Keeping persistent data in memory

Keeping persistent data in memory - c#

I am about to develop a Windows service in C#. This service needs to keep track of events in the system, and write some data to files from time to time. These ongoing events form a certain state, so I'll keep the state in memory and update it as events will arrive. I don't want to over-complicate things so I don't want the state to be persistent on disk, but I'm wondering if I could somehow make it persistent in memory, so that if the service crashes (and auto restarts by Windows) it could pick up from where it left and go on (possibly losing some events, not a big deal).
I was thinking along the line of creating a "shared" memory area, thus letting Windows manage it, and using it only in the service - but I'm not sure that object will persist after the service dies.
Any ideas?
EDIT: I'm not looking for an overkill solution. The data is somewhat important so I'd like to keep it waiting in memory until the service is restarted, but the data is not too important. It's more of a nice-to-have feature if I can persist the data easily, without working with files, external 3rd party processes and so on. My ideal solution would be a simple built-in feature (in .NET or in Windows) that will provide me with some in-memoory persistence, just to recover from a crash event.

You can use a Persitent Caching Block from the Microsoft Enterprise Library.
It is configurable and you can use many backing stores like database and isolated storage.

I know you said that you don't want to over-complicate things by persisting it to disk, but it's definitely going to much more complicate to persist stuff into shared memory or any of the solutions listed here. The reason why so many applications use databases or file storage is because it's the simplest solution.
I would recommend you keep all the state in a single object or object hierarchy, serialize this object to XML and write it to a file. It really doesn't get much simpler than that.

You could use Memcached, or Redis (which also persists it's data on disk, but handles it automatically).
http://code.google.com/p/redis/
You could also take a look at this question:
Memcached with Windows and .NET

I don't see why it'd be harder to persist to disk.
using db4o you can persist the instances you are already working with.

How about using isolated storage and persisting the object into memory that way?

Even if, for instance, you keep the data on a shared-memory of some other networked pc, how would you "guarantee" that the networked pc wont hang/restart/halt/etc? In that case your service will lose the persisted data anyway.
I would suggest, and chances are you'd likely to end up, storing the data on the same disk.
Note that, because of the volatile nature of memory(RAM) you cannot reload data that was previously there, before the system restart; not unless you use some mechanism to store/reload on disk.
--EDIT--
In that case, how about using MSMQ? So you can push everything over the queue, and even if your service gets a restart, it would look for the items in the queue and continue onwards.

Related

Pattern or library for caching data from web service calls and update in the background

I'm working on a web application that uses a number of external data sources for data that we need to display on the front end. Some of the external calls are expensive and some also comes with a monetary cost so we need a way to persist the result of these external requests to survive ie a app restart.
I've started with some proof of concept and my current solution is a combination of a persistent cache/storage (stores serialized json in files on disk) and a runtime cache. When the app starts it will populate runtime cache from the persistent cache, if the persistent cache is empty it would go ahead and call the webservices. Next time the app restarts we're loading from the persistent cache - avoiding the call to the external sources.
After the first population we want the cache to be update in the background with some kind of update process on a given schedule, we also want this update process to be smart enough to only update the cache if the request to the webservice was successful - otherwise keep the old version. Theres also a twist here, some webservices might return a complete collection while others requires one call per entity - so the update-process might differ depending on the concrete web service.
I'm thinking that this senario can't be totally unique, so I've looked around and done a fair bit of Googleing but I haven't fund any patterns or libraries that deals with something like this.
So what I'm looking for is any patterns that might be useful for us, if there is any C#-libraries or articles on the subject as well? I don't want to "reinvent the wheel". If anyone have solved similar problems I would love to hear more about how you approached them.
Thank you so much!

ORM for stateful application. Does EF fit? Or any?

I need an ORM that is suitable for stateful application. I'm going to keep entities between requests in low-latency realtime game server with persistent client connections. There is an only 1 server instance connected to database so no data can be changed from "outside" and the server can rely on its cache.
When user remotely logs in to the server its whole profile is loaded to server memory. Several higher-level services are also created for each user to operate profile data and provide functionality. They can also have internal fields (state) to store temporary data. When user wants to change his signature he asks corresponding service to do so. The service tracks how frequently user changes his signature and allows it only once per ten minutes (for example) - such short interval is not tracked in db, this is a temporary state. This change should be stored to db executing only 1 query: UPDATE users SET signature = ... WHERE user_id = .... When user logs off it's unloaded from server memory after minutes/hours of inactivity. Db here is only a storage. This is what I call stateful.
Some entities are considered "static data" and loaded only once at application start. Those can be referenced from other "dynamic" entities. Loading "dynamic" entity should not require reloading referenced "static data" entity.
Update/Insert/Delete should set/insert/delete only changed properties/entities even with "detached" entity.
Write operations should not each time load data from database (perform Select) preliminary to detect changes. (A state can be tracked in dynamically generated inheritor.) I have a state locally, there is no sense to load anything. I want to continue tracking changes even outside of connection scope and "upload" changes when I want.
While performing operations references of persisted objects should not be changed.
DBConnection-per-user is not going to work. The expected online is thousands of users.
Entities from "static data" can be assigned to "dynamic" enitity properties (which represent foreign keys) and Update should handle it correctly.
Now I'm using NHibernate despite it's designed for stateless applications. It supports reattaching to session but that looks like very uncommon usage, requires me to use undocumented behavior and doesn't solve everything.
I'm not sure about Entity Framework - can I use it that way? Or can you suggest another ORM?
If the server will recreate (or especially reload) user objects each time user hits a button it will eat CPU very fast. CPU scales vertically expensively but have small effect. Contrary if you are out of RAM you can just go and buy more - like with horizontal scaling but easier to code. If you think that another approach should be used here I'm ready to discuss it.

Yes, you can use EF for this kind of application. Please keep in mind, that on heavy load you will have some db errors time to time. And typically, it's faster to recover after errors, when you application track changes, not EF. By the way, you can use this way NHibernate too.

I have used hibernate in a stateful desktop application with extremely long sessions: the session starts when the application launches, and remains open for as long as the application is running. I had no problems with that. I make absolutely no use of attaching, detaching, reattaching, etc. I know it is not standard practice, but that does not mean it is not doable, or that there are any pitfalls. (Edit: but of course read the discussion below for possible pitfalls suggested by others.)
I have even implemented my own change notification mechanism on top of that, (separate thread polling the DB directly, bypassing hibernate,) so it is even possible to have external agents modify the database while hibernate is running, and to have your application take notice of these changes.
If you have lots and lots of stuff already working with hibernate, it would probably not be a good idea to abandon what you already have and rewrite it unless you are sure that hibernate absolutely won't do what you want to accomplish.

Fast Distributed Memory Access in C#

I have a C# WCF service that hosts 120 GB of memory in a Dictionary<File,byte[]> for very fast access of file contents, which really worked well with me. Upon access, the file contents were wrapped within a MemoryStream and read
This service needs to be restarted everyday to load some static data from the database that could change on daily basis. The restart took so much time because of the huge data that need to be loaded again into memory
So I decided to host this memory in a different process on the same machine, and access it through sockets. The Data process will be always up and running. TcpListener/Client and NetworkStream were used in a similar fashion to the following
memoryStream.Read(position.PositionData, 0, position.SizeOfData);
position.NetworkStream.Write(position.PositionData, 0, position.SizeOfData);
Problem is: this was 10 times slower than hosting the memory in the same process. Slowdown is expected, but a factor of 10 is too much.
I thought of MemoryMappedFiles, but those are more useful for random access to a specific view of the file. My file access is sequential from the beginning all the way to the end.
Is there a different technology or library that could be used in my case? or is this just so expected?

I assume you are using SQLServer. If so, Service Broker & SQLNotificaiton Or Query notification may be of your friends here. I presume, you need more of a push messaging model, which automatically propagate changes back to service (if something change in db). Therefore, avoid restarting memory/resource intensive process hence no need to remap your heavy weight dictionary.

Utilizing two Redis instances - Similar to Mongos

I have been reading that the proper way to scale Redis is to add a Separate instance (Even on the same machine is ok because CPU intensive). What I am wondering is if there are any existing components out there that facilitate the round robin / write / read similar to Mongos so that I could just call into it and it would properly write / read to one of the underlying instances. I realize that it is more complicated that what I have represented above, but didn't want to re-invent the wheel by trying to write my own proxy, etc to handle this.
Any suggestions / tips, etc would be appreciated.
Thanks,
S

The approach will work for scaling reads, but not writes as Redis is not yet released with redis-cluster.
For load balancing reads, any TCP load balancer should work fine such as Balance. I link that one because it is software based and pretty simple to set up and use. Of course, if you have a hardware load balancer you could do it there, or use any of several other software based load balancers.
Another option is to implement round robin in your client code, though I prefer to not do that myself. Once redis-cluster is released it won't really matter which server you connect to.
For balancing writes, you'll need to go the route of sharding your data, which is described rather well IMO at Craigslist's Redis usgae page. If you think you'll need to go this route, I'd recommend taking the line JZ takes and do the underlying setup in advance. Ideally once redis-cluster is ready there should be minimal, if any, code changes to move to the cluster handling it for you.
If you want a single IP to handle both reads and writes as well as multiple sharded write masters you would likely need to write that "proxy" yourself, or put the code in the client code you write. Alternatively, this proxy announcement may hold what you need, though I don't see anything about routing writes in it.
Ultimately, I think you'd need to test and validate you actually need that write scaling before implementing it. I've found that if I have all reads on one or more slaves, and have the slaves manage disk persistence, performance of writes is usually not an issue.

Keeping in sync with database

The solution we developed uses a database (sqlserver 2005) for persistence purposes, and thus, all updated data is saved to the database, instead of sent to the program.
I have a front-end (desktop) that currently keeps polling the database for updates that may happen anytime on some critical data, and I am not really a fan of database polling and wasted CPU cycles with work that is being redone uselessly.
Our manager doesn't seem to mind us polling the database. The amount of data is small (less than 100 records) and the interval is high (1 min), but I am a coder. I do. Is there a better way to accomplish a task of keeping the data on memory as synced as possible with the data on the database? The system is developed using C# 3.5.

Since you're on SQL2005, you can use a SqlDependency to be notified of changes. Note that you can use it pretty effortlessly with System.Web.Caching.Cache, which, despite it's namespace runs just fine in a WinForms app.

First thought off the top of my head is a trigger combined with a message queue.

This may probably be overkill for your situation, but it may be interesting to take a look at the Microsoft Sync Framework

SQL Notification Services will allow you to have the database callback to an app based off a number of protocols. One method of implementation is to have the notification service create (or modify) a file on an accessible network share and have your desktop app react by using a FileSystemWatcher.
More information on Notification Services can be found at: http://technet.microsoft.com/en-us/library/aa226909(SQL.80).aspx
Please note that this may be a sledgehammer approach to a nut type problem though.

In ASP.NET, http://msdn.microsoft.com/en-us/library/ms178604(VS.80).aspx.

This may also be overkill but maybe you could implement some sort of caching mechanism. That is, when the data is written to the database, you could cache it at the same time and when you're trying to fetch data back from the DB, check the cache first.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.