In my current setup I have a dedicated Appfabric server. Most of the objects stored there are reference objects which means most of the operations are 'Get' operations. Therefore I've considered using LocalCache.
Unfortunately, recently I experienced problems with the availability of the cache server resulting from various network issues. The application server continues to work directly with the DB in these cases thanks to a provider I've written. However, it has a very large impact on performance as expected.
I want to be able to use some kind of a local cache for the highly referenced objects, even when the cache server is down. For this purpose I've considered using the MemoryCache of .Net 4. I don't really care about the objects being stale and I rely on a timeout eviction policy, therefore I don't worry about synchronization between the application servers.
I wanted to hear what do you think about this solution.
- Are there any other points I should consider?
- Is there a better solution to provide fast access for highly referenced objects even when the cache server is down?
Appfabric's LocalCache is a client cache, local and inproc to the client application, which stores references of frequently used data, so application does not need to deserialize same object again. However since LocalCache works with the cache server, it would not work if cache server is down.
One solution possible to your problem is as you have mentioned, having an independant client cache so even if cache server goes down, client cache will still be available.
When relying on inproc cache you will have to keep it in mind that in-process caches store reference of cached objects. If your application modifies object after getting from cache, it will be modified in cache as well. Also if multiple threads may end up modifying same item in cache, you will need thread synchronization for such objetcs.
However even using an independant client cache, you application may end up hitting the database frequently, since data in client cache of one application server will not be accessable to other servers.
A better solution might be using replicated cache servers, where each server will have all cached data. This will not only improve get performace for referential data but also will eliminate single point of failure, like in your case.
If Appfabric is not a hard requirement for application, you may look into NCache for better scalability and high availablility.
Did you consider AppFabric's local cache feature? Or is it not suitable for you?
Related
I have a unique (or so I think) problem - we have an ASP.NET web app using MVC principles. The project will be at most single threaded (our business requires single point of control). We are using Entity Framework to connect to the database
Problem:
We want to query our database less frequently than every page load.
I have considered putting our database connection in a singleton but am worried about connecting to in too infrequently -- will a query still work if it connected a significant time ago? How would you recommend connecting to the database?
How would you recommend connecting to the database?
Do NOT use a shared connection. Connections are not thread-safe, and are pooled by .NET, so creating one generally isn't an expensive operation.
The best practice is to create a command and connection for every database request. If you are using Entity Framework, then this will be taken care of for you.
If you want to cache results using the built-in Session or Cache properties, then that's fine, but don't cache disposable resources like connections, EF contexts, etc.
If at some point you find you have a measurable performance problem directly related to creating connections or contexts, then you can try and deal with that, but don't try to optimize something that might not even be a problem.
If you want to get data without connecting to the database, you need to cache it - either in memory, in a file or in whatever mean of storage you want, but you need to keep it in front of the DB somehow. There is no other way known to me.
If by connecting you mean building a completely new SqlConnection to your DB, then you can either rely on connection pooling (EF is smart enough to keep your connections alive for some minutes even after you finish your business) or you can just create connections and keep them alive inside your application by not closing them instantly (i.e. keeping track of them inside a structure).
But you should definitely consider if this is REALLY what you want. The way EF does it internally is most of the time exactly what you want.
Some further reading:
https://learn.microsoft.com/en-us/aspnet/mvc/overview/older-versions/getting-started-with-ef-5-using-mvc-4/implementing-the-repository-and-unit-of-work-patterns-in-an-asp-net-mvc-application
I'm looking for a simple way to implement a local memory store which can be used on an Azure .NET instance
I've been looking at Azure Co-located Caching and it seems to support all of my requirements:
Work on both web roles and worker roles
Implement a simple LRU
Keep cached objects in memory (RAM)
Allow me to define the cache size as a percentage of the machine's total RAM
Keep the cache on the same machine of the web/worker role (co-located mode)
Allow me to access the same cache from multiple AppDomains running on the same machine (Web Roles may split my handlers into different AppDomains)
The only problem I have with Azure Co-located caching is that different instances communicate and try to share their caches - and I don't really need all that.
I want every machine to have its own separate in-memory cache. When I query this cache, I don't want to waste any time on making a network request to other instances' caches.
Local Cache config?
I've seen a configuration setting in Azure Caching to enable a Local Cache - but it still seems like machines may communicate with each other (ie. during cache miss). This config also requires a ttlValue and objectCount and I want TTL to be "forever" and the object count to be "until you fill the entire cache". It feels like specifying maxInt in both cases seems wrong.
What about a simple static variable?
When I really think about it, all this Azure caching seems like a bit of an overkill for what I need. I basically just need a static variable in the application/role level.. except that doesn't work for requirement #6 (different AppDomains). Requirement #4 is also a bit harder to implement in this case.
Memcached
I think good old memcached seems to do exactly what I want. Problem is I'm using Azure as a PaaS and I don't really want to administer my own VM's. I don't think I can install memcached on my roles.. [UPDATE] It seems it is possible to run memcached locally on my roles. Is there a more elegant "native" solution without using memcached itself?
You can certainly install memcached on Web and Worker roles. Steve Marx blogged getting memcached running on Azure Cloud Service several years ago before the Virtual Machine features were present. This is an older post, so you may run into other ways of dealing with this, such as using start up tasks instead of the OnStart method in RoleEntryPoint, etc.
I have used the "free" versions of SQL Server for local caching and they have worked great. It depends on what you are doing, but I have ran both SQL Server Express/Compact for storing entire small static data sets for a fantasy football site I wrote that included 5 years of statistics. They worked really well even on a small/medium Azure instances, because of the small footprint.
http://blogs.msdn.com/b/jerrynixon/archive/2012/02/26/sql-express-v-localdb-v-sql-compact-edition.aspx
Best part is you can use t-sql. Your cache requirements might be more complex or not scale to this.
I would like to maintain a list of objects that is distributed between N load balanced servers: whenever a client changes the list on one server, I would like these changes to migrate to the other servers. So, I guess this is a case of master-master replication.
What is the simplest way of handling this? One simplifying fact is that each change to an object in the list has an associated increasing version number attached to it. So, it is possible to resolve conflicts if an item was changed on two different servers, and these two deltas make their way to a third server.
Edit: clarification: I am quite familiar with distributed key-value stores like Memcached and Redis. That is not the issue here; what I am interested in is a mechanism to resolve conflicts in a shared list: if server A changes an item in the list, and server B removes the item, for example, how to resolve the conflict programmatically.
I suggest memcached. It's a distributed server cache system that seems to fit your needs perfectly. Check out this link:
Which .NET Memcached client do you use, EnyimMemcached vs. BeITMemcached?
If passing the entire list doesn't suit you (I don't know if memcached is smart enough to diff your lists) then I would suggest giving the old DataSet object a look, as its diff grams should be well suited for passing about just deltas if your data set is large.
Put your changes in a queue. Have each server look at the queue, and act upon it.
For example, queue could have:
add item #33
remove item #55
update item #22
and so on
Upon doing a change, write to the queue, and have each server pick up items from the queue and update its list according to that.
I did in-memory database with such method, and it worked perfectly on multiple 'servers'.
EDIT:
When servers want to update each other, that has to happen:
Each server that updates will put an UPDATE (or ADD or DELETE) request into the queue for all other servers. Each server should also store the list of queued requests that originated from it so it will not load its own updates from the queue.
Does each server have it's own version of List locally cached or do you plan to use a centralized caching layer?
As suggested, you can have a centralized "push" process which works off a centralized queue. Any changes submitted by any server are en-queued, and the "push" process can push updates to all the servers via some remoting / WebService mechanism.
This offers the advantage of any changes/updates/deletes being applied at once (or close in time) to all the servers, centralized validation or logging if needed. This also solves the problem of multiple updates - the latest one takes precedence.
I've seen this implemented as a windows service which has an internal queue (can be persisted to DB async for resiliency) which manages the queue and simply takes items one by one, validates the item, loggs change/content and finally pushes it to local Lists via WebService calls to each web server (servers maintain in-memory list which simply gets updated/added/deleted as needed).
There are algorithms that can be used to syncronize Distributed systems.
In your case you need an algorithms that given two events on the system tells you wich one of them happened firts. If you can decide for any two events wich is the first one then all the conflicts could be resolved.
I recommend you to use Lamport Clocks.
If you're on a Windows platform, I suggest you take a look at "Windows Server AppFabric", and especially the Caching feature. The name is funky, but I think it's exactly what you're looking for, I quote:
A distributed in-memory cache that provides .NET applications with
high-speed access, scale, and high availability to application data.
I consider using CLR trigger instead of traditional T-SQL one because I need to use some logic that is already implemented in C#. I'm aware that SQL server supports CLR integration and in my case it seems like a solution that's worth a shot.
However, the operations I want to perform can be somewhat slow. Not slow enough to rule out using them in triggered actions completely, but probably noticeably slow when it comes to inserting hundreds of thousands of records. The slowest part can strongly benefit from caching, I suppose that it will be very few cache misses and thousands of cache hits. At this point it all leads to a question: can CLR triggers have any state? And, more important, what's the life cycle of this state?
I suppose I could use static fields of trigger class to hold some state, but I have no idea when it gets initialized (When the server is started? At transaction start? Not specified?). I am not sure if it's the safe route and therefore ask what the common practices for using some state in CLR triggers are (if any).
To avoid confusion: I need to cache CLR objects, not the results of some SQL queries, so it's not about how good SQL Server itself is at caching, I want to cache some data that doesn't belong to database. Also, I consider CLR not because I can't do string manipulations and bound checking in T-SQL. I need to execute some logic that is implemented in CLR class library and has a lot of dependencies. Wether I should use triggers in this case is another question that has almost nothing to do with this one.
Many thanks in advance.
PS: I will appreciate any comments and insights on topic, even the ones that don't answer my question directly, but please don't make it all about "triggers are evil and shouldn't ever be used" and "CLR integration is slow and a major compatibility pain". Also, I know that it may scream "premature optimization" to someone, but at the moment I just want to know what my optimization options are going in since I'm new to CLR integration in SQL server. I won't optimize it unless profiling results suggest so, but I don't want to implement the whole thing to realize it's too slow and there is nothing I can do about it.
I use SQL Server 2008 and .NET 3.5.
While it is possible to use static class fields in the SQLCLR Trigger class to cache values, there are several things you need to be very cautious about:
How much data do you plan on caching? You don't want to take up too much memory that SQL Server should instead be using for queries.
There is a single AppDomain per Database per Assembly Owner (i.e. AUTHORIZATION on the Assembly). This means that the code in any particular Assembly is shared across all SQL Server Sessions (i.e. SPIDs). If the data is just lookup data that won't change based on which process is interacting with the static field, then this is fine. But if the data is different per process, then this will produce "odd" behavior unless you associate a value such as the current TransactionID with the process.
If the data is per process, assuming you find a way to differentiate each particular SPID / SESSION, how are you going to clean up the old data? It will exist in memory until explicitly removed or the AppDomain is unloaded. This is not a problem for common lookup data that is meant to be shared with everyone as that type of data doesn't increase with each new process. But per-process data will continually increase unless cleared out.
AppDomains can be unloaded at any time and for a variety of reasons (memory pressure, drop/recreate of the Assembly, security change related to the Assembly, security change related to the DB, running DBCC FREESYSTEMCACHE('ALL'), etc). If the data being cached can cause different outcomes between sequential processes if one process relies upon data cached by a prior process, then this cannot be guaranteed to work. If the cache being dropped between processes results in nothing more than the need to reload the cache, then it should be fine.
Other notes (but nothing to be cautious about):
AppDomains are loaded when the first method is called in an Assembly where there is no currently running AppDomain for the Database that the Assembly exists in and the User that is the Authorizer of that Assembly.
AppDomains will remain loaded until they are unloaded by SQL Server for the one of the reasons noted above, but none of those scenarios will necessarily occur. Meaning, the AppDomain can remain loaded for a very long time (i.e. until server / service restart).
Each Assembly is loaded the first time a method inside of it is referenced.
In order to make use of the loading event, you can place code in the static class construct. Just be aware that there is no SqlContext available, so you can't make any SqlConnections in a static class constructor that use the in-process Context Connection (i.e. Context Connection = true).
I am about to develop a Windows service in C#. This service needs to keep track of events in the system, and write some data to files from time to time. These ongoing events form a certain state, so I'll keep the state in memory and update it as events will arrive. I don't want to over-complicate things so I don't want the state to be persistent on disk, but I'm wondering if I could somehow make it persistent in memory, so that if the service crashes (and auto restarts by Windows) it could pick up from where it left and go on (possibly losing some events, not a big deal).
I was thinking along the line of creating a "shared" memory area, thus letting Windows manage it, and using it only in the service - but I'm not sure that object will persist after the service dies.
Any ideas?
EDIT: I'm not looking for an overkill solution. The data is somewhat important so I'd like to keep it waiting in memory until the service is restarted, but the data is not too important. It's more of a nice-to-have feature if I can persist the data easily, without working with files, external 3rd party processes and so on. My ideal solution would be a simple built-in feature (in .NET or in Windows) that will provide me with some in-memoory persistence, just to recover from a crash event.
You can use a Persitent Caching Block from the Microsoft Enterprise Library.
It is configurable and you can use many backing stores like database and isolated storage.
I know you said that you don't want to over-complicate things by persisting it to disk, but it's definitely going to much more complicate to persist stuff into shared memory or any of the solutions listed here. The reason why so many applications use databases or file storage is because it's the simplest solution.
I would recommend you keep all the state in a single object or object hierarchy, serialize this object to XML and write it to a file. It really doesn't get much simpler than that.
You could use Memcached, or Redis (which also persists it's data on disk, but handles it automatically).
http://code.google.com/p/redis/
You could also take a look at this question:
Memcached with Windows and .NET
I don't see why it'd be harder to persist to disk.
using db4o you can persist the instances you are already working with.
How about using isolated storage and persisting the object into memory that way?
Even if, for instance, you keep the data on a shared-memory of some other networked pc, how would you "guarantee" that the networked pc wont hang/restart/halt/etc? In that case your service will lose the persisted data anyway.
I would suggest, and chances are you'd likely to end up, storing the data on the same disk.
Note that, because of the volatile nature of memory(RAM) you cannot reload data that was previously there, before the system restart; not unless you use some mechanism to store/reload on disk.
--EDIT--
In that case, how about using MSMQ? So you can push everything over the queue, and even if your service gets a restart, it would look for the items in the queue and continue onwards.