I have some design issue,
Lets say you have an application that stores data into its cache during runtime, what do you think is the maximum amount of data (in mb) an application should cache before it must use DB.
Thanks.
How much memory do you have? How much can you afford to lose when the app or system crashes? How long can you afford the start up time to be when you restart and have to reload those caches? Typically, even with caches, you need to write through to the DB any way (or to something) to persist the data.
And if you eventually have "too much data" to fit in to memory, then you're now paging working sets from the DB to memory. You also have the cache synchronization issues if someone changes the DB behind your back.
All sorts of fun issues.
But, if you have the memory, go ahead and use it.
There's a reasonable chance that SQLServer is better at this than you are, and that you should cache approximately nothing.
Like Will says, though, if you're willing to pay in terms of lost data on a crash, you might increase responsiveness by not persisting until later. Anything you can do "for free", in the sense of giving up the guarantees you'd get by writing it to the DB immediately, the DB can in principle do for itself. SQL Server is quite aggressive about using almost all available free memory (and quite polite in reducing its memory usage when other apps need RAM). So any memory you use for caching, you're taking away from SQL Server's caches.
Related
We are designing an enterprise application which caches a lot of data from back end. The users are allowed to open arbitrary number of app windows, and each loads its own data and caches it. To somehow manage memory consumption and prevent overall OS performance decrease, we decided to write a cache manager that will automatically monitor app's memory footprint and remove data from cache when needed.
So the problem is we have difficulties identifying whether it is time to free up memory. Currently we use a very simple approach - we just start throwing away stuff from cache when app's memory usage exceeds 80% of physical memory.
Are there any (alternative?) established practices for dealing with such kind of problem?
This is basically OK. There is no really good strategy. If there are multiple competing applications this can lead to cache competitions and false evictions.
If you pick the threshold too low you waste cache space. If it's too high nothing else might fit into memory including the file cache, DLLs, ...
What do you mean by "available physical memory"? Do you mean installed memory or memory that's free? How can an app use 80% of free memory? I'm unclear on the definition that you are using.
SQL Server uses memory until the OS signals that it's low on memory (I believe that happens when 95% of "something" is being used).
You certainly do not want to use the GC to free memory. It will routinely kill your entire cache.
Maybe you can move the cache contents to disk entirely? Or, you could share the cache between .NET processes by having a hidden cache server process that can be queries by app processes.
I want to stress that if your app consumes 99% of installed RAM (as an example) performance will be very bad because the file cache is almost empty. This means that even DLLs and .NET NGEN'ed code will be paged out and in frequently.
Maybe a better strategy is to assume, that 1GB will be needed to appropriately cache the OS and app files. So you can consume memory until there are only 10% free of the installed RAM minus 1 GB.
I believe that the mvc mini profiler stores all the response times in HttpRuntime cache. Please let me know if I'm wrong but if that's the case then what is the max limit for this cache? How many operations can it profile before the cache is full? We are using the mini profiler for profiling operations of a test suite and the test suite will grow over time so I am concerned about this thing. Should I be concerned?
On a related note. When all the tests have been profiled I simply call the Save method in SqlServerStorage class of the mini profiler. And all the response times are saved to a SQL server database. Is there any way I could call the Save method more frequently without starting and stopping the profiler again and again? We just start it at the start of the test suite and end it when all the tests have been profiled. We consider one entry to the MiniProfilers table as one profiling session. Right now I am not able to call the 'Save' method more than once because it needs a new MiniProfilerId everytime it is called.
Any suggestions?
I'm not directly familiar with the mini profiler but I do have quite a bit of experience with the cache. The HttpRuntime.Cache property provides a reference to the System.Web.Caching.Cache class. Which is an implementation of the object cache. In general use this cache is static, so there is only one instance. You can configure the behavior of this Cache using the Web.Config file. Some things to keep in mind about the windows cache, you will never get an out of memory error using it. The cache has a percentage of memory value that tells it how full it should get. Once it gets near that top memory usage percentage it will start to cull objects out of the cache starting with the oldest touched objects. So the short answer to your first question is no, don't worry about the memory limits, one of the main selling points of a managed language is that you should never have to worry about memory consumption, let the framework handle it.
As for #2 I wouldn't worry about it. The cache may throw away the response object itself but I would venture a guess that it's already been included in the result aggregation from the profilier, so you really shouldn't need the original request object itself unless you want to deep inspect it.
Long story short, I wouldn't worry about this anymore unless you hit an real issue. Let the cache do it's job and trust the engineers who built it knew what they were doing until you have proof otherwise.
I am trying to improve the performance of a Windows Service, developed in C# and .NET 2.0, that processes a great amount of files. I want to process more files per second.
In its process, for each file, the service does a database query to retrieve some parameters of the system.
Those parameters change annually, and I am thinking that I would gain some performance, if a loaded those parameters as a singleton and refreshed this singleton periodically. Instead of make a database query for each file being processed, I would get the parameters from memory.
To complete the scenario : I am using Windows Server 2008 R2 64 Bits, SQL Server 2008 is the database, C# and .NET 2.0 as already mentioned.
I am right in my approach? What would you do?
Thanks!
Those parameters change anually
Yes, do cache them in memory. Especially if they are large or complex.
You should take care to invalidate them at the right time once a year, depending how accurate that has to be.
Simply caching them for an hour or even for a few minutes might be a good compromise.
RAM memory data access is definitely faster that any other data access, except than cpu memories like registries and CPU cache
Chaching would be faster even if you change it every minute, so yes, caching that query is very faster
Crossing a network or going to disk is always orders of magnitude slower than in memory access.
Databases can cache data in memory so if you can achieve that and you're not crossing a network, the database might be faster since their data access patterns/indexes etc... may be faster than you're code. But, that's best case - if you need it faster, in memory caches help.
But, be aware that in memory caches can add complexity and bugs. You have to determine the lifetime of the cached data, how to refresh and the more complex it is, the more wierd edge case state bugs you will have. Even though they change annually, you have to handle that cusp.
We have an ASP.NET 4.0 application that draws from a database a complex data structure that takes over 12 hours to push into an in memory data structure (that is later stored in HttpRuntime.Cache). The size of the data structure is quickly increasing and we can't continue waiting 12+ hours to get it into memory if the application restarts. This is a major issue if you want to change the web.config or any code in the web application that causes a restart - it means a long wait before the application can be used, and hinders development or updating the deployment.
The data structure MUST be in memory to work at a speed that makes the website usable. In memory databases such as memcache or Redis are slow in comparison to HttpRuntime.Cache, and would not work in our situation (in memory db's have to serialize put/get, plus they can't reference each other, they use keys which are lookups - degrading performance, plus with a large amount of keys the performance goes down quickly). Performance is a must here.
What we would like to do is quickly dump the HttpRuntime.Cache to disk before the application ends (on a restart), and be able to load it back immediately when the application starts again (hopefully within minutes instead of 12+ hours or days).
The in-memory structure is around 50GB.
Is there a solution to this?
In memory databases such as memcache or Redis are slow in comparison to HttpRuntime.Cache
Yes, but they are very fast compared to a 12+ hour spin-up. Personally, I think you're taking the wrong approach here in forcing load of a 50 GB structure. Just a suggestion, but we use HttpRuntime.Cache as part of a multi-tier caching strategy:
local cache is checked etc first
otherwise redis is used as the next tier of cache (which is faster than the underlying data, persistent, and supports a number of app servers) (then local cache is updated)
otherwise, the underlying database is hit (and then both redis and local cache are updated)
The point being, at load we don't require anything in memory - it is filled as it is needed, and from then on it is fast. We also use pub/sub (again courtesy of redis) to ensure cache invalidation is prompt. The net result: it is fast enough when cold, and very fast when warm.
Basically, I would look at anything that avoids needing the 50GB data before you can do anything.
If this data isn't really cache, but is your data, I would look at serialization on a proper object model. I would suggest protobuf-net (I'm biased as the author) as a strong candidate here - very fast and very small output.
As I said above, I want to know what are all the disadvantage of using cache? Is that good to use cache in a website?
I don't see any disadvantages to using the cache.
The only disadvantages if you can call them that is incorrect usage.
There are several potential problems when using the cache though:
You will experience increased memory usage if you store objects in memory instead of a database
You may end up storing objects in cache that you don't want there (old objects or dynamic data for instance)
You may cache too much - causing you applications performance to degrade since your cache eats all the servers resources
You may cache too little ending up with a system with increased complexity and no performance gain
You may cache wrong data
And so on. Caching is hard, but used correctly it is a Good Thing.
Cache is a good thing. It will help your site run faster and avoid downloading the same content over and over again. Of course you should avoid caching for dynamically generated pages.
Another problem is with the caching of images or similar resources. If you do use caching for them, it will be tricky to update them when the need arises. You should always choose the caching times properly, making a compromise between faster loading and update efficiency.