I'm having some trouble getting my cache to work the way I want.
The problem:
The process of retrieving the requested data is very time consuming. If using standard ASP.NET caching some users will take the "hit" of retrieving the data. This is not acceptable.
The solution?:
It is not super important that the data is 100% current. I would like to serve old invalidated data while updating the cached data in another thread making the new data available for future requests. I reckon that the data needs to be persisted in some way in order to be able to serve the first user after application restart without that user taking the "hit".
I've made a solution which does somewhat of the above, but I'm wondering if there is a "best practice" way or of there is a caching framework out there already supporting this behaviour?
There are tools that do this, for example Microsofts ISA Server (may be a bit expensive / overkill).
You can cache it in memory using Enterprise Libary Caching. Let your users read from Cache, and have other pages that update the Cache, these other pages should be called as regularly as you need to keep the data upto date.
You could listen when the Cached Item is Removed and Process then,
public void RemovedCallback(String k, Object v, CacheItemRemovedReason r)
{
// Put Item Back IN Cache, ( so others can use it until u have finished grabbing the new data)
// Spawn Thread to Go Get Up To Date Data
// Over right Old data with new return...
}
in global asax
protected void Application_Start(object sender, EventArgs e)
{
// Spawn worker thread to pre-load critical data
}
Ohh...I have no idea if this is best practice, i just thought it would be slick~
Good Luck~
I created my own solution with a Dictionary/Hashtable in memory as a duplicate of the actual cache. When a method call came in requesting the object from cache and it wasn't there but was present in memory, the memory stored object was returned and fired a new thread to update the object in both memory and the cache using a delegate method.
You can do this pretty easily with the Cache and Timer classes built into .NET. The Timer runs on a separate thread.
And I actually wrote a very small wrapper library called WebCacheHelper which exposes this functionality in an overloaded constructor. The library also serves as a strongly typed wrapper around the Cache object.
Here's an example of how you could do this...
public readonly static WebCacheHelper.Cache<int> RegisteredUsersCount =
new WebCacheHelper.Cache<int>(new TimeSpan(0, 5, 0), () => GetRegisteredUsersCount());
This has a lazy loading aspect to it where GetRegisteredUsersCount() will be executed on the calling thread the instant that RegisteredUsersCount is first accessed. However, after that it's executed every 5 minutes on a background thread. This means that the only user who will be penalized with a slow wait time will be the very first user.
Then getting the value is as simple as referencing RegisteredUsersCount.Value.
Yeah you could just cache the most frequently accessed data when your app starts but that still means the first user to trigger that would "take the hit" as you say (assuming inproc cache of course).
What I do in this situation is using a CacheTable in db to cache the latest data, and running a background job (with a windows service. in a shared environment you can use threads also) that refreshes the data on the table.
There is a very little posibility to show user a blank screen. I eliminate this by also caching via asp.net cache for 1 minute.
Don't know if it's a bad design, but it's working great without a problem on a highly used web site.
Related
I have a web-service which is called by some web-service clients. This web-service returns the current inventory list of an inventory. This list can be big, 10K+ of product IDs, and it takes quite some time (~4 minutes) to refresh the list by reading data in the database. I don't want to refresh the list every time this web-service is called, as it may consume too much resource on my database server, and the performance is always not very good.
What I intend to do is giving the inventory list some time-to-live value, which means when a client asks for the inventory list, if the data is not out-of-date I just return the data right away; if the data is obsolete I will read it from the database, update this list data and its time-to-live value, and then return the refreshed data back to the web-service client. As there may be several clients call this web-service, It looks like I need a multi-thread synchronization(multiple-read single-write, ReaderWriterLockSlim class?) to protect this inventory list, but I haven't found a good design to make this web-service have good performance: only one client refreshes the data, the other clients don't have to redo the work if the data is still within the time-to-live frame and the web-service should return the result as soon as possible after the write thread completes the update.
I also think about another solution (also use ReaderWriterLockSlim class): creating a separate thread to refresh the inventory list periodically (write-thread refreshes the data every 15 minutes), and let all the web-service clients use read-thread to read the data. This may work, but I don't really like it as this solution still waste resource of the web-server. For example, if there is no client's request, the system still has to refresh the inventory list data every 15 minutes.
Please suggest some solution. Thanks.
I would suggest using a MemoryCache.
https://stackoverflow.com/a/22935978/34092 can be used to detect when the item expires. https://msdn.microsoft.com/en-us/library/7kxdx246.aspx is also worth a read.
At this point, the first line of the code you write (in CacheRemovedCallback) should write the value back to the MemoryCache - this will allow readers to keep reading it.
Then it should get the latest data, and then write that new data to the MemoryCache (again passing in a CacheItemPolicy to allow the callback to be called when the latest version is removed). And so on and so on...
Do you ever run only one instance of your service? Then in-memory caching is enough for you. Or use a ConcurrentDictionary if you don't want to implement the lock yourself.
If you run multiple instances of that services, it might be advisable to use a out of process cache like Redis.
Also, you could eventually maintain the cached list so that it is always in sync with what you have in the database!?
There are many different cache vendors dotnet and asp.net and asp.net core have different solutions. For distributed caching there are also many different options. Just pick whatever fits best for the framework you use.
You can also use libraries like CacheManager to help you implement what you need more easily.
In my ASP.NET-MVC-application I store information in static classes with static vars. But ASP.NET is recycling all data and threads after and my "App_Start"-procedure will call after the cleanup.
I solved the problem with the backup-tasks with HangFire.
But to generate the static-class, I need a long time. The first request after the recycle has to wait while the static-classes are set up.
Why the delay? I am using the EntityFramework and for correct handling I need all records from the database with their relations.
So I hold all records with static-classes and use the database as 2nd strategy.
I have no ideas what I can do to improve performance.
My first idea was to serialize the complete data - but how is the performance for deserialize a ArrayList with 2K or more records?
Is there a way to prevent the recyclefor my static ArrayList?
I'd recommend you use the application cache mechanism for ASP.NET instead. However, by default, the cache is still in-memory and maintained within the process, so app pool recycles would still wipe it out. The solution is to change the storage location of the application cache so it's in a different process. See this answer for some recommendations for how to store your application cache.
In short, I wouldn't recommend trying to avoid app pool recycling since it can really save you a lot of trouble.
In my client-server architecture I have few API functions which usage need to be limited.
Server is written in .net C# and it is running on IIS.
Until now I didn't need to perform any synchronization. Code was written in a way that even if client would send same request multiple times (e.g. create sth request) one call will end with success and all others with error (because of server code + db structure).
What is the best way to perform such limitations? For example I want no more that 1 call of API method: foo() per user per minute.
I thought about some SynchronizationTable which would have just one column unique_text and before computing foo() call I'll write something like foo{userId}{date}{HH:mm} to this table. If call end with success I know that there wasn't foo call from that user in current minute.
I think there is much better way, probably in server code, without using db for that. Of course, there could be thousands of users calling foo.
To clarify what I need: I think it could be some light DictionaryMutex.
For example:
private static DictionaryMutex FooLock = new DictionaryMutex();
FooLock.lock(User.GUID);
try
{
...
}
finally
{
FooLock.unlock(User.GUID);
}
EDIT:
Solution in which one user cannot call foo twice at the same time is also sufficient for me. By "at the same time" I mean that server started to handle second call before returning result for first call.
Note, that keeping this state in memory in an IIS worker process opens the possibility to lose all this data at any instant in time. Worker processes can restart for any number of reasons.
Also, you probably want to have two web servers for high availability. Keeping the state inside of worker processes makes the application no longer clustering-ready. This is often a no-go.
Web apps really should be stateless. Many reasons for that. If you can help it, don't manage your own data structures like suggested in the question and comments.
Depending on how big the call volume is, I'd consider these options:
SQL Server. Your queries are extremely simple and easy to optimize for. Expect 1000s of such queries per seconds per CPU core. This can bear a lot of load. You can use a SQL Express for free.
A specialized store like Redis. Stack Overflow is using Redis as a persistent, clustering-enabled cache. A good idea.
A distributed cache, like Microsoft Velocity. Or others.
This storage problem is rather easy because it fits a key/value store model well. And the data is near worthless so you don't even need to backup.
I think you're overestimating how costly this rate limitation will be. Your web-service is probably doing a lot more costly things than a single UPDATE by primary key to a simple table.
I'm trying to use the cache in an ASP.NET MVC web application to store some list data that rarely updates. I insert this data into the cache in an UpdateCache() method like this:
HttpContext.Current.Cache.Insert("MyApp-Products", products, null, DateTime.Now.AddYears(99), Cache.NoSlidingExpiration);
Then in my model I retrieve it:
public static List<string> GetProducts()
{
var cachedProducts = HttpContext.Current.Cache["MyApp-Products"];
if (cachedProducts == null)
{
UpdateCache();
cachedProducts = HttpContext.Current.Cache["MyApp-Products"];
}
return ((List<string>)cachedProducts );
}
The first time I visit the page, UpdateCache() is called as expected. If I refresh, the data comes from the cache and I do not need to call UpdateCache(). However, after maybe 15 mins, I come back to the app and the cached values are gone. My understanding was that this cache was per application, not session, so I would have expected it to still be there for myself or another user.
Is there something wrong with the way I'm storing the cache? Or is there something I'm not understanding about how the Cache in ASP.NET works in a web app?
My understanding was that this cache was per application, not
session, so I would have expected it to still be there for myself or
another user.
While the cache is per application, ASP.NET doesn't provide you any guarantees that if you stored something into the cache you will find it back there. The cache could be evicted under different circumstances such as for example your server starts running low on memory and so on. You can subscribe to an event and get notified when an item is evicted from the cache. You can also define a priority when caching an item. The higher the priority, the lower the chance of this item getting evicted.
Also since the cache is stored in the memory of the web server (by default) you should not forget the fact that your application domain could be recycled at any point by IIS. For example after a certain amount of inactivity or if it starts running low on memory or even if a certain CPU threshold usageis reached, ... and everything that is stored in the memory including the cache will simply disappear into the void and the next request in your application will start a new AppDomain.
But in any cases make sure that you check if the item is present in the cache before using it. Never rely on the fact that if you stored something into it, you will find it.
All this blabla to come to the really essential point which is something very worrying with your code. It looks like you are storing a List<string> instance into the cache. But since the cache is per application, this single instance of List<string> could be shared between multiple users of the application and of course this could happen concurrently. And as you know List<T> is not a thread safe structure. So with this code you will, at best, get an exception and at worst you will get corrupt data. So be very careful what you are caching and how you are synchronizing the access to the structure especially if you are caching a non thread-safe class.
IF this is on a full IIS, and happens around every 15minuntes. Remember to check the Idle timeout value.
That being said, if this list "never" changes, why not store it in a static array instead.
I can see how storing data in general in database is better than in static object.
For one, in case data does change, it is easier to update DB than the application.
Try explicitly setting absolute expiration when caching your object:
HttpRuntime.Cache.Insert("MyApp-Products", cachedProducts, null, DateTime.Now.AddDays(20), TimeSpan.Zero );
Note HttpRuntime is used instead of HttpContext for performance reasons even though the difference is minor.
At the new place I am working, I've been tasking with developing a web-application framework. I am new (6 months ish) to the ASP.NET framework and things seem pretty straight forward, but I have a few questions that I'd like to ask you ASP professionals. I'll note that I am no stranger to C#.
Long life objects/Caching
What is the preferred method to deal with objects that you don't want to re-initialize every time a page is it? I noticed that there was a cache manager that can be used, but are there any caveats to using this? For example, I might want to cache various things and I was thinking about writing a wrapper around the cache that prefixed cache names so that I could implement different caches using the same underlying .NET cache manager.
1) Are there any design considerations I need to think about the objects that I am want to cache?
2) If I want to implement a manager of some time that is around during the lifetime of the web application (thread-safe, obviously), is it enough to initialize it during app_start and kill it in app_end? Or is this practiced frowned upon and any managers are created uniquely in the constructor/init method of the page being served.
3) If I have a long-term object initialized at app start, is this likely to get affected when the app pool is recycled? If it is destroy at app end is it a case of it simply getting destroyed and then recreated again? I am fine with this restriction, I just want to get a little clearer :)
Long Life Threads
I've done a bit of research on this and this question is probably redundant. It seems it is not safe to start a worker thread in the ASP.NET environment and instead, use a windows service to do long-running tasks. The latter isn't exactly a problem, the target environments will have the facility to install services, but I just wanted to double check that this was absolutely necessary. I understand threads can throw exceptions and die, but I do not understand the reasoning behind prohibiting them. If .NET provided a a thread framework that encompassed System.Thread, but also provided notifications for when the Application Server was going to recycle the App-Pool, we could actually do something about it rather than just keel over and die at the point we were stopped.
Are there any solutions to threading in ASP.NET or is it basically "service"?
I am sure I'll have more queries, but this is it for now.
EDIT: Thankyou for all the responses!
So here's the main thing that you're going to want to keep in mind. The IIS may get reset or may reset itself (based on criteria) while you're working. You can never know when that will happen unless it stops rendering your page while you're waiting on the response (in which case you'll get a browser notice that the page stopped responding, eventually.
Threads
This is why you shouldn't use threads in ASP.NET apps. However, that's not to say you can't. Once again, you'll need to configure the IIS engine properly (I've had it hang when spawning a lot of threads, but that may have been machine dependent). If you can trust that nobody will cause ASP.NET to recompile your code/restart your application (by saving the web.config, for instance) then you will have less issues than you might otherwise.
Instead of running a Windows service, you could use an ASMX or WCF service which also run on IIS/.NET. That's up to you. But with multiple service pools it allows you to keep everything "in the same environment" as far as installations and builds are concerned. They obviously don't share the same processpool/memoryspace.
"You're Wrong!"
I'm sure someone will read this far and go "but you can't thread in ASP.NET!!!" so here's the link that shows you how to do it from that venerable MSDN http://msdn.microsoft.com/en-us/magazine/cc164128.aspx
Now onto Long life objects/Caching
Caching
So it depends on what you mean by caching. Is this per user, per system, per application, per database, or per page? Each is possible, but takes some contrivance and complexity, depending on needs.
The simplest way to do it per page is with static variables. This is also highly dangerous if you're using it for user-code-stuff because there's no indication to the end user that the variable is going to change, if more than one users uses the page. Instead, if you need something to live with the user while they work with the page in particular, you could either stuff it into session (serverside caching, stays with the user, they can use it across multiple pages) or you could stick it into ViewState.
The cachemanager you reference above would be good for application style caching, where everyone using the webapp can use the same datastore. That might be good for intensive queries where you want to get the values back as quickly as possible so long as they're not stale. That's up to you to decide. Also, things like application settings could be stored there, if you use a database layer for storage.
Long term cache objects
You could initialize it in the app_start with no problem, and the same goes for destroying it at the end if you felt the need, but yes, you do need to watch out for what I described at first about the system throwing all your code out and restarting.
Keel over and die
But you don't get notified when you're (the app pool here) going to be restarted (as far as I know) so you can pretty much keel over and die on anything. Always assume the app is going to go down on you before your request, and that every request is the first one.
Really tho, that just leads back into web-design in the first place. You don't know that this is the first visitor or the fifty millionth (unless you're storing that information in memory of course) so just like the app is stateless, you also need to plan your architecture to be stateless as much as possible. That's where web-apps are great.
If you need state on a regular basis, consider sticking with desktop apps. If you can live with stateless-ness, welcome to ASP.NET and web development.
1) The main thing about caching is understanding the lifetime of the cache, and the effects of caching (particularly large) objects in cache. Consider caching a 1MB object in memory that is generated each time your default.aspx page is hit; and after a year of production you're getting 10,000 hits an hour, and object lifetime is 2 hours. You can easily chew up TONS of memory, which can affect performance, and also may cause things to be prematurely expired from the cache, which in turn can cause other issues. As long as you understand the effects of all of this, you're fine.
2) Starting it up in Application_Start and shutting it down in Application_End is fine. You can also implement a custom HttpApplication with an http module.
3) Yes, when your app pool is recycled it calls Application_End and everything is shutdown and destroyed.
4) (Threads) The issue with threads comes up in relation to scaling. If you hit that default.aspx page, and it fires up a thread, and that page gets hit 10,000 in 2 minutes, you could potentially have a ton of threads running in your application pool. Again, as long as you understand the ramifications of firing up a thread, you can do it. ThreadPool is another story, the asp.net runtime uses the ThreadPool to process requests, so if you tie up all the threadpool threads, your application can potentially hang because there isn't a thread available to process the request.
1) Are there any design considerations I need to think about the objects that I am want to cache?
2) If I want to implement a manager of some time that is around during the lifetime of the web application (thread-safe, obviously), is it enough to initialize it during app_start and kill it in app_end? Or is this practiced frowned upon and any managers are created uniquely in the constructor/init method of the page being served.
There's a difference between data caching and output caching. I think you're looking for data caching which means caching some object for use in the application. This can be done via HttpContext.Current.Cache. You can also cache page output and differentiate that on conditions so the page logic doesn't have to run at all. This functionality is also built into ASP.NET. Something to keep in mind when doing data caching is that you need to be careful about the scope of the things you cache. For example, when using Entity Framework, you might be tempted to cache some object that's been retrieved from the DB. However, if your DB Context is scoped per request (a new one for every user visiting your site, probably the correct way) then your cached object will rely on this DB Context for lazy loading but the DB Context will be disposed of after the first request ends.
3) If I have a long-term object initialized at app start, is this likely to get affected when the app pool is recycled? If it is destroy at app end is it a case of it simply getting destroyed and then recreated again? I am fine with this restriction, I just want to get a little clearer :)
Perhaps the biggest issue with threading in ASP.NET is that it runs in the same process as all your requests. Even if this weren't an issue in and of itself, IIS can be configured (and if you don't own the servers almost certainly will be configured) to shut down the app if it's inactive (which you mentioned) which can cause issues for these threads. I have seen solutions to that including making sure IIS never recycles the app pool to spawning a thread that hits the site to keep it alive even on hosted servers