Caching and multi-thread synchronization with ReaderWriterLockSlim - c#

I have a web-service which is called by some web-service clients. This web-service returns the current inventory list of an inventory. This list can be big, 10K+ of product IDs, and it takes quite some time (~4 minutes) to refresh the list by reading data in the database. I don't want to refresh the list every time this web-service is called, as it may consume too much resource on my database server, and the performance is always not very good.
What I intend to do is giving the inventory list some time-to-live value, which means when a client asks for the inventory list, if the data is not out-of-date I just return the data right away; if the data is obsolete I will read it from the database, update this list data and its time-to-live value, and then return the refreshed data back to the web-service client. As there may be several clients call this web-service, It looks like I need a multi-thread synchronization(multiple-read single-write, ReaderWriterLockSlim class?) to protect this inventory list, but I haven't found a good design to make this web-service have good performance: only one client refreshes the data, the other clients don't have to redo the work if the data is still within the time-to-live frame and the web-service should return the result as soon as possible after the write thread completes the update.
I also think about another solution (also use ReaderWriterLockSlim class): creating a separate thread to refresh the inventory list periodically (write-thread refreshes the data every 15 minutes), and let all the web-service clients use read-thread to read the data. This may work, but I don't really like it as this solution still waste resource of the web-server. For example, if there is no client's request, the system still has to refresh the inventory list data every 15 minutes.
Please suggest some solution. Thanks.

I would suggest using a MemoryCache.
https://stackoverflow.com/a/22935978/34092 can be used to detect when the item expires. https://msdn.microsoft.com/en-us/library/7kxdx246.aspx is also worth a read.
At this point, the first line of the code you write (in CacheRemovedCallback) should write the value back to the MemoryCache - this will allow readers to keep reading it.
Then it should get the latest data, and then write that new data to the MemoryCache (again passing in a CacheItemPolicy to allow the callback to be called when the latest version is removed). And so on and so on...

Do you ever run only one instance of your service? Then in-memory caching is enough for you. Or use a ConcurrentDictionary if you don't want to implement the lock yourself.
If you run multiple instances of that services, it might be advisable to use a out of process cache like Redis.
Also, you could eventually maintain the cached list so that it is always in sync with what you have in the database!?
There are many different cache vendors dotnet and asp.net and asp.net core have different solutions. For distributed caching there are also many different options. Just pick whatever fits best for the framework you use.
You can also use libraries like CacheManager to help you implement what you need more easily.

Related

Pattern or library for caching data from web service calls and update in the background

I'm working on a web application that uses a number of external data sources for data that we need to display on the front end. Some of the external calls are expensive and some also comes with a monetary cost so we need a way to persist the result of these external requests to survive ie a app restart.
I've started with some proof of concept and my current solution is a combination of a persistent cache/storage (stores serialized json in files on disk) and a runtime cache. When the app starts it will populate runtime cache from the persistent cache, if the persistent cache is empty it would go ahead and call the webservices. Next time the app restarts we're loading from the persistent cache - avoiding the call to the external sources.
After the first population we want the cache to be update in the background with some kind of update process on a given schedule, we also want this update process to be smart enough to only update the cache if the request to the webservice was successful - otherwise keep the old version. Theres also a twist here, some webservices might return a complete collection while others requires one call per entity - so the update-process might differ depending on the concrete web service.
I'm thinking that this senario can't be totally unique, so I've looked around and done a fair bit of Googleing but I haven't fund any patterns or libraries that deals with something like this.
So what I'm looking for is any patterns that might be useful for us, if there is any C#-libraries or articles on the subject as well? I don't want to "reinvent the wheel". If anyone have solved similar problems I would love to hear more about how you approached them.
Thank you so much!

Web api cache architecture

Net web api developer and i want to know if im working correctly.
Im saving a changeable objects into the cache.
Other developers on my team said only static data should be stored in the cache.
So i wanted to know if only static data need to be store in cache or there's another right way to do it.
Thanks.
I use caching for changeable objects because they take a reasonable amount of time to generate, although the frequency of their changing varies.
There are a couple of things which I do to try and make sure the data is always valid.
On the cached item I put a policy which will keep the item in cache for say 15 minutes, and make the expiration time sliding. This keeps the used items in cache but drops less used items.
I also have cache eviction end points on the API, and the process which updates the data in the database calls the endpoint once the process has been complete. The items which have been updated are then removed from the cache and hence rebuilt the next time they are requested.
In the end I think it all boils down to how long it takes to get the object you are trying to return, and whether the delay to generate it is acceptable.

conflict resolution in distributed list

I would like to maintain a list of objects that is distributed between N load balanced servers: whenever a client changes the list on one server, I would like these changes to migrate to the other servers. So, I guess this is a case of master-master replication.
What is the simplest way of handling this? One simplifying fact is that each change to an object in the list has an associated increasing version number attached to it. So, it is possible to resolve conflicts if an item was changed on two different servers, and these two deltas make their way to a third server.
Edit: clarification: I am quite familiar with distributed key-value stores like Memcached and Redis. That is not the issue here; what I am interested in is a mechanism to resolve conflicts in a shared list: if server A changes an item in the list, and server B removes the item, for example, how to resolve the conflict programmatically.
I suggest memcached. It's a distributed server cache system that seems to fit your needs perfectly. Check out this link:
Which .NET Memcached client do you use, EnyimMemcached vs. BeITMemcached?
If passing the entire list doesn't suit you (I don't know if memcached is smart enough to diff your lists) then I would suggest giving the old DataSet object a look, as its diff grams should be well suited for passing about just deltas if your data set is large.
Put your changes in a queue. Have each server look at the queue, and act upon it.
For example, queue could have:
add item #33
remove item #55
update item #22
and so on
Upon doing a change, write to the queue, and have each server pick up items from the queue and update its list according to that.
I did in-memory database with such method, and it worked perfectly on multiple 'servers'.
EDIT:
When servers want to update each other, that has to happen:
Each server that updates will put an UPDATE (or ADD or DELETE) request into the queue for all other servers. Each server should also store the list of queued requests that originated from it so it will not load its own updates from the queue.
Does each server have it's own version of List locally cached or do you plan to use a centralized caching layer?
As suggested, you can have a centralized "push" process which works off a centralized queue. Any changes submitted by any server are en-queued, and the "push" process can push updates to all the servers via some remoting / WebService mechanism.
This offers the advantage of any changes/updates/deletes being applied at once (or close in time) to all the servers, centralized validation or logging if needed. This also solves the problem of multiple updates - the latest one takes precedence.
I've seen this implemented as a windows service which has an internal queue (can be persisted to DB async for resiliency) which manages the queue and simply takes items one by one, validates the item, loggs change/content and finally pushes it to local Lists via WebService calls to each web server (servers maintain in-memory list which simply gets updated/added/deleted as needed).
There are algorithms that can be used to syncronize Distributed systems.
In your case you need an algorithms that given two events on the system tells you wich one of them happened firts. If you can decide for any two events wich is the first one then all the conflicts could be resolved.
I recommend you to use Lamport Clocks.
If you're on a Windows platform, I suggest you take a look at "Windows Server AppFabric", and especially the Caching feature. The name is funky, but I think it's exactly what you're looking for, I quote:
A distributed in-memory cache that provides .NET applications with
high-speed access, scale, and high availability to application data.

Update data without relying direct on the connection?

I have an application that once started will get some initial data from my database and after that some functions may update or insert data to it.
Since my database is not on the same computer of the one running the application and I would like to be able to freely move the application server around, I am looking for a more flexible way to insert/update/query data as needed.
I was thinking of using an website API on a separated thread on my application with some kinda of list where this thread will try to update the data every X minutes and if a given entry is updated it will be removed from the list.
This way instead of being held by the database queries and the such the application would run freely queuing what has to be update/inserted etc
The main point here is so I can run the functions without worrying about connectivity issues to the database end, or issues related, since all the changes are queued to be updated on it.
Is this approach ok ? bad ? are the better recommendations for this scenario ?
On "can access DB through some web server instead of talking directly to DB server": yes this is very common and recommended approach. It is much easier to limit set of operations exposed through custom API (web services, REST services, ...) than restrict direct communication with DB.
On "sync on separate thread..." - you need to figure out what are requirements of the synchronization. Delayed sync may be ok if you don't need to know latest data and not care if updates from client are commited to storage immediately.

ASP.NET Persistent Caching ("Lazy loading"-style?)

I'm having some trouble getting my cache to work the way I want.
The problem:
The process of retrieving the requested data is very time consuming. If using standard ASP.NET caching some users will take the "hit" of retrieving the data. This is not acceptable.
The solution?:
It is not super important that the data is 100% current. I would like to serve old invalidated data while updating the cached data in another thread making the new data available for future requests. I reckon that the data needs to be persisted in some way in order to be able to serve the first user after application restart without that user taking the "hit".
I've made a solution which does somewhat of the above, but I'm wondering if there is a "best practice" way or of there is a caching framework out there already supporting this behaviour?
There are tools that do this, for example Microsofts ISA Server (may be a bit expensive / overkill).
You can cache it in memory using Enterprise Libary Caching. Let your users read from Cache, and have other pages that update the Cache, these other pages should be called as regularly as you need to keep the data upto date.
You could listen when the Cached Item is Removed and Process then,
public void RemovedCallback(String k, Object v, CacheItemRemovedReason r)
{
// Put Item Back IN Cache, ( so others can use it until u have finished grabbing the new data)
// Spawn Thread to Go Get Up To Date Data
// Over right Old data with new return...
}
in global asax
protected void Application_Start(object sender, EventArgs e)
{
// Spawn worker thread to pre-load critical data
}
Ohh...I have no idea if this is best practice, i just thought it would be slick~
Good Luck~
I created my own solution with a Dictionary/Hashtable in memory as a duplicate of the actual cache. When a method call came in requesting the object from cache and it wasn't there but was present in memory, the memory stored object was returned and fired a new thread to update the object in both memory and the cache using a delegate method.
You can do this pretty easily with the Cache and Timer classes built into .NET. The Timer runs on a separate thread.
And I actually wrote a very small wrapper library called WebCacheHelper which exposes this functionality in an overloaded constructor. The library also serves as a strongly typed wrapper around the Cache object.
Here's an example of how you could do this...
public readonly static WebCacheHelper.Cache<int> RegisteredUsersCount =
new WebCacheHelper.Cache<int>(new TimeSpan(0, 5, 0), () => GetRegisteredUsersCount());
This has a lazy loading aspect to it where GetRegisteredUsersCount() will be executed on the calling thread the instant that RegisteredUsersCount is first accessed. However, after that it's executed every 5 minutes on a background thread. This means that the only user who will be penalized with a slow wait time will be the very first user.
Then getting the value is as simple as referencing RegisteredUsersCount.Value.
Yeah you could just cache the most frequently accessed data when your app starts but that still means the first user to trigger that would "take the hit" as you say (assuming inproc cache of course).
What I do in this situation is using a CacheTable in db to cache the latest data, and running a background job (with a windows service. in a shared environment you can use threads also) that refreshes the data on the table.
There is a very little posibility to show user a blank screen. I eliminate this by also caching via asp.net cache for 1 minute.
Don't know if it's a bad design, but it's working great without a problem on a highly used web site.

Categories