Implementing a simple local memory cache on an Azure instance

Implementing a simple local memory cache on an Azure instance - c#

I'm looking for a simple way to implement a local memory store which can be used on an Azure .NET instance
I've been looking at Azure Co-located Caching and it seems to support all of my requirements:
Work on both web roles and worker roles
Implement a simple LRU
Keep cached objects in memory (RAM)
Allow me to define the cache size as a percentage of the machine's total RAM
Keep the cache on the same machine of the web/worker role (co-located mode)
Allow me to access the same cache from multiple AppDomains running on the same machine (Web Roles may split my handlers into different AppDomains)
The only problem I have with Azure Co-located caching is that different instances communicate and try to share their caches - and I don't really need all that.
I want every machine to have its own separate in-memory cache. When I query this cache, I don't want to waste any time on making a network request to other instances' caches.
Local Cache config?
I've seen a configuration setting in Azure Caching to enable a Local Cache - but it still seems like machines may communicate with each other (ie. during cache miss). This config also requires a ttlValue and objectCount and I want TTL to be "forever" and the object count to be "until you fill the entire cache". It feels like specifying maxInt in both cases seems wrong.
What about a simple static variable?
When I really think about it, all this Azure caching seems like a bit of an overkill for what I need. I basically just need a static variable in the application/role level.. except that doesn't work for requirement #6 (different AppDomains). Requirement #4 is also a bit harder to implement in this case.
Memcached
I think good old memcached seems to do exactly what I want. Problem is I'm using Azure as a PaaS and I don't really want to administer my own VM's. I don't think I can install memcached on my roles.. [UPDATE] It seems it is possible to run memcached locally on my roles. Is there a more elegant "native" solution without using memcached itself?

You can certainly install memcached on Web and Worker roles. Steve Marx blogged getting memcached running on Azure Cloud Service several years ago before the Virtual Machine features were present. This is an older post, so you may run into other ways of dealing with this, such as using start up tasks instead of the OnStart method in RoleEntryPoint, etc.

I have used the "free" versions of SQL Server for local caching and they have worked great. It depends on what you are doing, but I have ran both SQL Server Express/Compact for storing entire small static data sets for a fantasy football site I wrote that included 5 years of statistics. They worked really well even on a small/medium Azure instances, because of the small footprint.
http://blogs.msdn.com/b/jerrynixon/archive/2012/02/26/sql-express-v-localdb-v-sql-compact-edition.aspx
Best part is you can use t-sql. Your cache requirements might be more complex or not scale to this.

Related

How to refresh static class from another website

I have an ASP.Net webform application and ASP.Net WebApi, both are on the same IIS but in different sites and App pools. Both work with the same DB. I have stored some settings values from DB in the static class. Now I need to refresh this static class on the webform app when I change the settings via WebApi and vice versa. I'm using named pipes for sending the flag into the second app 'on setting change'. But I think that named pipes are not 100% reliable. Is there any other (better) mechanism for how to sync these two classes?

There are a number of solutions to this, which one you choose will depend on the frequency of the updates and how critical it is that the data is in sync.
Ideally you should look for a solution that supports your service instances being distributed across multiple physical locations, you will find the overall implementation simpler and it will allow you to scale your solution beyond the current single server
If it is critical that the many instances are in sync, then a WebSocket solution is a proven protocol and design pattern to orchestrate between multiple instances.
At a high level, you define a single server instance that will orchestrate messaging between all the client instances. The clients (your static class) establish a persistent Web Socket connection to the server that the server can use to send messages to the client when they need to refresh the config.
You can do this from first principals following this Asynchronous Server Socket Example but there are implementation frameworks like Signal R that you might find useful as well.
A simpler but less efficient pattern is to simply poll a single source frequently to determine when you need to refresh. The source could be a single timestamp value in a SQL database, or you could use a reliable cloud based storage like MS Azure Tables or Blob storage.
If the call to check for the update is simple and efficient you can usually get away with this without too much effort or causing too much trouble.
Polling can even be more effient in scenarios where the update frequency is high, especially if the updates are more frequent than the times you need to check if the values have changed.
You could also look into a distributed cache, either to replace the whole static class or just to manage the refresh token. Redis Cache is a reliable pattern that is easy to plugin to ASP.Net, you can setup a local Redis server as explained here or you could use a cloud hosted implementation like that offered by Azure

Multi level cache - Appfabric with MemoryCahe

In my current setup I have a dedicated Appfabric server. Most of the objects stored there are reference objects which means most of the operations are 'Get' operations. Therefore I've considered using LocalCache.
Unfortunately, recently I experienced problems with the availability of the cache server resulting from various network issues. The application server continues to work directly with the DB in these cases thanks to a provider I've written. However, it has a very large impact on performance as expected.
I want to be able to use some kind of a local cache for the highly referenced objects, even when the cache server is down. For this purpose I've considered using the MemoryCache of .Net 4. I don't really care about the objects being stale and I rely on a timeout eviction policy, therefore I don't worry about synchronization between the application servers.
I wanted to hear what do you think about this solution.
- Are there any other points I should consider?
- Is there a better solution to provide fast access for highly referenced objects even when the cache server is down?

Appfabric's LocalCache is a client cache, local and inproc to the client application, which stores references of frequently used data, so application does not need to deserialize same object again. However since LocalCache works with the cache server, it would not work if cache server is down.
One solution possible to your problem is as you have mentioned, having an independant client cache so even if cache server goes down, client cache will still be available.
When relying on inproc cache you will have to keep it in mind that in-process caches store reference of cached objects. If your application modifies object after getting from cache, it will be modified in cache as well. Also if multiple threads may end up modifying same item in cache, you will need thread synchronization for such objetcs.
However even using an independant client cache, you application may end up hitting the database frequently, since data in client cache of one application server will not be accessable to other servers.
A better solution might be using replicated cache servers, where each server will have all cached data. This will not only improve get performace for referential data but also will eliminate single point of failure, like in your case.
If Appfabric is not a hard requirement for application, you may look into NCache for better scalability and high availablility.

Did you consider AppFabric's local cache feature? Or is it not suitable for you?

How to build a highly scaleable global counter in Azure?

I am trying to setup in Windows Azure a global counter which would keep track of the number of games started within a day. Each time a player starts a game, a Web Service call is made from the client to the server and a global counter would be incremented by one. This should be fairly simple to do with a database... But I wonder how I could efficiently do this. The database approach is good for a few hundreds clients simultaneously, but what will happen if I have 100,000 clients?
Thanks for your help/ideas!

A little over a year ago, this was a topic in a Cloud Cover episode: Cloud Cover Episode 43 - Scalable Counters with Windows Azure. They discussed how to create an Apaythy Button (similar to the Like Button on Facebook).
Steve Marx also discusses this in detail in a blog post with source code: Architecting Scalable Counters with Windows Azure. In this solution they're doing the following:
On each instance, keep track of a local counter
Use Interlock.Increment to modify the local counter
If the counter changed, save the new value in table storage (have a timer do this every few seconds). For each deployment/instance, you'll have 1 record in the counters table.
To display the total count, take the sum of all records in the counters table.

Well, there are a bunch of choices. And I don't know which is best for you. But I'll present them here with some pros and cons and you can come to your own conclusions given your requirements.
The simplest answer is "put it in storage." Both SQL Azure and the core Azure table or blog storage options are out there for you. One issue to contend with is performance in the face of large scale concurrency, but I'd also encourage you to think about correctness. You really want something that supports atomic increment to outsource this problem IMO.
Another variation of a storage oriented option would be a highly available VM. You could spin up your own VM on Azure, back a data drive on to Azure Drives, and then use something on top of the OS to do this (a database server, an app that uses the file system directly, whatever). This would be more similar to what you'd do at home but would have fairly unfortunate trade-offs...your entire cloud is now reliant on the availability of this one VM, cost is something to think about, scalability of the solution, and so on.
Splunk is also an option to consider, if you look at VMs.
As an earlier commenter mentioned, you could compute off of log data. But this would likely not be super real time.
Service Bus is another option to consider. You could pump messages over SB for these events and have a consumer that reads them and emits a "summary." There are a bunch of design patterns to consider if you look at this. The SB stack is pretty well documented. Another interesting element of SB is that you might be able to trade off 100% correctness for perf/scale/cost. This might be a worthy trade-off for you depending upon your goals.
Azure also exposes queues which might be a fit. I'll admit I think SB is probably a better fit but it is worth looking at both if you are going down this path.
Sorry I don't have a silver bullet but I hope this helps.

I would suggest you follow the pattern described in .NET Multi-Tier Application. This would help you decouple the Web role which faces your clients and the Worker role, which will store the data to a persistence medium (either SQL Server / Azure Storage) by using the Service Bus.
Also, this is an efficient model to scale as you can span new instances of web role or worker role or both. For the dashboard depending on the load you can Cache your data periodically and server it from the Cache. This would compromise on the accuracy of the data, but would still provide with an option for easy scaling. You can even invalidate the cache every 1 minute and get it loaded from the persistence medium to get the latest value.
Regarding to use SQL Server or Azure storage, if there is no need for relational capabilities like JOINS etc, you can very well go for the Azure storage.

how many webservices

I have a web service that looks like this:
public class TheService : System.Web.Services.WebService
{
[WebMethod(EnableSession = true)]
public string GetData(string Param1, string Param2) { ... }
}
In other words, it's contained in one class and in there, I have one public method and there is another private method that does a read to the database.
The issue I'm facing is in terms of scalability. I'm building a web app that should work for 1,000 daily users and each user will do about 300-500 calls a day to the web service and so that's about 300,000 to 500,000 requests per day. I need to add 9 more calls to the web service. Some of these calls will involve database writes.
My question is this: am I better off creating 9 separate web services or continue with the one service I have and add the other methods. Or may be something different and better. I'm planning to deploy the application on Azure so I'm not really concerned about hardware, just the application side of things.

I wouldn't base my decision off the volume, or for performance/scalability reasons. You won't get much if any performance benefit from keeping them lumped together or separating them. Any grouping or filtering that can be done while the services are grouped one way can also be done with the services grouped the other way. The ability to partition between servers will be the same, too.
Design
Instead I would focus on trying to make your code understandable and maintainable. Group your services how they make the most sense architecturally within your program. Keep them logically grouped how they make the most sense to be grouped, from a problem-domain perspective (as opposed to a solution domain perspective).
Since you're free to group them how you want, I recommend you read up on SOLID, which is a set of guiding principles for creating software architecture.
One of the principles listed that is particularly important is the Interface Segregation Principle, which can be defined by the notion that "many client specific interfaces are better than one general purpose interface."
Performance and scalability
Since you mentioned performance and scalability being a concern, I recommend you follow this plan:
Determine how long you can wait until you can patch/maintain the software
Determine your expected load, including both average and peak load-per-time (you've determined the average), and how much you expect this traffic to grow over time (specifically over the period you can go without patching/maintaining the software)
Create a model describing exactly which calls will be done and in which ratios (per time and per server)
Create automation that mirrors these models as closely as you can. Try to model both average and peak traffic, and surpassing your highest scale traffic
Profile your code, DB, network traffic, and disk traffic while running this automation
Determine the bottlenecks, and if they are within acceptable tolerance
Optimize your bottlenecks (as required), and repeat from the profiling step forward
The next release of your software, repeat from the top to add scenarios/load/automation
Perform regression testing using your existing tests, altered to fit the new scale

Splitting the web methods into several web services won't help you here; load balancing will.

The number of web services will not have any affect on scalability of the app.
Finding your bottlenecks will help scalability. If you're bottleneck is the DB, you may need to find ways to tune your queries, partition your data across more stores, etc... If you're bottleneck is CPU on the web services (web roles in azure), then adding more than one web role to your cluster will help. Azure supports that.
But, simply don't start adding roles. Understand where your bottlenecks are. Measure, profile and tune.
Azure has devfabric and IIS locally to help you profile locally as well.

Splitting the web-services into multiple web roles because of physical constraints and not necessarily due to logical layout may be worth considering because:
Using Azure you can scale out your Roles independently of one another. This means that IF different web methods need to scale in different patterns (ie: your first web method has the biggest volume in the mornings and after lunch and your other two web methods have the biggest volume in the evening and during the night), and the last 2 web methods are usually flat throughout the day, it very well maybe worth it to split your methods across Roles by scalability constraints and not by logical constraints.
By increasing/decreasing the servers allocated to each method independently you maybe able to fine-tune your optimal power vs. need with a much greater precision.
HTH

Actually, creating separate Web Services, as Igorek suggested, will provide much more granular scale-out. In that scenario, you can deploy different Web Services to different Roles, each role getting its own set of instances (along with the option to create different instance sizes per role). Windows Azure will load-balance across all the instances of a Role.
So from a granularity standpoint:
Least granular: Combine all methods into a single Web Service, hosted on a single Role. As you scale out to multiple instances, all service method requests are load-balanced across all instances. Because you're combining everything into one Role, you will find this to be optimized for cost: You can run all Web Services code in a single instance (really 2 instances to give yourself SLA).
More granular: Create separate Web Services, each with their own methods, and host on the same Role (allows you to exercise SOLID principles, as Merlyn described). Same basic performance characteristics as the first option, as all requests are still load-balanced across the same set of instances.
Most granular: Create separate Web Services, each with their own methods, and host each Web Service endpoint on a separate Role, allowing for independent VM sizing and scale-out of each Web Service endpoint. This option has a higher runtime cost to it, as you now have a minimum of one instance per Web Service endpoint (again, 2 instances in a real world, live application).

I am not sure about exact your case, but moving expensive (from CPU/DB point of view) tasks to separate Worker Role usually are good solution for Azure. In that case you will have one WebRole with services that will receive requests (it will be light weight, so you sjould not have many Instances for it) and create tasks for Worker Roles and one or few Worker Roles that will process that tasks - #1 Worker Roles can be created per kind of task (to group similar actions like reading/writing data to DB) or #2 one Worker Role can handle any type of task. I don't see any benefits in #2, because to get the same behavior you can just create one WebRole with many instances and handle all there. So you will have ability to control processing time by adding/removing Worker Roles.
As other people suggested - using Azure platform by itself will not make app scalable, especially if you are using SQL Azure, you will need to implement sharding or add many DBes to avoid one big DB for all requests.
I don't know if that's related to this questing, but just to let you know - Azure is dropping connections which are not active during 60 sec (I did not find some way to increase that timeout, you can Google this problem). This may be an issue is you are porting web-services to Azure and your responses can reach 60 seconds. One way to avoid it is keeping connection active, which is pretty simple if clients know about this "feature".

High availability & scalability for C#

I've got a C# service that currently runs single-instance on a PC. I'd like to split this component so that it runs on multiple PCs. Each PC should be assigned a certain part of the work. If one PC fails, its work should be moved to a backup machine.
Data synchronization can be done by the DB, so that should not be much of an issue. My current idea is to use some kind of load balancer that splits and sends the incoming requests to the array of PCs and makes sure the work is actually processed.
How would I implement such a functionality? I'm not sure if I'm asking the right question. If my understanding of how this goal should be achieved is wrong, please give me a hint.
Edit:
I wonder if the idea given above (load balancer splitswork packages to PCs and checks for result) is feasible at all. If there is some kind of already implemented solution so this seemingly common problem, I'd love to use that solution.
Availability is a critical requirement.

I'd recommend looking at a Pull model of load-sharing, rather than a Push model. When pushing work, the coordinating server(s)/load-balancer must be aware of all the servers that are currently running in your system so that it knows where to forward requests; this must either be set in config or dynamically set (such as in the Publisher-Subscriber model), then constantly checked to detect if any servers have gone offline. Whilst it's entirely feasible, it can complicate the scaling-out of your application.
With a Pull architecture, you have a central work queue (hosted in MSMQ, Sql Server Service Broker or similar) and each processing service pulls work off that queue. Expose a WCF service to accept external requests and place work onto the queue, safe in the knowledge that some server will do the work, even though you don't know exactly which one. This has the added benefits that each server monitors it's own workload and picks up work as-and-when it is ready, and you can easily add or remove servers to/from this model without any change in config.
This architecture is supported by NServiceBus and the communication between Windows Azure Web & Worker roles.

From what you said each PC will require a full copy of your service -
Each PC should be assigned a certain
part of the work. If one PC fails, its
work should be moved to a backup
machine
Otherwise you won't be able to move its work to another PC.
I would be tempted to have a central server which farms out work to individual PCs. This means that you would need some form of communication between each machine and and keep a record back on the central server of what work has been assigned where.
You'll also need each machine to measure it's cpu loading and reject work if it is too busy.
A multi-threaded approach to the service would make good use of those multiple processor cores that are ubiquitoius nowadays.

How about using a server and multi-threading your processing? Or even multi-threading on a PC as you can get many cores on a standard desktop now.
This obviously doesn't deal with the machine going down, but could give you much more performance for less investment.

you can check windows clustering, and you have to handle set of issues that depends on the behaviour of the service (you can put more details about the service itself so I can answer)

This depends on how you wanted to split your workload, this usually done by
Splitting the same workload by multiple services
Means same service being installed on
different servers and will do the
same job. Assume your service is reading huge data from the db servers and processing them to produce huge client specific datafiles and finally this datafile is been sent to the clients. In this approach all your services installed in diff servers will do the same work but they split the work to increaese the performance.
Splitting the part of the workload by multiple services
In this approach each service will be assigned to the indivitual jobs and works on different goals. in above example one serivce is responsible for reading data from db and generating huge data files and another service is configured only to read the data file and send it to clients.
I have implemented the 2nd approach in one of my work. Because this let me isolate and debug the errors in case of any failures.

The usual approach for load balancer is to split service requests evenly between all service instances.
For each work item (request) you can store relative information in database. Then each service should also have at least one background thread checking database for abandoned work items.

I would suggest that you publish your service through WCF (Windows Communication Foundation).
Then implement a "central" client application which can keep track of available providers of your service and dish out work. The central app will act as scheduler and load balancer of the tasks to be performed.
Check out Juwal Lövy's book on WCF ("Programming WCF Services") for a good introduction on this topic.

You can have a look at NGrid : http://ngrid.sourceforge.net/
or Alchemi : http://www.gridbus.org/~alchemi/index.html
both are grid computing framework with load balancers that will get you started in no time.
Cheers,
Florian

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.