I have a web service that looks like this:
public class TheService : System.Web.Services.WebService
{
[WebMethod(EnableSession = true)]
public string GetData(string Param1, string Param2) { ... }
}
In other words, it's contained in one class and in there, I have one public method and there is another private method that does a read to the database.
The issue I'm facing is in terms of scalability. I'm building a web app that should work for 1,000 daily users and each user will do about 300-500 calls a day to the web service and so that's about 300,000 to 500,000 requests per day. I need to add 9 more calls to the web service. Some of these calls will involve database writes.
My question is this: am I better off creating 9 separate web services or continue with the one service I have and add the other methods. Or may be something different and better. I'm planning to deploy the application on Azure so I'm not really concerned about hardware, just the application side of things.
I wouldn't base my decision off the volume, or for performance/scalability reasons. You won't get much if any performance benefit from keeping them lumped together or separating them. Any grouping or filtering that can be done while the services are grouped one way can also be done with the services grouped the other way. The ability to partition between servers will be the same, too.
Design
Instead I would focus on trying to make your code understandable and maintainable. Group your services how they make the most sense architecturally within your program. Keep them logically grouped how they make the most sense to be grouped, from a problem-domain perspective (as opposed to a solution domain perspective).
Since you're free to group them how you want, I recommend you read up on SOLID, which is a set of guiding principles for creating software architecture.
One of the principles listed that is particularly important is the Interface Segregation Principle, which can be defined by the notion that "many client specific interfaces are better than one general purpose interface."
Performance and scalability
Since you mentioned performance and scalability being a concern, I recommend you follow this plan:
Determine how long you can wait until you can patch/maintain the software
Determine your expected load, including both average and peak load-per-time (you've determined the average), and how much you expect this traffic to grow over time (specifically over the period you can go without patching/maintaining the software)
Create a model describing exactly which calls will be done and in which ratios (per time and per server)
Create automation that mirrors these models as closely as you can. Try to model both average and peak traffic, and surpassing your highest scale traffic
Profile your code, DB, network traffic, and disk traffic while running this automation
Determine the bottlenecks, and if they are within acceptable tolerance
Optimize your bottlenecks (as required), and repeat from the profiling step forward
The next release of your software, repeat from the top to add scenarios/load/automation
Perform regression testing using your existing tests, altered to fit the new scale
Splitting the web methods into several web services won't help you here; load balancing will.
The number of web services will not have any affect on scalability of the app.
Finding your bottlenecks will help scalability. If you're bottleneck is the DB, you may need to find ways to tune your queries, partition your data across more stores, etc... If you're bottleneck is CPU on the web services (web roles in azure), then adding more than one web role to your cluster will help. Azure supports that.
But, simply don't start adding roles. Understand where your bottlenecks are. Measure, profile and tune.
Azure has devfabric and IIS locally to help you profile locally as well.
Splitting the web-services into multiple web roles because of physical constraints and not necessarily due to logical layout may be worth considering because:
Using Azure you can scale out your Roles independently of one another. This means that IF different web methods need to scale in different patterns (ie: your first web method has the biggest volume in the mornings and after lunch and your other two web methods have the biggest volume in the evening and during the night), and the last 2 web methods are usually flat throughout the day, it very well maybe worth it to split your methods across Roles by scalability constraints and not by logical constraints.
By increasing/decreasing the servers allocated to each method independently you maybe able to fine-tune your optimal power vs. need with a much greater precision.
HTH
Actually, creating separate Web Services, as Igorek suggested, will provide much more granular scale-out. In that scenario, you can deploy different Web Services to different Roles, each role getting its own set of instances (along with the option to create different instance sizes per role). Windows Azure will load-balance across all the instances of a Role.
So from a granularity standpoint:
Least granular: Combine all methods into a single Web Service, hosted on a single Role. As you scale out to multiple instances, all service method requests are load-balanced across all instances. Because you're combining everything into one Role, you will find this to be optimized for cost: You can run all Web Services code in a single instance (really 2 instances to give yourself SLA).
More granular: Create separate Web Services, each with their own methods, and host on the same Role (allows you to exercise SOLID principles, as Merlyn described). Same basic performance characteristics as the first option, as all requests are still load-balanced across the same set of instances.
Most granular: Create separate Web Services, each with their own methods, and host each Web Service endpoint on a separate Role, allowing for independent VM sizing and scale-out of each Web Service endpoint. This option has a higher runtime cost to it, as you now have a minimum of one instance per Web Service endpoint (again, 2 instances in a real world, live application).
I am not sure about exact your case, but moving expensive (from CPU/DB point of view) tasks to separate Worker Role usually are good solution for Azure. In that case you will have one WebRole with services that will receive requests (it will be light weight, so you sjould not have many Instances for it) and create tasks for Worker Roles and one or few Worker Roles that will process that tasks - #1 Worker Roles can be created per kind of task (to group similar actions like reading/writing data to DB) or #2 one Worker Role can handle any type of task. I don't see any benefits in #2, because to get the same behavior you can just create one WebRole with many instances and handle all there. So you will have ability to control processing time by adding/removing Worker Roles.
As other people suggested - using Azure platform by itself will not make app scalable, especially if you are using SQL Azure, you will need to implement sharding or add many DBes to avoid one big DB for all requests.
I don't know if that's related to this questing, but just to let you know - Azure is dropping connections which are not active during 60 sec (I did not find some way to increase that timeout, you can Google this problem). This may be an issue is you are porting web-services to Azure and your responses can reach 60 seconds. One way to avoid it is keeping connection active, which is pretty simple if clients know about this "feature".
Related
I have an ASP.Net webform application and ASP.Net WebApi, both are on the same IIS but in different sites and App pools. Both work with the same DB. I have stored some settings values from DB in the static class. Now I need to refresh this static class on the webform app when I change the settings via WebApi and vice versa. I'm using named pipes for sending the flag into the second app 'on setting change'. But I think that named pipes are not 100% reliable. Is there any other (better) mechanism for how to sync these two classes?
There are a number of solutions to this, which one you choose will depend on the frequency of the updates and how critical it is that the data is in sync.
Ideally you should look for a solution that supports your service instances being distributed across multiple physical locations, you will find the overall implementation simpler and it will allow you to scale your solution beyond the current single server
If it is critical that the many instances are in sync, then a WebSocket solution is a proven protocol and design pattern to orchestrate between multiple instances.
At a high level, you define a single server instance that will orchestrate messaging between all the client instances. The clients (your static class) establish a persistent Web Socket connection to the server that the server can use to send messages to the client when they need to refresh the config.
You can do this from first principals following this Asynchronous Server Socket Example but there are implementation frameworks like Signal R that you might find useful as well.
A simpler but less efficient pattern is to simply poll a single source frequently to determine when you need to refresh. The source could be a single timestamp value in a SQL database, or you could use a reliable cloud based storage like MS Azure Tables or Blob storage.
If the call to check for the update is simple and efficient you can usually get away with this without too much effort or causing too much trouble.
Polling can even be more effient in scenarios where the update frequency is high, especially if the updates are more frequent than the times you need to check if the values have changed.
You could also look into a distributed cache, either to replace the whole static class or just to manage the refresh token. Redis Cache is a reliable pattern that is easy to plugin to ASP.Net, you can setup a local Redis server as explained here or you could use a cloud hosted implementation like that offered by Azure
I am currently working with mvc4 application that reads data from a set of wcf services. Currently when a user hits a page number, if wcf requests are triggered to get data for different parts of the page. I want to improve its performance.
My idea is, when a user lands on a page a single wcf call is made which retrieves all the necessary data that the multiple calls previously did and put the data from it in to the users request httpcontext.
Is this improving performance than the approach single but larger wcf call over named pipes or multiple smaller calls under named pipes? Are there any performance implications of putting a large set of data in to the httpcontext?
I think you are trying to solve one problem by producing even more problems.
If you query all the data at a time and store in httpcontext it will speed up performance for opening new pages but it will take considerably longer to open the page for the first time. Also you may easily run out of memory especially if you have many users at a time if storing data in httpcontext per a user.
I think first you need to localize the problem and find the root cause of poor performance. It may be a query or it may be some database locks.
in any case caching is a good idea, but don't use httpcontext for it. Use ASP.NET cahe or some distributed cache like App Fabric. These tools will provide you with a lot of built-in features and it will be easier for you to then scale your application.
Hope it helps.
I am trying to setup in Windows Azure a global counter which would keep track of the number of games started within a day. Each time a player starts a game, a Web Service call is made from the client to the server and a global counter would be incremented by one. This should be fairly simple to do with a database... But I wonder how I could efficiently do this. The database approach is good for a few hundreds clients simultaneously, but what will happen if I have 100,000 clients?
Thanks for your help/ideas!
A little over a year ago, this was a topic in a Cloud Cover episode: Cloud Cover Episode 43 - Scalable Counters with Windows Azure. They discussed how to create an Apaythy Button (similar to the Like Button on Facebook).
Steve Marx also discusses this in detail in a blog post with source code: Architecting Scalable Counters with Windows Azure. In this solution they're doing the following:
On each instance, keep track of a local counter
Use Interlock.Increment to modify the local counter
If the counter changed, save the new value in table storage (have a timer do this every few seconds). For each deployment/instance, you'll have 1 record in the counters table.
To display the total count, take the sum of all records in the counters table.
Well, there are a bunch of choices. And I don't know which is best for you. But I'll present them here with some pros and cons and you can come to your own conclusions given your requirements.
The simplest answer is "put it in storage." Both SQL Azure and the core Azure table or blog storage options are out there for you. One issue to contend with is performance in the face of large scale concurrency, but I'd also encourage you to think about correctness. You really want something that supports atomic increment to outsource this problem IMO.
Another variation of a storage oriented option would be a highly available VM. You could spin up your own VM on Azure, back a data drive on to Azure Drives, and then use something on top of the OS to do this (a database server, an app that uses the file system directly, whatever). This would be more similar to what you'd do at home but would have fairly unfortunate trade-offs...your entire cloud is now reliant on the availability of this one VM, cost is something to think about, scalability of the solution, and so on.
Splunk is also an option to consider, if you look at VMs.
As an earlier commenter mentioned, you could compute off of log data. But this would likely not be super real time.
Service Bus is another option to consider. You could pump messages over SB for these events and have a consumer that reads them and emits a "summary." There are a bunch of design patterns to consider if you look at this. The SB stack is pretty well documented. Another interesting element of SB is that you might be able to trade off 100% correctness for perf/scale/cost. This might be a worthy trade-off for you depending upon your goals.
Azure also exposes queues which might be a fit. I'll admit I think SB is probably a better fit but it is worth looking at both if you are going down this path.
Sorry I don't have a silver bullet but I hope this helps.
I would suggest you follow the pattern described in .NET Multi-Tier Application. This would help you decouple the Web role which faces your clients and the Worker role, which will store the data to a persistence medium (either SQL Server / Azure Storage) by using the Service Bus.
Also, this is an efficient model to scale as you can span new instances of web role or worker role or both. For the dashboard depending on the load you can Cache your data periodically and server it from the Cache. This would compromise on the accuracy of the data, but would still provide with an option for easy scaling. You can even invalidate the cache every 1 minute and get it loaded from the persistence medium to get the latest value.
Regarding to use SQL Server or Azure storage, if there is no need for relational capabilities like JOINS etc, you can very well go for the Azure storage.
I am in the process of creating an application which will communicate with a single server where WCF Web Service(s) would be installed. I am a little new to this process and was wondering which of these two options would be better in the long run to handle the load for a significant amount of users:
1- Create and install a single Web Service on a multi-core server for all of the client applications to communicate with.
2- Create and install multiple Web Services on a multi-core server, each to communicate with different modules inside of the client application.
All-in-all I'm just trying to figure out whether in processing time and with a large number of users whether there is a significant difference between options 1 and 2, or if option 2 would just create an unnecessary programming headache.
Thanks,
Patrick
The advantage of having multiple web services would be that each can have their own application pool (i.e. worker process) in IIS. So you can recycle one application pool for one web service without affecting the others.
The advantage of having a single web service would be potentially easier maintenance, since the code is in one file, etc. Of course, if it's a lot of code, this can make maintenance harder too.
So the question is, what's the right level of granularity?
You can split the web services up per business function, and I've found that this is a good approach. For example, if you have some business methods that deal with invoicing, you could put those into an Invoicing web service.
If you have other business methods that deal with shipping orders, you could put those into a Shipping web service.
This creates a nice split, in my opinion, and also lets you leverage the application pool advantages discussed earlier.
Example
You can see a real world example of this type of split with FedEx. Note how they split their web services up by shipping, tracking and visibility, etc.
I have a little experience with WCF and would like to get your opinion/suggestion on how the following problem can be solved:
A web service needs to be accessible from multiple clients simultaneously and service needs to return a result from a shared data set. The concrete project I'm working on has to store a list of IP addresses/ranges. This list will be queried by a bunch of web servers for a validation purposes and we speak of a couple of thousand or more queries per minute.
My initial draft approach was to use Windows service as a WCF host with service contract implementing class that is decorated with ServiceBehavior(InstanceContextMode = InstanceContextMode.Single, ConcurrencyMode = ConcurrencyMode.Multiple) that has a list object and a custom locking for accessing it. So basically I have a WCF service singleton with a list = shared data -> multiple clients. What I do not like about it is that data and communication layers are merged into one and performance wise this doesn't feel "right".
What I really really (- want is Windows service running an instance of IP list holding container class object, a second service running WCF service contract implementation and a way the latter querying the former in a nice way with a minimal blocking. Using another WCF channel would not really take me far away from the initial draft implementation or would it?
What approach would you take? Project is still in a very early stage so complete design re-do is not out of question.
All ideas are appreciated. Thanks!
UPDATE: The data set will be changed dynamically. Web service will have a separate method to add IP or IP range and on top of that there will be a scheduled task that will trigger data cleanup every 10-15 minutes according to some rules.
UPDATE 2: a separate benchmark project will be kicked up that should use MySQL as a data backend (instead on in-memory list).
It depends how far it has to scale. If a single server will suffice, then fine; keep it conveniently in memory (as long as you can recreate the data if the server gets restarted). If the data-volume is low, then simple blocking (lock) should work fine to synchronize the data, or for higher throughput a ReaderWriterLockSlim. I would probably not store it directly in the WCF class instance, though.
I would avoid anything involving sessions (if/when this ties into the WCF life-cycle); this is rarely helpful to simple services.
For distributed load (over multiple servers) I would give consideration to a separate dedicated backend. A database or memcached / AppFabric / etc would be worth consideration.