High availability & scalability for C# - c#

I've got a C# service that currently runs single-instance on a PC. I'd like to split this component so that it runs on multiple PCs. Each PC should be assigned a certain part of the work. If one PC fails, its work should be moved to a backup machine.
Data synchronization can be done by the DB, so that should not be much of an issue. My current idea is to use some kind of load balancer that splits and sends the incoming requests to the array of PCs and makes sure the work is actually processed.
How would I implement such a functionality? I'm not sure if I'm asking the right question. If my understanding of how this goal should be achieved is wrong, please give me a hint.
Edit:
I wonder if the idea given above (load balancer splitswork packages to PCs and checks for result) is feasible at all. If there is some kind of already implemented solution so this seemingly common problem, I'd love to use that solution.
Availability is a critical requirement.

I'd recommend looking at a Pull model of load-sharing, rather than a Push model. When pushing work, the coordinating server(s)/load-balancer must be aware of all the servers that are currently running in your system so that it knows where to forward requests; this must either be set in config or dynamically set (such as in the Publisher-Subscriber model), then constantly checked to detect if any servers have gone offline. Whilst it's entirely feasible, it can complicate the scaling-out of your application.
With a Pull architecture, you have a central work queue (hosted in MSMQ, Sql Server Service Broker or similar) and each processing service pulls work off that queue. Expose a WCF service to accept external requests and place work onto the queue, safe in the knowledge that some server will do the work, even though you don't know exactly which one. This has the added benefits that each server monitors it's own workload and picks up work as-and-when it is ready, and you can easily add or remove servers to/from this model without any change in config.
This architecture is supported by NServiceBus and the communication between Windows Azure Web & Worker roles.

From what you said each PC will require a full copy of your service -
Each PC should be assigned a certain
part of the work. If one PC fails, its
work should be moved to a backup
machine
Otherwise you won't be able to move its work to another PC.
I would be tempted to have a central server which farms out work to individual PCs. This means that you would need some form of communication between each machine and and keep a record back on the central server of what work has been assigned where.
You'll also need each machine to measure it's cpu loading and reject work if it is too busy.
A multi-threaded approach to the service would make good use of those multiple processor cores that are ubiquitoius nowadays.

How about using a server and multi-threading your processing? Or even multi-threading on a PC as you can get many cores on a standard desktop now.
This obviously doesn't deal with the machine going down, but could give you much more performance for less investment.

you can check windows clustering, and you have to handle set of issues that depends on the behaviour of the service (you can put more details about the service itself so I can answer)

This depends on how you wanted to split your workload, this usually done by
Splitting the same workload by multiple services
Means same service being installed on
different servers and will do the
same job. Assume your service is reading huge data from the db servers and processing them to produce huge client specific datafiles and finally this datafile is been sent to the clients. In this approach all your services installed in diff servers will do the same work but they split the work to increaese the performance.
Splitting the part of the workload by multiple services
In this approach each service will be assigned to the indivitual jobs and works on different goals. in above example one serivce is responsible for reading data from db and generating huge data files and another service is configured only to read the data file and send it to clients.
I have implemented the 2nd approach in one of my work. Because this let me isolate and debug the errors in case of any failures.

The usual approach for load balancer is to split service requests evenly between all service instances.
For each work item (request) you can store relative information in database. Then each service should also have at least one background thread checking database for abandoned work items.

I would suggest that you publish your service through WCF (Windows Communication Foundation).
Then implement a "central" client application which can keep track of available providers of your service and dish out work. The central app will act as scheduler and load balancer of the tasks to be performed.
Check out Juwal Lövy's book on WCF ("Programming WCF Services") for a good introduction on this topic.

You can have a look at NGrid : http://ngrid.sourceforge.net/
or Alchemi : http://www.gridbus.org/~alchemi/index.html
both are grid computing framework with load balancers that will get you started in no time.
Cheers,
Florian

Related

Use SQL Service Broker to Decouple Service From Database?

I'm looking at putting together a fairly straight-forward WCF-based service, and I have a question about how best to decouple it from the database.
Background: The service I'm going to be implementing is highly critical, geographically distributed, and needs to be as available as possible through a disaster or database failure. The business logic is pretty simple; it receives events from an external source, maintains a state table, and broadcasts processed updates to connected clients. I'm replacing a service that currently handles 400-600 incoming events per second, and approximately 10-20 concurrently connected clients. There will be multiple instances of the service running in multiple locations across the US. All instances host the same state data and share events. There is one instance of a master (SQL Server 2008) database in one location.
Challenge: I've built a number of applications similar to this in the past, and I have most of the architectural hurdles behind me. But there's one challenge I've come across to which I can't help but imagine there's a better solution: in my design, the database (MSSQL) is used only for persistence; the database is only read when the first instance of the service starts and for offline reporting. During normal operation, the application only ever writes historical data to the DB.
To fully decouple the application from the database, in the past I've used SQL Service Broker: On each server running the service, I install an instance of SQL Server Express that essentially just acts as a queue for Service Broker messages to the core (SSB "target") database. In normal operating conditions, the application executes all its SQL operations against the local instance, which queues/forwards them to the target DB via SSB. This works pretty well, and to be honest I'm fairly happy with it... As long as the local instance of SQL Server Express is up, the application will obviously stay unaware of problems at the target DB, network issues between it and the target DB, etc., and it's highly survivable in the case of a localized disaster. It's easy to monitor, not too horribly ugly to set up, and it's all supported. In short, it works, and I'm content to live with it if I have to.
But it strikes me as a bit of a kludge. It feels like there should a better way to do that.
Obviously one option is to just queue the database operations in process. I don't like that because if I'm going to decouple things at all, I'd prefer to really decouple and keep my application itself as far away from the DB as possible. I could also write a Data Service that queues these operations... I actually briefly started down that path before thinking to myself, "Wait, isn't this what SSB already does?"
Due to unchangeable external constraints, a more robust/HA SQL Server architecture is not an option. I've been given my one DB cluster and that's that.
So I'm open to just about any thoughts and/or criticisms. Is there something obvious I'm missing? This feels like the kind of thing where there could be something stone-simple I've just somehow overlooked (though not for lack of searching.) Am I making some kind of wider architectural mistake here?
Thanks in advance!
My opinion is obviously biased, but for the record I can point to several fairly big projects that do (or did) it the same way, like High volumn contiguos real Time ETL, March Madness on Demand or MySpace SQL Server Service Broker.
But several things changed in later years, and the primary change is the rise of PaaS offerings. Today you can have a highly available, scalable database and messaging platform, eg. SQL Azure and Azure Queues/Azure Service Buss. Or DynamoDB and SQS if you're willing to step outside SQL/ACID. Arguably, the price point of a park of SQL Express instances pushing to a central SQL Server Standard Edition will be lower than a PaaS solution, but it will be hard to beat the PaaS in terms of availability, free maintenance and scale on-demand.
So aside from the PaaS oint of view above, I would argue that the solution you have is superior to pretty much anything else the MS stack has. WCF is sure easy to program against, unless you have the anti-SOAP fever, but has basically 0 (zero) to offer in terms of availability/reliability. Your process is gone === your data is gone, end of story. WCf over MSMQ is 'WCF' just in name, the programming model of queue channels is miles away from the http/net binding WCF programming model. And MSMQ has little to stand up agains Service Broker (aside from ubiquity). but then again, as you probably know, I am really biased in my opinion...

Register certain events on client machine and notify to another C#

Please don't get confuse yourself with the title of this question, I don't know what is the exact technical term of what I want to accomplish :). My requirement may be little strange and I already implemented it but I need some best practice/method to do it properly.
Here is my situation.
I am developing a client system monitoring windows application (Tracking software in client side and monitoring software in my system). I have many systems connected to a LAN and I have a monitoring system. If any certain actions happen on client system, I will get notified. I cannot use any databases in my network so what I am doing is, Since my system is also connected to LAN I shared one folder in my system. Whenever some actions happens in client system, Tracking software will create a file containing event to the shared folder in my system. The monitoring software uses a timer which will continuously check for any new files in the shared folder on a certain interval(15 Minutes). If any file found, monitoring system will know some event has happened and will show the event.
But the problem I will get notified only after 15 minutes. Also is I don't think this is the best way. There may be some good and best methods. Is there any way like registering event directly to my Monitoring application from client machine?
Please NOTE: I cannot use any Database for this purpose.
Any suggestions will be appreciated.
Take a look at SignalR - it provides real time notification and can be used exactly as you describe.
You would not require a database (but remember if your server isn't running you will miss events - this may or may not be acceptable).
Take a look at FileSystemWatcher. This will monitor directories and raise events. IME, it works well, but can fail with large amounts of traffic.
This sounds like a perfect candidate for MSMQ (MS Message Queue) and Triggers.
Create an MSMQ that all your Tracking Softwares can write to. Then have an MSMQ trigger (perhaps connecting to a front-end through WCF/named pipes) to display an alert in your Monitoring Software
You may want to use WCF Framework.
Here is two links that can help you:
wcf-tutorial-events-and-callbacks
wcf-tutorial-basic-interprocess-communication

Networked Client-Server application advice

I'm trying to design an application that will allow two users over a network to play the prisoner's
dilemma game (http://en.wikipedia.org/wiki/Prisoner%27s_dilemma).
Basically, this involves:
Game starts (Round 1).
Player 1 chooses to either cooperate, or betray.
Player 2 chooses to either cooperate, or betray.
Each other's decisions are then displayed
Round 2 begins
Etc.
I've done some thinking and searching and I think the application should contain the following:
Server class that accepts incoming tcp/ip connections
Gui clients (Seperate program)
For each connection (maximum 2) the server will create a new ConnectedClient class. This class will contain the details of the two player's machines/identities.
The Server class and the ConnectedClient class will connect/subscribe events to each so they can alert one another when e.g. server instruction ready to transmit to players, or players have transmitted their inputs to the server.
I'm not sure whether the best approch is to use a single thread to do or the work, or have it multithreaded. Single threaded would obviously be easier, but I'm not sure whether it is possible for this situation - I've never made a application before requiring TCP/IP connections, and I'm not sure if you can listen for two incoming connections on one thread.
I've found the following guide online, but it seems that it opens two clients on two threads, and they communicate directly to each other - bypassing the server (which I will need to control the game logic): http://www.codeproject.com/Articles/429144/Simple-Instant-Messenger-with-SSL-Encryption-in-Cs
I'm very interested and would be grateful on any advice on how you would go about implementing the application (mainly the server class).
I hope I've explained my intentions clearly. Thanks in advance.
My 1st advice would be to forget about TCP/IP and sockets here. You definitely can do it with that technology stack, but you would also get a lot of headache implementing all the things you want. And the reason is it too low level technology for such a class of tasks. I would go with tcp/ip and sockets only for academic interest, or if I need tremendous control over the communication, or if I have very high performance requirements.
So, my 2nd advice would be to look at WCF technology. Don't be afraid if you haven't used it before. It's not that difficult. And if you were ready to use sockets for your app, you can handle WCF definitely. For you task you can create basic communication withing 1-2 hours from scratch using any WCF tutorial.
So, I would create a server WCF service which will have some API functions containing your business logic. It can be hosted within a windows service, IIS, or even a console application.
And your clients would use that WCF service, calling their functions like it's functions from another local class in your project. WCF could also help you do the events which you want (it's a little bit more advanced topic though). And you can even forget about threading here, most of the things will be working out of the box.
First, as others have said, separate your game logic as much as you can, so the basic funcionality won't depend too much on your comunication infrastructure.
For the communication, WCF can handle the task. You can make your clients send a request to a service hosted in IIS, doing some kind of identification/authentication, and open a Duplex channel from where your service can push results and comunicate the start of new rounds.
Once one client connects, it waits for another. When it happens, it notifies the first client using the Duplex Channel callback and awaits for its choice. Then it asks the second user, awaits for its response. When it comes, it notifies the result to both and restarts the game.
Going a little bit deeper in the implementation:
You will have a service with some operations (like Register, PushDecision, more if needed). You will also define a callback interface, with the operations your service will need to push to the client (NotifyResult, RequestDecision, again, these are examples). You then create proxies for your clients that maps to your service operations and implement the callback operations in a way it expose events and raise them when the service pushs messages.
A use case:
Client A creates the proxy, calls Register on the server. The server receives the call, register the cilent and saves the callback object in a state. A duplex connection will be established. What does that mean? It means that (if you using the PollingDuplexBinding, as you probably will) from now on the proxy object in Client A will be doing long poll requests to the server, checking if there is a callback message. If there isnt, then it long polls again. If there is, it calls the method of the callback in the proxy passing the data the server has push. The callback method in the proxy will tipically raise an event, or execute a delegate, its up to you to choose.
Client B connects (calling Register), does the same as it did to A, and the server, noticing that two clients are connected, requests a response to A through its saved callback. This can happen during the processing of the B's Register call, or it can be triggered to execute in a new thread (or better, run in the ThreadPool or start a new Task) in B's register call.
Client A will receive the server callback requesting its choice. It can then notify the user and get the choice through the UI. A new call is made to the server (PushDecision, for example). The server receives Client A choice, asks B the same way. Once it has both responses, it calculates the result and pushes the outcome to the Clients.
An advantage of using Duplex Channels with PollingDuplex with WPF is that, as it uses long polling, there will be no need to use other ports than 80.
This is by no means a final implementation, is just a little guide to give you some ideas instead of just giving you some misty advices. Of course, there may be a bunch of other ways of doing that with WCF.
We can first assume that the application can handle only two users per time and then, if you want, you can scale up, making your service keep some form of state with a mapping table with locked access, as another example.
Some thoughts on WCF: There is an easy path to start developing with WCF using the Visual Studio tools (svcutil) but I don't like that approach. You don't "get to know" the WCF infrastructure well, you become tied to the verbose magic with which it generates your proxies, and you lose flexibility, especially in special scenarios, like Duplex polling that you may want to use.
The other way, that is to manually create your services and your proxies, is not that hard, though, and gets very interesting once you realize what you can do with it. Related to that I can give you one advice: do everything you can to make your proxy operations use Task-based Async Pattern (you can see the different ways to implement proxy operations here). This will make your code much cleaner and straight forward when combined with the new C# async/await keywords and your UI will be a joy to implement.
I can recommend some links to get you started. Some of them are old, but very didactic.
There used to be a fantastic article of WCF in this link but it seems to be now offline. Luckily, I found the content available there in a file in this link.
This one covers your hosting options.
Topics on WCF infrastructure: link
Topics on Duplex Services: link link link
Topics on Task-based Async Pattern: link link link
Well one advice I can give you if you insist that all user communicate through server and you want your application to scale:
Separate your logic (by understanding each part of the logic you want to build on the server)
Make your classes such that it can handle multiple users per transaction
Use IOCP whenever possible
it depends on the structure of your application if you need authentication and user profiles etc .. you may introduce the WCF or whatever web-service for user and hide your actual action in the background (this will cost you performance but it might be the only suitable solution you have) , so you may have your authentication framework at the top of your server logic, and a pipelined action logic in the behind .. i.e. users get authenticated to be able to access the services presented by the server, but these services pipeline all users and handle as many as possible simultaneously — if you don't need authentication then you might directly communicate to your server logic and you may use completion ports on user's request - a lot of work to be done here.

running timer from global.asax vs quartz.net

I am developing a asp.net site that needs hit a few social media sites daily for blanket friend/follower data. I have chosen arvixe business class as my hosting. In the future if we grow, I'd love to get onto a dedicated server and run a windows service, however since that is not in the cards at this point I need another reliable way of running scheduled tasks. I am familiar with running a thread timer from the app_code(global.aspx). However the app pool recycling will cause some problems with the timer. I have never used task scheduling like quartz but have read a lot about it on stackoverflow. I was looking for some advise as to how to approach my goal. One big problem I have using either method is that I will need the crawler threads to sleep for up to an hour regularly due to api call limits. My first thoughts were to use the db to save the starting and ending of a job. When the app pool recycles I would clear out any parts not completed and only start parts that do not have a record of running on that day. What do the experts here think? any good links to sample architecture of this type of scheduling?
It doesn't really matter what method you use, whether you roll your own or use Quartz. You are at the mercy of ASP.NET/IIS because that's where you want to host it.
Do you have a spare computer laying around that can just run a scheduled task and upload data to a hosted database? To be honest, it's possibly safer (depending on your use case) to just do it that way then try to run a scheduler in ASP.NET.
Somewhat along the lines of Bryan's post;
Find a spare computer.
Instead of allowing DB access have it call up a web service on your site. This service call should be the initiator of the process you are trying to do. Don't try to put params into it, just something like "StartProcess()" should work fine.
As far as going to sleep and resuming later take a look at Workflow Foundation. There are some nice built in features to persist state.
Don't expose your DB to the outside world, instead expose that page or web service and wraps some security around that. WCF has some nice built in security features for that.
The best part is when you decide to move off, you can keep your web service and have it called from a Windows Service in the same manner.
As long as you use a persistent job store (like a database) and you write and schedule your jobs so that they can handle things like being killed half way through, having IIS recycle your process is not that big a deal.
The bigger issue is that IIS shuts your site down if it doesn't have traffic. If you can keep your site up, then just make sure you set the misfire policy appropriately and that your jobs store any state data needed to pick up where they left off, you should be able to pull it off.
If you are language-agnostic and don't mind writing your "job-activation-script" in your favourite, Linux-supported language...
One solution that has worked very well for me is:
Getting relatively cheap, stable Linux hosting(from reputable
companies),
Creating a WCF service on your .Net hosted platform that will contain the logic you want to run regularly (RESTfully or SOAP or XMLRPC... whichever suits you),
Handling the calls through your Linux hosted cron jobs, written in your language of choice(I use PHP).
Working very well, like I said. No VPS expense,configurable and externally activated. I have one central place where my jobs are activated, with 99 to 100% uptime(never had any failures).

Need advice to query data from sql server on every 5 seconds and send it to other app.(.NET C#)

I have a require ment to read data from a table(SQL 2005) and send that data to other application for every 5 seconds. I am looking for the best approach to do the same.
Right now I am planning to write a console application(.NET and C#) which will read the data from sql server 2005(QUEUE table which will be filled through different applications) and send to other application through TCP/IP(Central server). Run that console application under schedule task for every 5 seconds. I am assuming scheduled task will take care to discard new run event if task is already running(avoid to run concurrent executions).
Does any body come accross similar situation? Please share your experience and advice me for best approach.
Thanks in advance for your valuable time spending for my request.
-Por-hills-
We have done simliar work. If you are going to query a sql database every 5 seconds, be sure to use a stored procedure that is optimized to be very fast. It should not update data unless aboslutely necessary. This approach is typically called 'polling' and I've found that it is acceptable if your sqlserver is not otherwise bogged down with too many other calls.
In approaches we've used, a Windows Service that does the polling works well.
To communicate results to another app, it all depends on what your other app is doing and what type of interface you can make into it, and how quickly you need the results. The WCF class libraries from Microsoft provide many workable approaches for real time communication. My preference is to write to the applications database, and then have the application read the data (if it works for that app). If you need something real time, WCF is the way to go, and I'd suggest using a stateless protocol like http if < 5 sec response time is required, (using standard HTTP posts), or TCP/IP if subsecond response time is required.
since I assume your central storage is also SQL 2005, have you considered using what SQL Server 2005 offers out of the box to achieve your requirements? Rather than pool every 5 seconds, marshal and unmarshal TCP/IP, implement authentication and authorization for the TCP/IP pipe, scale TCP transmission with boxcaring, manage message acknowledgments and retries, deal with central site availability, fragment large messages, implement fairness in transmission and so on and so forth, why not simply use Service Broker? It does all you need and more, out of the box, already tested, already tuned for performance and scalability.
Getting reliable messaging right is not trivial and you should focus your efforts in meeting your business specifics, not reiventing the wheel.
I would recommend writing a Windows Service (since you are C#) that has some timer which runs every 5 seconds. That way you wont be starting and stopping an application all the time, it can run even when there is no one logged into the machine, and it will automatically start when the machine is restarted.
For one of my projects, I needed to do something periodically. I opted for a service and set up a timer that takes care of reading the data. You might consider that solution. It has worked well for me.
I suggest to create a windows service and not an application and to perform the timing yourself - create a timer and execute one step on each timer event. For the communication you have many choices - I would consider using standard technologies like a webservice or Winows Communication Foundation.
Besides this custom solution I would evaluate if the task can be solved using Microsoft Integration Services .
Finally other question comes to mind - why do you need this application? Why doesn't/don't the application(s) consuming the data query the database? Is the expensive polling required? Is it possible for the data producers to signal the availibilty of new data directly to the data consumers?
I am not sure about the details of your project, specifically related to security but maybe it would be better to create an SSIS package and schedule it as a job?

Categories