Hopefully two simple questions relating to creating a server application:
Is there a theoretical/practical limit on the number of simultaneous sockets that can be open? Ignoring the resources required to process the data once it has arrived! If its of relevance I am targeting the .net framework
Should each connection be run in a separate thread that's permanently assigned to it, or should use of a Thread Pool be made? The dedicated thread approach seems simpler, but it seems odd to have 100+ threads running it once. Is this acceptable practice?
Any advice is greatly appreciated
Venatu
You may find the following answer useful. It illustrates how to write a scalable TCP server using the .NET thread pool and asynchronous sockets methods (BeginAccept/EndAccept and BeginReceive/EndReceive).
This being said it is rarely a good idea to write its own server when you could use one of the numerous WCF bindings (or even write custom ones) and benefit from the full power of the WCF infrastructure. It will probably scale better than every custom written server.
There are practical limits, yes. However, you will most likely run out of resources to handle the load long before you reach them. CPU, or memory are more likely to be exhausted before number of connections.
For maximum scalability, you don't want a seperate thread per connection, but rather you would use an Asynchronous model that only uses threads when servicing active (as in receiving or sending data) connections.
As I remember correctly (did sockets long time ago) the very best way of implementing them is with ReceiveAsync (.NET 3.5) / BeginReceive methods using asynchronous callbacks which will utilize thread pool. Don't open a thread for every connection, it is a waste of resources.
Related
A few years ago I developed a server app (C#, .NET 4.0) that has multiple clients which connect to it. The way I did this was to create a thread for every single connection, and maintain a list of these connections. When I tested the app, it handled connections for 50 clients across my country. and it ran OK (from what I saw).
My questions are these:
For a scalable solution, is multi-threading a viable solution for handling multiple connections to various clients, or should I handle all connections on the same thread?
Are there limits to the number of threads and threading in general under .NET?
Are there downsides for using threads in .NET?
I know this is sort of vague, but I have forgotten some more intricate details since I developed the project some time ago. I am interested in developing a scalable solution for a server app in .NET and would like to know from the start if there where areas of improvements in my approach before.
UPDATE 1
I didn't use a thread polling instantiated. I actually created a Thread for a method (lets call it method threadLife).
in threadLife i had a while(true) statement in which i awaited for messages from the client. In the while i would wait for the client so send a message (so the while was blocked until i received a message)
In my application, the connections were quite stable (i.e. the clients would stay connected for long periods of time) so the connections where kept alive till the client disconnected (didn't close the connection after every message, i would recieve very frequent messages that let me know the clients state)
Thread-per-connection is not a scalable solution.
To scale well, you should use asynchronous socket methods exclusively. The question is whether to multiplex them all on a single thread or to use the thread pool. The thread pool would scale better than multiplexing, but it introduces multithreading complexities.
A lot of devs attempt to learn socket programming and multithreading at the same time, which is just too much.
one can use message queues, load balancing, dispatching, etc, etc. There is no single answer. Some solutions fit some problems well, some not.
Good places to start can be:
the ØMQ documentation
Warning: Unstable Paradigms!
The Push Framework Technical Architecture
One thread per connection would not be scalable.
I would suggest that a thread could handle a set of connected clients. If the amount of connected clients increases, the server application should be able to add as many thread needed to handle the ramp up. Then other server instances if things continues to grow.
Instead of having threads that do everything (as application instances are doing), have some dedicated threads that processes the same task separately and shares in-memory data a synchronized way.
This is more and less the way IIS is working. The number of working process, the amount of threads and thread pools is manageable through the control panel.
I remember the OpenSim project (a virtual world platform) ran this way : one thread per connected client. It has been refactored since the way explained above.
Apparently you are well started already with multithreading. Maybe this free ebook will help you to dig further.
When I started multithread, I had some trouble at first to understand it means several execution of the same code at the same time, primarily in the same instance.
Does anyone know if AspNetWebSocket (introduced in .NET Framework 4.5) utilizes IOCP for handling requests instead of one thread per Connection?
There's a relatively easy way for you to work this out for yourself.
Create a simple program.
Watch it with a task monitor and look at the number of threads.
Create lots of web socket connections to it.
See if the thread count increases whilst the connections are alive.
Compare the number of connections to the number of threads.
But, what does it matter? As long as the API meets your performance and scalability requirements which you will only discover from performance testing your application.
Note that I would be VERY surprised if the implementation does NOT use IOCP but there's really little point in asking IMHO.
I want to create a high performance server in C# which could take about ~10k clients. Now i started writing a TcpServer with C# and for each client-connection i open a new thread. I also use one thread to accept the connections. So far so good, works fine.
The server has to deserialize AMF incoming objects do some logic ( like saving the position of a player ) and send some object back ( serializing objects ). I am not worried about the serializing/deserializing part atm.
My main concern is that I will have a lot of threads with 10k clients and i've read somewhere that an OS can only hold like a few hunderd threads.
Are there any sources/articles available on writing a decent async threaded server ? Are there other possibilties or will 10k threads work fine ? I've looked on google, but i couldn't find much info about design patterns or ways which explain it clearly
You're going to run into a number of problems.
You can't spin up 10,000 threads for a couple of reasons. It'll trash the kernel scheduler. If you're running a 32-bit, then the default stack address space of 1MB means that 10k threads will reserve about 10GB of address space. That'll fail.
You can't use a simple select system either. At it's heart, select is O(N) for the number of sockets. With 10k sockets, that's bad.
You can use IO Completion Ports. This is the scenario they're designed for. To my knowledge there is no stable, managed IO Completion port library. You'll have to write your own using P/Invoke or Managed C++. Have fun.
The way to write an efficient multithreaded server is to use I/O completion ports (using a thread per request is quite inefficient, as #Marcelo mentions).
If you use the asynchronous version of the .NET socket class, you get this for free. See this question which has pointers to documentation.
You want to look into using IO completion ports. You basically have a threadpool and a queue of IO operations.
I/O completion ports provide an
efficient threading model for
processing multiple asynchronous I/O
requests on a multiprocessor system.
When a process creates an I/O
completion port, the system creates an
associated queue object for requests
whose sole purpose is to service these
requests. Processes that handle many
concurrent asynchronous I/O requests
can do so more quickly and efficiently
by using I/O completion ports in
conjunction with a pre-allocated
thread pool than by creating threads
at the time they receive an I/O
request.
You definitely don't want a thread per request. Even if you have fewer clients, the overhead of creating and destroying threads will cripple the server, and there's no way you'll get to 10,000 threads; the OS scheduler will die a horrible death long before then.
There are numerous articles online about asynchronous server programming in C# (e.g., here). Just google around a bit.
I know there are many cases which are good cases to use multi-thread in an application, but when is it the best to multi-thread a .net web application?
A web application is almost certainly already multi threaded by the hosting environment (IIS etc). If your page is CPU-bound (and want to use multiple cores), then arguably multiple threads is a bad idea, as when your system is under load you are already using them.
The time it might help is when you are IO bound; for example, you have a web-page that needs to talk to 3 external web-services, talk to a database, and write a file (all unrelated). You can do those in parallel on different threads (ideally using the inbuilt async operations, to maximise completion-port usage) to reduce the overall processing time - all without impacting local CPU overly much (here the real delay is on the network).
Of course, in such cases you might also do better by simply queuing the work in the web application, and having a separate service dequeue and process them - but then you can't provide an immediate response to the caller (they'd need to check back later to verify completion etc).
IMHO you should avoid the use of multithread in a web based application.
maybe a multithreaded application could increase the performance in a standard app (with the right design), but in a web application you may want to keep a high throughput instead of speed.
but if you have a few concurrent connection maybe you can use parallel thread without a global performance degradation
Multithreading is a technique to provide a single process with more processing time to allow it to run faster. It has more threads thus it eats more CPU cycles. (From multiple CPU's, if you have any.) For a desktop application, this makes a lot of sense. But granting more CPU cycles to a web user would take away the same cycles from the 99 other users who are doing requests at the same time! So technically, it's a bad thing.
However, a web application might use other services and processes that are using multiple threads. Databases, for example, won't create a separate thread for every user that connects to them. They limit the number of threads to just a few, adding connections to a connection pool for faster usage. As long as there are connections available or pooled, the user will have database access. When the database runs out of connections, the user will have to wait.
So, basically, the use of multiple threads could be used for web applications to reduce the number of active users at a specific moment! It allows the system to share resources with multiple users without overloading the resource. Instead, users will just have to stand in line before it's their turn.
This would not be multi-threading in the web application itself, but multi-threading in a service that is consumed by the web application. In this case, it's used as a limitation by only allowing a small amount of threads to be active.
In order to benefit from multithreading your application has to do a significant amount of work that can be run in parallel. If this is not the case, the overhead of multithreading may very well top the benefits.
In my experience most web applications consist of a number of short running methods, so apart from the parallelism already offered by the hosting environment, I would say that it is rare to benefit from multithreading within the individual parts of a web application. There are probably examples where it will offer a benefit, but my guess is that it isn't very common.
ASP.NET is already capable of spawning several threads for processing several requests in parallel, so for simple request processing there is rarely a case when you would need to manually spawn another thread. However, there are a few uncommon scenarios that I have come across which warranted the creation of another thread:
If there is some operation that might take a while and can run in parallel with the rest of the page processing, you might spawn a secondary thread there. For example, if there was a webservice that you had to poll as a result of the request, you might spawn another thread in Page_Init, and check for results in Page_PreRender (waiting if necessary). Though it's still a question if this would be a performance benefit or not - spawning a thread isn't cheap and the time between a typical Page_Init and Page_Prerender is measured in milliseconds anyway. Keeping a thread pool for this might be a little bit more efficient, and ASP.NET also has something called "asynchronous pages" that might be even better suited for this need.
If there is a pool of resources that you wish to clean up periodically. For example, imagine that you are using some weird DBMS that comes with limited .NET bindings, but there is no pooling support (this was my case). In that case you might want to implement the DB connection pool yourself, and this would necessitate a "cleaner thread" which would wake up, say, once a minute and check if there are connections that have not been used for a long while (and thus can be closed off).
Another thing to keep in mind when implementing your own threads in ASP.NET - ASP.NET likes to kill off its processes if they have been inactive for a while. Thus you should not rely on your thread staying alive forever. It might get terminated at any moment and you better be ready for it.
I'm looking to move a Windows C++ application to C# so that some major enhancements are a bit easier. The C++ application is single-threaded and uses a home-grown reactor pattern for all event handling of accepting, reading and writing sockets, and timers. All socket handling is done async.
What is the accepted way to implement a C# reactor pattern? Are the existing libraries?
brofield,
Unfortunately the mentality of the C# world is still in the thread per connection realm. I'm looking for a way to handle multiple connections on a single Compact Framework/ Windows CE box and looking to write my own Proactor/Reactor pattern (fashioned after the one used in ACE) Compact Framework doesn't seem to support asynch connecting - just asynch reading and writing. I also need to manage tight control over timeouts (soft realtime application).
Alan,
One reason to implement a proactor/reactor pattern is so that you don't have to have a thread running for each connection. A web server is the classic example. A busy server could easily have 100s of connections active at anyone time. With that many threads (and I've seen implementations that have one thread to read, another to write data) the time spent in a context switching becomes significant. Under Windows CE on a 750Mhz ARM processor, I measured over a millisecond with peaks as high as 4 milliseconds.
I still find most C# and Java applicatons that I've come across still have way too many threads running. Seems to be the solution for everything - start another thread. Example. Eclipse (the IDE) use 44 threads even before I actually open a project. 44 threads???? to do what exactly??? Is that why Eclipse is so slow?
Have a read of Asynchronous Programming in C# using Iterators;
In this article we will look how to write programs that perform asynchronous operations without the typical inversion of control. To briefly introduce what I mean by 'asynchronous' and 'inversion of control' - asynchronous refers to programs that perform some long running operations that don't necessary block a calling thread, for example accessing the network, calling web services or performing any other I/O operation in general.
The Windows kernel has a very nice asynchronous wait API called I/O Completion Ports, which is very good for implementing a reactor pattern. Unfortunately, the System.Net.Sockets library in the .NET framework does not fully expose I/O Completion Ports to developers. The biggest problem with System.Net.Sockets is that you have no control over which thread will dequeue an asynchronous completion event. All your asynchronous completions must occur on some random global .NET ThreadPool thread, chosen for you by the framework.
.NET Threads, Sockets and Event Handling have first class support within C# and .NET, so the need for an external library is pretty light. In otherwords, if you know what you are doing, the framework makes it very straight forward to roll your own socket event handling routines.
Both companies I work for, tended to reuse the default .NET Socket library as is for all network connections.
Edit to add: Basically what this means is this is an excellent opportunity to learn about Delegates, Events, Threads, and Sockets in .NET/C#
Check SignalR. It looks very promising.
Have a look at the project Interlace on google code. We use this in all of our products.