Does anyone know if AspNetWebSocket (introduced in .NET Framework 4.5) utilizes IOCP for handling requests instead of one thread per Connection?
There's a relatively easy way for you to work this out for yourself.
Create a simple program.
Watch it with a task monitor and look at the number of threads.
Create lots of web socket connections to it.
See if the thread count increases whilst the connections are alive.
Compare the number of connections to the number of threads.
But, what does it matter? As long as the API meets your performance and scalability requirements which you will only discover from performance testing your application.
Note that I would be VERY surprised if the implementation does NOT use IOCP but there's really little point in asking IMHO.
Related
A few years ago I developed a server app (C#, .NET 4.0) that has multiple clients which connect to it. The way I did this was to create a thread for every single connection, and maintain a list of these connections. When I tested the app, it handled connections for 50 clients across my country. and it ran OK (from what I saw).
My questions are these:
For a scalable solution, is multi-threading a viable solution for handling multiple connections to various clients, or should I handle all connections on the same thread?
Are there limits to the number of threads and threading in general under .NET?
Are there downsides for using threads in .NET?
I know this is sort of vague, but I have forgotten some more intricate details since I developed the project some time ago. I am interested in developing a scalable solution for a server app in .NET and would like to know from the start if there where areas of improvements in my approach before.
UPDATE 1
I didn't use a thread polling instantiated. I actually created a Thread for a method (lets call it method threadLife).
in threadLife i had a while(true) statement in which i awaited for messages from the client. In the while i would wait for the client so send a message (so the while was blocked until i received a message)
In my application, the connections were quite stable (i.e. the clients would stay connected for long periods of time) so the connections where kept alive till the client disconnected (didn't close the connection after every message, i would recieve very frequent messages that let me know the clients state)
Thread-per-connection is not a scalable solution.
To scale well, you should use asynchronous socket methods exclusively. The question is whether to multiplex them all on a single thread or to use the thread pool. The thread pool would scale better than multiplexing, but it introduces multithreading complexities.
A lot of devs attempt to learn socket programming and multithreading at the same time, which is just too much.
one can use message queues, load balancing, dispatching, etc, etc. There is no single answer. Some solutions fit some problems well, some not.
Good places to start can be:
the ØMQ documentation
Warning: Unstable Paradigms!
The Push Framework Technical Architecture
One thread per connection would not be scalable.
I would suggest that a thread could handle a set of connected clients. If the amount of connected clients increases, the server application should be able to add as many thread needed to handle the ramp up. Then other server instances if things continues to grow.
Instead of having threads that do everything (as application instances are doing), have some dedicated threads that processes the same task separately and shares in-memory data a synchronized way.
This is more and less the way IIS is working. The number of working process, the amount of threads and thread pools is manageable through the control panel.
I remember the OpenSim project (a virtual world platform) ran this way : one thread per connected client. It has been refactored since the way explained above.
Apparently you are well started already with multithreading. Maybe this free ebook will help you to dig further.
When I started multithread, I had some trouble at first to understand it means several execution of the same code at the same time, primarily in the same instance.
Recently I've read a lot about parallel programming in .NET but I am still confused by contradicting statements over the texts on this subject.
For example, tThe popup (upon pointing a mouse on tag's icon) description of the stackoverflow.com task-parallel-library tag:
"The Task Parallel Library is part of .NET 4. It is a set of APIs tpo
enable developers to program multi-core shared memory processors"
Does this mean that multi-core-d and parallel programming applications impossible using prior versions of .NET?
Do I control a multicore/parallel usage/ditribution between cores in .NET multithreaded application?
How can I identify a core on which a thread to be run and attribute a thread to a specific core?
What has the .NET 4.0+ Task Parallel Library enabled that was impossible to do in previous versions of .NET?
Update:
Well, it was difficult to formulate specific questions but I'd like to better understand:
What is the difference in .NET between developing a multi-threaded application and parallel programming?
So far, I could not grasp the difference between them
Update2:
MSDN "Parallel Programming in the .NET Framework" starts from version .NET 4.0 and its article Task Parallel Library tells:
"Starting with the .NET Framework 4, the TPL is the preferred way to
write multithreaded and parallel code"
Can you give me hints how to specifically create parallel code in pre-.NET4 (in .NET3.5), taking into account that I am familiar with multi-threading development?
I see "multithreading" as just what the term says: using multiple threads.
"Parallel processing" would be: splitting up a group of work among multiple threads so the work can be processed in parallel.
Thus, parallel processing is a special case of multithreading.
Does this mean that multi-core-d and parallel programming applications impossible using prior versions of .NET?
Not at all. You could do it using the Thread class. It was just much harder to write, and much much harder to get it right.
Do I control a multicore/parallel usage/ditribution between cores in .NET multithreaded application?
Not really, but you don't need to. You can mess around with processor affinity for your application, but at the .NET level that's hardly ever a winning strategy.
The Task Parallel Library includes a "partitioner" concept that can be used to control the distribution of work, which is a better solution that controlling the distribution of threads over cores.
How can I identify a core on which a thread to be run and attribute a thread to a specific core?
You're not supposed to do this. A .NET thread doesn't necessarily correspond with an OS thread; you're at a higher level of abstraction than that. Now, the default .NET host does map threads 1-to-1, so if you want to depend on an undocumented implementation detail, then you can poke through the abstraction and use P/invoke to determine/drive your processor affinity. But as noted above, it's not useful.
What has the .NET 4.0+ Task Parallel Library enabled that was impossible to do in previous versions of .NET?
Nothing. But it sure has made parallel processing (and multithreading) much easier!
Can you give me hints how to specifically create parallel code in pre-.NET4 (in .NET3.5), taking into account that I am familiar with multi-threading development?
First off, there's no reason to develop for that platform. None. .NET 4.5 is already out, and the last version (.NET 4.0) supports all OSes that the next older version (.NET 3.5) did.
But if you really want to, you can do simple parallel processing by spinning up Thread objects or BackgroundWorkers, or by queueing work directly to the thread pool. All of these approaches require more code (particularly around error handling) than the Task type in the TPL.
What if i ask you "Do you write business software with your own developed language? or Do you drink water after digging your own well?"
That's the difference in writing multi threading by creating threads and manage them around while you can use abstraction over threads using TPL. Multicore and scheduling of threads on cores is maintained at OS so you don't need to worry about whether your threads are getting executed on the cores your system supports AFAIK.
Check this article, it basically sums up what was (virtually) impossible before TPL, even though many companies had brewed their own parallel processing libraries none of them had been fully optimized to take advantage of all resources of the popular architectures (simply because it's big task & Microsoft has a lot of resources + they are good). Also it's interesting to note Intel's counterpart implementation TBB vs TPL
Does this mean that multi-core-d and parallel programming applications impossible using prior versions of .NET?
Not at all. Types like Thread and ThreadPool for scheduling computations on other threads and ManualResetEvent for synchronization were there since .Net 1.
Do I control a multicore/parallel usage/ditribution between cores in .NET multithreaded application?
No, that's mostly the job of the OS. You can set ProcessorAffinity of a ProcessThread, but there is no simple way to get a ProcessThread from a Thread (because it was originally thought that .Net Threads may not directly correspond to OS threads). There is usually no reason to do this and you especially shouldn't do it for ThreadPool threads.
What has the .NET 4.0+ Task Parallel Library enabled that was impossible to do in previous versions of .NET?
I'd say it didn't make anything impossible possible. But it made lots of tasks much simpler.
You could always write your own version of ThreadPool and manually use synchronization primitives (like ManualResetEvent) for synchronization between threads. But doing that properly and efficiently is lots of error-prone work.
What is the difference in .NET between developing a multi-threaded application and parallel programming?
This is just a question of naming and doesn't have much to do with your previous questions. Parallel programming means performing multiple operations at the same time, but it doesn't say how do you achieve parallelism. For that, you could use multiple computers, or multiple processes or multiple threads, or even a single thread.
(Parallel programming on a single thread can work if the operations are not CPU-bound, like reading a file from disk or fetching some data from the internet.)
So, multi-threaded programming is a subset of parallel programming, though one that's most commonly used on .Net.
Multithreading used to be available on single-core CPUs. I believe in .NET world, "parallel programming" represents compiler/language, as well as namespace and "library" additions, that facilitate multi-core capabilities (better than before). In this sense "parallel programming" is a category under multithreading, that provides improved support for multiple CPUa/cores.
My own ponderings: at the same time I see .NET "parallel programming" to encompass not only multi-threading, but other techniques. Consider the fact that the new async/await facilities don't guarantee multi-threading, as in certain scenarios they are only an abstraction of the continuation-passing-style paradigm that could accomplish everything on a single thread. Include in the mix parallelism that comes from running different processes (potentially on different machines) and in that sense, multithreading is only a portion of the broader concept of "parallel programming".
But if you consider the .NET releases I think the former is a better explanation.
Hopefully two simple questions relating to creating a server application:
Is there a theoretical/practical limit on the number of simultaneous sockets that can be open? Ignoring the resources required to process the data once it has arrived! If its of relevance I am targeting the .net framework
Should each connection be run in a separate thread that's permanently assigned to it, or should use of a Thread Pool be made? The dedicated thread approach seems simpler, but it seems odd to have 100+ threads running it once. Is this acceptable practice?
Any advice is greatly appreciated
Venatu
You may find the following answer useful. It illustrates how to write a scalable TCP server using the .NET thread pool and asynchronous sockets methods (BeginAccept/EndAccept and BeginReceive/EndReceive).
This being said it is rarely a good idea to write its own server when you could use one of the numerous WCF bindings (or even write custom ones) and benefit from the full power of the WCF infrastructure. It will probably scale better than every custom written server.
There are practical limits, yes. However, you will most likely run out of resources to handle the load long before you reach them. CPU, or memory are more likely to be exhausted before number of connections.
For maximum scalability, you don't want a seperate thread per connection, but rather you would use an Asynchronous model that only uses threads when servicing active (as in receiving or sending data) connections.
As I remember correctly (did sockets long time ago) the very best way of implementing them is with ReceiveAsync (.NET 3.5) / BeginReceive methods using asynchronous callbacks which will utilize thread pool. Don't open a thread for every connection, it is a waste of resources.
I'm looking to move a Windows C++ application to C# so that some major enhancements are a bit easier. The C++ application is single-threaded and uses a home-grown reactor pattern for all event handling of accepting, reading and writing sockets, and timers. All socket handling is done async.
What is the accepted way to implement a C# reactor pattern? Are the existing libraries?
brofield,
Unfortunately the mentality of the C# world is still in the thread per connection realm. I'm looking for a way to handle multiple connections on a single Compact Framework/ Windows CE box and looking to write my own Proactor/Reactor pattern (fashioned after the one used in ACE) Compact Framework doesn't seem to support asynch connecting - just asynch reading and writing. I also need to manage tight control over timeouts (soft realtime application).
Alan,
One reason to implement a proactor/reactor pattern is so that you don't have to have a thread running for each connection. A web server is the classic example. A busy server could easily have 100s of connections active at anyone time. With that many threads (and I've seen implementations that have one thread to read, another to write data) the time spent in a context switching becomes significant. Under Windows CE on a 750Mhz ARM processor, I measured over a millisecond with peaks as high as 4 milliseconds.
I still find most C# and Java applicatons that I've come across still have way too many threads running. Seems to be the solution for everything - start another thread. Example. Eclipse (the IDE) use 44 threads even before I actually open a project. 44 threads???? to do what exactly??? Is that why Eclipse is so slow?
Have a read of Asynchronous Programming in C# using Iterators;
In this article we will look how to write programs that perform asynchronous operations without the typical inversion of control. To briefly introduce what I mean by 'asynchronous' and 'inversion of control' - asynchronous refers to programs that perform some long running operations that don't necessary block a calling thread, for example accessing the network, calling web services or performing any other I/O operation in general.
The Windows kernel has a very nice asynchronous wait API called I/O Completion Ports, which is very good for implementing a reactor pattern. Unfortunately, the System.Net.Sockets library in the .NET framework does not fully expose I/O Completion Ports to developers. The biggest problem with System.Net.Sockets is that you have no control over which thread will dequeue an asynchronous completion event. All your asynchronous completions must occur on some random global .NET ThreadPool thread, chosen for you by the framework.
.NET Threads, Sockets and Event Handling have first class support within C# and .NET, so the need for an external library is pretty light. In otherwords, if you know what you are doing, the framework makes it very straight forward to roll your own socket event handling routines.
Both companies I work for, tended to reuse the default .NET Socket library as is for all network connections.
Edit to add: Basically what this means is this is an excellent opportunity to learn about Delegates, Events, Threads, and Sockets in .NET/C#
Check SignalR. It looks very promising.
Have a look at the project Interlace on google code. We use this in all of our products.
I used multiple threads in a few programs, but still don't feel very comfortable about it.
What multi-threading libraries for C#/.NET are out there and which advantages does one have over the other?
By multi-threading libraries I mean everything which helps make programming with multiple threads easier.
What .NET integratet (i.e. like ThreadPool) do you use periodically?
Which problems did you encounter?
There are various reasons for using multiple threads in an application:
UI responsiveness
Concurrent operations
Parallel speedup
The approach one should choose depends on what you're trying to do. For UI responsiveness, consider using BackgroundWorker, for example.
For concurrent operations (e.g. a server: something that doesn't have to be parallel, but probably does need to be concurrent even on a single-core system), consider using the thread pool or, if the tasks are long-lived and you need a lot of them, consider using one thread per task.
If you have a so-called embarrassingly parallel problem that can be easily divided up into small subproblems, consider using a pool of worker threads (as many threads as CPU cores) that pull tasks from a queue. The Microsoft Task Parallel Library (TPL) may help here. If the job can be easily expressed as a monadic stream computation (i.e. with a query in LINQ with work in transformations and aggregations etc.), Parallel LINQ (same link) which runs on top of TPL may help.
There are other approaches, such as Actor-style parallelism as seen in Erlang, which are harder to implement efficiently in .NET because of the lack of a green threading model or means to implement same, such as CLR-supported continuations.
I like this one
http://www.codeplex.com/smartthreadpool
Check out the Power Threading library.
I have written a lot of threading code in my days, even implemented my own threading pool & dispatcher. A lot of it is documented here:
http://web.archive.org/web/20120708232527/http://devplanet.com/blogs/brianr/default.aspx
Just realize that I wrote these for very specific purposes and tested them in those conditions, and there is no real silver-bullet.
My advise would be to get comfortable with the thread pool before you move to any other libraries. A lot of the framework code uses the thread pool, so even if you happen to find The Best Threads Library(TM), you will still have to work with the thread pool, so you really need to understand that.
You should also keep in mind that a lot of work has been put into implementing the thread pool and tuning it. The upcoming version of .NET has numerous improvements triggered by the development the parallel libraries.
In my point of view many of the "problems" with the current thread pool can be amended by knowing its strengths and weaknesses.
Please keep in mind that you really should be closing threads (or allowing the threadpool to dispose) when you no longer need them, unless you will need them again soon. The reason I say this is that each thread requires stack memory (usually 1mb), so when you have applications sitting on threads but not using them, you are wasting memory.
For exmaple, Outlook on my machine right now has 20 threads open and is using 0% CPU. That is simply a waste of (a least) 20mb of memory. Word is also using another 10 threads with 0% CPU. 30mb may not seem like much, but what if every application was wasting 10-20 threads?
Again, if you need access to a threadpool on a regular basis then you don't need to close it (creating/destroying threads has an overhead).
You don't have to use the threadpool explicitly, you can use BeginInvoke-EndInvoke if you need async calls. It uses the threadpool behind the scenes. See here: http://msdn.microsoft.com/en-us/library/2e08f6yc.aspx
You should take a look at the Concurrency & Coordination Runtime. The CCR can be a little daunting at first as it requires a slightly different mind set. This video has a fairly good job of explanation of its workings...
In my opinion this would be the way to go, and I also hear that it will use the same scheduler as the TPL.
For me the builtin classes of the Framework are more than enough. The Threadpool is odd and lame, but you can write your own easily.
I often used the BackgroundWorker class for Frontends, cause it makes life much easier - invoking is done automatically for the eventhandlers.
I regularly start of threads manually and safe them in an dictionary with a ManualResetEvent to be able to examine who of them has ended already. I use the WaitHandle.WaitAll() Method for this. Problem there is, that WaitHandle.WaitAll does not acceppt Arrays with more than 64 WaitHandles at once.
You might want to look at the series of articles about threading patterns. Right now it has sample codes for implementing a WorkerThread and a ThreadedQueue.
http://devpinoy.org/blogs/jakelite/archive/tags/Threading+Patterns/default.aspx