.NET C# Socket Concurrency issues

.NET C# Socket Concurrency issues - c#

An instance of System.Net.Sockets.Socket
can be shared by 2 threads so one use the send() method and another it's receive() method ?
Is it safe?
Well, I need it to be not only thread-safe, but also that the send/receive methods be non-syncronized, so as to let each thread call them concurrently.
Do I have another way of doing it ?
Thanks for helping, I am experienced in java but having a hard time trying to make this one.

It should be safe, yes. The Socket class is quoted by MSDN to be fully thread-safe.
I don't know if it's a good idea however. You might be making it difficult for yourself by using two threads. You probably want to look at BeginSend and BeginReceive for asynchronous versions, in which case you shouldn't need multiple threads.

Little off topic, but using de sync methods is only usefull if you have limited clients. I figured out that async sockets are slower with response. The async soockets are much better with handling many clients.
So:
sync is way faster.
async is more scalable

Yes, it is perfectly safe to access send and recieve from two different threads at the same time.
If you want your application to scale to 100's of active sockets then you'll want to use the BeginReceiveve/BeginSend methods as opposed to creating threads manually. This will do magic behind the scenes so that you don't spawn 100's of threads to process the sockets. What exactly it does is platform dependent. On windows you'll use the 'high performance' io completion ports. Under linux (mono) you'll use epoll I believe. Either way, you'll end up using a lot less threads than active sockets, which is always a good thing :)

Related

Asynchronous programming in Windows service, Is it relevant

In windows service, we do not have any blocking UI thread, so is it relevant to use Asynchronous programming inside windows service ??

The alternatives are to either block (i.e. do nothing until required data is available) or await (yield processing and then return automatically when the data is available).
In a situation when the program (a Windows service included) can do nothing further until the data arrives, there may seem little difference between the two, and as far as that program itself is concerned, this is true.
However, the program will be running in a thread allocated to it by the operating system (even if it is using only a single thread). Threads are not free resources and if a large number are in use, the OS will not hand out new ones until old ones terminate or become free. Thus other programs will be held up.
When a program blocks, it keeps hold of its thread, making it unavailable for use else where. When it awaits, the thread becomes available for others to use.
So using await will make the whole computer run more efficiently.

Async programming allows the efficient use of threads when they are executing blocking tasks. Blocking occurs in the ui but also when performing IO and therefore when communicating.
If your service does not perform heavy IO and does not use sockets and pipes, you won't have a benefit within the service; although I cannot image what such service could do.
Generally speaking, async programming produce also a benefit in the hosting system because it allows to globally use fewer resources to run your workload. However, you have to consider that async programming does not perform any resource sharing as said in other answers: your implementation will use your threads in a more efficient way (i.e. Task oriented), but you won't magically have more threads available.

The two things aren't related.
Most Windows services don't have a gui thread as they don't have a GUI. Instead they'll have a main thread, and probably many other child threads that implement the service. Any of these threads may want to mak use of asynchronous programing techniques. For example, they may be reading or writing over a socket, a classic example of using an asychronous programming model.

Design of asynchronous socket classes in C#

I've done an small asynchronous tcp server/client in C#...
... And I've been just thinking :
C# API implements select and epoll, a classic but easy way to do async. Why does Microsoft introduce the BeginConnect/BeginSend family, which -in my opinion- have a more complicated design (and adds lines of code too).
So, using the BeginXXX() "trend", I noticed that the System.Threading import is required (for the events). Does it mean that threads are involved too ?

select and poll have two problems:
They are generally used in a single-threaded way. They do not scale for this reason.
They require all IO to be dispatched through a central place that does the polling.
It is much nicer to be able to just specify callback that magically will be called on completion. This scales automatically and there is no central place to dispatch needed. Async IO in .NET is quite free of hassles. It just works (efficiently).
Async IO on Windows is threadless. While an IO is running not a single thread is busy serving it. All async IO in .NET uses truly async IO supported by the OS. This means either overlapped IO or completion ports.
Look into async/await which also can be used with sockets. They provide the easiest way to use async IO that I know of. That includes all languages and platforms. select and poll aren't even in the same league judged by ease of use.

Is there a performance penalty using await/async?

I'm considering rewriting my network library with await/async paradigm. Lots of code which uses the library is still synchronous, so I'm planning on moving the entire library into the async mode and then creating method stubs, which would transform the async calls into synchronous calls.
Can anyone suggest to me whether this is going to make my library worse for synchronous use? (like if it would consume more cpu, method calls would take longer to execute etc)?

It will definitely not be faster, synchronous code can respond to incoming data quicker.
The advantage you get from doing it asynchronously is that your library will scale a lot better, being able to handle many more connections. A side effect of not having hundreds of threads doing nothing but waiting for data to arrive. The disadvantage of doing it asynchronously is that your library will be much harder to use by the client app. Which is what async/await solves. There is no benefit if you make it synchronous again yourself, it must be left to the client app.

The proper answer here is to benchmark it, try rewriting a couple of methods with async and see how they perform when used synchronously.
Having said that this article explains that yes there is a cost to setting up the state required for async methods and so only use them if it's beneficial. For a network library (where the vast majority of your time is probably spent waiting for the network) the time cost for setting up async is probably negligible.
In summary, for a network library it's probably fine, but benchmarking is the only way to be sure.

Only perfomance penalty is that async/await is using state machine. So perfomance influence is such as yield return instead of returning an array/list.

Server Architecture

Hopefully two simple questions relating to creating a server application:
Is there a theoretical/practical limit on the number of simultaneous sockets that can be open? Ignoring the resources required to process the data once it has arrived! If its of relevance I am targeting the .net framework
Should each connection be run in a separate thread that's permanently assigned to it, or should use of a Thread Pool be made? The dedicated thread approach seems simpler, but it seems odd to have 100+ threads running it once. Is this acceptable practice?
Any advice is greatly appreciated
Venatu

You may find the following answer useful. It illustrates how to write a scalable TCP server using the .NET thread pool and asynchronous sockets methods (BeginAccept/EndAccept and BeginReceive/EndReceive).
This being said it is rarely a good idea to write its own server when you could use one of the numerous WCF bindings (or even write custom ones) and benefit from the full power of the WCF infrastructure. It will probably scale better than every custom written server.

There are practical limits, yes. However, you will most likely run out of resources to handle the load long before you reach them. CPU, or memory are more likely to be exhausted before number of connections.
For maximum scalability, you don't want a seperate thread per connection, but rather you would use an Asynchronous model that only uses threads when servicing active (as in receiving or sending data) connections.

As I remember correctly (did sockets long time ago) the very best way of implementing them is with ReceiveAsync (.NET 3.5) / BeginReceive methods using asynchronous callbacks which will utilize thread pool. Don't open a thread for every connection, it is a waste of resources.

Sync Vs. Async Sockets Performance in .NET

Everything that I read about sockets in .NET says that the asynchronous pattern gives better performance (especially with the new SocketAsyncEventArgs which saves on the allocation).
I think this makes sense if we're talking about a server with many client connections where its not possible to allocate one thread per connection. Then I can see the advantage of using the ThreadPool threads and getting async callbacks on them.
But in my app, I'm the client and I just need to listen to one server sending market tick data over one tcp connection. Right now, I create a single thread, set the priority to Highest, and call Socket.Receive() with it. My thread blocks on this call and wakes up once new data arrives.
If I were to switch this to an async pattern so that I get a callback when there's new data, I see two issues
The threadpool threads will have default priority so it seems they will be strictly worse than my own thread which has Highest priority.
I'll still have to send everything through a single thread at some point. Say that I get N callbacks at almost the same time on N different threadpool threads notifying me that there's new data. The N byte arrays that they deliver can't be processed on the threadpool threads because there's no guarantee that they represent N unique market data messages because TCP is stream based. I'll have to lock and put the bytes into an array anyway and signal some other thread that can process what's in the array. So I'm not sure what having N threadpool threads is buying me.
Am I thinking about this wrong? Is there a reason to use the Async patter in my specific case of one client connected to one server?
UPDATE:
So I think that I was mis-understanding the async pattern in (2) above. I would get a callback on one worker thread when there was data available. Then I would begin another async receive and get another callback, etc. I wouldn't get N callbacks at the same time.
The question still is the same though. Is there any reason that the callbacks would be better in my specific situation where I'm the client and only connected to one server.

The slowest part of your application will be the network communication. It's highly likely that you will make almost no difference to performance for a one thread, one connection client by tweaking things like this. The network communication itself will dwarf all other contributions to processing or context switching time.
Say that I get N callbacks at almost
the same time on N different
threadpool threads notifying me that
there's new data.
Why is that going to happen? If you have one socket, you Begin an operation on it to receive data, and you get exactly one callback when it's done. You then decide whether to do another operation. It sounds like you're overcomplicating it, though maybe I'm oversimplifying it with regard to what you're trying to do.
In summary, I'd say: pick the simplest programming model that gets you what you want; considering choices available in your scenario, they would be unlikely to make any noticeable difference to performance whichever one you go with. With the blocking model, you're "wasting" a thread that could be doing some real work, but hey... maybe you don't have any real work for it to do.

The number one rule of performance is only try to improve it when you have to.
I see you mention standards but never mention problems, if you are not having any, then you don't need to worry what the standards say.

"This class was specifically designed for network server applications that require high performance."
As I understand, you are a client here, having only a single connection.
Data on this connection arrives in order, consumed by a single thread.
You will probably loose performance if you instead receive small amounts on separate threads, just so that you can assemble them later in a serialized - and thus like single-threaded - manner.
Much Ado about Nothing.
You do not really need to speed this up, you probably cannot.
What you can do, however is to dispatch work units to other threads after you receive them.
You do not need SocketAsyncEventArgs for this. This might speed things up.
As always, measure & measure.
Also, just because you can, it does not mean you should.
If the performance is enough for the foreseeable future, why complicate matters?

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.