Are Asynchronous writes to a socket thread safe? - c#

Consider the Socket.BeginSend() method. If two thread pool threads were to call this method simultaneously, would their respective messages end up mixing with each other or does the socket class keep this from happening?

.NET Socket instances are not thread-safe in that simultaneous calls to some methods (the same or different ones) may cause inconsistent state. However, the BeginSend() and BeginReceive() methods are thread-safe with respect to themselves.
It is safe to have multiple outstanding calls to each (or both).
In the case of BeginReceive(), they will be serviced when data is available in the order called. This can be useful if, for example, your processing is lengthy but you want other receives to occur as quickly as possible. Of course in this case you may have your code processing multiple receipts simultaneously and you may need your own synchronization logic to protect your application state.
In the case of BeginSend(), each call will attempt to push the sent data into the socket buffer, and as soon as it is accepted there, your callback will be called (where you will call EndSend()). If there is insufficient buffer space for any call, it will block.
Note: don't assume that the default 8k buffer means "I can quickly call BeginSend() with exactly 8k of data, then it will block," as the following are true:
The 8K is a "nominal size" number, and the buffer may shrink and grow somewhat
As you are queing up the 8K worth of calls, data will be being sent on the network reducing that 8K of queued data
In general:
If you call BeginSend() several times within a thread, you are assured that the sends will leave the machine in the order they were called.
If you call BeginSend() from several threads, there is no guarantee of order unless you use some other blocking mechanism to force the actual calls to occur in some specific order. Each call however will send its data properly in one contiguous stream of network bytes.

I found a smiliar post on the MSDN forum which seems to answer to your question.
You can queue multiple BeginSends at the same time. You don't need to lock
Edit:
Even more interesting informations:
If you scroll down a bit in the Remark section of the MSDN doc BeginSend(), you will find interesting use of callback methods that could be relevant for you.
[...] If you want the original thread to block after you call the BeginSend method, use the WaitHandle.WaitOne method. [...]

Related

Trouble understanding the mechanics+benefits of async in .NET

On reading this article it gives the impression that using async can mean that a webserver can concurrently serve more requests than it has threads.
I don't understand how this works though. I do understand (at least I think I understand) that using async one can spin off multiple IO requests without blocking inside a request and without starting a new threads. I believe there's some magic going on but deep within the covers there's probably a call to select().
However, even then, I still think you need one thread per request. The thread presumably has the current stack, and other information about where execution in that thread is up to (like an instruction pointer). I can't see how one can just trash that information whilst you're waiting for a file descriptor in a select() call, lest you forget where you're at and also risk all your active data being garbage collected. But the article first referenced suggested this somehow happens, and I quote:
Asynchronous requests allow a smaller number of threads to handle a larger number of requests.
I don't really understand the mechanics of how this happens, and to me it seems impossible. Sure, async will reduce the number of additional threads to deal with IO requests, but I still can't see how you can avoid at least one thread per request. Unless you do something weird like trashing the thread but saving it's state to the heap and then restoring it later, but I don't really see how that achieves much (you could just apply the thread pool to "running threads" and achieve basically the same thing).

How does .NET handle IOCP thread safety?

I'm playing with SocketAsyncEventArgs and IO Completion Ports.
I've been looking but I can't seem to find how .NET handles race conditions.
Need clarification on this stack overflow question:
https://stackoverflow.com/a/28690948/855421
As a side note, don't forget that your request might have completed synchronously. Perhaps you're reading from a TCP stream in a while loop, 512 bytes at a time. If the socket buffer has enough data in it, multiple ReadAsyncs can return immediately without doing any thread switching at all. [emphasis mine]
For the sake of simplicity. Let's assume one client one server. The server is using a IOCP. If the client is a fast writer but server is a slow reader, does IOCP mean the kernel/underlying process can signal multiple threads?
1 So, socket reads 512 bytes, kernel signals a IOCP thread
2 Server processes new bytes
3 socket receives another X bytes but server is still processing previous buffer
Does the kernel spin up another thread? SocketAsyncEventArgs has a Buffer which by definition is: "Gets the data buffer to use with an asynchronous socket method." So the buffer should not change over the lifetime of the SocketAsyncEventArgs if I understand that correctly.
What's preventing SocketAsyncEventArgs.Buffer from getting corrupted by IOCP thread 2?
Or does the .NET framework synchronize IOCP threads? If so, what's the point of spinning up a new thread then if IOCP thread 1 blocks the previous read?
I've been looking but I can't seem to find how .NET handles race conditions.
For the most part, it doesn't. It's up to you to do that. But, it's not clear from your question that you really have a race condition problem.
You are asking about this text, in the other answer:
If the socket buffer has enough data in it, multiple ReadAsyncs can return immediately without doing any thread switching at all
First, to be clear: the method's name is ReceiveAsync(), not ReadAsync(). Other classes, like StreamReader and NetworkStream have ReadAsync() methods, and these methods have very little to do with what your question is about. Now, that clarified…
That quote is about the opposite of a race condition. The author of that text is warning you that, should you happen to call ReceiveAsync() on a socket that already has data ready to be read, the data will be read synchronously and the SocketAsyncEventArgs.Completed event will not be raised later. It will be the responsibility of the thread that called ReceiveAsync() to also process the data that was read.
All of this would happen in a single thread. There wouldn't be any race condition in that scenario.
Now, let's consider your "fast writer, slow reader" scenario. The worst that can happen there is that the first read, which could take place in any thread, does not complete immediately, but by the time the Completed event is raised, the writer has overrun the reader's pace. In this case, since part of handling the Completed event is likely to be calling ReceiveAsync() again, which now will return synchronously, an IOCP thread pool thread will get tied up looping on the calls to ReceiveAsync(). No new thread is needed, because the current IOCP thread is doing all the work synchronously. But it does prevent that thread from handling other IOCP events.
All that will mean though, is that if you have some other socket the server is handling and which also needs to call ReceiveAsync(), the framework will have to ensure there's another thread in the IOCP thread pool available to handle that I/O. But, that's a completely different socket and you would necessarily be using a completely different buffer for that socket anyway.
Again, no race condition.
Now, all that said, if you want to get really confused, it is possible to use asynchronous I/O in the .NET Socket API (whether with BeginReceive() or ReceiveAsync() or even wrapping the socket in a NetworkStream and using ReadAsync()) in such a way that you do have a race condition for a particular socket.
I hesitate to even mention it, because there's no evidence in your question this pertains to you at all, nor that you're even really interested in having this level of detail. Adding this explanation could just confuse things. But, for the sake of completeness…
It is possible to have issued more than one read operation on a socket at any given time. This would be somewhat akin to double- or triple-buffered video display (if you're familiar with that concept). The idea being that you might still be handling a read operation while new data comes in, and it would be more performant to have a new read operation already in progress to handle that data before you're done handling the current read operation.
This sounds great, but in practice because of the way Windows schedules threads, and in particular does not guarantee a particular ordering of thread scheduling, if you try to implement your code that way, you create the possibility that your code will see read operations completed out of order. That is, if you for example call ReceiveAsync() twice in a row (with two different SocketAsyncEventArgs objects and two different buffers, of course), your Completed event handler might get called with the second buffer first.
This isn't because the read operations themselves complete out of order. They don't. Hence the emphasis on "your" above. The problem is that while the IOCP threads handling the IO completions become runnable in the correct order (because the buffers are filled in the order you provided them by calling ReceiveAsync() multiple times), the second IOCP thread to become runnable could wind up being the first thread to actually be scheduled to run by Windows.
This is not hard to deal with. You just have to make sure that you track the buffer sequence as you issue the read operations, so that you can reassemble the buffers in the correct order later. All of the async options available provide a mechanism for you to include additional user state data (e.g. SocketAsyncEventArgs.UserToken), so you can use this to track the order of your buffers.
Again, this is not common. For most scenarios, a completely orderly implementation, where you only issue a new read operation after you're completely done with the current read operation, is completely sufficient. If you're worried at all about getting a multi-buffer read implementation correct, just don't bother. Stick with the simple approach.

Is safe to call NetworkStream.BeginWrite multiple times from one thread?

I am implementing a TCP client in my Unity3D game and I am wondering if it's actually safe or not to call the NetworkStream.BeginWrite without waiting until the previous call finishes writing.
From what I understood while reading the documentation, it's safe until I am not performing concurrent BeginWrite calls in the different threads (and Unity has only one thread for the game main loop).
For my reading I call BeginRead right after making a connection with the asynchronous callback in which I read the incoming data from the TcpClient.GetStream(), put it to the separate MemoryStream with lock(readMemoryStream), and run BeginRead again. Besides that, in my Update() function (in the main game thread) I check for the new data in the readMemoryStream, check for the solid message and unpack it (using the same lock(readMemoryStream) of course) and perform operations on the game objects based on the message from server.
Will this approach work fine? Won't BeginRead interfere with BeginWrite?
Again, I am using callback thread to read the data and main thread to write.
As long as no two threads are calling BeginWrite() concurrently, all is well. The same thread, or even other threads, can call BeginWrite() consecutively before earlier calls have completed.
Do note that the completion callbacks might be executed out of order; if you do implement it this way and the order of the execution of the completion callbacks matters, it is up to you to keep track of which asynchronous operation is which. Of course, for writing to the socket, this often doesn't matter, as you may not have anything to do in the completion callback other than to call EndWrite().
Reading from and writing to a socket are completely independent operations. The socket is full-duplex and can safely handle concurrently pending read and write operations on the same socket.
You didn't ask, but like BeginWrite(), you can also call BeginRead() multiple times without earlier operations completing. And again, as with BeginWrite(), it's up to you to keep track of the correct order of the operations so that when your completion callback is executed for each one, you know which order the received data should be in.
Note that since the order of the completions is critical for read operations (something often not the case for write operations), it is common for all but the largest-scale implementations to never overlap read operations on a given socket. The code is much simpler when for a given socket, only one read operation is in progress at a time.
One last caveat: do note that your buffers are pinned for the duration of the I/O operation. Too many outstanding I/O operations can interfere with the efficient management of the heap, due to fragmentation. This is unlikely to be an issue in a client implementation, but a large-scale server implementation should take this into account (e.g. by allocating large buffers so that they come from the LOH, where things are always pinned anyway).

Sync Vs. Async Sockets Performance in .NET

Everything that I read about sockets in .NET says that the asynchronous pattern gives better performance (especially with the new SocketAsyncEventArgs which saves on the allocation).
I think this makes sense if we're talking about a server with many client connections where its not possible to allocate one thread per connection. Then I can see the advantage of using the ThreadPool threads and getting async callbacks on them.
But in my app, I'm the client and I just need to listen to one server sending market tick data over one tcp connection. Right now, I create a single thread, set the priority to Highest, and call Socket.Receive() with it. My thread blocks on this call and wakes up once new data arrives.
If I were to switch this to an async pattern so that I get a callback when there's new data, I see two issues
The threadpool threads will have default priority so it seems they will be strictly worse than my own thread which has Highest priority.
I'll still have to send everything through a single thread at some point. Say that I get N callbacks at almost the same time on N different threadpool threads notifying me that there's new data. The N byte arrays that they deliver can't be processed on the threadpool threads because there's no guarantee that they represent N unique market data messages because TCP is stream based. I'll have to lock and put the bytes into an array anyway and signal some other thread that can process what's in the array. So I'm not sure what having N threadpool threads is buying me.
Am I thinking about this wrong? Is there a reason to use the Async patter in my specific case of one client connected to one server?
UPDATE:
So I think that I was mis-understanding the async pattern in (2) above. I would get a callback on one worker thread when there was data available. Then I would begin another async receive and get another callback, etc. I wouldn't get N callbacks at the same time.
The question still is the same though. Is there any reason that the callbacks would be better in my specific situation where I'm the client and only connected to one server.
The slowest part of your application will be the network communication. It's highly likely that you will make almost no difference to performance for a one thread, one connection client by tweaking things like this. The network communication itself will dwarf all other contributions to processing or context switching time.
Say that I get N callbacks at almost
the same time on N different
threadpool threads notifying me that
there's new data.
Why is that going to happen? If you have one socket, you Begin an operation on it to receive data, and you get exactly one callback when it's done. You then decide whether to do another operation. It sounds like you're overcomplicating it, though maybe I'm oversimplifying it with regard to what you're trying to do.
In summary, I'd say: pick the simplest programming model that gets you what you want; considering choices available in your scenario, they would be unlikely to make any noticeable difference to performance whichever one you go with. With the blocking model, you're "wasting" a thread that could be doing some real work, but hey... maybe you don't have any real work for it to do.
The number one rule of performance is only try to improve it when you have to.
I see you mention standards but never mention problems, if you are not having any, then you don't need to worry what the standards say.
"This class was specifically designed for network server applications that require high performance."
As I understand, you are a client here, having only a single connection.
Data on this connection arrives in order, consumed by a single thread.
You will probably loose performance if you instead receive small amounts on separate threads, just so that you can assemble them later in a serialized - and thus like single-threaded - manner.
Much Ado about Nothing.
You do not really need to speed this up, you probably cannot.
What you can do, however is to dispatch work units to other threads after you receive them.
You do not need SocketAsyncEventArgs for this. This might speed things up.
As always, measure & measure.
Also, just because you can, it does not mean you should.
If the performance is enough for the foreseeable future, why complicate matters?

Letting the client pull data

I'm writing a server in C# which creates a (long, possibly even infinite) IEnumerable<Result> in response to a client request, and then streams those results back to the client.
Can I set it up so that, if the client is reading slowly (or possibly not at all for a couple of seconds at a time), the server won't need a thread stalled waiting for buffer space to clear up so that it can pull the next couple of Results, serialize them, and stuff them onto the network?
Is this how NetworkStream.BeginWrite works? The documentation is unclear (to me) about when the callback method will be called. Does it happen basically immediately, just on another thread which then blocks on EndWrite waiting for the actual writing to happen? Does it happen when some sort of lower-level buffer in the sockets API underflows? Does it happen when the data has been actually written to the network? Does it happen when it has been acknowledged?
I'm confused, so there's a possibility that this whole question is off-base. If so, could you turn me around and point me in the right direction to solve what I'd expect is a fairly common problem?
I'll answer the third part of your question in a bit more detail.
The MSDN documentation states that:
When your application calls BeginWrite, the system uses a separate thread to execute the specified callback method, and blocks on EndWrite until the NetworkStream sends the number of bytes requested or throws an exception.
As far as my understanding goes, whether or not the callback method is called immediately after calling BeginSend depends upon the underlying implementation and platform. For example, if IO completion ports are available on Windows, it won't be. A thread from the thread pool will block before calling it.
In fact, the NetworkStream's BeginWrite method simply calls the underlying socket's BeginSend method on my .Net implementation. Mine uses the underlying WSASend Winsock function with completion ports, where available. This makes it far more efficient than simply creating your own thread for each send/write operation, even if you were to use a thread pool.
The Socket.BeginSend method then calls the OverlappedAsyncResult.CheckAsyncCallOverlappedResult method if the result of WSASend was IOPending, which in turn calls the native RegisterWaitForSingleObject Win32 function. This will cause one of the threads in the thread pool to block until the WSASend method signals that it has completed, after which the callback method is called.
The Socket.EndSend method, called by NetworkStream.EndSend, will wait for the send operation to complete. The reason it has to do this is because if IO completion ports are not available then the callback method will be called straight away.
I must stress again that these details are specific to my implementation of .Net and my platform, but that should hopefully give you some insight.
First, the only way your main thread can keep executing while other work is being done is through the use of another thread. A thread can't do two things at once.
However, I think what you're trying to avoid is mess with the Thread object and yes that is possible through the use of BeginWrite. As per your questions
The documentation is unclear (to me)
about when the callback method will be
called.
The call is made after the network driver reads the data into it's buffers.
Does it happen basically immediately,
just on another thread which then
blocks on EndWrite waiting for the
actual writing to happen?
Nope, just until it's in the buffers handled by the network driver.
Does it happen when some sort of
lower-level buffer in the sockets API
underflows?
If by underflow your mean it has room for it then yes.
Does it happen when the data has been
actually written to the network?
No.
Does it happen when it has been
acknowledged?
No.
EDIT
Personally I would try using a Thread. There's a lot of stuff that BeginWrite is doing behind the scenes that you should probably recognize... plus I'm weird and I like controlling my threads.

Categories