I'm writing a server in C# which creates a (long, possibly even infinite) IEnumerable<Result> in response to a client request, and then streams those results back to the client.
Can I set it up so that, if the client is reading slowly (or possibly not at all for a couple of seconds at a time), the server won't need a thread stalled waiting for buffer space to clear up so that it can pull the next couple of Results, serialize them, and stuff them onto the network?
Is this how NetworkStream.BeginWrite works? The documentation is unclear (to me) about when the callback method will be called. Does it happen basically immediately, just on another thread which then blocks on EndWrite waiting for the actual writing to happen? Does it happen when some sort of lower-level buffer in the sockets API underflows? Does it happen when the data has been actually written to the network? Does it happen when it has been acknowledged?
I'm confused, so there's a possibility that this whole question is off-base. If so, could you turn me around and point me in the right direction to solve what I'd expect is a fairly common problem?
I'll answer the third part of your question in a bit more detail.
The MSDN documentation states that:
When your application calls BeginWrite, the system uses a separate thread to execute the specified callback method, and blocks on EndWrite until the NetworkStream sends the number of bytes requested or throws an exception.
As far as my understanding goes, whether or not the callback method is called immediately after calling BeginSend depends upon the underlying implementation and platform. For example, if IO completion ports are available on Windows, it won't be. A thread from the thread pool will block before calling it.
In fact, the NetworkStream's BeginWrite method simply calls the underlying socket's BeginSend method on my .Net implementation. Mine uses the underlying WSASend Winsock function with completion ports, where available. This makes it far more efficient than simply creating your own thread for each send/write operation, even if you were to use a thread pool.
The Socket.BeginSend method then calls the OverlappedAsyncResult.CheckAsyncCallOverlappedResult method if the result of WSASend was IOPending, which in turn calls the native RegisterWaitForSingleObject Win32 function. This will cause one of the threads in the thread pool to block until the WSASend method signals that it has completed, after which the callback method is called.
The Socket.EndSend method, called by NetworkStream.EndSend, will wait for the send operation to complete. The reason it has to do this is because if IO completion ports are not available then the callback method will be called straight away.
I must stress again that these details are specific to my implementation of .Net and my platform, but that should hopefully give you some insight.
First, the only way your main thread can keep executing while other work is being done is through the use of another thread. A thread can't do two things at once.
However, I think what you're trying to avoid is mess with the Thread object and yes that is possible through the use of BeginWrite. As per your questions
The documentation is unclear (to me)
about when the callback method will be
called.
The call is made after the network driver reads the data into it's buffers.
Does it happen basically immediately,
just on another thread which then
blocks on EndWrite waiting for the
actual writing to happen?
Nope, just until it's in the buffers handled by the network driver.
Does it happen when some sort of
lower-level buffer in the sockets API
underflows?
If by underflow your mean it has room for it then yes.
Does it happen when the data has been
actually written to the network?
No.
Does it happen when it has been
acknowledged?
No.
EDIT
Personally I would try using a Thread. There's a lot of stuff that BeginWrite is doing behind the scenes that you should probably recognize... plus I'm weird and I like controlling my threads.
Related
I'm playing with SocketAsyncEventArgs and IO Completion Ports.
I've been looking but I can't seem to find how .NET handles race conditions.
Need clarification on this stack overflow question:
https://stackoverflow.com/a/28690948/855421
As a side note, don't forget that your request might have completed synchronously. Perhaps you're reading from a TCP stream in a while loop, 512 bytes at a time. If the socket buffer has enough data in it, multiple ReadAsyncs can return immediately without doing any thread switching at all. [emphasis mine]
For the sake of simplicity. Let's assume one client one server. The server is using a IOCP. If the client is a fast writer but server is a slow reader, does IOCP mean the kernel/underlying process can signal multiple threads?
1 So, socket reads 512 bytes, kernel signals a IOCP thread
2 Server processes new bytes
3 socket receives another X bytes but server is still processing previous buffer
Does the kernel spin up another thread? SocketAsyncEventArgs has a Buffer which by definition is: "Gets the data buffer to use with an asynchronous socket method." So the buffer should not change over the lifetime of the SocketAsyncEventArgs if I understand that correctly.
What's preventing SocketAsyncEventArgs.Buffer from getting corrupted by IOCP thread 2?
Or does the .NET framework synchronize IOCP threads? If so, what's the point of spinning up a new thread then if IOCP thread 1 blocks the previous read?
I've been looking but I can't seem to find how .NET handles race conditions.
For the most part, it doesn't. It's up to you to do that. But, it's not clear from your question that you really have a race condition problem.
You are asking about this text, in the other answer:
If the socket buffer has enough data in it, multiple ReadAsyncs can return immediately without doing any thread switching at all
First, to be clear: the method's name is ReceiveAsync(), not ReadAsync(). Other classes, like StreamReader and NetworkStream have ReadAsync() methods, and these methods have very little to do with what your question is about. Now, that clarified…
That quote is about the opposite of a race condition. The author of that text is warning you that, should you happen to call ReceiveAsync() on a socket that already has data ready to be read, the data will be read synchronously and the SocketAsyncEventArgs.Completed event will not be raised later. It will be the responsibility of the thread that called ReceiveAsync() to also process the data that was read.
All of this would happen in a single thread. There wouldn't be any race condition in that scenario.
Now, let's consider your "fast writer, slow reader" scenario. The worst that can happen there is that the first read, which could take place in any thread, does not complete immediately, but by the time the Completed event is raised, the writer has overrun the reader's pace. In this case, since part of handling the Completed event is likely to be calling ReceiveAsync() again, which now will return synchronously, an IOCP thread pool thread will get tied up looping on the calls to ReceiveAsync(). No new thread is needed, because the current IOCP thread is doing all the work synchronously. But it does prevent that thread from handling other IOCP events.
All that will mean though, is that if you have some other socket the server is handling and which also needs to call ReceiveAsync(), the framework will have to ensure there's another thread in the IOCP thread pool available to handle that I/O. But, that's a completely different socket and you would necessarily be using a completely different buffer for that socket anyway.
Again, no race condition.
Now, all that said, if you want to get really confused, it is possible to use asynchronous I/O in the .NET Socket API (whether with BeginReceive() or ReceiveAsync() or even wrapping the socket in a NetworkStream and using ReadAsync()) in such a way that you do have a race condition for a particular socket.
I hesitate to even mention it, because there's no evidence in your question this pertains to you at all, nor that you're even really interested in having this level of detail. Adding this explanation could just confuse things. But, for the sake of completeness…
It is possible to have issued more than one read operation on a socket at any given time. This would be somewhat akin to double- or triple-buffered video display (if you're familiar with that concept). The idea being that you might still be handling a read operation while new data comes in, and it would be more performant to have a new read operation already in progress to handle that data before you're done handling the current read operation.
This sounds great, but in practice because of the way Windows schedules threads, and in particular does not guarantee a particular ordering of thread scheduling, if you try to implement your code that way, you create the possibility that your code will see read operations completed out of order. That is, if you for example call ReceiveAsync() twice in a row (with two different SocketAsyncEventArgs objects and two different buffers, of course), your Completed event handler might get called with the second buffer first.
This isn't because the read operations themselves complete out of order. They don't. Hence the emphasis on "your" above. The problem is that while the IOCP threads handling the IO completions become runnable in the correct order (because the buffers are filled in the order you provided them by calling ReceiveAsync() multiple times), the second IOCP thread to become runnable could wind up being the first thread to actually be scheduled to run by Windows.
This is not hard to deal with. You just have to make sure that you track the buffer sequence as you issue the read operations, so that you can reassemble the buffers in the correct order later. All of the async options available provide a mechanism for you to include additional user state data (e.g. SocketAsyncEventArgs.UserToken), so you can use this to track the order of your buffers.
Again, this is not common. For most scenarios, a completely orderly implementation, where you only issue a new read operation after you're completely done with the current read operation, is completely sufficient. If you're worried at all about getting a multi-buffer read implementation correct, just don't bother. Stick with the simple approach.
I am implementing a TCP client in my Unity3D game and I am wondering if it's actually safe or not to call the NetworkStream.BeginWrite without waiting until the previous call finishes writing.
From what I understood while reading the documentation, it's safe until I am not performing concurrent BeginWrite calls in the different threads (and Unity has only one thread for the game main loop).
For my reading I call BeginRead right after making a connection with the asynchronous callback in which I read the incoming data from the TcpClient.GetStream(), put it to the separate MemoryStream with lock(readMemoryStream), and run BeginRead again. Besides that, in my Update() function (in the main game thread) I check for the new data in the readMemoryStream, check for the solid message and unpack it (using the same lock(readMemoryStream) of course) and perform operations on the game objects based on the message from server.
Will this approach work fine? Won't BeginRead interfere with BeginWrite?
Again, I am using callback thread to read the data and main thread to write.
As long as no two threads are calling BeginWrite() concurrently, all is well. The same thread, or even other threads, can call BeginWrite() consecutively before earlier calls have completed.
Do note that the completion callbacks might be executed out of order; if you do implement it this way and the order of the execution of the completion callbacks matters, it is up to you to keep track of which asynchronous operation is which. Of course, for writing to the socket, this often doesn't matter, as you may not have anything to do in the completion callback other than to call EndWrite().
Reading from and writing to a socket are completely independent operations. The socket is full-duplex and can safely handle concurrently pending read and write operations on the same socket.
You didn't ask, but like BeginWrite(), you can also call BeginRead() multiple times without earlier operations completing. And again, as with BeginWrite(), it's up to you to keep track of the correct order of the operations so that when your completion callback is executed for each one, you know which order the received data should be in.
Note that since the order of the completions is critical for read operations (something often not the case for write operations), it is common for all but the largest-scale implementations to never overlap read operations on a given socket. The code is much simpler when for a given socket, only one read operation is in progress at a time.
One last caveat: do note that your buffers are pinned for the duration of the I/O operation. Too many outstanding I/O operations can interfere with the efficient management of the heap, due to fragmentation. This is unlikely to be an issue in a client implementation, but a large-scale server implementation should take this into account (e.g. by allocating large buffers so that they come from the LOH, where things are always pinned anyway).
Consider the Socket.BeginSend() method. If two thread pool threads were to call this method simultaneously, would their respective messages end up mixing with each other or does the socket class keep this from happening?
.NET Socket instances are not thread-safe in that simultaneous calls to some methods (the same or different ones) may cause inconsistent state. However, the BeginSend() and BeginReceive() methods are thread-safe with respect to themselves.
It is safe to have multiple outstanding calls to each (or both).
In the case of BeginReceive(), they will be serviced when data is available in the order called. This can be useful if, for example, your processing is lengthy but you want other receives to occur as quickly as possible. Of course in this case you may have your code processing multiple receipts simultaneously and you may need your own synchronization logic to protect your application state.
In the case of BeginSend(), each call will attempt to push the sent data into the socket buffer, and as soon as it is accepted there, your callback will be called (where you will call EndSend()). If there is insufficient buffer space for any call, it will block.
Note: don't assume that the default 8k buffer means "I can quickly call BeginSend() with exactly 8k of data, then it will block," as the following are true:
The 8K is a "nominal size" number, and the buffer may shrink and grow somewhat
As you are queing up the 8K worth of calls, data will be being sent on the network reducing that 8K of queued data
In general:
If you call BeginSend() several times within a thread, you are assured that the sends will leave the machine in the order they were called.
If you call BeginSend() from several threads, there is no guarantee of order unless you use some other blocking mechanism to force the actual calls to occur in some specific order. Each call however will send its data properly in one contiguous stream of network bytes.
I found a smiliar post on the MSDN forum which seems to answer to your question.
You can queue multiple BeginSends at the same time. You don't need to lock
Edit:
Even more interesting informations:
If you scroll down a bit in the Remark section of the MSDN doc BeginSend(), you will find interesting use of callback methods that could be relevant for you.
[...] If you want the original thread to block after you call the BeginSend method, use the WaitHandle.WaitOne method. [...]
I thought C# was an event-driven programming language.
This type of thing seems rather messy and inefficient to me:
tcpListener.Start();
while (true)
{
TcpClient client = this.tcpListener.AcceptTcpClient();
Thread clientThread = new Thread(new ParameterizedThreadStart(HandleClientCommunication));
clientThread.Start(client);
}
I also have to do the same kind of thing when waiting for new messages to arrive. Obviously these functions are enclosed within a thread.
Is there not a better way to do this that doesn't involve infinite loops and wasted CPU cycles? Events? Notifications? Something? If not, is it bad practice to do a Thread.Sleep so it isn't processing as nearly as often?
There is absolutely nothing wrong with the method you posted. There are also no wasted CPU cycles like you mentioned. TcpClient.AcceptTcpClient() blocks the thread until a client connects to it, which means it does not take up any CPU cycles. So the only time the loop actually loops is when a client connects.
Of course you may want to use something other than while(true) if you want a way to exit the loop and stop listening for connections, but that's another topic. In short, this is a good way to accept connections and I don't see any purpose of having a Thread.Sleep anywhere in here.
There are actually three ways to handle IO operations for sockets. The first one is to use the blocking functions just as you do. They are usually used to handle a client socket since the client expects and answer directly most of the time (and therefore can use blocking reads)
For any other socket handlers I would recommend to use one of the two asynchronous (non-blocking) models.
The first model is the easiest one to use. And it's recognized by the Begin/End method names and the IAsyncResult return value from the Begin method. You pass a callback (function pointer) to the Begin method which will be invoked when something has happened. As an example take a look at BeginReceive.
The second asynchronous model is more like the windows IO model (IO Completion Ports) . It's also the newest model in .NET and should give you the best performance. As SocketAsyncEventArgs object is used to control the behavior (like which method to invoke when an operation completes). You also need to be aware of that an operation can be completed directly and that the callback method will not be invoked then. Read more about RecieveAsync.
How exactly does a Handle relate to a thread? I am writing a service that accepts an HTTP request and calls a method before returning a response. I have written a test client that sends out 10,000 HTTP requests (using a semaphore to make sure that only 1000 request are made at a time).
If i call the method (the method processed before returning a response) through the ThreadPool, or through a generic Action<T>.BeginInvoke, the service's handles will go way up and stay there until all the request have finished, but the thread count of the service stays pretty much dead.
However, if I synchronously call the method before returning the response, the thread count goes up, but the handle count will will go through extreme peaks and valleys.
This is C# on a windows machine (Server 2008)
Your description is too vague to give a good diagnostic. But the ThreadPool was designed to carefully throttle the number of active threads. It will avoid running more threads than you have CPU cores. Only when a thread gets "stuck" will it schedule an extra thread. That explains why you see the number of threads not increase wildly. And, indirectly, why the handle count stays stable because the machine is doing less work.
You can think of a handle as an abstraction of a pointer. There are lots of things in Windows that use handles (when you open a file at the API level, you get a handle to the file, when you create a window, the window has a handle, a thread has a handle, etc). So, Your handle count probably has to do with operations that are occuring on your threads. If you have more threads running, more stuff is going on at the same time, so you will see more handles open.