I'm developing a game server. Through TCPListener I accept clients.
var Listener = new TcpListener(IPAddress.Any, CommonConfig.Settings.GamePort);
Listener.Start();
ListenerStarted = true;
while (ListenerStarted)
{
TcpClient tcpClient = await Listener.AcceptTcpClientAsync();
ProcessClientTearOff(tcpClient);
}
Then through ReadAsync is getting the data from the client.
byte[] Buffer = new byte[8192];
int i = await Stream.ReadAsync(Buffer, 0, 8192);
After that, the data is processed using the method
RequestHandling(byte[] data)
and performing various actions. Clients actively interact between each other and therefore there are problems with thread safety. I was looking for information on how to properly organize the server structure and found a possible case that the data is being received asynchronously (as I have now), but the processing and execution of actions occurs in one thread.
One thread to accept clients and get data, one thread to process and execute, one thread to send data to clients.
But I can not understand how this can be realized. Through Task, you can specify the order in which the methods are executed, but only before running tasks. Is it possible to run all the processing of packets in a separate thread so that all actions are executed synchronously in the order of the queue? Or is there any alternative to this?
The question is fairly vague which is understandable given that you are seeking a general concept to organize this.
You don't need a separate thread to process a queue. Usually, a lock is an easier solution. A lock has an internal queue as am implementation detail. The queue contains the threads that are waiting to enter.
A good pattern for your case seems to be the following. Make each connection thread/task execute this loop:
while (true) {
var message = await ReceiveMessageFromNetwork();
lock (globalLock) {
ApplyMessage(message); //no IO here
}
}
The queue is implicit in the lock. I marked some code as "no IO" because you have to quickly leave the lock so that other threads/tasks can enter.
Related
I am somewhat new to parallel programming C# (When I started my project I worked through the MSDN examples for TPL) and would appreciate some input on the following example code.
It is one of several background worker tasks. This specific task pushes status messages to a log.
var uiCts = new CancellationTokenSource();
var globalMsgQueue = new ConcurrentQueue<string>();
var backgroundUiTask = new Task(
() =>
{
while (!uiCts.IsCancellationRequested)
{
while (globalMsgQueue.Count > 0)
ConsumeMsgQueue();
Thread.Sleep(backgroundUiTimeOut);
}
},
uiCts.Token);
// Somewhere else entirely
backgroundUiTask.Start();
Task.WaitAll(backgroundUiTask);
I'm asking for professional input after reading several topics like Alternatives to using Thread.Sleep for waiting, Is it always bad to use Thread.Sleep()?, When to use Task.Delay, when to use Thread.Sleep?, Continuous polling using Tasks
Which prompts me to use Task.Delay instead of Thread.Sleep as a first step and introduce TaskCreationOptions.LongRunning.
But I wonder what other caveats I might be missing? Is polling the MsgQueue.Count a code smell? Would a better version rely on an event instead?
First of all, there's no reason to use Task.Start or use the Task constructor. Tasks aren't threads, they don't run themselves. They are a promise that something will complete in the future and may or may not produce any results. Some of them will run on a threadpool thread. Use Task.Run to create and run the task in a single step when you need to.
I assume the actual problem is how to create a buffered background worker. .NET already offers classes that can do this.
ActionBlock< T >
The ActionBlock class already implements this and a lot more - it allows you to specify how big the input buffer is, how many tasks will process incoming messages concurrently, supports cancellation and asynchronous completion.
A logging block could be as simple as this :
_logBlock=new ActionBlock<string>(msg=>File.AppendAllText("myLog.txt",msg));
The ActionBlock class itself takes care of buffering the inputs, feeding new messages to the worker function when it arrives, potentially blocking senders if the buffer gets full etc. There's no need for polling.
Other code can use Post or SendAsync to send messages to the block :
_block.Post("some message");
When we are done, we can tell the block to Complete() and await for it to process any remaining messages :
_block.Complete();
await _block.Completion;
Channels
A newer, lower-level option is to use Channels. You can think of channels as a kind of asynchronous queue, although they can be used to implement complex processing pipelines. If ActionBlock was written today, it would use Channels internally.
With channels, you need to provide the "worker" task yourself. There's no need for polling though, as the ChannelReader class allows you to read messages asynchronously or even use await foreach.
The writer method could look like this :
public ChannelWriter<string> LogIt(string path,CancellationToken token=default)
{
var channel=Channel.CreateUnbounded<string>();
var writer=channel.Writer;
_=Task.Run(async ()=>{
await foreach(var msg in channel.Reader.ReadAllAsync(token))
{
File.AppendAllText(path,msg);
}
},token).ContinueWith(t=>writer.TryComplete(t.Exception);
return writer;
}
....
_logWriter=LogIt(somePath);
Other code can send messages by using WriteAsync or TryWrite, eg :
_logWriter.TryWrite(someMessage);
When we're done, we can call Complete() or TryComplete() on the writer :
_logWriter.TryComplete();
The line
.ContinueWith(t=>writer.TryComplete(t.Exception);
is needed to ensure the channel is closed even if an exception occurs or the cancellation token is signaled.
This may seem too cumbersome at first. Channels allow us to easily run initialization code or carry state from one message to the next. We could open a stream before the loop starts and use it instead of reopening the file each time we call File.AppendAllText, eg :
public ChannelWriter<string> LogIt(string path,CancellationToken token=default)
{
var channel=Channel.CreateUnbounded<string>();
var writer=channel.Writer;
_=Task.Run(async ()=>{
//***** Can't do this with an ActionBlock ****
using(var writer=File.AppendText(somePath))
{
await foreach(var msg in channel.Reader.ReadAllAsync(token))
{
writer.WriteLine(msg);
//Or
//await writer.WriteLineAsync(msg);
}
}
},token).ContinueWith(t=>writer.TryComplete(t.Exception);
return writer;
}
Definitely Task.Delay is better than Thread.Sleep, because you will not be blocking the thread on the pool, and during the wait the thread on the pool will be available to handle other tasks. Then, you don't need to make your task long-running. Long-running tasks are run in a dedicated thread, and then Task.Delay is meaningless.
Instead, I will recommend a different approach. Just use System.Threading.Timer and make your life simple. Timers are kernel objects that will run their callback on the thread pool, and you will not have to worry about delay or sleep.
The TPL Dataflow library is the preferred tool for this kind of job. It allows building efficient producer-consumer pairs quite easily, and more complex pipelines as well, while offering a complete set of configuration options. In your case using a single ActionBlock should be enough.
A simpler solution you might consider is to use a BlockingCollection. It has the advantage of not requiring the installation of any package (because it is built-in), and it's also much easier to learn. You don't have to learn more than the methods Add, CompleteAdding, and GetConsumingEnumerable. It also supports cancellation. The drawback is that it's a blocking collection, so it blocks the consumer thread while waiting for new messages to arrive, and the producer thread while waiting for available space in the internal buffer (only if you specify a boundedCapacity in the constructor).
var uiCts = new CancellationTokenSource();
var globalMsgQueue = new BlockingCollection<string>();
var backgroundUiTask = new Task(() =>
{
foreach (var item in globalMsgQueue.GetConsumingEnumerable(uiCts.Token))
{
ConsumeMsgQueueItem(item);
}
}, uiCts.Token);
The BlockingCollection uses a ConcurrentQueue internally as a buffer.
I want to receive constantly while a socket is alive and call concurrently to a send method from different threads (I'm sorry for my English).
I have thought about the following:
As there will be multiple clients I think it would be better if the operating system is responsible for generating the threads so as not to compromise the performance, therefore, I do not want to check the Receive method in a loop on a different thread for each client. This I discarded instantly.
To avoid the creation of instances of the type that implements the IAsyncResult interface used in each call to BeginReceive and thus not stress the GB and have a better performance when there is a heavy load of IO operations I decided not to use BeginReceive / EndReceive.
Instead, I've thought about using the ReceiveAsync method and a reusable SocketAsyncEventArgs instance for each connection. For example, if the server supports 1000 connections, it will have that same number of instances of SocketAsyncEventArgs, one for each client during the time the connection is alive. When the client disconnects the underlying SocketAsyncEventArgs instance will return to the pool for later use in another connection (SOLUTION CHOSEN)
Regarding the sending operation, I do not care that the send requests are not processed in the order of call; the important thing is that the bytes do not mix with those of other messages, so that in the buffer of reception in the remote host the messages arrive one after the other so that the class that is in charge of the protocol can interpret them.
For this, I do not want to call BeginSend / EndSend because apart from creating different IAsynResult instances in each call, I understand that there is a possibility that message bytes are mixed when calling from different threads, is that correct? I do not remember where I read it a long time ago. I would like to have reliable sources. Regardless of whether it is true or not, I do not intend to use it.
I also understand that SendAsync is like a mask for BeginSend / EndSend, with the difference that it reuses the underlying IAsyncResult instance. Which makes me think that although I can concurrently call the SendAsyn method as well as BeginSend, there is also the possibility that the bytes are mixed in the output buffer.
To process multiple calls with SendAsyn you should also provide a different instance of SocketAsyncEventArgs on each call, which I do not like. Then I thought about using the Semaphore class to guarantee a sending operation at a time and once completed, reuse the SocketAsyncEventArgs instance used in the previous send operation. This way I would only occupy two instances of SocketAsyncEventArgs per connection (one to receive and one to send) and I avoided having a pool of these objects and not only that, it would also prevent the bytes of different messages from being mixed in the output buffer.
Regarding the solution that I found to receive constantly I am satisfied. But for the shipping operation I'm not sure. I do not find any advantage when using SendAsync. I only want multiple threads can call concurrently to the shipping method. Then I thought about using the Send method and wrapping it in an asynchronous method like the one shown below:
public virtual async Task<int> SendAsync(byte[] buffer, int offset, int size, CancellationToken cancellationToken)
{
ThrowIfDisposed();
var sendingRequest = CancellationTokenSource.CreateLinkedTokenSource(requests.Token, cancellationToken);
int bytesSent = 0;
try
{
await semaphore.WaitAsync(sendingRequest.Token).ConfigureAwait();
while(bytesSent < size)
{
int bytesWrote = clientSocket.Send(buffer, offset, size, SocketFlags.None);
if(bytesWrote == 0)
{
throw new SocketException((int)SocketError.NotConnected);
}
else
{
bytesSend += bytesWrote;
}
}
}
catch(SocketException)
{
Disconnect();
throw;
}
finally
{
semaphore.Release();
sendingRequest.Dispose();
sendingRequest = null;
}
return bytesSent;
}
I would appreciate it if you would tell me in which parts of everything I said above I am wrong or if my approach is correct or how to improve what I already have.
Thanks.
I have been struggling a bit with some async await stuff. I am using RabbitMQ for sending/receiving messages between some programs.
As a bit of background, the RabbitMQ client uses 3 or so threads that I can see: A connection thread and two heartbeat threads. Whenever a message is received via TCP, the connection thread handles it and calls a callback which I have supplied via an interface. The documentation says that it is best to avoid doing lots of work during this call since its done on the same thread as the connection and things need to continue on. They supply a QueueingBasicConsumer which has a blocking 'Dequeue' method which is used to wait for a message to be received.
I wanted my consumers to be able to actually release their thread context during this waiting time so somebody else could do some work, so I decided to use async/await tasks. I wrote an AwaitableBasicConsumer class which uses TaskCompletionSources in the following fashion:
I have an awaitable Dequeue method:
public Task<RabbitMQ.Client.Events.BasicDeliverEventArgs> DequeueAsync(CancellationToken cancellationToken)
{
//we are enqueueing a TCS. This is a "read"
rwLock.EnterReadLock();
try
{
TaskCompletionSource<RabbitMQ.Client.Events.BasicDeliverEventArgs> tcs = new TaskCompletionSource<RabbitMQ.Client.Events.BasicDeliverEventArgs>();
//if we are cancelled before we finish, this will cause the tcs to become cancelled
cancellationToken.Register(() =>
{
tcs.TrySetCanceled();
});
//if there is something in the undelivered queue, the task will be immediately completed
//otherwise, we queue the task into deliveryTCS
if (!TryDeliverUndelivered(tcs))
deliveryTCS.Enqueue(tcs);
}
return tcs.Task;
}
finally
{
rwLock.ExitReadLock();
}
}
The callback which the rabbitmq client calls fulfills the tasks: This is called from the context of the AMQP Connection thread
public void HandleBasicDeliver(string consumerTag, ulong deliveryTag, bool redelivered, string exchange, string routingKey, RabbitMQ.Client.IBasicProperties properties, byte[] body)
{
//we want nothing added while we remove. We also block until everybody is done.
rwLock.EnterWriteLock();
try
{
RabbitMQ.Client.Events.BasicDeliverEventArgs e = new RabbitMQ.Client.Events.BasicDeliverEventArgs(consumerTag, deliveryTag, redelivered, exchange, routingKey, properties, body);
bool sent = false;
TaskCompletionSource<RabbitMQ.Client.Events.BasicDeliverEventArgs> tcs;
while (deliveryTCS.TryDequeue(out tcs))
{
//once we manage to actually set somebody's result, we are done with handling this
if (tcs.TrySetResult(e))
{
sent = true;
break;
}
}
//if nothing was sent, we queue up what we got so that somebody can get it later.
/**
* Without the rwlock, this logic would cause concurrency problems in the case where after the while block completes without sending, somebody enqueues themselves. They would get the
* next message and the person who enqueues after them would get the message received now. Locking prevents that from happening since nobody can add to the queue while we are
* doing our thing here.
*/
if (!sent)
{
undelivered.Enqueue(e);
}
}
finally
{
rwLock.ExitWriteLock();
}
}
rwLock is a ReaderWriterLockSlim. The two queues (deliveryTCS and undelivered) are ConcurrentQueues.
The problem:
Every once in a while, the method that awaits the dequeue method throws an exception. This would not normally be an issue since that method is also async and so it enters the "Exception" completion state that tasks enter. The problem comes in the situation where the task that calls DequeueAsync is resumed after the await on the AMQP Connection thread that the RabbitMQ client creates. Normally I have seen tasks resume onto the main thread or one of the worker threads floating around. However, when it resumes onto the AMQP thread and an exception is thrown, everything stalls. The task does not enter its "Exception state" and the AMQP Connection thread is left saying that it is executing the method that had the exception occur.
My main confusion here is why this doesn't work:
var task = c.RunAsync(); //<-- This method awaits the DequeueAsync and throws an exception afterwards
ConsumerTaskState state = new ConsumerTaskState()
{
Connection = connection,
CancellationToken = cancellationToken
};
//if there is a problem, we execute our faulted method
//PROBLEM: If task fails when its resumed onto the AMQP thread, this method is never called
task.ContinueWith(this.OnFaulted, state, TaskContinuationOptions.OnlyOnFaulted);
Here is the RunAsync method, set up for the test:
public async Task RunAsync()
{
using (var channel = this.Connection.CreateModel())
{
...
AwaitableBasicConsumer consumer = new AwaitableBasicConsumer(channel);
var result = consumer.DequeueAsync(this.CancellationToken);
//wait until we find something to eat
await result;
throw new NotImplementeException(); //<-- the test exception. Normally this causes OnFaulted to be called, but sometimes, it stalls
...
} //<-- This is where the debugger says the thread is sitting at when I find it in the stalled state
}
Reading what I have written, I see that I may not have explained my problem very well. If clarification is needed, just ask.
My solutions that I have come up with are as follows:
Remove all Async/Await code and just use straight up threads and block. Performance will be decreased, but at least it won't stall sometimes
Somehow exempt the AMQP threads from being used for resuming tasks. I assume that they were sleeping or something and then the default TaskScheduler decided to use them. If I could find a way to tell the task scheduler that those threads are off limits, that would be great.
Does anyone have an explanation for why this is happening or any suggestions to solving this? Right now I am removing the async code just so that the program is reliable, but I really want to understand what is going on here.
I first recommend that you read my async intro, which explains in precise terms how await will capture a context and use that to resume execution. In short, it will capture the current SynchronizationContext (or the current TaskScheduler if SynchronizationContext.Current is null).
The other important detail is that async continuations are scheduled with TaskContinuationOptions.ExecuteSynchronously (as #svick pointed out in a comment). I have a blog post about this but AFAIK it is not officially documented anywhere. This detail does make writing an async producer/consumer queue difficult.
The reason await isn't "switching back to the original context" is (probably) because the RabbitMQ threads don't have a SynchronizationContext or TaskScheduler - thus, the continuation is executed directly when you call TrySetResult because those threads look just like regular thread pool threads.
BTW, reading through your code, I suspect your use of a reader/writer lock and concurrent queues are incorrect. I can't be sure without seeing the whole code, but that's my impression.
I strongly recommend you use an existing async queue and build a consumer around that (in other words, let someone else do the hard part :). The BufferBlock<T> type in TPL Dataflow can act as an async queue; that would be my first recommendation if you have Dataflow available on your platform. Otherwise, I have an AsyncProducerConsumerQueue type in my AsyncEx library, or you could write your own (as I describe on my blog).
Here's an example using BufferBlock<T>:
private readonly BufferBlock<RabbitMQ.Client.Events.BasicDeliverEventArgs> _queue = new BufferBlock<RabbitMQ.Client.Events.BasicDeliverEventArgs>();
public void HandleBasicDeliver(string consumerTag, ulong deliveryTag, bool redelivered, string exchange, string routingKey, RabbitMQ.Client.IBasicProperties properties, byte[] body)
{
RabbitMQ.Client.Events.BasicDeliverEventArgs e = new RabbitMQ.Client.Events.BasicDeliverEventArgs(consumerTag, deliveryTag, redelivered, exchange, routingKey, properties, body);
_queue.Post(e);
}
public Task<RabbitMQ.Client.Events.BasicDeliverEventArgs> DequeueAsync(CancellationToken cancellationToken)
{
return _queue.ReceiveAsync(cancellationToken);
}
In this example, I'm keeping your DequeueAsync API. However, once you start using TPL Dataflow, consider using it elsewhere as well. When you need a queue like this, it's common to find other parts of your code that would also benefit from a dataflow approach. E.g., instead of having a bunch of methods calling DequeueAsync, you could link your BufferBlock to an ActionBlock.
As I understand it, TcpListener will queue connections once you call Start(). Each time you call AcceptTcpClient (or BeginAcceptTcpClient), it will dequeue one item from the queue.
If we load test our TcpListener app by sending 1,000 connections to it at once, the queue builds far faster than we can clear it, leading (eventually) to timeouts from the client because it didn't get a response because its connection was still in the queue. However, the server doesn't appear to be under much pressure, our app isn't consuming much CPU time and the other monitored resources on the machine aren't breaking a sweat. It feels like we're not running efficiently enough right now.
We're calling BeginAcceptTcpListener and then immediately handing over to a ThreadPool thread to actually do the work, then calling BeginAcceptTcpClient again. The work involved doesn't seem to put any pressure on the machine, it's basically just a 3 second sleep followed by a dictionary lookup and then a 100 byte write to the TcpClient's stream.
Here's the TcpListener code we're using:
// Thread signal.
private static ManualResetEvent tcpClientConnected = new ManualResetEvent(false);
public void DoBeginAcceptTcpClient(TcpListener listener)
{
// Set the event to nonsignaled state.
tcpClientConnected.Reset();
listener.BeginAcceptTcpClient(
new AsyncCallback(DoAcceptTcpClientCallback),
listener);
// Wait for signal
tcpClientConnected.WaitOne();
}
public void DoAcceptTcpClientCallback(IAsyncResult ar)
{
// Get the listener that handles the client request, and the TcpClient
TcpListener listener = (TcpListener)ar.AsyncState;
TcpClient client = listener.EndAcceptTcpClient(ar);
if (inProduction)
ThreadPool.QueueUserWorkItem(state => HandleTcpRequest(client, serverCertificate)); // With SSL
else
ThreadPool.QueueUserWorkItem(state => HandleTcpRequest(client)); // Without SSL
// Signal the calling thread to continue.
tcpClientConnected.Set();
}
public void Start()
{
currentHandledRequests = 0;
tcpListener = new TcpListener(IPAddress.Any, 10000);
try
{
tcpListener.Start();
while (true)
DoBeginAcceptTcpClient(tcpListener);
}
catch (SocketException)
{
// The TcpListener is shutting down, exit gracefully
CheckBuffer();
return;
}
}
I'm assuming the answer will be related to using Sockets instead of TcpListener, or at least using TcpListener.AcceptSocket, but I wondered how we'd go about doing that?
One idea we had was to call AcceptTcpClient and immediately Enqueue the TcpClient into one of multiple Queue<TcpClient> objects. That way, we could poll those queues on separate threads (one queue per thread), without running into monitors that might block the thread while waiting for other Dequeue operations. Each queue thread could then use ThreadPool.QueueUserWorkItem to have the work done in a ThreadPool thread and then move onto dequeuing the next TcpClient in its queue. Would you recommend this approach, or is our problem that we're using TcpListener and no amount of rapid dequeueing is going to fix that?
I've whipped up some code that uses sockets directly, but I lack the means of performing a load test with 1000 clients. Could you please try to test how this code compares to your current solution? I'd be very interested in the results as I'm building a server that needs to accept a lot of connections as well right now.
static WaitCallback handleTcpRequest = new WaitCallback(HandleTcpRequest);
static void Main()
{
var e = new SocketAsyncEventArgs();
e.Completed += new EventHandler<SocketAsyncEventArgs>(e_Completed);
var socket = new Socket(
AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
socket.Bind(new IPEndPoint(IPAddress.Loopback, 8181));
socket.Listen((int)SocketOptionName.MaxConnections);
socket.AcceptAsync(e);
Console.WriteLine("--ready--");
Console.ReadLine();
socket.Close();
}
static void e_Completed(object sender, SocketAsyncEventArgs e)
{
var socket = (Socket)sender;
ThreadPool.QueueUserWorkItem(handleTcpRequest, e.AcceptSocket);
e.AcceptSocket = null;
socket.AcceptAsync(e);
}
static void HandleTcpRequest(object state)
{
var socket = (Socket)state;
Thread.Sleep(100); // do work
socket.Close();
}
Unless I'm missing something, you're calling BeingAcceptTcpClient, which is asynchronous, but then you're calling WaitOne() to wait until the asynchronous code finishes , which effectively makes the process synchronous. Your code can only accept one client at a time. Or am I totally crazy? At the very least, this seems like a lot of context switching for nothing.
It was alluded to, in the other questions, but I would suggest in your tcpListener.Start() method, use the overload that allows you to set the backlog to a number higher than the maximum number of connections you're expecting at one time:
public void Start()
{
currentHandledRequests = 0;
tcpListener = new TcpListener(IPAddress.Any, 10000);
try
{
tcpListener.Start(1100); // This is the backlog parameter
while (true)
DoBeginAcceptTcpClient(tcpListener);
}
catch (SocketException)
{
// The TcpListener is shutting down, exit gracefully
CheckBuffer();
return;
}
}
Basically, this option sets how many "pending" TCP connections are allowed that are waiting for an Accept to be called. If you are not accepting connections fast enough, and this backlog fills up, the TCP connections will be automatically rejected, and you won't even get a chance to process them.
As others have mentioned, the other possibility is speeding up how fast you process the incoming connections. You still, however, should set the backlog to a higher value, even if you can speed up the accept time.
The first thing to ask yourself is "is 1000 connections all at once reasonable". Personally I think it's unlikely that you will get into that situation. More likely you have 1000 connections occurring over a short period of time.
I have a TCP test program that I use to test my server framework, it can do things like X connections in total in batches of Y with a gap of Z ms between each batch; which I personally find is more real world than 'vast number all at once'. It's free, it might help, you can get it from here: http://www.lenholgate.com/blog/2005/11/windows-tcpip-server-performance.html
As others have said, increase the listen backlog, process the connections faster, use asynchronous accepts if possible...
Just a suggestion : why not accept the clients synchronously (by using AcceptTcpClient instead of BeginAcceptTcpClient), and then process the client on a new thread ? That way, you won't have to wait for a client to be processed before you can accept the next one.
I want to create an asynchronous socket Server using the SocketAsyncEventArgs event.
The server should manage about 1000 connections at the same time. What is the best way to handle the logic for each packet?
The Server design is based on this MSDN example, so every socket will have his own SocketAsyncEventArgs for receiving data.
Do the logic stuff inside the receive function.
No overhead will be created, but since the next ReceiveAsync() call won’t be done before the logic has completed, new data can’t be read from the socket. The two main questions for me are: If the client sends a lot of data and the logic processing is heavy, how will the system handle it (packets lost because buffer is to full)? Also, if all clients send data at the same time, will there be 1000 threads, or is there an internal limit and a new thread can’t start before another one completes execution?
Use a queue.
The receive function will be very short and execute fast, but you’ll have decent overhead because of the queue. Problems are, if your worker threads are not fast enough under heavy server load, your queue can get full, so maybe you have to force packet drops. You also get the Producer/Consumer problem, which can probably slow down the entire queue with to many locks.
So what will be the better design, logic in receive function, logic in worker threads or anything completely different I’ve missed so far.
Another Quest regarding data sending.
Is it better to have a SocketAsyncEventArgs tied to a socket (analog to the receive event) and use a buffer system to make one send call for a few small packets (let’s say the packets would otherwise sometimes! send directly one after another) or use a different SocketAsyncEventArgs for every packet and store them in a pool to reuse them?
To effectivly implement async sockets each socket will need more than 1 SocketAsyncEventArgs. There is also an issue with the byte[] buffer in each SocketAsyncEventArgs. In short, the byte buffers will be pinned whenever a managed - native transition occurs (sending / receiving). If you allocate the SocketAsyncEventArgs and byte buffers as needed you can run into OutOfMemoryExceptions with many clients due to fragmentation and the inability of the GC to compact pinned memory.
The best way to handle this is to create a SocketBufferPool class that will allocate a large number of bytes and SocketAsyncEventArgs when the application is first started, this way the pinned memory will be contiguous. Then simply resuse the buffers from the pool as needed.
In practice I've found it best to create a wrapper class around the SocketAsyncEventArgs and a SocketBufferPool class to manage the distribution of resources.
As an example, here is the code for a BeginReceive method:
private void BeginReceive(Socket socket)
{
Contract.Requires(socket != null, "socket");
SocketEventArgs e = SocketBufferPool.Instance.Alloc();
e.Socket = socket;
e.Completed += new EventHandler<SocketEventArgs>(this.HandleIOCompleted);
if (!socket.ReceiveAsync(e.AsyncEventArgs)) {
this.HandleIOCompleted(null, e);
}
}
And here is the HandleIOCompleted method:
private void HandleIOCompleted(object sender, SocketEventArgs e)
{
e.Completed -= this.HandleIOCompleted;
bool closed = false;
lock (this.sequenceLock) {
e.SequenceNumber = this.sequenceNumber++;
}
switch (e.LastOperation) {
case SocketAsyncOperation.Send:
case SocketAsyncOperation.SendPackets:
case SocketAsyncOperation.SendTo:
if (e.SocketError == SocketError.Success) {
this.OnDataSent(e);
}
break;
case SocketAsyncOperation.Receive:
case SocketAsyncOperation.ReceiveFrom:
case SocketAsyncOperation.ReceiveMessageFrom:
if ((e.BytesTransferred > 0) && (e.SocketError == SocketError.Success)) {
this.BeginReceive(e.Socket);
if (this.ReceiveTimeout > 0) {
this.SetReceiveTimeout(e.Socket);
}
} else {
closed = true;
}
if (e.SocketError == SocketError.Success) {
this.OnDataReceived(e);
}
break;
case SocketAsyncOperation.Disconnect:
closed = true;
break;
case SocketAsyncOperation.Accept:
case SocketAsyncOperation.Connect:
case SocketAsyncOperation.None:
break;
}
if (closed) {
this.HandleSocketClosed(e.Socket);
}
SocketBufferPool.Instance.Free(e);
}
The above code is contained in a TcpSocket class that will raise DataReceived & DataSent events. One thing to notice is the case SocketAsyncOperation.ReceiveMessageFrom: block; if the socket hasn't had an error it immediately starts another BeginReceive() which will allocate another SocketEventArgs from the pool.
Another important note is the SocketEventArgs SequenceNumber property set in the HandleIOComplete method. Although async requests will complete in the order queued, you are still subject to other thread race conditions. Since the code calls BeginReceive before raising the DataReceived event there is a possibility that the thread servicing the orginal IOCP will block after calling BeginReceive but before rasing the event while the second async receive completes on a new thread which raises the DataReceived event first. Although this is a fairly rare edge case it can occur and the SequenceNumber property gives the consuming app the ability to ensure that data is processed in the correct order.
One other area to be aware of is async sends. Oftentimes, async send requests will complete synchronously (SendAsync will return false if the call completed synchronously) and can severely degrade performance. The additional overhead of of the async call coming back on an IOCP can in practice cause worse performance than simply using the synchronous call. The async call requires two kernel calls and a heap allocation while the synchronous call happens on the stack.
Hope this helps,
Bill
In your code, you do this:
if (!socket.ReceiveAsync(e.AsyncEventArgs)) {
this.HandleIOCompleted(null, e);
}
But it is an error to do that. There is a reason why the callback is not invoked when it finishes synchronously, such action can fill up the stack.
Imagine that each ReceiveAsync is always returning synchronously. If your HandleIOCompleted was in a while, you could process the result that returned synchronously at the same stack level. If it didn't return synchronously, you break the while.
But, by doing the you you do, you end-up creating a new item in the stack... so if you have bad luck enough, you will cause stack overflow exceptions.