I want to receive constantly while a socket is alive and call concurrently to a send method from different threads (I'm sorry for my English).
I have thought about the following:
As there will be multiple clients I think it would be better if the operating system is responsible for generating the threads so as not to compromise the performance, therefore, I do not want to check the Receive method in a loop on a different thread for each client. This I discarded instantly.
To avoid the creation of instances of the type that implements the IAsyncResult interface used in each call to BeginReceive and thus not stress the GB and have a better performance when there is a heavy load of IO operations I decided not to use BeginReceive / EndReceive.
Instead, I've thought about using the ReceiveAsync method and a reusable SocketAsyncEventArgs instance for each connection. For example, if the server supports 1000 connections, it will have that same number of instances of SocketAsyncEventArgs, one for each client during the time the connection is alive. When the client disconnects the underlying SocketAsyncEventArgs instance will return to the pool for later use in another connection (SOLUTION CHOSEN)
Regarding the sending operation, I do not care that the send requests are not processed in the order of call; the important thing is that the bytes do not mix with those of other messages, so that in the buffer of reception in the remote host the messages arrive one after the other so that the class that is in charge of the protocol can interpret them.
For this, I do not want to call BeginSend / EndSend because apart from creating different IAsynResult instances in each call, I understand that there is a possibility that message bytes are mixed when calling from different threads, is that correct? I do not remember where I read it a long time ago. I would like to have reliable sources. Regardless of whether it is true or not, I do not intend to use it.
I also understand that SendAsync is like a mask for BeginSend / EndSend, with the difference that it reuses the underlying IAsyncResult instance. Which makes me think that although I can concurrently call the SendAsyn method as well as BeginSend, there is also the possibility that the bytes are mixed in the output buffer.
To process multiple calls with SendAsyn you should also provide a different instance of SocketAsyncEventArgs on each call, which I do not like. Then I thought about using the Semaphore class to guarantee a sending operation at a time and once completed, reuse the SocketAsyncEventArgs instance used in the previous send operation. This way I would only occupy two instances of SocketAsyncEventArgs per connection (one to receive and one to send) and I avoided having a pool of these objects and not only that, it would also prevent the bytes of different messages from being mixed in the output buffer.
Regarding the solution that I found to receive constantly I am satisfied. But for the shipping operation I'm not sure. I do not find any advantage when using SendAsync. I only want multiple threads can call concurrently to the shipping method. Then I thought about using the Send method and wrapping it in an asynchronous method like the one shown below:
public virtual async Task<int> SendAsync(byte[] buffer, int offset, int size, CancellationToken cancellationToken)
{
ThrowIfDisposed();
var sendingRequest = CancellationTokenSource.CreateLinkedTokenSource(requests.Token, cancellationToken);
int bytesSent = 0;
try
{
await semaphore.WaitAsync(sendingRequest.Token).ConfigureAwait();
while(bytesSent < size)
{
int bytesWrote = clientSocket.Send(buffer, offset, size, SocketFlags.None);
if(bytesWrote == 0)
{
throw new SocketException((int)SocketError.NotConnected);
}
else
{
bytesSend += bytesWrote;
}
}
}
catch(SocketException)
{
Disconnect();
throw;
}
finally
{
semaphore.Release();
sendingRequest.Dispose();
sendingRequest = null;
}
return bytesSent;
}
I would appreciate it if you would tell me in which parts of everything I said above I am wrong or if my approach is correct or how to improve what I already have.
Thanks.
Related
I have a server which communicates with 50 or more devices over TCP LAN. There is a Task.Run for each socket reading message loop.
I buffer each message reach into a blocking queue, where each blocking queue has a Task.Run using a BlockingCollection.Take().
So something like (semi-pseudocode):
Socket Reading Task
Task.Run(() =>
{
while (notCancelled)
{
element = ReadXml();
switch (element)
{
case messageheader:
MessageBlockingQueue.Add(deserialze<messageType>());
...
}
}
});
Message Buffer Task
Task.Run(() =>
{
while (notCancelled)
{
Process(MessageQueue.Take());
}
});
So that would make 50+ reading tasks and 50+ tasks blocking on their own buffers.
I did it this way to avoid blocking the reading loop and allow the program to distribute processing time on messages more fairly, or so I believe.
Is this an inefficient way to handle it? what would be a better way?
You may be interested in the "channels" work, in particular: System.Threading.Channels. The aim of this is to provider asynchronous producer/consumer queues, covering both single and multiple producer and consumer scenarios, upper limits, etc. By using an asynchronous API, you aren't tying up lots of threads just waiting for something to do.
Your read loop would become:
while (notCancelled) {
var next = await queue.Reader.ReadAsync(optionalCancellationToken);
Process(next);
}
and the producer:
switch (element)
{
case messageheader:
queue.Writer.TryWrite(deserialze<messageType>());
...
}
so: minimal changes
Alternatively - or in combination - you could look into things like "pipelines" (https://www.nuget.org/packages/System.IO.Pipelines/) - since you're dealing with TCP data, this would be an ideal fit, and is something I've looked at for the custom web-socket server here on Stack Overflow (which deals with huge numbers of connections). Since the API is async throughout, it does a good job of balancing work - and the pipelines API is engineered with typical TCP scenarios in mind, for example partially consuming incoming data streams as you detect frame boundaries. I've written about this usage a lot, with code examples mostly here. Note that "pipelines" doesn't include a direct TCP layer, but the "kestrel" server includes one, or the third-party library https://www.nuget.org/packages/Pipelines.Sockets.Unofficial/ does (disclosure: I wrote it).
I actually do something similar in another project. What I learned or would do differently are the following:
First of all, better to use dedicated threads for the reading/writing loop (with new Thread(ParameterizedThreadStart)) because Task.Run uses a pool thread and as you use it in a (nearly) endless loop the thread is practically never returned to the pool.
var thread = new Thread(ReaderLoop) { Name = nameof(ReaderLoop) }; // priority, etc if needed
thread.Start(cancellationToken);
Your Process can be an event, which you can invoke asynchronously so your reader loop can be return immediately to process the new incoming packages as fast as possible:
private void ReaderLoop(object state)
{
var token = (CancellationToken)state;
while (!token.IsCancellationRequested)
{
try
{
var message = MessageQueue.Take(token);
OnMessageReceived(new MessageReceivedEventArgs(message));
}
catch (OperationCanceledException)
{
if (!disposed && IsRunning)
Stop();
break;
}
}
}
Please note that if a delegate has multiple targets it's async invocation is not trivial. I created this extension method for invoking a delegate on pool threads:
public static void InvokeAsync<TEventArgs>(this EventHandler<TEventArgs> eventHandler, object sender, TEventArgs args)
{
void Callback(IAsyncResult ar)
{
var method = (EventHandler<TEventArgs>)ar.AsyncState;
try
{
method.EndInvoke(ar);
}
catch (Exception e)
{
HandleError(e, method);
}
}
foreach (EventHandler<TEventArgs> handler in eventHandler.GetInvocationList())
handler.BeginInvoke(sender, args, Callback, handler);
}
So the OnMessageReceived implementation can be:
protected virtual void OnMessageReceived(MessageReceivedEventArgs e)
=> messageReceivedHandler.InvokeAsync(this, e);
Finally it was a big lesson that BlockingCollection<T> has some performance issues. It uses SpinWait internally, whose SpinOnce method waits longer and longer times if there is no incoming data for a long time. This is a tricky issue because even if you log every single step of the processing you will not notice that everything is started delayed unless you can mock also the server side. Here you can find a fast BlockingCollection implementation using an AutoResetEvent for triggering incoming data. I added a Take(CancellationToken) overload to it as follows:
/// <summary>
/// Takes an item from the <see cref="FastBlockingCollection{T}"/>
/// </summary>
public T Take(CancellationToken token)
{
T item;
while (!queue.TryDequeue(out item))
{
waitHandle.WaitOne(cancellationCheckTimeout); // can be 10-100 ms
token.ThrowIfCancellationRequested();
}
return item;
}
Basically that's it. Maybe not everything is applicable in your case, eg. if the nearly immediate response is not crucial the regular BlockingCollection also will do it.
Yes, this is a bit inefficient, because you block ThreadPool threads.
I already discussed this problem Using Task.Yield to overcome ThreadPool starvation while implementing producer/consumer pattern
You can also look at examples with testing a producer -consumer pattern here:
https://github.com/BBGONE/TestThreadAffinity
You can use await Task.Yield in the loop to give other tasks access to this thread.
You can solve it also by using dedicated threads or better a custom ThreadScheduler which uses its own thread pool. But it is ineffective to create 50+ plain threads. Better to adjust the task, so it would be more cooperative.
If you use a BlockingCollection (because it can block the thread for long while waiting to write (if bounded) or to read or no items to read) then it is better to use System.Threading.Tasks.Channels https://github.com/stephentoub/corefxlab/blob/master/src/System.Threading.Tasks.Channels/README.md
They don't block the thread while waiting when the collection will be available to write or to read. There's an example how it is used https://github.com/BBGONE/TestThreadAffinity/tree/master/ThreadingChannelsCoreFX/ChannelsTest
I'm developing a game server. Through TCPListener I accept clients.
var Listener = new TcpListener(IPAddress.Any, CommonConfig.Settings.GamePort);
Listener.Start();
ListenerStarted = true;
while (ListenerStarted)
{
TcpClient tcpClient = await Listener.AcceptTcpClientAsync();
ProcessClientTearOff(tcpClient);
}
Then through ReadAsync is getting the data from the client.
byte[] Buffer = new byte[8192];
int i = await Stream.ReadAsync(Buffer, 0, 8192);
After that, the data is processed using the method
RequestHandling(byte[] data)
and performing various actions. Clients actively interact between each other and therefore there are problems with thread safety. I was looking for information on how to properly organize the server structure and found a possible case that the data is being received asynchronously (as I have now), but the processing and execution of actions occurs in one thread.
One thread to accept clients and get data, one thread to process and execute, one thread to send data to clients.
But I can not understand how this can be realized. Through Task, you can specify the order in which the methods are executed, but only before running tasks. Is it possible to run all the processing of packets in a separate thread so that all actions are executed synchronously in the order of the queue? Or is there any alternative to this?
The question is fairly vague which is understandable given that you are seeking a general concept to organize this.
You don't need a separate thread to process a queue. Usually, a lock is an easier solution. A lock has an internal queue as am implementation detail. The queue contains the threads that are waiting to enter.
A good pattern for your case seems to be the following. Make each connection thread/task execute this loop:
while (true) {
var message = await ReceiveMessageFromNetwork();
lock (globalLock) {
ApplyMessage(message); //no IO here
}
}
The queue is implicit in the lock. I marked some code as "no IO" because you have to quickly leave the lock so that other threads/tasks can enter.
I have an observable which wraps a data source, which it continually watches and spits out changes as they occur:
IObservable<String> ReadDatasource()
{
//returns data from the data source, and watches it for changes
}
and within my code I have a TCP connection
interface Connection
{
Task Send(String data);
Boolean IsAvailable { get; }
}
which subscribes to the observable:
Connection _connection;
ReadDatabase()
.SubscribeOn(NewThreadScheduler.Default)
.Subscribe(
onNext: async r =>
{
if (_connection.IsAvailable)
{
try
{
await _connection.Send(r);
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
}
}
If the connection is closed by the client when the observable is spitting out large volumes of data in quick succession, there are a tonne of built up tasks still awaiting (at least I think this is the case), which then throw a tonne of exceptions due to the connection not being available (the _connection.IsAvailable has already been checked). FWIW I do not have the ability to make changes inside the _connection.Send(data) method. I have no issue waiting for the _connection.Send(data) to complete before moving onto the next element in the observable sequence. I fact, that would probably be preferable.
Is there a simple Rx style of handling this case?
there are a tonne of built up tasks still awaiting... which then throw a tonne of exceptions due to the connection not being available
Yes, that's what I would expect with this code. And there's nothing really wrong with that, since each one of those sends are in fact failing to send. If your code works fine with this, then you might just want to keep it as-is.
Otherwise...
(the _connection.IsAvailable has already been checked).
Yes. Connection.IsAvailable is useless. So are Socket.Connected / TcpClient.Connected, for that matter. They're all useless because all they tell you is whether an error has already occurred. Which you, er, already know because the last call already threw an exception. They do not provide any guarantee or even a guess as to whether the next method will succeed. This is why you need other mechanisms to detect socket connection failure.
I have no issue waiting for the _connection.Send(data) to complete before moving onto the next element in the observable sequence. I fact, that would probably be preferable.
If Connection is a simple wrapper around a socket without a write queue, then you should definitely only perform one call to Send at a time. This is due to the fact that in resource-constrained scenarios (i.e., always in production, not on your dev box), a "write" operation for a socket may only write some of the bytes to the actual network stream. I assume your Connection wrapper is handling partial writes by continuing to write until the entire data buffer is sent. This works great unless the code calls Send multiple times - in which case you can end up with bytes being out of order (A and then B are written; A partially completes and the wrapper sends the rest of A in another write... after B).
So, you'll need a write queue for reliable operation. If Connection already provides one, then I'd say you don't need to do anything else; the multiple Sends failing are normal. But if Connection only handles sending that single data buffer and does not queue up its write requests, then you'll need to do that yourself.
This is most easily accomplished by using a TPL Dataflow block. Specifically, ActionBlock<T>:
// Define the source observable.
var obs = ReadDatabase().SubscribeOn(NewThreadScheduler.Default);
// Create our queue which calls Send for each observable item.
var queue = new ActionBlock<string>(data => _connection.Send(data));
try
{
// Subscribe the queue to the observable and (asynchronously) wait for it to complete.
using (var subscription = obs.Subscribe(queue.AsObserver()))
await queue.Completion;
}
catch (Exception ex)
{
// The first exception thrown from Send will end up here.
Console.WriteLine(ex.Message);
}
Dataflow blocks understand asynchronous code, and by default they only process one item at a time. So, this code will invoke Send one at a time, buffering up additional data items in a FIFO queue until that Send completes.
Dataflow blocks have a "fail fast" behavior, so the first Send that throws will fault the block, causing it to discard all remaining queued writes. When the block faults, await queue.Completion will throw, unsubscribing from the observable and displaying the message.
If the observable completes, then await queue.Completion will complete, again unsubscribing from the observable, and continue execution normally.
For more about interfacing Rx with TPL Dataflow, see my Concurrency in C# Cookbook, recipe 7.7. You may also find this Stack Overflow answer helpful in understanding why passing an async lambda to Subscribe isn't ideal.
I've been upgrading some older software from the Begin/End pattern in C# to use the new async functionality of the TcpClient class.
Long story short, this receive method works great for small numbers of connected sockets, and continues to work great for 10,000+ connections. The problem comes when these sockets disconnect.
The method I am using server side is, in essence, this (heavily simplified but still causes the problem):
private async void ReceiveDataUntilStopped(object state)
{
while (IsConnected)
{
try
{
byte[] data = new byte[8192];
int recvCount = await _stream.ReadAsync(data, 0, data.Length);
if (recvCount == 0) { throw new Exception(); }
Array.Resize(ref data, recvCount);
Console.WriteLine(">>{0}<<", Encoding.UTF8.GetString(data));
}
catch { Shutdown(); return; }
}
}
This method is called using ThreadPool.QueueUserWorkItem(ReceiveDataUntilStopped); when the connection is accepted.
To test the server, I connect 1,000 sockets. The time it takes to accept these is neglible, around 2 seconds or so. I'm very pleased with this. However, when I disconnect these 1,000 sockets, the process takes a substantial amount of time, 15 or more seconds, to handle the closure of these sockets (the Shutdown method). During this time, my server refuses any more connections. I emptied the contents of the Shutdown method to see if there was something in there blocking, but the delay remains the same.
Am I being stupid and doing something I shouldn't? I'm relatively new to the async/await pattern, but enjoying it so far.
Is this unavoidable behaviour? I understand it's unlikely in production that 1,000 sockets will disconnect at the same time, but I'd like to be able to handle a scenario like this without causing a denial of service. It strikes me as odd that the listener stops accepting new sockets, but I expect this is because all the ThreadPool threads are busy shutting down the disconnected sockets?
EDIT: While I agree that throwing an exception when 0 bytes are received is not good control flow, this is not the source of the problem. The problem is still present with simply if (recvCount == 0) { Shutdown(); return; }. This is because ReadAsync throws an IOException if the other side disconnects uncleanly. I'm also aware that I'm not handling the buffers properly etc. this is just an example, with minimal content, just like SO likes. I use the following code to accept clients:
private async void AcceptClientsUntilStopped()
{
while (IsListening)
{
try
{
ServerConnection newConnection = new ServerConnection(await _listener.AcceptTcpClientAsync());
lock (_connections) { _connections.Add(newConnection); }
Console.WriteLine(_connections.Count);
}
catch { Stop(); }
}
}
if (recvCount == 0) { throw new Exception(); }
In case of disconnect you throw an exception. Exceptions are very expensive. I benchmarked them once at 10000/sec. This is very slow.
Under the debugger, exceptions are vastly slower again (maybe 100x).
This is a misuse of exceptions for control flow. From a code quality standpoint this is really bad. Your exception handling also is really bad because it catches too much. You meant to catch socket problems but you're also swallowing all possible bugs such as NRE.
using (mySocket) { //whatever you are using, maybe a TcpClient
while (true)
{
byte[] data = new byte[8192];
int recvCount = await _stream.ReadAsync(data, 0, data.Length);
if (recvCount == 0) break;
Array.Resize(ref data, recvCount);
Console.WriteLine(">>{0}<<", Encoding.UTF8.GetString(data));
}
Shutdown();
}
Much better, wow.
Further issues: Inefficient buffer handling, broken UTF8 decoding (can't split UTF8 at any byte position!), usage of async void (probably, you should use Task.Run to initiate this method, or simply call it and discard the result task).
In the comments we discovered that the following works:
Start a high-prio thread and accept synchronously on that (no await). That should keep the accepting going. Fixing the exceptions is not going to be 100% possible, but: await increases the cost of exceptions because it rethrows them. It uses ExceptionDispatchInfo for that which holds a process-global lock while doing that. Might be part of your scalability problems. You could improve perf by doing await readTask.ContinueWith(_ => { }). That way await will never throw.
Based on the code provided and my initial understanding of the problem. I think that there are several things that you should do to address this issue.
Use async Task instead of async void. This will ensure that the async state machine knows how to actually maintain its state.
Instead of invoking ThreadPool.QueueUserWorkItem(ReceiveDataUntilStopped); call ReceiveDataUntilStopped via await ReceiveDataUntilStopped in the context of an async Task method.
With async await, the Task and Task<T> objects represent the asynchronous operation. If you are concerned that the results of the await are executed on the original calling thread you could use .ConfigureAwait(false) to prevent capturing the current synchronization context. This is explained very well here and here too.
Additionally, look at how a similar "read-while" was written with this example.
I want to create an asynchronous socket Server using the SocketAsyncEventArgs event.
The server should manage about 1000 connections at the same time. What is the best way to handle the logic for each packet?
The Server design is based on this MSDN example, so every socket will have his own SocketAsyncEventArgs for receiving data.
Do the logic stuff inside the receive function.
No overhead will be created, but since the next ReceiveAsync() call won’t be done before the logic has completed, new data can’t be read from the socket. The two main questions for me are: If the client sends a lot of data and the logic processing is heavy, how will the system handle it (packets lost because buffer is to full)? Also, if all clients send data at the same time, will there be 1000 threads, or is there an internal limit and a new thread can’t start before another one completes execution?
Use a queue.
The receive function will be very short and execute fast, but you’ll have decent overhead because of the queue. Problems are, if your worker threads are not fast enough under heavy server load, your queue can get full, so maybe you have to force packet drops. You also get the Producer/Consumer problem, which can probably slow down the entire queue with to many locks.
So what will be the better design, logic in receive function, logic in worker threads or anything completely different I’ve missed so far.
Another Quest regarding data sending.
Is it better to have a SocketAsyncEventArgs tied to a socket (analog to the receive event) and use a buffer system to make one send call for a few small packets (let’s say the packets would otherwise sometimes! send directly one after another) or use a different SocketAsyncEventArgs for every packet and store them in a pool to reuse them?
To effectivly implement async sockets each socket will need more than 1 SocketAsyncEventArgs. There is also an issue with the byte[] buffer in each SocketAsyncEventArgs. In short, the byte buffers will be pinned whenever a managed - native transition occurs (sending / receiving). If you allocate the SocketAsyncEventArgs and byte buffers as needed you can run into OutOfMemoryExceptions with many clients due to fragmentation and the inability of the GC to compact pinned memory.
The best way to handle this is to create a SocketBufferPool class that will allocate a large number of bytes and SocketAsyncEventArgs when the application is first started, this way the pinned memory will be contiguous. Then simply resuse the buffers from the pool as needed.
In practice I've found it best to create a wrapper class around the SocketAsyncEventArgs and a SocketBufferPool class to manage the distribution of resources.
As an example, here is the code for a BeginReceive method:
private void BeginReceive(Socket socket)
{
Contract.Requires(socket != null, "socket");
SocketEventArgs e = SocketBufferPool.Instance.Alloc();
e.Socket = socket;
e.Completed += new EventHandler<SocketEventArgs>(this.HandleIOCompleted);
if (!socket.ReceiveAsync(e.AsyncEventArgs)) {
this.HandleIOCompleted(null, e);
}
}
And here is the HandleIOCompleted method:
private void HandleIOCompleted(object sender, SocketEventArgs e)
{
e.Completed -= this.HandleIOCompleted;
bool closed = false;
lock (this.sequenceLock) {
e.SequenceNumber = this.sequenceNumber++;
}
switch (e.LastOperation) {
case SocketAsyncOperation.Send:
case SocketAsyncOperation.SendPackets:
case SocketAsyncOperation.SendTo:
if (e.SocketError == SocketError.Success) {
this.OnDataSent(e);
}
break;
case SocketAsyncOperation.Receive:
case SocketAsyncOperation.ReceiveFrom:
case SocketAsyncOperation.ReceiveMessageFrom:
if ((e.BytesTransferred > 0) && (e.SocketError == SocketError.Success)) {
this.BeginReceive(e.Socket);
if (this.ReceiveTimeout > 0) {
this.SetReceiveTimeout(e.Socket);
}
} else {
closed = true;
}
if (e.SocketError == SocketError.Success) {
this.OnDataReceived(e);
}
break;
case SocketAsyncOperation.Disconnect:
closed = true;
break;
case SocketAsyncOperation.Accept:
case SocketAsyncOperation.Connect:
case SocketAsyncOperation.None:
break;
}
if (closed) {
this.HandleSocketClosed(e.Socket);
}
SocketBufferPool.Instance.Free(e);
}
The above code is contained in a TcpSocket class that will raise DataReceived & DataSent events. One thing to notice is the case SocketAsyncOperation.ReceiveMessageFrom: block; if the socket hasn't had an error it immediately starts another BeginReceive() which will allocate another SocketEventArgs from the pool.
Another important note is the SocketEventArgs SequenceNumber property set in the HandleIOComplete method. Although async requests will complete in the order queued, you are still subject to other thread race conditions. Since the code calls BeginReceive before raising the DataReceived event there is a possibility that the thread servicing the orginal IOCP will block after calling BeginReceive but before rasing the event while the second async receive completes on a new thread which raises the DataReceived event first. Although this is a fairly rare edge case it can occur and the SequenceNumber property gives the consuming app the ability to ensure that data is processed in the correct order.
One other area to be aware of is async sends. Oftentimes, async send requests will complete synchronously (SendAsync will return false if the call completed synchronously) and can severely degrade performance. The additional overhead of of the async call coming back on an IOCP can in practice cause worse performance than simply using the synchronous call. The async call requires two kernel calls and a heap allocation while the synchronous call happens on the stack.
Hope this helps,
Bill
In your code, you do this:
if (!socket.ReceiveAsync(e.AsyncEventArgs)) {
this.HandleIOCompleted(null, e);
}
But it is an error to do that. There is a reason why the callback is not invoked when it finishes synchronously, such action can fill up the stack.
Imagine that each ReceiveAsync is always returning synchronously. If your HandleIOCompleted was in a while, you could process the result that returned synchronously at the same stack level. If it didn't return synchronously, you break the while.
But, by doing the you you do, you end-up creating a new item in the stack... so if you have bad luck enough, you will cause stack overflow exceptions.