async / await vs BeginRead, EndRead - c#

I don't quite 'get' async and await yet, and I'm looking for some clarification around a particular problem I'm about to solve. Basically, I need to write some code that'll handle a TCP connection. It'll essentially just receive data and process it until the connection is closed.
I'd normally write this code using the NetworkStream BeginRead and EndRead pattern, but since the async / await pattern is much cleaner, I'm tempted to use that instead. However, since I admittedly don't fully understand exactly what is involved in these, I'm a little wary of the consequences. Will one use more resources than the other; will one use a thread where another would use IOCP, etc.
Convoluted example time. These two do the same thing - count the bytes in a stream:
class StreamCount
{
private Stream str;
private int total = 0;
private byte[] buffer = new byte[1000];
public Task<int> CountBytes(Stream str)
{
this.str = str;
var tcs = new TaskCompletionSource<int>();
Action onComplete = () => tcs.SetResult(total);
str.BeginRead(this.buffer, 0, 1000, this.BeginReadCallback, onComplete);
return tcs.Task;
}
private void BeginReadCallback(IAsyncResult ar)
{
var bytesRead = str.EndRead(ar);
if (bytesRead == 0)
{
((Action)ar.AsyncState)();
}
else
{
total += bytesRead;
str.BeginRead(this.buffer, 0, 1000, this.BeginReadCallback, ar.AsyncState);
}
}
}
... And...
public static async Task<int> CountBytes(Stream str)
{
var buffer = new byte[1000];
var total = 0;
while (true)
{
int bytesRead = await str.ReadAsync(buffer, 0, 1000);
if (bytesRead == 0)
{
break;
}
total += bytesRead;
}
return total;
}
To my eyes, the async way looks cleaner, but there is that 'while (true)' loop that my uneducated brain tells me is going to use an extra thread, more resources, and therefore won't scale as well as the other one. But I'm fairly sure that is wrong. Are these doing the same thing in the same way?

To my eyes, the async way looks cleaner, but there is that 'while (true)' loop that my uneducated brain tells me is going to use an extra thread, more resources, and therefore won't scale as well as the other one.
Nope, it won't. The loop will only use a thread when it's actually running code... just as it would in your BeginRead callback. The await expression will return control to whatever the calling code is, having registered a continuation which jumps back to the right place in the method (in an appropriate thread, based on the synchronization context) and then continues running until it either gets to the end of the method or hits another await expression. It's exactly what you want :)
It's worth learning more about how async/await works behind the scenes - you might want to start with the MSDN page on it, as a jumping off point.

Related

hanging in ReadAsync from stream

To get my GUI responsive while receiving data I (think I) need to implement the read of a BT device asynchronously.
But once I try to make it async by awaiting read it's hanging in the first call or 2nd call.
Debugging also does not produce insights, breakpoint don't fire, etc.
When running synchronously the loop is usually running more than once.
public async Task<byte[]> ReadBufferFromStreamAsync(NetworkStream stream)
{
var totalRead = 0;
byte[] buffer = new byte[IO_BUFFER_SIZE];
while (!buffer.Contains((byte)'#'))
{
int read = await stream.ReadAsync(buffer, totalRead, buffer.Length - totalRead);
totalRead += read;
}
return buffer;
}
public async Task<string> readAsync()
{
string answer = System.Text.Encoding.UTF8.GetString(await ReadBufferFromStreamAsync(myStream));
Debug.Write(answer);
return answer.Trim(charsToTrim);
}
public async Task<string> WriteReadAsync(string str)
{
Debug.Write($"send:{str},");
await writeAsync(str);
var value = await readAsync();
Debug.Write($"received:{value}");
return value;
}
whereas this runs fine:
....
Task<int> read = stream.ReadAsync(buffer, totalRead, buffer.Length - totalRead);
totalRead += read.Result;
I would be also keen to know how you debug this kind of code in any case of trouble.
As confirmed with your latest comment, you have an async deadlock. This question goes into a lot of detail about that and provides links to resources where you can learn more.
You say that somewhere in your code you have something like:
public void MyMethod()
{
MyOtherMethodAsync().Result;
}
This isn't really the right way to call async methods. You end up with a chicken and egg type situation, where MyMethod needs to be free to receive the result of MyOtherMethodAsync, so the async method basically waits to resume. The other side of this is MyMethod's call to .Result which is blocking, waiting for MyOtherMethodAsync to complete. This ends up with them both waiting for each other, and then we have ourselves a deadlock.
The best solution is to make MyMethod async:
public async Task MyMethod()
{
await MyOtherMethodAsync();
}
But this sometimes isn't possible. If the method has to be sync then you can use .ConfigureAwait(false) to prevent it from hanging, but as Stephen Cleary notes, it is a hack. Anyway, you can do it like this:
public void MyMethod()
{
MyOtherMethodAsync().ConfigureAwait(false).GetAwaiter().GetResult();
}
Note that for UI event handler methods, you can change them to async void:
public async void Button1_Clicked(object sender, EventArgs e)
{
await MyOtherMethodAsync();
}
But note that using async void effectively means that you don't care about waiting for the result of the Button1_Clicked method. It becomes fire and forget. This is, however, the recommended practise for integrating UI buttons, etc. with async methods.

What is the fastest possible way to read a serial port in .net?

I need a serial port program to read data coming in at 4800 baud. Right now I have a simulator sending 15 lines of data every second. The output of it seems to get "behind" and can't keep up with the speed/amount of data coming in.
I have tried using ReadLine() with a DataReceieved event, which did not seem to be reliable, and now I am using an async method with serialPort.BaseStream.ReadAsync:
okToReadPort = true;
Task readTask = new Task(startAsyncRead);
readTask.Start();
//this method starts the async read process and the "nmeaList" is what
// is used by the other thread to display data
public async void startAsyncRead()
{
while (okToReadPort)
{
Task<string> task = ReadLineAsync(serialPort);
string line = await task;
NMEAMsg tempMsg = new NMEAMsg(line);
if (tempMsg.sentenceType != null)
{
nmeaList[tempMsg.sentenceType] = tempMsg;
}
}
public static async Task<string> ReadLineAsync(
this SerialPort serialPort)
{
// Console.WriteLine("Entering ReadLineAsync()...");
byte[] buffer = new byte[1];
string ret = string.Empty;
while (true)
{
await serialPort.BaseStream.ReadAsync(buffer, 0, 1);
ret += serialPort.Encoding.GetString(buffer);
if (ret.EndsWith(serialPort.NewLine))
return ret.Substring(0, ret.Length - serialPort.NewLine.Length);
}
}
This still seems inefficient, does anyone know of a better way to ensure that every piece of data is read from the port and accounted for?
Generally speaking, your issue is that you are performing IO synchronously with data processing. It doesn't help that your data processing is relatively expensive (string concatenation).
To fix the general problem, when you read a byte put it into a processing buffer (BlockingCollection works great here as it solves Producer/Consumer) and have another thread read from the buffer. That way the serial port can immediately begin reading again instead of waiting for your processing to finish.
As a side note, you would likely see a benefit by using StringBuilder in your code instead of string concatenation. You should still process via queue though.

Prevent overlapping async code execution

I have read about async-await patterns and that Mutex is incompatible with asynchronous code so I wonder how to write the following method in a lightweight way, without object allocations (not even for a Task<>) but without excessive encumberance (from a pattern).
The problem is as follows. Look at this method:
public async void SendAsync(Stream stream, byte[] data)
{
await stream.WriteAsync(data.Length);
await stream.WriteAsync(data);
await stream.FlushAsync();
}
Now, this method probably is not very useful. In fact there is plenty more code, but for the problem at hand it's perfectly complete.
The problem here is that I have a single main thread that will invoke the method very quickly without await:
for (i = 0; i < 10; i++)
{
SendAsync(stream[i], buffer[i]);
}
All I want is that the main thread loops through quickly without blocking.
Now, this loop is looped over, too:
while (true)
{
for (i = 0; i < 10; i++)
{
SendAsync(stream[i], buffer[i]);
}
// TODO: add a break criterion here
}
My suspicion (fear) is that a very chunky buffer (say 1 MB) on a congested stream[i] (yes, it's a TCP/IP NetworkStream underneath) may cause the nth invocation to occur before the preceding n-1th invocation was reentrant (returned from the invocation).
The two tasks will then interfere with each other.
I don't want any of these two solutions I've already discarded:
(Non-)Solution 1: Mutex
This will throw an exception because Mutex has thread affinity and async code is thread-agnostic:
public async void SendAsync(Stream stream, byte[] data)
{
mutex.WaitOne(); // EXCEPTION!
await stream.WriteAsync(data.Length);
await stream.WriteAsync(data);
await stream.FlushAsync();
mutex.ReleaseMutex();
}
Solution 2: return a Task to *.WaitAll()
The loop is in consumer land while SendAsync() is in framework land, I don't want to force consumers to use an intricate pattern:
while (true)
{
for (i = 0; i < 10; i++)
{
tasks[i] = SendAsync(stream[i], buffer[i]);
}
Task.WaitAll(tasks);
// TODO: add a break criterion here
}
Furthermore I don't ABSOLUTELY want Tasks to be allocated on each loop, this code will run at a moderate frequency (5-10 Hz) with 1000+ open streams. I cannot allocate 10000+ objects per second, it would GC like crazy. Mind that this is minimal code per SO policy, the actual code is much more complex.
What I'd like to see is some kind of outside-loop-allocation that allows to wait for completion only when absolutely necessary (like a Mutex) but at the same time does not allocate memory inside the loop.
Let's say you have a wrapper class over the stream, you could do something like this :
private Stream stream;
private Task currentTask;
public async void SendAsync(byte[] data)
{
currentTask = WriteAsyncWhenStreamAvailable(data);
await currentTask;
}
private async Task WriteAsyncWhenStreamAvailable(byte[] data)
{
if (currentTask != null)
await currentTask.ConfigureAwait(false);
await WriteAsync(data);
}
public async Task WriteAsync(byte[] data)
{
await stream.WriteAsync(data.Length).ConfigureAwait(false);
await stream.WriteAsync(data).ConfigureAwait(false);
await stream.FlushAsync().ConfigureAwait(false);
}
This way you always wait for the previous WriteAsync to end before sending the new data. This solution will ensure the buffers to be sent in requested order.
You could use an async SemaphoreSlim instead of a Mutex
private SemaphoreSlim semaphore = new SemaphoreSlim(1);
public async void SendAsync(Stream stream, byte[] data)
{
await semaphore.WaitAsync(); // EXCEPTION!
await stream.WriteAsync(data.Length);
await stream.WriteAsync(data);
await stream.FlushAsync();
semaphore.Release();
}
Beware of OutOfMemoryException if you blindlessly produce new data without taking in account the size of pending buffers...

c# 4.5 - Should a TCP Server, mainly doing database inserts, start each client on a Task

My understanding is that async await is for IO (network, db, etc) and parallel task is for cpu.
Note: This code is a little harsh to make it concise for this post.
I have a windows service created in c# that has the following code
while (true)
{
var socket = await tcpListener.AcceptSocketAsync();
if (socket == null) { break; }
var client = new RemoteClient(socket);
Task.Run(() => client.ProcessMessage());
}
In the RemoteClient class the ProcessMessage method does this
byte[] buffer = new byte[4096];
rawMessage = string.Empty;
while (true)
{
Array.Clear(buffer, 0, buffer.Length);
int bytesRead = await networkStream.ReadAsync(buffer, 0, buffer.Length);
rawMessage += (System.Text.Encoding.ASCII.GetString(buffer).Replace("\0", string.Empty));
if (bytesRead == 0 || buffer[buffer.Length - 1] == 0)
{
StoreMessage();
return;
}
}
So I have the I/O work happening asynchronously. But my concern and my question is in using Task.Run to kick off the work am I still creating a block?
I'm trying to take a TCP connection and release it as quickly as possible in order to scale to a large number of connections.
I feel like I'm mixing paradigms here.
Thanks
My understanding is that async await is for IO (network, db, etc) and parallel task is for cpu.
I would say that understand is incorrect. async/await is for any asynchronous operation, whether I/O or CPU bound.
…my concern and my question is in using Task.Run to kick off the work am I still creating a block?
"A block"? What kind of block do you think you would be creating otherwise?
Personally, I would not write the code that way. The accept operation will already complete in a thread pool thread (or synchronously in the same thread), i.e. one from the IOCP thread pool. It would be perfectly fine to set up some initial conditions for the connection on that thread, and then initiate the I/O from there. There's no reason to queue up the work on yet another thread.
So the way I'd write the code is like this:
async Task ProcessMessage()
{
byte[] buffer = new byte[4096];
rawMessage = string.Empty;
while (true)
{
Array.Clear(buffer, 0, buffer.Length);
int bytesRead = await networkStream.ReadAsync(buffer, 0, buffer.Length);
rawMessage += (System.Text.Encoding.ASCII.GetString(buffer).Replace("\0", string.Empty));
if (bytesRead == 0 || buffer[buffer.Length - 1] == 0)
{
StoreMessage();
return;
}
}
}
Then in your service:
while (true)
{
var socket = await tcpListener.AcceptSocketAsync();
if (socket == null) { break; }
var client = new RemoteClient(socket);
var _ = client.ProcessMessage();
}
Notes:
The dummy _ variable is just there to keep the compiler from warning you about the ignored, non-awaited async return)
Since you are ignoring the returned Task object, you won't receive thrown exceptions. So in lieu of that, you should add appropriate exception handling to the ProcessMessage() method itself.
I agree with commenter shr regarding cleanup. You didn't provide a complete code example, so we don't know what e.g. the StoreMessage() method does. But presumably/hopefully you have logic in there somewhere that correctly and gracefully shuts down the connection and closes the socket.

Asynchronous SHA256 Hashing

I have the following method:
public static string Sha256Hash(string input) {
if(String.IsNullOrEmpty(input)) return String.Empty;
using(HashAlgorithm algorithm = new SHA256CryptoServiceProvider()) {
byte[] inputBytes = Encoding.UTF8.GetBytes(input);
byte[] hashBytes = algorithm.ComputeHash(inputBytes);
return BitConverter.ToString(hashBytes).Replace("-", String.Empty);
}
}
Is there a way to make it asynchronous? I was hoping to use the async and await keywords, but the HashAlgorithm class does not provide any asynchronous support for this.
Another approach was to encapsulate all the logic in a:
public static async string Sha256Hash(string input) {
return await Task.Run(() => {
//Hashing here...
});
}
But this does not seem clean and I'm not sure if it's a correct (or efficient) way to perform an operation asynchronously.
What can I do to accomplish this?
As stated by the other answerers, hashing is a CPU-bound activity so it doesn't have Async methods you can call. You can, however, make your hashing method async by asynchronously reading the file block by block and then hashing the bytes you read from the file. The hashing will be done synchronously but the read will be asynchronous and consequently your entire method will be async.
Here is sample code for achieving the purpose I just described.
public static async Threading.Tasks.Task<string> GetHashAsync<T>(this Stream stream)
where T : HashAlgorithm, new()
{
StringBuilder sb;
using (var algo = new T())
{
var buffer = new byte[8192];
int bytesRead;
// compute the hash on 8KiB blocks
while ((bytesRead = await stream.ReadAsync(buffer, 0, buffer.Length)) != 0)
algo.TransformBlock(buffer, 0, bytesRead, buffer, 0);
algo.TransformFinalBlock(buffer, 0, bytesRead);
// build the hash string
sb = new StringBuilder(algo.HashSize / 4);
foreach (var b in algo.Hash)
sb.AppendFormat("{0:x2}", b);
}
return sb?.ToString();
}
The function can be invoked as such
using (var stream = System.IO.File.OpenRead(#"C:\path\to\file.txt"))
string sha256 = await stream.GetHashAsync<SHA256CryptoServiceProvider>();
Of course,you could equally call the method with other hash algorithms such as SHA1CryptoServiceProvider or SHA512CryptoServiceProvider as the generic type parameter.
Likewise with a few modifications, you can also get it to hash a string as is specific to your case.
The work that you're doing is inherently synchronous CPU bound work. It's not inherently asynchronous as something like network IO is going to be. If you would like to run some synchronous CPU bound work in another thread and asynchronously wait for it to be completed, then Task.Run is indeed the proper tool to accomplish that, assuming the operation is sufficiently long running to need to perform it asynchronously.
That said, there really isn't any reason to expose an asynchronous wrapper over your synchronous method. It generally makes more sense to just expose the method synchronously, and if a particular caller needs it to run asynchronously in another thread, they can use Task.Run to explicitly indicate that need for that particular invocation.
The overhead of running this asynchronously (using Task.Run) will probably be higher that just running it synchronously.
An asynchronous interface is not available because it is a CPU bound operation. You can make it asynchronous (using Task.Run) as you pointed out, but I would recommend against it.

Categories