Grpc.Core 2.38.0
I have a collection of applications participating in inter process communication using grpc streaming. From time to time we've noticed a lockup and memory exhaustion (due to the lockup) in the server processes being unable to finish a call to IAsyncStreamWriter.WriteAsync(...)
Recent changes to grpc (.net) have changed the WriteAsync API to accept a CancellationToken, however this is not available in the Grpc.Core package.
A misconfigured grpc client accepting a stream can cause a deadlock. If a client does not dispose of the AsyncServerStreamingCall during error handling, the deadlock will occur on the server.
Example:
async Task ClientStreamingThread()
{
while (...)
{
var theStream = grpcService.SomeStream(new());
try
{
while (await theStream.ResponseStream.MoveNext(shutdownToken.Token))
{
var theData = theStream.ResponseStream.Current;
}
}
catch (RpcException)
{
// if an exception occurs, start over, reopen the stream
}
}
}
The example above contains the misbehaving client. If an RpcException occurs, we'll return to the start of the while loop and open another stream without cleaning up the previous. This causes the deadlock.
"Fix" the client code by disposing of the previous stream like the following:
async Task ClientStreamingThread()
{
while (...)
{
// important. dispose of theStream if it goes out of scope
using var theStream = grpcService.SomeStream(new());
try
{
while (await theStream.ResponseStream.MoveNext(shutdownToken.Token))
{
var theData = theStream.ResponseStream.Current;
}
}
catch (RpcException)
{
// if an exception occurs, start over, reopen the stream
}
}
}
Related
I'm currently working on ASP.NET Core WebApp, which consist of web server and two long-running services– TCP Server (for managing my own clients) and TCP Client (integration with external platform).
Both of services are running alongside web sever– I achieved that, by making them inherit from BackgroundService and injecting to DI in this way:
services.AddHostedService(provider => provider.GetService<TcpClientService>());
services.AddHostedService(provider => provider.GetService<TcpServerService>());
Unfortunately, while development I ran into weird issue (which doesn't let me sleep at night so at this point I beg for your help). For some reason async code in TcpClientService blocks execution of other services (web server and tcp server).
using System;
using System.IO;
using System.Net.Sockets;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;
namespace ClientService.AsyncPoblem
{
public class TcpClientService : BackgroundService
{
private readonly ILogger<TcpClientService> _logger;
private bool Connected { get; set; }
private TcpClient TcpClient { get; set; }
public TcpClientService(ILogger<TcpClientService> logger)
{
_logger = logger;
}
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
while (!stoppingToken.IsCancellationRequested)
{
try
{
if (Connected)
{
await Task.Delay(100, stoppingToken); // check every 100ms if still connected
}
else
{
TcpClient = new TcpClient("localhost", 1234);
HandleClient(TcpClient); // <-- Call causing the issue
_logger.Log(LogLevel.Debug, "After call");
}
}
catch (Exception e)
{
// log the exception, wait for 3s and try again
_logger.Log(LogLevel.Critical, "An error occured while trying to connect with server.");
_logger.Log(LogLevel.Critical, e.ToString());
await Task.Delay(3000, stoppingToken);
}
}
}
private async Task HandleClient(TcpClient client)
{
Connected = true;
await using var ns = client.GetStream();
using var streamReader = new StreamReader(ns);
var msgBuilder = new StringBuilder();
bool reading = false;
var buffer = new char[1024];
while (!streamReader.EndOfStream)
{
var res = await streamReader.ReadAsync(buffer, 0, 1024);
foreach (var value in buffer)
{
if (value == '\x02')
{
msgBuilder.Clear();
reading = true;
}
else if (value == '\x03')
{
reading = false;
if (msgBuilder.Length > 0)
{
Console.WriteLine(msgBuilder);
msgBuilder.Clear();
}
}
else if (value == '\x00')
{
break;
}
else if (reading)
{
msgBuilder.Append(value);
}
}
Array.Clear(buffer, 0, buffer.Length);
}
Connected = false;
}
}
}
Call causing the issue is located in else statement of ExecuteAsync method
else
{
TcpClient = new TcpClient("localhost", 1234);
HandleClient(TcpClient); // <-- Call causing the issue
_logger.Log(LogLevel.Debug, "After call");
}
The code reads properly from the socket, but it blocks initialization of WebServer and TcpServer. Actually, even log method is not being reached. No matter if I put await in front of HandleClient() or not, the code behaves the same.
I've done some tests, and I figured out that this piece of code is not blocking anymore ("After call" log shows up):
else
{
TcpClient = new TcpClient("localhost", 1234);
await Task.Delay(1);
HandleClient(TcpClient); // <- moving Task.Delay into HandleClient also works
_logger.Log(LogLevel.Debug, "After call");
}
This also works like a charm (if I try to await Task.Run(), it will block "After call" log, but rest of app will start with no problem):
else
{
tcpClient = new TcpClient("localhost", 6969);
Connected = true;
Task.Run(() => ReceiveAsync(tcpClient));
_logger.Log(LogLevel.Debug, "After call");
}
There is couple more combinations which make it work, but my question is– why other methods work (especially 1ms delay- this completely shut downs my brain) and firing HandleClient() without await doesn't? I know that fire and forget may not be the most elegant solution, but it should work and do it's job shouldn't it? I searched for almost a month, and still didn't find a single explanation for that. At this point I have hard time falling asleep at night, cause I have no one to ask and can't stop thinking about that..
Update
(Sorry for disappearing for over a day without any answers)
After many many hours of investigation, I started debugging once again. Every time I would hit while loop in HandleClient(), I was losing control over debugger, program seemed to continue to work, but it would never reach await streamReader.ReadAsync(). At some point I decided to change condition in the while loop to true (I have no idea why I didn't think of trying it before), and everything began to work as expected. Messages would get read from tcp socket, and other services would fire up without any issues.
Here is piece of code causing issue
while (!streamReader.EndOfStream) <----- issue
{
var res = await streamReader.ReadAsync(buffer, 0, 1024);
// ...
After that observation, I decided to print out the result of EndOfStream before reaching the loop, to see what happens
Console.WriteLine(streamReader.EndOfStream);
while (!streamReader.EndOfStream)
{
var res = await streamReader.ReadAsync(buffer, 0, 1024);
// ...
Now the exact same thing was happening, but before even reaching the loop!
Explanation
Note:
I'm not senior programmer, especially when it comes to dealing with asynchronous TCP communication so I might be wrong here, but I will try to do my best.
streamReader.EndOfStream is not a regular field, it is a property, and it has logic inside it's getter.
This is how it looks like from the inside:
public bool EndOfStream
{
get
{
ThrowIfDisposed();
CheckAsyncTaskInProgress();
if (_charPos < _charLen)
{
return false;
}
// This may block on pipes!
int numRead = ReadBuffer();
return numRead == 0;
}
}
EndOfStream getter is synchronous method. To detect whether stream has ended or not, it calls ReadBuffer(). Since there is no data in the buffer yet and stream hasn't ended, method hangs until there is some data to read. Unfortunately it cannot be used in asynchronous context, it will always block (unfortunately because it seems to be the only way to instantly detect interrupted connection, broken cable or end of stream).
I don't have finished piece of code yet, I need to rewrite it and add some broken connection detection. I will post my solution I soon as I finish.
I would like to thank everyone for trying to help me, and especially #RoarS. who took biggest part in discussion, and spent some of his own time to take a closer look at my issue.
This is poorly documented behaviour of the BackgroundService class. All registered IHostedService will be started sequentially in the order they were registered. The application will not start until each IHostedService has returned from StartAsync. A BackgroundService is an IHostedService that starts your ExecuteAsync task before returning from StartAsync. Async methods will run until their first call to await an incomplete task before returning.
TLDR; If you don't await anything in your ExecuteAsync method, the server will never start.
Since you aren't awaiting that async method, your code boils down to;
while(true)
HandleClient(...);
(Do you really want to spawn an infinite number of TcpClient as fast as the CPU will go?). There's a really easy fix;
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
await Task.Yield();
// ...
}
Consider the following simplified example (ready to roll in LinqPad, elevated account required):
void Main()
{
Go();
Thread.Sleep(100000);
}
async void Go()
{
TcpListener listener = new TcpListener(IPAddress.Any, 6666);
try
{
cts.Token.Register(() => Console.WriteLine("Token was canceled"));
listener.Start();
using(TcpClient client = await listener.AcceptTcpClientAsync()
.ConfigureAwait(false))
using(var cts = new CancellationTokenSource(TimeSpan.FromSeconds(5)))
{
var stream=client.GetStream();
var buffer=new byte[64];
try
{
var amtRead = await stream.ReadAsync(buffer,
0,
buffer.Length,
cts.Token);
Console.WriteLine("finished");
}
catch(TaskCanceledException)
{
Console.WriteLine("boom");
}
}
}
finally
{
listener.Stop();
}
}
If I connect a telnet client to localhost:6666 and sit around doing nothing for 5 seconds, why do I see "Token was canceled" but never see "boom" (or "finished")?
Will this NetworkStream not respect cancellation?
I can work around this with a combination of Task.Delay() and Task.WhenAny, but I'd prefer to get it working as expected.
Conversely, the following example of cancellation:
async void Go(CancellationToken ct)
{
using(var cts=new CancellationTokenSource(TimeSpan.FromSeconds(5)))
{
try
{
await Task.Delay(TimeSpan.FromSeconds(10),cts.Token)
.ConfigureAwait(false);
}
catch(TaskCanceledException)
{
Console.WriteLine("boom");
}
}
}
Prints "boom", as expected. What's going on?
No, NetworkStream does not support cancellation.
Unfortunately, the underlying Win32 APIs do not always support per-operation cancellation. Traditionally, you could cancel all I/O for a particular handle, but the method to cancel a single I/O operation is fairly recent. Most of the .NET BCL was written against the XP API (or older), which did not include CancelIoEx.
Stream compounds this issue by "faking" support for cancellation (and asynchronous I/O, too) even if the implementation doesn't support it. The "fake" support for cancellation just checks the token immediately and then starts a regular asynchronous read that cannot be cancelled. That's what you're seeing happen with NetworkStream.
With sockets (and most Win32 types), the traditional approach is to close the handle if you want to abort communications. This causes all current operations (both reads and writes) to fail. Technically this is a violation of BCL thread safety as documented, but it does work.
cts.Token.Register(() => client.Close());
...
catch (ObjectDisposedException)
If, on the other hand, you want to detect a half-open scenario (where your side is reading but the other side has lost its connection), then the best solution is to periodically send data. I describe this more on my blog.
I am using the .NET 4.5 HttpClient class to make a POST request to a server a number of times. The first 3 calls run quickly, but the fourth time a call to await client.PostAsync(...) is made, it hangs for several seconds before returning the expected response.
using (HttpClient client = new HttpClient())
{
// Prepare query
StringBuilder queryBuilder = new StringBuilder();
queryBuilder.Append("?arg=value");
// Send query
using (var result = await client.PostAsync(BaseUrl + queryBuilder.ToString(),
new StreamContent(streamData)))
{
Stream stream = await result.Content.ReadAsStreamAsync();
return new MyResult(stream);
}
}
The server code is shown below:
HttpListener listener;
void Run()
{
listener.Start();
ThreadPool.QueueUserWorkItem((o) =>
{
while (listener.IsListening)
{
ThreadPool.QueueUserWorkItem((c) =>
{
var context = c as HttpListenerContext;
try
{
// Handle request
}
finally
{
// Always close the stream
context.Response.OutputStream.Close();
}
}, listener.GetContext());
}
});
}
Inserting a debug statement at // Handle request shows that the server code doesn't seem to receive the request as soon as it is sent.
I have already investigated whether it could be a problem with the client not closing the response, meaning that the number of connections the ServicePoint provider allows could be reached. However, I have tried increasing ServicePointManager.MaxServicePoints but this has no effect at all.
I also found this similar question:
.NET HttpClient hangs after several requests (unless Fiddler is active)
I don't believe this is the problem with my code - even changing my code to exactly what is given there didn't fix the problem.
The problem was that there were too many Task instances scheduled to run.
Changing some of the Task.Factory.StartNew calls in my program for tasks which ran for a long time to use the TaskCreationOptions.LongRunning option fixed this. It appears that the task scheduler was waiting for other tasks to finish before it scheduled the request to the server.
I have a requirement, is to process X number of files, usually we can receive around 100 files each day, is a zip file so I have to open it, create a stream then send it to a WebApi service which is a workflow, this workflow calls two more WebApi Steps.
I implemented a console application that loops through the files then calls a wrapper which makes a REST call using HttpWebRequest.GetResponse().
I stressed tested the solution and created 11K files, in a synchronous version it takes to process all the files around 17 minutes, but I would like to create an async version of it and be able to use await HttpWebRequest.GetResponseAsync().
Here is the Async version:
private async Task<KeyValuePair<HttpStatusCode, string>> REST_CallAsync(
string httpMethod,
string url,
string contentType,
object bodyMessage = null,
Dictionary<string, object> headerParameters = null,
object[] queryStringParamaters = null,
string requestData = "")
{
try
{
HttpWebRequest req = (HttpWebRequest)HttpWebRequest.Create("some url");
req.Method = "POST";
req.ContentType = contentType;
//Adding zip stream to body
var reqBodyBytes = ReadFully((Stream)bodyMessage);
req.ContentLength = reqBodyBytes.Length;
Stream reqStream = req.GetRequestStream();
reqStream.Write(reqBodyBytes, 0, reqBodyBytes.Length);
reqStream.Close();
//Async call
var resp = await req.GetResponseAsync();
var httpResponse = (HttpWebResponse)resp as HttpWebResponse;
var responseData = new StreamReader(resp.GetResponseStream()).ReadToEnd();
return new KeyValuePair<HttpStatusCode,string>(httpResponse.StatusCode, responseData);
}
catch (WebException webEx)
{
//something
}
catch (Exception ex)
{
//something
}
In my console Application I have a loop to open and call the async (CallServiceAsync under the covers calls the method above)
foreach (var zipFile in Directory.EnumerateFiles(directory))
{
using (var zipStream = System.IO.File.OpenRead(zipFile))
{
await _restFulService.CallServiceAsync<WorkflowResponse>(
zipStream,
headerParameters,
null,
true);
}
processId++;
}
}
What end up happening was that only 2K of 11K got processed and didn't throw any exception so I was clueless so I changed the version I am calling the async to:
foreach (var zipFile in Directory.EnumerateFiles(directory))
{
using (var zipStream = System.IO.File.OpenRead(zipFile))
{
tasks.Add(_restFulService.CallServiceAsync<WorkflowResponse>(
zipStream,
headerParameters,
null,
true));
}
}
}
And have another loop to await for the tasks:
foreach (var task in await System.Threading.Tasks.Task.WhenAll(tasks))
{
if (task.Value != null)
{
Console.WriteLine("Ending Process");
}
}
And now I am facing a different error, when I process three files, the third one receives:
The client is disconnected because the underlying request has been completed. There is no longer an HttpContext available.
My question is, what i am doing wrong here? I use SimpleInjector as IoC would it be this the problem?
Also when you do WhenAll is waiting for each thread to run? Is not making it synchronous so it waits for a thread to finish in order to execute the next one? I am new to this async world so any help would be really much appreciated.
Well for those that added -1 to my question and instead of providing some type of solution just suggested something meaningless, here it is the answer and the reason why specifying as much detail as possible is useful.
First problem, since I'm using IIS Express if I'm not running my solution (F5) then the web applications are not available, that happened to me sometimes not always.
The second problem and the one giving me a huge headache is that not all the files got processed, I should've known the reason of this issue before, is the usage of async - await in a console application. I forced my console app to work with async by doing:
static void Main(string[] args)
{
System.Threading.Tasks.Task.Run(() => MainAsync(args)).Wait();
}
static async void MainAsync(string[] args)
{
//rest of code
Then if you note in my foreach I had await keyword and what was happening is that by concept await sends back the control flow to the caller, in this case the OS is the one calling the Console App (that is why doesn't make too much sense to use async - await in a console app, I did it because I mistakenly used await by calling an async method).
So the result was that my process only processed some X number of files, so what I end up doing is the following:
Add a list of tasks, the same way I did above:
tasks.Add(_restFulService.CallServiceAsync<WorkflowResponse>(....
And the way to run the threads is (in my console app):
ExecuteAsync(tasks);
Finally my method:
static void ExecuteAsync(List<System.Threading.Tasks.Task<KeyValuePair<HttpStatusCode, WorkflowResponse>>> tasks)
{
System.Threading.Tasks.Task.WhenAll(tasks).Wait();
}
UPDATE: Based on Scott's feedback, I changed the way I execute my threads.
And now I'm able to process all my files, I tested it and to process 1000 files in my synchronous process took around 160+ seconds to run all the process (I have a workflow of three steps in order to process the file) and when I put my async process in place it took 80+ seconds so almost half of the time. In my production server with IIS I believe the execution time will be less.
Hope this helps to anyone facing this type of issue.
I am trying to establish a TCP connection with a number of IPs in parallel, and do that as fast as possible. I have converted some older code to use AsyncCTP for that purpose, introducing the parallelism.
Changes to Design and Speed, and Accessing Successful Connections?
My question is three-fold:
How bad is the following flow / what should I change?
i.e. the await starts a bunch of parallel TcpRequest threads,
but within each TcpRequest there is a tcpClient.BeginConnect
as well as another thread being spawn for reading (if connection is successful)
and the writing to the stream is done with a Wait / Pulse mechanism in a while loop.
Secondly, how could i make the process of connecting to a number of targets faster?
Currently, if the ip:port targets are not actually running any servers, then i get the "All Done" printed after about 18 seconds from the start, when trying to connect to about 500 local targets (that are not listening, and thus fail, on those ports).
How could i access the WriteToQueue method of successful connections, from the mothership?
Async Mothership Trying to Connect to All Targets in Parallel
// First get a bunch of IPAddress:Port targets
var endpoints = EndPointer.Get();
// Try connect to all those targets
var tasks = from t in topList select TcpRequester.ConnectAsync(t);
await TaskEx.WhenAll(tasks);
Debug.WriteLine("All Done");
Static Accessor for Individual TcpRequest Tasks
public static Task<TcpRequester> ConnectAsync(IPEndPoint endPoint)
{
var tcpRequester = Task<TcpRequester>.Factory.StartNew(() =>
{
var request = new TcpRequester();
request.Connect(endPoint);
return request;
}
);
return tcpRequester;
}
TcpRequester with BeginConnect TimeOut and new Thread for Reading
public void Connect(IPEndPoint endPoint)
{
TcpClient tcpClient = null;
Stream stream = null;
using (tcpClient = new TcpClient())
{
tcpClient.ReceiveTimeout = 1000;
tcpClient.SendTimeout = 1000;
IAsyncResult ar = tcpClient.BeginConnect(endPoint.Address, endPoint.Port, null, null);
WaitHandle wh;
wh = ar.AsyncWaitHandle;
try
{
if (!ar.AsyncWaitHandle.WaitOne(TimeSpan.FromMilliseconds(1000), false))
{
throw new TimeoutException();
}
if (tcpClient.Client != null)
{
// Success
tcpClient.EndConnect(ar);
}
if (tcpClient.Connected)
{
stream = tcpClient.GetStream();
}
// Start to read stream until told to close or remote close
ThreadStart reader = () => Read(stream);
// Reading is done in a separate thread
var thread = new Thread(reader);
thread.Start();
// See Writer method below
Writer(stream);
} finally
{
wh.Close();
}
}
} catch (Exception ex)
{
if (tcpClient != null)
tcpClient.Close();
}
}
}
Writing to Stream with Wait and Pulse
readonly Object _writeLock = new Object();
public void WriteToQueue(String message)
{
_bytesToBeWritten.Add(Convert(message));
lock (_writeLock)
{
Monitor.Pulse(_writeLock);
}
}
void Writer(Stream stream)
{
while (!_halt)
{
while (_bytesToBeWritten.Count > 0 && !_halt)
{
// Write method does the actual writing to the stream:
if (Write(stream, _bytesToBeWritten.ElementAt(0)))
{
_bytesToBeWritten.RemoveAt(0);
} else
{
Discontinue();
}
}
if (!(_bytesToBeWritten.Count > 0) && !_halt)
{
lock (_writeLock)
{
Monitor.Wait(_writeLock);
}
}
}
Debug.WriteLine("Discontinuing Writer and TcpRequester");
}
There are a few red flags that pop out at a cursory glance.
You have this Stream that is accepting reads and writes, but there is no clear indication that the operations have been synchronized appropriately. The documentation does state that a Stream's instance methods are not safe for multithreaded operations.
There does not appear to be synchronization around operations involving _bytesToBeWritten.
Acquiring a lock solely to execute Monitor.Wait and Monitor.Pulse is a little weird, if not downright incorrect. It is basically equivalent to using a ManualResetEvent.
It is almost never correct to use Monitor.Wait without a while loop. To understand why you have to understand the purpose of pulsing and waiting on a lock. That is really outside the scope of this answer.
It appears like the Writer and WriteToQueue methods are an attempt to generate a producer-consumer queue. The .NET BCL already contains the innards for this via the BlockingCollection class.
For what it is worth I see nothing flagrantly wrong with the general approach and usage of the await keyword.