TcpClient SocketException with timeout after 20s no matter what - c#

I'd like to wait for a slow response from a client with TcpClient but get a timeout after about 20s no matter how I configure it. This is my attempt:
using (var client = new TcpClient { ReceiveTimeout = 9999999, SendTimeout = 9999999 })
{
await client.ConnectAsync(ip, port);
using (var stream = client.GetStream())
{
// Some quick read/writes happen here via the stream with stream.Write() and stream.Read(), successfully.
// Now the remote host is calculating something long and will reply if finished. This throws the below exception however instead of waiting for >20s.
var bytesRead = await stream.ReadAsync(new byte[8], 0, 8);
}
}
The exception is an IOException:
Unable to read data from the transport connection: A connection
attempt failed because the connected party did not properly respond
after a period of time, or established connection failed because
connected host has failed to respond.
...which contains a SocketException inside:
A connection attempt failed because the connected party did not
properly respond after a period of time, or established connection
failed because connected host has failed to respond
SocketErrorCode is TimedOut.
The 20s seems to be an OS default on Windows but isn't it possible to override it from managed code by interacting with TcpClient? Or how can I wait for the response otherwise?
I've also tried the old-style BeginRead-EndRead way and the same happens on EndRead. The problem is also not caused by Windows Firewall or Defender.

I'd like to wait for a slow response from a client
It's important to note that it's the connection that is failing. The connection timeout is only for establishing a connection, which should always be very fast. In fact, the OS will accept connections on behalf of an application, so you're literally just talking about a packet round-trip. 21 seconds should be plenty.
Once the connection is established, then you can just remove the ReceiveTimeout/SendTimeout and use asynchronous reads to wait forever.

It turns out that the remote host wasn't responding in a timely manner, hence the problem. Let me elaborate, and though this will be a solution very specific to my case maybe it will be useful for others too.
The real issue wasn't a timeout per se, as the exception indicated, but rather what exceptions thrown on subsequent Read() calls have shown: "An existing connection was forcibly closed by the remote host"
The remote host wasn't purposely closing the connection. Rather what happened is that when it was slow to respond it was actually so busy that it wasn't processing any TCP traffic either. While the local host wasn't explicitly sending anything while waiting for a response this still was an issue: the local host tried to send ACKs for previous transmissions of the remote host. Since these couldn't be delivered the local host determined that the remote host "forcibly closed" the connection.
I got the clue from looking at the traffic with Wireshark (always good to try to look at what's beneath the surface instead of guessing around): it was apparent that while the remote host was busy it showed complete radio silence. At the same time Wireshark showed retransmission attempts carried out by the local host, indicating that this is behind the issue.
Thus the solution couldn't be implemented on the local host either, the behavior of the remote host needed to be changed.

Related

C# TcpClient - What's the best way for a client to determine if remote server has gracefully shutdown connection?

I know there are various suggested ways to achieve this, using Poll/Available/Send, etc., but none of them seem to work for me. I have a connection to a remote server, which the remote server gracefully disconnects after a specific message. I need to ensure I don't disconnect from the remote server myself, and wait for the server to shutdown connection before I can safely reconnect and send other follow-up messages.
I'm using the ReadAsync method on Stream to get responses from that connection, as below:
while (await TcpClientObject.GetStream().ReadAsync(bufferData, 0, bufferData.Length) > 0)
{
//My logic here to handle responses
}
What's the most recommended approach for me to verify that the remote server has gracefully shutdown the connection before attempting a reconnect? Thanks.
If everything goes well ReadAsync will return 0 when the server closes the connection.
An exception is thrown if your side detects an abnormal disconnection, but in the worst case your end still thinks it's connected and ReadAsync won't return.

If no data is received over stream tcpclient close the connection

I would like to know on TcpClient's NetworkStream what exactly happen if timeout occurs.
While debugging the code i found that after request is sent and if no data is received within mention timeout period it throws below exception and unfortunately closes the connection (TcpClient.Connected become false):
Unable to read data from the transport connection: A connection
attempt failed because the connected party did not properly respond
after a period of time, or established connection failed because
connected host has failed to respond.
It throws the exception is okay, but i would like to know how i can prevent it from closing the connection.
It would be great if someone can provide more insights on this.
Have you checked this one? Reconnect TCPClient after interruption I think if you have a long enough TTL of your TCP Connection, should an exception occurs (I believe you would get thrown a SocketException) you can catch that up and initiate your retry logic. There are several implementations for this and obviously that would depend on the use case but normally there is a number of attempts (configuration value) before "giving up" connecting. That way your manager will retry connecting X number of times and will carry on if there is a successful connection otherwise will propagate up in the chain the exception.

Self-healing SslStream

I'm writing a service that needs to maintain a long running SSL connection to a remote server. I need this server to be self-healing, that is if it's disconnected for any reason then the next time it's written to it will reconnect. I've tried this:
bool isConnected = client.Connected && client.Client.Poll(0, SelectMode.SelectWrite) && stream.CanWrite;
if (!isConnected )
{
this.connected = false;
GetConnection();
}
stream.Write(bytes, 0, bytes.Length);
stream.Flush();
But I find it doesn't act as I would expect it. If I simulate a network outage by disabling my wifi, I'm still able to write to the stream with stream.Write() for approximately 20 seconds. Then next time I try to write to it, none of client.Connected, client.Client.Poll(), or stream.CanWrite() return false, but when I go to write to the stream I get a socket exception. Finally, if I try to recreate the connection, I get this exception: An existing connection was forcibly closed by the remote host.
I would appreciate any help create a long running SslStream that can withstand network failure. Thanks!
From a 10.000 feet point of view:
The reason you can still write to the stream after shutting down your wifi is because there is a network buffer that is holding the data for transmission, stream.Write/stream.Flush success means the network interface (TCP/IP stack) has accepted the data and has been buffered for transmission, not that the data has reach its target.
It takes time to the TCP/IP Stack to notice a full media disconnection, (connection lost/reset) because even if there is no physical link TCP/IP will see this as a temporary issue in the network and will keep retrying for a while (the network could be dropping packets at some point and the stack will keep retrying)
If you think about this in the reverse way, you won't like all your programs to fail if there is a network hiccup (this happen too often on internet), so TCP/IP takes its time to notify to the app layer that the connection has become invalid (after retry several times and wait a reasonable amount of time)
You can always reconnect to the server when the SslStream fails and continue sending data, although you will find is not as easy as this because there are several scenarios where you send and data is not received by server and others where server receive the data and you do not receive any ACK from server at all... So depending on your needs, self-healing alone could be not enough.
Self-Healing is simple to implement, data consistency and reliability is harder and usually requires the server to be ready to support some kind of reliable messaging mechanism to ensure all data has been sent and received.
The underlying protocol for SSL is TCP. TCP will usually only send data if the application wants it to deliver data, or if it needs to reply to data received from the other side by sending an ACK. This means, that a broken connection like a lost link will not be noticed until you are trying to send any data. But you will not notice immediatly, because:
A write to the socket will only deliver the data to the OS kernel and return success if this delivery was successful.
The kernel will then try to deliver the data to the peer and will wait for the ACK from the client.
If it does not get any ACK it will retry again to deliver the data and only after some unsuccessful retries the kernel will declare the connection broken.
Only after the connection is marked broken by the kernel the next write or read will return the error from kernel to user space, like with returning EPIPE when doing a write.
This means, if you want to know up-front if the connection is still alive you have to make sure that you get a regular data exchange on the connection. At the TCP level you might set TCP_KEEPALIVE, but this might use an interval of some hours between exchanges packets. At the SSL layer you might try to use the infamous heartbeat extension, but most peers will not understand it. The last choice is to implement some kind of heartbeat in your own application.
As for the self healing: When reconnecting you get a new TCP connection and you also need to do a full SSL handshake, because the last SSL connection was not cleanly closed and thus cannot be resumed. The server has no idea that this new connection is just a continuation of the old one so you have to implement some kind of meta-connection spanning multiple TCP connections inside your application layer on both client and server. Inside this meta-connection you need to have your own data tracking to detect, which data are really accepted from the peer and which were only send but never explicitly accepted because the connection broke. Sound like a kind of TCP on top of TCP.

How can you get Socket.Shutdown to raise a SocketException?

MSDN states that Socket.Shutdown can throw a SocketException. I've had this happen to me in production recently after introducing a load balancer between my clients and my server. But I cannot reproduce it in testing without a load balancer. Can you?
Some background - I have a server application written in C# that uses TCP sockets to communicate with clients. The application protocol is very simple for the server: accept connection, read request, send response, wait for client shutdown (read expecting 0 bytes), shutdown.
This code has been in production without issue for many years. However after introducing a load balancer in front of multiple server machines one of the server processes crashed due to an unhandled SocketException that was raised when the server called Socket.Shutdown. The particular client had timed out whilst waiting for the server to respond and attempted to close the connection early. The exception message on the server was "An existing connection was forcibly closed by the remote host." It is not unusual for the client to do this, but obviously prior to the load balancer the server was raising this error at a different point in the code. Still it's clearly a server bug and the fix is obvious - handle the exception.
However using a test client application (also written in C#), I cannot find a sequence of operations that will cause the server to raise an exception during Socket.Shutdown. It appears that the load balancer did something unusual to the TCP packets, but still, I dislike using that as excuse for failing to reproduce the issue.
I can run both server and client code in debug and I have WireShark watching the packets.
On the client side, after the connection is established, the operations are:
Socket.Send() // single call
Socket.Receive() // this one times out in our scenario
Socket.XXX() // various choices as described below
On the server side, after the connection is established, the operations are:
1) Socket.Receive() //multiple calls until complete message is received
2) // Processing...
3) Socket.Write() //single call
4) Socket.Receive() // single call expecting 0 bytes
5) Socket.Shutdown()
Presume each call is wrapped with try..catch(SocketException)
A) If I pause the server during step 2, wait for the client to time out, and initiate a client shutdown using Socket.Shutdown(SocketShutDown.Send) a FIN packet is sent to the server. When the server resumes processing, all the calls will succeed (3 thru 5) because that's a perfectly acceptable TCP flow.
B) If I pause the server during step 2, wait for the client to time out, and initiate a client shutdown using Socket.Shutdown(SocketShutDown.Both) or Socket.Close() again a FIN packet is sent to the server. When the server resumes processing step 3 succeeds, but it causes the client to send a RST packet in response as it is not accepting more data. If this RST arrives before step 4 then Socket.Receive throws and step 5 succeeds. If it arrives after step 4, then Socket.Receive succeeds (returns 0 bytes), and yet step 5 succeeds.
C) If the client has "Dont Linger" set (Linger enabled with 0 timeout), and I pause the server during processing, wait for the client to time out, and initiate a client shutdown using Socket.Shutdown(SocketShutDown.Both) or Socket.Close() a "RST" packet is immediately sent to the server. When the server resumes processing steps 3 and 4 will fail but still step 5 succeeds.
I think what puzzles me most is that Socket.Shutdown appears to ignore my test client RST packets and yet evidently my load balancer was able to send a RST packet that was not ignored. What am I missing? What else can I try?

Unable to make 2 parallel TCP requests to the same TCP Client

Error:
Unable to read data from the transport connection: A blocking operation was interrupted by a call to WSACancelBlockingCall
Situation
There is a TCP Server
My web application connects to this TCP Server
Using the below code:
TcpClientInfo = new TcpClient();
_result = TcpClientInfo.BeginConnect(<serverAddress>,<portNumber>, null, null);
bool success = _result.AsyncWaitHandle.WaitOne(20000, true);
if (!success)
{
TcpClientInfo.Close();
throw new Exception("Connection Timeout: Failed to establish connection.");
}
NetworkStreamInfo = TcpClientInfo.GetStream();
NetworkStreamInfo.ReadTimeout = 20000;
2 Users use the same application from two different location to access information from this server at the SAME TIME
Server takes around 2sec to reply
Both Connect
But One of the user gets above error
"Unable to read data from the transport connection: A blocking operation was interrupted by a call to WSACancelBlockingCall"
when trying to read data from stream
How can I resolve this issue?
Use a better way of connecting to the server
Can't because it's a server issue
if a server issue, how should the server handle request to avoid this problem
This looks Windows-specific to me, which isn't my strong point, but...
You don't show us the server code, only the client code. I can only assume, then, that your server code accepts a socket connection, does its magic, sends something back, and closes the client connection. If this is your case, then that's the problem.
The accept() call is a blocking one that waits for the next client connection attempt and binds to it. There may be a queue of connection attempts created and administered by the OS, but it can still only accept one connection at a time.
If you want to be able to handle multiple simultaneous requests, you have to change your server to call accept(), and when a new connection comes in, launch a worker thread/process to handle the request and go back to the top of the loop where the accept() is. So the main loop hands off the actual work to another thread/process so it can get back to the business of waiting for the next connection attempt.
Real server applications are more complex than this. They launch a bunch of "worker bee" threads/processes in a pool and reuse them for future requests. Web servers do this, for instance.
If my assumptions about your server code are wrong, please enlighten us as to what it looks like.
Just a thought.
If your server takes 2seconds to response, shouldn't the Timeout values be 2000, instead of 20000 (which is 20 seconds)? First argument for AsyncWaitHandle.WaitOne() is in milliseconds.
If you are waiting 20 seconds, may be your server is disconnecting you for being idle?

Categories