Nonpersistent HTTP 1.1 connections faster than persistent one? - c#

I am using Entity framework 4.0 in conjunction with REST web service.
On the client side, during data/entities loading, client is making 40 sequential web requests.
When I set HttpWebRequest.KeepAlive to false (Fiddler shows Connection: Close headers in client-server communication), data loading is faster about 50% (requests are still sequential) - and I am wondering why.
From Wikipedia:
HTTP persistent connection, also called HTTP keep-alive, or HTTP connection reuse, is the idea of using the same TCP connection to send and receive multiple HTTP requests/responses, as opposed to opening a new connection for every single request/response pair.
From MSDN:
When the KeepAlive property is true, the application makes persistent connections to the servers that support them.
When using HTTP/1.1, Keep-Alive is on/true by default.
What´s wrong? How can I speed up persistent requests?

Maybe on the client the limit for no. of concurrent connections per IP is higher for non-persistent connections than for persistent. So when using keep-alive, client may have allowed you to have 10 conns in parallel, but when not using keep-alive, you can have for example 15 parallel connections.
But this will be faster only on local network where establishing connection is really fast. On internet (RTT of 5-200 ms) you would need 3x RTT time (SYN, SYN+ACK, ACK) only to begin new connection. So especially if you have many small requests (for example images under 1kB), the speed of keep-alive can be 4x faster - because you setup the connection only once and then send 1 packet as request and receive 1 packet as response. But without keepalive, you need 3 packets to begin, then send request, then receive response and then 2 packets to close the connection.

Related

How to reduce delays caused by a Server TCP Spurious retransmission and subsequent Client TCP retransmission?

I have a Dotnet application (running on a Windows PC) which communicates with a Linux box via OPC UA. The use case here is to make ~40 read requests to the server in serial. Once these 40 read calls are complete, the next cycle of 40 read calls begins. Each read call returns a response from the server carrying a payload of ~16KB which is fragmented and delivered to the client. For most requests, the server finishes delivering the complete response within 5ms. However for some requests it takes ~300 ms to complete.
In scenarios where this delay exists, I can see the following pattern of re-transmissions.
[71612] A new Read request is sent to the server.
[71613-71630] The response is delivered to the client.
[71631] A new Read request is sent to the server.
[71632] A TCP Spurious Retransmission occurs from the server for packet [71844] with Seq No. 61624844
[71633] Client sends a DUP ACK for the packet.
[71634] Client does a TCP Retransmission for the read request in [71846] after 288ms
This delay adds up and causes some 5-6 seconds of delay for a complete cycle of 40 requests to complete. I want to figure out what is causing these retransmissions (hence delays) and what can possibly be done to-
Reduce the frequency of retransmissions.
Reduce the 300ms delay from the client side to quickly retransmit the obstructed read request.
I have tried disabling the Nagle algorithm on the server to possibly improve performance but it did not have any effect. Also, when reducing the response size by half (8KB), the retransmissions are rare and hence the delay is minute as well. But reducing the response is not a valid solution in our use case.
The connection to the Linux box is through a switch, however while directly connecting to it point-point, there is marginal reduction in the delay.
I can share relevant code but I think this issue is likely with the TCP stack (or at least, some configuration that should be enabled?) hence it would make little difference.

ServicePoint Configuration - Application starving http connections?

I have two C# asp.net applications running on IIS:
The main application creates up to 80 threads where each of them will
establish an http connection to a certrain endpoint (all the same endpoint (LAN)) at a frequency of roughly 3 seconds.
That endpoint is beeing hosted on localhost (e.g localhost:4510).
This endpoint is the second application which represents the "driver" that will ultimately establish a connection to a device within LAN.
So it's totally possible to have 80 threads trying to make a request to driver/device at the same time.
Over time the app seems to have issues with anything involving httpclients. RavenDB, Elasticsearch and also the 80 threads.
I read a few things about ServicePointManager class; especially DefaultConnectionLimit and
MaxServicePoints and how the influence http througput.
I only have basic understanding of the underlying mechanism so I'd like to ask if I should focus on a specific subject or what I would want to check to may improve on http throughput.
Update:
With current configuration CPU load is low and memory consumption also.
Following code shows how the 80 httpclients which connect to the driver on localhost:4510:
var driverBaseAddressSp= ServicePointManager.FindServicePoint(driverBaseAddress); Debug.WriteLine(driverBaseAddressSp.ConnectionLimit);
Debug.WriteLine(driverBaseAddressSp.MaxIdleTime);
var connectionUriSp = ServicePointManager.FindServicePoint(connectionUri);
Debug.WriteLine(connectionUriSp.ConnectionLimit);
Debug.WriteLine(connectionUriSp.MaxIdleTime);
return new HttpClient { BaseAddress = driverBaseAddress };
ConnectionLimit shows Int.Max when debugging but
I cannot find any configuration in the solution?

How to fully terminate HttpWebRequest

Even though i am properly terminating everything when i check existing HTTP connections i see they are not terminated
For example when i open 200 concurrent connections by starting different tasks
I see
158 Established HTTP connections
927 TimeWait
95 SynSent
24 LastAck
6 CloseWait
34 FinWait
The worse part is, the number of TimeWait keep increasing each minute
So how can i prevent such issue to happen?
After a while the windows become unable to make any new requests
This problem occurs when i use webproxies : Too many proxy connection kills window's resolving hosts ability
Here when i use 200 connections with different proxies
Connections in TimeWait state can generate a performance problem.
First, take a look at TCP State diagram,
https://en.wikipedia.org/wiki/File:Tcp_state_diagram_fixed_new.svg
This is a state of a TCP connection after a machine’s TCP has sent the ACK segment in response to a FIN segment received from its peer (details in RFC 793 defining TCP back in 1981 http://www.ietf.org/rfc/rfc793.txt). During this state the socket resources, including the TCB (TCP Control Block) and the port of course, are not released to the OS. After a timeout expires, socket resources are released to the OS. The original reason is to deal with the Two Generals problem that can happen between peers in an unreliable medium. The connection will be in TimeWait until a configurable timeout which has a default value that is dependent on the operating system.
These links can help you to set the TcpTimedWaitDelay parameter in Windows:
https://technet.microsoft.com/en-us/library/cc938217.aspx
http://msdn.microsoft.com/en-us/library/ee377084%28v=bts.10%29.aspx
It says the default value is 240 seconds but I'm my tests I experienced lower times (between 60 and 120).
Anyway, today networks are more reliable and web services requiring high performance and throughput should reduce this value. I would suggest set it just to 5 seconds. If you want to be more conservative, set it to 30 seconds.
Other parameter that could be useful for you is the max number of ephemeral ports Windows allows a client to open. Windows Server by default limits the maximum number of ephemeral TCP ports. In some Windows, this value could be 5000. You can change this behavior by setting the value MaxUserPort in the registry.

.NET WebSockets forcibly closed despite keep-alive and activity on the connection

We have written a simple WebSocket client using System.Net.WebSockets. The KeepAliveInterval on the ClientWebSocket is set to 30 seconds.
The connection is opened successfully and traffic flows as expected in both directions, or if the connection is idle, the client sends Pong requests every 30 seconds to the server (visible in Wireshark).
But after 100 seconds the connection is abruptly terminated due to the TCP socket being closed at the client end (watching in Wireshark we see the client send a FIN). The server responds with a 1001 Going Away before closing the socket.
After a lot of digging we have tracked down the cause and found a rather heavy-handed workaround. Despite a lot of Google and Stack Overflow searching we have only seen a couple of other examples of people posting about the problem and nobody with an answer, so I'm posting this to save others the pain and in the hope that someone may be able to suggest a better workaround.
The source of the 100 second timeout is that the WebSocket uses a System.Net.ServicePoint, which has a MaxIdleTime property to allow idle sockets to be closed. On opening the WebSocket if there is an existing ServicePoint for the Uri it will use that, with whatever the MaxIdleTime property was set to on creation. If not, a new ServicePoint instance will be created, with MaxIdleTime set from the current value of the System.Net.ServicePointManager MaxServicePointIdleTime property (which defaults to 100,000 milliseconds).
The issue is that neither WebSocket traffic nor WebSocket keep-alives (Ping/Pong) appear to register as traffic as far as the ServicePoint idle timer is concerned. So exactly 100 seconds after opening the WebSocket it just gets torn down, despite traffic or keep-alives.
Our hunch is that this may be because the WebSocket starts life as an HTTP request which is then upgraded to a websocket. It appears that the idle timer is only looking for HTTP traffic. If that is indeed what is happening that seems like a major bug in the System.Net.WebSockets implementation.
The workaround we are using is to set the MaxIdleTime on the ServicePoint to int.MaxValue. This allows the WebSocket to stay open indefinitely. But the downside is that this value applies to any other connections for that ServicePoint. In our context (which is a Load test using Visual Studio Web and Load testing) we have other (HTTP) connections open for the same ServicePoint, and in fact there is already an active ServicePoint instance by the time that we open our WebSocket. This means that after we update the MaxIdleTime, all HTTP connections for the Load test will have no idle timeout. This doesn't feel quite comfortable, although in practice the web server should be closing idle connections anyway.
We also briefly explore whether we could create a new ServicePoint instance reserved just for our WebSocket connection, but couldn't see a clean way of doing that.
One other little twist which made this harder to track down is that although the System.Net.ServicePointManager MaxServicePointIdleTime property defaults to 100 seconds, Visual Studio is overriding this value and setting it to 120 seconds - which made it harder to search for.
I ran into this issue this week. Your workaround got me pointed in the right direction, but I believe I've narrowed down the root cause.
If a "Content-Length: 0" header is included in the "101 Switching Protocols" response from a WebSocket server, WebSocketClient gets confused and schedules the connection for cleanup in 100 seconds.
Here's the offending code from the .Net Reference Source:
//if the returned contentlength is zero, preemptively invoke calldone on the stream.
//this will wake up any pending reads.
if (m_ContentLength == 0 && m_ConnectStream is ConnectStream) {
((ConnectStream)m_ConnectStream).CallDone();
}
According to RFC 7230 Section 3.3.2, Content-Length is prohibited in 1xx (Informational) messages, but I've found it mistakenly included in some server implementations.
For additional details, including some sample code for diagnosing ServicePoint issues, see this thread: https://github.com/ably/ably-dotnet/issues/107
I set the KeepAliveInterval for the socket to 0 like this:
theSocket.Options.KeepAliveInterval = TimeSpan.Zero;
That eliminated the problem of the websocket shutting down when the timeout was reached. But then again, it also probably turns off the send of ping messages altogether.
I studied this issue these days, compared capture packages in Wireshark(webclient-client of python and WebSocketClient of .Net), and found what happened. In WebSocketClient, "Options.KeepAliveInterval" only send one packet to the server when no message received from server in these period. But some server only judge if there is active message from client. So we have to manually send arbitrary packets (not necessarily ping packets,and WebSocketMessageType has no ping type) to the server at regular intervals,even if the server side continuously sends packets. That's the solution.

Self-healing SslStream

I'm writing a service that needs to maintain a long running SSL connection to a remote server. I need this server to be self-healing, that is if it's disconnected for any reason then the next time it's written to it will reconnect. I've tried this:
bool isConnected = client.Connected && client.Client.Poll(0, SelectMode.SelectWrite) && stream.CanWrite;
if (!isConnected )
{
this.connected = false;
GetConnection();
}
stream.Write(bytes, 0, bytes.Length);
stream.Flush();
But I find it doesn't act as I would expect it. If I simulate a network outage by disabling my wifi, I'm still able to write to the stream with stream.Write() for approximately 20 seconds. Then next time I try to write to it, none of client.Connected, client.Client.Poll(), or stream.CanWrite() return false, but when I go to write to the stream I get a socket exception. Finally, if I try to recreate the connection, I get this exception: An existing connection was forcibly closed by the remote host.
I would appreciate any help create a long running SslStream that can withstand network failure. Thanks!
From a 10.000 feet point of view:
The reason you can still write to the stream after shutting down your wifi is because there is a network buffer that is holding the data for transmission, stream.Write/stream.Flush success means the network interface (TCP/IP stack) has accepted the data and has been buffered for transmission, not that the data has reach its target.
It takes time to the TCP/IP Stack to notice a full media disconnection, (connection lost/reset) because even if there is no physical link TCP/IP will see this as a temporary issue in the network and will keep retrying for a while (the network could be dropping packets at some point and the stack will keep retrying)
If you think about this in the reverse way, you won't like all your programs to fail if there is a network hiccup (this happen too often on internet), so TCP/IP takes its time to notify to the app layer that the connection has become invalid (after retry several times and wait a reasonable amount of time)
You can always reconnect to the server when the SslStream fails and continue sending data, although you will find is not as easy as this because there are several scenarios where you send and data is not received by server and others where server receive the data and you do not receive any ACK from server at all... So depending on your needs, self-healing alone could be not enough.
Self-Healing is simple to implement, data consistency and reliability is harder and usually requires the server to be ready to support some kind of reliable messaging mechanism to ensure all data has been sent and received.
The underlying protocol for SSL is TCP. TCP will usually only send data if the application wants it to deliver data, or if it needs to reply to data received from the other side by sending an ACK. This means, that a broken connection like a lost link will not be noticed until you are trying to send any data. But you will not notice immediatly, because:
A write to the socket will only deliver the data to the OS kernel and return success if this delivery was successful.
The kernel will then try to deliver the data to the peer and will wait for the ACK from the client.
If it does not get any ACK it will retry again to deliver the data and only after some unsuccessful retries the kernel will declare the connection broken.
Only after the connection is marked broken by the kernel the next write or read will return the error from kernel to user space, like with returning EPIPE when doing a write.
This means, if you want to know up-front if the connection is still alive you have to make sure that you get a regular data exchange on the connection. At the TCP level you might set TCP_KEEPALIVE, but this might use an interval of some hours between exchanges packets. At the SSL layer you might try to use the infamous heartbeat extension, but most peers will not understand it. The last choice is to implement some kind of heartbeat in your own application.
As for the self healing: When reconnecting you get a new TCP connection and you also need to do a full SSL handshake, because the last SSL connection was not cleanly closed and thus cannot be resumed. The server has no idea that this new connection is just a continuation of the old one so you have to implement some kind of meta-connection spanning multiple TCP connections inside your application layer on both client and server. Inside this meta-connection you need to have your own data tracking to detect, which data are really accepted from the peer and which were only send but never explicitly accepted because the connection broke. Sound like a kind of TCP on top of TCP.

Categories