Connection closing while downloading extremely large file - c#

We have a webservice that serves up files. Recently, we have come across a Very Large File - more than 2 GB - that can't be copied into the buffer. I've modified the code to use HttpCompletionOptions.ResponseHeadersRead to not use the buffer and copy directly to a stream. However, most of the time I get
System.IO.IOException: 'Unable to read data from the transport connection: The connection was closed.'
Curl is able to download it without problem. The exception doesn't happen every time, but it's most of the time. I set HttpClient.Timeout to an hour, so that's not the problem. The Exception itself is very ambiguous and I can't find any reason that it would be closing the connection. The logs on the web server also say
info: Microsoft.AspNetCore.Server.Kestrel[34]
Connection id "0HLO4L4D3UAMS", Request id "0HLO4L4D3UAMS:00000001": the application aborted the connection.
so it seems to be something on the client side.
var requestMessage = GetVeryLargeFile(asset, HttpMethod.Get);
using (var result = await _client.SendAsync(requestMessage, HttpCompletionOption.ResponseHeadersRead, cancellationToken).ConfigureAwait(false))
{
result.EnsureSuccessStatusCode();
cancellationToken.ThrowIfCancellationRequested();
using (var stream = await result.Content.ReadAsStreamAsync().ConfigureAwait(false))
{
cancellationToken.ThrowIfCancellationRequested();
using (var fileStream = _fileProvider.Create(filePath))
{
await stream.CopyToAsync(fileStream.StreamInstance).ConfigureAwait(false);
if (fileStream.Length == 0)
{
throw new DataException($"No data retrieved for {asset.Url}");
}
}
}
}
UPDATE:
Based on comments here, I changed the copy line to be synchronous, and that fixed the error. That's certainly less than optimal, but I'm still struggling to figure out why the async will randomly close the connection.
stream.CopyTo(fileStream.StreamInstance);

Related

WebSocket simplex stream - how can the client close the connection?

I'm using WebSockets to stream values from server to client. The connection should be closed when the stream of values is completed (server-side termination), or when the client stops subscribing (client-side termination).
How can the client gracefully close the connection?
A rough sample to demonstrate the issue in AspNetCore; server pushes a (potentially) infinite stream of values; client subscribes to the first 5 values, and should then close the connection.
app.Use(async (context, next) =>
{
if (context.Request.Path == "/read")
{
var client = new ClientWebSocket();
await client.ConnectAsync(new Uri(#"wss://localhost:7125/write"), CancellationToken.None);
for (int i = 0; i < 5; i++)
await client.ReceiveAsync(new ArraySegment<byte>(new byte[sizeof(int)]), CancellationToken.None);
// This does not seem to have any particular effect on writer
// await client.CloseOutputAsync(WebSocketCloseStatus.NormalClosure, string.Empty, CancellationToken.None);
// This freezes in AspNetcore on .NET6 core (because the implementation waits for the connection to close, which never happens)
// In AspNet (non-core, .Net Framework 4.8), this seems to throw an exception that data cannot be read after the connection has been closed (i.e. the socket seems to only be closeable if no data is pending to be read)
await client.CloseAsync(WebSocketCloseStatus.NormalClosure, string.Empty, CancellationToken.None);
}
if (context.Request.Path == "/write")
{
var ws = await context.WebSockets.AcceptWebSocketAsync();
await foreach (var number in GetNumbers())
{
var bytes = BitConverter.GetBytes(number);
if (ws.State != WebSocketState.Open)
throw new Exception("I wish we'd hit this branch, but we never do!");
await ws.SendAsync(new ArraySegment<byte>(bytes), WebSocketMessageType.Binary, true, CancellationToken.None);
}
}
});
static async IAsyncEnumerable<int> GetNumbers()
{
for (int i = 0; i <= int.MaxValue; i++)
{
yield return i;
await Task.Delay(25);
}
}
The general issue seems to be that the close message isn't picked up by the /write method, i.e. ws.State remains WebSocketState.Open. I'm assuming that onle receive operations update the connection status?
Is there any good way to handle this situation / for the server to pick up the client's request to close the connection?
I would quite like to avoid the client having to send any explicit messages to the server / for the server to have to read the stream explicitly. I'm increasingly wondering if that is possible, though.
Way the WebSocket protocol works is similar to TCP - via connection establishment, the only difference - initiation is done via http[s].
One send action from one side matches one receive action from another, and vice versa.
You can notice this detail (if i am not mistaken) in remarks of documentation:
Exactly one send and one receive is supported on each WebSocket object in parallel.
So, you should receive at least one data segment and recognize CloseIntention message from client. And the same on client side.
How to receive message, recognize close intention, and properly react to it - see here.
How to send close intention message - see here.
Suspect you should call webSocket.ReceiveAsync at least once in background on your server.
Then, in ContinueWith task call CancellationTokenSource.Cancel() for current server socket session.
That repo is working, except of docker-compose. - I am kinda newcomer in complex DevOps things. )
UPDATE
Remarks part of docs is not about matching of send-receive actions on different sides of conversation. Just wanted you to notice how this TCP-concept works, i.e you should receive data at least once.

Could somebody explain how this server program closes it's socket after every connection? And how could I keep the socket open?

So I want to make an TCP connection between 2 UWP apps using streamsockets. I found this example on the microsoft webpage and it works. The problem is that it closes it's sockets after every connection that's been established. I want to understand when it closes(can't find it in the code and that confuses me a bit) and I also want to know how I could keep the connection between server and client open so I don't have to reconnect every time I want to send something.
Example: https://learn.microsoft.com/en-us/windows/uwp/networking/sockets#build-a-basic-tcp-socket-client-and-server
I have looked in the StreamSocket documentation on Windows and can't really find things about closing the socket.
I assume it happens somewhere in this method. It's the server side of the program that is executed when a connection is received.
private async void StreamSocketListener_ConnectionReceived
(Windows.Networking.Sockets.StreamSocketListener sender,
Windows.Networking.Sockets.
StreamSocketListenerConnectionReceivedEventArgs args)
{
string request;
using (var streamReader = new
StreamReader(args.Socket.InputStream.AsStreamForRead()))
{
request = await streamReader.ReadLineAsync();
}
await this.Dispatcher.RunAsync(CoreDispatcherPriority.Normal, () => this.serverListBox.Items.Add(string.Format("server received the request: \"{0}\"", request)));
// Echo the request back as the response.
using (Stream outputStream = args.Socket.OutputStream.AsStreamForWrite())
{
using (var streamWriter = new StreamWriter(outputStream))
{
await streamWriter.WriteLineAsync(request);
await streamWriter.FlushAsync();
}
}
string request;
using (var streamReader = new
StreamReader(args.Socket.InputStream.AsStreamForRead()))
{
request = await streamReader.ReadLineAsync();
}
await this.Dispatcher.RunAsync(CoreDispatcherPriority.Normal, () =>
this.serverListBox.Items.Add(string.Format("server received the
request: \"{0}\"", request)));
// Echo the request back as the response.
using (Stream outputStream =
args.Socket.OutputStream.AsStreamForWrite())
{
using (var streamWriter = new StreamWriter(outputStream))
{
await streamWriter.WriteLineAsync(request);
await streamWriter.FlushAsync();
}
}
await this.Dispatcher.RunAsync(CoreDispatcherPriority.Normal, () =>
this.serverListBox.Items.Add(string.Format("server sent back the
response: \"{0}\"", request)));
sender.Dispose();
await this.Dispatcher.RunAsync(CoreDispatcherPriority.Normal, () =>
this.serverListBox.Items.Add("server closed its socket"));
}
Any help would greatly be appreciated!
I found this example on the microsoft webpage and it works.
Unfortunately, all the Microsoft socket examples are not good examples of how to write socket applications. They are only examples on how to call those APIs. Building a production-quality socket application is non-trivial, and the Microsoft socket examples will mislead you.
For example, this socket server:
Uses Dispatcher to update the UI rather than modern solutions like IProgress<T>.
Reads from its input stream until a newline is found. This is a problem because:
There is no timeout for the request to arrive.
The input buffer grows without bounds.
There's no handling of the half-open scenario.
Most socket examples from Microsoft have the same problems, all of which have to be addressed when writing production-quality socket code. And writing production-quality socket code is much harder than it first appears.
For this reason, I always recommend using an alternative technology (e.g., self-hosted SignalR) if possible.
But to answer your actual question:
I want to understand when it closes
With sockets, there are actually two streams: an input stream and output stream. Both are closed when sender.Dispose(); is called. However, the input stream is also closed when the StreamReader is disposed, and the output stream is also closed when the StreamWriter is disposed. These happen at the end of their using blocks. This is why you cannot read the second message after closing the StreamReader.

HttpClient not throwing after lost connection

I have a Windows Store application in which I need to download a file a couple of megabytes in size.
I am trying to do this using the HttpClient.
Here is a simplification of the code:
using (var httpClient = new HttpClient())
{
var request =
await httpClient.SendAsync(
new HttpRequestMessage(
HttpMethod.Get,
"http://openpandora.info:8080/Battlefield%204%20-%20Fishing%20in%20Baku%20-%20Xbox%20One.mp4"),
HttpCompletionOption.ResponseHeadersRead);
var outputFile =
await
ApplicationData.Current.LocalFolder.CreateFileAsync(
"test.data",
CreationCollisionOption.ReplaceExisting);
using (var outputStream = await outputFile.OpenStreamForWriteAsync())
{
await request.Content.CopyToAsync(outputStream);
}
}
The sample code provided here is downloading a 2GB file for illustration purposes.
The issue is the following. If the client has no internet connection when the app is started, the code throws an exception as expected. However, if the client loses internet connectivity while the download is running no exception is thrown and the code will never execute beyond the code block. If the client encounters no connection problems during download, the code works fine.
Any insight on why this is the case?

How to determine if an HTTP response is complete

I am working on building a simple proxy which will log certain requests which are passed through it. The proxy does not need to interfere with the traffic being passed through it (at this point in the project) and so I am trying to do as little parsing of the raw request/response as possible durring the process (the request and response are pushed off to a queue to be logged outside of the proxy).
My sample works fine, except for a cannot reliably tell when the "response" is complete so I have connections left open for longer than needed. The relevant code is below:
var request = getRequest(url);
byte[] buffer;
int bytesRead = 1;
var dataSent = false;
var timeoutTicks = DateTime.Now.AddMinutes(1).Ticks;
Console.WriteLine(" Sending data to address: {0}", url);
Console.WriteLine(" Waiting for response from host...");
using (var outboundStream = request.GetStream()) {
while (request.Connected && (DateTime.Now.Ticks < timeoutTicks)) {
while (outboundStream.DataAvailable) {
dataSent = true;
buffer = new byte[OUTPUT_BUFFER_SIZE];
bytesRead = outboundStream.Read(buffer, 0, OUTPUT_BUFFER_SIZE);
if (bytesRead > 0) { _clientSocket.Send(buffer, bytesRead, SocketFlags.None); }
Console.WriteLine(" pushed {0} bytes to requesting host...", _backBuffer.Length);
}
if (request.Connected) { Thread.Sleep(0); }
}
}
Console.WriteLine(" Finished with response from host...");
Console.WriteLine(" Disconnecting socket");
_clientSocket.Shutdown(SocketShutdown.Both);
My question is whether there is an easy way to tell that the response is complete without parsing headers. Given that this response could be anything (encoded, encrypted, gzip'ed etc), I dont want to have to decode the actual response to get the length and determine if I can disconnect my socket.
As David pointed out, connections should remain open for a period of time. You should not close connections unless the client side does that (or if the keep alive interval expires).
Changing to HTTP/1.0 will not work since you are a server and it's the client that will specify HTTP/1.1 in the request. Sure, you can send a error message with HTTP/1.0 as version and hope that the client changes to 1.0, but it seems inefficient.
HTTP messages looks like this:
REQUEST LINE
HEADERS
(empty line)
BODY
The only way to know when a response is done is to search for the Content-Length header. Simply search for "Content-Length:" in the request buffer and extract everything to the linefeed. (But trim the found value before converting to int).
The other alternative is to use the parser in my webserver to get all headers. It should be quite easy to use just the parser and nothing more from the library.
Update: There is a better parser here: HttpParser.cs
If you make a HTTP/1.0 request instead of 1.1, the server should close the connection as soon as it's through since it doesn't need to keep the connection open for another request.
Other than that, you really need to parse the content length header in the response to get the best value.
Using blocking IO and multiple threads might be your answer. Specifically
using(var response = request.GetResponse())
using(var stream = response.GetResponseStream())
using(var reader = new StreamReader(stream)
data = reader.ReadToEnd()
This is for textual data, however binary handling is similar.

The operation has timed out with WebClient.DownloadFile and correct url's

I am batch uploading products to a database.
I am download the image urls to the site to be used for the products.
The code I written works fine for the first 25 iterations (always that number for some reason), but then throws me a System.Net.WebException "The operation has timed out".
if (!File.Exists(localFilename))
{
using (WebClient Client = new WebClient())
{
Client.DownloadFile(remoteFilename, localFilename);
}
}
I checked the remote url it was requesting and it is a valid image url that returns an image.
Also, when I step through it with the debugger, I don't get the timeout error.
HELP! ;)
If I were in your shoes, here's a few possibilities I'd investigate:
if you're running this code from multiple threads, you may be bumping up against the System.Net.ServicePointManager.DefaultConnectionLimit property. Try increasing it to 50-100 when you start up your app. note that I don't think this is your problem, but trying this is easier than the other stuff below. :-)
another possibility is that you're swamping the server. This is usually hard to do with a single-threaded client, but is possible since multiple other clients may be hitting the server also. But because the problem always happens at #25, this seems unlikely since you'd expect to see more variation.
you may be running into a problem with keepalive HTTP connections backing up between your client and the server. this also seems unlikely.
the hard cutoff of 25 makes me think that this may be a proxy or firewall limit, either on your end or the server's, where >25 connections made from one client IP to one server (or proxy) will get throttled.
My money is on the latter one, since the fact that it always breaks at a nice round number of requests, and that stepping in the debugger (aka slower!) doesn't trigger the problem.
To test all this, I'd start with the easy thing: stick in a delay (Thread.Sleep) before each HTTP call, and see if the problem goes away. If it does, reduce the delay until the problem comes back. If it doesn't, increase the delay up to a large number (e.g. 10 seconds) until the problem goes away. If it doesn't go away with a 10 second delay, that's truly a mystery and I'd need more info to diagnose.
If it does go away with a delay, then you need to figure out why-- and whether the limit is permanent (e.g. server's firewall which you can't change) or something you can change. To get more info, you'll want to time the requests (e.g. check DateTime.Now before and after each call) to see if you see a pattern. If the timings are all consistent and suddenly get huge, that suggests a network/firewall/proxy throttling. If the timings gradually increase, that suggests a server you're gradually overloading and lengthening its request queue.
In addition to timing the requests, I'd set the timeout of your webclient calls to be longer, so you can figure out if the timeout is infinite or just a bit longer than the default. To do this, you'll need an alternative to the WebClient class, since it doesn't support a timeout. This thread on MSDN Forums has a reasonable alternative code sample.
An alternative to adding timing in your code is to use Fiddler:
download fiddler and start it up.
set your webclient code's Proxy property to point to the fiddler proxy (localhost:8888)
run your app and look at fiddler.
it seems that WebClient is not closing the Response object it uses when done which will cause, in your case, many responses to be opened at the same time and with a limit of 25 connections on the remote server, you got the 'Timeout exception'. When you debug, early opened reponses get closed due to their inner timeout, etc...
(I inpected WebClient that with Reflector, I can't find an instruction for closing the response).
I propse that you use HttpWebRequest & HttpWebResponse so that you can clean objects after each download:
HttpWebRequest request;
HttpWebResponse response = null;
try
{
FileStream fs;
Stream s;
byte[] read;
int count;
read = new byte[256];
request = (HttpWebRequest)WebRequest.Create(remoteFilename);
request.Timeout = 30000;
request.AllowWriteStreamBuffering = false;
response = (HttpWebResponse)request.GetResponse();
s = response.GetResponseStream();
fs = new FileStream(localFilename, FileMode.Create);
while((count = s.Read(read, 0, read.Length))> 0)
{
fs.Write(read, 0, count);
count = s.Read(read, 0, read.Length);
}
fs.Close();
s.Close();
}
catch (System.Net.WebException)
{
//....
}finally
{
//Close Response
if (response != null)
response.Close();
}
Here's a slightly simplified version of manji's answer:
private static void DownloadFile(Uri remoteUri, string localPath)
{
var request = (HttpWebRequest)WebRequest.Create(remoteUri);
request.Timeout = 30000;
request.AllowWriteStreamBuffering = false;
using (var response = (HttpWebResponse)request.GetResponse())
using (var s = response.GetResponseStream())
using (var fs = new FileStream(localPath, FileMode.Create))
{
byte[] buffer = new byte[4096];
int bytesRead;
while ((bytesRead = s.Read(buffer, 0, buffer.Length)) > 0)
{
fs.Write(buffer, 0, bytesRead);
bytesRead = s.Read(buffer, 0, buffer.Length);
}
}
}
I have the same problem and I solve it adding this lines to the configuration file app.config:
<system.net>
<connectionManagement>
<add address="*" maxconnection="100" />
</connectionManagement>
</system.net>

Categories