C# HttpClient.GetAsync ignore HttpCompletionOption.ResponseHeadersRead - c#

I got a weird situation when using C# HttpClient. I am trying to use the HttpCompletionOption.ResponseHeadersRead option in GetAsync to get response headers without content as quickly as possible. But when downloading files, I am in await GetAsync until the whole content is downloaded over the network (i checked this with Fiddler). I am attaching an example code that downloads a 1Gb test file. The example application will hang in the await client.GetAsync until all file content is received over the network. How do I get control back when the headers have finished receiving and not wait for the complete content transfer over the network?
using System;
using System.IO;
using System.Net.Http;
using System.Threading;
using System.Threading.Tasks;
public class Program
{
private const int HttpBufferSize = 81920;
private static async Task Main(string[] args)
{
var url = new Uri("http://212.183.159.230/1GB.zip");
await DownloadFileAsync(#"C:\1GB.zip", url, CancellationToken.None).ConfigureAwait(false);
}
private static async Task DownloadFileAsync(string filePath, Uri fileEndpoint,
CancellationToken token)
{
using var client = new HttpClient();
using var response = await client.GetAsync(fileEndpoint, HttpCompletionOption.ResponseHeadersRead, token).ConfigureAwait(false);
response.EnsureSuccessStatusCode();
await using var contentStream = await response.Content.ReadAsStreamAsync(token).ConfigureAwait(false);
await using var stream = new FileStream(filePath, FileMode.Create, FileAccess.Write, FileShare.None);
await contentStream.CopyToAsync(stream, HttpBufferSize, token).ConfigureAwait(false);
}
}

You are sending a GET request. If you only require the headers then you can use HEAD request. An example for HttpClient:
client.SendAsync(new HttpRequestMessage(HttpMethod.Head, url))
Caution: Servers can block HEAD requests so make sure to handle gracefully. For example, fallback to GET request if the response fails but it will be at the cost of speed.

I have identified the reason for this behavior. The reason was Fiddler. It acted as a proxy and did not seem to redirect partially received responses. To check this, I've added console output for each of the operations:
Console.WriteLine($"Start GetAsync - {DateTime.Now}");
using var response = await client.GetAsync(fileEndpoint, HttpCompletionOption.ResponseHeadersRead, token).ConfigureAwait(false);
Console.WriteLine($"End GetAsync - {DateTime.Now}");
response.EnsureSuccessStatusCode();
await using var contentStream = await response.Content.ReadAsStreamAsync(token).ConfigureAwait(false);
await using var stream = new FileStream(filePath, FileMode.Create, FileAccess.Write, FileShare.None);
Console.WriteLine($"Start CopyToAsync - {DateTime.Now}");
await contentStream.CopyToAsync(stream, HttpBufferSize, token).ConfigureAwait(false);
Console.WriteLine($"End CopyToAsync - {DateTime.Now}");
Results with running program Fiddler:
Start GetAsync - 30.06.2021 17:46:03
End GetAsync - 30.06.2021 17:46:49
Start CopyToAsync - 30.06.2021 17:46:49
End CopyToAsync - 30.06.2021 17:46:51
Results without Fiddler:
Start GetAsync - 30.06.2021 17:38:32
End GetAsync - 30.06.2021 17:38:32
Start CopyToAsync - 30.06.2021 17:38:32
End CopyToAsync - 30.06.2021 17:39:48
Conclusion: be careful with proxies

Related

Does HttpClient (from HttpClientFactory) Dispose Clean-up the HttpResponseMessage / Content?

I know there are many questions about calling dispose on HttpClient and my understanding is it isn't necessary, but shouldn't (normally) cause any harm in .net core / when using HttpClientFactory. I am wondering about the effect (if any) on 1 particular use case:
HttpResponseMessage response = null;
using (HttpClient client = httpFactory.Create("NEW"))
{
const string url = "https://url";
response = await client.GetAsync(url, HttpCompletionOption.ResponseHeadersRead));
}
Stream stream = await response.Content.ReadAsStreamAsync();
........... use stream .............................
Does disposing the httpclient run the risk of cleaning-up/impacting the HttpResponseMessage/Stream (assume the stream processing take a long time)?
Thanks

net core console app C# HttpClient hangs when request takes long

Console App Net Core 3.0
I'm sending batch requests to Dynamic CRM Api, basically I read one folder looking for text files that are generated by another process (Nodejs) those files have the request content, then I iterate throught those files, take the text content as string and send it. The problem is when the content of any of those files is bigger (500kb+) The problem is when the requests takes longer (5 minutes or longer) then the HttpClient.SendAsync starts but NEVER completes, the system hangs there, I know the sending was successful cause I check in the api and the changes are applied but the automated process stops there and I have to kill the app. If the content is not that big then everything runs smoothly.
The most frustrating thing is there's no any error or feedback that leads me to any solution or workaround.
Important details:
I think this shouldn't be a deadlock since every single async call in the chain is awaited, the sendasync is also set to ConfigureAwait(false). No .Result, no Wait() or any other blocking operation is called.
I'm not exhausting netword sockets by creating one HttpClient instance per request, I'm using IHttpClientFactory to dispatch the httpClient instance as Singleton. And the issue happens at the first long content request it comes across.
Update
I realized the problem has nothing to do with the payload size but with the time the request takes to complete, I noticed files of 200kb or less hangs because the server takes more than 8 minutes to complete.
The code:
HttpRequestMessage GetRequest(string batchId, string requestBody)
{
var request = new HttpRequestMessage(HttpMethod.Post,"$batch");
request.Headers.Add("OData-MaxVersion", "4.0");
request.Headers.Add("OData-Version", "4.0");
request.Headers.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
request.Content = GetContent(batchId, requestBody);
return request;
}
StringContent GetContent(string batchId, string requestBody)
{
var content = new StringContent(requestBody);
content.Headers.Remove("Content-Type");
content.Headers.Add("Content-Type", $"multipart/mixed;boundary=batch_{batchId}");
return content;
}
string GetRequestBody(string file)
{
var requestBody = "";
using (var sr = new StreamReader(file))
{
requestBody = sr.ReadToEnd();
}
return requestBody;
}
private async Task<HttpResponseMessage> SendRequest(string batchId, string file)
{
// this call is awaited since the access token may expire then is necessary to request a new one
var httpClient = await _httpClientBuilder.BuildHttpClient().ConfigureAwait(false);
var requestBody = GetRequestBody(file); // Here I read the text from the file
using(var request = GetRequest(batchId, requestBody)){
using(var response = await httpClient.SendAsync(request, HttpCompletionOption.ResponseHeadersRead).ConfigureAwait(false)){ // after this call nothing is executed
if (response.IsSuccessStatusCode)
{
CustomConsole.Success("({0})Successfully Sent batch: {1}", response.StatusCode, batchId);
File.Delete(file);
}
else
{
CustomConsole.Error("({0}) Error sending batch: {1}", response.StatusCode, batchId);
var responseBody = await response.Content.ReadAsStringAsync().ConfigureAwait(false);
CustomConsole.Info("Response Body:");
CustomConsole.Info("{0}",responseBody);
}
return response;
}
}
}

HttpClient timeout using HttpCompletionOption.ResponseHeadersRead

.NET Core 3.1 Console application on Windows, I'm trying to figure out why the httpClient.Timeout does not seem to be working when getting the content after using HttpCompletionOption.ResponseHeadersRead
static async Task Main(string[] args)
{
var httpClient = new HttpClient();
// if using HttpCompletionOption this timeout doesn't work
httpClient.Timeout = TimeSpan.FromSeconds(5);
var uri = new Uri("http://brokenlinkcheckerchecker.com/files/200MB.zip");
// will not timeout
//using var httpResponseMessage = await httpClient.GetAsync(uri, HttpCompletionOption.ResponseHeadersRead);
// will timeout after 5s with a TaskCanceledException
var httpResponseMessage = await httpClient.GetAsync(uri);
Console.WriteLine($"Status code is {httpResponseMessage.StatusCode}. Press any key to get content");
Console.ReadLine();
Console.WriteLine("getting content");
var html = await httpResponseMessage.Content.ReadAsStringAsync();
Console.WriteLine($"finished and length is {html.Length}");
}
Have also tried a CancellationToken
// will not timeout
var cts = new CancellationTokenSource(5000);
using var httpResponseMessage = await httpClient.GetAsync(uri, HttpCompletionOption.ResponseHeadersRead,
cts.Token);
and ReadAsStreamAsync
// will not timeout
using (Stream streamToReadFrom = await httpResponseMessage.Content.ReadAsStreamAsync())
{
string fileToWriteTo = Path.GetTempFileName();
using (Stream streamToWriteTo = File.Open(fileToWriteTo, FileMode.Create))
{
await streamToReadFrom.CopyToAsync(streamToWriteTo);
}
}
I learned about HttpCompletionOption from this great article:
https://www.stevejgordon.co.uk/using-httpcompletionoption-responseheadersread-to-improve-httpclient-performance-dotnet
Update
Using #StephenCleary answer below of passing the cancellationToken into the CopyToAsync method this now works as expected.
I've included the updated code below which shows copying into a MemoryStream then into a string, which I found tricky to find how to do. For my use case this is good.
string html;
await using (var streamToReadFrom = await httpResponseMessage.Content.ReadAsStreamAsync())
await using (var streamToWriteTo = new MemoryStream())
{
await streamToReadFrom.CopyToAsync(streamToWriteTo, cts.Token);
// careful of what encoding - read from incoming MIME
html = Encoding.UTF8.GetString(streamToWriteTo.ToArray());
}
I would expect HttpClient.Timeout to only apply to the GetAsync part of the request. HttpCompletionOption.ResponseHeadersRead means "consider the Get complete when the response headers are read", so it's complete. So the problem is that it just doesn't apply to reading from the stream.
I recommend using Polly's Timeout instead of HttpClient.Timeout; Polly is a generic library that can be used to timeout any operation.
If you don't want to use Polly at this time, you can pass the CancellationToken to Stream.CopyToAsync.

The await httpClient.GetByteArrayAsync() in HttpClient suddenly stop after many videos downloaded?

After 2-4 downloading of videos data from API using HttpClient suddenly prompt error.
Here's my code:
public async Task<byte[]> GetMedia(string id)
{
var api = $"/api/v1/download/{id}";
var Uri = $"{MccBaseURL}{api}";
byte[] responseBody;
httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("No");
try
{
HttpResponseMessage response = await httpClient.GetAsync(Uri);
response.EnsureSuccessStatusCode();
responseBody = await response.Content.ReadAsByteArrayAsync();
if (response.IsSuccessStatusCode)
{
return responseBody;
}
}
catch (Exception ex)
{
Debug.Print(ex.Message);
throw;
}
}
Then below is the error faced:
Additional error info:
Error
Please help me?
First, you should dispose your HttpResponseMessage, as you have in your answer, but not in the original question.
The most likely issue, though, is your use of DefaultRequestHeaders. You should only use this for headers that apply to every request that the HttpClient instance will send, and then you should set them only once, when you create the client, as the documentation implies ("Headers set on this property don't need to be set on request messages again").
While HttpClient is essentially thread-safe, the DefaultRequestHeaders (and BaseAddress) properties are not. You're changing these values while the client instance is potentially busy using them elsewhere. It's not clear whether you're using the singleton HttpClient elsewhere as well, possibly changing the default headers there too, but if so that would significantly increase the chances of issues arising.
Some additional references about the non-thread-safety of these properties:
https://github.com/dotnet/dotnet-api-docs/issues/1085
http://www.michaeltaylorp3.net/httpclient-is-it-really-thread-safe/
https://github.com/MicrosoftDocs/architecture-center/issues/935
I found an answer which is:
public async Task<bool> GetMedia(string saveDir, string id)
{
var api = $"/api/v1/download/{id}";
var Uri = $"{MccBaseURL}{api}";
using (HttpClient client = new HttpClient())
{
using (HttpResponseMessage response = await client.GetAsync(Uri, HttpCompletionOption.ResponseHeadersRead))
using (System.IO.Stream streamToReadFrom = await response.Content.ReadAsStreamAsync())
{
string fileToWriteTo = System.IO.Path.GetTempFileName();
using (System.IO.FileStream streamToWriteTo = new System.IO.FileStream(saveDir, System.IO.FileMode.Create))
{
await streamToReadFrom.CopyToAsync(streamToWriteTo);
return true;
}
}
}
}
It was really memory something problem which continuously using same HttpClient over and over again. So I created a new instance. I'm a super noob! Sorry!

HttpClient.PutAsync finish immediately with no response

I try to upload a file with PUT method to the http server (Apache Tika) with the following code
private static async Task<string> send(string fileName, string url)
{
using (var fileStream = File.OpenRead(fileName))
{
var client = new HttpClient();
client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("text/plain"));
var content = new StreamContent(fileStream);
content.Headers.ContentType = new MediaTypeHeaderValue("application/pdf");
var response = await client.PutAsync(url, content);
response.EnsureSuccessStatusCode();
return await response.Content.ReadAsStringAsync();
}
}
In Main the method is called this way:
private static void Main(string[] args)
{
// ...
send(options.FileName, options.Url).
ContinueWith(task => Console.WriteLine(task.Result));
}
In response the server should return HTTP 200 and text response (parsed pdf file). I've checked this behavior with with Fiddler and it works fine as far as the server is concerned.
Unfortunately the execution finish right after calling PutAsync method.
What I do wrong?
You're executing this from a console application, which will terminate after your call to send. You'll have to use Wait or Result on it in order for Main not to terminate:
private static void Main(string[] args)
{
var sendResult = send(options.FileName, options.Url).Result;
Console.WriteLine(sendResult);
}
Note - this should be only used inside a console application. Using Task.Wait or Task.Result will result in a deadlock in other application types (which are not console) due to synchronization context marshaling.

Categories