So I have multiple threads trying to get a response from a resource, but for some reason - even though they are running in seperate threads, each response will only return when all others are either still waiting or closed. I tried using
WebResponse response = await request.GetResponseAsync(); but first of all that seems redundant to me, since I'm already running seperate threads, and also visual studio tells me
The 'await' operator can only be used within an async method. Consider marking this method with the 'async' modifier and changing its return type to 'Task'.
What's going on here?
EDIT (Code):
Start method (called from a single thread)
public void Start()
{
if (!Started)
{
ByteAt = 0;
request = (HttpWebRequest)WebRequest.Create(URL);
request.Method = "GET";
request.AddRange(ByteStart, ByteStart + ByteLength);
downloadThread = new Thread(DownloadThreadWorker);
downloadThread.Start();
Started = true;
Paused = false;
}
}
Download threads:
private void DownloadThreadWorker()
{
WebResponse response = request.GetResponse();
if (response != null)
{
if (!CheckRange(response))
Abort(String.Format("Multi part downloads not supported (Requested length: {0}, response length: {1})", ByteLength, response.ContentLength));
else
{ ...
Per HTTP 1.1 RFC a client should make no more than 2 concurrent connections. Not sure about the latest versions of IE, but previously IE honored this limitation (could be changed via a registry key) and only had no more than 2 connections to the same host at any one time. This could be the limitation you're experiencing...
Or try setting ServicePointManager.DefaultConnectionLimit above 2.
Related
I'm using below code (this is slightly simplified) to make a webrequest:
public async Task<string> GetResponseAsync()
{
WebRequest webrequest = WebRequest.Create(url);
WebResponse response = null;
string content = string.Empty;
webrequest.Method = "GET";
webrequest.Timeout = 10000; // 10 seconds
response = await webrequest.GetResponseAsync();//this seems to not get started
using (Stream dataStream = response.GetResponseStream())
{
StreamReader reader = new StreamReader(dataStream);
content = await reader.ReadToEndAsync();
}
response?.Close();
return content;
}
This code has been working in production for months. Recently some changes have been made to the load balancer of the underlying service and now intermittently the line with GetResponseAsync gets stuck.
Below is a screenshot from the tasks debugging window. It will stay in this state for hours and the timeout does not work. The tasks window only shows tasks which are either "Awaiting" or "Scheduled". There is no task in any other state. Double clicking the task in red will go to line with GetResponseAsync method.
I feel like I might be missing something obvious here. What can be the reason of this getting stuck?
As per the link below, use ConfigureAwait to prevent deadlocks. Please read extensive doc on deadlocks due to async calls
https://blog.stephencleary.com/2012/07/dont-block-on-async-code.html
public static async Task<JObject> GetJsonAsync(Uri uri)
{
// (real-world code shouldn't use HttpClient in a using block; this is just example code)
using (var client = new HttpClient())
{
var jsonString = await client.GetStringAsync(uri).ConfigureAwait(false);
return JObject.Parse(jsonString);
}
}
It's likely that code outside of this method is placing restrictions on the ExecutionContext or SynchronizationContext your task needs to resume execution.
It turned out that the SSL handshake failed and that the timeout does not work in this case. The solution was to pass a CancellationToken with the timeout like this:
await webrequest.GetResponseAsync(new CancellationTokenSource(millisecondsDelay: 10000).Token)
I have a program which gets html code for ~500 webpages every 5 minutes
it runs correctly until first fail(unable to download source in 6 seconds)
after that all threads will fail
and if I restart program, again it runs correctly until ...
where I'm wrong, what I should do to do it better?
this function runs every 5 mins:
foreach (Company company in companies)
{
string link = company.GetLink();
Thread t = new Thread(() => F(company, link));
t.Start();
if (!t.Join(TimeSpan.FromSeconds(6)))
{
Debug.WriteLine( company.Name + " Fails");
t.Abort();
}
}
and this function download html code
private void F(Company company, string link)
{
try
{
string htmlCode = GetInformationFromWeb.GetHtmlRequest(link);
company.HtmlCode = htmlCode;
}
catch (Exception ex)
{
}
}
and this class:
public class GetInformationFromWeb
{
public static string GetHtmlRequest(string url)
{
using (MyWebClient client = new MyWebClient())
{
client.Encoding = Encoding.UTF8;
string htmlCode = client.DownloadString(url);
return htmlCode;
}
}
}
and web client class
public class MyWebClient : WebClient
{
protected override WebRequest GetWebRequest(Uri address)
{
HttpWebRequest request = base.GetWebRequest(address) as HttpWebRequest;
request.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
return request;
}
}
IF your foreach is looping over 500 companies, and each is creating a new thread, it could be that your internet speed could become a bottleneck and you will receive timeouts over 6 seconds, and fail very often.
I suggest you to try with parallelism. Note MaxDegreeOfParallelism, which sets maximum amount of parallel executions. You can tune this to suit your needs.
Parallel.ForEach(companies, new ParallelOptions { MaxDegreeOfParallelism = 10 }, (company) =>
{
try
{
string htmlCode = GetInformationFromWeb.GetHtmlRequest(company.link);
company.HtmlCode = htmlCode;
}
catch(Exception ex)
{
//ignore or process exception
}
});
I have four basic suggestions:
Use HttpClient instead of obsolete WebClient. HttpClient can deal with asynchronous operations natively and has far more flexibility to take advantage of. You can even read downloaded contents to strings/streams on different thread since you can configure await not to schedule back your operations. Or even program the HttpClientHandler to break after 6 seconds and raise TaskCanceledException if this was exceeded.
Avoid swallowing exceptions (like you do in your F function) as it breaks debugging and obfuscates the real cause of problems. Correctly-written program will never raise an exception during normal operation.
You are using threads in an useless way, in which they are not even overlapping; they are just waiting for each other to start, because you are locking the calling loop after each thread's start. In .NET it would be better to do multitasking using Tasks (for example, by calling them as Task.Run(async delegate() { await yourTask(); }) (or AsyncContext.Run(...) if you need UI access) and it won't block anything.
The whole GetInformationFromWeb class is pointless in the moment - and you are spawning multiple client objects also pointlessly, since one HTTP client object can handle multiple requests (if you'd use HttpClient even without additional bloat - you just instantiate it once as static global variable with all necessary configuration and then call it from any place using as little code as client.GetStringAsync(Uri uri).
OT: Is it some kind of an academic project?
I am using the .NET 4.5 HttpClient class to make a POST request to a server a number of times. The first 3 calls run quickly, but the fourth time a call to await client.PostAsync(...) is made, it hangs for several seconds before returning the expected response.
using (HttpClient client = new HttpClient())
{
// Prepare query
StringBuilder queryBuilder = new StringBuilder();
queryBuilder.Append("?arg=value");
// Send query
using (var result = await client.PostAsync(BaseUrl + queryBuilder.ToString(),
new StreamContent(streamData)))
{
Stream stream = await result.Content.ReadAsStreamAsync();
return new MyResult(stream);
}
}
The server code is shown below:
HttpListener listener;
void Run()
{
listener.Start();
ThreadPool.QueueUserWorkItem((o) =>
{
while (listener.IsListening)
{
ThreadPool.QueueUserWorkItem((c) =>
{
var context = c as HttpListenerContext;
try
{
// Handle request
}
finally
{
// Always close the stream
context.Response.OutputStream.Close();
}
}, listener.GetContext());
}
});
}
Inserting a debug statement at // Handle request shows that the server code doesn't seem to receive the request as soon as it is sent.
I have already investigated whether it could be a problem with the client not closing the response, meaning that the number of connections the ServicePoint provider allows could be reached. However, I have tried increasing ServicePointManager.MaxServicePoints but this has no effect at all.
I also found this similar question:
.NET HttpClient hangs after several requests (unless Fiddler is active)
I don't believe this is the problem with my code - even changing my code to exactly what is given there didn't fix the problem.
The problem was that there were too many Task instances scheduled to run.
Changing some of the Task.Factory.StartNew calls in my program for tasks which ran for a long time to use the TaskCreationOptions.LongRunning option fixed this. It appears that the task scheduler was waiting for other tasks to finish before it scheduled the request to the server.
I'm currently using code that makes HTTP requests using the HttpClient class. Although you can specify a timeout for the request, the value applies to the entirety of the request (which includes resolving the host name, establishing a connection, sending the request and receiving the response).
I need a way to make requests fail fast if they cannot resolve the name or establish a connection, but I also sometimes need to receive large amounts of data, so cannot just reduce the timeout.
Is there a way to achieve this using either a built in (BCL) class or an alternative HTTP client stack?
I've looked briefly at RestSharp and ServiceStack, but neither seems to provide a timeout just for the connection part (but do correct me if I am wrong).
You can use a Timer to abort the request if the connection take too much time. Add an event when the time is elapsed. You can use something like this:
static WebRequest request;
private static void sendAndReceive()
{
// The request with a big timeout for receiving large amout of data
request = HttpWebRequest.Create("http://localhost:8081/index/");
request.Timeout = 100000;
// The connection timeout
var ConnectionTimeoutTime = 100;
Timer timer = new Timer(ConnectionTimeoutTime);
timer.Elapsed += connectionTimeout;
timer.Enabled = true;
Console.WriteLine("Connecting...");
try
{
using (var stream = request.GetRequestStream())
{
Console.WriteLine("Connection success !");
timer.Enabled = false;
/*
* Sending data ...
*/
System.Threading.Thread.Sleep(1000000);
}
using (var response = (HttpWebResponse)request.GetResponse())
{
/*
* Receiving datas...
*/
}
}
catch (WebException e)
{
if(e.Status==WebExceptionStatus.RequestCanceled)
Console.WriteLine("Connection canceled (timeout)");
else if(e.Status==WebExceptionStatus.ConnectFailure)
Console.WriteLine("Can't connect to server");
else if(e.Status==WebExceptionStatus.Timeout)
Console.WriteLine("Timeout");
else
Console.WriteLine("Error");
}
}
static void connectionTimeout(object sender, System.Timers.ElapsedEventArgs e)
{
Console.WriteLine("Connection failed...");
Timer timer = (Timer)sender;
timer.Enabled = false;
request.Abort();
}
Times here are just for example, you have to adjust them to your needs.
.NET's HttpWebRequest exposes 2 properties for specifying a Timeout for connecting with a remote HTTP Server:
Timeout - Gets or sets the time-out value in milliseconds for the GetResponse and GetRequestStream methods.
ReadWriteTimeout - The number of milliseconds before the writing or reading times out. The default value is 300,000 milliseconds (5 minutes).
The Timeout property is the closest to what you're after, but it does suggest that regardless of the Timeout value the DNS resolution may take up to 15 seconds:
A Domain Name System (DNS) query may take up to 15 seconds to return or time out. If your request contains a host name that requires resolution and you set Timeout to a value less than 15 seconds, it may take 15 seconds or more before a WebException is thrown to indicate a timeout on your request.
One way to prempt a lower timeout than 15s for DNS lookups is to lookup the hostname yourself, but many solutions requires P/Invoke to specify low-level settings.
Specifying timeouts in ServiceStack HTTP Clients
The underlying HttpWebRequest Timeout and ReadWriteTimeout properties can also be specified in ServiceStack's high-level HTTP Clients, i.e. in C# Service Clients with:
var client = new JsonServiceClient(BaseUri) {
Timeout = TimeSpan.FromSeconds(30)
};
Or using ServiceStack's HTTP Utils with:
var timeoutMs = 30 * 1000;
var response = url.GetStringFromUrl(requestFilter: req =>
req.Timeout = timeoutMs);
I believe RestSharp does have timeout properties in RestClient.
var request = new RestRequest();
var client = new RestClient
{
Timeout = timeout, //Timeout in milliseconds to use for requests made by this client instance
ReadWriteTimeout = readWriteTimeout //The number of milliseconds before the writing or reading times out.
};
var response = client.Execute(request);
//Handle response
You right, you are unable to set this specific timeout.
I don't have enough information about how the libraries were built, but for the purpose they are meant to, I believe they fit. Someone wants to do a request and set a timeout for everything.
I suggest you take a different approach.
You are trying to do two different things here that HttpRequest do at once:
Try to find the host/stabblish a connection;
Transfer data;
You could try to separate this in two stages.
Use Ping class (check this out) to try to get to your host and set a timeout for it;
Use the HttpRequest IF it works for your needs (of timeout,
This process should not slow down everything, since part of resolving names/routes would be done at the first stage. This would not be totally disposable.
There's a drawback on this solution: your remote host must accept pings.
Hope this helps.
I used this method to check if the connection can be established. This however doesn't guarantee that the connection can be established by the subsequent call in HttpWebRequest.
private static bool CanConnect(string machine)
{
using (TcpClient client = new TcpClient())
{
if (!client.ConnectAsync(machine, 443).Wait(50)) // Check if we can connect in 50ms
{
return false;
}
}
return true;
}
if timeouts does not suits your need - don't use them. you can use a handler which waits for the operation to complete. when you get a response - stop the handler and proceed. that way you will get short time requests when failing and long time requests for large amounts of data.
something like this maybe:
var handler = new ManualResetEvent(false);
request = (HttpWebRequest)WebRequest.Create(url)
{
// initialize parameters such as method
}
request.BeginGetResponse(new AsyncCallback(delegate(IAsyncResult result)
{
try
{
var request = (HttpWebRequest)result.AsyncState;
using (var response = (HttpWebResponse)request.EndGetResponse(result))
{
using (var stream = response.GetResponseStream())
{
// success
}
response.Close();
}
}
catch (Exception e)
{
// fail operations go here
}
finally
{
handler.Set(); // whenever i succeed or fail
}
}), request);
handler.WaitOne(); // wait for the operation to complete
What about asking for only the header at first, and then the usual resource if it is successful,
webRequest.Method = "HEAD";
I have a requirement, is to process X number of files, usually we can receive around 100 files each day, is a zip file so I have to open it, create a stream then send it to a WebApi service which is a workflow, this workflow calls two more WebApi Steps.
I implemented a console application that loops through the files then calls a wrapper which makes a REST call using HttpWebRequest.GetResponse().
I stressed tested the solution and created 11K files, in a synchronous version it takes to process all the files around 17 minutes, but I would like to create an async version of it and be able to use await HttpWebRequest.GetResponseAsync().
Here is the Async version:
private async Task<KeyValuePair<HttpStatusCode, string>> REST_CallAsync(
string httpMethod,
string url,
string contentType,
object bodyMessage = null,
Dictionary<string, object> headerParameters = null,
object[] queryStringParamaters = null,
string requestData = "")
{
try
{
HttpWebRequest req = (HttpWebRequest)HttpWebRequest.Create("some url");
req.Method = "POST";
req.ContentType = contentType;
//Adding zip stream to body
var reqBodyBytes = ReadFully((Stream)bodyMessage);
req.ContentLength = reqBodyBytes.Length;
Stream reqStream = req.GetRequestStream();
reqStream.Write(reqBodyBytes, 0, reqBodyBytes.Length);
reqStream.Close();
//Async call
var resp = await req.GetResponseAsync();
var httpResponse = (HttpWebResponse)resp as HttpWebResponse;
var responseData = new StreamReader(resp.GetResponseStream()).ReadToEnd();
return new KeyValuePair<HttpStatusCode,string>(httpResponse.StatusCode, responseData);
}
catch (WebException webEx)
{
//something
}
catch (Exception ex)
{
//something
}
In my console Application I have a loop to open and call the async (CallServiceAsync under the covers calls the method above)
foreach (var zipFile in Directory.EnumerateFiles(directory))
{
using (var zipStream = System.IO.File.OpenRead(zipFile))
{
await _restFulService.CallServiceAsync<WorkflowResponse>(
zipStream,
headerParameters,
null,
true);
}
processId++;
}
}
What end up happening was that only 2K of 11K got processed and didn't throw any exception so I was clueless so I changed the version I am calling the async to:
foreach (var zipFile in Directory.EnumerateFiles(directory))
{
using (var zipStream = System.IO.File.OpenRead(zipFile))
{
tasks.Add(_restFulService.CallServiceAsync<WorkflowResponse>(
zipStream,
headerParameters,
null,
true));
}
}
}
And have another loop to await for the tasks:
foreach (var task in await System.Threading.Tasks.Task.WhenAll(tasks))
{
if (task.Value != null)
{
Console.WriteLine("Ending Process");
}
}
And now I am facing a different error, when I process three files, the third one receives:
The client is disconnected because the underlying request has been completed. There is no longer an HttpContext available.
My question is, what i am doing wrong here? I use SimpleInjector as IoC would it be this the problem?
Also when you do WhenAll is waiting for each thread to run? Is not making it synchronous so it waits for a thread to finish in order to execute the next one? I am new to this async world so any help would be really much appreciated.
Well for those that added -1 to my question and instead of providing some type of solution just suggested something meaningless, here it is the answer and the reason why specifying as much detail as possible is useful.
First problem, since I'm using IIS Express if I'm not running my solution (F5) then the web applications are not available, that happened to me sometimes not always.
The second problem and the one giving me a huge headache is that not all the files got processed, I should've known the reason of this issue before, is the usage of async - await in a console application. I forced my console app to work with async by doing:
static void Main(string[] args)
{
System.Threading.Tasks.Task.Run(() => MainAsync(args)).Wait();
}
static async void MainAsync(string[] args)
{
//rest of code
Then if you note in my foreach I had await keyword and what was happening is that by concept await sends back the control flow to the caller, in this case the OS is the one calling the Console App (that is why doesn't make too much sense to use async - await in a console app, I did it because I mistakenly used await by calling an async method).
So the result was that my process only processed some X number of files, so what I end up doing is the following:
Add a list of tasks, the same way I did above:
tasks.Add(_restFulService.CallServiceAsync<WorkflowResponse>(....
And the way to run the threads is (in my console app):
ExecuteAsync(tasks);
Finally my method:
static void ExecuteAsync(List<System.Threading.Tasks.Task<KeyValuePair<HttpStatusCode, WorkflowResponse>>> tasks)
{
System.Threading.Tasks.Task.WhenAll(tasks).Wait();
}
UPDATE: Based on Scott's feedback, I changed the way I execute my threads.
And now I'm able to process all my files, I tested it and to process 1000 files in my synchronous process took around 160+ seconds to run all the process (I have a workflow of three steps in order to process the file) and when I put my async process in place it took 80+ seconds so almost half of the time. In my production server with IIS I believe the execution time will be less.
Hope this helps to anyone facing this type of issue.