WebRequest slower in multiply tasks - c#

When I work on the WebRequest inside tasks under Mono, I found that it's very slow to get response from server. But it very fast if sending request out Task. I did some research but there is no solution for it. The code works well on Windows and .NET Core on Linux. So I guess it maybe a problem on Mono. Hope someone can help me out of it.
public static void Main(string[] args)
{
ServicePointManager.DefaultConnectionLimit = 12;
ServicePointManager.Expect100Continue = false;
var tasks = Enumerable.Range(0, 10).Select(async _ =>
{
var timer = Stopwatch.StartNew();
await ValidateUrlAsync("https://www.bing.com");
timer.Stop();
Console.WriteLine(timer.ElapsedMilliseconds);
//request.Abort();
}).ToArray();
Task.WaitAll(tasks);
// Single request
{
var timer = Stopwatch.StartNew();
ValidateUrlAsync("https://www.bing.com").ConfigureAwait(false).GetAwaiter().GetResult();
timer.Stop();
Console.WriteLine(timer.ElapsedMilliseconds);
}
Console.Read();
}
public static async Task<bool> ValidateUrlAsync(string url)
{
using(var response = (HttpWebResponse)await WebRequest.Create(url).GetResponseAsync())
return response.StatusCode == HttpStatusCode.OK;
}
The Mono version: 5.16.0.179.
Ubuntu: 16.04.2 LTS.

Related

Deadlock when calling HttpClient.PostAsync

I'm currently working on a project which consists of 2 application, one that's basically a rest server and a client. On the server part all seems to work fine but I have a deadlock when posting async in the client application.
The post handler in the server application is as for now (using Nancy):
Post("/", args =>
{
bool updated = false;
string json = Request.Body.AsString();
PostData data = JsonConvert.DeserializeObject<PostData>(json);
int clientCode = data.ClientCode;
bool started = data.Started;
// TODO:
// Update database
return updated;
}
);
The deadlocking part of the client application is (I'm also adding the GET handling just for completeness, but it's working fine):
public async Task<string> Get()
{
string actualRequest = $"{uri}/{request}";
response = await client.GetAsync(actualRequest);
response.EnsureSuccessStatusCode();
result = await response.Content.ReadAsStringAsync();
return result;
}
public async Task<string> Post(object data)
{
string json = JsonConvert.SerializeObject(data);
StringContent content = new StringContent(json, Encoding.UTF8, "application/json");
response = await client.PostAsync(uri, content);
result = await response.Content.ReadAsStringAsync();
return result;
}
The small chunk of code used for testing is:
private static string response = "";
private static async Task TestPostConnection(RestClient client)
{
PostData data = new PostData(1, true);
response = await client.Post(data);
}
private static async Task TestGetConnection(RestClient client)
{
client.Request = "id=1";
response = await client.Get();
}
private static async Task InitializeClient()
{
string ipAddress = "localhost";
RestClient client = new RestClient("RestClient", new Uri($"http://{ipAddress}:{8080}"));
await TestGetConnection(client); // OK
await TestPostConnection(client);
}
Probably it's something stupid that I can't see because it's like 4 hours that I'm working on it :P
Thank in advantage!
Edit:
I'm using .net framework (4.7.2) and I'm not calling Result anywhere (just searched in Visual Studio for double check)
This is the Main method:
[STAThread]
static async Task Main()
{
await InitializeClient(); // Call the async methods above
Application.EnableVisualStyles();
Application.SetCompatibleTextRenderingDefault(false);
Application.Run(new ClientForm(response))
}
I just solved the it.
The problem was the return type of the post handler in the server application
Post("/", args =>
{
bool updated = false;
string json = Request.Body.AsString();
PostData data = JsonConvert.DeserializeObject<PostData>(json);
int clientCode = data.ClientCode;
bool started = data.Started;
// TODO:
// Update database
return updated.ToString();
}
);
The RestClient.Post() that I've implemented was expecting a string as a result (in fact the type of the method si (async) Task<string> but I was returning a bool).
Adding ToString() solved it.
Thanks all!

HttpClient POST long delay before request processing on IIS, multithreading

I have a simple console application which sends HTTP POST from multiple threads:
List<Task> tasks = new List<Task>();
for (int i = 0; i < 100; i++)
{
tasks.Add(Task.Factory.StartNew(() => SendQuery(url1, query1)));
tasks.Add(Task.Factory.StartNew(() => SendQuery(url2, query2)));
}
Task.WaitAll(tasks.ToArray());
SendQuery(string uri, string requestString) looks like this:
Uri url = new Uri(uri);
try
{
using (HttpClient client = new HttpClient { Timeout = new TimeSpan(0, 0, 10, 0) })
{
StringContent content = new StringContent(requestString);
content.Headers.ContentType = new MediaTypeHeaderValue("application/json");
HttpResponseMessage response = client.PostAsync(url, content).Result;
response.EnsureSuccessStatusCode();
}
}
catch (Exception ex)
{
Console.WriteLine(ex);
}
The program works without any errors, all the queries are processed finally, but after filling tasks list each thread hangs on client.PostAsync(url, content).Result, after several minutes IIS starts to process queries. Why does this delay occur? What's happening during this time? I am using IIS 7.5 running on Windows Server 2008 R2 to host web-services which provide url1 and url2.
Set this value to 500 or 1000 at the start of your program and let us know the effects.
Your requests maybe getting throttled at the default value of 2. (depending on the .net version)
ServicePointManager.DefaultConnectionLimit = 500;

Consuming Streaming API Call using HTTP Web Request [duplicate]

How can I use HttpWebRequest (.NET, C#) asynchronously?
Use HttpWebRequest.BeginGetResponse()
HttpWebRequest webRequest;
void StartWebRequest()
{
webRequest.BeginGetResponse(new AsyncCallback(FinishWebRequest), null);
}
void FinishWebRequest(IAsyncResult result)
{
webRequest.EndGetResponse(result);
}
The callback function is called when the asynchronous operation is complete. You need to at least call EndGetResponse() from this function.
By far the easiest way is by using TaskFactory.FromAsync from the TPL. It's literally a couple of lines of code when used in conjunction with the new async/await keywords:
var request = WebRequest.Create("http://www.stackoverflow.com");
var response = (HttpWebResponse) await Task.Factory
.FromAsync<WebResponse>(request.BeginGetResponse,
request.EndGetResponse,
null);
Debug.Assert(response.StatusCode == HttpStatusCode.OK);
If you can't use the C#5 compiler then the above can be accomplished using the Task.ContinueWith method:
Task.Factory.FromAsync<WebResponse>(request.BeginGetResponse,
request.EndGetResponse,
null)
.ContinueWith(task =>
{
var response = (HttpWebResponse) task.Result;
Debug.Assert(response.StatusCode == HttpStatusCode.OK);
});
Considering the answer:
HttpWebRequest webRequest;
void StartWebRequest()
{
webRequest.BeginGetResponse(new AsyncCallback(FinishWebRequest), null);
}
void FinishWebRequest(IAsyncResult result)
{
webRequest.EndGetResponse(result);
}
You could send the request pointer or any other object like this:
void StartWebRequest()
{
HttpWebRequest webRequest = ...;
webRequest.BeginGetResponse(new AsyncCallback(FinishWebRequest), webRequest);
}
void FinishWebRequest(IAsyncResult result)
{
HttpWebResponse response = (result.AsyncState as HttpWebRequest).EndGetResponse(result) as HttpWebResponse;
}
Greetings
Everyone so far has been wrong, because BeginGetResponse() does some work on the current thread. From the documentation:
The BeginGetResponse method requires some synchronous setup tasks to
complete (DNS resolution, proxy detection, and TCP socket connection,
for example) before this method becomes asynchronous. As a result,
this method should never be called on a user interface (UI) thread
because it might take considerable time (up to several minutes
depending on network settings) to complete the initial synchronous
setup tasks before an exception for an error is thrown or the method
succeeds.
So to do this right:
void DoWithResponse(HttpWebRequest request, Action<HttpWebResponse> responseAction)
{
Action wrapperAction = () =>
{
request.BeginGetResponse(new AsyncCallback((iar) =>
{
var response = (HttpWebResponse)((HttpWebRequest)iar.AsyncState).EndGetResponse(iar);
responseAction(response);
}), request);
};
wrapperAction.BeginInvoke(new AsyncCallback((iar) =>
{
var action = (Action)iar.AsyncState;
action.EndInvoke(iar);
}), wrapperAction);
}
You can then do what you need to with the response. For example:
HttpWebRequest request;
// init your request...then:
DoWithResponse(request, (response) => {
var body = new StreamReader(response.GetResponseStream()).ReadToEnd();
Console.Write(body);
});
public static async Task<byte[]> GetBytesAsync(string url) {
var request = (HttpWebRequest)WebRequest.Create(url);
using (var response = await request.GetResponseAsync())
using (var content = new MemoryStream())
using (var responseStream = response.GetResponseStream()) {
await responseStream.CopyToAsync(content);
return content.ToArray();
}
}
public static async Task<string> GetStringAsync(string url) {
var bytes = await GetBytesAsync(url);
return Encoding.UTF8.GetString(bytes, 0, bytes.Length);
}
I ended up using BackgroundWorker, it is definitely asynchronous unlike some of the above solutions, it handles returning to the GUI thread for you, and it is very easy to understand.
It is also very easy to handle exceptions, as they end up in the RunWorkerCompleted method, but make sure you read this: Unhandled exceptions in BackgroundWorker
I used WebClient but obviously you could use HttpWebRequest.GetResponse if you wanted.
var worker = new BackgroundWorker();
worker.DoWork += (sender, args) => {
args.Result = new WebClient().DownloadString(settings.test_url);
};
worker.RunWorkerCompleted += (sender, e) => {
if (e.Error != null) {
connectivityLabel.Text = "Error: " + e.Error.Message;
} else {
connectivityLabel.Text = "Connectivity OK";
Log.d("result:" + e.Result);
}
};
connectivityLabel.Text = "Testing Connectivity";
worker.RunWorkerAsync();
.NET has changed since many of these answers were posted, and I'd like to provide a more up-to-date answer. Use an async method to start a Task that will run on a background thread:
private async Task<String> MakeRequestAsync(String url)
{
String responseText = await Task.Run(() =>
{
try
{
HttpWebRequest request = WebRequest.Create(url) as HttpWebRequest;
WebResponse response = request.GetResponse();
Stream responseStream = response.GetResponseStream();
return new StreamReader(responseStream).ReadToEnd();
}
catch (Exception e)
{
Console.WriteLine("Error: " + e.Message);
}
return null;
});
return responseText;
}
To use the async method:
String response = await MakeRequestAsync("http://example.com/");
Update:
This solution does not work for UWP apps which use WebRequest.GetResponseAsync() instead of WebRequest.GetResponse(), and it does not call the Dispose() methods where appropriate. #dragansr has a good alternative solution that addresses these issues.
public void GetResponseAsync (HttpWebRequest request, Action<HttpWebResponse> gotResponse)
{
if (request != null) {
request.BeginGetRequestStream ((r) => {
try { // there's a try/catch here because execution path is different from invokation one, exception here may cause a crash
HttpWebResponse response = request.EndGetResponse (r);
if (gotResponse != null)
gotResponse (response);
} catch (Exception x) {
Console.WriteLine ("Unable to get response for '" + request.RequestUri + "' Err: " + x);
}
}, null);
}
}
Follow up to the #Isak 's answer, which is very good. Nonetheless it's biggest flaw is that it will only call the responseAction if the response has status 200-299. The best way to fix this is:
private void DoWithResponseAsync(HttpWebRequest request, Action<HttpWebResponse> responseAction)
{
Action wrapperAction = () =>
{
request.BeginGetResponse(new AsyncCallback((iar) =>
{
HttpWebResponse response;
try
{
response = (HttpWebResponse)((HttpWebRequest)iar.AsyncState).EndGetResponse(iar);
}
catch (WebException ex)
{
// It needs to be done like this in order to read responses with error status:
response = ex.Response as HttpWebResponse;
}
responseAction(response);
}), request);
};
wrapperAction.BeginInvoke(new AsyncCallback((iar) =>
{
var action = (Action)iar.AsyncState;
action.EndInvoke(iar);
}), wrapperAction);
}
And then as #Isak follows:
HttpWebRequest request;
// init your request...then:
DoWithResponse(request, (response) => {
var body = new StreamReader(response.GetResponseStream()).ReadToEnd();
Console.Write(body);
});
I've been using this for async UWR, hopefully it helps someone
string uri = "http://some.place.online";
using (UnityWebRequest uwr = UnityWebRequest.Get(uri))
{
var asyncOp = uwr.SendWebRequest();
while (asyncOp.isDone == false) await Task.Delay(1000 / 30); // 30 hertz
if(uwr.result == UnityWebRequest.Result.Success) return uwr.downloadHandler.text;
Debug.LogError(uwr.error);
}

Why is this implementation of async WebRequests slower than the synchronous implementation?

I have an application that requires many requests to a third party REST service. I thought that modifying this part of the application to make the requests asynchronously would potentially speed things up, so I wrote a POC console application to test things out.
To my surprise the async code takes almost twice as long to complete as the synchronous version. Am I just doing it wrong?
async static void LoadUrlsAsync()
{
var startTime = DateTime.Now;
Console.WriteLine("LoadUrlsAsync Start - {0}", startTime);
var numberOfRequest = 3;
var tasks = new List<Task<string>>();
for (int i = 0; i < numberOfRequest; i++)
{
var request = WebRequest.Create(#"http://www.google.com/images/srpr/logo11w.png") as HttpWebRequest;
request.Method = "GET";
var task = LoadUrlAsync(request);
tasks.Add(task);
}
var results = await Task.WhenAll(tasks);
var stopTime = DateTime.Now;
var duration = (stopTime - startTime);
Console.WriteLine("LoadUrlsAsync Complete - {0}", stopTime);
Console.WriteLine("LoadUrlsAsync Duration - {0}ms", duration);
}
async static Task<string> LoadUrlAsync(WebRequest request)
{
string value = string.Empty;
using (var response = await request.GetResponseAsync())
using (var responseStream = response.GetResponseStream())
using (var reader = new StreamReader(responseStream))
{
value = reader.ReadToEnd();
Console.WriteLine("{0} - Bytes: {1}", request.RequestUri, value.Length);
}
return value;
}
NOTE:
I have also tried setting the maxconnections=100 in the app.config in an attempt to eliminate throttling from the system.net connection pool. This setting doesn't seem to make an impact on the performance.
<system.net>
<connectionManagement>
<add address="*" maxconnection="100" />
</connectionManagement>
</system.net>
First, try to avoid microbenchmarking. When the differences of your code timings are swamped by network conditions, your results lose meaning.
That said, you should set ServicePointManager.DefaultConnectionLimit to int.MaxValue. Also, use end-to-end async methods (i.e., StreamReader.ReadToEndAsync) - or even better, use HttpClient, which was designed for async HTTP.
The async version becomes faster as you increase the number of threads. I'm not certain, however my guess is that you are bypassing the cost of setting up the threads. When you pass this threshold the async version becomes superior. Try 50 or even 500 requests and you should see async is faster. That's how it worked out for me.
500 Async Requests: 11.133 seconds
500 Sync Requests: 18.136 seconds
If you only have ~3 calls then I suggest avoiding async. Here's what I used to test:
public class SeperateClass
{
static int numberOfRequest = 500;
public async static void LoadUrlsAsync()
{
var startTime = DateTime.Now;
Console.WriteLine("LoadUrlsAsync Start - {0}", startTime);
var tasks = new List<Task<string>>();
for (int i = 0; i < numberOfRequest; i++)
{
var request = WebRequest.Create(#"http://www.google.com/images/srpr/logo11w.png") as HttpWebRequest;
request.Method = "GET";
var task = LoadUrlAsync(request);
tasks.Add(task);
}
var results = await Task.WhenAll(tasks);
var stopTime = DateTime.Now;
var duration = (stopTime - startTime);
Console.WriteLine("LoadUrlsAsync Complete - {0}", stopTime);
Console.WriteLine("LoadUrlsAsync Duration - {0}ms", duration);
}
async static Task<string> LoadUrlAsync(WebRequest request)
{
string value = string.Empty;
using (var response = await request.GetResponseAsync())
using (var responseStream = response.GetResponseStream())
using (var reader = new StreamReader(responseStream))
{
value = reader.ReadToEnd();
Console.WriteLine("{0} - Bytes: {1}", request.RequestUri, value.Length);
}
return value;
}
}
public class SeperateClassSync
{
static int numberOfRequest = 500;
public async static void LoadUrlsSync()
{
var startTime = DateTime.Now;
Console.WriteLine("LoadUrlsSync Start - {0}", startTime);
var tasks = new List<Task<string>>();
for (int i = 0; i < numberOfRequest; i++)
{
var request = WebRequest.Create(#"http://www.google.com/images/srpr/logo11w.png") as HttpWebRequest;
request.Method = "GET";
var task = LoadUrlSync(request);
tasks.Add(task);
}
var results = await Task.WhenAll(tasks);
var stopTime = DateTime.Now;
var duration = (stopTime - startTime);
Console.WriteLine("LoadUrlsSync Complete - {0}", stopTime);
Console.WriteLine("LoadUrlsSync Duration - {0}ms", duration);
}
async static Task<string> LoadUrlSync(WebRequest request)
{
string value = string.Empty;
using (var response = request.GetResponse())//Still async FW, just changed to Sync call here
using (var responseStream = response.GetResponseStream())
using (var reader = new StreamReader(responseStream))
{
value = reader.ReadToEnd();
Console.WriteLine("{0} - Bytes: {1}", request.RequestUri, value.Length);
}
return value;
}
}
class Program
{
static void Main(string[] args)
{
SeperateClass.LoadUrlsAsync();
Console.ReadLine();//record result and run again
SeperateClassSync.LoadUrlsSync();
Console.ReadLine();
}
}
In my tests it's faster to use the WebRequest.GetResponseAsync() method for 3 parallel requests.
It should be more noticeable with large requests, many requests (3 is not many), and requests from different websites.
What are the exact results you are getting? In your question you are converting a TimeSpan to a string and calling it milliseconds but you aren't actually calculating the milliseconds. It's displaying the standard TimeSpan.ToString which will show fractions of a second.
It appears that the problem was more of an environmental issue than anything else. Once I moved the code to another machine on a different network, the results were much more inline with my expectations.
The original async code does in fact execute more quickly than the synchronous version. This helps me ensure that I am not introducing additional complexity to our application without the expected performance gains.

a faster way to download multiple files

i need to download about 2 million files from the SEC website. each file has a unique url and is on average 10kB. this is my current implementation:
List<string> urls = new List<string>();
// ... initialize urls ...
WebBrowser browser = new WebBrowser();
foreach (string url in urls)
{
browser.Navigate(url);
while (browser.ReadyState != WebBrowserReadyState.Complete) Application.DoEvents();
StreamReader sr = new StreamReader(browser.DocumentStream);
StreamWriter sw = new StreamWriter(), url.Substring(url.LastIndexOf('/')));
sw.Write(sr.ReadToEnd());
sr.Close();
sw.Close();
}
the projected time is about 12 days... is there a faster way?
Edit: btw, the local file handling takes only 7% of the time
Edit: this is my final implementation:
void Main(void)
{
ServicePointManager.DefaultConnectionLimit = 10000;
List<string> urls = new List<string>();
// ... initialize urls ...
int retries = urls.AsParallel().WithDegreeOfParallelism(8).Sum(arg => downloadFile(arg));
}
public int downloadFile(string url)
{
int retries = 0;
retry:
try
{
HttpWebRequest webrequest = (HttpWebRequest)WebRequest.Create(url);
webrequest.Timeout = 10000;
webrequest.ReadWriteTimeout = 10000;
webrequest.Proxy = null;
webrequest.KeepAlive = false;
webresponse = (HttpWebResponse)webrequest.GetResponse();
using (Stream sr = webrequest.GetResponse().GetResponseStream())
using (FileStream sw = File.Create(url.Substring(url.LastIndexOf('/'))))
{
sr.CopyTo(sw);
}
}
catch (Exception ee)
{
if (ee.Message != "The remote server returned an error: (404) Not Found." && ee.Message != "The remote server returned an error: (403) Forbidden.")
{
if (ee.Message.StartsWith("The operation has timed out") || ee.Message == "Unable to connect to the remote server" || ee.Message.StartsWith("The request was aborted: ") || ee.Message.StartsWith("Unable to read data from the trans­port con­nec­tion: ") || ee.Message == "The remote server returned an error: (408) Request Timeout.") retries++;
else MessageBox.Show(ee.Message, "Error", MessageBoxButtons.OK, MessageBoxIcon.Error);
goto retry;
}
}
return retries;
}
Execute the downloads concurrently instead of sequentially, and set a sensible MaxDegreeOfParallelism otherwise you will try to make too many simultaneous request which will look like a DOS attack:
public static void Main(string[] args)
{
var urls = new List<string>();
Parallel.ForEach(
urls,
new ParallelOptions{MaxDegreeOfParallelism = 10},
DownloadFile);
}
public static void DownloadFile(string url)
{
using(var sr = new StreamReader(HttpWebRequest.Create(url)
.GetResponse().GetResponseStream()))
using(var sw = new StreamWriter(url.Substring(url.LastIndexOf('/'))))
{
sw.Write(sr.ReadToEnd());
}
}
Download files in several threads. Number of threads depends on your throughput. Also, look at WebClient and HttpWebRequest classes. Simple sample:
var list = new[]
{
"http://google.com",
"http://yahoo.com",
"http://stackoverflow.com"
};
var tasks = Parallel.ForEach(list,
s =>
{
using (var client = new WebClient())
{
Console.WriteLine($"starting to download {s}");
string result = client.DownloadString((string)s);
Console.WriteLine($"finished downloading {s}");
}
});
I'd use several threads in parallel, with a WebClient. I recommend setting the max degree of parallelism to the number of threads you want, since unspecified degree of parallelism doesn't work well for long running tasks. I've used 50 parallel downloads in one of my projects without a problem, but depending on the speed of an individual download a much lower might be sufficient.
If you download multiple files in parallel from the same server, you're by default limited to a small number (2 or 4) of parallel downloads. While the http standard specifies such a low limit, many servers don't enforce it. Use ServicePointManager.DefaultConnectionLimit = 10000; to increase the limit.
I think the code from o17t H1H' S'k seems right and all but to perform I/O bound tasks an async method should be used.
Like this:
public static async Task DownloadFileAsync(HttpClient httpClient, string url, string fileToWriteTo)
{
using HttpResponseMessage response = await httpClient.GetAsync(url, HttpCompletionOption.ResponseHeadersRead);
using Stream streamToReadFrom = await response.Content.ReadAsStreamAsync();
using Stream streamToWriteTo = File.Open(fileToWriteTo, FileMode.Create);
await streamToReadFrom.CopyToAsync(streamToWriteTo);
}
Parallel.Foreach is also available with Parallel.ForEachAsync. Parallel.Foreach has a lot of features that the async does't has, but most of them are also depracticed. You can implement an Producer Consumer system with Channel or BlockingCollection to handle the amount of 2 million files. But only if you don't know all URLs at the start.
private static async void StartDownload()
{
(string, string)[] urls = new ValueTuple<string, string>[]{
new ("https://dotnet.microsoft.com", "C:/YoureFile.html"),
new ( "https://www.microsoft.com", "C:/YoureFile1.html"),
new ( "https://stackoverflow.com", "C:/YoureFile2.html")};
var client = new HttpClient();
ParallelOptions options = new() { MaxDegreeOfParallelism = 2 };
await Parallel.ForEachAsync(urls, options, async (url, token) =>
{
await DownloadFileAsync(httpClient, url.Item1, url.Item2);
});
}
Also look into this NuGet Package. The Github Wiki gives examples how to use it. To download 2 million files this is a good library and has also a retry function. To download a file you only have to create an instance of LoadRequest and it downloads it with the name of the file into the Downloads directory.
private static void StartDownload()
{
string[] urls = new string[]{
"https://dotnet.microsoft.com",
"https://www.microsoft.com",
" https://stackoverflow.com"};
foreach (string url in urls)
new LoadRequest(url).Start();
}
I hope this helps to improve the code.

Categories