a faster way to download multiple files

a faster way to download multiple files - c#

i need to download about 2 million files from the SEC website. each file has a unique url and is on average 10kB. this is my current implementation:
List<string> urls = new List<string>();
// ... initialize urls ...
WebBrowser browser = new WebBrowser();
foreach (string url in urls)
{
browser.Navigate(url);
while (browser.ReadyState != WebBrowserReadyState.Complete) Application.DoEvents();
StreamReader sr = new StreamReader(browser.DocumentStream);
StreamWriter sw = new StreamWriter(), url.Substring(url.LastIndexOf('/')));
sw.Write(sr.ReadToEnd());
sr.Close();
sw.Close();
}
the projected time is about 12 days... is there a faster way?
Edit: btw, the local file handling takes only 7% of the time
Edit: this is my final implementation:
void Main(void)
{
ServicePointManager.DefaultConnectionLimit = 10000;
List<string> urls = new List<string>();
// ... initialize urls ...
int retries = urls.AsParallel().WithDegreeOfParallelism(8).Sum(arg => downloadFile(arg));
}
public int downloadFile(string url)
{
int retries = 0;
retry:
try
{
HttpWebRequest webrequest = (HttpWebRequest)WebRequest.Create(url);
webrequest.Timeout = 10000;
webrequest.ReadWriteTimeout = 10000;
webrequest.Proxy = null;
webrequest.KeepAlive = false;
webresponse = (HttpWebResponse)webrequest.GetResponse();
using (Stream sr = webrequest.GetResponse().GetResponseStream())
using (FileStream sw = File.Create(url.Substring(url.LastIndexOf('/'))))
{
sr.CopyTo(sw);
}
}
catch (Exception ee)
{
if (ee.Message != "The remote server returned an error: (404) Not Found." && ee.Message != "The remote server returned an error: (403) Forbidden.")
{
if (ee.Message.StartsWith("The operation has timed out") || ee.Message == "Unable to connect to the remote server" || ee.Message.StartsWith("The request was aborted: ") || ee.Message.StartsWith("Unable to read data from the transport connection: ") || ee.Message == "The remote server returned an error: (408) Request Timeout.") retries++;
else MessageBox.Show(ee.Message, "Error", MessageBoxButtons.OK, MessageBoxIcon.Error);
goto retry;
}
}
return retries;
}

Execute the downloads concurrently instead of sequentially, and set a sensible MaxDegreeOfParallelism otherwise you will try to make too many simultaneous request which will look like a DOS attack:
public static void Main(string[] args)
{
var urls = new List<string>();
Parallel.ForEach(
urls,
new ParallelOptions{MaxDegreeOfParallelism = 10},
DownloadFile);
}
public static void DownloadFile(string url)
{
using(var sr = new StreamReader(HttpWebRequest.Create(url)
.GetResponse().GetResponseStream()))
using(var sw = new StreamWriter(url.Substring(url.LastIndexOf('/'))))
{
sw.Write(sr.ReadToEnd());
}
}

Download files in several threads. Number of threads depends on your throughput. Also, look at WebClient and HttpWebRequest classes. Simple sample:
var list = new[]
{
"http://google.com",
"http://yahoo.com",
"http://stackoverflow.com"
};
var tasks = Parallel.ForEach(list,
s =>
{
using (var client = new WebClient())
{
Console.WriteLine($"starting to download {s}");
string result = client.DownloadString((string)s);
Console.WriteLine($"finished downloading {s}");
}
});

I'd use several threads in parallel, with a WebClient. I recommend setting the max degree of parallelism to the number of threads you want, since unspecified degree of parallelism doesn't work well for long running tasks. I've used 50 parallel downloads in one of my projects without a problem, but depending on the speed of an individual download a much lower might be sufficient.
If you download multiple files in parallel from the same server, you're by default limited to a small number (2 or 4) of parallel downloads. While the http standard specifies such a low limit, many servers don't enforce it. Use ServicePointManager.DefaultConnectionLimit = 10000; to increase the limit.

I think the code from o17t H1H' S'k seems right and all but to perform I/O bound tasks an async method should be used.
Like this:
public static async Task DownloadFileAsync(HttpClient httpClient, string url, string fileToWriteTo)
{
using HttpResponseMessage response = await httpClient.GetAsync(url, HttpCompletionOption.ResponseHeadersRead);
using Stream streamToReadFrom = await response.Content.ReadAsStreamAsync();
using Stream streamToWriteTo = File.Open(fileToWriteTo, FileMode.Create);
await streamToReadFrom.CopyToAsync(streamToWriteTo);
}
Parallel.Foreach is also available with Parallel.ForEachAsync. Parallel.Foreach has a lot of features that the async does't has, but most of them are also depracticed. You can implement an Producer Consumer system with Channel or BlockingCollection to handle the amount of 2 million files. But only if you don't know all URLs at the start.
private static async void StartDownload()
{
(string, string)[] urls = new ValueTuple<string, string>[]{
new ("https://dotnet.microsoft.com", "C:/YoureFile.html"),
new ( "https://www.microsoft.com", "C:/YoureFile1.html"),
new ( "https://stackoverflow.com", "C:/YoureFile2.html")};
var client = new HttpClient();
ParallelOptions options = new() { MaxDegreeOfParallelism = 2 };
await Parallel.ForEachAsync(urls, options, async (url, token) =>
{
await DownloadFileAsync(httpClient, url.Item1, url.Item2);
});
}
Also look into this NuGet Package. The Github Wiki gives examples how to use it. To download 2 million files this is a good library and has also a retry function. To download a file you only have to create an instance of LoadRequest and it downloads it with the name of the file into the Downloads directory.
private static void StartDownload()
{
string[] urls = new string[]{
"https://dotnet.microsoft.com",
"https://www.microsoft.com",
" https://stackoverflow.com"};
foreach (string url in urls)
new LoadRequest(url).Start();
}
I hope this helps to improve the code.

Related

Working with SemaphoreSlim and the ThreadPool for sending HttpWebRequests

I need to understand how SemaphoreSlim works with ThreadPool.
I need to read a file from the local disk of a computer, read each line, and then send that line as a HttpWebRequest to 2 different servers, and correspondingly get 2 HttpWebRequest back.
So lets say that file has 100 requests (this number can be in thousands or even more in a real-time scenario), and when all these requests will be sent I should get back 200 responses (like I mentioned each request should go to 2 different servers and fetch 2 responses from them). Here is my code:
static void Main(string[] args)
{
try
{
SendEntriesInFile(someFileOnTheLocaldisk);
Console.WriteLine();
}
catch (Exception e)
{
Console.WriteLine("Regression Tool Error: Major Unspecified Error:\n" + e);
}
}
}
public class MyParameters
{
}
private void SendEntriesInFile(FileInfo file)
{
static SemaphoreSlim threadSemaphore = new SemaphoreSlim(5, 10);
using (StreamReader reader = file.OpenText())
{
string entry = reader.ReadLine();
while (!String.IsNullOrEmpty(entry))
{
MyParameters myParams = new MyParameters(entry, totalNumberOfEntries, serverAddresses, requestType, fileName);
threadSemaphore.Wait();
ThreadPool.QueueUserWorkItem(new WaitCallback(Send), requestParams);
entry = reader.ReadLine();
}
}
}
private void Send(object MyParameters)
{
MyParameters myParams = (MyParameters)MyParameters;
for(int i=0; i < myParams.ServerAddresses.Count; i++)
{
byte[] bytesArray = null;
bytesArray = Encoding.UTF8.GetBytes(myParams.Request);
HttpWebRequest webRequest = null;
if (reqParams.TypeOfRequest == "tlc")
{
webRequest = (HttpWebRequest)WebRequest.Create(string.Format("http://{0}:{1}{2}", myParams .ServerAddresses[i], port, "/SomeMethod1"));
}
else
{
webRequest = (HttpWebRequest)WebRequest.Create(string.Format("http://{0}:{1}{2}", myParams .ServerAddresses[i], port, "/SomeMethod2"));
}
if (webRequest != null)
{
webRequest.Method = "POST";
webRequest.ContentType = "application/x-www-form-urlencoded";
webRequest.ContentLength = bytesArray.Length;
webRequest.Timeout = responseTimeout;
webRequest.ReadWriteTimeout = transmissionTimeout;
webRequest.ServicePoint.ConnectionLimit = maxConnections;
webRequest.ServicePoint.ConnectionLeaseTimeout = connectionLifeDuration;
webRequest.ServicePoint.MaxIdleTime = maxConnectionIdleTimeout;
webRequest.ServicePoint.UseNagleAlgorithm = nagleAlgorithm;
webRequest.ServicePoint.Expect100Continue = oneHundredContinue;
using (Stream requestStream = webRequest.GetRequestStream())
{
//Write the request through the request stream
requestStream.Write(bytesArray, 0, bytesArray.Length);
requestStream.Flush();
}
string response = "";
using (HttpWebResponse httpWebResponse = (HttpWebResponse)webRequest.GetResponse())
{
if (httpWebResponse != null)
{
using (Stream responseStream = httpWebResponse.GetResponseStream())
{
using (StreamReader stmReader = new StreamReader(responseStream))
{
response = stmReader.ReadToEnd();
string fileName = "";
if(i ==0)
{
fileName = Name is generated through some logic here;
}
else
{
fileName = Name is generated through some logic here;
}
using (StreamWriter writer = new StreamWriter(fileName))
{
writer.WriteLine(response);
}
}
}
}
}
}
}
Console.WriteLine(" Release semaphore: ---- " + threadSemaphore.Release());
}
The only thing is that I'm confused with that when I do something like above my Semaphore allow 5 threads to concurrently execute the Send() method and 5 other threads they wait in a queue for their turn. Since my file contains 100 requests so I should get back 200 responses. Every time I ended up getting only 109 or 107 or 108 responses back. Why I don't get all the 200 responses back. Though using a different code (not discussed here) when I send lets say 10 requests in parallel on 10 different created threads (not using the ThreadPool rather creating threads on-demand) I get back all the 200 responses.

Lots of very helpful comments. To add to them, I would recommend using just pure async io for this.
I would recommend applying the Semaphore as a DelegatingHandler.
You want to register this class.
public class LimitedConcurrentHttpHandler
: DelegatingHandler
{
private readonly SemaphoreSlim _concurrencyLimit = new(8);
public LimitedConcurrentHttpHandler(){ }
public LimitedConcurrentHttpHandler(HttpMessageHandler inner) : base(inner) {}
protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
{
await _concurrencyLimit.Wait(cancellationToken);
try
{
return await base.SendAsync(request, cancellationToken);
}
finally
{
_concurrencyLimit.Release();
}
}
}
This way, you can concentrate on your actual business logic within your actual code.

HttpClient with infinite time out throws time out exception

My HttpClient uses digest authentication to connect to the server and expects search queries in response. These search queries can come in any time so the client is expected to leave the connection open at all times.
The connection is made using the following code:
public static async void ListenForSearchQueries(int resourceId)
{
var url = $"xxx/yyy/{resourceId}/waitForSearchRequest?token=abc";
var httpHandler = new HttpClientHandler { PreAuthenticate = true };
using (var digestAuthMessageHandler = new DigestAuthMessageHandler(httpHandler, "user", "password"))
using (var client = new HttpClient(digestAuthMessageHandler))
{
client.Timeout = TimeSpan.FromMilliseconds(Timeout.Infinite);
var request = new HttpRequestMessage(HttpMethod.Get, url);
var tokenSource = new CancellationTokenSource();
tokenSource.CancelAfter(TimeSpan.FromMilliseconds(Timeout.Infinite));
using (var response = await client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead, tokenSource.Token))
{
Console.WriteLine("\nResponse code: " + response.StatusCode);
using (var body = await response.Content.ReadAsStreamAsync())
using (var reader = new StreamReader(body))
while (!reader.EndOfStream)
Console.WriteLine(reader.ReadLine());
}
}
}
This is how the method is being used in the Main method of a console application.
private static void Main(string[] args)
{
const int serviceId = 128;
.
.
.
ListenForSearchQueries(resourceId);
Console.ReadKey();
}
This is what the output on the console window looks like:
Response code: OK
--searchRequestBoundary
Even though the timeout for the client is set to infinity, the connection times out after roughly five minutes (which is not the default timeout of the HttpClient) after the first output, throwing the following exception.
System.IO.IOException occurred
HResult=0x80131620
Message=The read operation failed, see inner exception.
Source=System.Net.Http
StackTrace:
at System.Net.Http.HttpClientHandler.WebExceptionWrapperStream.Read(Byte[] buffer, Int32 offset, Int32 count)
at System.Net.Http.DelegatingStream.Read(Byte[] buffer, Int32 offset, Int32 count)
at System.IO.StreamReader.ReadBuffer()
at System.IO.StreamReader.get_EndOfStream()
at ConsoleTester.Program.<ListenSearchQueriesDigestAuthMessageHandler>d__10.MoveNext() in C:\Users\xyz\ProjName\ConsoleTester\Program.cs:line 270
Inner Exception 1:
WebException: The operation has timed out.
The DelegateHandler used for the authentication is a a rough adaption of this code (see the source section).
Why is the client timing out and how can I prevent this?
My ultimate goal is to make a call and wait indefinitely for a response. When a response does come, I don't want the connection to close because more responses might come in the future. Unfortunately, I can't change anything at the server end.

Although the default value for Stream.CanTimeout is false, returning a stream via the response.Content.ReadAsStreamAsync() gives a stream where the CanTimeout property returns true.
The default read and write time out for this stream is 5 minutes. That is after five minutes of inactivity, the stream will throw an exception. Much similar to the exception shown in the question.
To change this behavior, ReadTimeout and/or the WriteTimeout property of the stream can be adjusted.
Below is the modified version of the ListenForSearchQueries method that changes the ReadTimeout to Infinite.
public static async void ListenForSearchQueries(int resourceId)
{
var url = $"xxx/yyy/{resourceId}/waitForSearchRequest?token=abc";
var httpHandler = new HttpClientHandler { PreAuthenticate = true };
using (var digestAuthMessageHandler = new DigestAuthMessageHandler(httpHandler, "user", "password"))
using (var client = new HttpClient(digestAuthMessageHandler))
{
client.Timeout = TimeSpan.FromMilliseconds(Timeout.Infinite);
var request = new HttpRequestMessage(HttpMethod.Get, url);
var tokenSource = new CancellationTokenSource();
tokenSource.CancelAfter(TimeSpan.FromMilliseconds(Timeout.Infinite));
using (var response = await client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead, tokenSource.Token))
{
Console.WriteLine("\nResponse code: " + response.StatusCode);
using (var body = await response.Content.ReadAsStreamAsync())
{
body.ReadTimeout = Timeout.Infinite;
using (var reader = new StreamReader(body))
while (!reader.EndOfStream)
Console.WriteLine(reader.ReadLine());
}
}
}
}
This fixed the exception which was actually being thrown by the stream but seemed like was being thrown by the HttpClient.

Make the method return a Task
public static async Task ListenForSearchQueries(int resourceId) {
//...code removed for brevity
}
Update the console's main method to Wait on the Task to complete.
public static void Main(string[] args) {
const int serviceId = 128;
.
.
.
ListenForSearchQueries(resourceId).Wait();
Console.ReadKey();
}

I solved this problem in the following way:
var stream = await response.Content.ReadAsStreamAsync();
while (b == 1)
{
var bytes = new byte[1];
try
{
var bytesread = await stream.ReadAsync(bytes, 0, 1);
if (bytesread > 0)
{
text = Encoding.UTF8.GetString(bytes);
Console.WriteLine(text);
using (System.IO.StreamWriter escritor = new System.IO.StreamWriter(#"C:\orden\ConSegu.txt", true))
{
if (ctext == 100)
{
escritor.WriteLine(text);
ctext = 0;
}
escritor.Write(text);
}
}
}
catch (Exception ex)
{
Console.WriteLine("error");
Console.WriteLine(ex.Message);
}
}
in this way I get byte to byte the answer and I save it in a txt
later I read the txt and I'm erasing it again. for the moment it is the solution I found to receive the notifications sent to me by the server from the persistent HTTP connection.

WebClient isn't downloading the right file from the supplied URL

I want to download a .torrent file from a Linux distro, but for some reason the final file downloaded from my app is different from the one downloaded manually. The one that my app downloads has 31KB and it is a invalid .torrent file, while right one (when i download manually) has 41KB and it is valid.
The URL from the file i want to download is http://torcache.net/torrent/C348CBCA08288AE07A97DD641C5D09EE25299FAC.torrent
Why is it happening and how can i download the same file (the valid one, with 41KB)?
Thanks.
C# Code from the method that downloads the file above:
string sLinkTorCache = #"http://torcache.net/torrent/C348CBCA08288AE07A97DD641C5D09EE25299FAC.torrent";
using (System.Net.WebClient wc = new System.Net.WebClient())
{
var path = #"D:\Baixar automaticamente"; // HACK Pegar isso dos settings na versão final
var data = Helper.Retry(() => wc.DownloadData(sLinkTorCache), TimeSpan.FromSeconds(3), 5);
string fileName = null;
// Try to extract the filename from the Content-Disposition header
if (!string.IsNullOrEmpty(wc.ResponseHeaders["Content-Disposition"]))
{
fileName = wc.ResponseHeaders["Content-Disposition"].Substring(wc.ResponseHeaders["Content-Disposition"].IndexOf("filename=") + 10).Replace("\"", "");
}
var torrentPath = Path.Combine(path, fileName ?? "Arch Linux Distro");
if (File.Exists(torrentPath))
{
File.Delete(torrentPath);
}
Helper.Retry(() => wc.DownloadFile(new Uri(sLinkTorCache), torrentPath), TimeSpan.FromSeconds(3), 5);
}
Helper.Retry (Try to execute the method again in case of HTTP Exceptions):
public static void Retry(Action action, TimeSpan retryInterval, int retryCount = 3)
{
Retry<object>(() =>
{
action();
return null;
}, retryInterval, retryCount);
}
public static T Retry<T>(Func<T> action, TimeSpan retryInterval, int retryCount = 3)
{
var exceptions = new List<Exception>();
for (int retry = 0; retry < retryCount; retry++)
{
try
{
if (retry > 0)
System.Threading.Thread.Sleep(retryInterval); // TODO adicionar o Using pro thread
return action();
}
catch (Exception ex)
{
exceptions.Add(ex);
}
}
throw new AggregateException(exceptions);
}

I initially though the site was responding with junk if it thought it was a request from a bot (that is, it was checking some of the headers). After having a look with Fiddler - it appears that the data returned is exactly the same for both a web browser and the code. Which means, we're not properly deflating (extracting) the response. It's very common for web servers to compress the data (using something like gzip). WebClient does not automatically deflate the data.
Using the answer from Automatically decompress gzip response via WebClient.DownloadData - I managed to get it to work properly.
Also note that you're downloading the file twice. You don't need to do that.
Working code:
//Taken from above linked question
class MyWebClient : WebClient
{
protected override WebRequest GetWebRequest(Uri address)
{
HttpWebRequest request = base.GetWebRequest(address) as HttpWebRequest;
request.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
return request;
}
}
And using it:
string sLinkTorCache = #"http://torcache.net/torrent/C348CBCA08288AE07A97DD641C5D09EE25299FAC.torrent";
using (var wc = new MyWebClient())
{
var path = #"C:\Junk";
var data = Helper.Retry(() => wc.DownloadData(sLinkTorCache), TimeSpan.FromSeconds(3), 5);
string fileName = "";
var torrentPath = Path.Combine(path, fileName ?? "Arch Linux Distro.torrent");
if (File.Exists(torrentPath))
File.Delete(torrentPath);
File.WriteAllBytes(torrentPath, data);
}

HttpClient POST long delay before request processing on IIS, multithreading

I have a simple console application which sends HTTP POST from multiple threads:
List<Task> tasks = new List<Task>();
for (int i = 0; i < 100; i++)
{
tasks.Add(Task.Factory.StartNew(() => SendQuery(url1, query1)));
tasks.Add(Task.Factory.StartNew(() => SendQuery(url2, query2)));
}
Task.WaitAll(tasks.ToArray());
SendQuery(string uri, string requestString) looks like this:
Uri url = new Uri(uri);
try
{
using (HttpClient client = new HttpClient { Timeout = new TimeSpan(0, 0, 10, 0) })
{
StringContent content = new StringContent(requestString);
content.Headers.ContentType = new MediaTypeHeaderValue("application/json");
HttpResponseMessage response = client.PostAsync(url, content).Result;
response.EnsureSuccessStatusCode();
}
}
catch (Exception ex)
{
Console.WriteLine(ex);
}
The program works without any errors, all the queries are processed finally, but after filling tasks list each thread hangs on client.PostAsync(url, content).Result, after several minutes IIS starts to process queries. Why does this delay occur? What's happening during this time? I am using IIS 7.5 running on Windows Server 2008 R2 to host web-services which provide url1 and url2.

Set this value to 500 or 1000 at the start of your program and let us know the effects.
Your requests maybe getting throttled at the default value of 2. (depending on the .net version)
ServicePointManager.DefaultConnectionLimit = 500;

Why is this implementation of async WebRequests slower than the synchronous implementation?

I have an application that requires many requests to a third party REST service. I thought that modifying this part of the application to make the requests asynchronously would potentially speed things up, so I wrote a POC console application to test things out.
To my surprise the async code takes almost twice as long to complete as the synchronous version. Am I just doing it wrong?
async static void LoadUrlsAsync()
{
var startTime = DateTime.Now;
Console.WriteLine("LoadUrlsAsync Start - {0}", startTime);
var numberOfRequest = 3;
var tasks = new List<Task<string>>();
for (int i = 0; i < numberOfRequest; i++)
{
var request = WebRequest.Create(#"http://www.google.com/images/srpr/logo11w.png") as HttpWebRequest;
request.Method = "GET";
var task = LoadUrlAsync(request);
tasks.Add(task);
}
var results = await Task.WhenAll(tasks);
var stopTime = DateTime.Now;
var duration = (stopTime - startTime);
Console.WriteLine("LoadUrlsAsync Complete - {0}", stopTime);
Console.WriteLine("LoadUrlsAsync Duration - {0}ms", duration);
}
async static Task<string> LoadUrlAsync(WebRequest request)
{
string value = string.Empty;
using (var response = await request.GetResponseAsync())
using (var responseStream = response.GetResponseStream())
using (var reader = new StreamReader(responseStream))
{
value = reader.ReadToEnd();
Console.WriteLine("{0} - Bytes: {1}", request.RequestUri, value.Length);
}
return value;
}
NOTE:
I have also tried setting the maxconnections=100 in the app.config in an attempt to eliminate throttling from the system.net connection pool. This setting doesn't seem to make an impact on the performance.
<system.net>
<connectionManagement>
<add address="*" maxconnection="100" />
</connectionManagement>
</system.net>

First, try to avoid microbenchmarking. When the differences of your code timings are swamped by network conditions, your results lose meaning.
That said, you should set ServicePointManager.DefaultConnectionLimit to int.MaxValue. Also, use end-to-end async methods (i.e., StreamReader.ReadToEndAsync) - or even better, use HttpClient, which was designed for async HTTP.

The async version becomes faster as you increase the number of threads. I'm not certain, however my guess is that you are bypassing the cost of setting up the threads. When you pass this threshold the async version becomes superior. Try 50 or even 500 requests and you should see async is faster. That's how it worked out for me.
500 Async Requests: 11.133 seconds
500 Sync Requests: 18.136 seconds
If you only have ~3 calls then I suggest avoiding async. Here's what I used to test:
public class SeperateClass
{
static int numberOfRequest = 500;
public async static void LoadUrlsAsync()
{
var startTime = DateTime.Now;
Console.WriteLine("LoadUrlsAsync Start - {0}", startTime);
var tasks = new List<Task<string>>();
for (int i = 0; i < numberOfRequest; i++)
{
var request = WebRequest.Create(#"http://www.google.com/images/srpr/logo11w.png") as HttpWebRequest;
request.Method = "GET";
var task = LoadUrlAsync(request);
tasks.Add(task);
}
var results = await Task.WhenAll(tasks);
var stopTime = DateTime.Now;
var duration = (stopTime - startTime);
Console.WriteLine("LoadUrlsAsync Complete - {0}", stopTime);
Console.WriteLine("LoadUrlsAsync Duration - {0}ms", duration);
}
async static Task<string> LoadUrlAsync(WebRequest request)
{
string value = string.Empty;
using (var response = await request.GetResponseAsync())
using (var responseStream = response.GetResponseStream())
using (var reader = new StreamReader(responseStream))
{
value = reader.ReadToEnd();
Console.WriteLine("{0} - Bytes: {1}", request.RequestUri, value.Length);
}
return value;
}
}
public class SeperateClassSync
{
static int numberOfRequest = 500;
public async static void LoadUrlsSync()
{
var startTime = DateTime.Now;
Console.WriteLine("LoadUrlsSync Start - {0}", startTime);
var tasks = new List<Task<string>>();
for (int i = 0; i < numberOfRequest; i++)
{
var request = WebRequest.Create(#"http://www.google.com/images/srpr/logo11w.png") as HttpWebRequest;
request.Method = "GET";
var task = LoadUrlSync(request);
tasks.Add(task);
}
var results = await Task.WhenAll(tasks);
var stopTime = DateTime.Now;
var duration = (stopTime - startTime);
Console.WriteLine("LoadUrlsSync Complete - {0}", stopTime);
Console.WriteLine("LoadUrlsSync Duration - {0}ms", duration);
}
async static Task<string> LoadUrlSync(WebRequest request)
{
string value = string.Empty;
using (var response = request.GetResponse())//Still async FW, just changed to Sync call here
using (var responseStream = response.GetResponseStream())
using (var reader = new StreamReader(responseStream))
{
value = reader.ReadToEnd();
Console.WriteLine("{0} - Bytes: {1}", request.RequestUri, value.Length);
}
return value;
}
}
class Program
{
static void Main(string[] args)
{
SeperateClass.LoadUrlsAsync();
Console.ReadLine();//record result and run again
SeperateClassSync.LoadUrlsSync();
Console.ReadLine();
}
}

In my tests it's faster to use the WebRequest.GetResponseAsync() method for 3 parallel requests.
It should be more noticeable with large requests, many requests (3 is not many), and requests from different websites.
What are the exact results you are getting? In your question you are converting a TimeSpan to a string and calling it milliseconds but you aren't actually calculating the milliseconds. It's displaying the standard TimeSpan.ToString which will show fractions of a second.

It appears that the problem was more of an environmental issue than anything else. Once I moved the code to another machine on a different network, the results were much more inline with my expectations.
The original async code does in fact execute more quickly than the synchronous version. This helps me ensure that I am not introducing additional complexity to our application without the expected performance gains.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

a faster way to download multiple files - c#

Related

Working with SemaphoreSlim and the ThreadPool for sending HttpWebRequests

HttpClient with infinite time out throws time out exception

WebClient isn't downloading the right file from the supplied URL

HttpClient POST long delay before request processing on IIS, multithreading

Why is this implementation of async WebRequests slower than the synchronous implementation?

Categories

Resources