WebClient is very slow - c#

I have problem with Webclient.
It is very slow. It takes about 3-5 seconds to downloadString from one website.
I don't have any network problems.
This is my Modifed WebClient.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Net;
namespace StatusChecker
{
class WebClientEx: WebClient
{
public CookieContainer CookieContainer { get; private set; }
public WebClientEx()
{
CookieContainer = new CookieContainer();
ServicePointManager.Expect100Continue = false;
Encoding = System.Text.Encoding.UTF8;
WebRequest.DefaultWebProxy = null;
Proxy = null;
}
public void ClearCookies()
{
CookieContainer = new CookieContainer();
}
protected override WebRequest GetWebRequest(Uri address)
{
var request = base.GetWebRequest(address);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = CookieContainer;
}
return request;
}
}
}
UPDATE:
In wireshark I saw that single DownladString is sending and receiving few thousands packets.

There may be two issues at hand here (that I've also noticed in my own programs previously):
The first request takes an abnormally long time: This occurs because WebRequest by default detects and loads proxy settings the first time it starts, which can take quite a while. To stop this, simply set the proxy property (WebRequest.Proxy) to null and it'll bypass the check (provided you can directly access the internet)
You can't download more than 2 items at once: By default, you can only have 2 simultaneous HTTP connections open. To change this, set ServicePointManager.DefaultConnectionLimit to something larger. I usually set this to int.MaxValue (just make sure you don't spam the host with 1,000,000 connections).

There are a few options if it is related to the initial proxy settings being checked:
Disable the automatic proxy detection settings in Internet Explorer
Set the proxy to null:
WebClient.Proxy = null
On application startup set the default webproxy to null:
WebRequest.DefaultWebProxy = null;
In older .NET code instead of setting to null, you used to write this (but null is now preferred):
webclient.Proxy = GlobalProxySelection.GetEmptyWebProxy();

Maybe it will help somebody. Some web services support compression (gzip or other). So you can add Accept-Encoding header for your requests and then enable automatic decompression for web client instance. Chrome works in that way.

Related

Can't I set If-Modified-Since on a WebClient?

I am using WebClient to retrieve a website. I decided to set If-Modified-Since because if the website hasn't changed, I don't want to get it again:
var c = new WebClient();
c.Headers[HttpRequestHeader.IfModifiedSince] = Last_refreshed.ToUniversalTime().ToString("r");
Where Last_refreshed is a variable in which I store the time I've last seen the website.
But when I run this, I get a WebException with the text:
The 'If-Modified-Since' header must be modified using the appropriate property or method.
Parameter name: name
Turns out the API docs mention this:
In addition, some other headers are also restricted when using a WebClient object. These restricted headers include, but are not limited to the following:
Accept
Connection
Content-Length
Expect (when the value is set to "100-continue")
If-Modified-Since
Range
Transfer-Encoding
The HttpWebRequest class has properties for setting some of the above headers. If it is important for an application to set these headers, then the HttpWebRequest class should be used instead of the WebRequest class.
So does this mean there's no way to set them from WebClient? Why not? What's wrong with specifying If-Modified-Since in a normal HTTP GET?
I know I can just use HttpWebRequest, but I don't want to because it's too much work (have to do a bunch of casting, can't just get the content as a string).
Also, I know Cannot set some HTTP headers when using System.Net.WebRequest is related, but it doesn't actually answer my question.
As unwieldy as it may be, I have opted to subclass WebClient in order to add the functionality in a way that mimics the way WebClient typically works (in which headers are consumed by / reset after each use):
public class ApiWebClient : WebClient {
public DateTime? IfModifiedSince { get; set; }
protected override WebRequest GetWebRequest(Uri address) {
var webRequest = base.GetWebRequest(address);
var httpWebRequest = webRequest as HttpWebRequest;
if (httpWebRequest != null) {
if (IfModifiedSince != null) {
httpWebRequest.IfModifiedSince = IfModifiedSince.Value;
IfModifiedSince = null;
}
// Handle other headers or properties here
}
return webRequest;
}
}
This has the advantage of not having to write boilerplate for the standard operations that WebClient provides, while still offering some of the flexibility of using WebRequest.

Http Requests Timing Out in Production but not Dev

Below is an example of a website that when requested from my local dev environment returns ok but when requested from the production server, the request times out after 15 seconds. The request headers are exactly the same. Any ideas?
http://www.dealsdirect.com.au/p/wall-mounted-fish-tank-30cm/
Here's one thing that I wanna point beside what other stuff I've already provided. When you call GetResponse the object that's returned has to be disposed of ASAP. Otherwise stalling will occur, or rather the next call will block and possibly time out because there's a limit to the number of requests that can go through the HTTP request engine concurrently in System.Net.
// The request doesn't matter, it's not holding on to any critical resources
var request = (HttpWebRequest)WebRequest.Create(url);
// The response however is, these will eventually be reclaimed by the GC
// but you'll run into problems similar to deadlocks if you don't dispose them yourself
// when you have many of them
using (var response = request.GetResponse())
{
// Do stuff with `response` here
}
This is my old answer
This question is really hard to answer without knowing more about the specifics. There's no reason why the IIS would behave like this which leads me to conclude that the problem has to do with something you app is doing but I know nothing about it. If you can reproduce the problem with a debugger attached you might be able to track down where the problem is occuring but if you cannot do this first then there's little I can do to help.
Are you using the ASP.NET Development Server or IIS Express in development?
If this is an issue with proxies here's a factory method I use to setup HTTP requests that require where the proxy requires some authentication (though, I don't believe I ever received a time out):
HttpWebRequest CreateRequest(Uri url)
{
var request = (HttpWebRequest)WebRequest.Create(url);
request.Timeout = 120 * 1000;
request.CookieContainer = _cookieContainer;
if (UseSystemWebProxy)
{
var proxy = WebRequest.GetSystemWebProxy();
if (UseDefaultCredentials)
{
proxy.Credentials = CredentialCache.DefaultCredentials;
}
if (UseNetworkCredentials != null
&& UseNetworkCredentials.Length > 0)
{
var networkCredential = new NetworkCredential();
networkCredential.UserName = UseNetworkCredentials[0];
if (UseNetworkCredentials.Length > 1)
{
networkCredential.Password = UseNetworkCredentials[1];
}
if (UseNetworkCredentials.Length > 2)
{
networkCredential.Domain = UseNetworkCredentials[2];
}
proxy.Credentials = networkCredential;
}
request.Proxy = proxy;
}
return request;
}
Try this out Adrian and let me know how it goes.

What is the best way to download files via HTTP using .NET?

In one of my application I'm using the WebClient class to download files from a web server. Depending on the web server sometimes the application download millions of documents. It seems to be when there are lot of documents, performance vise the WebClient doesn't scale up well.
Also it seems to be the WebClient doesn't immediately close the connection it opened for the WebServer even after it successfully download the particular document.
I would like to know what other alternatives I have.
Update:
Also I noticed that for each and every download WebClient performs the authentication hand shake. I was expecting to see this hand shake once since my application only communicate with a single web server. Shouldn't the subsequent calls of the WebClient reuse the authentication session?
Update: My application also calls some web service methods and for these web service calls it seems to authentication session is reused. Also I'm using WCF to communicate with the web service.
I think you can still use "WebClient". However, you are better off using the "using" block as a good practice. This will make sure that the object is closed and is disposed off:-
using(WebClient client = new WebClient()) {
// Use client
}
I bet you are running into the default limit of 2 connections per server. Try running this code at the beginning of your program:
var cme = new System.Net.Configuration.ConnectionManagementElement();
cme.MaxConnection = 100;
System.Net.ServicePointManager.DefaultConnectionLimit = 100;
I have noticed the same behavior with the session in another project I was working on. To solve this "problem" I did use a static CookieContainer (since the session of the client is recognized by a value saved in a cookie).
public static class SomeStatics
{
private static CookieContainer _cookieContainer;
public static CookieContainer CookieContainer
{
get
{
if (_cookieContainer == null)
{
_cookieContainer = new CookieContainer();
}
return _cookieContainer;
}
}
}
public class CookieAwareWebClient : WebClient
{
protected override WebRequest GetWebRequest(Uri address)
{
WebRequest request = base.GetWebRequest(address);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = SomeStatics.CookieContainer;
(request as HttpWebRequest).KeepAlive = false;
}
return request;
}
}
//now the code that will download the file
using(WebClient client = new CookieAwareWebClient())
{
client.DownloadFile("http://address.com/somefile.pdf", #"c:\\temp\savedfile.pdf");
}
The code is just an example and inspired on Using CookieContainer with WebClient class and C# get rid of Connection header in WebClient.
The above code will close your connection immediately after the file is download and it will reuse the authentication.
WebClient is probably the best option. It doesn't close the connection straight away for a reason: so it can use the same connection again, without having to open a new one. If you find that it's not reusing the connection as expected, that's usually because you're not Close()ing the response from the previous request:
var request = WebRequest.Create("...");
// populate parameters
var response = request.GetResponse();
// process response
response.Close(); // <-- make sure you don't forget this!

How to perform a fast web request in C#

I have a HTTP based API which I potentially need to call many times. The problem is that I can't get the request to take less than about 20 seconds, though the same request made through a browser is near instantaneous. The following code illustrates how I have implemented it so far.
WebRequest r = HttpWebRequest.Create("https://example.com/http/command?param=blabla");
var response = r.GetResponse();
One solution would be to make an asynchronous request but I would like to know why it takes so long and if I can avoid it. I have also tried using the WebClient class but I suspect it uses a WebRequest internally.
Update:
Running the following code took about 40 seconds in Release Mode (measured with Stopwatch):
WebRequest g = HttpWebRequest.Create("http://www.google.com");
var response = g.GetResponse();
I'm working at a university where there might be different things in the network configuration affecting the performance, but the direct use of the browser illustrates that it should be near instant.
Update 2:
I uploaded the code to a remote machine and it worked fine so the conclusion must be that the .NET code does something extra compared to the browser or it has problems resolving the address through the university network (proxy issues or something?!).
This problem is similar to another post on StackOverflow:
Stackoverflow-2519655(HttpWebrequest is extremely slow)
Most of the time the problem is the Proxy server property. You should set this property to null, otherwise the object will attempt to search for an appropriate proxy server to use before going directly to the source. Note: this property is turn on by default, so you have to explicitly tell the object not to perform this proxy search.
request.Proxy = null;
using (var response = (HttpWebResponse)request.GetResponse())
{
}
I was having the 30 second delay on 'first' attempt - JamesR's reference to the other post mentioning setting proxy to null solved it instantly!
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(_site.url);
request.Proxy = null; // <-- this is the good stuff
...
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Does your site have an invalid SSL cert? Try adding this
ServicePointManager.ServerCertificateValidationCallback = new System.Net.Security.RemoteCertificateValidationCallback(AlwaysAccept);
//... somewhere AlwaysAccept is defined as:
using System.Security.Cryptography.X509Certificates;
using System.Net.Security;
public bool AlwaysAccept(object sender, X509Certificate certification, X509Chain chain, SslPolicyErrors sslPolicyErrors)
{
return true;
}
You don't close your Request. As soon as you hit the number of allowed connections, you have to wait for the earlier ones to time out. Try
using (var response = g.GetResponse())
{
// do stuff with your response
}

C# get rid of Connection header in WebClient

I'm using the C# using the WebClient().
I was testing out what headers are sent, and I noticed that the following header is automatically added.
Connection : Keep-Alive
Is there any way to remove this?
I had ran into the same issue this morning. Following on Jon Skeet's hint, it can be achieved by passing HttpWebRequest to WebClient by inheriting it:
class MyWebClient : WebClient
{
protected override WebRequest GetWebRequest(Uri address)
{
WebRequest request = base.GetWebRequest(address);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).KeepAlive = false;
}
return request;
}
}
Now sent headers will include Connection : close
Use HttpWebRequest instead of WebClient (it's slightly less convenient, but not by very much) and set the KeepAlive property to false.
I haven't tested this - it's possible that it'll just change the value of the Connection header instead of removing it - but it's worth a try. The docs for the Connection property at least suggest that it only adds Keep-Alive.

Categories