I am using WebClient to retrieve a website. I decided to set If-Modified-Since because if the website hasn't changed, I don't want to get it again:
var c = new WebClient();
c.Headers[HttpRequestHeader.IfModifiedSince] = Last_refreshed.ToUniversalTime().ToString("r");
Where Last_refreshed is a variable in which I store the time I've last seen the website.
But when I run this, I get a WebException with the text:
The 'If-Modified-Since' header must be modified using the appropriate property or method.
Parameter name: name
Turns out the API docs mention this:
In addition, some other headers are also restricted when using a WebClient object. These restricted headers include, but are not limited to the following:
Accept
Connection
Content-Length
Expect (when the value is set to "100-continue")
If-Modified-Since
Range
Transfer-Encoding
The HttpWebRequest class has properties for setting some of the above headers. If it is important for an application to set these headers, then the HttpWebRequest class should be used instead of the WebRequest class.
So does this mean there's no way to set them from WebClient? Why not? What's wrong with specifying If-Modified-Since in a normal HTTP GET?
I know I can just use HttpWebRequest, but I don't want to because it's too much work (have to do a bunch of casting, can't just get the content as a string).
Also, I know Cannot set some HTTP headers when using System.Net.WebRequest is related, but it doesn't actually answer my question.
As unwieldy as it may be, I have opted to subclass WebClient in order to add the functionality in a way that mimics the way WebClient typically works (in which headers are consumed by / reset after each use):
public class ApiWebClient : WebClient {
public DateTime? IfModifiedSince { get; set; }
protected override WebRequest GetWebRequest(Uri address) {
var webRequest = base.GetWebRequest(address);
var httpWebRequest = webRequest as HttpWebRequest;
if (httpWebRequest != null) {
if (IfModifiedSince != null) {
httpWebRequest.IfModifiedSince = IfModifiedSince.Value;
IfModifiedSince = null;
}
// Handle other headers or properties here
}
return webRequest;
}
}
This has the advantage of not having to write boilerplate for the standard operations that WebClient provides, while still offering some of the flexibility of using WebRequest.
Related
I am using asp:image field to get an image from different urls. I use the imageurl to set the image string from the remote website(ex: http://www.google.com/favicon.ico), the question is how can I tell if the image exist or not? that mean that the Url of the image is valid.
You can validate if a URI is valid using the Uri.TryCreate method.
You shouldn't be checking whether the image exists in your ASP.Net application. It is the job of the browser to download the image. You can add javascript to allow the browser to replace the missing image with a default image, as described in this question.
You can't do this by using the asp:Image control alone. However, with a little extra work it would be possible to use a ASHX handler to make a programmatical HttpRequest for the image (e.g. using the image on the querystring). If the HttpRequest succeeds, you can then stream the image to the response.
If the HttpRequest returns a 404 status, then you can serve another predefined image instead.
However, this is like using a sledgehammer to crack a nut and should not be used extensively throughout a site as it could potentially create a significant load - essentially, you are asking the server (not the user's browser) to download the image. Also it could be a potential XSS security risk if not implemented carefully.
It would be fine for individual cases, specifically when you actually need to retain the requested image locally. Any images requested should be written to disk so that future requests can serve the previously retained images.
Obviously, Javascript is a solution too but I mention the above as a possibility, depending on the requirements.
class MyClient : WebClient
{
public bool HeadOnly { get; set; }
protected override WebRequest GetWebRequest(Uri address)
{
WebRequest req = base.GetWebRequest(address);
if (HeadOnly && req.Method == "GET")
{
req.Method = "HEAD";
}
return req;
}
}
private bool headOnly;
public bool HeadOnly {
get {return headOnly;}
set {headOnly = value;}
}
using(var client = new MyClient()) {
client.HeadOnly = true;
// fine, no content downloaded
string s1 = client.DownloadString("http://google.com");
// throws 404
string s2 = client.DownloadString("http://google.com/silly");
}
try this one !!!
I have the following code with which I download a web-page into a byte array and then print it with Response.Write:
WebClient client = new WebClient();
byte[] data = client.DownloadData(requestUri);
/*********** Init response headers ********/
WebHeaderCollection responseHeaders = client.ResponseHeaders;
for (int i = 0; i < responseHeaders.Count; i++)
{
Response.Headers.Add(responseHeaders.GetKey(i), responseHeaders[i]);
}
/***************************************************/
Besides of the response headers, I need to add request headers as well. I try to do it with the following code:
/*********** Init request headers ********/
NameValueCollection requestHeaders = Request.Headers;
foreach (string key in requestHeaders)
{
client.Headers.Add(key, requestHeaders[key]);
}
/***************************************************/
However it does not work and I get the following exception:
This header must be modified using the appropriate property.Parameter name: name
Could anybody help me with this? What's the correct way of adding request headers with WebClient?
Thank you.
The headers collection "protects" some of the possible headers as described on the msdn page here: http://msdn.microsoft.com/en-us/library/system.net.webclient.headers.aspx
That page seems to give all the answer you need but to quote the important part:
Some common headers are considered restricted and are protected by the
system and cannot be set or changed in a WebHeaderCollection object.
Any attempt to set one of these restricted headers in the
WebHeaderCollection object associated with a WebClient object will
throw an exception later when attempting to send the WebClient
request.
Restricted headers protected by the system include, but are not
limited to the following:
Date
Host
In addition, some other headers are also restricted when using a
WebClient object. These restricted headers include, but are not
limited to the following:
Accept
Connection
Content-Length
Expect (when the value is set to "100-continue"
If-Modified-Since
Range
Transfer-Encoding
The HttpWebRequest class has properties for setting some of the above
headers. If it is important for an application to set these headers,
then the HttpWebRequest class should be used instead of the WebRequest
class.
I suspect the reason for this is that many of the headers such as Date and host must be set differently on a different request. You should not be copying them. Indeed I would personally probably suggest that you should not be copying any of them. Put in your own user agent - If the page you are getting relies on a certain value then I'd think you want to make sure you always send a valid value rather than relying on the original user to give you that information.
Essentially work out what you need to do rather than finding something that works and doing that without fully understanding what you are doing.
Looks like you're trying to set some header which is must be set using one of the WebClient properties (CachePolicy, ContentLength or ContentType)
Moreover, it's not very good to blindly copy all the headers, you need to get just those you really need.
I have problem with Webclient.
It is very slow. It takes about 3-5 seconds to downloadString from one website.
I don't have any network problems.
This is my Modifed WebClient.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Net;
namespace StatusChecker
{
class WebClientEx: WebClient
{
public CookieContainer CookieContainer { get; private set; }
public WebClientEx()
{
CookieContainer = new CookieContainer();
ServicePointManager.Expect100Continue = false;
Encoding = System.Text.Encoding.UTF8;
WebRequest.DefaultWebProxy = null;
Proxy = null;
}
public void ClearCookies()
{
CookieContainer = new CookieContainer();
}
protected override WebRequest GetWebRequest(Uri address)
{
var request = base.GetWebRequest(address);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).CookieContainer = CookieContainer;
}
return request;
}
}
}
UPDATE:
In wireshark I saw that single DownladString is sending and receiving few thousands packets.
There may be two issues at hand here (that I've also noticed in my own programs previously):
The first request takes an abnormally long time: This occurs because WebRequest by default detects and loads proxy settings the first time it starts, which can take quite a while. To stop this, simply set the proxy property (WebRequest.Proxy) to null and it'll bypass the check (provided you can directly access the internet)
You can't download more than 2 items at once: By default, you can only have 2 simultaneous HTTP connections open. To change this, set ServicePointManager.DefaultConnectionLimit to something larger. I usually set this to int.MaxValue (just make sure you don't spam the host with 1,000,000 connections).
There are a few options if it is related to the initial proxy settings being checked:
Disable the automatic proxy detection settings in Internet Explorer
Set the proxy to null:
WebClient.Proxy = null
On application startup set the default webproxy to null:
WebRequest.DefaultWebProxy = null;
In older .NET code instead of setting to null, you used to write this (but null is now preferred):
webclient.Proxy = GlobalProxySelection.GetEmptyWebProxy();
Maybe it will help somebody. Some web services support compression (gzip or other). So you can add Accept-Encoding header for your requests and then enable automatic decompression for web client instance. Chrome works in that way.
I have used this code
WebClient webClient = new WebClient();
byte[] reqHTML;
reqHTML = webClient.DownloadData(url);
for executing a url. Here i am having a question, while using this code, whether the cookies set or not?
Cookies are not sent by default with WebClient. You could although write your implementation that uses a cookie container:
public class CookieAwareWebClient : WebClient
{
private CookieContainer _container = new CookieContainer();
protected override WebRequest GetWebRequest(Uri address)
{
WebRequest request = base.GetWebRequest(address);
if (request is HttpWebRequest)
((HttpWebRequest)request).CookieContainer = _container;
return request;
}
}
If you mean the cookies from the ASP.NET page that is executing - then no: I'm pretty sure that WebClient is not going to look at all for the cookies on the current executing web request.
If you want this functionality, can you perhaps use AJAX from the browser? Perhaps via jQuery? That should flow the context etc as per standard browser rules.
Alternatively, you are going to have to handle the cookies yourself (i.e. copy them into the WebClient, and back if needed).
I'm using the C# using the WebClient().
I was testing out what headers are sent, and I noticed that the following header is automatically added.
Connection : Keep-Alive
Is there any way to remove this?
I had ran into the same issue this morning. Following on Jon Skeet's hint, it can be achieved by passing HttpWebRequest to WebClient by inheriting it:
class MyWebClient : WebClient
{
protected override WebRequest GetWebRequest(Uri address)
{
WebRequest request = base.GetWebRequest(address);
if (request is HttpWebRequest)
{
(request as HttpWebRequest).KeepAlive = false;
}
return request;
}
}
Now sent headers will include Connection : close
Use HttpWebRequest instead of WebClient (it's slightly less convenient, but not by very much) and set the KeepAlive property to false.
I haven't tested this - it's possible that it'll just change the value of the Connection header instead of removing it - but it's worth a try. The docs for the Connection property at least suggest that it only adds Keep-Alive.