Add request headers with WebClient C# - c#

I have the following code with which I download a web-page into a byte array and then print it with Response.Write:
WebClient client = new WebClient();
byte[] data = client.DownloadData(requestUri);
/*********** Init response headers ********/
WebHeaderCollection responseHeaders = client.ResponseHeaders;
for (int i = 0; i < responseHeaders.Count; i++)
{
Response.Headers.Add(responseHeaders.GetKey(i), responseHeaders[i]);
}
/***************************************************/
Besides of the response headers, I need to add request headers as well. I try to do it with the following code:
/*********** Init request headers ********/
NameValueCollection requestHeaders = Request.Headers;
foreach (string key in requestHeaders)
{
client.Headers.Add(key, requestHeaders[key]);
}
/***************************************************/
However it does not work and I get the following exception:
This header must be modified using the appropriate property.Parameter name: name
Could anybody help me with this? What's the correct way of adding request headers with WebClient?
Thank you.

The headers collection "protects" some of the possible headers as described on the msdn page here: http://msdn.microsoft.com/en-us/library/system.net.webclient.headers.aspx
That page seems to give all the answer you need but to quote the important part:
Some common headers are considered restricted and are protected by the
system and cannot be set or changed in a WebHeaderCollection object.
Any attempt to set one of these restricted headers in the
WebHeaderCollection object associated with a WebClient object will
throw an exception later when attempting to send the WebClient
request.
Restricted headers protected by the system include, but are not
limited to the following:
Date
Host
In addition, some other headers are also restricted when using a
WebClient object. These restricted headers include, but are not
limited to the following:
Accept
Connection
Content-Length
Expect (when the value is set to "100-continue"
If-Modified-Since
Range
Transfer-Encoding
The HttpWebRequest class has properties for setting some of the above
headers. If it is important for an application to set these headers,
then the HttpWebRequest class should be used instead of the WebRequest
class.
I suspect the reason for this is that many of the headers such as Date and host must be set differently on a different request. You should not be copying them. Indeed I would personally probably suggest that you should not be copying any of them. Put in your own user agent - If the page you are getting relies on a certain value then I'd think you want to make sure you always send a valid value rather than relying on the original user to give you that information.
Essentially work out what you need to do rather than finding something that works and doing that without fully understanding what you are doing.

Looks like you're trying to set some header which is must be set using one of the WebClient properties (CachePolicy, ContentLength or ContentType)
Moreover, it's not very good to blindly copy all the headers, you need to get just those you really need.

Related

Can't I set If-Modified-Since on a WebClient?

I am using WebClient to retrieve a website. I decided to set If-Modified-Since because if the website hasn't changed, I don't want to get it again:
var c = new WebClient();
c.Headers[HttpRequestHeader.IfModifiedSince] = Last_refreshed.ToUniversalTime().ToString("r");
Where Last_refreshed is a variable in which I store the time I've last seen the website.
But when I run this, I get a WebException with the text:
The 'If-Modified-Since' header must be modified using the appropriate property or method.
Parameter name: name
Turns out the API docs mention this:
In addition, some other headers are also restricted when using a WebClient object. These restricted headers include, but are not limited to the following:
Accept
Connection
Content-Length
Expect (when the value is set to "100-continue")
If-Modified-Since
Range
Transfer-Encoding
The HttpWebRequest class has properties for setting some of the above headers. If it is important for an application to set these headers, then the HttpWebRequest class should be used instead of the WebRequest class.
So does this mean there's no way to set them from WebClient? Why not? What's wrong with specifying If-Modified-Since in a normal HTTP GET?
I know I can just use HttpWebRequest, but I don't want to because it's too much work (have to do a bunch of casting, can't just get the content as a string).
Also, I know Cannot set some HTTP headers when using System.Net.WebRequest is related, but it doesn't actually answer my question.
As unwieldy as it may be, I have opted to subclass WebClient in order to add the functionality in a way that mimics the way WebClient typically works (in which headers are consumed by / reset after each use):
public class ApiWebClient : WebClient {
public DateTime? IfModifiedSince { get; set; }
protected override WebRequest GetWebRequest(Uri address) {
var webRequest = base.GetWebRequest(address);
var httpWebRequest = webRequest as HttpWebRequest;
if (httpWebRequest != null) {
if (IfModifiedSince != null) {
httpWebRequest.IfModifiedSince = IfModifiedSince.Value;
IfModifiedSince = null;
}
// Handle other headers or properties here
}
return webRequest;
}
}
This has the advantage of not having to write boilerplate for the standard operations that WebClient provides, while still offering some of the flexibility of using WebRequest.

HTTP request whose headers can be controlled and is automatically decompressed

I'm trying to send HTTP requests in C# that look like HTTP requests from a certain software. I wanted to use System.Net.HttpWebRequest but it doesn't give me the control I need over its headers: their letter-casing can't be changed (e.g. I want the Connection header to be keep-alive and not Keep-Alive), I don't have full control over the headers ordering, etc.
I tried using HttpClient from CodeScales library. Unfortunately, it doesn't decompress responses automatically (see HttpWebRequest.AutomaticDecompression). I decompressed it myself with System.IO.Compression.GZipStream and DeflateStream, but it didn't work when the response had the header Transfer-Encoding: chunked.
System.Net.Http.HttpRequestHeaders seems to give more control over headers than HttpWebRequest, but still not enough.
How can it be done?
Edit: I know that HTTP accepts those headers as valid anyway, but I'm working with a server that validates the headers and refuses to respond if they're not exactly what it expects.
To set some headers in the HTTPWebRequest class, you have to either use an attribute from the class (for example HttpWebRequest.KeepAlive = true), or you have to add the custom header to the request by calling the add method to the request headers.
Something important is that is you try to add the header (in a custom way) while it's already an attribute of the request, it'll send you an error.
objRequest.Headers.Add("Accept", "some data");
is incorrect. You'd rather say.
objRequest.Accept = "some data";
In your case you can :
objRequest.KeepAlive = true;
Don't worry to much for the letter-casing, it doesn't matter as far as you're sending the appropriate headers to the server.

Content Headers Remove fails for string Authorization

The following test fails inexplicably:
[Test]
public void CrazyAssHttpRequestMessageTest()
{
var subject = new HttpRequestMessage()
{
Method = HttpMethod.Get,
Content = new StringContent("some content")
};
subject.Content.Headers.Remove("Authorization");
}
The exceptions is:
System.InvalidOperationException : Misused header name. Make sure
request headers are used with HttpRequestMessage, response headers
with HttpResponseMessage, and content headers with HttpContent
objects.
Why? Any other header seems to work fine, replace Authorization with something else and all is ok.
The HttpContentHeaders class only supports a subset of HTTP headers -- the headers relating to content. It seems a bit of an odd decision to split them up that way, but that's the way the framework works.
The upshot is that there will never be an Authorization header in request.Content.Headers.
You get exactly the same error if you try to remove "Content-Type" from HttpRequestHeaders or HttpResponseHeaders, or if you try to add an unexpected header to these collections without calling TryAddWithoutValidation. Even more frustrating is the fact that Contains() will throw if you try to check for an invalid header. You can check for existence without throwing without worrying about the exact type of header collection using HttpHeaders.TryGetValues, or just use request.Content.Headers.Any(x => x.Key == "Authorization").
The classes linked above have a list of the headers they explicitly support (as strongly typed properties) e.g. HttpContentHeaders.ContentType.

check to see if URL is a download link using webclient c#

I am reading from the history database, and for every URL read, I am downloading it and storing the data into a string. I want to be able to determine if the link is a download link, i.e. .exe or .zip for e.g. I am assuming I need to read the headers to determine this, but I don't know how to do it with WebClient. Any suggestions?
while (sqlite_datareader.Read())
{
noIndex = false;
string url = (string)sqlite_datareader["url"];
try
{
if (url.Contains("http") && (!url.Contains(".pdf")) && (!url.Contains(".jpg")) && (!url.Contains("https")) && !isInBlackList(url))
{
WebClient client = new WebClient();
client.Headers.Add("user-agent", "Only a test!");
String htmlCode = client.DownloadString(url);
}
}
}
Instead of loading the complete content behind the link, I would issue a HEAD request.
The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response. The metainformation contained in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request. This method can be used for obtaining metainformation about the entity implied by the request without transferring the entity-body itself. This method is often used for testing hypertext links for validity, accessibility, and recent modification.
Quote of http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
See these questions for C# examples
How to check if a file exists on a server using c# and the WebClient class
How to check if System.Net.WebClient.DownloadData is downloading a binary file?
You're on the right track; you'll need to examine the ResponseHeaders after a successful request:
var someType = "application/zip";
if (client.ResponseHeaders["Content-Type"].Contains(someType)) {
// this was a "download link"
}
The tricky part will be in determining what constitutes a download link since there are so many content types possible. For example, how would you decide whether XML data is a download link or not?
Try to check WebClient's ResponseHeaders collections to validate response file type.
In case, anyone has the same problem, I have used an attribute in the history places.sqlite database which came in very handy!
Places.sqlite contains a table called moz_historyvisits which contains a column visit_type. According to [1], a visit_type of 7 is a download link. Therefore, reading this value will determine if it is a download link without reading the response header or even sending out a head method.
[1] http://www.firefoxforensics.com/research/moz_historyvisits.shtml

How to set WebClient Content-Type Header?

To conect to a third party service I need to make a Https Post. One of the requisites set is to sent a custom content type.
I'm using WebClient, but I can't find how to set it. I've tried making a new class and overriding the CreateRequest Method, but that make request crash.
Is there any way to do that without having to rewrite CopyHeadersTo method?
EDIT CopyHeaderTo is a method I've seen using .NET Reflector. It's invoked from GetWebRequest and sets all Request Headers, including Content-Type, from private properties.
You could try adding to the Headers collection.
myWebClient.Headers.Add("Content-Type","application/xxx");
webclient.Headers[HttpRequestHeader.ContentType] = "application/x-www-form-urlencoded";
Well, I just missed Request.ContentType property. If GetWebRequest method is overridden, setting ContentType to whatever value desired does it.
Still, connection to third party is not working. Go figure.
I encounter this too. And found that you must use Client Http, otherwise Browser Http will block change of Content-Type for security reason. This MSDN link explain that.
WebRequest.RegisterPrefix("http://", WebRequestCreator.ClientHttp);
WebRequest.RegisterPrefix("https://", WebRequestCreator.ClientHttp);
client.Headers["Content-Type"] = "application/json";

Categories