How to check the modify time of a remote file - c#

I Am in need of knowing the last modification DateTime of a remote file prior to downloading the entire content. This to save up on downloading bytes I am never going to need anyway.
Currently I am using WebClient to download the file. It is not needed to keep the use of WebClient specifically. The Last-Modified key can be found within the response headers but the entire file is downloaded at that point in time.
WebClient webClient = new WebClient();
byte[] buffer = webClient.DownloadData( uri );
WebHeaderCollection webClientHeaders = webClient.ResponseHeaders;
String modified = webClientHeaders.GetKey( "Last-Modified" );
Also I am not sure if that key is always included at each file on the internet.

You can use the HTTP "HEAD" method to just get the file's headers.
...
var request = WebRequest.Create(uri);
request.Method = "HEAD";
...
Then you can extract the last modified date and check whether to download the file or not.
Just be aware that not all servers implement Last-modified properly.

Related

Why does WebClient.UploadValues overwrites my html web page?

I'm familiar with Winform and WPF, but new to web developing. One day saw WebClient.UploadValues and decided to try it.
static void Main(string[] args)
{
using (var client = new WebClient())
{
var values = new NameValueCollection();
values["thing1"] = "hello";
values["thing2"] = "world";
//A single file that contains plain html
var response = client.UploadValues("D:\\page.html", values);
var responseString = Encoding.Default.GetString(response);
Console.WriteLine(responseString);
}
Console.ReadLine();
}
After run, nothing printed, and the html file content becomes like this:
thing1=hello&thing2=world
Could anyone explain it, thanks!
The UploadValues method is intended to be used with the HTTP protocol. This means that you need to host your html on a web server and make the request like that:
var response = client.UploadValues("http://some_server/page.html", values);
In this case the method will send the values to the server by using application/x-www-form-urlencoded encoding and it will return the response from the HTTP request.
I have never used the UploadValues with a local file and the documentation doesn't seem to mention anything about it. They only mention HTTP or FTP protocols. So I suppose that this is some side effect when using it with a local file -> it simply overwrites the contents of this file with the payload that is being sent.
You are using WebClient not as it was intended.
The purpose of WebClient.UploadValues is to upload the specified name/value collection to the resource identified by the specified URI.
But it should not be some local file on your disk, but instead it should be some web-service listening for requests and issuing responces.

Exception in BackgroundUploader

I am trying to upload some avi file to server. It works fine with HttpRequest but i need to continue uploading even if i suspend app so thats why i am trying to use BackgroundUploader. I am following this guideline on msdn http://msdn.microsoft.com/en-us/library/windows/apps/jj152727.aspx. So my code looks something like this.
StorageFile storageFile = KnownFolders.VideosLibrary.GetFileAsync("fileName");
BackgroundUploader uploader = new BackgroundUploader();
uploader.Method = "POST";
uploader.SetRequestHeader("Content-Type", "multipart/form-data; boundary=" + boundary);
var fs = await storageFile.OpenAsync(Windows.Storage.FileAccessMode.ReadWrite);
IInputStream aaaa = fs.GetInputStreamAt(0);
UploadOperation upload = uploader.CreateUploadFromStreamAsync(new Uri("uploadUri"), aaaa);
await HandleUploadAsync(upload, true);
the rest is same as on MSDN. And i am getting exception Unsupported media type (415) in method HandleUploadAsync on line
await upload.StartAsync().AsTask(cts.Token, progressCallback);
What am i doing wrong? Or what can cause this kind of exception?
EDIT : I solved my problem as i commented down here and in my answer. I think at the end i am basically just sending some data to server that are recognized and interpreted as i want to. So if i use BackgroundUploader i am not only uploading some file i am also sending information about how am i doing that(as i mentioned in my answer). So by the same idea i can also upload folder to server and by that i am not sending any actual content only some description about how to do that. And if i compare request that i am making by HttpRequest and BackgroundUploader they are equal and thats what i wanted.
So the problem part was the header of request. I have some header in my request that is recognized by server and i was trying to put it to BackgroundUploader through SetRequestheader method but it did not work. As Kieqic suggested i used Fiddler and by that i compare request made by HttpRequest and BackgroundUploader. I found out they are completely different. So through SetRequestheader i add some parts like expected Content-Type and for the rest parts of header to make them equal i add it before content of my file as array of bytes. And this works so conclusion is in my case using Fiddler that helped my how to construct request header.

Add request headers with WebClient C#

I have the following code with which I download a web-page into a byte array and then print it with Response.Write:
WebClient client = new WebClient();
byte[] data = client.DownloadData(requestUri);
/*********** Init response headers ********/
WebHeaderCollection responseHeaders = client.ResponseHeaders;
for (int i = 0; i < responseHeaders.Count; i++)
{
Response.Headers.Add(responseHeaders.GetKey(i), responseHeaders[i]);
}
/***************************************************/
Besides of the response headers, I need to add request headers as well. I try to do it with the following code:
/*********** Init request headers ********/
NameValueCollection requestHeaders = Request.Headers;
foreach (string key in requestHeaders)
{
client.Headers.Add(key, requestHeaders[key]);
}
/***************************************************/
However it does not work and I get the following exception:
This header must be modified using the appropriate property.Parameter name: name
Could anybody help me with this? What's the correct way of adding request headers with WebClient?
Thank you.
The headers collection "protects" some of the possible headers as described on the msdn page here: http://msdn.microsoft.com/en-us/library/system.net.webclient.headers.aspx
That page seems to give all the answer you need but to quote the important part:
Some common headers are considered restricted and are protected by the
system and cannot be set or changed in a WebHeaderCollection object.
Any attempt to set one of these restricted headers in the
WebHeaderCollection object associated with a WebClient object will
throw an exception later when attempting to send the WebClient
request.
Restricted headers protected by the system include, but are not
limited to the following:
Date
Host
In addition, some other headers are also restricted when using a
WebClient object. These restricted headers include, but are not
limited to the following:
Accept
Connection
Content-Length
Expect (when the value is set to "100-continue"
If-Modified-Since
Range
Transfer-Encoding
The HttpWebRequest class has properties for setting some of the above
headers. If it is important for an application to set these headers,
then the HttpWebRequest class should be used instead of the WebRequest
class.
I suspect the reason for this is that many of the headers such as Date and host must be set differently on a different request. You should not be copying them. Indeed I would personally probably suggest that you should not be copying any of them. Put in your own user agent - If the page you are getting relies on a certain value then I'd think you want to make sure you always send a valid value rather than relying on the original user to give you that information.
Essentially work out what you need to do rather than finding something that works and doing that without fully understanding what you are doing.
Looks like you're trying to set some header which is must be set using one of the WebClient properties (CachePolicy, ContentLength or ContentType)
Moreover, it's not very good to blindly copy all the headers, you need to get just those you really need.

check to see if URL is a download link using webclient c#

I am reading from the history database, and for every URL read, I am downloading it and storing the data into a string. I want to be able to determine if the link is a download link, i.e. .exe or .zip for e.g. I am assuming I need to read the headers to determine this, but I don't know how to do it with WebClient. Any suggestions?
while (sqlite_datareader.Read())
{
noIndex = false;
string url = (string)sqlite_datareader["url"];
try
{
if (url.Contains("http") && (!url.Contains(".pdf")) && (!url.Contains(".jpg")) && (!url.Contains("https")) && !isInBlackList(url))
{
WebClient client = new WebClient();
client.Headers.Add("user-agent", "Only a test!");
String htmlCode = client.DownloadString(url);
}
}
}
Instead of loading the complete content behind the link, I would issue a HEAD request.
The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response. The metainformation contained in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request. This method can be used for obtaining metainformation about the entity implied by the request without transferring the entity-body itself. This method is often used for testing hypertext links for validity, accessibility, and recent modification.
Quote of http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
See these questions for C# examples
How to check if a file exists on a server using c# and the WebClient class
How to check if System.Net.WebClient.DownloadData is downloading a binary file?
You're on the right track; you'll need to examine the ResponseHeaders after a successful request:
var someType = "application/zip";
if (client.ResponseHeaders["Content-Type"].Contains(someType)) {
// this was a "download link"
}
The tricky part will be in determining what constitutes a download link since there are so many content types possible. For example, how would you decide whether XML data is a download link or not?
Try to check WebClient's ResponseHeaders collections to validate response file type.
In case, anyone has the same problem, I have used an attribute in the history places.sqlite database which came in very handy!
Places.sqlite contains a table called moz_historyvisits which contains a column visit_type. According to [1], a visit_type of 7 is a download link. Therefore, reading this value will determine if it is a download link without reading the response header or even sending out a head method.
[1] http://www.firefoxforensics.com/research/moz_historyvisits.shtml

Downloading a Cab file using WebClient gives too few bytes

I need to download a Cab file from a Url into a stream.
using (WebClient client = new WebClient())
{
client.Credentials = CredentialCache.DefaultCredentials;
byte[] fileContents = client.DownloadData("http://localhost/sites/hfsc/FormServerTemplates/HfscInspectionForm.xsn");
using (MemoryStream ms = new MemoryStream(fileContents))
{
FormTemplate = formExtractor.ExtractFormTemplateComponent(ms, "template.xml");
}
}
This is fairly straight forward, however my cab extractor (CabLib) is throwing an exception that it's not a valid cabinet.
I was previously using a SharePoint call to get the byte stream and that was returning 30942 bytes. The stream I get through that method worked correctly with CabLib. The stream I get with the WebClient returns only 28087 bytes.
I have noticed that the responce header content-type is coming back as text/html; charset=utf-8
I'm not too sure why but I think it's what's affecting the data I get back.
I beleive the problem is that SharePoint is passing the xsn to the Forms Server to render as an info path form in HTML for you. You need to stop this from happening. You can do this by adding some query string parameters to the URL request.
These can be found at:
http://msdn.microsoft.com/en-us/library/ms772417.aspx
I suggest you use NoRedirect=true

Categories