Get original filename when downloading with WebClient - c#

Is there any way to know the original name of a file you download using the WebClient when the Uri doesn't contain the name?
This happens for example in sites where the download originates from a dynamic page where the name isn't known beforehand.
Using my browser, the file gets the orrect name. But how can this be done using the WebClient?
E.g.
WebClient wc= new WebClient();
var data= wc.DownloadData(#"www.sometime.com\getfile?id=123");
Using DownloadFile() isn't a solution since this method needs a filename in advance.

You need to examine the response headers and see if there is a content-disposition header present which includes the actual filename.
WebClient wc = new WebClient();
var data= wc.DownloadData(#"www.sometime.com\getfile?id=123");
string fileName = "";
// Try to extract the filename from the Content-Disposition header
if (!String.IsNullOrEmpty(wc.ResponseHeaders["Content-Disposition"]))
{
fileName = wc.ResponseHeaders["Content-Disposition"].Substring(wc.ResponseHeaders["Content-Disposition"].IndexOf("filename=") + 9).Replace("\"", "");
}

Read the Response Header "Content-Disposition" with WebClient.ResponseHeaders
It should be:
Content-Disposition: attachment; filename="fname.ext"
your code should look like:
string header = wc.ResponseHeaders["Content-Disposition"]??string.Empty;
const string filename="filename=";
int index = header.LastIndexOf(filename,StringComparison.OrdinalIgnoreCase);
if (index > -1)
{
fileName = header.Substring(index+filename.Length);
}

To get the filename without downloading the file:
public string GetFilenameFromWebServer(string url)
{
string result = "";
var req = System.Net.WebRequest.Create(url);
req.Method = "HEAD";
using (System.Net.WebResponse resp = req.GetResponse())
{
// Try to extract the filename from the Content-Disposition header
if (!string.IsNullOrEmpty(resp.Headers["Content-Disposition"]))
{
result = resp.Headers["Content-Disposition"].Substring(resp.Headers["Content-Disposition"].IndexOf("filename=") + 9).Replace("\"", "");
}
}
return result;
}

If you, like me, have to deal with a Content-Disposition header that is not formatted correctly or cannot be parsed automatically by the ContentDisposition class for some reason, here's my solution :
string fileName = null;
// Getting file name
var request = WebRequest.Create(url);
request.Method = "HEAD";
using (var response = request.GetResponse())
{
// Headers are not correct... So we need to parse manually
var contentDisposition = response.Headers["Content-Disposition"];
// We delete everything up to and including 'Filename="'
var fileNameMarker= "filename=\"";
var beginIndex = contentDisposition.ToLower().IndexOf(fileNameMarker);
contentDisposition = contentDisposition.Substring(beginIndex + fileNameMarker.Length);
//We only get the string until the next double quote
var fileNameLength = contentDisposition.ToLower().IndexOf("\"");
fileName = contentDisposition.Substring(0, fileNameLength);
}

Related

AzureDevops Api: Get item API with download true return a json

I'm trying to download a Git File using C#. I use the following code:
Stream response = await client.GetStreamAsync(url);
var splitpath = path.Split("/");
Stream file = File.OpenWrite(splitpath[splitpath.Length - 1]);
response.CopyToAsync(file);
response.Close();
file.Close();
Following this documentation, I use the following url:
string url = mainurl + name + "/_apis/git/repositories/" + rep + "/items?path=" + path + "&download=true&api-version=6.0";
but the file saved contains a json containing different links and information about the git file.
To check if all was working well, I tried to download it in a zip format, using the following url:
string url = mainurl + name + "/_apis/git/repositories/" + rep + "/items?path=" + path + "&$format=zip";
And it works fine, the file downloaded is a zip file containing the original file with its content...
Can someone help me? Thanks
P.S. I know that I can set IncludeContent to True, and get the content in the json, but I need the original file.
Since you are using C#, I will give you a C# sample to get the original files:
using RestSharp;
using System;
using System.IO;
using System.IO.Compression;
namespace xxx
{
class Program
{
static void Main(string[] args)
{
string OrganizationName = "xxx";
string ProjectName = "xxx";
string RepositoryName = "xxx";
string Personal_Access_Token = "xxx";
string archive_path = "./"+RepositoryName+".zip";
string extract_path = "./"+RepositoryName+"";
string url = "https://dev.azure.com/"+OrganizationName+"/"+ProjectName+"/_apis/git/repositories/"+RepositoryName+"/items?$format=zip&api-version=6.0";
var client = new RestClient(url);
//client.Timeout = -1;
var request = new RestRequest(url, Method.Get);
request.AddHeader("Authorization", "Basic "+Personal_Access_Token);
var response = client.Execute(request);
//save the zip file
File.WriteAllBytes("./PushBack.zip", response.RawBytes);
//unzip the file
if (Directory.Exists(extract_path))
{
Directory.Delete(extract_path, true);
ZipFile.ExtractToDirectory(archive_path, extract_path);
}
else
{
ZipFile.ExtractToDirectory(archive_path, extract_path);
}
}
}
}
Successfully on my side:
Let me know whether this works on your side.
var personalaccesstoken = "xyz....";
using (HttpClient client = new HttpClient())
{
client.DefaultRequestHeaders.Accept.Add(
new System.Net.Http.Headers.MediaTypeWithQualityHeaderValue("*/*")); //this did the magic for me
client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Basic",
Convert.ToBase64String(
System.Text.ASCIIEncoding.ASCII.GetBytes(
string.Format("{0}:{1}", "", personalaccesstoken))));
using (Stream stream = await client.GetStreamAsync(
"https://dev.azure.com/fabrikam/myproj/_apis/git/repositories/myrepoid/items?path=%2Fsrc%2Ffolder%2Ffile.txt&api-version=7.0")) //no download arg
{
StreamReader sr = new StreamReader(stream);
var text = sr.ReadToEnd();
return text; // text has the content of the source file
}
}
no need for download parameter in the url
request headers should not be json

get downloaded file from URL and Illegal characters in path

string uri = "https://sometest.com/l/admin/ical.html?t=TD61C7NibbV0m5bnDqYC_q";
string filePath = "D:\\Data\\Name";
WebClient webClient = new WebClient();
webClient.DownloadFile(uri, (filePath + "/" + uri.Substring(uri.LastIndexOf('/'))));
/// filePath + "/" + uri.Substring(uri.LastIndexOf('/')) = "D:\\Data\\Name//ical.html?t=TD61C7NibbV0m5bnDqYC_q"
Accesing the entire ( string ) uri, a .ical file will be automatically downloaded... The file name is room113558101.ics ( not that this will help ).
How can I get the file correctly?
You are building your filepath in a wrong way, which results in invalid file name (ical.html?t=TD61C7NibbV0m5bnDqYC_q). Instead, use Uri.Segments property and use last path segment (which will be ical.html in this case. Also, don't combine file paths by hand - use Path.Combine:
var uri = new Uri("https://sometest.com/l/admin/ical.html?t=TD61C7NibbV0m5bnDqYC_q");
var lastSegment = uri.Segments[uri.Segments.Length - 1];
string directory = "D:\\Data\\Name";
string filePath = Path.Combine(directory, lastSegment);
WebClient webClient = new WebClient();
webClient.DownloadFile(uri, filePath);
To answer your edited question about getting correct filename. In this case you don't know correct filename until you make a request to server and get a response. Filename will be contained in response Content-Disposition header. So you should do it like this:
var uri = new Uri("https://sometest.com/l/admin/ical.html?t=TD61C7NibbV0m5bnDqYC_q");
string directory = "D:\\Data\\Name";
WebClient webClient = new WebClient();
// make a request to server with `OpenRead`. This will fetch response headers but will not read whole response into memory
using (var stream = webClient.OpenRead(uri)) {
// get and parse Content-Disposition header if any
var cdRaw = webClient.ResponseHeaders["Content-Disposition"];
string filePath;
if (!String.IsNullOrWhiteSpace(cdRaw)) {
filePath = Path.Combine(directory, new System.Net.Mime.ContentDisposition(cdRaw).FileName);
}
else {
// if no such header - fallback to previous way
filePath = Path.Combine(directory, uri.Segments[uri.Segments.Length - 1]);
}
// copy response stream to target file
using (var fs = File.Create(filePath)) {
stream.CopyTo(fs);
}
}

how to get filename from URL without downloading file c#

this is my code
Uri uri = new Uri(this.Url);
var data = client.DownloadData(uri);
if (!String.IsNullOrEmpty(client.ResponseHeaders["Content-Disposition"]))
{
FileName = client.ResponseHeaders["Content-Disposition"].Substring(client.ResponseHeaders["Content-Disposition"].IndexOf("filename=") + 10).Replace("\"", "");
}
how to get the file name without download the file, I mean without using client.DownloadData??
WebClient will not support it but with HttpWebRequest you can either try to be nice and send a HEAD request if the server supports it or if it doesn't send a normal GET request and just don't download the data:
The HEAD request:
HttpWebRequest request = (HttpWebRequest)System.Net.WebRequest.Create(uri);
request.Method = "HEAD";
HttpWebResponse response = (HttpWebResponse) request.GetResponse();
string disposition = response.Headers["Content-Disposition"];
string filename = disposition.Substring(disposition.IndexOf("filename=") + 10).Replace("\"", "");
response.close();
If the server doesn't support HEAD, send a normal GET request:
HttpWebRequest request = (HttpWebRequest)System.Net.WebRequest.Create(uri);
HttpWebResponse response = (HttpWebResponse) request.GetResponse();
string disposition = response.Headers["Content-Disposition"];
string filename = disposition.Substring(disposition.IndexOf("filename=") + 9).Replace("\"", "");
response.close();

Don't know which file I'm downloading

I'm trying to download a file, from a link that looks like:
www.sample.com/download.php?id=1234231
I don't know which file I'll get from this link.
First I tried webclient.downloadfile(link,path) - but the path I gave as the folder that the file should be in gave me an access denied error.
My problem is that I can't determine the file I'll get.
I've tried something like:
var wreq = (HttpWebRequest)HttpWebRequest.Create(link);
using (var res = (HttpWebResponse) wreq.GetResponse())
{
using (var reader = new StreamReader(res.GetResponseStream()))
{
//get filename Header
var filenameHeader =
res.GetResponseHeader("Content-Disposition")
.Split(';')
.Where(s => s.Contains("filename"))
.ToList()[
0];
var fileName = filenameHeader.Replace(" ", "").Split('=')[1];
//clear fileName
fileName = fileName.Replace(":", "");
using (var writer = new StreamReader(Path.Combine(folderToSave , fileName),FileMode.Create))
{
writer.Write(reader.ReadToEnd());
}
}
}
Isn't there something simpler than that?
Is is there any chance that I will download a file and not get a "Content-Disposition" header?
Last thing, at the moment I'm trying to write the file using a StreamWriter but the resulting file is corrupted. I assume that this is something related to not writing in binary format, but I'm not sure.
I've also checked the "Content-Length" header and it was a different value than the response.GetResponse().ToString().Length, maybe the header is counted it the length as well?
You can extend WebClient class for this
class MyWebClient : WebClient
{
public string FileName { get; private set; }
protected override WebResponse GetWebResponse(WebRequest request)
{
WebResponse response = base.GetWebResponse(request);
FileName = Regex.Match(((HttpWebResponse)response).Headers["Content-Disposition"], "filename=(.+?)$").Result("$1");
string regexSearch = new string(Path.GetInvalidFileNameChars()) + new string(Path.GetInvalidPathChars());
Regex r = new Regex(string.Format("[{0}]", Regex.Escape(regexSearch)));
FileName = r.Replace(FileName, "-");
return response;
}
}
Usage:
MyWebClient mwc = new MyWebClient();
byte[] bytes = mwc.DownloadData("http://subtitle.co.il//downloadsubtitle.php?id=202500");
File.WriteAllBytes(Path.Combine(folderToSave, mwc.FileName), bytes);

Unable to cast object of type 'System.Net.HttpWebRequest' to type 'System.Net.FileWebRequest'

I try download file from server with FileWebRequest. But I get error:
Method on download is here:
public string HttpFileGetReq(Uri uri, int reqTimeout, Encoding encoding)
{
try
{
string stringResponse;
var req = (FileWebRequest)WebRequest.Create(uri);
req.Timeout = reqTimeout;
req.Method = WebRequestMethods.File.DownloadFile;
var res = (FileWebResponse)req.GetResponse();
//using (var receiveStream = res.GetResponseStream())
//using (var readStream = new StreamReader(receiveStream,encoding))
//{
// stringResponse = readStream.ReadToEnd();
//}
return stringResponse="0K";
}
catch (WebException webException)
{
throw webException;
}
}
Usage is here:
public dynamic LoadRoomMsg(IAccount account, string roomId)
{
try
{
string uri = string.Format("http://www-pokec.azet.sk/_s/chat/nacitajPrispevky.php?{0}&lok={1}&lastMsg=0&pub=0&prv=0&r=1295633087203&changeroom=1" , account.SessionId, roomId);
var htmlStringResult = HttpFileGetReq(new Uri(uri), ReqTimeout, EncodingType);
//var htmlStringResult = _httpReq.HttpGetReq(new Uri(string.Format("{0}{1}?{2}&lok=", PokecUrl.RoomMsg,account.SessionId,roomId)),
// ReqTimeout, account.Cookies, EncodingType);
if (!string.IsNullOrEmpty(htmlStringResult))
{
return true;
}
return false;
}
catch (Exception exception)
{
throw exception;
}
}
URL on file is here.
I would like read this file to string variable, that’s all. If anyone have some time and can help me I would be very glad to him.
Your URL (http://...) will produce a HttpWebRequest. You can check with the debugger.
Form MSDN:
The FileWebRequest class implements
the WebRequest abstract base class for
Uniform Resource Identifiers (URIs)
that use the file:// scheme to request
local files.
Note the file:// and local files in there.
Tip: Just use the WebClient class.
Rather than implement your own web streams allow the .NET framework to do it all for you with WebClient, for example:
string uri = string.Format(
"http://www-pokec.azet.sk/_s/chat/nacitajPrispevky.php?{0}&lok={1}&lastMsg=0&pub=0&prv=0&r=1295633087203&changeroom=1",
account.SessionId,
roomId);
System.Net.WebClient wc = new System.Net.WebClient();
string webData = wc.DownloadString(uri);
...parse the webdata response here...
Looking at the response from the URL you posted:
{"reason":0}
parsing that should be a simple task with a little string manipulation.
Change FileWebRequest and FileWebResponse to HttpWebRequest and HttpWebResponse.
It doesn't matter that what you're downloading may be a file; as far as the .NET Framework is concerned, you're just retrieving a page from a website.
FileWebRequest is for file:// protocols. Since you're using an http:// url, you want to use HttpWebRequest.
public string HttpFileGetReq(Uri uri, int reqTimeout, Encoding encoding)
{
string stringResponse;
var req = (HttpWebRequest)WebRequest.Create(uri);
req.Timeout = reqTimeout;
var res = (HttpWebResponse)req.GetResponse();
using (var receiveStream = res.GetResponseStream())
{
using (var readStream = new StreamReader(receiveStream,encoding))
{
return readStream.ReadToEnd();
}
}
}

Categories