get downloaded file from URL and Illegal characters in path

get downloaded file from URL and Illegal characters in path - c#

string uri = "https://sometest.com/l/admin/ical.html?t=TD61C7NibbV0m5bnDqYC_q";
string filePath = "D:\\Data\\Name";
WebClient webClient = new WebClient();
webClient.DownloadFile(uri, (filePath + "/" + uri.Substring(uri.LastIndexOf('/'))));
/// filePath + "/" + uri.Substring(uri.LastIndexOf('/')) = "D:\\Data\\Name//ical.html?t=TD61C7NibbV0m5bnDqYC_q"
Accesing the entire ( string ) uri, a .ical file will be automatically downloaded... The file name is room113558101.ics ( not that this will help ).
How can I get the file correctly?

You are building your filepath in a wrong way, which results in invalid file name (ical.html?t=TD61C7NibbV0m5bnDqYC_q). Instead, use Uri.Segments property and use last path segment (which will be ical.html in this case. Also, don't combine file paths by hand - use Path.Combine:
var uri = new Uri("https://sometest.com/l/admin/ical.html?t=TD61C7NibbV0m5bnDqYC_q");
var lastSegment = uri.Segments[uri.Segments.Length - 1];
string directory = "D:\\Data\\Name";
string filePath = Path.Combine(directory, lastSegment);
WebClient webClient = new WebClient();
webClient.DownloadFile(uri, filePath);
To answer your edited question about getting correct filename. In this case you don't know correct filename until you make a request to server and get a response. Filename will be contained in response Content-Disposition header. So you should do it like this:
var uri = new Uri("https://sometest.com/l/admin/ical.html?t=TD61C7NibbV0m5bnDqYC_q");
string directory = "D:\\Data\\Name";
WebClient webClient = new WebClient();
// make a request to server with `OpenRead`. This will fetch response headers but will not read whole response into memory
using (var stream = webClient.OpenRead(uri)) {
// get and parse Content-Disposition header if any
var cdRaw = webClient.ResponseHeaders["Content-Disposition"];
string filePath;
if (!String.IsNullOrWhiteSpace(cdRaw)) {
filePath = Path.Combine(directory, new System.Net.Mime.ContentDisposition(cdRaw).FileName);
}
else {
// if no such header - fallback to previous way
filePath = Path.Combine(directory, uri.Segments[uri.Segments.Length - 1]);
}
// copy response stream to target file
using (var fs = File.Create(filePath)) {
stream.CopyTo(fs);
}
}

Related

AzureDevops Api: Get item API with download true return a json

I'm trying to download a Git File using C#. I use the following code:
Stream response = await client.GetStreamAsync(url);
var splitpath = path.Split("/");
Stream file = File.OpenWrite(splitpath[splitpath.Length - 1]);
response.CopyToAsync(file);
response.Close();
file.Close();
Following this documentation, I use the following url:
string url = mainurl + name + "/_apis/git/repositories/" + rep + "/items?path=" + path + "&download=true&api-version=6.0";
but the file saved contains a json containing different links and information about the git file.
To check if all was working well, I tried to download it in a zip format, using the following url:
string url = mainurl + name + "/_apis/git/repositories/" + rep + "/items?path=" + path + "&$format=zip";
And it works fine, the file downloaded is a zip file containing the original file with its content...
Can someone help me? Thanks
P.S. I know that I can set IncludeContent to True, and get the content in the json, but I need the original file.

Since you are using C#, I will give you a C# sample to get the original files:
using RestSharp;
using System;
using System.IO;
using System.IO.Compression;
namespace xxx
{
class Program
{
static void Main(string[] args)
{
string OrganizationName = "xxx";
string ProjectName = "xxx";
string RepositoryName = "xxx";
string Personal_Access_Token = "xxx";
string archive_path = "./"+RepositoryName+".zip";
string extract_path = "./"+RepositoryName+"";
string url = "https://dev.azure.com/"+OrganizationName+"/"+ProjectName+"/_apis/git/repositories/"+RepositoryName+"/items?$format=zip&api-version=6.0";
var client = new RestClient(url);
//client.Timeout = -1;
var request = new RestRequest(url, Method.Get);
request.AddHeader("Authorization", "Basic "+Personal_Access_Token);
var response = client.Execute(request);
//save the zip file
File.WriteAllBytes("./PushBack.zip", response.RawBytes);
//unzip the file
if (Directory.Exists(extract_path))
{
Directory.Delete(extract_path, true);
ZipFile.ExtractToDirectory(archive_path, extract_path);
}
else
{
ZipFile.ExtractToDirectory(archive_path, extract_path);
}
}
}
}
Successfully on my side:
Let me know whether this works on your side.

var personalaccesstoken = "xyz....";
using (HttpClient client = new HttpClient())
{
client.DefaultRequestHeaders.Accept.Add(
new System.Net.Http.Headers.MediaTypeWithQualityHeaderValue("*/*")); //this did the magic for me
client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Basic",
Convert.ToBase64String(
System.Text.ASCIIEncoding.ASCII.GetBytes(
string.Format("{0}:{1}", "", personalaccesstoken))));
using (Stream stream = await client.GetStreamAsync(
"https://dev.azure.com/fabrikam/myproj/_apis/git/repositories/myrepoid/items?path=%2Fsrc%2Ffolder%2Ffile.txt&api-version=7.0")) //no download arg
{
StreamReader sr = new StreamReader(stream);
var text = sr.ReadToEnd();
return text; // text has the content of the source file
}
}
no need for download parameter in the url
request headers should not be json

Get original filename when downloading with WebClient

Is there any way to know the original name of a file you download using the WebClient when the Uri doesn't contain the name?
This happens for example in sites where the download originates from a dynamic page where the name isn't known beforehand.
Using my browser, the file gets the orrect name. But how can this be done using the WebClient?
E.g.
WebClient wc= new WebClient();
var data= wc.DownloadData(#"www.sometime.com\getfile?id=123");
Using DownloadFile() isn't a solution since this method needs a filename in advance.

You need to examine the response headers and see if there is a content-disposition header present which includes the actual filename.
WebClient wc = new WebClient();
var data= wc.DownloadData(#"www.sometime.com\getfile?id=123");
string fileName = "";
// Try to extract the filename from the Content-Disposition header
if (!String.IsNullOrEmpty(wc.ResponseHeaders["Content-Disposition"]))
{
fileName = wc.ResponseHeaders["Content-Disposition"].Substring(wc.ResponseHeaders["Content-Disposition"].IndexOf("filename=") + 9).Replace("\"", "");
}

Read the Response Header "Content-Disposition" with WebClient.ResponseHeaders
It should be:
Content-Disposition: attachment; filename="fname.ext"
your code should look like:
string header = wc.ResponseHeaders["Content-Disposition"]??string.Empty;
const string filename="filename=";
int index = header.LastIndexOf(filename,StringComparison.OrdinalIgnoreCase);
if (index > -1)
{
fileName = header.Substring(index+filename.Length);
}

To get the filename without downloading the file:
public string GetFilenameFromWebServer(string url)
{
string result = "";
var req = System.Net.WebRequest.Create(url);
req.Method = "HEAD";
using (System.Net.WebResponse resp = req.GetResponse())
{
// Try to extract the filename from the Content-Disposition header
if (!string.IsNullOrEmpty(resp.Headers["Content-Disposition"]))
{
result = resp.Headers["Content-Disposition"].Substring(resp.Headers["Content-Disposition"].IndexOf("filename=") + 9).Replace("\"", "");
}
}
return result;
}

If you, like me, have to deal with a Content-Disposition header that is not formatted correctly or cannot be parsed automatically by the ContentDisposition class for some reason, here's my solution :
string fileName = null;
// Getting file name
var request = WebRequest.Create(url);
request.Method = "HEAD";
using (var response = request.GetResponse())
{
// Headers are not correct... So we need to parse manually
var contentDisposition = response.Headers["Content-Disposition"];
// We delete everything up to and including 'Filename="'
var fileNameMarker= "filename=\"";
var beginIndex = contentDisposition.ToLower().IndexOf(fileNameMarker);
contentDisposition = contentDisposition.Substring(beginIndex + fileNameMarker.Length);
//We only get the string until the next double quote
var fileNameLength = contentDisposition.ToLower().IndexOf("\"");
fileName = contentDisposition.Substring(0, fileNameLength);
}

Get meta data of a file using c#

I need to find a files's meta data using c#.The file i use is saved in third party site.
I can able to download the file from that server but i can't able to get the original meta data of the file that i downloaded.
How to achieve this using c#.Below is my code.
string FilePath = AppDomain.CurrentDomain.BaseDirectory + #"Downloads\";
string Url = txtUrl.Text.Trim();
Uri _Url = new Uri(Url);
System.Net.HttpWebRequest request = (System.Net.HttpWebRequest)System.Net.WebRequest.Create(_Url);
request.Timeout = Timeout.Infinite;
System.Net.HttpWebResponse response = (System.Net.HttpWebResponse)request.GetResponse();
response.Close();
if (response.ContentType != "text/html; charset=UTF-8")
{
string FileSize = response.Headers.Get("Content-Length");
int lastindex = Url.LastIndexOf("/");
string TempUrlName = Url.Substring(lastindex + 1, Url.Length - (lastindex + 1));
WebClient oWebClient = new WebClient();
oWebClient.DownloadFile(txtUrl.Text.Trim(), FilePath + #"\" + TempUrlName);
if (File.Exists(FilePath + #"\" + TempUrlName))
{
FileInfo oInfo = new FileInfo(FilePath + #"\" + TempUrlName);
DateTime time = oInfo.CreationTime;
time = oInfo.LastAccessTime;
time = oInfo.LastWriteTime;
}
}
I can able to get file size,creation time,last accessed time and last write time only after saving the file in local. But i need the file meta data infos when file is located in server using c#.
Thanks

Since those are properties stored in the file system and changed once you save them locally, you won't be able to access those via HTTP.
Do you have any influence on the third party? Maybe have them send those properties along in the headers?

Don't know which file I'm downloading

I'm trying to download a file, from a link that looks like:
www.sample.com/download.php?id=1234231
I don't know which file I'll get from this link.
First I tried webclient.downloadfile(link,path) - but the path I gave as the folder that the file should be in gave me an access denied error.
My problem is that I can't determine the file I'll get.
I've tried something like:
var wreq = (HttpWebRequest)HttpWebRequest.Create(link);
using (var res = (HttpWebResponse) wreq.GetResponse())
{
using (var reader = new StreamReader(res.GetResponseStream()))
{
//get filename Header
var filenameHeader =
res.GetResponseHeader("Content-Disposition")
.Split(';')
.Where(s => s.Contains("filename"))
.ToList()[
0];
var fileName = filenameHeader.Replace(" ", "").Split('=')[1];
//clear fileName
fileName = fileName.Replace(":", "");
using (var writer = new StreamReader(Path.Combine(folderToSave , fileName),FileMode.Create))
{
writer.Write(reader.ReadToEnd());
}
}
}
Isn't there something simpler than that?
Is is there any chance that I will download a file and not get a "Content-Disposition" header?
Last thing, at the moment I'm trying to write the file using a StreamWriter but the resulting file is corrupted. I assume that this is something related to not writing in binary format, but I'm not sure.
I've also checked the "Content-Length" header and it was a different value than the response.GetResponse().ToString().Length, maybe the header is counted it the length as well?

You can extend WebClient class for this
class MyWebClient : WebClient
{
public string FileName { get; private set; }
protected override WebResponse GetWebResponse(WebRequest request)
{
WebResponse response = base.GetWebResponse(request);
FileName = Regex.Match(((HttpWebResponse)response).Headers["Content-Disposition"], "filename=(.+?)$").Result("$1");
string regexSearch = new string(Path.GetInvalidFileNameChars()) + new string(Path.GetInvalidPathChars());
Regex r = new Regex(string.Format("[{0}]", Regex.Escape(regexSearch)));
FileName = r.Replace(FileName, "-");
return response;
}
}
Usage:
MyWebClient mwc = new MyWebClient();
byte[] bytes = mwc.DownloadData("http://subtitle.co.il//downloadsubtitle.php?id=202500");
File.WriteAllBytes(Path.Combine(folderToSave, mwc.FileName), bytes);

ASP.NET Download Image file from URL with querystring

I'm trying to download an image from a URL. The URL has a security key appended to the end and I keep getting the following error:
System.Net.WebException: An exception occurred during a WebClient request. ---> System.ArgumentException: Illegal characters in path
I'm not sure the correct syntax to use for this. Here is my code below.
string remoteImgPath = "https://mysource.com/2012-08-01/Images/front/y/123456789.jpg?api_key=RgTYUSXe7783u45sRR";
string fileName = Path.GetFileName(remoteImgPath);
string localPath = AppDomain.CurrentDomain.BaseDirectory + "LocalFolder\\Images\\Originals\\" + fileName;
WebClient webClient = new WebClient();
webClient.DownloadFile(remoteImgPath, localPath);
return localPath;

I think that this is what you're looking for:
string remoteImgPath = "https://mysource.com/2012-08-01/Images/front/y/123456789.jpg?api_key=RgTYUSXe7783u45sRR";
Uri remoteImgPathUri = new Uri(remoteImgPath);
string remoteImgPathWithoutQuery = remoteImgPathUri.GetLeftPart(UriPartial.Path);
string fileName = Path.GetFileName(remoteImgPathWithoutQuery);
string localPath = AppDomain.CurrentDomain.BaseDirectory + "LocalFolder\\Images\\Originals\\" + fileName;
WebClient webClient = new WebClient();
webClient.DownloadFile(remoteImgPath, localPath);
return localPath;

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

get downloaded file from URL and Illegal characters in path - c#

Related

AzureDevops Api: Get item API with download true return a json

Get original filename when downloading with WebClient

Get meta data of a file using c#

Don't know which file I'm downloading

ASP.NET Download Image file from URL with querystring

Categories

Resources