I'm trying to download a file, from a link that looks like:
www.sample.com/download.php?id=1234231
I don't know which file I'll get from this link.
First I tried webclient.downloadfile(link,path) - but the path I gave as the folder that the file should be in gave me an access denied error.
My problem is that I can't determine the file I'll get.
I've tried something like:
var wreq = (HttpWebRequest)HttpWebRequest.Create(link);
using (var res = (HttpWebResponse) wreq.GetResponse())
{
using (var reader = new StreamReader(res.GetResponseStream()))
{
//get filename Header
var filenameHeader =
res.GetResponseHeader("Content-Disposition")
.Split(';')
.Where(s => s.Contains("filename"))
.ToList()[
0];
var fileName = filenameHeader.Replace(" ", "").Split('=')[1];
//clear fileName
fileName = fileName.Replace(":", "");
using (var writer = new StreamReader(Path.Combine(folderToSave , fileName),FileMode.Create))
{
writer.Write(reader.ReadToEnd());
}
}
}
Isn't there something simpler than that?
Is is there any chance that I will download a file and not get a "Content-Disposition" header?
Last thing, at the moment I'm trying to write the file using a StreamWriter but the resulting file is corrupted. I assume that this is something related to not writing in binary format, but I'm not sure.
I've also checked the "Content-Length" header and it was a different value than the response.GetResponse().ToString().Length, maybe the header is counted it the length as well?
You can extend WebClient class for this
class MyWebClient : WebClient
{
public string FileName { get; private set; }
protected override WebResponse GetWebResponse(WebRequest request)
{
WebResponse response = base.GetWebResponse(request);
FileName = Regex.Match(((HttpWebResponse)response).Headers["Content-Disposition"], "filename=(.+?)$").Result("$1");
string regexSearch = new string(Path.GetInvalidFileNameChars()) + new string(Path.GetInvalidPathChars());
Regex r = new Regex(string.Format("[{0}]", Regex.Escape(regexSearch)));
FileName = r.Replace(FileName, "-");
return response;
}
}
Usage:
MyWebClient mwc = new MyWebClient();
byte[] bytes = mwc.DownloadData("http://subtitle.co.il//downloadsubtitle.php?id=202500");
File.WriteAllBytes(Path.Combine(folderToSave, mwc.FileName), bytes);
Related
I'm trying to download a Git File using C#. I use the following code:
Stream response = await client.GetStreamAsync(url);
var splitpath = path.Split("/");
Stream file = File.OpenWrite(splitpath[splitpath.Length - 1]);
response.CopyToAsync(file);
response.Close();
file.Close();
Following this documentation, I use the following url:
string url = mainurl + name + "/_apis/git/repositories/" + rep + "/items?path=" + path + "&download=true&api-version=6.0";
but the file saved contains a json containing different links and information about the git file.
To check if all was working well, I tried to download it in a zip format, using the following url:
string url = mainurl + name + "/_apis/git/repositories/" + rep + "/items?path=" + path + "&$format=zip";
And it works fine, the file downloaded is a zip file containing the original file with its content...
Can someone help me? Thanks
P.S. I know that I can set IncludeContent to True, and get the content in the json, but I need the original file.
Since you are using C#, I will give you a C# sample to get the original files:
using RestSharp;
using System;
using System.IO;
using System.IO.Compression;
namespace xxx
{
class Program
{
static void Main(string[] args)
{
string OrganizationName = "xxx";
string ProjectName = "xxx";
string RepositoryName = "xxx";
string Personal_Access_Token = "xxx";
string archive_path = "./"+RepositoryName+".zip";
string extract_path = "./"+RepositoryName+"";
string url = "https://dev.azure.com/"+OrganizationName+"/"+ProjectName+"/_apis/git/repositories/"+RepositoryName+"/items?$format=zip&api-version=6.0";
var client = new RestClient(url);
//client.Timeout = -1;
var request = new RestRequest(url, Method.Get);
request.AddHeader("Authorization", "Basic "+Personal_Access_Token);
var response = client.Execute(request);
//save the zip file
File.WriteAllBytes("./PushBack.zip", response.RawBytes);
//unzip the file
if (Directory.Exists(extract_path))
{
Directory.Delete(extract_path, true);
ZipFile.ExtractToDirectory(archive_path, extract_path);
}
else
{
ZipFile.ExtractToDirectory(archive_path, extract_path);
}
}
}
}
Successfully on my side:
Let me know whether this works on your side.
var personalaccesstoken = "xyz....";
using (HttpClient client = new HttpClient())
{
client.DefaultRequestHeaders.Accept.Add(
new System.Net.Http.Headers.MediaTypeWithQualityHeaderValue("*/*")); //this did the magic for me
client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Basic",
Convert.ToBase64String(
System.Text.ASCIIEncoding.ASCII.GetBytes(
string.Format("{0}:{1}", "", personalaccesstoken))));
using (Stream stream = await client.GetStreamAsync(
"https://dev.azure.com/fabrikam/myproj/_apis/git/repositories/myrepoid/items?path=%2Fsrc%2Ffolder%2Ffile.txt&api-version=7.0")) //no download arg
{
StreamReader sr = new StreamReader(stream);
var text = sr.ReadToEnd();
return text; // text has the content of the source file
}
}
no need for download parameter in the url
request headers should not be json
I have an API using POST Method.From this API I can download the file via Postmen tool.But I would like to know how to download file from C# Code.I have tried below code but POST Method is not allowed to download the file.
Code:-
using (var client = new WebClient())
{
client.Headers.Add("X-Cleartax-Auth-Token", ConfigurationManager.AppSettings["auth-token"]);
client.Headers[HttpRequestHeader.ContentType] = "application/json";
string url = ConfigurationManager.AppSettings["host"] + ConfigurationManager.AppSettings["taxable_entities"] + "/ewaybill/download?print_type=detailed";
TransId Id = new TransId()
{
id = TblHeader.Rows[0]["id"].ToString()
};
List<string> ids = new List<string>();
ids.Add(TblHeader.Rows[0]["id"].ToString());
string DATA = JsonConvert.SerializeObject(ids, Newtonsoft.Json.Formatting.Indented);
string res = client.UploadString(url, "POST",DATA);
client.DownloadFile(url, ConfigurationManager.AppSettings["InvoicePath"].ToString() + CboGatePassNo.EditValue.ToString().Replace("/", "-") + ".pdf");
}
Postmen Tool:-
URL : https://ewbbackend-preprodpub-http.internal.cleartax.co/gst/v0.1/taxable_entities/1c74ddd2-6383-4f4b-a7a5-007ddd08f9ea/ewaybill/download?print_type=detailed
Header :-
Content-Type : application/json
X-Cleartax-Auth-Token :b1f57327-96db-4829-97cf-2f3a59a3a548
Body :-
[
"GLD24449"
]
using (WebClient client = new WebClient())
{
client.Headers.Add("X-Cleartax-Auth-Token", ConfigurationManager.AppSettings["auth-token"]);
client.Headers[HttpRequestHeader.ContentType] = "application/json";
string url = ConfigurationManager.AppSettings["host"] + ConfigurationManager.AppSettings["taxable_entities"] + "/ewaybill/download?print_type=detailed";
client.Encoding = Encoding.UTF8;
//var data = "[\"GLD24449\"]";
var data = UTF8Encoding.UTF8.GetBytes(TblHeader.Rows[0]["id"].ToString());
byte[] r = client.UploadData(url, data);
using (var stream = System.IO.File.Create("FilePath"))
{
stream.Write(r,0,r.length);
}
}
Try this. Remember to change the filepath. Since the data you posted is not valid
json. So, I decide to post data this way.
I think it's straight forward, but instead of using WebClient, you can use HttpClient, it's better.
here is the answer HTTP client for downloading -> Download file with WebClient or HttpClient?
comparison between the HTTP client and web client-> Deciding between HttpClient and WebClient
Example Using WebClient
public static void Main(string[] args)
{
string path = #"download.pdf";
// Delete the file if it exists.
if (File.Exists(path))
{
File.Delete(path);
}
var uri = new Uri("https://ewbbackend-preprodpub-http.internal.cleartax.co/gst/v0.1/taxable_entities/1c74ddd2-6383-4f4b-a7a5-007ddd08f9ea/ewaybill/download?print_type=detailed");
WebClient client = new WebClient();
client.Headers[HttpRequestHeader.ContentType] = "application/json";
client.Headers.Add("X-Cleartax-Auth-Token", "b1f57327-96db-4829-97cf-2f3a59a3a548");
client.Encoding = Encoding.UTF8;
var data = UTF8Encoding.UTF8.GetBytes("[\"GLD24449\"]");
byte[] r = client.UploadData(uri, data);
using (var stream = System.IO.File.Create(path))
{
stream.Write(r, 0, r.Length);
}
}
Here is the sample code, don't forget to change the path.
public class Program
{
public static async Task Main(string[] args)
{
string path = #"download.pdf";
// Delete the file if it exists.
if (File.Exists(path))
{
File.Delete(path);
}
var uri = new Uri("https://ewbbackend-preprodpub-http.internal.cleartax.co/gst/v0.1/taxable_entities/1c74ddd2-6383-4f4b-a7a5-007ddd08f9ea/ewaybill/download?print_type=detailed");
HttpClient client = new HttpClient();
var request = new HttpRequestMessage(HttpMethod.Post, uri)
{
Content = new StringContent("[\"GLD24449\"]", Encoding.UTF8, "application/json")
};
request.Headers.Add("X-Cleartax-Auth-Token", "b1f57327-96db-4829-97cf-2f3a59a3a548");
var response = await client.SendAsync(request);
if (response.IsSuccessStatusCode)
{
using (FileStream fs = File.Create(path))
{
await response.Content.CopyToAsync(fs);
}
}
else
{
}
}
I'm using Convert API to convert docx to PDF. With the old API version everything works good, but I'm trying to migrate to the new API version and when I open the PDF is not a valid document and it will not open. Not sure what I am doing wrong, maybe something about the encoding?
The response that I get from Convert API is a JSON with the File Name, File Size and File Data. Maybe this File Data needs to be processed to create a valid PDF file? if I just write that data in a file it does not work.
public string ConvertReportToPDF(string fileName)
{
string resultFileName = "";
key = "xxxxx";
var requestContent = new MultipartFormDataContent();
var fileStream = System.IO.File.OpenRead(fileName);
var stream = new StreamContent(fileStream);
requestContent.Add(stream, "File", fileStream.Name);
var response = new HttpClient().PostAsync("https://v2.convertapi.com/docx/to/pdf?Secret=" + key, requestContent).Result;
FileReportResponse responseDeserialized = JsonConvert.DeserializeObject<FileReportResponse>(response.Content.ReadAsStringAsync().Result);
var path = SERVER_TEMP_PATH + "\\" + responseDeserialized.Files.First().FileName;
System.IO.File.WriteAllText(path, responseDeserialized.Files.First().FileData);
return responseDeserialized.Files.First().FileName;
}
File data in JSON is Base64 encoded, decode it before writing to a file.
public string ConvertReportToPDF(string fileName)
{
string resultFileName = "";
key = "xxxxx";
var requestContent = new MultipartFormDataContent();
var fileStream = System.IO.File.OpenRead(fileName);
var stream = new StreamContent(fileStream);
requestContent.Add(stream, "File", fileStream.Name);
var response = new HttpClient().PostAsync("https://v2.convertapi.com/docx/to/pdf?Secret=" + key, requestContent).Result;
FileReportResponse responseDeserialized = JsonConvert.DeserializeObject<FileReportResponse>(response.Content.ReadAsStringAsync().Result);
var path = SERVER_TEMP_PATH + "\\" + responseDeserialized.Files.First().FileName;
System.IO.File.WriteAllText(path, Convert.FromBase64String(responseDeserialized.Files.First().FileData));
return responseDeserialized.Files.First().FileName;
}
Why to use JSON response in C# when you can use binary response instead. A response will be smaller, no need to decode. To change response type you need to add accept=application/octet-stream header to request to ask for binary response from server. The whole code will look like
using System;
using System.Net;
using System.IO;
class MainClass {
public static void Main (string[] args) {
const string fileToConvert = "test.docx";
const string fileToSave = "test.pdf";
const string Secret="";
if (string.IsNullOrEmpty(Secret))
Console.WriteLine("The secret is missing, get one for free at https://www.convertapi.com/a");
else
try
{
Console.WriteLine("Please wait, converting!");
using (var client = new WebClient())
{
client.Headers.Add("accept", "application/octet-stream");
var resultFile = client.UploadFile(new Uri("http://v2.convertapi.com/docx/to/pdf?Secret=" + Secret), fileToConvert);
File.WriteAllBytes(fileToSave, resultFile );
Console.WriteLine("File converted successfully");
}
}
catch (WebException e)
{
Console.WriteLine("Status Code : {0}", ((HttpWebResponse)e.Response).StatusCode);
Console.WriteLine("Status Description : {0}", ((HttpWebResponse)e.Response).StatusDescription);
Console.WriteLine("Body : {0}", new StreamReader(e.Response.GetResponseStream()).ReadToEnd());
}
}
}
string uri = "https://sometest.com/l/admin/ical.html?t=TD61C7NibbV0m5bnDqYC_q";
string filePath = "D:\\Data\\Name";
WebClient webClient = new WebClient();
webClient.DownloadFile(uri, (filePath + "/" + uri.Substring(uri.LastIndexOf('/'))));
/// filePath + "/" + uri.Substring(uri.LastIndexOf('/')) = "D:\\Data\\Name//ical.html?t=TD61C7NibbV0m5bnDqYC_q"
Accesing the entire ( string ) uri, a .ical file will be automatically downloaded... The file name is room113558101.ics ( not that this will help ).
How can I get the file correctly?
You are building your filepath in a wrong way, which results in invalid file name (ical.html?t=TD61C7NibbV0m5bnDqYC_q). Instead, use Uri.Segments property and use last path segment (which will be ical.html in this case. Also, don't combine file paths by hand - use Path.Combine:
var uri = new Uri("https://sometest.com/l/admin/ical.html?t=TD61C7NibbV0m5bnDqYC_q");
var lastSegment = uri.Segments[uri.Segments.Length - 1];
string directory = "D:\\Data\\Name";
string filePath = Path.Combine(directory, lastSegment);
WebClient webClient = new WebClient();
webClient.DownloadFile(uri, filePath);
To answer your edited question about getting correct filename. In this case you don't know correct filename until you make a request to server and get a response. Filename will be contained in response Content-Disposition header. So you should do it like this:
var uri = new Uri("https://sometest.com/l/admin/ical.html?t=TD61C7NibbV0m5bnDqYC_q");
string directory = "D:\\Data\\Name";
WebClient webClient = new WebClient();
// make a request to server with `OpenRead`. This will fetch response headers but will not read whole response into memory
using (var stream = webClient.OpenRead(uri)) {
// get and parse Content-Disposition header if any
var cdRaw = webClient.ResponseHeaders["Content-Disposition"];
string filePath;
if (!String.IsNullOrWhiteSpace(cdRaw)) {
filePath = Path.Combine(directory, new System.Net.Mime.ContentDisposition(cdRaw).FileName);
}
else {
// if no such header - fallback to previous way
filePath = Path.Combine(directory, uri.Segments[uri.Segments.Length - 1]);
}
// copy response stream to target file
using (var fs = File.Create(filePath)) {
stream.CopyTo(fs);
}
}
Is there any way to know the original name of a file you download using the WebClient when the Uri doesn't contain the name?
This happens for example in sites where the download originates from a dynamic page where the name isn't known beforehand.
Using my browser, the file gets the orrect name. But how can this be done using the WebClient?
E.g.
WebClient wc= new WebClient();
var data= wc.DownloadData(#"www.sometime.com\getfile?id=123");
Using DownloadFile() isn't a solution since this method needs a filename in advance.
You need to examine the response headers and see if there is a content-disposition header present which includes the actual filename.
WebClient wc = new WebClient();
var data= wc.DownloadData(#"www.sometime.com\getfile?id=123");
string fileName = "";
// Try to extract the filename from the Content-Disposition header
if (!String.IsNullOrEmpty(wc.ResponseHeaders["Content-Disposition"]))
{
fileName = wc.ResponseHeaders["Content-Disposition"].Substring(wc.ResponseHeaders["Content-Disposition"].IndexOf("filename=") + 9).Replace("\"", "");
}
Read the Response Header "Content-Disposition" with WebClient.ResponseHeaders
It should be:
Content-Disposition: attachment; filename="fname.ext"
your code should look like:
string header = wc.ResponseHeaders["Content-Disposition"]??string.Empty;
const string filename="filename=";
int index = header.LastIndexOf(filename,StringComparison.OrdinalIgnoreCase);
if (index > -1)
{
fileName = header.Substring(index+filename.Length);
}
To get the filename without downloading the file:
public string GetFilenameFromWebServer(string url)
{
string result = "";
var req = System.Net.WebRequest.Create(url);
req.Method = "HEAD";
using (System.Net.WebResponse resp = req.GetResponse())
{
// Try to extract the filename from the Content-Disposition header
if (!string.IsNullOrEmpty(resp.Headers["Content-Disposition"]))
{
result = resp.Headers["Content-Disposition"].Substring(resp.Headers["Content-Disposition"].IndexOf("filename=") + 9).Replace("\"", "");
}
}
return result;
}
If you, like me, have to deal with a Content-Disposition header that is not formatted correctly or cannot be parsed automatically by the ContentDisposition class for some reason, here's my solution :
string fileName = null;
// Getting file name
var request = WebRequest.Create(url);
request.Method = "HEAD";
using (var response = request.GetResponse())
{
// Headers are not correct... So we need to parse manually
var contentDisposition = response.Headers["Content-Disposition"];
// We delete everything up to and including 'Filename="'
var fileNameMarker= "filename=\"";
var beginIndex = contentDisposition.ToLower().IndexOf(fileNameMarker);
contentDisposition = contentDisposition.Substring(beginIndex + fileNameMarker.Length);
//We only get the string until the next double quote
var fileNameLength = contentDisposition.ToLower().IndexOf("\"");
fileName = contentDisposition.Substring(0, fileNameLength);
}