Don't know which file I'm downloading - c#

I'm trying to download a file, from a link that looks like:
www.sample.com/download.php?id=1234231
I don't know which file I'll get from this link.
First I tried webclient.downloadfile(link,path) - but the path I gave as the folder that the file should be in gave me an access denied error.
My problem is that I can't determine the file I'll get.
I've tried something like:
var wreq = (HttpWebRequest)HttpWebRequest.Create(link);
using (var res = (HttpWebResponse) wreq.GetResponse())
{
using (var reader = new StreamReader(res.GetResponseStream()))
{
//get filename Header
var filenameHeader =
res.GetResponseHeader("Content-Disposition")
.Split(';')
.Where(s => s.Contains("filename"))
.ToList()[
0];
var fileName = filenameHeader.Replace(" ", "").Split('=')[1];
//clear fileName
fileName = fileName.Replace(":", "");
using (var writer = new StreamReader(Path.Combine(folderToSave , fileName),FileMode.Create))
{
writer.Write(reader.ReadToEnd());
}
}
}
Isn't there something simpler than that?
Is is there any chance that I will download a file and not get a "Content-Disposition" header?
Last thing, at the moment I'm trying to write the file using a StreamWriter but the resulting file is corrupted. I assume that this is something related to not writing in binary format, but I'm not sure.
I've also checked the "Content-Length" header and it was a different value than the response.GetResponse().ToString().Length, maybe the header is counted it the length as well?

You can extend WebClient class for this
class MyWebClient : WebClient
{
public string FileName { get; private set; }
protected override WebResponse GetWebResponse(WebRequest request)
{
WebResponse response = base.GetWebResponse(request);
FileName = Regex.Match(((HttpWebResponse)response).Headers["Content-Disposition"], "filename=(.+?)$").Result("$1");
string regexSearch = new string(Path.GetInvalidFileNameChars()) + new string(Path.GetInvalidPathChars());
Regex r = new Regex(string.Format("[{0}]", Regex.Escape(regexSearch)));
FileName = r.Replace(FileName, "-");
return response;
}
}
Usage:
MyWebClient mwc = new MyWebClient();
byte[] bytes = mwc.DownloadData("http://subtitle.co.il//downloadsubtitle.php?id=202500");
File.WriteAllBytes(Path.Combine(folderToSave, mwc.FileName), bytes);

Related

AzureDevops Api: Get item API with download true return a json

I'm trying to download a Git File using C#. I use the following code:
Stream response = await client.GetStreamAsync(url);
var splitpath = path.Split("/");
Stream file = File.OpenWrite(splitpath[splitpath.Length - 1]);
response.CopyToAsync(file);
response.Close();
file.Close();
Following this documentation, I use the following url:
string url = mainurl + name + "/_apis/git/repositories/" + rep + "/items?path=" + path + "&download=true&api-version=6.0";
but the file saved contains a json containing different links and information about the git file.
To check if all was working well, I tried to download it in a zip format, using the following url:
string url = mainurl + name + "/_apis/git/repositories/" + rep + "/items?path=" + path + "&$format=zip";
And it works fine, the file downloaded is a zip file containing the original file with its content...
Can someone help me? Thanks
P.S. I know that I can set IncludeContent to True, and get the content in the json, but I need the original file.
Since you are using C#, I will give you a C# sample to get the original files:
using RestSharp;
using System;
using System.IO;
using System.IO.Compression;
namespace xxx
{
class Program
{
static void Main(string[] args)
{
string OrganizationName = "xxx";
string ProjectName = "xxx";
string RepositoryName = "xxx";
string Personal_Access_Token = "xxx";
string archive_path = "./"+RepositoryName+".zip";
string extract_path = "./"+RepositoryName+"";
string url = "https://dev.azure.com/"+OrganizationName+"/"+ProjectName+"/_apis/git/repositories/"+RepositoryName+"/items?$format=zip&api-version=6.0";
var client = new RestClient(url);
//client.Timeout = -1;
var request = new RestRequest(url, Method.Get);
request.AddHeader("Authorization", "Basic "+Personal_Access_Token);
var response = client.Execute(request);
//save the zip file
File.WriteAllBytes("./PushBack.zip", response.RawBytes);
//unzip the file
if (Directory.Exists(extract_path))
{
Directory.Delete(extract_path, true);
ZipFile.ExtractToDirectory(archive_path, extract_path);
}
else
{
ZipFile.ExtractToDirectory(archive_path, extract_path);
}
}
}
}
Successfully on my side:
Let me know whether this works on your side.
var personalaccesstoken = "xyz....";
using (HttpClient client = new HttpClient())
{
client.DefaultRequestHeaders.Accept.Add(
new System.Net.Http.Headers.MediaTypeWithQualityHeaderValue("*/*")); //this did the magic for me
client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Basic",
Convert.ToBase64String(
System.Text.ASCIIEncoding.ASCII.GetBytes(
string.Format("{0}:{1}", "", personalaccesstoken))));
using (Stream stream = await client.GetStreamAsync(
"https://dev.azure.com/fabrikam/myproj/_apis/git/repositories/myrepoid/items?path=%2Fsrc%2Ffolder%2Ffile.txt&api-version=7.0")) //no download arg
{
StreamReader sr = new StreamReader(stream);
var text = sr.ReadToEnd();
return text; // text has the content of the source file
}
}
no need for download parameter in the url
request headers should not be json

How to download file from API using Post Method

I have an API using POST Method.From this API I can download the file via Postmen tool.But I would like to know how to download file from C# Code.I have tried below code but POST Method is not allowed to download the file.
Code:-
using (var client = new WebClient())
{
client.Headers.Add("X-Cleartax-Auth-Token", ConfigurationManager.AppSettings["auth-token"]);
client.Headers[HttpRequestHeader.ContentType] = "application/json";
string url = ConfigurationManager.AppSettings["host"] + ConfigurationManager.AppSettings["taxable_entities"] + "/ewaybill/download?print_type=detailed";
TransId Id = new TransId()
{
id = TblHeader.Rows[0]["id"].ToString()
};
List<string> ids = new List<string>();
ids.Add(TblHeader.Rows[0]["id"].ToString());
string DATA = JsonConvert.SerializeObject(ids, Newtonsoft.Json.Formatting.Indented);
string res = client.UploadString(url, "POST",DATA);
client.DownloadFile(url, ConfigurationManager.AppSettings["InvoicePath"].ToString() + CboGatePassNo.EditValue.ToString().Replace("/", "-") + ".pdf");
}
Postmen Tool:-
URL : https://ewbbackend-preprodpub-http.internal.cleartax.co/gst/v0.1/taxable_entities/1c74ddd2-6383-4f4b-a7a5-007ddd08f9ea/ewaybill/download?print_type=detailed
Header :-
Content-Type : application/json
X-Cleartax-Auth-Token :b1f57327-96db-4829-97cf-2f3a59a3a548
Body :-
[
"GLD24449"
]
using (WebClient client = new WebClient())
{
client.Headers.Add("X-Cleartax-Auth-Token", ConfigurationManager.AppSettings["auth-token"]);
client.Headers[HttpRequestHeader.ContentType] = "application/json";
string url = ConfigurationManager.AppSettings["host"] + ConfigurationManager.AppSettings["taxable_entities"] + "/ewaybill/download?print_type=detailed";
client.Encoding = Encoding.UTF8;
//var data = "[\"GLD24449\"]";
var data = UTF8Encoding.UTF8.GetBytes(TblHeader.Rows[0]["id"].ToString());
byte[] r = client.UploadData(url, data);
using (var stream = System.IO.File.Create("FilePath"))
{
stream.Write(r,0,r.length);
}
}
Try this. Remember to change the filepath. Since the data you posted is not valid
json. So, I decide to post data this way.
I think it's straight forward, but instead of using WebClient, you can use HttpClient, it's better.
here is the answer HTTP client for downloading -> Download file with WebClient or HttpClient?
comparison between the HTTP client and web client-> Deciding between HttpClient and WebClient
Example Using WebClient
public static void Main(string[] args)
{
string path = #"download.pdf";
// Delete the file if it exists.
if (File.Exists(path))
{
File.Delete(path);
}
var uri = new Uri("https://ewbbackend-preprodpub-http.internal.cleartax.co/gst/v0.1/taxable_entities/1c74ddd2-6383-4f4b-a7a5-007ddd08f9ea/ewaybill/download?print_type=detailed");
WebClient client = new WebClient();
client.Headers[HttpRequestHeader.ContentType] = "application/json";
client.Headers.Add("X-Cleartax-Auth-Token", "b1f57327-96db-4829-97cf-2f3a59a3a548");
client.Encoding = Encoding.UTF8;
var data = UTF8Encoding.UTF8.GetBytes("[\"GLD24449\"]");
byte[] r = client.UploadData(uri, data);
using (var stream = System.IO.File.Create(path))
{
stream.Write(r, 0, r.Length);
}
}
Here is the sample code, don't forget to change the path.
public class Program
{
public static async Task Main(string[] args)
{
string path = #"download.pdf";
// Delete the file if it exists.
if (File.Exists(path))
{
File.Delete(path);
}
var uri = new Uri("https://ewbbackend-preprodpub-http.internal.cleartax.co/gst/v0.1/taxable_entities/1c74ddd2-6383-4f4b-a7a5-007ddd08f9ea/ewaybill/download?print_type=detailed");
HttpClient client = new HttpClient();
var request = new HttpRequestMessage(HttpMethod.Post, uri)
{
Content = new StringContent("[\"GLD24449\"]", Encoding.UTF8, "application/json")
};
request.Headers.Add("X-Cleartax-Auth-Token", "b1f57327-96db-4829-97cf-2f3a59a3a548");
var response = await client.SendAsync(request);
if (response.IsSuccessStatusCode)
{
using (FileStream fs = File.Create(path))
{
await response.Content.CopyToAsync(fs);
}
}
else
{
}
}

Convert API, new API version docx to PDF

I'm using Convert API to convert docx to PDF. With the old API version everything works good, but I'm trying to migrate to the new API version and when I open the PDF is not a valid document and it will not open. Not sure what I am doing wrong, maybe something about the encoding?
The response that I get from Convert API is a JSON with the File Name, File Size and File Data. Maybe this File Data needs to be processed to create a valid PDF file? if I just write that data in a file it does not work.
public string ConvertReportToPDF(string fileName)
{
string resultFileName = "";
key = "xxxxx";
var requestContent = new MultipartFormDataContent();
var fileStream = System.IO.File.OpenRead(fileName);
var stream = new StreamContent(fileStream);
requestContent.Add(stream, "File", fileStream.Name);
var response = new HttpClient().PostAsync("https://v2.convertapi.com/docx/to/pdf?Secret=" + key, requestContent).Result;
FileReportResponse responseDeserialized = JsonConvert.DeserializeObject<FileReportResponse>(response.Content.ReadAsStringAsync().Result);
var path = SERVER_TEMP_PATH + "\\" + responseDeserialized.Files.First().FileName;
System.IO.File.WriteAllText(path, responseDeserialized.Files.First().FileData);
return responseDeserialized.Files.First().FileName;
}
File data in JSON is Base64 encoded, decode it before writing to a file.
public string ConvertReportToPDF(string fileName)
{
string resultFileName = "";
key = "xxxxx";
var requestContent = new MultipartFormDataContent();
var fileStream = System.IO.File.OpenRead(fileName);
var stream = new StreamContent(fileStream);
requestContent.Add(stream, "File", fileStream.Name);
var response = new HttpClient().PostAsync("https://v2.convertapi.com/docx/to/pdf?Secret=" + key, requestContent).Result;
FileReportResponse responseDeserialized = JsonConvert.DeserializeObject<FileReportResponse>(response.Content.ReadAsStringAsync().Result);
var path = SERVER_TEMP_PATH + "\\" + responseDeserialized.Files.First().FileName;
System.IO.File.WriteAllText(path, Convert.FromBase64String(responseDeserialized.Files.First().FileData));
return responseDeserialized.Files.First().FileName;
}
Why to use JSON response in C# when you can use binary response instead. A response will be smaller, no need to decode. To change response type you need to add accept=application/octet-stream header to request to ask for binary response from server. The whole code will look like
using System;
using System.Net;
using System.IO;
class MainClass {
public static void Main (string[] args) {
const string fileToConvert = "test.docx";
const string fileToSave = "test.pdf";
const string Secret="";
if (string.IsNullOrEmpty(Secret))
Console.WriteLine("The secret is missing, get one for free at https://www.convertapi.com/a");
else
try
{
Console.WriteLine("Please wait, converting!");
using (var client = new WebClient())
{
client.Headers.Add("accept", "application/octet-stream");
var resultFile = client.UploadFile(new Uri("http://v2.convertapi.com/docx/to/pdf?Secret=" + Secret), fileToConvert);
File.WriteAllBytes(fileToSave, resultFile );
Console.WriteLine("File converted successfully");
}
}
catch (WebException e)
{
Console.WriteLine("Status Code : {0}", ((HttpWebResponse)e.Response).StatusCode);
Console.WriteLine("Status Description : {0}", ((HttpWebResponse)e.Response).StatusDescription);
Console.WriteLine("Body : {0}", new StreamReader(e.Response.GetResponseStream()).ReadToEnd());
}
}
}

get downloaded file from URL and Illegal characters in path

string uri = "https://sometest.com/l/admin/ical.html?t=TD61C7NibbV0m5bnDqYC_q";
string filePath = "D:\\Data\\Name";
WebClient webClient = new WebClient();
webClient.DownloadFile(uri, (filePath + "/" + uri.Substring(uri.LastIndexOf('/'))));
/// filePath + "/" + uri.Substring(uri.LastIndexOf('/')) = "D:\\Data\\Name//ical.html?t=TD61C7NibbV0m5bnDqYC_q"
Accesing the entire ( string ) uri, a .ical file will be automatically downloaded... The file name is room113558101.ics ( not that this will help ).
How can I get the file correctly?
You are building your filepath in a wrong way, which results in invalid file name (ical.html?t=TD61C7NibbV0m5bnDqYC_q). Instead, use Uri.Segments property and use last path segment (which will be ical.html in this case. Also, don't combine file paths by hand - use Path.Combine:
var uri = new Uri("https://sometest.com/l/admin/ical.html?t=TD61C7NibbV0m5bnDqYC_q");
var lastSegment = uri.Segments[uri.Segments.Length - 1];
string directory = "D:\\Data\\Name";
string filePath = Path.Combine(directory, lastSegment);
WebClient webClient = new WebClient();
webClient.DownloadFile(uri, filePath);
To answer your edited question about getting correct filename. In this case you don't know correct filename until you make a request to server and get a response. Filename will be contained in response Content-Disposition header. So you should do it like this:
var uri = new Uri("https://sometest.com/l/admin/ical.html?t=TD61C7NibbV0m5bnDqYC_q");
string directory = "D:\\Data\\Name";
WebClient webClient = new WebClient();
// make a request to server with `OpenRead`. This will fetch response headers but will not read whole response into memory
using (var stream = webClient.OpenRead(uri)) {
// get and parse Content-Disposition header if any
var cdRaw = webClient.ResponseHeaders["Content-Disposition"];
string filePath;
if (!String.IsNullOrWhiteSpace(cdRaw)) {
filePath = Path.Combine(directory, new System.Net.Mime.ContentDisposition(cdRaw).FileName);
}
else {
// if no such header - fallback to previous way
filePath = Path.Combine(directory, uri.Segments[uri.Segments.Length - 1]);
}
// copy response stream to target file
using (var fs = File.Create(filePath)) {
stream.CopyTo(fs);
}
}

Get original filename when downloading with WebClient

Is there any way to know the original name of a file you download using the WebClient when the Uri doesn't contain the name?
This happens for example in sites where the download originates from a dynamic page where the name isn't known beforehand.
Using my browser, the file gets the orrect name. But how can this be done using the WebClient?
E.g.
WebClient wc= new WebClient();
var data= wc.DownloadData(#"www.sometime.com\getfile?id=123");
Using DownloadFile() isn't a solution since this method needs a filename in advance.
You need to examine the response headers and see if there is a content-disposition header present which includes the actual filename.
WebClient wc = new WebClient();
var data= wc.DownloadData(#"www.sometime.com\getfile?id=123");
string fileName = "";
// Try to extract the filename from the Content-Disposition header
if (!String.IsNullOrEmpty(wc.ResponseHeaders["Content-Disposition"]))
{
fileName = wc.ResponseHeaders["Content-Disposition"].Substring(wc.ResponseHeaders["Content-Disposition"].IndexOf("filename=") + 9).Replace("\"", "");
}
Read the Response Header "Content-Disposition" with WebClient.ResponseHeaders
It should be:
Content-Disposition: attachment; filename="fname.ext"
your code should look like:
string header = wc.ResponseHeaders["Content-Disposition"]??string.Empty;
const string filename="filename=";
int index = header.LastIndexOf(filename,StringComparison.OrdinalIgnoreCase);
if (index > -1)
{
fileName = header.Substring(index+filename.Length);
}
To get the filename without downloading the file:
public string GetFilenameFromWebServer(string url)
{
string result = "";
var req = System.Net.WebRequest.Create(url);
req.Method = "HEAD";
using (System.Net.WebResponse resp = req.GetResponse())
{
// Try to extract the filename from the Content-Disposition header
if (!string.IsNullOrEmpty(resp.Headers["Content-Disposition"]))
{
result = resp.Headers["Content-Disposition"].Substring(resp.Headers["Content-Disposition"].IndexOf("filename=") + 9).Replace("\"", "");
}
}
return result;
}
If you, like me, have to deal with a Content-Disposition header that is not formatted correctly or cannot be parsed automatically by the ContentDisposition class for some reason, here's my solution :
string fileName = null;
// Getting file name
var request = WebRequest.Create(url);
request.Method = "HEAD";
using (var response = request.GetResponse())
{
// Headers are not correct... So we need to parse manually
var contentDisposition = response.Headers["Content-Disposition"];
// We delete everything up to and including 'Filename="'
var fileNameMarker= "filename=\"";
var beginIndex = contentDisposition.ToLower().IndexOf(fileNameMarker);
contentDisposition = contentDisposition.Substring(beginIndex + fileNameMarker.Length);
//We only get the string until the next double quote
var fileNameLength = contentDisposition.ToLower().IndexOf("\"");
fileName = contentDisposition.Substring(0, fileNameLength);
}

Categories