C# WebClient DownloadData issue on double quotes - c#

Im making a request to a url, but in the returned string i get foreach " an \"
This is my code:
WebClient webclient = new WebClient();
byte[] databuffer = webclient.DownloadData(url);
return Encoding.UTF8.GetString(databuffer);
What could be the problem to return as content of the webpage for instance:
<div id=\"whatever\"> instead <div id="whatever">?

There's no problem, you are probably looking the result in Visual Studio Debugger which does this. The actual string you are getting doesn't have any \". Try saving it to a file and you will see:
File.WriteAllBytes(#"c:\test.htm", databuffer);
So no worries, unless the web page you are downloading is crap and is using \" instead of " in the response.

Related

PDF is downloaded when sending MemoryStream to embed/iframe in view of ASP.NET MVC

I'm trying to show a pdf in my view that I download from an API, however everytime the view is loaded the browser downloads the pdf instead of showing it inside the iframe or embed.
I'm getting and returning the pdf in my action like this:
public FileStreamResult GetPDFInBrowser(Guid id)
{
string url = A link that gets the pdf from API with id, content is stored as a string inside the "root" object;
WebClient client = new WebClient();
string response = client.DownloadString(url);
Root responseObject = JsonConvert.DeserializeObject<Root>(response);
OOLDocument oolDocument = responseObject.value[0];
byte[] byteArray = Encoding.UTF8.GetBytes(oolDocument.Contents);
MemoryStream stream = new MemoryStream(byteArray);
return File(stream, "application/pdf", oolDocument.FileName);
}
Afterwards I try to show the pdf in my view like this with either iframe or embed:
<iframe src="#baseUrl/GetPDFInBrowser?id=#Model.DocumentsGuid" class="pdfviewer">
</iframe>
<embed src="#baseUrl/GetPDFInBrowser?id=#Model.DocumentsGuid" class="pdfviewer" type="application/pdf" id="pdfObjectViewer">
</embed>
So, how do I make sure the pdf is embedded in the browser instead of having the browser download it?
Any help or tips for improving the code are greatly appreciated.
Edit:
With some more searching combined with trial and error I was able to fix the error I was getting and have therefore removed it from the remainder of the question.
For those interested i fixed the error "Cannot load pdf-document" by changing:
byte[] byteArray = Encoding.UTF8.GetBytes(oolDocument.Contents);
To:
byte[] byteArray = Convert.FromBase64String(oolDocument.Contents);
However the pdf is still being downloaded instead of being shown in the view

Download and encode HTML page into file

I like to download some web pages which use charset="UTF-8"
This page is a sample: http://en.wikipedia.org/wiki/Billboard_Year-End_Hot_100_singles_of_2003
I always end up with special characters like this:
Beyoncé instead of Beyoncé
I tried the following code:
WebClient webClient = new WebClient();
webClient.Encoding = System.Text.Encoding.UTF8;
webClient.DownloadFile(url, fileName);
or this one:
WebClient client = new WebClient();
Byte[] pageData = client.DownloadData(url);
string pageHtml = Encoding.UTF8.GetString(pageData);
System.IO.File.WriteAllText(fileName, pageHtml);
What do I do wrong?
I just want an easy way to download web pages and write them to files. After that is done I will extract data from these files and obviously I want "normal" characters like I see on the original web-page and not some special characters.
The problem is that the WriteAllText Method don't write the encoded Text in UTF-8 in the File.
You should add the Encoding:
System.IO.File.WriteAllText(fileName, pageHtml, Encoding.UTF8);

String encoding with a JSON flow got by web request C#

I have a little problem with a string in C#. Actually, I take a JSON flow by an URL.
WebClient webC = new WebClient();
string jsonStr = webC.DownloadString("http://www.express-board.fr/api/jobs");
But when I write the string in the console, I have the problem of encoding.
[...]"contract":"Freelance/Indépendant"[...]
I have try to used lot of trick seen on stackoverflow with Encoding class. But impossible, to solve the problem. Of course if I use the link directly in my web browser and open it in Notepadd++ no problem.
Sometimes, with some combinaison of encoding ( ACSII-> UTF-8 I think), I obtain this :
[...]"contract":"Freelance/Indépendant"[...] to
[...]"contract":"Freelance/Ind??pendant"[...]
This actually returns the string as intended:
WebClient webC = new WebClient();
webC.Encoding = Encoding.UTF8;
string jsonStr = webC.DownloadString("http://www.express-board.fr/api/jobs");

WebClient.DownloadFile vs. WebClient.DownloadData

I am using WebClient.DownloadFile to download a small executable file from the internet. This method is working very well. However, I would now like to download this executable file into a byte array rather than onto my hard drive. I did some reading and came across the WebClient.DownloadData method. The problem that I am having with the downloadData method is that rather than downloading my file, my code is downloading the HTML data behind my file's download page.
I have tried using dozens of sites - each brings me the same issue. Below is the code I am using.
// Create a new instance of the System.Net 'WebClient'
System.Net.WebClient client = new System.Net.WebClient();
// Download URL
Uri uri = new Uri("http://www35.multiupload.com:81/files/4D7B4D2BFC3F1A9F765A433BA32ED2C5883D0CE133154A0FDB7E7786547A3165DA62393141C4AF8FF36C75222566CF3EB64AF6FBCFC02099BB209C891529CF7B90C83D9C63D39D989CBB8ECE6DE2B83B/Project1.exe");
byte[] dbytes = client.DownloadData(uri);
MessageBox.Show(dbytes.Length.ToString()); // Not the size of my file
Keep in mind that I am attempting to download the data of an executable file into a byte array.
Thank you for any help,
Evan
You are attempting to download a file using an expired token url. See below:
URL: http://www35.multiupload.com:81/files/4D7B4D2BFC3F1A9F765A433BA32ED2C5883D0CE133154A0FDB7E7786547A3165DA62393141C4AF8FF36C75222566CF3EB64AF6FBCFC02099BB209C891529CF7B90C83D9C63D39D989CBB8ECE6DE2B83B/Project1.exe`
Server: www35
Token:
4D7B4D2BFC3F1A9F765A433BA32ED2C5883D0CE133154A0FDB7E7786547A3165DA62393141C4AF8FF36C75222566CF3EB64AF6FBCFC02099BB209C891529CF7B90C83D9C63D39D989CBB8ECE6DE2B83B
You can't just download a file by waiting for the timer to end, and copy the direct link, it's a "token" link. It will only work for a specified period of time before redirecting you back to the download page (which is why you are getting HTML instead of binary data).
Workaround
You will have to download the multiupload's HTML and parse the direct download link from the HTML source code. Only this way provides a sure-fire way of getting an up-to-date token url.
How #Dark Slipstream said, you're attempting to download a file using an expired token url
look how get the new url:
System.Net.WebClient client = new System.Net.WebClient();
// Download URL
Uri uri = new Uri("http://www.multiupload.com/39QMACX7XS");
byte[] dbytes = client.DownloadData(uri);
string responseStr = System.Text.Encoding.ASCII.GetString(dbytes);
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(responseStr);
string urlToDownload = doc.DocumentNode.SelectNodes("//a[contains(#href,'files/')]")[0].Attributes["href"].Value;
byte[] data = client.DownloadData(uri);
length = data.Length;
I dont parsing the exceptions

Help Needed for parsing FTP files list in c#

I am using this code for getting list of all the files in directory
here webRequestUrl = something.com/directory/
FtpWebRequest fwrr = (FtpWebRequest)FtpWebRequest.Create(new Uri("ftp://" + webRequestUrl));
fwrr.Credentials = new NetworkCredential(username, password);
fwrr.Method = WebRequestMethods.Ftp.ListDirectoryDetails;
StreamReader srr = new StreamReader(fwrr.GetResponse().GetResponseStream());
string str = srr.ReadLine();
ArrayList strList = new ArrayList();
while (str != null)
{
strList.Add(str);
str = srr.ReadLine();
}
but I am not getting the list of files, but getting some HTML document type lines.
This ftp server is windows based while it is working fine in unix server.
Please help.
Thanks.
It works for me when the FTP on a internal machine and I do a ftp://192.168.0.155 - If I try that in IE I get the same HTML result like yours.
I doubt if its happening because of the url. Can you try replacing the url with the IP address (just a wild guess). Even if you are getting HTML, you can strip the unnecessary part and parse the files.
I even tried with a ftp://sub.a.com/somefolder and it worked for me. It seems the browser wraps the HTML around the FTP response because I get different HTML when I opened the FTP site in IE and Chrome.

Categories