WebClient.DownloadString(url) does not work with 2nd parameter - c#

I am using the WebClient.DownloadString() method to download some data. I am using the following code:
static void Main(string[] args)
{
string query = "select+%3farticle+%3fmesh+where+{+%3farticle+a+npg%3aArticle+.+%3farticle+npg%3ahasRecord+[+dc%3asubject+%3fmesh+]+.+filter+regex%28%3fmesh%2c+\"blood\"%2c+\"i\"%29+}";
NameValueCollection queries = new NameValueCollection();
queries.Add("query", query);
//queries.Add("output", "sparql_json");
using (WebClient wc = new WebClient())
{
wc.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");
wc.QueryString = queries;
string result = wc.DownloadString("http://data.nature.com/sparql");
Console.WriteLine(result);
}
Console.ReadLine();
}
With this code, tt works fine and gives me an xml string as the output. But I would like to get a JSON output and hence I un-commented the line
queries.Add("output", "sparql_json");
and executed the same program and it seems to be fetching an error message from the server.
However, if I try to use a web browser and use the same url (as given below), it gives me a JSON as expected:
URL that works in browsers
I am wondering what the problem could be. Especially when it works in a browser and not using a webclient. Is the webclient doing something different here?
Please note that I also tried to specify the query as
query + "&output=sparql_json"
But that does not work either.
Could someone please tell me what the problem might be?
Thanks

Add wc.Headers.Add("Accept","application/json");. Here is the full source I tested
string query = "select ?article ?mesh where { ?article a npg:Article . ?article npg:hasRecord [ dc:subject ?mesh ] . filter regex(?mesh, \"blood\", \"i\") }";
NameValueCollection queries = new NameValueCollection();
queries.Add("query", query);
queries.Add("output", "sparql_json");
using (WebClient wc = new WebClient())
{
wc.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");
wc.Headers.Add("Accept","application/json");
wc.QueryString = queries;
string result = wc.DownloadString("http://data.nature.com/sparql");
Console.WriteLine(result);
}
Console.ReadLine();

Related

WebClient is throwing error while calling UploadString()

I have to work on JSON data from API (in my windows app), and I am trying make a POST request using WebClient.UploadString();
Below is my code, but its throwing error, I tried various options but not able to copy the JSON as a string.
string result = "";
string url = "https://30prnabicq-dsn.algolia.net/1/indexes/*/queries?x-algolia-agent=Algolia for vanilla JavaScript (lite) 3.24.12;JS Helper 2.24.0;vue-instantsearch 1.5.0&x-algolia-application-id=30PRNABICQ&x-algolia-api-key=dcccebe87b846b64f545bf63f989c2b1";
string json = "{\"requests\":[{\"indexName\":\"vacatures\",\"params\":\"query=&hitsPerPage=20&page=0&highlightPreTag=__ais-highlight__&highlightPostTag=__/ais-highlight__&facets=[\"category\",\"contract\",\"experienceNeeded\",\"region\"]&tagFilters=\"}]}";
using (var client = new WebClient())
{
client.Headers[HttpRequestHeader.Host] = "30prnabicq-dsn.algolia.net";
client.Headers[HttpRequestHeader.UserAgent] = "Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:60.0) Gecko/20100101 Firefox/60.0";
client.Headers[HttpRequestHeader.Accept] = "application/json";
client.Headers[HttpRequestHeader.AcceptLanguage] = "en-US,en;q=0.5";
client.Headers[HttpRequestHeader.AcceptEncoding] = "gzip, deflate, br";
client.Headers[HttpRequestHeader.Referer] = "https://bouwjobs.be/";
client.Headers[HttpRequestHeader.ContentType] = "application/json";
client.Headers[HttpRequestHeader.ContentLength] = "249";
client.Headers[HttpRequestHeader.Origin] = "https://bouwjobs.be";
client.Headers[HttpRequestHeader.Connection] = "keep-alive";
client.Headers[HttpRequestHeader.Cache - Control] = "max-age=0";
result = client.UploadString(url, "POST", json);
return result;
}
Please guide me in correcting my code.
Note - Some restricted headers I have included in my code, but even after commenting out those it is throwing error.
You do not seem uploading a valid Json with WebClient. Double quotes in your inner array facets means your query parameter has ended. Remove quotes from it.
string json = "{\"requests\":[{\"indexName\":\"vacatures\",\"paras\":\"query=&hitsPerPage=20&page=0&highlightPreTag=__ais-highlight__&highlightPostTag=__/ais-highlight__&facets=[category,contract,experienceNeeded,region]&tagFilters=\"}]}";
This is valid json and should work fine.

get method returns root url content

i try to ge the content of this url: https://www.eganba.com/index.php?p=Products&ctg_id=2000&sort_type=rel-desc&view=0&page=1
but as a result of the following code the response contains the content of this url, the home page: https://www.eganba.com
in addition, when i try to get the first url content via Postman application the response is correct.
do you have any idea?
WebRequest request = WebRequest.Create("https://www.eganba.com/index.php?p=Products&ctg_id=2000&sort_type=rel-desc&view=0&page=1");
request.Method = "GET";
request.Headers["X-Requested-With"] = "XMLHttpRequest";
WebResponse response = request.GetResponse();
Use WebClient method which inside System.Net. I think this code gives you what you need. It return the page's html
using (WebClient client = new WebClient())
{
client.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");
client.Headers.Add("accept", "text/html");
var htmlCode = client.DownloadString("https://www.eganba.com/?p=Products&ctg_id=2000&sort_type=rel-desc&view=0&page=1");
var result = htmlCode.Contains("Stokta var") ? true : false;
}
Hope it helps to you.

C# WebClient DownloadString returns gibberish

I am attempting to view the source of http://simpledesktops.com/browse/desktops/2012/may/17/where-the-wild-things-are/ using the code:
String URL = "http://simpledesktops.com/browse/desktops/2012/may/17/where-the-wild-things-are/";
WebClient webClient = new WebClient();
webClient.Headers.Add("user-agent", "Mozilla/5.0 (Windows; Windows NT 5.1; rv:1.9.2.4) Gecko/20100611 Firefox/3.6.4");
webClient.Encoding = Encoding.GetEncoding("Windows-1255");
string download = webClient.DownloadString(URL);
webClient.Dispose();
Console.WriteLine(download);
When I run this, the console returns a bunch of nonsense that looks like it's been decoded incorrectly.
I've also attempted adding headers with no avail:
webClient.Headers.Add("user-agent", "Mozilla/5.0 (Windows; Windows NT 5.1; rv:1.9.2.4) Gecko/20100611 Firefox/3.6.4");
webClient.Headers.Add("Accept-Encoding", "gzip,deflate");
Other websites all returned the proper html source. I can also view the page's source through Chrome. What's going on here?
Response of that URL is gzipped, you should decompress it or set empty Accept-Encoding header, you don't need that user-agent field.
String URL = "http://simpledesktops.com/browse/desktops/2012/may/17/where-the-wild-things-are/";
WebClient webClient = new WebClient();
webClient.Headers.Add("Accept-Encoding", "");
string download = webClient.DownloadString(URL);
I've had the same thing bug me today.
Using a WebClient object to check whether a URL is returning something.
But my experience is different. I tried removing the Accept-Encoding, basically using the code #Antonio Bakula gave in his answer. But I kept getting the same error every time (InvalidOperationException)
So this did not work:
WebClient wc = new WebClient();
wc.Headers.Add("Accept-Encoding", "");
string result = wc.DownloadString(url);
But adding 'any' text as a User Agent instead did do the trick. This worked fine:
WebClient wc = new WebClient();
wc.Headers.Add(HttpRequestHeader.UserAgent, "My User Agent String");
System.IO.Stream stream = wc.OpenRead(url);
Your mileage may vary obviously, also of note. I'm using ASP.NET 4.0.30319.

Cant access site using webclient method..?

I am making a desktop yellowpage application. I can access all countries yellowpage site but not australian site. I dont know why?
Here is the code
class Program
{
static void Main(string[] args)
{
WebClient wb = new WebClient();
wb.Headers.Add("user-agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US)");
string html = wb.DownloadString("http://www.yellowpages.com.au");
Console.WriteLine(html);
}
}
For all other site I get html of the website for australian site I get null. i even tried httpwebrequest also.
Here is the yellowpage australian site: http://www.yellowpages.com.au
Thanks in advance
It looks like that website will only send over gzip'ed data. Try switching to HttpWebRequest and using auto decompression:
var request = (HttpWebRequest)WebRequest.Create("http://www.yellowpages.com.au");
request.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705;)";
request.Headers.Add(HttpRequestHeader.AcceptEncoding, "gzip,deflate");
request.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
In addition to #bkaid's correct (and upvoted) answer, you can use your own class inherited from WebClient to uncompress/handle gzip compressed html:
public class GZipWebClient : WebClient
{
protected override WebRequest GetWebRequest(Uri address)
{
HttpWebRequest request = (HttpWebRequest)base.GetWebRequest(address);
request.AutomaticDecompression = DecompressionMethods.GZip |
DecompressionMethods.Deflate;
return request;
}
}
Having done this, the following works just fine:
WebClient wb = new GZipWebClient();
string html = wb.DownloadString("http://www.yellowpages.com.au");
When I view the transfer from that website in Wireshark, it says it's a malformed HTTP packet. It says it uses chunked transfer, then says the following chunk has 0 bytes and then sends the code of the website. That's why WebClient returns an empty string (not null). And I think it's correct behavior.
It seems browsers ignore this error and so they can display the page properly.
EDIT:
As bkaid pointed out, the server seems to handle send correct gziped response. The following code works for me:
WebClient wb = new WebClient();
wb.Headers.Add("Accept-Encoding", "gzip");
string html;
using (var webStream = wb.OpenRead("http://www.yellowpages.com.au"))
using (var gzipStream = new GZipStream(webStream, CompressionMode.Decompress))
using (var streamReader = new StreamReader(gzipStream))
html = streamReader.ReadToEnd();

Why Does my HttpWebRequest Return 400 Bad request?

The following code fails with a 400 bad request exception. My network connection is good and I can go to the site but I cannot get this uri with HttpWebRequest.
private void button3_Click(object sender, EventArgs e)
{
WebRequest req = HttpWebRequest.Create(#"http://www.youtube.com/");
try
{
//returns a 400 bad request... Any ideas???
WebResponse response = req.GetResponse();
}
catch (WebException ex)
{
Log(ex.Message);
}
}
First, cast the WebRequest to an HttpWebRequest like this:
HttpWebRequest req = (HttpWebRequest)HttpWebRequest.Create(#"http://www.youtube.com/");
Then, add this line of code:
req.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)";
Set UserAgent and Referer in your HttpWebRequest:
var request = (HttpWebRequest)WebRequest.Create(#"http://www.youtube.com/");
request.Referer = "http://www.youtube.com/"; // optional
request.UserAgent =
"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; WOW64; " +
"Trident/4.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; " +
".NET CLR 3.5.21022; .NET CLR 3.5.30729; .NET CLR 3.0.30618; " +
"InfoPath.2; OfficeLiveConnector.1.3; OfficeLivePatch.0.0)";
try
{
var response = (HttpWebResponse)request.GetResponse();
using (var reader = new StreamReader(response.GetResponseStream()))
{
var html = reader.ReadToEnd();
}
}
catch (WebException ex)
{
Log(ex);
}
There could be many causes for this problem. Do you have any more details about the WebException?
One cause, which I've run into before, is that you have a bad user agent string. Some websites (google for instance) check that requests are coming from known user agents to prevent automated bots from hitting their pages.
In fact, you may want to check that the user agreement for YouTube does not preclude you from doing what you're doing. If it does, then what you're doing may be better accomplished by going through approved channels such as web services.
Maybe you've got a proxy server running, and you haven't set the Proxy property of the HttpWebRequest?

Categories