Unable to get HTML - c#

I'm trying to open adfoc.us/504....9 link with httpwebrequest.
However it gives me no HTML code.
try
{
req = WebRequest.Create(txtLink.Text);
WebProxy wp = new WebProxy(proxies[0]);
//req.Proxy = wp;
WebResponse wr = req.GetResponse();
StreamReader sr = new StreamReader(wr.GetResponseStream());
string content = sr.ReadToEnd();
MessageBox.Show(content);
sr.Close();
}
catch (UriFormatException)
{
MessageBox.Show("URL should be in this format:\nhttp://www.google.com");
return;
}
If I use website like [google.com][1] - I get mbox with google html source.
If I use adfoc.us/50.... link I get an empty string.
Where could be the problem?
Thank you.
EDIT: I resolved the problem by installing GeckoFx component.

This is just a guess.
If you can open the link in your browser and not from your code it could mean that adfoc.us blocks you because it can't find the useragent header. Try adding a useragent header that a browser uses.

try this
var req = (System.Net.HttpWebRequest) System.Net.WebRequest.Create("");
req.AllowAutoRedirect = true;
and you can manual set MaximumAutomaticRedirections

When initializing the WebRequest, add the following:
req.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
Seems like it doesn't like the default header. I got the above from Firefox request header.

Related

how to open webpage in c# without using webbrowser class

I want to know how to open webpage in c# without using webbrowser class. First time on c sharp. I tried below but that did not work. Can anyone help.
HttpWebRequest myRequest = (HttpWebRequest)WebRequest.Create("http://google.com");
myRequest.Method = "GET";
WebResponse myResponse = myRequest.GetResponse();
StreamReader sr = new StreamReader(myResponse.GetResponseStream(), System.Text.Encoding.UTF8);
string result = sr.ReadToEnd();
sr.Close();
myResponse.Close();
If you try to simply open a website without doing something else with it, you could do something like that to open the defined default browser:
string url = "http://google.com";
System.Diagnostics.Process.Start(url);

c# App skip after reading JSON response

I been trying to use an online API that returns an a json.
I am using winform application at the moment.
So far i tried
WebClient cHttp = new WebClient();
string htmlCode = cHttp.DownloadString(path); <--------
///-----------And then this
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(path);
request.Method = WebRequestMethods.Http.Get;
request.Accept = "application/json";
HttpWebResponse response = (HttpWebResponse)request.GetResponse(); <-----
Where i point the arrow to, the program doesn't crash it just hit that line and then skip all the code below it. Then my form open without running my entire code. What am i doing wrong?
Thank you
Use try-catch block, and you'll see an error:
try
{
WebClient cHttp = new WebClient();
string htmlCode = cHttp.DownloadString(path);
}
catch(Exception e)
{
Debug.WriteLine(e);
}
It jumps out of your method. You must catch the exception and return null object. or find the error.

connect to website using a free proxy server programmatically

I need to connect to a website using a proxy server. I can do this manually, for example I can use the online proxy http://zend2.com and then surf to www.google.com. But this must be done programmatically. I know I can use WebProxy class but how can I write a code so a proxy server can be used?
Anyone can give me a code snippet as example or something?
thanks
Understanding of zend2 works, you can populate an url like this :
http://zend2.com/bro.php?u=http%3A%2F%2Fwww.google.com&b=12&f=norefer
for browsing google.
I C#, build the url like this :
string targetUrl = "http://www.google.com";
string proxyUrlFormat = "http://zend2.com/bro.php?u={0}&b=12&f=norefer";
string actualUrl = string.Format(proxyUrlFormat, HttpUtility.UrlEncode(targetUrl));
// Do something with the proxy-ed url
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(new Uri(actualUrl));
HttpWebResponse resp = req.GetResponse();
string content = null;
using(StreamReader sr = new StreamReader(resp.GetResponseStream()))
{
content = sr.ReadToEnd();
}
Console.WriteLine(content);
You can use WebProxy Class
MSDN code
WebProxy proxyObject = new WebProxy("http://proxyserver:80/",true);
WebRequest req = WebRequest.Create("http://www.contoso.com");
req.Proxy = proxyObject;
In your case
WebProxy proxyObject = new WebProxy("http://zend2.com",true);
WebRequest req = WebRequest.Create("www.google.com");
req.Proxy = proxyObject;

Grabbing HTML from URL doesn't work - any tips?

I have tried several methods in C# using webclient and webresponse and they all return
<html><head><meta http-equiv=\"REFRESH\" content=\"0; URL=http://www.windowsphone.com/en-US/games?list=xbox\"><script type=\"text/javascript\">function OnBack(){}</script></head></html>"
instead of the actual rendered page when you use a browser to go to http://www.windowsphone.com/en-US/games?list=xbox
How would you go about grabbing the HTML from that location?
http://www.windowsphone.com/en-US/games?list=xbox
Thanks!
/edit: examples added:
Tried:
string inputUrl = "http://www.windowsphone.com/en-US/games?list=xbox";
string resultHTML = String.Empty;
Uri inputUri = new Uri(inputUrl);
WebRequest request = WebRequest.CreateDefault(inputUri);
request.Method = "GET";
WebResponse response;
try
{
response = request.GetResponse();
using (StreamReader reader = new StreamReader(response.GetResponseStream()))
{
resultHTML = reader.ReadToEnd();
}
}
catch { }
Tried:
string inputUrl = "http://www.windowsphone.com/en-US/games?list=xbox";
string resultHTML = String.Empty;
WebClient webClient = new WebClient();
try
{
resultHTML = webClient.DownloadString(inputUrl);
}
catch { }
Tried:
string inputUrl = "http://www.windowsphone.com/en-US/games?list=xbox";
string resultHTML = String.Empty;
WebResponse objResponse;
WebRequest objRequest = HttpWebRequest.Create(inputUrl);
try
{
objResponse = objRequest.GetResponse();
using (StreamReader sr = new StreamReader(objResponse.GetResponseStream()))
{
resultHTML = sr.ReadToEnd();
sr.Close();
}
}
catch { }
I checked for this URL, and you need to parse the cookies.
When you try to access the page for the first time, you are redirected to an https URL on login.live.com and then redirected back to the original URL. The https page sets a cookie called MSPRequ for the domain login.live.com. If you do not have this cookie, you cannot access the site.
I tried disabling cookies in my browser and it ends up looping infinitely back to the URL https://login.live.com/login.srf?wa=wsignin1.0&rpsnv=11&checkda=1&ct=1328303901&rver=6.1.6195.0&wp=MBI&wreply=http:%2F%2Fwww.windowsphone.com%2Fen-US%2Fgames%3Flist%3Dxbox&lc=1033&id=268289. It's been going on for several minutes now and doesn't appear it will ever stop.
So you will have to grab the cookie from the https page when it is set, and persist that cookie for your subsequent requests.
This might be because the server you are requesting HTML from returns different HTML depending on the User Agent string. You might try something like this
webClient.Headers.Add ("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");
That particular header may not work, but you could try others that would mimic standard browsers.

Can't get HTML code through HttpWebRequest

I am trying to parse the HTML code of the page at http://odds.bestbetting.com/horse-racing/today in order to have a list of races, etc.
The problem is I am not being able to retrieve the HTML code of the page. Here is the C# code of the function:
public static string Http(string url) {
Uri myUri = new Uri(url);
// Create a 'HttpWebRequest' object for the specified url.
HttpWebRequest myHttpWebRequest = (HttpWebRequest)WebRequest.Create(myUri);
myHttpWebRequest.AllowAutoRedirect = true;
// Send the request and wait for response.
HttpWebResponse myHttpWebResponse = (HttpWebResponse)myHttpWebRequest.GetResponse();
var stream = myHttpWebResponse.GetResponseStream();
var reader = new StreamReader(stream);
var html = reader.ReadToEnd();
// Release resources of response object.
myHttpWebResponse.Close();
return html;
}
When I execute the program calling the function it throws an exception on
HttpWebResponse myHttpWebResponse =
(HttpWebResponse)myHttpWebRequest.GetResponse();
which is:
Cannot handle redirect from HTTP/HTTPS protocols to other dissimilar ones.
I have read this question but I don't seem to have the same problem.
I've also tried iguring something out sniffing the traffic with fiddler but can't see anything to where it redirects or something similar. I just have extracted these two possible redirections: odds.bestbetting.com/horse-racing/2011-06-10/byCourse
and odds.bestbetting.com/horse-racing/2011-06-10/byTime , but querying them produces the same result as above.
It's not the first time I do something like this, but I'm really lost on this one. Any help?
Thanks!
I finally found the solution... it effectively was a problem with the headers, specifically the User-Agent one.
I found after lots of searching a guy having the same problem as me with the same site. Although his code was different the important bit was that he set the UserAgent attribute of the request manually to that of a browser. I think I had done this before but I may had done it pretty bad... sorry.
The final code if it is of interest to any one is this:
public static string Http(string url) {
if (url.Length > 0)
{
Uri myUri = new Uri(url);
// Create a 'HttpWebRequest' object for the specified url.
HttpWebRequest myHttpWebRequest = (HttpWebRequest)WebRequest.Create(myUri);
// Set the user agent as if we were a web browser
myHttpWebRequest.UserAgent = #"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4";
HttpWebResponse myHttpWebResponse = (HttpWebResponse)myHttpWebRequest.GetResponse();
var stream = myHttpWebResponse.GetResponseStream();
var reader = new StreamReader(stream);
var html = reader.ReadToEnd();
// Release resources of response object.
myHttpWebResponse.Close();
return html;
}
else { return "NO URL"; }
}
Thank you very much for helping.
There can be a dozen probable causes for your problem.
One of them is that the redirect from the server is pointing to an FTP site, or something like that.
It can also being that the server require some headers in the request that you're failing to provide.
Check what a browser would send to the site and try to replicate.

Categories