I'm working on a C# project that uses a public XML feed for calculations. I originally used XmlDocument.Load, but migrated to WebClient.DownloadString so I could include headers in my request. The feed I'm accessing usually responds quickly, but every now and again it fails to respond within the timeout period of the WebClient object, and I get an exception. Here's my code:
XmlDocument xmlDoc = new XmlDocument();
Webclient client = new WebClient();
client.Headers["User-Agent"] = "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.835.202 Safari/535.1";
client.Headers["Accept"] = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
string data = client.DownloadString(/*URL*/);
xmlDoc.LoadXml(data);
I've read that you cannot change the timeout property of WebClient, and people who have this problem should use HttpWebRequest instead. Unfortunately, I don't know how to go about implementing this in a way that still allows me to use my headers AND send that result to xmlDoc. Due to the nature of this application, I don't care how long it takes to receive the data; I can handle alerting the user.
What is the best way to go about doing this?
You could use a WebClient derived class for this, which just adds the timeout you want for each fetch:
public class TimeoutWebClient : WebClient
{
protected override WebRequest GetWebRequest(Uri address)
{
HttpWebRequest request = (HttpWebRequest)base.GetWebRequest(address);
request.Timeout = 60000; //1 minute timeout
return request;
}
}
If you use TimeoutWebClient instead of WebClient now, you get the timeout behavior that you want. If the custom headers you need are the same for each request, you could add those here as well and your calling code remains very clean.
XmlDocument xmlDoc = new XmlDocument();
HttpWebRequest request = new (HttpWebRequest)WebRequest.Create(/*URL*/);
request.Headers = new WebHeaderCollection();
// fill in request.Headers...
// The response is presented as a stream. Wrap it in a StreamReader that
// xmlDoc.LoadXml can accept.
xmlDoc.LoadXml(new StreamReader(request.GetResponse().GetResponseStream());
You could just catch the exception, then reissue the request. You might want to put some other logic in here to abort after a certain number of failed attempts.
bool continue;
do{
continue = false;
try {
string data = client.DownloadString(/*URL*/);
}
catch (WebException e) {
continue = true;
}
}
while(continue);
Related
Does anyone know how to check if a webpage is asking for HTTP Authentication via C# using the WebRequest class? I'm not asking how to post Credentials to the page, just how to check if the page is asking for Authentication.
Current Snippet to get HTML:
WebRequest wrq = WebRequest.Create(address);
wrs = wrq.GetResponse();
Uri uri = wrs.ResponseUri;
StreamReader strdr = new StreamReader(wrs.GetResponseStream());
string html = strdr.ReadToEnd();
wrs.Close();
strdr.Close();
return html;
PHP Server side source:
<?php
if (!isset($_SERVER['PHP_AUTH_USER'])) {
header('WWW-Authenticate: Basic realm="Secure Sign-in"');
header('HTTP/1.0 401 Unauthorized');
echo 'Text to send if user hits Cancel button';
exit;
} else {
echo "<p>Hello {$_SERVER['PHP_AUTH_USER']}.</p>";
echo "<p>You entered {$_SERVER['PHP_AUTH_PW']} as your password.</p>";
}
?>
WebRequest.GetResponse returns an object of type HttpWebResponse. Just cast it and you can retrieve StatusCode.
However, .Net will give you an exception if it receives a response of status 4xx or 5xx (thanks for your feedback).
There is a little workaround, check it out:
HttpWebRequest wrq = (HttpWebRequest)WebRequest.Create(#"http://webstrand.comoj.com/locked/safe.php");
HttpWebResponse wrs = null;
try
{
wrs = (HttpWebResponse)wrq.GetResponse();
}
catch (System.Net.WebException protocolError)
{
if (((HttpWebResponse)protocolError.Response).StatusCode == HttpStatusCode.Unauthorized)
{
//do something
}
}
catch (System.Exception generalError)
{
//run to the hills
}
if (wrs.StatusCode == HttpStatusCode.OK)
{
Uri uri = wrs.ResponseUri;
StreamReader strdr = new StreamReader(wrs.GetResponseStream());
string html = strdr.ReadToEnd();
wrs.Close();
strdr.Close();
}
Hope this helps.
Regards
Might want to try
WebClient wc = new WebClient();
CredentialCache credCache = new CredentialCache();
If you can work with WebClient instead of WebRequest, you should it's a bit higher level, easier to handle headers etc.
Also, might want to check this thread:
System.Net.WebClient fails weirdly
I previously had a small VBScript that would test if a specific website was accessible by sending a GET request. The script itself was extremely simple and did everything I needed:
Function GETRequest(URL) 'Sends a GET http request to a specific URL
Dim objHttpRequest
Set objHttpRequest = CreateObject("MSXML2.XMLHTTP.3.0")
objHttpRequest.Open "GET", URL, False
On Error Resume Next 'Error checking in case access is denied
objHttpRequest.Send
GETRequest = objHttpRequest.Status
End Function
I now want to include this sort of functionality in an expanded C# application. However I've been unable to get the same results my previous script provided.
Using code similar to what I've posted below sort of gets me a proper result, but fails to run if my network connection has failed.
public static void GETRequest()
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://url");
request.Method = "GET";
HttpStatusCode status;
HttpWebResponse response;
try
{
response = (HttpWebResponse)request.GetResponse();
status = response.StatusCode;
Console.WriteLine((int)response.StatusCode);
Console.WriteLine(status);
}
catch (WebException e)
{
status = ((HttpWebResponse)e.Response).StatusCode;
Console.WriteLine(status);
}
}
But as I said, I need to know if the site is accessible, not matter the reason: the portal could be down, or the problem might reside on the side of the PC that's trying to access it. Either way: I don't care.
When I used MSXML2.XMLHTTP.3.0 in the script I was able to get values ranging from 12000 to 12156 if I was having network problems. I would like to have the same functionality in my C# app, that way I could at least write a minimum of information to a log and let the computer act accordingly. Any ideas?
A direct translation of your code would be something like this:
static void GetStatusCode(string url)
{
dynamic httpRequest = Activator.CreateInstance(Type.GetTypeFromProgID("MSXML2.XMLHTTP.3.0"));
httpRequest.Open("GET", url, false);
try { httpRequest.Send(); }
catch { }
finally { Console.WriteLine(httpRequest.Status); }
}
It's as small and simple as your VBScript script, and uses the same COM object to send the request.
This code happily gives me error code like 12029 ERROR_WINHTTP_CANNOT_CONNECT or 12007 ERROR_WINHTTP_NAME_NOT_RESOLVED etc.
If the code is failing only when you don't have an available network connection, you can use GetIsNetworkAvailable() before executing your code. This method will return a boolean indicating if a network connection is available or not. If it returns false, you could execute an early return / notify the user, and if not, continue.
System.Net.NetworkInformation.NetworkInterface.GetIsNetworkAvailable()
using the code you provided above:
public static void GETRequest()
{
if (!System.Net.NetworkInformation.NetworkInterface.GetIsNetworkAvailable())
return; //or alert the user there is no connection
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://url");
request.Method = "GET";
HttpStatusCode status;
HttpWebResponse response;
try
{
response = (HttpWebResponse)request.GetResponse();
status = response.StatusCode;
Console.WriteLine((int)response.StatusCode);
Console.WriteLine(status);
}
catch (WebException e)
{
status = ((HttpWebResponse)e.Response).StatusCode;
Console.WriteLine(status);
}
}
This should work for you, i've used it many times before, cut it down a bit for your needs: -
private static string GetStatusCode(string url)
{
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
req.Method = WebRequestMethods.Http.Get;
req.ProtocolVersion = HttpVersion.Version11;
req.UserAgent = "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)";
try
{
HttpWebResponse response = (HttpWebResponse)req.GetResponse();
StringBuilder sb = new StringBuilder();
foreach (string header in response.Headers)
{
sb.AppendLine(string.Format("{0}: {1}", header, response.GetResponseHeader(header)));
}
return string.Format("Response Status Code: {0}\nServer:{1}\nProtocol: {2}\nRequest Method: {3}\n\n***Headers***\n\n{4}", response.StatusCode,response.Server, response.ProtocolVersion, response.Method, sb);
}
catch (Exception e)
{
return string.Format("Error: {0}", e.ToString());
}
}
Feel free to ignore the section that gets the headers
I am trying to parse the HTML code of the page at http://odds.bestbetting.com/horse-racing/today in order to have a list of races, etc.
The problem is I am not being able to retrieve the HTML code of the page. Here is the C# code of the function:
public static string Http(string url) {
Uri myUri = new Uri(url);
// Create a 'HttpWebRequest' object for the specified url.
HttpWebRequest myHttpWebRequest = (HttpWebRequest)WebRequest.Create(myUri);
myHttpWebRequest.AllowAutoRedirect = true;
// Send the request and wait for response.
HttpWebResponse myHttpWebResponse = (HttpWebResponse)myHttpWebRequest.GetResponse();
var stream = myHttpWebResponse.GetResponseStream();
var reader = new StreamReader(stream);
var html = reader.ReadToEnd();
// Release resources of response object.
myHttpWebResponse.Close();
return html;
}
When I execute the program calling the function it throws an exception on
HttpWebResponse myHttpWebResponse =
(HttpWebResponse)myHttpWebRequest.GetResponse();
which is:
Cannot handle redirect from HTTP/HTTPS protocols to other dissimilar ones.
I have read this question but I don't seem to have the same problem.
I've also tried iguring something out sniffing the traffic with fiddler but can't see anything to where it redirects or something similar. I just have extracted these two possible redirections: odds.bestbetting.com/horse-racing/2011-06-10/byCourse
and odds.bestbetting.com/horse-racing/2011-06-10/byTime , but querying them produces the same result as above.
It's not the first time I do something like this, but I'm really lost on this one. Any help?
Thanks!
I finally found the solution... it effectively was a problem with the headers, specifically the User-Agent one.
I found after lots of searching a guy having the same problem as me with the same site. Although his code was different the important bit was that he set the UserAgent attribute of the request manually to that of a browser. I think I had done this before but I may had done it pretty bad... sorry.
The final code if it is of interest to any one is this:
public static string Http(string url) {
if (url.Length > 0)
{
Uri myUri = new Uri(url);
// Create a 'HttpWebRequest' object for the specified url.
HttpWebRequest myHttpWebRequest = (HttpWebRequest)WebRequest.Create(myUri);
// Set the user agent as if we were a web browser
myHttpWebRequest.UserAgent = #"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4";
HttpWebResponse myHttpWebResponse = (HttpWebResponse)myHttpWebRequest.GetResponse();
var stream = myHttpWebResponse.GetResponseStream();
var reader = new StreamReader(stream);
var html = reader.ReadToEnd();
// Release resources of response object.
myHttpWebResponse.Close();
return html;
}
else { return "NO URL"; }
}
Thank you very much for helping.
There can be a dozen probable causes for your problem.
One of them is that the redirect from the server is pointing to an FTP site, or something like that.
It can also being that the server require some headers in the request that you're failing to provide.
Check what a browser would send to the site and try to replicate.
I am tring to screen scrape a page of a web app that just contains text and is hosted by a 3rd party. It's not a properly formed HTML page, however the text that is diplayed will tell us if the web app is up or down.
When I try to scrape the sreen it returns an error when it tries the WebRequest. The error is "The remote server returned an error: (500) Internal Server Error."
public void ScrapeScreen()
{
try
{
var url = textBox1.Text;
var request = WebRequest.Create(url);
var response = request.GetResponse();
var stream = response.GetResponseStream();
var reader = new StreamReader(stream);
var result = reader.ReadToEnd();
stream.Dispose();
reader.Dispose();
richTextBox1.Text = result;
}
catch(Exception ex)
{
MessageBox.Show(ex.Message);
}
}
Any ideas how I can get the text from the page?
Some sites don't like the default UserAgent. Consider changing it to something real, like:
((HttpWebRequest)request).UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.125 Safari/533.4"
First, try this:
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
However, if you're just looking for text and not having to do any POST-ing of data to the server, you may want to look at the webClient class. It more closely resembles a real browser, and takes care of a lot of HTTP header stuff that you may end up having to twek if you stick with the HttpWebRequest class.
For the project I'm working on, we have a desktop program that contacts an online server for a store. Because it's used in schools, getting the proxy setup right is tricky. What we've gone for is to allow users to specify proxy details to use if they want, otherwise it uses the ones from IE. We've also tried to bypass incorrect details being put in, so the code tries the user specified proxy, if that fails the default one, if that fails, then with credentials, if that fails then null.
The problem I'm having is that in places where the proxy settings need to be changed in succession (for example, if their registration fails because the proxy is wrong, they change one tiny thing and try again, takes seconds.) I end up with calls to a HttpRequests .GetResponse() timing out, causing the program to freeze for a good while. Sometimes if I leave a minute or two between the changes, it doesn't freeze, but not every time (Just tried again now after 10mins and it's timing out again).
I can't spot anything in the code that could cause this - though it looks a bit messy. I don't think it could be the server refusing the request unless it's generic server behaviour as I've tried this with requests to our server and others such as google.co.uk.
I'm posting the code in the hope that someone may be able to spot something that's wrong with it, or knows a much simpler way of doing what we're trying to.
The tests we run are without any proxy, so the first part is usually skipped. The first time ApplyProxy is run, it works fine and finishes everything in the first try block, the second, it can either timeout on the GetResponse in the first try block and then go through the rest of the code, or it can work there and timeout on the actual requests made for the registration.
Code:
void ApplyProxy()
{
Boolean ProxySuccess = true;
String WebRequestURI = #"http://www.google.co.uk";
if (UseProxy)
{
try
{
String ProxyUrl = (ProxyUri.ToLower().Contains("http://")) ?
ProxyUri :
"http://" + ProxyUri;
WebRequest.DefaultWebProxy = new WebProxy(ProxyUrl);
if (!string.IsNullOrEmpty(ProxyUsername) && !string.IsNullOrEmpty(ProxyPassword))
WebRequest.DefaultWebProxy.Credentials = new NetworkCredential(ProxyUsername, ProxyPassword);
HttpWebRequest request = HttpWebRequest.Create(WebRequestURI) as HttpWebRequest;
request.Method = "GET";
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
}
catch
{
ProxySuccess = false;
}
}
if(!ProxySuccess || !UseProxy)
{
try
{
WebRequest.DefaultWebProxy = WebRequest.GetSystemWebProxy();
HttpWebRequest request = HttpWebRequest.Create(WebRequestURI) as HttpWebRequest;
request.Method = "GET";
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
}
catch (Exception e)
{ //try with credentials
//make a new proxy from defaults
WebRequest.DefaultWebProxy = WebRequest.GetSystemWebProxy();
String newProxyURI = WebRequest.DefaultWebProxy.GetProxy(new Uri(WebRequestURI)).ToString();
if (newProxyURI == String.Empty)
{ //check we actually get a result
WebRequest.DefaultWebProxy = null;
return;
}
//continue
WebProxy NewProxy = new WebProxy(newProxyURI);
NewProxy.UseDefaultCredentials = true;
NewProxy.Credentials = CredentialCache.DefaultCredentials;
WebRequest.DefaultWebProxy = NewProxy;
try
{
HttpWebRequest request = HttpWebRequest.Create(WebRequestURI) as HttpWebRequest;
request.Method = "GET";
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
}
catch
{
WebRequest.DefaultWebProxy = null;
}
}
}
}
Is it not just a case of needing to set the Timeout property of the HttpWebRequest? It could be that the connection is being made, but not serviced (wrong type of proxy server or stalled server, for example), in which case it may be that the request is waiting for the Timeout period before giving up — a shorter timeout may be preferrable here.
Seems to be a programming error on my behalf. The requests were left open and obviously either the program or the server doesn't like this. Simply closing the HttpWebRequests once done seems to remove this issue.