C# WebRequest Check if page requires HTTP Authentication - c#

Does anyone know how to check if a webpage is asking for HTTP Authentication via C# using the WebRequest class? I'm not asking how to post Credentials to the page, just how to check if the page is asking for Authentication.
Current Snippet to get HTML:
WebRequest wrq = WebRequest.Create(address);
wrs = wrq.GetResponse();
Uri uri = wrs.ResponseUri;
StreamReader strdr = new StreamReader(wrs.GetResponseStream());
string html = strdr.ReadToEnd();
wrs.Close();
strdr.Close();
return html;
PHP Server side source:
<?php
if (!isset($_SERVER['PHP_AUTH_USER'])) {
header('WWW-Authenticate: Basic realm="Secure Sign-in"');
header('HTTP/1.0 401 Unauthorized');
echo 'Text to send if user hits Cancel button';
exit;
} else {
echo "<p>Hello {$_SERVER['PHP_AUTH_USER']}.</p>";
echo "<p>You entered {$_SERVER['PHP_AUTH_PW']} as your password.</p>";
}
?>

WebRequest.GetResponse returns an object of type HttpWebResponse. Just cast it and you can retrieve StatusCode.
However, .Net will give you an exception if it receives a response of status 4xx or 5xx (thanks for your feedback).
There is a little workaround, check it out:
HttpWebRequest wrq = (HttpWebRequest)WebRequest.Create(#"http://webstrand.comoj.com/locked/safe.php");
HttpWebResponse wrs = null;
try
{
wrs = (HttpWebResponse)wrq.GetResponse();
}
catch (System.Net.WebException protocolError)
{
if (((HttpWebResponse)protocolError.Response).StatusCode == HttpStatusCode.Unauthorized)
{
//do something
}
}
catch (System.Exception generalError)
{
//run to the hills
}
if (wrs.StatusCode == HttpStatusCode.OK)
{
Uri uri = wrs.ResponseUri;
StreamReader strdr = new StreamReader(wrs.GetResponseStream());
string html = strdr.ReadToEnd();
wrs.Close();
strdr.Close();
}
Hope this helps.
Regards

Might want to try
WebClient wc = new WebClient();
CredentialCache credCache = new CredentialCache();
If you can work with WebClient instead of WebRequest, you should it's a bit higher level, easier to handle headers etc.
Also, might want to check this thread:
System.Net.WebClient fails weirdly

Related

Check whether a web URL is working or not in selenium

i'm trying to create a selenium script in c# to check whether a URL is working or returning any error. What is the simplest way to do that.
Don't do it with Selenium, use HttpClient
string url = "url";
var client = new HttpClient();
var checkingResponse = await client.GetAsync(url);
if (checkingResponse.IsSuccessStatusCode) {
Console.WriteLine($"{url} is alive");
}
To check whether a URL is working or returning any error using Selenium's C# clients, you can simply use WebRequest and HttpWebResponse class to get the page response and status code as follows:
//Declare Webrequest
HttpWebRequest re = null;
re = (HttpWebRequest)WebRequest.Create(url);
try
{
var response = (HttpWebResponse)re.GetResponse();
System.Console.WriteLine($"URL: {url.GetAttribute("href")} status is :{response.StatusCode}");
}
catch (WebException e)
{
var errorResponse = (HttpWebResponse)e.Response;
System.Console.WriteLine($"URL: {url.GetAttribute("href")} status is :{errorResponse.StatusCode}");
}

Sending a http request in C# and catching network issues

I previously had a small VBScript that would test if a specific website was accessible by sending a GET request. The script itself was extremely simple and did everything I needed:
Function GETRequest(URL) 'Sends a GET http request to a specific URL
Dim objHttpRequest
Set objHttpRequest = CreateObject("MSXML2.XMLHTTP.3.0")
objHttpRequest.Open "GET", URL, False
On Error Resume Next 'Error checking in case access is denied
objHttpRequest.Send
GETRequest = objHttpRequest.Status
End Function
I now want to include this sort of functionality in an expanded C# application. However I've been unable to get the same results my previous script provided.
Using code similar to what I've posted below sort of gets me a proper result, but fails to run if my network connection has failed.
public static void GETRequest()
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://url");
request.Method = "GET";
HttpStatusCode status;
HttpWebResponse response;
try
{
response = (HttpWebResponse)request.GetResponse();
status = response.StatusCode;
Console.WriteLine((int)response.StatusCode);
Console.WriteLine(status);
}
catch (WebException e)
{
status = ((HttpWebResponse)e.Response).StatusCode;
Console.WriteLine(status);
}
}
But as I said, I need to know if the site is accessible, not matter the reason: the portal could be down, or the problem might reside on the side of the PC that's trying to access it. Either way: I don't care.
When I used MSXML2.XMLHTTP.3.0 in the script I was able to get values ranging from 12000 to 12156 if I was having network problems. I would like to have the same functionality in my C# app, that way I could at least write a minimum of information to a log and let the computer act accordingly. Any ideas?
A direct translation of your code would be something like this:
static void GetStatusCode(string url)
{
dynamic httpRequest = Activator.CreateInstance(Type.GetTypeFromProgID("MSXML2.XMLHTTP.3.0"));
httpRequest.Open("GET", url, false);
try { httpRequest.Send(); }
catch { }
finally { Console.WriteLine(httpRequest.Status); }
}
It's as small and simple as your VBScript script, and uses the same COM object to send the request.
This code happily gives me error code like 12029 ERROR_WINHTTP_CANNOT_CONNECT or 12007 ERROR_WINHTTP_NAME_NOT_RESOLVED etc.
If the code is failing only when you don't have an available network connection, you can use GetIsNetworkAvailable() before executing your code. This method will return a boolean indicating if a network connection is available or not. If it returns false, you could execute an early return / notify the user, and if not, continue.
System.Net.NetworkInformation.NetworkInterface.GetIsNetworkAvailable()
using the code you provided above:
public static void GETRequest()
{
if (!System.Net.NetworkInformation.NetworkInterface.GetIsNetworkAvailable())
return; //or alert the user there is no connection
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://url");
request.Method = "GET";
HttpStatusCode status;
HttpWebResponse response;
try
{
response = (HttpWebResponse)request.GetResponse();
status = response.StatusCode;
Console.WriteLine((int)response.StatusCode);
Console.WriteLine(status);
}
catch (WebException e)
{
status = ((HttpWebResponse)e.Response).StatusCode;
Console.WriteLine(status);
}
}
This should work for you, i've used it many times before, cut it down a bit for your needs: -
private static string GetStatusCode(string url)
{
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
req.Method = WebRequestMethods.Http.Get;
req.ProtocolVersion = HttpVersion.Version11;
req.UserAgent = "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)";
try
{
HttpWebResponse response = (HttpWebResponse)req.GetResponse();
StringBuilder sb = new StringBuilder();
foreach (string header in response.Headers)
{
sb.AppendLine(string.Format("{0}: {1}", header, response.GetResponseHeader(header)));
}
return string.Format("Response Status Code: {0}\nServer:{1}\nProtocol: {2}\nRequest Method: {3}\n\n***Headers***\n\n{4}", response.StatusCode,response.Server, response.ProtocolVersion, response.Method, sb);
}
catch (Exception e)
{
return string.Format("Error: {0}", e.ToString());
}
}
Feel free to ignore the section that gets the headers

Can't get HTML code through HttpWebRequest

I am trying to parse the HTML code of the page at http://odds.bestbetting.com/horse-racing/today in order to have a list of races, etc.
The problem is I am not being able to retrieve the HTML code of the page. Here is the C# code of the function:
public static string Http(string url) {
Uri myUri = new Uri(url);
// Create a 'HttpWebRequest' object for the specified url.
HttpWebRequest myHttpWebRequest = (HttpWebRequest)WebRequest.Create(myUri);
myHttpWebRequest.AllowAutoRedirect = true;
// Send the request and wait for response.
HttpWebResponse myHttpWebResponse = (HttpWebResponse)myHttpWebRequest.GetResponse();
var stream = myHttpWebResponse.GetResponseStream();
var reader = new StreamReader(stream);
var html = reader.ReadToEnd();
// Release resources of response object.
myHttpWebResponse.Close();
return html;
}
When I execute the program calling the function it throws an exception on
HttpWebResponse myHttpWebResponse =
(HttpWebResponse)myHttpWebRequest.GetResponse();
which is:
Cannot handle redirect from HTTP/HTTPS protocols to other dissimilar ones.
I have read this question but I don't seem to have the same problem.
I've also tried iguring something out sniffing the traffic with fiddler but can't see anything to where it redirects or something similar. I just have extracted these two possible redirections: odds.bestbetting.com/horse-racing/2011-06-10/byCourse
and odds.bestbetting.com/horse-racing/2011-06-10/byTime , but querying them produces the same result as above.
It's not the first time I do something like this, but I'm really lost on this one. Any help?
Thanks!
I finally found the solution... it effectively was a problem with the headers, specifically the User-Agent one.
I found after lots of searching a guy having the same problem as me with the same site. Although his code was different the important bit was that he set the UserAgent attribute of the request manually to that of a browser. I think I had done this before but I may had done it pretty bad... sorry.
The final code if it is of interest to any one is this:
public static string Http(string url) {
if (url.Length > 0)
{
Uri myUri = new Uri(url);
// Create a 'HttpWebRequest' object for the specified url.
HttpWebRequest myHttpWebRequest = (HttpWebRequest)WebRequest.Create(myUri);
// Set the user agent as if we were a web browser
myHttpWebRequest.UserAgent = #"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4";
HttpWebResponse myHttpWebResponse = (HttpWebResponse)myHttpWebRequest.GetResponse();
var stream = myHttpWebResponse.GetResponseStream();
var reader = new StreamReader(stream);
var html = reader.ReadToEnd();
// Release resources of response object.
myHttpWebResponse.Close();
return html;
}
else { return "NO URL"; }
}
Thank you very much for helping.
There can be a dozen probable causes for your problem.
One of them is that the redirect from the server is pointing to an FTP site, or something like that.
It can also being that the server require some headers in the request that you're failing to provide.
Check what a browser would send to the site and try to replicate.

I want to check whether the file in a url entered exists or not using .net

I am developing a tool for validation of links in url entered. suppose i have entered a url
(e.g http://www-review-k6.thinkcentral.com/content/hsp/science/hspscience/na/gr3/se_9780153722271_/content/nlsg3_006.html
) in textbox1 and i want to check whether the contents of all the links exists on remote server or not. finally i want a log file for the broken links.
You can use HttpWebRequest.
Note four things
1) The webRequest will throw exception if the link doesn't exist
2) You may like to disable auto redirect
3) You may also like to check if it's a valid url. If not, it will throw UriFormatException.
UPDATED
4) Per Paige suggested , Use "Head" in request.Method so that it won't download the whole remote file
static bool UrlExists(string url)
{
try
{
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(url);
request.Method = "HEAD";
request.AllowAutoRedirect = false;
request.GetResponse();
}
catch (UriFormatException)
{
// Invalid Url
return false;
}
catch (WebException ex)
{
// Valid Url but not exists
HttpWebResponse webResponse = (HttpWebResponse)ex.Response;
if (webResponse.StatusCode == HttpStatusCode.NotFound)
{
return false;
}
}
return true;
}
Use the HttpWebResponse class:
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create("http://www.gooogle.com/");
HttpWebResponse response = (HttpWebResponse)webRequest.GetResponse();
if (response.StatusCode == HttpStatusCode.NotFound)
{
// do something
}
bool LinkExist(string link)
{
HttpWebRequest webRequest = (HttpWebRequest) webRequest.Create(link);
HttpWebResponse webResponse = (HttpWebResponse)webRequest.GetResponse();
return !(webResponse.StatusCode != HttpStatusCode.NotFound);
}
Use an HTTP HEAD request as explained in this article: http://www.eggheadcafe.com/tutorials/aspnet/2c13cafc-be1c-4dd8-9129-f82f59991517/the-lowly-http-head-reque.aspx
Make a HTTP request to the URL and see if you get a 404 response. If so then it does not exist.
Do you need a code example?
If your goal is robust validation of page source, consider usign a tool that is already written, like the W3C Link Checker. It can be run as a command-line program that handles finding links, pictures, css, etc and checking them for validity. It can also recursively check an entire web-site.

How can I validate a URL in C# to avoid 404 errors?

I need to write a tool that will report broken URL's in C#. The URL should only reports broken if the user see's a 404 Error in the browser. I believe there might be tricks to handle web servers that do URL re-writing. Here's what I have. As you can see only some URL validate incorrectly.
string url = "";
// TEST CASES
//url = "http://newsroom.lds.org/ldsnewsroom/eng/news-releases-stories/local-churches-teach-how-to-plan-for-disasters"; //Prints "BROKEN", although this is getting re-written to good url below.
//url = "http://beta-newsroom.lds.org/article/local-churches-teach-how-to-plan-for-disasters"; // Prints "GOOD"
//url = "http://"; //Prints "BROKEN"
//url = "google.com"; //Prints "BROKEN" althought this should be good.
//url = "www.google.com"; //Prints "BROKEN" althought this should be good.
//url = "http://www.google.com"; //Prints "GOOD"
try
{
if (url != "")
{
WebRequest Irequest = WebRequest.Create(url);
WebResponse Iresponse = Irequest.GetResponse();
if (Iresponse != null)
{
_txbl.Text = "GOOD";
}
}
}
catch (Exception ex)
{
_txbl.Text = "BROKEN";
}
For one, Irequest and Iresponse shouldn't be named like that. They should just be webRequest and webResponse, or even just request and response. The capital "I" prefix is generally only used for interface naming, not for instance variables.
To do your URL validity checking, use UriBuilder to get a Uri. Then you should use HttpWebRequest and HttpWebResponse so that you can check the strongly typed status code response. Finally, you should be a bit more informative about what was broken.
Here's links to some of the additional .NET stuff I introduced:
string.IsNullOrEmpty()
HttpWebRequest
HttpWebResponse
HttpStatusCode
Uri
UriBuilder
string.Format()
Sample:
try
{
if (!string.IsNullOrEmpty(url))
{
UriBuilder uriBuilder = new UriBuilder(url);
HttpWebRequest request = HttpWebRequest.Create(uriBuilder.Uri);
HttpWebResponse response = request.GetResponse();
if (response.StatusCode == HttpStatusCode.NotFound)
{
_txbl.Text = "Broken - 404 Not Found";
}
if (response.StatusCode == HttpStatusCode.OK)
{
_txbl.Text = "URL appears to be good.";
}
else //There are a lot of other status codes you could check for...
{
_txbl.Text = string.Format("URL might be ok. Status: {0}.",
response.StatusCode.ToString());
}
}
}
catch (Exception ex)
{
_txbl.Text = string.Format("Broken- Other error: {0}", ex.Message);
}
Prepend http:// or https:// to the URL and pass it to WebClient.OpenRead method. It would throw an WebException if the URL is malformed.
private WebClient webClient = new WebClient();
try {
Stream strm = webClient.OpenRead(URL);
}
catch (WebException we) {
throw we;
}
The problem is that most of those 'should be good' cases are actually dealt with at a browser level I believe. If you omit the 'http://' its an invalid request but the browser puts it in for you.
So maybe you could do a similar check that the browser would do:
Ensure there is an 'http://' at the beginning
Ensure there is a 'www.' at the beginning

Categories