Status code 301 not showing correctly in C# - c#

I am able to get numbers with enum as suggested by dtb in Getting Http Status code number (200, 301, 404, etc.) from HttpWebRequest and HttpWebResponse. However, for moved permanently site also i am getting 200 (OK). What I want to see is 301 instead. Please help. My code is below. What could be wrong/needs to be corrected?
public int GetHeaders(string url)
{
//HttpStatusCode result = default(HttpStatusCode);
int result = 0;
var request = HttpWebRequest.Create(url);
request.Method = "HEAD";
try
{
using (var response = request.GetResponse() as HttpWebResponse)
{
if (response != null)
{
result = (int)response.StatusCode; // response.StatusCode;
response.Close();
}
}
}
catch (WebException we)
{
if (we.Response != null)
{
result = (int)((HttpWebResponse)we.Response).StatusCode;
}
}
return result;
}
The tool where i am using this code is capable of showing 404, not existing domains but it is ignoring the redirects and shows the details about the redirected url. e.g if i put my older domain easytipsandtricks.com in the text field, it shows the results for tipscow.com (if you check easytipsandtricks.com in any checker tool online, you will notice that it is giving the correct redirect message - 301 Moved). Please help.

You need to set HttpWebRequest.AllowAutoRedirect to false (default is true) for it to not automatically follow redirects (30x responses).
If AllowAutoRedirect is set to false, all responses with an HTTP status code from 300 to 399 is returned to the application.
Sample:
var request = (HttpWebRequest)HttpWebRequest.Create(url);
request.Method = "HEAD";
request.AllowAutoRedirect = false;

Related

Test connection to webpage

I'm trying to test in my code if a specific web site is available
public override HealthCheckResult DoHealthCheck()
{
// Create a request using a URL that can receive a post.
var request = (HttpWebRequest)WebRequest.Create(_healthCheckUrl);
// Set the Method property of the request to POST.
request.Method = "GET";
request.Timeout = HealthCheckConfiguration.Timeout * 1000;\\Very Very large timeout
// Set the ContentType property of the WebRequest.
request.ContentType = "text/xml; encoding='utf-8'";
// Get the response.
try
{
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
//var response = (HttpWebResponse)request.GetResponse();
if (response.StatusCode == HttpStatusCode.OK)
{
return HealthCheckResult.Success;
}
}
catch (WebException)
{
return HealthCheckResult.Timeout;
}
return HealthCheckResult.Failed;
}
Sometimes I get the HttpStatusCode.OK but most of the times I get the timeout exception.
If I just copy paste the _healthCheckUrl string in internet explorer the page is always loaded successfully very quickly. I don't understand why in the code i get so much timeouts.
Thanks in advance.

Sending a http request in C# and catching network issues

I previously had a small VBScript that would test if a specific website was accessible by sending a GET request. The script itself was extremely simple and did everything I needed:
Function GETRequest(URL) 'Sends a GET http request to a specific URL
Dim objHttpRequest
Set objHttpRequest = CreateObject("MSXML2.XMLHTTP.3.0")
objHttpRequest.Open "GET", URL, False
On Error Resume Next 'Error checking in case access is denied
objHttpRequest.Send
GETRequest = objHttpRequest.Status
End Function
I now want to include this sort of functionality in an expanded C# application. However I've been unable to get the same results my previous script provided.
Using code similar to what I've posted below sort of gets me a proper result, but fails to run if my network connection has failed.
public static void GETRequest()
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://url");
request.Method = "GET";
HttpStatusCode status;
HttpWebResponse response;
try
{
response = (HttpWebResponse)request.GetResponse();
status = response.StatusCode;
Console.WriteLine((int)response.StatusCode);
Console.WriteLine(status);
}
catch (WebException e)
{
status = ((HttpWebResponse)e.Response).StatusCode;
Console.WriteLine(status);
}
}
But as I said, I need to know if the site is accessible, not matter the reason: the portal could be down, or the problem might reside on the side of the PC that's trying to access it. Either way: I don't care.
When I used MSXML2.XMLHTTP.3.0 in the script I was able to get values ranging from 12000 to 12156 if I was having network problems. I would like to have the same functionality in my C# app, that way I could at least write a minimum of information to a log and let the computer act accordingly. Any ideas?
A direct translation of your code would be something like this:
static void GetStatusCode(string url)
{
dynamic httpRequest = Activator.CreateInstance(Type.GetTypeFromProgID("MSXML2.XMLHTTP.3.0"));
httpRequest.Open("GET", url, false);
try { httpRequest.Send(); }
catch { }
finally { Console.WriteLine(httpRequest.Status); }
}
It's as small and simple as your VBScript script, and uses the same COM object to send the request.
This code happily gives me error code like 12029 ERROR_WINHTTP_CANNOT_CONNECT or 12007 ERROR_WINHTTP_NAME_NOT_RESOLVED etc.
If the code is failing only when you don't have an available network connection, you can use GetIsNetworkAvailable() before executing your code. This method will return a boolean indicating if a network connection is available or not. If it returns false, you could execute an early return / notify the user, and if not, continue.
System.Net.NetworkInformation.NetworkInterface.GetIsNetworkAvailable()
using the code you provided above:
public static void GETRequest()
{
if (!System.Net.NetworkInformation.NetworkInterface.GetIsNetworkAvailable())
return; //or alert the user there is no connection
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://url");
request.Method = "GET";
HttpStatusCode status;
HttpWebResponse response;
try
{
response = (HttpWebResponse)request.GetResponse();
status = response.StatusCode;
Console.WriteLine((int)response.StatusCode);
Console.WriteLine(status);
}
catch (WebException e)
{
status = ((HttpWebResponse)e.Response).StatusCode;
Console.WriteLine(status);
}
}
This should work for you, i've used it many times before, cut it down a bit for your needs: -
private static string GetStatusCode(string url)
{
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
req.Method = WebRequestMethods.Http.Get;
req.ProtocolVersion = HttpVersion.Version11;
req.UserAgent = "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)";
try
{
HttpWebResponse response = (HttpWebResponse)req.GetResponse();
StringBuilder sb = new StringBuilder();
foreach (string header in response.Headers)
{
sb.AppendLine(string.Format("{0}: {1}", header, response.GetResponseHeader(header)));
}
return string.Format("Response Status Code: {0}\nServer:{1}\nProtocol: {2}\nRequest Method: {3}\n\n***Headers***\n\n{4}", response.StatusCode,response.Server, response.ProtocolVersion, response.Method, sb);
}
catch (Exception e)
{
return string.Format("Error: {0}", e.ToString());
}
}
Feel free to ignore the section that gets the headers

I want to check whether the file in a url entered exists or not using .net

I am developing a tool for validation of links in url entered. suppose i have entered a url
(e.g http://www-review-k6.thinkcentral.com/content/hsp/science/hspscience/na/gr3/se_9780153722271_/content/nlsg3_006.html
) in textbox1 and i want to check whether the contents of all the links exists on remote server or not. finally i want a log file for the broken links.
You can use HttpWebRequest.
Note four things
1) The webRequest will throw exception if the link doesn't exist
2) You may like to disable auto redirect
3) You may also like to check if it's a valid url. If not, it will throw UriFormatException.
UPDATED
4) Per Paige suggested , Use "Head" in request.Method so that it won't download the whole remote file
static bool UrlExists(string url)
{
try
{
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(url);
request.Method = "HEAD";
request.AllowAutoRedirect = false;
request.GetResponse();
}
catch (UriFormatException)
{
// Invalid Url
return false;
}
catch (WebException ex)
{
// Valid Url but not exists
HttpWebResponse webResponse = (HttpWebResponse)ex.Response;
if (webResponse.StatusCode == HttpStatusCode.NotFound)
{
return false;
}
}
return true;
}
Use the HttpWebResponse class:
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create("http://www.gooogle.com/");
HttpWebResponse response = (HttpWebResponse)webRequest.GetResponse();
if (response.StatusCode == HttpStatusCode.NotFound)
{
// do something
}
bool LinkExist(string link)
{
HttpWebRequest webRequest = (HttpWebRequest) webRequest.Create(link);
HttpWebResponse webResponse = (HttpWebResponse)webRequest.GetResponse();
return !(webResponse.StatusCode != HttpStatusCode.NotFound);
}
Use an HTTP HEAD request as explained in this article: http://www.eggheadcafe.com/tutorials/aspnet/2c13cafc-be1c-4dd8-9129-f82f59991517/the-lowly-http-head-reque.aspx
Make a HTTP request to the URL and see if you get a 404 response. If so then it does not exist.
Do you need a code example?
If your goal is robust validation of page source, consider usign a tool that is already written, like the W3C Link Checker. It can be run as a command-line program that handles finding links, pictures, css, etc and checking them for validity. It can also recursively check an entire web-site.

How to expand URLs in C#?

If I have a URL like http://popurls.com/go/msn.com/l4eba1e6a0ffbd9fc915948434423a7d5, how do I expand it back to the original URL programmatically? Obviously I could use an API like expandurl.com but that will limits me to 100 requests per hour.
Make a request to the URL and check the status code returned. If 301 or 302, look for a Location header, which will contain the "expanded URL":
string url = "http://popurls.com/go/msn.com/l4eba1e6a0ffbd9fc915948434423a7d5";
var request = (HttpWebRequest) WebRequest.Create(url);
request.AllowAutoRedirect = false;
var response = (HttpWebResponse) webRequest.GetResponse();
if ((int) response.StatusCode == 301 || (int) response.StatusCode == 302)
{
url = response.Headers["Location"];
}
Note: This solution presumes that only one redirect occurs. This may or may not be want you need. If you simply want to deobfuscate URLs from obfuscators (bit.ly et al) this solution should work well.
Managed to find an answer.
HttpWebRequest req = (HttpWebRequest)WebRequest.Create("http://popurls.com/go/msn.com/l4eba1e6a0ffbd9fc915948434423a7d5");
req.AllowAutoRedirect = true;
HttpWebResponse res = (HttpWebResponse)req.GetResponse();
ServicePoint sp = req.ServicePoint;
Console.WriteLine("End address is " + sp.Address.ToString());

Is there a faster way to check if an external web page exists?

I wrote this method to check if a page exists or not:
protected bool PageExists(string url)
{
try
{
Uri u = new Uri(url);
WebRequest w = WebRequest.Create(u);
w.Method = WebRequestMethods.Http.Head;
using (StreamReader s = new StreamReader(w.GetResponse().GetResponseStream()))
{
return (s.ReadToEnd().Length >= 0);
}
}
catch
{
return false;
}
}
I am using it to check a set of pages (iterates from AAAA-AAAZ), and it takes between 3 and 7 seconds to run the entire loop. Is there a faster or more efficient way to do this?
I think your approach is rather good, but would change it into only downloading the headers by adding w.Method = WebRequestMethods.Http.Head; before calling GetResponse.
This could do it:
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://www.example.com");
request.Method = WebRequestMethods.Http.Head;
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
bool pageExists = response.StatusCode == HttpStatusCode.OK;
You may probably want to check for other status codes as well.
static bool GetCheck(string address)
{
try
{
HttpWebRequest request = WebRequest.Create(address) as HttpWebRequest;
request.Method = "GET";
request.CachePolicy = new RequestCachePolicy(RequestCacheLevel.NoCacheNoStore);
var response = request.GetResponse();
return (response.Headers.Count > 0);
}
catch
{
return false;
}
}
static bool HeadCheck(string address)
{
try
{
HttpWebRequest request = WebRequest.Create(address) as HttpWebRequest;
request.Method = "HEAD";
request.CachePolicy = new RequestCachePolicy(RequestCacheLevel.NoCacheNoStore);
var response = request.GetResponse();
return (response.Headers.Count > 0);
}
catch
{
return false;
}
}
Beware, certain pages (eg. WCF .svc files) may not return anything from a head request. I know because I'm working around this right now.
EDIT - I know there are better ways to check the return data than counting headers, but this is a copy/paste from stuff where this is important to us.
One obvious speedup is to run several requests in parallel - most of the time will be spent on IO, so spawning 10 threads to each check a page will complete the whole iteration around 10 times faster.
You could do it using asynchronous way, because now you are waiting for results after each request. For few pages, you could just throw your function in ThreadPool, and wait for all requests to finish. For more requests, you could use asynchronous methods for your ResponseStream() (BeginRead etc.).
The other thing that can help you (help me for sure) is to clear .Proxy property:
w.Proxy = null;
Without this, at least 1st request is much slower, at least on my machine.
3. You can not download whole page, but download only header, by setting .Method to "HEAD".
I simply used Fredrik Mörk answer above but placed it within a method:
private bool checkURL(string url)
{
bool pageExists = false;
try
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
request.Method = WebRequestMethods.Http.Head;
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
pageExists = response.StatusCode == HttpStatusCode.OK;
}
catch (Exception e)
{
//Do what ever you want when its no working...
//Response.Write( e.ToString());
}
return pageExists;
}

Categories