Use httpwebrequest to check if url exists - c#

I'm using a function to check if an external url exists. Here's the code with the status messages removed for clarity.
public static bool VerifyUrl(string url)
{
url.ThrowNullOrEmpty("url");
if (!(url.StartsWith("http://") || url.StartsWith("https://")))
return false;
var uri = new Uri(url);
var webRequest = HttpWebRequest.Create(uri);
webRequest.Timeout = 5000;
webRequest.Method = "HEAD";
HttpWebResponse webResponse;
try
{
webResponse = (HttpWebResponse)webRequest.GetResponse();
webResponse.Close();
}
catch (WebException)
{
return false;
}
if (string.Compare(uri.Host, webResponse.ResponseUri.Host, true) != 0)
{
string responseUri = webResponse.ResponseUri.ToString().ToLower();
if (responseUri.IndexOf("error") > -1 || responseUri.IndexOf("404.") > -1 || responseUri.IndexOf("500.") > -1)
return false;
}
return true;
}
I've run a test over some external urls and found that about 20 out of 100 are coming back as errors. If i add a user agent the errors are around 14%.
The errors coming back are "forbidden", although this can be resolved for 6% using a user agent, "service unavialable", "method not allowed", "not implemented" or "connection closed".
Is there anything I can do to my code to ensure more, preferrably all give a valid response to their existance?
Altermatively, code that can be purchased to do this more effectively.
UPDATE - 14th Nov 12 ----------------------------------------------------------------------
After following advice from previous respondants, I'm now in a situation where I have a single domain that returns Service Unavailable (503). The example I have is www.marksandspencer.com.
When I use this httpsniffer web-sniffer.net as opposed to the one recommended in this thread, it works, returning the data using a webrequest.GET, however I can't work out what I need to do, to make it work in my code.

I finally got to the point of bieng able to validate all the urls without exception.
Firstly I took Davios advice. Some domains return an error on Request.HEAD so I have included a retry for specific scenarios. This created a new Request.GET for the second request.
Secondly, the Amazon scenario. Amazon was intermittently returning a 503 error for its own site and permanent 503 errors for sites hosted on the Amazon framework.
After some digging I found adding the following line to the Request resolved both. It is the Accept string used by Firefox.
var request = (HttpWebRequest)HttpWebRequest.Create(uri);
request.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";

Related

What is the role of my machine's Application pool in Windows service development when consuming third party rest services [duplicate]

I am working on Windows Service in visual studio 2017. In the rest api's call, getting exceptions while debugging code. Sometimes first 2 3 calls working after that getting exceptions.
System.Net.WebException: 'The remote server returned an error: (503)
Server Unavailable.'
The remote server returned an error: (429)
Unable to connect to the remote server
When calling same api's from Postman, getting response successfully.
This is my code
private void timer1_Tick(object sender, ElapsedEventArgs e)
{
WriteToFile("timer1_Tick method called..");
try
{
string jsonString = "";
string jsonstring2 = "";
string prodfetchurl = HOST;
var req = WebRequest.Create(prodfetchurl) as HttpWebRequest;
req.Method = "GET";
InitializeRequest(req);
req.Accept = MIME_TYPE;
//System.Threading.Thread.Sleep(5000);
var response = (HttpWebResponse)req.GetResponse();
WriteToFile("First service called...");
if (response.StatusCode == HttpStatusCode.OK)
{
Stream responseStream = response.GetResponseStream();
StreamReader responseReader = new StreamReader(responseStream);
jsonString = responseReader.ReadToEnd();
}
var deserialsseobj = JsonConvert.DeserializeObject<ProductList>(jsonString).Products.Where(i => i.Failed > 0).ToList();
foreach (var a in deserialsseobj)
{
var pid = a.ID;
string url = FailedDevicesUrl + pid.Value + "/failed";
var req2 = WebRequest.Create(url) as HttpWebRequest;
req2.Method = "GET";
InitializeRequest(req2);
req2.Timeout = 300000;
req2.Accept = MIME_TYPE;
var response1 = (HttpWebResponse)req2.GetResponse();
Stream responsestream2 = response1.GetResponseStream();
WriteToFile("Second service called...");
if (response1.StatusCode == HttpStatusCode.OK)
{
StreamReader responsereader1 = new StreamReader(responsestream2);
jsonstring2 = responsereader1.ReadToEnd();
}
var output = JsonConvert.DeserializeObject<List<FailedDeviceList>>(jsonstring2); // Will get List of the Failed devices
List<int> deviceids = new List<int>();
Reprocessdata reproc = new Reprocessdata();
Reprocessdata.DeviceId rprod = new Reprocessdata.DeviceId();
reproc.ForceFlag = true;
reproc.ProductID = pid.Value;
foreach (var dd in output)
{
rprod.ID = dd.DeviceId;
reproc.DeviceIds.Add(rprod);
}
// Reprocess the Product in Devices
var req3 = WebRequest.Create(ReprocessUrl) as HttpWebRequest;
req3.Method = "POST";
InitializeRequest(req3);
req3.Accept = MIME_TYPE;
req3.Timeout = 300000;
req3.ContentType = "application/json";
using (StreamWriter writer = new StreamWriter(req3.GetRequestStream()))
{
string json = new JavaScriptSerializer().Serialize(reproc);
writer.Write(json);
writer.Close();
}
System.Threading.Thread.Sleep(5000);
var response5 = (HttpWebResponse)req3.GetResponse();
WriteToFile("Third service called...");
if (response5.StatusCode == HttpStatusCode.OK)
{
string result;
using (StreamReader rdr = new StreamReader(response5.GetResponseStream()))
{
result = rdr.ReadToEnd();
}
}
}
response.Close();
}
catch (Exception ex)
{
WriteToFile("Simple Service Error on: {0} " + ex.Message + ex.StackTrace);
}
}
Methods used in above code
protected override void OnStart(string[] args)
{
base.OnStart(args);
timer1 = new System.Timers.Timer();
timer1.Interval = 60000; //every 1 min
timer1.Elapsed += new System.Timers.ElapsedEventHandler(timer1_Tick);
timer1.Enabled = true;
WriteToFile("Service has started..");
}
public void InitializeRequest(HttpWebRequest request)
{
request.Headers.Add("aw-tenant-code", API_TENANT_CODE);
request.Credentials = new NetworkCredential(USER_NAME, PASSWORD);
request.KeepAlive = false;
request.AddRange(1024);
}
When I contacted service provide they said everything fine from there side. Is this my code is buggy or windows service not reliable? How can I fix this issue?
Note: All APIS are working fine from Angular application using Visual Studio Code. It means my code is not working.
Edit1: Three below services I am using from this document of VMware.
private const string HOST = "https:host/api/mdm/products/search?";
private const string FailedDevicesUrl = "https:host/api/mdm/products/";
private const string ReprocessUrl = "https:host/api/mdm/products/reprocessProduct";
Response http code 429 indicates that you sending too many requests on target web service.
This means service you trying to send requests has a policies that blocks some requests by request-per-time limit.
Also I admit that external service can be manually configured to throw 403 code in specific cases that you can't know about. If that, this information can be explained in external service documentation... or not :)
What you can do with this?
Fit in limitations
You can make detailed research what limits target webservice has and set up your code to fit in this limitations. For example if service has limitation for receiving only one request per 10 minutes - you must set up your timer to send one request each 10 or more minutes. If documentation not provide such information - you can test it manually by finding some patterns with external service responses.
Use proxy
Every limitation policy based on information about requests senders. Usually this information consists of IP address of sender only. This means if you send 2 requests from two different IP addresses - limitation policy will perceive that like 2 different computers sending these requests. So you can find/buy/rent some proxy IP addresses and send requests through there on target web server.
How to connect through proxy in C# using WebRequest you can see in this answer.
Negotiate with external service provider
If you have possibility to communicate with external service developers or help center, you can ask their to reduce limitations for your IP address (if it static) or provide some mechanisms to avoid limitation policy for you. If for some reason they cannot provide this opportunity, at least you can ask detailed information about limitations.
Repetition mechanism
Some times 503 error code that is outer exception you received may be caused by service unavailable. It means that server can be under maintenance or temporary overloaded. So you can write repetition mechanism to make continious sending requests to server until it'll be accessible.
Polly library may help you with repetition mechanism creation
The inner error of that 503 is:
The remote server returned an error: (429)
HTTP 429 indicates too many requests. Maybe your upstream server can’t process all requests sent.
This can happen when you reached rate limiting / throttling value if you’re calling a third party API.
UPDATE
As per page 28 in the API docs, you could configure throttling when creating a new API. Check if the throttling is too small or maybe turn off the throttling and see if that could fix the error?

How do I stay on the page using HttpWebRequest?

So I am trying to simulate a person being on my website, in this case it's my console application.
I can connect to it using a HttpWebRequest and create a WebRequest but it doesn't show as a person being on the website in my dashboard. However when I manually get on my website through my web browser it says that someone is online on the website in my dashboard system (WordPress),
So my question is, how do I accomplish the same thing, would I have to create a Socket connection? Or is this possible by using KeepAlive because I think the issue is that it's not on the page for long enough, it connect and gets the request but it doesnt actually establish a connection if that makes any sense.
That's just my theory please correct me if I am wrong.
public static bool isServerOnline()
{
Boolean ret = false;
try
{
HttpWebRequest req = (HttpWebRequest)HttpWebRequest.Create("https://arcticinnovative.com");
req.CookieContainer = cookieContainer; // <= HERE
req.Method = "HEAD";
req.KeepAlive = false;
HttpWebResponse resp = (HttpWebResponse)req.GetResponse();
if (resp.StatusCode == HttpStatusCode.OK)
{
// HTTP = 200 - Internet connection available, server online
ret = true;
}
resp.Close();
return ret;
}
catch (WebException we)
{
// Exception - connection not available
Debug.Print("InternetUtils - isServerOnline - " + we.Status);
return false;
}
}
According to the documentation found here, you can set KeepAlive to true in order to maintain a persistent connection.

How to validate existence of OneDrive document via its Share link?

My C#/UWP app has a section where users can enter links to OneDrive documents and other web resources as reference information. A user can click a button to test a link after they've entered it to make sure it launches as expected. I need to validate that the link target exists before launching the URI and raise an error if the link is not valid before attempting to launch the URI.
It's straight-forward to validate web sites and non-OneDrive web-based docs by creating a HttpWebRequest using the URL and evaluating the response's status value. (See sample code below.)
However, OneDrive document share links seem to have problems with this approach, returning a [405 Method Not Allowed] error. I'm guessing this is because OneDrive share links do lots of forwarding and redirection before they get to the actual document.
try
{
// Create the HTTP request
HttpWebRequest request = WebRequest.Create(urlString) as HttpWebRequest;
request.Timeout = 5000; // Set the timeout to 5 seconds -- don't keep the user waiting too long!
request.Method = "HEAD"; // Get only the header information -- no need to download page content
// Get the HTTP response
using (HttpWebResponse response = request.GetResponse() as HttpWebResponse)
{
// Get the response status code
int statusCode = (int)response.StatusCode;
// Successful request...return true
if (statusCode >= 100 && statusCode < 400)
{
return true;
}
// Unsuccessful request (server-side errors)...return false
else // if (statusCode >= 500 && statusCode < 600)
{
Log.Error( $"URL not valid. Server error. (Status = {statusCode})" );
return false;
}
}
}
// Handle HTTP exceptions
catch (WebException e)
{
// Get the entire HTTP response from the exception
HttpWebResponse response = (HttpWebResponse)e.Response;
// Grab the HTTP status code from the response
int statusCode = (int)response.StatusCode;
// Unsuccessful request (client-side errors)...return false
if (statusCode >= 400 && statusCode <= 499)
{
Log.Error( $"URL not valid. Client error. (Status = {statusCode})" );
return false;
}
else // Unhandled errors
{
Log.Error( $"Unhandled status returned for URL. (Status = {e.Status})" );
return false;
}
}
// Handle non-HTTP exceptions
catch (Exception e)
{
Log.Error( $"Unexpected error. Could not validate URL." );
return false;
}
I can trap the 405 error and launch the URL anyhow using the Windows.System.Launcher.LaunchUriAsync method. The OneDrive link launches just fine...IF the OneDrive document actually exists.
But if the document doesn't exist, or if the share permissions have been revoked, I end up with a browser page with something like a [404 Not Found] error...exactly what I'm trying to avoid by doing the validation!
Is there a way to validate OneDrive share links WITHOUT actually launching them in a browser? Are there other types of links (bit.ly links, perhaps?) that also create problems in the same way? Perhaps a better question: Can I validate ALL web resources in the same way without knowing anything but the URL?
The best way to avoid the redirects and get access to an item metadata using a sharing link is to make an API call to the shares endpoint. You'll want to encode your URL as outlined here and the pass it to the API like:
HEAD https://api.onedrive.com/v1.0/shares/u!{encodedurl}

.NET service responds 500 internal error and "missing parameter" to HttpWebRequest POSTS but test form works fine

I am using a simple .NET service (asmx) that works fine when invoking via the test form (POST). When invoking via a HttpWebRequest object, I get a WebException "System.Net.WebException: The remote server returned an error: (500) Internal Server Error." Digging deeper, reading the WebException.Response.GetResponseStream() I get the message: "Missing parameter: serviceType." but I've clearly included this parameter.
I'm at a loss here, and its worse that I don't have access to debug the service itself.
Here is the code being used to make the request:
string postData = String.Format("serviceType={0}&SaleID={1}&Zip={2}", request.service, request.saleId, request.postalCode);
byte[] data = (new ASCIIEncoding()).GetBytes(postData);
HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create(url);
httpWebRequest.Timeout = 60000;
httpWebRequest.Method = "POST";
httpWebRequest.ContentType = "application/x-www-form-urlencoded";
httpWebRequest.ContentLength = data.Length;
using (Stream newStream = httpWebRequest.GetRequestStream())
{
newStream.Write(data, 0, data.Length);
}
try
{
using (response = (HttpWebResponse)httpWebRequest.GetResponse())
{
if (response.StatusCode != HttpStatusCode.OK)
throw new Exception("There was an error with the shipping freight service.");
string responseData;
using (StreamReader responseStream = new StreamReader(httpWebRequest.GetResponse().GetResponseStream(), System.Text.Encoding.GetEncoding("iso-8859-1")))
{
responseData = responseStream.ReadToEnd();
responseStream.Close();
}
if (string.IsNullOrEmpty(responseData))
throw new Exception("There was an error with the shipping freight service. Request went through but response is empty.");
XmlDocument providerResponse = new XmlDocument();
providerResponse.LoadXml(responseData);
return providerResponse;
}
}
catch (WebException webExp)
{
string exMessage = webExp.Message;
if (webExp.Response != null)
{
using (StreamReader responseReader = new StreamReader(webExp.Response.GetResponseStream()))
{
exMessage = responseReader.ReadToEnd();
}
}
throw new Exception(exMessage);
}
Anyone have an idea what could be happening?
Thanks.
UPDATE
Stepping through the debugger, I see the parameters are correct. I also see the parameters are correct in fiddler.
Examining fiddler, I get 2 requests each time this code executes. The first request is a post that sends the parameters. It gets a 301 response code with a "Document Moved Object Moved This document may be found here" message. The second request is a GET to the same URL with no body. It gets a 500 server error with "Missing parameter: serviceType." message.
It seems like you found your problem when you looked at the requests in Fiddler. Taking an excerpt from http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html:
10.3.2 301 Moved Permanently
The requested resource has been assigned a new permanent URI and any future references to this resource SHOULD use one of the returned URIs. Clients with link editing capabilities ought to automatically re-link references to the Request-URI to one or more of the new references returned by the server, where possible.
.....
Note: When automatically redirecting a POST request after
receiving a 301 status code, some existing HTTP/1.0 user agents
will erroneously change it into a GET request.
Here's a couple options that you can take:
Hard-code your program to use the new Url that you see in the 301 response in Fiddler
Adjust your code to retrieve the 301 response, parse out the new Url from the response, and build a new response with the new Url.
The latter option would be ideal if you're dealing with user-based input on the Url (like a web browser), since you don't know where the user is going to want your program to go.

Changing DefaultWebProxy causing WebRequests to time out

For the project I'm working on, we have a desktop program that contacts an online server for a store. Because it's used in schools, getting the proxy setup right is tricky. What we've gone for is to allow users to specify proxy details to use if they want, otherwise it uses the ones from IE. We've also tried to bypass incorrect details being put in, so the code tries the user specified proxy, if that fails the default one, if that fails, then with credentials, if that fails then null.
The problem I'm having is that in places where the proxy settings need to be changed in succession (for example, if their registration fails because the proxy is wrong, they change one tiny thing and try again, takes seconds.) I end up with calls to a HttpRequests .GetResponse() timing out, causing the program to freeze for a good while. Sometimes if I leave a minute or two between the changes, it doesn't freeze, but not every time (Just tried again now after 10mins and it's timing out again).
I can't spot anything in the code that could cause this - though it looks a bit messy. I don't think it could be the server refusing the request unless it's generic server behaviour as I've tried this with requests to our server and others such as google.co.uk.
I'm posting the code in the hope that someone may be able to spot something that's wrong with it, or knows a much simpler way of doing what we're trying to.
The tests we run are without any proxy, so the first part is usually skipped. The first time ApplyProxy is run, it works fine and finishes everything in the first try block, the second, it can either timeout on the GetResponse in the first try block and then go through the rest of the code, or it can work there and timeout on the actual requests made for the registration.
Code:
void ApplyProxy()
{
Boolean ProxySuccess = true;
String WebRequestURI = #"http://www.google.co.uk";
if (UseProxy)
{
try
{
String ProxyUrl = (ProxyUri.ToLower().Contains("http://")) ?
ProxyUri :
"http://" + ProxyUri;
WebRequest.DefaultWebProxy = new WebProxy(ProxyUrl);
if (!string.IsNullOrEmpty(ProxyUsername) && !string.IsNullOrEmpty(ProxyPassword))
WebRequest.DefaultWebProxy.Credentials = new NetworkCredential(ProxyUsername, ProxyPassword);
HttpWebRequest request = HttpWebRequest.Create(WebRequestURI) as HttpWebRequest;
request.Method = "GET";
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
}
catch
{
ProxySuccess = false;
}
}
if(!ProxySuccess || !UseProxy)
{
try
{
WebRequest.DefaultWebProxy = WebRequest.GetSystemWebProxy();
HttpWebRequest request = HttpWebRequest.Create(WebRequestURI) as HttpWebRequest;
request.Method = "GET";
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
}
catch (Exception e)
{ //try with credentials
//make a new proxy from defaults
WebRequest.DefaultWebProxy = WebRequest.GetSystemWebProxy();
String newProxyURI = WebRequest.DefaultWebProxy.GetProxy(new Uri(WebRequestURI)).ToString();
if (newProxyURI == String.Empty)
{ //check we actually get a result
WebRequest.DefaultWebProxy = null;
return;
}
//continue
WebProxy NewProxy = new WebProxy(newProxyURI);
NewProxy.UseDefaultCredentials = true;
NewProxy.Credentials = CredentialCache.DefaultCredentials;
WebRequest.DefaultWebProxy = NewProxy;
try
{
HttpWebRequest request = HttpWebRequest.Create(WebRequestURI) as HttpWebRequest;
request.Method = "GET";
HttpWebResponse response = request.GetResponse() as HttpWebResponse;
}
catch
{
WebRequest.DefaultWebProxy = null;
}
}
}
}
Is it not just a case of needing to set the Timeout property of the HttpWebRequest? It could be that the connection is being made, but not serviced (wrong type of proxy server or stalled server, for example), in which case it may be that the request is waiting for the Timeout period before giving up — a shorter timeout may be preferrable here.
Seems to be a programming error on my behalf. The requests were left open and obviously either the program or the server doesn't like this. Simply closing the HttpWebRequests once done seems to remove this issue.

Categories