My C#/UWP app has a section where users can enter links to OneDrive documents and other web resources as reference information. A user can click a button to test a link after they've entered it to make sure it launches as expected. I need to validate that the link target exists before launching the URI and raise an error if the link is not valid before attempting to launch the URI.
It's straight-forward to validate web sites and non-OneDrive web-based docs by creating a HttpWebRequest using the URL and evaluating the response's status value. (See sample code below.)
However, OneDrive document share links seem to have problems with this approach, returning a [405 Method Not Allowed] error. I'm guessing this is because OneDrive share links do lots of forwarding and redirection before they get to the actual document.
try
{
// Create the HTTP request
HttpWebRequest request = WebRequest.Create(urlString) as HttpWebRequest;
request.Timeout = 5000; // Set the timeout to 5 seconds -- don't keep the user waiting too long!
request.Method = "HEAD"; // Get only the header information -- no need to download page content
// Get the HTTP response
using (HttpWebResponse response = request.GetResponse() as HttpWebResponse)
{
// Get the response status code
int statusCode = (int)response.StatusCode;
// Successful request...return true
if (statusCode >= 100 && statusCode < 400)
{
return true;
}
// Unsuccessful request (server-side errors)...return false
else // if (statusCode >= 500 && statusCode < 600)
{
Log.Error( $"URL not valid. Server error. (Status = {statusCode})" );
return false;
}
}
}
// Handle HTTP exceptions
catch (WebException e)
{
// Get the entire HTTP response from the exception
HttpWebResponse response = (HttpWebResponse)e.Response;
// Grab the HTTP status code from the response
int statusCode = (int)response.StatusCode;
// Unsuccessful request (client-side errors)...return false
if (statusCode >= 400 && statusCode <= 499)
{
Log.Error( $"URL not valid. Client error. (Status = {statusCode})" );
return false;
}
else // Unhandled errors
{
Log.Error( $"Unhandled status returned for URL. (Status = {e.Status})" );
return false;
}
}
// Handle non-HTTP exceptions
catch (Exception e)
{
Log.Error( $"Unexpected error. Could not validate URL." );
return false;
}
I can trap the 405 error and launch the URL anyhow using the Windows.System.Launcher.LaunchUriAsync method. The OneDrive link launches just fine...IF the OneDrive document actually exists.
But if the document doesn't exist, or if the share permissions have been revoked, I end up with a browser page with something like a [404 Not Found] error...exactly what I'm trying to avoid by doing the validation!
Is there a way to validate OneDrive share links WITHOUT actually launching them in a browser? Are there other types of links (bit.ly links, perhaps?) that also create problems in the same way? Perhaps a better question: Can I validate ALL web resources in the same way without knowing anything but the URL?
The best way to avoid the redirects and get access to an item metadata using a sharing link is to make an API call to the shares endpoint. You'll want to encode your URL as outlined here and the pass it to the API like:
HEAD https://api.onedrive.com/v1.0/shares/u!{encodedurl}
Related
I just want know the web page is connected or not without using WebResponse class becuase if i use this class its taking time to get repsonse. So i just want without using like this below code
Dim url As New System.Uri("http://www.stackoverflow.com/")
Dim request As WebRequest = WebRequest.CreateDefault(url)
request.Method = "GET"
Dim response As WebResponse
Try
response = request.GetResponse()
Catch exc As WebException
response = exc.Response
End Try
You can't without using the proper classes for it, or writing your own.
My two cents: just execute the HttpWebRequest and check if the resulting HTTP status code is not 404:
try
{
HttpWebRequest q = (HttpWebRequest)WebRequest.Create(theUrl);
HttpWebResponse r = (HttpWebResponse)q.GetResponse();
if (r.StatusCode != HttpStatusCode.NotFound)
{
// page does not exist
}
}
catch (WebException ex)
{
HttpWebResponse r = ex.Response as HttpWebResponse;
if (r != null && r.StatusCode != HttpStatusCode.NotFound)
{
// page does not exist
}
}
You could create a basic socket connection to the given server and the desired port (80). If you can connect, you know that the server is online and you can immediatly close the connection without sending or receiving any data.
EDIT: My answer was of course not really correct. By connecting to the server on port 80 just verfiys that the server accepts request and not if the specific web page exists. But after connecting you could send a GET request like GET /page.html HTTP/1.1 and parse the answer of the server. But for this it will be much more comfortable to use WebRequest or WebClient.
I have a winform I'm using to connect to server via the use of a php script held online. Ive made it so my program can store this address within the settings of the winform itself like so:
http://server.webhost.com/file/uploadimage.html
Then when to pass this address to my program I simply call the following:
Settings.Default.ServerAddress;
Then to send my file to the server I have the following method which calls looks like this:
UploadToServer.HttpUploadFile(Settings.Default.ServerAddress , sfd.FileName.ToString(), "file", "image/jpeg", nvc);
However I have no idea of how to check to make sure that the address entered is actually valid. Is there a best practice to achieve that?
One of the way to make sure that a URL is working is to actually request it for content, you can make it better by placing a request of type HEAD only. Like
try
{
HttpWebRequest request = HttpWebRequest.Create("yoururl") as HttpWebRequest;
request.Method = "HEAD"; //Get only the header information -- no need to download any content
using (HttpWebResponse response = request.GetResponse() as HttpWebResponse)
{
int statusCode = (int)response.StatusCode;
if (statusCode >= 100 && statusCode < 400) //Good requests
{
}
else //if (statusCode >= 500 && statusCode <= 510) //Server Errors
{
//Hard to reach here since an exception would be thrown
}
}
}
catch (WebException ex)
{
//handle exception
//something wrong with the url
}
Use System.Uri (http://msdn.microsoft.com/en-us/library/system.uri.aspx) to parse it. You'll get an exception if it's not "valid". But as the other's who commented state, depending what kind of "valid" you want, this may or may not be good enough for what you are doing.
I need to get the url of the final destination of a shortened url. At the moment I am doing the following which seems to work:
var request = WebRequest.Create(shortenedUri);
var response = request.GetResponse();
return response.ResponseUri;
But can anyone suggest a better way?
If this shortened url is generated by some online service provider it is only this service provider that is storing the mapping between the short and the actual url. So you need to query this provider by sending an HTTP request to it, exactly as you did. Also don't forget to properly dispose IDisposable resources by wrapping them in using statements:
var request = WebRequest.Create(shortenedUri);
using (var response = request.GetResponse())
{
return response.ResponseUri;
}
If the service provider supports the HEAD verb you could also use this verb and read the Location response HTTP header which must be pointing to the actual url. As an alternative you could set the AllowAutoRedirect property to false on the request object and then read the Location response HTTP header. This way the client won't be redirecting to the actual resource and getting the entire response body when you are not interested in it.
Of course the best way to do this would be if your online service provider offers an API that would allow you to directly give you the actual url from a short url.
You do need to make an HTTP request - but you don't need to follow the redirect, which WebRequest will do by default. Here's a short example of making just one request:
using System;
using System.Net;
class Test
{
static void Main()
{
string url = "http://tinyurl.com/so-hints";
Console.WriteLine(LengthenUrl(url));
}
static string LengthenUrl(string url)
{
var request = WebRequest.CreateHttp(url);
request.AllowAutoRedirect = false;
using (var response = request.GetResponse())
{
var status = ((HttpWebResponse) response).StatusCode;
if (status == HttpStatusCode.Moved ||
status == HttpStatusCode.MovedPermanently)
{
return response.Headers["Location"];
}
// TODO: Work out a better exception
throw new Exception("No redirect required.");
}
}
}
Note that this means if the "lengthened" URL is itself a redirect, you won't get the "final" URI as you would in your original code. Likewise if the lengthened URL is invalid, you won't spot that - you'll just get the URL that you would have redirected to. Whether that's a good thing or not depends on your use case...
I'm using a function to check if an external url exists. Here's the code with the status messages removed for clarity.
public static bool VerifyUrl(string url)
{
url.ThrowNullOrEmpty("url");
if (!(url.StartsWith("http://") || url.StartsWith("https://")))
return false;
var uri = new Uri(url);
var webRequest = HttpWebRequest.Create(uri);
webRequest.Timeout = 5000;
webRequest.Method = "HEAD";
HttpWebResponse webResponse;
try
{
webResponse = (HttpWebResponse)webRequest.GetResponse();
webResponse.Close();
}
catch (WebException)
{
return false;
}
if (string.Compare(uri.Host, webResponse.ResponseUri.Host, true) != 0)
{
string responseUri = webResponse.ResponseUri.ToString().ToLower();
if (responseUri.IndexOf("error") > -1 || responseUri.IndexOf("404.") > -1 || responseUri.IndexOf("500.") > -1)
return false;
}
return true;
}
I've run a test over some external urls and found that about 20 out of 100 are coming back as errors. If i add a user agent the errors are around 14%.
The errors coming back are "forbidden", although this can be resolved for 6% using a user agent, "service unavialable", "method not allowed", "not implemented" or "connection closed".
Is there anything I can do to my code to ensure more, preferrably all give a valid response to their existance?
Altermatively, code that can be purchased to do this more effectively.
UPDATE - 14th Nov 12 ----------------------------------------------------------------------
After following advice from previous respondants, I'm now in a situation where I have a single domain that returns Service Unavailable (503). The example I have is www.marksandspencer.com.
When I use this httpsniffer web-sniffer.net as opposed to the one recommended in this thread, it works, returning the data using a webrequest.GET, however I can't work out what I need to do, to make it work in my code.
I finally got to the point of bieng able to validate all the urls without exception.
Firstly I took Davios advice. Some domains return an error on Request.HEAD so I have included a retry for specific scenarios. This created a new Request.GET for the second request.
Secondly, the Amazon scenario. Amazon was intermittently returning a 503 error for its own site and permanent 503 errors for sites hosted on the Amazon framework.
After some digging I found adding the following line to the Request resolved both. It is the Accept string used by Firefox.
var request = (HttpWebRequest)HttpWebRequest.Create(uri);
request.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
I am developing a tool for validation of links in url entered. suppose i have entered a url
(e.g http://www-review-k6.thinkcentral.com/content/hsp/science/hspscience/na/gr3/se_9780153722271_/content/nlsg3_006.html
) in textbox1 and i want to check whether the contents of all the links exists on remote server or not. finally i want a log file for the broken links.
You can use HttpWebRequest.
Note four things
1) The webRequest will throw exception if the link doesn't exist
2) You may like to disable auto redirect
3) You may also like to check if it's a valid url. If not, it will throw UriFormatException.
UPDATED
4) Per Paige suggested , Use "Head" in request.Method so that it won't download the whole remote file
static bool UrlExists(string url)
{
try
{
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(url);
request.Method = "HEAD";
request.AllowAutoRedirect = false;
request.GetResponse();
}
catch (UriFormatException)
{
// Invalid Url
return false;
}
catch (WebException ex)
{
// Valid Url but not exists
HttpWebResponse webResponse = (HttpWebResponse)ex.Response;
if (webResponse.StatusCode == HttpStatusCode.NotFound)
{
return false;
}
}
return true;
}
Use the HttpWebResponse class:
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create("http://www.gooogle.com/");
HttpWebResponse response = (HttpWebResponse)webRequest.GetResponse();
if (response.StatusCode == HttpStatusCode.NotFound)
{
// do something
}
bool LinkExist(string link)
{
HttpWebRequest webRequest = (HttpWebRequest) webRequest.Create(link);
HttpWebResponse webResponse = (HttpWebResponse)webRequest.GetResponse();
return !(webResponse.StatusCode != HttpStatusCode.NotFound);
}
Use an HTTP HEAD request as explained in this article: http://www.eggheadcafe.com/tutorials/aspnet/2c13cafc-be1c-4dd8-9129-f82f59991517/the-lowly-http-head-reque.aspx
Make a HTTP request to the URL and see if you get a 404 response. If so then it does not exist.
Do you need a code example?
If your goal is robust validation of page source, consider usign a tool that is already written, like the W3C Link Checker. It can be run as a command-line program that handles finding links, pictures, css, etc and checking them for validity. It can also recursively check an entire web-site.