How to check whether a string is a valid HTTP URL? - c#

There are the Uri.IsWellFormedUriString and Uri.TryCreate methods, but they seem to return true for file paths, etc.
How do I check whether a string is a valid (not necessarily active) HTTP URL for input validation purposes?

Try this to validate HTTP URLs (uriName is the URI you want to test):
Uri uriResult;
bool result = Uri.TryCreate(uriName, UriKind.Absolute, out uriResult)
&& uriResult.Scheme == Uri.UriSchemeHttp;
Or, if you want to accept both HTTP and HTTPS URLs as valid (per J0e3gan's comment):
Uri uriResult;
bool result = Uri.TryCreate(uriName, UriKind.Absolute, out uriResult)
&& (uriResult.Scheme == Uri.UriSchemeHttp || uriResult.Scheme == Uri.UriSchemeHttps);

This method works fine both in http and https. Just one line :)
if (Uri.IsWellFormedUriString("https://www.google.com", UriKind.Absolute))
MSDN: IsWellFormedUriString

Try that:
bool IsValidURL(string URL)
{
string Pattern = #"^(?:http(s)?:\/\/)?[\w.-]+(?:\.[\w\.-]+)+[\w\-\._~:/?#[\]#!\$&'\(\)\*\+,;=.]+$";
Regex Rgx = new Regex(Pattern, RegexOptions.Compiled | RegexOptions.IgnoreCase);
return Rgx.IsMatch(URL);
}
It will accept URL like that:
http(s)://www.example.com
http(s)://stackoverflow.example.com
http(s)://www.example.com/page
http(s)://www.example.com/page?id=1&product=2
http(s)://www.example.com/page#start
http(s)://www.example.com:8080
http(s)://127.0.0.1
127.0.0.1
www.example.com
example.com

public static bool CheckURLValid(this string source)
{
Uri uriResult;
return Uri.TryCreate(source, UriKind.Absolute, out uriResult) && uriResult.Scheme == Uri.UriSchemeHttp;
}
Usage:
string url = "htts://adasd.xc.";
if(url.CheckUrlValid())
{
//valid process
}
UPDATE: (single line of code) Thanks #GoClimbColorado
public static bool CheckURLValid(this string source) => Uri.TryCreate(source, UriKind.Absolute, out Uri uriResult) && uriResult.Scheme == Uri.UriSchemeHttps;
Usage:
string url = "htts://adasd.xc.";
if(url.CheckUrlValid())
{
//valid process
}

All the answers here either allow URLs with other schemes (e.g., file://, ftp://) or reject human-readable URLs that don't start with http:// or https:// (e.g., www.google.com) which is not good when dealing with user inputs.
Here's how I do it:
public static bool ValidHttpURL(string s, out Uri resultURI)
{
if (!Regex.IsMatch(s, #"^https?:\/\/", RegexOptions.IgnoreCase))
s = "http://" + s;
if (Uri.TryCreate(s, UriKind.Absolute, out resultURI))
return (resultURI.Scheme == Uri.UriSchemeHttp ||
resultURI.Scheme == Uri.UriSchemeHttps);
return false;
}
Usage:
string[] inputs = new[] {
"https://www.google.com",
"http://www.google.com",
"www.google.com",
"google.com",
"javascript:alert('Hack me!')"
};
foreach (string s in inputs)
{
Uri uriResult;
bool result = ValidHttpURL(s, out uriResult);
Console.WriteLine(result + "\t" + uriResult?.AbsoluteUri);
}
Output:
True https://www.google.com/
True http://www.google.com/
True http://www.google.com/
True http://google.com/
False

After Uri.TryCreate you can check Uri.Scheme to see if it HTTP(s).

As an alternative approach to using a regex, this code uses Uri.TryCreate per the OP, but then also checks the result to ensure that its Scheme is one of http or https:
bool passed =
Uri.TryCreate(url, UriKind.Absolute, out Uri uriResult)
&& (uriResult.Scheme == Uri.UriSchemeHttp
|| uriResult.Scheme == Uri.UriSchemeHttps);

This would return bool:
Uri.IsWellFormedUriString(a.GetAttribute("href"), UriKind.Absolute)

Problem: Valid URLs should include all of the following “prefixes”: https, http, www
Url must contain http:// or https://
Url may contain only one instance of www.
Url Host name type must be Dns
Url max length is 100
Solution:
public static bool IsValidUrl(string webSiteUrl)
{
if (webSiteUrl.StartsWith("www."))
{
webSiteUrl = "http://" + webSiteUrl;
}
return Uri.TryCreate(webSiteUrl, UriKind.Absolute, out Uri uriResult)
&& (uriResult.Scheme == Uri.UriSchemeHttp
|| uriResult.Scheme == Uri.UriSchemeHttps) && uriResult.Host.Replace("www.", "").Split('.').Count() > 1 && uriResult.HostNameType == UriHostNameType.Dns && uriResult.Host.Length > uriResult.Host.LastIndexOf(".") + 1 && 100 >= webSiteUrl.Length;
}
Validated with Unit Tests
Positive Unit Test:
[TestCase("http://www.example.com/")]
[TestCase("https://www.example.com")]
[TestCase("http://example.com")]
[TestCase("https://example.com")]
[TestCase("www.example.com")]
public void IsValidUrlTest(string url)
{
bool result = UriHelper.IsValidUrl(url);
Assert.AreEqual(result, true);
}
Negative Unit Test:
[TestCase("http.www.example.com")]
[TestCase("http:www.example.com")]
[TestCase("http:/www.example.com")]
[TestCase("http://www.example.")]
[TestCase("http://www.example..com")]
[TestCase("https.www.example.com")]
[TestCase("https:www.example.com")]
[TestCase("https:/www.example.com")]
[TestCase("http:/example.com")]
[TestCase("https:/example.com")]
public void IsInvalidUrlTest(string url)
{
bool result = UriHelper.IsValidUrl(url);
Assert.AreEqual(result, false);
}
Note: IsValidUrl method should not validate any relative url path like example.com
See:
Should I Use Relative or Absolute URLs?

Uri uri = null;
if (!Uri.TryCreate(url, UriKind.Absolute, out uri) || null == uri)
return false;
else
return true;
Here url is the string you have to test.

I've created this function to help me with URL validation, you can customize it as you like, note this is written in python3.10.6
def url_validator(url: str) -> bool:
"""
use this func to filter out the urls to follow only valid urls
:param: url
:type: str
:return: True if the passed url is valid otherwise return false
:rtype: bool
"""
#the following regex is copied from Django source code
# to validate a url using regax
regex = re.compile(
r"^(?:http|ftp)s?://" # http:// or https://
r"(?:(?:[A-Z0-9](?:[A-Z0-9-]{0,61}[A-Z0-9])?\.)+(?:[A-Z]{2,6}\.?|[A-Z0-9-]{2,}\.?)|" # domain...
r"localhost|" # localhost...
r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})" # ...or ip
r"(?::\d+)?" # optional port
r"(?:/?|[/?]\S+)$",
re.IGNORECASE,
)
blocked_sites: list[str] = []
for site in blocked_sites:
if site in url or site == url:
return False
# if none of the above then ensure that the url is valid and then return True otherwise return False
if re.match(regex, url):
return True
return False

Related

C# - Validating Website URLs. IsWellFormedUriString always returning true

I need a check that returns true for the following website urls:
I need to make sure that websites that start as www. pass as true. Also google.com should return true.
www.google.com
google.com
http://google.com
http://www.google.com
https://google.com
https://www.google.com
I have been using IsWellFormedUriString and haven't gotten anywhere. It keeps returning true. I also have used Uri.TryCreate and can't get it to work either. There is so much on Stack Overflow regarding this topic but none of them are working. I must be doing something wrong.
Here is my ValidateUrl function:
public static bool ValidateUrl(string url)
{
try
{
if (url.Substring(0, 3) != "www." && url.Substring(0, 4) != "http" && url.Substring(0, 5) != "https")
{
url = "http://" + url;
}
if (Uri.IsWellFormedUriString(url, UriKind.RelativeOrAbsolute))
{
Uri strUri = new Uri(url);
return true;
}
else
{
return false;
}
}
catch (Exception exc)
{
throw exc;
}
}
And I am calling it like this:
if (ValidateUrl(url) == false) {
validationErrors.Add(new Error()
{
fieldName = "url",
errorDescription = "Url is not in correct format."
});
}
It is returning true for htp:/google.com. I know there's a lot on this site regarding this topic but I have been trying to get this to work all day yesterday and nothing is working.
If you want your users to copy and paste from the db into the browser and enter a valid site, I think you should validate the url format
and at the same time verify the existence of the url
for example:
Uri.IsWellFormedUriString("http://www.google.com", UriKind.Absolute);
It will be true again how the URL is in the correct form.
WebRequest request = WebRequest.Create("http://www.google.com");
try
{
request.GetResponse();
}
catch (Exception ex)
{
throw ex;
}
An exception will return, if it is not possible to get the answer from the url
Hi.
If I understand your question correct then I would check it like that:
public static bool ValidateUrl(string url)
{
if (url.StartsWith("https://www.") || url.StartsWith("http://www.") || url.StartsWith("https://google.com") || url.StartsWith("http://google.com"))
{
return true;
}
else
{
return false;
}
}
Any domain name not google.com but with https://www. or http://www. returns true otherwise false.
If you want to test if an HTTP(S) url is good or not, you should use this :
(credit : stackoverflow.com/a/56116499/215552 )
Uri uriResult;
bool result = Uri.TryCreate(uriName, UriKind.Absolute, out uriResult)
&& (uriResult.Scheme == Uri.UriSchemeHttp || uriResult.Scheme == Uri.UriSchemeHttps);
So in your case :
public static boolean ValidateUrl(string url){
Uri uriResult;
return Uri.TryCreate(url, UriKind.Absolute, out uriResult)
&& (uriResult.Scheme == Uri.UriSchemeHttp || uriResult.Scheme == Uri.UriSchemeHttps);
}
// EDIT : try this :
public static bool ValidateUrl(string URL)
{
string Pattern = #"(http(s)?://)?([\w-]+\.)+[\w-]+[\w-]+[\.]+[\][a-z.]{2,3}$+([./?%&=]*)?";
Regex Rgx = new Regex(Pattern, RegexOptions.Compiled | RegexOptions.IgnoreCase);
return Rgx.IsMatch(URL);
}
I got it working by writing a small helper method that uses Regex to validate the url.
The following URL's pass:
google.com
www.google.com
http://google.com
http://www.google.com
https://google.com/test/test
https://www.google.com/test
It fails on:
www.google.com/a bad path with white space/
Below is the helper method I created:
public static bool ValidateUrl(string value, bool required, int minLength, int maxLength)
{
value = value.Trim();
if (required == false && value == "") return true;
if (required && value == "") return false;
Regex pattern = new Regex(#"^(?:http(s)?:\/\/)?[\w.-]+(?:\.[\w\.-]+)+[\w\-\._~:/?#[\]#!\$&'\(\)\*\+,;=.]+$");
Match match = pattern.Match(value);
if (match.Success == false) return false;
return true;
}
This allows users to input any valid url, plus it accounts for bad url paths with white space which is exactly what I needed.

How to get type of HTTP POST data?

In the picture above, I have Request Body of a POST request with FiddlerCore dll.
Here is how I capture it:
private void FiddlerApplication_AfterSessionComplete(Session sess)
{
string requestBody = "";
if (sess.oRequest != null)
{
if (sess.oRequest.headers != null)
{
requestBody = sess.GetRequestBodyAsString();
}
}
}
However, I would only need to capture it in the case it's parameters (2 last line on the picture) and in the other case I don't need to capture it.
I can filter it with string, it is what I do so far. However, what would be the proper way to do this?
NOTE: Each line on the picture is a different request, for a total of 5.
If there is no content type then ignore it. Figure out the ones you do want and take those.
private void FiddlerApplication_AfterSessionComplete(Session sess) {
if (sess == null || sess.oRequest == null || sess.oRequest.headers == null)
return;
// Ignore HTTPS connect requests or other non-POST requests
if (sess.RequestMethod == "CONNECT" || sess.RequestMethod != "POST")
return;
var reqHeaders = sess.oRequest.headers.ToString(); //request headers
// Get the content type of the request
var contentType = sess.oRequest["Content-Type"];
// Lets assume you have a List<string> of approved content types.
// Ignore requests that do not have a content type
// or are not in the approved list of types.
if(contentType != null && !approvedContent.Any(c => contentType.Containes(c))
return;
var reqBody = sess.GetRequestBodyAsString();//get the Body of the request
//...other code.
}

TempData empty when redirect to other url than root, why?

I have this method.
public ActionResult SetTempDataToChangeVendor(int vendorId, string url)
{
TempData["ChangeVendor"] = vendorId;
if (url == null) return Redirect("/");
var slug = _urlRecordRepository.Table.FirstOrDefault(s => s.Slug == url);
if (slug == null) RedirectToAction("PageNotFound", "Common");
return Redirect("/" + url);
}
It works just fine when it redirects to /. But when a url i supplied the TempData is empty, and I can't understand why.
TempData is a bucket where you can dump data that is only needed for the following request. That is, anything you put into TempData is discarded after the next request completes.

Validate a URL exists or not from DB

I have a situation where I need to restrict a user to enter a Url which already exits in DB.
Here is the function that I am using to validate:
public bool IsContentUrlExists(string url)
{
url = url.Trim().TrimEnd(new[]{'/'});
return Context.Contents.Any(content => content.Url == url);
}
With this method I can validate for a Url say "/testurl/" that matches a url "/testurl" in DB.
But otherway it will not work when I go to compare "/testurl" string with "/testurl/" in DB.
I need to remove the trailing slash in both case but TrimEnd(new[]{'/'}) will not work on a column in EF query. So the following method will fail
public bool IsContentUrlExists(string url)
{
url = url.Trim().TrimEnd(new[]{'/'});
return Context.Contents.Any(content => content.Url.Trim().TrimEnd(new[]{'/'}) == url);
}
Can anyone help me with an alternative solution?
N.B: We don't have any standard for URL in our existing DB
Using your code plus mine
public bool IsContentUrlExists(string url)
{
url = url.Trim().TrimEnd(new[]{'/'});
return Context.Contents.Any(content => content.Url == url || content.Url == url + "/");
}
Untested but something like the above shoudl work, shouldn't it?
Wing
As an alternate you can try:
url = url.Trim().TrimEnd(new[] { '/' });
var lstUrls = new List<string> { url, url + "/" };
return Context.Contents.Any(content => lstUrls.Contains(content.Url));
It compares a list of two strings: one ending with slash and the other without.
If there will be a match for any of these two strings in database then it means the URL exists!

ASP.Net Redirect to secure

I am wanting to redirect a page to a secure connection for an ASPX file.
Clients are asked to copy and paste a URL that looks like this foo.com.au into the browser.
I have this code below working on the code behind file but am wondering when it is deployed to production if this will update the URL to have www after the https://www as the URL provided to clients does not have www in it?
protected override void OnPreInit(EventArgs e)
{
base.OnPreInit(e);
if (!Request.IsLocal && !Request.IsSecureConnection)
{
string redirectUrl = Request.Url.ToString().Replace("http:", "https:");
Response.Redirect(redirectUrl);
}
}
Rather than using Request.Url, use Request.Url.AbsoluteUri. In addition, you should not assume that the URL will be entered in lowercase. I would revise the code to be:
if (!Request.IsLocal && !Request.IsSecureConnection)
{
if (Request.Url.Scheme.Equals(Uri.UriSchemeHttp, StringComparison.InvariantCultureIgnoreCase))
{
string sNonSchemeUrl = Request.Url.AbsoluteUri.Substring(Uri.UriSchemeHttp.Length);
// Ensure www. is prepended if it is missing
if (!sNonSchemeUrl.StartsWith("www", StringComparison.InvariantCultureIgnoreCase)) {
sNonSchemeUrl = "www." + sNonSchemeUrl;
}
string redirectUrl = Uri.UriSchemeHttps + sNonSchemeUrl;
Response.Redirect(redirectUrl);
}
}
If you do this, all it will change is the schema. So, if the absoluteUri is
http://foo.com.au
it will be changed to
https://foo.com.au
One last note: when we have done this, we have never tried it in OnPreInit, we always perform this logic in Page_Load. I am not sure what, if any, ramifications there will be for redirecting at that portion of the page lifecycle, but if you run into issues, you could move it into Page_Load.
This was my final implementation to account for a request comes through for https://foo and not https://www.foo
if (!Request.IsLocal &&
!Request.Url.AbsoluteUri.StartsWith("https://www.", StringComparison.OrdinalIgnoreCase))
{
string translatedUrl;
string nonSchemeUrl = Request.Url.AbsoluteUri;
string stringToReplace = (Request.Url.Scheme == Uri.UriSchemeHttp ? Uri.UriSchemeHttp + "://" : Uri.UriSchemeHttps + "://");
nonSchemeUrl = nonSchemeUrl.Replace(stringToReplace, string.Empty);
if (!nonSchemeUrl.StartsWith("www", StringComparison.InvariantCultureIgnoreCase))nonSchemeUrl = "www." + nonSchemeUrl;
translatedUrl = Uri.UriSchemeHttps + "://" + nonSchemeUrl;
Response.Redirect(nonSchemeUrl);
}

Categories