Extract domain with subdomain in side class - c#

I have an ASP.NET 3.5 Web application with C# 2008.
What I need to do is, I want to extract full domain name in side a class method from Current URL.
For example :
I do have Current URL like :
http://subdomain.domain.com/pagename.aspx
OR
https://subdomain.domain.com/pagename.aspx?param=value&param2=value2
Then the result should be like,
http://subdomain.domain.com
OR
https://subdomain.domain.com

You're looking for the Uri class:
new Uri(str).GetLeftPart(UriPartial.Authority)

Create a Uri and query the Host property:
var uri = new Uri(str);
var host = uri.Host;
(Later)
I just realized that you want the scheme and the domain. In that case, #SLaks answer is the one you want. You could do it by combining the uri.Scheme and uri.Host, but that can get messy for things like mailto urls, etc.

Related

How to maintain the right URL in C#/ASP.NET?

I am given a code and on one of its pages which shows a "search result" after showing different items, it allows user to click on one of records and it is expected to bring up a page so that specific selected record can be modified.
However, when it is trying to bring up the page I get (by IE) "This page cannot be displayed".
It is obvious the URL is wrong because first I see something http://www.Something.org/Search.aspx then it turns into http://localhost:61123/ProductPage.aspx
I did search in the code and found the following line which I think it is the cause. Now, question I have to ask:
What should I do to avoid using a static URL and make it dynamic so it always would be pointing to the right domain?
string url = string.Format("http://localhost:61123/ProductPage.aspx?BC={0}&From={1}", barCode, "Search");
Response.Redirect(url);
Thanks.
Use HttpContext.Current.Request.Url in your controller to see the URL. Url contains many things including Host which is what you're looking for.
By the way, if you're using the latest .Net 4.6+ you can create the string like so:
string url = $"{HttpContext.Current.Request.Url.Host}/ProductPage.aspx?BC={barCode}&From={"Search"}";
Or you can use string.Format
string host = HttpContext.Current.Request.Url.Host;
string url = string.Format("{0}/ProductPage.aspx?BC={1}&From={2}"), host, barCode, "Search";
You can store the Host segment in your AppSettings section of your Web.Config file (per config / environment like so)
Debug / Development Web.Config
Production / Release Web.Config (with config override to replace the localhost value with something.org host)
and then use it in your code like so.
// Creates a URI using the HostUrlSegment set in the current web.config
Uri hostUri = new Uri(ConfigurationManager.AppSettings.Get("HostUrlSegment"));
// does something like Path.Combine(..) to construct a proper Url with the hostName
// and the other url segments. The $ is a new C# construct to do string interpolation
// (makes for readable code)
Uri fullUri = new Uri(hostUri, $"ProductPage.aspx?BC={barCode}&From=Search");
// fullUrl.AbosoluteUri will contain the proper Url
Response.Redirect(fullUri.AbsoluteUri);
The Uri class has a lot of useful properties and methods to give you Relative Url, AbsoluteUrl, your Url Fragments, Host name etc etc.
This should do it.
string url = string.Format("ProductPage.aspx?BC={0}&From={1}", barCode, "Search");
Response.Redirect(url);
If you are using .Net 4.6+ you can also use this string interpolation version
string url = $"ProductPage.aspx?BC={barcode}&From=Search";
Response.Redirect(url);
You should just be able to omit the hostname to stay on the current domain.

strip off prefix of URL's

I have got the following log of URL strings. The logs contain millions of records.
www.example.com/p1?q=k
example.com/p1?q=k
http://example.com/p1?q=k
https://example.com/p1?q=k
http://www.example.com/p1?q=k
I used the C# Uri class but it throws an excepition for format of type "example.com/p1?q=K"
I was wondering if there is a generally/standard accepted method for dealing with such different types of URL to get websitename & the relative URL.
P.S: I could strip off http:// & https:// by using a regex or string comparision, but curious to know if there are any elegant solutions
If you try it with your existing example it will not work.. however you can play around with this and do some appending code where needed which means you will need to create a few variables to store the http://, https://, and www.
System.Uri uriPre = new Uri ("http://www.example.com/p1?q=k");
string uriString = uriPre.Host + uriPre.PathAndQuery;
uriString = uriString.Replace("www.", "");
yields
"example.com/p1?q=k"
the rest of the coding you will have to figure out because only you would know when to utilize the different protocols base on the example I've provided
to expand on Alexei Levenkov answer here is an example that you can use to try to create a new Uri.
Uri tempValue;
var uriPre = new Uri(string.Empty, UriKind.Relative);
if (Uri.TryCreate("example.com/p1?q=k", UriKind.Relative, out tempValue))
{
// do something or retrun tempValue;
}
Uri it the class that is designed to deal with Uris
var noSchemaRelativeUri = new Uri("example.com/foo", UriKind.Relative);
Either UriBuilder or Uri(Uri base, Uri relative) can be used to construct absolute Uri.
To pick between relative and aboslute you can use Uri.TryCreate.
Note. "www.example.com" and "example.com" strictly speaking are unrelated domain names, converting one to another is not guaranteed to always produce registered domain name (also indeed most sites register both and do some sort of redirect between).

How to get website name from domain name?

I fetch the domain from the URL as follows:
var uri = new Uri("Http://www.google.com");
var host = uri.Host;
//host ="www.google.com"
But I want only google.com in Host,
host = "google.com"
Given the accepted answer I guess the issue was not knowing how to manipulate strings rather than how to deal with uris... but for anyone else who ends up here:
The Uri class does not have this property so you will have to parse it yourself.
Presumably you do not know what the subdomain is before time so a simple replace may not be possible.
This is not trivial since the TLDs are so varied (http://en.wikipedia.org/wiki/List_of_Internet_top-level_domains), and there maybe be multiple parts to the url (eg http://pre.subdomain.domain.co.uk).
You will have to decide exactly what you want to get and how complex you want the solution to be.
simple - do a string replace, see ekad's answer
medium - regex that works most of the time, see Strip protocol and subdomain from a URL
or complex - refer to a list of suffixes in order to figure out what is subdomain and what is domain eg
Get the subdomain from a URL
If host begins with "www.", you can replace "www." with an empty string using String.Replace Method like this:
var uri = new Uri("Http://www.google.com");
var host = uri.Host.ToLower();
if (host.StartsWith("www."))
{
host = host.Replace("www.", "");
}

Getting full URL from URL with tilde(~) sign

I am trying to get a typical asp.net url starting with the tilde sign ('~') to parse into a full exact url starting with "http:"
I have this string "~/PageB.aspx"
And i want to make it become "http://myServer.com/PageB.aspx"
I know there is several methods to parse urls and get different paths of server and application and such. I have tried several but not gotten the result i want.
Try out
System.Web.VirtualPathUtility.ToAbsolute("yourRelativePath");
There are various ways that are available in ASP.NET that we can use to resolve relative paths to a resource on the server-side and making it available on the client-side. I know of 4 ways -
1) Request.ApplicationPath
2) System.Web.VirtualPathUtility
3) Page.ResolveUrl
4) Page.ResolveClientUrl
Good article : Different approaches for resolving URLs in ASP.NET
If you're in a page handler you could always use the ResolveUrl method to convert the relative path to a server specific path. But if you want the "http://www.yourserver.se" part aswell, you'll have to prepend the Request.Url.Scheme and Request.Url.Authority to it.
string.Format("http://{0}{1}", Request.Url.Host, Page.ResolveUrl(relativeUrl));
This method looks the nicest to me. No string manipulation, it can tolerate both relative or absolute URLs as input, and it uses the exact same scheme, authority, port, and root path as whatever the current request is using:
private Uri GetAbsoluteUri(string redirectUrl)
{
var redirectUri = new Uri(redirectUrl, UriKind.RelativeOrAbsolute);
if (!redirectUri.IsAbsoluteUri)
{
redirectUri = new Uri(new Uri(Request.Url.GetLeftPart(UriPartial.Authority) + Request.ApplicationPath), redirectUri);
}
return redirectUri;
}

filter duplicate URLs domain from List c#

I have a list of 100,000 urls in list(Of string) which can contain urls in the form.
yahoo.com
http://yahoo.com
http://www.yahoo.com
i have tried using a combination of regex and the Uri class, but that didn't help, so i dumped the code. i also tried using this code, but it will only remove duplicatse of exact form, since its not domain specific.
list = new ArrayList<T>(new HashSet<T>(list))
How filter these duplicates and keep just one of these url if it contains the same name e.g yahoo.
thanks
[EDIT]
Please note that
all URL are of different domains, but can usually have duplicates like the example i gave above
also, am using .net 2.0, so i can't use linq
This worked for me
[TestMethod]
public void TestMethod1()
{
var sites = new List<string> {"yahoo.com", "http://yahoo.com", "http://www.yahoo.com"};
var result = sites.Select(
s =>
s.StartsWith("http://www.")
? s
: s.StartsWith("http://")
? "http://www." + s.Substring(7)
: "http://www." + s).Distinct();
Assert.AreEqual(1, result.Count());
}
I think the Uri Class would be able to help in this case. I am not at a VS machine where I can test; however, pass the Uri constructor the string of the Url, and try the Host property for comparison:
List<string> distinctHosts = new List<string>();
foreach (string url in UrlList)
{
Uri uri = new Uri(url)
if (! disctinctHosts.Contains(uri.Host))
{
distinctHosts.Add(uri.Host);
}
}
This feels a bit primitive, and could probably be more elegant - possibly without a foreach; but like I said, I'm not at a development machine where I could work with it.
I think this would be able to handle any variation of a valid Url. Building an ArrayList is not a good idea; in my opinion, Regex would require that you maintain some sort of custom 'MatchList' that could get unwieldy.
As #Damokles points out, you should have some form of validation. The Uri class does require a protocol: 'http://' or 'ftp://'. You do not want to assume 'badurl.com' is actually invalid; however:
if (!url.StartsWith("http://")) { /* add protocol */ } // then check Host domain as above
...should be sufficient simply to retrieve a distinct host or domain name. I recommend any option that does not require guessing the index position of any part of the Url as that is tightly bound to specific formats.
You can do this with the Uri class and Linq/extension methods. The trick is to normalize the Url before using it with the Uri class. Also note that the Uri class requires the scheme, so that will have to be added for ones where it's not present. You can use a different property of the Uri class to achieve different results. The example below returns all unique Urls and treats yahoo.com differently than www.yahoo.com.
string[] urls = new[] {
"yahoo.com",
"http://yahoo.com",
"http://www.yahoo.com" };
var unique = urls.
Select(url => new System.Uri(
url.StartsWith("http") ? url : "http://" + url).Host).
Distinct();
(Edited to clean up formatting and to make the scheme addition part support both "http://" and "https://")
Try a Regex then .*?(\w+\.\w+)$ assuming you don't have anything after the tld.

Categories