Extract domain name from URL in C# - c#

This question has answer in other languages/platforms but I couldn't find a robust solution in C#. Here I'm looking for the part of URL which we use in WHOIS so I'm not interested in sub-domains, port, schema, etc.
Example 1: http://s1.website.co.uk/folder/querystring?key=value => website.co.uk
Example 2: ftp://username:password#website.com => website.com
The result should be the same when the owner in whois is the same so sub1.xyz.com and sub2.xyz.com both belong to who has the xyz.com which I'm need to extract from a URL.

I needed the same, so I wrote a class that you can copy and paste into your solution. It uses a hard coded string array of tld's. http://pastebin.com/raw.php?i=VY3DCNhp
Console.WriteLine(GetDomain.GetDomainFromUrl("http://www.beta.microsoft.com/path/page.htm"));
outputs microsoft.com
and
Console.WriteLine(GetDomain.GetDomainFromUrl("http://www.beta.microsoft.co.uk/path/page.htm"));
outputs microsoft.co.uk

As #Pete noted, this is a little bit complicated, but I'll give it a try.
Note that this application must contain a complete list of known TLD's. These can be retrieved from http://publicsuffix.org/. Left extracting the list from this site as an exercise for the reader.
class Program
{
static void Main(string[] args)
{
var testCases = new[]
{
"www.domain.com.ac",
"www.domain.ac",
"domain.com.ac",
"domain.ac",
"localdomain",
"localdomain.local"
};
foreach (string testCase in testCases)
{
Console.WriteLine("{0} => {1}", testCase, UriHelper.GetDomainFromUri(new Uri("http://" + testCase + "/")));
}
/* Produces the following results:
www.domain.com.ac => domain.com.ac
www.domain.ac => domain.ac
domain.com.ac => domain.com.ac
domain.ac => domain.ac
localdomain => localdomain
localdomain.local => localdomain.local
*/
}
}
public static class UriHelper
{
private static HashSet<string> _tlds;
static UriHelper()
{
_tlds = new HashSet<string>
{
"com.ac",
"edu.ac",
"gov.ac",
"net.ac",
"mil.ac",
"org.ac",
"ac"
// Complete this list from http://publicsuffix.org/.
};
}
public static string GetDomainFromUri(Uri uri)
{
return GetDomainFromHostName(uri.Host);
}
public static string GetDomainFromHostName(string hostName)
{
string[] hostNameParts = hostName.Split('.');
if (hostNameParts.Length == 1)
return hostNameParts[0];
int matchingParts = FindMatchingParts(hostNameParts, 1);
return GetPartOfHostName(hostNameParts, hostNameParts.Length - matchingParts);
}
private static int FindMatchingParts(string[] hostNameParts, int offset)
{
if (offset == hostNameParts.Length)
return hostNameParts.Length;
string domain = GetPartOfHostName(hostNameParts, offset);
if (_tlds.Contains(domain.ToLowerInvariant()))
return (hostNameParts.Length - offset) + 1;
return FindMatchingParts(hostNameParts, offset + 1);
}
private static string GetPartOfHostName(string[] hostNameParts, int offset)
{
var sb = new StringBuilder();
for (int i = offset; i < hostNameParts.Length; i++)
{
if (sb.Length > 0)
sb.Append('.');
sb.Append(hostNameParts[i]);
}
string domain = sb.ToString();
return domain;
}
}

The closest you could get is the System.Uri.Host property, which would extract the sub1.xyz.com portion. Unfortunately, it's hard to know what exactly is the "toplevel" portion of the host (e.g. sub1.foo.co.uk versus sub1.xyz.com)

if you need to domain name then you can use URi.hostadress in .net
if you need the url from content then you need to parse them using regex.

Related

Get the two first and last chars of a string C#

Well I'd like to get the two first and last chars of a string. This is what I already got
public static string FirstLastChars(this string str)
{
return str.Substring(0,2);
}
BTW. It's an extension method
Starting C# 8.0 you can use array ranges:
public static class StringExtentions {
public static string FirstLastChars(this string str)
{
// If it's less that 4, return the entire string
if(str.Length < 4) return str;
return str[..2] + str[^2..];
}
}
Check solution here: https://dotnetfiddle.net/zBBT3U
You can use the existing string Substring method. Check the following code.
public static string FirstLastChars(this string str)
{
if(str.Length < 4)
{
return str;
}
return str.Substring(0,2) + str.Substring(str.Length - 1, 1) ;
}
As an alternative you can make use of the Span Api.
First you need to create a buffer and pass that to a Span instance. (With this you have a writable Span.)
var strBuffer = new char[3];
var resultSpan = new Span<char>(strBuffer);
From the original str you can create two ReadOnlySpan<char> instances to have references for the first two letters and for the last letter.
var strBegin = str.AsSpan(0, 2);
var strEnd = str.AsSpan(str.Length - 1, 1);
In order to make a single string from two Span objects you need to concatenate them. You can do this by making use of the writable span.
strBegin.CopyTo(resultSpan.Slice(0, 2));
strEnd.CopyTo(resultSpan.Slice(2, 1));
Finally let's convert the Span<char> into string. You can do this in several ways, but the two most convenient commands are the followings:
new string(resultSpan)
resultSpan.ToString()
Helper:
public static class StringExtentions
{
public static string FirstLastChars(this string str)
{
if (str.Length < 4) return str;
var strBuffer = new char[3];
var resultSpan = new Span<char>(strBuffer);
var strBegin = str.AsSpan(0, 2);
var strEnd = str.AsSpan(str.Length - 1, 1);
strBegin.CopyTo(resultSpan.Slice(0, 2));
strEnd.CopyTo(resultSpan.Slice(2, 1));
return new string(resultSpan);
}
}
Usage:
class Program
{
static void Main(string[] args)
{
string str = "Hello World!";
Console.WriteLine(str.FirstLastChars()); //He!
}
}
Try this code-
static void Main(string[] args)
{
int[] numbers = { 12, 34, 64, 98, 32, 46 };
var finalList = numbers.Skip(2).SkipLast(2).ToArray();
foreach(var item in finalList)
{
Console.WriteLine(item + "");
}
}

Replace every other of a certain char in a string

I have searched a lot to find a solution to this, but could not find anything. I do however suspect that it is because I don't know what to search for.
First, I have a string that I convert to an array. The string will be formatted like so:
"99.28099822998047,68.375 118.30699729919434,57.625 126.49999713897705,37.875 113.94499683380127,11.048999786376953 96.00499725341797,8.5"
I create the array with the following code:
public static Array StringToArray(string String)
{
var list = new List<string>();
string[] Coords = String.Split(' ', ',');
foreach (string Coord in Coords)
{
list.Add(Coord);
}
var array = list.ToArray();
return array;
}
Now my problem is; I am trying to find a way to convert it back into a string, with the same formatting. So, I could create a string simply using:
public static String ArrayToString(Array array)
{
string String = string.Join(",", array);
return String;
}
and then hopefully replace every 2nd "," with a space (" "). Is this possible? Or are there a whole other way you would do this?
Thank you in advance! I hope my question makes sense.
There is no built-in way of doing what you need. However, it's pretty trivial to achieve what it is you need e.g.
public static string[] StringToArray(string str)
{
return str.Replace(" ", ",").Split(',');
}
public static string ArrayToString(string[] array)
{
StringBuilder sb = new StringBuilder();
for (int i = 0; i <= array.Length-1; i++)
{
sb.AppendFormat(i % 2 != 0 ? "{0} " : "{0},", array[i]);
}
return sb.ToString();
}
If those are pairs of coordinates, you can start by parsing them like pairs, not like separate numbers:
public static IEnumerable<string[]> ParseCoordinates(string input)
{
return input.Split(' ').Select(vector => vector.Split(','));
}
It is easier then to reconstruct the original string:
public static string PrintCoordinates(IEnumerable<string[]> coords)
{
return String.Join(" ", coords.Select(vector => String.Join(",", vector)));
}
But if you absolutely need to have your data in a flat structure like array, it is then possible to convert it to a more structured format:
public static IEnumerable<string[]> Pairwise(string[] coords)
{
coords.Zip(coords.Skip(1), (coord1, coord2) => new[] { coord1, coord2 });
}
You then can use this method in conjunction with PrintCoordinates to reconstruct your initial string.
Here is a route to do it. I don't think other solutions were removing last comma or space. I also include a test.
public static String ArrayToString(Array array)
{
var useComma = true;
var stringBuilder = new StringBuilder();
foreach (var value in array)
{
if (useComma)
{
stringBuilder.AppendFormat("{0}{1}", value, ",");
}
else
{
stringBuilder.AppendFormat("{0}{1}", value, " ");
}
useComma = !useComma;
}
// Remove last space or comma
stringBuilder.Length = stringBuilder.Length - 1;
return stringBuilder.ToString();
}
[TestMethod]
public void ArrayToStringTest()
{
var expectedStringValue =
"99.28099822998047,68.375 118.30699729919434,57.625 126.49999713897705,37.875 113.94499683380127,11.048999786376953 96.00499725341797,8.5";
var array = new[]
{
"99.28099822998047",
"68.375",
"118.30699729919434",
"57.625",
"126.49999713897705",
"37.875",
"113.94499683380127",
"11.048999786376953",
"96.00499725341797",
"8.5",
};
var actualStringValue = ArrayToString(array);
Assert.AreEqual(expectedStringValue, actualStringValue);
}
Another way of doing it:
string inputString = "1.11,11.3 2.22,12.4 2.55,12.8";
List<string[]> splitted = inputString.Split(' ').Select(a => a.Split(',')).ToList();
string joined = string.Join(" ", splitted.Select(a => string.Join(",",a)).ToArray());
"splitted" list will look like this:
1.11 11.3
2.22 12.4
2.55 12.8
"joined" string is the same as "inputString"
Here's another approach to this problem.
public static string ArrayToString(string[] array)
{
Debug.Assert(array.Length % 2 == 0, "Array is not dividable by two.");
// Group all coordinates as pairs of two.
int index = 0;
var coordinates = from item in array
group item by index++ / 2
into pair
select pair;
// Format each coordinate pair with a comma.
var formattedCoordinates = coordinates.Select(i => string.Join(",", i));
// Now concatinate all the pairs with a space.
return string.Join(" ", formattedCoordinates);
}
And a simple demonstration:
public static void A_Simple_Test()
{
string expected = "1,2 3,4";
string[] array = new string[] { "1", "2", "3", "4" };
Debug.Assert(expected == ArrayToString(array));
}

Query Expansion logic in C#

I have to write logic for expanding queries for solr search engine. I am using
Dictionary<string, string> dicSynon = new Dictionary<string, string>();.
Each time I will be getiing strings like "2 ln ca". In my dictionary I am having synonyms for ln and ca as lane and california. Now I need to pass solr all the combination of strings. Like this
2 ln ca
2 lane ca
2 lane california
2 ln california
Please help me to build the logic....
This is a exercise in using combinatorics and Linqs SelectMany:
First you have to writeyourself some function to give you a sequence of synonyms given a word (including the word, so "2" will result in ("2")) - let's call this 'Synonmys' - using a dictionary it can look like this:
private Dictionary<string, IEnumerable<string>> synonyms = new Dictionary<string, IEnumerable<string>>();
public IEnumerable<string> GetSynonmys(string word)
{
return synonyms.ContainsKey(word) ? synonyms[word] : new[]{word};
}
(you have to fill the Dictionary yourself...)
Having this your task is rather easy - just think on it. You have to combine every synonym of a word with all the combinations you get from doing the task on the rest of the words - that's exactly where you can use SelectMany (I only paste the rest seperated by a space - maybe you should refactor a bit) - the Algorithm yourself is your standard recursive combinations-algorithm - you will see this a lot if you know this kind of problem:
public string[] GetWords(string text)
{
return text.Split(new[]{' '}); // add more seperators if you need
}
public IEnumerable<string> GetCombinations(string[] words, int lookAt = 0)
{
if (lookAt >= words.Length) return new[]{""};
var currentWord = words[lookAt];
var synonymsForCurrentWord = GetSynonmys(currentWord);
var combinationsForRest = GetCombinations(words, lookAt + 1);
return synonymsForCurrentWord.SelectMany(synonym => combinationsForRest.Select(rest => synonym + " " + rest));
}
Here is a complete example:
class Program
{
static void Main(string[] args)
{
var test = new Synonmys(Tuple.Create("ca", "ca"),
Tuple.Create("ca", "california"),
Tuple.Create("ln", "ln"),
Tuple.Create("ln", "lane"));
foreach (var comb in test.GetCombinations("2 ln ca"))
Console.WriteLine("Combination: " + comb);
}
}
class Synonmys
{
private Dictionary<string, IEnumerable<string>> synonyms;
public Synonmys(params Tuple<string, string>[] syns )
{
synonyms = syns.GroupBy(s => s.Item1).ToDictionary(g => g.Key, g => g.Select(kvp => kvp.Item2));
}
private IEnumerable<string> GetSynonmys(string word)
{
return synonyms.ContainsKey(word) ? synonyms[word] : new[]{word};
}
private string[] GetWords(string text)
{
return text.Split(new[]{' '}); // add more seperators if you need
}
private IEnumerable<string> GetCombinations(string[] words, int lookAt = 0)
{
if (lookAt >= words.Length) return new[]{""};
var currentWord = words[lookAt];
var synonymsForCurrentWord = GetSynonmys(currentWord);
var combinationsForRest = GetCombinations(words, lookAt + 1);
return synonymsForCurrentWord.SelectMany(synonym => combinationsForRest.Select(rest => synonym + " " + rest));
}
public IEnumerable<string> GetCombinations(string text)
{
return GetCombinations(GetWords(text));
}
}
Feel free to comment if something is not crystal-clear here ;)

compare the characters in two strings

In C#, how do I compare the characters in two strings.
For example, let's say I have these two strings
"bc3231dsc" and "bc3462dsc"
How do I programically figure out the the strings
both start with "bc3" and end with "dsc"?
So the given would be two variables:
var1 = "bc3231dsc";
var2 = "bc3462dsc";
After comparing each characters from var1 to var2, I would want the output to be:
leftMatch = "bc3";
center1 = "231";
center2 = "462";
rightMatch = "dsc";
Conditions:
1. The strings will always be a length of 9 character.
2. The strings are not case sensitive.
The string class has 2 methods (StartsWith and Endwith) that you can use.
After reading your question and the already given answers i think there are some constraints are missing, which are maybe obvious to you, but not to the community. But maybe we can do a little guess work:
You'll have a bunch of string pairs that should be compared.
The two strings in each pair are of the same length or you are only interested by comparing the characters read simultaneously from left to right.
Get some kind of enumeration that tells me where each block starts and how long it is.
Due to the fact, that a string is only a enumeration of chars you could use LINQ here to get an idea of the matching characters like this:
private IEnumerable<bool> CommonChars(string first, string second)
{
if (first == null)
throw new ArgumentNullException("first");
if (second == null)
throw new ArgumentNullException("second");
var charsToCompare = first.Zip(second, (LeftChar, RightChar) => new { LeftChar, RightChar });
var matchingChars = charsToCompare.Select(pair => pair.LeftChar == pair.RightChar);
return matchingChars;
}
With this we can proceed and now find out how long each block of consecutive true and false flags are with this method:
private IEnumerable<Tuple<int, int>> Pack(IEnumerable<bool> source)
{
if (source == null)
throw new ArgumentNullException("source");
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
{
yield break;
}
bool current = iterator.Current;
int index = 0;
int length = 1;
while (iterator.MoveNext())
{
if(current != iterator.Current)
{
yield return Tuple.Create(index, length);
index += length;
length = 0;
}
current = iterator.Current;
length++;
}
yield return Tuple.Create(index, length);
}
}
Currently i don't know if there is an already existing LINQ function that provides the same functionality. As far as i have already read it should be possible with SelectMany() (cause in theory you can accomplish any LINQ task with this method), but as an adhoc implementation the above was easier (for me).
These functions could then be used in a way something like this:
var firstString = "bc3231dsc";
var secondString = "bc3462dsc";
var commonChars = CommonChars(firstString, secondString);
var packs = Pack(commonChars);
foreach (var item in packs)
{
Console.WriteLine("Left side: " + firstString.Substring(item.Item1, item.Item2));
Console.WriteLine("Right side: " + secondString.Substring(item.Item1, item.Item2));
Console.WriteLine();
}
Which would you then give this output:
Left side: bc3
Right side: bc3
Left side: 231
Right side: 462
Left side: dsc
Right side: dsc
The biggest drawback is in someway the usage of Tuple cause it leads to the ugly property names Item1 and Item2 which are far away from being instantly readable. But if it is really wanted you could introduce your own simple class holding two integers and has some rock-solid property names. Also currently the information is lost about if each block is shared by both strings or if they are different. But once again it should be fairly simply to get this information also into the tuple or your own class.
static void Main(string[] args)
{
string test1 = "bc3231dsc";
string tes2 = "bc3462dsc";
string firstmatch = GetMatch(test1, tes2, false);
string lasttmatch = GetMatch(test1, tes2, true);
string center1 = test1.Substring(firstmatch.Length, test1.Length -(firstmatch.Length + lasttmatch.Length)) ;
string center2 = test2.Substring(firstmatch.Length, test1.Length -(firstmatch.Length + lasttmatch.Length)) ;
}
public static string GetMatch(string fist, string second, bool isReverse)
{
if (isReverse)
{
fist = ReverseString(fist);
second = ReverseString(second);
}
StringBuilder builder = new StringBuilder();
char[] ar1 = fist.ToArray();
for (int i = 0; i < ar1.Length; i++)
{
if (fist.Length > i + 1 && ar1[i].Equals(second[i]))
{
builder.Append(ar1[i]);
}
else
{
break;
}
}
if (isReverse)
{
return ReverseString(builder.ToString());
}
return builder.ToString();
}
public static string ReverseString(string s)
{
char[] arr = s.ToCharArray();
Array.Reverse(arr);
return new string(arr);
}
Pseudo code of what you need..
int stringpos = 0
string resultstart = ""
while not end of string (either of the two)
{
if string1.substr(stringpos) == string1.substr(stringpos)
resultstart =resultstart + string1.substr(stringpos)
else
exit while
}
resultstart has you start string.. you can do the same going backwards...
Another solution you can use is Regular Expressions.
Regex re = new Regex("^bc3.*?dsc$");
String first = "bc3231dsc";
if(re.IsMatch(first)) {
//Act accordingly...
}
This gives you more flexibility when matching. The pattern above matches any string that starts in bc3 and ends in dsc with anything between except a linefeed. By changing .*? to \d, you could specify that you only want digits between the two fields. From there, the possibilities are endless.
using System;
using System.Text.RegularExpressions;
using System.Collections.Generic;
class Sample {
static public void Main(){
string s1 = "bc3231dsc";
string s2 = "bc3462dsc";
List<string> common_str = commonStrings(s1,s2);
foreach ( var s in common_str)
Console.WriteLine(s);
}
static public List<string> commonStrings(string s1, string s2){
int len = s1.Length;
char [] match_chars = new char[len];
for(var i = 0; i < len ; ++i)
match_chars[i] = (Char.ToLower(s1[i])==Char.ToLower(s2[i]))? '#' : '_';
string pat = new String(match_chars);
Regex regex = new Regex("(#+)", RegexOptions.Compiled);
List<string> result = new List<string>();
foreach (Match match in regex.Matches(pat))
result.Add(s1.Substring(match.Index, match.Length));
return result;
}
}
for UPDATE CONDITION
using System;
class Sample {
static public void Main(){
string s1 = "bc3231dsc";
string s2 = "bc3462dsc";
int len = 9;//s1.Length;//cond.1)
int l_pos = 0;
int r_pos = len;
for(int i=0;i<len && Char.ToLower(s1[i])==Char.ToLower(s2[i]);++i){
++l_pos;
}
for(int i=len-1;i>0 && Char.ToLower(s1[i])==Char.ToLower(s2[i]);--i){
--r_pos;
}
string leftMatch = s1.Substring(0,l_pos);
string center1 = s1.Substring(l_pos, r_pos - l_pos);
string center2 = s2.Substring(l_pos, r_pos - l_pos);
string rightMatch = s1.Substring(r_pos);
Console.Write(
"leftMatch = \"{0}\"\n" +
"center1 = \"{1}\"\n" +
"center2 = \"{2}\"\n" +
"rightMatch = \"{3}\"\n",leftMatch, center1, center2, rightMatch);
}
}

print name of the variable in c#

i have a statement
int A = 10,B=6,C=5;
and i want to write a print function such that i pass the int variable to it and
it prints me the variable name and the value.
eg if i call print(A)
it must return "A: 10", and print (B) then it must return "B:6"
in short i want to know how can i access the name of the variable and print it to string in c#. DO i have to use reflection?
After reading the answers
Hi all, thanks for the suggestions provided. I shall try them out, however i wanted to know if it is at all possible in .NET 2.0? Nothing similar to
#define prt(x) std::cout << #x " = '" << x << "'" << std::endl;
macro which is there in C/C++?
The only sensible way to do this would be to use the Expression API; but that changes the code yet further...
static void Main() {
int A = 10, B = 6, C = 5;
Print(() => A);
}
static void Print<T>(Expression<Func<T>> expression) {
Console.WriteLine("{0}={1}",
((MemberExpression)expression.Body).Member.Name,
expression.Compile()());
}
Note: if this is for debugging purposes, be sure to add [Conditional("DEBUG")] to the method, as using a variable in this way changes the nature of the code in subtle ways.
You can use lambda expressions:
static void Main( string[] args ) {
int A = 50, B = 30, C = 17;
Print( () => A );
Print( () => B );
Print( () => C );
}
static void Print<T>( System.Linq.Expressions.Expression<Func<T>> input ) {
System.Linq.Expressions.LambdaExpression lambda = (System.Linq.Expressions.LambdaExpression)input;
System.Linq.Expressions.MemberExpression member = (System.Linq.Expressions.MemberExpression)lambda.Body;
var result = input.Compile()();
Console.WriteLine( "{0}: {1}", member.Member.Name, result );
}
This is not possible without some 'help' from the call site; even reflection does not know about names of local variables.
This is not possible to do with reflection (see Brian and Joel). In general this is not possible simply because you cannot guarantee a named value is being passed to your print function. For instance, I could just as easily do the following
print(42);
print(A + 42);
Neither of these expressions actually has a name. What would you expect to print here?
Another solution (from a closed post):
Inspired by Jon Skeet's post about Null Reference exception handling and suddenly being reminded about projection there is a way to kinda do that.
Here is complete working codez:
public static class ObjectExtensions {
public static string GetVariableName<T>(this T obj) {
System.Reflection.PropertyInfo[] objGetTypeGetProperties = obj.GetType().GetProperties();
if(objGetTypeGetProperties.Length == 1)
return objGetTypeGetProperties[0].Name;
else
throw new ArgumentException("object must contain one property");
}
}
class Program {
static void Main(string[] args) {
string strName = "sdsd";
Console.WriteLine(new {strName}.GetVariableName());
int intName = 2343;
Console.WriteLine(new { intName }.GetVariableName());
}
}
If you need to support more types than int, use the Expression API but avoid generics and handle the different expression gracefully:
private static string ToDebugOutput(params Expression<Func<object>>[] variables)
{
var sb = new StringBuilder();
foreach (var input in variables)
{
string name;
if (input.Body is UnaryExpression unary && unary.Operand is MemberExpression operand)
{
name = operand.Member.Name;
}
else if (input.Body is MemberExpression member)
{
name = member.Member.Name;
}
else
{
throw new NotSupportedException($"typeof lambda: {input.Body.GetType()}");
}
var result = input.Compile()();
sb.Append($"{name}={result}, ");
}
return sb.ToString();
}
Usage:
string s = "123";
double d = 1.23;
int i2 = 123;
var out2 = ToDebugOutput(() => s, () => d, () => i2);
// out2 = "s=123, d=1.23, i2=123, "

Categories