Regex Issue in C#

Regex Issue in C# - c#

I am trying to create a C# routine that removes all of the following prefixes and suffixes and returns just the root word of a domain:
var stripChars = new List<string> { "http://", "https://", "www.", "ftp.", ".com", ".net", ".org", ".info", ".co", ".me", ".mobi", ".us", ".biz" };
I do this with the following code:
originalDomain = stripChars.Aggregate(originalDomain, (current, repl) => Regex.Replace(current, repl, #"", RegexOptions.IgnoreCase));
Which seems to work in almost all cases. Today, however, I discovered that setting "originalDomain" to "NameCheap.com" does not return:
NameCheap
Like it should, but rather:
NCheap
Can anyone look at this and tell me what is going wrong? Any help would be appreciated.

THis is normal: the dot in a regex means any character.
Therefore, .me matches ame in NameCheap.
Escape the dots with a backslash.
Also, you'd be better off using a dedicated URI API for this kind of operation.

I know this doesn't answer your question directly, but given the specific task you are trying to accomplish I would recommend trying something like this:
Uri uri = new Uri(originalDomain);
originalDomain = uri.Host;
EDIT:
If your input may not contain a scheme you can use the uri builder as notied in this post
var hostName = new UriBuilder(input).Host
Hope this helps.

Related

How to check if the string contains dynamic word that starts with first letter

I am trying to find out in the string if the word that changes but starts with letter 'F' (C#). The result output from service call is as below:
Exception_Remote_Call--VNQ DN ERROR CODE found ERROR CODE= F0123,
ERROR DESCRIPTION= NOT AVAILABLE
In the above string, F0123 word changes according to the different ERROR CODE. I tried as below but it works for F0123 and does not work if the output is F0111. I would like to find if it starts with 'F'.
var isStartsWithF = s.Contains("F0123");
I would really appreciate for the help. Thank you in advance!

This is a job for regular expressions. To make things clearer and easier to spot for future maintainers, I might include the ERROR CODE = as part of the expression:
var data = "Exception_Remote_Call--VNQ DN ERROR CODE found ERROR CODE = F0123, ERROR DESCRIPTION= NOT AVAILABLE";
var exp = new Regex(#"ERROR CODE\s?= (F\d{4,5})");
var result = exp.Match(data).Groups[1].Value;
See it work here:
https://dotnetfiddle.net/nOXOCt

Microsoft Bot Framework Special Characters

A coworker of mine developed a bot using MS Bot Framework. It's working as expected, but there are responses wherein special characters are shown in place of apostrophe's. Please let me know if you guys experienced this and have any fix. Thanks. :)

You can use regex class for a best practice to avoid other special characters in future. Below is the sample you can use
using System;
using System.Text.RegularExpressions;
class Program
{
static void Main()
{
string inputString = "Manoj# Bhard#waj";
// . indicates any character to be removed. You can also write characters as well to remove some selected characters
string outputString = Regex.Replace(input, ".", " ");
// Write the output.
Console.WriteLine(inputString);
Console.WriteLine(outputString);
}
}

I also faced this issue, then I created the message with below formats in MS bot Code(app.js).
var customMessage = new builder.Message(session)
.text("I didn't quite get that. For us ......")
.textFormat("plain")
.textLocale("en-us");
session.send(customMessage);
Refer Official URL for V3: https://learn.microsoft.com/en-us/azure/bot-service/nodejs/bot-builder-nodejs-message-create?view=azure-bot-service-3.0

ARSoft.Tools.Net SpfValidator.CheckHost() not responding

I am trying to follow this example:
https://docs.ar-soft.de/arsoft.tools.net/#SPF%20SenderIP%20Validation.html
var validator = new SpfValidator()
{
HeloDomain = DomainName.Parse("example.com"),
LocalDomain = DomainName.Parse("receivingmta.example.com"),
LocalIP = IPAddress.Parse("192.0.2.1")
};
SpfQualifier result = validator.CheckHost(IPAddress.Parse("192.0.2.200"),
DomainName.Parse("example.com"), "sender#example.com").Result;
However, no matter what IPs and urls I use, CheckHost() method does not finish.
Does anybody know the correct use, or example input parameters for which this would complete?
I would expect an exception if inputs are invalid.

You're using it the same way I'm using it. It works for perfectly for me. Maybe you have something in your firewall blocking it from performing the look up queries?

I need to strip a Google Alerts URL

To preface, I know there are similar threads about this, but I am using C#, not java, or python, or Php. Some threads provide a solution for a single URL, which is not universal. Thanks for not flagging me.
So I am using Google Alerts to get links to articles via email. I have already written a program that can strip the URLs out of the email as well as another program to scrape the websites. My issue is that the links in the google alerts email look like this:
https://www.google.com/url?rct=j&sa=t&url=http://www.foxnews.com/health/2016/08/19/virtual-reality-treadmills-help-prevent-falls-in-elderly.html&ct=ga&cd=CAEYACoTOTc2NjE4NjYyNzMzNzc3NDcyODIaODk2NWUwYzRjMzdmOGI4Nzpjb206ZW46VVM&usg=AFQjCNGyK2EyVBLoKnNkdxIBDf8a_B3Ung. Yeah, ugly.
Because this redirects to the actual article through google, my scraping program does not work on these links. I have tried a million different RegExs from questions here and other sources. I managed to strip off everything up until the http:// of the actual article but it still has the tail end that screws it up. Here is what I have so far. They now look like:
http://www.foxnews.com/health/2016/08/19/virtual-reality-treadmills-help-prevent-falls-in-elderly.html&ct=ga&cd=CAEYACoTOTc2NjE4NjYyNzMzNzc3NDcyODIaODk2NWUwYzRjMzdmOGI4Nzpjb206ZW46VVM&usg=AFQjCNGyK2EyVBLoKnNkdxIBDf8a_B3Ung
private List<string> GetLinks(string message)
{
List<string> list = new List<string>();
Regex urlRx = new Regex(#"((http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,#?^=%&:/~\+#]*[\w\-\#?^=%&/~\+#])?)", RegexOptions.IgnoreCase);
MatchCollection matches = urlRx.Matches(message);
foreach (Match match in matches)
{
if(!match.ToString().Contains("news.google.com/news") && !match.ToString().Contains("google.com/alerts"))
{
string find = "=http";
int ind = match.ToString().IndexOf(find);
list.Add(match.ToString().Substring(ind+1));
}
}
return list;
}
Some help getting rid of the endings would be awesome, be it a new RegEx or some extra code. Thanks in advance.

You can use HttpUtility.ParseQueryString to retrieve the url part of the query string. It is located in the System.Web namespace (reference required).
var uri = new Uri("https://www.google.com/url?rct=j&sa=t&url=http://www.foxnews.com/health/2016/08/19/virtual-reality-treadmills-help-prevent-falls-in-elderly.html&ct=ga&cd=CAEYACoTOTc2NjE4NjYyNzMzNzc3NDcyODIaODk2NWUwYzRjMzdmOGI4Nzpjb206ZW46VVM&usg=AFQjCNGyK2EyVBLoKnNkdxIBDf8a_B3Ung");
var queries = HttpUtility.ParseQueryString(uri.Query);
var foxNews = queries["url"]; //http://www.foxnews.com/health/2016/08/19/virtual-reality-treadmills-help-prevent-falls-in-elderly.html

Comparing the url segment to string

i have done this:
code behind:
var uri = new Uri(Request.Url.ToString());
if ("newsFeed" == Request.Url.Segments[2])
{
L1.Attributes.Add("class", "active");
}
the url of a the page is:
http://localhost:52040/ClientSide/newsFeed/allEr.aspx
so it's suppose to work and enter the if, but it doesnt
what is the problem?

Well Segment[2] would be newsFeed/ not newsFeed. So you can do:
if ("newsFeed" == Request.Url.Segments[2].Trim('/'))
Or use string.TrimEnd
An easier way to debug these problems in future is to use a debug point and watch window. There you can see the value of Request.Url.Segments[2]
See: How to: Use Debugger Variable Windows
By the way Request.Url is already of type Uri, you don't have to create a new instance of Uri with ToString

Try This:
if(Request.Url.Segments[2].Contains("newsFeed"))

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Regex Issue in C# - c#

THis is normal: the dot in a regex means any character. Therefore, .me matches ame in NameCheap. Escape the dots with a backslash. Also, you'd be better off using a dedicated URI API for this kind of operation.

Related

How to check if the string contains dynamic word that starts with first letter

Microsoft Bot Framework Special Characters

ARSoft.Tools.Net SpfValidator.CheckHost() not responding

I need to strip a Google Alerts URL

Comparing the url segment to string

Categories

Resources