How to remove a pattern from a string using Regex

How to remove a pattern from a string using Regex - c#

I want to find paths from a string and remove them, e.g.:
string1 = "'c:\a\b\c'!MyUDF(param1, param2,..) + 'c:\a\b\c'!MyUDF(param3, param4,..)..."`
I'd like a regex to find the pattern '[some path]'!MyUDF, and remove '[path]'.
Thanks.
Edit:
Example input:
string1 = "'c:\a\b\c'!MyUDF(param1, param2,..) + 'c:\a\b\c'!MyUDF(param3, param4,..)";
Expected output: "MyUDF(param1, param2,...) + MyUDF(param3, param4,...)"
where MyUDF is a function name, so it consists of only letters

input=Regex.Replace(input,"'[^']+'(?=!MyUDF)","");
In case if the path is followed by ! and some other word you can use
input=Regex.Replace(input,#"'[^']+'(?=!\w+)","");

Alright, if the ! is always in the string as you suggest, this Regex !(.*)?\( will get you what you want. Here is a Regex 101 to prove it.
To use it, you might do something like this:
var result = Regex.Replace(myString, #"!(.*)?\(");

The feature you want, if you are dealing with file paths, is in System.Path.
There are many methods there, but that is one of it's specific purposes.

Related

Pattern not correct with regular expressions in c#

i am doing a project , and i want to remove from a string http protocoll. In my excel sheet there are two types one is http://www.email#domain.com and the other is http://email#domain.com.I have tried so many combinations but i can't find the right one.
My code only works with the first type and not with the second one
var website_domain_in_excel = list_of_information_in_excel[2];
string pattern = "(http://\\www.)";
Console.WriteLine(Regex.Replace(website_domain_in_excel, pattern, String.Empty));
Thank you for your time

The pattern you want is this:
string pattern = #"http:\/\/(?:www\.)?"
This matches http:// and then an optional non-capturing group matching www..
You can see an explanation of the regex here and this fiddle for a working demo in C#.

You can use the string: "http?://?www.|http?://" which matches either "http://www." or "http://".
The code would look like this:
var website_domain_in_excel = list_of_information_in_excel[2];
string pattern = #"http:\/\/www.|http:\/\/";
Console.WriteLine(Regex.Replace(website_domain_in_excel, pattern, String.Empty));

A non-regex solution:
var eml = "http://www.email#domain.com";
eml = eml.Replace("http://", "").Replace("www.", "");
// eml now is "email#domain.com"
You might want to test that that "www." only appears at the start. The (unusual) "email#www.domain.com" should remain intact.
But if you really want a regex:
eml = Regex.Replace(eml, "^https?://(www\\.)?", "");
This also catches "https", because of the ? after that "s"
It will also find and replace an optional "www.", but only at the start

RegEx to extract partial string

So simple but I'm struggling, I do RegExp every 2 years or so , so I'm rusty
I have these two url strings
http://localhost:58876/Products/Product1
https://localhost:58876/Products/Product1
The result I want is
localhost:58876
Basically remove the http(s):// and everything after the first single / so I end up with the domain with or without the port number
P.S: I'm working with C#

This worked for me (tested int notepad++):
(\w+:\d+)

You can use the following regex to split the URL:
((http[s]?|ftp):/)?/?([^:/\s]+)(:([^/]))?((/\w+)/)([\w-.]+[^#?\s]+)(\?([^#]))?(#(.))?
The RegEx positions 3 and 5 are those you are looking for.

(^[^h]|\/\/)([\w\d\:\#\.]+:?[\d]?+)
then in c#:
string address = ...
char[] MyChar = {'/'};
string NewString = address.TrimStart(MyChar);
EDIT: also worked with localhost:58876/Products/Product1
!

Just match anything but a slash: /^https?:\/\/([^\/]+)\/.*$/
var url = 'http://localhost:58876/Products/Product1';
var match = url.match(/^https?:\/\/([^\/]+)\/.*$/);
if(match&&match.length>0)document.write(match[1]);
Even shorter: /\/\/([^\/]+)/. Note that there are (a lot) better ways to parse URLs. Depending on your platform, there’s PHP’s parse_url, NodeJS’s url module or libraries like uri.js that handle the many faces of valid URIs.

Splitting a string from a specific point in C#

I have looked at the split string methods in C#, such as:
string[] lines = Regex.Split(value, "\");
Although I have come across the situation where I need to extract a file name from a file path, so I dont want to have to split the string on all occurrences of "\" e.g.:
C:\Windows\System32\calc.exe
Expected output:
calc.exe

new FileInfo(#"C:\Windows\System32\calc.exe").Name

Let the framework take the strain. Use the Path.GetFilename method.
http://msdn.microsoft.com/en-us/library/system.io.path.getfilename.aspx

C# Regular expression problem

I have the following string:
http://www.powerwXXe.com/text1 123-456 text2 text3/
Can someone give me advice on how to get the value of text1, text2 and text3 and put them into a string. I have heard of regular expressions but have no idea how to use them.

Instead of going the RegEx route, if you know that the string will always be of a similar format, you can using string.Split, first on /, then on space and retrieve the results from the resulting string arrays.
string[] slashes = myString.Split('/');
string[] textVals = slashes[3].Split(' ');
// at this point:
// textVals[0] = "text1"
// textVals[1] = "123-456"
// textVals[2] = "text2"
// textVals[3] = "text3"

Here is a link on getting started with regular expressions in C#:Regular Expression Tutorial
I don't think it is appropriate to write out a tutorial here since the information is online, so please check out the link and let me know if you have a specific question.

Instead of using regex, you can use string.Fromat("http://myurl.com/{0}{1}{2}", value1, textbox2.Text, textbox3.Text) and format the url in whatever fashion. If you are looking to go the regex route, you can always check regexlib.

The use of regular expressions relies on patterns you see in your strings - you need to be able to generalize the pattern of strings you're looking for before you can use a regular expression.
For a problem of this scope, if you can pin down the pattern, you're probably better off using other string parsing methods, such as String.IndexOf and String.Split.
Regular expressions is a powerful tool, and certainly worth learning, but it might not be necessary here.

Based on the example you gave, it looks as though text1, text2 and text3 are separated by spaces? If so, and if you always know the positions they'll be in, you may want to skip regular expressions and just use .Split(' ') to split the string into an array of strings and then grab the pertinent items from there. Something like this:
string foo = "http://www.powerwXXe.com/text1 123-456 text2 text3/"
string[] fooParts = foo.Split(' ');
string text1 = fooParts[0].Replace("http://www.powerwXXe.com/", "");
string text2 = fooParts[2];
string text3 = fooParts[3].Replace("/", "");
You'd want to perform bounds checking on the string[] before trying to grab anything from it, but this would work. Regex is awesome for string parsing, but when it's simple stuff you need to do, sometimes it's overkill when simple methods from the string class will do.

It all depends on how much you know about about the string you are parsing. Where does the string come from and how much do you know about it's formating?
Based on your example string you could get away with something as simple as
string pattern = #"http://www.powerwXXe.com/(?<myGroup1>\S+)\s\S+\s(?<myGroup2>\S+)\s(?<myGroup3>\S+)/";
var reg = new System.Text.RegularExpressions.Regex(pattern);
string input = "http://www.powerwXXe.com/text1 123-456 text2 text3/";
System.Text.RegularExpressions.Match myMatch = reg.Match(input);
The caputerd strings would then be contained in myMatch.Groups["myGroup1"], ["myGroup2"], ["myGroup3"] respectivly.
This however assumes that your string always begins with http://www.powerwXXe.com/, that there will always be three groups to capture and that the groups are separated by a space (which is an illegal character in url's and would in almost all cases be converted to %20, which would have to be accounted for in the pattern).
So, how much do you know about your string? And, as some has already stated, do you really need regular expressions?

Using .NET RegEx to retrieve part of a string after the second '-'

This is my first stack message. Hope you can help.
I have several strings i need to break up for use later. Here are a couple of examples of what i mean....
fred-064528-NEEDED
frederic-84728957-NEEDED
sam-028-NEEDED
As you can see above the string lengths vary greatly so regex i believe is the only way to achieve what i want. what i need is the rest of the string after the second hyphen ('-').
i am very weak at regex so any help would be great.
Thanks in advance.

Just to offer an alternative without using regex:
foreach(string s in list)
{
int x = s.LastIndexOf('-')
string sub = s.SubString(x + 1)
}
Add validation to taste.

Something like this. It will take anything (except line breaks) after the second '-' including the '-' sign.
var exp = #"^\w*-\w*-(.*)$";
var match = Regex.Match("frederic-84728957-NEE-DED", exp);
if (match.Success)
{
var result = match.Groups[1]; //Result is NEE-DED
Console.WriteLine(result);
}
EDIT: I answered another question which relates to this. Except, it asked for a LINQ solution and my answer was the following which I find pretty clear.
Pimp my LINQ: a learning exercise based upon another post
var result = String.Join("-", inputData.Split('-').Skip(2));
or
var result = inputData.Split('-').Skip(2).FirstOrDefault(); //If the last part is NEE-DED then only NEE is returned.
As mentioned in the other SO thread it is not the fastest way of doing this.

If they are part of larger text:
(\w+-){2}(\w+)
If there are presented as whole lines, and you know you don't have other hyphens, you may also use:
[^-]*$
Another option, if you have each line as a string, is to use split (again, depending on whether or not you're expecting extra hyphens, you may omit the count parameter, or use LastIndexOf):
string[] tokens = line.Split("-".ToCharArray(), 3);
string s = tokens.Last();

This should work:
.*?-.*?-(.*)

This should do the trick:
([^\-]+)\-([^\-]+)\-(.*?)$

the regex pattern will be
(?<first>.*)?-(?<second>.*)?-(?<third>.*)?(\s|$)
then you can get the named group "second" to get the test after 2nd hyphen
alternatively
you can do a string.split('-') and get the 2 item from the array

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to remove a pattern from a string using Regex - c#

input=Regex.Replace(input,"'[^']+'(?=!MyUDF)",""); In case if the path is followed by ! and some other word you can use input=Regex.Replace(input,#"'[^']+'(?=!\w+)","");

Alright, if the ! is always in the string as you suggest, this Regex !(.)?\( will get you what you want. Here is a Regex 101 to prove it. To use it, you might do something like this: var result = Regex.Replace(myString, #"!(.)?\(");

The feature you want, if you are dealing with file paths, is in System.Path. There are many methods there, but that is one of it's specific purposes.

Related

Pattern not correct with regular expressions in c#

RegEx to extract partial string

Splitting a string from a specific point in C#

C# Regular expression problem

Using .NET RegEx to retrieve part of a string after the second '-'

Categories

Resources

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to remove a pattern from a string using Regex - c#

input=Regex.Replace(input,"'[^']+'(?=!MyUDF)",""); In case if the path is followed by ! and some other word you can use input=Regex.Replace(input,#"'[^']+'(?=!\w+)","");

Alright, if the ! is always in the string as you suggest, this Regex !(.*)?\( will get you what you want. Here is a Regex 101 to prove it. To use it, you might do something like this: var result = Regex.Replace(myString, #"!(.*)?\(");

The feature you want, if you are dealing with file paths, is in System.Path. There are many methods there, but that is one of it's specific purposes.

Related

Pattern not correct with regular expressions in c#

RegEx to extract partial string

Splitting a string from a specific point in C#

C# Regular expression problem

Using .NET RegEx to retrieve part of a string after the second '-'

Categories

Resources

Alright, if the ! is always in the string as you suggest, this Regex !(.)?\( will get you what you want. Here is a Regex 101 to prove it. To use it, you might do something like this: var result = Regex.Replace(myString, #"!(.)?\(");