RegEx to extract partial string - c#

So simple but I'm struggling, I do RegExp every 2 years or so , so I'm rusty
I have these two url strings
http://localhost:58876/Products/Product1
https://localhost:58876/Products/Product1
The result I want is
localhost:58876
Basically remove the http(s):// and everything after the first single / so I end up with the domain with or without the port number
P.S: I'm working with C#

This worked for me (tested int notepad++):
(\w+:\d+)

You can use the following regex to split the URL:
((http[s]?|ftp):/)?/?([^:/\s]+)(:([^/]))?((/\w+)/)([\w-.]+[^#?\s]+)(\?([^#]))?(#(.))?
The RegEx positions 3 and 5 are those you are looking for.

(^[^h]|\/\/)([\w\d\:\#\.]+:?[\d]?+)
then in c#:
string address = ...
char[] MyChar = {'/'};
string NewString = address.TrimStart(MyChar);
EDIT: also worked with localhost:58876/Products/Product1
!

Just match anything but a slash: /^https?:\/\/([^\/]+)\/.*$/
var url = 'http://localhost:58876/Products/Product1';
var match = url.match(/^https?:\/\/([^\/]+)\/.*$/);
if(match&&match.length>0)document.write(match[1]);
Even shorter: /\/\/([^\/]+)/. Note that there are (a lot) better ways to parse URLs. Depending on your platform, there’s PHP’s parse_url, NodeJS’s url module or libraries like uri.js that handle the many faces of valid URIs.

Related

C# Regex for Web.config Rewrite Url

I have the following url
www.localproject.com:843/user/validate/eyJhbGciOiJodHRwOi8vd3d3LnczLm9yZy8yMDAxLzA0L3htbGRzaWctbW9yZSNobWFjLXNoYTI1NiIsInR5cCI6IkpXVCJ9eyJ0ZW1wVXJsIjoie1wiQ3VzdG9tZXJJZFwiOjEsXCJDb3Vyc2VJZFwiOjEsXCJUb2tlblwiOm51bGwsXCJFeHBpcnlcIjpcIjIwMTgtMDQtMThUMTc6MzU6MTMuOTQ2MjM2NCswNTowMFwifSJ9uvm7jZ3us5UFa1hqh4bod2cSamcxF2rRUbfxs7DHQs
from which I need to extract only 10 to 30 characters after validate including numbers etc. For example, I need only this
eyJhbGciOiJodHRwOi8vd3d
what should be the regex? I tried following but it not working
^api/user/validate/(^([a-zA-Z\d]){50})
Try
api\/user\/validate\/(.{23})
Can see it working here - https://regex101.com/r/Xr6Ueo/1
You can do this with regex or Substring. If the string is stored in web.config you can still get to it with ConfigurationManager. I recommend taking the parts you do not need out of the Uri. This way once you deploy to prod there's less of a chance of things breaking down.
var input = new Uri("www.localproject.com:843/user/validate/eyJhbGciOiJodHRwOi8vd3d3LnczLm9yZy8yMDAxLzA0L3htbGRzaWctbW9yZSNobWFjLXNoYTI1NiIsInR5cCI6IkpXVCJ9eyJ0ZW1wVXJsIjoie1wiQ3VzdG9tZXJJZFwiOjEsXCJDb3Vyc2VJZFwiOjEsXCJUb2tlblwiOm51bGwsXCJFeHBpcnlcIjpcIjIwMTgtMDQtMThUMTc6MzU6MTMuOTQ2MjM2NCswNTowMFwifSJ9uvm7jZ3us5UFa1hqh4bod2cSamcxF2rRUbfxs7DHQs");
var environmentAgnosticPath = input.Segments[input.Segments.Length - 1];//take just the part after /validate
//example output: eyJhbGciOiJodHRwOi8vd3d
//I don't know what you mean by 10 - 30 since the example is 23 characters
var resultFromSubstring = environmentAgnosticPath.Substring(0,23);
var pattern = #"[A-Za-z0-9]{10,23}";
Regex regex = new Regex(pattern);
var resultFromRegex = regex.Match(environmentAgnosticPath).Value;
And generally regex101 is a great playground for figuring out the regex part.

How to remove a pattern from a string using Regex

I want to find paths from a string and remove them, e.g.:
string1 = "'c:\a\b\c'!MyUDF(param1, param2,..) + 'c:\a\b\c'!MyUDF(param3, param4,..)..."`
I'd like a regex to find the pattern '[some path]'!MyUDF, and remove '[path]'.
Thanks.
Edit:
Example input:
string1 = "'c:\a\b\c'!MyUDF(param1, param2,..) + 'c:\a\b\c'!MyUDF(param3, param4,..)";
Expected output: "MyUDF(param1, param2,...) + MyUDF(param3, param4,...)"
where MyUDF is a function name, so it consists of only letters
input=Regex.Replace(input,"'[^']+'(?=!MyUDF)","");
In case if the path is followed by ! and some other word you can use
input=Regex.Replace(input,#"'[^']+'(?=!\w+)","");
Alright, if the ! is always in the string as you suggest, this Regex !(.*)?\( will get you what you want. Here is a Regex 101 to prove it.
To use it, you might do something like this:
var result = Regex.Replace(myString, #"!(.*)?\(");
The feature you want, if you are dealing with file paths, is in System.Path.
There are many methods there, but that is one of it's specific purposes.

Using .NET RegEx to retrieve part of a string after the second '-'

This is my first stack message. Hope you can help.
I have several strings i need to break up for use later. Here are a couple of examples of what i mean....
fred-064528-NEEDED
frederic-84728957-NEEDED
sam-028-NEEDED
As you can see above the string lengths vary greatly so regex i believe is the only way to achieve what i want. what i need is the rest of the string after the second hyphen ('-').
i am very weak at regex so any help would be great.
Thanks in advance.
Just to offer an alternative without using regex:
foreach(string s in list)
{
int x = s.LastIndexOf('-')
string sub = s.SubString(x + 1)
}
Add validation to taste.
Something like this. It will take anything (except line breaks) after the second '-' including the '-' sign.
var exp = #"^\w*-\w*-(.*)$";
var match = Regex.Match("frederic-84728957-NEE-DED", exp);
if (match.Success)
{
var result = match.Groups[1]; //Result is NEE-DED
Console.WriteLine(result);
}
EDIT: I answered another question which relates to this. Except, it asked for a LINQ solution and my answer was the following which I find pretty clear.
Pimp my LINQ: a learning exercise based upon another post
var result = String.Join("-", inputData.Split('-').Skip(2));
or
var result = inputData.Split('-').Skip(2).FirstOrDefault(); //If the last part is NEE-DED then only NEE is returned.
As mentioned in the other SO thread it is not the fastest way of doing this.
If they are part of larger text:
(\w+-){2}(\w+)
If there are presented as whole lines, and you know you don't have other hyphens, you may also use:
[^-]*$
Another option, if you have each line as a string, is to use split (again, depending on whether or not you're expecting extra hyphens, you may omit the count parameter, or use LastIndexOf):
string[] tokens = line.Split("-".ToCharArray(), 3);
string s = tokens.Last();
This should work:
.*?-.*?-(.*)
This should do the trick:
([^\-]+)\-([^\-]+)\-(.*?)$
the regex pattern will be
(?<first>.*)?-(?<second>.*)?-(?<third>.*)?(\s|$)
then you can get the named group "second" to get the test after 2nd hyphen
alternatively
you can do a string.split('-') and get the 2 item from the array

Extracting Data from a String Using Regular Expressions

I need some help extracting the following bits of information using regular expressions.
Here is my input string "C:\Yes"
******** Missing character at start of string and in between but not at the end =
a weird superscript looking L.***
I need to extract "C:\" into one string and "Yes" into another.
Thanks In Advance.
I wouldn't bother with regular expressions for that. Too much work, and I'd be too likely to screw it up.
var x = #"C:\Yes";
var root = Path.GetPathRoot(x); // => #"C:\"
var file = Path.GetFileName(x); // => "Yes"
The following regular expression returns C:\ in the first capture group and the rest in the second:
^(\w:\\)(.*)$
This is looking for: a full string (^…$) starting with a letter (\w, although [a-z] would probably more accurate for Windows drive letters), followed by :\. All the rest (.*) is captured in the second group.
Notice that this won’t work with UNC paths. If you’re working with paths, your best bet is not to use strings and regular expressions but rather the API found in System.IO. The classes found there already offer the functionality that you want.
Regex r = new Regex("([A-Z]:\\)([A-Za-z]+)");
Match m = r.Match(#"C:\");
string val1 = m.Groups[0];
string val2 = m.Groups[1];

Extract substring from string with Regex

Imagine that users are inserting strings in several computers.
On one computer, the pattern in the configuration will extract some characters of that string, lets say position 4 to 5.
On another computer, the extract pattern will return other characters, for instance, last 3 positions of the string.
These configurations (the Regex patterns) are different for each computer, and should be available for change by the administrator, without having to change the source code.
Some examples:
Original_String Return_Value
User1 - abcd78defg123 78
User2 - abcd78defg123 78g1
User3 - mm127788abcd 12
User4 - 123456pp12asd ppsd
Can it be done with Regex?
Thanks.
Why do you want to use regex for this? What is wrong with:
string foo = s.Substring(4,2);
string bar = s.Substring(s.Length-3,3);
(you can wrap those up to do a bit of bounds-checking on the length easily enough)
If you really want, you could wrap it up in a Func<string,string> to put somewhere - not sure I'd bother, though:
Func<string, string> get4and5 = s => s.Substring(4, 2);
Func<string,string> getLast3 = s => s.Substring(s.Length - 3, 3);
string value = "abcd78defg123";
string foo = getLast3(value);
string bar = get4and5(value);
If you really want to use regex:
^...(..)
And:
.*(...)$
To have a regex capture values for further use you typically use (), depending on the regex compiler it might be () or for microsoft MSVC I think it's []
Example
User4 - 123456pp12asd ppsd
is most interesting in that you have here 2 seperate capture areas. Is there some default rule on how to join them together, or would you then want to be able to specify how to make the result?
Perhaps something like
r/......(..)...(..)/\1\2/ for ppsd
r/......(..)...(..)/\2-\1/ for sd-pp
do you want to run a regex to get the captures and handle them yourself, or do you want to run more advanced manipulation commands?
I'm not sure what you are hoping to get by using RegEx. RegEx is used for pattern matching. If you want to extract based on position, just use substring.
It seems to me that Regex really isn't the solution here. To return a section of a string beginning at position pos (starting at 0) and of length length, you simply call the Substring function as such:
string section = str.Substring(pos, length)
Grouping. You could match on /^.{3}(.{2})/ and then look at group $1 for example.
The question is why? Normal string handling i.e. actual substring methods are going to be faster and clearer in intent.

Categories