I have this string
AnyText: "jonathon" <usernameredacted#example.com>
Desired Output Using Regex
AnyText: <usernameredacted#example.com>
Omit anything in between !
I am still a rookie at regular expressions. Could anyone out there help me with the matching & replacing expression for the above scenario?
Try this:
string input = "jonathon <usernameredacted#example.com>";
string output = Regex.Match(input, #"<[^>]+>").Groups[0].Value;
Console.WriteLine(output); //<usernameredacted#example.com>
You could use the following regex to match all the characters that you want to replace with an empty string:
^[^<]*
The first ^ is an anchor to the beginning of the string. The ^ inside the character class means that the character class is a negation. ie. any character that isn't an < will match. The * is a greedy quantifier. So in summary, this regex will swallow up all characters from the beginning of the string until the first <.
Here is the way to do it in VBA flavor: Replace "^[^""]*" with "".
^ marks the start of the sentence.
[^""]* marks anything other than a
quote sign.
UPDATE:
Since in your additional comment you mentioned you wanted to grab the "From:" and the email address, but none of the junk in between or after, I figure instead of replace, extract would be better. Here is a VBA function written for Excel that will give you back all the subgroup matches (everything you put in parenthesis) and nothing else.
Function RegexExtract(ByVal text As String, _
ByVal extract_what As String) As String
Application.ScreenUpdating = False
Dim i As Long
Dim result As String
Dim allMatches As Object
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
RE.Pattern = extract_what
RE.Global = True
Set allMatches = RE.Execute(text)
For i = 0 To allMatches.Item(0).submatches.count - 1
result = result & allMatches.Item(0).submatches.Item(i)
Next
RegexExtract = result
Application.ScreenUpdating = True
End Function
Using this code, your regex call would be: "^(.+: ).+(<.+>).*"
^ denotes start of sentence
(.+: ) denotes first match group. .+ is one or more characters, followed by : and a space
.+ denotes one or more characters
(<.+>) denotes second match group.
< is <, then .+ for one or more characters, then the final >
.* denotes zero or more
characters.
So in excel you'd use (assuming cell is A1):
=RegexExtract(A1, "^(.+: ).+(<.+>).*")
Related
Regex.IsMatch method returns the wrong result while checking the following condition,
string text = "$0.00";
Regex compareValue = new Regex(text);
bool result = compareValue.IsMatch(text);
The above code returns as "False". Please let me know if i missed anything.
The Regex class has a special method for escaping characters in a pattern: Regex.Escape()
Change your code like this:
string text = "$0.00";
Regex compareValue = new Regex(Regex.Escape(text)); // Escape characters in text
bool result = compareValue.IsMatch(text);
"$" is a special character in C# regex. Escape it first.
Regex compareValue = new Regex(#"\$0\.00");
bool result = compareValue.IsMatch("$0.00");
Regex expressions: https://msdn.microsoft.com/en-us/library/az24scfc(v=vs.110).aspx
Both '.' and '$' are special characters and thus you need to escape them if you want to match the character itself. '.' matches any character and '$' matches the end of a string
see: https://regex101.com/r/pK2uY6/1
You have to escape $ since it is a special (reserved) character which means "end of string". In case . means just dot (say, decimal separator) you have to escape it as well (when not escaped, . means "any symbol"):
string pattern = #"\$0\.00";
bool result = RegEx.IsMatch(text, pattern);
As for your original pattern, it has no chance to match any string, since $0.00 means
$ end of string, followed by
0 zero
. any character
0 zero
0 zero
but end of string can't be followed by...
I am trying to make a regex on field which accepts in the following:
Where X is a numerical value between 0-9 so 3 numbers before the - and three after the dash.
I started with the following but I got lost in adding validation after the dash.
([0-9-])\w+([0-9-])
3 digits, a dash then 3 digits:
\d{3}-\d{3}
var example = "123-455";
var pattern = #"\A(\d){3}-(\d){3}\Z";
var result = Regex.Match(example, pattern);
This will not only search for the pattern within your string, but also make sure that the beginning and end of the pattern is at the beginning and end of your target string. This ensures that you won't get a match e.g. for:
"silly123-456stuff" or "0123-4567".
In other words, it both looks for a pattern, and limits its length by anchoring it to the begining and end of the string.
string pattern = #"^([0-9]{3})-([0-9]{3})$";
Regex rgx = new Regex(pattern);
I would add the the beginning and end of line to the regex
^\d{3}-\d{3}$
^ = at the beginning of the line
\d = a number
{3} = three times
- = a dash
\d = a number
{3} = three times
$ = the end of the line
Not setting the start and end of line could catch invalid patterns, such as Text123-4858
Edit: even better than line markers, the anchors proposed by Kjartan are the correct answer in this case.
In C# I'm trying to validate a string that looks like:
I#paramname='test'
or
O#paramname=2827
Here is my code:
string t1 = "I#parameter='test'";
string r = #"^([Ii]|[Oo])#\w=\w";
var re = new Regex(r);
If I take the "=\w" off the end or variable r I get True. If I add an "=\w" after the \w it's False. I want the characters between # and = to be able to be any alphanumeric value. Anything after the = sign can have alphanumeric and ' (single quotes). What am I doing wrong here. I very rarely have used regular expressions and normally can find example, this is custom format though and even with cheatsheets I'm having issues.
^([Ii]|[Oo])#\w+=(?<q>'?)[\w\d]+\k<q>$
Regular expression:
^ start of line
([Ii]|[Oo]) either (I or i) or (O or o)
\w+ 1 or more word characters
= equals sign
(?<q>'?) capture 0 or 1 quotes in named group q
[\w\d]+ 1 or more word or digit characters
\k<q> repeat of what was captured in named group q
$ end of line
use \w+ instead of \w to one character or more. Or \w* to get zero or more:
Try this: Live demo
^([Ii]|[Oo])#\w+=\'*\w+\'*
If you are being a bit more strict with using paramname:
^([Ii]|[Oo])#paramname=[']?[\w]+[']?
Here is a demo
You could try something like this:
Regex rx = new Regex( #"^([IO])#(\w+)=(.*)$" , RegexOptions.IgnoreCase ) ;
Match group 1 will give you the value of I or O (the parameter direction?)
Match group 2 will give you the name of the parameter
Match group 3 will give you the value of the parameter
You could be stricter about the 3rd group and match it as
(([^']+)|('(('')|([^']+))*'))
The first alternative matches 1 or more non quoted character; the second alternative match a quoted string literal with any internal (embedded) quotes escape by doubling them, so it would match things like
'' (the empty string
'foo bar'
'That''s All, Folks!'
I'm trying to create a regex expression what will accept a certain format of command. The pattern is as follows:
Can start with a $ and have two following value 0-9,A-F,a-f (ie: $00 - $FF)
or
Can be any value except for "&<>'/"
*if the value start with $ the next two values after need to be a valid hex value from 00-ff
So far I have this
Regex correctValue = new Regex("($[0-9a-fA-F][0-9a-fA-F])");
Any help will be greatly appreciated!
You just need to add "\" symbol before your "$" and it works:
string input = "$00";
Match m = Regex.Match(input, #"^\$[0-9a-fA-F][0-9a-fA-F]$");
if (m.Success)
{
foreach (Group g in m.Groups)
Console.WriteLine(g.Value);
}
else
Console.WriteLine("Didn't match");
If I'm following you correctly, the net result you're looking for is any value that is not in the list "&<>'/", since any combination of $ and two alphanumeric characters would also not be in that list. Thus you could make your expression:
Regex correctValue = new Regex("[^&<>'/]");
Update: But just in case you do need to know how to properly match the $00 - $FF, this would do the trick:
Regex correctValue = new Regex("\$[0-9A-Fa-f]{2}");
In Regular Expression $ use for Anchor assertion, and means:
The match must occur at the end of the string or before \n at the end of the line or string.
try using [$] (Character Class for single character) or \$ (Character Escape) instead.
I want to find if a string contains a repeated sequence of a known substring (with comma separators) and nothing else and return true if this is the case; otherwise false. For example: the substring is "0,8"
String A: "0,8,0,8,0,8,0,8" returns true
String B: "0,8,0,8,1,0,8,0" returns false because of '1'
I tried using the C# string functions Contains but it does not suit my requirements. I am totally new to regular expression but I feel it should be powerful enough to do this. What RegEx should I use to do this?
The pattern for a string containing nothing but a repeated number of a given substring (possibly zero of them, resulting in an empty string) is \A(?:substring goes here)*\z. The \A matches the beginning of the string, the \z the end of the string, and the (?:...)* matches 0 or more copies of anything matching the thing between the colon and the close parenthesis.
But your string doesn't actually match \A(?:0,8)*\z, because of the extra commas; an example that would match is "0,80,80,80,8". You need to account for the commas explicitly with something like \A0,8(?:,0,8)*\z.
You can build such a thing in C# thus:
string OkSubstring = "0,8";
string aOk = "0,8,0,8,0,8,0,8";
string bOK = "0,8,0,8,1,0,8,0";
Regex OkRegex = new Regex( #"\A" + OkSubstring + "(?:," + OkSubstring + #")*\z" );
OkRegex.isMatch(aOK); // True
OkRegex.isMatch(bOK); // False
That hard-codes the comma-delimiter; you could make it more general. Or maybe you just need the literal regex. Either way, that's the pattern you need.
EDIT Changed the anchors per Mike Samuel's suggestion.