I'm trying to get the values between {} and %% in a same Regex.
This is what I have till now. I can successfully get values individually for each but I was curious to learn about how can I combine both.
var regex = new Regex(#"%(.*?)%|\{([^}]*)\}");
String s = "This is a {test} %String%. %Stack% {Overflow}";
Expected answer for the above string
test
String
Stack
Overflow
Individual regex
#"%(.*?)%" gives me String and Stack
#"\{([^}]*)\}" gives me test and Overflow
Following is my code.
var regex = new Regex(#"%(.*?)%|\{([^}]*)\}");
var matches = regex.Matches(s);
foreach (Match match in matches)
{
Console.WriteLine(match.Groups[1].Value);
}
Similar to your regex. You can use Named Capturing Groups
String s = "This is a {test} %String%. %Stack% {Overflow}";
var list = Regex.Matches(s, #"\{(?<name>.+?)\}|%(?<name>.+?)%")
.Cast<Match>()
.Select(m => m.Groups["name"].Value)
.ToList();
If you want to learn how conditional expressions work, here is a solution using that kind of .NET regex capability:
(?:(?<p>%)|(?<b>{))(?<v>.*?)(?(p)%|})
See the regex demo
Here is how it works:
(?:(?<p>%)|(?<b>{)) - match and capture either Group "p" with % (percentage), or Group "b" (brace) with {
(?<v>.*?) - match and capture into Group "v" (value) any character (even a newline since I will be using RegexOptions.Singleline) zero or more times, but as few as possible (lazy matching with *? quantifier)
(?(p)%|}) - a conditional expression meaning: if "p" group was matched, match %, else, match }.
C# demo:
var s = "This is a {test} %String%. %Stack% {Overflow}";
var regex = "(?:(?<p>%)|(?<b>{))(?<v>.*?)(?(p)%|})";
var matches = Regex.Matches(s, regex, RegexOptions.Singleline);
// var matches_list = Regex.Matches(s, regex, RegexOptions.Singleline)
// .Cast<Match>()
// .Select(p => p.Groups["v"].Value)
// .ToList();
// Or just a demo writeline
foreach (Match match in matches)
Console.WriteLine(match.Groups["v"].Value);
Sometimes the capture is in group 1 and sometimes it's in group 2 because you have two pairs of parentheses.
Your original code will work if you do this instead:
Console.WriteLine(match.Groups[1].Value + match.Groups[2].Value);
because one group will be the empty string and the other will be the value you're interested in.
#"[\{|%](.*?)[\}|%]"
The idea being:
{ or %
anything
} or %
I think you should use a combination of conditional anda nested groups:
((\{(.*)\})|(%(.*)%))
Related
can anybody help me with regular expression in C#?
I want to create a pattern for this input:
{a? ab 12 ?? cd}
This is my pattern:
([A-Fa-f0-9?]{2})+
The problem are the curly brackets. This doesn't work:
{(([A-Fa-f0-9?]{2})+)}
It just works for
{ab}
I would use {([A-Fa-f0-9?]+|[^}]+)}
It captures 1 group which:
Match a single character present in the list below [A-Fa-f0-9?]+
Match a single character not present in the list below [^}]+
If you allow leading/trailing whitespace within {...} string, the expression will look like
{(?:\s*([A-Fa-f0-9?]{2}))+\s*}
See this regex demo
If you only allow a single regular space only between the values inside {...} and no space after { and before }, you can use
{(?:([A-Fa-f0-9?]{2})(?: (?!}))?)+}
See this regex demo. Note this one is much stricter. Details:
{ - a { char
(?:\s*([A-Fa-f0-9?]{2}))+ - one or more occurrences of
\s* - zero or more whitespaces
([A-Fa-f0-9?]{2}) - Capturing group 1: two hex or ? chars
\s* - zero or more whitespaces
} - a } char.
See a C# demo:
var text = "{a? ab 12 ?? cd}";
var pattern = #"{(?:([A-Fa-f0-9?]{2})(?: (?!}))?)+}";
var result = Regex.Matches(text, pattern)
.Cast<Match>()
.Select(x => x.Groups[1].Captures.Cast<Capture>().Select(m => m.Value))
.ToList();
foreach (var list in result)
Console.WriteLine(string.Join("; ", list));
// => a?; ab; 12; ??; cd
If you want to capture pairs of chars between the curly's, you can use a single capture group:
{([A-Fa-f0-9?]{2}(?: [A-Fa-f0-9?]{2})*)}
Explanation
{ Match {
( Capture group 1
[A-Fa-f0-9?]{2} Match 2 times any of the listed characters
(?: [A-Fa-f0-9?]{2})* Optionally repeat a space and again 2 of the listed characters
) Close group 1
} Match }
Regex demo | C# demo
Example code
string pattern = #"{([A-Fa-f0-9?]{2}(?: [A-Fa-f0-9?]{2})*)}";
string input = #"{a? ab 12 ?? cd}
{ab}";
foreach (Match m in Regex.Matches(input, pattern))
{
Console.WriteLine(m.Groups[1].Value);
}
Output
a? ab 12 ?? cd
ab
Given:
var input = "test <123>";
Regex.Matches(input, "<.*?>");
Result:
<123>
Gives me the result I want but includes the angle brackets. Which is ok because I can easily do a search and replace. I was just wondering if there was a way to include that in the expression?
You need to use a capturing group:
var input = "test <123>";
var results = Regex.Matches(input, "<(.*?)>")
.Cast<Match>()
.Select(m => m.Groups[1].Value)
.ToList();
The m.Groups[1].Value lets you get the capturing group #1 value.
And a better, more efficient regex can be <([^>]*)> (it matches <, then matches and captures into Group 1 any zero or more chars other than > and then just matches >). See the regex demo:
I've got an input string that looks like this:
level=<device[195].level>&name=<device[195].name>
I want to create a RegEx that will parse out each of the <device> tags, for example, I'd expect two items to be matched from my input string: <device[195].level> and <device[195].name>.
So far I've had some luck with this pattern and code, but it always finds both of the device tags as a single match:
var pattern = "<device\\[[0-9]*\\]\\.\\S*>";
Regex rgx = new Regex(pattern);
var matches = rgx.Matches(httpData);
The result is that matches will contain a single result with the value <device[195].level>&name=<device[195].name>
I'm guessing there must be a way to 'terminate' the pattern, but I'm not sure what it is.
Use non-greedy quantifiers:
<device\[\d+\]\.\S+?>
Also, use verbatim strings for escaping regexes, it makes them much more readable:
var pattern = #"<device\[\d+\]\.\S+?>";
As a side note, I guess in your case using \w instead of \S would be more in line with what you intended, but I left the \S because I can't know that.
depends how much of the structure of the angle blocks you need to match, but you can do
"\\<device.+?\\>"
I want to create a RegEx that will parse out each of the <device> tags
I'd expect two items to be matched from my input string:
1. <device[195].level>
2. <device[195].name>
This should work. Get the matched group from index 1
(<device[^>]*>)
Live demo
String literals for use in programs:
#"(<device[^>]*>)"
Change your repetition operator and use \w instead of \S
var pattern = #"<device\[[0-9]+\]\.\w+>";
String s = #"level=<device[195].level>&name=<device[195].name>";
foreach (Match m in Regex.Matches(s, #"<device\[[0-9]+\]\.\w+>"))
Console.WriteLine(m.Value);
Output
<device[195].level>
<device[195].name>
Use named match groups and create a linq entity projection. There will be two matches, thus separating the individual items:
string data = "level=<device[195].level>&name=<device[195].name>";
string pattern = #"
(?<variable>[^=]+) # get the variable name
(?:=<device\[) # static '=<device'
(?<index>[^\]]+) # device number index
(?:]\.) # static ].
(?<sub>[^>]+) # Get the sub command
(?:>&?) # Match but don't capture the > and possible &
";
// Ignore pattern whitespace is to document the pattern, does not affect processing.
var items = Regex.Matches(data, pattern, RegexOptions.IgnorePatternWhitespace)
.OfType<Match>()
.Select (mt => new
{
Variable = mt.Groups["variable"].Value,
Index = mt.Groups["index"].Value,
Sub = mt.Groups["sub"].Value
})
.ToList();
items.ForEach(itm => Console.WriteLine ("{0}:{1}:{2}", itm.Variable, itm.Index, itm.Sub));
/* Output
level:195:level
name:195:name
*/
Im having a bit of trouble with this regex. I have a line that could look like this
PREF-FA/WV/WB/LO...could continue
or
PREF-FA
and I need to grab all the ratings(FA/WV/WB etc) for each line, and put them in their own class. Is this something regex could handle? or should I just split the string up?
I have a class called rating, and a List which length determines how many ratings are in that above line.
Thanks
How about
Regex
.Matches("PREF-FA/WV/WB/LO" , #".+?-(?<rating>.{2})(?:/(?<rating>.{2}))*")
.Cast<Match>()
.SelectMany(m => m.Groups["rating"].Captures.Cast<Capture>().Select(c => c.Value))
gives an IEnumerable<string> with values "FA", "WV", "WB", "LO"
To go back to .Net2.0 world:
MatchCollection matches=Regex
.Matches("PREF-FA/WV/WB/LO",#".+?-(?<rating>.{2})(?:/(?<rating>.{2}))*");
List<string> ratings=new List<string>();
foreach(Match m in matches)
{
CaptureCollection captures=m.Groups["rating"].Captures;
foreach(Capture c in captures)
{
ratings.Add(c.Value);
}
}
You could try:
((?:\w{2}/)*\w{2})$
?: to avoid capturing the 2-letter words and the slash.
Test it on Rubular if you want. The regex works with many regex engines.
If the line always begins with PREF-, you could use:
^PREF-((?:\w{2}/)*\w{2})$
You can use this regex (?<=PREF-).*$
resultString = Regex.Match(subjectString, "(?<=PREF-).*$",
RegexOptions.Singleline | RegexOptions.Multiline).Value;
It uses positive look behind to match PREF- and then mathces the succeeding string.
If you want to loop through all the mathces
Regex ItemRegex = new Regex(#"(?<=PREF-).*$", RegexOptions.Compiled);
foreach (Match ItemMatch in ItemRegex.Matches(subjectString))
{
Console.WriteLine(ItemMatch);
}
How can I use lookbehind in a C# Regex in order to skip matches of repeated prefix patterns?
Example - I'm trying to have the expression match all the b characters following any number of a characters:
Regex expression = new Regex("(?<=a).*");
foreach (Match result in expression.Matches("aaabbbb"))
MessageBox.Show(result.Value);
returns aabbbb, the lookbehind matching only an a. How can I make it so that it would match all the as in the beginning?
I've tried
Regex expression = new Regex("(?<=a+).*");
and
Regex expression = new Regex("(?<=a)+.*");
with no results...
What I'm expecting is bbbb.
Are you looking for a repeated capturing group?
(.)\1*
This will return two matches.
Given:
aaabbbb
This will result in:
aaa
bbbb
This:
(?<=(.))(?!\1).*
Uses the above principal, first checking that the finding the previous character, capturing it into a back reference, and then asserting that that character is not the next character.
That matches:
bbbb
I figured it out eventually:
Regex expression = new Regex("(?<=a+)[^a]+");
foreach (Match result in expression.Matches(#"aaabbbb"))
MessageBox.Show(result.Value);
I must not allow the as to me matched by the non-lookbehind group. This way, the expression will only match those b repetitions that follow a repetitions.
Matching aaabbbb yields bbbb and matching aaabbbbcccbbbbaaaaaabbzzabbb results in bbbbcccbbbb, bbzz and bbb.
The reason the look-behind is skipping the "a" is because it is consuming the first "a" (but no capturing it), then it captures the rest.
Would this pattern work for you instead? New pattern: \ba+(.+)\b
It uses a word boundary \b to anchor either ends of the word. It matches at least one "a" followed by the rest of the characters till the word boundary ends. The remaining characters are captured in a group so you can reference them easily.
string pattern = #"\ba+(.+)\b";
foreach (Match m in Regex.Matches("aaabbbb", pattern))
{
Console.WriteLine("Match: " + m.Value);
Console.WriteLine("Group capture: " + m.Groups[1].Value);
}
UPDATE: If you want to skip the first occurrence of any duplicated letters, then match the rest of the string, you could do this:
string pattern = #"\b(.)(\1)*(?<Content>.+)\b";
foreach (Match m in Regex.Matches("aaabbbb", pattern))
{
Console.WriteLine("Match: " + m.Value);
Console.WriteLine("Group capture: " + m.Groups["Content"].Value);
}