Split a string by Regex [duplicate] - c#

This question already has answers here:
Regular expression to extract text between square brackets
(15 answers)
Closed 5 years ago.
I'm currently thinking of how to split this kind of string into regex using c#.
[01,01,01][02,03,00][03,07,00][04,06,00][05,02,00][06,04,00][07,08,00][08,05,00]
Can someone knowledgeable on regex can point me on how to achieved this goal?
sample regex pattern that don't work:
[\dd,\dd,\dd]
sample output:
[01,01,01]
[02,03,00]
[03,07,00]
[04,06,00]
[05,02,00]
[06,04,00]
[07,08,00]
[08,05,00]

This will do the job in C# (\[.+?\]), e.g.:
var s = #"[01,01,01][02,03,00][03,07,00][04,06,00][05,02,00][06,04,00][07,08,00][08,05,00]";
var reg = new Regex(#"(\[.+?\])");
var matches = reg.Matches(s);
foreach(Match m in matches)
{
Console.WriteLine($"{m.Value}");
}
EDIT This is how the expression (\[.+?\]) works
first the outter parenthesis, ( and ), means to capture whatever the inside pattern matched
then the escaped square brackets, \[ and \], is to match the [ and ] in the source string
finally the .+? means to match one or more characters, but as few times as possible, so that it won't match all the characters before the first [ and the last ]

I know you stipulated Regex, however it's worth looking at Split again, if for only for academic purposes:
Code
var input = "[01,01,01][02,03,00][03,07,00][04,06,00][05,02,00][06,04,00][07,08,00][08,05,00]";
var output = input.Split(']',StringSplitOptions.RemoveEmptyEntries)
.Select(x => x + "]") // the bracket back
.ToList();
foreach(var o in output)
Console.WriteLine(o);
Output
[01,01,01]
[02,03,00]
[03,07,00]
[04,06,00]
[05,02,00]
[06,04,00]
[07,08,00]
[08,05,00]

The Regex solution below is restricted to 3 values of only 2 digits seperated by comma. Inside the foreach loop you can access the matching value via match.Value. >> Refiddle example
Remember to include using System.Text.RegularExpressions;
var input = "[01,01,01][02,03,00][03,07,00][04,06,00][05,02,00][06,04,00][07,08,00][08,05,00]";
foreach(var match in Regex.Matches(input, #"(\[\d{2},\d{2},\d{2}\])+"))
{
// do stuff
}

Thanks all for the answer i also got it working by using this code
string pattern = #"\[\d\d,\d\d,\d\d]";
Regex rgx = new Regex(pattern, RegexOptions.IgnoreCase);
MatchCollection matches = rgx.Matches(myResult);
Debug.WriteLine(matches.Count);
foreach (Match match in matches)
Debug.WriteLine(match.Value);

Related

Use RegEx to extract specific part from string

I have string like
"Augustin Ralf (050288)"
"45 Max Müller (4563)"
"Hans (Adam) Meider (056754)"
I am searching for a regex to extract the last part in the brackets, for example this results for the strings above:
"050288"
"4563"
"056754"
I have tried with
var match = Regex.Match(string, #".*(\(\d*\))");
But I get also the brackets with the result. Is there a way to extract the strings and get it without the brackets?
Taking your requirements precisely, you are looking for
\(([^()]+)\)$
This will capture anything between the parentheses (not nested!), may it be digits or anything else and anchors them to the end of the string. If you happen to have whitespace at the end, use
\(([^()]+)\)\s*$
In C# this could be
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = #"\(([^()]+)\)$";
string input = #"Augustin Ralf (050288)
45 Max Müller (4563)
Hans (Adam) Meider (056754)
";
RegexOptions options = RegexOptions.Multiline;
foreach (Match m in Regex.Matches(input, pattern, options))
{
Console.WriteLine("'{0}' found at index {1}.", m.Value, m.Index);
}
}
}
See a demo on regex101.com.
please use regex - \(([^)]*)\)[^(]*$. This is working as expected. I have tested here
You can extract the number between the parantheses without worring about extracting the capturing groups with following regex.
(?<=\()\d+(?=\)$)
demo
Explanation:
(?<=\() : positive look behind for ( meaning that match will start after a ( without capturing it to the result.
\d+ : captures all digits in a row until non digit character found
(?=\)$) : positive look ahead for ) with line end meaning that match will end before a ) with line ending without capturing ) and line ending to the result.
Edit: If the number can be within parantheses that is not at the end of the line, remove $ from the regex to fix the match.
var match = Regex.Match(string, #".*\((\d*)\)");
https://regex101.com/r/Wk9asY/1
Here are three options for you.
The first one uses the simplest pattern and in addition the Trim method.
The second one uses capturing the desired value to the group and then getting it from the group.
The third one uses Lookbehind and Lookahead.
var inputs = new string[] {
"Augustin Ralf (050288)", "45 Max Müller (4563)", "Hans (Adam) Meider (056754)"
};
foreach (var input in inputs)
{
var match = Regex.Match(input, #"\(\d+\)");
Console.WriteLine(match.Value.Trim('(', ')'));
}
Console.WriteLine();
foreach (var input in inputs)
{
var match = Regex.Match(input, #"\((\d+)\)");
Console.WriteLine(match.Groups[1]);
}
Console.WriteLine();
foreach (var input in inputs)
{
var match = Regex.Match(input, #"(?<=\()\d+(?=\))");
Console.WriteLine(match.Value);
}
Console.WriteLine();

Use C# RegEx to retrieve a list of matching strings found in a source string? [duplicate]

This question already has an answer here:
Simple and tested online regex containing regex delimiters does not work in C# code
(1 answer)
Closed 3 years ago.
I'm a RegEx novice, so I'm hoping someone out there can give me a hint.
I want to find a straightforward way (using RegEx?) to extract a list/array of values that match a pattern from a string.
If source string is "Hello #bob and #mark and #dave", I want to retrieve a list containing "#bob", "#mark" and "#dave" or, even better, "bob", "mark" and "dave" (without the # symbol).
So far, I have something like this (in C#):
string note = "Hello, #bob and #mark and #dave";
var r = new Regex(#"/(#)\w+/g");
var listOfFound = r.Match(note);
I'm hoping listOfFound will be an array or a List containing the three values.
I could do this with some clever string parsing, but it seems like this should be a piece of cake for RegEx, if I could only come up with the right pattern.
Thanks for any help!
Regexes in C# don't need delimiters and options must be supplied as the second argument to the constructor, but are not required in this case as you can get all your matches using Regex.Matches. Note that by using a lookbehind for the # ((?<=#)) we can avoid having the # in the match:
string note = "Hello, #bob and #mark and #dave";
Regex r = new Regex(#"(?<=#)\w+");
foreach (Match match in r.Matches(note))
Console.WriteLine("Found '{0}' at position {1}", match.Value, match.Index);
Output:
Found 'bob' at position 8
Found 'mark' at position 17
Found 'dave' at position 27
To get all the values into a list/array you could use something like:
string note = "Hello, #bob and #mark and #dave";
Regex r = new Regex(#"(?<=#)\w+");
// list of matches
List<String> Matches = new List<String>();
foreach (Match match in r.Matches(note))
Matches.Add(match.Value);
// array of matches
String[] listOfFound = Matches.ToArray();
You could do it without Regex, for example:
var listOfFound = note.Split().Where(word => word.StartsWith("#"));
Replace
var listOfFound = r.Match(note);
by
var listOfFound = r.Matches(note);

Easy Regex capture [duplicate]

This question already has answers here:
Regular Expression Groups in C#
(5 answers)
Closed 6 years ago.
New to using C# Regex, I am trying to capture two comma separated integers from a string into two variables.
Example: 13,567
I tried variations on
Regex regex = new Regex(#"(\d+),(\d+)");
var matches = regex.Matches("12,345");
foreach (var itemMatch in matches)
Debug.Print(itemMatch.Value);
This just captures 1 variable, which is the entire string. I did workaround this by changing the capture pattern to "(\d+)", but that then ignores the middle comma entirely and I would get a match if there were any text between the integers.
How do I get it to extract both integers and ensure it also sees a comma between.
Can do this with String.Split
Why not just use a split and parse?
var results = "123,456".Split(',').Select(int.Parse).ToArray();
var left = results[0];
var right = results[1];
Alternatively, you can use a loop and use int.TryParse to handle failures but for what you're looking for this should cover it
If you're really committed to a Regex
You can do this with a Regex too, just need to use groups of the match
Regex r = new Regex(#"(\d+)\,(\d+)", RegexOptions.Compiled);
var r1 = r.Match("123,456");
//first is total match
Console.WriteLine(r1.Groups[0].Value);
//Then first and second groups
var left = int.Parse(r1.Groups[1].Value);
var right = int.Parse(r1.Groups[2].Value);
Console.WriteLine("Left "+ left);
Console.WriteLine("Right "+right);
Made a dotnetfiddle you can test the solution in as well
With Regex, you can use this:
Regex regex = new Regex(#"\d+(?=,)|(?<=,)\d+");
var matches = regex.Matches("12,345");
foreach (Match itemMatch in matches)
Console.WriteLine(itemMatch.Value);
prints:
12
345
Actually this is doing a look-ahead and look-behind a , :
\d+(?=,) <---- // Match numbers followed by a ,
| <---- // OR
(?<=,)\d+ <---- // Match numbers preceeded by a ,

Looking for patterns in a string how to?

I'm trying to find all instances of the substring EnemyType('XXXX') where XXXX is an arbitrary string and the instasnce of EnemyType('XXXX') can appear multiple times.
Right now I'm using a consortium of index of/substring functions in C# but would like to know if there's a cleaner way of doing it?
Use regex. Example:
using System.Text.RegularExpressions;
var inputString = " EnemyType('1234')abcdeEnemyType('5678')xyz";
var regex = new Regex(#"EnemyType\('\d{4}'\)");
var matches = regex.Matches(inputString);
foreach (Match i in matches)
{
Console.WriteLine(i.Value);
}
It will print:
EnemyType('1234')
EnemyType('5678')
The pattern to match is #"EnemyType\('\d{4}'\)", where \d{4} means 4 numeric characters (0-9). The parentheses are escaped with backslash.
Edit: Since you only want the string inside quotes, not the whole string, you can use named groups instead.
var inputString = " EnemyType('1234')abcdeEnemyType('5678')xyz";
var regex = new Regex(#"EnemyType\('(?<id>[^']+)'\)");
var matches = regex.Matches(inputString);
foreach (Match i in matches)
{
Console.WriteLine(i.Groups["id"].Value);
}
Now it prints:
1234
5678
Regex is a really nice tool for parsing strings. If you often parse strings, regex can make life so much easier.

C# regex. Everything inside curly brackets{} and mod(%) charaters

I'm trying to get the values between {} and %% in a same Regex.
This is what I have till now. I can successfully get values individually for each but I was curious to learn about how can I combine both.
var regex = new Regex(#"%(.*?)%|\{([^}]*)\}");
String s = "This is a {test} %String%. %Stack% {Overflow}";
Expected answer for the above string
test
String
Stack
Overflow
Individual regex
#"%(.*?)%" gives me String and Stack
#"\{([^}]*)\}" gives me test and Overflow
Following is my code.
var regex = new Regex(#"%(.*?)%|\{([^}]*)\}");
var matches = regex.Matches(s);
foreach (Match match in matches)
{
Console.WriteLine(match.Groups[1].Value);
}
Similar to your regex. You can use Named Capturing Groups
String s = "This is a {test} %String%. %Stack% {Overflow}";
var list = Regex.Matches(s, #"\{(?<name>.+?)\}|%(?<name>.+?)%")
.Cast<Match>()
.Select(m => m.Groups["name"].Value)
.ToList();
If you want to learn how conditional expressions work, here is a solution using that kind of .NET regex capability:
(?:(?<p>%)|(?<b>{))(?<v>.*?)(?(p)%|})
See the regex demo
Here is how it works:
(?:(?<p>%)|(?<b>{)) - match and capture either Group "p" with % (percentage), or Group "b" (brace) with {
(?<v>.*?) - match and capture into Group "v" (value) any character (even a newline since I will be using RegexOptions.Singleline) zero or more times, but as few as possible (lazy matching with *? quantifier)
(?(p)%|}) - a conditional expression meaning: if "p" group was matched, match %, else, match }.
C# demo:
var s = "This is a {test} %String%. %Stack% {Overflow}";
var regex = "(?:(?<p>%)|(?<b>{))(?<v>.*?)(?(p)%|})";
var matches = Regex.Matches(s, regex, RegexOptions.Singleline);
// var matches_list = Regex.Matches(s, regex, RegexOptions.Singleline)
// .Cast<Match>()
// .Select(p => p.Groups["v"].Value)
// .ToList();
// Or just a demo writeline
foreach (Match match in matches)
Console.WriteLine(match.Groups["v"].Value);
Sometimes the capture is in group 1 and sometimes it's in group 2 because you have two pairs of parentheses.
Your original code will work if you do this instead:
Console.WriteLine(match.Groups[1].Value + match.Groups[2].Value);
because one group will be the empty string and the other will be the value you're interested in.
#"[\{|%](.*?)[\}|%]"
The idea being:
{ or %
anything
} or %
I think you should use a combination of conditional anda nested groups:
((\{(.*)\})|(%(.*)%))

Categories