> String St = "New Specification Result : Measures 0.0039mm ( 4 Microns )New Specification Result : Measures 0.0047mm ( 5 Microns )";
The string that i want to get is 0.0039mm and 0.0047mm but the code i use keep giving me 0.0047mm only.
var src = st;
var pattern = #"([0-9].[0-9]{4}mm)";
var expr = new Regex(pattern, RegexOptions.IgnoreCase);
foreach (Match match in expr.Matches(src))
{
string key = match.Groups[1].Value;
string key2 = match.Groups[2].Value;
label1.Text = key + key2;
}
Your code is fine, and the millimeter number you are trying to match is being captured correctly, but in the first capture group, and not in the second. There is a slight problem with your pattern, and it should be this:
([0-9]\.[0-9]{4}mm)
You intend for the dot to be a literal decimal point, so it should be escaped with a backslash. Here is the full code:
var pattern = #"([0-9].[0-9]{4}mm)";
var expr = new Regex(pattern, RegexOptions.IgnoreCase);
foreach (Match match in expr.Matches(src))
{
string key = match.Groups[1].Value;
string key2 = match.Groups[2].Value; // this doesn't match to anything here
Console.WriteLine(key);
}
Demo
You want the following. Your loop is is overwriting your copy of the first result (and you don't have a 2nd capture group. You have a 2nd match)
var st = "New Specification Result : Measures 0.0039mm(4 Microns)New Specification Result: Measures 0.0047mm(5 Microns";
var pattern = #"([0-9]\.[0-9]{4}mm)";
var expr = new Regex(pattern, RegexOptions.IgnoreCase);
string key = "";
foreach (Match match in expr.Matches(st))
{
key += match.Groups[1].Value;
}
you want to iterate through each match and the join them together for display
var mm = new Regex(#"([0-9]\.[0-9]{4}mm)").Matches(src).Select(m => m.Groups[1]).ToList();
var list = string.Join(" ", mm);
label1.Text = list;
currently you are only getting the last match as you keep overwriting the text in your label
Related
I need to recover each number in a glued string
For example, from these strings:
string test = "number1+3"
string test1 = "number 1+4"
I want to recover (1 and 3) and (1 and 4)
How can I do this?
CODE
string test= "number1+3";
List<int> res;
string[] digits= Regex.Split(test, #"\D+");
foreach (string value in digits)
{
int number;
if (int.TryParse(value, out number))
{
res.Add(number)
}
}
This regex should work
string pattern = #"\d+";
string test = "number1+3";
foreach (Match match in Regex.Matches(test, pattern))
Console.WriteLine("Found '{0}' at position {1}",
match.Value, match.Index);
Note that if you intend to use it multiple times, it's better, for performance reasons, to create a Regex instance than using this static method.
var res = new List<int>();
var regex = new Regex(#"\d+");
void addMatches(string text) {
foreach (Match match in regex.Matches(text))
{
int number = int.Parse(match.Value);
res.Add(number);
}
}
string test = "number1+3";
addMatches(test);
string test1 = "number 1+4";
addMatches(test1);
MSDN link.
Fiddle 1
Fiddle 2
This calls for a regular expression:
(\d+)\+(\d+)
Test it
Match m = Regex.Match(input, #"(\d+)\+(\d+)");
string first = m.Groups[1].Captures[0].Value;
string second = m.Groups[2].Captures[0].Value;
An alternative to regular expressions:
string test = "number 1+4";
int[] numbers = test.Replace("number", string.Empty, StringComparison.InvariantCultureIgnoreCase)
.Trim()
.Split("+", StringSplitOptions.RemoveEmptyEntries)
.Select(x => Convert.ToInt32(x))
.ToArray();
I've found a lot of examples of how to check something using regex, or how to split text using regular expressions.
But how can I extract words out of a string ?
Example:
aaaa 12312 <asdad> 12334 </asdad>
Lets say I have something like this, and I want to extract all the numbers [0-9]* and put them in a list.
Or if I have 2 different kind of elements:
aaaa 1234 ...... 1234 ::::: asgsgd
And I want to choose digits that come after ..... and words that come after ::::::
Can I extract these strings in a single regex ?
Here's a solution for your first problem:
class Program
{
static void Main(string[] args)
{
string data = "aaaa 12312 <asdad> 12334 </asdad>";
Regex reg = new Regex("[0-9]+");
foreach (var match in reg.Matches(data))
{
Console.WriteLine(match);
}
Console.ReadLine();
}
}
In the general case, you can do this using capturing parentheses:
string input = "aaaa 1234 ...... 1234 ::::: asgsgd";
string regex = #"\.\.\.\. (\d+) ::::: (\w+)";
Match m = Regex.Match(input, regex);
if (m.Success) {
int numberAfterDots = int.Parse(m.Groups[1].Value);
string wordAfterColons = m.Groups[2].Value;
// ... Do something with these values
}
But the first part you asked (extract all the numbers) is a bit easier:
string input = "aaaa 1234 ...... 1234 ::::: asgsgd";
var numbers = Regex.Matches(input, #"\d+")
.Cast<Match>()
.Select(m => int.Parse(m.Value))
.ToList();
Now numbers will be a list of integers.
For your specific examples:
string firstString = "aaaa 12312 <asdad> 12334 </asdad>";
Regex firstRegex = new Regex(#"(?<Digits>[\d]+)", RegexOptions.ExplicitCapture);
if (firstRegex.IsMatch(firstString))
{
MatchCollection firstMatches = firstRegex.Matches(firstString);
foreach (Match match in firstMatches)
{
Console.WriteLine("Digits: " + match.Groups["Digits"].Value);
}
}
string secondString = "aaaa 1234 ...... 1234 ::::: asgsgd";
Regex secondRegex = new Regex(#"([\.]+\s(?<Digits>[\d]+))|([\:]+\s(?<Words>[a-zA-Z]+))", RegexOptions.ExplicitCapture);
if (secondRegex.IsMatch(secondString))
{
MatchCollection secondMatches = secondRegex.Matches(secondString);
foreach (Match match in secondMatches)
{
if (match.Groups["Digits"].Success)
{
Console.WriteLine("Digits: " + match.Groups["Digits"].Value);
}
if (match.Groups["Words"].Success)
{
Console.WriteLine("Words: " + match.Groups["Words"].Value);
}
}
}
Hope that helps. The output is:
Digits: 12312
Digits: 12334
Digits: 1234
Words: asgsgd
Something like this will do nicely!
var text = "aaaa 12312 <asdad> 12334 </asdad>";
var matches = Regex.Matches(text, #"\w+");
var arrayOfMatched = matches.Cast<Match>().Select(m => m.Value).ToArray();
Console.WriteLine(string.Join(", ", arrayOfMatched));
\w+ Matches consecutive word characters. Then we just selected the values out of the list of matches and turn them into an array.
Regex itemsRegex = new Regex(#"(\d*)");
MatchCollection matches = itemsRegex.Matches(text);
int[] values = matches.Cast<Match>().Select(m => Convert.ToInt32(m.Value)).ToArray();
Regex phoneregex = new Regex("[0-9][0-9][0-9]\-[0-9][0-9][0-9][0-9]");
String unicornCanneryDirectory = "unicorn cannery 483-8627 cha..."
String numbersToCall = "";
//the second argument is where to begin within the match,
//we probably want 0, the first character
Match matchIterator = phoneregex.Match(unicornCanneryDirectory , 0);
//Success tells us if matchIterator has another match or not
while( matchIterator.Sucess){
String aResult = matchIterator.Result();
//we could manipulate our match now but I'm going to concatenate them all for later
numbersToCall += aResult + " ";
matchIterator = matchIterator.NextMatch();
}
// use my concatenated matches now
String message = "Unicorn rights activists demand more sparkles in the unicorn canneries under the new law...";
phoneDialer.MassCallWithAutomatedMessage(aResult, message );
http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.match.nextmatch.aspx
I have a string that looks like this:
var expression = #"Args("token1") + Args("token2")";
I want to retrieve a collection of strings that are enclosed in Args("") in the expression.
How would I do this in C# or VB.NET?
Regex:
string expression = "Args(\"token1\") + Args(\"token2\")";
Regex r = new Regex("Args\\(\"([^\"]+)\"\\)");
List<string> tokens = new List<string>();
foreach (var match in r.Matches(expression)) {
string s = match.ToString();
int start = s.IndexOf('\"');
int end = s.LastIndexOf('\"');
tokens.add(s.Substring(start + 1, end - start - 1));
}
Non-regex (this assumes that the string in the correct format!):
string expression = "Args(\"token1\") + Args(\"token2\")";
List<string> tokens = new List<string>();
int index;
while (!String.IsNullOrEmpty(expression) && (index = expression.IndexOf("Args(\"")) >= 0) {
int start = expression.IndexOf('\"', index);
string s = expression.Substring(start + 1);
int end = s.IndexOf("\")");
tokens.Add(s.Substring(0, end));
expression = s.Substring(end + 2);
}
There is another regular expression method for accomplishing this, using lookahead and lookbehind assertions:
Regex regex = new Regex("(?<=Args\\(\").*?(?=\"\\))");
string input = "Args(\"token1\") + Args(\"token2\")";
MatchCollection matches = regex.Matches(input);
foreach (var match in matches)
{
Console.WriteLine(match.ToString());
}
This strips away the Args sections of the string, giving just the tokens.
If you want token1 and token2, you can use following regex
input=#"Args(""token1"") + Args(""token2"")"
MatchCollection matches = Regex.Matches(input,#"Args\(""([^""]+)""\)");
Sorry, If this is not what you are looking for.
if your collection looks like this:
IList<String> expression = new List<String> { "token1", "token2" };
var collection = expression.Select(s => Args(s));
As long as Args returns the same type as the queried collection type this should work okay
you can then iterate over the collection like so
foreach (var s in collection)
{
Console.WriteLine(s);
}
In a program I'm reading in some data files, part of which are formatted as a series of records each in square brackets. Each record contains a section title and a series of key/value pairs.
I originally wrote code to loop through and extract the values, but decided it could be done more elegantly using regular expressions. Below is my resulting code (I just hacked it out for now in a console app - so know the variable names aren't that great, etc.
Can you suggest improvements? I feel it shouldn't be necessary to do two matches and a substring, but can't figure out how to do it all in one big step:
string input = "[section1 key1=value1 key2=value2][section2 key1=value1 key2=value2 key3=value3][section3 key1=value1]";
MatchCollection matches=Regex.Matches(input, #"\[[^\]]*\]");
foreach (Match match in matches)
{
string subinput = match.Value;
int firstSpace = subinput.IndexOf(' ');
string section = subinput.Substring(1, firstSpace-1);
Console.WriteLine(section);
MatchCollection newMatches = Regex.Matches(subinput.Substring(firstSpace + 1), #"\s*(\w+)\s*=\s*(\w+)\s*");
foreach (Match newMatch in newMatches)
{
Console.WriteLine("{0}={1}", newMatch.Groups[1].Value, newMatch.Groups[2].Value);
}
}
I prefer named captures, nice formatting, and clarity:
string input = "[section1 key1=value1 key2=value2][section2 key1=value1 key2=value2 key3=value3][section3 key1=value1]";
MatchCollection matches = Regex.Matches(input, #"\[
(?<sectionName>\S+)
(\s+
(?<key>[^=]+)
=
(?<value>[^ \] ]+)
)+
]", RegexOptions.IgnorePatternWhitespace);
foreach(Match currentMatch in matches)
{
Console.WriteLine("Section: {0}", currentMatch.Groups["sectionName"].Value);
CaptureCollection keys = currentMatch.Groups["key"].Captures;
CaptureCollection values = currentMatch.Groups["value"].Captures;
for(int i = 0; i < keys.Count; i++)
{
Console.WriteLine("{0}={1}", keys[i].Value, values[i].Value);
}
}
You should take advantage of the collections to get each key. So something like this then:
string input = "[section1 key1=value1 key2=value2][section2 key1=value1 key2=value2 key3=value3][section3 key1=value1]";
Regex r = new Regex(#"(\[(\S+) (\s*\w+\s*=\s*\w+\s*)*\])", RegexOptions.Compiled);
foreach (Match m in r.Matches(input))
{
Console.WriteLine(m.Groups[2].Value);
foreach (Capture c in m.Groups[3].Captures)
{
Console.WriteLine(c.Value);
}
}
Resulting output:
section1
key1=value1
key2=value2
section2
key1=value1
key2=value2
key3=value3
section3
key1=value1
You should be able to do something with nested groups like this:
pattern = #"\[(\S+)(\s+([^\s=]+)=([^\s\]]+))*\]"
I haven't tested it in C# or looped through the matches, but the results look right on rubular.com
This will match all the key/value pairs ...
var input = "[section1 key1=value1 key2=value2][section2 key1=value1 key2=value2 key3=value3][section3 key1=value1]";
var ms = Regex.Matches(input, #"section(\d+)\s*(\w+=\w+)\s*(\w+=\w+)*");
foreach (Match m in ms)
{
Console.WriteLine("Section " + m.Groups[1].Value);
for (var i = 2; i < m.Groups.Count; i++)
{
if( !m.Groups[i].Success ) continue;
var kvp = m.Groups[i].Value.Split( '=' );
Console.WriteLine( "{0}={1}", kvp[0], kvp[1] );
}
}
I need to cut out and save/use part of a string in C#. I figure the best way to do this is by using Regex. My string looks like this:
"changed from 1 to 10".
I need a way to cut out the two numbers and use them elsewhere. What's a good way to do this?
Error checking left as an exercise...
Regex regex = new Regex( #"\d+" );
MatchCollection matches = regex.Matches( "changed from 1 to 10" );
int num1 = int.Parse( matches[0].Value );
int num2 = int.Parse( matches[1].Value );
Matching only exactly the string "changed from x to y":
string pattern = #"^changed from ([0-9]+) to ([0-9]+)$";
Regex r = new Regex(pattern);
Match m = r.match(text);
if (m.Success) {
Group g = m.Groups[0];
CaptureCollection cc = g.Captures;
int from = Convert.ToInt32(cc[0]);
int to = Convert.ToInt32(cc[1]);
// Do stuff
} else {
// Error, regex did not match
}
In your regex put the fields you want to record in parentheses, and then use the Match.Captures property to extract the matched fields.
There's a C# example here.
Use named capture groups.
Regex r = new Regex("*(?<FirstNumber>[0-9]{1,2})*(?<SecondNumber>[0-9]{1,2})*");
string input = "changed from 1 to 10";
string firstNumber = "";
string secondNumber = "";
MatchCollection joinMatches = regex.Matches(input);
foreach (Match m in joinMatches)
{
firstNumber= m.Groups["FirstNumber"].Value;
secondNumber= m.Groups["SecondNumber"].Value;
}
Get Expresson to help you out, it has an export to C# option.
DISCLAIMER: Regex is probably not right (my copy of expresso expired :D)
Here is a code snippet that does almost what I wanted:
using System.Text.RegularExpressions;
string text = "changed from 1 to 10";
string pattern = #"\b(?<digit>\d+)\b";
Regex r = new Regex(pattern);
MatchCollection mc = r.Matches(text);
foreach (Match m in mc) {
CaptureCollection cc = m.Groups["digit"].Captures;
foreach (Capture c in cc){
Console.WriteLine((Convert.ToInt32(c.Value)));
}
}