Regular expression question (C#) - c#

How do I write a regular expression to match (_Rev. n.nn) in the following filenames (where n is a number):
Filename_Rev. 1.00
Filename_Rev. 1.10
Thanks

The following should work (for the whole line):
#"^Filename_Rev\.\s\d\.\d\d$"

Should capture versions >9
Edit: Fixed
string captureString = "abc123butts_Rev. 1.00";
Regex reg = new Regex(#"(.(?!_Rev))+\w_Rev\. (?<version>\d+\.\d+)");
string version = reg.Match(captureString).Groups["version"].Value;

Building off of #leppie's answer (give him the green check not me), you can extract the numbers from your regex match by putting parens around the \d's.
Regex foo = new Regex(#"_Rev\.\s(\d)\.(\d\d)$");
GroupCollection groups = foo.Match("Filename_Rev. 1.00").Groups;
string majorNum = groups[1].Value;
string minorNum = groups[2].Value;
System.Console.WriteLine(majorNum);
System.Console.WriteLine(minorNum);

Related

Can LINQ be used to search for Regex expressions in a string?

I have the following code that works, but would like to edit it up using LINQ to find if any of the Regex search strings are in the target.
foreach (Paragraph comment in
wordDoc.MainDocumentPart.Document.Body.Descendants<Paragraph>().Where<Paragraph>(comment => comment.InnerText.Contains("cmt")))
{
//print values
}
More precisely I have to select through LINQ if the string start with letters or start with symbols - or •
This Regex is correct for my case ?
string pattern = #"^[a-zA-Z-]+$";
Regex rg = new Regex(pattern);
Any suggestion please?
Thanks in advance for any help
You can. It would be better to use query syntax though, as described here: https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/linq/how-to-combine-linq-queries-with-regular-expressions
Example:
var queryMatchingFiles =
from file in fileList
where file.Extension == ".htm"
let fileText = System.IO.File.ReadAllText(file.FullName)
let matches = searchTerm.Matches(fileText)
where matches.Count > 0
select new
{
name = file.FullName,
matchedValues = from System.Text.RegularExpressions.Match match in matches
select match.Value
};
Your pattern is fine, just remove the $ from the end and add any character
#"^[a-zA-Z-]+. *"
Your regex should be modified as
^[\p{L}•-]
To also allow whitespace at the start of the string add \s and use
^[\p{L}\s•-]
Details
^ - start of string
[\p{L}•-] - a letter, • or -
[\p{L}•-] - a letter, whitespace, • or -
In C#, use
var reg = new Regex(#"^[\p{L}•-]");
foreach (Paragraph comment in
wordDoc.MainDocumentPart.Document.Body.Descendants<Paragraph>()
.Where<Paragraph>(comment => reg.IsMatch(comment.InnerText)))
{
//print values
}
If you want to match those items containing cmt and also matching this regex, you may adjust the pattern to
var reg = new Regex(#"^(?=.*cmt)[\p{L}\s•-]", RegexOptions.Singleline);
If you need to only allow cmt at the start of the string:
var reg = new Regex(#"^(?:cmt|[\p{L}\s•-])");

How can I split a regex into exact words?

I need a little help regarding Regular Expressions in C#
I have the following string
"[[Sender.Name]]\r[[Sender.AdditionalInfo]]\r[[Sender.Street]]\r[[Sender.ZipCode]] [[Sender.Location]]\r[[Sender.Country]]\r"
The string could also contain spaces and theoretically any other characters. So I really need do match the [[words]].
What I need is a text array like this
"[[Sender.Name]]",
"[[Sender.AdditionalInfo]]",
"[[Sender.Street]]",
// ... And so on.
I'm pretty sure that this is perfectly doable with:
var stringArray = Regex.Split(line, #"\[\[+\]\]")
I'm just too stupid to find the correct Regex for the Regex.Split() call.
Anyone here that can tell me the correct Regular Expression to use in my case?
As you can tell I'm not that experienced with RegEx :)
Why dont you split according to "\r"?
and you dont need regex for that just use the standard string function
string[] delimiters = {#"\r"};
string[] split = line.Split(delimiters,StringSplitOptions.None);
Do matching if you want to get the [[..]] block.
Regex rgx = new Regex(#"\[\[.*?\]\]");
foreach (Match m in rgx.Matches(input))
Console.WriteLine(m.Groups[0].Value);
IDEONE
The regex you are using (\[\[+\]\]) will capture: literal [s 2 or more, then 2 literal ]s.
A regex solution is capturing all the non-[s inside doubled [ and ]s (and the string inside the brackets should not be empty, I guess?), and cast MatchCollection to a list or array (here is an example with a list):
var str = "[[Sender.Name]]\r[[Sender.AdditionalInfo]]\r[[Sender.Street]]\r[[Sender.ZipCode]] [[Sender.Location]]\r[[Sender.Country]]\r";
var rgx22 = new Regex(#"\[\[[^]]+?\]\]");
var res345 = rgx22.Matches(str).Cast<Match>().ToList();
Output:

Regex: Parse specific string with one 18-digit number

C#/.NET 4.0
I need to parse a string containing a 18-digit number. I also need the substrings at the left and right side.
Example strings:
string a = "Frl Camp Gerbesklooster 871687120000000691 OPLDN 2010 H1";
string b = "some text with spaces 123456789012345678 more text";
How it should be parsed:
string aParsed[0] = "Frl Camp Gerbesklooster";
string aParsed[1] = "871687120000000691";
string aParsed[2] = "OPLDN 2010 H1";
string bParsed[0] = "some text with spaces";
string bParsed[1] = "123456789012345678";
string bParsed[2] = "more text";
There is always that 18-digit number in the middle of the string. I'm an absolute newbie to Regex so I don't actually have a try of my own.
What is the best way to do this? Should I use regular expressions?
Thanks.
You can use something like the regex: (.*)(\d{18})(.*).
The key here is to use {18} to specify that there must be exactly 18 digits and to capture each part in a group.
var parts = Regex.Matches(s, #"(.*)(\d{18})(.*)")
.Cast<Match>()
.SelectMany(m => m.Groups.Cast<Group>().Skip(1).Select(g=>g.Value))
.ToArray();
Daniël,
Although the question is answered the following may be a useful reference for learning Reg Expressions.
http://txt2re.com
Regards,
Liam

regex problem C#

const string strRegex = #"(?<city_country>.+) ?(bis|zu)? (?<price>[\d.,]+) eur";
searchQuery = RemoveSpacesFromString(searchQuery);
Regex regex = new Regex(strRegex, RegexOptions.IgnoreCase);
Match m = regex.Match(searchQuery);
ComplexAdvertismentsQuery query = new ComplexAdvertismentsQuery();
if (m.Success)
{
query.CountryName = m.Groups["city_country"].Value;
query.CityOrAreaName = m.Groups["city_country"].Value;
query.PriceFrom = Convert.ToDecimal(1);
query.PriceTo = Convert.ToDecimal(m.Groups["price"].Value);
}
else
return null;
return query;
my search string is "Agadir ca. 600 eur" but "ca." is not "bis" or "zu" and regex is also true. What is wrong with regex? I want that regex is true only if is word "bis" or "zu".
I think this is worng
?(bis|zu)?
Agadir ca. becomes your city_country and (bis|zu)?part is skipped as you've marked it as not required with ?.
It looks like it's because you made bis and zu optional. Try changing (bis|zu)? to (bis|zu)
Remove the question mark in (bis|zu)?. As it is right now, the .+ of <city_country> matches up to the prices and includes ca..
In fact, you might want to change the whole ?(bis|zu)? part to ( bis| zu).
(?<city_country>.+) ?(bis|zu)? (?<price>[\d.,]+) eur
^ ^
1 2
The first ? belongs to the space before
The second ? belongs to the (bis|zu)
The ? in these two cases makes the expression before optional

Regular expression to retrieve everything before first slash

I need a regular expression to basically get the first part of a string, before the first slash ().
For example in the following:
C:\MyFolder\MyFile.zip
The part I need is "C:"
Another example:
somebucketname\MyFolder\MyFile.zip
I would need "somebucketname"
I also need a regular expression to retrieve the "right hand" part of it, so everything after the first slash (excluding the slash.)
For example
somebucketname\MyFolder\MyFile.zip
would return
MyFolder\MyFile.zip.
You don't need a regular expression (it would incur too much overhead for a simple problem like this), try this instead:
yourString = yourString.Substring(0, yourString.IndexOf('\\'));
And for finding everything after the first slash you can do this:
yourString = yourString.Substring(yourString.IndexOf('\\') + 1);
This problem can be handled quite cleanly with the .NET regular expression engine. What makes .NET regular expressions really nice is the ability to use named group captures.
Using a named group capture allows you to define a name for each part of regular expression you are interested in “capturing” that you can reference later to get at its value. The syntax for the group capture is "(?xxSome Regex Expressionxx). Remember also to include the System.Text.RegularExpressions import statement when using regular expression in your project.
Enjoy!
//Regular expression
string _regex = #"(?<first_part>[a-zA-Z:0-9]+)\\{1}(?<second_part>(.)+)";
//Example 1
{
Match match = Regex.Match(#"C:\MyFolder\MyFile.zip", _regex, RegexOptions.IgnoreCase);
string firstPart = match.Groups["first_part"].Captures[0].Value;
string secondPart = match.Groups["second_part"].Captures[0].Value;
}
//Example 2
{
Match match = Regex.Match(#"somebucketname\MyFolder\MyFile.zip", _regex, RegexOptions.IgnoreCase);
string firstPart = match.Groups["first_part"].Captures[0].Value;
string secondPart = match.Groups["second_part"].Captures[0].Value;
}
You are aware that .NET's file handling classes do this a lot more elegantly, right?
For example in your last example, you could do:
FileInfo fi = new FileInfo(#"somebucketname\MyFolder\MyFile.zip");
string nameOnly = fi.Name;
The first example you could do:
FileInfo fi = new FileInfo(#"C:\MyFolder\MyFile.zip");
string driveOnly = fi.Root.Name.Replace(#"\", "");
This matches all non \ chars
[^\\]*
Here is the regular expression solution using the "greedy" operator '?'...
var pattern = "^.*?\\\\";
var m = Regex.Match("c:\\test\\gimmick.txt", pattern);
MessageBox.Show(m.Captures[0].Value);
Split on slash, then get first item
words = s.Split('\\');
words[0]

Categories