How can find format number by Regex? - c#

This my text=0.123.456Vaaa.789.V
I want find text=123.456V
I using this pattern in C#: \.[0-9]*[\.]?[0-9]*V
But result return 2 values: 123.456V and 789.V
I don't want get case blank after ".": 789.V
How can fix my pattern?
Thank you.

In your pattern, [\.]? does not have to be a separate character class, or the dot does not have to be escaped. I suggest writing the optional dot pattern as \.?, it is least ambiguous. [0-9]* after the optional dot pattern matches zero or more digits, hence you get unexpected matches.
You do not seem to need the \. at the start, either.
You can use
[0-9]*\.?[0-9]+V
See the .NET regex demo.
Details:
[0-9]* - zero or more ASCII digits
\.? - an optional .
[0-9]+ - one or more digits
V - a V char.
See a C# regex demo:
var results = Regex.Matches(text, #"[0-9]*\.?[0-9]+V")
.Cast<Match>()
.Select(x => x.Value)
.ToList();
// => 123.456V

I think the simplest solution would be:
\d+\.\d+V
meaning you want to find some arbitrary number of digits, followed by a dot, followed by more digits, followed by the letter V.

Related

Split String by Regex Expression

This is my string.
19282511~2017-08-28 13:24:28~Entering (A/B)~1013~283264/89282511~2017-08-28 13:24:28~Entering (A/B)~1013~283266/79282511~2017-08-28 13:24:28~Entering (A/B)~1013~283261
I would like this string be split like below:
19282511~2017-08-28 13:24:28~Entering (A/B)~1013~283264
89282511~2017-08-28 13:24:28~Entering (A/B)~1013~283266
79282511~2017-08-28 13:24:28~Entering (A/B)~1013~283261
I cannot split my string blindly by slash (/) since there is a value A/B will also get split.
Any idea of doing this by regex expression?
Your help will definitely be appreciated.
You may split with / that is in between digits:
(?<=\d)/(?=\d)
See the regex demo
Details
(?<=\d) - a positive lookbehind that requires a digit to appear immediately to the left of the current location
/ - a / char
(?=\d) - a positive lookahead that requires a digit to appear immediately to the right of the current location.
Since the \d pattern is inside non-consuming patterns, only / will be removed upon splitting and the digits will remain in the resulting items.
Another idea is to match and capture these strings using
/?([^~]*(?:~[^~]*){3}~\d+)
See this regex demo.
Details
/? - 1 or 0 / chars
([^~]*(?:~[^~]*){3}~\d+) - Group 1 (what you need to grab):
[^~]* - zero or more chars other than ~
(?:~[^~]*){3} - 3 or more sequences of ~ and then 0+ chars other than ~
~\d+ - a ~ and then 1 or more digits.
The C# code will look like
var results = Regex.Matches(s, #"/?([^~](?:~[^~]){3}~\d+)")
.Cast()
.Select(m => m.Groups1.Value)
.ToList();
NOTE: By default, \d matches all Unicode digits. If you do not want this behavior, use the RegexOptions.ECMAScript option, or replace \d with [0-9] to only match ASCII digits.

RegExp multiply matches in text

I want to write a regexp to get multiple matches of the first character and next three digits. Some valid examples:
A123,
V322,
R333.
I try something like that
[a-aA-Z](1)\d3
but it gets me just the first match!
Could you possibly show me, how to rewrite this regexp to get multiple results?Thank you so much and Have a nice day!
Your regex does not work because it matches:
[a-aA-Z] - an ASCII letter, then
(1) - a 1 digit (and puts into a capture)
\d - any 1 digit
3 - a 3 digit.
So, it matches Y193, E103, etc., even in longer phrases, where Y and E are not first letters.
You need to use a word boundary and fix your pattern as
\b[a-aA-Z][0-9]{3}
NOTE: if you need to match it as a whole word, add \b at the end: \b[a-aA-Z][0-9]{3}\b.
See the regex demo.
Details:
\b - leading word boundary
[a-aA-Z] - an ASCII letter
[0-9]{3} - 3 digits.
C# code:
var results = Regex.Matches(s, #"\b[a-aA-Z][0-9]{3}")
.Cast<Match>()
.Select(m => m.Value)
.ToList();

Regex for both negative and positive values in dash-separated string

I'm reading weight and dimension dash-separated values from serial port.
This is what incoming data look like right now:
-15.0cm-47.8cm-83.1cm: 0.115 kg
And this is my pattern for it
#"(\d+\.\d+)"
However, sometimes one of those values can be negative as well, for example
--15.0cm-47.8cm--83.1cm: 0.115 kg.
My question is how I can get both negative and positive values at the same time? My expected output for the above string is [ "-15.0", "47.8", "-83.1", "0.115"].
You may use a lookbehind pattern to make sure there is a "dash" before another one (that will get consumed, i.e. added to the match value):
(?:(?<=-)-)?\d+\.\d+
See the regex demo against a --15.0cm-47.8cm--83.1cm: 0.115 kg string:
Here, (?:(?<=-)-)? is an optional non-capturing group that matches a - that is preceded with another -. The \d+\.\d+ matches 1+ digits, . and again 1 or more digits.
C# code:
var results = Regex.Matches(str, #"(?:(?<=-)-)?\d+\.\d+")
.Cast<Match>()
.Select(m => m.Value)
.ToList();

Numeric substrings between dots

I am trying to make a regex that finds substrings that start with a dot (.), have only numbers and end either with another dot or it's the strings end.
To clarify, here are a few examples:
abc.123.ds => 123
aAsd.12sd.SAs.32.asd.3123 => 32 and 3123
111.2e2 => no result
aaa.bbb.13.320.a => 13 and 320
I tried different approaches, this is the closest I cam to a result is "^[.][0-9]+\.?$" but it still fails.
Any tips would be greatly appreciated
The ^[.][0-9]+\.?$ fails becaue ^ forces the pattern to match at the start of the string and $ makes it match the end of string (the full string), and the .? at the end, when matched, will consume the . and will not let you match an overlapping number with a dot in front.
I suggest using lookarounds:
(?<=\.)[0-9]+(?=\.|$)
See the regex demo
Details:
(?<=\.) - there must be a . immediately to the left of the current position
[0-9]+ - 1+ digits
(?=\.|$) - there must be a . or end of string immediately to the right of the current location.
C#:
var res = Regex.Matches(str, #"(?<=\.)[0-9]+(?=\.|$)")
.Cast<Match>()
.Select(m => m.Value)
.ToList();
Remove the begining of line anchor and do an alternative for the other:
\.[0-9]+(\.|$)
It is pretty simple using capturing groups:
int[] result = Regex.Matches("\.(\d+)\.?").Cast<Match>().Select(x=> int.Parse(x.Groups[2].Value)).ToList();
First group is your entire match
\.(\d+)\.?
Second is first nested brace-closed expression
\d+

Regular expression - secong group of digits

Hi I would like to pull second group of digits which are after (-) from below string:
D:\data\home\Logs_Audit\VO12_LAB_20140617-000301.txt
I used \d{8} to pull 20140617 but now I want to pull 000301
EDIT 1:
Now I would Like to pull VO12_LAB from above string. Could You please help me.
I am not good at regular expression and I didn't find good tutorial to understand it.
EDIT 2:
I found that something like
\w{2,3}\d{2,3}_\w{2,3}
works to me. Do you think it is accurate enough?
You can use lookahead/lookbehind to find the group based on "anchors", like this:
(?<=[-])\\d+(?=[.]txt)
The groups before and after the \\d+ are non-capturing zero-width "markers", in the sense that they do not consume any characters from the string, only describe character combinations that need to precede and/or follow the text that you would like to match.
You can use a Positive Lookahead for this.
\d+(?=\.)
Explanation: This matches digits (1 or more times) preceded by a dot .
\d+ digits (0-9) (1 or more time)
(?= look ahead to see if there is:
\. '.'
) end of look-ahead
Live Demo
Final Solution:
String s = #"D:\data\home\Logs\V_LAB_20140617-000301.txt";
Match m = Regex.Match(s, #"\d+(?=\.)");
if (m.Success) {
Console.WriteLine(m.Value); //=> "000301"
}
You can use this regex:
(?<=-)(\d+)
The first group will contain the digits.
Live Demo

Categories