Numeric substrings between dots

Numeric substrings between dots - c#

I am trying to make a regex that finds substrings that start with a dot (.), have only numbers and end either with another dot or it's the strings end.
To clarify, here are a few examples:
abc.123.ds => 123
aAsd.12sd.SAs.32.asd.3123 => 32 and 3123
111.2e2 => no result
aaa.bbb.13.320.a => 13 and 320
I tried different approaches, this is the closest I cam to a result is "^[.][0-9]+\.?$" but it still fails.
Any tips would be greatly appreciated

The ^[.][0-9]+\.?$ fails becaue ^ forces the pattern to match at the start of the string and $ makes it match the end of string (the full string), and the .? at the end, when matched, will consume the . and will not let you match an overlapping number with a dot in front.
I suggest using lookarounds:
(?<=\.)[0-9]+(?=\.|$)
See the regex demo
Details:
(?<=\.) - there must be a . immediately to the left of the current position
[0-9]+ - 1+ digits
(?=\.|$) - there must be a . or end of string immediately to the right of the current location.
C#:
var res = Regex.Matches(str, #"(?<=\.)[0-9]+(?=\.|$)")
.Cast<Match>()
.Select(m => m.Value)
.ToList();

Remove the begining of line anchor and do an alternative for the other:
\.[0-9]+(\.|$)

It is pretty simple using capturing groups:
int[] result = Regex.Matches("\.(\d+)\.?").Cast<Match>().Select(x=> int.Parse(x.Groups[2].Value)).ToList();
First group is your entire match
\.(\d+)\.?
Second is first nested brace-closed expression
\d+

Related

How can find format number by Regex?

This my text=0.123.456Vaaa.789.V
I want find text=123.456V
I using this pattern in C#: \.[0-9]*[\.]?[0-9]*V
But result return 2 values: 123.456V and 789.V
I don't want get case blank after ".": 789.V
How can fix my pattern?
Thank you.

In your pattern, [\.]? does not have to be a separate character class, or the dot does not have to be escaped. I suggest writing the optional dot pattern as \.?, it is least ambiguous. [0-9]* after the optional dot pattern matches zero or more digits, hence you get unexpected matches.
You do not seem to need the \. at the start, either.
You can use
[0-9]*\.?[0-9]+V
See the .NET regex demo.
Details:
[0-9]* - zero or more ASCII digits
\.? - an optional .
[0-9]+ - one or more digits
V - a V char.
See a C# regex demo:
var results = Regex.Matches(text, #"[0-9]*\.?[0-9]+V")
.Cast<Match>()
.Select(x => x.Value)
.ToList();
// => 123.456V

I think the simplest solution would be:
\d+\.\d+V
meaning you want to find some arbitrary number of digits, followed by a dot, followed by more digits, followed by the letter V.

C# Regex to obtain string up until a pattern

I've always been really bad when it comes to using regular expressions but it is something I want to seriously understand because as we all know, it is quite useful.
This is for a personal project, to keep my folders organized and neat.
I have a bunch of folders with the following naming pattern XXXXXXXX.XXXXXXX.XXXXXX.SYY.EYY.SOMETHINGELSE
There can be any amount of X repeating separated by ".", but the SYY.EYY is always there. So what I want is a regular expression to retrieve all the text represented by XXX without the "." if possible up until the SYY.EYY pattern.
I managed to detect the pattern because YY are always numbers, so doing something like \d{2} will detect it but I'm wondering if its possible to also add the rest of the pattern to that \d{2}.
Any help is appreciate it :)

If the YY is as you stated 2 digits and you want to get the text except the . up until for example S11.E22 you could make use of the \G anchor and a capturing group to get the text without a dot.
The value is in the Match.Groups property.
\G(?!S[0-9]{2}\.E[0-9]{2})([^.]+)\.
In parts
\G Assert position at the end of previous match (start at the beginning)
(?! Negative lookahead, assert what is directly to the right is not
S[0-9]{2}\.E[0-9]{2} Math S, 2 digits, . E and 2 digits
) Close lookahead
( Capture group 1
[^.]+ Match 1+ times any char except a dot
) Close group 1
\. Match dot literal
Regex demo | C# demo
For example
string pattern = #"\G(?!S[0-9]{2}\.E[0-9]{2})([^.]+)\.";
string input = #"XXXXXXXX.XXXXXXX.XXXXXX.S11.E22.SOMETHINGELSE";
foreach (Match m in Regex.Matches(input, pattern))
{
Console.WriteLine(m.Groups[1].Value);
}
Output
XXXXXXXX
XXXXXXX
XXXXXX

You can "replace/cut" the "." with C#.
The regex to get up until the SYY.EYY can be like this:
.SYY.EYY$
Line ends with word -> Regex: ExampleWord$

I would do something like:
var leftPart = Regex.Match(x, "^.*?(?=SYY)").Captures.First().Value;
// this now has XXXXXXXX.XXXXXXX.XXXXXX.
// And we can:
var left = leftPart.Replace(".", " "); // or any other char

C# equivalent for this regex pattern

I have this regular expression pattern: .{2}\#.{2}\K|\..*(*SKIP)(?!)|.(?=.*\.)
It works perfectly to convert to replace the matches to get
trabc#abtrec.com.lo => ***bc#ab*****.com.lo
demomail#demodomain.com => ******il#de*********.com
But when I try to use it on C# the \K and the (*SKIP) and (*F) are not allowed.
what will be the c# version of this pattern? or do you know a simpler way to mask the email without the unsupported pattern entries?
Demo
UPDATE:
(*SKIP): this verb causes the match to fail at the current starting position in the subject if the rest of the pattern does not match
(*F): Forces a matching failure at the given position in the pattern (the same as (?!)

Try this regex:
\w(?=.{2,}#)|(?<=#[^\.]{2,})\w
Click for Demo
Explanation:
\w - matches a word character
(?=.{2,}#) - positive lookahead to find the position immediately followed by 2+ occurrences of any character followed by #
| - OR
(?<=#[^\.]{2,}) - positive lookbehind to find the position immediately preceded by # followed by 2+ occurrences of any character that is not a .
\w - matches a word character.
Replace each match with a *

You can achieve the same result with a regex that matches items in one block, and applying a custom match evaluator:
var res = Regex.Replace(
s
, #"^.*(?=.{2}\#.{2})|(?<=.{2}\#.{2}).*(?=.com.*$)"
, match => new string('*', match.ToString().Length)
);
The regex has two parts:
The one on the left ^.*(?=.{2}\#.{2}) matches the user name portion except the last two characters
The one on the right (?<=.{2}\#.{2}).*(?=.com.*$) matches the suffix of the domain up to the ".com..." ending.
Demo.

RegExp multiply matches in text

I want to write a regexp to get multiple matches of the first character and next three digits. Some valid examples:
A123,
V322,
R333.
I try something like that
[a-aA-Z](1)\d3
but it gets me just the first match!
Could you possibly show me, how to rewrite this regexp to get multiple results?Thank you so much and Have a nice day!

Your regex does not work because it matches:
[a-aA-Z] - an ASCII letter, then
(1) - a 1 digit (and puts into a capture)
\d - any 1 digit
3 - a 3 digit.
So, it matches Y193, E103, etc., even in longer phrases, where Y and E are not first letters.
You need to use a word boundary and fix your pattern as
\b[a-aA-Z][0-9]{3}
NOTE: if you need to match it as a whole word, add \b at the end: \b[a-aA-Z][0-9]{3}\b.
See the regex demo.
Details:
\b - leading word boundary
[a-aA-Z] - an ASCII letter
[0-9]{3} - 3 digits.
C# code:
var results = Regex.Matches(s, #"\b[a-aA-Z][0-9]{3}")
.Cast<Match>()
.Select(m => m.Value)
.ToList();

Regex for both negative and positive values in dash-separated string

I'm reading weight and dimension dash-separated values from serial port.
This is what incoming data look like right now:
-15.0cm-47.8cm-83.1cm: 0.115 kg
And this is my pattern for it
#"(\d+\.\d+)"
However, sometimes one of those values can be negative as well, for example
--15.0cm-47.8cm--83.1cm: 0.115 kg.
My question is how I can get both negative and positive values at the same time? My expected output for the above string is [ "-15.0", "47.8", "-83.1", "0.115"].

You may use a lookbehind pattern to make sure there is a "dash" before another one (that will get consumed, i.e. added to the match value):
(?:(?<=-)-)?\d+\.\d+
See the regex demo against a --15.0cm-47.8cm--83.1cm: 0.115 kg string:
Here, (?:(?<=-)-)? is an optional non-capturing group that matches a - that is preceded with another -. The \d+\.\d+ matches 1+ digits, . and again 1 or more digits.
C# code:
var results = Regex.Matches(str, #"(?:(?<=-)-)?\d+\.\d+")
.Cast<Match>()
.Select(m => m.Value)
.ToList();

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Numeric substrings between dots - c#

Remove the begining of line anchor and do an alternative for the other: \.[0-9]+(\.|$)

It is pretty simple using capturing groups: int[] result = Regex.Matches("\.(\d+)\.?").Cast<Match>().Select(x=> int.Parse(x.Groups[2].Value)).ToList(); First group is your entire match \.(\d+)\.? Second is first nested brace-closed expression \d+

Related

How can find format number by Regex?

C# Regex to obtain string up until a pattern

C# equivalent for this regex pattern

RegExp multiply matches in text

Regex for both negative and positive values in dash-separated string

Categories

Resources