Match numbers that not in context of Value(x)

Match numbers that not in context of Value(x) - c#

I am trying to match the numbers that are not in the context of Value(X) and discard rest of text.
Example text:
lorem ipsum Value (3) dfasdf 654345435ds sdfsdf asdf
asd
F
asdf
sad Value (2)
Example Regex:
Value\((\d)\)
Thanks for help.

The .NET regex engine supports a quantifier in the lookbehind assertion.
What you might do is assert that from the current position, the is not Value( to the left that has 1+ digits and ) to the right. If that is the case, match 1 or more digits.
The pattern matches:
(?<!\bValue[\p{Zs}\t]*\((?=[0-9]+\)))[0-9]+
(?<! Positive lookbehind, assert what is to the left is
\bValue Match Value preceded by a word boundary to prevent a partial match
[\p{Zs}\t]*\( Match optional horizontal spaces followed by (
(?=[0-9]+\)) Positive lookahead, assert 1+ digits followed by ) to the right
) Close lookbehind
[0-9]+ Match 1+ digits 0-9
.NET regex demo
Note that \d matches more digits than 0-9 only, but also from other languages. If you want match that, you can use \d, else you can use [0-9] instead.

You are looking for:
(?<!Value *\()\d+)
Note that I am assuming that every Value( has a closing bracket.
Explanation:
(?<!Value *\() asserts that what follows it is not preceded by "Value(", Value (, Value ( and so on.
\d+ matches a digit between one and infinite times

Something like this ought to do you:
private static readonly Regex rx = new Regex(#"
(?<! # A zero-width negative look-behind assertion, consisting of:
\w # - a word boundary, followed by
Value # - the literal 'Value', followed by
\s* # - zero or more whitespace characters, followed by
[(] # - a left parenthesis '(', followed by
\s* # - zero or more whitespace characters,
) # The whole of which is followed by
( # A number, consisting of
-? # - an optional minus sign, followed by
\d+ # - 1 or more decimal digits,
) # The whole of which is followed by
(?! # A zero-width negative look-ahead assertion, consisting of
\s* # - zero or more whitespace characters, followed by
[)] # - a single right parenthesis ')'
) #
",
rxOpts
);
private const RegexOptions rxOpts = RegexOptions.IgnoreCase
| RegexOptions.ExplicitCapture
| RegexOptions.IgnorePatternWhitespace
;
Then . . .
foreach ( Match m in rx.Matches( someText ) )
{
string nbr = m.Value;
Console.WriteLine("Found '{0}', nbr);
}

Related

How would I write a regular expression to match numeric or alphanumeric words, but not words without numbers?

This will execute in the C# Regex Engine, in the .Net Framework 4.7.2.
I need a Regular Expression to search strings for "words" that match the following properties:
A numeric value, such as 1234, or 10.00
An alphanumeric value, such as ABC123 or ABC10.00
NOT an alpha-only value, such as cat or CAT
Matches separated by any non alpha-numeric character.
Matches: "123", "ABC123", "abc123", "10.00", "ABC.123", "Foo10.00"
Non-matches: "sugar", "rush", "XYZ"
In the following example string, the matches I want are in bold-italic:
789|--|789 ABC 123 10.00 ABC123 123ABC ABC123ABC abc.123.abc
I am currently using the following regex, but it is just an aggregation of all the special cases, and doesn't cover fully-complex cases. There must be a more efficient way to write this:
(?<=^|[\W])(?:[\d]+[A-Za-z]{1,}|[A-Za-z]+[\d]{1,}|[\d]+[.]+[\d]{1,}|[\d]{1,})(?=$|[\W])
This regex will match most of the examples above, but it will not not match any value where we toggle from numbers to letters and back, or vice-versa, like this: A1B2C3D4.
To test: https://regex101.com/r/oeSg10/1

You may use
(?xi) # Enable free-spacing and case insensitive mode
\b # Word boundary
(?=[A-Z.]*[0-9]) # After any 0+ letters/dots there must be a digit
[A-Z0-9]+ # 1+ letters or digits
(?:\.[A-Z0-9]+)* # 0+ repetitions of a . and then 1+ letters/digits
\b # Word boundary
See the regex demo at regex101.com and a .NET regex demo showing it really works in a .NET environment.
In C# code, you may use
var Pattern = new Regex(#"
\b # Word boundary
(?=[A-Z.]*[0-9]) # After any 0+ letters/dots there must be a digit
[A-Z0-9]+ # 1+ letters or digits
(?:\.[A-Z0-9]+)* # 0+ repetitions of a . and then 1+ letters/digits
\b # Word boundary",
RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace);
where (?x) = RegexOptions.IgnorePatternWhitespace and (?i) = RegexOptions.IgnoreCase.

Regex to get square brackets containing numbers only but are not within square brackets themselves

Sample String
"[] [ds*[000112]] [1448472995] sample string [1448472995] ***";
The regex should match
[1448472995] [1448472995]
and should not match [000112] since there is outer square bracket.
Currently I have this regex that is matching [000112] as well
const string unixTimeStampPattern = #"\[([0-9]+)]";

This is a good way to do it using balanced text.
( \[ \d+ \] ) # (1)
| # or,
\[ # Opening bracket
(?> # Then either match (possessively):
[^\[\]]+ # non - brackets
| # or
\[ # [ increase the bracket counter
(?<Depth> )
| # or
\] # ] decrease the bracket counter
(?<-Depth> )
)* # Repeat as needed.
(?(Depth) # Assert that the bracket counter is at zero
(?!)
)
\] # Closing bracket
C# sample
string sTestSample = "[] [ds*[000112]] [1448472995] sample string [1448472995] ***";
Regex RxBracket = new Regex(#"(\[\d+\])|\[(?>[^\[\]]+|\[(?<Depth>)|\](?<-Depth>))*(?(Depth)(?!))\]");
Match bracketMatch = RxBracket.Match(sTestSample);
while (bracketMatch.Success)
{
if (bracketMatch.Groups[1].Success)
Console.WriteLine("{0}", bracketMatch);
bracketMatch = bracketMatch.NextMatch();
}
Output
[1448472995]
[1448472995]

You need to use balancing groups to handle this - it looks a bit daunting but isn't all that complicated:
Regex regexObj = new Regex(
#"\[ # Match opening bracket.
\d+ # Match a number.
\] # Match closing bracket.
(?= # Assert that the following can be matched ahead:
(?> # The following group (made atomic to avoid backtracking):
[^\[\]]+ # One or more characters except brackets
| # or
\[ (?<Depth>) # an opening bracket (increase bracket counter)
| # or
\] (?<-Depth>) # a closing bracket (decrease bracket counter, can't go below 0).
)* # Repeat ad libitum.
(?(Depth)(?!)) # Assert that the bracket counter is now zero.
[^\[\]]* # Match any remaining non-bracket characters
\z # until the end of the string.
) # End of lookahead.",
RegexOptions.IgnorePatternWhitespace);

Are you just trying to capture the unix time stamp? Then you can try a simpler one where you specify the minimum number of characters matched in a group.
\[([0-9]{10})\]
Here I limit it to 10 characters since I doubt the time stamp will hit 11 characters anytime soon... To protect against that:
\[([0-9]{10,11})\]
Of course this could lead to false positives if you have a 10-length number in an enclosing bracket.

This will match your expression as expected: http://regexr.com/3csg3 it uses lookahead.

Which regular expression will let me match just the first and last letters?

Examples:
i General Biology i
i General Biology
General Biology i
I need to catch any phrase that begins with a single letter or number, ends with a letter or number, or both begins and ends with a single letter or number so that I can pre-parse the data to this:
General Biology
I've tried tons of examples on Rubular but can't seem to figure this one out. I've used literal match groups to get those characters but I don't want the match groups per se I literally just want the regex to only capture those two letters.

You can use the following to achieve this:
String result = Regex.Replace(input, #"(?i)^[a-z0-9]\s+|\s+[a-z0-9]$", "");
Explanation:
This removes a single letter/number at the beginning/end of the string followed or preceded by whitespace.
(?i) # set flags for this block (case-insensitive)
^ # the beginning of the string
[a-z0-9] # any character of: 'a' to 'z', '0' to '9'
\s+ # whitespace (\n, \r, \t, \f, and " ") (1 or more times)
| # OR
\s+ # whitespace (\n, \r, \t, \f, and " ") (1 or more times)
[a-z0-9] # any character of: 'a' to 'z', '0' to '9'
$ # before an optional \n, and the end of the string
Working Demo

Regex to capture parenthesis with hash tag?

So far I have this perfectly working regex:
(?:(?<=\s)|^)#(\w*[A-Za-z_]+\w*)
It finds any word that starts with a hash tag (ex. #lolz but not hsshs#jdjd)
The problem is I also want it to match parenthesis. So if I have this it will match:
(#lolz wow)
or
(wow #cool)
or
(#cool)
Any idea on how can I make or use my regex to work like that?

The following seemed to work for me ...
\(?#(\w*[A-Za-z_]+\w*)\)?

The way you are using the following in context is overkill..
\w*[A-Za-z_]\w*
\w alone matches word characters ( a-z, A-Z, 0-9, _ ). And it is not necessary for the use of the non-capturing group (?: to be wrapped around your lookbehind assertion here.
I do believe that the following would suffice by itself.
(?<=^|\s)\(?#(\w+)\)?
Regular expression:
(?<= look behind to see if there is:
^ the beginning of the string
| OR
\s whitespace (\n, \r, \t, \f, and " ")
) end of look-behind
\(? '(' (optional (matching the most amount possible))
# '#'
( group and capture to \1:
\w+ word characters (a-z, A-Z, 0-9, _) (1 or more times)
) end of \1
\)? ')' (optional (matching the most amount possible))
See live demo
You can also use a negative lookbehind here if you wanted to.
(?<![^\s])\(?#(\w+)\)?

Mask in Multiline TextBox

I WOuld like to implement textBox in which user can only insert text in pattern like this:
dddddddddd,
dddddddddd,
dddddddddd,
...
where d is a digit. If user leave control with less then 10 digits in a row validation should fail and he should not be able to write in one line more than 10 digits, then acceptable should be only comma ",".
Thanks for help

Match m = Regex.Match(textBox.Text, #"^\d{10},$", RegexOptions.Multiline);
Haven't tried it, but it should work. Please take a look here and here for more information.

I suggest the regex
\A(?:\s*\d{10},)*\s*\d{10}\s*\Z
Explanation:
\A # start of the string
(?: # match the following zero or more times:
\s* # optional whitespace, including newlines
\d{10}, # 10 digits, followed by a comma
)* # end of repeated group
\s* # match optional whitespace
\d{10} # match 10 digits (this time no comma)
\s* # optional whitespace
\Z # end of string
In C#, this would look like
validInput = Regex.IsMatch(subjectString, #"\A(?:\s*\d{10},)*\s*\d{10}\s*\Z");
Note that you need to use a verbatim string (#"...") or double all the backslashes in the regex.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Match numbers that not in context of Value(x) - c#

I am trying to match the numbers that are not in the context of Value(X) and discard rest of text. Example text: lorem ipsum Value (3) dfasdf 654345435ds sdfsdf asdf asd F asdf sad Value (2) Example Regex: Value\((\d)\) Thanks for help.

You are looking for: (?<!Value \()\d+) Note that I am assuming that every Value( has a closing bracket. Explanation: (?<!Value \() asserts that what follows it is not preceded by "Value(", Value (, Value ( and so on. \d+ matches a digit between one and infinite times

Related

How would I write a regular expression to match numeric or alphanumeric words, but not words without numbers?

Regex to get square brackets containing numbers only but are not within square brackets themselves

Which regular expression will let me match just the first and last letters?

Regex to capture parenthesis with hash tag?

Mask in Multiline TextBox

Categories

Resources

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Match numbers that not in context of Value(x) - c#

I am trying to match the numbers that are not in the context of Value(X) and discard rest of text. Example text: lorem ipsum Value (3) dfasdf 654345435ds sdfsdf asdf asd F asdf sad Value (2) Example Regex: Value\((\d)\) Thanks for help.

You are looking for: (?<!Value *\()\d+) Note that I am assuming that every Value( has a closing bracket. Explanation: (?<!Value *\() asserts that what follows it is not preceded by "Value(", Value (, Value ( and so on. \d+ matches a digit between one and infinite times

Related

How would I write a regular expression to match numeric or alphanumeric words, but not words without numbers?

Regex to get square brackets containing numbers only but are not within square brackets themselves

Which regular expression will let me match just the first and last letters?

Regex to capture parenthesis with hash tag?

Mask in Multiline TextBox

Categories

Resources

You are looking for: (?<!Value \()\d+) Note that I am assuming that every Value( has a closing bracket. Explanation: (?<!Value \() asserts that what follows it is not preceded by "Value(", Value (, Value ( and so on. \d+ matches a digit between one and infinite times