How to set repeat regular expression?

How to set repeat regular expression? - c#

I have regular expression ^\d{5}$|^\d{5}-\d{4}*$" it checked US zip.
But I need check "zip, zip, zip" how to do this?
I tried this ^(\d{5}$|^\d{5}-\d{4},)*$ but it not work

Try
((^|, )(\d{5}|\d{5}-\d{4}))*$
Tester: http://regexr.com?36297
Each match must be preceded by (^|, ), so by the beginning of the string or a , (comma space)
Note that you shouldn't use the \d in .NET, because ٠١٢٣٤ are \d! (in .NET \d includes non-ASCII Unicode digits). [0-9] is normally better.

The expression you appear to need is:
^\d{5}(|-\d{4})(,\d{5}(|-\d{4}))*$
The one you were attempting to write was:
^(\d{5}|\d{5}-\d{4},)*$
but that would require every ZIP to have a trailing comma, which the very last one would not have had.
Breaking down the answer given,
\d{5}(|-\d{4}) is a variant of your original, but simply making the -1234 optional.
(,\d{5}(|-\d{4}))* is the first regular expression preceded by a comma, and allowed zero or more times.

I would use this for speed:
^\d{5}(?:-\d{4})?(?:,\s*\d{5}(?:-\d{4})?)*$
expanded
^
\d{5}
(?: - \d{4} )?
(?:
, \s* \d{5}
(?: - \d{4} )?
)*
$
and this for speed/flexibility:
^\s*\d{5}(?:\s*-\s*\d{4})?(?:\s*,\s*\d{5}(?:\s*-\s*\d{4})?)*\s*$
expanded
^
\s*
\d{5}
(?: \s* - \s* \d{4} )?
(?:
\s* , \s* \d{5}
(?: \s* - \s* \d{4} )?
)*
\s*
$

Related

Match numbers that not in context of Value(x)

I am trying to match the numbers that are not in the context of Value(X) and discard rest of text.
Example text:
lorem ipsum Value (3) dfasdf 654345435ds sdfsdf asdf
asd
F
asdf
sad Value (2)
Example Regex:
Value\((\d)\)
Thanks for help.

The .NET regex engine supports a quantifier in the lookbehind assertion.
What you might do is assert that from the current position, the is not Value( to the left that has 1+ digits and ) to the right. If that is the case, match 1 or more digits.
The pattern matches:
(?<!\bValue[\p{Zs}\t]*\((?=[0-9]+\)))[0-9]+
(?<! Positive lookbehind, assert what is to the left is
\bValue Match Value preceded by a word boundary to prevent a partial match
[\p{Zs}\t]*\( Match optional horizontal spaces followed by (
(?=[0-9]+\)) Positive lookahead, assert 1+ digits followed by ) to the right
) Close lookbehind
[0-9]+ Match 1+ digits 0-9
.NET regex demo
Note that \d matches more digits than 0-9 only, but also from other languages. If you want match that, you can use \d, else you can use [0-9] instead.

You are looking for:
(?<!Value *\()\d+)
Note that I am assuming that every Value( has a closing bracket.
Explanation:
(?<!Value *\() asserts that what follows it is not preceded by "Value(", Value (, Value ( and so on.
\d+ matches a digit between one and infinite times

Something like this ought to do you:
private static readonly Regex rx = new Regex(#"
(?<! # A zero-width negative look-behind assertion, consisting of:
\w # - a word boundary, followed by
Value # - the literal 'Value', followed by
\s* # - zero or more whitespace characters, followed by
[(] # - a left parenthesis '(', followed by
\s* # - zero or more whitespace characters,
) # The whole of which is followed by
( # A number, consisting of
-? # - an optional minus sign, followed by
\d+ # - 1 or more decimal digits,
) # The whole of which is followed by
(?! # A zero-width negative look-ahead assertion, consisting of
\s* # - zero or more whitespace characters, followed by
[)] # - a single right parenthesis ')'
) #
",
rxOpts
);
private const RegexOptions rxOpts = RegexOptions.IgnoreCase
| RegexOptions.ExplicitCapture
| RegexOptions.IgnorePatternWhitespace
;
Then . . .
foreach ( Match m in rx.Matches( someText ) )
{
string nbr = m.Value;
Console.WriteLine("Found '{0}', nbr);
}

Regex to get square brackets containing numbers only but are not within square brackets themselves

Sample String
"[] [ds*[000112]] [1448472995] sample string [1448472995] ***";
The regex should match
[1448472995] [1448472995]
and should not match [000112] since there is outer square bracket.
Currently I have this regex that is matching [000112] as well
const string unixTimeStampPattern = #"\[([0-9]+)]";

This is a good way to do it using balanced text.
( \[ \d+ \] ) # (1)
| # or,
\[ # Opening bracket
(?> # Then either match (possessively):
[^\[\]]+ # non - brackets
| # or
\[ # [ increase the bracket counter
(?<Depth> )
| # or
\] # ] decrease the bracket counter
(?<-Depth> )
)* # Repeat as needed.
(?(Depth) # Assert that the bracket counter is at zero
(?!)
)
\] # Closing bracket
C# sample
string sTestSample = "[] [ds*[000112]] [1448472995] sample string [1448472995] ***";
Regex RxBracket = new Regex(#"(\[\d+\])|\[(?>[^\[\]]+|\[(?<Depth>)|\](?<-Depth>))*(?(Depth)(?!))\]");
Match bracketMatch = RxBracket.Match(sTestSample);
while (bracketMatch.Success)
{
if (bracketMatch.Groups[1].Success)
Console.WriteLine("{0}", bracketMatch);
bracketMatch = bracketMatch.NextMatch();
}
Output
[1448472995]
[1448472995]

You need to use balancing groups to handle this - it looks a bit daunting but isn't all that complicated:
Regex regexObj = new Regex(
#"\[ # Match opening bracket.
\d+ # Match a number.
\] # Match closing bracket.
(?= # Assert that the following can be matched ahead:
(?> # The following group (made atomic to avoid backtracking):
[^\[\]]+ # One or more characters except brackets
| # or
\[ (?<Depth>) # an opening bracket (increase bracket counter)
| # or
\] (?<-Depth>) # a closing bracket (decrease bracket counter, can't go below 0).
)* # Repeat ad libitum.
(?(Depth)(?!)) # Assert that the bracket counter is now zero.
[^\[\]]* # Match any remaining non-bracket characters
\z # until the end of the string.
) # End of lookahead.",
RegexOptions.IgnorePatternWhitespace);

Are you just trying to capture the unix time stamp? Then you can try a simpler one where you specify the minimum number of characters matched in a group.
\[([0-9]{10})\]
Here I limit it to 10 characters since I doubt the time stamp will hit 11 characters anytime soon... To protect against that:
\[([0-9]{10,11})\]
Of course this could lead to false positives if you have a 10-length number in an enclosing bracket.

This will match your expression as expected: http://regexr.com/3csg3 it uses lookahead.

Regex to capture parenthesis with hash tag?

So far I have this perfectly working regex:
(?:(?<=\s)|^)#(\w*[A-Za-z_]+\w*)
It finds any word that starts with a hash tag (ex. #lolz but not hsshs#jdjd)
The problem is I also want it to match parenthesis. So if I have this it will match:
(#lolz wow)
or
(wow #cool)
or
(#cool)
Any idea on how can I make or use my regex to work like that?

The following seemed to work for me ...
\(?#(\w*[A-Za-z_]+\w*)\)?

The way you are using the following in context is overkill..
\w*[A-Za-z_]\w*
\w alone matches word characters ( a-z, A-Z, 0-9, _ ). And it is not necessary for the use of the non-capturing group (?: to be wrapped around your lookbehind assertion here.
I do believe that the following would suffice by itself.
(?<=^|\s)\(?#(\w+)\)?
Regular expression:
(?<= look behind to see if there is:
^ the beginning of the string
| OR
\s whitespace (\n, \r, \t, \f, and " ")
) end of look-behind
\(? '(' (optional (matching the most amount possible))
# '#'
( group and capture to \1:
\w+ word characters (a-z, A-Z, 0-9, _) (1 or more times)
) end of \1
\)? ')' (optional (matching the most amount possible))
See live demo
You can also use a negative lookbehind here if you wanted to.
(?<![^\s])\(?#(\w+)\)?

C#: Regex for string with enclosing single-quotes (and escaping by doubling the quotes)

I did not found a regex for my problem. There are always example-regex for escaping with back-slash.
But I need escaping by doubling the enclosing-character.
Example: 'o''reilly'
Result: o'reilly

'(?:''|[^']*)*'
will match a quote-delimited string that may contain double-escaped quotes. So that's your regex to find those strings.
Explanation:
' # Match a single quote.
(?: # Either match... (use (?> instead of (?: if you can)
'' # a doubled quote
| # or
[^']* # anything that's not a quote
)* # any number of times.
' # Match a single quote.
To now remove the quotes correctly, you could do it in two steps:
First, search for (?<!')'(?!') to find all single quotes; replace them with nothing.
Explanation:
(?<!') # Assert that the previous character (if present) isn't a quote
' # Match a quote
(?!') # Assert that the next character (if present) isn't a quote
Second, search for '' and replace all with '.

Regular expression to find separator dots in formula

The C# expression library I am using will not directly support my table/field parameter syntax:
The following are table/field parameter names that are not directly supported:
TableName1.FieldName1
[TableName1].[FieldName1]
[Table Name 1].[Field Name 1]
It accepts alphanumeric parameters without spaces, or most characters enclosed within square brackets. I would like to use C# regular expressions to replace the dot separators and neighboring brackets to a different delimiter, so the results would be as follows:
[TableName1|FieldName1]
[TableName1|FieldName1]
[Table Name 1|Field Name 1]
I also need to skip any string literals within single quotes, like:
'TableName1.FieldName1'
And, of course, ignore any numeric literals like:
12345.6789
EDIT: Thank you for your feedback on improving my question. Hopefully it is clearer now.

I've written a completely new answer, now that the problem is clarified:
You can do this in a single regex. It is quite bulletproof, I think, but as you can see, it's not exactly self-explanatory, which is why I've commented it liberally. Hope it makes sense.
You're lucky that .NET allows re-use of named capturing groups, otherwise you would have had to do this in several steps.
resultString = Regex.Replace(subjectString,
#"(?: # Either match...
(?<before> # (and capture into backref <before>)
(?=\w*\p{L}) # (as long as it contains at least one letter):
\w+ # one or more alphanumeric characters,
) # (End of capturing group <before>).
\. # then a literal dot,
(?<after> # (now capture again, into backref <after>)
(?=\w*\p{L}) # (as long as it contains at least one letter):
\w+ # one or more alphanumeric characters.
) # (End of capturing group <after>) and end of match.
| # Or:
\[ # Match a literal [
(?<before> # (now capture into backref <before>)
[^\]]+ # one or more characters except ]
) # (End of capturing group <before>).
\]\.\[ # Match literal ].[
(?<after> # (capture into backref <after>)
[^\]]+ # one or more characters except ]
) # (End of capturing group <after>).
\] # Match a literal ]
) # End of alternation. The match is now finished, but
(?= # only if the rest of the line matches either...
[^']*$ # only non-quote characters
| # or
[^']*'[^']*' # contains an even number of quote characters
[^']* # plus any number of non-quote characters
$ # until the end of the line.
) # End of the lookahead assertion.",
"[${before}|${after}]", RegexOptions.Multiline | RegexOptions.IgnorePatternWhitespace);

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to set repeat regular expression? - c#

I have regular expression ^\d{5}$|^\d{5}-\d{4}$" it checked US zip. But I need check "zip, zip, zip" how to do this? I tried this ^(\d{5}$|^\d{5}-\d{4},)$ but it not work

Related

Match numbers that not in context of Value(x)

Regex to get square brackets containing numbers only but are not within square brackets themselves

Regex to capture parenthesis with hash tag?

C#: Regex for string with enclosing single-quotes (and escaping by doubling the quotes)

Regular expression to find separator dots in formula

Categories

Resources

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to set repeat regular expression? - c#

I have regular expression ^\d{5}$|^\d{5}-\d{4}*$" it checked US zip. But I need check "zip, zip, zip" how to do this? I tried this ^(\d{5}$|^\d{5}-\d{4},)*$ but it not work

Related

Match numbers that not in context of Value(x)

Regex to get square brackets containing numbers only but are not within square brackets themselves

Regex to capture parenthesis with hash tag?

C#: Regex for string with enclosing single-quotes (and escaping by doubling the quotes)

Regular expression to find separator dots in formula

Categories

Resources

I have regular expression ^\d{5}$|^\d{5}-\d{4}$" it checked US zip. But I need check "zip, zip, zip" how to do this? I tried this ^(\d{5}$|^\d{5}-\d{4},)$ but it not work