Regex pattern generator - c#

I'm trying to do regex pattern which will match to this:
Name[0]/Something
or
Name/Something
Verbs Name and Something will be always known.
I did for Name[0]/Something, but I want make pattern for this verb in one regex
I've tried to sign [0] as optional but it didn't work :
var regexPattern = "Name" + #"\([\d*\]?)/" + "Something"
Do you know some generator where I will input some verbs and it will make pattern for me?

Use this:
Name(\[\d+\])?\/Something
\d+ allows one or more digits
\[\d+\] allows one or more digits inside [ and ]. So it will allow [0], [12] etc but reject []
(\[\d+\])? allows digit with brackets to be present either zero times or once
\/ indicates a slash (only one)
Name and Something are string literals
Regex 101 Demo

You were close, the regex Name(\[\d+\])?\/Something will do.

The problem is with first '\' in your pattern before '('.
Here is what you need:
var str = "Name[0]/Something or Name/Something";
Regex rg = new Regex(#"Name(\[\d+\])?/Something");
var matches = rg.Matches(str);
foreach(Match a in matches)
{
Console.WriteLine(a.Value);
}

var string = 'Name[0]/Something';
var regex = /^(Name)(\[\d*\])?\/Something$/;
console.log(regex.test(string));
string = 'Name/Something';
console.log(regex.test(string));
You've tried wrong with this pattern: \([\d*\]?)/
No need to use \ before ( (in this case)
? after ] mean: character ] zero or one time
So, if you want the pattern [...] displays zero or one time, you can try: (\[\d*\])?
Hope this helps!

i think this is what you are looking for:
Name(\[\d+\])?\/Something
Name litteral
([\d+])? a number (1 or more digits) between brackets optional 1 or 0 times
/Something Something litteral
https://regex101.com/r/G8tIHC/1

Related

Regex to find special pattern

I have a string to parse. First I have to check if string contains special pattern:
I wanted to know if there is substrings which starts with "$(",
and end with ")",
and between those start and end special strings,there should not be
any white-empty space,
it should not include "$" character inside it.
I have a little regex for it in C#
string input = "$(abc)";
string pattern = #"\$\(([^$][^\s]*)\)";
Regex rgx = new Regex(pattern, RegexOptions.IgnoreCase);
MatchCollection matches = rgx.Matches(input);
foreach (var match in matches)
{
Console.WriteLine("value = " + match);
}
It works for many cases but failed at input= $(a$() , which inside the expression is empty. I wanted NOT to match when input is $().[ there is nothing between start and end identifiers].
What is wrong with my regex?
Note: [^$] matches a single character but not of $
Use the below regex if you want to match $()
\$\(([^\s$]*)\)
Use the below regex if you don't want to match $(),
\$\(([^\s$]+)\)
* repeats the preceding token zero or more times.
+ Repeats the preceding token one or more times.
Your regex \(([^$][^\s]*)\) is wrong. It won't allow $ as a first character inside () but it allows it as second or third ,, etc. See the demo here. You need to combine the negated classes in your regex inorder to match any character not of a space or $.
Your current regex does not match $() because the [^$] matches at least 1 character. The only way I can think of where you would have this match would be when you have an input containing more than one parens, like:
$()(something)
In those cases, you will also need to exclude at least the closing paren:
string pattern = #"\$\(([^$\s)]+)\)";
The above matches for example:
abc in $(abc) and
abc and def in $(def)$()$(abc)(something).
Simply replace the * with a + and merge the options.
string pattern = #"\$\(([^$\s]+)\)";
+ means 1 or more
* means 0 or more

Regex removing empty spaces when using replace

My situation is not about removing empty spaces, but keeping them. I have this string >[database values] which I would like to find. I created this RegEx to find it then go in and remove the >, [, ]. The code below takes a string that is from a document. The first pattern looks for anything that is surrounded by >[some stuff] it then goes in and "removes" >, [, ]
string decoded = "document in string format";
string pattern = #">\[[A-z, /, \s]*\]";
string pattern2 = #"[>, \[, \]]";
Regex rgx = new Regex(pattern);
Regex rgx2 = new Regex(pattern2);
foreach (Match match in rgx.Matches(decoded))
{
string replacedValue= rgx2.Replace(match.Value, "");
Console.WriteLine(match.Value);
Console.WriteLine(replacedValue);
What I am getting in first my Console.WriteLine is correct. So I would be getting things like >[123 sesame St]. But my second output shows that my replace removes not just the characters but the spaces so I would get something like this 123sesameSt. I don't see any space being replaced in my Regex. Am I forgetting something, perhaps it is implicitly in a replace?
The [A-z, /, \s] and [>, \[, \]] in your patterns are also looking for commas and spaces. Just list the characters without delimiting them, like this: [A-Za-z/\s]
string pattern = #">\[[A-Za-z/\s]*\]";
string pattern2 = #"[>,\[\]]";
Edit to include Casimir's tip.
After rereading your question (if I understand well) I realize that your two steps approach is useless. You only need one replacement using a capture group:
string pattern = #">\[([^]]*)]";
Regex rgx = new Regex(pattern);
string result = rgx.Replace(yourtext, "$1");
pattern details:
>\[ # literals: >[
( # open the capture group 1
[^]]* # all that is not a ]
) # close the capture group 1
] # literal ]
the replacement string refers to the capture group 1 with $1
By defining [>, \[, \]] in pattern2 you define a character group consisting of single characters like >, ,, , [ and every other character you listed in the square brackets. But I guess you don't want to match space and ,. So if you don't want to match them leave them out like
string pattern2 = #"[>\[\]]";
Alternatively, you could use
string pattern2 = #"(>\[|\])";
Thereby, you either match >[ or ] which better expresses your intention.

C# regexp negative lookahead

i have a problem with replacing characters after specific character. For example i want to replace first 'aa' to '33' with this code.
string str = "dc1aaaafg";
string pattern = #"a{2}(?!(1))";
Regex rgx = new Regex(pattern);
string result = rgx.Replace(str, "33");
but the result is 'dc13333fg'. It replaced the second group after '1'. I need to replace only first group like 'dc133aafg'. How can i achive this. I have a large string and it can be many replacing, this is just example.
Regex.Replace() is global. It will replace as many times as the pattern matches*.
You could use Regex.Replace(String, String, Int32) to limit the number of operations.
string result = rgx.Replace(str, "33", 1);
Or you change the pattern to a look-behind.
Regex rgx = new Regex(#"(?<=1)a{2}");
string result = rgx.Replace(str, "33");
* Note that Replace() is global, but not incremental. Using the expression a{2} on "aaaaaa" to with the replacement "ba" will result in "bababa", not in "bbbbba".
There is an overload to the Replace method in which you can specify the number of times. Specify 1 and it shall do only the first match.
string result = rgx.Replace(str, "33", 1);
A regex pattern cannot express that only the first match is relevant.
Use Regex.Match to get the position and length of the first match. Then use Substring (or Remove followed by Insert) to construct a new string from the old string, that has the replacement you want.
Try with a negative look behind : (a{2})(?<!\1{2})
(a{2}) # 'a' two times
(?<! # negative look behind
\1{2} # '\1' is the captured group 'a' twice to "jump" over the captured group
)

.NET Regex - "Not" Match

I have a regular expression:
12345678|[0]{8}|[1]{8}|[2]{8}|[3]{8}|[4]{8}|[5]{8}|[6]{8}|[7]{8}|[8]{8}|[9]{8}
which matches if the string contains 12345679 or 11111111 or 22222222 ... or ... 999999999.
How can I changed this to only match if NOT the above? (I am not able to just !IsMatch in the C# unfortunately)...EDIT because that is black box code to me and I am trying to set the regex in an existing config file
This will match everything...
foundMatch = Regex.IsMatch(SubjectString, #"^(?:(?!123456789|(\d)\1{7}).)*$");
unless one of the "forbidden" sequences is found in the string.
Not using !isMatch as you can see.
Edit:
Adding your second constraint can be done with a lookahead assertion:
foundMatch = Regex.IsMatch(SubjectString, #"^(?=\d{9,12})(?:(?!123456789|(\d)\1{7}).)*$");
Works perfectly
string s = "55555555";
Regex regx = new Regex(#"^(?:12345678|(\d)\1{7})$");
if (!regx.IsMatch(s)) {
Console.WriteLine("It does not match!!!");
}
else {
Console.WriteLine("it matched");
}
Console.ReadLine();
Btw. I simplified your expression a bit and added anchors
^(?:12345678|(\d)\1{7})$
The (\d)\1{7} part takes a digit \d and the \1 checks if this digit is repeated 7 more times.
Update
This regex is doing what you want
Regex regx = new Regex(#"^(?!(?:12345678|(\d)\1{7})$).*$");
First of all, you don't need any of those [] brackets; you can just do 0{8}|1{8}| etc.
Now for your problem. Try using a negative lookahead:
#"^(?:(?!123456789|(\d)\1{7}).)*$"
That should take care of your issue without using !IsMatch.
I am not able to just !IsMatch in the C# unfortunately.
Why not? What's wrong with the following solution?
bool notMatch = !Regex.Match(yourString, "^(12345678|[0]{8}|[1]{8}|[2]{8}|[3]{8}|[4]{8}|[5]{8}|[6]{8}|[7]{8}|[8]{8}|[9]{8})$");
That will match any string that contains more than just 12345678, 11111111, ..., 99999999

Quick Help With Regex C#

How can I match on the following string: A constant string name, followed by a period, followed by any positive integer, followed by another dot.
For example I want to find anything like this:
SomeText.1.
SomeText.99.
SomeText.100.
SomeText.1002.
Regex.Match(input, #"SomeText\.\d+\.");
Try something like this:
^SomeText\.\d+\.$
To explain:
The ^ means the beginning of the line, as $ means the end of the line. This ensure that the entire string matches the expression, not that something in it happens to match the pattern.
The SomeText part is self explanatory.
The \. means "match a single .". The \ is required to escape the meaning of the period, which by itself would mean "Any single character"
The \d+ means "One or more digits".
Then the \. again, and finally $ to signify that's where we expect the string to end.
If you want to be able to retrieve the number, try:
var exp = new Regex(#"SomeText\.(?<number>\d+)\.",RegexOptions.Compiled);
foreach(string s in allStrings)
{
var collection = exp.Match(s);
if (collection.Success)
{
int myNumber = int.parse(collection.Groups["number"].Value);
// ...
}
}
Your regex would look like SomeText\.\d+\.
Which, in c# code would be
var result = Regex.Match(stringToMatch, #"SomeText\.\d+\.");

Categories