Find a pattern using Regex - c#

I have a bunch of string lines with different formats. I want to find a pattern using regex in order to match specific lines. I have tried to figure it out myself up to some degree, using this: \b([A-Z0-9]{2,})\b. However I wasn't able to find the right pattern that will match only the lines 3, 6 and 8.Thank you.
// DONE:
return Test;
TESTER
MessageBoxButtons.OK,
.GetConnectionString();
TOURNAMENT TRACKER
// Create
TEST 4 ME

My guess is that your solution also matched the first and fourth line. If you want to exclude the lines with characters other than those specified, you could look at the whole line instead of checking single words:
^[0-9A-Z]+(\s[0-9A-Z]+)*$
It will match lines consisting of white space separated words which contain numbers or uppercase letters.

If you check the whole line you can use this
^[A-Z0-9 ]+$
Assuming case-sensitivity is set then this will match only uppercase characters, numbers and spaces from the start to the end of the line.
See demo here

Related

How to eliminate digits followed by specific string

I have quite a long regex pattern. Here is just a part of it:
string pattern = #"((?<!top=)(?<![A-Za-z])\d)+";
Given the string:
date(Account/AccountClose) gt 2019-03-25 and Brg eq '100'&$select=IdAccountCurrent&$skip=10&$top=10
It matches 2019, 03, 25, 100, 10 and 0.
I want to eliminate the last 0 from the matching result. In other words, all numbers that are followed by top= should not match.
My solution works only if I have one digit after top=.How can I achieve the desired result ?
regex101 example
UPDATE: Unfortunately, the suggested solutions are not suited for the whole pattern. I tried to make my example simple but it looks like it's imposible to do.
So my whole regex pattern is:
string pattern = #"((?<!top=)(?<![A-Za-z])\d|-|T\d+|:|\.|\+|(?<=\d)Z)+|\bfalse\b|\btrue\b|\bnull\b|'[^']+'|\(['\d][^\)]+\)";
I need to edit this pattern to eliminate all digits right after top=.
my whole example (please see the last row in this example, last 0 should not be matched)
Just add 0-9 in your regex, for forcing the digit not to be preceded by another digit:
((?<!top=)(?<![A-Za-z0-9])\d+)
See here for a demo.
But you can also just use word boundaries:
(?<!top=)\b(\d+)
See here for a demo.
You can change your regex to this where I've used \b to reject the partial matching of digits,
(?<!top=)(?<![A-Za-z])\b\d+
Demo
The way your wrote your regex ((?<!top=)(?<![A-Za-z])\d)+ will work by applying the condition on an individually and then counting one or more such characters which wouldn't have allowed using \b in your regex and hence I changed it to remove outer parenthesis and used \b\d+. Hopefully this should give you all your desired matches. Let me know if you face any issues.

Match text not surrounded by & and ;

I am currently using the following regular expression:
(?<!&)[^&;]*(?!;)
To match text like this:
match1<match2>
And extract:
match1
match2
However, this seems to match an extra five empty strings. See Regex Storm.
How can I only match the two listed above?
Note the existing pattern ((?<=^|;)[^&]+) by #xanatos will only match matches 1 to 3 in the following string and not match4:
match1&lte;match2<match;3+match&4
Try changing the * to a +:
(?<!&)[^&;]+(?!;)
Test here
More correct regex:
(?<=^|;)[^&]+
Test here
The basic idea here is that a "good" substring starts at the beginning of the string (^) or right after the ;, and ends when you encounter a & ([^&]+).
Third version... But here we are showing how if you have a problem, and you decide to use regexes, now you have two problems:
(?<=^|;)([^&]|&(?=[^&;]*(?:&|$)))+
Test here
I have managed it with:
(?<Text>.+?)(?:&[^&;]*?;|$)
This seems to match all of the corner cases but it might not work with a case I can't think of at the moment.
This won't work if the string starts with a &...; pattern or is only that.
See Regex Storm.

C# Regular Expression: Search the first 3 letters of each name

Does anyone know how to say I can get a regex (C#) search of the first 3 letters of a full name?
Without the use of (.*)
I used (.**)but it scrolls the text far beyond the requested name, or
if it finds the first condition and after 100 words find the second condition he return a text that is not the look, so I have to limit in number of words.
Example: \s*(?:\s+\S+){0,2}\s*
I would like to ignore names with less than 3 characters if they exist in name.
Search any name that contains the first 3 characters that start with:
'Mar Jac Rey' (regex that performs search)
Should match:
Marck Jacobs L. S. Reynolds
Marcus Jacobine Reys
Maroon Jacqueline by Reyils
Can anyone help me?
The zero or more quantifier (*) is 'greedy' by default—that is, it will consume as many characters as possible in order to finding the remainder of the pattern. This is why Mar.*Jac will match the first Mar in the input and the last Jac and everything in between.
One potential solution is just to make your pattern 'non-greedy' (*?). This will make it consume as few characters as possible in order to match the remainder of the pattern.
Mar.*?Jac.*?Rey
However, this is not a great solution because it would still match the various name parts regardless of what other text appears in between—e.g. Marcus Jacobine Should Not Match Reys would be a valid match.
To allow only whitespace or at most 2 consecutive non-whitespace characters to appear between each name part, you'd have to get more fancy:
\bMar\w*(\s+\S{0,2})*\s+Jac\w*(\s+\S{0,2})*\s+Rey\w*
The pattern (\s+\S{0,2})*\s+ will match any number of non-whitespace characters containing at most two characters, each surrounded by whitespace. The \w* after each name part ensures that the entire name is included in that part of the match (you might want to use \S* instead here, but that's not entirely clear from your question). And I threw in a word boundary (\b) at the beginning to ensure that the match does not start in the middle of a 'word' (e.g. OMar would not match).
I think what you want is this regular expression to check if it is true and is case insensitive
#"^[Mar|Jac|Rey]{3}"
Less specific:
#"^[\w]{3}"
If you want to capture the first three letters of every words of at least three characters words you could use something like :
((?<name>[\w]{3})\w+)+
And enable ExplicitCapture when initializing your Regex.
It will return you a serie of Match named "name", each one of them is a result.
Code sample :
Regex regex = new Regex(#"((?<name>[\w]{3})\w+)+", RegexOptions.ExplicitCapture | RegexOptions.IgnoreCase);
var match = regex.Matches("Marck Jacobs L. S. Reynolds");
If you want capture also 3 characters words, you can replace the last "\w" by a space. In this case think to handle the last word of the phrase.

how to match multiple words in string using regex?

I am trying to match 3 words that can appear anywhere in the string:
Win
Enter
Now
All 3 words must exist in the string for it return as a match. But I am having issues for getting a match when all 3 words do exist.
Below is the regex I am using: http://regexr.com/39b83
^(?=.*?win)(?=.*?(enter))(?=.*?(now)).*
Regex is working when all three words are within the same line... when its spread out across the entire string on different lines, it is failing to match.
Any direction or help is appreciated.
Since you don't want to match words like center (with the word "enter"), I would use:
/(\benter\b)|(\bwin\b)|(\bnow\b)/
Link to Fiddler
I think C# would support (?s) DOTALL modifier. If yes then you could try the below regex,
(?i)(?s)win.*?enter.*?now
How about...
/(win|enter|now)/gi
It sounds like you want to match the lines on which these words appear, across up to three lines. That’s not really easy, but:
/^.*win.*(?:\s+.*)?enter.*(?:\s+.*)?now.*|^.*win.*(?:\s+.*)?now.*(?:\s+.*)?enter.*|^.*enter.*(?:\s+.*)?win.*(?:\s+.*)?now.*|^.*enter.*(?:\s+.*)?now.*(?:\s+.*)?win.*|^.*now.*(?:\s+.*)?win.*(?:\s+.*)?enter.*|^.*now.*(?:\s+.*)?enter.*(?:\s+.*)?win.*/igm
should do it.
It 's because the dot doesn't match the newline character. To change this, you have to ways. The first, use the s modifier (that allows the dot to match newlines):
(?s)^(?=.*\bwin\b)(?=.*\benter\b)(?=.*\bnow\b).*
But this feature isn't always available (for example in Javascript). The second way consists to replace the dot with [\s\S] (a character class that matches all the characters):
^(?=[\s\S]*\bwin\b)(?=[\s\S]*\benter\b)(?=[\s\S]*\bnow\b)[\s\S]+

Match text after colon

I want to match the word after the "type :".
What I have?
My actual pattern:
(?<=type\s:\s)(\w*)
Text:
"type : text,"
It work exact as I want when I have just one whitespace before/after color...
"type_SPACE_:_SPACE_text
But if I have 2 spaces or none, it doesn't work.
I already try with this, but doesn't match.
(?<=type\s*:\s*)(\w*)
Also, I try with this, best approach. But with this, the matched text contain the colon.
(?<=type)(\s*):(\s*)(.*)(?=,)
To do the test I use gskinner's tester...
http://gskinner.com/RegExr/
If you're doing this in C# and using the included Regex engine, your original regex should work, with a slight modification:
string myString = "type : something";
var match = Regex.Match(myString, #"(?<=type\s*:\s*)\w+");
Console.Write(match);
Edit: The reason why the ?<=type\s*:\s*)\w* version wasn't working for you with multiple spaces, is because the regex match was happily returning various combinations of strings with 0 characters after the variable number of spaces following the colon.
You can view the various matched strings by using Regex.Matches, you'll see that your matched word is in there, but it's not the first result.

Categories