How do go about Regex.Matches group in visual C# - c#

I have the following statement, and I want to extract the values of Video[0] and title[0].
String text = "< php $video[0]='aEOqMqVWB5s';$title[0]='www.yahoo.com'; ?>";
How do I go about using Regex.Matches, and the Groups[0].Value? So in this example, the first group will be aEOqMqVWB5s, and the second group will be www.yahoo.com.
Thanks in advance.

Parenthesis will create the 2 groups you need. Also, you will have to escape the following characters: $,[,],? and you can do this with "\". So, you're regex will look something like: "< php \$video\[0\]='(.*)';\$title\[0\]='(.*)'; \?>"

You can use this pattern:
\$(?:video|title)\[0\]='(.*?)'
Sample Code:
C# example: (Untested, to give you an idea for how to start)
MatchCollection mcol = Regex.Matches(inputStr,"\$(?:video|title)\[0\]='(.*?)'");
foreach(Match m in mcol)
{
Debug.Print(m.ToString()); // See Output Window
// Here you can use m.Groups[0].Value or m.Groups[1].Value
// adjust your loop accordingly
}
Regex breakup:
\$ = looks for `$` character, need to escape since it has a specific meaning for regex enine.
(?:video|title) = match for either word `video` or `title`, Don't capture group.
\[0\]= = looks for literal `[` followed by `0` ollowed by `]` and `=`.
'(.*?)' = lazy match for anything enclosed by single quote `()` makes a group here.
Recommended read:
An example for Regex.Matches.Groups
Live demo:
Online Regex tester Live demo

Related

Get substring with RegEx

I am really struggling with RegEx. I want my RegEx (if possible) to do 2 things:
1- Validate that the whole string respects the format NAME_STKBYGRP.CSV
2- Extract the NAME substring if match
Examples:
TEST_STKBYGRP.CSV -> TEST
other_stkbygrp.csv -> other
test_wrong.csv -> ""
Here is what I tried so far
string input = "NAME_STKBYGRP.CSV";
Regex regex = new Regex("([A-Z])*_STKBYGRP.CSV", RegexOptions.IgnoreCase);
string s = regex.Match(input).Value;
It does return "" if it doesn't match but return the whole input if it matches.
You need to read regex.Match(input).Groups[1].Value if you only want the value of the first group.
You should also add a ^ and $ at the start and end of your regex if you want to rule out strings like evilnumber12345_NAME_STKBYGRP.CSVevilsuffix.
Edit: adv12 also has a good point about the location of the * - it should be inside the parentheses.
First off, your * should be inside the parentheses. Otherwise, you'll capture several single-character groups. Then, use Match.Groups[1] to get just the characters matched by the portion of the regex in the parentheses.

Regex to find anything after ']' and before '['

I have a regex working to find anything between the square brackets in a text file, which is this:
Regex squareBrackets = new Regex(#"\[(.*?)\]");
And I want to create a regex that is basically the opposite way round to select whatever is after what's in the square brackets. So I thought just swap them round?
Regex textAfterTitles = new Regex(#"\](.*?)\[");
But this does not work and Regex's confuse me - can anyone help?
Cheers
You can use a lookbehind:
var textAfterTiles = new Regex(#"(?<=\[(.*?)\]).*");
You can combine it with a lookahead if you have multiple such bracketed groups, such as:
var textAfterTiles = "before [one] inside [two] after"
And you wanted to match " inside " and " after", you could do this:
new Regex(#"(?<=\[(.*?)\])[^\[]*");
The same \[(.*?)] regex (I'd just remove the redundant escaping backslash before ]), or even better regex is \[([^]]*)], can be used to split the text and get the text outside [...] (if used with RegexOptions.ExplicitCapture modifier):
var data = "A bracket is a tall punctuation mark[1] typically used in matched pairs within text,[2] to set apart or interject other text.";
Console.WriteLine(String.Join("\n", Regex.Split(data,#"\[([^]]*)]",RegexOptions.ExplicitCapture)));
Output of the C# demo:
A bracket is a tall punctuation mark
typically used in matched pairs within text,
to set apart or interject other text.
The RegexOptions.ExplicitCapture flag makes the capturing group inside the pattern non-capturing, and thus, the captured text is not output into the resulting split array.
If you do not have to keep the same regex, just remove the capture group, use \[[^]]*] for splitting.
You can try this one
\]([^\]]*)\[

Regex pattern in c# start with # and end with 9;

Need regex pattern that text start with"#" and end with " ";
I tried the below pattern
string pattern = "^[#].*?[ ]$";
but not working
Since is an hex code of tab character, why not just using StartsWith and EndsWith methods instead?
if(yourString.StartsWith("#") && yourString.EndsWith("\\t"))
{
// Pass
}
This patterns works fine. I have tested it.
string pattern = "#(.*?)9";
See below link to test it online.
https://regex101.com/r/iR6nP6/1
C#
const string str = "dadasd#beetween9ddasdasd";
var match = Regex.Match(str, "#(.*?)9");
Console.WriteLine(match.Groups[1].Value);
In regex syntaxt, the [] denotes a group of characters of which the engine will attempt to match one of. Thus, [&#x9] means, match one of an &, #, x or 9 in no particular order.
If you are after order, which seems you are, you will need to remove the []. Something like so should work: string pattern = "^#.*?&#x9$";
you mean something like:
string pattern = "^#.*?[ ]$"
There are also many fine regex expression helpers on the web. for example https://regex101.com/ It gives a nice explanation of how your text will be handled.
You should use \t to match tab character
You can use special character sequences to put non-printable characters in your regular expression. Use \t to match a tab character (ASCII 0x09)
Try following Regex
^\#.*\t\;$

Limit regex expression by character in c#

I get the following pattern (\s\w+) I need matches every words in my string with a space.
For example
When i have this string
many word in the textarea must be happy
I get
many
word
in
the
textarea
must
be
happy
It is correct, but when i have another character, for example
many word in the textarea , must be happy
I get
many
word
in
the
textarea
must
be
happy
But must be happy should be ignored, because i want it to break when another character is in the string
Edit:
Example 2
all cats { in } the world are nice
Should be return
all
cats
Because { is another separator for me
Example 3
My 3 cats are ... funny
Should be return
My
3
cats
are
Because 3 is alphanumeric and . is separator for me
What can I do?
To do that you need to use the \G anchors that matches the positions at the start of the string or after the last match. so you can do it with this pattern:
#"(?<=\G\s*)\w+"
[^\w\s\n].*$|(\w+\s+)
Try this.Grab the captures or matches.See demo.Set flag m for multiline mode.
See demo.
http://regex101.com/r/kP4pZ2/12
I think Sam I Am's comment is correct: you'll require two regular expressions.
Capture the text up to a non-word character.
Capture all the words with a space on one side.
Here's the corresponding code:
"^(\\w+\\s+)+"
"(\\w+\\s+)"
You can combine these two to capture just the individual words pretty easily - like so
"^(\\w+\\s+)+"
Here's a complete piece of code demonstrating the pattern:
string input = "many word in the textarea , must be happy";
string pattern = "^(\\w+\\s+)+";
Match match = Regex.Match(input , pattern);
// Never returns a NullReferenceException because of GroupsCollection array indexer - check it out!
foreach(Capture capture in match.Groups[1].Captures)
{
Console.WriteLine(capture.Value);
}
EDIT
Check out Casimir et Hippolyte for a really clean answer.
All in one regex :-) Result is in list
Regex regex = new Regex(#"^((\w+)\s*)+([^\w\s]|$).*");
Match m = regex.Match(inputString);
if(m.Success)
{
List<string> list =
m.Groups[2].Captures.Cast<Capture>().
Select(c=>c.Value).ToList();
}

How do I exclude a regex value in a replace

I have a regex expression which searches for strings using a Prefix and a Suffix. In it's simplest form \$\$\w+\$\$ will find $$My_Name$$ (in this case the Prefix and Suffix are both equal to $$) Another example would be \[\#\w+\#\] to match [#My_Name#].
The Prefix and Suffix will always be a specific string of 0 to n characters which I can always safely escape for a direct character match.
I extract the Matches in my C# code so I can work with them but obviously my match contains $$My_Name$$ but what I want is to simply get the inner string between the Suffix and Prefix: My_Name.
How do I exclude the Prefix and Suffix from the result?
Change your REGEX to \$\$(\w+)\$\$ and use $1 to get the matching (inner) string.
For example
string pattern = #"\$\$(\w+)\$\$";
string input = "$$My_Name$$";
Regex rgx = new Regex(pattern);
Match result = rgx.Match(input);
Console.WriteLine(result.Groups[1]);
Outputs: "My Name"
P.S - There's no need to use explicitly typed local variables, but I just wanted the types to be clear.
You can group your w+ into a group like this (w+) then when you retrieve the matched string you might be able to ask for that subgroup.
I do not know if I am wrong (but you didn't provided any code whatsoever) but I think this is how it is done: .Groups[1].Value on the the result of Regex.Match.
How about the regex below. It works by capturing the first character into a named group then capturing any repeats into a named group called first group which it then uses to match the end of the string. It will work with any number of repeated character so long as they repeated at the end of the word.
'(?<first_group>(?<first_char>.)\k<first_char>+)(?<word>\w+)\k<first_group>+'
You just need to then extract the capture group named word like so:
String sample = "$$My_Name$$";
Regex regex = new Regex("(?<first_group>(?<first_char>.)\k<first_char>+)(?<word>\w+)\k<first_group>+");
Match match = regex.Match(sample);
if (match.Success)
{
Console.WriteLine(match.Groups["word"].Value);
}
You can use named group like this:
(\$\$)(?<group1>.+?)\1 -- pattern 1 (first case)
\[(#)(?<group2>.+?)\1\] -- pattern 2 (second case)
or combined representation would be:
(\$\$)(?<group1>.+?)\1|\[(#)(?<group2>.+?)\3\]
I would suggest you to use .+? it will help you to match any character other than your prefix/suffix.
Live Demo

Categories