Match a particular word after double quotes - c#,regex - c#

I want to match a particular word which is followed by double quotes.
I am using regex #"\bspecific\S*id\b" which will match anything that starts with specific and ends with id.
But, I want something which should match
"specific-anything-id"(it should be with double quotes)
**<specific-anything-id>** - should not match
specific-"anything"-id - should not match

You can include the double quotes and use a negated character class [^"] (matching any char but ") rather than \S (that can also match double quotes as it matches any non-whitespace character):
var pattern = #"""specific[^""]*id""";
You do not need word boundaries either here.
See the regex demo and a C# demo:
var s = "\"specific-anything-id\" <specific-anything-id> specific-\"anything\"-id";
var matches = Regex.Matches(s, #"""specific[^""]*id""");
foreach (Match m in matches)
Console.WriteLine(m.Value); // => "specific-anything-id"

Do:
"([^"]+)"
the matched group would contain the ID you want.

Related

How to extract text that lies between parentheses

I have string like (CAT,A)(DOG,C)(MOUSE,D)
i want to get the DOG value C using Regular expression.
i tried following
Match match = Regex.Match(rspData, #"\(DOG,*?\)");
if (match.Success)
Console.WriteLine(match.Value);
But not working could any one help me to solve this issue.
You can use
(?<=\(DOG,)\w+(?=\))?
(?<=\(DOG,)[^()]*(?=\))
See the regex demo.
Details:
(?<=\(DOG,) - a positive lookbehind that matches a location that is immediately preceded with (DOG, string
\w+ - one or more letters, digits, connector punctuation
[^()]* - zero or more chars other than ( and )
(?=\)) - a positive lookahead that matches a location that is immediately followed with ).
As an alternative you can also use a capture group:
\(DOG,([^()]*)\)
Explanation
\(DOG, Match (DOG,
([^()]*) Capture group 1, match 0+ chars other than ( or )
\) Match )
Regex demo | C# demo
String rspData = "(CAT,A)(DOG,C)(MOUSE,D)";
Match match = Regex.Match(rspData, #"\(DOG,([^()]*)\)");
if (match.Success)
Console.WriteLine(match.Groups[1].Value);
}
Output
C

C# Replace some char from a Regex Match [duplicate]

In the end, I will want to replace all the \t that are enclosed within "
I'm currently on Regex101 trying various iterations of my regex... This is the the closest I have so far...
originString = blah\t\"blah\tblah\"\t\"blah\"\tblah\tblah\t\"blah\tblah\t\tblah\t\"\t\"\tbleh\"
regex = \t?+\"{1}[^"]?+([\t])?+[^"]?+\"
\t?+ maybe one or more tab
\"{1} a double quote
[^"]?+ anything but a double quote
([\t])?+ capture all the tabs
[^"]?+ anything but a double quote
\"{1} a double quote
My logic is flawed!
I need your help in grouping the tab characters.
Match the double quoted substrings with a mere "[^"]+" regex (if there are no escape sequences to account for) and replace the tabs inside the matches only inside a match evaluator:
var str = "A tab\there \"inside\ta\tdouble-quoted\tsubstring\" some\there";
var pattern = "\"[^\"]+\""; // A pattern to match a double quoted substring with no escape sequences
var result = Regex.Replace(str, pattern, m =>
m.Value.Replace("\t", "-")); // Replace the tabs inside double quotes with -
Console.WriteLine(result);
// => A tab here "inside-a-double-quoted-substring" some here
See the C# demo
you can use this :
\"[^\"]*\"
originally answered here

Regular expression matching a given structure

I need to generate a regex to match any string with this structure:
{"anyWord"}{"aSpace"}{"-"}{"anyLetter"}
How can I do it?
Thanks
EDIT
I have tried:
string txt="print -c";
string re1="((?:[a-z][a-z]+))"; // Word 1
Regex r = new Regex(re1,RegexOptions.IgnoreCase|RegexOptions.Singleline);
Match m = r.Match(txt);
if (m.Success)
{
String word1=m.Groups[1].ToString();
Console.Write("("+word1.ToString()+")"+"\n");
}
Console.ReadLine();
but this only matches the word "print"
This would be pretty straight-forward :
[a-zA-Z]+\s\-[a-zA-Z]
explained as follows :
[a-zA-Z]+ # Matches 1 or more letters
\s # Matches a single space
\- # Matches a single hyphen / dash
[a-zA-Z] # Matches a single letter
If you needed to implement this in C#, you could just use the Regex class and specifically the Regex.Matches() method:
var matches = Regex.Matches(yourString,#"[a-zA-Z]+\s\-[a-zA-Z]");
Some example matching might look like this :

How can I use lookbehind in a C# Regex in order to skip matches of repeated prefix patterns?

How can I use lookbehind in a C# Regex in order to skip matches of repeated prefix patterns?
Example - I'm trying to have the expression match all the b characters following any number of a characters:
Regex expression = new Regex("(?<=a).*");
foreach (Match result in expression.Matches("aaabbbb"))
MessageBox.Show(result.Value);
returns aabbbb, the lookbehind matching only an a. How can I make it so that it would match all the as in the beginning?
I've tried
Regex expression = new Regex("(?<=a+).*");
and
Regex expression = new Regex("(?<=a)+.*");
with no results...
What I'm expecting is bbbb.
Are you looking for a repeated capturing group?
(.)\1*
This will return two matches.
Given:
aaabbbb
This will result in:
aaa
bbbb
This:
(?<=(.))(?!\1).*
Uses the above principal, first checking that the finding the previous character, capturing it into a back reference, and then asserting that that character is not the next character.
That matches:
bbbb
I figured it out eventually:
Regex expression = new Regex("(?<=a+)[^a]+");
foreach (Match result in expression.Matches(#"aaabbbb"))
MessageBox.Show(result.Value);
I must not allow the as to me matched by the non-lookbehind group. This way, the expression will only match those b repetitions that follow a repetitions.
Matching aaabbbb yields bbbb and matching aaabbbbcccbbbbaaaaaabbzzabbb results in bbbbcccbbbb, bbzz and bbb.
The reason the look-behind is skipping the "a" is because it is consuming the first "a" (but no capturing it), then it captures the rest.
Would this pattern work for you instead? New pattern: \ba+(.+)\b
It uses a word boundary \b to anchor either ends of the word. It matches at least one "a" followed by the rest of the characters till the word boundary ends. The remaining characters are captured in a group so you can reference them easily.
string pattern = #"\ba+(.+)\b";
foreach (Match m in Regex.Matches("aaabbbb", pattern))
{
Console.WriteLine("Match: " + m.Value);
Console.WriteLine("Group capture: " + m.Groups[1].Value);
}
UPDATE: If you want to skip the first occurrence of any duplicated letters, then match the rest of the string, you could do this:
string pattern = #"\b(.)(\1)*(?<Content>.+)\b";
foreach (Match m in Regex.Matches("aaabbbb", pattern))
{
Console.WriteLine("Match: " + m.Value);
Console.WriteLine("Group capture: " + m.Groups["Content"].Value);
}

Regex to match and return group names

I need to match the following strings and returns the values as groups:
abctic
abctac
xyztic
xyztac
ghhtic
ghhtac
Pattern is wrote with grouping is as follows:
(?<arch>[abc,xyz,ghh])(?<flavor>[tic,tac]$)
The above returns only parts of group names. (meaning match is not correct).
If I use * in each sub pattern instead of $ at the end, groups are correct, but that would mean that abcticff will also match.
Please let me know what my correct regex should be.
Your pattern is incorrect because a pipe symbol | is used to specify alternate matches, not a comma in brackets as you were using, i.e., [x,y].
Your pattern should be: ^(?<arch>abc|xyz|ghh)(?<flavor>tic|tac)$
The ^ and $ metacharacters ensures the string matches from start to end. If you need to match text in a larger string you could replace them with \b to match on a word boundary.
Try this approach:
string[] inputs = { "abctic", "abctac", "xyztic", "xyztac", "ghhtic", "ghhtac" };
string pattern = #"^(?<arch>abc|xyz|ghh)(?<flavor>tic|tac)$";
foreach (var input in inputs)
{
var match = Regex.Match(input, pattern);
if (match.Success)
{
Console.WriteLine("Arch: {0} - Flavor: {1}",
match.Groups["arch"].Value,
match.Groups["flavor"].Value);
}
else
Console.WriteLine("No match for: " + input);
}

Categories