Regular expression in C# to find matches like "x-[some_string]" - c#

What is the Regular Expression in C# to find matches inside text
that starting with "x-[" and ending with "]"?
I've tried something like this:
Regex urlRx = new Regex(#"^x-[.*]$", RegexOptions.IgnoreCase);

Simple:
x-\[([^]]+)\]
# that is: look for x-[ literally
# capture and save anything that is not a ]
# followed by ]
See a demo on regex101.com.

This should work
string input = "x-[ABCD]";
string pattern = "^x-\\[(.*)\\]$";
Regex rgx = new Regex(pattern);
Match match = rgx.Match(input);
if (match.Success) {
Console.WriteLine(match.Groups[1].Value);
}
IDEONE DEMO
UPDATE
As pointed by Jan, there will be too much backtracking in cases like x-[ABCDEFGHJJHGHGFGHGFVFGHGFGHGFGHGGHGGHGDCNJK]ABCD]. My updated regex is similar to his
^x-\[([^\]]*)\]$

Do you really need a regex for it? Simple String operation should serve your purpose.
yourString.EndsWith("]");
yourString.StartsWith("x-[");

Related

C# Capturing the first match with regex

I've got an input string that looks like this:
url=https%3A%2F%2Fdomain.com%2Fsale-deal%3Futm_source%3Dinsider-primary-action%3Dinsider-primary-action&utm_source=FB
or
url=https%3A%2F%2Fdomain.com%2Fsale&utm_source=FB&sub_id1=M12
the string sometimes has or non %3Futm_source
how to get link between url= and %3Futm_source% or &utm_source
Regex reg = new Regex(#"url=(https%3A%2F%2Fdomain.com[a-zA-Z0-9-_/%\.]+)%3Futm_source|&utm_source");
Match result = reg.Match(inPut);
Console.WriteLine(result.Groups[1].Value));
it always get from url= to &utm_source
You can use this
(?<=url=).*?(?=%3Futm_source|&utm_source)
(?<=url=) Positive look behind. matches url=.
.* - Matches anything except new line.
(?=%3Futm_source|&utm_source) - Positive look ahead. Matches %3Futm_source or &utm_source
Demo

Regular Expression to match the pattern

I am looking for Regular Expression search pattern to find data within $< and >$.
string pattern = "\b\$<[^>]*>\$";
is not working.
Thanks,
You can make use of a tempered greedy token:
\$<(?:(?!\$<|>\$)[\s\S])*>\$
See demo
This way, you will match only the closest boundaries.
Your regex does not match because you do not allow > in-between your markers, and you are using \b where you most probably do not have a word boundary.
If you do not want to get the delimiters in the output, use capturing group:
\$<((?:(?!\$<|>\$)[\s\S])*)>\$
^ ^
And the result will be in Group 1.
In C#, you should consider declaring all regex patterns (whenever possible) with the help of a verbatim string literal notation (with #"") because you won't have to worry about doubling backslashes:
var rx = new Regex(#"\$<(?:(?!\$<|>\$)[\s\S])*>\$");
Or, since there is a singleline flag (and this is preferable):
var rx = new Regex(#"\$<((?:(?!\$<|>\$).)*)>\$", RegexOptions.Singleline | RegexOptions.CultureInvariant);
var res = rx.Match(text).Select(p => p.Groups[1].Value).ToList();
This pattern will do the work:
(?<=\$<).*(?=>\$)
Demo: https://regex101.com/r/oY6mO2/1
To find this pattern in php you have this REGEX code for find any patten,
/$<(.*?)>$/s
For Example:
$arrayWhichStoreKeyValueArrayOfYourPattern= array();
preg_match_all('/$<(.*?)>$/s',
$yourcontentinwhichyoufind,
$arrayWhichStoreKeyValueArrayOfYourPattern);
for($i=0;$i<count($arrayWhichStoreKeyValueArrayOfYourPattern[0]);$i++)
{
$content=
str_replace(
$arrayWhichStoreKeyValueArrayOfYourPattern[0][$i],
constant($arrayWhichStoreKeyValueArrayOfYourPattern[1][$i]),
$yourcontentinwhichyoufind);
}
using this example you will replace value using same name constant content in this var $yourcontentinwhichyoufind
For example you have string like this which has also same named constant.
**global.php**
//in this file my constant declared.
define("MYNAME","Hiren Raiyani");
define("CONSTANT_VAL","contant value");
**demo.php**
$content="Hello this is $<MYNAME>$ and this is simple demo to replace $<CONSTANT_VAL>$";
$myarr= array();
preg_match_all('/$<(.*?)>$/s', $content, $myarray);
for($i=0;$i<count($myarray[0]);$i++)
{
$content=str_replace(
$myarray[0][$i],
constant($myarray[1][$i]),
$content);
}
I think as i know that's all.

How can I split a regex into exact words?

I need a little help regarding Regular Expressions in C#
I have the following string
"[[Sender.Name]]\r[[Sender.AdditionalInfo]]\r[[Sender.Street]]\r[[Sender.ZipCode]] [[Sender.Location]]\r[[Sender.Country]]\r"
The string could also contain spaces and theoretically any other characters. So I really need do match the [[words]].
What I need is a text array like this
"[[Sender.Name]]",
"[[Sender.AdditionalInfo]]",
"[[Sender.Street]]",
// ... And so on.
I'm pretty sure that this is perfectly doable with:
var stringArray = Regex.Split(line, #"\[\[+\]\]")
I'm just too stupid to find the correct Regex for the Regex.Split() call.
Anyone here that can tell me the correct Regular Expression to use in my case?
As you can tell I'm not that experienced with RegEx :)
Why dont you split according to "\r"?
and you dont need regex for that just use the standard string function
string[] delimiters = {#"\r"};
string[] split = line.Split(delimiters,StringSplitOptions.None);
Do matching if you want to get the [[..]] block.
Regex rgx = new Regex(#"\[\[.*?\]\]");
foreach (Match m in rgx.Matches(input))
Console.WriteLine(m.Groups[0].Value);
IDEONE
The regex you are using (\[\[+\]\]) will capture: literal [s 2 or more, then 2 literal ]s.
A regex solution is capturing all the non-[s inside doubled [ and ]s (and the string inside the brackets should not be empty, I guess?), and cast MatchCollection to a list or array (here is an example with a list):
var str = "[[Sender.Name]]\r[[Sender.AdditionalInfo]]\r[[Sender.Street]]\r[[Sender.ZipCode]] [[Sender.Location]]\r[[Sender.Country]]\r";
var rgx22 = new Regex(#"\[\[[^]]+?\]\]");
var res345 = rgx22.Matches(str).Cast<Match>().ToList();
Output:

Select only numeric part of a selection in a single regex

Well, I don't know how to explain that exactly, but I have this text:
abc=0;def=2;abc=1;ghi=4;jkl=2
The thing I want to do is select abc=0 and abc=1 but excluding abc part...
My regex is: abc=\d+, but it includes abc part...
I readed something about this, and the answer was this: (?!abc=)\d+ but It select all the numbers inside the text...
So, can somebody help me with this?
Thanks in advance.
If your language supports \K then you could use the below regex to matche the number which was just after to the string abc=,
abc=\K\d+
DEMO
OR
use a positive look-behind if your language didn't support \K,
(?<=abc=)\d+
DEMO
C# code would be,
{
string str = "abc=0;def=2;abc=1;ghi=4;jkl=2";
Regex rgx = new Regex(#"(?<=abc=)\d+");
foreach (Match m in rgx.Matches(str))
Console.WriteLine(m.Value);
}
IDEONE
Explanation:
(?<=abc=) Positive lookbehind which actually sets the matching marker just after to the string abc=.
\d+ Matches one or more digits.
You don't need a lookaround assertion here. You can simply use a capturing group to capture the matched context that you want and refer back to the matched group using the Match.Groups Property.
abc=(\d+)
Example:
string s = "abc=0;def=2;abc=1;ghi=4;jkl=2";
foreach (Match m in Regex.Matches(s, #"abc=(\d+)"))
Console.WriteLine(m.Groups[1].Value);
Output
0
1

C# - Regex Match whole words

I need to match all the whole words containing a given a string.
string s = "ABC.MYTESTING
XYZ.YOUTESTED
ANY.TESTING";
Regex r = new Regex("(?<TM>[!\..]*TEST.*)", ...);
MatchCollection mc = r.Matches(s);
I need the result to be:
MYTESTING
YOUTESTED
TESTING
But I get:
TESTING
TESTED
.TESTING
How do I achieve this with Regular expressions.
Edit: Extended sample string.
If you were looking for all words including 'TEST', you should use
#"(?<TM>\w*TEST\w*)"
\w includes word characters and is short for [A-Za-z0-9_]
Keep it simple: why not just try \w*TEST\w* as the match pattern.
I get the results you are expecting with the following:
string s = #"ABC.MYTESTING
XYZ.YOUTESTED
ANY.TESTING";
var m = Regex.Matches(s, #"(\w*TEST\w*)", RegexOptions.IgnoreCase);
Try using \b. It's the regex flag for a non-word delimiter. If you wanted to match both words you could use:
/\b[a-z]+\b/i
BTW, .net doesn't need the surrounding /, and the i is just a case-insensitive match flag.
.NET Alternative:
var re = new Regex(#"\b[a-z]+\b", RegexOptions.IgnoreCase);
Using Groups I think you can achieve it.
string s = #"ABC.TESTING
XYZ.TESTED";
Regex r = new Regex(#"(?<TM>[!\..]*(?<test>TEST.*))", RegexOptions.Multiline);
var mc= r.Matches(s);
foreach (Match match in mc)
{
Console.WriteLine(match.Groups["test"]);
}
Works exactly like you want.
BTW, your regular expression pattern should be a verbatim string ( #"")
Regex r = new Regex(#"(?<TM>[^.]*TEST.*)", RegexOptions.IgnoreCase);
First, as #manojlds said, you should use verbatim strings for regexes whenever possible. Otherwise you'll have to use two backslashes in most of your regex escape sequences, not just one (e.g. [!\\..]*).
Second, if you want to match anything but a dot, that part of the regex should be [^.]*. ^ is the metacharacter that inverts the character class, not !, and . has no special meaning in that context, so it doesn't need to be escaped. But you should probably use \w* instead, or even [A-Z]*, depending on what exactly you mean by "word". [!\..] matches ! or ..
Regex r = new Regex(#"(?<TM>[A-Z]*TEST[A-Z]*)", RegexOptions.IgnoreCase);
That way you don't need to bother with word boundaries, though they don't hurt:
Regex r = new Regex(#"(?<TM>\b[A-Z]*TEST[A-Z]*\b)", RegexOptions.IgnoreCase);
Finally, if you're always taking the whole match anyway, you don't need to use a capturing group:
Regex r = new Regex(#"\b[A-Z]*TEST[A-Z]*\b", RegexOptions.IgnoreCase);
The matched text will be available via Match's Value property.

Categories