value extraction using regular expressions - c#

I have a text file that contains commands in this format ex:>> call <<
I want to use a regular expression to extract "call" ..
how could be done?
Regex ComandStart = new Regex(">>", RegexOptions.Multiline);
Regex ComandEnd = new Regex("<<", RegexOptions.Multiline);

create the regex:
var regex = new Regex(#"ex:>>([a-z]+)<<");
then extract if match :
var match = regex.Match("ex:>>call<<");
var yourString = match.Groups[1].Value; //yourString = "call" here

If you're on Windows, you could use Notepad++ with the Regex helper plugin for testing your regex's
This expression works fine for me:
/ex:\>\>(.*)\<\</
No idea about c# specifically, sorry

Related

Regular expression in C# to find matches like "x-[some_string]"

What is the Regular Expression in C# to find matches inside text
that starting with "x-[" and ending with "]"?
I've tried something like this:
Regex urlRx = new Regex(#"^x-[.*]$", RegexOptions.IgnoreCase);
Simple:
x-\[([^]]+)\]
# that is: look for x-[ literally
# capture and save anything that is not a ]
# followed by ]
See a demo on regex101.com.
This should work
string input = "x-[ABCD]";
string pattern = "^x-\\[(.*)\\]$";
Regex rgx = new Regex(pattern);
Match match = rgx.Match(input);
if (match.Success) {
Console.WriteLine(match.Groups[1].Value);
}
IDEONE DEMO
UPDATE
As pointed by Jan, there will be too much backtracking in cases like x-[ABCDEFGHJJHGHGFGHGFVFGHGFGHGFGHGGHGGHGDCNJK]ABCD]. My updated regex is similar to his
^x-\[([^\]]*)\]$
Do you really need a regex for it? Simple String operation should serve your purpose.
yourString.EndsWith("]");
yourString.StartsWith("x-[");

Regular Expression to match the pattern

I am looking for Regular Expression search pattern to find data within $< and >$.
string pattern = "\b\$<[^>]*>\$";
is not working.
Thanks,
You can make use of a tempered greedy token:
\$<(?:(?!\$<|>\$)[\s\S])*>\$
See demo
This way, you will match only the closest boundaries.
Your regex does not match because you do not allow > in-between your markers, and you are using \b where you most probably do not have a word boundary.
If you do not want to get the delimiters in the output, use capturing group:
\$<((?:(?!\$<|>\$)[\s\S])*)>\$
^ ^
And the result will be in Group 1.
In C#, you should consider declaring all regex patterns (whenever possible) with the help of a verbatim string literal notation (with #"") because you won't have to worry about doubling backslashes:
var rx = new Regex(#"\$<(?:(?!\$<|>\$)[\s\S])*>\$");
Or, since there is a singleline flag (and this is preferable):
var rx = new Regex(#"\$<((?:(?!\$<|>\$).)*)>\$", RegexOptions.Singleline | RegexOptions.CultureInvariant);
var res = rx.Match(text).Select(p => p.Groups[1].Value).ToList();
This pattern will do the work:
(?<=\$<).*(?=>\$)
Demo: https://regex101.com/r/oY6mO2/1
To find this pattern in php you have this REGEX code for find any patten,
/$<(.*?)>$/s
For Example:
$arrayWhichStoreKeyValueArrayOfYourPattern= array();
preg_match_all('/$<(.*?)>$/s',
$yourcontentinwhichyoufind,
$arrayWhichStoreKeyValueArrayOfYourPattern);
for($i=0;$i<count($arrayWhichStoreKeyValueArrayOfYourPattern[0]);$i++)
{
$content=
str_replace(
$arrayWhichStoreKeyValueArrayOfYourPattern[0][$i],
constant($arrayWhichStoreKeyValueArrayOfYourPattern[1][$i]),
$yourcontentinwhichyoufind);
}
using this example you will replace value using same name constant content in this var $yourcontentinwhichyoufind
For example you have string like this which has also same named constant.
**global.php**
//in this file my constant declared.
define("MYNAME","Hiren Raiyani");
define("CONSTANT_VAL","contant value");
**demo.php**
$content="Hello this is $<MYNAME>$ and this is simple demo to replace $<CONSTANT_VAL>$";
$myarr= array();
preg_match_all('/$<(.*?)>$/s', $content, $myarray);
for($i=0;$i<count($myarray[0]);$i++)
{
$content=str_replace(
$myarray[0][$i],
constant($myarray[1][$i]),
$content);
}
I think as i know that's all.

Regular expression question (C#)

How do I write a regular expression to match (_Rev. n.nn) in the following filenames (where n is a number):
Filename_Rev. 1.00
Filename_Rev. 1.10
Thanks
The following should work (for the whole line):
#"^Filename_Rev\.\s\d\.\d\d$"
Should capture versions >9
Edit: Fixed
string captureString = "abc123butts_Rev. 1.00";
Regex reg = new Regex(#"(.(?!_Rev))+\w_Rev\. (?<version>\d+\.\d+)");
string version = reg.Match(captureString).Groups["version"].Value;
Building off of #leppie's answer (give him the green check not me), you can extract the numbers from your regex match by putting parens around the \d's.
Regex foo = new Regex(#"_Rev\.\s(\d)\.(\d\d)$");
GroupCollection groups = foo.Match("Filename_Rev. 1.00").Groups;
string majorNum = groups[1].Value;
string minorNum = groups[2].Value;
System.Console.WriteLine(majorNum);
System.Console.WriteLine(minorNum);

C# - Regex Match whole words

I need to match all the whole words containing a given a string.
string s = "ABC.MYTESTING
XYZ.YOUTESTED
ANY.TESTING";
Regex r = new Regex("(?<TM>[!\..]*TEST.*)", ...);
MatchCollection mc = r.Matches(s);
I need the result to be:
MYTESTING
YOUTESTED
TESTING
But I get:
TESTING
TESTED
.TESTING
How do I achieve this with Regular expressions.
Edit: Extended sample string.
If you were looking for all words including 'TEST', you should use
#"(?<TM>\w*TEST\w*)"
\w includes word characters and is short for [A-Za-z0-9_]
Keep it simple: why not just try \w*TEST\w* as the match pattern.
I get the results you are expecting with the following:
string s = #"ABC.MYTESTING
XYZ.YOUTESTED
ANY.TESTING";
var m = Regex.Matches(s, #"(\w*TEST\w*)", RegexOptions.IgnoreCase);
Try using \b. It's the regex flag for a non-word delimiter. If you wanted to match both words you could use:
/\b[a-z]+\b/i
BTW, .net doesn't need the surrounding /, and the i is just a case-insensitive match flag.
.NET Alternative:
var re = new Regex(#"\b[a-z]+\b", RegexOptions.IgnoreCase);
Using Groups I think you can achieve it.
string s = #"ABC.TESTING
XYZ.TESTED";
Regex r = new Regex(#"(?<TM>[!\..]*(?<test>TEST.*))", RegexOptions.Multiline);
var mc= r.Matches(s);
foreach (Match match in mc)
{
Console.WriteLine(match.Groups["test"]);
}
Works exactly like you want.
BTW, your regular expression pattern should be a verbatim string ( #"")
Regex r = new Regex(#"(?<TM>[^.]*TEST.*)", RegexOptions.IgnoreCase);
First, as #manojlds said, you should use verbatim strings for regexes whenever possible. Otherwise you'll have to use two backslashes in most of your regex escape sequences, not just one (e.g. [!\\..]*).
Second, if you want to match anything but a dot, that part of the regex should be [^.]*. ^ is the metacharacter that inverts the character class, not !, and . has no special meaning in that context, so it doesn't need to be escaped. But you should probably use \w* instead, or even [A-Z]*, depending on what exactly you mean by "word". [!\..] matches ! or ..
Regex r = new Regex(#"(?<TM>[A-Z]*TEST[A-Z]*)", RegexOptions.IgnoreCase);
That way you don't need to bother with word boundaries, though they don't hurt:
Regex r = new Regex(#"(?<TM>\b[A-Z]*TEST[A-Z]*\b)", RegexOptions.IgnoreCase);
Finally, if you're always taking the whole match anyway, you don't need to use a capturing group:
Regex r = new Regex(#"\b[A-Z]*TEST[A-Z]*\b", RegexOptions.IgnoreCase);
The matched text will be available via Match's Value property.

Regular expression to retrieve everything before first slash

I need a regular expression to basically get the first part of a string, before the first slash ().
For example in the following:
C:\MyFolder\MyFile.zip
The part I need is "C:"
Another example:
somebucketname\MyFolder\MyFile.zip
I would need "somebucketname"
I also need a regular expression to retrieve the "right hand" part of it, so everything after the first slash (excluding the slash.)
For example
somebucketname\MyFolder\MyFile.zip
would return
MyFolder\MyFile.zip.
You don't need a regular expression (it would incur too much overhead for a simple problem like this), try this instead:
yourString = yourString.Substring(0, yourString.IndexOf('\\'));
And for finding everything after the first slash you can do this:
yourString = yourString.Substring(yourString.IndexOf('\\') + 1);
This problem can be handled quite cleanly with the .NET regular expression engine. What makes .NET regular expressions really nice is the ability to use named group captures.
Using a named group capture allows you to define a name for each part of regular expression you are interested in “capturing” that you can reference later to get at its value. The syntax for the group capture is "(?xxSome Regex Expressionxx). Remember also to include the System.Text.RegularExpressions import statement when using regular expression in your project.
Enjoy!
//Regular expression
string _regex = #"(?<first_part>[a-zA-Z:0-9]+)\\{1}(?<second_part>(.)+)";
//Example 1
{
Match match = Regex.Match(#"C:\MyFolder\MyFile.zip", _regex, RegexOptions.IgnoreCase);
string firstPart = match.Groups["first_part"].Captures[0].Value;
string secondPart = match.Groups["second_part"].Captures[0].Value;
}
//Example 2
{
Match match = Regex.Match(#"somebucketname\MyFolder\MyFile.zip", _regex, RegexOptions.IgnoreCase);
string firstPart = match.Groups["first_part"].Captures[0].Value;
string secondPart = match.Groups["second_part"].Captures[0].Value;
}
You are aware that .NET's file handling classes do this a lot more elegantly, right?
For example in your last example, you could do:
FileInfo fi = new FileInfo(#"somebucketname\MyFolder\MyFile.zip");
string nameOnly = fi.Name;
The first example you could do:
FileInfo fi = new FileInfo(#"C:\MyFolder\MyFile.zip");
string driveOnly = fi.Root.Name.Replace(#"\", "");
This matches all non \ chars
[^\\]*
Here is the regular expression solution using the "greedy" operator '?'...
var pattern = "^.*?\\\\";
var m = Regex.Match("c:\\test\\gimmick.txt", pattern);
MessageBox.Show(m.Captures[0].Value);
Split on slash, then get first item
words = s.Split('\\');
words[0]

Categories