Write a regular expression in order to search a substring C# - c#

I have a string which I want to examine and search for a substring within it. If the substring is found, I want to do something on the original string.
The string looks like this:
"\r\radmin#Modem -- *<456> \radmin#Modem -- *<456> "
Goal: Search the substring pattern " -- *<456> " if it exists in the string, and return success or fail (the digits number is between 1 to infinite: 1, 5, 36, 76, 478, 975 etc.).
What is the regular expression pattern which I need?

Use this:
var myRegex = new Regex("(?<=<)[0-9]+(?=>)");
string resultString = myRegex.Match(yourString).Value;
Console.WriteLine(resultString);
// matches 456
See the match in the Regex Demo.
Explanation
The lookbehind (?<=<) asserts that what precedes is <
[0-9]+ matches one or more digits
The lookahead (?=>) asserts that what follows is >

You can use this following piece of code to check if your pattern exist :
string yourInput = "\r\radmin#Modem -- *<456> \radmin#Modem -- *<456> " ;
string pattern = #"<(\d+)>";
boolean success = Regex.Match(yourInput , pattern, RegexOptions.IgnoreCase).Success ;
success will be true if a number is found.

With this pattern you can match the string: "--\s\*\<\d{3}\>"
Note: If the number of digits can change, use this: "--\s\*\<\d{MIN,MAX}\>" where MIN and MAX are the number of digits that can appear in your string (within the part we are interested in matching).
using System;
using System.Text.RegularExpressions;
class Example
{
static void Main()
{
string text = "One car red car blue car";
// This regex will match the pattern you're looking for
// Since youre new to regexes :) I'll explain it a little:
// "--" matches "--" literally, "\s" matches the space in between but only once.
// "\*" matches the "*" and "\<" and "\>" match "<" and ">" respectively
// "\d" matches a digit 0-9 and "{3}" indicates that there are three digits
string pat = #"--\s\*\<\d{3}\>";
// Instantiate the regular expression object.
Regex r = new Regex(pat, RegexOptions.IgnoreCase);
// Match the regular expression pattern against a text string.
Match m = r.Match(text);
while (m.Success)
{
// Do something ...
// Find next match
m = m.NextMatch();
}
}
}
This will allow you to make any changes on a per match basis. So every time you match the regex you can do something to your string and then look if there is another match and so on...

Perhaps the following can help you:
static void Main(string[] args)
{
string originalString= "\r\radmin#Modem -- *<456> \radmin#Modem -- *<456> ";
Regex reg = new Regex(#"-- \*<[1-9][0-9]*>");
bool isMatch = reg.IsMatch(originalString);
Console.WriteLine(isMatch);
}

You can use Regex.IsMatch using this regular expression --\\s\\*<\\d+> for matching strings like -- *<456>
bool MatchTheNumTag(string str)
{
Regex reg = new Regex("--\\s\\*<\\d+>");
return reg.IsMatch(str);
}

you can use this regex
<[1-9][0-9]*>
explanation:
[1-9]
this part is a range from 1-9 so your number is bigger than 0
[0-9]*
number range from 0-9 and the * gives you the possibility to have number as big as you want
other way:
you can also use special characters for numbers, but then it really depends on the regex syntax
\d

welcome to Regexes!
You ave to know that certain characters in regexes are special characters and they need to be escaped, you can find them here: http://www.regular-expressions.info/characters.html
Which means that a regex for your pattern would be \s\-\-\s\*<456>
\s just means whitespace.

Related

Build a regex that does not contain the first and last character you are looking for in the match

I have the following problem.
This is what the regex looks like:
var regexTest = new Regex(#"'\d.*\d#");
This is what the string looks like:
var text = "dsadsadsadsa('1.222222#dsadsa'";
That is the result of what I would like to have:
1.222222
That's the result I'm getting right now ...:
'1.222222#
You want to extract the float number in between ' and ", use
var text = "dsadsadsadsa('1.222222#dsadsa'";
var regexTest = new Regex(#"'(\d+\.\d+)#");
var m = regexTest.Match(text);
if (m.Success)
{
Console.WriteLine(m.Groups[1].Value);
}
Here, (\d+\.\d+) captures any 1+ digits, . and then 1+ digits into Group 1 that you may access using match.Groups[1].Value. However, only access that value if there was a match, or you will get an exception (see m.Success part in my demo snippet).
See the regex demo:
Just enclose the part you want to get in parentheses, so that you can get it as a group:
var regexTest = new Regex(#"'(\d.*\d)#");
-----------------------------^------^----
In '\d.*\d# you are are matching ' followed by a digit, any character 0+ times followed by a digit. That would match '1.222222# but also for example '1.A2# because of the .*
To don't match the ' and the # you could use a positive lookahead and a positive lookbehind to assert that they are there. If you only want to match digits then the .* could be left out.
(?<=')\d+\.\d+(?=#)
Regex demo

search string for everything before a set of characters in C#

I'm looking for a way to search a string for everything before a set of characters in C#. For Example, if this is my string value:
This is is a test.... 12345
I want build a new string with all of the characters before "12345".
So my new string would equal "This is is a test.... "
Is there a way to do this?
I've found Regex examples where you can focus on one character but not a sequence of characters.
You don't need to use a Regex:
public string GetBitBefore(string text, string end)
{
var index = text.IndexOf(end);
if (index == -1) return text;
return text.Substring(0, index);
}
You can use a lazy quantifier to match anything, followed by a lookahead:
var match = Regex.Match("This is is a test.... 12345", #".*?(?=\d{5})");
where:
.*? lazily matches everything (up to the lookahead)
(?=…) is a positive lookahead: the pattern must be matched, but is not included in the result
\d{5} matches exactly five digits. I'm assuming this is your lookahead; you can replace it
You can do so with help of regex lookahead.
.*(?=12345)
Example:
var data = "This is is a test.... 12345";
var rxStr = ".*(?=12345)";
var rx = new System.Text.RegularExpressions.Regex (rxStr,
System.Text.RegularExpressions.RegexOptions.IgnoreCase);
var match = rx.Match(data);
if (match.Success) {
Console.WriteLine (match.Value);
}
Above code snippet will print every thing upto 12345:
This is is a test....
For more detail about see regex positive lookahead
This should get you started:
var reg = new Regex("^(.+)12345$");
var match = reg.Match("This is is a test.... 12345");
var group = match.Groups[1]; // This is is a test....
Of course you'd want to do some additional validation, but this is the basic idea.
^ means start of string
$ means end of string
The asterisk tells the engine to attempt to match the preceding token zero or more times. The plus tells the engine to attempt to match the preceding token once or more
{min,max} indicate the minimum/maximum number of matches.
\d matches a single character that is a digit, \w matches a "word character" (alphanumeric characters plus underscore), and \s matches a whitespace character (includes tabs and line breaks).
[^a] means not so exclude a
The dot matches a single character, except line break characters
In your case there many way to accomplish the task.
Eg excluding digit: ^[^\d]*
If you know the set of characters and they are not only digit, don't use regex but IndexOf(). If you know the separator between first and second part as "..." you can use Split()
Take a look at this snippet:
class Program
{
static void Main(string[] args)
{
string input = "This is is a test.... 12345";
// Here we call Regex.Match.
MatchCollection matches = Regex.Matches(input, #"(?<MySentence>(\w+\s*)*)(?<MyNumberPart>\d*)");
foreach (Match item in matches)
{
Console.WriteLine(item.Groups["MySentence"]);
Console.WriteLine("******");
Console.WriteLine(item.Groups["MyNumberPart"]);
}
Console.ReadKey();
}
}
You could just split, not as optimal as the indexOf solution
string value = "oiasjdoiasj12345";
string end = "12345";
string result = value.Split(new string[] { end }, StringSplitOptions.None)[0] //Take first part of the result, not the quickest but fairly simple

Regex to find special pattern

I have a string to parse. First I have to check if string contains special pattern:
I wanted to know if there is substrings which starts with "$(",
and end with ")",
and between those start and end special strings,there should not be
any white-empty space,
it should not include "$" character inside it.
I have a little regex for it in C#
string input = "$(abc)";
string pattern = #"\$\(([^$][^\s]*)\)";
Regex rgx = new Regex(pattern, RegexOptions.IgnoreCase);
MatchCollection matches = rgx.Matches(input);
foreach (var match in matches)
{
Console.WriteLine("value = " + match);
}
It works for many cases but failed at input= $(a$() , which inside the expression is empty. I wanted NOT to match when input is $().[ there is nothing between start and end identifiers].
What is wrong with my regex?
Note: [^$] matches a single character but not of $
Use the below regex if you want to match $()
\$\(([^\s$]*)\)
Use the below regex if you don't want to match $(),
\$\(([^\s$]+)\)
* repeats the preceding token zero or more times.
+ Repeats the preceding token one or more times.
Your regex \(([^$][^\s]*)\) is wrong. It won't allow $ as a first character inside () but it allows it as second or third ,, etc. See the demo here. You need to combine the negated classes in your regex inorder to match any character not of a space or $.
Your current regex does not match $() because the [^$] matches at least 1 character. The only way I can think of where you would have this match would be when you have an input containing more than one parens, like:
$()(something)
In those cases, you will also need to exclude at least the closing paren:
string pattern = #"\$\(([^$\s)]+)\)";
The above matches for example:
abc in $(abc) and
abc and def in $(def)$()$(abc)(something).
Simply replace the * with a + and merge the options.
string pattern = #"\$\(([^$\s]+)\)";
+ means 1 or more
* means 0 or more

get an special Substring in c#

I need to extract a substring from an existing string. This String starts with uninteresting characters (include "," "space" and numbers) and ends with ", 123," or ", 57," or something like this where the numbers can change. I only need the Numbers.
Thanks
public static void Main(string[] args)
{
string input = "This is 2 much junk, 123,";
var match = Regex.Match(input, #"(\d*),$"); // Ends with at least one digit
// followed by comma,
// grab the digits.
if(match.Success)
Console.WriteLine(match.Groups[1]); // Prints '123'
}
Regex to match numbers: Regex regex = new Regex(#"\d+");
Source (slightly modified): Regex for numbers only
I think this is what you're looking for:
Remove all non numeric characters from a string using Regex
using System.Text.RegularExpressions;
...
string newString = Regex.Replace(oldString, "[^.0-9]", "");
(If you don't want to allow the decimal delimiter in the final result, remove the . from the regular expression above).
Try something like this :
String numbers = new String(yourString.TakeWhile(x => char.IsNumber(x)).ToArray());
You can use \d+ to match all digits within a given string
So your code would be
var lst=Regex.Matches(inp,reg)
.Cast<Match>()
.Select(x=x.Value);
lst now contain all the numbers
But if your input would be same as provided in your question you don't need regex
input.Substring(input.LastIndexOf(", "),input.LastIndexOf(","));

Regex to match and return group names

I need to match the following strings and returns the values as groups:
abctic
abctac
xyztic
xyztac
ghhtic
ghhtac
Pattern is wrote with grouping is as follows:
(?<arch>[abc,xyz,ghh])(?<flavor>[tic,tac]$)
The above returns only parts of group names. (meaning match is not correct).
If I use * in each sub pattern instead of $ at the end, groups are correct, but that would mean that abcticff will also match.
Please let me know what my correct regex should be.
Your pattern is incorrect because a pipe symbol | is used to specify alternate matches, not a comma in brackets as you were using, i.e., [x,y].
Your pattern should be: ^(?<arch>abc|xyz|ghh)(?<flavor>tic|tac)$
The ^ and $ metacharacters ensures the string matches from start to end. If you need to match text in a larger string you could replace them with \b to match on a word boundary.
Try this approach:
string[] inputs = { "abctic", "abctac", "xyztic", "xyztac", "ghhtic", "ghhtac" };
string pattern = #"^(?<arch>abc|xyz|ghh)(?<flavor>tic|tac)$";
foreach (var input in inputs)
{
var match = Regex.Match(input, pattern);
if (match.Success)
{
Console.WriteLine("Arch: {0} - Flavor: {1}",
match.Groups["arch"].Value,
match.Groups["flavor"].Value);
}
else
Console.WriteLine("No match for: " + input);
}

Categories