match.regex syntax with digit character and a #

match.regex syntax with digit character and a # - c#

i have a string with this format :
111111#1
the number of digit character is 5 or 6 and after that i set a '#' and also set a digit charcter.
i use Regex.IsMatch like this :
if (Regex.IsMatch(string, #"^d{6}#\d{1}"))
{...}
but it cant handle my string
what is my mistake?

You're missing the backslash on the first d so it's not matching against digits:
Regex.IsMatch("111111#1", #"^\d{6}#\d{1}")

This single line Regex will capture two groups: the leading five to six digits and the '#' followed by a single digit:
(\d{5,6})(#\d{1})
Example:
string pattern = #"(\d{5,6})(#\d{1})";
string input = "111111#1";
MatchCollection matches = Regex.Matches(input, pattern);
foreach (Match match in matches)
{
var firstGroupValue = match.Groups[1]; // "111111"
var secondGroupValue = match.Groups[2]; // "#1"
}

Related

Regex - Get digits after a colon

I have a regex:
var topPayMatch = Regex.Match(result, #"(?<=Top Pay)(\D*)(\d+(?:\.\d+)?)", RegexOptions.IgnoreCase);
And I have to convert this to int which I did
topPayMatch = Convert.ToInt32(topPayMatchString.Groups[2].Value);
So now...
Top Pay: 1,000,000 then it currently grabs the first digit, which is 1. I want all 1000000.
If Top Pay: 888,888 then I want all 888888.
What should I add to my regex?

You can use something as simple like #"(?<=Top Pay: )([0-9,]+)". Note that, decimals will be ignored with this regex.
This will match all numbers with their commas after Top Pay:, which after you can parse it to an integer.
Example:
Regex rgx = new Regex(#"(?<=Top Pay: )([0-9,]+)");
string str = "Top Pay: 1,000,000";
Match match = rgx.Match(str);
if (match.Success)
{
string val = match.Value;
int num = int.Parse(val, System.Globalization.NumberStyles.AllowThousands);
Console.WriteLine(num);
}
Console.WriteLine("Ended");
Source:
Convert int from string with commas

If you use the lookbehind, you don't need the capture groups and you can move the \D* into the lookbehind.
To get the values, you can match 1+ digits followed by optional repetitions of , and 1+ digits.
Note that your example data contains comma's and no dots, and using ? as a quantifier means 0 or 1 time.
(?<=Top Pay\D*)\d+(?:,\d+)*
The pattern matches:
(?<=Top Pay\D*) Positive lookbehind, assert what is to the left is Top Pay and optional non digits
\d+ Match 1+ digits
(?:,\d+)* Optionally repeat a , and 1+ digits
See a .NET regex demo and a C# demo
string pattern = #"(?<=Top Pay\D*)\d+(?:,\d+)*";
string input = #"Top Pay: 1,000,000
Top Pay: 888,888";
RegexOptions options = RegexOptions.IgnoreCase;
foreach (Match m in Regex.Matches(input, pattern, options))
{
var topPayMatch = int.Parse(m.Value, System.Globalization.NumberStyles.AllowThousands);
Console.WriteLine(topPayMatch);
}
Output
1000000
888888

REGEX Expression C#. Split string by whitespace outside the quotation marks

I'm trying to define a regular expression for the Split function in order to obtain all substring split by a whitespace omitting those whitespaces that are into single quotation marks.
Example:
key1:value1 key2:'value2 value3'
i Need these separated values:
key1:value1
key2:'value2 value3'
I'm tried to perform this in different ways:
Regex.Split(q, #"(\s)^('\s')").ToList();
Regex.Split(q, #"(\s)(^'.\s.')").ToList();
Regex.Split(q, #"(?=.*\s)").ToList();
What i am wrong with this code?
Could you please help me with this?
Thanks in advance

A working example:
(\w+):(?:(\w+)|'([^']+)')
(\w+) # key: 1 or more word chars (captured)
: # literal
(?: # non-captured grouped alternatives
(\w+) # value: 1 or more word chars (captured)
| # or
'([^']+)' # 1 or more not "'" enclosed by "'" (captured)
) # end of group
Demo
Your try:
(\s)^('\s')
^ means beginning of line, \s is a white-space characters. If you want to use the not-operator, this only works in a character class [^\s] -> 1 character not a white-space.

var st = "key1:value1 key2:'value2 value3'";
var result = Regex.Matches(st, #"\w+:\w+|\w+:\'[^']+\'");
foreach (var item in result)
Console.WriteLine(item);
The result should be:
key1:value1
key2:'value2 value3'

Try following :
static void Main(string[] args)
{
string input = "key1:value1 key2:'value2 value3'";
string pattern = #"\s*(?'key'[^:]+):((?'value'[^'][^\s]+)|'(?'value'[^']+))";
MatchCollection matches = Regex.Matches(input, pattern);
foreach (Match match in matches)
{
Console.WriteLine("Key : '{0}', Value : '{1}'", match.Groups["key"].Value, match.Groups["value"].Value);
}
Console.ReadLine();
}

search string for everything before a set of characters in C#

I'm looking for a way to search a string for everything before a set of characters in C#. For Example, if this is my string value:
This is is a test.... 12345
I want build a new string with all of the characters before "12345".
So my new string would equal "This is is a test.... "
Is there a way to do this?
I've found Regex examples where you can focus on one character but not a sequence of characters.

You don't need to use a Regex:
public string GetBitBefore(string text, string end)
{
var index = text.IndexOf(end);
if (index == -1) return text;
return text.Substring(0, index);
}

You can use a lazy quantifier to match anything, followed by a lookahead:
var match = Regex.Match("This is is a test.... 12345", #".*?(?=\d{5})");
where:
.*? lazily matches everything (up to the lookahead)
(?=…) is a positive lookahead: the pattern must be matched, but is not included in the result
\d{5} matches exactly five digits. I'm assuming this is your lookahead; you can replace it

You can do so with help of regex lookahead.
.*(?=12345)
Example:
var data = "This is is a test.... 12345";
var rxStr = ".*(?=12345)";
var rx = new System.Text.RegularExpressions.Regex (rxStr,
System.Text.RegularExpressions.RegexOptions.IgnoreCase);
var match = rx.Match(data);
if (match.Success) {
Console.WriteLine (match.Value);
}
Above code snippet will print every thing upto 12345:
This is is a test....
For more detail about see regex positive lookahead

This should get you started:
var reg = new Regex("^(.+)12345$");
var match = reg.Match("This is is a test.... 12345");
var group = match.Groups[1]; // This is is a test....
Of course you'd want to do some additional validation, but this is the basic idea.

^ means start of string
$ means end of string
The asterisk tells the engine to attempt to match the preceding token zero or more times. The plus tells the engine to attempt to match the preceding token once or more
{min,max} indicate the minimum/maximum number of matches.
\d matches a single character that is a digit, \w matches a "word character" (alphanumeric characters plus underscore), and \s matches a whitespace character (includes tabs and line breaks).
[^a] means not so exclude a
The dot matches a single character, except line break characters
In your case there many way to accomplish the task.
Eg excluding digit: ^[^\d]*
If you know the set of characters and they are not only digit, don't use regex but IndexOf(). If you know the separator between first and second part as "..." you can use Split()

Take a look at this snippet:
class Program
{
static void Main(string[] args)
{
string input = "This is is a test.... 12345";
// Here we call Regex.Match.
MatchCollection matches = Regex.Matches(input, #"(?<MySentence>(\w+\s*)*)(?<MyNumberPart>\d*)");
foreach (Match item in matches)
{
Console.WriteLine(item.Groups["MySentence"]);
Console.WriteLine("******");
Console.WriteLine(item.Groups["MyNumberPart"]);
}
Console.ReadKey();
}
}

You could just split, not as optimal as the indexOf solution
string value = "oiasjdoiasj12345";
string end = "12345";
string result = value.Split(new string[] { end }, StringSplitOptions.None)[0] //Take first part of the result, not the quickest but fairly simple

Get a specific part from a string based on a pattern

I have a string in this format:
ABCD_EFDG20120700.0.xml
This has a pattern which has three parts to it:
First is the set of chars before the '_', the 'ABCD'
Second are the set of chars 'EFDG' after the '_'
Third are the remaining 20120700.0.xml
I can split the original string and get the number(s) from the second element in the split result using this switch:
\d+
Match m = Regex.Match(splitname[1], "\\d+");
That returns only '20120700'. But I need '20120700.0'.
How do I get the required string?

You can extend your regex to look for any number of digits, then period and then any number of digits once again:
Match m = Regex.Match(splitname[1], "\\d+\\.\\d+");
Although with such regular expression you don't even need to split the string:
string s = "ABCD_EFDG20120700.0.xml";
Match m = Regex.Match(s, "\\d+\\.\\d+");
string result = m.Value; // result is 20120700.0

I can suggest you to use one regex operation for all you want like this:
var rgx = new Regex(#"^([^_]+)_([^\d.]+)([\d.]+\d+)\.(.*)$");
var matches = rgx.Matches(input);
if (matches.Count > 0)
{
Console.WriteLine("{0}", matches[0].Groups[0]); // All input string
Console.WriteLine("{0}", matches[0].Groups[1]); // ABCD
Console.WriteLine("{0}", matches[0].Groups[2]); // EFGH
Console.WriteLine("{0}", matches[0].Groups[3]); // 20120700.0
Console.WriteLine("{0}", matches[0].Groups[4]); // xml
}

Regex help with sample pattern. C#

I decided to use Regex, now I have two problems :)
Given the input string "hello world [2] [200] [%8] [%1c] [%d]",
What would be an approprite pattern to match the instances of "[%8]" "[%1c]" + "[%d]" ? (So a percentage sign, followed by any length alphanumeric, all enclosed in square brackets).
for the "[2]" and [200], I already use
Regex.Matches(input, "(\\[)[0-9]*?\\]");
Which works fine.
Any help would be appreicated.

MatchCollection matches = null;
try {
Regex regexObj = new Regex(#"\[[%\w]+\]");
matches = regexObj.Matches(input);
if (matches.Count > 0) {
// Access individual matches using matches.Item[]
} else {
// Match attempt failed
}
} catch (ArgumentException ex) {
// Syntax error in the regular expression
}

The Regex needed to match this pattern of "[%anyLengthAlphaNumeric]" in a string is this "[(%\w+)]"
The leading "[" is escaped with the "\" then you are creating a grouping of characters with the (...). This grouping is defined as %\w+. The \w is a shortcut for all word characters including letters and digits no spaces. The + matches one or more instances of the previous symbol, character or group. Then the trailing "]" is escaped with a "\" and catches the closing bracket.
Here is a basic code example:
string input = #"hello world [2] [200] [%8] [%1c] [%d]";
Regex example = new Regex(#"\[(%\w+)\]");
MatchCollection matches = example.Matches(input);

Try this:
Regex.Matches(input, "\\[%[0-9a-f]+\\]");
Or as a combined regular expression:
Regex.Matches(input, "\\[(\\d+|%[0-9a-f]+)\\]");

How about #"\[%[0-9a-f]*?\]"?
string input = "hello world [2] [200] [%8] [%1c] [%d]";
MatchCollection matches = Regex.Matches(input, #"\[%[0-9a-f]*?\]");
matches.Count // = 3

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

match.regex syntax with digit character and a # - c#

i have a string with this format : 111111#1 the number of digit character is 5 or 6 and after that i set a '#' and also set a digit charcter. i use Regex.IsMatch like this : if (Regex.IsMatch(string, #"^d{6}#\d{1}")) {...} but it cant handle my string what is my mistake?

You're missing the backslash on the first d so it's not matching against digits: Regex.IsMatch("111111#1", #"^\d{6}#\d{1}")

Related

Regex - Get digits after a colon

REGEX Expression C#. Split string by whitespace outside the quotation marks

search string for everything before a set of characters in C#

Get a specific part from a string based on a pattern

Regex help with sample pattern. C#

Categories

Resources