Match regular expression for the pattern dddG-xyz - c#

My try:
string exp1 = "\\d+G-";
string z = Regex.Match("CN=314G-VK1,OU=Grupper,OU=314,OU=Skole,OU=03Skien,DC=login,DC=sk-asp,DC=no",exp1).Value;
Console.WriteLine(z);
I want to match 314G-VK1 from the string z.
Where,
314 is decimal and it can be any number of digit. say 125632588.
G- is constant.
And VK1 can be character or decimal but length will be only 3, say er5.
How can i meet the requirements?
From my code i only get output 314G-. I tried several ways but that don't help me anymore.

You may use
string exp1 = #"\d+G-\w{3}";
See the regex demo
The \w{3} pattern will match 3 word chars, i.e. mostly letters, digits, underscores. You may precise it if need be, e.g. to only match 3 uppercase ASCII letters or digits, you may use [A-Z0-9]{3}. To also include lowercase letters, add them to the character class, [A-Za-z0-9]{3}.
Regulex regex graph:
.NET regex test results:
C# code demo:
string exp1 = #"\d+G-\w{3}";
string s = "CN=314G-VK1,OU=Grupper,OU=314,OU=Skole,OU=03Skien,DC=login,DC=sk-asp,DC=no";
string z = Regex.Match(s, exp1)?.Value;
Console.WriteLine(z); // => 314G-VK1

You are almost correct but, you just need to add [0-9a-zA-Z]{3} after G-.
string exp1 = "\\d+G-[0-9a-zA-Z]{3}";
string z = Regex.Match("CN=314G-VK1,OU=Grupper,OU=314,OU=Skole,OU=03Skien,DC=login,DC=sk-asp,DC=no", exp1).Value;
Console.WriteLine(z);
Check demo here

Try this:
string exp1 = #"\d+G-[a-zA-Z0-9]{3}"
[a-zA-Z0-9]{3} will match with 3-char alphanumeric string.

Related

Regex only letters except set of numbers

I'm using Replace(#"[^a-zA-Z]+", "");
leave only letters, but I have a set of numbers or characters that I want to keep as well, ex: 122456 and 112466. But I'm having trouble leaving it only if it's this sequence:
ex input:
abc 1239 asm122456000
I want to:
abscasm122456
tried this: ([^a-zA-Z])+|(?!122456)
My answer doesn't applying Replace(), but achieves a similar result:
(?:[a-zA-Z]+|\d{6})
which captures the group (non-capturing group) with the alphabetic character(s) or a set of digits with 6 occurrences.
Regex 101 & Test Result
Join all the matching values into a single string.
using System.Linq;
Regex regex = new Regex("(?:[a-zA-Z]+|\\d{6})");
string input = "abc 1239 asm12245600";
string output = "";
var matches = regex.Matches(input);
if (matches.Count > 0)
output = String.Join("", matches.Select(x => x.Value));
Sample .NET Fiddle
Alternate way,
using .Split() and .All(),
string input = "abc 1239 asm122456000";
string output = string.Join("", input.Split().Where(x => !x.All(char.IsDigit)));
.NET Fiddle
It is very simple: you need to match and capture what you need to keep, and just match what you need to remove, and then utilize a backreference to the captured group value in the replacement pattern to put it back into the resulting string.
Here is the regex:
(122456|112466)|[^a-zA-Z]
See the regex demo. Details:
(122456|112466) - Capturing group with ID 1: either of the two alternatives
| - or
[^a-zA-Z] - a char other than an ASCII letter (use \P{L} if you need to match any char other than any Unicode letter).
Note the removed + quantifier as [^A-Za-z] also matches digits.
You need to use $1 in the replacement:
var result = Regex.Replace(text, #"(122456|112466)|[^a-zA-Z]", "$1");

Regular expression match between string and last digit

I'm trying to come up with a regular expression matches the text in bold in all the examples.
Between the string "JZ" and any character before "-"
JZ123456789-301A
JZ134255872-22013
Between the string "JZ" and the last character
JZ123456789D
I have tried the following but it only works for the first example
(?<=JZ).*(?=-)
You can use (?<=JZ)[0-9]+, presuming the desired text will always be numeric.
Try it out here
You may use
JZ([^-]*)(?:-|.$)
and grab Group 1 value. See the regex demo.
Details
JZ - a literal substring
([^-]*) - Capturing group 1: zero or more chars other than -
(?:-|.$) - a non-capturing group matching either - or any char at the end of the string
C# code:
var m = Regex.Match(s, #"JZ([^-]*)(?:-|.$)");
if (m.Success)
{
Console.WriteLine(m.Groups[1].Value);
}
If, for some reason, you need to obtain the required value as a whole match, use lookarounds:
(?<=JZ)[^-]*(?=-|.$)
See this regex variation demo. Use m.Value in the code above to grab the value.
A one-line answer without regex:
string s,r;
// if your string always starts with JZ
s = "JZ123456789-301A";
r = string.Concat(s.Substring(2).TakeWhile(char.IsDigit));
Console.WriteLine(r); // output : 123456789
// if your string starts with anything
s = "A12JZ123456789-301A";
r = string.Concat(s.Substring(s.IndexOf("JZ")).TakeWhile(char.IsDigit));
Console.WriteLine(r); // output : 123456789
Basically, we remove everything before and including the delimiter "JZ", then we take each char while they are digit. The Concat is use to transform the IEnumerable<char> to a string. I think it is easier to read.
Try it online

Regex to get first 6 and last 4 characters of a string

I would like to use regex instead of string.replace() to get the first 6 chars of a string and the last 4 chars of the same string and substitute it with another character: & for example. The string is always with 16 chars. Im doing some research but i never worked with regex before. Thanks
If you prefer to use regular expression, you could use the following. The dot . will match any character except a newline sequence, so you can specify {n} to match exactly n times and use beginning/end of string anchors.
String r = Regex.Replace("123456foobar7890", #"^.{6}|.{4}$",
m => new string('&', m.ToString().Length));
Console.WriteLine(r); //=> "&&&&&&foobar&&&&"
If you want to invert the logic, replacing the middle portion of your string you can use Positive Lookbehind.
String r = Regex.Replace("123456foobar7890", #"(?<=^.{6}).{6}",
m => new string('&', m.ToString().Length));
Console.WriteLine(r); //=> "123456&&&&&&7890"

get an special Substring in c#

I need to extract a substring from an existing string. This String starts with uninteresting characters (include "," "space" and numbers) and ends with ", 123," or ", 57," or something like this where the numbers can change. I only need the Numbers.
Thanks
public static void Main(string[] args)
{
string input = "This is 2 much junk, 123,";
var match = Regex.Match(input, #"(\d*),$"); // Ends with at least one digit
// followed by comma,
// grab the digits.
if(match.Success)
Console.WriteLine(match.Groups[1]); // Prints '123'
}
Regex to match numbers: Regex regex = new Regex(#"\d+");
Source (slightly modified): Regex for numbers only
I think this is what you're looking for:
Remove all non numeric characters from a string using Regex
using System.Text.RegularExpressions;
...
string newString = Regex.Replace(oldString, "[^.0-9]", "");
(If you don't want to allow the decimal delimiter in the final result, remove the . from the regular expression above).
Try something like this :
String numbers = new String(yourString.TakeWhile(x => char.IsNumber(x)).ToArray());
You can use \d+ to match all digits within a given string
So your code would be
var lst=Regex.Matches(inp,reg)
.Cast<Match>()
.Select(x=x.Value);
lst now contain all the numbers
But if your input would be same as provided in your question you don't need regex
input.Substring(input.LastIndexOf(", "),input.LastIndexOf(","));

How to ignore regex matches in C#?

An input string:
string datar = "aag, afg, agg, arg";
I am trying to get matches: "aag" and "arg", but following won't work:
string regr = "a[a-z&&[^fg]]g";
string regr = "a[a-z[^fg]]g";
What is the correct way of ignoring regex matches in C#?
The obvious way is to use a[a-eh-z]g, but you could also try with a negative lookbehind like this :
string regr = "a[a-z](?<!f|g)g"
Explanation :
a Match the character "a"
[a-z] Match a single character in the range between "a" and "z"
(?<!XXX) Assert that it is impossible to match the regex below with the match ending at this position (negative lookbehind)
f|g Match the character "f" or match the character "g"
g Match the character "g"
Character classes aren't quite that fancy. The simple solution is:
a[a-eh-z]g
If you really want to explicitly list out the letters that don't belong, you could try something like:
a[^\W\d_A-Zfg]g
This character class matches everything except:
\W excludes non-word characters, i.e. punctuation, whitespace, and other special characters. What's left are letters, digits, and the underscore _.
\d removes digits so now we have letters and the underscore _.
_ removes the underscore so now we only match letters.
A-Z removes uppercase letters so now we only match lowercase letters.
Finally at this point we can list the individual lowercase letters we don't want to match.
All in all way more complicated than we'd likely ever want. That's regular expressions for ya!
What you're using is Java's set intersection syntax:
a[a-z&&[^fg]]g
..meaning the intersection of the two sets ('a' THROUGH 'z') and (ANYTHING EXCEPT 'f' OR 'g'). No other regex flavor that I know of uses that notation. The .NET flavor uses the simpler set subtraction syntax:
a[a-z-[fg]]g
...that is, the set ('a' THROUGH 'z') minus the set ('f', 'g').
Java demo:
String s = "aag, afg, agg, arg, a%g";
Matcher m = Pattern.compile("a[a-z&&[^fg]]g").matcher(s);
while (m.find())
{
System.out.println(m.group());
}
C# demo:
string s = #"aag, afg, agg, arg, a%g";
foreach (Match m in Regex.Matches(s, #"a[a-z-[fg]]g"))
{
Console.WriteLine(m.Value);
}
Output of both is
aag
arg
Try this if you want match arg and aag:
a[ar]g
If you want to match everything except afg and agg, you need this regex:
a[^fg]g
It seems like you're trying to match any three alphabetic characters, with the condition that the second character cannot be f or g. If this is the case, why not use the following regular expression:
string regr = "a[a-eh-z]g";
Regex: a[a-eh-z]g.
Then use Regex.Matches to get the matched substrings.

Categories