How can I split a regex into exact words? - c#

I need a little help regarding Regular Expressions in C#
I have the following string
"[[Sender.Name]]\r[[Sender.AdditionalInfo]]\r[[Sender.Street]]\r[[Sender.ZipCode]] [[Sender.Location]]\r[[Sender.Country]]\r"
The string could also contain spaces and theoretically any other characters. So I really need do match the [[words]].
What I need is a text array like this
"[[Sender.Name]]",
"[[Sender.AdditionalInfo]]",
"[[Sender.Street]]",
// ... And so on.
I'm pretty sure that this is perfectly doable with:
var stringArray = Regex.Split(line, #"\[\[+\]\]")
I'm just too stupid to find the correct Regex for the Regex.Split() call.
Anyone here that can tell me the correct Regular Expression to use in my case?
As you can tell I'm not that experienced with RegEx :)

Why dont you split according to "\r"?
and you dont need regex for that just use the standard string function
string[] delimiters = {#"\r"};
string[] split = line.Split(delimiters,StringSplitOptions.None);

Do matching if you want to get the [[..]] block.
Regex rgx = new Regex(#"\[\[.*?\]\]");
foreach (Match m in rgx.Matches(input))
Console.WriteLine(m.Groups[0].Value);
IDEONE

The regex you are using (\[\[+\]\]) will capture: literal [s 2 or more, then 2 literal ]s.
A regex solution is capturing all the non-[s inside doubled [ and ]s (and the string inside the brackets should not be empty, I guess?), and cast MatchCollection to a list or array (here is an example with a list):
var str = "[[Sender.Name]]\r[[Sender.AdditionalInfo]]\r[[Sender.Street]]\r[[Sender.ZipCode]] [[Sender.Location]]\r[[Sender.Country]]\r";
var rgx22 = new Regex(#"\[\[[^]]+?\]\]");
var res345 = rgx22.Matches(str).Cast<Match>().ToList();
Output:

Related

Use C# RegEx to retrieve a list of matching strings found in a source string? [duplicate]

This question already has an answer here:
Simple and tested online regex containing regex delimiters does not work in C# code
(1 answer)
Closed 3 years ago.
I'm a RegEx novice, so I'm hoping someone out there can give me a hint.
I want to find a straightforward way (using RegEx?) to extract a list/array of values that match a pattern from a string.
If source string is "Hello #bob and #mark and #dave", I want to retrieve a list containing "#bob", "#mark" and "#dave" or, even better, "bob", "mark" and "dave" (without the # symbol).
So far, I have something like this (in C#):
string note = "Hello, #bob and #mark and #dave";
var r = new Regex(#"/(#)\w+/g");
var listOfFound = r.Match(note);
I'm hoping listOfFound will be an array or a List containing the three values.
I could do this with some clever string parsing, but it seems like this should be a piece of cake for RegEx, if I could only come up with the right pattern.
Thanks for any help!
Regexes in C# don't need delimiters and options must be supplied as the second argument to the constructor, but are not required in this case as you can get all your matches using Regex.Matches. Note that by using a lookbehind for the # ((?<=#)) we can avoid having the # in the match:
string note = "Hello, #bob and #mark and #dave";
Regex r = new Regex(#"(?<=#)\w+");
foreach (Match match in r.Matches(note))
Console.WriteLine("Found '{0}' at position {1}", match.Value, match.Index);
Output:
Found 'bob' at position 8
Found 'mark' at position 17
Found 'dave' at position 27
To get all the values into a list/array you could use something like:
string note = "Hello, #bob and #mark and #dave";
Regex r = new Regex(#"(?<=#)\w+");
// list of matches
List<String> Matches = new List<String>();
foreach (Match match in r.Matches(note))
Matches.Add(match.Value);
// array of matches
String[] listOfFound = Matches.ToArray();
You could do it without Regex, for example:
var listOfFound = note.Split().Where(word => word.StartsWith("#"));
Replace
var listOfFound = r.Match(note);
by
var listOfFound = r.Matches(note);

Search string using Pattern within long string in C#

I need to search for a pattern within a string.
For eg:
string big = "Hello there, I need information for ticket XYZ12345. I also submitted ticket ZYX54321. Please update.";
Now I need to extract/find/seek words based on the pattern XXX00000 i.e. 3 ALPHA and than 5 numeric.
Is there any way to do this ?
Even extraction will be okay for me.
Please help.
foreach (Match m in Regex.Matches(big, "([A-Za-z]{3}[0-9]{5})"))
{
if (m.Success)
{
m.Groups[1].Value // -- here is your match
}
}
How about this one?
([XYZ]{3}[0-9]{5})
You can use Regex Tester to test your expressions.
You can use simple regular expression to match your following string
([A-Za-z]{3}[0-9]{5})
the full code will be:
string strRegex = #"([A-Za-z]{3}[0-9]{5})";
Regex myRegex = new Regex(strRegex, RegexOptions.IgnoreCase);
string strTargetString = #"Hello there, I need information for ticket XYZ12345. I also submitted ticket ZYX54321. Please update.";
foreach (Match myMatch in myRegex.Matches(strTargetString))
{
if (myMatch.Success)
{
// Add your code here
}
}
You could always use a chatbot extension for the requests.
as for extracting the required information out of a sentence without any context
you can use regex for that.
you can use http://rubular.com/ to test it,
an example would be
...[0-9]
that would find XXX00000
hope that helped.
Use a regex:
string ticketNumber = string.Empty;
var match = Regex.Match(myString,#"[A-Za-z]{3}\d{5}");
if(match.Success)
{
ticketNumber = match.Value;
}
Here's a regex:
var str = "ABCD12345 ABC123456 ABC12345 XYZ98765";
foreach (Match m in Regex.Matches(str, #"(?<![A-Z])[A-Z]{3}[0-9]{5}(?![0-9])"))
Console.WriteLine(m.Value);
The extra bits are the zero-width negative look-behind ((?<![A-Z])) and look-ahead ((?![0-9])) expressions to make sure you don't capture extra numbers or letters. The above example only catches the third and fourth parts, but not the first and second. A simple [A-Z]{3}[0-9]{5} catches at least the specified number of characters, or more.

Regular expression to find all words which starts with white space and ends with white space

I need to find words in a string with starting and ending white space. I am finding issues while searching white spaces. However, I could achieve the below. Starts and ends with ##. Any help with whitespaces will be great.
string input = "##12## ##13##";
foreach (Match match in Regex.Matches(input, #"##\b\S+?\b##"))
{
messagebox.show(match.Groups[1].Value);
}
From MSDN doc:
// Define a regular expression for repeated words.
Regex rx = new Regex(#"\b(?<word>\w+)\s+(\k<word>)\b",
RegexOptions.Compiled | RegexOptions.IgnoreCase);
\s+(?=</)
is that expression you're after. It means one or more white-space characters followed by
In my opinion it is betetr to use string.Split() instead of Regex:
var wordsArray = s.Split(new []{' '},StringSplitOptions.RemoveEmptyEntries);
it is better to avoid regex if you can achieve the same result easyer with standard string methods.
i cant exactly get what is in your mind but i hope this code can help you:
string[] ha = input.Split(new[] { '#' }, StringSplitOptions.RemoveEmptyEntries);

C# - Regex Match whole words

I need to match all the whole words containing a given a string.
string s = "ABC.MYTESTING
XYZ.YOUTESTED
ANY.TESTING";
Regex r = new Regex("(?<TM>[!\..]*TEST.*)", ...);
MatchCollection mc = r.Matches(s);
I need the result to be:
MYTESTING
YOUTESTED
TESTING
But I get:
TESTING
TESTED
.TESTING
How do I achieve this with Regular expressions.
Edit: Extended sample string.
If you were looking for all words including 'TEST', you should use
#"(?<TM>\w*TEST\w*)"
\w includes word characters and is short for [A-Za-z0-9_]
Keep it simple: why not just try \w*TEST\w* as the match pattern.
I get the results you are expecting with the following:
string s = #"ABC.MYTESTING
XYZ.YOUTESTED
ANY.TESTING";
var m = Regex.Matches(s, #"(\w*TEST\w*)", RegexOptions.IgnoreCase);
Try using \b. It's the regex flag for a non-word delimiter. If you wanted to match both words you could use:
/\b[a-z]+\b/i
BTW, .net doesn't need the surrounding /, and the i is just a case-insensitive match flag.
.NET Alternative:
var re = new Regex(#"\b[a-z]+\b", RegexOptions.IgnoreCase);
Using Groups I think you can achieve it.
string s = #"ABC.TESTING
XYZ.TESTED";
Regex r = new Regex(#"(?<TM>[!\..]*(?<test>TEST.*))", RegexOptions.Multiline);
var mc= r.Matches(s);
foreach (Match match in mc)
{
Console.WriteLine(match.Groups["test"]);
}
Works exactly like you want.
BTW, your regular expression pattern should be a verbatim string ( #"")
Regex r = new Regex(#"(?<TM>[^.]*TEST.*)", RegexOptions.IgnoreCase);
First, as #manojlds said, you should use verbatim strings for regexes whenever possible. Otherwise you'll have to use two backslashes in most of your regex escape sequences, not just one (e.g. [!\\..]*).
Second, if you want to match anything but a dot, that part of the regex should be [^.]*. ^ is the metacharacter that inverts the character class, not !, and . has no special meaning in that context, so it doesn't need to be escaped. But you should probably use \w* instead, or even [A-Z]*, depending on what exactly you mean by "word". [!\..] matches ! or ..
Regex r = new Regex(#"(?<TM>[A-Z]*TEST[A-Z]*)", RegexOptions.IgnoreCase);
That way you don't need to bother with word boundaries, though they don't hurt:
Regex r = new Regex(#"(?<TM>\b[A-Z]*TEST[A-Z]*\b)", RegexOptions.IgnoreCase);
Finally, if you're always taking the whole match anyway, you don't need to use a capturing group:
Regex r = new Regex(#"\b[A-Z]*TEST[A-Z]*\b", RegexOptions.IgnoreCase);
The matched text will be available via Match's Value property.

How can I split a string using regex to return a list of values?

How can I take the string foo[]=1&foo[]=5&foo[]=2 and return a collection with the values 1,5,2 in that order. I am looking for an answer using regex in C#. Thanks
In C# you can use capturing groups
private void RegexTest()
{
String input = "foo[]=1&foo[]=5&foo[]=2";
String pattern = #"foo\[\]=(\d+)";
Regex regex = new Regex(pattern);
foreach (Match match in regex.Matches(input))
{
Console.Out.WriteLine(match.Groups[1]);
}
}
I don't know C#, but...
In java:
String[] nums = String.split(yourString, "&?foo[]");
The second argument in the String.split() method is a regex telling the method where to split the String.
I'd use this particular pattern:
string re = #"foo\[\]=(?<value>\d+)";
So something like (not tested):
Regex reValues = new Regex(re,RegexOptions.Compiled);
List<integer> values = new List<integer>();
foreach (Match m in reValues.Matches(...putInputStringHere...)
{
values.Add((int) m.Groups("value").Value);
}
Use the Regex.Split() method with an appropriate regex. This will split on parts of the string that match the regular expression and return the results as a string[].
Assuming you want all the values in your querystring without checking if they're numeric, (and without just matching on names like foo[]) you could use this: "&?[^&=]+="
string[] values = Regex.Split(“foo[]=1&foo[]=5&foo[]=2”, "&?[^&=]+=");
Incidentally, if you're playing with regular expressions the site http://gskinner.com/RegExr/ is fantastic (I'm just a fan).
Assuming you're dealing with numbers this pattern should match:
/=(\d+)&?/
This should do:
using System.Text.RegularExpressions;
Regex.Replace(s, !#"^[0-9]*$”, "");
Where s is your String where you want the numbers to be extracted.
Just make sure to escape the ampersand like so:
/=(\d+)\&/
Here's an alternative solution using the built-in string.Split function:
string x = "foo[]=1&foo[]=5&foo[]=2";
string[] separator = new string[2] { "foo[]=", "&" };
string[] vals = x.Split(separator, StringSplitOptions.RemoveEmptyEntries);

Categories