Regex: Get a list of id - c#

I need to get a list of id from a string. The regex for the string is like this:
"GET_LIST( [A-Za-z0-9]{5,10}){0,100}";
When I send a string like this:
GET_LIST 1000 10001 10002
I'd like to get something like "10000 10001 10002" or better a list of id. But when I try to get this with matches.Groups[1].Value;
I only get the last id.
My code look like this actually :
public IList<string> ExctractListId(string command)
{
IList<string> id = new List<string>();
Match matches = new Regex(ReponseListeService).Match(command);
if (matches.Success)
{
string ids = matches.Groups[1].Value;
Console.WriteLine(ids);
return id;
}
return id;
}
I know that the code is not fully right, actually I just want get a list or a string with all the id
This code is for a homework and I can't use, Split(), Concat(), ...
How can I have this ?

You may use
private static string pattern = #"^GET_LIST(?:\s+([A-Za-z0-9]{4,10})){0,100}$";
private static List<string> ExtractListId(string command)
{
return Regex.Matches(command, pattern)
.Cast<Match>().SelectMany(p => p.Groups[1].Captures
.Cast<Capture>()
.Select(t => t.Value)
)
.ToList();
}
See the C# demo and a regex demo. Results:
Details
^ - matches start of string
GET_LIST - a literal substring
(?:\s+([A-Za-z0-9]{4,10})){0,100} - 0 to 100 occurrences of
\s+ - 1+ whitespaces
([A-Za-z0-9]{4,10}) - Capturing group 1: 4 to 10 alphanumeric ASCII chars
$ - end of string.
Note that we have a capturing group (([A-Za-z0-9]{4,10})) inside a quantified non-capturing group (?:...){0,100}. To get those values, you should access the group capture collection. As the group has ID 1, you need to get match.Groups[1] and access all its .Captures.

You can also use the String.Split() method to split the string on whitespace characters, and then return all items that can be parsed to an int. Note that this will return all items that are valid integers, so it will work with your sample input, but if you have other types of input it may need some modification.
public static IList<string> ExctractListId(string command)
{
if (command == null || !command.StartsWith("GET_LIST"))
{
return new List<string>();
}
int temp;
return command.Split().Where(item => int.TryParse(item, out temp)).ToList();
}
Example usage:
private static void Main()
{
Console.WriteLine(string.Join(", ", ExctractListIds("GET_LIST 1000 10001 10002")));
GetKeyFromUser("\nDone! Press any key to exit...");
}
Output

The data your are searching contains white-space. So in the regex add white-space or \s and try again.
Hope this helps.
Sorry, I counldn't completely understand the problem.
A small code snippet using Javascript
function getId(data){
var regex = /^GET_LIST(([\d\s]{5,10}){0,100})/g;
var match = regex.exec(data);
console.log(match[1]);
}

Related

Check if the inputs have the same values in Regex

I am trying to get the input from the user in a single Line with with [, ,] separators. Like this:
[Q,W,1] [R,T,3] [Y,U,9]
And then I will use these inputs in a function like this:
f.MyFunction('Q','W',1); // Third parameter will be taken as integer
f.MyFunction('R','T',3);
f.MyFunction('Y','U',9);
So, using Regex:
var funcArgRE = new Regex(#"\[(.),(.),(\d+)\]", RegexOptions.Compiled);
foreach (Match match in funcArgRE.Matches(input))
{
var g = match.Groups;
f.MyFunction(g[1].Value[0], g[2].Value[0], Int32.Parse(g[3].Value));
}
But I also want to check the inputs if they have the same char combination
Like
[Q,W,1] [U,Y,3] [Z,K,1] [Y,U,9]
if(theyHaveTheSame_Combination)
// do sth.
How can I do this inside the regex code piece?
You can use
\[([A-Za-z]),([A-Za-z]),(\d+)](?=.*\[(?:\1,\2|\2,\1),\d+])
Or, if you can really have anything in place of letters:
\[(.),(.),(\d+)](?=.*\[(?:\1,\2|\2,\1),\d+])
See the regex demo.
Details:
\[(.),(.),(\d+)] - a [ char, any single char (Group 1), comma, any single char (Group 2), comma, one or more digtis (Group 3), ] char
(?=.*\[(?:\1,\2|\2,\1),\d+]) - a positive lookahead that requires the following pattern to appear immediately to the right of the current location:
.* - any zero or more chars other than line break chars as many as possible
\[ - a [ char
(?:\1,\2|\2,\1) - Group 1 value, comma, Group 2 value, or Group 2 value, comma, Group 1 value
,\d+] - comma, one or more digits, ] char.
Regex can help you parse the input string, but to compare the different inputs to see if any are duplicates, you will need some other logic.
The structure of your data seems to be this:
class Command {
public char Letter1;
public char Letter2;
public int Number;
}
class CommandBatch {
public Command[] Commands;
}
You can use the regex you have to populate a CommandBatch.
You can create a function to compare two Commands, to see if they have matching letters.
bool AreMatching(Command c1, Command c2) {
return (c1.Letter1 == c2.Letter1 && c1.Letter2 == c2.Letter2)
|| (c1.Letter1 == c2.Letter2 && c1.Letter2 == c2.Letter1);
}
And then you can use that to make a function that checks a whole CommandBatch.
bool AnyDuplicates(CommandBatch batch) {
var pairs = from c1 in batch.Commands
from c2 in batch.Commands
where c1 != c2
select (c1, c2);
return pairs.Any(tup => AreMatching(tup.Item1, tup.Item2));
}

check for a substring(of a string) in the dictionary and return the key's(substring) value

I have a dictionary like below,
PropStreetSuffixDict.Add("ROAD", "RD");
PropStreetSuffixDict.Add("STREET","ST"); and many more.
Now my requirement says when a string contains a substring of either ROAD or STREET i want to return the related value for that substring.
For example..CHURCH ACROSS ROAD should return RD
This is what i tried, which only works if the input string is exactly same as key of the dict.
private string GetSuffix(string input)
{
string suffix=string.Empty;
suffix = PropStreetSuffixDict.Where(x => x.Key.ToUpper().Trim() ==
input.ToUpper().Trim()).FirstOrDefault().Value;
return suffix;
}
Note:
In case a string contains more than one of such substrings, then it should return the value of the first occurence of the any of the substrings.
i.e. if STREET CHURCH ACROSS ROAD is the input, it should return ST not RD
You can try something like this
private string GetSuffix(string input)
{
string suffix=string.Empty;
string[] test =input.ToUpper().Split(' ');
suffix =(from dic in PropStreetSuffixDict
join inp in test on dic.Key equals inp
select dic.Value).LastOrDefault();
return suffix;
}
Split the input and then use linq
If you want it to return first occurrence in the input string (GetSuffix("CHURCH STREET ACROSS ROAD) ==> "STREET") it becomes a little tricky.
Code below will find where in the input string all keys occur, and return value of first found position.
private string GetSuffix(string input)
{
var suffix = PropStreetSuffixDict
.Select(kvp => new
{
Position = input.IndexOf(kvp.Key.Trim(), StringComparison.CurrentCultureIgnoreCase),
Value = kvp.Value
})
.OrderBy(x => x.Position)
.FirstOrDefault(x => x.Position > -1)?.Value;
return suffix ?? string.Empty;
}
If you didn't care about the order of occurrence in input string you could simplify it to this:
private string GetSuffix(string input)
{
var suffix = PropStreetSuffixDict.FirstOrDefault(kvp => input.Containts(kvp.Key.Trim(), StringComparison.CurrentCultureIgnoreCase))?.Value;
return suffix ?? string.Empty;
}
I would recommend using using RegEx to split apart your words, that way you can efficiently split on multiple characters, not just spaces, if required. This solution also allows replacing the individual words very easily, without having to deal with tracking the position and length of the matched word, vs the length of the replacement value.
You could use a function like this:
public string ReplaceWords(string input, Dictionary<string,string> dictionary)
{
var result = Regex.Replace(input, #"\w*", (match) =>
{
if (dictionary.TryGetValue(match.Value, out var replacement))
{
return replacement;
}
return match.Value;
});
return result;
}
It will take an input string, split it up, and replace the individual words with those in the supplied dictionary. The particular RegEx of \w* will match any continuous run of "word" characters, so it will break on spaces, commas, dashes, and anything else that isn't part of a "word".
This code does use some newer C# language features that you may not have access too (inline out parameters). Just let me know if you can't use those and I'll update it to work without them.
You can use it like this:
Console.WriteLine(ReplaceWords("CHURCH ACROSS ROAD", PropStreetSuffixDict));
Console.WriteLine(ReplaceWords("CHURCH ACROSS STREET", PropStreetSuffixDict));
Console.WriteLine(ReplaceWords("CHURCH ACROSS ROAD, LEFT AT THE OTHER STREET", PropStreetSuffixDict));
For the following results:
CHURCH ACROSS RD
CHURCH ACROSS ST
CHURCH ACROSS RD, LEFT AT THE OTHER ST

How to move 12 digit numbers from richtextbox to textbox2

I want to move 12 digit numbers from richtextbox to textbox2 by a program.
I enter these words for richtextbox
sdgsjglksdjgkl,512025151988,512025151988,512025151988,512025151988,512025151988,sdgsgd
I need to get only these 12 digit numbers to textbox2..
I tried this code but it types System.Text.RegularExpressions.MatchCollection not these digits
Here i use code for that
private void button2_Click(object sender, EventArgs e)
{
Regex RX = new Regex("[0-9]{1,12}$");
textBox2.Text = (RX.Matches(richTextBox1.Text)).ToString();
}
I don't know how to move these numebrs to the textbox2.. Please help me enter image description here
Split with a comma, then take all items that are of length 12 and are all digits:
var richTextBox1_Text = "sdgsjglksdjgkl,512025151988,512025151988,512025151988,512025151988,512025151988,sdgsgd";
Console.Write(
string.Join(",",
richTextBox1_Text.Split(',')
.Where(m=>m.Length==12 && m.All(char.IsDigit))));
# => 512025151988,512025151988,512025151988,512025151988,512025151988
See the C# demo
In your code:
textBox2.Text = string.Join(",",
richTextBox1.Text.Split(',')
.Where(m=>m.Length==12 && m.All(char.IsDigit)));
For more complex scenarios, use a \b\d{12}\b regex like this:
textBox2.Text = string.Join("\r\n",
Regex.Matches(richTextBox1.Text, #"\b\d{12}\b")
.Cast<Match>()
.Select(m => m.Value));
I have created a method that returns a collection of strings, each containing a number.
public static IEnumerable<string> SeparateNumbers(string inputText)
{
var matches = Regex.Matches(inputText, "[0-9]{12}");
foreach (Match match in matches)
{
yield return inputText.Substring(match.Index, match.Length);
}
}
You would simply use it like this. I have also added a way to comma separate them again:
string inputText = "sdgsjglksdjgkl,512025151988,512025151988,512025151988,512025151988,512025151988,sdgsgd";
var separatedNumbers = SeparateNumbers(inputText)
.ToArray();
string numbersOnly = string.Join(',', separatedNumbers);
I hope this helps.
Edit:
The reason that it gives you this: System.Text.RegularExpressions.MatchCollection is because of the default implementation of the ToString method, it simply gives you the full name of the type (including namespaces).
Also, if you want it to match any amount of numbers up to 12, simply change the regex to [0-9]{1,12} as you did initially.
1) If you want to return all 12-digit numbers (no more, no less), your regex should be [0-9]{12}. The $ you had in your OP matches the end of a string or line, and the {1,12} in your OP matches any number of digits from 1 to 12. If the number has to be surrounded by commas or string anchors so that 13-digit numbers are not matched, your regex would look something like (?<=^|,)[0-9]{12}(?=,|$).
2) If you read this link, Regex.Matches(string) returns a MatchCollection. If you convert that to string, it is just the type name to string. You have to get to each item in the collection, like:
Match match = regex.Match(input);
while (match.Success) {
// Your logic here
match = match.NextMatch();
}
3) I think string.Split(',') is easier to use. Then, loop through the array and return all strings that are 12 characters long and are numeric. Alternatively, you could use Linq as others have pointed out.
You can simply use LINQ for this purposes. If you want to get just 512025151988:
textBox2.Text = string.Join("",richTextBox1.Text.SkipWhile(c =>
!char.IsDigit(c)).TakeWhile(char.IsDigit));
Or if you want to get all numbers (512025151988,512025151988,512025151988,512025151988,512025151988):
textBox2.Text = string.Join(",",richTextBox1.Text.Split(',')
.Select(d => string.Join("",d
.SkipWhile(c => !char.IsDigit(c)).TakeWhile(char.IsDigit))))
.TrimStart(',').TrimEnd(',');
Replace first comma with space if you need to join results with space. string.Join(" ",...

Attempting to capture multiple groups but only the last group is captured

I am trying to use regex to help to convert the following string into a Dictionary:
{TheKey|TheValue}{AnotherKey|AnotherValue}
Like such:
["TheKey"] = "TheValue"
["AnotherKey"] = "AnotherValue"
To parse the string for the dictionary, I am using the regex expression:
^(\{(.+?)\|(.+?)\})*?$
But it will only capture the last group of {AnotherKey|AnotherValue}.
How do I get it to capture all of the groups?
I am using C#.
Alternatively, is there a more straightforward way to approach this rather than using Regex?
Code (Properties["PromptedValues"] contains the string to be parsed):
var regex = Regex.Matches(Properties["PromptedValues"], #"^(\{(.+?)\|(.+?)\})*?$");
foreach(Match match in regex) {
if(match.Groups.Count == 4) {
var key = match.Groups[2].Value.ToLower();
var value = match.Groups[3].Value;
values.Add(key, new StringPromptedFieldHandler(key, value));
}
}
This is coded to work for the single value, I would be looking to update it once I can get it to capture multiple values.
The $ says that: The match must occur at the end of the string or before \n at the end of the line or string.
The ^ says that: The match must start at the beginning of the string or line.
Read this for more regex syntax: msdn RegEx
Once you remove the ^ and $ your regex will match all of the sets You should read: Match.Groups and get something like the following:
public class Example
{
public static void Main()
{
string pattern = #"\{(.+?)\|(.+?)\}";
string input = "{TheKey|TheValue}{AnotherKey|AnotherValue}";
MatchCollection matches = Regex.Matches(input, pattern);
foreach (Match match in matches)
{
Console.WriteLine("The Key: {0}", match.Groups[1].Value);
Console.WriteLine("The Value: {0}", match.Groups[2].Value);
Console.WriteLine();
}
Console.WriteLine();
}
}
Your regex tries to match against the entire line. You can get individual pairs if you don't use anchors:
var input = Regex.Matches("{TheKey|TheValue}{AnotherKey|AnotherValue}");
var matches=Regex.Matches(input,#"(\{(.+?)\|(.+?)\})");
Debug.Assert(matches.Count == 2);
It's better to name the fields though:
var matches=Regex.Matches(input,#"\{(?<key>.+?)\|(?<value>.+?)\}");
This allows you to access the fields by name, and even use LINQ:
var pairs= from match in matches.Cast<Match>()
select new {
key=match.Groups["key"].Value,
value=match.Groups["value"].Value
};
Alternatively, you can use the Captures property of your groups to get all of the times they matched.
if (regex.Success)
{
for (var i = 0; i < regex.Groups[1].Captures.Count; i++)
{
var key = regex.Groups[2].Captures[i].Value.ToLower();
var value = regex.Groups[3].Captures[i].Value;
}
}
This has the advantage of still checking that your entire string was made up of matches. Solutions suggesting you remove the anchors will find things that look like matches in a longer string, but will not fail for you if anything was malformed.

Get specific numbers from string

In my current project I have to work alot with substring and I'm wondering if there is an easier way to get out numbers from a string.
Example:
I have a string like this:
12 text text 7 text
I want to be available to get out first number set or second number set.
So if I ask for number set 1 I will get 12 in return and if I ask for number set 2 I will get 7 in return.
Thanks!
This will create an array of integers from the string:
using System.Linq;
using System.Text.RegularExpressions;
class Program {
static void Main() {
string text = "12 text text 7 text";
int[] numbers = (from Match m in Regex.Matches(text, #"\d+") select int.Parse(m.Value)).ToArray();
}
}
Try using regular expressions, you can match [0-9]+ which will match any run of numerals within your string. The C# code to use this regex is roughly as follows:
Match match = Regex.Match(input, "[0-9]+", RegexOptions.IgnoreCase);
// Here we check the Match instance.
if (match.Success)
{
// here you get the first match
string value = match.Groups[1].Value;
}
You will of course still have to parse the returned strings.
Looks like a good match for Regex.
The basic regular expression would be \d+ to match on (one or more digits).
You would iterate through the Matches collection returned from Regex.Matches and parse each returned match in turn.
var matches = Regex.Matches(input, "\d+");
foreach(var match in matches)
{
myIntList.Add(int.Parse(match.Value));
}
You could use regex:
Regex regex = new Regex(#"^[0-9]+$");
you can split the string in parts using string.Split, and then travese the list with a foreach applying int.TryParse, something like this:
string test = "12 text text 7 text";
var numbers = new List<int>();
int i;
foreach (string s in test.Split(' '))
{
if (int.TryParse(s, out i)) numbers.Add(i);
}
Now numbers has the list of valid values

Categories