Check if the inputs have the same values in Regex - c#

I am trying to get the input from the user in a single Line with with [, ,] separators. Like this:
[Q,W,1] [R,T,3] [Y,U,9]
And then I will use these inputs in a function like this:
f.MyFunction('Q','W',1); // Third parameter will be taken as integer
f.MyFunction('R','T',3);
f.MyFunction('Y','U',9);
So, using Regex:
var funcArgRE = new Regex(#"\[(.),(.),(\d+)\]", RegexOptions.Compiled);
foreach (Match match in funcArgRE.Matches(input))
{
var g = match.Groups;
f.MyFunction(g[1].Value[0], g[2].Value[0], Int32.Parse(g[3].Value));
}
But I also want to check the inputs if they have the same char combination
Like
[Q,W,1] [U,Y,3] [Z,K,1] [Y,U,9]
if(theyHaveTheSame_Combination)
// do sth.
How can I do this inside the regex code piece?

You can use
\[([A-Za-z]),([A-Za-z]),(\d+)](?=.*\[(?:\1,\2|\2,\1),\d+])
Or, if you can really have anything in place of letters:
\[(.),(.),(\d+)](?=.*\[(?:\1,\2|\2,\1),\d+])
See the regex demo.
Details:
\[(.),(.),(\d+)] - a [ char, any single char (Group 1), comma, any single char (Group 2), comma, one or more digtis (Group 3), ] char
(?=.*\[(?:\1,\2|\2,\1),\d+]) - a positive lookahead that requires the following pattern to appear immediately to the right of the current location:
.* - any zero or more chars other than line break chars as many as possible
\[ - a [ char
(?:\1,\2|\2,\1) - Group 1 value, comma, Group 2 value, or Group 2 value, comma, Group 1 value
,\d+] - comma, one or more digits, ] char.

Regex can help you parse the input string, but to compare the different inputs to see if any are duplicates, you will need some other logic.
The structure of your data seems to be this:
class Command {
public char Letter1;
public char Letter2;
public int Number;
}
class CommandBatch {
public Command[] Commands;
}
You can use the regex you have to populate a CommandBatch.
You can create a function to compare two Commands, to see if they have matching letters.
bool AreMatching(Command c1, Command c2) {
return (c1.Letter1 == c2.Letter1 && c1.Letter2 == c2.Letter2)
|| (c1.Letter1 == c2.Letter2 && c1.Letter2 == c2.Letter1);
}
And then you can use that to make a function that checks a whole CommandBatch.
bool AnyDuplicates(CommandBatch batch) {
var pairs = from c1 in batch.Commands
from c2 in batch.Commands
where c1 != c2
select (c1, c2);
return pairs.Any(tup => AreMatching(tup.Item1, tup.Item2));
}

Related

How to capitalize 1st letter (ignoring non a-z) with regex in c#?

There are tons of posts regarding how to capitalize the first letter with C#, but I specifically am struggling how to do this when ignoring prefixed non-letter characters and tags inside them. Eg,
<style=blah>capitalize the word, 'capitalize'</style>
How to ignore potential <> tags (or non-letter chars before it, like asterisk *) and the contents within them, THEN capitalize "capitalize"?
I tried:
public static string CapitalizeFirstCharToUpperRegex(string str)
{
// Check for empty string.
if (string.IsNullOrEmpty(str))
return string.Empty;
// Return char and concat substring.
// Start # first char, no matter what (avoid <tags>, etc)
string pattern = #"(^.*?)([a-z])(.+)";
// Extract middle, then upper 1st char
string middleUpperFirst = Regex.Replace(str, pattern, "$2");
middleUpperFirst = CapitalizeFirstCharToUpper(str); // Works
// Inject the middle back in
string final = $"$1{middleUpperFirst}$3";
return Regex.Replace(str, pattern, final);
}
EDIT:
Input: <style=foo>first non-tagged word 1st char upper</style>
Expected output: <style=foo>First non-tagged word 1st char upper</style>
You may use
<[^<>]*>|(?<!\p{L})(\p{L})(\p{L}*)
The regex does the following:
<[^<>]*> - matches <, any 0+ chars other than < and > and then >
| - or
(?<!\p{L}) - finds a position not immediately preceded with a letter
(\p{L}) - captures into Group 1 any letter
(\p{L}*) - captures into Group 2 any 0+ letters (that is necessary if you want to lowercase the rest of the word).
Then, check if Group 2 matched, and if yes, capitalize the first group value and lowercase the second one, else, return the whole value:
var result = Regex.Replace(s, #"<[^<>]*>|(?<!\p{L})(\p{L})(\p{L}*)", m =>
m.Groups[1].Success ?
m.Groups[1].Value.ToUpper() + m.Groups[2].Value.ToLower() :
m.Value);
If you do not need to lowercase the rest of the word, remove the second group and the code related to it:
var result = Regex.Replace(s, #"<[^<>]*>|(?<!\p{L})(\p{L})", m =>
m.Groups[1].Success ?
m.Groups[1].Value.ToUpper() : m.Value);
To only replace the first occurrence using this approach, you need to set a flag and reverse it once the first match is found:
var s = "<style=foo>first non-tagged word 1st char upper</style>";
var found = false;
var result = Regex.Replace(s, #"<[^<>]*>|(?<!\p{L})(\p{L})", m => {
if (m.Groups[1].Success && !found) {
found = !found;
return m.Groups[1].Value.ToUpper();
} else {
return m.Value;
}
});
Console.WriteLine(result); // => <style=foo>First non-tagged word 1st char upper</style>
See the C# demo.
Using look-behind regex feature you can match the first 'capitalize' without > parenthesis and then you can capitalize the output.
The regex is the following:
(?<=<.*>)\w+
It will match the first word after the > parenthesis

Regex: Get a list of id

I need to get a list of id from a string. The regex for the string is like this:
"GET_LIST( [A-Za-z0-9]{5,10}){0,100}";
When I send a string like this:
GET_LIST 1000 10001 10002
I'd like to get something like "10000 10001 10002" or better a list of id. But when I try to get this with matches.Groups[1].Value;
I only get the last id.
My code look like this actually :
public IList<string> ExctractListId(string command)
{
IList<string> id = new List<string>();
Match matches = new Regex(ReponseListeService).Match(command);
if (matches.Success)
{
string ids = matches.Groups[1].Value;
Console.WriteLine(ids);
return id;
}
return id;
}
I know that the code is not fully right, actually I just want get a list or a string with all the id
This code is for a homework and I can't use, Split(), Concat(), ...
How can I have this ?
You may use
private static string pattern = #"^GET_LIST(?:\s+([A-Za-z0-9]{4,10})){0,100}$";
private static List<string> ExtractListId(string command)
{
return Regex.Matches(command, pattern)
.Cast<Match>().SelectMany(p => p.Groups[1].Captures
.Cast<Capture>()
.Select(t => t.Value)
)
.ToList();
}
See the C# demo and a regex demo. Results:
Details
^ - matches start of string
GET_LIST - a literal substring
(?:\s+([A-Za-z0-9]{4,10})){0,100} - 0 to 100 occurrences of
\s+ - 1+ whitespaces
([A-Za-z0-9]{4,10}) - Capturing group 1: 4 to 10 alphanumeric ASCII chars
$ - end of string.
Note that we have a capturing group (([A-Za-z0-9]{4,10})) inside a quantified non-capturing group (?:...){0,100}. To get those values, you should access the group capture collection. As the group has ID 1, you need to get match.Groups[1] and access all its .Captures.
You can also use the String.Split() method to split the string on whitespace characters, and then return all items that can be parsed to an int. Note that this will return all items that are valid integers, so it will work with your sample input, but if you have other types of input it may need some modification.
public static IList<string> ExctractListId(string command)
{
if (command == null || !command.StartsWith("GET_LIST"))
{
return new List<string>();
}
int temp;
return command.Split().Where(item => int.TryParse(item, out temp)).ToList();
}
Example usage:
private static void Main()
{
Console.WriteLine(string.Join(", ", ExctractListIds("GET_LIST 1000 10001 10002")));
GetKeyFromUser("\nDone! Press any key to exit...");
}
Output
The data your are searching contains white-space. So in the regex add white-space or \s and try again.
Hope this helps.
Sorry, I counldn't completely understand the problem.
A small code snippet using Javascript
function getId(data){
var regex = /^GET_LIST(([\d\s]{5,10}){0,100})/g;
var match = regex.exec(data);
console.log(match[1]);
}

compiled Regex template with passing value dynamically [duplicate]

This is the input string: 23x^45*y or 2x^2 or y^4*x^3.
I am matching ^[0-9]+ after letter x. In other words I am matching x followed by ^ followed by numbers. Problem is that I don't know that I am matching x, it could be any letter that I stored as variable in my char array.
For example:
foreach (char cEle in myarray) // cEle is letter in char array x, y, z, ...
{
match CEle in regex(input) //PSEUDOCODE
}
I am new to regex and I new that this can be done if I define regex variables, but I don't know how.
You can use the pattern #"[cEle]\^\d+" which you can create dynamically from your character array:
string s = "23x^45*y or 2x^2 or y^4*x^3";
char[] letters = { 'e', 'x', 'L' };
string regex = string.Format(#"[{0}]\^\d+",
Regex.Escape(new string(letters)));
foreach (Match match in Regex.Matches(s, regex))
Console.WriteLine(match);
Result:
x^45
x^2
x^3
A few things to note:
It is necessary to escape the ^ inside the regular expression otherwise it has a special meaning "start of line".
It is a good idea to use Regex.Escape when inserting literal strings from a user into a regular expression, to avoid that any characters they type get misinterpreted as special characters.
This will also match the x from the end of variables with longer names like tax^2. This can be avoided by requiring a word boundary (\b).
If you write x^1 as just x then this regular expression will not match it. This can be fixed by using (\^\d+)?.
The easiest and faster way to implement from my point of view is the following:
Input: This?_isWhat?IWANT
string tokenRef = "?";
Regex pattern = new Regex($#"([^{tokenRef}\/>]+)");
The pattern should remove my tokenRef and storing the following output:
Group1 This
Group2 _isWhat
Group3 IWANT
Try using this pattern for capturing the number but excluding the x^ prefix:
(?<=x\^)[0-9]+
string strInput = "23x^45*y or 2x^2 or y^4*x^3";
foreach (Match match in Regex.Matches(strInput, #"(?<=x\^)[0-9]+"))
Console.WriteLine(match);
This should print :
45
2
3
Do not forget to use the option IgnoreCase for matching, if required.

Regex expression for matching only floating point numbers

Hi i need a Regex Expression for extracting only floating point numbers from right to left
Example string
Earning per Equity Share (in ) face value of 2 each26 1,675.10
1,252.56
My current Regex
(\+|-)?[0-9][0-9]*(\,[0-9]*)?(\.[0-9]*)? with Rex options-Right to left
but
Current Output is
1,252.56
1,675.10
26
2
However i do not want to match on 26 or 2
Please help me
Maybe something like this will help
Regex
/[-+]?[0-9,\.]*([,\.])[0-9]*/g
Example input
Earning -34 5 b4 pe8r blah4 t3st + - (in) 1,252.56 face
-12234,23423.342 of 1,675.10 1,252.56
Matches
1,252.56
-12234,23423.342
1,675.10
1,252.56
Explanation
[-+]? match a single character present in the list below
Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy]
-+ a single character in the list -+ literally
[0-9,\.]* match a single character present in the list below
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
0-9 a single character in the range between 0 and 9
, the literal character ,
\. matches the character . literally
1st Capturing group ([,\.])
[,\.] match a single character present in the list below
, the literal character ,
\. matches the character . literally
[0-9]* match a single character present in the list below
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
0-9 a single character in the range between 0 and 9
g modifier: global. All matches (don't return on first match)
Although this is a Regex question this is also taged as C#.
Below is an example of how you might get a little bit more control over your output.
It's also culture-specific and only picks up numbers with a decimal place, and has no false positives.
Method
private List<double> GetNumbers(string input)
{
// declare result
var resultList = new List<double>();
// if input is empty return empty results
if (string.IsNullOrEmpty(input))
{
return resultList;
}
// Split input in to words, exclude empty entries
var words = input.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
// set your desirted culture
var culture = CultureInfo.CreateSpecificCulture("en-GB");
// Refine words into a list that represents potential numbers
// must have decimal place, must not start or end with decimal place
var refinedList = words.Where(x => x.Contains(".") && !x.StartsWith(".") && !x.EndsWith("."));
foreach (var word in refinedList)
{
double value;
// parse words using designated culture, and the Number option of double.TryParse
if (double.TryParse(word, NumberStyles.Number, culture, out value))
{
resultList.Add(value);
}
}
return resultList;
}
Usage
var testString = "Earning -34 5 b4 , . 234. 234, ,345 45.345 $234234 234.3453.345 $23423.2342 +234 -23423 pe8r blah4 t3st + - (in) 1,252.56 face -12234,23423.342 of 1,675.10 1,252.56";
var results = GetNumbers(testString);
foreach (var item in results)
{
Debug.WriteLine("{0}", item);
}
Output
45.345
1252.56
-1223423423.342
1675.1
1252.56
Additional Notes
You can learn more about double.TryParse and its options here.
You can learn more about the CultureInfo Class here.

.Net Removing all the first 0 of a string

I got the following :
01.05.03
I need to convert that to 1.5.3
The problem is I cannot only trim the 0 because if I got :
01.05.10
I need to convert that to 1.5.10
So, what's the better way to solve that problem ? Regex ? If so, any regex example doing that ?
Expanding on the answer of #FrustratedWithFormsDesigner:
string Strip0s(string s)
{
return string.Join<int>(".", from x in s.Split('.') select int.Parse(x));
}
Regex-replace
(?<=^|\.)0+
with the empty string. The regex is:
(?<= # begin positive look-behind (i.e. "a position preceded by")
^|\. # the start of the string or a literal dot †
) # end positive look-behind
0+ # one or more "0" characters
† note that not all regex flavors support variable-length look-behind, but .NET does.
If you expect this kind of input: "00.03.03" and want to to keep the leading zero in this case (like "0.3.3"), use this expression instead:
(?<=^|\.)0+(?=\d)
and again replace with the empty string.
From the comments (thanks Kobi): There is a more concise expression that does not require look-behind and is equivalent to my second suggestion:
\b0+(?=\d)
which is
\b # a word boundary (a position between a word char and a non-word char)
0+ # one or more "0" characters
(?=\d) # positive look-ahead: a position that's followed by a digit
This works because the 0 happens to be a word character, so word boundaries can be used to find the first 0 in a row. It is a more compatible expression, because many regex flavors do not support variable-length look-behind, and some (like JavaScript) no look-behind at all.
You could split the string on ., then trim the leading 0s on the results of the split, then merge them back together.
I don't know of a way to do this in a single operation, but you could write a function that hides this and makes it look like a single operation. ;)
UPDATE:
I didn't even think of the other guy's regex. Yeah, that will probably do it in a single operation.
Here's another way you could do what FrustratedWithFormsDesigner suggests:
string s = "01.05.10";
string s2 = string.Join(
".",
s.Split('.')
.Select(str => str.TrimStart('0'))
.ToArray()
);
This is almost the same as dtb's answer, but doesn't require that the substrings be valid integers (it would also work with, e.g., "000A.007.0HHIMARK").
UPDATE: If you'd want any strings consisting of all 0s in the input string to be output as a single 0, you could use this:
string s2 = string.Join(
".",
s.Split('.')
.Select(str => TrimLeadingZeros(str))
.ToArray()
);
public static string TrimLeadingZeros(string text) {
int number;
if (int.TryParse(text, out number))
return number.ToString();
else
return text.TrimStart('0');
}
Example input/output:
00.00.000A.007.0HHIMARK // input
0.0.A.7.HHIMARK // output
There's also the old-school way which probably has better performance characteristics than most other solutions mentioned. Something like:
static public string NormalizeVersionString(string versionString)
{
if(versionString == null)
throw new NullArgumentException("versionString");
bool insideNumber = false;
StringBuilder sb = new StringBuilder(versionString.Length);
foreach(char c in versionString)
{
if(c == '.')
{
sb.Append('.');
insideNumber = false;
}
else if(c >= '1' && c <= '9')
{
sb.Append(c);
insideNumber = true;
}
else if(c == '0')
{
if(insideNumber)
sb.Append('0');
}
}
return sb.ToString();
}
string s = "01.05.10";
string newS = s.Replace(".0", ".");
newS = newS.StartsWith("0") ? newS.Substring(1, newS.Length - 1) : newS;
Console.WriteLine(newS);
NOTE: You will have to thoroughly check for possible input combination.
This looks like it is a date format, if so I would use Date processing code
DateTime time = DateTime.Parse("01.02.03");
String newFormat = time.ToString("d.M.yy");
or even better
String newFormat = time.ToShortDateString();
which will respect you and your clients culture setting.
If this data is not a date then don't use this :)
I had a similar requirement to parse a string with street adresses, where some of the house numbers had leading zeroes and I needed to remove them while keeping the rest of the text intact, so I slightly edited the accepted answer to meet my requirements, maybe someone finds it useful. Basically doing the same as accepted answer, with the difference that I am checking if the string part can be parsed as an integer, and defaulting to the string value when false;
string Strip0s(string s)
{
int outputValue;
return
string.Join(" ",
from x in s.Split(new[] { ' ' })
select int.TryParse(x, out outputValue) ? outputValue.ToString() : x);
}
Input: "Islands Brygge 34 B 07 TV"
Output: "Islands Brygge 34 B 7 TV"

Categories