Regex to match all Romanian phone numbers - c#

I searched the whole google to find some ways to verify if the phone number is Romanian but didn't found anything that helps me...
I want a Regex validator for the following numbers format:
074xxxxxxx
075xxxxxxx
076xxxxxxx
078xxxxxxx
072xxxxxxx
077xxxxxxx
0251xxxxxx
0351xxxxxx
This is the regex that I've made, but it is not working:
{ "Romania", new Regex("(/^(?:(?:(?:00\\s?|\\+)40\\s?|0)(?:7\\d{2}\\s?\\d{3}\\s?\\d{3}|(21|31)\\d{1}\\s?\\d{3}\\s?\\d{3}|((2|3)[3-7]\\d{1})\\s?\\d$)")}
It doesn't validate the correct numbers format.
More details:
If the number begins with other than the initial ones that I've added, then that number is not valid.
The x should contain any number, but there should not be the same number..like 0000000 1111111 etc.
It can also have the following format (but not mandatory): (072)xxxxxxx
Is there any way of doing this?
I want to implement this to store these numbers in database and check if their format is Romanian.
This is the code where I need to add the regex expression...there should be a new Regex named "Romanian"
static IDictionary<string, Regex> countryRegex = new Dictionary<string, Regex>()
{
{ "USA", new Regex("^[2-9]\\d{2}-\\d{3}-\\d{4}$")},
{ "UK", new Regex("(^1300\\d{6}$)|(^1800|1900|1902\\d{6}$)|(^0[2|3|7|8]{1}[0-9]{8}$)|(^13\\d{4}$)|(^04\\d{2,3}\\d{6}$)")},
{ "Netherlands", new Regex("(^\\+[0-9]{2}|^\\+[0-9]{2}\\(0\\)|^\\(\\+[0-9]{2}\\)\\(0\\)|^00[0-9]{2}|^0)([0-9]{9}$|[0-9\\-\\s]{10}$)")},
};

If I understand the rules correctly, this pattern should work:
^(?<paren>\()?0(?:(?:72|74|75|76|77|78)(?(paren)\))(?<first>\d)(?!\k<first>{6})\d{6}|(?:251|351)(?(paren)\))(?<first>\d)(?!\k<first>{5})\d{5})$
So, you could add it to your code like this:
static IDictionary<string, Regex> countryRegex = new Dictionary<string, Regex>()
{
{ "USA", new Regex("^[2-9]\\d{2}-\\d{3}-\\d{4}$")},
{ "UK", new Regex("(^1300\\d{6}$)|(^1800|1900|1902\\d{6}$)|(^0[2|3|7|8]{1}[0-9]{8}$)|(^13\\d{4}$)|(^04\\d{2,3}\\d{6}$)")},
{ "Netherlands", new Regex("(^\\+[0-9]{2}|^\\+[0-9]{2}\\(0\\)|^\\(\\+[0-9]{2}\\)\\(0\\)|^00[0-9]{2}|^0)([0-9]{9}$|[0-9\\-\\s]{10}$)")},
{ "Romania", new RegEx(#"^(?<paren>\()?0(?:(?:72|74|75|76|77|78)(?(paren)\))(?<first>\d)(?!\k<first>{6})\d{6}|(?:251|351)(?(paren)\))(?<first>\d)(?!\k<first>{5})\d{5})$")}
};
Here is the meaning of the pattern:
^ - Matches must start at the beginning of the input string
(?<paren>\()? - Optionally matches a ( character. If it is there, it captures it in a group named paren
0 - The number must start with a single 0
(?: - Begins an non-capturing group for the purpose of matching one of two different formats
(?:72|74|75|76|77|78)(?(paren)\))(?<first>\d)(?!\k<first>{6})\d{6} - The first format
(?:72|74|75|76|77|78) - The next two digits must be 72, 74, 75, 76, 77, or 78
(?(paren)\)) - If the opening ( exists, then there must be a closing ) here
(?<first>\d) - Matches just the first of the ending seven digits and captures it in a group named first
(?!\k<first>{6}) - A negative look-ahead which ensures that the remaining six digits are not the same as the first one
\d{6} - Matches the remaining six digits
| - The or operator
(?:251|351)(?(paren)\))(?<first>\d)(?!\k<first>{5})\d{5} - The second format
(?:251|351) - The next three digits must be 251 or 351.
(?(paren)\)) - If the opening ( exists, then there must be a closing ) here
(?<first>\d) - Matches just the first of the ending six digits and captures it in a group named first
(?!\k<first>{5}) - A negative look-ahead which ensures that the remaining five digits are not the same as the first one
\d{5} - Matches the remaining five digits
) - Ends the non-capturing group which specified the two potential formats
$ - The match must go all the way to the of the input string

Try this one: ^(?=0[723][2-8]\d{7})(?!.*(.)\1{2,}).{10}$ - The negative lookahead (?!...) is testing the repeating characters
I use http://regexr.com/ to test

This match your example:
0(([7][456728])|([23]51)).*

Related

Regex pattern not working on my C# code however it works on an online tester

I want to extract the double value from the string that contains a specific keyword. For example:
Amount : USD 3,747,190.67
I need to extract the value "3,747,190.67" from the string above using the keyword Amount, for that I tested this pattern in different online Regex testers and it works:
(?<=\bAmount.*)(\d+\,*\.*)*
However it doesn't work on my C# code:
if (type == typeof(double))
{
double doubleVal = 0;
pattern = #"(?<=\bAmount.*)(\d+\,*\.*)*";
matchPattern = Regex.Match(textToParse, pattern);
if (matchPattern.Success)
{
double.TryParse(matchPattern.Value.ToString(), out doubleVal);
}
return doubleVal;
}
This one works:
(?<=\bAmount.*)\d+(,\d+)*(\.\d+)?
(?<=\bAmount.*) the look behind
\d+                      leading digits (at least one digit)
(,\d+)*               thousands groups (zero or more times)
(\.\d+)?             decimals (? = optional)
Note that the regex tester says "9 matches found" for your pattern. For my pattern it says "1 match found".
The problem with your pattern is that its second part (\d+\,*\.*)* can be empty because of the * at the end. The * quantifier means zero, one or more repetitions. Therefore, the look-behind finds 8 empty entries between Amount and the number. The last of the 9 matches is the number. You can correct it by replacing the * with a +. See: regextester with *, regextester with +. You can also test it with "your" tester and switch to the table to see the detailed results.
My solution does not allow consecutive commas or points but allows numbers without thousands groups or a decimal part.
The lookbehind (?<=\bAmount.*) is always true in the example data after Amount.
The first 7 matches are empty, as (\d+\,*\.*)* can not consume a character where there is no digit, but it matches at the position as the quantifier * matches 0 or more times.
See this screenshot of the matches:
You might use
(?<=\bAmount\D*)\d{1,3}(?:,\d{3})*(?:\.\d{1,2})?\b
(?<=\bAmount\D*) Positive lookbehind, assert Amount to the left followed by optional non digits
\d{1,3} Match 1-3 digits
(?:,\d{3})* Optionally repeat , and 3 digits
(?:\.\d{1,2})? Optionally match . and 1 or 2 digits
\b A word boundary
See a .NET regex demo
For example
double doubleVal = 0;
var pattern = #"(?<=\bAmount\D*)\d{1,3}(?:,\d{3})*(?:\.\d{1,2})?\b";
var textToParse = "Amount : USD 3,747,190.67";
var matchPattern = Regex.Match(textToParse, pattern);
if (matchPattern.Success)
{
double.TryParse(matchPattern.Value.ToString(), out doubleVal);
}
Console.WriteLine(doubleVal);
Output
3747190.67
You can omit the word boundaries if needed if a partial match is also valid
(?<=Amount\D*)\d{1,3}(?:,\d{3})*(?:\.\d{1,2})?

REGEX Matching string nonconsecutively

I'm trying to understand how to match a specific string that's held within an array (This string will always be 3 characters long, ex: 123, 568, 458 etc) and I would match that string to a longer string of characters that could be in any order (9841273 for example). Is it possible to check that at least 2 of the 3 characters in the string match (in this example) strMoves? Please see my code below for clarification.
private readonly string[] strSolutions = new string[8] { "123", "159", "147", "258", "357", "369", "456", "789" };
Private Static string strMoves = "1823742"
foreach (string strResult in strSolutions)
{
Regex rgxMain = new Regex("[" + strMoves + "]{2}");
if (rgxMain.IsMatch(strResult))
{
MessageBox.Show(strResult);
}
}
The portion where I have designated "{2}" in Regex is where I expected the result to check for at least 2 matching characters, but my logic is definitely flawed. It will return true IF the two characters are in consecutive order as compared to the string in strResult. If it's not in the correct order it will return false. I'm going to continue to research on this but if anyone has ideas on where to look in Microsoft's documentation, that would be greatly appreciated!
Correct order where it would return true: "144257" when matched to "123"
incorrect order: "35718" when matched to "123"
The 3 is before the 1, so it won't match.
You can use the following solution if you need to find at least two different not necessarily consecutive chars from a specified set in a longer string:
new Regex($#"([{strMoves}]).*(?!\1)[{strMoves}]", RegexOptions.Singleline)
It will look like
([1823742]).*(?!\1)[1823742]
See the regex demo.
Pattern details:
([1823742]) - Capturing group 1: one of the chars in the character class
.* - any zero or more chars as many as possible (due to RegexOptions.Singleline, . matches any char including newline chars)
(?!\1) - a negative lookahead that fails the match if the next char is a starting point of the value stored in the Group 1 memory buffer (since it is a single char here, the next char should not equal the text in Group 1, one of the specified digits)
[1823742] - one of the chars in the character class.

Extract phone numbers and exclude extraneous characters

I'm trying to create a regex which will extract a complete phone number from a string (which is the only thing in the string) but leaving out any cruft like decorative brackets, etc.
The pattern I have mostly appears to work, but returns a list of matches - whereas I want it to return the phone number with the characters removed. Unfortunately, it completely fails if I add the start and end of line matchers...
^(?!\(\d+\)\s*){1}(?:[\+\d\s]*)$
Without the ^ and $ this matches the following numbers:
12345-678-901 returns three groups: 12345 678 901
+44-123-4567-8901 returns four groups: +44 123 4567 8901
(+48) 123 456 7890 returns four groups: +48 123 456 7890
How can I get the groups to be returned as a single, joined up whole?
Other than that, the only change I would like to include is to return nothing if there are any non-numeric, non-bracket, non-+ characters anywhere. So, this should fail:
(+48) 123 burger 7890
I'd keep it simple, makes it more readable and maintainable:
public string CleanPhoneNumber(string messynumber){
if(Regex.IsMatch(messynumber, "[a-z]"))
return "";
else
return Regex.Replace(messynumber, "[^0-9+]", "");
}
If any alphameric characters are present (extend this range if you wish) return blank else replace every char that is not 0-9 or +, with nothing. This produces output like 0123456789 and +481234567 with all the brackets, spaces and hyphens etc removed too. If you want to keep those in the output, add them to the Regex
Side note: It's not immediately clear or me what you think is "cruft" that should be stripped (non a-z?) and what you think is "cruft" that should cause blank (a-z?). I struggled with this because you said (paraphrase) "non digit, non bracket, non plus should cause blank" but earlier in your examples your processing permitted numbers that had hyphens and also spaces - being strictly demanding of spec hyphens/spaces would be "cruft that causes the whole thing to return blank" too
I've assumed that it's lowercase chars from the "burger" example but as noted you can extend the range in the IF part should you need to include other chars that return blank
If you have a lot of them to do maybe pre compile a regex as a class level variable and use it in the method:
private Regex _strip = new Regex( "[^0-9+]", RegexOptions.Compiled);
public string CleanPhoneNumber(string messynumber){
if(Regex.IsMatch(messynumber, "[a-z]"))
return "";
else
return _strip.Replace(messynumber, "");
}
...
for(int x = 0; x < millionStrArray.Length; x++)
millionStrArray[x] = CleanPhoneNumber(millionStrArray[x], "");
I don't think you'll gain much from compiling the IsMatch one but you could try it in a similar pattern
Other options exist if you're avoiding regex, you cold even do it using LINQ, or looping on char arrays, stringbuilders etc. Regex is probably the easiest in terms of short maintainable code
The strategy here is to use a look ahead and kick out (fail) a match if word characters are found.
Then when there are no characters, it then captures the + and all numbers into a match group named "Phone". We then extract that from the match's "Phone" capture group and combine as such:
string pattern = #"
^
(?=[\W\d+\s]+\Z) # Only allows Non Words, decimals and spaces; stop match if letters found
(?<Phone>\+?) # If a plus found at the beginning; allow it
( # Group begin
(?:\W*) # Match but don't *capture* any non numbers
(?<Phone>[\d]+) # Put the numbers in.
)+ # 1 to many numbers.
";
var number = "+44-123-33-8901";
var phoneNumber =
string.Join(string.Empty,
Regex.Match(number,
pattern,
RegexOptions.IgnorePatternWhitespace // Allows us to comment the pattern
).Groups["Phone"]
.Captures
.OfType<Capture>()
.Select(cp => cp.Value));
// phoneNumber is `+44123338901`
If one looks a the match structure, the data it houses is this:
Match #0
[0]: +44-123-33-8901
["1"] → [1]: -8901
→1 Captures: 44, -123, -33, -8901
["Phone"] → [2]: 8901
→2 Captures: +, 44, 123, 33, 8901
As you can see match[0] contains the whole match, but we only need the captures under the "Phone" group. With those captures { +, 44, 123, 33, 8901 } we now can bring them all back together by the string.Join.

Regular Expression for exactly N elements, not less, not more

I am trying to figure out what the regex for finding matches with exactly N occurrences, not less, not more, of a group of characters. It looks like a pretty simple task, but I have not been able to find the proper regex for it.
More specifically, I want a regex that tells whether a given string contains exactly 3 digits - not less, not more.
I thought I would be able to achieve it simply by treating the 3 digits as a group and adding a quantifier of {1} after it, but it does not work.
Alternately, I expected [0-9][0-9][0-9] to work as well, but again it does not. Both regexes return the very same results, for the an input set as
1, 12, 123, 1234, 12345.
Below is a code sample that performs what I tried, as described above.
class Program
{
static void Main(string[] args)
{
List<Regex> regexes = new List<Regex> { new Regex("\\d{3}"), new Regex("[0-9][0-9][0-9]"), new Regex("(\\d{3}){1}") };
List<int> numbers = new List<int> { 1, 12, 123, 1234, 12345 };
foreach(Regex regex in regexes)
{
Console.WriteLine("Testing regex {0}", regex.ToString());
foreach (int number in numbers)
{
Console.WriteLine(string.Format("{0} {1}", number, regex.IsMatch(number.ToString()) ? "is a match" : "not a match"));
}
Console.WriteLine();
}
}
}
The output to the program above is:
Clearly, only 123 is a match, from all the input values.
What would be the regular expression that treats "123" alone as a match ?
All of your regular expressions are for 3 digits anywhere on the input. You are looking for:
new Regex("^\\d{3}$")
The ^ matches the beginning of the input, and the $ matches the end of the input. So this regular expression states, "From the beginning, there must be three digits, then expect the end."
You should prefix with ^ to indicate the start of the string and $ to indicate its end. See http://regexr.com/3be8e for a working example.
You should be looking for n characters followed by a non-character. So, if you are looking for digits, you should be looking for n digits followed by a non-digit. Make sure that you precede the regex by a non-digit as well.

Regex expression for matching only floating point numbers

Hi i need a Regex Expression for extracting only floating point numbers from right to left
Example string
Earning per Equity Share (in ) face value of 2 each26 1,675.10
1,252.56
My current Regex
(\+|-)?[0-9][0-9]*(\,[0-9]*)?(\.[0-9]*)? with Rex options-Right to left
but
Current Output is
1,252.56
1,675.10
26
2
However i do not want to match on 26 or 2
Please help me
Maybe something like this will help
Regex
/[-+]?[0-9,\.]*([,\.])[0-9]*/g
Example input
Earning -34 5 b4 pe8r blah4 t3st + - (in) 1,252.56 face
-12234,23423.342 of 1,675.10 1,252.56
Matches
1,252.56
-12234,23423.342
1,675.10
1,252.56
Explanation
[-+]? match a single character present in the list below
Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy]
-+ a single character in the list -+ literally
[0-9,\.]* match a single character present in the list below
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
0-9 a single character in the range between 0 and 9
, the literal character ,
\. matches the character . literally
1st Capturing group ([,\.])
[,\.] match a single character present in the list below
, the literal character ,
\. matches the character . literally
[0-9]* match a single character present in the list below
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
0-9 a single character in the range between 0 and 9
g modifier: global. All matches (don't return on first match)
Although this is a Regex question this is also taged as C#.
Below is an example of how you might get a little bit more control over your output.
It's also culture-specific and only picks up numbers with a decimal place, and has no false positives.
Method
private List<double> GetNumbers(string input)
{
// declare result
var resultList = new List<double>();
// if input is empty return empty results
if (string.IsNullOrEmpty(input))
{
return resultList;
}
// Split input in to words, exclude empty entries
var words = input.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
// set your desirted culture
var culture = CultureInfo.CreateSpecificCulture("en-GB");
// Refine words into a list that represents potential numbers
// must have decimal place, must not start or end with decimal place
var refinedList = words.Where(x => x.Contains(".") && !x.StartsWith(".") && !x.EndsWith("."));
foreach (var word in refinedList)
{
double value;
// parse words using designated culture, and the Number option of double.TryParse
if (double.TryParse(word, NumberStyles.Number, culture, out value))
{
resultList.Add(value);
}
}
return resultList;
}
Usage
var testString = "Earning -34 5 b4 , . 234. 234, ,345 45.345 $234234 234.3453.345 $23423.2342 +234 -23423 pe8r blah4 t3st + - (in) 1,252.56 face -12234,23423.342 of 1,675.10 1,252.56";
var results = GetNumbers(testString);
foreach (var item in results)
{
Debug.WriteLine("{0}", item);
}
Output
45.345
1252.56
-1223423423.342
1675.1
1252.56
Additional Notes
You can learn more about double.TryParse and its options here.
You can learn more about the CultureInfo Class here.

Categories