How to search MongoDB for various phone number formats? - c#

I'm attempting to search my MongoDB for string phone numbers that are all in different formats.
E.g. (323)704-3234, 3237043234, 323-704-3234,+1-323-704-3234,323.704.3234, etc.
Is there an operator or regex that I can use that MongoDB provides that allows you to find strings minus special characters?
For example in c#,
collection.Find(Query.Matches("PhoneNumber",(some regex, replace, or where)3237043234))

Sorry, but you are doing it wrong.
You can accept phone strings in any format (if you want so), but before storing them you need to convert them to some specific format. In your case the best will be just to convert it to number (smaller size to store, to create index).
Then one more time, when the user asks for a number - he can also provide it in any format, and before you search for it you will convert it to the number. (Do not forget to put index on phone number field) Being more user friendly, you can show the ouput in the same format the user asked it.
To change all your current phones, just iterate through the database and update each of the numbers.

Related

c# string format validate

Update: The acceptable format is ADD|| .
I need to check if the request that the server gets, is in this format, and the numbers are between <>.
After that I have to read the numbers and add them and write the result back. So, if the format not fits to for example ADD|<5>|<8>
I have to refuse it and make a specific error message(it is not a number, it is wrong format, etc.). I checked the ADD| part, I took them in an array, and I can check, if the numbers are not numbers. But I cannot check if the numbers are in <> or not, because the numbers can contain multiple digits and ADD|<7>|<13> is not the same number of items likeADD|<2358>|<78961156>. How can I check that the numbers are in between <>?
please help me with the following: I need to make a server-client console application, and I would like to validate requests from the clients. The acceptable format is XXX|<number>|<number>.
I can split the message like here:
string[] messageProcess = message.Split('|');
and I can check if it is a number or not:
if (!(double.TryParse(messageProcess[1], out double number1)) || !(double.TryParse(messageProcess[2], out double number2)))
but how can I check the <number> part?
Thank you for your advice.
You can use Regex for that.
If I understood you correctly, follwing inputs should pass validation:
xxx|1232|32133
xxx|5345|23423
XXX|1323|45645
and following shouldn't:
YYY|1231|34423
XXX|ds12|sda43
If my assumptions are correct, this Regex should do the trick:
XXX\|\d+\|\d+
What it does?
first it looks for three X's... (if it doesn't matter if it's uppercase or lowercase X substitute XXX with (?:XXX|xxx) or use "case insensitive regex flag" - demo)
separated by pipe (|)...
then looks for more than one digit...
separated by pipe (|)...
finally ending with another set of one or more digits
You can see the demo here: Regex101 Demo
And since you are using C#, the Regex.IsMatch() would probably fit you best. You can read about it here, if you are unfamiliar with regular expressions and how to use them in C#.

Reverse RegExp from user entered string ( C#)

Is it possible to generate regular expressions from a user entered string? Are there any C# libraries to do this?
For example a user enters a string e.g. ABCxyz123 and the C# code automatically generates [A-Z]{3}[a-z]{3}\d{3}.
This is a simple string but we could have more complicated strings like
MON-0123/AB/5678-abc 2/7
Or
1234-678/abc::1234ABC?246
I already have a string tokeniser (from a previous stackoverflow question) so I could construct a regex from the list of tokens.
But I was wondering if there is a lib or C# code out there that’ll do it.
Edit: Important, I should of also said: It's not the actual character in the string that are important but the type of character and how many.
e.g A user could enter a "pattern" string of ABCxyz123.
This would be interpreted as
3 upper case alphas followed by
3 lower case alphas followed by
3 digits
So other users (when complied) must enter strings that match that pattern [A-Z]{3}[a-z]{3}\d{3}., e.g. QAZplm789
It's the format of user entered strings that's need to be checked not the actual content if that makes sense
Jerry has a related link
creating a regular expression for a list of strings
There are a few other links off this.
I'm not trying to do anything complicated e.g NLP etc.
I could use C# expression builder and dynamic linq at a push, but that seems overkill and a code maintainable nightmare .
I'll write my own "simple" regex builder from the tokenized string.
Example Use Case:
An admin office user where I work could setup the string patterns for each field by typing a string pattern, My code converts this to a regex, I store these in a database.
E.g: Field one requires 3 digits at the start. If there are 2 digits then send to workflow 1 if 3 then send to workflow 2. I could simply check the number of chars by substr or what ever. But this would be a concrete solution.
I am trying to do this generically for multiple documents with multiple fields. Also, each field could have multiple format checkers.
I don't want to write specific C# checks for every single field in numerous documents.
I'll get on with it, should keep me amused for a couple of days.

Convert Hebrew Letters into Equivalent Number

Other then hard coding this by hand I was wondering if there was a way that the.net framework would have this built in automaticaly, I know it can automatically convert hebrew dates into georgian dates but I need to convert hebrew numbers into georgian
IE א
= 1
ב
= 2
This goes into the hundreds. See here for more info.
Here is the approach that you should take:
Make Dictionary<char,int> that gives correspondence between each Hebrew letter and its numeric value
Parse the string one character at a time (best to do it right-to-left)
For each character, look up its value in the dictionary and add it to a running sum
Be sure to handle common scenarios for separating the hundreds-letters from the tens-letters (double-quotation mark) and separating the thousands-letters from the hundreds (single-quotation mark). For example, 5770 = ה'תש"ע.`. See the details in the link above for more on separations.
Edit: I just published a GitHub Repo that exposes functionality for converting Hebrew text to numbers, and numbers to their Hebrew letter equivalents.

Format string using multiple specifiers

Is there a way to use Int32.ToString("<some string format specifier>") with using more than 1 specifiers?
Specifically, I want to format an int in Hexadecimal but force the string to be 8-bit long, by adding 0's in the empty spots.
For example, I want to parse the number 1234 in decimal to the string "000004D2".
The way I wanted to do this was by combining the specifiers "X" and "00000000", but I can't seem to find any examples of combining specifiers together. Do I need to create my own FormatProvider?
I need to do this because I am writing a viewer which displays an array of bytes which supports different packages and formats. For example, display the array as an array of 4-bytes integers in hexadecimal, or 2-bytes integers in signed display. Much like the Memory viewer in VS
For that specific example, you can just use "X8" as your format specifier. I don't know about the more general case - but if you have any other specific requirements, it's probably worth asking about those separately.

How to show long numbers in Excel?

I have to build a C# program that makes CSV files and puts long numbers (as string in my program). The problem is, when I open this CSV file in Excel the numbers appear like this:
1234E+ or 1234560000000 (the end of the number is 0)
How I retain the formatting of the numbers? If I open the file as a text file, the numbers are formatted correctly.
Thanks in advance.
As others have mentioned, you can force the data to be a string. The best way for that was ="1234567890123". The = makes the cell a formula, and the quotation marks make the enclosed value an Excel string literal. This will display all the digits, even beyond Excel's numeric precision limit, but the cell (generally) won't be able to be used directly in numeric calculations.
If you need the data to remain numeric, the best way is probably to create a native Excel file (.xls or .xlsx). Various approaches for that can be found in the solutions to this related Stack Overflow question.
If you don't mind having thousands separators, there is one other trick you can use, which is to make your C# program insert the thousands separators and surround the value in quotes: "1,234,567,890,123". Do not include a leading = (as that will force it to be a string). Note that in this case, the quotation marks are for protecting the commas in the CSV, not for specifying an Excel string literal.
Format those long numbers as strings by putting a ' (apostrophe) in front or making a formula out of it: ="1234567890123"
You can't. Excel stores numbers with fifteen digits of precision. If you don't mind not having the ability to perform calculations on the numbers from within Excel, you can store them as Text, and all of the digits will display.
When I generate data to imported into Excel, I do not generate a CSV file if I want control over how the data are displayed. Instead, I write out an Excel file where the properties of the cells are set appropriately. I do not know if there is a library out there that would do that for you in C# without requiring Excel to be installed on the machine generating the files, but it is something to look into.
My two cents:
I think it's important to realize there is a difference between "Data" and "Formatting". In this example you are kind of trying to store both in a data-only file. This will, as you can tell from other answers, change the nature of the data. (In other words cause it to be converted to a string. A CSV file is a data only file. You can do some tricks here and there to merge formatting in with data, but to my way of thinking this essentially corrupts the data by merging it with non-data values: ie: "Formatting".
If you really need to be able to store formatting information I suggest that, if you have time to develop it out, you switch to a file type capable of storing formatting info separately from the data. It sounds like this problem would be a good candidate for a XML Spreadsheet solution. In this way you can not only specify your data, but also it's type and any formatting you choose to use.

Categories