Regular expression for ensuring domain name is English characters only [closed] - c#

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
What's a good, clear regex for matching a domain name that must consist of:
Only English alpha characters, plus numbers
Including spaces or other separator characters that are valid, and reliably handled within a domain name
To clarify, this is for the purposes of validating a domain name. Whilst there are moves in the internet community to support internationalisation of domain names, I've done a fair bit of research into this and to keep my explanation fairly simple, only domain names that include characters that are part of a modern UK English character set (including numbers) are reliably handled by the Domain Name System (DNS). I'm not indicating a desire to prohibit internationalisation - I've done a lot of work during my career doing the opposite!
To answer this, what I was looking for is something like this (tested and works). Sorry the original question wasn't explicit enough about what I was trying to do, however I've upvoted the suggestions that have helped me provide this answer to the commmunity:
^[\w- .]*$
'\w' = shorthand for [a-zA-Z0-9_]
'- .' = allow '-', ' ', '.'
asterisk = any of the previous characters zero or more times

You can use this one:
(?i)[a-z0-9\p{Z}]
where \p{Z} is "All separators" class and i ignore-case option.

You may use [a-zA-Z\d\s\p{P}]+ as the most simple solution. Or go with non-unicode solution >>
POSIX defines character classes [:...:] , but not every regex engine support them.
But alternative sets can be used then...
[:alnum:] [A-Za-z0-9] Alphanumeric characters
[:space:] [ \t\r\n\v\f] Whitespace characters
[:punct:] [\]\[!"#$%&'()*+,./:;<=>?#\^_`{|}~-] Punctuation characters
So putting them together you will get
^[A-Za-z0-9 \t\r\n\v\f\]\[!"#$%&'()*+,./:;<=>?#\^_`{|}~-]+$
This way you see what you going to match and what not. Please note that some characters are escaped by \ as without escaping they would have different meaning.

Related

is there a conditional split in C#? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 9 years ago.
I came across a problem,
I want to split everything that comes afer ". "
For example, if I have the sentences :
"Danny went to school. it was wonderful. "
I want my output will be
Danny went to school.
it was wonderful.
which I can easily solve it by that :
string[] list = currentResult.Split(new string[] { ". " }, StringSplitOptions.None);
BUT!
what if I have for example :
Danny went to School. and : 2. James went to school as well.
my output will be :
1.
Danny went to School. and :
2.
James went to school as well
.
I dont want it to split it when there is a number before the dot, for example.
Can I solve it somehow ?
Thanks!
The problem here is how to deal with oddly formatted data, if you have control over your data you might consider using 1) and 2) instead of 1. and 2.; however if this is not the case then you might have to resort to regex to discern where a . is part of a line or the end of one as this functionality is past the capabilities of String.Split
You could always go character by character, and do something like:
NOTE: Untested, but looks right :)
List<string> strings = new List<string>();
int curStart = 0;
for(int index=0;index<str.Length;index++) {
if(index > 0) {
if(str[index] == '.') {
if(!char.IsNumeric(str[index-1])) {
strings.Add(str.SubString(curStart, index-curStart));
curStart = index + 1;
}
}
}
}
I thought I'd take a stab at producing an answer matching to what you ask, where as the comments make allot of sense in the larger scope of what you want.
Find out how to use regex with C# code from :http://www.dotnetperls.com/regex-matches
I used http://regexpal.com/ to confirm my regex. Play around with that or a similar page to get a handle on regex. It's worth knowing how to regex.
Look at http://www.mikesdotnetting.com/Article/46/CSharp-Regular-Expressions-Cheat-Sheet or someplace else for a list of the commands and definitions for regex.
the regex ".*?\D[.||:]\s" will turn the string:
1. Danny went to School. and : 2. James went to school as well. Danny went to school. it was wonderful.
into the following matches (separated here by new lines):
1. Danny went to School.
and :
2. James went to school as well.
Danny went to school.
it was wonderful.
Note that I took the liberty to separate matches based on ':' as well since your example does so.

What algorithm can break text up into its component words? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I was pleasantly surprised to find how easy it is to use iTextSharp to extract the text from a pdf file. By following this article, I was able to get a pdf file converted to text with this simple code:
string pdfFilename = dlg.FileName;
// Show just the file name, without the path
string pdfFileNameOnly = System.IO.Path.GetFileName(pdfFilename);
lblFunnyMammalsFile.Content = pdfFileNameOnly;
string textFilename = String.Format(#"C:\Scrooge\McDuckbilledPlatypus\{0}.txt", pdfFileNameOnly);
PDFParser pdfParser = new PDFParser();
if (!pdfParser.ExtractText(pdfFilename, textFilename))
{
MessageBox.Show("there was a boo-boo");
}
The problem is that the text file generated contains text like this (i.e. it has no spaces):
IwaspleasantlysurprisedtofindhoweasyitistouseiTextSharptoextractthetextfromatextfile.
Is there an algorithm "out there" that will take text like that and make a best guess as to where the word breaks (AKA "spaces") should go?
Though I agree with Gavin that there's an easy way to solve this problem in this case but the problem itself is an interesting one.
This would require a heuristic algorithm to solve. I will just explain in a bit on why I think so. But first, I'll explain my algorithm.
Store all the dictionary words in a Trie. Now take a sentence, and look up in the trie to get to a word. The trie tracks the end of the word. Once you find a word, add a space to it in your sentence. This will work for your sentence. But consider these two examples:
He gave me this book
He told me a parable
For the first example, the above algorithm works fine but for the second example, the algorithm outputs:
He told me a par able
In order to avoid this, we will need to consider a longest match but if we do that then the output for the first example becomes:
He gave met his book.
So we are stuck and hence add heuristics to the algorithm that will be able to judge that grammatically He gave met his book doesn't make sense.

Password Policy Setting for a Website in c# 2008 [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I have a web application in c# 2008.
I'm assigned a task to set the password policy for this web site.
The policy is
The 1st character is Upper Case
The 2nd character is lower case
The Character is a "special character"
The 4th through 8th character are random digits
The password is exactly 8 characters
The password should expire after 6 months
I'm not able to figure out this. Thanks in advance.
If you want to do it "right" and correct, go for Regular Expressions. If you don't have any experience with them, forget it if it's urgent.
Instead go with the quick and dirty way. This is untested pseudo-code:
if (password.Length == 8)
{
check password[0] for upper case
check password[1] for lower case
check password[2] for special char
check password[3] && password[7] for "random digits"
//return false, throw error, whatever you want in the case of any failures.
}
else
{
return error "your password is too short"
}
Not sure what you want to do for making the password expire in 6 months. If you are treating your password as a custom class with an "expiration date" field, and you just want 6 months from now, just use MyPassword.ExpirationDate = DateTime.Now.AddMonths(6);
Its not a good practice to ask it in here without trying anything. It sounds like you are trying to make your job done by others. I can suggest the way you should do instead of providing code.
You can do it by using regular expressions. You can search for it . There are many resources.You should constuct a regular expression that will check for the constraints you want except the password expiration. You should check password expiration on your database. You can define a job that will work every midnight , which will check the password database and detect the passwords that expires.

Phone Number Validation in Multiple Countries [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
I am having trouble find regex validation patterns for phone numbers in different countries and have little time to try and write my own and was hoping a regex guru would be able to help.
I've checked the usual sources like regexlib already, so if anyone can help i'd be grateful with any of them
I need a separate phone number validation expression for each of the following:
Germany
US
Australia
New Zealand
Canada
Asia
France
The format is here.
Writing the regex is not trivial, but if you specify the rules, would not be difficult.
Instead of making an elaborate regular expression to match how you think your visitor will input their phone number, try a different approach. Take the phone number input and strip out the symbols. Then the regular expression can be simple and just check for 10 numeric digits (US number, for example). Then if you need to save the phone number in a consistent format, build that format off of the 10 digits.
This example validates U.S. phone numbers by looking for 10 numeric digits.
protected bool IsValidPhone(string strPhoneInput)
{
// Remove symbols (dash, space and parentheses, etc.)
string strPhone = Regex.Replace(strPhoneInput, #”[- ()\*\!]“, String.Empty);
// Check for exactly 10 numbers left over
Regex regTenDigits = new Regex(#”^\d{10}$”);
Match matTenDigits = regTenDigits.Match(strPhone);
return matTenDigits.Success;
}
Phone number is a number, what you want to validate there ?
Here you can see how different numbers look like.
And there is no such country like Asia, this is a mainland with several countries.
It's close to impossible to get a single regular expression that will cover all countries.
I'd go with [0-9+][0-9() ]* -- this simply allows any digit to start (or the "+" character), then any combination of digits or parentheses or spaces.
In general validation any further is not really going to be of much use. If the user of the page wants to be contacted by phone, they'll enter a valid phone number -- if not, then they won't.
A better way to enforce a correct phone number and eliminate most simple miskeying is to require the number to be entered twice -- then the user is likely to at least check it!

How do I capture named groups in C# .NET regex? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 12 years ago.
I'm trying to use named groups to parse a string.
An example input is:
exitcode: 0; session id is RDP-Tcp#2
and my attempted regex is:
("(exitCode)+(\s)*:(\s)*(?<exitCode>[^;]+)(\s)*;(\s)*(session id is)(\s)*(?<sessionID>[^;]*)(\s)*");
Where is my syntax wrong?
Thanks
In your example:
exitcode: 0; session id is RDP-Tcp#2
It does not end with a semi-colon, but it seems your regular expression expects a semi-colon to mark the end of sessionID:
(?<sessionID>[^;]*)
I notice that immediately following both your named groups, you have optional whitespace matches -- perhaps it would help to add whitespace into the character classes, like this:
(?<exitCode>[^;\s]+)
(?<sessionID>[^;\s]*)
Even better, split the string on the semi-colon first, and then perhaps you don't even need a regular expression. You'd have these two substrings after you split on the semi-colon, and the exitcode and sessionID happen to be on the ends of the strings, making it easy to parse them any number of ways:
exitcode: 0
session id is RDP-Tcp#2
Richard's answer really covers it already - either remove or make optional the semicolon at the end and it should work, and definitely consider putting whitespace in the negated classes or just splitting on semi-colon, but a little extra food for thought. :)
Don't bother with \s where it's not necessary - looks like your output is some form of log or something, so it should be more predictable, and if so something simpler can do:
exitcode: (?<exitCode>\d+);\s+session id is\s+(?<sessionID>[^;\s]*);?
For the splitting on semi-colon, you'll get an array of two objects - here's some pseudo-code, assuming exitcode is numeric and sessionid doesn't have spaces in:
splitresult = input.split('\s*;\s*')
exitCode = splitresult[0].match('\d+')
sessionId = splitresult[1].match('\S*$')
Depending on who will be maintaining the code, this might be considered more readable than the above expression.

Categories