Match domain names as regular expressions

Match domain names as regular expressions - c#

can some tell how can i write regular expression matching abc.com.ae or abc.net.af or anything ,ae which is the last in the string is optional
this is successfull with / but not . don't know why
^[a-z]{1,25}.[a-z]{3}$
answer
^[a-z]{1,30}\.[a-z]{3}((\.)[a-z]{2})?$

The answer is: write \. instead of ., because . means “any character”.

You can match specific characters using the \u escape code followed immediately by the unicode number for that character in hex format and it must also be four digits only. I think a full stop is \u002E.
In your example where '/' works but not '.', this is because a '/' is recognised as a normal character and does not need to be escaped whereas the full stop has a different meaning in c# regex (it matches any character).
If youre still not sure, here is a useful guide to regex in C#: http://www.radsoftware.com.au/articles/regexlearnsyntax.aspx
And this one has examples of regex for URLs: http://www.radsoftware.com.au/articles/regexsyntaxadvanced.aspx

I sounds like you need some environment to learn regular expressions.
The best way to learn regular expressions is with an interactive regular expression simulator: type in an expression, and it will give you the results, instantly.
There are a few free (and not so free) regular expression simulators available:
Free web version from http://lumadis.be/regex/test_regex.php.
RegExBuddy from http://www.regexbuddy.com/. I am not affiliated with this program, but I have used it with good success in the past.

Related

C# Regex to match attributes [duplicate]

To review regular expresions I read this tutorial. Anyways that tutorial mentions that \b matches a word boundary (between \w and \W characters). That tutorial also gives a link where you can install expresso (program that helps when creating regular expressions).
So I have created my regular expressions in expresso and I do inded get a match. Now when I copy the same regex to visual studio I do not get a match. Take a look:
Why am I not getting a match? in the immediate window I am showing the content of variable output. In expresso I do get a match and in visual studio I don't. why?

The C# language and .NET Regular Expressions both have their own distinct set of backslash-escape sequences, but the C# compiler is intercepting the "\b" in your string and converting it into an ASCII backspace character so the RegEx class never sees it. You need to make your string verbatim (prefix with an at-symbol) or double-escape the 'b' so the backslash is passed to RegEx like so:
#"\bCOMPILATION UNIT";
Or
"\\bCOMPILATION UNIT"
I'll say the .NET RegEx documentation does not make this clear. It took me a while to figure this out at first too.
Fun-fact: The \r and \n characters (carriage-return and line-break respectively) and some others are recognized by both RegEx and the C# language, so the end-result is the same, even if the compiled string is different.

You should use #"\bCOMPILATION UNIT". This is a verbatim literal. When you do "\b" instead, it parses \b into a special character. You can also do "\\b", whose double backslash is parsed into a real backslash, but it's generally easier to just use verbatims when dealing with regex.

How can I check the following conditions in c# using regular expression

I've a condition like this
if I enter the text format as
9 - This should only allow numbers
s- Should only allow special chars
a - should only allow alphabets
x - should allow alpha numerics
There may be combinations like, if I specify '9s' this should allow numbers and special chars,
'sa' - should allow alphabes and numerics etc..
How can I check these conditions using regular expressions using c#.
Thanks

You can translate these conditions into regex like this:
Start the regex with ^[.
Then add one or more of the
following:
Numbers: \p{N}
Special characters (i. e.
non-alphanumerics): \W
Letters: \p{L}
Alphanumerics: \w
End the regex with ]+$
Enclose the regex in a verbatim string.
So, for "only letters", it's #"^[\p{L}]+$"; for "numbers and special characters", it's #"^[\p{N}\W]+$" etc.

You cannot 'generate' regular expressions using C# unless you code for it. But you surely can go to a site like 'www.regexlib.com' to find and build the regular expressions you want.
Then you can execute your regular expressions using C# to validate user inputs. this link would give you the knowledge how to use C# for it.
Hope this helps, regards.

A beginner's guide to interpreting Regex?

Greetings.
I've been tasked with debugging part of an application that involves a Regex -- but, I have never dealt with Regex before. Two questions:
1) I know that the regexes are supposed to be testing whether or not two strings are equivalent, but what specifically do the two regex statements, below, mean in plain English?
2) Does anyone have a recommendation on websites / sources where I can learn more about Regexes? (preferably in C#)
if (Regex.IsMatch(testString, #"^(\s*?)(" + tag + #")(\s*?),", RegexOptions.IgnoreCase))
{
result = true;
}
else if (Regex.IsMatch(testString, #",(\s*?)(" + tag + #")(\s*?),", RegexOptions.IgnoreCase))
{
result = true;
}

It's going to be difficult to tell what that regex means, without knowing what's in tag. In fact, it looks like that regex is broken (or, at least, doesn't properly escape inputs).
Roughly speaking, for the first regex:
The ^ says to match at the beginning of the string.
The (...) sets up a capturing group (which is available, although this example apparently doesn't use it).
The \s matches any white space characters (spaces, tabs, etc.)
The *? matches zero or more of the previous character (in this case, whitespace), and because it has a question-mark, it matches the minimum number of characters needed to make the rest of the expression work.
The (" + tag + #") inserts the contents of the tag into the regex. As I mention, that's dangerous, without escaping.
The (\s*?) matches the same as the before (the minimum number of whitespace characters)
The , matches a trailing comma.
The second regex is very similar, but looks for a starting comma (rather than the beginning of the string).
I like the Python documentation for Regular Expressions, but it looks like this site
has a pretty good, basic introduction, with C# examples.

One word - Cribsheet (or is that two?) :)

I'm not c# savvy but I can recommend an awesome guide to regular expressions that I use for Bash and Java programming. It applies to pretty much all languages:
http://www.amazon.com/Mastering-Regular-Expressions-Jeffrey-Friedl/dp/0596528124/ref=tmm_pap_title_0
It is totally worth $30 to own this book. It is VERY thorough and helped my fundamental understanding of Regex a lot.
-Ryan

Since you specifically tagged C#, I recommend the Regex Hero as a tool you can use to play around with them since it's running on .NET. It also lets you toggle the different RegexOptions flags as you would pass them into the constructor when creating a new Regex.
Also, if you're using a version of Visual Studio 2010 that supports extensions, I would take a look at the Regex Editor extension... it will popup whenever you type new Regex( and offer you some guidance and autocomplete for your regex pattern.

Using The Regex Coach
The regular expression is a sequence consisting of the expression '(\s*?)', the expression '(tag)', the expression '(\s*?)', and the character ','.
where (\s*?) is defined as The regular expression is a repetition which matches a whitespace character as often as necessary.
the second one matches a , at the start too
As for good learning websites, I like www.regular-expressions.info/
Super simple version:
At the start of a string 0 or more spaces, whatever Tag is, 0 or More spaces, a comma.
the second one is
a comma, 0 or more spaces, whatever Tag is, 0 or More spaces, a comma.

Once you have the very basic idea about regex (it's full of resources over there) I recommend you to use Expresso for creating your regular expressions.
Expresso editor is equally suitable as a teaching tool for the beginning user of regular expressions or as a full-featured development environment for the experienced programmer or web designer with an extensive knowledge of regular expressions.

Your premise is not correct. Regular expressions are not used to tell if two strings are equivalent, but rather if the input string matches a certain pattern.
The first test above looks for any text that does not contain "zero or more whitespace charaters" searching "non-greedy". Then matches the text of the variable "tag" in the middle, then "zero or more whitespace characters, non greedy" again.
The second one is very similar, except that it allows for beginning whitespace as long as it follows a comma.
It is hard to explain "non-greedy" in this context, especially involving whitespace characters, so look here for more information.

A regular expression is a way to describe a set of strings that have some particular characteristics.
They don't merely need just to compare two strings.. what you usually do it to test if a string matches a particular regular expression. They can also be used to do simple parsing of a string in tokens that respect some patterns..
The good thing about regexps is that they allow you to express certain constraints inside a string keeping it general and able to match a group of strings that respect those constraints.. then they follow a formal specification that doesn't leave ambiguities around..
Here you can find a comparison table of various regular expression languages in many different programming languages and a specific guide for C# if you follow its link.
Usually the implementations for the various languages are quite similar since the syntax is somewhat standardized from the theoretical topics regexps come from, so any tutorial about regexp will be fine, then you'll just need to get into C# API.

1) The first regex is trying to do a case-insensitive match starting at the beginning of the test string. It then matches optional whitespace, followed by whatever is in tag, followed by optional whitespace then finally a comma.
The second matches a string containing a comma, followed by optional whitespace, followed by whatever is in tag, followed by optional whitespace then finally a comma.
Thought it's for C# I recommend picking up the Perl Pocket Reference which has a great Regex syntax reference. It helped my out a lot when I was learning regexes 14 years ago.

http://www.myregextester.com/ is a decent regular expression tester that also has an explain option for C# regexps - For Instance check out this example:
The regular expression:
(?-imsx:^(\s*?)(tagtext)(\s*?),)
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
^ the beginning of the string
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
\s*? whitespace (\n, \r, \t, \f, and " ") (0
or more times (matching the least amount
possible))
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
( group and capture to \2:
----------------------------------------------------------------------
tagtext 'tagtext'
----------------------------------------------------------------------
) end of \2
----------------------------------------------------------------------
( group and capture to \3:
----------------------------------------------------------------------
\s*? whitespace (\n, \r, \t, \f, and " ") (0
or more times (matching the least amount
possible))
----------------------------------------------------------------------
) end of \3
----------------------------------------------------------------------
, ','
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------

A regular expression does not tell you if two strings match, but rather if a given string matches a pattern.
This site is my favorite for learning and testing regular expressions:
http://gskinner.com/RegExr/
It allows you to interactively test regular expressions as you write them, and provides a built-in tutorial.

Although it doesn't use C#, Rejex is a simple tool for testing and learning about regular expressions which includes a quick reference for the special characters

It looks like that they are trying to match some kind of list of words delimited by colons (UPDATE: commas).
The first one is probably matching first item and the second one some item after the first one excluding the last one. I hope you will understand :).
A good source of information about regular expressions is at http://www.regular-expressions.info/

also a great site to test your regular expressions with extra info: http://regex101.com/

Regular Expression to reject special characters other than commas

I am working in asp.net. I am using Regular Expression Validator
Could you please help me in creating a regular expression for not allowing special characters other than comma. Comma has to be allowed.
I checked in regexlib, however I could not find a match. I treid with ^(a-z|A-Z|0-9)*[^#$%^&*()']*$ . When I add other characters as invalid, it does not work.
Also could you please suggest me a place where I can find a good resource of regular expressions? regexlib seems to be big; but any other place which lists very limited but most used examples?
Also, can I create expressions using C# code? Any articles for that?

[\w\s,]+
works fine, as you can see bellow.
RegExr is a great place to test your regular expressions with real time results, it also comes with a very complete list of common expressions.
[] character class \w Matches any word character (alphanumeric & underscore). \s
Matches any whitespace character (spaces, tabs, line breaks). , include comma + is greedy match; which will match the previous 1 or more times.

[\d\w\s,]*
Just a guess

To answer on any articles, I got started here, find it to be an excellent resource:
http://www.regular-expressions.info/
For your current problem, try something like this:
[\w\s,]*
Here's a breakdown:
Match a single character present in the list below «[\w\s,]*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
A word character (letters, digits, etc.) «\w»
A whitespace character (spaces, tabs, line breaks, etc.) «\s»
The character “,” «,»

For a single character that is not a comma, [^,] should work perfectly fine.

You can try [\w\s,] regular expression. This regex will match only alpha-numeric characters and comma. If any other character appears within text, then this wont match.
For your second question regarding regular expression resource, you can goto
http://www.regular-expressions.info/
This website has lot of tutorials on regex, plus it has lot of usefult information.
Also, can I create expressions using
C# code? Any articles for that?
By this, do you mean to say you want to know which class and methods for regular expression execution? Or you want tool that will create regular expression for you?

You can create expressions with C#, something like this usually does the trick:
Regex regex = new Regex(#"^[a-z | 0-9 | /,]*$", RegexOptions.IgnoreCase);
System.Console.Write("Enter Text");
String s = System.Console.ReadLine();
Match match = regex.Match(s);
if (match.Success == true)
{
System.Console.WriteLine("True");
}
else
{
System.Console.WriteLine("False");
}
System.Console.ReadLine();
You need to import the System.Text.RegularExpressions;
The regular expression above, accepts only numbers, letters (both upper and lower case) and the comma.
For a small introduction to Regular Expressions, I think that the book for MCTS 70-536 can be of a big help, I am pretty sure that you can either download it from somewhere or obtain a copy.
I am assuming that you never messed around with regular expressions in C#, hence I provided the code above.
Hope this helps.

Thank you, all..
[\w\s,]* works
Let me go through regular-expressions.info and come back if I need further support.
Let me try the C# code approach and come back if I need further support.
[This forum is awesome. Quality replies so qucik..]
Thanks again

(…) is denoting a grouping and not a character set that’s denoted with […]. So try this:
^[a-zA-Z0-9,]*$
This will only allow alphanumeric characters and the comma.

Convert an Extended Regular expression to .NET compatible RegEx

I've got a RegEx that works great on *NIX systems and languages that support Extended Regular Expressions (ERE). I haven't found a freely available library for .NET that supports ERE's, nor have I had any lucky trying to translate this into a non-ERE and get the same result. Here is the RegEx:
^\+(<{7} \.|={7}$|>{7} \.)
Background: the point of the RegEx is to identify if a given string appears to have the markers from an unresolved subversion merge.

It looks to me like ERE syntax is mostly upward-compatible with .NET's regex flavor, as it is with most other "Perl-compatible" flavors (Perl, PHP, Python, JavaScript, Ruby, Java...). In other words, anything you can do in an ERE regex, you should be able to do in an identical .NET regex. Certainly your example:
^\+(<{7} \.|={7}$|>{7} \.)
means the same thing in .NET as it does in ERE. The only major exception I can see is in the area of POSIX bracket expressions; .NET follows the Unicode standard instead.
It's when you go to apply the regex that things really get different. In C# you might apply that regex like this:
string result = Regex.Match(targetString, #"^\+(<{7} \.|={7}$|>{7} \.)").Value;
C#'s verbatim strings save you having to escape backslashes like in some other languages' string literals; you only have to escape quotation marks, which you do by doubling them:
#"He said, ""Look out!""";
Does that answer your question?

Are you sure you don't have a typo in that? RegexBuddy (when set to either POSIX ERE or GNU ERE) says that the "+" quantifier must be preceded by a token that can be repeated. Other than that, this appears to be a valid .NET Regex. You might want to check out one of the great O'Reilly books on regular expressions as well. If this doesn't help, please post some examples of text you're trying to match/not match.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Match domain names as regular expressions - c#

can some tell how can i write regular expression matching abc.com.ae or abc.net.af or anything ,ae which is the last in the string is optional this is successfull with / but not . don't know why ^[a-z]{1,25}.[a-z]{3}$ answer ^[a-z]{1,30}\.[a-z]{3}((\.)[a-z]{2})?$

The answer is: write \. instead of ., because . means “any character”.

Related

C# Regex to match attributes [duplicate]

How can I check the following conditions in c# using regular expression

A beginner's guide to interpreting Regex?

Regular Expression to reject special characters other than commas

Convert an Extended Regular expression to .NET compatible RegEx

Categories

Resources