String Needs to Contain 2 words - c#

I have a textbox on one of my views, and that textbox should not accept anything that has more than 2 words or less than 2 words. This textbox needs 2 words.
Basically this textbox accepts a person's first and last name. I don't want people to only enter one or the other.
Is there a way to check for a space character between 2 words and another space character along with any letter, number, etc after the 2nd word if it exists? I think that if the user accidently 'fat-fingers' an extra space after the 2nd word, that should be fine bc there are still only 2 words.
For example:
/* the _ character means space */
John /* not accepted */
John_ /* not accepted */
John_Smith_a /* not accepted */
John Smith_ /* accepted */
Any help is appreciated.

There are multiple approaches that you could use to solve this, I'll review over a few.
Using the String.Split() Method
You could use the String.Split() method to break up a string into it's individual components based on a delimiter. In this case, you could use a space as a delimiter to get the individual words :
// Get your words, removing any empty entries along the way
var words = YourTextBox.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
// Determine how many words you have here
if(words.Length != 2)
{
// Tell the user they made a horrible mistake not typing two words here
}
Using a Regular Expression
Additionally, you could attempt to resolve this via a Regular Expression using the Regex.IsMatch() method :
// Check for exactly two words (and allow for beginning and trailing spaces)
if(!Regex.IsMatch(input,#"^(\s+)?\w+\s+\w+(\s+)?"))
{
// There are not two words, do something
}
The expression itself may look a bit scary, but it can be broken down as follows :
^ # This matches the start of your string
(\s+)? # This optionally allows for a single series of one or more whitespace characters
\w+ # This allows for one or more "word" characters that make up your first word
\s+ # Again you allow for a series of whitespace characters, you can drop the + if you just want one
\w+ # Here's your second word, nothing new here
(\s+)? # Finally allow for some trailing spaces (up to you if you want them)
A "word" character \w is a special character in Regular Expressions that can represent a digit, letter or an underscore and is the equivalent of [a-zA-Z0-9_].
Taking Advantage of Regular Expressions using MVC's RegularExpressionAttribute
Finally, since you are using MVC, you could take advantage of the [RegularExpressionValidation] attribute on your model itself :
[RegularExpression(#"^(\s+)?\w+\s+\w+(\s+)?", ErrorMessage = "Exactly two words are required.")]
public string YourProperty { get; set; }
This will allow you to simply call the ModelState.IsValid within your Controller Action to see if your Model has any errors or not :
// This will check your validation attributes like the one mentioned above
if(!ModelState.IsValid)
{
// You probably have some errors, like not exactly two words
}

use it like this
string s="John_Smith_a"
if (s.Trim().Split(new char[] { ' ' }).Length > 1)
{
}

The tag implies MVC here, so I would recommend using the RegularExpressionAttribute class:
public class YourModel
{
[RegularExpression(#"[^\w\s\w$]", ErrorMessage = "You must have exactly two words separated by a space.")]
public string YourProperty { get; set; }
}

Match m = Regex.Match(this.yourTextBox.Text, #"[^\w\s\w$]", String.Empty);
if (m.Success)
//do something
else
//do something else
With my very limited knowledge of regular expressions, I believe that this will solve your issue.

The cleanest way is to use regular expressions with the IsMatch method like this:
Regex.IsMatch("One Two", #"^\w+\s\w+\s?$")
Returns true if the input is a match.

Try this
if (str.Split(' ').Length == 2)
{
//Do Something
}
str is the variable holding your string to compare

Related

Merging 3 Regular Expressions to make a Slug/URL validation check

I am trying to merge a few working RegEx patterns together (AND them). I don't think I am doing this properly, further, the first RegEx might be getting in the way of the next two.
Slug example (no special characters except for - and _):
(^[a-z0-9-_]+$)
Then I would like to ensure the first character is NOT - or _:
(^[^-_])
Then I would like to ensure the last character is NOT - or _:
([^-_]$)
Match (good Alias):
my-new_page
pagename
Not-Match (bad Alias)
-my-new-page
my-new-page_
!##$%^&*()
If this RegExp can be simplified and I am more than happy to use it. I am trying to create validation on a page URL that the user can provide, I am looking for the user to:
Not start or and with a special character
Start and end with a number or letter
middle (not start and end) can include - and _
One I get that working, I can tweak if for other characters as needed.
In the end I am applying as an Annotation to my model like so:
[RegularExpression(
#"(^[a-z0-9-_]+$)?(^[^-_])?([^-_]$)",
ErrorMessage = "Alias is not valid")
]
Thank you, and let me know if I should provide more information.
See regex in use here
^[a-z\d](?:[a-z\d_-]*[a-z\d])?$
^ Assert position at the start of the line
[a-z\d] Match any lowercase ASCII letter or digit
(?:[a-z\d_-]*[a-z\d])? Optionally match the following
[a-z\d_-]* Match any character in the set any number of times
[a-z\d] Match any lowercase ASCII letter or digit
$ Assert position at the end of the line
See code in use here
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
Regex regex = new Regex(#"^[a-z\d](?:[a-z\d_-]*[a-z\d])?$");
string[] strings = {"my-new_page", "pagename", "-my-new-page", "my-new-page_", "!##$%^&*()"};
foreach(string s in strings) {
if (regex.IsMatch(s))
{
Console.WriteLine(s);
}
}
}
}
Result (only positive matches):
my-new_page
pagename

C# Extract part of the string that starts with specific letters

I have a string which I extract from an HTML document like this:
var elas = htmlDoc.DocumentNode.SelectSingleNode("//a[#class='a-size-small a-link-normal a-text-normal']");
if (elas != null)
{
//
_extractedString = elas.Attributes["href"].Value;
}
The HREF attribute contains this part of the string:
gp/offer-listing/B002755TC0/
And I'm trying to extract the B002755TC0 value, but the problem here is that the string will vary by its length and I cannot simply use Substring method that C# offers to extract that value...
Instead I was thinking if there's a clever way to do this, to perhaps a match beginning of the string with what I search?
For example I know for a fact that each href has this structure like I've shown, So I would simply match these keywords:
offer-listing/
So I would find this keyword and start extracting the part of the string B002755TC0 until the next " / " sign ?
Can someone help me out with this ?
This is a perfect job for a regular expression :
string text = "gp/offer-listing/B002755TC0/";
Regex pattern = new Regex(#"offer-listing/(\w+)/");
Match match = pattern.Match(text);
string whatYouAreLookingFor = match.Groups[1].Value;
Explanation : we just match the exact pattern you need.
'offer-listing/'
followed by any combination of (at least one) 'word characters' (letters, digits, hyphen, etc...),
followed by a slash.
The parenthesis () mean 'capture this group' (so we can extract it later with match.Groups[1]).
EDIT: if you want to extract also from this : /dp/B01KRHBT9Q/
Then you could use this pattern :
Regex pattern = new Regex(#"/(\w+)/$");
which will match both this string and the previous. The $ stands for the end of the string, so this literally means :
capture the characters in between the last two slashes of the string
Though there is already an accepted answer, I thought of sharing another solution, without using Regex. Just find the position of your pattern in the input + it's lenght, so the wanted text will be the next character. to find the end, search for the first "/" after the begining of the wanted text:
string input = "gp/offer-listing/B002755TC0/";
string pat = "offer-listing/";
int begining = input.IndexOf(pat)+pat.Length;
int end = input.IndexOf("/",begining);
string result = input.Substring(begining,end-begining);
If your desired output is always the last piece, you could also use split and get the last non-empty piece:
string result2 = input.Split(new string[]{"/"},StringSplitOptions.RemoveEmptyEntries)
.ToList().Last();

Regular Expression oddity, why does this happen?

This simple regular expression matches the text of Movie. Am I wrong in reading this as "Q repeated zero or more times"? Why does it match, shouldn't it return false?
public class Program
{
private static void Main(string[] args)
{
Regex regex = new Regex("Q*");
string input = "Movie";
if (regex.IsMatch(input))
{
Console.WriteLine("Yup.");
}
else
{
Console.WriteLine("Nope.");
}
}
}
As you are saying correctly, it means “Q repeated zero or more times”. I this case, it’s zero times, so you are essentially trying to match "" in your input string. As IsMatch doesn’t care where it matches, it can match the empty string anywhere within your input string, so it returns true.
If you want to make sure that the whole input string has to match, you can add ^ and $: "^Q*$".
Regex regex = new Regex("^Q*$");
Console.WriteLine(regex.IsMatch("Movie")); // false
Console.WriteLine(regex.IsMatch("QQQ")); // true
Console.WriteLine(regex.IsMatch("")); // true
You are right in reading this regex as Q repeated 0 or more times. The thing with that is the 0. When you try a regex, it will try to find any successful match.
The only way for the regex to match the string is to try matching an empty string (0 times), which appears anywhere in-between the matches, and if you didn't know that before, yes, regex can match empty strings between characters. You can try:
(Q*)
To get a capture group and use .Matches and Groups[1].Value to see what has been captured. You'll see that it's an empty string.
Usually, if you want to check the existence of a character, you don't use regex, but use .Contains. Otherwise, if you do want to use regex, you'd drop the quantifier, or use one which matches at least one particular character.

Check if an expression is a match with regex

In C# I have two strings: [I/text] and [S/100x20].
So, the first one is [I/ followed by text and ending in ].
And the second is [S/ followed by an integer, then x, then another integer, and ending in ].
I need to check if a given string is a match of one of this formats. I tried the following:
(?<word>.*?) and (?<word>[0-9]x[0-9])
But this does not seem to work and I am missing the [I/...] and [S/...] parts.
How can I do this?
This should do nicely:
Regex rex = new Regex(#"\[I/[^\]]+\]|\[S/\d+x\d+\]");
If the text in [I/text] is supposed to include only alphanumeric characters then #Oleg's use of the \w instead of [^\]] would be better. Also using + means there needs to be at least one of the preceding character class, and the * allows class to be optional. Adjust as needed..
And use:
string testString1 = "[I/text]";
if(rex.IsMatch(testString1))
{
// should match..
}
string testString2 = "[S/100x20]";
if(rex.IsMatch(testString2))
{
// should match..
}
Following regex does it. Matches the whole string
"(\[I/\w+\])|(\[S/\d+x\d+\])"
([I/\w+])
(S/\d+x\d+])
the above works.
use http://regexr.com?34543 to play with your expressions

Using Regex to determine if string contains a repeated sequence of a particular substring with comma separators and nothing else

I want to find if a string contains a repeated sequence of a known substring (with comma separators) and nothing else and return true if this is the case; otherwise false. For example: the substring is "0,8"
String A: "0,8,0,8,0,8,0,8" returns true
String B: "0,8,0,8,1,0,8,0" returns false because of '1'
I tried using the C# string functions Contains but it does not suit my requirements. I am totally new to regular expression but I feel it should be powerful enough to do this. What RegEx should I use to do this?
The pattern for a string containing nothing but a repeated number of a given substring (possibly zero of them, resulting in an empty string) is \A(?:substring goes here)*\z. The \A matches the beginning of the string, the \z the end of the string, and the (?:...)* matches 0 or more copies of anything matching the thing between the colon and the close parenthesis.
But your string doesn't actually match \A(?:0,8)*\z, because of the extra commas; an example that would match is "0,80,80,80,8". You need to account for the commas explicitly with something like \A0,8(?:,0,8)*\z.
You can build such a thing in C# thus:
string OkSubstring = "0,8";
string aOk = "0,8,0,8,0,8,0,8";
string bOK = "0,8,0,8,1,0,8,0";
Regex OkRegex = new Regex( #"\A" + OkSubstring + "(?:," + OkSubstring + #")*\z" );
OkRegex.isMatch(aOK); // True
OkRegex.isMatch(bOK); // False
That hard-codes the comma-delimiter; you could make it more general. Or maybe you just need the literal regex. Either way, that's the pattern you need.
EDIT Changed the anchors per Mike Samuel's suggestion.

Categories