How to validate comma-separated string using Regex [duplicate] - c#

This question already has answers here:
Regular expression help - comma delimited string
(7 answers)
Regular expression to find and remove duplicate words
(9 answers)
Closed 2 years ago.
I need to validate my c# model class.
[Required(ErrorMessage = "Comma Separated String Required")]
[RegularExpression(#"", ErrorMessage = "Invalid Comma Separated String.")]
[RegularExpression(#"", ErrorMessage = "Duplicate Code.")]
public string CommaSeparatedString { get; set; }
I just tried the following regex, but it is not working for me.
((\s+)??(\d[a-z]|[a-z]\d|[a-z]),?)+?$
In my case, CommaSeparatedString can be:
ASAEW1,ASAEW2,ASA,S4,ASAEW5,ASAEW6,ASAEW7 - Valid
ASAEW1,ASAEW2,ASA,S4,ASAEW5,ASAEW6,ASAEW7,ASAEW6 - Invalid - Duplicate ASAEW6
ASAEW1,ASAEW2,ASA,S4,ASAEW5,ASAEW6,ASAEW7, - Invalid - Comma at end
ASAEW1,ASAEW2,,ASA,S4,ASAEW5,ASAEW6,ASAEW7 - Invalid - No value between 2,3 comma
The above requirement should happen. Is there any possible way to check duplicates in comma-separated String? I need to show 'Duplicates code' error message if CommaSeparatedString consists of duplicates. How can I do this?

I'm not a regex magician, but puzzled together something that might work for you here:
^((([A-Z]+\d*)(?!.*,\3\b)),)*[A-Z]+\d*$
So to visualize this:
In steps:
^((( - Start string ancor followed by three capture groups
[A-Z]+\d* - Third capture group must exist out of capitals (at least one) followed by as many digits as possible
(?!.*,\3\b) - Negative lookahead to make sure that previously found pattern will not have a duplicate further down the line.
),)* - Closing group 2 followed by a comma and closing group 1 which then must occur * as many times as possible
[A-Z]+\d* - The last bit is repeating the same pattern we were looking for in group 3
$ - End string ancor
I'm not the best at explaining either but I hope it's clear enough and working (hoping backreferences are allowed within c# as I have no experience in that) =)

You can take a look at custom validations for your model, using either the ASP.Net (Core?) framework or the FluentValidation Nuget package.
For a solution depending only on the framework, I write a sample that might work for you or at least get you started:
public class MyModel : IValidatableObject
{
public string CommaSeparatedString { get; set; }
public IEnumerable<ValidationResult> Validate(ValidationContext validationContext)
{
if (CommaSeparatedString.EndsWith(","))
{
yield return new ValidationResult("Invalid Comma Separated String - Comma at end");
}
var splitCodes = CommaSeparatedString.Split(",");
var setOfCodes = new HashSet<string>();
foreach (var code in splitCodes)
{
if (code.Trim() == String.Empty)
{
yield return new ValidationResult("Invalid Comma Separated String - Missing Code");
continue;
}
var added = setOfCodes.Add(code);
if (!added) yield return new ValidationResult($"Duplicate Code: {code}");
}
}
}

Related

check for valid string format

i am trying to make a validaing system where it checks a string is in the correct format.
the required format can only contain numbers and dashes - and be ordered like so ***-**-*****-**-* 3-2-5-2-1.
For example, 978-14-08855-65-2
can i use Regex like i have for a email checking system by change the format key #"^([\w]+)#([\w])\.([\w]+)$"
the email checking code is
public static bool ValidEmail(string email, out string error)
{
error = "";
string regexEmailCOM = #"^([\w]+)#([\w])\.([\w]+)$"; // allows for .com emails
string regexEmailCoUK = #"^([\w]+)#([\w])\.([\w]+)\.([\w]+)$"; // this allows fo .co.uk emails
var validEmail = new Regex(email);
return validEmail.IsMatch(regexEmailCOM) || validEmail.IsMatch(regexEmailCoUK) && error == "") // if the new instance matches with the string, and there is no error
}
Regex is indeed a good fit for this situation.
One possible expression would be:
^\d{3}-\d\d-\d{5}-\d\d-\d$
This matches exactly 5 groups of only digits (\d) separated by -. Use curly brackets to set a fixed number of repeats.

Issue with Validating Phone number on user input,Int to large c#

Overview of Project:
I am creating a multi form application, which consist of two forms and one parent class. Both Forms have a series of validation functions, such as isLetter() isNumber() and isValidEmail(). My Issue comes when using the isNumber() Function.
public static bool numValidation(string strNum)
{
if (!string.IsNullOrWhiteSpace(strNum))
{
int temp;
if (int.TryParse(strNum, out temp))
{
Console.WriteLine("Phone Number is a valid input: " + temp);
return true;
}
else
{ Console.WriteLine(temp + "Is not Valid input!!"); }
}
return false;
}
At first glance it works fine but once I tried to break it, I realised that when an actual phone number is entered I get an error saying that the number is too high. Any ideas how to get round this constraint ? as the only reason I need this validation is for phone numbers, Fax etc. I simply need this function to accept very large numbers such as phone numbers
I suggest that you use a regular expresion to validate the input in your case
public static bool numValidation(string strNum)
{
Regex regex = new Regex(#"^[0-9]+$");
return (regex.IsMatch(strNum)) ;
}
Parsing a string and not using that value is not really needed.
For more details checkout this answers - Regex for numbers only
From Mauricio Gracia Gutierrez answer
I suggest that you use a regular expresion to validate the
input in your case
public static bool numValidation(string strNum) {
Regex regex = new Regex(#"^[0-9]+$");
return (regex.IsMatch(strNum)) ; } Parsing a string and not using that value is not really needed.
For more details checkout this answers - Regex for numbers only
You could enhance the expression to check the length of the number:
Between 5 and 10 digits:
Regex regex = new Regex(#"^[\d]{5,10}+$");
Max 10 digits:
Regex regex = new Regex(#"^[\d]{10}+$");
At least 5 digits:
Regex regex = new Regex(#"^[\d]{5,}+$");

Validate a string based on certain Format

I need to validate a input string based on certain formats i.e
Proj-######## (4 alphabets, 1 Dash and 8 numbers)
OP###### (2 characters, 6 numbers)
Can someone please help me on this?
I tried with the below approach it's working for 1 dash and 8 numbers. but am not geeting how to add code into regerx for allow only 4 charactes.
private static readonly Regex boxNumberRegex = new Regex(#"^\d-\d{8}$");
public static bool VerifyBoxNumber (string boxNumber)
{
return boxNumberRegex.IsMatch(boxNumber);
}
Try this.
\b[a-zA-Z]{4}-\d{6}\b - is for Proj-########
\b[a-zA-Z]{2}\d{6}\b - is for OP######
If you want to learn bulding regular expressions, have a look at this article. Worth reading it.
http://www.codeproject.com/Articles/9099/The-Minute-Regex-Tutorial

C# Regex, any more efficient way to parse string enclosed by symbol?

I'm not sure if it's okay to ask... But here goes.
I implemented a method that parses a string using regex, each matching are parsed through the delegates with an order ( actually, order is not important-- I think, wait, is it? ... But I wrote it this way, and it's not fully tested ):
Pattern Regex.Replace: #"(?<!\\)\$.+?\$" then String.Replace: #"\$", #"$"; Replace string enclosed by dollar sign. Ignores backslash ones, then erases backslash. Ex: "$global name$" -> "motherofglobalvar", "Money \$9000" -> "Money $9000"
Pattern Regex.Replace #"(?<!\\)%.+?%" then String.Replace #"\%", #"%"; Replace string enclosed by percentage sign. Ignores backslash ones, then erase backslash. Same as previous example: "%local var%" -> "lordoflocalvar", "It's over 9000\%" -> "It's over 9000%"
Pattern Regex.Replace #"(?<!\\)#" then String.Replace #"\#", #"#"; Replace char '#' with whitespace, ' '. But ignore backslash ones, then erase the backslash. Ex: "I#hit#the#ground#too#hard" -> "I hit the ground too hard", "qw\#op" -> "qw#op"
What I've done without much experience (I think):
//parse variable
public static string ParseVariable(string text)
{
return Regex.Replace(Regex.Replace(Regex.Replace(text, #"(?<!\\)\$.+?\$", match =>
{
string trim = match.Value.Trim('$');
string trimUpper = trim.ToUpper();
return variableGlobal.ContainsKey(trim) ? variableGlobal[trim] : match.Value;
}).Replace(#"\$", #"$"), #"(?<!\\)%.+?%", match =>
{
string trim = match.Value.Trim('%');
string trimUpper = trim.ToUpper();
return variableLocal.ContainsKey(trim) ? variableLocal[trim] : match.Value;
}).Replace(#"\%", #"%"), #"(?<!\\)#", " ").Replace(#"\#", #"#");
}
In short, what I used is: Regex.Replace().Replace()
Since I need to parse 3 kinds of symbols, I chained it as following: Regex.Replace(Regex.Replace(Regex.Replace().Replace()).Replace()).Replace()
Is there any more efficient way than this? I mean, like without need to go through the text 6 times? (3 times regex.replace, 3 times string.replace, where each replace modifies the text to be used by the next replace )
Or is it the best way it can do?
Thanks.
Here's a unique take on the problem, I think. You can build a class that will be used to construct the overall pattern piece-by-piece. This class will be responsible for the generating of the MatchEvaluator delegate that will be passed to Replace as well.
class RegexReplacer
{
public string Pattern { get; private set; }
public string Replacement { get; private set; }
public string GroupName { get; private set; }
public RegexReplacer NextReplacer { get; private set; }
public RegexReplacer(string pattern, string replacement, string groupName, RegexReplacer nextReplacer = null)
{
this.Pattern = pattern;
this.Replacement = replacement;
this.GroupName = groupName;
this.NextReplacer = nextReplacer;
}
public string GetAggregatedPattern()
{
string constructedPattern = this.Pattern;
string alternation = (this.NextReplacer == null ? string.Empty : "|" + this.NextReplacer.GetAggregatedPattern()); // If there isn't another replacer, then we won't have an alternation; otherwise, we build an alternation between this pattern and the next replacer's "full" pattern
constructedPattern = string.Format("(?<{0}>{1}){2}", this.GroupName, this.Pattern, alternation); // The (?<XXX>) syntax builds a named capture group. This is used by our GetReplacementDelegate metho.
return constructedPattern;
}
public MatchEvaluator GetReplaceDelegate()
{
return (match) =>
{
if (match.Groups[this.GroupName] != null && match.Groups[this.GroupName].Length > 0) // Did we get a hit on the group name?
{
return this.Replacement;
}
else if (this.NextReplacer != null) // No? Then is there another replacer to inspect?
{
MatchEvaluator next = this.NextReplacer.GetReplaceDelegate();
return next(match);
}
else
{
return match.Value; // No? Then simply return the value
}
};
}
}
It should be obvious as to what Pattern and Replacement represent. GroupName is kind of a hack to let the replacement evaluator know which RegexReplacer fragment resulted in the match. NextReplacer points to another replacer instance that holds a different pattern fragment (et al.).
The idea here is to have a kind of linked list of objects that will represent the overall pattern. You can call GetAggregatedPattern on the outer-most replacer to get the full pattern--each replacer calls the next replacer's GetAggregatedPattern to get that replacer's patter fragment, to which it concatenates its own fragment. The GetReplacementDelegate generates a MatchEvaluator. This MatchEvaluator will compare its own GroupName to the Match's captured groups. If the group name was captured, then we have a hit, and we return this replacer's Replacement value. Otherwise, we step into the next replacer (if there is one) and repeat the group name comparison. If there is no hit on any replacer, then we simply yield back the original value (i.e. what was matched by the pattern; this should be rare).
The usage of such might look like this:
string target = #"$global name$ Money \$9000 %local var% It's over 9000\% I#hit#the#ground#too#hard qw\#op";
RegexReplacer dollarWrapped = new RegexReplacer(#"(?<!\\)\$[^$]+\$", "motherofglobalvar", "dollarWrapped");
RegexReplacer slashDollar = new RegexReplacer(#"\\\$", string.Empty, "slashDollar", dollarWrapped);
RegexReplacer percentWrapped = new RegexReplacer(#"(?<!\\)%[^%]+%", "lordoflocalvar", "percentWrapped", slashDollar);
RegexReplacer slashPercent = new RegexReplacer(#"\\%", string.Empty, "slashPercent", percentWrapped);
RegexReplacer singleAt = new RegexReplacer(#"(?<!\\)#", " ", "singleAt", slashPercent);
RegexReplacer slashAt = new RegexReplacer(#"\\#", "#", "slashAt", singleAt);
RegexReplacer replacer = slashAt;
string pattern = replacer.GetAggregatedPattern();
MatchEvaluator evaluator = replacer.GetReplaceDelegate();
string result = Regex.Replace(target, pattern, evaluator);
Because you want each replacer to know if it got a hit, and because we are hacking this by using group names, you want to make sure that each group name is distinct. A simple way to ensure this would be to use a name that's identical to the variable name since you can't have two variables with the same name within the same scope.
You can see above that I am building each part of the pattern separately, but as I build, I pass the previous replacer as a 4th parameter to the current replacer. This builds the chain of replacers. Once built, I use the last replacer constructed in order to generate the overall pattern and evaluator. If you use anything but, then you will only have part of the overall pattern. Finally, it's simply a matter of passing the generated pattern and evaluator to the Replace method.
Keep in mind that this approach was targeted more at the problem as described. It may work in more general scenarios, but I've only worked with what you've presented. Also, since this is more of a parsing question, a parser may be the proper route to take--although the learning curve is going to be higher.
Also keep in mind that I haven't profiled this code. It certainly doesn't loop over the target string multiple times, but it does involve additional method calls during replacement. You would certainly want to test it in your environment.

How to preserve "{0}" after two string.Format calls [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
how to add { in String Format c#
When i'm rewriting always the same thing, i'm used to write what I call a string pattern of it.
Let's say I would like to do SQL injection to extend ORM functionality...
protected static string FULLTEXTPATTERN = "EXISTS CONTAINSTABLE([{0}],*,'\"{1}\"') WHERE [key] = {0}.id;
And usually I got the table name and value that i combine in a string.format(FULLTEXTPATTERN ,...) and everything is fine.
Imagine now, I have to do that in two time. first injecting the table name, then the value I search for. So I would like to write something like:
protected static string FULLTEXTPATTERN = "EXISTS CONTAINSTABLE([{0}],*,'\"{{0}}/*Something that returns {0} after string.format*/\"') WHERE [key] = {0}.id;
...
var PartialPattern= string.fomat(FULLTEXTPATTERN, "TableX");
//PartialPattern = "EXISTS CONTAINSTABLE([TableX],*,'\"{0}\"') WHERE [key] = {0}.id"
...
//later in the code
...
var sqlStatement = string.format(PartialPattern,"Pitming");
//sqlStatement = "EXISTS CONTAINSTABLE([TableX],*,'\"Pitming\"') WHERE [key] = {0}.id"
Is there a way to do it ?
Logic says that you would simply put {{{0}}} in the format string to have it reduce down to {0} after the second string.Format call, but you can't - that throws a FormatException. But that's because you need yet another { and }, otherwise it really is not in the correct format :).
What you could do - set your full format to this (note the 4 { and } characters at the end):
"EXISTS CONTAINSTABLE([{0}],*,'\"{{0}}\"') WHERE [key] = {{{{0}}}}.id";
Then your final string will contain the {0} you expect.
As a proof - run this test:
[TestMethod]
public void StringFormatTest()
{
string result = string.Format(string.Format(
"{0} {{0}} {{{{0}}}}", "inner"), "middle");
Assert.AreEqual("inner middle {0}", result);
}
Is it possible to delay generating SQL to the point at which you have all the required inputs so that you can use one call to String.Format() and multiple fields?
Alternatively, you could you build the query iteratively using a StringBuilder rather than String.Format().

Categories