environment: microsoft visual studio 2008 c#
How do I get the index of a whole word found in a string
string dateStringsToValidate = "birthdatecake||birthdate||other||strings";
string testValue = "birthdate";
var result = dateStringsToValidate.IndexOf(testValue);
It doesn't have to be the way i did it either, for example, would it be better to use regular expressions or other methods?
Update:
The word is birthdate not birthdatecake. it doesn't have to retrieve the match but the index should find the right word. i don't think IndexOf is what i'm looking for then. Sorry for being unclear.
Use regular expressions for this
string dateStringsToValidate = "birthdatecake||birthdate||other||strings";
string testValue = "strings";
var result = WholeWordIndexOf(dateStringsToValidate, testValue);
// ...
public int WholeWordIndexOf(string source, string word, bool ignoreCase = false)
{
string testValue = "\\W?(" + word + ")\\W?";
var regex = new Regex(testValue, ignoreCase ?
RegexOptions.IgnoreCase :
RegexOptions.None);
var match = regex.Match(source);
return match.Captures.Count == 0 ? -1 : match.Groups[0].Index;
}
Learn more about regex options in c# here
Another option, depending on your needs, is to split the string (as I see you have some delimiters). Please note the index returned by the this option is the index by word count, not character count (In this case, 1, as C# has zero based arrays).
string dateStringsToValidate = "birthdatecake||birthdate||other||strings";
var split = dateStringsToValidate.Split(new string[] { "||" }, StringSplitOptions.RemoveEmptyEntries);
string testValue = "birthdate";
var result = split.ToList().IndexOf(testValue);
If you must deal with the exact index in the given string, then this is of little use to you. If you just want to find the best match in the string, this could work for you.
var dateStringsToValidate = "birthdatecake||birthdate||other||strings";
var toFind = "birthdate";
var splitDateStrings = dateStringsToValidate.Split(new[] {"||"}, StringSplitOptions.None);
var best = splitDateStrings
.Where(s => s.Contains(toFind))
.OrderBy(s => s.Length*1.0/toFind.Length) // a metric to define "best match"
.FirstOrDefault();
Console.WriteLine(best);
Related
I have value like below
string value = "11,.Ad23";
int n;
bool isNumeric = int.TryParse(value, out n);
I control if string is numeric or not.If string is not numeric and has non numeric i need to get non numeric values as below
Result must be as below
,.Ad
How can i do this in c# ?
If it doesn't matter if the non-digits are consecutive, it's simple:
string nonNumericValue = string.Concat(value.Where(c => !Char.IsDigit(c)));
Online Demo: http://ideone.com/croMht
If you use .NET 3.5. as mentioned in the comment there was no overload of String.Concat (or String.Join as in Dmytris answer) that takes an IEnumerable<string>, so you need to create an array:
string nonNumericValue = string.Concat(value.Where(c => !Char.IsDigit(c)).ToArray());
That takes all non-digits. If you instead want to take the middle part, so skip the digits, then take all until the the next digits:
string nonNumericValue = string.Concat(value.SkipWhile(Char.IsDigit)
.TakeWhile(c => !Char.IsDigit(c)));
Regular expression solution (glue together all non-numeric values):
String source = "11,.Ad23";
String result = String.Join("", Regex
.Matches(source, #"\D{1}")
.OfType<Match>()
.Select(item => item.Value));
Edit: it seems that you use and old version of .Net, in that case you can use straightforward code without RegEx, Linq etc:
String source = "11,.Ad23";
StringBuilder sb = new StringBuilder(source.Length);
foreach (Char ch in source)
if (!Char.IsDigit(ch))
sb.Append(ch);
String result = sb.ToString();
Although I like the solution proposed I think a more efficent way would be using regular expressions such as
[^\D]
Which called as
var regex = new Regex(#"[^\D]");
var nonNumeric = regex.Replace("11,.Ad23", ""));
Which returns:
,.Ad
Would a LINQ solution work for you?
string value = "11,.Ad23";
var result = new string(value.Where(x => !char.IsDigit(x)).ToArray());
I'm trying to make a Regular Expression in C# that will match strings like"", but my Regex stops at the first match, and I'd like to match the whole string.
I've been trying with a lot of ways to do this, currently, my code looks like this:
string sPattern = #"/&#\d{2};/";
Regex rExp = new Regex(sPattern);
MatchCollection mcMatches = rExp.Matches(txtInput.Text);
foreach (Match m in mcMatches) {
if (!m.Success) {
//Give Warning
}
}
And also tried lblDebug.Text = Regex.IsMatch(txtInput.Text, "(&#[0-9]{2};)+").ToString(); but it also only finds the first match.
Any tips?
Edit:
The end result I'm seeking is that strings like &# are labeled as incorrect, as it is now, since only the first match is made, my code marks this as a correct string.
Second Edit:
I changed my code to this
string sPattern = #"&#\d{2};";
Regex rExp = new Regex(sPattern);
MatchCollection mcMatches = rExp.Matches(txtInput.Text);
int iMatchCount = 0;
foreach (Match m in mcMatches) {
if (m.Success) {
iMatchCount++;
}
}
int iTotalStrings = txtInput.Text.Length / 5;
int iVerify = txtInput.Text.Length % 5;
if (iTotalStrings == iMatchCount && iVerify == 0) {
lblDebug.Text = "True";
} else {
lblDebug.Text = "False";
}
And this works the way I expected, but I still think this can be achieved in a better way.
Third Edit:
As #devundef suggest, the expression "^(&#\d{2};)+$" does the work I was hopping, so with this, my final code looks like this:
string sPattern = #"^(&#\d{2};)+$";
Regex rExp = new Regex(sPattern);
lblDebug.Text = rExp.IsMatch(txtInput.Text).ToString();
I always neglect the start and end of string characters (^ / $).
Remove the / at the start and end of the expression.
string sPattern = #"&#\d{2};";
EDIT
I tested the pattern and it works as expected. Not sure what you want.
Two options:
&#\d{2}; => will give N matches in the string. On the string it will match 2 groups, and
(&#\d{2};)+ => will macth the whole string as one single group. On the string it will match 1 group,
Edit 2:
What you want is not get the groups but know if the string is in the right format. This is the pattern:
Regex rExp = new Regex(#"^(&#\d{2};)+$");
var isValid = rExp.IsMatch("") // isValid = true
var isValid = rExp.IsMatch("xyz") // isValid = false
Here you go: (&#\d{2};)+ This should work for one occurence or more
(&#\d{2};)*
Recommend: http://www.weitz.de/regex-coach/
I have a Visual Studio 2008 C# .NET 3.5 application where I need to parse a macro.
Given a serial serial number that is N digits long, and a macro like %SERIALNUMBER3%, I would like this parse method to return only the first 3 digits of the serial number.
string serialnumber = "123456789";
string macro = "%SERIALNUMBER3%";
string parsed = SomeParseMethod(serialnumber, macro);
parsed = "123"
Given `%SERIALNUMBER7%, return the first 7 digits, etc..
I can do this using String.IndexOf and some complexity, but I wondered if there was a simple method. Maybe using a Regex replace.
What's the simplest method of doing this?
var str = "%SERIALNUMBER3%";
var reg = new Regex(#"%(\w+)(\d+)%");
var match = reg.Match( str );
if( match.Success )
{
string token = match.Groups[1].Value;
int numDigits = int.Parse( match.Groups[2].Value );
}
Use the Regex class. Your expression will be something like:
#"%(\w)+(\d)%"
Your first capture group is the ID (in this case, "SERIALNUMBER"), and your second capture group is the number of digits (in this case, "3").
Very quick and dirty example:
static void Main(string[] args)
{
string serialnumber = "123456789";
string macro = "%SERIALNUMBER3%";
var match = Regex.Match(macro, #"\d+");
string parsed = serialnumber.Substring(0, int.Parse(match.ToString()));
}
If I have various strings that have text followed by whitespace followed by text, how can I parse the substring beginning with the first character in the second block of text?
For example:
If I have the string:
"stringA stringB"
How can I extract the substring
"stringB"
The strings are of various lengths but will all be of the format .
I'm sure this can be easily done with regex but I'm having trouble finding the proper syntax for c#.
No RegEx needed, just split it.
var test = "stringA stringB";
var second = test.Split()[1];
and if you are in the wonderful LINQ-land
var second = "string1 string2".Split().ElementAtOrDefault(1);
and with RegEx (for completeness)
var str2 = Regex.Match("str1 str2", #"\w (.*$)").Groups[1].Value;
use string.Split()
var test = "stringA stringB";
var elements = test.Split(new[]
{
' '
});
var desiredItem = elements.ElementAtOrDefault(1);
if you want to capture all whitespaces (msdn tells us more):
var test = "stringA stringB";
//var elements = test.Split(); // pseudo overload
var elements = test.Split(null); // correct overload
var desiredItem = elements.ElementAtOrDefault(1);
edit:
why pseudo-overload?
.Split() gets compiled to .Split(new char[0])
not documented in MSDN
If all strings are separated by a whitespace you don't need a regex here. You could just use the Split() method:
string[] result = { };
string myStrings = "stringA stringB stringC";
result = myStrings.Split(' ');
You don't need event the Split(). I think a simple IndexOf/Substring will do the job.
var input = "A B";
var result = string.Empty;
var index = input.IndexOf(' ');
if (index >= 0)
{
result = input.Substring(index + 1);
}
What's the easiest way to parse a string and extract a number and a letter? I have string that can be in the following format (number|letter or letter|number), i.e "10A", "B5", "C10", "1G", etc.
I need to extract the 2 parts, i.e. "10A" -> "10" and "A".
Update: Thanks to everyone for all the excellent answers
Easiest way is probably to use regular expressions.
((?<number>\d+)(?<letter>[a-zA-Z])|(?<letter>[a-zA-Z])(?<number>\d+))
You can then match it with your string and extract the value from the groups.
Match match = regex.Match("10A");
string letter = match.Groups["letter"].Value;
int number = int.Parse(match.Groups["number"].Value);
The easiest and fastest is to use simple string operations. Use the IsDigit method to check where the letter is, and parse the rest of the string to a number:
char letter = str[0];
int index = 1;
if (Char.IsDigit(letter)) {
letter = str[str.Length - 1];
index = 0;
}
int number = int.Parse(str.Substring(index, str.Length - 1));
Just to be different:
string number = input.Trim("ABCDEFGHIJKLMNOPQRSTUVWXYZ".ToCharArray());
string letter = input.Trim("0123456789".ToCharArray());
char letter = str.Single(c => char.IsLetter(c));
int num = int.Parse(new string(str.Where(c => char.IsDigit(c)).ToArray()));
This solution is not terribly strict (it would allow things like "5A2" and return 'A' and 52) but it may be fine for your purposes.
Here is how I would approach this. You can step through this and put gc1["letter"], gc1["number"], gc2["letter"], and gc2["number"] in the watch window to see that it worked (step just past the last line of code here, of course).
The regular epxression will take either pattern requiring one or more letter and number in each case.
Regex pattern = new Regex("^(?<letter>[a-zA-Z]+)(?<number>[0-9]+)|(?<number>[0-9]+)(?<letter>[a-zA-Z]+)$");
string s1 = "12A";
string s2 = "B45";
Match m1 = pattern.Match(s1);
Match m2 = pattern.Match(s2);
GroupCollection gc1 = m1.Groups;
GroupCollection gc2 = m2.Groups;
Using Sprache and some Linq kung-fu:
var tagParser =
from a in Parse.Number.Or(Parse.Letter.Once().Text())
from b in Parse.Letter.Once().Text().Or(Parse.Number)
select char.IsDigit(a[0]) ?
new{Number=a, Letter=b} : new{Number=b, Letter=a};
var tag1 = tagParser.Parse("10A");
var tag2 = tagParser.Parse("A10");
tag1.Letter; // should be A
tag1.Number; // should be 10
tag2.Letter; // should be A
tag2.Number; // should be 10
/* Output:
A
10
A
10
*/