Regular expression to split string and number - c#

I have a string of the form:
codename123
Is there a regular expression that can be used with Regex.Split() to split the alphabetic part and the numeric part into a two-element string array?

I know you asked for the Split method, but as an alternative you could use named capturing groups:
var numAlpha = new Regex("(?<Alpha>[a-zA-Z]*)(?<Numeric>[0-9]*)");
var match = numAlpha.Match("codename123");
var alpha = match.Groups["Alpha"].Value;
var num = match.Groups["Numeric"].Value;

splitArray = Regex.Split("codename123", #"(?<=\p{L})(?=\p{N})");
will split between a Unicode letter and a Unicode digit.

Regex is a little heavy handed for this, if your string is always of that form. You could use
"codename123".IndexOfAny(new char[] {'1','2','3','4','5','6','7','8','9','0'})
and two calls to Substring.

A little verbose, but
Regex.Split( "codename123", #"(?<=[a-zA-Z])(?=\d)" );
Can you be more specific about your requirements? Maybe a few other input examples.

IMO, it would be a lot easier to find matches, like:
Regex.Matches("codename123", #"[a-zA-Z]+|\d+")
.Cast<Match>()
.Select(m => m.Value)
.ToArray();
rather than to use Regex.Split.

Well, is a one-line only: Regex.Split("codename123", "^([a-z]+)");

Another simpler way is
string originalstring = "codename123";
string alphabets = string.empty;
string numbers = string.empty;
foreach (char item in mainstring)
{
if (Char.IsLetter(item))
alphabets += item;
if (Char.IsNumber(item))
numbers += item;
}

this code is written in java/logic should be same elsewhere
public String splitStringAndNumber(String string) {
String pattern = "(?<Alpha>[a-zA-Z]*)(?<Numeric>[0-9]*)";
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(string);
if (m.find()) {
return (m.group(1) + " " + m.group(2));
}
return "";
}

Related

c# trying to change first letter to uppercase but doesn't work

I have to convert the first letter of every word the user inputs into uppercase. I don't think I'm doing it right so it doesn't work but I'm not sure where has gone wrong D: Thank you in advance for your help! ^^
static void Main(string[] args)
{
Console.Write("Enter anything: ");
string x = Console.ReadLine();
string pattern = "^";
Regex expression = new Regex(pattern);
var regexp = new System.Text.RegularExpressions.Regex(pattern);
Match result = expression.Match(x);
Console.WriteLine(x);
foreach(var match in x)
{
Console.Write(match);
}
Console.WriteLine();
}
If your exercise isn't regex operations, there are built-in utilities to do what you are asking:
System.Globalization.TextInfo ti = System.Globalization.CultureInfo.CurrentCulture.TextInfo;
string titleString = ti.ToTitleCase("this string will be title cased");
Console.WriteLine(titleString);
Prints:
This String Will Be Title Cased
If you operation is for regex, see this previous StackOverflow answer: Sublime Text: Regex to convert Uppercase to Title Case?
First of all, your Regex "^" matches the start of a line. If you need to match each word in a multi-word line, you'll need a different Regex, e.g. "[A-Za-z]".
You're also not doing anything to actually change the first letter to upper case. Note that strings in C# are immutable (they cannot be changed after creation), so you will need to create a new string which consists of the first letter of the original string, upper cased, followed by the rest of the string. Give that part a try on your own. If you have trouble, post a new question with your attempt.
string pattern = "(?:^|(?<= ))(.)"
^ doesnt capture anything by itself.You can replace by uppercase letters by applying function to $1.See demo.
https://regex101.com/r/uE3cC4/29
I would approach this using Model Extensions.
PHP has a nice method called ucfirst.
So I translated that into C#
public static string UcFirst(this string s)
{
var stringArr = s.ToCharArray(0, s.Length);
var char1ToUpper = char.Parse(stringArr[0]
.ToString()
.ToUpper());
stringArr[0] = char1ToUpper;
return string.Join("", stringArr);
}
Usage:
[Test]
public void UcFirst()
{
string s = "john";
s = s.UcFirst();
Assert.AreEqual("John", s);
}
Obviously you would still have to split your sentence into a list and call UcFirst for each item in the list.
Google C# Model Extensions if you need help with what is going on.
One more way to do it with regex:
string input = "this string will be title cased, even if there are.cases.like.that";
string output = Regex.Replace(input, #"(?<!\w)\w", m => m.Value.ToUpper());
I hope this may help
public static string CapsFirstLetter(string inputValue)
{
char[] values = new char[inputValue.Length];
int count = 0;
foreach (char f in inputValue){
if (count == 0){
values[count] = Convert.ToChar(f.ToString().ToUpper());
}
else{
values[count] = f;
}
count++;
}
return new string(values);
}

Get non numeric values from string value c#

I have value like below
string value = "11,.Ad23";
int n;
bool isNumeric = int.TryParse(value, out n);
I control if string is numeric or not.If string is not numeric and has non numeric i need to get non numeric values as below
Result must be as below
,.Ad
How can i do this in c# ?
If it doesn't matter if the non-digits are consecutive, it's simple:
string nonNumericValue = string.Concat(value.Where(c => !Char.IsDigit(c)));
Online Demo: http://ideone.com/croMht
If you use .NET 3.5. as mentioned in the comment there was no overload of String.Concat (or String.Join as in Dmytris answer) that takes an IEnumerable<string>, so you need to create an array:
string nonNumericValue = string.Concat(value.Where(c => !Char.IsDigit(c)).ToArray());
That takes all non-digits. If you instead want to take the middle part, so skip the digits, then take all until the the next digits:
string nonNumericValue = string.Concat(value.SkipWhile(Char.IsDigit)
.TakeWhile(c => !Char.IsDigit(c)));
Regular expression solution (glue together all non-numeric values):
String source = "11,.Ad23";
String result = String.Join("", Regex
.Matches(source, #"\D{1}")
.OfType<Match>()
.Select(item => item.Value));
Edit: it seems that you use and old version of .Net, in that case you can use straightforward code without RegEx, Linq etc:
String source = "11,.Ad23";
StringBuilder sb = new StringBuilder(source.Length);
foreach (Char ch in source)
if (!Char.IsDigit(ch))
sb.Append(ch);
String result = sb.ToString();
Although I like the solution proposed I think a more efficent way would be using regular expressions such as
[^\D]
Which called as
var regex = new Regex(#"[^\D]");
var nonNumeric = regex.Replace("11,.Ad23", ""));
Which returns:
,.Ad
Would a LINQ solution work for you?
string value = "11,.Ad23";
var result = new string(value.Where(x => !char.IsDigit(x)).ToArray());

Substring or split word in clamp

I need an solution for my problem. I have a clause like:
Hello guys I am cool (test)
An now I need an effective method to split just only the part in the parentheses and the result should be:
test
My try is to split the String in words like. But I don't think it is the best way.
string[] words = s.Split(' ');
I do not think that split is the solution to your problem
Regex is very good for extracting data.
using System.Text.RegularExpression;
...
string result = Regex.Match(s, #"\((.*?)\)").Groups[1].Value;
This should do the trick.
Assuming:
var input = "Hello guys I am cool (test)";
..Non-Regex version:
var nonRegex = input.Substring(input.IndexOf('(') + 1, input.LastIndexOf(')') - (input.IndexOf('(') + 1));
..Regex version:
var regex = Regex.Match(input, #"\((\w+)\)").Groups[1].Value;
You can use regex for this:
string parenthesized = Regex.Match(s, #"(?<=\()[^)]+(?=\))").Value;
Here's an explanation of the various parts of the regex pattern:
(?<=\(): Lookbehind for the ( (excluded from the match)
[^)]+: Sequence of characters consisting of anything except )
(?=\)): Lookahead for the ) (excluded from the match)
The most efficient way is using string methods but you don't need Split but Substring and IndexOf. Note that this currently just finds a single word in parentheses:
string text = "Hello guys I am cool (test)";
string result = "--no parentheses--";
int index = text.IndexOf('(');
if(index++ >= 0) // ++ used to look behind ( which is a single character
{
int endIndex = text.IndexOf(')', index);
if(endIndex >= 0)
{
result = text.Substring(index, endIndex - index);
}
}
string s = "Hello guys I am cool (test)";
var result = s.Substring(s.IndexOf("test"), 4);

Is there a better way to create acronym from upper letters in C#?

What is the best way to create acronym from upper letters in C#?
Example:
Alfa_BetaGameDelta_Epsilon
Expected result:
ABGDE
My solution works, but it's not nice
var classNameAbbreviationRegex = new Regex("[A-Z]+", RegexOptions.Compiled);
var matches = classNameAbbreviationRegex.Matches(enumTypeName);
var letters = new string[matches.Count];
for (var i = 0; i < matches.Count; i++)
{
letters[i] = matches[i].Value;
}
var abbreviation = string.Join(string.Empty, letters);
string.Join("", s.Where(char.IsUpper));
string.Join("", s.Where(x => char.IsUpper(x))
string test = "Alfa_BetaGameDelta_Epsilon";
string result = string.Concat(test.Where(char.IsUpper));
You can use the Where method to filter out the upper case characters, and the Char.IsUpper method can be used as a delegate directly without a lambda expression. You can create the resulting string from an array of characters:
string abbreviation = new String(enumTypeName.Where(Char.IsUpper).ToArray());
By using MORE regexes :-)
var ac = string.Join(string.Empty,
Regex.Match("Alfa_BetaGameDelta_Epsilon",
"(?:([A-Z]+)(?:[^A-Z]*))*")
.Groups[1]
.Captures
.Cast<Capture>()
.Select(p => p.Value));
More regexes are always the solution, expecially with LINQ! :-)
The regex puts all the [A-Z] in capture group 1 (because all the other () are non-capturing group (?:)) and "skips" all the non [A-Z] ([^A-Z]) by putting them in a non-capturing group. This is done 0-infinite times by the last *. Then a little LINQ to select the value of each capture .Select(p => p.Value) and the string.Join to join them.
Note that this isn't Unicode friendly... ÀÈÌÒÙ will be ignored. A better regex would use #"(?:(\p{Lu}+)(?:[^\p{Lu}]*))*" where \p{Lu} is the Unicode category UppercaseLetter.
(yes, this is useless... The other methods that use LINQ + IsUpper are better :-) but the whole example was built just to show the problems of Regexes with Unicode)
MUCH EASIER:
var ac = Regex.Replace("Alfa_BetaGameDelta_Epsilon", #"[^\p{Lu}]", string.Empty);
simply remove all the non-uppercase letters :-)
var str = "Alfa_BetaGammaDelta_Epsilon";
var abbreviation = string.Join(string.Empty, str.Where(c => c.IsUpper()));

Is there a method for removing whitespace characters from a string?

Is there a string class member function (or something else) for removing all spaces from a string? Something like Python's str.strip() ?
You could simply do:
myString = myString.Replace(" ", "");
If you want to remove all white space characters you could use Linq, even if the syntax is not very appealing for this use case:
myString = new string(myString.Where(c => !char.IsWhiteSpace(c)).ToArray());
String.Trim method removes trailing and leading white spaces. It is the functional equivalent of Python's strip method.
LINQ feels like overkill here, converting a string to a list, filtering the list, then turning it back onto a string. For removal of all white space, I would go for a regular expression. Regex.Replace(s, #"\s", ""). This is a common idiom and has probably been optimized.
If you want to remove the spaces that prepend the string or at itt's end, you might want to have a look at TrimStart() and TrimEnd() and Trim().
If you're looking to replace all whitespace in a string (not just leading and trailing whitespace) based on .NET's determination of what's whitespace or not, you could use a pretty simple LINQ query to make it work.
string whitespaceStripped = new string((from char c in someString
where !char.IsWhiteSpace(c)
select c).ToArray());
Yes, Trim.
String a = "blabla ";
var b = a.Trim(); // or TrimEnd or TrimStart
Yes, String.Trim().
var result = " a b ".Trim();
gives "a b" in result. By default all whitespace is trimmed. If you want to remove only space you need to type
var result = " a b ".Trim(' ');
If you want to remove all spaces in a string you can use string.Replace().
var result = " a b ".Replace(" ", "");
gives "ab" in result. But that is not equivalent to str.strip() in Python.
I don't know much about Python...
IF the str.strip() just removes whitespace at the start and the end then you could use str = str.Trim() in .NET... otherwise you could just str = str.Replace ( " ", "") for removing all spaces.
IF it removes all whitespace then use
str = (from c in str where !char.IsWhiteSpace(c) select c).ToString()
There are many diffrent ways, some faster then others:
public static string StripTabsAndNewlines(this string s) {
//string builder (fast)
StringBuilder sb = new StringBuilder();
for (int i = 0; i < str.Length; i++) {
if ( ! Char.IsWhiteSpace(s[i])) {
sb.Append();
}
}
return sb.tostring();
//linq (faster ?)
return new string(input.ToCharArray().Where(c => !Char.IsWhiteSpace(c)).ToArray());
//regex (slow)
return Regex.Replace(s, #"\s+", "")
}
you could use
StringVariable.Replace(" ","")
I'm surprised no one mentioned this:
String.Join("", " all manner\tof\ndifferent\twhite spaces!\n".Split())
string.Split by default splits along the characters that are char.IsWhiteSpace so this is a very similar solution to filtering those characters out by the direct use of char.IsWhiteSpace and it's a one-liner that works in pre-LINQ environments as well.
Strip spaces? Strip whitespaces? Why should it matter? It only matters if we're searching for an existing implementation, but let's not forget how fun it is to program the solution rather than search MSDN (boring).
You should be able to strip any chars from any string by using 1 of the 2 functions below.
You can remove any chars like this
static string RemoveCharsFromString(string textChars, string removeChars)
{
string tempResult = "";
foreach (char c in textChars)
{
if (!removeChars.Contains(c))
{
tempResult = tempResult + c;
}
}
return tempResult;
}
or you can enforce a character set (so to speak) like this
static string EnforceCharLimitation(string textChars, string allowChars)
{
string tempResult = "";
foreach (char c in textChars)
{
if (allowChars.Contains(c))
{
tempResult = tempResult + c;
}
}
return tempResult;
}

Categories