Removing all non letter characters from a string in C#

Removing all non letter characters from a string in C# - c#

I want to remove all non letter characters from a string. When I say all letters I mean anything that isn't in the alphabet, or an apostrophe. This is the code I have.
public static string RemoveBadChars(string word)
{
char[] chars = new char[word.Length];
for (int i = 0; i < word.Length; i++)
{
char c = word[i];
if ((int)c >= 65 && (int)c <= 90)
{
chars[i] = c;
}
else if ((int)c >= 97 && (int)c <= 122)
{
chars[i] = c;
}
else if ((int)c == 44)
{
chars[i] = c;
}
}
word = new string(chars);
return word;
}
It's close, but doesn't quite work. The problem is this:
[in]: "(the"
[out]: " the"
It gives me a space there instead of the "(". I want to remove the character entirely.

The Char class has a method that could help out. Use Char.IsLetter() to detect valid letters (and an additional check for the apostrophe), then pass the result to the string constructor:
var input = "(the;':";
var result = new string(input.Where(c => Char.IsLetter(c) || c == '\'').ToArray());
Output:
the'

You should use Regular Expression (Regex) instead.
public static string RemoveBadChars(string word)
{
Regex reg = new Regex("[^a-zA-Z']");
return reg.Replace(word, string.Empty);
}
If you don't want to replace spaces:
Regex reg = new Regex("[^a-zA-Z' ]");

A regular expression would be better as this is pretty inefficient, but to answer your question, the problem with your code is that you should use a different variable other than i inside your for loop. So, something like this:
public static string RemoveBadChars(string word)
{
char[] chars = new char[word.Length];
int myindex=0;
for (int i = 0; i < word.Length; i++)
{
char c = word[i];
if ((int)c >= 65 && (int)c <= 90)
{
chars[myindex] = c;
myindex++;
}
else if ((int)c >= 97 && (int)c <= 122)
{
chars[myindex] = c;
myindex++;
}
else if ((int)c == 44)
{
chars[myindex] = c;
myindex++;
}
}
word = new string(chars);
return word;
}

private static Regex badChars = new Regex("[^A-Za-z']");
public static string RemoveBadChars(string word)
{
return badChars.Replace(word, "");
}
This creates a Regular Expression that consists of a character class (enclosed in square brackets) that looks for anything that is not (the leading ^ inside the character class) A-Z, a-z, or '. It then defines a function that replaces anything that matches the expression with an empty string.

This is the working answer, he says he want to remove none-letters chars
public static string RemoveNoneLetterChars(string word)
{
Regex reg = new Regex(#"\W");
return reg.Replace(word, " "); // or return reg.Replace(word, String.Empty);
}

word.Aggregate(new StringBuilder(word.Length), (acc, c) => acc.Append(Char.IsLetter(c) ? c.ToString() : "")).ToString();
Or you can substitute whatever function in place of IsLetter.

Related

How can i Split words inside this .txt in c# [duplicate]

How to split text into words?
Example text:
'Oh, you can't help that,' said the Cat: 'we're all mad here. I'm mad. You're mad.'
The words in that line are:
Oh
you
can't
help
that
said
the
Cat
we're
all
mad
here
I'm
mad
You're
mad

Split text on whitespace, then trim punctuation.
var text = "'Oh, you can't help that,' said the Cat: 'we're all mad here. I'm mad. You're mad.'";
var punctuation = text.Where(Char.IsPunctuation).Distinct().ToArray();
var words = text.Split().Select(x => x.Trim(punctuation));
Agrees exactly with example.

First, Remove all special characeters:
var fixedInput = Regex.Replace(input, "[^a-zA-Z0-9% ._]", string.Empty);
// This regex doesn't support apostrophe so the extension method is better
Then split it:
var split = fixedInput.Split(' ');
For a simpler C# solution for removing special characters (that you can easily change), add this extension method (I added a support for an apostrophe):
public static string RemoveSpecialCharacters(this string str) {
var sb = new StringBuilder();
foreach (char c in str) {
if ((c >= '0' && c <= '9') || (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z') || c == '\'' || c == ' ') {
sb.Append(c);
}
}
return sb.ToString();
}
Then use it like so:
var words = input.RemoveSpecialCharacters().Split(' ');
You'll be surprised to know that this extension method is very efficient (surely much more efficient then the Regex) so I'll suggest you use it ;)
Update
I agree that this is an English only approach but to make it Unicode compatible all you have to do is replace:
(c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z')
With:
char.IsLetter(c)
Which supports Unicode, .Net Also offers you char.IsSymbol and char.IsLetterOrDigit for the variety of cases

Just to add a variation on #Adam Fridental's answer which is very good, you could try this Regex:
var text = "'Oh, you can't help that,' said the Cat: 'we're all mad here. I'm mad. You're mad.'";
var matches = Regex.Matches(text, #"\w+[^\s]*\w+|\w");
foreach (Match match in matches) {
var word = match.Value;
}
I believe this is the shortest RegEx that will get all the words
\w+[^\s]*\w+|\w

If you don't want to use a Regex object, you could do something like...
string mystring="Oh, you can't help that,' said the Cat: 'we're all mad here. I'm mad. You're mad.";
List<string> words=mystring.Replace(",","").Replace(":","").Replace(".","").Split(" ").ToList();
You'll still have to handle the trailing apostrophe at the end of "that,'"

This is one of solution, i dont use any helper class or method.
public static List<string> ExtractChars(string inputString) {
var result = new List<string>();
int startIndex = -1;
for (int i = 0; i < inputString.Length; i++) {
var character = inputString[i];
if ((character >= 'a' && character <= 'z') ||
(character >= 'A' && character <= 'Z')) {
if (startIndex == -1) {
startIndex = i;
}
if (i == inputString.Length - 1) {
result.Add(GetString(inputString, startIndex, i));
}
continue;
}
if (startIndex != -1) {
result.Add(GetString(inputString, startIndex, i - 1));
startIndex = -1;
}
}
return result;
}
public static string GetString(string inputString, int startIndex, int endIndex) {
string result = "";
for (int i = startIndex; i <= endIndex; i++) {
result += inputString[i];
}
return result;
}

If you want to use the "for cycle" to check each char and save all punctuation in the input string I've create this class. The method GetSplitSentence() return a list of SentenceSplitResult. In this list there are saved all the words and all the punctuation & numbers. Each punctuation or numbers saved is an item in the list. The sentenceSplitResult.isAWord is used to check if is a word or not. [Sorry for my English]
public class SentenceSplitResult
{
public string word;
public bool isAWord;
}
public class StringsHelper
{
private readonly List<SentenceSplitResult> outputList = new List<SentenceSplitResult>();
private readonly string input;
public StringsHelper(string input)
{
this.input = input;
}
public List<SentenceSplitResult> GetSplitSentence()
{
StringBuilder sb = new StringBuilder();
try
{
if (String.IsNullOrEmpty(input)) {
Logger.Log(new ArgumentNullException(), "GetSplitSentence - input is null or empy");
return outputList;
}
bool isAletter = IsAValidLetter(input[0]);
// Each char i checked if is a part of a word.
// If is YES > I can store the char for later
// IF is NO > I Save the word (if exist) and then save the punctuation
foreach (var _char in input)
{
isAletter = IsAValidLetter(_char);
if (isAletter == true)
{
sb.Append(_char);
}
else
{
SaveWord(sb.ToString());
sb.Clear();
SaveANotWord(_char);
}
}
SaveWord(sb.ToString());
}
catch (Exception ex)
{
Logger.Log(ex);
}
return outputList;
}
private static bool IsAValidLetter(char _char)
{
if ((Char.IsPunctuation(_char) == true) || (_char == ' ') || (Char.IsNumber(_char) == true))
{
return false;
}
return true;
}
private void SaveWord(string word)
{
if (String.IsNullOrEmpty(word) == false)
{
outputList.Add(new SentenceSplitResult()
{
isAWord = true,
word = word
});
}
}
private void SaveANotWord(char _char)
{
outputList.Add(new SentenceSplitResult()
{
isAWord = false,
word = _char.ToString()
});
}

You could try using a regex to remove the apostrophes that aren't surrounded by letters (i.e. single quotes) and then using the Char static methods to strip all the other characters. By calling the regex first you can keep the contraction apostrophes (e.g. can't) but remove the single quotes like in 'Oh.
string myText = "'Oh, you can't help that,' said the Cat: 'we're all mad here. I'm mad. You're mad.'";
Regex reg = new Regex("\b[\"']\b");
myText = reg.Replace(myText, "");
string[] listOfWords = RemoveCharacters(myText);
public string[] RemoveCharacters(string input)
{
StringBuilder sb = new StringBuilder();
foreach (char c in input)
{
if (Char.IsLetter(c) || Char.IsWhiteSpace(c) || c == '\'')
sb.Append(c);
}
return sb.ToString().Split(' ');
}

Replace character in string with Uppercase of next in line (Pascal Casing)

I want to remove all underscores from a string with the uppercase of the character following the underscore. So for example: _my_string_ becomes: MyString similarly: my_string becomes MyString
Is there a simpler way to do it? I currently have the following (assuming no input has two consecutive underscores):
StringBuilder sb = new StringBuilder();
int i;
for (i = 0; i < input.Length - 1; i++)
{
if (input[i] == '_')
sb.Append(char.ToUpper(input[++i]));
else if (i == 0)
sb.Append(char.ToUpper(input[i]));
else
sb.Append(input[i]);
}
if (i < input.Length && input[i] != '_')
sb.Append(input[i]);
return sb.ToString();
Now I know this is not totally related, but I thought to run some numbers on the implementations provided in the answers, and here are the results in Milliseconds for each implementation using 1000000 iterations of the string: "_my_string_121_a_" :
Achilles: 313
Raj: 870
Damian: 7916
Dmitry: 5380
Equalsk: 574
method utilised:
Stopwatch stp = new Stopwatch();
stp.Start();
for (int i = 0; i < 1000000; i++)
{
sb = Test("_my_string_121_a_");
}
stp.Stop();
long timeConsumed= stp.ElapsedMilliseconds;
In the end I think I'll go with Raj's implementation, because it's just very simple and easy to understand.

This must do it using ToTitleCase using System.Globalization namespace
static string toCamel(string input)
{
TextInfo info = CultureInfo.CurrentCulture.TextInfo;
input= info.ToTitleCase(input).Replace("_", string.Empty);
return input;
}

Shorter (regular expressions), but I doubt if it's better (regular expressions are less readable):
string source = "_my_string_123__A_";
// MyString123A
string result = Regex
// _ + lower case Letter -> upper case letter (thanks to Wiktor Stribiżew)
.Replace(source, #"(_+|^)(\p{Ll})?", match => match.Groups[2].Value.ToUpper())
// all the other _ should be just removed
.Replace("_", "");

Loops over each character and converts to uppercase as necessary.
public string GetNewString(string input)
{
var convert = false;
var sb = new StringBuilder();
foreach (var c in input)
{
if (c == '_')
{
convert = true;
continue;
}
if (convert)
{
sb.Append(char.ToUpper(c));
convert = false;
continue;
}
sb.Append(c);
}
return sb.ToString().First().ToString().ToUpper() + sb.ToString().Substring(1);
}
Usage:
GetNewString("my_string");
GetNewString("___this_is_anewstring_");
GetNewString("___this_is_123new34tring_");
Output:
MyString
ThisIsAnewstring
ThisIs123new34tring

Try with Regex:
var regex = new Regex("^[a-z]|_[a-z]?");
var result = regex.Replace("my_string_1234", x => x.Value== "_" ? "" : x.Value.Last().ToString().ToUpper());
Tested with:
my_string -> MyString
_my_string -> MyString
_my_string_ -> MyString

You need convert snake case to camel case, You can use this code it's working for me
var x ="_my_string_".Split(new[] {"_"}, StringSplitOptions.RemoveEmptyEntries)
.Select(s => char.ToUpperInvariant(s[0]) + s.Substring(1, s.Length - 1))
.Aggregate(string.Empty, (s1, s2) => s1 + s2);
x = MyString

static string toCamel(string input)
{
StringBuilder sb = new StringBuilder();
int i;
for (i = 0; i < input.Length; i++)
{
if ((i == 0) || (i > 0 && input[i - 1] == '_'))
sb.Append(char.ToUpper(input[i]));
else
sb.Append(char.ToLower(input[i]));
}
return sb.ToString();
}

How to split text into words?

How to split text into words?
Example text:
'Oh, you can't help that,' said the Cat: 'we're all mad here. I'm mad. You're mad.'
The words in that line are:
Oh
you
can't
help
that
said
the
Cat
we're
all
mad
here
I'm
mad
You're
mad

Split text on whitespace, then trim punctuation.
var text = "'Oh, you can't help that,' said the Cat: 'we're all mad here. I'm mad. You're mad.'";
var punctuation = text.Where(Char.IsPunctuation).Distinct().ToArray();
var words = text.Split().Select(x => x.Trim(punctuation));
Agrees exactly with example.

First, Remove all special characeters:
var fixedInput = Regex.Replace(input, "[^a-zA-Z0-9% ._]", string.Empty);
// This regex doesn't support apostrophe so the extension method is better
Then split it:
var split = fixedInput.Split(' ');
For a simpler C# solution for removing special characters (that you can easily change), add this extension method (I added a support for an apostrophe):
public static string RemoveSpecialCharacters(this string str) {
var sb = new StringBuilder();
foreach (char c in str) {
if ((c >= '0' && c <= '9') || (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z') || c == '\'' || c == ' ') {
sb.Append(c);
}
}
return sb.ToString();
}
Then use it like so:
var words = input.RemoveSpecialCharacters().Split(' ');
You'll be surprised to know that this extension method is very efficient (surely much more efficient then the Regex) so I'll suggest you use it ;)
Update
I agree that this is an English only approach but to make it Unicode compatible all you have to do is replace:
(c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z')
With:
char.IsLetter(c)
Which supports Unicode, .Net Also offers you char.IsSymbol and char.IsLetterOrDigit for the variety of cases

Just to add a variation on #Adam Fridental's answer which is very good, you could try this Regex:
var text = "'Oh, you can't help that,' said the Cat: 'we're all mad here. I'm mad. You're mad.'";
var matches = Regex.Matches(text, #"\w+[^\s]*\w+|\w");
foreach (Match match in matches) {
var word = match.Value;
}
I believe this is the shortest RegEx that will get all the words
\w+[^\s]*\w+|\w

If you don't want to use a Regex object, you could do something like...
string mystring="Oh, you can't help that,' said the Cat: 'we're all mad here. I'm mad. You're mad.";
List<string> words=mystring.Replace(",","").Replace(":","").Replace(".","").Split(" ").ToList();
You'll still have to handle the trailing apostrophe at the end of "that,'"

This is one of solution, i dont use any helper class or method.
public static List<string> ExtractChars(string inputString) {
var result = new List<string>();
int startIndex = -1;
for (int i = 0; i < inputString.Length; i++) {
var character = inputString[i];
if ((character >= 'a' && character <= 'z') ||
(character >= 'A' && character <= 'Z')) {
if (startIndex == -1) {
startIndex = i;
}
if (i == inputString.Length - 1) {
result.Add(GetString(inputString, startIndex, i));
}
continue;
}
if (startIndex != -1) {
result.Add(GetString(inputString, startIndex, i - 1));
startIndex = -1;
}
}
return result;
}
public static string GetString(string inputString, int startIndex, int endIndex) {
string result = "";
for (int i = startIndex; i <= endIndex; i++) {
result += inputString[i];
}
return result;
}

If you want to use the "for cycle" to check each char and save all punctuation in the input string I've create this class. The method GetSplitSentence() return a list of SentenceSplitResult. In this list there are saved all the words and all the punctuation & numbers. Each punctuation or numbers saved is an item in the list. The sentenceSplitResult.isAWord is used to check if is a word or not. [Sorry for my English]
public class SentenceSplitResult
{
public string word;
public bool isAWord;
}
public class StringsHelper
{
private readonly List<SentenceSplitResult> outputList = new List<SentenceSplitResult>();
private readonly string input;
public StringsHelper(string input)
{
this.input = input;
}
public List<SentenceSplitResult> GetSplitSentence()
{
StringBuilder sb = new StringBuilder();
try
{
if (String.IsNullOrEmpty(input)) {
Logger.Log(new ArgumentNullException(), "GetSplitSentence - input is null or empy");
return outputList;
}
bool isAletter = IsAValidLetter(input[0]);
// Each char i checked if is a part of a word.
// If is YES > I can store the char for later
// IF is NO > I Save the word (if exist) and then save the punctuation
foreach (var _char in input)
{
isAletter = IsAValidLetter(_char);
if (isAletter == true)
{
sb.Append(_char);
}
else
{
SaveWord(sb.ToString());
sb.Clear();
SaveANotWord(_char);
}
}
SaveWord(sb.ToString());
}
catch (Exception ex)
{
Logger.Log(ex);
}
return outputList;
}
private static bool IsAValidLetter(char _char)
{
if ((Char.IsPunctuation(_char) == true) || (_char == ' ') || (Char.IsNumber(_char) == true))
{
return false;
}
return true;
}
private void SaveWord(string word)
{
if (String.IsNullOrEmpty(word) == false)
{
outputList.Add(new SentenceSplitResult()
{
isAWord = true,
word = word
});
}
}
private void SaveANotWord(char _char)
{
outputList.Add(new SentenceSplitResult()
{
isAWord = false,
word = _char.ToString()
});
}

You could try using a regex to remove the apostrophes that aren't surrounded by letters (i.e. single quotes) and then using the Char static methods to strip all the other characters. By calling the regex first you can keep the contraction apostrophes (e.g. can't) but remove the single quotes like in 'Oh.
string myText = "'Oh, you can't help that,' said the Cat: 'we're all mad here. I'm mad. You're mad.'";
Regex reg = new Regex("\b[\"']\b");
myText = reg.Replace(myText, "");
string[] listOfWords = RemoveCharacters(myText);
public string[] RemoveCharacters(string input)
{
StringBuilder sb = new StringBuilder();
foreach (char c in input)
{
if (Char.IsLetter(c) || Char.IsWhiteSpace(c) || c == '\'')
sb.Append(c);
}
return sb.ToString().Split(' ');
}

Remove a char from a String

I'm looking for a method which can remove a character of a string.
for example I have " 3*X^4" and I want to remove characters '*' & '^' then the string would be like this "3X4" .

Maybe:
string s = Regex.Replace(input, "[*^]", "");

var s = "3*X^4";
var simplified = s.Replace("*", "").Replace("^", "");
// simplified is now "3X4"

try this..it will remove all special character from string
public static string RemoveSpecialCharacters(string str)
{
StringBuilder sb = new StringBuilder();
foreach (char c in str)
{
if ((c >= '0' && c <= '9') || (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z')
|| c == '.' || c == '_')
{
sb.Append(c);
}
}
return sb.ToString();
}

Another solution would be extracting the unwanted characters manually - this might be slightly more performant than repeatedly calling string.Replace especially for larger numbers of unwanted characters:
StringBuilder result = new StringBuilder(input.Length);
foreach (char ch in input) {
switch (ch) {
case '*':
case '^':
break;
default:
result.Append(ch);
break;
}
}
string s = result.ToString();
Or maybe extracting is the wrong word: Rather, you copy all characters except for those that you don't want.

Try this: String.Replace(Old String, New String)
string S = "3*X^4";
string str = S.Replace("*","").Replace("^","");

Parse an integer from a string with trailing garbage

I need to parse a decimal integer that appears at the start of a string.
There may be trailing garbage following the decimal number. This needs to be ignored (even if it contains other numbers.)
e.g.
"1" => 1
" 42 " => 42
" 3 -.X.-" => 3
" 2 3 4 5" => 2
Is there a built-in method in the .NET framework to do this?
int.TryParse() is not suitable. It allows trailing spaces but not other trailing characters.
It would be quite easy to implement this but I would prefer to use the standard method if it exists.

You can use Linq to do this, no Regular Expressions needed:
public static int GetLeadingInt(string input)
{
return Int32.Parse(new string(input.Trim().TakeWhile(c => char.IsDigit(c) || c == '.').ToArray()));
}
This works for all your provided examples:
string[] tests = new string[] {
"1",
" 42 ",
" 3 -.X.-",
" 2 3 4 5"
};
foreach (string test in tests)
{
Console.WriteLine("Result: " + GetLeadingInt(test));
}

foreach (var m in Regex.Matches(" 3 - .x. 4", #"\d+"))
{
Console.WriteLine(m);
}
Updated per comments
Not sure why you don't like regular expressions, so I'll just post what I think is the shortest solution.
To get first int:
Match match = Regex.Match(" 3 - .x. - 4", #"\d+");
if (match.Success)
Console.WriteLine(int.Parse(match.Value));

There's no standard .NET method for doing this - although I wouldn't be surprised to find that VB had something in the Microsoft.VisualBasic assembly (which is shipped with .NET, so it's not an issue to use it even from C#).
Will the result always be non-negative (which would make things easier)?
To be honest, regular expressions are the easiest option here, but...
public static string RemoveCruftFromNumber(string text)
{
int end = 0;
// First move past leading spaces
while (end < text.Length && text[end] == ' ')
{
end++;
}
// Now move past digits
while (end < text.Length && char.IsDigit(text[end]))
{
end++;
}
return text.Substring(0, end);
}
Then you just need to call int.TryParse on the result of RemoveCruftFromNumber (don't forget that the integer may be too big to store in an int).

I like #Donut's approach.
I'd like to add though, that char.IsDigit and char.IsNumber also allow for some unicode characters which are digits in other languages and scripts (see here).
If you only want to check for the digits 0 to 9 you could use "0123456789".Contains(c).
Three example implementions:
To remove trailing non-digit characters:
var digits = new string(input.Trim().TakeWhile(c =>
("0123456789").Contains(c)
).ToArray());
To remove leading non-digit characters:
var digits = new string(input.Trim().SkipWhile(c =>
!("0123456789").Contains(c)
).ToArray());
To remove all non-digit characters:
var digits = new string(input.Trim().Where(c =>
("0123456789").Contains(c)
).ToArray());
And of course: int.Parse(digits) or int.TryParse(digits, out output)

This doesn't really answer your question (about a built-in C# method), but you could try chopping off characters at the end of the input string one by one until int.TryParse() accepts it as a valid number:
for (int p = input.Length; p > 0; p--)
{
int num;
if (int.TryParse(input.Substring(0, p), out num))
return num;
}
throw new Exception("Malformed integer: " + input);
Of course, this will be slow if input is very long.
ADDENDUM (March 2016)
This could be made faster by chopping off all non-digit/non-space characters on the right before attempting each parse:
for (int p = input.Length; p > 0; p--)
{
char ch;
do
{
ch = input[--p];
} while ((ch < '0' || ch > '9') && ch != ' ' && p > 0);
p++;
int num;
if (int.TryParse(input.Substring(0, p), out num))
return num;
}
throw new Exception("Malformed integer: " + input);

string s = " 3 -.X.-".Trim();
string collectedNumber = string.empty;
int i;
for (x = 0; x < s.length; x++)
{
if (int.TryParse(s[x], out i))
collectedNumber += s[x];
else
break; // not a number - that's it - get out.
}
if (int.TryParse(collectedNumber, out i))
Console.WriteLine(i);
else
Console.WriteLine("no number found");

This is how I would have done it in Java:
int parseLeadingInt(String input)
{
NumberFormat fmt = NumberFormat.getIntegerInstance();
fmt.setGroupingUsed(false);
return fmt.parse(input, new ParsePosition(0)).intValue();
}
I was hoping something similar would be possible in .NET.
This is the regex-based solution I am currently using:
int? parseLeadingInt(string input)
{
int result = 0;
Match match = Regex.Match(input, "^[ \t]*\\d+");
if (match.Success && int.TryParse(match.Value, out result))
{
return result;
}
return null;
}

Might as well add mine too.
string temp = " 3 .x£";
string numbersOnly = String.Empty;
int tempInt;
for (int i = 0; i < temp.Length; i++)
{
if (Int32.TryParse(Convert.ToString(temp[i]), out tempInt))
{
numbersOnly += temp[i];
}
}
Int32.TryParse(numbersOnly, out tempInt);
MessageBox.Show(tempInt.ToString());
The message box is just for testing purposes, just delete it once you verify the method is working.

I'm not sure why you would avoid Regex in this situation.
Here's a little hackery that you can adjust to your needs.
" 3 -.X.-".ToCharArray().FindInteger().ToList().ForEach(Console.WriteLine);
public static class CharArrayExtensions
{
public static IEnumerable<char> FindInteger(this IEnumerable<char> array)
{
foreach (var c in array)
{
if(char.IsNumber(c))
yield return c;
}
}
}
EDIT:
That's true about the incorrect result (and the maintenance dev :) ).
Here's a revision:
public static int FindFirstInteger(this IEnumerable<char> array)
{
bool foundInteger = false;
var ints = new List<char>();
foreach (var c in array)
{
if(char.IsNumber(c))
{
foundInteger = true;
ints.Add(c);
}
else
{
if(foundInteger)
{
break;
}
}
}
string s = string.Empty;
ints.ForEach(i => s += i.ToString());
return int.Parse(s);
}

private string GetInt(string s)
{
int i = 0;
s = s.Trim();
while (i<s.Length && char.IsDigit(s[i])) i++;
return s.Substring(0, i);
}

Similar to Donut's above but with a TryParse:
private static bool TryGetLeadingInt(string input, out int output)
{
var trimmedString = new string(input.Trim().TakeWhile(c => char.IsDigit(c) || c == '.').ToArray());
var canParse = int.TryParse( trimmedString, out output);
return canParse;
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Removing all non letter characters from a string in C# - c#

You should use Regular Expression (Regex) instead. public static string RemoveBadChars(string word) { Regex reg = new Regex("[^a-zA-Z']"); return reg.Replace(word, string.Empty); } If you don't want to replace spaces: Regex reg = new Regex("[^a-zA-Z' ]");

This is the working answer, he says he want to remove none-letters chars public static string RemoveNoneLetterChars(string word) { Regex reg = new Regex(#"\W"); return reg.Replace(word, " "); // or return reg.Replace(word, String.Empty); }

word.Aggregate(new StringBuilder(word.Length), (acc, c) => acc.Append(Char.IsLetter(c) ? c.ToString() : "")).ToString(); Or you can substitute whatever function in place of IsLetter.

Related

How can i Split words inside this .txt in c# [duplicate]

Replace character in string with Uppercase of next in line (Pascal Casing)

How to split text into words?

Remove a char from a String

Parse an integer from a string with trailing garbage

Categories

Resources