How to remove a charlist from a string

How to remove a charlist from a string - c#

How can I remove a specific list of chars from a string?
For example I have the string Multilanguage File07 and want to remove all vowels, spaces and numbers to get the string MltlnggFl.
Is there any shorter way than using a foreach loop?
string MyLongString = "Multilanguage File07";
string MyShortString = MyLongString;
char[] charlist = new char[17]
{ 'a', 'e', 'i', 'o', 'u',
'0', '1', '2', '3', '4', '5',
'6', '7', '8', '9', '0', ' ' };
foreach (char letter in charlist)
{
MyShortString = MyShortString.Replace(letter.ToString(), "");
}

Use this code to replace a list of chars within a string:
using System.Text.RegularExpressions;
string MyLongString = "Multilanguage File07";
string MyShortString = Regex.Replace(MyLongString, "[aeiou0-9 ]", "");
Result:
Multilanguage File07 => MltlnggFl
Text from which some chars should be removed 12345 => Txtfrmwhchsmchrsshldbrmvd
Explanation of how it works:
The Regex Expression I use here, is a list of independend chars defined by the brackets []
=> [aeiou0-9 ]
The Regex.Replace() iterates through the whole string and looks at each character, if it will match one of the characters within the Regular Expression.
Every matched letter will be replaced by an empty string ("").

How about this:
var charList = new HashSet<char>(“aeiou0123456789 “);
MyLongString = new string(MyLongString.Where(c => !charList.Contains(c)).ToArray());

Try this pattern: (?|([aeyuio0-9 ]+)). Replace it with empty string and you will get your desird result.
I used branch reset (?|...) so all characters are captured into one group for easier manipulation.
Demo.

public void removeVowels()
{
string str = "MultilanguAge File07";
var chr = str.Where(c => !"aeiouAEIOU0-9 ".Contains(c)).ToList();
Console.WriteLine(string.Join("", chr));
}
1st line: creating desire string variable.
2nd line: using linq ignore vowels words [captital case,lower case, 0-9 number & space] and convert into list.
3rd line: combine chr list into one line string with the help of string.join function.
result: MltlnggFl7
Note: removeVowels function not only small case, 1-9 number and empty space but also remove capital case word from string.

Related

How to create char array of letters from a string array ? (C#)

For example, I have a string:
"Nice to meet you"
, there are 13 letters when we count repeating letters, but I wanna create a char array of letters from this string without repeating letters, I mean for the string above it should create an array like
{'N', 'i', 'c', 'e', 't', 'o', 'y', 'u', 'm'}
I was looking for answers on google for 2 hours, but I found nothing, there were lots of answers about strings and char arrays, but were not answers for my situation. I thought that I can write code by checking every letter in the array by 2 for cycles but this time I got syntax errors, so I decided to ask.

You can do this:
var foo = "Nice to meet you";
var fooArr = s.ToCharArray();
HashSet<char> set = new();
set.UnionWith(fooArr);
//or if you want without whitespaces you could refactor this as below
set.UnionWith(fooArr.Where(c => c != ' '));
UPDATE:
You could even make an extension method:
public static IEnumerable<char> ToUniqueCharArray(this string source, char? ignoreChar)
{
var charArray = source.ToCharArray();
HashSet<char> set = new();
set.UnionWith(charArray.Where(c => c != ignoreChar));
return set;
}
And then you can use it as:
var foo = "Nice to meet you";
var uniqueChars = foo.ToUniqueCharArray(ignoreChar: ' ');
// if you want to keep the whitespace
var uniqueChars = foo.ToUniqueCharArray(ignoreChar: null);

this piece of code does the job:
var sentence = "Nice To meet you";
var arr = sentence.ToLower().Where(x => x !=' ' ).ToHashSet();
Console.WriteLine(string.Join(",", arr));
I have added ToLower() if you dont do differences between uppercase and lowercase, if case is sensitive you just put off this extension..
HashSet suppresses all duplicates letters
test: Fiddle

I tried this one and it works too
"Nice to meet you".Replace(" ", "").ToCharArray().Distinct();

A very short solution is to use .Except() on the input string:
string text = "Nice to meet you";
char[] letters = text.Except(" ").ToArray();
Here, .Except():
translates both the text string and the parameter string (" ") to char collections
filters out all the chars in the text char collection that are present in the parameter char collection
returns a collection of distinct chars from the filtered text char collection
Example fiddle here.
Visualizing the process
Let's use the blue banana as an example.
var input = "blue banana";
input.Except(" ") will be translated to:
{ 'b', 'l', 'u', 'e', ' ', 'b', 'a', 'n', 'a', 'n', 'a' }.Except({ ' ' })
Filtering out all ' ' occurrences in the text char array produces:
{ 'b', 'l', 'u', 'e', 'b', 'a', 'n', 'a', 'n', 'a' }
The distinct char collection will have all the duplicates of 'b', 'a' and 'n' removed, resulting in:
{ 'b', 'l', 'u', 'e', 'a', 'n' }

ToCharArray method of string is only thing you need.
using System;
public class HelloWorld
{
public static void Main(string[] args)
{
string str = "Nice to meet you";
char[] carr = str.ToCharArray();
for(int i = 0; i < carr.Length; i++)
Console.WriteLine (carr[i]);
}
}

String str = "Nice To meet you";
char[] letters = str.ToLower().Except(" ").ToArray();

A solution just using for-loops (no generics or Linq), with comments explaining things:
// get rid of the spaces
String str = "Nice to meet you".Replace(" ", "");
// a temporary array more than long enough (we will trim it later)
char[] temp = new char[str.Length];
// the index where to insert the next char into "temp". This is also the length of the filled-in part
var idx = 0;
// loop through the source string
for (int i=0; i<str.Length; i++)
{
// get the character at this position (NB "surrogate pairs" will fail here)
var c = str[i];
// did we find it in the following loop?
var found = false;
// loop through the collected characters to see if we already have it
for (int j=0; j<idx; j++)
{
if (temp[j] == c)
{
// yes, we already have it!
found = true;
break;
}
}
if (!found)
{
// we didn't already have it, so add
temp[idx] = c;
idx+=1;
}
}
// "temp" is too long, so create a new array of the correct size
var letters = new char[idx];
Array.Copy(temp, letters, idx);
// now "letters" contains the unique letters
That "surrogate pairs" remark basically means that emojis will fail.

Get everything before dot or comma c#

how can I get a substring of everything before dot or comma?
For example:
string input = "2.1";
int charLocation = text.IndexOf(".", StringComparison.Ordinal);
string test = input.Substring(0, charLocation );
but what if I have an input = "2,1" ?
I would like to do it in one method, not using twice a substring (once for dot and once for comma)?

string test = input.Split(new Char[] { ',', '.' })[0];

This will split the string for either comma or period...
input.Split(',','.');

Use the IndexOfAny function. It allows you to specify a list of characters to look for, rather than just a single character. You could then make a substring up to the return value of that function.
e.g.
char[] chars = { '.', ',' }
String out = s.Substring(0,s.IndexOfAny(chars));

C# - regex finding all ascii enclosed in ' ' and convert them to hex ascii output

I write a converter for user input data, which converts number value strings and ascii characters enclosed in ' ' to hex representation. Number entering works fine with:
string TestText = "lorem, 'C', 127, 0x06, '#' ipsum";
TestText = Regex.Replace(
TestText,
" +\\d{1,3}",
(MatchEvaluator)(match => Convert.ToByte(match.Value).ToString("X2")));
Out.Text = TestText;
But how can I detect ascii chars enclosed in ' ' and convert them to a hex string like: 'C' will be 43 and '+' becomes 2B.

Basically, you want to match the regular expression '[^']'. This looks for all characters that are not ' but which are enclosed in '.
Then, in your match evaluator, you get the character in the middle, and convert it to a hexadecimal string. To do that, first cast the char to an int and then you can use ToString("x2"):
TestText = Regex.Replace(TestText, "'[^']'",
(MatchEvaluator)(match => ((int)match.Value[1]).ToString("x2")));

First, you need a RegEx to capture the character inside the 's: "'(.)'"
Then you need to convert that character to its hex equivalent, like so: Encoding.ASCII.GetBytes(match.Groups[1].Value).First().ToString("X2")
so your final code would look like this:
string TestText = "lorem, 'C', 127, 0x06, '#' ipsum '+'";
TestText = Regex.Replace(TestText, #" +\d{1,3}", match => Convert.ToByte(match.Value).ToString("X2"));
TestText = Regex.Replace(TestText, "'(.)'", match => Encoding.ASCII.GetBytes(match.Groups[1].Value).First().ToString("X2"));
Out.Text = TestText;
Note that, as pointed out in the comments, your RegEx is currently matching the 0 at the beginning of 0x06, which may not be what you want.

C# Trim charaters from the end of string

I have this string
"1.3.1.\tProduction and Sales Analysis:"
I want to trim numbers and escape sequences from the start and end of string.
Output Should be :
"Production and Sales Analysis:"
My code :
Char[] trimArray = new Char[] {'0','1','2','3','4','5','6','7','8','9','.',',',':','\\','/'};
String test = "1.3.1.\tProduction and Sales Analysis:";
test = test.TrimEnd(trimArray);
but problem is when a string like 23232-232123-asd-323 comes it also removes the digits
I want to remove the unwanted characters from start and end of string But Want to keep the string like 23232-232123-asd-323 or mobile numbers
Thanks.

Is there always '\t' in the middle?
You can try to split by '\t' and trim end.
Split(new char[]{'\t'});
If colon is always at the end of what you want to get you can split again ;)

If there is something in common separating the bad number and the good numbers like say "t", then how about finding the index of the common letter/symbol then creating a substring of everything after. For example:
String test = "1.3.1.\tProduction and Sales Analysis:";
index = test.LastIndexOfAny(new char[] { 't' });
test = test.Substring(index +1);
This should give you "Production and Sales Analysis:". You could do the same for the ":" except you want everything before it like so
int index = test.LastIndexOfAny(new char[] { ':' });
test = test.Substring(0, index);

With regex:
using System.Text.RegularExpressions;
string input = "1.3.1.\tProduction and Sales Analysis:";
string rgx = #"(?:[0-9.]+)*(?:\\[a-zA-Z])*([\w\s\.\:-_]+)(?:\\[a-zA-Z])*";
string result = Regex.Match(input, rgx).Groups[1].Value.TrimStart();

remove the any character from string except number,dot(.), and comma(,) [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
remove the invalid character from price
Hi friends,
i have a scenario where i have to remove the invalid character from price using c# code.
i want the regular ex to remove the character or some thing good then this.
For Ex- my price is
"3,950,000 ( Ex. TAX )"
i want to remove "( Ex. TAX )" from the price.
my scenario is that. i have to remove the any character from string except number,dot(.), and comma(,)
please help..
thanks in advance
Shivi

private string RemoveExtraText(string value)
{
var allowedChars = "01234567890.,";
return new string(value.Where(c => allowedChars.Contains(c)).ToArray());
}

string s = #"3,950,000 ( Ex. TAX )";
string result = string.Empty;
foreach (var c in s)
{
int ascii = (int)c;
if ((ascii >= 48 && ascii <= 57) || ascii == 44 || ascii == 46)
result += c;
}
Console.Write(result);
Notice that the dot in "Ex. TAX" will stay

How about this:
using System.Text.RegularExpressions;
public static Regex regex = new Regex(
"(\\d|[,\\.])*",
RegexOptions.IgnoreCase
| RegexOptions.CultureInvariant
| RegexOptions.IgnorePatternWhitespace
| RegexOptions.Compiled
);
//// Capture the first Match, if any, in the InputText
Match m = regex.Match(InputText);
//// Capture all Matches in the InputText
MatchCollection ms = regex.Matches(InputText);
//// Test to see if there is a match in the InputText
bool IsMatch = regex.IsMatch(InputText);

You can use LINQ
HashSet<char> validChars = new HashSet<char>(
new char[] { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', ',', '.' });
var washedString = new string((from c in "3,950,000 ( Ex. TAX )"
where validChars.Contains(c)
select c).ToArray());
but the "." in "Ex. TAX" will remain.

you may use something like [^alpha] ore [^a-z]

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to remove a charlist from a string - c#

How about this: var charList = new HashSet<char>(“aeiou0123456789 “); MyLongString = new string(MyLongString.Where(c => !charList.Contains(c)).ToArray());

Try this pattern: (?|([aeyuio0-9 ]+)). Replace it with empty string and you will get your desird result. I used branch reset (?|...) so all characters are captured into one group for easier manipulation. Demo.

Related

How to create char array of letters from a string array ? (C#)

Get everything before dot or comma c#

C# - regex finding all ascii enclosed in ' ' and convert them to hex ascii output

C# Trim charaters from the end of string

remove the any character from string except number,dot(.), and comma(,) [duplicate]

Categories

Resources