Removing specific special characters from a string - c#

I would like to remove spaces(' '), dots('.') and hyphens(-) from a string, using a regular expression.
My current approach:
string input = "hello how --r dsbadb...dasjidhdsa.dasbhdgsa--dasb";
var res = input
.ToCharArray()
.Where(i => i != ' ' && i != '-' && i != '.')
.Aggregate(" ", (a, b) => a + b);

string filteredInput = Regex.Replace(input, "[ .-]+", "");
should be easier and more readable.

var result = string.Concat(input.Where(c => !new[] { '.', ' ', '-' }.Contains(c)));

string result = Regex.Replace(input, "[\s\.-]+", "");
\s would target space, \. would target dots, and - would target hyphens and will replace them with empty string

Related

Split word from string

I use this method for splitting words from string, but \n doesn't consider. How can I solve it?
public string SplitXWord(string text, int wordCount)
{
string output = "";
IEnumerable<string> words = text.Split().Take(wordCount);
foreach (string word in words)
{
output += " " + word;
}
return output;
}
Well, string.Split() splits by white-spaces only
https://learn.microsoft.com/en-us/dotnet/api/system.string.split?view=net-6.0
Split is used to break a delimited string into substrings. You can use either a character array or a string array to specify zero or more delimiting characters or strings. If no delimiting characters are specified, the string is split at white-space characters.
bold is mine.
So far so good, string.Split() splits on spaces ' ', tabulation '\t', new line '\n', carriage return '\r' etc.:
Console.Write(string.Join(", ", "a\nb\rc\td e".Split()));
produces
a, b, c, d, e
If you want to split on your cown delimiters, you should prvide them:
Console.Write(string.Join(", ", "a\nb\rc\td e".Split(new char[] {' ', '\t'})));
note that \r and \n are preserved, when splitted on ' ' and 't'
a
b
c, d, e
So, it seems that your method should be something like this:
using System.Linq;
...
//DONE: static - we don't want this here
public static string SplitXWord(string text, int wordCount) {
//DONE: don't forget about degenerated cases
if (string.IsNullOrWhiteSpace(text) || wordCount <= 0)
return "";
//TODO: specify delimiters on which you want to split
return string.Join(" ", text
.Split(
new char[] { ' ', '\t' },
wordCount + 1,
StringSplitOptions.RemoveEmptyEntries)
.Take(wordCount));
}
Use the overload of Split method which accepts an array of char separators and clears the empty entries
string str = "my test \n\r string \n is here";
string[] words = str.Split(new []{' ', '\r', '\n'}, StringSplitOptions.RemoveEmptyEntries);
UPDATE:
Another solution with regex and keeping line characters:
string str = "my test\r\n string\n is here";
var wordsByRegex = Regex.Split(str, #"(?= ).+?(\r|\n|\r\n)?").ToList();
fiddle
using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
namespace ConsoleApp17
{
class Program
{
static void Main(string[] args)
{
string myStr = "hello my friend \n whats up \n bro";
string[] mySplitStr = myStr.Split("\n");
mySplitStr.ToList().ForEach(str=>{
Console.WriteLine(str);
//to remove the white spaces
//Console.WriteLine(str.Replace(" ",""));
});
Console.ReadLine();
}
}
}

How to get string to contain only numbers, dashes, and space?

I am using regex to check if the string contains only numbers, dashes and spaces:
Regex regex = new Regex(#"^[0-9- ]+$");
if(!regex.IsMatch(str))
How can I do this without regex?
You can use linq to iterate over the characters, and char.IsDigit to check for a digit.
bool invalid = myString.Any( x => x != ' ' && x != '-' && !char.IsDigit(x) );
Here's a LINQ solution:
var allowedChars = "1234567890- ";
var str = "3275-235 23-325";
if (str.All(x => allowedChars.Contains(x))){
Console.WriteLine("true");
}

reverse words BUT dot should be at the end

I have sentence:
"I love Marry."
and I would like to get:
"Marry love I." (dot at the end)
How can I do that?
public static string ReverseWords(string originalString)
{
return string.Join(" ", originalString.Split(' ').Where(x => !string.IsNullOrEmpty(x)).Reverse());
}
You can remove the last '.' before the split.
Demo:
public static string ReverseWords(string originalString)
{
var input = originalString.EndsWith(".") ? originalString.Remove(originalString.Length - 1) : originalString; // will trim ending '.'
return string.Join(" ", input.Split().Reverse()) + ".";
}
Try it online!
Try this. I am making it into several statements for readability.
var words = originalString.Split(new [] {' ', '.'}, StringSplitOptions.RemoveEmptyEntries).Reverse();
That gets your words in reverse order, and avoids the need for your Where clause. Then join them back with the period:
return string.Join(' ', words) + '.';
Do it in two steps where you split on . first;
return
string.Join(".",
originalString.Split('.')
.ToList()
.Select(s => string.Join(" ", s.Split(' ').Where(x => !string.IsNullOrEmpty(x)).Reverse())));
For single sentences, remove the dot and append it again in the end.
To remove the dot you can use TrimEnd which will remove all dots from the end of the string. If there is none, nothing is removed:
public static string ReverseWords(string originalString)
{
originalString = originalString.TrimEnd('.');
originalString = string.Join(" ", originalString.Split(' ').Where(x => !string.IsNullOrEmpty(x)).Reverse());
return originalString + ".";
}
For multiple senctences you can split the input string at the ., which will give you an array of sentences without dots. Then you simply reverse each part, append a dot and put them back together (I used a StringBuilder to do that):
public static string ReverseWordsMultiple(string originalString)
{
String[] sentences = originalString.Split(new char[] { '.' }, StringSplitOptions.RemoveEmptyEntries);
StringBuilder builder = new StringBuilder();
foreach (String senctence in sentences)
{
builder.Append(string.Join(" ", senctence.Split(' ').Where(x => !string.IsNullOrEmpty(x)).Reverse()));
builder.Append(". ");
}
return builder.ToString().TrimEnd();
}

Regex to split by a Targeted String up to a certain character

I have an LDAP Query I need to build the domain.
So, split by "DC=" up to a "comma"
INPUT:
LDAP://DC=SOMETHINGS,DC=ELSE,DC=NET\account
RESULT:
SOMETHING.ELSE.NET
You can do it pretty simple using DC=(\w*) regex pattern.
var str = #"LDAP://DC=SOMETHINGS,DC=ELSE,DC=NET\account";
var result = String.Join(".", Regex.Matches(str, #"DC=(\w*)")
.Cast<Match>()
.Select(m => m.Groups[1].Value));
Without Regex you can do:
string ldapStr = #"LDAP://DC=SOMETHINGS,DC=ELSE,DC=NET\account";
int startIndex = ldapStr.IndexOf("DC=");
int length = ldapStr.LastIndexOf("DC=") - startIndex;
string output = null;
if (startIndex >= 0 && length <= ldapStr.Length)
{
string domainComponentStr = ldapStr.Substring(startIndex, length);
output = String.Join(".",domainComponentStr.Split(new[] {"DC=", ","}, StringSplitOptions.RemoveEmptyEntries));
}
If you are always going to get the string in similar format than you can also do:
string ldapStr = #"LDAP://DC=SOMETHINGS,DC=ELSE,DC=NET\account";
var outputStr = String.Join(".", ldapStr.Split(new[] {"DC=", ",","\\"}, StringSplitOptions.RemoveEmptyEntries)
.Skip(1)
.Take(3));
And you will get:
outputStr = "SOMETHINGS.ELSE.NET"

manipulating strings

I am trying to remove some special characters from a string.
I have got the following string
[_fesd][009] Statement
and I want to get rid of all '_' '[' and ']'
I managed to remove the first characters with TrimStart and I get fesd][009] Statement
How should I remove the special characters from the middle of my string?
Currently Im using the following code
string newStr = str.Trim(new Char[] { '[', ']', '_' });
where str is the strin that should be manupulated and the result should be stored in newStr
string newStr = str.Replace("[", "").Replace("]", "").Replace("_", "");
var newStr = Regex.Replace("[_fesd][009] Statement", "(\\[)|(\\])|(_)", string.Empty);
Use string.Replace with string.Empty as the string to replace with.
You could use Linq for it:
static void Main(string[] args)
{
var s = #"[_fesd][009] Statement";
var unwanted = #"_[]";
var sanitizedS = s
.Where(i => !unwanted.Contains(i))
.Aggregate<char, string>("", (a, b) => a + b);
Console.WriteLine(sanitizedS);
// output: fesd009 Statement
}
var chars = new Char[] { '[', ']', '_' };
var newValue = new String(str.Where(x => !chars.Contains(x)).ToArray());

Categories