I am trying to write "text" into a file with
private void WriteToLogs(string text)
{
File.AppendAllText(todayMessageLog, $"({DateTime.Now}) Server Page: \"{text.Trim()}\"\n");
}
The text comes out as this:
"text (a bunch of white space)"
The text string is made up of these:
string username = e.NewClientUsername.Trim().Replace(" ", "");
string ip = e.NewClientIP.Trim().Replace(" ", "");
WriteToLogs($"{username.Trim().Replace(" ", "")} ({ip.Trim().Replace(" ", "")}) connected"); // NONE OF THESE WORKED FOR REMOVING THE WHITE SPACE
The "e" parameter comes from a custom EventArgs class in another namespace and NewClientIP and NewClientUsername are properties inside the class
As you can see, I tried with both Trim and Replace on both the strings themselves and the method but nothing removes the white space.
If the Trim() and Replace() methods do not work, the string is likely not padded with the usual white-space characters like SPACE or TAB, but something else. There are many other characters which can show up blank.
Try printing the result with something like BitConverter.ToString(Text.Encoding.UTF8.GetBytes(text)). Spaces would show up as 20-20-20-..., but you will probably get something else.
The white space shows up as 00, not 20, how can I remove it?
Good. Use the argument to the Trim() method, like so:
var text = "Blah\0\0\0\0";
text.Length → 8
text.Trim('\0').Length → 4
I hope that this working for you
using System;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
//This is your text
string input = "This is text with far too much "+
"This is text with far too much "+
"This is text with far too much ";
//This is the Regex
string pattern = "\\s";
//Value with the replace
string replacement = "";
//Replace
string result = Regex.Replace(input, pattern, replacement);
//Result
Console.WriteLine("Original String: {0}", input);
Console.WriteLine("Replacement String: {0}", result);
}
}
If you want to trim white spaces (not only ' ', but \t, \U00A0 etc.) as well as \0 (which is not white space), you can try regular expressions:
using System.Text.RegularExpressions;
...
// Trim whitespaces and \0
string result = Regex.Replace(input, #"(^[\s\0]+)|([\s\0]+$)", "");
For reference
// Trim start whitespaces and \0
string resultStart = Regex.Replace(input, #"^[\s\0]+", "");
// Trim end whitespaces and \0
string resultEnd = Regex.Replace(input, #"[\s\0]+$", "");
Same idea (regular expressions), but different pattern if you want not to trim but remove white spaces and \0:
string result = Regex.Replace(input, #"[\s\0]+", "");
Related
I'm a doing an massive uploading of information from a .csv file and I need replace this character non ASCII "�" for a normal space, " ".
The character "�" corresponds to "\uFFFD" for C, C++, and Java, which it seems that it is called REPLACEMENT CHARACTER. There are others, such as spaces type like U+FEFF, U+205F, U+200B, U+180E, and U+202F in the C# official documentation.
I'm trying do the replace this way:
public string Errors = "";
public void test(){
string textFromCsvCell = "";
string validCharacters = "^[0-9A-Za-z().:%-/ ]+$";
textFromCsvCell = "This is my text from csv file"; //All spaces aren't normal space " "
string cleaned = textFromCsvCell.Replace("\uFFFD", "\"")
if (Regex.IsMatch(cleaned, validCharacters ))
//All code for insert
else
Errors=cleaned;
//print Errors
}
The test method shows me this text:
"This is my�texto from csv file"
I try some solutions too:
Trying solution 1: Using Trim
Regex.Replace(value.Trim(), #"[^\S\r\n]+", " ");
Try solution 2: Using Replace
System.Text.RegularExpressions.Regex.Replace(str, #"\s+", " ");
Try solution 3: Using Trim
String.Trim(new char[]{'\uFEFF', '\u200B'});
Try solution 4: Add [\S\r\n] to validCharacters
string validCharacters = "^[\S\r\n0-9A-Za-z().:%-/ ]+$";
Nothing works.
How can I replace it?
Sources:
Unicode Character 'REPLACEMENT CHARACTER' (U+FFFD)
Trying to replace all white space with a single space
Strip the byte order mark from string in C#
Remove extra whitespaces, but keep new lines using a regular expression in C#
EDITED
This is the original string:
"SYSTEM OF MONITORING CONTINUES OF GLUCOSE"
in 0x... notation
SYSTEM OF0xA0MONITORING CONTINUES OF GLUCOSE
Solution
Go to the Unicode code converter. Look at the conversions and do the replace.
In my case, I do a simple replace:
string value = "SYSTEM OF MONITORING CONTINUES OF GLUCOSE";
//value contains non-breaking whitespace
//value is "SYSTEM OF�MONITORING CONTINUES OF GLUCOSE"
string cleaned = "";
string pattern = #"[^\u0000-\u007F]+";
string replacement = " ";
Regex rgx = new Regex(pattern);
cleaned = rgx.Replace(value, replacement);
if (Regex.IsMatch(cleaned,"^[0-9A-Za-z().:<>%-/ ]+$"){
//all code for insert
else
//Error messages
This expression represents all possible spaces: space, tab, page break, line break and carriage return
[ \f\n\r\t\v\u00a0\u1680\u180e\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200a\u2028\u2029\u202f\u205f\u3000]
References
Regular expressions (MDN)
Using String.Replace:
Use a simple String.Replace().
I've assumed that the only characters you want to remove are the ones you've mentioned in the question: � and you want to replace them by a normal space.
string text = "imp�ortant";
string cleaned = text.Replace('\u00ef', ' ')
.Replace('\u00bf', ' ')
.Replace('\u00bd', ' ');
// Returns 'imp ortant'
Or using Regex.Replace:
string cleaned = Regex.Replace(text, "[\u00ef\u00bf\u00bd]", " ");
// Returns 'imp ortant'
Try it out: Dotnet Fiddle
Define a range of ASCII characters, and replace anything that is not within that range.
We want to find only Unicode characters, so we will match on a Unicode character and replace.
Regex.Replace("This is my te\uFFFDxt from csv file", #"[^\u0000-\u007F]+", " ")
The above pattern will match anything that is not ^ in the set [ ] of this range \u0000-\u007F (ASCII characters (everything past \u007F is Unicode)) and replace it with a space.
Result
This is my te xt from csv file
You can adjust the range provided \u0000-\u007F as needed to expand the range of allowed characters to suit your needs.
If you just want ASCII then try the following:
var ascii = new ASCIIEncoding();
byte[] encodedBytes = ascii.GetBytes(text);
var cleaned = ascii.GetString(encodedBytes).Replace("?", " ");
I would like to remove text contained between each of multiple pairs of brackets. The code below works fine if there is only ONE pair of brackets within the string:
var text = "This (remove me) works fine!";
// Remove text between brackets.
text = Regex.Replace(text, #"\(.*\)", "");
// Remove extra spaces.
text = Regex.Replace(text, #"\s+", " ");
Console.WriteLine(text);
This works fine!
However, if there are MULTIPLE sets of brackets contained within the string too much text is removed. The Regex expression removes all text between the FIRST opening bracket and LAST closing bracket.
var text = "This is (remove me) not (remove me) a problem!";
// Remove text between brackets.
text = Regex.Replace(text, #"\(.*\)", "");
// Remove extra spaces.
text = Regex.Replace(text, #"\s+", " ");
Console.WriteLine(text);
This is a problem!
I'm stumped - I'm sure there's a simple solution, but I'm out of ideas...
Help most welcome!
You have two main possibilities:
change .* to .*? i.e. match as few as possible and thus match ) as early as possible:
text = Regex.Replace(text, #"\(.*?\)", "");
text = Regex.Replace(text, #"\s{2,}", " "); // let's exclude trivial replaces
change .* to [^)]* i.e. match any symbols except ):
text = Regex.Replace(text, #"\([^)]*\)", "");
text = Regex.Replace(text, #"\s{2,}", " ");
working example in c#, this will handle curly braces "{", so result will be.. {{pc_mem_kc}}
string str = "{{pc_mem_kc}} of members were health (test message)";
var pattern = #"\{.*?\}}";
var data11 = Regex.Matches(str, pattern, RegexOptions.IgnoreCase);
How can I replace Line Breaks within a string in C#?
Use replace with Environment.NewLine
myString = myString.Replace(System.Environment.NewLine, "replacement text"); //add a line terminating ;
As mentioned in other posts, if the string comes from another environment (OS) then you'd need to replace that particular environments implementation of new line control characters.
The solutions posted so far either only replace Environment.NewLine or they fail if the replacement string contains line breaks because they call string.Replace multiple times.
Here's a solution that uses a regular expression to make all three replacements in just one pass over the string. This means that the replacement string can safely contain line breaks.
string result = Regex.Replace(input, #"\r\n?|\n", replacementString);
To extend The.Anyi.9's answer, you should also be aware of the different types of line break in general use. Dependent on where your file originated, you may want to look at making sure you catch all the alternatives...
string replaceWith = "";
string removedBreaks = Line.Replace("\r\n", replaceWith).Replace("\n", replaceWith).Replace("\r", replaceWith);
should get you going...
I would use Environment.Newline when I wanted to insert a newline for a string, but not to remove all newlines from a string.
Depending on your platform you can have different types of newlines, but even inside the same platform often different types of newlines are used. In particular when dealing with file formats and protocols.
string ReplaceNewlines(string blockOfText, string replaceWith)
{
return blockOfText.Replace("\r\n", replaceWith).Replace("\n", replaceWith).Replace("\r", replaceWith);
}
If your code is supposed to run in different environments, I would consider using the Environment.NewLine constant, since it is specifically the newline used in the specific environment.
line = line.Replace(Environment.NewLine, "newLineReplacement");
However, if you get the text from a file originating on another system, this might not be the correct answer, and you should replace with whatever newline constant is used on the other system. It will typically be \n or \r\n.
if you want to "clean" the new lines, flamebaud comment using regex #"[\r\n]+" is the best choice.
using System;
using System.Text.RegularExpressions;
class MainClass {
public static void Main (string[] args) {
string str = "AAA\r\nBBB\r\n\r\n\r\nCCC\r\r\rDDD\n\n\nEEE";
Console.WriteLine (str.Replace(System.Environment.NewLine, "-"));
/* Result:
AAA
-BBB
-
-
-CCC
DDD---EEE
*/
Console.WriteLine (Regex.Replace(str, #"\r\n?|\n", "-"));
// Result:
// AAA-BBB---CCC---DDD---EEE
Console.WriteLine (Regex.Replace(str, #"[\r\n]+", "-"));
// Result:
// AAA-BBB-CCC-DDD-EEE
}
}
Use new in .NET 6 method
myString = myString.ReplaceLineEndings();
Replaces ALL newline sequences in the current string.
Documentation:
ReplaceLineEndings
Don't forget that replace doesn't do the replacement in the string, but returns a new string with the characters replaced. The following will remove line breaks (not replace them). I'd use #Brian R. Bondy's method if replacing them with something else, perhaps wrapped as an extension method. Remember to check for null values first before calling Replace or the extension methods provided.
string line = ...
line = line.Replace( "\r", "").Replace( "\n", "" );
As extension methods:
public static class StringExtensions
{
public static string RemoveLineBreaks( this string lines )
{
return lines.Replace( "\r", "").Replace( "\n", "" );
}
public static string ReplaceLineBreaks( this string lines, string replacement )
{
return lines.Replace( "\r\n", replacement )
.Replace( "\r", replacement )
.Replace( "\n", replacement );
}
}
To make sure all possible ways of line breaks (Windows, Mac and Unix) are replaced you should use:
string.Replace("\r\n", "\n").Replace('\r', '\n').Replace('\n', 'replacement');
and in this order, to not to make extra line breaks, when you find some combination of line ending chars.
Why not both?
string ReplacementString = "";
Regex.Replace(strin.Replace(System.Environment.NewLine, ReplacementString), #"(\r\n?|\n)", ReplacementString);
Note: Replace strin with the name of your input string.
I needed to replace the \r\n with an actual carriage return and line feed and replace \t with an actual tab. So I came up with the following:
public string Transform(string data)
{
string result = data;
char cr = (char)13;
char lf = (char)10;
char tab = (char)9;
result = result.Replace("\\r", cr.ToString());
result = result.Replace("\\n", lf.ToString());
result = result.Replace("\\t", tab.ToString());
return result;
}
var answer = Regex.Replace(value, "(\n|\r)+", replacementString);
As new line can be delimited by \n, \r and \r\n, first we’ll replace \r and \r\n with \n, and only then split data string.
The following lines should go to the parseCSV method:
function parseCSV(data) {
//alert(data);
//replace UNIX new lines
data = data.replace(/\r\n/g, "\n");
//replace MAC new lines
data = data.replace(/\r/g, "\n");
//split into rows
var rows = data.split("\n");
}
Use the .Replace() method
Line.Replace("\n", "whatever you want to replace with");
Best way to replace linebreaks safely is
yourString.Replace("\r\n","\n") //handling windows linebreaks
.Replace("\r","\n") //handling mac linebreaks
that should produce a string with only \n (eg linefeed) as linebreaks.
this code is usefull to fix mixed linebreaks too.
Another option is to create a StringReader over the string in question. On the reader, do .ReadLine() in a loop. Then you have the lines separated, no matter what (consistent or inconsistent) separators they had. With that, you can proceed as you wish; one possibility is to use a StringBuilder and call .AppendLine on it.
The advantage is, you let the framework decide what constitutes a "line break".
string s = Regex.Replace(source_string, "\n", "\r\n");
or
string s = Regex.Replace(source_string, "\r\n", "\n");
depending on which way you want to go.
Hopes it helps.
If you want to replace only the newlines:
var input = #"sdfhlu \r\n sdkuidfs\r\ndfgdgfd";
var match = #"[\\ ]+";
var replaceWith = " ";
Console.WriteLine("input: " + input);
var x = Regex.Replace(input.Replace(#"\n", replaceWith).Replace(#"\r", replaceWith), match, replaceWith);
Console.WriteLine("output: " + x);
If you want to replace newlines, tabs and white spaces:
var input = #"sdfhlusdkuidfs\r\ndfgdgfd";
var match = #"[\\s]+";
var replaceWith = "";
Console.WriteLine("input: " + input);
var x = Regex.Replace(input, match, replaceWith);
Console.WriteLine("output: " + x);
This is a very long winded one-liner solution but it is the only one that I had found to work if you cannot use the the special character escapes like "\r" and "\n" and \x0d and \u000D as well as System.Environment.NewLine as parameters to thereplace() method
MyStr.replace( System.String.Concat( System.Char.ConvertFromUtf32(13).ToString(), System.Char.ConvertFromUtf32(10).ToString() ), ReplacementString );
This is somewhat offtopic but to get it to work inside Visual Studio's XML .props files, which invoke .NET via the XML properties, I had to dress it up like it is shown below.
The Visual Studio XML --> .NET environment just would not accept the special character escapes like "\r" and "\n" and \x0d and \u000D as well as System.Environment.NewLine as parameters to thereplace() method.
$([System.IO.File]::ReadAllText('MyFile.txt').replace( $([System.String]::Concat($([System.Char]::ConvertFromUtf32(13).ToString()),$([System.Char]::ConvertFromUtf32(10).ToString()))),$([System.String]::Concat('^',$([System.Char]::ConvertFromUtf32(13).ToString()),$([System.Char]::ConvertFromUtf32(10).ToString())))))
Based on #mark-bayers answer and for cleaner output:
string result = Regex.Replace(ex.Message, #"(\r\n?|\r?\n)+", "replacement text");
It removes \r\n , \n and \r while perefer longer one and simplify multiple occurances to one.
I need to remove all chars that cant be part of urls, like spaces ,<,> and etc.
I am getting the data from database.
For Example if the the retrieved data is: Product #number 123!
the new string should be: Product-number-123
Should I use regex? is there a regex pattern for that?
Thanks
Here is a an example on how to generate an url-friendly string from a "normal" string:
public static string GenerateSlug(string phrase)
{
string str = phrase.ToLower();
str = Regex.Replace(str, #"[^a-z0-9\s-]", ""); // invalid chars
str = Regex.Replace(str, #"\s+", " ").Trim(); // convert multiple spaces into one space
str = str.Substring(0, str.Length <= 45 ? str.Length : 45).Trim(); // cut and trim it
str = Regex.Replace(str, #"\s", "-"); // hyphens
return str;
}
You may want to remove the trim-part if you are sure that you always want the full string.
Source
An easy regex to do this is:
string cleaned = Regex.Replace(url, #"[^a-zA-Z0-9]+","-");
To just perform the replacement of special characters like "<" you can use Server.UrlEncode(string s). And you can do the opposite with Server.UrlDecode(string s).
How do I replace \n with empty space?
I get an empty literal error if I do this:
string temp = mystring.Replace('\n', '');
String.Replace('\n', '') doesn't work because '' is not a valid character literal.
If you use the String.Replace(string, string) override, it should work.
string temp = mystring.Replace("\n", "");
As replacing "\n" with "" doesn't give you the result that you want, that means that what you should replace is actually not "\n", but some other character combination.
One possibility is that what you should replace is the "\r\n" character combination, which is the newline code in a Windows system. If you replace only the "\n" (line feed) character it will leave the "\r" (carriage return) character, which still may be interpreted as a line break, depending on how you display the string.
If the source of the string is system specific you should use that specific string, otherwise you should use Environment.NewLine to get the newline character combination for the current system.
string temp = mystring.Replace("\r\n", string.Empty);
or:
string temp = mystring.Replace(Environment.NewLine, string.Empty);
This should work.
string temp = mystring.Replace("\n", "");
Are you sure there are actual \n new lines in your original string?
string temp = mystring.Replace("\n", string.Empty).Replace("\r", string.Empty);
Obviously, this removes both '\n' and '\r' and is as simple as I know how to do it.
If you use
string temp = mystring.Replace("\r\n", "").Replace("\n", "");
then you won't have to worry about where your string is coming from.
One caveat: in .NET the linefeed is "\r\n". So if you're loading your text from a file, you might have to use that instead of just "\n"
edit> as samuel pointed out in the comments, "\r\n" is not .NET specific, but is windows specific.
What about creating an Extension Method like this....
public static string ReplaceTHAT(this string s)
{
return s.Replace("\n\r", "");
}
And then when you want to replace that wherever you want you can do this.
s.ReplaceTHAT();
Best Regards!
Here is your exact answer...
const char LineFeed = '\n'; // #10
string temp = new System.Text.RegularExpressions.Regex(
LineFeed
).Replace(mystring, string.Empty);
But this one is much better... Specially if you are trying to split the lines (you may also use it with Split)
const char CarriageReturn = '\r'; // #13
const char LineFeed = '\n'; // #10
string temp = new System.Text.RegularExpressions.Regex(
string.Format("{0}?{1}", CarriageReturn, LineFeed)
).Replace(mystring, string.Empty);
string temp = mystring.Replace("\n", " ");
#gnomixa - What do you mean in your comment about not achieving anything? The following works for me in VS2005.
If your goal is to remove the newline characters, thereby shortening the string, look at this:
string originalStringWithNewline = "12\n345"; // length is 6
System.Diagnostics.Debug.Assert(originalStringWithNewline.Length == 6);
string newStringWithoutNewline = originalStringWithNewline.Replace("\n", ""); // new length is 5
System.Diagnostics.Debug.Assert(newStringWithoutNewline.Length == 5);
If your goal is to replace the newline characters with a space character, leaving the string length the same, look at this example:
string originalStringWithNewline = "12\n345"; // length is 6
System.Diagnostics.Debug.Assert(originalStringWithNewline.Length == 6);
string newStringWithoutNewline = originalStringWithNewline.Replace("\n", " "); // new length is still 6
System.Diagnostics.Debug.Assert(newStringWithoutNewline.Length == 6);
And you have to replace single-character strings instead of characters because '' is not a valid character to be passed to Replace(string,char)
I know this is an old post but I'd like to add my method.
public static string Replace(string text, string[] toReplace, string replaceWith)
{
foreach (string str in toReplace)
text = text.Replace(str, replaceWith);
return text;
}
Example usage:
string newText = Replace("This is an \r\n \n an example.", new string[] { "\r\n", "\n" }, "");
Found on Bytes.com:
string temp = mystring.Replace('\n', '\0');// '\0' represents an empty char