I have this string:
string specialCharacterString = #"\n";
where "\n" is the new line special character.
Is it possible convert/assign that string (of two characters) into a (single) char. How do I do something like:
char specialCharacter = Parse(specialCharacterString);
Where specialCharacter value would be equal to \n
Is there anything in dotnet that would parse the string for me or must I use if or switch the string (the string can contain any special character) to accomplish what I want. Note that char.Parse(string) cannot handle special characters and thinks the string above is actually two characters.
Maybe I am oversimplifying but can't you just do the following:
txtString.Replace("\n", "$");
It is technically a string to string replacement but would be string to char...
You can always cast it to a char since you know what char you are replacing the string with.
Not sure, what business need it is, but if you need parsing C# in C# you can use some tools like Antlr, which supports C# grammar (https://github.com/antlr/grammars-v4/)
I don't think there is any ready tool designed just for strings
Try use Regex.Unescape(specialCharacterString);
It will return the new string with escape characters.
For example:
var literalStringWithEscapeCharacters = #"Hello\tWorld";
var stringWithEscapeCharacters = Regex.Unescape(literalStringWithEscapeCharacters);
Console.WriteLine(stringWithEscapeCharacters);
Will print: Hello World
Instead of: Hello\tWorld
Then you can find escape characters in stringWithEscapeCharacters like this:
var escapeChars= new [] { '\n' };
var characters = stringWithEscapeCharacters.Where(c => escapeChars.Contains(c)).ToList();
All escape characters described here:
https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/strings/#string-escape-sequences
Related
I'm trying to compare two characters in C#. The "==" operator does not work for strings, you have to use the .Equals() method. In the following code example I want to read each character in the input string, and output another string without spaces.
string inputName, outputName = null;
// read input name from file
foreach (char indexChar in inputName)
{
if (!indexChar.Equals(" "))
outputName += indexChar;
}
This does not work, the comparison always equals false, even when the input name has embedded spaces. I also tried using the overload method Equals(string, string), which did not work either. I'm assuming C# treats char variables as a string of length 1. Microsoft's documentation doesn't seem to mention comparing characters. Does anyone have a better method for comparing characters in a string?
" " is a string of length one; a char and a string never match; you want ' ', the space character:
if (indexChar != ' ')
However, if you're just trying to remove all spaces, it is probably easier to just do:
var outputName = inputName.Replace(" ", "");
This avoids allocating lots of intermediate strings.
Note also that the space character isn't the only whitespace character in unicode. If you need to deal with all whitespace characters, a regex may be a better option:
var outputName = Regex.Replace(inputName, #"\s", "");
You can use .CompareTo(char) to compare characters.
Example :
if('Z'.CompareTo('Z') == 0)
Console.WriteLine("Same character !");
Thanks for all the great suggestions. inputName.CompareTo(" ") is not the way to go for this example, you would still have to have a loop. I ended up using:
var outputName = Regex.Replace(inputName, #"\s", "")
which works, and it's only one line of code!
Using C# we can do string check like if string.contains() method, e.g.:
string test = "Microsoft";
if (test.Contains("i"))
test = test.Replace("i","a");
This is fine. But what if I want to replace a string which contains " symbol to be replaced.
I want to achieve this:
"<html><head>
I want to remove the " symbol present in check so that the result would be:
<html><head>
The " character can also be replaced, just like any other:
test = test.Replace("\"","");
Also, note that you don't have to test if the character exists : your test.Contains("i") could be removed since the .Replace() method won't do anything (no replace, no error thrown) if the character doesn't exist inside the string.
To include a quote symbol in a string, you need to escape it, using a backslash. In your example, you want to use something lik this:
if (test.Contains("\""))
There are two ways to include a '"' character in a string literal. All the answers so far have used the c-style way:
var quotation = "Parting is such sweet sorrow";
var howSweetIsIt = quotation + " that I shall say \"good-night\" till it be morrow.";
In some contexts (especially for users experienced with Visual Basic), the verbatim string literal may be easier to read. A verbatim string literal begins with an # sign, and the only character that requires escaping is the quotation mark -- all other characters are included verbatim (hence the name). Significantly, the method of escaping the quotation mark is different: rather than preceding it with a backslash, it must be doubled:
var howSweetIsIt = quotation + " that I shall say ""good-night"" till it be morrow.";
string SymbolString = "Micro\"so\"ft";
The string above use scape char \ to insert " between the characters
string Result = SymbolString.Replace("\"", string.Empty);
With the following replace I replace the character "" for empty.
This is what you try to achieve?
if (check.Contains("\"")
output = check.Replace("\"", "");
output = check.Replace("\"", "");
Just remember to use "\"" for the quote sign as the backslash is an escape character.
if (str.Contains("\""))
{
str = str.Replace("\"", "");
}
I am writing a program to process special text files. Some of these text files end with a SUB character (a substitute character. It may be 0x1A.) How do I detect this character and remove it from the text file using C#?
If it's really 0x1A in the binary data, and if you're reading it as an ASCII or UTF-8 file, it should end up as U+001A when read in .NET. So you may be able to write something like:
string text = File.ReadAllText("file.txt");
text = text.Replace("\u001a", "");
File.WriteAllText("file.txt", text);
Note that the "\u001a" part is a string consisting of a single character: \uxxxx is an escape sequence for a single UTF-16 code point with the given Unicode value expressed in hex.
The easiest answer would probably be a Regex:
public static string RemoveAll(this string input, char toRemove)
{
//produces a pattern like "\x1a+" which will match any occurrence
//of one or more of the character with that hex value
var pattern = #"\x" + ((int)toRemove).ToString("x") + "+";
return Regex.Replace(input, pattern, String.Empty);
}
//usage
var cleanString = dirtyString.RemoveAll((char)0x1a);
Yes, you could just pass in the int, but that requires knowing the integer value of the character. using a char as a parameter allows you to specify a literal or char variable with less muck.
C# has a method to detect control characters (including SUB).
See msdn : https://msdn.microsoft.com/en-us/library/9s05w2k9(v=vs.110).aspx
You could also try something like this it should work
using (FileStream f = File.OpenRead("path\\file")) //Your filename + extension
{
using (StreamReader sr = new StreamReader(f))
{
string text = sr.ReadToEnd();
text = text.Replace("\u001a", string.Empty);
}
}
In my c# application i want to convert a string characters to special characters.
My input string is "G\u00f6teborg" and i want the output as Göteborg.
I am using below code,
string name = "G\\u00f6teborg";
StringBuilder sb = new StringBuilder(name);
sb = sb.Replace(#"\\",#"\");
string name1 = System.Web.HttpUtility.HtmlDecode(sb.ToString());
Console.WriteLine(name1);
In the above code the double slash remains the same , it is not replacing to single slash, so after decoding i am getting the output as G\u00f6teborg .
Please help to find a solution for this.
Thanks in advance.
string name = "G\\u00f6teborg";
Just remove one of the backslashes:
string name = "G\u00f6teborg";
If you got the input from a user then you need to do more: it’s not enough to replace a backslash because that’s not how the characters are stored internally, the \uXXXX is an escape sequence representing a Unicode code point.
If you want to replace a user input escape sequence by a Unicode code point you need to parse the user input properly. You can use a regular expression for that:
MatchEvaluator replacer = m => ((char) int.Parse(m.Groups[1].Value, NumberStyles.AllowHexSpecifier)).ToString();
string result = Regex.Replace(name, #"\\u([a-fA-F0-9]{4})", replacer);
This matches each escape group (\u followed by four hex digits), extracts the hex digits, parses them and translates them to a character.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
What does # mean at the start of a string in C#?
Sorry but I can't find this on Google. I guess it maybe is not accepting my search string when I do a search.
Can someone tell me what this means in C#
var a = #"abc";
what's the meaning of the #?
It is a string literal. Which basically means it will take any character except ", including new lines. To write out a ", use "".
The advantage of #-quoting is that escape sequences are not processed,
which makes it easy to write, for example, a fully qualified file
name:
#"c:\Docs\Source\a.txt" // rather than "c:\\Docs\\Source\\a.txt"
It means it's a literal string.
Without it, any string containing a \ will consider the next character a special character, such as \n for new line. With a # in front, it will treat the \ literally.
In the example you've given, there is no difference in the output.
This says that the characters inside the double quotation marks should be interpreted exactly as they are.
You can see that the backslash is treated as a character and not an
escape sequence when the # is used. The C# compiler also allows you to
use real newlines in verbatim literals. You must encode quotation
marks with double quotes.
string fileLocation = "C:\\CSharpProjects";
string fileLocation = #"C:\CSharpProjects";
Look at here for examples.
C# supports two forms of string literals: regular string literals and verbatim string literals.
A regular string literal consists of zero or more characters enclosed
in double quotes, as in "hello", and may include both simple escape
sequences (such as \t for the tab character) and hexadecimal and
Unicode escape sequences.
A verbatim string literal consists of an # character followed by a
double-quote character, zero or more characters, and a closing
double-quote character. A simple example is "hello". In a verbatim
string literal, the characters between the delimiters are interpreted
verbatim, the only exception being a quote-escape-sequence. In
particular, simple escape sequences and hexadecimal and Unicode
escape sequences are not processed in verbatim string literals. A
verbatim string literal may span multiple lines.
Code Example
string a = "hello, world"; // hello, world
string b = #"hello, world"; // hello, world
string c = "hello \t world"; // hello world
string d = #"hello \t world"; // hello \t world
string e = "Joe said \"Hello\" to me"; // Joe said "Hello" to me
string f = #"Joe said ""Hello"" to me"; // Joe said "Hello" to me
string g = "\\\\server\\share\\file.txt"; // \\server\share\file.txt
string h = #"\\server\share\file.txt"; // \\server\share\file.txt
string i = "one\r\ntwo\r\nthree";
string j = #"one
two
three";
Reference link: MSDN