C# file readline has double back slashes reading a text file - c#

I get the answers in the following topic/question:
Reading a line from text file returns unwanted slash
the answer states:
The debug view for strings escapes things that would normally be escaped in code. This means that things like inline quotes or real 's will be escaped with a \ (making quotes look like " and single slashes look like \).
These slashes are not in the actual string, they are only there in the text viewer. You can verify that by writing out the string to the console or the debug output. Your string.Replace didn't work because there was nothing to replace.
The actual string shows ok without double back slashes showing in the Console and in debugger using the magnifier , but when I add it to a List collection of strings it contains the double back slashes.
var data = new List<string>();
if (File.Exists(textFile))
{
// Read file using StreamReader. Reads file line by line
using (StreamReader file = new StreamReader(textFile))
{
string ln;
while ((ln = file.ReadLine()) != null)
{
Console.WriteLine(ln);
data.Add(ln); // <----- contains double quotes
}
file.Close();
}
}
the incoming file contains a lot of data from the mainframe.
I'm not expecting double quotes. I'm using VS 2017.

Related

CSV file double-spacing lines

I am having trouble with OpenOffice Calc opening a CSV file that I create using StreamWriter C#. When it opens it has empty lines between every line that should be there(double-spaced). There seems to be some kind of doubling of the carriage returns. When I open it in Notepad it reads correctly. When I changed the program to write integers instead of strings the problem went away. It seems to be adding a return on the end of each string and then the formating adds another return that I'm not seeing.
Output looks like this...
1...
2...
3...
Output should look like this...
1...
2...
3...
Here is the ForEach loop I use to write the List to file...
using (StreamWriter sw = new StreamWriter(#"c:\andy\Arduino StreamWriter.csv", false, Encoding.UTF8))
{
foreach (string element in SerialPortString)
{
sw.WriteLine(element);
}
}
There is only one field of data per line, so there are no delimiters, just new lines. I tried formatting so that it would write with quotes around each field hoping that would eliminate confusion for the CSV format, but I wasn't able to figure that out either.
Any help would be appreciated.
Thanks.
Change
sw.WriteLine(element);
to
sw.WriteLine(element.Trim());
or maybe
sw.WriteLine(element.TrimEnd());
Trim the element first. That will remove any LineFeeds or other whitespace characters around the 'edges' of the characters. Then the StreamWriter's CRLFs will be the only newlines present.

Using The # symbol for a Directory so two backslashes aren't needed

this seems very simple but upon research I cannot find out how to use the # sign in a directory to prevent it from having to be to backslashes.
An example is of
DirectoryInfo folderInfo = new DirectoryInfo(#"C:\");
But In my Application The directory will be dynamic so I cannot do this:
DirectoryInfo folderInfo = new DirectoryInfo(#Globals.directoryRoute);
So I was wondering what is the correct way to put the # symbol before the string.
Globals.directoryRoute is set as C:\ but the user can change this input so I was hoping instead of having to parse out every double backslash I can use this to make it so only one backslash is needed.
Would this be an effective way of doing it or should I just parse out every second backslash?
The # prefix is a tool to tell the compiler to not take the backslash as an escape character within the following string. If the string is entered at runtime, you don't need to worry about that. So you can just use the content of Globals.directoryRoute as it is.
The double backslashes are only needed for string literals in your code. In memory, only a single backslash is stored in the string, so no # symbol is needed when dealing with strings that are already in memory. Similarly, user input does not need the double backslashes, since it is not interpreted in the same manner as source code. For instance, if you have a text box called txtPath, the user can simply type C:\some\path, not C:\\some\\path as you would normally need to do in source code. When you read the value of that text box in code, you can just use:
string path = txtPath.Text;
This will be the same as if you had the following code:
string path = #"C:\some\path";
or, equivalently:
string path = "C:\\some\\path";

How to read double quotes (") in a text file in C#?

I have to read a text file and then to parse it, in C# using VS 2010. The sample text is as follows,
[TOOL_TYPE]
; provides the name of the selected tool for programming
“Phoenix Select Advanced”;
[TOOL_SERIAL_NUMBER]
; provides the serial number for the tool
7654321;
[PRESSURE_CORRECTION]
; provides the Pressure correction information requirement
“Yes”;
[SURFACE_MOUNT]
; provides the surface mount information
“Yes”;
[SAPPHIRE_TYPE]
; provides the sapphire type information
“No”;
Now I have to parse only the string data (in double quotes) and headers (in square brackets[]), and then save it into another text file. I can successfully parse the headers but the string data in double quotes is not appearing correctly, as shown below.
[TOOL_TYPE]
�Phoenix Select Advanced�;
[TOOL_SERIAL_NUMBER]
7654321;
[PRESSURE_CORRECTION]
�Yes�;
[SURFACE_MOUNT]
�Yes�;
[SAPPHIRE_TYPE]
�No�;
[EXTENDED_TELEMETRY]
�Yes�;
[OVERRIDE_SENSE_RESISTOR]
�No�;
Please note a special character (�) which is appearing every time whenever a double quotes appear.
How can I write the double quotes(") in the destination file and avoid (�) ?
Update
I am using the following line for my parsing
temporaryconfigFileWriter.WriteLine(configFileLine, false, Encoding.Unicode);
Here is the complete code I am using:
string temporaryConfigurationFileName = System.Environment.GetFolderPath(Environment.SpecialFolder.Desktop) + "\\Temporary_Configuration_File.txt";
//Pointers to read from Configuration File 'configFileReader' and to write to Temporary Configuration File 'temporaryconfigFileWriter'
StreamReader configFileReader = new StreamReader(CommandLineVariables.ConfigurationFileName);
StreamWriter temporaryconfigFileWriter = new StreamWriter(temporaryConfigurationFileName);
//Check whether the 'END_OF_FILE' header is specified or not, to avoid searching for end of file indefinitely
if ((File.ReadAllText(CommandLineVariables.ConfigurationFileName)).Contains("[END_OF_FILE]"))
{
//Read the file untill reaches the 'END_OF_FILE'
while (!((configFileLine = configFileReader.ReadLine()).Contains("[END_OF_FILE]")))
{
configFileLine = configFileLine.Trim();
if (!(configFileLine.StartsWith(";")) && !(string.IsNullOrEmpty(configFileLine)))
{
temporaryconfigFileWriter.WriteLine(configFileLine, false, Encoding.UTF8);
}
}
// to write the last header [END_OF_FILE]
temporaryconfigFileWriter.WriteLine(configFileLine);
configFileReader.Close();
temporaryconfigFileWriter.Close();
}
Your input file doesn't contain double quotes, that's a lie. It contains the opening double quote and the closing double quote not the standard version.
First you must ensure that you are reading your input with the correct encoding (Try multiple ones and just display the string in a textbox in C# you'll see if it show the characters correctly pretty fast)
If you want such characters to appear in your output you must write the output file as something else than ASCII and if you write it as UTF-8 for example you should ensure that it start with the Byte Order Mark (Otherwise it will be readable but some software like notepad will display 2 characters as it won't detect that the file isn't ASCII).
Another choice is to simply replace “ and ” with "
It appears that you are using proper typographic quotes (“...”) instead of the straight ASCII ones ("..."). My guess would be that you read the text file with the wrong encoding.
If you can see them properly in Notepad and neither ASCII nor one of the Unicode encodings works, then it's probably codepage 1252. You can get that encoding via
Encoding.GetEncoding(1252)

cannot remove backslashes added by streamReader.ReadLine - c sharp

the text I've got from the file has the following string: 1"-DC-19082-A3
after getting that line I get the following string (got it while debugging): "\"1\"\"-DC-19082-A3\""
as I'm searching on the DB it's not of any use like that
any idea how can I get back to the original?
I've seen that StreamWriter.WriteLine would do the job, but I don't want to create any file for this.
I tried the following, but it didn't work
StringBuilder sb = new StringBuilder();
sb.Append(#"\");
sb.Append('"');
string strToReplace = sb.ToString();
string lineNumberToSearchFor = lineNumber.Replace(strToReplace, string.Empty);
hopefully there's an easy way of achieving this
many thanks!
It adds the backslashes to indicate quotes in the middle of the string. The backslashes aren't actually there, the quotes are.
If you want to remove the quotes instead:
myString.Replace("\"", string.Empty);
The backslashes are added by the Watch window.
The string itself doesn't have backslashes.

How to read/write text and avoid special character signs (<, , >, etc)

I am currently parsing some C# scripts that are stored in a database, extracting the body of some methods in the code, and then writing an XML file that shows the id, the body of the extracted methods, etc.
The problem I have write now is that when I write the code in the XML I have to write it as a literal string, so I thought I'd need to add " at the beginning and end:
new XElement("MethodName", #"""" + Extractor.GetMethodBody(rule.RuleScript, "MethodName") + #"""")
This works, but I have a problem, things that are written in the DB as
for (int n = 1; n < 10; n++)
are written into the XML file (or printed to console) as:
for (int n = 1; n < 10; n++)
How can I get it to print the actual character and not its code? The code in the database is written with the actual charaters, not the "safe" < like one.
Inside xml (as a text value) it is correct for < to be encoded as <. The internal representation of xml doesn't affect the value, so let it get encoded. You can get around this by forcing a CDATA section, but in all honesty - it isn't worth it. But here is an example using CDATA:
string noEncoding = new XElement("foo", new XCData("a < b")).ToString();
Why do you think that you have to write it as a literal string? That is not so. Besides, you are not writing it as a literal string at all, it's still a dynamic string value only that you have added quotation marks around it.
A literal string is a string that is written litteraly in the code, like "Hello world". If you get the string in any other way, it's not a literal string.
The quotation marks that you have added to the string simply adds quotation marks to the value, they don't do anything else to the string. You can add the string with the quotation marks just fine:
new XElement("MethodName", Extractor.GetMethodBody(rule.RuleScript, "MethodName"))
Now, the characters that are encoded when they are put in the XML, is because they need to be encoded. You can't put a < character inside a value without encoding it.
If you show the XML, you will see the encoded values, and that is just a sign that it works as it should. When you read the XML, the encoded characters will be decoded, and you end up with the original string.
I don't know what software he's going to use to read the XML, but any that I know of will throw an error on parsing any XML that does not escape < and > chars which aren't used as tag starts and ends. It's just part of the XML specification; these chars are reserved as part of the structure.
If I were you, then, I'd part ways with the System.XML utilities and write this file yourself. Any decent XML tool is going to encode those chars for you, so you should probably not use them. Go with a StreamWriter and create the output the way you are being told to. That way you can control the XML output yourself, even if it means breaking the XML specification.
using (StreamWriter sw = new StreamWriter("c:\\xmlText.xml", false, Encoding.UTF8))
{
sw.WriteLine("<?xml version=\"1.0\"?>");
sw.WriteLine("<Class>");
sw.Write("\t<Method Name=\"MethodName\">");
sw.Write(#"""" + Extractor.GetMethodBody(rule.RuleScript, "MethodName") + #"""");
sw.WriteLine("</Method>");
// ... and so on and so forth
sw.WriteLine("</Class>");
}

Categories