StreamWriter.WriteLine is including an extra \r [duplicate] - c#

I created a class with the responsibility to generate a text file where each line represents the information of an object of 'MyDataClass' class. Below is a simplification of my code:
public class Generator
{
private readonly Stream _stream;
private readonly StreamWriter _streamWriter;
private readonly List<MyDataClass> _items;
public Generator(Stream stream)
{
_stream = stream;
_streamWriter = new StreamWriter(_stream, Encoding.GetEncoding("ISO-8859-1"));
}
public void Generate()
{
foreach (var item in _items)
{
var line = AnotherClass.GetLineFrom(item);
_streamWriter.WriteLine(line);
}
_streamWriter.Flush();
_stream.Position = 0;
}
}
And I call this class like this:
using (var file = new FileStream("name", FileMode.OpenOrCreate, FileAccess.ReadWrite))
{
new Generator(file).Generate();
}
When I run the application on visual studio (I test with run (Ctrl+F5), debug (F5), with debug and release mode) all goes according to the plan. But I publish the application in a IIS server and now StreamWriter class put an extra \r before the end of the line.
Check it out the hexadecimal reading of both generated files:
Running in Visual Studio:
http://www.jonataspiazzi.xpg.com.br/hex_vs.bmp
Running in IIS:
http://www.jonataspiazzi.xpg.com.br/hex_iis.bmp
Some things I already checked:
Write the line variable (in var line = AnotherClass.GetLineFrom(item);) in a log to see if an extra '\r' is uncluded by the class AnotherClass.
Didn't result in nothing, the last char in line is a regular char like expected (in example above is a space).
Write another code to see if the problem is general for all IIS StreamWriter instances.
I tried this:
var ms = new MemoryStream();
var sw = new StreamWriter(ms, Encoding.GetEncoding("ISO-8859-1"));
sw.WriteLine("Test");
sw.WriteLine("Of");
sw.WriteLine("Lines");
sw.Flush();
ms.Position = 0;
In this case the code works well for both visual studio and IIS.
I'm in this for 3 days, I already try everything my brain can think. Did anyone have any clue for what I can try?
UPDATE
Get weirder! I try to replace the line _streamWriter.WriteLine(line); with:
_streamWriter.Write(linhaTexto + Environment.NewLine);
And even worse:
_streamWriter.Write(linhaTexto + "\r\n");
Both keep generating the extra \r character.
I try replace with this:
_streamWriter.Write(linhaTexto + "#\r\n#");
And get:
http://www.jonataspiazzi.xpg.com.br/hex_sharp.bmp

According to MSDN, WriteLine
Writes data followed by a line terminator to the text string or stream.
your last line should be
_streamWriter.Write(line);
Put it outside of your loop and change your loop so it doesn't manage the last line.

My guess is that the extra \r is added during FTP (maybe try a binary transfer)
Like here
I've tested the code and the extra /r is not due to the code in the current question

I had a similar issue. Environment.NewLine and WriteLine gave me extra \r character. But this below worked for me:
StringBuilder sbFileContent = new StringBuilder();
sbFileContent.Append(line);
sbFileContent.Append("\n");
streamWriter.Write(sbFileContent.ToString());

I just now had a similar problem where the code below would randomly insert blank lines in the output file (outFile)
using (StreamWriter outFile = new StreamWriter(outFilePath, true)) {
foreach (string line in File.ReadLines(logPath)) {
string concatLine = parse(line, out bool shouldWrite);
if (shouldWrite) {
outFile.WriteLine(concatLine);
}
}
}
Using Antar's idea I changed my parse function so that it returned a line with Environment.NewLine appended, ie
return myStringBuilder.Append(Environment.NewLine).ToString();
and then in the foreach loop above, changed the
outFile.WriteLine(concatLine);
to
outFile.Write(concatLine);
and now it writes the file without a bunch of random new lines inserted. However, I still have absolutely no idea why I should have to do this.

Related

StreamWriter adds extra character(s) on new line(s) at the end of file

I'm trying to modify an .ini file, in C# with .NET 5.0, using FileStream and StreamReader / StreamWriter. I just need to modify the first line of the file so I read the entire file into a list of strings called strList, modify the first line, and then write it all back to the same file.
List<string> strList = new List<string>();
using (FileStream fs = File.OpenRead(#"C:\MyFolder\test.ini"))
{
using (StreamReader sr = new StreamReader(fs))
{
while (!sr.EndOfStream)
{
strList.Add(sr.ReadLine());
}
}
}
strList[0] = "test01";
using (FileStream fs = File.OpenWrite(#"C:\MyFolder\test.ini"))
{
using (StreamWriter sw = new StreamWriter(fs))
{
for (int x = 0; x < ewsLines.Count; x++)
{
sw.WriteLine(strList[x]);
}
}
}
The issue I'm running into is that I'll have new character(s) at the end of my file on new line(s). I verified that the number of lines I read from the file matches what is in the file and that the for loop only writes that same number of lines back into the file. I don't have any issues writing other strings except for "test01". This string is the only one that causes the issue that I just described. It seems to be grabbing characters from the last line like R or LAYER from MULTI_LAYER.
Ex 1: This
S10087_U1
Cq4InEq=TRUE
XtrVer=5.5
IOCUPDATEMDB=TRUE
ARCHITECTURE=MULTI_LAYER
Becomes this
test01
Cq4InEq=TRUE
XtrVer=5.5
IOCUPDATEMDB=TRUE
ARCHITECTURE=MULTI_LAYER
R
Ex 2: This
test01 - Copy
Cq4InEq=TRUE
XtrVer=5.5
IOCUPDATEMDB=TRUE
ARCHITECTURE=MULTI_LAYER
ER
Becomes this
test01
Cq4InEq=TRUE
XtrVer=5.5
IOCUPDATEMDB=TRUE
ARCHITECTURE=MULTI_LAYER
LAYER
Replacing the StreamWriter portion with the following seems to fix the issue but I'm trying to figure out why using StreamWriter doesn't work as I expect it to.
File.WriteAllLines(#"C:\MyFolder\test.ini", strList);
This is because you're using File.OpenWrite. From the remarks in the documentation:
The OpenWrite method opens a file if one already exists for the file path, or creates a new file if one does not exist. For an existing file, it does not append the new text to the existing text. Instead, it overwrites the existing characters with the new characters. If you overwrite a longer string (such as "This is a test of the OpenWrite method") with a shorter string (such as "Second run"), the file will contain a mix of the strings ("Second runtest of the OpenWrite method").
While you could just change your code to use File.Create instead, I'd suggest changing the code more significantly - not just the writing, but the reading too:
string path = #"C:\MyFolder\test.ini";
var lines = File.ReadAllLines(path);
lines[0] = "test01";
File.WriteAllLines(path, lines);
That's much simpler code to do the same thing.
The half-way house between the two would be to use File.OpenText (to return a StreamWriter) and File.CreateText (to return a StreamWriter). There's no need to do the wrapping yourself.

StreamWriter add an extra \r in the end of the line

I created a class with the responsibility to generate a text file where each line represents the information of an object of 'MyDataClass' class. Below is a simplification of my code:
public class Generator
{
private readonly Stream _stream;
private readonly StreamWriter _streamWriter;
private readonly List<MyDataClass> _items;
public Generator(Stream stream)
{
_stream = stream;
_streamWriter = new StreamWriter(_stream, Encoding.GetEncoding("ISO-8859-1"));
}
public void Generate()
{
foreach (var item in _items)
{
var line = AnotherClass.GetLineFrom(item);
_streamWriter.WriteLine(line);
}
_streamWriter.Flush();
_stream.Position = 0;
}
}
And I call this class like this:
using (var file = new FileStream("name", FileMode.OpenOrCreate, FileAccess.ReadWrite))
{
new Generator(file).Generate();
}
When I run the application on visual studio (I test with run (Ctrl+F5), debug (F5), with debug and release mode) all goes according to the plan. But I publish the application in a IIS server and now StreamWriter class put an extra \r before the end of the line.
Check it out the hexadecimal reading of both generated files:
Running in Visual Studio:
http://www.jonataspiazzi.xpg.com.br/hex_vs.bmp
Running in IIS:
http://www.jonataspiazzi.xpg.com.br/hex_iis.bmp
Some things I already checked:
Write the line variable (in var line = AnotherClass.GetLineFrom(item);) in a log to see if an extra '\r' is uncluded by the class AnotherClass.
Didn't result in nothing, the last char in line is a regular char like expected (in example above is a space).
Write another code to see if the problem is general for all IIS StreamWriter instances.
I tried this:
var ms = new MemoryStream();
var sw = new StreamWriter(ms, Encoding.GetEncoding("ISO-8859-1"));
sw.WriteLine("Test");
sw.WriteLine("Of");
sw.WriteLine("Lines");
sw.Flush();
ms.Position = 0;
In this case the code works well for both visual studio and IIS.
I'm in this for 3 days, I already try everything my brain can think. Did anyone have any clue for what I can try?
UPDATE
Get weirder! I try to replace the line _streamWriter.WriteLine(line); with:
_streamWriter.Write(linhaTexto + Environment.NewLine);
And even worse:
_streamWriter.Write(linhaTexto + "\r\n");
Both keep generating the extra \r character.
I try replace with this:
_streamWriter.Write(linhaTexto + "#\r\n#");
And get:
http://www.jonataspiazzi.xpg.com.br/hex_sharp.bmp
According to MSDN, WriteLine
Writes data followed by a line terminator to the text string or stream.
your last line should be
_streamWriter.Write(line);
Put it outside of your loop and change your loop so it doesn't manage the last line.
My guess is that the extra \r is added during FTP (maybe try a binary transfer)
Like here
I've tested the code and the extra /r is not due to the code in the current question
I had a similar issue. Environment.NewLine and WriteLine gave me extra \r character. But this below worked for me:
StringBuilder sbFileContent = new StringBuilder();
sbFileContent.Append(line);
sbFileContent.Append("\n");
streamWriter.Write(sbFileContent.ToString());
I just now had a similar problem where the code below would randomly insert blank lines in the output file (outFile)
using (StreamWriter outFile = new StreamWriter(outFilePath, true)) {
foreach (string line in File.ReadLines(logPath)) {
string concatLine = parse(line, out bool shouldWrite);
if (shouldWrite) {
outFile.WriteLine(concatLine);
}
}
}
Using Antar's idea I changed my parse function so that it returned a line with Environment.NewLine appended, ie
return myStringBuilder.Append(Environment.NewLine).ToString();
and then in the foreach loop above, changed the
outFile.WriteLine(concatLine);
to
outFile.Write(concatLine);
and now it writes the file without a bunch of random new lines inserted. However, I still have absolutely no idea why I should have to do this.

OutofMemory Exception when reading and replacing strings with StreamReader and StreamWriter

I'm trying to convert a file's encoding and replace some text along the way. Unfortunately, I'm getting an OutOfMemory exception. I'm not sure why. As I understand it, it streams the original file line by line into a var (str), completes a couple of string replacements, and then writes the converted line to the StreamWriter.
Can someone tell me what I'm doing wrong here?
EDIT 1
- I'm currently testing a single file - 1GB:2.5m rows.
- Replaced read and replace into a single line. Same results!
EDIT 2
???By the way, can anyone tell me why the question was downgraded? I'd like to know for future postings.???
The problem is with the file itself. It's output from SQL Server BCP where I explicitly flag the row terminator with a specific string. By default, when the row terminator flag is omitted, BCP adds a newline at the end of each row and the code below works perfectly.
What I still don't understand is: when I set the row terminator flag with a specific string, each record appears on a newline, so why doesn't streamreader see each record on a separate line? Instead, it appears it views the entire file as one long line. That still doesn't explain the OOM exception since I have well over a 100G of memory.
Unfortunately, explicitly setting the row terminator flag is a must. For now, I'll take this over to dba exchange.
Thanks
static void Main(string[] args)
{
String msg = String.Empty;
String str = String.Empty;
DirectoryInfo dInfo = new DirectoryInfo(#"\\server\share");
foreach (var f in dInfo.GetFiles())
{
using (StreamReader sr = new StreamReader(f.FullName, Encoding.Unicode, false))
{
using (StreamWriter sw = new StreamWriter(f.DirectoryName + "\\new\\" + f.Name, false, Encoding.UTF8))
{
try
{
while (!sr.EndOfStream)
{
str = sr.ReadLine().Replace("this","that");
sw.WriteLine(str);
}
}
catch (Exception e)
{
msg += f.Name + ": " + e.Message;
}
}
}
}
Console.WriteLine(msg);
Console.ReadLine();
}
Well, you're main reading and writing code needs just one line of data. Your msg string, on the other hand, keeps getting larger and larger with each exception.
You'll need to have many millions of files in the folder to get an OutOfMemory exception this way, though.

In C#, How can I copy a file with arbitrary encoding, reading line by line, without adding or deleting a newline

I need to be able to take a text file with unknown encoding (e.g., UTF-8, UTF-16, ...) and copy it line by line, making specific changes as I go. In this example, I am changing the encoding, however there are other uses for this kind of processing.
What I can't figure out is how to determine if the last line has a newline! Some programs care about the difference between a file with these records:
Rec1<newline>
Rec2<newline>
And a file with these:
Rec1<newline>
Rec2
How can I tell the difference in my code so that I can take appropriate action?
using (StreamReader reader = new StreamReader(sourcePath))
using (StreamWriter writer = new StreamWriter(destinationPath, false, outputEncoding))
{
bool isFirstLine = true;
while (!reader.EndOfStream)
{
string line = reader.ReadLine();
if (isFirstLine)
{
writer.Write(line);
isFirstLine = false;
}
else
{
writer.Write("\r\n" + line);
}
}
//if (LastLineHasNewline)
//{
// writer.Write("\n");
//}
writer.Flush();
}
The commented out code is what I want to be able to do, but I can't figure out how to set the condition lastInputLineHadNewline! Remember, I have no a priori knowledge of the input file encoding.
Remember, I have no a priori knowledge of the input file encoding.
That's the fundamental problem to solve.
If the file could be using any encoding, then there is no concept of reading "line by line" as you can't possibly tell what the line ending is.
I suggest you first address this part, and the rest will be easy. Now, without knowing the context it's hard to say whether that means you should be asking the user for the encoding, or detecting it heuristically, or something else - but I wouldn't start trying to use the data before you can fully understand it.
As often happens, the moment you go to ask for help, the answer comes to the surface. The commented out code becomes:
if (LastLineHasNewline(reader))
{
writer.Write("\n");
}
And the function looks like this:
private static bool LastLineHasNewline(StreamReader reader)
{
byte[] newlineBytes = reader.CurrentEncoding.GetBytes("\n");
int newlineByteCount = newlineBytes.Length;
reader.BaseStream.Seek(-newlineByteCount, SeekOrigin.End);
byte[] inputBytes = new byte[newlineByteCount];
reader.BaseStream.Read(inputBytes, 0, newlineByteCount);
for (int i = 0; i < newlineByteCount; i++)
{
if (newlineBytes[i] != inputBytes[i])
return false;
}
return true;
}

How do I locate a particular word in a text file using .NET

I am sending mails (in asp.net ,c#), having a template in text file (.txt) like below
User Name :<User Name>
Address : <Address>.
I used to replace the words within the angle brackets in the text file using the below code
StreamReader sr;
sr = File.OpenText(HttpContext.Current.Server.MapPath(txt));
copy = sr.ReadToEnd();
sr.Close(); //close the reader
copy = copy.Replace(word.ToUpper(),"#" + word.ToUpper()); //remove the word specified UC
//save new copy into existing text file
FileInfo newText = new FileInfo(HttpContext.Current.Server.MapPath(txt));
StreamWriter newCopy = newText.CreateText();
newCopy.WriteLine(copy);
newCopy.Write(newCopy.NewLine);
newCopy.Close();
Now I have a new problem,
the user will be adding new words within an angle, say for eg, they will be adding <Salary>.
In that case i have to read out and find the word <Salary>.
In other words, I have to find all the words, that are located with the angle brackets (<>).
How do I do that?
Having a stream for your file, you can build something similar to a typical tokenizer.
In general terms, this works as a finite state machine: you need an enumeration for the states (in this case could be simplified down to a boolean, but I'll give you the general approach so you can reuse it on similar tasks); and a function implementing the logic. C#'s iterators are quite a fit for this problem, so I'll be using them on the snippet below. Your function will take the stream as an argument, will use an enumerated value and a char buffer internally, and will yield the strings one by one. You'll need this near the start of your code file:
using System.Collections.Generic;
using System.IO;
using System.Text;
And then, inside your class, something like this:
enum States {
OUT,
IN,
}
IEnumerable<string> GetStrings(TextReader reader) {
States state=States.OUT;
StringBuilder buffer;
int ch;
while((ch=reader.Read())>=0) {
switch(state) {
case States.OUT:
if(ch=='<') {
state=States.IN;
buffer=new StringBuilder();
}
break;
case States.IN:
if(ch=='>') {
state=States.OUT;
yield return buffer.ToString();
} else {
buffer.Append(Char.ConvertFromUtf32(ch));
}
break;
}
}
}
The finite-state machine model always has the same layout: while(READ_INPUT) { switch(STATE) {...}}: inside each case of the switch, you may be producing output and/or altering the state. Beyond that, the algorithm is defined in terms of states and state changes: for any given state and input combination, there is an exact new state and output combination (the output can be "nothing" on those states that trigger no output; and the state may be the same old state if no state change is triggered).
Hope this helps.
EDIT: forgot to mention a couple of things:
1) You get a TextReader to pass to the function by creating a StreamReader for a file, or a StringReader if you already have the file on a string.
2) The memory and time costs of this approach are O(n), with n being the length of the file. They seem quite reasonable for this kind of task.
Using regex.
var matches = Regex.Matches(text, "<(.*?)>");
List<string> words = new List<string>();
for (int i = 0; i < matches.Count; i++)
{
words.Add(matches[i].Groups[1].Value);
}
Of course, this assumes you already have the file's text in a variable. Since you have to read the entire file to achieve that, you could look for the words as you are reading the stream, but I don't know what the performance trade off would be.
This is not an answer, but comments can't do this:
You should place some of your objects into using blocks. Something like this:
using(StreamReader sr = File.OpenText(HttpContext.Current.Server.MapPath(txt)))
{
copy = sr.ReadToEnd();
} // reader is closed by the end of the using block
//remove the word specified UC
copy = copy.Replace(word.ToUpper(), "#" + word.ToUpper());
//save new copy into existing text file
FileInfo newText = new FileInfo(HttpContext.Current.Server.MapPath(txt));
using(var newCopy = newText.CreateText())
{
newCopy.WriteLine(copy);
newCopy.Write(newCopy.NewLine);
}
The using block ensures that resources are cleaned up even if an exception is thrown.

Categories