Is there a way to read large text file in parts?

Is there a way to read large text file in parts? - c#

I have a large file(60mb) and I am reading the file into a string and Iam returning that string to another method.
Now when I am reading the file into a string its giving System out of memory exception.
Is there a way to read file in parts and append it to the string?
If not is there a way around this?
static public string Serialize()
{
string returnValue;
System.IO.FileInfo file1 = new FileInfo(#"c:\file.txt");
returnValue = System.IO.File.ReadAllText(file1.ToString());
}

How do you read the file right now ?
You could use the StreamReader class, and read the file line by line (ReadLine method).
You could also read a specified amount of bytes from the file on each read operation (Read method)

Yes- it's called streaming. Have a look at the StreamReader Class. Though I'm not sure why you want 1 60MB in one string. Probably best to deal with it a little at a time if possible (possibly in your scenario on a line by line basis?).
Instead of ReadAllText look at OpenRead and passing the returned FileStream into the constructor of a StreamReader, have a look at doing something along these lines if possible:
using (FileStream fs = File.OpenRead("c:\theFile.text"))
using (StreamReader sr = new StreamReader(fs))
{
string oneLine = sr.ReadLine();
}

even if you read it line by line (or in parts by streaming), you will run out of memory as you are appending it to a single string. is compressing it along the way an option? if not, i'd probably up the maxHeap for the JVM to 512MB or similar.

Related

Replace Text in a TextFile c#

What is the best way to replace text in a text file?
I do not want to give the file a new name
I do not want the text to become one long string which is what happens when I use File.ReadAllText because this is stored as a string and I loose carriage returns etc...
Also, I guess I will run into issues using a StreamReader/StreamWriter because you cannot read and write to the same file?
Thanks

You can do it with a stream opened for both reading and writing:
FileStream fileStream = new FileStream(#"c:\myFile.txt", FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None);
var streamWriter = new StreamWriter(fileStream);
var streamReader = new StreamReader(fileStream);
...
fileStream .Close();
But the most easy way is still to read all file, edit the text and write it back to the file:
var text = File.ReadAllText(#"c:\myFile.txt");
...
File.WriteAllText(#"c:\myFile.tx", text);

Depending on your file format, you could also read your files line by line (using File.ReadLines) and perform the text replacements for each line.
You can also refer to this answer for a variant based on streams, which is the preferred way if your file is large.
How to read a large (1 GB) txt file in .NET?

StreamReader fails to detect BOM

I have the following piece of code:
using (StreamReader sr = new StreamReader(path, Encoding.GetEncoding("shift-jis"), true)) {
mCertainFileIsUTFFormat = !sr.CurrentEncoding.Equals(Encoding.GetEncoding("shift-jis"));
mCodingFromBOM = sr.CurrentEncoding;
String line = sr.ReadToEnd();
return line.Split('\n');
}
Basically reading a file and assuming Shift-Jis if there is no BOM. Alas, this method is always, no matter what, returning Shift-JIS encoding, even if the file in question has a BOM within it. Am I doing something wrong here or perhaps there is a known issue? I could always open the file binary and do the work myself, but this is supposed to do what I want :)

You need to call Read of any kind - StreamReader will not detect encoding before reading. I.e. get encoding after your ReadToEnd call:
String line = sr.ReadToEnd();
mCodingFromBOM = sr.CurrentEncoding;
Info: StreamReader.CurrentEncoding
The value can be different after the first call to any Read` method of StreamReader, since encoding autodetection is not done until the first call to a Read method.

C#: Compare text file contents to a string variable

I have an application that dumps text to a text file. I think there might be an issue with the text not containing the proper carriage returns, so I'm in the process of writing a test that will compare the contents of of this file to a string variable that I declare in the code.
Ex:
1) Code creates a text file that contains the text:
This is line 1
This is line 2
This is line 3
2) I have the following string that I want to compare it to:
string testString = "This is line 1\nThis is line 2\nThis is line3"
I understand that I could open a file stream reader and read the text file line by line and store that in a mutable string variable while appending "\n" after each line, but wondering if this is re-inventing the wheel (other words, .NET has a built in class for something like this). Thanks in advance.

you can either use StreamReader's ReadToEnd() method to read contents in a single string like
using System.IO;
using(StreamReader streamReader = new StreamReader(filePath))
{
string text = streamReader.ReadToEnd();
}
Note: you have to make sure that you release the resources (above code uses "using" to do that) and ReadToEnd() method assumes that stream knows when it has reached an end. For interactive protocols in which the server sends data only when you ask for it and does not close the connection, ReadToEnd might block indefinitely because it does not reach an end, and should be avoided and also you should take care that current position in the string should be at the start.
You can also use ReadAllText like
// Open the file to read from.
string readText = File.ReadAllText(path);
which is simple it opens a file, reads all lines and takes care of closing as well.

No, there is nothing built in for this. The easiest way, assuming that your file is small, is to just read the whole thing and compare them:
var fileContents = File.ReadAllText(fileName);
return testString == filecontents;
If the file is fairly long, you may want to compare the file line by line, since finding a difference early on would allow you to reduce IO.

A faster way to implement reading all the text in a file is
System.IO.File.ReadAllText()
but theres no way to do the string level comparison shorter

if(System.IO.File.ReadAllText(filename) == "This is line 1\nThis is line 2\nThis is line3") {
// it matches
}

This should work:
StreamReader streamReader = new StreamReader(filePath);
string originalString = streamReader.ReadToEnd();
streamReader.Close();
I don't think there is a quicker way of doing it in C#.

You can read the entire file into a string variable this way:
FileStream stream;
StreamReader reader;
stream = new FileStream(yourFileName, FileMode.Open, FileAccess.Read, FileShare.Read);
reader = new StreamReader(stream);
string stringContainingFilesContent = reader.ReadToEnd();
// and check for your condition
if (testString.Equals(stringContainingFilesContent, StringComparison.OrdinalIgnoreCase))

Is there a restriction on the amount of text I can write / read with a stream writer / reader in C#?

I am trying to write to some text file using a stream writer.
the text I am trying to write is from a different text file.
I try:
string line = reader.ReadLine(); //reader is a streamReader I defined before
while (line != null)
{
sw.WriteLine(line); //sw is a streamWriter I defined before
line = reader.ReadLine();
}
I also tried:
while (!(reader.EndOfStream))
{
sw.WriteLine(reader.ReadLine()); //sw is a streamWriter I defined before
}
this two methods succeeded to copy the text from the file to the other file, but from some reason not all of the text was copied.
The text file I am trying to copy from is very large, about 96000 lines, and only the ~95000 first lines are copied.
Therfore, I am asking if there is a restriction on the amount of text I can write / read with a stream writer / reader in C#?
Also, I asking for some suggestions for how to succeed copy all the text.
(I read that there is a method copy of the Stream class, but that is for .NET4, so it wont help).
EDIT: I tried to replace the text in the end that didn't copied by a text form the start that was copied. I got the same problem, so it isn't a problem with the characters.

Hmm. Probably you are not flushing your stream. Try doing sw.Autoflush=true; Or, before you close sw, call sw.Flush();

I am going to guess that you are not calling flush on your output stream. This would cause the last few (sometimes a lot) of lines to not be written to the output file.

Read Text File Up to GO

I have a multi-megabyte .sql file, there are GO statements on newlines every 10k or so. I am trying to come up with a way to read the file, line-by-line, until I hit a new line that only has "go" and a Newline character, then return what was read to the caller, and then read the next bunch of text until I hit GO again.
Peek only lets me read one character, what's a smart way to make this work in C# on Framework 4.0?
Thanks.

This should do exactly what you were saying.
Call this function with a string of the filename of the SQL file, and it will return an IEnumerable<string> (a bunch of strings) that each hold a SQL batch (each up until a GO statement) which you can then loop through with foreach or anything else.
public static IEnumerable<string> GetSqlBatches(string filename)
{
using(StreamReader sr = new StreamReader(filename))
{
StringBuilder ReadSoFar = new StringBuilder();
while (!sr.EndOfStream)
{
string line = sr.ReadLine();
ReadSoFar.AppendLine(line);
if (line.Trim() == "GO")
{
yield return ReadSoFar.ToString();
ReadSoFar = new StringBuilder();
}
}
sr.Close();
}
}

If you use SQL Server there is no need to parse file manually. Use Server.ConnectionContext.ExecuteNonQuery method instead. It is part of SMO.
See Handling "GO" Separators in SQL Scripts - the easy way by Jon Galloway.

If you read the file with a BufferedStream then you can seek back to the beginning of the line once you read a "GO" token.
If the file will always be on disk (instead of in memory or coming from a network connection), you can also just use a FileStream because that is also seekable.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Is there a way to read large text file in parts? - c#

How do you read the file right now ? You could use the StreamReader class, and read the file line by line (ReadLine method). You could also read a specified amount of bytes from the file on each read operation (Read method)

even if you read it line by line (or in parts by streaming), you will run out of memory as you are appending it to a single string. is compressing it along the way an option? if not, i'd probably up the maxHeap for the JVM to 512MB or similar.

Related

Replace Text in a TextFile c#

StreamReader fails to detect BOM

C#: Compare text file contents to a string variable

Is there a restriction on the amount of text I can write / read with a stream writer / reader in C#?

Read Text File Up to GO

Categories

Resources