StringBuilder.ToString() throws OutOfMemoryException - c#

I have a created a StringBuilder of length "132370292", when I try to get the string using the ToString() method it throws OutOfMemoryException.
StringBuilder SB = new StringBuilder();
for(int i =0; i<=5000; i++)
{
SB.Append("Some Junk Data for testing. My Actual Data is created from different sources by Appending to the String Builder.");
}
try
{
string str = SB.ToString(); // Throws OOM mostly
Console.WriteLine("String Created Successfully");
}
catch(OutOfMemoryException ex)
{
StreamWriter sw = new StreamWriter(#"c:\memo.txt", true);
sw.Write(SB.ToString()); //Always writes to the file without any error
Console.WriteLine("Written to File Successfully");
}
What is the reason for the OOM while creating a new string and why it doesn't throw OOM while writing to a file?
Machine Details: 64-bit, Windows-7, 2GB RAM, .NET version 2.0

What is the reason for the OOM while creating a new string
Because you're running out of memory - or at least, the CLR can't allocate an object with the size you've requested. It's really that simple. If you want to avoid the errors, don't try to create strings that don't fit into memory. Note that even if you have a lot of memory, and even if you're running a 64-bit CLR, there are limits to the size of objects that can be created.
and why it doesn't throw OOM while writing to a file ?
Because you have more disk space than memory.
I'm pretty sure the code isn't exactly as you're describing though. This line would fail to compile:
sw.write(SB.ToString());
... because the method is Write rather than write. And if you're actually calling SB.ToString(), then that's just as likely to fail as str = SB.ToString().
It seems more likely that you're actually writing to the file in a streaming fashion, e.g.
using (var writer = File.CreateText(...))
{
for (int i = 0; i < 5000; i++)
{
writer.Write(mytext);
}
}
That way you never need to have huge amounts of text in memory - it just writes it to disk as it goes, possibly with some buffering, but not enough to cause memory issues.

Workaround: Suppose you would want to write a big string stored in StringBuilder to a StreamWriter, I would do a write this way to avoid SB.ToString's OOM exception. But if your OOM exception is due to StringBuilder's content add itself, you should work on that.
public const int CHUNK_STRING_LENGTH = 30000;
while (SB.Length > CHUNK_STRING_LENGTH )
{
sw.Write(SB.ToString(0, CHUNK_STRING_LENGTH ));
SB.Remove(0, CHUNK_STRING_LENGTH );
}
sw.Write(SB);

You have to remember that strings in .NET are stored in memory in 16-bit unicode. This means string of length 132370292 will reqire 260MB of RAM.
Furthermore, while executing
string str = SB.ToString();
you are creating a COPY of your string (another 260MB).
Keep in mind that each process have its own RAM limit so OutOfMemoryException can be thrown even if you have some free RAM left.

Might help someone , if your logic needs large objects then you can change your application to 64bit and also
change your app.config by adding this section
<runtime>
<gcAllowVeryLargeObjects enabled="true" />
</runtime>
gcAllowVeryLargeObjects On 64-bit platforms, enables arrays that are greater than 2 gigabytes (GB) in total size.

String m_filename = "c:\temp\myfile.xml"
StreamWriter sw = new StreamWriter(m_filename);
while (sb.Length > 0)
{
int writelen = Math.Min(sb.Length, 30000);
sw.Write (sb.ToString(0, writelen));
sb.Remove (0,writelen);
}
sw.Flush();
sw.Close();
sw = null;

Related

How to handle out of memory exception string builder

I have this method that is expected to return a string but the string is pretty big maybe in GB. Currently this runs into out of memory exception.
I was thinking I would write to a file (random filename) then read it delete the file and send back the response. Tried MemoryStream and StreamWriter that too ran into out of memory exception.
Did not want to initialize StringBuilder with a capacity since the size was not known and I think it needs a continuous block of memory.
What is the best course of action to solve this problem?
public string Result()
{
StringBuilder response = new StringBuilder();
for (int i = 1; i <= Int.Max; i++)
{
response.Append("keep adding some text {0}", i);
response.Append(Environment.NewLine);
}
return response.ToString();
}
Even if you could solve the memory problem on the StringBuilder, the resulting string by calling response.ToString() would be larger than the maximum length of a string in .NET, see: https://stackoverflow.com/a/140749/3997704
So you will have to make your function store its result elsewhere, like in a file.

System.IO.Compression.ZipArchive work with Large File

I have a code that is SSIS script task to zip file written in C#.
I have problem when zipping 1gb (approxymately) file.
I try to implement this code and still get error 'System.OutOfMemoryException'
System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
at ST_4cb59661fb81431abcf503766697a1db.ScriptMain.AddFileToZipUsingStream(String sZipFile, String sFilePath, String sFileName, String sBackupFolder, String sPrefixFolder) in c:\Users\dtmp857\AppData\Local\Temp\vsta\84bef43d323b439ba25df47c365b5a29\ScriptMain.cs:line 333
at ST_4cb59661fb81431abcf503766697a1db.ScriptMain.Main() in c:\Users\dtmp857\AppData\Local\Temp\vsta\84bef43d323b439ba25df47c365b5a29\ScriptMain.cs:line 131
This is the snippet of code when zipping file:
protected bool AddFileToZipUsingStream(string sZipFile, string sFilePath, string sFileName, string sBackupFolder, string sPrefixFolder)
{
bool bIsSuccess = false;
try
{
if (File.Exists(sZipFile))
{
using (ZipArchive addFile = ZipFile.Open(sZipFile, ZipArchiveMode.Update))
{
addFile.CreateEntryFromFile(sFilePath, sFileName);
//Move File after zipping it
BackupFile(sFilePath, sBackupFolder, sPrefixFolder);
}
}
else
{
//from https://stackoverflow.com/questions/28360775/adding-large-files-to-io-compression-ziparchiveentry-throws-outofmemoryexception
using (var zipFile = ZipFile.Open(sZipFile, ZipArchiveMode.Update))
{
var zipEntry = zipFile.CreateEntry(sFileName);
using (var writer = new BinaryWriter(zipEntry.Open()))
using (FileStream fs = File.Open(sFilePath, FileMode.Open))
{
var buffer = new byte[16 * 1024];
using (var data = new BinaryReader(fs))
{
int read;
while ((read = data.Read(buffer, 0, buffer.Length)) > 0)
writer.Write(buffer, 0, read);
}
}
}
//Move File after zipping it
BackupFile(sFilePath, sBackupFolder, sPrefixFolder);
}
bIsSuccess = true;
}
catch (Exception ex)
{
throw ex;
}
return bIsSuccess;
}
What I am missing, please give me suggestion maybe tutorial or best practice handling this problem.
I know this is an old post but what can I say, it helped me sort out some stuff and still comes up as a top hit on Google.
So there is definitely something wrong with the System.IO.Compression library!
First and Foremost...
You must make sure to turn off 32-Preferred. Having this set (in my case with a build for "AnyCPU") causes so many inconsistent issues.
Now with that said, I took some demo files (several less than 500MB, one at 500MB, and one at 1GB), and created a sample program with 3 buttons that made use of the 3 methods.
Button 1 - ZipArchive.CreateFromDirectory(AbsolutePath, TargetFile);
Button 2 - ZipArchive.CreateEntryFromFile(AbsolutePath, RelativePath);
Button 3 - Using the [16 * 1024] Byte Buffer method from above
Now here is where it gets interesting. (Assuming that the program is built as "AnyCPU" and with NO 32 Preferred check)... all 3 Methods worked on a Windows 64-Bit OS, regardless of how much memory it had.
However, as soon as I ran the same test on a 32-Bit OS, regardless of how much memory it had, ONLY method 1 worked!
Method 2 and 3 blew up with the outofmemory error AND to add salt to it, method 3 (the preferred method of chunking) actually corrupted more files than method #2!
By messed up, I mean that of my files, the 500MB and the 1GB file ended up in the zipped archive but at a size less than the original (it was basically truncated).
So I dunno... since there are not many 32-bit OS around anymore, I guess maybe it is a moot point.
But seems like there are some bugs in the System.IO.Compression Framework!

Dispose array of string in a loop

I have the following loop inside a function:
for(int i = 0; i < 46;i++){
String[] arrStr = File.ReadAllLines(path+"File_"+i+".txt")
List<String> output = new List<String>();
for(j = 0;j< arrStr.Length;j++){
//Do Something
output.Add(someString);
}
File.WriteAllLines(path+"output_File_"+i+".txt",output.toArray());
output.Clear();
}
Each txt file has about 20k lines.The function opens 46 of them and I need to run the function more than 1k times so I'm planning to leave the program running overnight,so far I didnt find any erros but since there is an 20k size String array being referenced at each interaction of the loop,i'm afraid that there might be some issue with trash memory being acumulated or something from the arrays in the past interactions. If there is such a risk,which method is best to dispose of the old array in this case?
Also,is it memory safe to run 3 programs like this at the same time?
Use Streams with using this will handle the memory management for you:
for (int i = 0; i < 46; i++)
{
using (StreamReader reader = new StreamReader(path))
{
using (StreamWriter writer = new StreamWriter(outputpath))
{
while(!reader.EndOfStream)
{
string line = reader.ReadLine();
// do something with line
writer.WriteLine(line);
}
}
}
}
The Dispose methods of StreamReader and StreamWriter are automatically called when exiting the using block, freeing up any memory used. Using streams also ensures your entire file isn't in memory at once.
More info on MSDN - File Stream and I/O
Sounds like you came from the C world :-)
C# garbage collection is fine, you will not have any problems with that.
I would be more worried about file-system errors.

Out of memory exception reading "Large" file

I'm trying to serialize an object into a string
The first problem I encountered was that the XMLSerializer.Serialize method threw an Out of memory exception, I've trying all kind of solutions and none worked so I serialized it into a file.
The file is about 300mb's (32 bit process, 8gb ram) and trying to read it with StreamReader.ReadToEnd also results in Out of memory exception.
The XML format and loading it on a string are not an option but a must.
The question is:
Any reason that a 300mb file will throw that kind of exception? 300mb is not really a large file.
Serialization code that fails on .Serialize
using (MemoryStream ms = new MemoryStream())
{
var type = obj.GetType();
if (!serializers.ContainsKey(type))
serializers.Add(type,new XmlSerializer(type));
// new XmlSerializer(obj.GetType()).Serialize(ms, obj);
serializers[type].Serialize(ms, obj);
ms.Position = 0;
using (StreamReader sr = new StreamReader(ms))
{
return sr.ReadToEnd();
}
}
Serialization and read from file that fails on ReadToEnd
var type = obj.GetType();
if (!serializers.ContainsKey(type))
serializers.Add(type,new XmlSerializer(type));
FileStream fs = new FileStream(#"c:/temp.xml", FileMode.Create);
TextWriter writer = new StreamWriter(fs, new UTF8Encoding());
serializers[type].Serialize(writer, obj);
writer.Close();
fs.Close();
using (StreamReader sr = new StreamReader(#"c:/temp.xml"))
{
return sr.ReadToEnd();
}
The object is large because its an elaborate system entire configuration object...
UPDATE:
Reading the file in chucks (8*1024 chars) will load the file into a StringBuilder but the builders fails on ToString().... starting to think there is no way which is really strange.
Yeah, if you're using 32-bit, trying to load 300MB in one chunk is going to be awkward, especially when using approaches that don't know the final size (number of characters, not bytes) in advance, thus have to keep doubling an internal buffer. And that is just when processing the string! It then needs to rip that into a DOM, which can often take several times as much space as the underlying data. And finally, you need to deserialize it into the actual objects, usually taking about the same again.
So - indeed, trying to do this in 32-bit will be tough.
The first thing to try is: don't use ReadToEnd - just use XmlReader.Create with either the file path or the FileStream, and let XmlReader worry about how to load the data. Don't load the contents for it.
After that... the next thing to do is: don't limit it to 32-bit.
Well, you could try enabling the 3GB switch, but... moving to 64-bit would be preferable.
Aside: xml is not a good choice for large volumes of data.
Exploring the source code for StreamReader.ReadToEnd reveals that it internally makes use of the StringBuilder.Append method:
public override String ReadToEnd()
{
if (stream == null)
__Error.ReaderClosed();
#if FEATURE_ASYNC_IO
CheckAsyncTaskInProgress();
#endif
// Call ReadBuffer, then pull data out of charBuffer.
StringBuilder sb = new StringBuilder(charLen - charPos);
do {
sb.Append(charBuffer, charPos, charLen - charPos);
charPos = charLen; // Note we consumed these characters
ReadBuffer();
} while (charLen > 0);
return sb.ToString();
}
which most probably throws this exception that leads to the this question/answer: interesting OutOfMemoryException with StringBuilder
al

Issue with memory management and program performance

OK, I made a C# winform app, it's a File_Splitter_Joiner.
You just give it a file and it splits it for you to a number of pieces you specify.
The splitting is done in a separate thread.
Everything was working pretty fine until I sliced a 1Gig file!
In the task manager, I saw that my program started consuming 1Gigabyte of memory and my computer almost died!
not just that, when slicing finished, the consuming didn't change!
(dunno if this means that the garbage collector isn't working, although I'm pretty sure that I lost all references to what was holding the big data chumps, so it should work)
Here's the Splitter constructor (just to give you a better idea):
public FileSplitter(string FileToSplitPath, string PiecesFolder, int NumberOfPieces, int PieceSize, SplittingMethod Method)
{
FileToSplitInfo = new FileInfo(FileToSplitPath);
this.FileToSplitPath = FileToSplitPath;
this.PiecesFolder = PiecesFolder;
this.NumberOfPieces = NumberOfPieces;
this.PieceSize = PieceSize;
this.Method = Method;
SplitterThread = new Thread(Split);
}
And here is the method that did the actual splitting:
(I'm still a newbie, so what you're about to see 'may not' be done in the best way ever possible, I'm just learning here)
private void Split()
{
int remainingSize = 0;
int remainingPos = -1;
bool isNumberOfPiecesEqualInSize = true;
int fileSize = (int)FileToSplitInfo.Length; // FileToSplitInfo is a FileInfo object
if (fileSize % PieceSize != 0)
{
remainingSize = fileSize % PieceSize;
remainingPos = fileSize - remainingSize;
isNumberOfPiecesEqualInSize = false;
}
byte[] fileBytes = new byte[fileSize];
var _fs = File.Open(FileToSplitPath, FileMode.Open);
BinaryReader br = new BinaryReader(_fs);
br.Read(fileBytes, 0, fileSize);
br.Close();
_fs.Close();
for (int i = 0, index = 0; i < NumberOfPieces; i++, index += PieceSize)
{
var fs = File.Create(PiecesFolder + "\\" + Path.GetFileName(FileToSplitPath) + "." + (i+1).ToString());
var bw = new BinaryWriter(fs);
bw.Write(fileBytes, index, PieceSize);
if(i == NumberOfPieces-1 && !isNumberOfPiecesEqualInSize && Method == SplittingMethod.NumberOfPieces)
bw.Write(fileBytes, remainingPos, remainingSize);
bw.Close();
fs.Close();
}
MessageBox.Show("File has been splitted successfully!");
SplitterThread.Abort();
}
Now, instead of reading the bytes of the file via a BinaryReader, I was first reading it via the File.ReadAllBytes method, it was working fine with small file sizes, but, I got a "SystemOutOfMemory" exception when I dealt with our big guy, dunno why I didn't get that exception when I read the bytes via a BinaryReader.
(that was an in between question)
So, the main question is, how can I load big files (gigs speaking) in a way that doesn't consume so much memory ? I mean, how can I make my program not consume all that memory ?
and how I can I free the used memory after the splitting is done ?
(I actually used
bw.Dispose; fs.Dispose;
instead of
bw.Close(); fs.Close();
it was the same.
I know the Q might not make sense, cuz when we load something, it gets in our memory not somewhere else, but, the reason I asked it like that, is cuz I used another Splitting_Joining program (not written by me) just to see that if it had the same problem, I loaded the file, the program consumed about 5Migs of ram, when I started splitting, it used about 10Migs!!
Now that is a VERY big difference .. Probably that app was in C/C++ ..
So to sum up, who sucks ? is it my code, and if so how can I fix it ? or is it C# when it comes to performance ?
Thank you SOOO much for anything you could hook me up with :)
The following 2 lines will kil you:
int fileSize = (int)FileToSplitInfo.Length; // a FileInfo object
...
byte[] fileBytes = new byte[fileSize];
Your code will fail when the size is over Int32.MaxValue. Unnecessary, just use long fileSize = FileToSplitInfo.Length;
This corrected code will fail when there is not enough contiguous memory. Fragmentation (of the LOH) will bring you down sooner or later.
You allocate memory for the entire file but your only need PieceSize bytes at a time.
You don't even need to know the fileSize, just
byte[] pieceBuffer = new byte[PieceSize];
while (true)
{
int nBytes = br.Read(pieceBuffer, 0, pieceBuffer.Length);
if (nBytes == 0)
break;
// write this piece, the length is nBytes
}
There are different aspects that can be made better:
if you are working with big file, why first read all inside an array and after write into another file ? Just write into the new file while reading from the other.
use using to gurantee disposal of the streams, in any case: either there is an exception or not.
if you begin to work with really big file, like 1GB or even more, I would recommend to look on Memory Mapped Files. So you will laverage incredible memory consuption benefit with some increased performance cost.

Categories