IOException on reading large files - c#

Here is my code for writting file from one file to another. It works fine for file having size less than 2GB but throws exception when it is of greater size. So it copies less than 2 gb data and then throws exception after that. Any fixes?
const int bufferSize = 2048;
byte[] buffer = new byte[bufferSize];
int bytes = 0;
using (var input = filedata.DataStream)
using (var output = ServiceModel.FileManager.Current.GetFile(filedata.FileName).Open(FileMode.CreateNew, FileAccess.Write, FileShare.Read))
{
while ((bytes = input.Read(buffer, 0, bufferSize)) > 0) //Throws exception: An exception has been thrown when reading the stream.
{
output.Write(buffer, 0, bytes);
}
}

Related

Convert Stream to byte[] c# for large files of 2GB

Trying to convert Stream object to byte[] and using the below method for the same:
public static byte[] ReadFully(System.IO.Stream input)
{
byte[] buffer = new byte[16*1024];
using (System.IO.MemoryStream ms = new System.IO.MemoryStream())
{
int read;
while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
{
ms.Write(buffer, 0, read);
}
return ms.ToArray();
}
}
However the input parameter "input" is for large file that is of 2 GB and hence the code does not get enter into while loop and hence does not convert it to byte array.
For smaller files it is working fine
That's what a Stream is for.
You don't load the whole content into a byte[], you read a small buffer from the Stream into memory and handle it, then dispose and read the next buffer.
If you still need to use a byte[]:
It seems like your app can't handle more than 2^32 Bytes Memory, meaning it's 32bit.
Try changing it to 64bit (in Project Properties go to Build and disable Prefer 32 bit)
Object limited below 32bit. (that is why all index using int)
how about use list contains byte array to deal entire data?
public List<byte[]> ReadBytesList(string fileName)
{
List<byte[]> rawDataBytes= new List<byte[]>();
byte[] buff;
FileStream fs = new FileStream(fileName,
FileMode.Open,
FileAccess.Read);
BinaryReader br = new BinaryReader(fs);
long numBytes = new FileInfo(fileName).Length;
int arrayCount= (int)(numBytes / 2100000000); //2147483648 is max
int arrayRest = (int)(numBytes % 2100000000);
if(arrayCount>0)
{
for (int i = 0; i < arrayCount; i++)
{
buff = br.ReadBytes(2100000000);
rawDataBytes.Add(buff);
}
buff = br.ReadBytes(arrayRest);
rawDataBytes.Add(buff);
}
else
{
buff = br.ReadBytes(arrayRest);
rawDataBytes.Add(buff);
}
return rawDataBytes;
}

Convert a VERY LARGE binary file into a Base64String incrementally

I need help converting a VERY LARGE binary file (ZIP file) to a Base64String and back again. The files are too large to be loaded into memory all at once (they throw OutOfMemoryExceptions) otherwise this would be a simple task. I do not want to process the contents of the ZIP file individually, I want to process the entire ZIP file.
The problem:
I can convert the entire ZIP file (test sizes vary from 1 MB to 800 MB at present) to Base64String, but when I convert it back, it is corrupted. The new ZIP file is the correct size, it is recognized as a ZIP file by Windows and WinRAR/7-Zip, etc., and I can even look inside the ZIP file and see the contents with the correct sizes/properties, but when I attempt to extract from the ZIP file, I get: "Error: 0x80004005" which is a general error code.
I am not sure where or why the corruption is happening. I have done some investigating, and I have noticed the following:
If you have a large text file, you can convert it to Base64String incrementally without issue. If calling Convert.ToBase64String on the entire file yielded: "abcdefghijklmnopqrstuvwx", then calling it on the file in two pieces would yield: "abcdefghijkl" and "mnopqrstuvwx".
Unfortunately, if the file is a binary then the result is different. While the entire file might yield: "abcdefghijklmnopqrstuvwx", trying to process this in two pieces would yield something like: "oiweh87yakgb" and "kyckshfguywp".
Is there a way to incrementally base 64 encode a binary file while avoiding this corruption?
My code:
private void ConvertLargeFile()
{
FileStream inputStream = new FileStream("C:\\Users\\test\\Desktop\\my.zip", FileMode.Open, FileAccess.Read);
byte[] buffer = new byte[MultipleOfThree];
int bytesRead = inputStream.Read(buffer, 0, buffer.Length);
while(bytesRead > 0)
{
byte[] secondaryBuffer = new byte[buffer.Length];
int secondaryBufferBytesRead = bytesRead;
Array.Copy(buffer, secondaryBuffer, buffer.Length);
bool isFinalChunk = false;
Array.Clear(buffer, 0, buffer.Length);
bytesRead = inputStream.Read(buffer, 0, buffer.Length);
if(bytesRead == 0)
{
isFinalChunk = true;
buffer = new byte[secondaryBufferBytesRead];
Array.Copy(secondaryBuffer, buffer, buffer.length);
}
String base64String = Convert.ToBase64String(isFinalChunk ? buffer : secondaryBuffer);
File.AppendAllText("C:\\Users\\test\\Desktop\\Base64Zip", base64String);
}
inputStream.Dispose();
}
The decoding is more of the same. I use the size of the base64String variable above (which varies depending on the original buffer size that I test with), as the buffer size for decoding. Then, instead of Convert.ToBase64String(), I call Convert.FromBase64String() and write to a different file name/path.
EDIT:
In my haste to reduce the code (I refactored it into a new project, separate from other processing to eliminate code that isn't central to the issue) I introduced a bug. The base 64 conversion should be performed on the secondaryBuffer for all iterations save the last (Identified by isFinalChunk), when buffer should be used. I have corrected the code above.
EDIT #2:
Thank you all for your comments/feedback. After correcting the bug (see the above edit), I re-tested my code, and it is actually working now. I intend to test and implement #rene's solution as it appears to be the best, but I thought that I should let everyone know of my discovery as well.
Based on the code shown in the blog from Wiktor Zychla the following code works. This same solution is indicated in the remarks section of Convert.ToBase64String as pointed out by Ivan Stoev
// using System.Security.Cryptography
private void ConvertLargeFile()
{
//encode
var filein= #"C:\Users\test\Desktop\my.zip";
var fileout = #"C:\Users\test\Desktop\Base64Zip";
using (FileStream fs = File.Open(fileout, FileMode.Create))
using (var cs=new CryptoStream(fs, new ToBase64Transform(),
CryptoStreamMode.Write))
using(var fi =File.Open(filein, FileMode.Open))
{
fi.CopyTo(cs);
}
// the zip file is now stored in base64zip
// and decode
using (FileStream f64 = File.Open(fileout, FileMode.Open) )
using (var cs=new CryptoStream(f64, new FromBase64Transform(),
CryptoStreamMode.Read ) )
using(var fo =File.Open(filein +".orig", FileMode.Create))
{
cs.CopyTo(fo);
}
// the original file is in my.zip.orig
// use the commandlinetool
// fc my.zip my.zip.orig
// to verify that the start file and the encoded and decoded file
// are the same
}
The code uses standard classes found in System.Security.Cryptography namespace and uses a CryptoStream and the FromBase64Transform and its counterpart ToBase64Transform
You can avoid using a secondary buffer by passing offset and length to Convert.ToBase64String, like this:
private void ConvertLargeFile()
{
using (var inputStream = new FileStream("C:\\Users\\test\\Desktop\\my.zip", FileMode.Open, FileAccess.Read))
{
byte[] buffer = new byte[MultipleOfThree];
int bytesRead = inputStream.Read(buffer, 0, buffer.Length);
while(bytesRead > 0)
{
String base64String = Convert.ToBase64String(buffer, 0, bytesRead);
File.AppendAllText("C:\\Users\\test\\Desktop\\Base64Zip", base64String);
bytesRead = inputStream.Read(buffer, 0, buffer.Length);
}
}
}
The above should work, but I think Rene's answer is actually the better solution.
Use this code:
public void ConvertLargeFile(string source , string destination)
{
using (FileStream inputStream = new FileStream(source, FileMode.Open, FileAccess.Read))
{
int buffer_size = 30000; //or any multiple of 3
byte[] buffer = new byte[buffer_size];
int bytesRead = inputStream.Read(buffer, 0, buffer.Length);
while (bytesRead > 0)
{
byte[] buffer2 = buffer;
if(bytesRead < buffer_size)
{
buffer2 = new byte[bytesRead];
Buffer.BlockCopy(buffer, 0, buffer2, 0, bytesRead);
}
string base64String = System.Convert.ToBase64String(buffer2);
File.AppendAllText(destination, base64String);
bytesRead = inputStream.Read(buffer, 0, buffer.Length);
}
}
}

System.ArgumentException using FileStream

Instead of reading all at once, I first create a FileStream to open the file, read into a buffer, then call NetworkStream.write() to write its content.
Here's the code.
using (FileStream fs = new FileStream(path, FileMode.Open, FileAccess.Read))
{
try
{
int len = (int)fs.Length;
byte[] data = new byte[len];
byte[] buffer = new byte[bufferSize];
int count, sum = 0;
while ((count = fs.Read(buffer, sum, len - sum)) > 0)
{
netstream.Write(buffer,sum,len-sum);
sum += count;
}
...
It's throwing the error:
An unhandled exception of type 'System.ArgumentException' occurred in
mscorlib.dll
Additional information:
Offset and length were out of bounds for the array or count is greater
than the number of elements from index to the end of the source
collection.
I don't see any array out of bound issue here .
Suggestions please
Offset and Length should be based on Buffer length not the whole file, here is an example of reading chucked data from a FileStream and write it to another stream :
using (FileStream fs = new FileStream(path, FileMode.Open, FileAccess.Read))
{
byte[] buffer = new byte[bufferSize];
while (true)
{
var count = fs.Read(buffer, 0, buffer.Length);
if (count == 0) break;
netstream.Write(buffer, 0, count);
}
}

C# Cyclic redundancy check Exception when reading the stream of a CD

I want to read a CD to write it's content to a file, but I get an IOException (Data error Cyclic Redundancy Check) after a few read/write to my output file.
I'm trying to make an ISO creator program and the disk I read is 500 Mo game CD, and the exception arise after reading about at 1,9 Mo of data. The CD is not broken and perfectly usable.
I don't know if they are limitations with chunk size or buffer size (using 4096 bytes).
I'm very interested by any idea or experience to fix that problem.
EDIT
I get this error with every CD, not only this one. And it's not broken, as I used it just before starting to make the program to install the game it contains.
I use this code to read the CD :
const int BUFFER_SIZE = 4096;
private void MakeISO()
{
_HDEV = NativeMethods.CreateFileR(DRIVE_NAME);
string targetFile = TARGET + "\\" + NAME;
try
{
_FSR = new FileStream(_HDEV, FileAccess.Read, BUFFER_SIZE);
_FSW = new FileStream(targetFile, FileMode.Create, FileAccess.Write, FileShare.None, BUFFER_SIZE);
byte[] buffer = new byte[BUFFER_SIZE];
int length;
while ((length = _FSR.Read(buffer, 0, buffer.Length)) > 0)
{
_FSW.Write(buffer, 0, buffer.Length);
}
// Don't work either
//do
//{
// _FSR.Read(buffer, 0, BUFFER_SIZE);
// _FSW.Write(buffer, 0, BUFFER_SIZE);
//}
//while (_fsw.Position == _fsr.Position);
}
catch (Exception ex)
{
//
}
finally
{
CloseAll();
}
}

System.OutOfMemory exception when trying to read large files

public static byte[] ReadMemoryMappedFile(string fileName)
{
long length = new FileInfo(fileName).Length;
using (var stream = File.Open(fileName, FileMode.OpenOrCreate, FileAccess.Read, FileShare.ReadWrite))
{
using (var mmf = MemoryMappedFile.CreateFromFile(stream, null, length, MemoryMappedFileAccess.Read, null, HandleInheritability.Inheritable, false))
{
using (var viewStream = mmf.CreateViewStream(0, length, MemoryMappedFileAccess.Read))
{
using (BinaryReader binReader = new BinaryReader(viewStream))
{
var result = binReader.ReadBytes((int)length);
return result;
}
}
}
}
}
OpenFileDialog openfile = new OpenFileDialog();
openfile.Filter = "All Files (*.*)|*.*";
openfile.ShowDialog();
byte[] buff = ReadMemoryMappedFile(openfile.FileName);
texteditor.Text = BitConverter.ToString(buff).Replace("-"," "); <----A first chance exception of type 'System.OutOfMemoryException' occurred in mscorlib.dll
I get a System.OutOfMemory exception when trying to read large files.
I've read a lot for 4 weeks in all the web... and tried a lot!!! But still, I can't seem to find a good solution to my problem.
Please help me..
Update
public byte[] FileToByteArray(string fileName)
{
byte[] buff = null;
FileStream fs = new FileStream(fileName,
FileMode.Open,
FileAccess.Read);
BinaryReader br = new BinaryReader(fs);
long numBytes = new FileInfo(fileName).Length;
buff = br.ReadBytes((int)numBytes);
//return buff;
return File.ReadAllBytes(fileName);
}
OR
public static byte[] FileToByteArray(FileStream stream, int initialLength)
{
// If we've been passed an unhelpful initial length, just
// use 32K.
if (initialLength < 1)
{
initialLength = 32768;
}
BinaryReader br = new BinaryReader(stream);
byte[] buffer = new byte[initialLength];
int read = 0;
int chunk;
while ((chunk = br.Read(buffer, read, buffer.Length - read)) > 0)
{
read += chunk;
// If we've reached the end of our buffer, check to see if there's
// any more information
if (read == buffer.Length)
{
int nextByte = br.ReadByte();
// End of stream? If so, we're done
if (nextByte == -1)
{
return buffer;
}
// Nope. Resize the buffer, put in the byte we've just
// read, and continue
byte[] newBuffer = new byte[buffer.Length * 2];
Array.Copy(buffer, newBuffer, buffer.Length);
newBuffer[read] = (byte)nextByte;
buffer = newBuffer;
read++;
}
}
// Buffer is now too big. Shrink it.
byte[] ret = new byte[read];
Array.Copy(buffer, ret, read);
return ret;
}
I still get a System.OutOfMemory exception when trying to read large files.
If your file is 4GB, then BitConverter will turn each byte into XX- string, each char in string is 2 bytes * 3 chars per byte * 4 294 967 295 bytes = 25 769 803 770. You need +25Gb of free memory to fit entire string, plus you already have your file in memory as byte array.
Besides, no single object in a .Net program may be over 2GB. Theoretical limit for a string length would be 1,073,741,823 chars, but you also need to have a 64-bit process.
So solution in your case - open FileStream. Read first 16384 bytes (or how much can fit on your screen), convert to hex and display, and remember file offset. When user wants to navigate to next or previous page - seek to that position in file on disk, read and display again, etc.
You need to read the file in chunks, keep track of where you are in the file, page the contents on screen and use seek and position to move up and down in the file stream.
You will not be able to display 4Gb file reading all of it in memory first by any approach.
The approach is to virtualize the data, reading only the visible lines when user scrolls. If you need to do a read-only text viewer then you can use WPF ItemsControl with virtulizing stack panel and bind to custom IList collection which will lazily fetch lines from the file calculating file offset by for the line index.

Categories