I have been attempting to save a large PdfDocument into a byte array using various means but always come back to an out of memory exception (file is 200 MB and 2.5K pages).
My initial attempt was to simply use MemoryStream
public static byte[] ProcessLargePdfDocument(PdfDocument pdfDocument)
{
using (MemoryStream stream = new MemoryStream())
{
pdfDocument.Save(stream, true);
return stream.ToArray();
}
}
Then I tried adding in some buffering
public static byte[] ProcessLargePdfDocument(PdfDocument pdfDocument, long whereToStartReading = 0)
{
List<byte> byteList = new List<byte>();
using (MemoryStream stream = new MemoryStream())
{
pdfDocument.Save(stream, false);
byte[] buffer = new byte[megabyte];
stream.Seek(whereToStartReading, SeekOrigin.Begin);
int bytesRead = stream.Read(buffer, 0, megabyte);
while (bytesRead > 0)
{
byteList.AddRange(buffer);
bytesRead = stream.Read(buffer, 0, megabyte);
}
}
return byteList.ToArray();
}
No matter what I try I get an out of memory exception on the pdfDocument.Save call. I am able to write it to a file location and to read it back using a buffered FileStream in dev but I'm not able to do this on the production environment due to permissions (yet).
Two tips:
Make sure your process runs as a 64-bit process to allow it to use more than 2 GiB of RAM.
stream.ToArray() creates a copy, stream.GetBuffer() lets you access the internal buffer of the MemoryStream. If the exception occurs after the Save() this may make a difference.
I need help converting a VERY LARGE binary file (ZIP file) to a Base64String and back again. The files are too large to be loaded into memory all at once (they throw OutOfMemoryExceptions) otherwise this would be a simple task. I do not want to process the contents of the ZIP file individually, I want to process the entire ZIP file.
The problem:
I can convert the entire ZIP file (test sizes vary from 1 MB to 800 MB at present) to Base64String, but when I convert it back, it is corrupted. The new ZIP file is the correct size, it is recognized as a ZIP file by Windows and WinRAR/7-Zip, etc., and I can even look inside the ZIP file and see the contents with the correct sizes/properties, but when I attempt to extract from the ZIP file, I get: "Error: 0x80004005" which is a general error code.
I am not sure where or why the corruption is happening. I have done some investigating, and I have noticed the following:
If you have a large text file, you can convert it to Base64String incrementally without issue. If calling Convert.ToBase64String on the entire file yielded: "abcdefghijklmnopqrstuvwx", then calling it on the file in two pieces would yield: "abcdefghijkl" and "mnopqrstuvwx".
Unfortunately, if the file is a binary then the result is different. While the entire file might yield: "abcdefghijklmnopqrstuvwx", trying to process this in two pieces would yield something like: "oiweh87yakgb" and "kyckshfguywp".
Is there a way to incrementally base 64 encode a binary file while avoiding this corruption?
My code:
private void ConvertLargeFile()
{
FileStream inputStream = new FileStream("C:\\Users\\test\\Desktop\\my.zip", FileMode.Open, FileAccess.Read);
byte[] buffer = new byte[MultipleOfThree];
int bytesRead = inputStream.Read(buffer, 0, buffer.Length);
while(bytesRead > 0)
{
byte[] secondaryBuffer = new byte[buffer.Length];
int secondaryBufferBytesRead = bytesRead;
Array.Copy(buffer, secondaryBuffer, buffer.Length);
bool isFinalChunk = false;
Array.Clear(buffer, 0, buffer.Length);
bytesRead = inputStream.Read(buffer, 0, buffer.Length);
if(bytesRead == 0)
{
isFinalChunk = true;
buffer = new byte[secondaryBufferBytesRead];
Array.Copy(secondaryBuffer, buffer, buffer.length);
}
String base64String = Convert.ToBase64String(isFinalChunk ? buffer : secondaryBuffer);
File.AppendAllText("C:\\Users\\test\\Desktop\\Base64Zip", base64String);
}
inputStream.Dispose();
}
The decoding is more of the same. I use the size of the base64String variable above (which varies depending on the original buffer size that I test with), as the buffer size for decoding. Then, instead of Convert.ToBase64String(), I call Convert.FromBase64String() and write to a different file name/path.
EDIT:
In my haste to reduce the code (I refactored it into a new project, separate from other processing to eliminate code that isn't central to the issue) I introduced a bug. The base 64 conversion should be performed on the secondaryBuffer for all iterations save the last (Identified by isFinalChunk), when buffer should be used. I have corrected the code above.
EDIT #2:
Thank you all for your comments/feedback. After correcting the bug (see the above edit), I re-tested my code, and it is actually working now. I intend to test and implement #rene's solution as it appears to be the best, but I thought that I should let everyone know of my discovery as well.
Based on the code shown in the blog from Wiktor Zychla the following code works. This same solution is indicated in the remarks section of Convert.ToBase64String as pointed out by Ivan Stoev
// using System.Security.Cryptography
private void ConvertLargeFile()
{
//encode
var filein= #"C:\Users\test\Desktop\my.zip";
var fileout = #"C:\Users\test\Desktop\Base64Zip";
using (FileStream fs = File.Open(fileout, FileMode.Create))
using (var cs=new CryptoStream(fs, new ToBase64Transform(),
CryptoStreamMode.Write))
using(var fi =File.Open(filein, FileMode.Open))
{
fi.CopyTo(cs);
}
// the zip file is now stored in base64zip
// and decode
using (FileStream f64 = File.Open(fileout, FileMode.Open) )
using (var cs=new CryptoStream(f64, new FromBase64Transform(),
CryptoStreamMode.Read ) )
using(var fo =File.Open(filein +".orig", FileMode.Create))
{
cs.CopyTo(fo);
}
// the original file is in my.zip.orig
// use the commandlinetool
// fc my.zip my.zip.orig
// to verify that the start file and the encoded and decoded file
// are the same
}
The code uses standard classes found in System.Security.Cryptography namespace and uses a CryptoStream and the FromBase64Transform and its counterpart ToBase64Transform
You can avoid using a secondary buffer by passing offset and length to Convert.ToBase64String, like this:
private void ConvertLargeFile()
{
using (var inputStream = new FileStream("C:\\Users\\test\\Desktop\\my.zip", FileMode.Open, FileAccess.Read))
{
byte[] buffer = new byte[MultipleOfThree];
int bytesRead = inputStream.Read(buffer, 0, buffer.Length);
while(bytesRead > 0)
{
String base64String = Convert.ToBase64String(buffer, 0, bytesRead);
File.AppendAllText("C:\\Users\\test\\Desktop\\Base64Zip", base64String);
bytesRead = inputStream.Read(buffer, 0, buffer.Length);
}
}
}
The above should work, but I think Rene's answer is actually the better solution.
Use this code:
public void ConvertLargeFile(string source , string destination)
{
using (FileStream inputStream = new FileStream(source, FileMode.Open, FileAccess.Read))
{
int buffer_size = 30000; //or any multiple of 3
byte[] buffer = new byte[buffer_size];
int bytesRead = inputStream.Read(buffer, 0, buffer.Length);
while (bytesRead > 0)
{
byte[] buffer2 = buffer;
if(bytesRead < buffer_size)
{
buffer2 = new byte[bytesRead];
Buffer.BlockCopy(buffer, 0, buffer2, 0, bytesRead);
}
string base64String = System.Convert.ToBase64String(buffer2);
File.AppendAllText(destination, base64String);
bytesRead = inputStream.Read(buffer, 0, buffer.Length);
}
}
}
I am trying to add a large video file(~500MB) to an ArchiveEntry by using this code:
using (var zipFile = ZipFile.Open(outputZipFile, ZipArchiveMode.Update))
{
var zipEntry = zipFile.CreateEntry("largeVideoFile.avi");
using (var writer = new BinaryWriter(zipEntry.Open()))
{
using (FileStream fs = File.Open(#"largeVideoFile.avi", FileMode.Open))
{
var buffer = new byte[16 * 1024];
using (var data = new BinaryReader(fs))
{
int read;
while ((read = data.Read(buffer, 0, buffer.Length)) > 0)
{
writer.Write(buffer, 0, read);
}
}
}
}
}
I am getting the error
System.OutOfMemoryException
when writer.Write is called, alltought I used a intermediate buffer....
Any idea how to solve this?
Build the application as any CPU and execute it in a x64 machine. This should fix the issue. (Or directly build the application as x64).
Videos normally cannot be compressed a lot and the zip file probably remains in memory until the are completely created.
I want to iterate through the contents of a zipped archive and, where the contents are readable, display them. I can do this for text based files, but can't seem to work out how to pull out binary data from things like images. Here's what I have:
var zipArchive = new System.IO.Compression.ZipArchive(stream);
foreach (var entry in zipArchive.Entries)
{
using (var entryStream = entry.Open())
{
if (IsFileBinary(entry.Name))
{
using (BinaryReader br = new BinaryReader(entryStream))
{
//var fileSize = await reader.LoadAsync((uint)entryStream.Length);
var fileSize = br.BaseStream.Length;
byte[] read = br.ReadBytes((int)fileSize);
binaryContent = read;
I can see inside the zip file, but calls to Length result in an OperationNotSupported error. Also, given that I'm getting a long and then having to cast to an integer, it feels like I'm missing something quite fundamental about how this should work.
I think the stream will decompress the data as it is read, which means that the stream cannot know the decompressed length. Calling entry.Length should return the correct size value that you can use. You can also call entry.CompressedLength to get the compressed size.
Just copy the stream into a file or another stream:
using (var fs = await file.OpenStreamForWriteAsync())
{
using (var src = entry.Open())
{
var buffLen = 1024;
var buff = new byte[buffLen];
int read;
while ((read = await src.ReadAsync(buff, 0, buffLen)) > 0)
{
await fs.WriteAsync(buff, 0, read);
await fs.FlushAsync();
}
}
}
I am trying to reduce the size of an Image I am taking from the camera before I save it to Isolated Storage. I already have it reduced to the lowest resolution (640x480) How can I reduce the bytes to 100kb instead of what they are coming out at almost 1mb.
void cam_CaptureImageAvailable(object sender, Microsoft.Devices.ContentReadyEventArgs e)
{
string fileName = folderName+"\\MyImage" + savedCounter + ".jpg";
try
{
// Set the position of the stream back to start
e.ImageStream.Seek(0, SeekOrigin.Begin);
// Save picture as JPEG to isolated storage.
using (IsolatedStorageFile isStore = IsolatedStorageFile.GetUserStoreForApplication())
{
using (IsolatedStorageFileStream targetStream = isStore.OpenFile(fileName, FileMode.Create, FileAccess.Write))
{
// Initialize the buffer for 4KB disk pages.
byte[] readBuffer = new byte[4096];
int bytesRead = -1;
// Copy the image to isolated storage.
while ((bytesRead = e.ImageStream.Read(readBuffer, 0, readBuffer.Length)) > 0)
{
targetStream.Write(readBuffer, 0, bytesRead);
}
}
}
}
finally
{
// Close image stream
e.ImageStream.Close();
}
}
I have nothing about Windows Phone, but maybe this link will help you:
http://msdn.microsoft.com/en-us/library/ff769549(v=vs.92).aspx
1) load your jpeg
2) copy to bitmap
3) save as bitmap with quality settings
http://msdn.microsoft.com/en-us/library/system.windows.media.imaging.extensions.savejpeg(v=vs.92).aspx