I'm having an issue with copying data from a MemoryStream into a Stream inside a ZipArchive. The following is NOT working - it returns only 114 bytes:
GetDataAsByteArray(IDataSource dataSource)
{
using (var zipStream = new MemoryStream())
{
using (var archive = new ZipArchive(zipStream, ZipArchiveMode.Create, true))
{
var file = archive.CreateEntry("compressed.file");
using (var targetStream = file.Open())
{
using (var sourceStream = new MemoryStream())
{
await dataSource.LoadIntoStream(sourceStream);
sourceStream.CopyTo(targetStream);
}
}
}
var result = zipStream.ToArray();
zipStream.Close();
return result;
}
}
However, using the implementation below for the "copy"-process, all 1103 bytes are written to the array/memory stream:
await targetStream.WriteAsync(sourceStream.ToArray(), 0, (int) sourceStream.Length);
I'm wondering why the CopyTo yields less bytes. Also I'm feeling unsecure with the cast to Int32 in the second implementation.
FYI: Comparing the byte array: It looks like only the header and footer of the zip file were written by the first implementation.
Stream.CopyTo() starts copying from the stream's current Position. Which probably isn't 0 after that LoadIntoStream() call. Since it is a MemoryStream, you can simply fix it like this:
await dataSource.LoadIntoStream(sourceStream);
sourceStream.Position = 0;
sourceStream.CopyTo(targetStream);
Set sourceStream.Position = 0 before copying it. The copy will copy from the current position to the end of the stream.
As other have said the Position is probably no longer 0. You can't always set the Position back to 0 though, such as for Network and Compressed streams. You should check the stream.CanSeek property before doing any operations and if it is false then copy the stream to a new MemoryStream first (which can be seeked) and then after each operation which changes the position set the Position back to 0.
Related
I have small sig files that are exactly 256 bytes. When uploading to a file upload controller on asp.net core web app, the buffer is occupied correctly for the 256 positions but they aren't written to the output stream and the file is empty. CopyToAsync works fine. This will only happen to certain files. The problem is reproducible on a console application:
string SoureFile = #"C:\Users\me\source\repos\files\mySigFile.sig";
byte[] buffer = new byte[1024 * 64];
string tmp = #"C:\Users\me\Downloads\tempsigfile.tmp";
string tmp2 = #"C:\Users\me\Downloads\tempsigfile2.tmp";
var inputStream = File.Open(SoureFile, FileMode.OpenOrCreate);
//doesn't work
using FileStream writer = new(tmp, FileMode.Create);
int read;
while ((read = await inputStream.ReadAsync(buffer)) != 0)
{
await writer.WriteAsync(buffer.AsMemory(0, read));
}
inputStream.Position = 0;
//works
using (var stream = new FileStream(tmp2, FileMode.Create))
{
await inputStream.CopyToAsync(stream);
}
FileInfo info = new FileInfo(tmp);
Console.WriteLine(info.Length); //0
FileInfo info2 = new FileInfo(tmp2);
Console.WriteLine(info2.Length);//256
Doing this (using declaration, no bracers):
using FileStream writer = new(tmp, FileMode.Create);
means writer will only be disposed at the end of the scope, so at the end of the method. Doing WriteAsync does not necessary mean that information will be written to file right away, it might be written to the internal in-memory buffer and only written to the file when this buffer fills or when you close the file, or when you explicitly call Flush on a stream. You don't do anything of that, the file is only closed at the end of the method, but your check:
FileInfo info = new FileInfo(tmp);
Console.WriteLine(info.Length); //0
is performed before the actual write to file happens, so you see 0. If you actually check the contents of the file after this method (program) completes - you'll see it contains the correct data.
In second case you use using statement:
using (var stream = new FileStream(tmp2, FileMode.Create))
{
await inputStream.CopyToAsync(stream);
}
so you write to a file, close it, and only after that check the contents. Then it works as you expect.
I have multiple Stream instances and would like to zip them up using ZipArchive. This is the code I'm using:
using var memoryStream = new MemoryStream();
using (var zipArchive = new ZipArchive(memoryStream, ZipArchiveMode.Create, true))
{
foreach (var stream in streams)
{
var entry = zipArchive.CreateEntry($"{Guid.NewGuid()}.jpeg");
using var entryStream = entry.Open();
stream.CopyTo(entryStream);
}
}
memoryStream.Seek(0, SeekOrigin.Begin);
// memoryStream.Position = 0; // this throws the same exception
This code works fine when streams has a few items, but it fails with 10 items. The error is coming from the last line when I try to rewind back to the initial position.
{System.ArgumentException: Offset and length were out of bounds for
the array or count is greater than the number of elements from index
to the end of the source collection. at System.IO.MemoryStream.Read
(System.Byte[] buffer, System.Int32 offset, System.Int…}
This is the memory stream state in debugger:
CanRead: true
CanSeek: true
CanWrite: true
Capacity: 8534016
Length: 6853402
Position: 6853402
Note that this code is running on a physical iOS device written in Xamarin iOS.
What's causing this error to be thrown?
I have implemented a code block in order to convert Stream into Byte Array. And code snippet is shown below. But unfortunately, it gives OutOfMemory Exception while converting MemoryStream to Array (return newDocument.ToArray();). please could someone help me with this?
public byte[] MergeToBytes()
{
using (var processor = new PdfDocumentProcessor())
{
AppendStreamsToDocumentProcessor(processor);
using (var newDocument = new MemoryStream())
{
processor.SaveDocument(newDocument);
return newDocument.ToArray();
}
}
}
public Stream MergeToStream()
{
return new MemoryStream(MergeToBytes());
}
Firstly: how big is the document? if it is too big for the byte[] limit: you're going to have to use a different approach.
However, a MemoryStream is already backed by an (oversized) array; you can get this simply using newDocument.TryGetBuffer(out var buffer), and noting that you must restrict yourself to the portion of the .Array indicated by .Offset (usually, but not always, zero) and .Count (the number of bytes that should be considered "live"). Note that TryGetBuffer can return false, but not in the new MemoryStream() scenario.
If is also interesting that you're converting a MemoryStream to a byte[] and then back to a MemoryStream. An alternative here would just have been to set the Position back to 0, i.e. rewind it. So:
public Stream MergeToStream()
{
using var processor = new PdfDocumentProcessor();
AppendStreamsToDocumentProcessor(processor);
var newDocument = new MemoryStream();
processor.SaveDocument(newDocument);
newDocument.Position = 0;
return newDocument;
}
I needed to put a byte to a memory stream so initially, I used:
byte[] Input;
using (MemoryStream mem = new MemoryStream())
{
mem.Write(Input, 0, (int)Input.Length);
StreamReader stream = new StreamReader(mem);
...
}
I wanted to use the Streamreader to read lines from a text file.
It didn't work.
Then I used
using (MemoryStream mem = new MemoryStream(Input))
instead and removed
mem.Write(Input, 0, (int)Input.Length);
It worked. I don't know why. Why did it work?
In your first approach, you use mem.Write(Input, 0, (int)Input.Length);. Note that MemoryStream.Write sets the stream read/write position behind the written data. In your example case this is equivalent with a position signifying the end of the stream. Trying to read from the MemoryStream again will not return any data, as the MemoryStream read/write position is at the end of the stream.
In your second approach, you passed the Input byte array as argument to the MemoryStream constructor. Providing the byte array through the constructor not only will make MemoryStream use this byte array, but more importantly it keeps the initial stream position of zero. Thus, when trying to read from the MemoryStream initialized in this way, the data contained in the input byte array will be returned as expected.
How to fix the problem with the first approach?
You can make the first approach with MemoryStream.Write working by simply setting the MemoryStream position back to the intended/original value (in your example case it would be zero) after writing the data to the MemoryStream:
byte[] Input;
using (MemoryStream mem = new MemoryStream())
{
mem.Write(Input, 0, (int)Input.Length);
mem.Position = 0;
using (StreamReader stream = new StreamReader(mem))
{
...
}
}
I'm working on a project where I need the ability to unzip streams and byte arrays as well as zip them. I was running some unit tests that create the Zip from a stream and then unzip them and when I unzip them, the only way that DonNetZip sees them as a zip is if I run streamToZip.Seek(o,SeekOrigin.Begin) and streamToZip.Flush(). If I don't do this, I get the error "Cannot read Block, No data" on the ZipFile.Read(stream).
I was wondering if anyone could explain why that is. I've seen a few articles on using it to actually set the relative read position, but none that really explain why in this situation it is required.
Here is my Code:
Zipping the Object:
public Stream ZipObject(Stream data)
{
var output = new MemoryStream();
using (var zip = new ZipFile())
{
zip.AddEntry(Name, data);
zip.Save(output);
FlushStream(output);
ZippedItem = output;
}
return output;
}
Unzipping the Object:
public List<Stream> UnZipObject(Stream data)
{
***FlushStream(data); // This is what I had to add in to make it work***
using (var zip = ZipFile.Read(data))
{
foreach (var item in zip)
{
var newStream = new MemoryStream();
item.Extract(newStream);
UnZippedItems.Add(newStream);
}
}
return UnZippedItems;
}
Flush method I had to add:
private static void FlushStream(Stream stream)
{
stream.Seek(0, SeekOrigin.Begin);
stream.Flush();
}
When you return output from ZipObject, that stream is at the end - you've just written the data. You need to "rewind" it so that the data can then be read. Imagine you had a video cassette, and had just recorded a program - you'd need to rewind it before you watched it, right? It's exactly the same here.
I would suggest doing this in ZipObject itself though - and I don't believe the Flush call is necessary. I'd personally use the Position property, too:
public Stream ZipObject(Stream data)
{
var output = new MemoryStream();
using (var zip = new ZipFile())
{
zip.AddEntry(Name, data);
zip.Save(output);
}
output.Position = 0;
return output;
}
When you write to a stream, the position is changed. If you want to decompress it (the same stream object), you'll need to reset the position. Else you'll get a EndOfStreamException because the ZipFile.Read will start at the stream.Position.
So
stream.Seek(0, SeekOrigin.Begin);
Or
stream.Position = 0;
would do the trick.
Offtopic but sure useful:
public IEnumerable<Stream> UnZipObject(Stream data)
{
using (var zip = ZipFile.Read(data))
{
foreach (var item in zip)
{
var newStream = new MemoryStream();
item.Extract(newStream);
newStream.Position = 0;
yield return newStream;
}
}
}
Won't unzip all items in memory (because of the MemoryStream used in the UnZipObject(), only when iterated. Thats because extracted items are yielded. (returning an IEnumerable<Stream>) More info on yield: http://msdn.microsoft.com/en-us/library/vstudio/9k7k7cf0.aspx
Normally i wouldn't recomment returning data as stream, because the stream is something like an iterator (using the .Position as current position). This way it isn't by default threadsafe. I'd rather return these memory streams as ToArray().