GZipStream compression not working - c#

I'm trying to read in a file and compress it using GZipStream, like this:
using (var outStream = new MemoryStream())
{
using (var fileStream = new FileStream(filename, FileMode.Open, FileAccess.Read))
{
using (var gzipStream = new GZipStream(outStream, CompressionMode.Compress))
{
fileStream.CopyTo(gzipStream);
Debug.WriteLine(
"Compressed from {0} to {1} bytes",
fileStream.Length,
outStream.Length);
// "outStream" is utilised here (persisted to a NoSql database).
}
}
}
The problem is that outStream.Length always shows 10 bytes. What am I doing wrong?
I've tried calling gzipStream.Close() after the fileStream.CopyTo line (as suggested in other forums) but this seems to close outStream too, so the subsequent code that uses it falls over.

MSDN says: The write operation might not occur immediately but is buffered until the buffer size is reached or until the Flush or Close method is called.
In other words, the fact that all the Write operations are done doesn't mean the data is already in the MemoryStream. You have to do gzipStream.Flush() or close the gzipStream first.
Example:
using (var outStream = new MemoryStream())
{
using (var fileStream = new FileStream(filename, FileMode.Open, FileAccess.Read))
{
using (var gzipStream = new GZipStream(outStream, CompressionMode.Compress))
{
fileStream.CopyTo(gzipStream);
}
Debug.WriteLine(
"Compressed from {0} to {1} bytes",
fileStream.Length,
outStream.Length);
// "outStream" is utilised here (persisted to a NoSql database).
}
}
Also, ideally, put it outside of the FileStream as well - you want to close files as soon as you can, rather than waiting for some other processing to finish.

Related

FileStream using block not disposing of file properly when using CopyToAsync

I have a situation where I need to asynchronously move a small list of files to another location on the network. I have the following method to do this, but it is occasionally throwing an IO Exception (cannot access the file x because it is being used by another process) when trying to delete the source file. I expected the using block to take care of disposing the FileStreams for me so am not sure what is going on.
public static async Task MoveFileAsync(string sourceFile, string destinationFile)
{
using (var sourceStream = new FileStream(sourceFile, FileMode.Open, FileAccess.Read, FileShare.Read, 4096, FileOptions.Asynchronous | FileOptions.SequentialScan))
using (var destinationStream = new FileStream(destinationFile, FileMode.CreateNew, FileAccess.Write, FileShare.None, 4096, FileOptions.Asynchronous | FileOptions.SequentialScan))
{
await sourceStream.CopyToAsync(destinationStream);
}
File.Delete(sourceFile);
}
I tried doing this with a File.Move in a Parallel.ForEach loop but found the above method was much quicker in my tests. Any pointers on what might be going on would be greatly appreciated.

Writing to MemoryStream not working as expected

I'm using the DryWetMidi library to process some MIDI data.
First I get the MIDI Data as a MemoryStream from the Clipboard:
MemoryStream ms = (MemoryStream)Clipboard.GetDataObject().GetData("Standard MIDI File");
MidiFile mid = MidiFile.Read(ms);
Then I do some stuff with the midi:
mid.RemoveNotes(n => n.NoteName == NoteName.FSharp);
Now I want to write it back to the Clipboard. I managed to do this like this:
using (FileStream file = new FileStream("file.mid", FileMode.Create, FileAccess.
{
mid.Write(file);
}
using (MemoryStream ms2 = new MemoryStream())
using (FileStream file = new FileStream("file.mid", FileMode.Open, FileAccess.Read))
{
byte[] bytes = new byte[file.Length];
file.Read(bytes, 0, (int)file.Length);
ms2.Write(bytes, 0, (int)file.Length);
Clipboard.Clear();
Clipboard.SetData(midiFormat, ms2);
}
File.Delete("file.mid");
As you can see, first I write the MIDI to a file, then I read that file into a MemoryStream which I then write into the Clipboard. This makes not much sense, because it would be simpler to write it to a MemoryStream directly. Also, I don't want to write a file to the users file system. But there's the problem. I tried it like this:
using (MemoryStream ms2 = new MemoryStream())
{
mid.Write(ms2);
}
This doesn't give me an error, but the MemoryStream is completely empty. Calling ms2.Length results in a System.ObjectDisposedException.
How can I write the midi directly into the MemoryStream?
EDIT: Here's the link to the DryWetMidi Write() Method.
Second Edit: Here's a piece of code that won't work:
MemoryStream ms = (MemoryStream)Clipboard.GetDataObject().GetData(midiFormat);
MidiFile mid = MidiFile.Read(ms);
mid.RemoveNotes(n => n.NoteName == NoteName.FSharp);
MemoryStream ms2 = new MemoryStream();
mid.Write(ms2);
var T = ms2.Length; //This will throw an exception
Third Edit: I am 100% sure that the code posted is exactly the same I'm running. Here's the StackTrace. (Gist because formatting was terrible on SO).
As far as I can see, DryWetMidi uses BinaryWriter to write to stream. And the default behaviour of BinaryWriter is that when it is disposed, It'll dispose the stream as well.
You can't read from MemoryStream when it's disposed but you can call ToArray().
byte[] result;
using (MemoryStream ms2 = new MemoryStream())
{
mid.Write(ms2);
result = ms2.ToArray();
}

Async FileStream Writes "NUL" into file

I am using this code to write asynchronously to a file
public static void AsyncWrite(string file, string text)
{
try
{
byte[] data = Encoding.Unicode.GetBytes(text);
using ( FileStream fs = new FileStream(file, FileMode.Create,
FileAccess.Write, FileShare.Read, 1, true))
fs.BeginWrite(data, 0, data.Length, null, null);
}
catch
{
}
}
For some reason, from time to time, rather than writing text into the file as expected, Notepad++ shows the following ouput :
BeginWrite is asynchronous, so it might well happen that the stream is closed through the using statement while other things are happening.
I'd not use using when doing asynchronous writing. Instead I'd create a proper callback method and close the stream there. This would also give you the chance to call EndWrite as recommended.

create file and save to it using memorystream

How can i create a file and write to it using the memory stream?
I need to use the memorystream to prevent other threads from trying to access the file.
The data i'm trying to save to a file is html.
How can this be done?
(Presuming you mean how to copy a file's content to a memory stream)
If you are using framework 4:
var memoryStream = new MemoryStream();
using var fileStream = new FileStream(FilePath, FileMode.Open, FileAccess.Read);
fileStream.CopyTo(memoryStream);
Here are code to create file
byte[] data = System.Text.Encoding.ASCII.GetBytes("This is a sample string");
System.IO.MemoryStream ms = new System.IO.MemoryStream();
ms.Write(data, 0, data.Length);
ms.Close();

Writing to then reading from a MemoryStream

I'm using DataContractJsonSerializer, which likes to output to a Stream. I want to top-and-tail the outputs of the serializer so I was using a StreamWriter to alternately write in the extra bits I needed.
var ser = new DataContractJsonSerializer(typeof (TValue));
using (var stream = new MemoryStream())
{
using (var sw = new StreamWriter(stream))
{
sw.Write("{");
foreach (var kvp in keysAndValues)
{
sw.Write("'{0}':", kvp.Key);
ser.WriteObject(stream, kvp.Value);
}
sw.Write("}");
}
using (var streamReader = new StreamReader(stream))
{
return streamReader.ReadToEnd();
}
}
When I do this I get an ArgumentException "Stream was not readable".
I'm probably doing all sorts wrong here so all answers welcome. Thanks.
Three things:
Don't close the StreamWriter. That will close the MemoryStream. You do need to flush the writer though.
Reset the position of the stream before reading.
If you're going to write directly to the stream, you need to flush the writer first.
So:
using (var stream = new MemoryStream())
{
var sw = new StreamWriter(stream);
sw.Write("{");
foreach (var kvp in keysAndValues)
{
sw.Write("'{0}':", kvp.Key);
sw.Flush();
ser.WriteObject(stream, kvp.Value);
}
sw.Write("}");
sw.Flush();
stream.Position = 0;
using (var streamReader = new StreamReader(stream))
{
return streamReader.ReadToEnd();
}
}
There's another simpler alternative though. All you're doing with the stream when reading is converting it into a string. You can do that more simply:
return Encoding.UTF8.GetString(stream.GetBuffer(), 0, (int) stream.Length);
Unfortunately MemoryStream.Length will throw if the stream has been closed, so you'd probably want to call the StreamWriter constructor that doesn't close the underlying stream, or just don't close the StreamWriter.
I'm concerned by you writing directly to the the stream - what is ser? Is it an XML serializer, or a binary one? If it's binary, your model is somewhat flawed - you shouldn't mix binary and text data without being very careful about it. If it's XML, you may find that you end up with byte-order marks in the middle of your string, which could be problematic.
setting the memory streams position to the beginning might help.
stream.Position = 0;
But the core problem is that the StreamWriter is closing your memory stream when it is closed.
Simply flushing that stream where you end the using block for it and only disposing of it fter you have read the data out of the memory stream will solve this for you.
You may also want to consider using a StringWriter instead...
using (var writer = new StringWriter())
{
using (var sw = new StreamWriter(stream))
{
sw.Write("{");
foreach (var kvp in keysAndValues)
{
sw.Write("'{0}':", kvp.Key);
ser.WriteObject(writer, kvp.Value);
}
sw.Write("}");
}
return writer.ToString();
}
This would require your serialization WriteObject call can accept a TextWriter instead of a Stream.
To access the content of a MemoryStream after it has been closed use the ToArray() or GetBuffer() methods. The following code demonstrates how to get the content of the memory buffer as a UTF8 encoded string.
byte[] buff = stream.ToArray();
return Encoding.UTF8.GetString(buff,0,buff.Length);
Note: ToArray() is simpler to use than GetBuffer() because ToArray() returns the exact length of the stream, rather than the buffer size (which might be larger than the stream content). ToArray() makes a copy of the bytes.
Note: GetBuffer() is more performant than ToArray(), as it doesn't make a copy of the bytes. You do need to take care about possible undefined trailing bytes at the end of the buffer by considering the stream length rather than the buffer size. Using GetBuffer() is strongly advised if stream size is larger than 80000 bytes because the ToArray copy would be allocated on the Large Object Heap where it's lifetime can become problematic.
It is also possible to clone the original MemoryStream as follows, to facilitate accessing it via a StreamReader e.g.
using (MemoryStream readStream = new MemoryStream(stream.ToArray()))
{
...
}
The ideal solution is to access the original MemoryStream before it has been closed, if possible.
Just a wild guess: maybe you need to flush the streamwriter? Possibly the system sees that there are writes "pending". By flushing you know for sure that the stream contains all written characters and is readable.

Categories