Disposing MemoryStreams and GZipStreams

Disposing MemoryStreams and GZipStreams - c#

I want to compress a ProtoBuffer object on serialisation and decompress on deserialisation. Unfortunatly, C# stdlib offers only compression routines that work on streams rather than on byte[], that makes it a bit unesseray more verbose than a function call. My Code so far:
class MyObject{
public string P1 {get; set;}
public string P2 {get; set;}
// ...
public byte[] Serialize(){
var builder = new BinaryFormat.MyObject.Builder();
builder.SetP1(P1);
builder.SetP2(P2);
// ...
// object is now build, let's compress it.
var ms = new MemoryStream();
// Without this using, the serialisatoin/deserialisation Tests fail
using (var gz = new GZipStream(ms, CompressionMode.Compress))
{
builder.Build().WriteTo(gz);
}
return ms.ToArray();
}
public void Deserialize(byte[] data)
{
var ms = new MemoryStream();
// Here, Tests work, even when the "using" is left out, like this:
(new GZipStream(new MemoryStream(data), CompressionMode.Decompress)).CopyTo(ms);
var msg = BinaryFormat.MachineInfo.ParseFrom(ms.ToArray());
P1 = msg.P1;
P2 = msg.P2;
// ...
}
}
When dealing with streams, it seems one has to manually take care of the disposal of the objects. I wonder why that is, I'd expect GZipStream to be fully managed Code. And I wonder If Deserialize works only by accident and if I should dispose the MemoryStreams aswell.
I know I could probably solve this problem by simply using a thrid party compression library, but that's somewhat besides the point of this question.

GZipStream needs to be disposed so it flushes it's final blocks of compression out of its buffer to its underlying stream, it also calls dispose on the stream you passed in unless you use the overload that takes in a bool and you pass in false.
If you where using the overload that did not dispose of the MemoryStream it is not as critical to have the MemoryStream be disposed because it is not writing its internall buffer anywhere. The only thing it does is set some flags and set a Task object null so it can be GCed sooner if the stream lifetime is longer than the dispose point.
protected override void Dispose(bool disposing)
{
try {
if (disposing) {
_isOpen = false;
_writable = false;
_expandable = false;
// Don't set buffer to null - allow TryGetBuffer, GetBuffer & ToArray to work.
#if FEATURE_ASYNC_IO
_lastReadTask = null;
#endif
}
}
finally {
// Call base.Close() to cleanup async IO resources
base.Dispose(disposing);
}
}
Also, although the comment says "Call base.Close() to cleanup async IO resources" the base dispose function from the Stream class does nothing at all.
protected virtual void Dispose(bool disposing)
{
// Note: Never change this to call other virtual methods on Stream
// like Write, since the state on subclasses has already been
// torn down. This is the last code to run on cleanup for a stream.
}
All that being said, when decompressing a GZipStream you can likely get away with not disposing it for the same reason as not disposing the MemoryStream, when decompressing it does not buffer bytes anywhere so there is no need to flush any buffers.

Related

Finding a memory leak

I have an issue with the following code. I create a memory stream in the GetDB function and the return value is used in a using block. For some unknown reason if I dump my objects I see that the MemoryStream is still around at the end of the Main method. This cause me a massive leak. Any idea how I can clean this buffer ?
I have actually checked that the Dispose method has been called on the MemoryStream but the object seems to stay around, I have used the diagnostic tools of Visual Studio 2017 for this task.
class Program
{
static void Main(string[] args)
{
List<CsvProduct> products;
using (var s = GetDb())
{
products = Utf8Json.JsonSerializer.Deserialize<List<CsvProduct>>(s).ToList();
}
}
public static Stream GetDb()
{
var filepath = Path.Combine("c:/users/tom/Downloads", "productdb.zip");
using (var archive = ZipFile.OpenRead(filepath))
{
var data = archive.Entries.Single(e => e.FullName == "productdb.json");
using (var s = data.Open())
{
var ms = new MemoryStream();
s.CopyTo(ms);
ms.Seek(0, SeekOrigin.Begin);
return (Stream)ms;
}
}
}
}

For some unknown reason if I dump my objects I see that the MemoryStream is still around at the end of the Main method.
That isn't particuarly abnormal; GC happens separately.
This cause me a massive leak.
That isn't a leak, it is just memory usage.
Any idea how I can clean this buffer ?
I would probably just not use a MemoryStream, instead returning something that wraps the live uncompressing stream (from s = data.Open()). The problem here, though, is that you can't just return s - as archive would still be disposed upon leaving the method. So if I needed to solve this, I would create a custom Stream that wraps an inner stream and which disposes a second object when disposed, i.e.
class MyStream : Stream {
private readonly Stream _source;
private readonly IDisposable _parent;
public MyStream(Stream, IDisposable) {...assign...}
// not shown: Implement all Stream methods via `_source` proxy
public override void Dispose()
{
_source.Dispose();
_parent.Dispose();
}
}
then have:
public static Stream GetDb()
{
var filepath = Path.Combine("c:/users/tom/Downloads", "productdb.zip");
var archive = ZipFile.OpenRead(filepath);
var data = archive.Entries.Single(e => e.FullName == "productdb.json");
var s = data.Open();
return new MyStream(s, archive);
}
(could be improved slightly to make sure that archive is disposed if an exception happens before we return with success)

Generic Stream that can call Dispose() method for another object after closing

Here is a very simple example how I can return a Stream to ASP.NET MVC:
public ActionResult DownloadData(...)
{
using(var lib = new SomeLibrary())
{
// do some stuff...
var stream = new MemoryStream();
lib.SaveAs(stream);
stream.Position = 0;
return File(stream, "contenttype", "filename");
}
}
The problem is that MemoryStream will be allocated in large heap object area and on 32 bit system it will cause OutOfMemoryException quite soon because of RAM fragmentation that will prevent allocation of a large memory block even if there is enough memory. On 64 bit systems this method is also quite slow.
What I want is just return a stream, similar to this
public ActionResult DownloadData(...)
{
using(var lib = new SomeLibrary())
{
// do some stuff...
return File(lib.Stream, "contenttype", "filename");
}
}
However then the using statement will call .Dispose() before my data is returned. If I remove the using statement completely, then the library will lock resources until garbage collector clean up memory.
I think the best solution will be to use a generic Stream that just copy the source stream and call .Dispose() at the end:
public ActionResult DownloadData(...)
{
var lib = new SomeLibrary()
try
{
// do some stuff...
var stream = new CopyStream(lib.Stream, lib);
return File(stream, "contenttype", "filename");
}
catch(Exception)
{
lib.Dispose();
throw;
}
}
The generic stream should call .Dispose() for the second parameter lib after closing.
Is there any existing implementation for such a Stream in .NET or NuGet?

Maybe i did not understand exactly what you need but in your case, there is no need to dispose the stream by yourself (with an explicit Dispose() call or with the using keyword).
The File helper method coming from ASP MVC controller is designed to return streams : It creates a FileStreamResult that is in charge of disposing the encapsulated stream and will do it when its job is done.
On this topic, you can find a far better and more detailed answer of Darin Dimitrov in here.

Give up the ownership of a stream/file

I'm having a real strange issue while working with streams in mono.
I open a stream like this:
_stream = new FileStream(filename, FileMode.Open, FileAccess.ReadWrite, FileShare.Read);
Where _stream of course is a FileStream. I added a dispose method for this class like that:
public override void Dispose()
{
Dispose(true);
GC.SuppressFinalize(this);
}
private void Dispose(bool disposing)
{
if(!_disposed)
{
if(disposing)
{
_stream.Flush(true);
_stream.Dispose();
_items.Clear();
}
_disposed = true;
}
}
Now i have this simple test:
[Test]
public void SimpleTest()
{
SyncFileWriter s = new SyncFileWriter("file");
s.Save("a", "b");
s.Dispose();
SyncFileWriter s2 = new SyncFileWriter("file");
s2.Dispose();
}
But if fails with a IOException: Sharing violation. [edit]It looks like there was an error with monodevelop. Restarted it and now it fails everytime[/edit]
How to avoid this and close the stream properly?

It looks like it was my fault. The SyncFileWriter is derived from a abstract class that checkes whether the file exist, or not. If not, it will initialize this file with some default things and that's why I did the following
File.Create(filename);
As you see, I create a file, but File.Create opens a stream and does not close it after leaving it's scope, whatever.
Now I do this:
Filestream f = File.Create(filename);
...
f.Dispose();
and everything is fine.
I think here lies also the problem for the first nondeterministic behavoir, because it does this only, when a parameter is null, and it is only null, when you create a new file.
Anyway, Thanks for your help!

Not quite sure what you're trying to achieve, but I guess this may help: http://www.blackwasp.co.uk/UsingStatement.aspx

Why does visual studio code analysis warn me that this object can be disposed twice in this code (when it can't)

Code is:
using (MemoryStream memorystream = new MemoryStream(bytes))
{
using (BinaryWriter writer = new BinaryWriter(memorystream))
{
writer.Write((double)100.0);
}
return memorystream.ToArray();
}
Isn't the above code appropriate to properly dispose of both object?
Is the code analysis at all useful? Other than garbage information about variables names and namespaces it seems to complain about a lot of things that are not reality. I am really thinking that maybe it is useful and I am just missing the point.
OKAY to address concerns whether the MemoryStream is disposed of or not (its not) here is an example where the VS code analysis gives me the exact same warning. Clearly nothing is getting disposed of here
public class MyClass : IDisposable
{
public void DoSomethingElse()
{
}
#region IDisposable Members
public void Dispose()
{
throw new NotImplementedException();
}
#endregion
}
public class MyOtherClass : IDisposable
{
public MyOtherClass(MyClass mc)
{
}
public void DoSomething() { }
}
public void Foo()
{
using (MyClass mc = new MyClass())
{
using (MyOtherClass otherclass = new MyOtherClass(mc))
{
otherclass.DoSomething();
}
mc.DoSomethingElse();
}
}

On "If the stream were disposed then how can I call memorystream.ToArray()" part of the question:
MemeoryStream.Dispose does not release internal buffer. The only thing that it does is blocking all Stream methods (like Read) by throwing "Object Disposed" exception. I.e. following code is perfectly valid (usually similar code accessing lower level storage can be written for other other Stream objects that manage some sort of storage):
MemoryStream memoryStream = new MemoryStream();
using (memoryStream)
{
// write something to the stream
}
// Note memoryStream.Write will fail here with an exception,
// only calls to get storage (ToArray and GetBuffer) make sense at this point.
var data = memoryStream.ToArray();

BinaryWriter does not dispose the MemoryStream. YOU NEED TO DISPOSE OF IT. Get the results of .ToArray in a variable, then dispose the MemoryStream, then return the result. Or use a try/finally to dispose of the MemoryStream.

If I were to guess I would say that the logic of the analysis rule is identifying a possiblity that goes someting like this:
"You are passing a disposable object into the constructor of another disposable object (thus semantically implying a transfer of ownership). If ownership is truly being trasnsferred to the second object, it's likely that it will dispose of the object passed to its constructor in its own Dispose method but you're disposing of it as well."
Your second example proves that it is not actually analysing whether or not the second object takes ownership of and disposes the first, but rather this rule is saying this pattern has some semantic ambiguity, as the code responsible for disposal of the first object is no longer clear.
The dispose pattern has all of the inherent pitfalls of alloc/free and new/delete as only one class should own and thus control the lifetime of a disposable instance.
All pure conjecture of course but that's how I'd read it.

MemoryStream wraps a managed buffer that will be garbage collected at some point in the future regardless of whether its disposed of.
The IDisposable implementation is part of the Stream contract. It closes the stream preventing you from using the Read and Write methods but you are still permitted to access the underlying bytes. (Preventing calls to ToArray would probably make it difficult to use with StreamWriters.) A Stream implementation that does use unmanaged resources like a FileStream would release the file handle when it is disposed of.
The code analysis is noting that both you and the StreamWriter are calling Dispose on the same object and warning about it. This is pretty meaningless for a MemoryStream but might be dangerous with some other IDisposable implementations.
You appear to be confusing the IDisposable interface with garbage collection. Calling Dispose does not release managed objects, it just gives your class the opportunity to release unmanaged resources in a deterministic manner.

I think BinaryWriter.Dispose() will call its underlying MemoryStream's .Dispose(), and then your using will re-dispose it.
At least I think that's what will happen. I haven't opened the source for BinaryWriter to verify, but I was always under the impression it closed and disposed its underlying stream.
Edit:
This is the source for the classes. You can see where your memory stream is being disposed:
BinaryStream:
public BinaryWriter(Stream output) : this(output, new UTF8Encoding(false, true))
{
}
public BinaryWriter(Stream output, Encoding encoding)
{
if (output==null)
throw new ArgumentNullException("output");
if (encoding==null)
throw new ArgumentNullException("encoding");
if (!output.CanWrite)
throw new ArgumentException(Environment.GetResourceString("Argument_StreamNotWritable"));
Contract.EndContractBlock();
OutStream = output;
_buffer = new byte[16];
_encoding = encoding;
_encoder = _encoding.GetEncoder();
}
protected virtual void Dispose(bool disposing)
{
if (disposing)
OutStream.Close();
}
MemoryStream:
public class MemoryStream : Stream
{
....
}
Stream
public virtual void Close()
{
Dispose(true);
GC.SuppressFinalize(this);
}
To answer one of your further questions, "why can you still call .ToArray() if MemoryStream has been disposed", well, this test passes just fine:
[TestMethod]
public void TestDispose()
{
var m = new MemoryStream();
m.WriteByte(120);
m.Dispose();
var a = m.ToArray();
Assert.AreEqual(1, a.Length);
}
So the .ToArray() method is still accessible after dispose.
The bytes are still available. MemoryStream dispose just sets some flags internally that prevent you from further modifying the stream:
protected override void Dispose(bool disposing)
{
try {
if (disposing) {
_isOpen = false;
_writable = false;
_expandable = false;
// Don't set buffer to null - allow GetBuffer & ToArray to work.
}
}
finally {
// Call base.Close() to cleanup async IO resources
base.Dispose(disposing);
}
}
In fact, note that the source has the comment in it:
// Don't set buffer to null - allow GetBuffer & ToArray to work.
so that really answers that question :)

Disposing of FileStream does not end Asynchronous Read

I open a FileStream to be written to Asynchronously.
m_oFile.BeginRead(arrInputReport, 0, m_nInputReportLength, new AsyncCallback(ReadCompleted), arrInputReport);
I dispose this using the following code:
if (bDisposing)
{
if (m_oFile != null)
{
m_oFile.Dispose();
m_oFile = null;
}
Unfortunately, after calling the dispose method, ReadComplete method still receives results:
protected void ReadCompleted(IAsyncResult iResult)
{
byte[] arrBuff = (byte[])iResult.AsyncState; // retrieve the read buffer
try
{
m_oFile.EndRead(iResult);
It will get a nullReference error at the m_oFile.EndRead line. Checking for null gets rid of the exception but just traps the program in this method. How can I dispose of the ReadComplete method?

If you don't have reliable access to the instance you used, then I would suggest using the AsyncState to store all the things you need. For example:
var state = Tuple.Create(m_oFile, arrInputReport);
m_oFile.BeginRead(...blah... , ReadCompleted, state);
with:
var state = (Tuple<FileInfo, byte[]>)iResult.AsyncState;
FileInfo file = state.Item1;
byte[] arrBuff = state.Item2;
Note a null-check isn't enough, because if a new file has been used, you would be calling it on the wrong FileInfo instance - not a good thing.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Disposing MemoryStreams and GZipStreams - c#

Related

Finding a memory leak

Generic Stream that can call Dispose() method for another object after closing

Give up the ownership of a stream/file

Why does visual studio code analysis warn me that this object can be disposed twice in this code (when it can't)

Disposing of FileStream does not end Asynchronous Read

Categories

Resources