Erasing IBuffers in a Destructor

Erasing IBuffers in a Destructor - c#

I have a class that internally manages an IBuffer. The data is sensitive, so I would like to have the class ensure that the buffer is 0'd out before it is destroyed to avoid leaving the bits in memory. I have an Erase() method which is as follows:
public static void Erase(this IBuffer value)
{
using (var writer = new DataWriter(value.AsStream().AsOutputStream()))
{
for (int i = 0; i < value.Length; i++)
writer.WriteByte(0);
var storeTask = Task.Run(async () => await writer.StoreAsync());
storeTask.Wait();
}
}
First, I recognize that my use of Task.Run here to call a non-CPU-bound async method is suspect, but I haven't found a synchronous equivalent. Alternatives are welcome.
The problem I'm experiencing is that in Debug mode, and most of the time in Release mode, it runs perfectly. However, occasionally in Release mode, I bump into an exception when Finalize() is run on my object:
An unhandled exception of type 'System.Runtime.InteropServices.InvalidComObjectException' occurred in System.Runtime.WindowsRuntime.dll
Additional information: Excep_InvalidComObject_NoRCW_Wrapper. For more information, visit http://go.microsoft.com/fwlink/?LinkId=623485
The URL really only talks about the optimizations for exceptions in .Net Native, not this specific exception type.
I presume the cause may have something to do with the IBuffer being destroyed before my Erase method has a chance to complete.
How can I properly achieve the behavior I want?

C# finalizers are not like C++ destructors, even though the term "destructor" is sometimes used to describe them. A C# finalizer is not guaranteed to run at all. If the finalizer does run, the order in which it runs relative to other finalizers is not defined, so your finalizer can't rely on accessing objects that themselves have finalizers (e.g. a COM object wrapper).
So yes, it's entirely possible that if you rely on a finalizer and your finalizer attempts to use an object that itself might have a finalizer, you may find that when your finalizer runs, the object it's trying to use may already have been cleaned up.
It is true that if you implement Dispose(), one strategy is to also implement a finalizer (another is to use a SafeHandle subclass to wrap unmanaged resources). But that's just a backstop, and since finalizers aren't guaranteed to run, it's not a 100% reliable one. The guidance to implement a finalizer isn't because that's a 100% reliable way to clean things up, but rather because it's the closest you're going to get if you're dealing with buggy client code that forgets to call Dispose().
So, yes…in C# the correct strategy here is to implement IDisposable and require clients that want the memory cleaned up safely to make sure that they follow the rules and call Dispose() when they are done with the object.
By the way, as far as your Task.Run() goes…
You should be using asynchronous methods asynchronously. E.g. don't implement Erase(), instead implement EraseAsync() and use await inside.
But if you really insist on waiting for them, there's no need to wrap the call in an anonymous async method that you execute with Task.Run(). That's major overkill. Just wait on the task object returned by StoreAsync(). You can do that e.g. by calling its GetResults() method (which should block until the result is actually available), or you can convert directly to a Task<T> object using the AsTask() extension method, and of course then wait on that task object.

Related

Blocking in Dispose method

Can blocking in Dispose() method (synchronous variant) anyhow influence GC process?
Supposing:
classes with no explicit finalizers,
no real resource allocation/freeing within Dispose, just abusing the pattern to inject some code at the point of Dispose call
Up to now, I understood Dispose as a "normal" method, which also could "accidentally" be called by compiler generated code from some syntactic sugar constructs, like using (var i = new Something(...)){} or using var i = new Something(...) and if using is not suitable for us (from any reason), we just call it directly;
This implies, any blocking operation inside means only delay at the point of Dispose execution, right?
This implies, GC does not care about Dispose at all and just collects any instance, when there are no references to it, regardless of Dispose being called or not, right?
From above I imply that under mentioned conditions, there is no reason for any memory leak nor GC influence, when working with instances blocking in Dispose, right?
Example of such class:
class DisposeBlocker : IDisposable
{
private Task localWork;
private Task remoteWork;
...
DisposeBlocker(Task work)
{
remoteWork = work;
}
void Dispose()
{
for (var i = 0; i < 1000000; ++i) {} // CPU Bound
Thread.Sleep(1000); // Should be ok too, right?
Task.Delay(1000).GetAwaiter().GetResult(); // Is this still ok? Thread pools, contexts, similar stuff...
Task.WhenAll(localWork, remoteWork).GetAwaiter().GetResult(); // Same as the previous one, right?
}
}

Correct.
If it has no finalizer, GC will not care about the Dispose method.
This sounds like a big code smell, though: if Dispose is necessary, I would imagine it would be desirable to ensure it is called in most cicumstances, ergo you need a finalizer. And finalizers MUST NOT BLOCK or throw an exception under any circumstances.
It's also unexpected for Dispose to block either. So you really should avoid this kind of setup.

Is it safe to call CancellationTokenSource.Cancel multiple times?

For example, if I want to cancel some operation in a Dispose() call (which can be called multiple times), then do I need to write
public void Dispose()
{
if (!cancellationTokenSource.IsCancellationRequested)
{
cancellationTokenSource.Cancel();
}
}
or is it enough with the simpler
public void Dispose()
{
cancellationTokenSource.Cancel();
}
(You are welcome to comment on whether it is wise or not to cancel things in a Dispose method, but that is not the point of this question.)

Yes.
But only if the CancellationTokenSource has not been disposed yet.
From the reference source:
ThrowIfDisposed();
// ...
// fast-path test to check if Notify has been called previously
if (IsCancellationRequested)
return;

This seems more a question about the Dispose pattern, then about CancellationToken or anything else. And I am uncertain if you implemented said pattern properly. Here is the official MS Document on the mater:
https://learn.microsoft.com/en-us/dotnet/standard/design-guidelines/dispose-pattern
And here my interpretation:
There is two levels of Disposing: Dispose and Finalizsation. As the code for both is very similar, often they are combiend into one function (in C# usually it is Dispose one). The main difference is if you relay it to contained classes. You always relay a Dispose call(the relay is usually what Dispose is about). You never relay a Finalization call (Finalisation is between that instance and the GC only).
There are also two cases: One in wich you handle Unmanaged resources directly. And one in wich you handle just another Disposeable class.
Unamanged resource directly
In this case the first thing you do is implement a Finalizer, so at least the GC can reliably clean this up. Then you implement IDisposeable as an additional feature so programmers can use stuff like the using pattern to have it cleaned up deterministic at runtime.
Handling something that implements IDisposeable
You have a resource that implements IDisposeable (say like a Filestream Reference). You implement IDisposeable in your class for the sole purpose of relaying the Dispose() call to said FileStream. This is the way more common case. It would guess it makes about 95-99% of all Dispose Implementations.
One thing to keep in mind here is that "Dispose" and "Finalize" often implies lower level cleanup. A SQLConenction you call dispose on will be closed first (if nessesary). A Filehandle you Dispose off will also first be closed. Even if calling cancellationTokenSource.Cancel was not repeatable, cancellationTokenSource.Dispose should call Cancel as part of it's operation and should be Repeatable. The class itself does implement IDisposeable. And if any class does, it is usually saver to just call Dispose rather then manually doing the cleanup manually via Cancel: https://learn.microsoft.com/en-us/dotnet/api/system.threading.cancellationtokensource?view=netframework-4.7.2

What is the exact condition we must call dispose method for managed code

I have some doubts related to dispose and finalizer in C# which i am mentioning below:-
1.Apart from unmanaged resources, what is the exact need to use dispose method.Why do we use dispose to release the memory of managed code if there is garbage collector to release the memory.
2.Also, why finalizer is not recommended.Microsoft would have some reasons to develop finalizer feature. in the most of sites i have visited,suggested that finalizer is not recommended. what is the reason.
3.Sometimes, we use only object.dispose to release whereas sometimes we use idisposable interface . why?
4.What is the exact condition we must call dispose method?

For your #1: as you correctly wrote on your question, the main reason to use Dispose is to free resources from unmanaged resources (like file handles, database connections, etc), but there is one more case where we can call dispose to do some things related with managed resources and that is disconnecting event handlers. There is a really good explanation about this here.
Answering your #2, finalizers are not recommended because they introduce performance issues and because of that, you should avoid using them if you can use better solutions. As stated in this fragment from "Effective C#" by Bill Wagner:
A finalizer is a defensive mechanism that ensures your objects always
have a way to release unmanaged resources
And if you keep reading...
Finalizers are the only way to guarantee that unmanaged resources
allocated by an object of a given type are eventually released. But
finalizers execute at nondeterministic times, so your design and
coding practices should minimize the need for creating finalizers, and
also minimize the need for executing the finalizers that do exist.
So the finalizer seem to be in the only thing you can do to make sure the unmanaged resources are released, so maybe it was the reason you were looking for (I don't really know the Microsoft reason to do it, sorry).
To answer your #3 I would need an exact code example to what you really mean, but I wil try to guess. I suppose you are talking about the next two different scenarios:
Calling myObject.Dispose() after using it, explicit way. For example, we can create an instance, use it and then call Dispose:
myObject = new MyObject()
// More code here...
myObject.Dispose();
That will be ok if you are sure that between the creation of your instance and the calling to the Dispose method there is no exception in your code, which could cause the call to Dispose to be missed. Of course you can always use a finally block:
try {
MyObject myObject = new MyObject()
(...)
}
catch (Exception) {
// manage exception
}
finally {
if (myObject != null)
myObject.Dispose();
}
Calling Dispose using the IDisposable interface through using. It is basically the same that the previous with the finally block, but it will be created "automatically":
using (MyObject myObject = new MyObject()) {
// your code here
}
You can check the docs here.
And answering your #4. I think that this is a good answer, but do not forget to read the comments. So, in short, if it has a Dispose method, it should be called.

Should I add a destructor/finalizer to my class that contains a Dataset?

I've read through this post about disposing of datasets and I still have a question about the destructor. I know that post basically says that you don't need to dispose of Datasets, Datatables, and Dataviews, but my dataset is MASSIVE, so I want to release that memory ASAP. So, my question, should I include a destructor even though the dataset will be disposed when my objects' dispose method is called? Also, explain to me again why the "bool disposing" is needed.
public DEditUtil(DataSet dsTxData)
{
this.dsTxData = dsTxData;
}
public void Dispose()
{
Dispose(true);
GC.SuppressFinalize(this);
}
protected virtual void Dispose(bool disposing)
{
if (!disposed)
{
if (disposing)
dsTxData.Dispose();
disposed = true;
}
}
~DEditUtil()
{
Dispose(false);
}

Yes, in general you should implement the full IDisposable pattern whenever either of the following is true:
You have unmanaged resources being allocated by your class, or
You have managed resources that implement IDisposable (which implies that they, in turn, have unmanaged resources)
The presence of the finalizer (the general CLR term for what C++/C# call a "destructor") is to handle cases where your Dispose method is not called for some reason. The boolean value being passed in to your protected Dispose() method indicated if you are being called from within the public Dispose, or from within your finalizer.
If your public Dispose method is being called, that call stack is deterministic: your dispose method is being called directly, so you can safely call methods (including Dispose) on your child objects.
If you are inside of the finalizer, then you have no idea what's going on with other objects that are also being garbage-collected. In general, it may not be safe to call methods on managed objects your control from within your finalizer.
So, the boolean value basically says: "if true, dispose everything; if false, only dispose my unmanaged resources and let everyone else deal with theirs."

The memory used by your DataSet object will be available for garbage collection as soon as it is not referenced anymore by the code.
The garbage collector will make that memory available to the program at a later (non determinate) time.
Both things do not depend of having or not a destructor or calls to Dispose, so the answer is no - you don't need a destructor.

No, you do not need any other method call here, it's already enough what you did.
Dispose will be called by the runtime and you will free resource allocated, the cleanup let's leave up to GC to decide how and when to do it.
If you have really huge problems with memory you can try to cal GC.Collect() to enforce the collection of the garbage, that usually works, but it's never a good practise to use it in that way, so try to avoid it as much as possible.
EDIT
According to the comments, it's important to pay attention on execution flow in your case, cause the DataSet cleanup will be done only if it's not disposed==false and disposing == true, which from the code provided, will be a case only during esplicit call from the code.

Very seldom should user-written classes ever use finalizers (or C# destructors) for any purpose other than to log failures to call Dispose. Unless one is delving deep into the particulars of how finalizers work, and exactly what is or is not guaranteed guaranteed about the context in which they run, one should never call any other object's Dispose method within a finalizer. In particular, if one's object's Finalize() method is running, any IDisposable objects to which it holds a reference will usually fall into one of the following categories:
Someone else still has a reference to that object and expects it to be usable, so calling `Dispose` would be bad.
The object cannot be safely disposed within a finalizer thread context, so calling `Dispose` would be bad.
The object would have kept the present object alive if there were anything meaningful for its `Dispose` handler to do; the fact that the present object's `Finalize` method is running implies that there's no longer any need to call `Dispose` on the other object (this scenario can occur with events).
The object has already had its `Finalize` method called, so calling `Dispose` would be at best superfluous.
The object is scheduled to have its `Finalize` method called, so calling `Dispose` would likely be superfluous.
Although there are a few cases where an object might need to clean up another IDisposable object within a Finalize method, using Finalize properly in such cases is tricky, and using it improperly is apt to be worse than not using it at all. Among other things, Finalize generally only runs when an entity requests an IDisposable and wrongfully fails to call Dispose before abandoning it. It's usually better to focus one's efforts on making sure that Dispose gets properly before an object is abandoned, than on trying to properly handle buggy consumer code.

Why is 'using' improving C# performances

It seems that in most cases the C# compiler could call Dispose() automatically. Like most cases of the using pattern look like:
public void SomeMethod()
{
...
using (var foo = new Foo())
{
...
}
// Foo isn't use after here (obviously).
...
}
Since foo isn't used (that's a very simple detection) and since its not provided as argument to another method (that's a supposition that applies to many use cases and can be extended), the compiler could automatically and immediately call Dispose() without the developper requiring to do it.
This means that in most cases the using is pretty useless if the compiler does some smart job. IDisposable seem low level enough to me to be taken in account by a compiler.
Now why isn't this done? Wouldn't that improve the performances (if the developpers are... dirty).

A couple of points:
Calling Dispose does not increase performance. IDisposable is designed for scenarios where you are using limited and/or unmanaged resources that cannot be accounted for by the runtime.
There is no clear and obvious mechanism as to how the compiler could treat IDisposable objects in the code. What makes it a candidate for being disposed of automatically and what doesn't? If the instance is (or could) be exposed outside of the method? There's nothing to say that just because I pass an object to another function or class that I want it to be usable beyond the scope of the method
Consider, for example, a factory patter that takes a Stream and deserializes an instance of a class.
public class Foo
{
public static Foo FromStream(System.IO.Stream stream) { ... }
}
And I call it:
Stream stream = new FileStream(path);
Foo foo = Foo.FromStream(stream);
Now, I may or may not want that Stream to be disposed of when the method exits. If Foo's factory reads all of the necessary data from the Stream and no longer needs it, then I would want it to be disposed of. If the Foo object has to hold on to the stream and use it over its lifetime, then I wouldn't want it to be disposed of.
Likewise, what about instances that are retrieved from something other than a constructor, like Control.CreateGraphics(). These instances could exist outside of the code, so the compiler wouldn't dispose of them automatically.
Giving the user control (and providing an idiom like the using block) makes the user's intention clear and makes it much easier to spot places where IDisposable instances are not being properly disposed of. If the compiler were to automatically dispose of some instances, then debugging would be that much more difficult as the developer had to decipher how the automatic disposal rules applied to each and every block of code that used an IDisposable object.
In the end, there are two reasons (by convention) for implementing IDisposable on a type.
You are using an unmanaged resource (meaning you're making a P/Invoke call that returns something like a handle that must be released by a different P/Invoke call)
Your type has instances of IDisposable that should be disposed of when this object's lifetime is over.
In the first case, all such types are supposed to implement a finalizer that calls Dispose and releases all unmanaged resources if the developer fails to do so (this is to prevent memory and handle leaks).

Garbage Collection (while not directly related to IDisposable, is what cleans up unused objects) isn't that simple.
Let me re-word this a little bit. Automatically calling Dispose() isn't that simple. It also won't directly increase performance. More on that a little later.
If you had the following code:
public void DoSomeWork(SqlCommand command)
{
SqlConnection conn = new SqlConnection(connString);
conn.Open();
command.Connection = conn;
// Rest of the work here
}
How would the compiler know when you were done using the conn object? Or if you passed a reference to some other method that was holding on to it?
Explicitly calling Dispose() or using a using block clearly states your intent and forces things to get cleaned up properly.
Now, back to performance. Simply calling Dispose() on an Object doesn't guarantee any performance increase. The Dispose() method is used for "cleaning up" resources when you're done with an Object.
The performance increase can come when using un-managed resources. If a managed object doesn't properly dispose of its un-managed resources, then you have a memory leak. Ugly stuff.
Leaving the determination to call Dispose() up to the compiler would take away that level of clarity and make debugging memory leaks caused by un-managed resources that much more difficult.

You're asking the compiler to perform a semantic analysis of your code. The fact that something isn't explicitly referenced after a certain point in the source does not mean that it isn't being used. If I create a chain of references and pass one out to a method, which may or may not store that reference in a property or some other persistent container, should I really expect the compiler to trace through all of that and figure out what I really meant?
Volatile entities may also be a concern.
Besides, using() {....} is more readable and intuitive, which is worth a lot in terms of maintainability.
As engineers or programmers, we strive to be efficient, but that is rarely the same thing as lazy.

Look at the MSDN Artilce for the C# Using Statement The using statement is just a short cut to keep from doing a try and finally in allot of places. Calling the dispose is not a low level functionality like Garbage Collection.
As you can see using is translated into.
{
Font font1 = new Font("Arial", 10.0f);
try
{
byte charset = font1.GdiCharSet;
}
finally
{
if (font1 != null)
((IDisposable)font1).Dispose();
}
}
How would the compiler know where to put the finally block? Does it call it on Garbage Collection?
Garabage Collection doesn't happen as soon as you leave a method. Read this article on Garbage Collection to understand it better. Only after there are no references to the object. A resource could be tied up for much longer than needed.
The thought that keeps popping into my head is that the compiler should not protect developers who do not clean up there resources. Just because a language is managed doesn't mean that it is going to protect from yourself.

C++ supports this; they call it "stack semantics for reference types". I support adding this to C#, but it will require different syntax (changing the semantics based on whether or not a local variable is passed to another method isn't a good idea).

I think that you are thinking about finalizers. Finalizers use the destructor syntax in c#, and they are called automatically by the garbage collector. Finalizers are only appropriate to use when you are cleaning up unmanaged resources.
Dispose is intended to allow for early cleanup of unmanaged resources (and it can be used to clean managed resources as well).
Detection is actually trickier than it looks. What if you have code like this:
var mydisposable = new...
AMethod(mydisposable);
// (not used again)
It's possible that some code in AMethod holds on to a reference to myDisposable.
Maybe it gets assigned to an instance variable inside of that method
Maybe myDisposable subscribes to an event inside of AMethod (then the event publisher holds a reference to myDisposable)
Maybe another thread is spawned by AMethod
Maybe mydisposable becomes "enclosed" by an anonymous method or lamba expression inside of AMethod.
All of those things make it difficult to know for absolute certain that your object is no longer in use, so Dispose is there to let a developer say "ok, I know that it's safe to run my cleanup code now);
Bear in mind also that dispose doesn't deallocate your object -- only the GC can do that. (The GC does have the magic to understand all of the scenarios that I described, and it knows when to clean up your object, and if you really need code to run when the GC detects no references, you can use a finalizer). Be careful with finalizers, though -- they are only for unmanaged allocations that your class owns.
You can read more about this stuff here:
http://msdn.microsoft.com/en-us/magazine/bb985010.aspx
and here: http://www.bluebytesoftware.com/blog/2005/04/08/DGUpdateDisposeFinalizationAndResourceManagement.aspx
If you need unmanaged handle cleanup, read about SafeHandles as well.

It's not the responsibility of the compiler to interpret the scopes in your application and do things like figure out when you no longer need memory. In fact, I'm pretty sure that's an impossible problem to solve, because there's no way for the compiler to know what your program will look like at runtime, no matter how smart it is.
This is why we have the garbage collection. The problem with garbage collection is that it runs on an indeterminate interval, and typically if an object implements IDisposable, the reason is because you want the ability to dispose of it immediately. Like, right now immediately. Constructs such as database connections aren't just disposable because they have some special work to do when they get trashed - it's also because they are scarce.

I seems difficult for the G.C. to know that you won't be using this variable anymore later in the same method. Obviously, if you leave the method, and don't keep a further reference to you variable, the G.C. will dispose it. But using using in you sample, tells the G.C. that you are sure that you will not be using this variable anymore after.

The using statement has nothing to do with performance (unless you consider avoiding resource/memory leaks as performance).
All it does for you is guarantee that the IDisposable.Dispose method is called on the object in question when it goes out of scope, even if an exception has occurred inside the using block.
The Dispose() method is then responsible for releasing any resources used by the object. These are most often unmanaged resources such as files, fonts, images etc, but could also be simple "clean-up" activities on managed objects (not garbage collection however).
Of course if the Dispose() method is implemented badly, the using statement provides zero benefit.

I think the OP is saying "why bother with 'using' when the compiler should be able to work it out magically pretty easily".
I think the OP is saying that
public void SomeMethod()
{
...
var foo = new Foo();
... do stuff with Foo ...
// Foo isn't use after here (obviously).
...
}
should be equivalent to
public void SomeMethod()
{
...
using (var foo = new Foo())
{
... do stuff with Foo ...
}
// Foo isn't use after here (obviously).
...
}
because Foo isn't used again.
The answer of course is that the compiler cannot work it out pretty easily. Garbage Collection (what magically calls "Dispose()" in .NET) is a very complicated field. Just because the symbol isn't being used below that doesn't mean that the variable isn't being used.
Take this example:
public void SomeMethod()
{
...
var foo = new Foo();
foo.DoStuffWith(someRandomObject);
someOtherClass.Method(foo);
// Foo isn't use after here (obviously).
// Or is it??
...
}
In this example, someRandomObject and someOtherClass might both have references to what Foo points out, so if we called Foo.Dispose() it would break them. You say you're just imagining the simple case, but the only 'simple case' where what you're proposing works is the case where you make no method calls from Foo and do not pass Foo or any of its members to anything else - effectively when you don't even use Foo at all in which case you probably have no need to declare it. Even then, you can never be sure that some kind of reflection or event hackery didn't get a reference to Foo just by its very creation, or that Foo didn't hook itself up with something else during its constructor.

In addition to the fine reasons listed above, since the problem can't be solved reliably for all cases, those "easy" cases are something that code analysis tools can and do detect. Let the compiler do stuff deterministically, and let your automatic code analysis tools tell you when you're doing something silly like forgetting to call Dispose.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.