Related
Today I have seen a piece of code that first seemed odd to me at first glance and made me reconsider. Here is a shortened version of the code:
if(list != null){
list.Clear();
list = null;
}
My thought was, why not replace it simply by:
list = null;
I read a bit and I understand that clearing a list will remove the reference to the objects allowing the GC to do it's thing but will not "resize". The allocated memory for this list stays the same.
On the other side, setting to null would also remove the reference to the list (and thus to its items) also allowing the GC to do it's thing.
So I have been trying to figure out a reason to do it the like the first block. One scenario I thought of is if you have two references to the list. The first block would clear the items in the list so even if the second reference remains, the GC can still clear the memory allocated for the items.
Nonetheless, I feel like there's something weird about this so I would like to know if the scenario I mentioned makes sense?
Also, are there any other scenarios where we would have to Clear() a list right before setting the reference to null?
Finally, if the scenario I mentioned made sense, wouldn't it be better off to just make sure we don't hold multiple references to this list at once and how would we do that (explicitly)?
Edit: I get the difference between Clearing and Nulling the list. I'm mostly curious to know if there is something inside the GC that would make it so that there would be a reason to Clear before Nulling.
The list.Clear() is not necessary in your scenario (where the List is private and only used within the class).
A great intro level link on reachability / live objects is http://levibotelho.com/development/how-does-the-garbage-collector-work :
How does the garbage collector identify garbage?
In Microsoft’s
implementation of the .NET framework the garbage collector determines
if an object is garbage by examining the reference type variables
pointing to it. In the context of the garbage collector, reference
type variables are known as “roots”. Examples of roots include:
A reference on the stack
A reference in a static variable
A reference in another object on the managed heap that is not eligible for garbage
collection
A reference in the form of a local variable in a method
The key bit in this context is A reference in another object on the managed heap that is not eligible for garbage collection. Thus, if the List is eligible to be collected (and the objects within the list aren't referenced elsewhere) then those objects in the List are also eligible to be collected.
In other words, the GC will realise that list and its contents are unreachable in the same pass.
So, is there an instance where list.Clear() would be useful? Yes. It might be useful if you have two references to a single List (e.g. as two fields in two different objects). One of those references may wish to clear the list in a way that the other reference is also impacted - in which list.Clear() is perfect.
This answer started as a comment for Mick, who claims that:
It depends on which version of .NET you are working with. On mobile platforms like Xamarin or mono, you may find that the garbage collector needs this kind of help in order to do its work.
That statement is begging to be fact checked. So, let us see...
.NET
.NET uses a generational mark and sweep garbage collector. You can see the abstract of the algorithm in What happens during a garbage collection
. For summary, it goes over the object graph, and if it cannot reach a object, that one can be erased.
Thus, the garbage collector will correctly identify the items of the list as collectible in the same iteration, regardless of whatever or not you clear the list. There is no need to decouple the objects beforehand.
This means that clearing the list does not help the garbage collector on the regular implementation of .NET.
Note: If there were another reference to the list, then the fact that you cleared the list would be visible.
Mono and Xamarin
Mono
As it turns out, the same is true for Mono.
Xamarin.Android
Also true for Xamarin.Android.
Xamarin.iOS
However, Xamarin.iOS requires additional considerations. In particular, MonoTouch will use wrapped Objective-C objects which are beyond the garbage collector. See Avoid strong circular references under iOS Performance. These objects require different semantics.
Xamarin.iOS will minimize the use of Objetive-C objects by keeping a cache:
C# NSObjects are also created on demand when you invoke a method or a property that returns an NSObject. At this point, the runtime will look into an object cache and determine whether a given Objective-C NSObject has already been surfaced to the managed world or not. If the object has been surfaced, the existing object will be returned, otherwise a constructor that takes an IntPtr as a parameter is invoked to construct the object.
The system keeps these objects alive even there are no references from managed code:
User-subclasses of NSObjects often contain C# state so whenever the Objective-C runtime performs a "retain" operation on one of these objects, the runtime creates a GCHandle that keeps the managed object alive, even if there are no C# visible references to the object. This simplifies bookeeping a lot, since the state will be preserved automatically for you.
Emphasis mine.
Thus, under Xamarin.iOS, if there were a chance that the list might contain wrapped Objetive-C objects, this code would help the garbage collector.
See the question How does memory management works on Xamarin.IOS, Miguel de Icaza explains in his answer that the semantics are to "retain" the object when you take a reference and "release" it when the reference is null.
On the Objetive-C side, "release" does not mean to destroy the object. Objetive-C uses a reference count garbage collector. When we "retain" the object the counter is incremented and when we "release" the counter is decreased. The system destroys the object when the counter reaches zero. See: About Memory Management.
Therefore, Objetive-C is bad at handling circular references (if A references B and B references A, their reference count is not zero, even if they cannot be reached), thus, you should avoid them in Xamarin.iOS. In fact, forgetting to decouple references will lead to leaks in Xamarin.iOS... See: Xamarin iOS memory leaks everywhere.
Others
dotGNU also uses a generational mark and sweep garbage collector.
I also had a look at CrossNet (that compiles IL to C++), it appears they attempted to implement it too. I do not know how good it is.
It depends on which version of .NET you are working with. On mobile platforms like Xamarin or mono, you may find that the garbage collector needs this kind of help in order to do its work. Whereas on desktop platforms the garbage collector implementation may be more elaborate. Each implementation of the CLI out there is going to have it's own implementation of the garbage collector and it is likely to behave differently from one implementation to another.
I can remember 10 years ago working on a Windows Mobile application which had memory issues and this sort of code was the solution. This was probably due to the mobile platform requiring a garbage collector that was more frugal with processing power than the desktop.
Decoupling objects helps simplify the analysis the garbage collector needs to do and helps avoid scenarios where the garbage collector fails to recognise a large graph of objects has actually become disconnected from all the threads in your application. Which results in memory leaks.
Anyone who believes you can't have memory leaks in .NET is an inexperienced .NET developer. On desktop platforms just ensuring Dispose is called on objects which implement them may be enough, however with other implementations you may find it is not.
List.Clear() will decouple the objects in the list from the list and each other.
EDIT: So to be clear I'm not claiming that any particular implementation currently out there is susceptible to memory leaks. And again depending on when this answer is read the robustness of the garbage collector on any implementation of the CLI currently out there could have changed since the time writing this.
Essentially I'm suggesting if you know that your code needs to be cross platform and used across many implementations of the .NET framework, especially implementations of the .NET framework for mobile devices, it could be worth investing time into decoupling objects when they are no longer required. In that case I'd start off by adding decoupling to classes that already implement Dispose, and then if needed look at implementing IDisposable on classes that don't implement IDisposable and ensuring Dispose is called on those classes.
How to tell for sure if it's needed? You need to instrument and monitor the memory usage of your application on each platform it is to be deployed on. Rather than writing lots of superfluous code, I think the best approach is to wait until your monitoring tools indicate you have memory leaks.
As mentioned in the docs:
List.Clear Method (): Count is set to 0, and references to other
objects from elements of the collection are also released.
In your 1st snippet:
if(list != null){
list.Clear();
list = null;
}
If you just set the list to null, it means that you release the reference of your list to the actual object in the memory (so the list itself is remain in the memory) and waiting for the Garbage Collector comes and release its allocated memory.
But the problem is that your list may contain elements that hold a reference to another objects, for example:
list → objectA, objectB, objectC
objectB → objectB1, objectB2
So, after setting the list to null, now list has no reference and it should be collected by Garbage Collector later, but objectB1 and objectB2 has a reference from objectB (still be in the memory) and because of that, Garbage Collector need to analyse the object reference chain. To make it less confusing, this snippet use .Clear() function to remove this confusion.
Clearing the list ensures that if the list is not garbage collected for some reason, then at the very least, the elements it contained can still be disposed of.
As stated in the comments, preventing other references to the list from existing requires careful planning, and clearing the list before nulling it doesn't incur a big enough performance hit to justify trying to avoid doing so.
I want to put a reference to a C# object into unmanaged memory (C), I guess as a pointer (int), and when the C code calls back into C# later on, I want to get the reference back from the unmanaged memory, so I can resolve it, and access the object. The reason is that the C code controls which object should be used, there's no real alternative. I have limited control over the C code and C++/CLI is not an option.
Question: Is that possible and safe, if so, how?
Well, it is possible. Primary concern is that your scheme is very incompatible with the garbage collector, it moves objects in memory when it compacts the heap. That's something you can stop, you can pin the object so the GC cannot move it. You use GCHandle.Alloc() to allocate a GCHandleType.Pinned handle and pass the return value of GCHandle.AddrOfPinnedObject() to your C code, presumably with a pinvoke call.
You have to fret about how long that object needs to stay pinned. A couple of seconds, tops, is okay, but it gets pretty detrimental to the GC if you keep it pinned for a long time. It is a rock in the road that the GC constantly has to drive around. And the heap segment can never be recycled, that single object can cost you a handful of megabytes.
In which case you should consider allocating unmanaged memory and copying the object into it. Use Marshal.AllocHGlobal() to allocate, Marshal.StructureToPtr() to copy the object into it. Possibly multiple times if you modify the object and the changes need to be visible to the C code as well.
Either way, the object must be blittable or you get a runtime error. An expensive word that just means that the object must have simple field types, the kind that a C program has a shot at reading correctly. Don't use bool. Be careful with the declaration in the C program, pretty easy to corrupt the heap when you get it wrong.
When you control the 'handing out' and the 'use after receiving back' phases you can simply use a List or array and pass around the index.
It's possible to consume C# objects via COM and proxies created by the CLR called COM-Callable Wrappers.
You just need assign a GUID assembly attribute to identify the COM type library, e.g.:
[assembly: Guid ("39ec755f-022e-497a-9ac8-70ba92cfdb7c")]
And then use the Type Library Exporter tool (tlbexp.exe) to genereate the COM type library (.tlb) file which can be consumed in the COM world:
tlbexp.exe YourLibrary.dll
If you mean safe in the C#'s sense of the word, then certainly unsafe, as you'll be using the objects in the unmanaged world, and lifetimes are controlled from the COM side via reference counting as opposed to CLR's GC.
I have a concern. I'm a first year student of computer science. Normally I'm very inquisitive in class but, not always my teacher has an answer, or not always knows the answer. Are destructors necessary in C#? What I mean is if I have to implement a destructor method as I normally do with constructors, is it a good practice or i can avoid it and the garbage collector will do it for me?
Destructors (or finalizers) are good to have in the language - but you should almost never use them. Basically you should only need them if you have a direct handle on an unmanaged resource, and not only is that incredibly rare, but using SafeHandle as a tiny level of indirection is a better idea anyway (which handles clean-up for you). See Joe Duffy's blog post on the topic for more details.
For what it's worth, I can't remember the last time I wrote a finalizer other than to test some odd behaviour or other.
For the vast majority of the time, life is simpler:
The garbage collector can handle memory resource cleanup
If you use an unmanaged resource (e.g. a file) locally within a method, use a using statement to make sure you release it when you're done with it
If you need a reference to an unmanaged resource (or anything else which implements IDisposable) as an instance variable within your type, your type should itself implement IDisposable. (I try to avoid this where possible. Even when it is necessary, you can make life simpler by making your class sealed, at which point you at least don't need to worry about other subclasses having even more unmanaged state to clean up.)
No destructors are not neccesary in C#. The reason why that's true is that in C# the memory is managed automatically and you haven't to do anything except from creating an object. When the garbage collector verifies that an object is not referred anywhere else in your application, then it reclaims its memory, without having you declared any destructor for this object, like we do on C++ for instance.
Nothing is unnecessary in any language. They serve their purpose.
Destructors will destruct the object and you'll end up with object resurrection.(If you try to access destructed object, you might get an error)
GC will automatically do this for you when the object has no longer any references to it. So there's is no need for you to do this explicitly..
Also, implementing IDisposable should be given preference over destructor.
There are lots of questions about managed vs unmanaged resources. I understand the basic definition of the two. However, I have a hard time knowing when a resource or object is managed or unmanaged.
When I think of unmanaged resources I tend to think of native code that isn't directly part of .NET such as pinvoke or marshaling resources. I would normally think of resources meant to interface to something that will use HW such as a file handle or network connection also being unmanaged.
What about .NET objects that wrap native unmanaged resources such as a FileStream.
A FileStream must use unmanaged resources, but when I implement the IDisposable pattern, should I consider this a managed or unmanaged resources?
I've been assuming thus far that if the object implements IDisposable, then it is managed. How would I know that IntPtr should be handled as an unmanaged resoruce?
A FileStream must use unmanaged resources, but when I implement the IDisposable pattern, should I consider this a managed or unmanaged resources?
A FileStream is a managed resource.
Managed resources are classes that contain (and must manage) unmanaged resources. Usually the actual resource is several layers down.
I've been assuming thus far that if the object implements IDisposable, then it is managed.
Correct.
How would I know that IntPtr should be handled as an unmanaged resoruce?
From the documentation of the API that you got its value from. But do note that in practice, most programmers never deal with unmanaged resources directly. And when you do have to, use the SafeHandle class to turn an unmanaged resource into a managed resource.
It is pretty straight-forward, you can never accidentally allocate an unmanaged resource. A pinvoke call is required to allocate it, you'd know about it. The term "object" is overloaded, but there is no such thing as an unmanaged object, all objects in a .NET program are managed. You may interop with code written in another language that supports creating objects, like C++. But you cannot directly use such an object, a C++/CLI wrapper is required. Which makes it a managed class that implements IDisposable.
If you work with a poorly documented library then do pay attention when you get an IntPtr back. That's a pretty strong indication that an unmanaged allocation is involved, either a pointer to unmanaged memory or an operating system handle. That library should then also give you a way to release it, if it doesn't otherwise manage it automatically. Contact the owner of the library if you are not sure how to properly deal with it.
It was Microsoft's job to provide managed wrapper classes around all common operating system resources. Like FileStream, Socket, etcetera. Those classes almost always implement IDisposable. The only thing you have to do in your code when you store such an class object in your own class is to implement IDisposable yourself, just so you call the Dispose() method on those object. Or use the using statement if you use them as a local variable in a method.
It is most helpful to think of a "resource" in this context as meaning "something which an object has asked something else to do on its behalf, until further notice, to the detriment of everyone else". An object constitutes a "managed resource" if abandoning it would result in the garbage collector notifying the object of abandonment, and the object in turn instructing anything that was acting on its behalf to stop doing so. An "unmanaged resource" is a resource which is not encapsulated within a managed resource.
If some object Foo allocates a handle to unmanaged memory, it asks the memory manager to grant it exclusive use of some area of memory, making it unavailable to any other code that might otherwise want to use it, until such time as Foo informs the memory manager that the memory is no longer needed and should thus be made available for other purposes. What makes the handle an unmanaged resource is not the fact that it was received via an API, but rather the fact that even if all deliberate references to it were abandoned the memory manager would forever continue granting exclusive use of the memory to an object which no longer needs it (and likely no longer exists).
While API handles are the most common kind of unmanaged resource, there are countless other kinds as well. Things like monitor locks and events exist entirely within the managed-code world of .net, but can nonetheless represent unmanaged resources since acquiring a lock and abandoning while code is waiting on it may result in that code waiting forever, and since a short-lived object which subscribes to an event from a long-lived object and fails to unsubscribe before it is abandoned may cause that long-lived object to continue carrying around the event reference indefinitely (a small burden if only one subscriber is abandoned, but an unbounded burden if an unbounded number of subscribers are created and abandoned).
Addendum
A fundamental assumption of the garbage collector is that when object X holds a reference to object Y, it is because X is "interested" in Y. In some situations, however, the reference may be held because X wants Y to hold a reference to it even though Y doesn't "care" one way or the other. Such situations occur frequently with notification event handlers. Object Y may want to be notified every time something happens to object X. Although X has to keep a reference to Y so it can perform such notifications, X itself doesn't care about the notifications. It only performs them because of a presumption that some rooted object might care about Y's receiving them.
In some cases, it's possible to use what's called a "weak event pattern". Unfortunately, while there are many weak event patterns in .net, all of them have quirks and limitations due to the lack of a proper WeakDelegate type. Further, while weak events are helpful, they're not a panacea. Suppose, for example, that Y has asked long-lived object X to notify it when something happens, the only existing reference to Y is the one X uses for such notification, the only thing Y does with such notification is to increment a property in some object Z, and that setting that property modifies nothing outside Z. Under that scenario, even though object Z will be the only thing in the universe that "cares" about object Y, Z won't hold any sort of reference to Y whatsoever, and so the garbage collector will have no way of tying Y's lifetime to that of Z. If a X holds a strong reference to Y, the latter will be kept alive even after nobody's interested in it. If X only holds a weak reference, then Y may be garbage-collected even if Z is interested in it. There is no mechanism by which the garbage collector can automatically infer that Z is interested in Y.
In a complex application (involving inversion of control and quite some classes) it is hardly possible to know when a certain object won't be referenced anylonger.
First Question: Suggests the statement above that there is a design flaw in such an application, since there is a pattern saying: "In all OO programming it is about objects using other types of objects to ease up implementation. However: For any object created there should be some owner that will take care of its lifetime."
I assume it is save to state that traditional unmanaged OO programming works like stated above: Some owner will eventually free / release the used object.
However the benefit of a managed language is that in principle you don't have to care about lifetime management anymore. As long an object is referenced anyhow (event-handler...) and from anywhere (maybe not the "owner") it lives and should live, since it is still in use.
I really like that idea and that you don't have to think in terms of owner relationships. However at some point in a program it might get obvious that you want to get rid of an object (or at least mute it in a way as it wouldn't be there).
IStoppable: a suggestion of a design pattern
There could be an interface like "IStoppable", with a "Stop()" method and an "Stopped" event, so that any other object using it can remove their references onto the object. (Therefore would need to unplug their OnStopped event handler within the event handler if that is possible). As a result the object is no longer needed and will get collected.
Maybe it is naive but what i like to believe about that idea is that there wouldn't be an undefined state of the object. Even if some other object missed to unregister itself on OnStopped it will just stay alive and can still get called. Nothing got broken just by removing most references onto it.
I think this pattern can be viewed as an anarchistic app design, since
it is based on the idea that ANY other object can manage the lifetime of an IStoppable
there is no need for an owner
it would be considered as OK to leave the decision of unregistering from an IStoppable to those using it
you don't need to dispose, destroy or throw away - you just stop and let live (let GC do the dirty part)
IDisposable: from scatch and just to check a related pattern:
The disposable pattern suggests that you should still think and work like in unmanaged OO programming: Dispose an object that you don't need anylonger.
using is your friend in a method (very comfortable!)
an own IDisposable implementation is your friend otherwise.
after using it / calling Dispose you shouldn't call it anylonger: undefined behaviour.
implementation and resource centric: it is not so much about when and why, but more about the details of reclaiming resources
So again: In an application where i don't have in mind if anything else but an "owner" is pointing to an object, it is hard to ensure that noone will reference and call it anylonger.
I read of a "Dispose" event in the Component class of .NET. Is there a design pattern around it?
Why would i want to think in terms of Disposables? Why should i?
In a managed world...
Thanks!
Sebastian
I personally don't like the idea of IStoppable, as defined above. You're saying you want any object to manage the lifetime of the object - however, a defined lifecycle really suggests ownership - allowing multiple objects to manage the lifetime of a single object is going to cause issues in the long
IDisposable is, however, a well defined pattern in the .NET world. I wrote an entire series on implementing IDisposable which is a decent introduction to it's usage. However, it's purpose is for handling resource which have an unmanaged component - when you have a managed object that refers to a native resource, it's often desirable to have explicit control of the lifetime of that resource. IDisposable is a defined pattern for handling that situation.
That being said, a proper implementation of IDisposable will still clean up your resources if you fail to call Dispose(). The downside is that the resource will be cleaned up during the object's finalization, which could occur at any arbitrary point after the object is no longer used. This can be very bad for quite a few reasons - especially if you're using native resources that are limited in nature. By not disposing of the resource immediately, you can run out of resources before the GC runs on the object, especially if there isn't a lot of memory pressure in the system.
Ok first I would point out a few things I find uncomfortable about your IStoppable suggestion.
IStoppable raises event Stopped, consumers must know about this and release references. This is a bit complex at best, problematic at worst. Consumers must know where every reference is in order to remove/reset the reference.
You claim "... Nothing got broken just by removing most references onto it.". That entirely depends on the object implementing IStoppable and it's uses. Say, for example, my IStoppable object is an object cache. Now I forget about or ignore the event and suddenly I'm using a different object cache as the rest of the world... maybe that is ok, maybe not.
Events are a horrible way to provide behavior like this due to the fact that exceptions prove difficult to handle. What does it mean when the third out 10 event handlers throws an exception in the IStoppable.Stopped event?
I think what your trying to express is an object that may be 'owned' by many things and can be forcefully released by one? In this case you might consider using a reference counter pattern, more like old-school COM. That of course has issues as well, but they are less of a problem in a managed world.
The issue with a reference counter around an object is that you come back to the idea of an invalid/uninitialized object. One possible way to solve this is to provide the reference counter with a valid 'default' instance (or a factory delegate) to use when all references have been release and someone still wants an instance.
I think you have a misunderstanding of modern OO languages; in particular scope and garbage collection.
The lifetime of the objects are very much controlled by their scope. Whether the scope is limited to a using clause, a method, or even the appdomain.
Although you don't necessarily "care" about the lifetime of the object, the compiler does and will set it aside for garbage collection as soon as it goes out of scope.
You can speed up that process by purposely telling the garbage collector to run now, but that's usually a pointless exercise as the compiler will optimize the code to do so at the most opportune time anyway.
If you are talking about objects in multi-threaded applications, these already expose mechanisms to stop their execution or otherwise kill them on demand.
Which leaves us with unmanaged resources. For those, the wrapper should implement IDisposable. I'll skip talking about it as Reed Copsey has already covered that ground nicely.
While there are times a Disposed event (like the one used by Windows Forms) can be useful, events do add a fair bit of overhead. In cases where an object will keep all the IDisposables it ever owns until it's disposed (a common situation) it may be better to keep a List(Of IDisposable) and have a private function "T RegDisp<T>(T obj) where T:IDisposable" which will add an object to the disposables list and return it. Instead of setting a field to SomeDisposable, set it to RegDisp(SomeDisposable). Note that in VB, provided all constructor calls are wrapped in factory methods, it's possible to safely use RegDisp() within field initializers, but that cannot be done in C#.
Incidentally, if an IDisposable's constructor accepts an IDisposable as a parameter, it may often be helpful to have it accept a Boolean indicating whether or not ownership of that object will be transferred. If a possibly-owned IDisposable will be exposed in a mutable property (e.g. PictureBox.Image) the property itself should be read-only, with a setter method that accepts an ownership flag. Calling the set method when the object owns the old object should Dispose the old object before setting the new one. Using that approach will eliminate much of the need for a Disposed event.