Garbage Collection - Is it always safe to access objects via strong references?

Garbage Collection - Is it always safe to access objects via strong references? - c#

My understanding is that when the GC finds a sub-graph of objects that are no longer accessible (via strong references) from the main graph, it will collect them up and reclaim the memory. My question is concerning the order in which inaccessible objects are deleted. Does this occur as an atomic operation? Are all of the inaccessible objects finalized at once, or does the GC finalize each object one-by-one while the application is still executing? If the objects are finalized one-by-one, is there a particular order that is followed?
If I have an object A that holds a weak reference to object B, then it is clear that A must check if B is still alive before invoking any of B’s instance methods. Now suppose B holds a strong reference to another object C. If B is still alive, am I always guaranteed that C will also still be alive? Is there any possibility that the GC might have marked both B & C for collection, but C is finalized before B?
My guess is that it will always be safe to access C from B (since this is a strong reference). But I would like to confirm this assumption, because if I am wrong I could introduce a very intermittent hard-to-track-down bug.
public class ClassA
{
private readonly WeakReference objBWeakRef;
public ClassA(ClassB objB)
{
objBWeakRef = new WeakReference(objB);
}
public void DoSomething()
{
// The null check is required because objB
// may have been cleaned-up by the GC
var objBStrongRef = (ClassB) objBWeakRef.Target;
if (objBStrongRef != null)
{
objBStrongRef.DoSomething();
}
}
}
public class ClassB
{
private readonly ClassC objCStrongRef;
public ClassB(ClassC objC)
{
objCStrongRef = objC;
}
public void DoSomething()
{
// Do I also need to do some kind of checking here?
// Is it possible that objC has been collected, but
// the GC has not yet gotten around to collecting objB?
objCStrongRef.DoSomething();
}
}
public class ClassC
{
public void DoSomething()
{
// do something here... if object is still alive.
}
}

Yes, if your ClassB has a reference to a ClassC instance via objCStrongRef, then if the ClassB is still alive you don't need to worry about the ClassC evaporating at random. The exception to this is if you write a finalizer, i.e.
~ClassB() { ...}
In there, you should not attempt to talk to objCStrongRef at all; because you can have no clue which object gets finalized first. If ClassC needs something at finalization, it should have a separate ~ClassC() - although in reality finalizer methods are really rare and you shouldn't add them without good reason (unmanaged handles, etc)
Re finalization order: no, there is no particular order followed, because full loops are possible - and it needs to be broken somewhere arbitrarily.

In normal managed methods, you can rely on everything that B refers to with strong references to be alive as long as B is alive. The reference to C from B will be good for the lifetime of B.
The exception is in finalization code. If you implement a finalizer (which should be considered an unusual case, not the norm), then all bets are off within the call chain of the finalizer. Finalizers may not execute for a long time after the GC notices that the objects are no longer referenced by live objects - objects with finalizers tend to hang around a lot longer than "normal" objects, which is another reason not to use finalizers. A finalizer may not execute at all, ever, in extreme panic shutdown corner cases. You can't assume anything about what thread the finalizer will execute on, or what order finalizers will execute in. And you should avoid referring to any reference variables in your object because you don't know if they have already been finalized.
This is all academic though. Your code example doesn't implement a finalizer, so you don't need to worry about the bizzarro finalizer world.

From the "CLR via C#" by Jeffrey Richter:
Furthermore, be aware of the fact that you have no control over when
the Finalize method will execute. Finalize methods run when a garbage
collection occurs, which may happen when your application requests
more memory. Also, the CLR doesn’t make any guarantees as to the order
in which Finalize methods are called, so you should avoid writing a
Finalize method that accesses other objects whose type defines a
Finalize method; those other objects could have been finalized
already. However, it is perfectly OK to access value type instances or
reference type objects that do not define a Finalize method. You also
need to be careful when calling static methods because these methods
can internally access objects that have been finalized, causing the
behavior of the static method to become unpredictable.

If there exists no reference path of any sort which can reach an object, there will be no means to examine the memory space that object used to occupy, unless or until such time as the GC has run and made that memory space usable for reuse and some new object is created that uses it; by the time that occurs, the old object won't exist anymore. Regardless of when the GC runs, an object becomes instantly inaccessible the moment the last reference to it is destroyed or becomes inaccessible.
Objects with active finalizers always have a reference path, since a data structure called the "finalization queue" holds a reference to every such object. Objects in this queue are the last things processed by the GC; if an object is found to be referenced by the queue but by nothing else, a reference will be stored in a structure called the "freachable" queue, which lists objects whose Finalize method should be run at first opportunity. Once a GC cycle completes, this list will be considered a strong rooted reference, but the finalizer thread will start pulling things out of it. Typically, once items get pulled from the list there won't be any reference to them anywhere and they'll disappear.
Weak references add another wrinkle, since objects are considered eligible for collection even if weak references to them exist. The simplest way to regard those as working is to figure that once every object requiring retention has been identified, the system will go through and invalidate every WeakReference whose targets do not require retention. Once every such WeakReference instances has been invalidated, the object will be inaccessible.
The only situation where atomic semantics might matter would be when two or more WeakReference instances target the same object. In that scenario, it might theoretically be possible for one thread to access the Target property of a WeakReference at the exact moment that the GC was invalidating another WeakReference which had the same target. I don't think this situation can actually arise; it could be prevented by having multiple WeakReference instances that share the same target also share their GCHandle. In that case, either the access to the target would happen soon enough to keep the object alive, or else the handle would be invalidated, effectively invalidating all WeakReference instances that hold a reference to it.

Related

How can my Main() method be holding references to variables that have been disposed and are no longer used?

I'm working on fixing memory leaks in a large project, and so I've shrunk it down into one Main() method wherein a reference type object Obj1 containing a reference to another reference type object Obj2. Then, I create another object of type Obj1 which contains a reference to the same Obj2 object. Both of the objects are in their own using blocks, like so:
using (dynamic obj1_a = new Obj1(args))
{
do some actions...
using (dynamic obj1_b = new Obj1(args))
{
do some more actions...
//Memory Snapshot 1 taken here
}
}
GC.Collect();
GC.WaitForPendingFinalizer();
GC.Collect();
//Memory Snapshot 2 taken here
Somehow, when I take the 2 snapshots at the points commented above and compare them, .NET Memory Profiler indicates that even though the two objects obj1_a and obj1_b have been disposed, they haven't been GC'ed. When I examine the reference graphs, I see that the memory profiler says that both objects are referenced by my Main() method itself. I've gone through the whole code of the Main() method (it's not very complex, just creating, slightly modifying and then testing for garbage-collection) to see if there is a variable reference remaining to these two objects but there are none. How is it still possible that my Main() method could be holding these objects in memory? It's important that they get garbage collected (or at least are able to get GC'ed) because they contain references to many more reference and value types and the program becomes quite a memory drain without it.

even though the two objects obj1_a and obj1_b have been disposed, they haven't been GC'ed
Your statement presupposes that there is a connection between an object being disposed and it being deallocated by the garbage collector. Make sure you fully understand the following statement: disposing an object has no effect whatsoever on whether it is eligible to be collected. From the GC's perspective, Dispose is just a method. You might as well say "even though I called ToString on an object, it still hasn't been GC'd". What does ToString have to do with the GC? Nothing. What does Dispose have to do with the GC? Nothing whatsoever.
Now, this slightly overstates the case. A finalizable object should implement IDisposable, and its Dispose should call SuppressFinalization. That has an effect on the garbage collector because finalizable objects always live at least one collection longer than they otherwise would because the finalization queue is a root. But here the effect is not due directly to Dispose; it is simply a convention that disposers suppress finalization. It is the suppression which has an effect on the GC behaviour.
How is it still possible that my Main() method could be holding these objects in memory?
An object is collected by the GC when the GC determines that there is no living root containing a direct or indirect reference to the object.
A local variable in an active method is a living root.
The runtime is permitted to, entirely at its discretion and for any reason whatsoever, to both (1) determine that a local is never read again and treat it as dead early, and (2) keep a local alive longer even when control has passed beyond the local variable declaration space of that variable.
Even though your obj1_a and obj1_b are out of scope by the time the GC runs, the runtime is entirely permitted to pretend that they were declared at the top scope of Main, and is permitted to keep them alive until Main completes, which is after the GC runs.
It's important that they get garbage collected (or at least are able to get GC'ed) because they contain references to many more reference and value types and the program becomes quite a memory drain without it.
If you require fine-grained control over lifetimes of objects then languages which have automatic memory deallocation do not meet your requirements.

Singleton class backed via serialization

I'm writing a Singleton class for access to a List. When the Singleton is first initialized this List is populated by de-serializing an xml file.
This all works great but I want this List to be updatable and in order to do so I'm currently having to re-serialize each time I add or remove from it.
I would prefer to simply have one call to serialize the list when the object is removed from memory. However I feel that trying to do this in the ~Destructor can potentially cause lost data etc.
What is the best pattern to follow to achieve this behavior?
Thanks

Destructors in C# are very different from C++ destructors. They are not actual destructors that are run on fully correct objects which are no longer required, but a last resort measure to deal with unmanaged resources (MSDN):
You should override Finalize for a class that uses unmanaged resources
such as file handles or database connections that must be released
when the managed object that uses them is discarded during garbage
collection.
So it is a bad idea to use Finalizer (Destructor) to interoperate with managed resources, because:
The finalizers of two objects are not guaranteed to run in any
specific order, even if one object refers to the other. That is, if
Object A has a reference to Object B and both have finalizers, Object
B might have already been finalized when the finalizer of Object A
starts.
Deserialized data list may be correct during such finalization, or it may have been already reclaimed during previous GC runs. Even if it is not so with current garbage collector, no one can guarantee that such problem won't appear in the next .NET version. Accessing managed resources in Finalizer's is practically an undefined behaviour.
Possible solution
Taking into account that your object is singleton, and thus it can be garbage collected only when application exits (or never if it is never requested),
the simplest way to allow such persistent storage to object's data is to attach event handler to the application close(exit) event.
...
public static Singleton Instance
{
get
{
if (_Instance == null)
{
_Instance = new Singleton();
Application.Exit += (sender, args) =>
{
_Instance.SaveChanges();
}
}
return _Instance;
}
}
Or use appropriate Lazy<T> constructor if your Singleton is based on its use.

Can I access reference type instance fields/properties safely within a finalizer?

I always thought that the answer to this was no, but I cannot find any source stating this.
In my class below, can I access (managed) fields/properties of an instance of C in the finalizer, i.e. in ReleaseUnmanaged()? What restrictions, if any, are there? Will GC or finalization ever have set these members to null?
The only thing I can find is that stuff on the finalizer queue can be finalized in any order. So in that case, since the recommendation is that types should allow users to call Dispose() more than once, why does the recommended pattern bother with a disposing boolean? What bad things might happen if my finalizer called Dispose(true) instead of Dispose(false)?
public class C : IDisposable
{
private void ReleaseUnmanaged() { }
private void ReleaseOtherDisposables() { }
protected virtual void Dispose(bool disposing)
{
ReleaseUnmanaged();
if (disposing)
{
ReleaseOtherDisposables();
}
}
~ C()
{
Dispose(false);
}
public void Dispose()
{
Dispose(true);
GC.SuppressFinalize(this);
}
}

In general case - yes, you can. If a class has nonempty finalizer, first time GC would collect instance of this class, it calls finalizer instead (only if you didn't call GC.SuppressFinalize on it earlier). Object instance seen from finalizer looks just like it did last time you touched it. You can even create new (direct or indirect) link from root to your instance and thus resurrect it.
Even if you hold unmanaged pointer to unpinned object and inspect raw memory contents, you shouldn't be able to see partially-deallocated object, because .NET uses copying GC. If an instance is alive during collection, it is either promoted to next generation or moved to completely new memory block along with other instances. If it is not reachable it is either left where it was or the whole heap is released and returned to OS. Keep in mind, however, that finalizers can and will be called on instances of objects that failed to construct (i.e. when was an exception thrown during object construction).
Edit: As for Dispose(true) vs Dispose(false) in well-written classes there shouldn't be much of a difference in the long run. If your finalizer called Dispose(true), it would only remove links from your object to other objects, but since your object is already nonreachable, releasing other instances referenced by your object won't matter their reachability.
For more details on .NET GC implementation details, I recommend C# 5.0 in a Nutshell by Joseph and Ben Albahari.

The GC will never collect any object if there is any way via which any code could ever get a reference to it. If the only references that exist to an object are weak references, the GC will invalidate them so that there's no way any code could ever get a reference to that object, whereupon it will then be able to collect it.
If an object has an active finalizer, then if the GC would collect it (but for the existence of the finalizer), the GC will instead add it to a queue of objects whose finalizers should be run ASAP and, having done so, de-activate it. The reference within the queue will prevent the GC from collecting the object until the finalizer has run; once the finalizer finishes, if no other references exist to the object and it hasn't re-registered its finalizer, it will cease to exist.
The biggest problems with finalizers accessing outside objects are:
A finalizer will be run in a threading context which is unrelated to any threading context where the object was used, but should not perform any operations which cannot be guaranteed to complete quickly. This often creates contradictory requirement that code use locking when accessing other objects, but that code not block while waiting on locks.
If a finalizer has a reference to another object which also has a finalizer, there's no guarantee as to which will run first.
Both of these factors severely limit the ability of objects with finalizers tosafely access outside objects they don't own. Further, because of the way finalization is implemented, it's possible for the system to decide to run a finalizer when no strong references exist, but for external strong references to be created and used before the finalizer runs; the object with the finalizer has no say in whether that happens.

What happens to 'object' after local scope?

I am eager to know what happens to a object in .NET when it falls out of scope. Something like this:
class A
{
ClassB myObjectB = new ClassB();
}
//what happens to object at this point here?
What happens in the memory? Does GC get called when it falls out of scope and loose it reference in the heap?

What happens in the memory? Does GC get called when it falls out of scope and loose it reference in the heap?
No - the GC doesn't get called at this point.
What happens is that the object is left in memory, but is now "unrooted" - basically, there are no references to that object from any other object. As such, the object is now eligible for garbage collection.
At some point in the future, the GC will run. When this happens, the unrooted object will now be eligible for garbage collection. Depending on which generation holds the object, it may or may not be cleaned at that point. Eventually, if the program continues executing and if memory pressure causes the appropriate generation to be collected, the memory for that object will be reclaimed.
There is no definite point in time when this will happen (unless you explicitly call GC.Collect, which is a bad idea). However, with managed code, you don't really need to worry about when this will happen. Just assume that it will happen when and if appropriate for your specific application.

Following from pst's comment, a better example might be this:
void M()
{
ClassB myObjectB = new ClassB();
}
//what happens to object at this point here?
In this example, myObjectB is a local variable rather than a field. So, what happens when the local variable goes out of scope? Nothing! Scope has nothing to do with object lifetime in C#.
What really happens is that the JIT compiler decides to release the object at some point. That can be before the end of the variable's scope, if the variable is not going to be used in the rest of the method. Once the object is no longer referenced, as other answers have also mentioned, it becomes eligible for collection by the GC. It is not actually collected until the GC runs (actually, until the GC collects the generation in which the object is living).
As pst implied, a field is a poor example because it will always be reachable whenever its containing object is reachable, so the separation between scope and object lifetime is even greater:
class A
{
private object o = //whatever
}
void B()
{
var a = new A();
// here, the field o is not in scope, but the object it refers to is reachable and cannot be collected.
GC.KeepAlive(a);
}

I find "falling out of scope" to be a much more C++-specific way of thinking of things, where at the end of a given scope an object with automatic storage is freed and has its destructor called.
In the C# world, there's no "falling out of scope". Variables (read: names) exist in a certain scope, and that's it. The GC really has no concern for this; an object can be collected before the end of its name's scope even exits, or long afterwards depending on what references it and when the GC decides a collection is necessary.
The two concepts should then be divorced and reasoned about separately. Scoping is all about names, whereas garbage collection cares only about the reachability of objects. When an object is no longer reachable from one of the known roots, it will be scheduled for collection.

If there's nothing referencing the object, it gets eventually collected by GC. The thing is, you can't predict WHEN exactly it will happen. You just know, it will happen.

Generally speaking, Garbage Collection happens at 3 distinct generations (0, 1, or 2). When each of these is collected depends on how much resources are needed by the OS.
A call to GC.Collect() will collect all of the available resources but it is possible to define which generation of resources to collect.

What is the correct way to free memory in C#

I have a timer in C# which executes some code inside it's method. Inside the code I'm using several temporary objects.
If I have something like Foo o = new Foo(); inside the method, does that mean that each time the timer ticks, I'm creating a new object and a new reference to that object?
If I have string foo = null and then I just put something temporal in foo, is it the same as above?
Does the garbage collector ever delete the object and the reference or objects are continually created and stay in memory?
If I just declare Foo o; and not point it to any instance, isn't that disposed when the method ends?
If I want to ensure that everything is deleted, what is the best way of doing it:
with the using statement inside the method
by calling dispose method at the end
by putting Foo o; outside the timer's method and just make the assignment o = new Foo() inside, so then the pointer to the object is deleted after the method ends, the garbage collector will delete the object.

1.If I have something like Foo o = new Foo(); inside the method, does that
mean that each time the timer ticks,
I'm creating a new object and a new
reference to that object?
Yes.
2.If I have string foo = null and then I just put something temporal in foo,
is it the same as above?
If you are asking if the behavior is the same then yes.
3.Does the garbage collector ever delete the object and the reference or
objects are continually created and
stay in memory?
The memory used by those objects is most certainly collected after the references are deemed to be unused.
4.If I just declare Foo o; and not point it to any instance, isn't that
disposed when the method ends?
No, since no object was created then there is no object to collect (dispose is not the right word).
5.If I want to ensure that everything is deleted, what is the best way of
doing it
If the object's class implements IDisposable then you certainly want to greedily call Dispose as soon as possible. The using keyword makes this easier because it calls Dispose automatically in an exception-safe way.
Other than that there really is nothing else you need to do except to stop using the object. If the reference is a local variable then when it goes out of scope it will be eligible for collection.1 If it is a class level variable then you may need to assign null to it to make it eligible before the containing class is eligible.
1This is technically incorrect (or at least a little misleading). An object can be eligible for collection long before it goes out of scope. The CLR is optimized to collect memory when it detects that a reference is no longer used. In extreme cases the CLR can collect an object even while one of its methods is still executing!
Update:
Here is an example that demonstrates that the GC will collect objects even though they may still be in-scope. You have to compile a Release build and run this outside of the debugger.
static void Main(string[] args)
{
Console.WriteLine("Before allocation");
var bo = new BigObject();
Console.WriteLine("After allocation");
bo.SomeMethod();
Console.ReadLine();
// The object is technically in-scope here which means it must still be rooted.
}
private class BigObject
{
private byte[] LotsOfMemory = new byte[Int32.MaxValue / 4];
public BigObject()
{
Console.WriteLine("BigObject()");
}
~BigObject()
{
Console.WriteLine("~BigObject()");
}
public void SomeMethod()
{
Console.WriteLine("Begin SomeMethod");
GC.Collect();
GC.WaitForPendingFinalizers();
Console.WriteLine("End SomeMethod");
}
}
On my machine the finalizer is run while SomeMethod is still executing!

The .NET garbage collector takes care of all this for you.
It is able to determine when objects are no longer referenced and will (eventually) free the memory that had been allocated to them.

Objects are eligable for garbage collection once they go out of scope become unreachable (thanks ben!). The memory won't be freed unless the garbage collector believes you are running out of memory.
For managed resources, the garbage collector will know when this is, and you don't need to do anything.
For unmanaged resources (such as connections to databases or opened files) the garbage collector has no way of knowing how much memory they are consuming, and that is why you need to free them manually (using dispose, or much better still the using block)
If objects are not being freed, either you have plenty of memory left and there is no need, or you are maintaining a reference to them in your application, and therefore the garbage collector will not free them (in case you actually use this reference you maintained)

Let's answer your questions one by one.
Yes, you make a new object whenever this statement is executed, however, it goes "out of scope" when you exit the method and it is eligible for garbage collection.
Well this would be the same as #1, except that you've used a string type. A string type is immutable and you get a new object every time you make an assignment.
Yes the garbage collector collects the out of scope objects, unless you assign the object to a variable with a large scope such as class variable.
Yes.
The using statement only applies to objects that implement the IDisposable interface. If that is the case, by all means using is best for objects within a method's scope. Don't put Foo o at a larger scope unless you have a good reason to do so. It is best to limit the scope of any variable to the smallest scope that makes sense.

Here's a quick overview:
Once references are gone, your object will likely be garbage collected.
You can only count on statistical collection that keeps your heap size normal provided all references to garbage are really gone. In other words, there is no guarantee a specific object will ever be garbage collected.
It follows that your finalizer will also never be guaranteed to be called. Avoid finalizers.
Two common sources of leaks:
Event handlers and delegates are references. If you subscribe to an event of an object, you are referencing to it. If you have a delegate to an object's method, you are referencing it.
Unmanaged resources, by definition, are not automatically collected. This is what the IDisposable pattern is for.
Finally, if you want a reference that does not prevent the object from getting collected, look into WeakReference.
One last thing: If you declare Foo foo; without assigning it you don't have to worry - nothing is leaked. If Foo is a reference type, nothing was created. If Foo is a value type, it is allocated on the stack and thus will automatically be cleaned up.

Yes
What do you mean by the same? It will be re-executed every time the method is run.
Yes, the .Net garbage collector uses an algorithm that starts with any global/in-scope variables, traverses them while following any reference it finds recursively, and deletes any object in memory deemed to be unreachable. see here for more detail on Garbage Collection
Yes, the memory from all variables declared in a method is released when the method exits as they are all unreachable. In addition, any variables that are declared but never used will be optimized out by the compiler, so in reality your Foo variable will never ever take up memory.
the using statement simply calls dispose on an IDisposable object when it exits, so this is equivalent to your second bullet point. Both will indicate that you are done with the object and tell the GC that you are ready to let go of it. Overwriting the only reference to the object will have a similar effect.

The garbage collector will come around and clean up anything that no longer has references to it. Unless you have unmanaged resources inside Foo, calling Dispose or using a using statement on it won't really help you much.
I'm fairly sure this applies, since it was still in C#. But, I took a game design course using XNA and we spent some time talking about the garbage collector for C#. Garbage collecting is expensive, since you have to check if you have any references to the object you want to collect. So, the GC tries to put this off as long as possible. So, as long as you weren't running out of physical memory when your program went to 700MB, it might just be the GC being lazy and not worrying about it yet.
But, if you just use Foo o outside the loop and create a o = new Foo() each time around, it should all work out fine.

As Brian points out the GC can collect anything that is unreachable including objects that are still in scope and even while instance methods of those objects are still executing. consider the following code:
class foo
{
static int liveFooInstances;
public foo()
{
Interlocked.Increment(ref foo.liveFooInstances);
}
public void TestMethod()
{
Console.WriteLine("entering method");
while (Interlocked.CompareExchange(ref foo.liveFooInstances, 1, 1) == 1)
{
Console.WriteLine("running GC.Collect");
GC.Collect();
GC.WaitForPendingFinalizers();
}
Console.WriteLine("exiting method");
}
~foo()
{
Console.WriteLine("in ~foo");
Interlocked.Decrement(ref foo.liveFooInstances);
}
}
class Program
{
static void Main(string[] args)
{
foo aFoo = new foo();
aFoo.TestMethod();
//Console.WriteLine(aFoo.ToString()); // if this line is uncommented TestMethod will never return
}
}
if run with a debug build, with the debugger attached, or with the specified line uncommented TestMethod will never return. But running without a debugger attached TestMethod will return.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.