C# Garbage collection

C# Garbage collection - c#

Say we have:
public void foo()
{
someRefType test = new someRefType ();
test = new someRefType ();
}
What does the garbage collector do with the first heap object? Is it immediately garbage collected before the new assignment? What is the general mechanism?

What does the garbage collector do with the first heap object?
Who knows? It's not deterministic. Think of it like this: on a system with infinite memory, the garbage collector doesn't have to do anything. And you might think that's a bad example, but that's what the garbage collector is simulating for you: a system with infinite memory. Because on a system with sufficiently more memory available than required by your program, the garbage collector never has to run. Consequently, your program can not make any assumptions about when memory will (if ever) be collected.
So, the answer to your question is: we don't know.
Is it immediately garbage collected before the new assignment?
No. The garbage collector is not deterministic. You have no idea when it will collect and release garbage. You can not make any assumptions about when garbage will be collected or when finalizers will run.
In fact, it's very unlikely it's collected so quickly (that would make collections happen too frequently). Additionally, on a system with sufficient memory, the garbage collector never has to run.
What is the general mechanism?
That's a fairly broad question. But the underlying principle is very simple: a garbage collector simulates a machine with infinite memory. To do this, it somehow keeps track of memory and is able to determine when memory is garbage. When it sees fit, due to its need to simulate infinite memory, it will from time to time collect this garbage and make it available for allocation again.

No, there is nothing that says that the object is immediately collected. In fact, it is quite unlikely that it is. It will be collected eventually by the garbage collector, but you can't know exactly when.
You can force a collection by calling GC.Collect, although this is normally not recommended.
Exactly how the garbage collection works is a fairly large subject, but there is great documentation you can read on MSDN.

There are numerous different strategies for garbage collection and they have gotten more sophisticated and more efficient over the years. There's lots of excellent resources in the literature and on the web that talk about them. But I also find sometimes an imperfect and colorful metaphor gives me an intuition that helps me get started. So allow me to try:
.NET has a so-called "generational" garbage collector and I think of it as behaving I lot like I do myself. I let dirty clothes and mail ("C# objects") pile up all over my living room floor ("memory") over a period of several days and when I find that I can't see the carpet any more ("memory full") I spend some time cleaning up ("garbage collecting") the living room ("generation 0"), throwing away the objects that aren't needed any more ("no longer reachable") and moving the remaining ones to my bedroom ("generation 1"). Quite often this buys me some time and I don't need to do any more work. But when my bedroom fills up I do something similar, throwing away some objects and moving the others to my basement ("generation 2"). Occasionally even the basement fills up and then I have I real problem and need to do some major spring cleaning ("full collection").
Applying this metaphor to your example, we might guess that the first piece of trash ("heap object") just sits around until I get around to picking it up ("run the generation 0 collector") which happens when I feel like it, when the floor gets completely covered, or maybe never :-)

To see when the objects are being deleted, you can override the finalize method in your class to print when and what objects are being deleted, like in this sample below:
class MyClass
{
private int _id;
public MyClass(int id)
{
_id = id;
}
~MyClass()
{
Console.WriteLine("Object " + _id + " deleted at " + DateTime.Now + " .");
}
}
class Program
{
static void Main(string[] args)
{
MyClass p1 = new MyClass(1);
p1 = new MyClass(2);
Console.ReadKey();
}
}
To force the garbage collector to free this objects faster, you could add a field to them as a long array, something like private int []memory; and in the constructor: memory=new int[10000000].

Related

Calling GC.Collect inside loop and thread and Clearing collection doesn't decrease memory usage

I have many objects and each object has many members, I need to insert some collection data into each members and clearing them again after using them.. after I clear it I hope GC.Collect() can claim Memory usage immediately , but its look doesn't decrease memory usage.. I've check on task manager is always increasing. Only after all processing task complete I notice the memory usage was down.
The memory usage achieve up to 10G achive almost 100% on my PC.. and the usage only going down after all the processing done.. I afraid if client pc memory is not enough then will cause the outofmemory exception
and I notice Clearing all collection data doesn't looks like reduce the memory usage..
How I should claim the memory back ?
and is it will be ok calling GC.Collect inside the loop inside the thread ?
Illustration looks like this.
ex :
public class Progress
{
var obj2 = new Obj2();
public void Processing()
{
//here my thread start.. (I have some thread class)
AsyncClass.DoTask(() =>
{
foreach(var curProcess in AllObjects)
{
var allSolutions = (from m in curProcess.memories
where....
select m).ToList();
forearch(var memory in allSolutions)
{
foreach(data in alldata)
{
//some process
...
...
result = data.result;
//pushing the data into memory members obj
obj2.PushCalculation(memory, result)
}
}
forearch(var memory in allSolutions)
{
obj2.ClearValues(memory)
}
GC.Collect();
GC.WaitForPendingFinalizers();
}
});
}
}
public class Obj2 : IDisposable
{
public void PushCalculation(memoryObj mem, List<data> results)
{
foreach(var result in results)
{
mem.data1.add(result);
mem.data2.addrange(result * 1);
//etc... all about pushing into memory object members
}
}
public void ClearValues(memoryObj mem)
{
// clear all collections of memoryObj members
mem.data1.Clear();
mem.data1 = null;
mem.data2.Clear();
mem.data2 = null;
......
......
}
}

First of all: you should let the garbage collector (GC) do its job. It can probably judge when to collect garbage a lot better than you, based on the system, current load, application profile and other factors.
Second: the garbage collector allocates some memory space to operate in. This space may not shrink even though the collector correctly collected most of it. It is then empty, but available for future allocations; or it may shrink at a later point in time.
But if the garbage collector is actually unable to collect some of your objects, then you still have references to those objects somewhere. Perhaps in a static fields, or a collection you forgot about?
What you should take from this is: never call GC.Collect(). There are only very few good reasons to call GC.Collect(). If you have some memory issue that prompted you to use GC.Collect() then you should instead be investigating what's causing the memory issue.
Possible causes that I can think of from the top of my head:
Keeping a reference to an object that you no longer need.
Using an unmanaged resource and afterwards not disposing it properly.

Garbage collectors are complex beasts and the simplicity of the interface to them is deceiving. Just call GC.Collect(), right?
The .NET garbage collector is a generational GC. When it runs, it looks for new objects to get rid of first. This comes from the idea that the overwhelming majority of objects are short-lived. Objects that survived are placed into a different memory pool and this pool isn't scanned as often. If I remember correctly, there are three such pools. This means that when the GC does a run, it does not scan the whole heap.
Also note that those are pools. This means that when the virtual machine wants to allocate memory, it looks in the pool for free memory before trying to expand the heap. This is only possible if the memory of collected objects is not immediately reclaimed. In other words, when the GC collects an object, it doesn't necessarily return the memory to the operating system.
Which leads us to GC.Collect(). We know that when the GC runs, it doesn't scan the whole heap, and when it destroys objects, it doesn't return memory. Then, what's the point of calling GC.Collect()? I can't answer that one. As far as I'm concerned, calling GC.Collect() isn't useful.

Your code is probably not real but I will try to answer anyway.
When you call PushCalculation, you add pointers to objects of data class (assuming data is not a struct) to the mem.data1, mem.data2 collections. So after that operation you have two references to each data object. One goes from mem.data and the other is from data.results.
Then in ClearValues you clear one of the references from each data object but results still pins them all. You need to clear data.results too if you want the garbage collector to free the memory.
In other words you have a memory leak. See this answer on how to investigate memory leaks.
How to debug the potential memory leak?
I personally used WinDbg and found it extremely useful. It shows what pins down an object in memory

Object instantiation within destructor of same class and garbage collection

Recently, while taking an introductory unit focused on Object Oriented programming, I was introduced to the Garbage Collector in C#, and that it's role is to "clean up" objects that are no longer being referenced. Then I was introduced to destructors, and how they're called just before the object is deleted.
Naturally, I got thinking, but I never remembered to ask the lecturer about it; what will happen if you create an instance of a class within the destructor of the same class?
C# example
class Person{
~Person(){
Person p = new Person();
Console.WriteLine("Person destroyed");
}
}
class Program{
static void Main(string[] args){
Person p = new Person();
}
}
I would like to approach this from a more theoretical point of view, so I'm reluctant (at this stage) to try it since I probably wouldn't understand anyway, but I have a few theories. Besides, I'm not at my regular computer right now ;)
Person.~Person() is going to recurse, as each time the new Person is created, it's going to call its destructor and create a new Person ad infinitum, or until some kind of memory-related exception occurs. Subsequently, main will never terminate.
The compiler will complain (adding this option to every scenario seems like a good idea anyway).
Somehow, some kind of "destructor skipping" will occur. ie. object destruction wouldn't be called sequentially, so neither would the constructor.
Now for a similarly related question. If the Garbage Collector's role is to delete the objects that are no longer referenced/needed, how would a situation like the one above be handled in an environment without a Garbage Collector - say, C++?

There's no real mystery here I think.
It won't 'recurse' as such - you're just chucking a new object on the managed heap which is immediately dereferenced; thus making it a candidate for garbage collection.
Eventually the garbage collector will come round again, triggering the operation again etc.
That's not recursion - more like a chain. But ultimately each Person will be removed from memory.
And, after a while the Garbage collector will send you an email complaining that you're not playing fair.
As for C++, well my guess is a stack overflow, since construction/destruction is happening there and then, and a very sulky computer afterwards.
If your next logical thought is 'shouldn't the runtime/language stop this from happening?' - no. The language or runtimes in question are not there to stop you doing something that would otherwise be considered ill-advised; it trusts you, the programmer, to make sure you're not doing that.
That said - in an application shutdown scenario (re your comment below) the .Net runtime is going to act out of self-interest and will ultimately stop processing these finalizers to enact a shutdown. Finalizers are for your benefit, not the runtime's.

A more interesting point to make is that an object can actually resurrect itself in the finalizer!
class Foo
{
static public List<Foo> ZombieFoos = new List<Foo>;
~Foo()
{
ZombieFoos.Add(this);
// Now there is a reference to this instance again (in the list)..
// The GC will not reclaim this instance.. huzzah we have been resurrected!!
}
}
Not even remotely recommended...

You could, in fact, cause system to be 99.9 time in GC. Just acquire sufficient amount of memory by every Person object for GC to trigger Heap 0 collection.

Why does this memory not get cleaned up, or get allocated at all?

So, I've got this awesome program that is very useful:
static void Main(string[] args)
{
new Dictionary<int,int>(10000000);
while (true)
{
System.Threading.Thread.Sleep(1000);
}
}
This doesn't even produce any warnings from the compiler, which is surprising.
Running this allocates a chunk of memory. If I run several copies, I'll eventually get to a point where I can't start any more because I've run out of memory.
Why doesn't the garbage collector ever clean up the memory, instead letting the system get into a state where there is not enough memory for new processes?
Heck, why isn't the memory allocation optimized out? It can never be referenced by anything ever!
So what's going on here?

The garbage collector is non-deterministic, and responds to memory pressure. If nothing requires the memory, it might not collect for a while. It can't optimize away the new, as that changes your code: the constructor could have side-effects. Also, in debug it is even more likely to decide not to collect.
In a release/optimized build, I would expect this to collect at some point when there is a good reason to. There is also GC.Collect, but that should generally be avoided except for extreme scenarios or certain profiling demands.
As a "why" - there is a difference in the GC behaviour between GC "generations"; and you have some big arrays on the "large object heap" (LOH). This LOH is pretty expensive to keep checking, which may explain further why it is so reluctant.

My guess is that a hidden Gen0 collection is being done.
Here is my test program:
static void Main(string[] args)
{
new Dictionary<int, int>(10000000);
Thread.Sleep(5000);
int x = 1; // or 0;
int i = 0;
while (true)
{
object o = ++i;
Thread.Sleep(x);
}
}
When the system executes the Sleep(1), the system must think that this is a good time for a quick, hidden GC on Gen0 only. So the 'object o = ++i' statement never places pressure on Gen0, and never triggers a GC collection and hence never releases the Dictionary.
Sleep(1) http://www.freeimagehosting.net/uploads/6fad1952e0.png
Change x to 0. Now, this hidden GC does not occur, and things work as expected, with the 'object o = ++i' statement causing the Dictionary to be collected.
Sleep(0) http://www.freeimagehosting.net/uploads/f285b8acdb.png

The GC probably runs and frees the memory... for the application itself. That is, if the Sleep() calls needs to allocate some RAM then it will probably find plenty of it, namely the big blocks which were initially allocated for the huge Dictionary.
This does not mean that the GC gave the memory back to the operating system. From the OS point of view, the big blocks may still be part of the process, not usable by any other process.
The allocation is not optimized out because it is some external code. Your Main class calls out to a constructor for Dictionary<int,int> which could do anything possibly with various side effects. As a human programmer you expect that constructor not to have externally visible side effects, but the compiler and the VM do not know that for sure. So the code cannot dispense with really creating a Dictionary<int,int> instance and calling its constructor. Similarly, the Dictionary<int,int> constructor does not know that it is called for an object which will soon become unreachable, hence it cannot optimize itself out.

I don't know this for a fact but I'd guess it's because even without you creating a reference to your new Dictionary, it has been linked to the local scope at that point, which your program never leaves. To check if this is the case, just create the Dictionary in an inner scope which you can leave before starting your loop, e.g.
static void Main(string[] args)
{
{
new Dictionary(10000000);
}
while (true)
{
System.Threading.Thread.Sleep(1000);
}
}
this should now leave the memory available for Garbage Collection

Delegate variables not garbage collected

Recently discovered that the variables inside ToGadget, and presumably the delegate as well, weren't getting garbage collected. Can anyone see why .NET holds a reference to this? Seems that the delegate and all would be marked for garbage collection after Foo ends. Literally saw Billions in memory after dumping the heap.
Note: 'result.Things' is a List<Gadget> () and Converter is a System delegate.
public Blah Foo()
{
var result = new Blah();
result.Things = this.Things.ConvertAll((new Converter(ToGadget)));
return result;
}
.................
public static Gadget ToGadget(Widget w)
{
return new Gadget(w);
}
Update: changing the 'ConvertAll' to this cleans up the delegates and corresponding object references. This suggests to me that either List<> ConvertAll is somehow holding on to the delegate or I don't understand how these things are garbage collected.
foreach (var t in this.Things)
{
result.Things.Add(ToGadget(t));
}

Use a memory profiler.
You can ask on StackOverflow all day and get a bunch of educated guesses, or you can slap a memory profiler on your application and immediately see what is rooted and what is garbage. There are tools available that are built specifically to solve your exact problem quickly and easily. Use them!

There is one major flaw in your question, which may be the cause of confusion:
Seems that the delegate and all would be marked for garbage collection after Foo ends.
The CLR doesn't "mark items" for collection at the end of a routine. Rather, once that routine ends, there is no longer an (active) reference to any of the items referenced in your delegate. At that point, they are what is refered to as "unrooted".
Later, when the CLR determines that there is a certain amount of memory pressure, the garbage collector will execute. It will search through and find all unrooted elements, and potentially collect them.
The important distinction here is that the timing is not something that can be predicted. The objects may never be collected until your program ends, or they may get collected right away. It's up to the system to determine when it will collect. This doesn't happen when Foo ends - but rather at some unknown amount of time after Foo ends.
Edit:
This is actually directly addressing your question, btw. You can see if this is the issue by forcing a garbage collection. Just add, after your call to Foo, a call to:
GC.Collect();
GC.WaitForPendingFinalizers();
Then do your checking of the CLR's heap. At this point, if you're still getting objects in the heap, it's because the objects are still being rooted by something. Your simplified example doesn't show this happening, but as this is a very simplified example, it's difficult to determine where this would happen. (Note: I don't recommend keeping this in your code, if this is the case. Calling GC.Collect() manually is almost always a bad idea...)

It looks like your function is set up to return the new Blah(). Is it actually being returned in your code? I see in the piece you posted that it is not. If that is the case, then the new Blah() would have a scope outside of Foo and it may be the calling function that is actually holding the references in scope. Also, you're creating new Gadget() as well. Depending on how many Blahs to Gadgets you have, you could be exponentially filling your memory as the Gadgets will be scoped with the Blahs which are then held in scope beyond Foo.
Whether I'm right or wrong, this possibility was kinda funny to type.

C# How can I destroy a temporary string array before it gets garbage collected?

I have a string that contains comma seperated email addresses. I then load this into a string array, and from then populate a list which is easier to work with. Once the list is populated, I would like to be able to destroy the now unused string array, because the class still has a lot of work to do before the garbage collector will clean up this waste of memory.
How can I manually destroy this string array...
While reviewing the code, if you have a cleaner more efficient way of populating the list, recommendations are welcome.
Here is code:
public class EmailReportManager
{
private List<string> emailAddresses;
public EmailReportManager(string emailAddressesCommaSeperatedList)
{
loadAddresses(emailAddressesCommaSeperatedList);
}
private void loadAddresses(string emailAddressesCommaSeperatedList)
{
string[] addresses = emailAddressesCommaSeperatedList.Split(',');
for (int addressCount = 0; addressCount < addresses.Length; addressCount++)
{
this.emailAddresses.Add(addresses[addressCount]);
}
//Want to destroy addresses here.....
}
}

You can't "destroy" the array. Options are:
You can clear the array using Array.Clear, so that it won't hold references to any strings any more. This won't reclaim any memory.
You can set the variable to null, so that that particular variable doesn't prevent the array from being garbage collected. This won't reclaim any memory.
You could call GC.Collect after making sure you don't have any references to the array any more. This will probably reclaim the memory (it's not guaranteed) but it's generally not a good idea. In particular, forcing a full GC regularly can significantly harm performance.
It's worth understanding that in your case you don't need to set the array variable to null - it's about to go out of scope anyway. Even if it wasn't about to go out of scope, if it wasn't going to be used in the rest of the method (and the JIT could tell that) then setting it to null would be pointless. The GC can pretty reliably tell when a variable is no longer relevant. It's rarely a good idea to set a variable to null for the sake of GC - if you find you want to, that's usually an indication that you could refactor your code to be more modular anyway.
It's usually a good idea to just trust the GC. Why do you think the GC isn't going to get round to reclaiming your array, and do you have a really good reason to care?

addresses = null;
Then if you really want to force GC, call this (otherwise, just set it to null and let the GC do its job on its own time):
GC.Collect();
Also see this question: Force garbage collection of arrays, C#

I think you can make the code easier to read by using the AddRange method of the List<> class.
public class EmailReportManager
{
private List<string> emailAddresses = null;
public EmailReportManager(string emailAddressesCommaSeperatedList)
{
this.emailAddresses.AddRange(emailAddressesCommaSeperatedList.Split(","))
}
}

You don't need to do anything at all to make the array a candidate for garbage collection, and you shouldn't do anything, as everything that you try will only be a waste of time. The garbage collector will collect the array at a convenient time, and in almost every case the garbage collector has a lot more information about the current state of memory usage than what you can possibly anticipate.
From the moment that the array is not used any more (i.e. when you exit the loop where you copy from it), the garbage collector knows that it can be removed. There is no need to clear the reference to the array as the garbage collector already knows that it's no longer used, so the only thing that accompishes is a waste of a few processor cycles. Calling Clear on the Array to remove the references from it is just a waste of time, and will actually keep the array in memory longer as the garbage collector can't remove it until you have cleared it.
Besides, all the strings from the array is now in the list, so the only thing that can be garbage collected is the array of references. All the strings that the references point to are still in use, so collecting the array doesn't free a lot of memory.

private void loadAddresses(string emailAddressesCommaSeperatedList) {
emailAddresses = new List<String>();
foreach (string s in emailAddressesCommaSeperatedList.Split(','))
{
emailAddresses.Add(s.Trim());
}
}
the list needs to be instantiated first before using the function add =)

Even on Windows XP, you have at last 2GB of memory at your disposal. Are you really that close to using 2GB that you need to worry about releasing the memory a few seconds earlier?

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.