Ok so I understand about the stack and the heap (values live on the Stack, references on the Heap).
When I declare a new instance of a Class, this lives on the heap, with a reference to this point in memory on the stack. I also know that C# does it's own Garbage Collection (ie. It determines when an instanciated class is no longer in use and reclaims the memory).
I have 2 questions:
Is my understanding of Garbage Collection correct?
Can I do my own? If so is there any real benefit to doing this myself or should I just leave it.
I ask because I have a method in a For loop. Every time I go through a loop, I create a new instance of my Class. In my head I visualise all of these classes lying around in a heap, not doing anything but taking up memory and I want to get rid of them as quickly as I can to keep things neat and tidy!
Am I understanding this correctly or am I missing something?
Ok so I understand about the stack and the heap (values live on the Stack, references on the Heap
I don't think you understand about the stack and the heap. If values live on the stack then where does an array of integers live? Integers are values. Are you telling me that an array of integers keeps its integers on the stack? When you return an array of integers from a method, say, with ten thousand integers in it, are you telling me that those ten thousand integers are copied onto the stack?
Values live on the stack when they live on the stack, and live on the heap when they live on the heap. The idea that the type of a thing has to do with the lifetime of its storage is nonsense. Storage locations that are short lived go on the stack; storage locations that are long lived go on the heap, and that is independent of their type. A long-lived int has to go on the heap, same as a long-lived instance of a class.
When I declare a new instance of a Class, this lives on the heap, with a reference to this point in memory on the stack.
Why does the reference have to go on the stack? Again, the lifetime of the storage of the reference has nothing to do with its type. If the storage of the reference is long-lived then the reference goes on the heap.
I also know that C# does it's own Garbage Collection (ie. It determines when an instanciated class is no longer in use and reclaims the memory).
The C# language does not do so; the CLR does so.
Is my understanding of Garbage Collection correct?
You seem to believe a lot of lies about the stack and the heap, so odds are good no, it's not.
Can I do my own?
Not in C#, no.
I ask because I have a method in a For loop. Every time I go through a loop, I create a new instance of my Class. In my head I visualise all of these classes lying around in a heap, not doing anything but taking up memory and I want to get rid of them as quickly as I can to keep things neat and tidy!
The whole point of garbage collection is to free you from worrying about tidying up. That's why its called "automatic garbage collection". It tidies for you.
If you are worried that your loops are creating collection pressure, and you wish to avoid collection pressure for performance reasons then I advise that you pursue a pooling strategy. It would be wise to start with an explicit pooling strategy; that is:
while(whatever)
{
Frob f = FrobPool.FetchFromPool();
f.Blah();
FrobPool.ReturnToPool(f);
}
rather than attempting to do automatic pooling using a resurrecting finalizer. I advise against both finalizers and object resurrection in general unless you are an expert on finalization semantics.
The pool of course allocates a new Frob if there is not one in the pool. If there is one in the pool, then it hands it out and removes it from the pool until it is put back in. (If you forget to put a Frob back in the pool, the GC will get to it eventually.) By pursuing a pooling strategy you cause the GC to eventually move all the Frobs to the generation 2 heap, instead of creating lots of collection pressure in the generation 0 heap. The collection pressure then disappears because no new Frobs are allocated. If something else is producing collection pressure, the Frobs are all safely in the gen 2 heap where they are rarely visited.
This of course is the exact opposite of the strategy you described; the whole point of the pooling strategy is to cause objects to hang around forever. Objects hanging around forever is a good thing if you're going to use them.
Of course, do not make these sorts of changes before you know via profiling that you have a performance problem due to collection pressure! It is rare to have such a problem on the desktop CLR; it is rather more common on the compact CLR.
More generally, if you are the kind of person who feels uncomfortable having a memory manager clean up for you on its schedule, then C# is not the right language for you. Consider C instead.
values live on the Stack, references on the Heap
This is an implementation detail. There is nothing to stop a .NET Framework from storing both on the stack.
I also know that C# does it's own Garbage Collection
C# has nothing to do with this. This is a service provided by the CLR. VB.NET, F#, etc all still have garbage collection.
The CLR will remove an object from memory if it has no strong roots. For example, when your class instance goes out of scope in your for loop. There will be a few lying around, but they will get collected eventually, either by garbage collection or the program terminating.
Can I do my own? If so is there any real benefit to doing this myself or should I just leave it?
You can use GC.Collect to force a collection. You should not do it because it is an expensive operation. More expensive than letting a few objects occupy memory a little bit longer than they are absolutely needed. The GC is incredibly good at what it does on its own. You will also force short lived objects to promote to generations they wouldn't get normally.
First off, to Erics seminal post about The truth about value types
Secondly on Garbage collection, the collector knows far more about your running program than you do, don't try to second guess it unless you're in the incredibly unlikely situation that you have a memory leak.
So to your second question, no don't try to "help" the GC.
I'll find a post to this effect on the CG and update this answer.
Can I do my own? If so is there any real benefit to doing this myself or should I just leave it.
Yes you can with GC.Collect but you shouldn't. The GC is optimized for variables that are short lived, ones in a method, and variables that are long lived, ones that generally stick around for the life time of the application.
Variables that are in-between aren't as common and aren't really optimum for the GC.
By forcing a GC.Collect you're more likely to cause variables in scope to be in forced into that in-between state which is the opposite from you are trying to accomplish.
Also from the MSDN article Writing High-Performance Managed Applications : A Primer
The GC is self-tuning and will adjust itself according to applications
memory requirements. In most cases programmatically invoking a GC will
hinder that tuning. "Helping" the GC by calling GC.Collect will more
than likely not improve your applications performance
Your understanding of Garbage Collection is good enough. Essentially, an unreferenced instance is deemed as being out-of-scope and no longer needed. Having determined this, the collector will remove an unreferenced object at some future point.
There's no way to force the Garbage Collector to collect just a specific instance. You can ask it to do its normal "collect everything possible" operation GC.Collect(), but you shouldn't.; the garbage-collector is efficient and effective if you just leave it to its own devices.
In particular it excels at collecting objects which have a short lifespan, just like those that are created as temporary objects. You shouldn't have to worry about creating loads of objects in a loop, unless they have a long lifespan that prevents immediate collection.
Please see this related question with regard to the Stack and Heap.
In your specific scenario, agreed, if you new up objects in a for-loop then you're going to have sub-optimal performance. Are the objects stored (or otherwise used) within the loop, or are they discarded? If the latter, can you optimize this by newing up one object outside the loop and re-using it?
With regard to can you implement your own GC, there is no explicit delete keyword in C#, you have to leave it to the CLR. You can however give it hints such as when to collect, or what to ignore during collection, however I'd leave that unless absolutely necessary.
Best regards,
Read the following article by Microsoft to get a level of knowledge about Garbage Collection in C#. I'm sure it'll help anyone who need information regarding this matter.
Memory Management and Garbage Collection in the .NET Framework
If you are interested in performance of some areas in your code when writing C#, you can write unsafe code. You will have a plus of performance, and also, in your fixed block, the garbage collector most likely will not occur.
Garbage collection is basically reference tracking. I can't think of any good reason why you would want to change it. Are you having some sort of problem where you find that memory isn't being freed? Or maybe you are looking for the dispose pattern
Edit:
Replaced "reference counting" with "reference tracking" to not be confused with the Increment/Decrement Counter on object Reference/Dereference (eg from Python).
I thought it was pretty common to refer to the object graph generation as "Counting" like in this answer:
Why no Reference Counting + Garbage Collection in C#?
But I will not pick up the glove of (the) Eric Lippert :)
Related
I wanted to add a small debug UI to my OpenGL game, which will be updated frequently with various debugging options/output displays. One thing I wanted was a constant counter that shows active objects in each generation of the garbage collector. I don't want names or anything, just a total count; something that I can eyeball when I do certain things within the game.
My problem, however, is that I can't seem to find a way to count the total objects currently alive in the various generations.
I even considered keeping a global static field, which would be incremented within every constructor and decremented within class finalizers. This would require hand-coding said functionality into every class though, and would not solve the problem of a "per-generation total".
Do you know how I could go about doing this?
(Question title:) "Counting total objects queued for garbage collection"
(From the question's body:) "My problem, however, is that I can't seem to find a way to count the total objects currently alive in the various generations."
Remark: Your question's title and body ask for opposite things. In the title, you're asking for the number of objects that can no longer be reached via any GC root, while in the body, you're asking for "live" objects, i.e. those that can still be reached via any GC root.
Let me start by saying that there might not be any way to do this, basically because objects in .NET are not reference-counted, so they cannot be immediately marked as "no longer needed" when the last reference to them disappears or goes out of scope. I believe .NET's mark-and-compact garbage collector only discovers which objects are alive and which can be reclaimed during an actual garbage collection (during the "mark" phase). You however seem to want this information in advance, i.e. before a GC occurs.
That being said, here are perhaps your best options:
Perhaps your best bet in .NET's managed Framework Class Library are performance counters. But it doesn't look like there are any suitable counters available: There are performance counters giving you the number of allocated bytes in the various GC generations, but AFAIK no counters for the number of live/dead objects.
You might also want to take a look at the CLR's (i.e. the runtime's) unmanaged, COM-based Debugging API. Given that you have retrieved an ICorDebugProcess5 interface, these methods might be of interest:
ICorDebugProcess5::EnumerateGCReferences method:
"Gets an enumerator for all objects that are to be garbage-collected in a process."
See also this answer to a similar question on SO.
Note that this is about objects that are to be garbage-collected, not about live objects.
ICorDebugProcess5::GetGCHeapInformation method:
"Provides general information about the garbage collection heap, including whether it is currently enumerable."
If it turns out that the managed heap is enumerable, you could use…
ICorDebugProcess5::EnumerateHeap method:
"Gets an enumerator for the objects on the managed heap."
The objects returned by this enumerator are of this type:
COR_HEAPOBJECT structure:
"Provides information about an object on the managed heap."
You might not be actually interested in these details, but just in the number of objects returned by the enumerator.
(I haven't used this API myself, perhaps there exists a better and more efficient way.)
In Sept 2015, Microsoft published a managed library called clrmd aka Microsoft.Diagnostics.Runtime on GitHub. It is based on the same foundation as the unmanaged debugging API mentioned above. The project includes documentation about enumerating objects in the GC heap.
Btw. there is an extremely informative book out there by Ben Watson, "Writing High-Performance .NET Code", which includes solid tips on how to make .NET memory allocation and GC more efficient.
Garbage Collector doesn't have to collect objects.
... that fact will be discovered when the garbage collector
runs the collector for whatever generation the object was in. (If it
runs at all, which it might not. There is no guarantee that the GC
runs.)
(C) Eric Lippert
If the application performs normally and the memory consumption is not increasing the GC can let it work without interruptions. That means that numbers will differ from run to run.
If I were you I wouldn't spend time on getting generations information, but just the size of used memory.
The simple but not very accurate way is to get it from GC.
// Determine the best available approximation of the number
// of bytes currently allocated in managed memory.
Console.WriteLine("Total Memory: {0}", GC.GetTotalMemory(false));
If you see that used memory increases and decreases often, then you can use existing profilers to figure out where are you allocating too mush, or even where the memory leak is.
I'm converting a C# project to C++ and have a question about deleting objects after use. In C# the GC of course takes care of deleting objects, but in C++ it has to be done explicitly using the delete keyword.
My question is, is it ok to just follow each object's usage throughout a method and then delete it as soon as it goes out of scope (ie method end/re-assignment)?
I know though that the GC waits for a certain size of garbage (~1MB) before deleting; does it do this because there is an overhead when using delete?
As this is a game I am creating there will potentially be lots of objects being created and deleted every second, so would it be better to keep track of pointers that go out of scope, and once that size reachs 1MB to then delete the pointers?
(as a side note: later when the game is optimised, objects will be loaded once at startup so there is not much to delete during gameplay)
Your problem is that you are using pointers in C++.
This is a fundamental problem that you must fix, then all your problems go away. As chance would have it, I got so fed up with this general trend that I created a set of presentation slides on this issue. – (CC BY, so feel free to use them).
Have a look at the slides. While they are certainly not entirely serious, the fundamental message is still true: Don’t use pointers. But more accurately, the message should read: Don’t use delete.
In your particular situation you might find yourself with a lot of long-lived small objects. This is indeed a situation which a modern GC handles quite well, and which reference-counting smart pointers (shared_ptr) handle less efficiently. If (and only if!) this becomes a performance problem, consider switching to a small object allocator library.
You should be using RAII as much as possible in C++ so you do not have to explicitly deleteanything anytime.
Once you use RAII through smart pointers and your own resource managing classes every dynamic allocation you make will exist only till there are any possible references to it, You do not have to manage any resources explicitly.
Memory management in C# and C++ is completely different. You shouldn't try to mimic the behavior of .NET's GC in C++. In .NET allocating memory is super fast (basically moving a pointer) whereas freeing it is the heavy task. In C++ allocating memory isn't that lightweight for several reasons, mainly because a large enough chunk of memory has to be found. When memory chunks of different sizes are allocated and freed many times during the execution of the program the heap can get fragmented, containing many small "holes" of free memory. In .NET this won't happen because the GC will compact the heap. Freeing memory in C++ is quite fast, though.
Best practices in .NET don't necessarily work in C++. For example, pooling and reusing objects in .NET isn't recommended most of the time, because the objects get promoted to higher generations by the GC. The GC works best for short lived objects. On the other hand, pooling objects in C++ can be very useful to avoid heap fragmentation. Also, allocating a larger chunk of memory and using placement new can work great for many smaller objects that need to be allocated and freed frequently, as it can occur in games. Read up on general memory management techniques in C++ such as RAII or placement new.
Also, I'd recommend getting the books "Effective C++" and "More effective C++".
Well, the simplest solution might be to just use garbage collection in
C++. The Boehm collector works well, for example. Still, there are
pros and cons (but porting code originally written in C# would be a
likely candidate for a case where the pros largely outweigh the cons.)
Otherwise, if you convert the code to idiomatic C++, there shouldn't be
that many dynamically allocated objects to worry about. Unlike C#, C++
has value semantics by default, and most of your short lived objects
should be simply local variables, possibly copied if they are returned,
but not allocated dynamically. In C++, dynamic allocation is normally
only used for entity objects, whose lifetime depends on external events;
e.g. a Monster is created at some random time, with a probability
depending on the game state, and is deleted at some later time, in
reaction to events which change the game state. In this case, you
delete the object when the monster ceases to be part of the game. In
C#, you probably have a dispose function, or something similar, for
such objects, since they typically have concrete actions which must be
carried out when they cease to exist—things like deregistering as
an Observer, if that's one of the patterns you're using. In C++, this
sort of thing is typically handled by the destructor, and instead of
calling dispose, you call delete the object.
Substituting a shared_ptr in every instance that you use a reference in C# would get you the closest approximation at probably the lowest effort input when converting the code.
However you specifically mention following an objects use through a method and deleteing at the end - a better approach is not to new up the object at all but simply instantiate it inline/on the stack. In fact if you take this approach even for returned objects with the new copy semantics being introduced this becomes an efficient way to deal with returned objects also - so there is no need to use pointers in almost every scenario.
There are a lot more things to take into considerations when deallocating objects than just calling delete whenever it goes out of scope. You have to make sure that you only call delete once and only call it once all pointers to that object have gone out of scope. The garbage collector in .NET handles all of that for you.
The construct that is mostly corresponding to that in C++ is tr1::shared_ptr<> which keeps a reference counter to the object and deallocates when it drops to zero. A first approach to get things running would be to make all C# references in to C++ tr1::shared_ptr<>. Then you can go into those places where it is a performance bottleneck (only after you've verified with a profile that it is an actual bottleneck) and change to more efficient memory handling.
GC feature of c++ has been discussed a lot in SO.
Try Reading through this!!
Garbage Collection in C++
Since C# is a managed language that performs garbage collection automatically to clean up objects etc, ...
what are the ways one can introduce a memory leak?
Are there some non-obvious ways that one should look out for?
How can you detect or look for memory leaks (once you understand how they are generated etc.)
Usually leaks show up in the form of a developer writing code that "holds on" to objects when they shouldn't be, which subsequently disallows the garbage collector from collecting on those objects.
The garbage collector is pretty good at what it does, but if you don't understand what it's doing, the likelihood of you introducing memory issues into your program is pretty high.
I would suggest reading up on the GC and understanding how it works.
Here's something to get you started:
http://www.simple-talk.com/dotnet/.net-framework/understanding-garbage-collection-in-.net/
Two.
First, building up referenced objects. Like creating objects that subscribe to an event on a form. The form is active, so can not be collected, and the event subscriptions... keep the objects alive.
Second, block the garbage colelctor in native code. Like the oracle ODP.NET driever does under certain conditions. Stops the finalizer, so any object requiring finalization WILL NOT GET IT - and thus never be released.
The obvious "memory leak" one could cause in a GC-ed language would be caused simply be retaining a reference to an object after it's needed - this is especially likely if you roll your own caching or keep other global state.
Another way would be leaking memory in unmanaged resources that weren't disposed of, although most of the standard library classes will probably dispose of those in destructors so the memory will be reclaimed sooner or later.
(I'm marking the post as CW because of the open-ended nature of the question.)
A memory leak means keeping in memory objects you don't need anymore. In C#, considering the GC collects unreferenced objects, it's equivalent to say keeping references to objects you don't need.
Think about uncorrectly scoped declarations, infinite recursion or iteration...
Hello and thank you very much for your help!
Does anybody have a good idea to find unreferenced objects of a specific class before garbage collection? (preferable as soon as possible)
In my case, I need to create a lot of small objects of a specific class for temporary use only. The problem is that I don’t know when the object is not needed anymore. I would like to collect the objects of that class which are not referenced any more (as soon as possible) before garbage collection so that I can recycle them and don’t need to create them new. I think that would make the code much faster.
Kind Regards,
David
First off, before you do this you should do extensive profiling to determine that you really, truly do have a performance problem caused by collection pressure. The garbage collector is highly tuned and works quite well most of the time; situations where you need to pool objects for performance reasons are rare.
I actually am in that scenario; we have determined through extensive testing that there are certain objects we use all the time on a temporary basis, ("builders" of other objects, essentially) and that the cost of collection pressure caused by re-allocating them frequently is measurable and high.
What we do is we have a pool class which maintains an array of "blank" objects. When you need a new object, the pool checks the array and returns an object that is in the array if we have one, nulling out the array entry. If we don't have one then it creates a new object. When the temporary user is done with the object, it passes it back to the pool, which "blanks" it and sticks it back in the array. (Growing the array if necessary.)
If a user forgets to put the object back into the pool, or cannot do so because an exception was thrown before the "back in the pool" call, who cares? All we've done in that case is perhaps slightly de-optimized a future allocation. The cost is that you need to remember to put the object back in the pool when you're done with it.
There's no way to "hook" the garbage collector to put stuff back in the pool automatically that I know of.
You can't directly control garbage collection, but you could create a manager class that is responsible for creating, holding the references and disposing of these objects. As long as the manager class is in scope, its objects will not be garbage collected.
I need to dispose of an object so it can release everything it owns, but it doesn't implement the IDisposable so I can't use it in a using block. How can I make the garbage collector collect it?
You can force a collection with GC.Collect(). Be very careful using this, since a full collection can take some time. The best-practice is to just let the GC determine when the best time to collect is.
Does the object contain unmanaged resources but does not implement IDisposable? If so, it's a bug.
If it doesn't, it shouldn't matter if it gets released right away, the garbage collector should do the right thing.
If it "owns" anything other than memory, you need to fix the object to use IDisposable. If it's not an object you control this is something worth picking a different vendor over, because it speaks to the core of how well your vendor really understands .Net.
If it does just own memory, even a lot of it, all you have to do is make sure the object goes out of scope. Don't call GC.Collect() — it's one of those things that if you have to ask, you shouldn't do it.
You can't perform garbage collection on a single object. You could request a garbage collection by calling GC.Collect() but this will effect all objects subject to cleanup. It is also highly discouraged as it can have a negative effect on the performance of later collections.
Also, calling Dispose on an object does not clean up it's memory. It only allows the object to remove references to unmanaged resources. For example, calling Dispose on a StreamWriter closes the stream and releases the Windows file handle. The memory for the object on the managed heap does not get reclaimed until a subsequent garbage collection.
Chris Sells also discussed this on .NET Rocks. I think it was during his first appearance but the subject might have been revisited in later interviews.
http://www.dotnetrocks.com/default.aspx?showNum=10
This article by Francesco Balena is also a good reference:
When and How to Use Dispose and Finalize in C#
http://www.devx.com/dotnet/Article/33167/0/page/1
Garbage collection in .NET is non deterministic, meaning you can't really control when it happens. You can suggest, but that doesn't mean it will listen.
Tells us a little bit more about the object and why you want to do this. We can make some suggestions based off of that. Code always helps. And depending on the object, there might be a Close method or something similar. Maybe the useage is to call that. If there is no Close or Dispose type of method, you probably don't want to rely on that object, as you will probably get memory leaks if in fact it does contain resourses which will need to be released.
If the object goes out of scope and it have no external references it will be collected rather fast (likely on the next collection).
BEWARE: of f ra gm enta tion in many cases, GC.Collect() or some IDisposal is not very helpful, especially for large objects (LOH is for objects ~80kb+, performs no compaction and is subject to high levels of fragmentation for many common use cases) which will then lead to out of memory (OOM) issues even with potentially hundreds of MB free. As time marches on, things get bigger, though perhaps not this size (80 something kb) for LOH relegated objects, high degrees of parallelism exasperates this issue due simply due to more objects in less time (and likely varying in size) being instantiated/released.
Array’s are the usual suspects for this problem (it’s also often hard to identify due to non-specific exceptions and assertions from the runtime, something like “high % of large object heap fragmentation” would be swell), the prognosis for code suffering from this problem is to implement an aggressive re-use strategy.
A class in Systems.Collections.Concurrent.ObjectPool from the parallel extensions beta1 samples helps (unfortunately there is not a simple ubiquitous pattern which I have seen, like maybe some attached property/extension methods?), it is simple enough to drop in or re-implement for most projects, you assign a generator Func<> and use Get/Put helper methods to re-use your previous object’s and forgo usual garbage collection. It is usually sufficient to focus on array’s and not the individual array elements.
It would be nice if .NET 4 updated all of the .ToArray() methods everywhere to include .ToArray(T target).
Getting the hang of using SOS/windbg (.loadby sos mscoreei for CLRv4) to analyze this class of issue can help. Thinking about it, the current garbage collection system is more like garbage re-cycling (using the same physical memory again), ObjectPool is analogous to garbage re-using. If anybody remembers the 3 R’s, reducing your memory use is a good idea too, for performance sakes ;)