I'm converting a C# project to C++ and have a question about deleting objects after use. In C# the GC of course takes care of deleting objects, but in C++ it has to be done explicitly using the delete keyword.
My question is, is it ok to just follow each object's usage throughout a method and then delete it as soon as it goes out of scope (ie method end/re-assignment)?
I know though that the GC waits for a certain size of garbage (~1MB) before deleting; does it do this because there is an overhead when using delete?
As this is a game I am creating there will potentially be lots of objects being created and deleted every second, so would it be better to keep track of pointers that go out of scope, and once that size reachs 1MB to then delete the pointers?
(as a side note: later when the game is optimised, objects will be loaded once at startup so there is not much to delete during gameplay)
Your problem is that you are using pointers in C++.
This is a fundamental problem that you must fix, then all your problems go away. As chance would have it, I got so fed up with this general trend that I created a set of presentation slides on this issue. – (CC BY, so feel free to use them).
Have a look at the slides. While they are certainly not entirely serious, the fundamental message is still true: Don’t use pointers. But more accurately, the message should read: Don’t use delete.
In your particular situation you might find yourself with a lot of long-lived small objects. This is indeed a situation which a modern GC handles quite well, and which reference-counting smart pointers (shared_ptr) handle less efficiently. If (and only if!) this becomes a performance problem, consider switching to a small object allocator library.
You should be using RAII as much as possible in C++ so you do not have to explicitly deleteanything anytime.
Once you use RAII through smart pointers and your own resource managing classes every dynamic allocation you make will exist only till there are any possible references to it, You do not have to manage any resources explicitly.
Memory management in C# and C++ is completely different. You shouldn't try to mimic the behavior of .NET's GC in C++. In .NET allocating memory is super fast (basically moving a pointer) whereas freeing it is the heavy task. In C++ allocating memory isn't that lightweight for several reasons, mainly because a large enough chunk of memory has to be found. When memory chunks of different sizes are allocated and freed many times during the execution of the program the heap can get fragmented, containing many small "holes" of free memory. In .NET this won't happen because the GC will compact the heap. Freeing memory in C++ is quite fast, though.
Best practices in .NET don't necessarily work in C++. For example, pooling and reusing objects in .NET isn't recommended most of the time, because the objects get promoted to higher generations by the GC. The GC works best for short lived objects. On the other hand, pooling objects in C++ can be very useful to avoid heap fragmentation. Also, allocating a larger chunk of memory and using placement new can work great for many smaller objects that need to be allocated and freed frequently, as it can occur in games. Read up on general memory management techniques in C++ such as RAII or placement new.
Also, I'd recommend getting the books "Effective C++" and "More effective C++".
Well, the simplest solution might be to just use garbage collection in
C++. The Boehm collector works well, for example. Still, there are
pros and cons (but porting code originally written in C# would be a
likely candidate for a case where the pros largely outweigh the cons.)
Otherwise, if you convert the code to idiomatic C++, there shouldn't be
that many dynamically allocated objects to worry about. Unlike C#, C++
has value semantics by default, and most of your short lived objects
should be simply local variables, possibly copied if they are returned,
but not allocated dynamically. In C++, dynamic allocation is normally
only used for entity objects, whose lifetime depends on external events;
e.g. a Monster is created at some random time, with a probability
depending on the game state, and is deleted at some later time, in
reaction to events which change the game state. In this case, you
delete the object when the monster ceases to be part of the game. In
C#, you probably have a dispose function, or something similar, for
such objects, since they typically have concrete actions which must be
carried out when they cease to exist—things like deregistering as
an Observer, if that's one of the patterns you're using. In C++, this
sort of thing is typically handled by the destructor, and instead of
calling dispose, you call delete the object.
Substituting a shared_ptr in every instance that you use a reference in C# would get you the closest approximation at probably the lowest effort input when converting the code.
However you specifically mention following an objects use through a method and deleteing at the end - a better approach is not to new up the object at all but simply instantiate it inline/on the stack. In fact if you take this approach even for returned objects with the new copy semantics being introduced this becomes an efficient way to deal with returned objects also - so there is no need to use pointers in almost every scenario.
There are a lot more things to take into considerations when deallocating objects than just calling delete whenever it goes out of scope. You have to make sure that you only call delete once and only call it once all pointers to that object have gone out of scope. The garbage collector in .NET handles all of that for you.
The construct that is mostly corresponding to that in C++ is tr1::shared_ptr<> which keeps a reference counter to the object and deallocates when it drops to zero. A first approach to get things running would be to make all C# references in to C++ tr1::shared_ptr<>. Then you can go into those places where it is a performance bottleneck (only after you've verified with a profile that it is an actual bottleneck) and change to more efficient memory handling.
GC feature of c++ has been discussed a lot in SO.
Try Reading through this!!
Garbage Collection in C++
Related
I know C# gives the programmer the ability to access, use pointers in an unsafe context. But When is this needed?
At what circumstances, using pointers becomes inevitable?
Is it only for performance reasons?
Also why does C# expose this functionality through an unsafe context, and remove all of the managed advantages from it? Is it possible to have use pointers without losing any advantages of managed environment, theoretically?
When is this needed? Under what circumstances does using pointers becomes inevitable?
When the net cost of a managed, safe solution is unacceptable but the net cost of an unsafe solution is acceptable. You can determine the net cost or net benefit by subtracting the total benefits from the total costs. The benefits of an unsafe solution are things like "no time wasted on unnecessary runtime checks to ensure correctness"; the costs are (1) having to write code that is safe even with the managed safety system turned off, and (2) having to deal with potentially making the garbage collector less efficient, because it cannot move around memory that has an unmanaged pointer into it.
Or, if you are the person writing the marshalling layer.
Is it only for performance reasons?
It seems perverse to use pointers in a managed language for reasons other than performance.
You can use the methods in the Marshal class to deal with interoperating with unmanaged code in the vast majority of cases. (There might be a few cases in which it is difficult or impossible to use the marshalling gear to solve an interop problem, but I don't know of any.)
Of course, as I said, if you are the person writing the Marshal class then obviously you don't get to use the marshalling layer to solve your problem. In that case you'd need to implement it using pointers.
Why does C# expose this functionality through an unsafe context, and remove all of the managed advantages from it?
Those managed advantages come with performance costs. For example, every time you ask an array for its tenth element, the runtime needs to do a check to see if there is a tenth element, and throw an exception if there isn't. With pointers that runtime cost is eliminated.
The corresponding developer cost is that if you do it wrong then you get to deal with memory corruption bugs that formats your hard disk and crashes your process an hour later rather than dealing with a nice clean exception at the point of the error.
Is it possible to use pointers without losing any advantages of managed environment, theoretically?
By "advantages" I assume you mean advantages like garbage collection, type safety and referential integrity. Thus your question is essentially "is it in theory possible to turn off the safety system but still get the benefits of the safety system being turned on?" No, clearly it is not. If you turn off that safety system because you don't like how expensive it is then you don't get the benefits of it being on!
Pointers are an inherent contradiction to the managed, garbage-collected, environment.
Once you start messing with raw pointers, the GC has no clue what's going on.
Specifically, it cannot tell whether objects are reachable, since it doesn't know where your pointers are.
It also cannot move objects around in memory, since that would break your pointers.
All of this would be solved by GC-tracked pointers; that's what references are.
You should only use pointers in messy advanced interop scenarios or for highly sophisticated optimization.
If you have to ask, you probably shouldn't.
The GC can move references around; using unsafe keeps an object outside of the GC's control, and avoids this. "Fixed" pins an object, but lets the GC manage the memory.
By definition, if you have a pointer to the address of an object, and the GC moves it, your pointer is no longer valid.
As to why you need pointers: Primary reason is to work with unmanaged DLLs, e.g. those written in C++
Also note, when you pin variables and use pointers, you're more susceptible to heap fragmentation.
Edit
You've touched on the core issue of managed vs. unmanaged code... how does the memory get released?
You can mix code for performance as you describe, you just can't cross managed/unmanaged boundaries with pointers (i.e. you can't use pointers outside of the 'unsafe' context).
As for how they get cleaned... You have to manage your own memory; objects that your pointers point to were created/allocated (usually within the C++ DLL) using (hopefully) CoTaskMemAlloc(), and you have to release that memory in the same manner, calling CoTaskMemFree(), or you'll have a memory leak. Note that only memory allocated with CoTaskMemAlloc() can be freed with CoTaskMemFree().
The other alternative is to expose a method from your native C++ dll that takes a pointer and frees it... this lets the DLL decide how to free the memory, which works best if it used some other method to allocate memory. Most native dlls you work with are third-party dlls that you can't modify, and they don't usually have (that I've seen) such functions to call.
An example of freeing memory, taken from here:
string[] array = new string[2];
array[0] = "hello";
array[1] = "world";
IntPtr ptr = test(array);
string result = Marshal.PtrToStringAuto(ptr);
Marshal.FreeCoTaskMem(ptr);
System.Console.WriteLine(result);
Some more reading material:
C# deallocate memory referenced by IntPtr
The second answer down explains the different allocation/deallocation methods
How to free IntPtr in C#?
Reinforces the need to deallocate in the same manner the memory was allocated
http://msdn.microsoft.com/en-us/library/aa366533%28VS.85%29.aspx
Official MSDN documentation on the various ways to allocate and deallocate memory.
In short... you need to know how the memory was allocated in order to free it.
Edit
If I understand your question correctly, the short answer is yes, you can hand the data off to unmanaged pointers, work with it in an unsafe context, and have the data available once you exit the unsafe context.
The key is that you have to pin the managed object you're referencing with a fixed block. This prevents the memory you're referencing from being moved by the GC while in the unsafe block. There are a number of subtleties involved here, e.g. you can't reassign a pointer initialized in a fixed block... you should read up on unsafe and fixed statements if you're really set on managing your own code.
All that said, the benefits of managing your own objects and using pointers in the manner you describe may not buy you as much of a performance increase as you might think. Reasons why not:
C# is very optimized and very fast
Your pointer code is still generated as IL, which has to be jitted (at which point further optimizations come into play)
You're not turning the Garbage Collector off... you're just keeping the objects you're working with out of the GC's purview. So every 100ms or so, the GC still interrupts your code and executes its functions for all the other variables in your managed code.
HTH,
James
The most common reasons to use pointers explicitly in C#:
doing low-level work (like string manipulation) that is very performance sensitive,
interfacing with unmanaged APIs.
The reason why the syntax associated with pointers was removed from C# (according to my knowledge and viewpoint — Jon Skeet would answer better B-)) was it turned out to be superfluous in most situations.
From the language design perspective, once you manage memory by a garbage collector you have to introduce severe constraints on what is and what is not possible to do with pointers. For example, using a pointer to point into the middle of an object can cause severe problems to the GC. Hence, once the restrictions are in place, you can just omit the extra syntax and end up with “automatic” references.
Also, the ultra-benevolent approach found in C/C++ is a common source of errors. For most situations, where micro-performance doesn't matter at all, it is better to offer tighter rules and constrain the developer in favor of less bugs that would be very hard to discover. Thus for common business applications the so-called “managed” environments like .NET and Java are better suited than languages that presume to work against the bare-metal machine.
Say you want to communicate between 2 application using IPC (shared memory) then you can marshal the data to memory and pass this data pointer to the other application via windows messaging or something. At receiving application you can fetch data back.
Useful also in case of transferring data from .NET to legacy VB6 apps wherein you will marshal the data to memory, pass pointer to VB6 app using win msging, use VB6 copymemory() to fetch data from the managed memory space to VB6 apps unmanaged memory space..
Ok so I understand about the stack and the heap (values live on the Stack, references on the Heap).
When I declare a new instance of a Class, this lives on the heap, with a reference to this point in memory on the stack. I also know that C# does it's own Garbage Collection (ie. It determines when an instanciated class is no longer in use and reclaims the memory).
I have 2 questions:
Is my understanding of Garbage Collection correct?
Can I do my own? If so is there any real benefit to doing this myself or should I just leave it.
I ask because I have a method in a For loop. Every time I go through a loop, I create a new instance of my Class. In my head I visualise all of these classes lying around in a heap, not doing anything but taking up memory and I want to get rid of them as quickly as I can to keep things neat and tidy!
Am I understanding this correctly or am I missing something?
Ok so I understand about the stack and the heap (values live on the Stack, references on the Heap
I don't think you understand about the stack and the heap. If values live on the stack then where does an array of integers live? Integers are values. Are you telling me that an array of integers keeps its integers on the stack? When you return an array of integers from a method, say, with ten thousand integers in it, are you telling me that those ten thousand integers are copied onto the stack?
Values live on the stack when they live on the stack, and live on the heap when they live on the heap. The idea that the type of a thing has to do with the lifetime of its storage is nonsense. Storage locations that are short lived go on the stack; storage locations that are long lived go on the heap, and that is independent of their type. A long-lived int has to go on the heap, same as a long-lived instance of a class.
When I declare a new instance of a Class, this lives on the heap, with a reference to this point in memory on the stack.
Why does the reference have to go on the stack? Again, the lifetime of the storage of the reference has nothing to do with its type. If the storage of the reference is long-lived then the reference goes on the heap.
I also know that C# does it's own Garbage Collection (ie. It determines when an instanciated class is no longer in use and reclaims the memory).
The C# language does not do so; the CLR does so.
Is my understanding of Garbage Collection correct?
You seem to believe a lot of lies about the stack and the heap, so odds are good no, it's not.
Can I do my own?
Not in C#, no.
I ask because I have a method in a For loop. Every time I go through a loop, I create a new instance of my Class. In my head I visualise all of these classes lying around in a heap, not doing anything but taking up memory and I want to get rid of them as quickly as I can to keep things neat and tidy!
The whole point of garbage collection is to free you from worrying about tidying up. That's why its called "automatic garbage collection". It tidies for you.
If you are worried that your loops are creating collection pressure, and you wish to avoid collection pressure for performance reasons then I advise that you pursue a pooling strategy. It would be wise to start with an explicit pooling strategy; that is:
while(whatever)
{
Frob f = FrobPool.FetchFromPool();
f.Blah();
FrobPool.ReturnToPool(f);
}
rather than attempting to do automatic pooling using a resurrecting finalizer. I advise against both finalizers and object resurrection in general unless you are an expert on finalization semantics.
The pool of course allocates a new Frob if there is not one in the pool. If there is one in the pool, then it hands it out and removes it from the pool until it is put back in. (If you forget to put a Frob back in the pool, the GC will get to it eventually.) By pursuing a pooling strategy you cause the GC to eventually move all the Frobs to the generation 2 heap, instead of creating lots of collection pressure in the generation 0 heap. The collection pressure then disappears because no new Frobs are allocated. If something else is producing collection pressure, the Frobs are all safely in the gen 2 heap where they are rarely visited.
This of course is the exact opposite of the strategy you described; the whole point of the pooling strategy is to cause objects to hang around forever. Objects hanging around forever is a good thing if you're going to use them.
Of course, do not make these sorts of changes before you know via profiling that you have a performance problem due to collection pressure! It is rare to have such a problem on the desktop CLR; it is rather more common on the compact CLR.
More generally, if you are the kind of person who feels uncomfortable having a memory manager clean up for you on its schedule, then C# is not the right language for you. Consider C instead.
values live on the Stack, references on the Heap
This is an implementation detail. There is nothing to stop a .NET Framework from storing both on the stack.
I also know that C# does it's own Garbage Collection
C# has nothing to do with this. This is a service provided by the CLR. VB.NET, F#, etc all still have garbage collection.
The CLR will remove an object from memory if it has no strong roots. For example, when your class instance goes out of scope in your for loop. There will be a few lying around, but they will get collected eventually, either by garbage collection or the program terminating.
Can I do my own? If so is there any real benefit to doing this myself or should I just leave it?
You can use GC.Collect to force a collection. You should not do it because it is an expensive operation. More expensive than letting a few objects occupy memory a little bit longer than they are absolutely needed. The GC is incredibly good at what it does on its own. You will also force short lived objects to promote to generations they wouldn't get normally.
First off, to Erics seminal post about The truth about value types
Secondly on Garbage collection, the collector knows far more about your running program than you do, don't try to second guess it unless you're in the incredibly unlikely situation that you have a memory leak.
So to your second question, no don't try to "help" the GC.
I'll find a post to this effect on the CG and update this answer.
Can I do my own? If so is there any real benefit to doing this myself or should I just leave it.
Yes you can with GC.Collect but you shouldn't. The GC is optimized for variables that are short lived, ones in a method, and variables that are long lived, ones that generally stick around for the life time of the application.
Variables that are in-between aren't as common and aren't really optimum for the GC.
By forcing a GC.Collect you're more likely to cause variables in scope to be in forced into that in-between state which is the opposite from you are trying to accomplish.
Also from the MSDN article Writing High-Performance Managed Applications : A Primer
The GC is self-tuning and will adjust itself according to applications
memory requirements. In most cases programmatically invoking a GC will
hinder that tuning. "Helping" the GC by calling GC.Collect will more
than likely not improve your applications performance
Your understanding of Garbage Collection is good enough. Essentially, an unreferenced instance is deemed as being out-of-scope and no longer needed. Having determined this, the collector will remove an unreferenced object at some future point.
There's no way to force the Garbage Collector to collect just a specific instance. You can ask it to do its normal "collect everything possible" operation GC.Collect(), but you shouldn't.; the garbage-collector is efficient and effective if you just leave it to its own devices.
In particular it excels at collecting objects which have a short lifespan, just like those that are created as temporary objects. You shouldn't have to worry about creating loads of objects in a loop, unless they have a long lifespan that prevents immediate collection.
Please see this related question with regard to the Stack and Heap.
In your specific scenario, agreed, if you new up objects in a for-loop then you're going to have sub-optimal performance. Are the objects stored (or otherwise used) within the loop, or are they discarded? If the latter, can you optimize this by newing up one object outside the loop and re-using it?
With regard to can you implement your own GC, there is no explicit delete keyword in C#, you have to leave it to the CLR. You can however give it hints such as when to collect, or what to ignore during collection, however I'd leave that unless absolutely necessary.
Best regards,
Read the following article by Microsoft to get a level of knowledge about Garbage Collection in C#. I'm sure it'll help anyone who need information regarding this matter.
Memory Management and Garbage Collection in the .NET Framework
If you are interested in performance of some areas in your code when writing C#, you can write unsafe code. You will have a plus of performance, and also, in your fixed block, the garbage collector most likely will not occur.
Garbage collection is basically reference tracking. I can't think of any good reason why you would want to change it. Are you having some sort of problem where you find that memory isn't being freed? Or maybe you are looking for the dispose pattern
Edit:
Replaced "reference counting" with "reference tracking" to not be confused with the Increment/Decrement Counter on object Reference/Dereference (eg from Python).
I thought it was pretty common to refer to the object graph generation as "Counting" like in this answer:
Why no Reference Counting + Garbage Collection in C#?
But I will not pick up the glove of (the) Eric Lippert :)
I need to dispose of an object so it can release everything it owns, but it doesn't implement the IDisposable so I can't use it in a using block. How can I make the garbage collector collect it?
You can force a collection with GC.Collect(). Be very careful using this, since a full collection can take some time. The best-practice is to just let the GC determine when the best time to collect is.
Does the object contain unmanaged resources but does not implement IDisposable? If so, it's a bug.
If it doesn't, it shouldn't matter if it gets released right away, the garbage collector should do the right thing.
If it "owns" anything other than memory, you need to fix the object to use IDisposable. If it's not an object you control this is something worth picking a different vendor over, because it speaks to the core of how well your vendor really understands .Net.
If it does just own memory, even a lot of it, all you have to do is make sure the object goes out of scope. Don't call GC.Collect() — it's one of those things that if you have to ask, you shouldn't do it.
You can't perform garbage collection on a single object. You could request a garbage collection by calling GC.Collect() but this will effect all objects subject to cleanup. It is also highly discouraged as it can have a negative effect on the performance of later collections.
Also, calling Dispose on an object does not clean up it's memory. It only allows the object to remove references to unmanaged resources. For example, calling Dispose on a StreamWriter closes the stream and releases the Windows file handle. The memory for the object on the managed heap does not get reclaimed until a subsequent garbage collection.
Chris Sells also discussed this on .NET Rocks. I think it was during his first appearance but the subject might have been revisited in later interviews.
http://www.dotnetrocks.com/default.aspx?showNum=10
This article by Francesco Balena is also a good reference:
When and How to Use Dispose and Finalize in C#
http://www.devx.com/dotnet/Article/33167/0/page/1
Garbage collection in .NET is non deterministic, meaning you can't really control when it happens. You can suggest, but that doesn't mean it will listen.
Tells us a little bit more about the object and why you want to do this. We can make some suggestions based off of that. Code always helps. And depending on the object, there might be a Close method or something similar. Maybe the useage is to call that. If there is no Close or Dispose type of method, you probably don't want to rely on that object, as you will probably get memory leaks if in fact it does contain resourses which will need to be released.
If the object goes out of scope and it have no external references it will be collected rather fast (likely on the next collection).
BEWARE: of f ra gm enta tion in many cases, GC.Collect() or some IDisposal is not very helpful, especially for large objects (LOH is for objects ~80kb+, performs no compaction and is subject to high levels of fragmentation for many common use cases) which will then lead to out of memory (OOM) issues even with potentially hundreds of MB free. As time marches on, things get bigger, though perhaps not this size (80 something kb) for LOH relegated objects, high degrees of parallelism exasperates this issue due simply due to more objects in less time (and likely varying in size) being instantiated/released.
Array’s are the usual suspects for this problem (it’s also often hard to identify due to non-specific exceptions and assertions from the runtime, something like “high % of large object heap fragmentation” would be swell), the prognosis for code suffering from this problem is to implement an aggressive re-use strategy.
A class in Systems.Collections.Concurrent.ObjectPool from the parallel extensions beta1 samples helps (unfortunately there is not a simple ubiquitous pattern which I have seen, like maybe some attached property/extension methods?), it is simple enough to drop in or re-implement for most projects, you assign a generator Func<> and use Get/Put helper methods to re-use your previous object’s and forgo usual garbage collection. It is usually sufficient to focus on array’s and not the individual array elements.
It would be nice if .NET 4 updated all of the .ToArray() methods everywhere to include .ToArray(T target).
Getting the hang of using SOS/windbg (.loadby sos mscoreei for CLRv4) to analyze this class of issue can help. Thinking about it, the current garbage collection system is more like garbage re-cycling (using the same physical memory again), ObjectPool is analogous to garbage re-using. If anybody remembers the 3 R’s, reducing your memory use is a good idea too, for performance sakes ;)
I'm learning C#. From what I know, you have to set things up correctly to have the garbage collector actually delete everything as it should be. I'm looking for wisdom learned over the years from you, the intelligent.
I'm coming from a C++ background and am VERY used to code-smells and development patterns. I want to learn what code-smells are like in C#. Give me advice!
What are the best ways to get things deleted?
How can you figure out when you have "memory leaks"?
Edit: I am trying to develop a punch-list of "stuff to always do for memory management"
Thanks, so much.
C#, the .NET Framework uses Managed Memory and everything (but allocated unmanaged resources) is garbage collected.
It is safe to assume that managed types are always garbage collected. That includes arrays, classes and structures. Feel free to do int[] stuff = new int[32]; and forget about it.
If you open a file, database connection, or any other unmanaged resource in a class, implement the IDisposable interface and in your Dispose method de-allocate the unmanaged resource.
Any class which implements IDisposable should be explicitly closed, or used in a (I think cool) Using block like;
using (StreamReader reader = new StreamReader("myfile.txt"))
{
... your code here
}
Here .NET will dispose reader when out of the { } scope.
The first thing with GC is that it is non-deterministic; if you want a resource cleaned up promptly, implement IDisposable and use using; that doesn't collect the managed memory, but can help a lot with unmanaged resources and onward chains.
In particular, things to watch out for:
lots of pinning (places a lot of restrictions on what the GC can do)
lots of finalizers (you don't usually need them; slows down GC)
static events - easy way to keep a lot of large object graphs alive ;-p
events on an inexpensive long-life object, that can see an expensive object that should have been cleaned up
"captured variables" accidentally keeping graphs alive
For investigating memory leaks... "SOS" is one of the easiest routes; you can use SOS to find all instances of a type, and what can see it, etc.
In general, the less you worry about memory allocation in C#, the better off you are. I would leave it to a profiler to tell me when I'm having issues with collection.
You can't create memory leaks in C# in the same way as you do in C++. The garbage collector will always "have your back". What you can do is create objects and hold references to them even though you never use them. That's a code smell to look out for.
Other than that:
Have some notion of how frequently collection will occur (for performance reasons)
Don't hold references to objects longer than you need
Dispose of objects that implement IDisposable as soon as you're done with them (use the using syntax)
Properly implement the IDisposable interface
The main sources of memory leaks I can think of are:
keeping references to objects you don't need any more (usually in some sort of collection) So here you need to remember that all things that you add to a collection that you have reference too will stay in memory.
Having circular references, e.g. having delegates registered with an event. So even though you explicitly don't reference an object, it can't get garbage collected because one of its methods is registered as a delegate with an event. In these cases you need to remember to remove the delegate before discarding the reference.
Interoperating with native code and failing to free it. Even if you use managed wrappers that implement finalizers, often the CLR doesn't clean them fast enough, because it doesn't understand the memory footprint. You should use the using(IDisposable ){} pattern
One other thing to consider for memory management is if you are implementing any Observer patterns and not disposing of the references correctly.
For instance:
Object A watches Object B
Object B is disposed if the reference from A to B is not disposed of property the GC will not properyly dispose of the object. Becuase the event handler is still assigned the GC doesn't see it as a non utilized resource.
If you have a small set of objects you're working with this may me irrelevant. However, if your working with thousands of objects this can cause a gradual increase in memory over the life of the application.
There are some great memory management software applications to monitor what's going on with the heap of your application. I found great benefit from utilizing .Net Memory Profiler.
HTH
I recommend using .NET Memory Profiler
.NET Memory Profiler is a powerful tool for finding memory leaks and optimizing the memory usage in programs written in C#, VB.NET or any other .NET Language.
.NET Memory Profiler will help you to:
View real-time memory and resource information
Easily identify memory leaks by collecting and comparing snapshots of .NET memory
Find instances that are not properly disposed
Get detailed information about unmanaged resource usage
Optimize memory usage
Investigate memory problems in production code
Perform automated memory testing
Retrieve information about native memory
Take a look at their video tutorials:
http://memprofiler.com/tutorials/
Others have already mentioned the importance of IDisposable, and some of the things to watch out for in your code.
I wanted to suggest some additional resources; I found the following invaluable when learning the details of .NET GC and how to trouble-shoot memory issues in .NET applications.
CLR via C# by Jeffrey Richter is an excellent book. Worth the purchase price just for the chapter on GC and memory.
This blog (by a Microsoft "ASP.NET Escalation Engineer") is often my go-to source for tips and tricks for using WinDbg, SOS, and for spotting certain types of memory leaks. Tess even designed .NET debugging demos/labs which will walk you through common memory issues and how to recognize and solve them.
Debugging Tools for Windows (WinDbg, SOS, etc)
You can use tools like CLR profiler it takes some time to learn how to use it correctly, but after all it is free. (It helped me several times to find my memory leakage)
The best way to ensure that objects get deleted, or in .NET lingo, garbage-collected, is to ensure that all root references (references that can be traced through methods and objects to the first method on a thread's call stack) to an object are set to null.
The GC cannot, and will not, collect an object if there are any rooted references to it, no matter whether it implements IDisposable or not.
Circular references impose no penalty or possibility of memory leaks, as the GC marks which objects it has visited in the object graph. In the case of delegates or eventhandlers it may be common to forget to remove the reference in an event to a target method, so that the object that contains the target method can't be collected if the event is rooted.
What are the best ways to get things deleted?
NOTE: the following works only for types containing unmanaged resources. It doesn't help with purely managed types.
Probably the best method is to implement and follow the IDisposable pattern; and call the dispose method on all objects implementing it.
The 'using' statement is your best friend. Loosely put, it will call dispose for you on objects implementing IDisposable.
I'm currently working on a ray-tracer in C# as a hobby project. I'm trying to achieve a decent rendering speed by implementing some tricks from a c++ implementation and have run into a spot of trouble.
The objects in the scenes which the ray-tracer renders are stored in a KdTree structure and the tree's nodes are, in turn, stored in an array. The optimization I'm having problems with is while trying to fit as many tree nodes as possible into a cache line. One means of doing this is for nodes to contain a pointer to the left child node only. It is then implicit that the right child follows directly after the left one in the array.
The nodes are structs and during tree construction they are succesfully put into the array by a static memory manager class. When I begin to traverse the tree it, at first, seems to work just fine. Then at a point early in the rendering (about the same place each time), the left child pointer of the root node is suddenly pointing at a null pointer. I have come to the conclusion that the garbage collecter has moved the structs as the array lies on the heap.
I've tried several things to pin the addresses in memory but none of them seems to last for the entire application lifetime as I need. The 'fixed' keyword only seems to help during single method calls and declaring 'fixed' arrays can only be done on simple types which a node isn't. Is there a good way to do this or am I just too far down the path of stuff C# wasn't meant for.
Btw, changing to c++, while perhaps the better choice for a high performance program, is not an option.
Firstly, if you're using C# normally, you can't suddenly get a null reference due to the garbage collector moving stuff, because the garbage collector also updates all references, so you don't need to worry about it moving stuff around.
You can pin things in memory but this may cause more problems than it solves. For one thing, it prevents the garbage collector from compacting memory properly, and may impact performance in that way.
One thing I would say from your post is that using structs may not help performance as you hope. C# fails to inline any method calls involving structs, and even though they've fixed this in their latest runtime beta, structs frequently don't perform that well.
Personally, I would say C++ tricks like this don't generally tend to carry over too well into C#. You may have to learn to let go a bit; there can be other more subtle ways to improve performance ;)
What is your static memory manager actually doing? Unless it is doing something unsafe (P/Invoke, unsafe code), the behaviour you are seeing is a bug in your program, and not due to the behaviour of the CLR.
Secondly, what do you mean by 'pointer', with respect to links between structures? Do you literally mean an unsafe KdTree* pointer? Don't do that. Instead, use an index into the array. Since I expect that all nodes for a single tree are stored in the same array, you won't need a separate reference to the array. Just a single index will do.
Finally, if you really really must use KdTree* pointers, then your static memory manager should allocate a large block using e.g. Marshal.AllocHGlobal or another unmanaged memory source; it should both treat this large block as a KdTree array (i.e. index a KdTree* C-style) and it should suballocate nodes from this array, by bumping a "free" pointer.
If you ever have to resize this array, then you'll need to update all the pointers, of course.
The basic lesson here is that unsafe pointers and managed memory do not mix outside of 'fixed' blocks, which of course have stack frame affinity (i.e. when the function returns, the pinned behaviour goes away). There is a way to pin arbitrary objects, like your array, using GCHandle.Alloc(yourArray, GCHandleType.Pinned), but you almost certainly don't want to go down that route.
You will get more sensible answers if you describe in more detail what you are doing.
If you really want to do this, you can use the GCHandle.Alloc method to specify that a pointer should be pinned without being automatically released at the end of the scope like the fixed statement.
But, as other people have been saying, doing this is putting undue pressure on the garbage collector. What about just creating a struct that holds onto a pair of your nodes and then managing an array of NodePairs rather than an array of nodes?
If you really do want to have completely unmanaged access to a chunk of memory, you would probably be better off allocating the memory directly from the unmanaged heap rather than permanently pinning a part of the managed heap (this prevents the heap from being able to properly compact itself). One quick and simple way to do this would be to use Marshal.AllocHGlobal method.
Is it really prohibitive to store the pair of array reference and index?
What is your static memory manager actually doing? Unless it is doing something unsafe (P/Invoke, unsafe code), the behaviour you are seeing is a bug in your program, and not due to the behaviour of the CLR.
I was in fact speaking about unsafe pointers. What I wanted was something like Marshal.AllocHGlobal, though with a lifetime exceeding a single method call. On reflection it seems that just using an index is the right solution as I might have gotten too caught up in mimicking the c++ code.
One thing I would say from your post is that using structs may not help performance as you hope. C# fails to inline any method calls involving structs, and even though they've fixed this in their latest run-time beta, structs frequently don't perform that well.
I looked into this a bit and I see it has been fixed in .NET 3.5SP1; I assume that's what you were referring to as the run-time beta. In fact, I now understand that this change accounted for a doubling of my rendering speed. Now, structs are aggressively in-lined, improving their performance greatly on X86 systems (X64 had better struct performance in advance).