Removing List items without any other reference - c#

I have a List of objects, which can be accessed by multiple users from a WebService. However, the number of objects in the list is steadily growing, so I need some memory management.
I would like to clear all elements from the list, which are not used by any user. However, I cannot do this simply by calling the GC, because there is still one reference (the one from the List). And I don't know, how to get the number of references to an object.
So, is there a way, how to clear all objects, that have just one reference? Or get the number of references? Or determine, whether there is no other reference outside the List? Any solution is welcome.

You can use a so called Weak List.
Basically a weak list is a list whose references are "ignored" by the GC. So while there is still a reference from the list, it will not be counted and (depending on which implementation of weak list you use) the item will be removed automatically at one point from the list.
Unfortunately there is no direct implementation of a weak list in the .NET Framework. There is the ConditionalWeakTable though which you might be able to use like a list and there are several examples for weak lists on the web which use the WeakReference type or similar mechanisms.
Examples:
Is there a way to do a WeakList or WeakCollection (like WeakReference) in CLR?

Related

What, exactly, do MemoryCache's memory limits mean?

System.Runtime.Caching.MemoryCache is a class in the .NET Framework (version 4+) that caches objects in-memory, using strings as keys. More than System.Collections.Generic.Dictionary<string, object>, this class has all kinds of bells and whistles that let you configure how much big the cache can grow to (in either absolute or relative terms), set different expiration policies for different cache items, and so much more.
My questions relate to the memory limits. None of the docs on MSDN seem to explain this satisfactorily, and the code on Reference Source is fairly opaque. Sorry about piling all of this into one SO "question", but I can't figure out how to take some out into their own questions, because they're really just different views of one overall question: "how do you reconcile idiomatic C#/.NET with the notion of a generally useful in-memory cache that has configurable memory limits that's implemented nearly entirely in managed code?"
Do key sizes count towards the space that the MemoryCache is considered to take up? What about keys in the intern pool, each of which should only add the size of an object reference to the size of the cache?
Does MemoryCache consider more than just the size of the object references that it stores when determining the size of the object being stored in the pool? I mean... it has to, right? Otherwise, the configuration options are extremely misleading for the common-case... for the remaining questions, I'm going to assume that it does.
Given that MemoryCache almost certainly considers more than the size of the object references of the values stored in the cache, how deep does it go?
If I were implementing something like this, I would find it very difficult to consider the memory usage of the "child" members of individual objects, without also pulling in "parent" reference properties.
e.g., imagine a class in a game application, Player. Player has some player-specific state that's encapsulated in a public PlayerStateData PlayerState { get; } property that encapsulates what direction the player is looking, how many sprockets they're holding, etc., as well as a reference to the entire game's state public GameStateData GameState { get; } that can be used to get back to the game's (much larger) state from a method that only knows about a player.
Does MemoryCache consider both PlayerState and GameState when considering the size of the contribution to the cache?
Maybe it's more like "what's the total size on the managed heap taken up by the objects directly stored in the cache, and everything that's reachable through members of those objects"?
It seems like it would be silly to multiply the size of GameState's contribution to the limit by 5 just because 5 players are cached... but then again, a likely implementation might do just that, and it's difficult to count PlayerState without counting GameState.
If an object is stored multiple times in the MemoryCache, does each entry count separately towards the limit?
Related to the previous one, if an object is stored directly in the MemoryCache, but also indirectly through another object's members, what impact does either one have on the memory limit?
If an object is stored in the MemoryCache, but also referenced by some other live objects completely disconnected from the MemoryCache, which objects count against the memory limit? What about if it's an array of objects, some (but not all) of which have incoming external references?
My own research led me to SRef.cs, which I gave up on trying to understand after getting here, which later leads here. Guessing the answers to all these questions would revolve around finding and meditating on the code that ultimately populated the INT64 that's stored in that handle.
I know this is late but I've done a lot of digging in the source code to try to understand what is going on and I have a fairly good idea now. I will say that MemoryCache is the worst documented class on MSDN, which kind of baffles me for something intended to be used by people trying to optimize their applications.
MemoryCache uses a special "sized reference" to measure the size of objects. It all looks like a giant hack in the memory cache source code involving reflection to wrap an internal type called "System.SizedReference", which from what I can tell causes the GC to set the size of the object graph it points to during gen 2 collections.
From my testing, this WILL include the size of parent objects, and thus all child objects referenced by the parent etc, BUT I've found that if you make references to parent objects weak references (i.e. via WeakReference or WeakReference<>) then it is no longer counted as part of the object graph, so that is what I do for all cache objects now.
I believe cache objects need to be completely self-contained or use weak references to other objects for the memory limit to work at all.
If you want to play with it yourself, just copy the code from SRef.cs, create an object graph and point a new SRef instance to it, and then call GC.Collect. After the collection the approximate size will be set to the size of the object graph.

How to find out if an object is referencing another object?

I'm having trouble creating copies of my class instances from a dictionary of templates. It appears that MemberwiseClone() leaves some fields referenced to the dictionary's template fields. I'd like to be able to see if that's so in a convenient way, like Visual Studio's DataTips provide.
Is there a way to find out if an instance of a reference type object (or its fields) is referencing another instance of the same type (after memberwise cloning)?
The rule is that any value type will be copied and any reference type will only copy the reference. It is a shallow copy.
If that is not the behaviour that you want then you need to roll your own clone method.
You are probably talking about a deep copy, in which case this will tell you what you need to know: How do you do a deep copy of an object in .NET (C# specifically)?
As for counting the number of references to an instance, Eric Lippert says that C# does not do reference counting C# - Get number of references to object so again you would have to roll your own. But I don't think that is what you want to do.
You could use a memory profiler for manually checking the references. See .NET Memory Profiling Tools.
One "feature" of Java is that there's really only one non-primitive type: an object reference, which may be used in all sorts of ways. While that makes the framework easy to implement, it means that the type of a variable is insufficient to describe its meaning. While .net improves on Java in many ways, it shares that fundamental weakness.
For example, suppose an object George has a field Bob of type IList<String> [or, for Java, list<string>]. There are at least five fundamentally different things such a field could represent:
It holds a reference to an object holding a set of strings to which it will never allow any changes. If item #5 of that list is "Barney", then item #5 on that list will never be anything other than "Barney". In this scenario, `Bob` encapsulates only immutable aspects of the list. Further, the reference may be freely shared to any code that's interested in that aspect of George's state.
It holds a reference to an object holding a set of strings which may be modified by anyone holding a reference, but no entity with a reference to that object will modify the list nor allow it to be exposed to anything that might do so. In other worse, while the list would allow its contents to be changed, nothing will in fact ever alter those contents. As above, `Bob` encapsulates only immutable aspects of the list, but `George` is responsible for maintaining such immutability by exposing the reference only to code that can be trusted not to modify the list.
It holds the only reference, anywhere in the universe, to an object holding a set of strings which it modifies at will. In this scenario, `Bob` encapsulates the *mutable state* of the list. If one copies `George`, one must make a new list with the same items as the old, and give the copy a reference to that. Note also that `George` cannot pass a reference to any code that might persist the reference, whether or not that code would try to modify the list.
It holds a reference to a list which is "owned" by some other object, which will be used either to add items to the list for the other object's benefit, or to observe things the other object has put in the list for `George`'s benefit. In this scenario, `Bob` encapsulates the *identity* of the list. In a correct clone, `Bob` must identify the *same list* as in the original.
It holds a reference to a list which it owns, and which will be mutated, but to which some other objects also hold a reference (perhaps so they can add things to the list for `George`'s benefit, or perhaps so they can see things `George` does with the list). In this scenario, `Bob` encapsulates *both mutable state and identity*. The existence of a field which encapsulates both aspects means that *it is not possible to make a semantically-correct copy of `George` without the cooperation of other objects*.
In short, it's possible for Bob to encapsulate the list's mutable state, its identity, both, or neither (immutable state, other than identity, is a 'freebie'). If it encapsulates only mutable state, a semantically-correct copy of George must have Bob reference a different list which is initialized with the same contents. If it encapsulates only identity, a semantically-correct copy must have Bob reference the same list. If it encapsulates both mutable state and immutble state, George cannot be properly cloned in isolation. Fields that do neither may be copied or not, as convenient.
If one can properly determine which fields encapsulate the referenced objects' mutable states, which ones encapsulate identity, and which ones both, it will be obvious what a semantically-correct cloning operation should do. Unfortunately, there's no standard convention in the Framework for categorizing fields in such fashion, so you'll have to come up with your own method and then a cloning scheme that uses it.

Is there a way to return all unrefrenced objects?

I have this "Linked Web"(?) data structure
ie. each object has a bunch of references to other objects.
So i wrote a method that is supposed to 'remove' a passed object by removing all references to it.
I need to test it and make sure that after i run the method the particular object i want to remove is not referenced by anything else
How could I do this?
An idea would be to force a Garbage Collection then run my delete object method and then force another GC to see if it found an object.
if it found an object for deletion then I would assume that my method works
but if it found nothing to collect then I would assume that something is referencing it and would have to plug that leak
Is this possible? How?
Thanks,
Ryan
You can keep track of each object via a WeakReference and check the IsAlive property after a garbage collection.
This strikes me as an overly convoluted way to test whether your logic is correct. I would construct a couple of unit tests in the following manner.
Add an object to the web. In the test keep a reference to the object
Call the logic that unhooks the reference from the web.
Iterate over (or traverse) your web and use the ObjectReferenceEquals to see if the reference you're looking for is still hanging around.Since you have numerous possible connections, this may not be the swiftest of operations. However in a relatively small test it should be fine.
There is nothing in your scenario that leads me to believe you need to leverage garbage collection to test your data structure. Also, I believe your data structure is correctly termed a graph. Good luck

C# reference collection for storing reference types

I like to implement a collection (something like List<T>) which would hold all my objects that I have created in the entire life span of my application as if its an array of pointers in C++. The idea is that when my process starts I can use a central factory to create all objects and then periodically validate/invalidate their state. Basically I want to make sure that my process only deals with valid instances and I don't re-fetch information I already fetched from the database. So all my objects will basically be in one place - my collection. A cool thing I can do with this is avoid database calls to get data from the database if I already got it (even if I updated it after retrieval its still up-to-date if of course some other process didn't update it but that a different concern). I don't want to be calling new Customer("James Thomas"); again if I initted James Thomas already sometime in the past. Currently I will end up with multiple copies of the same object across the appdomain - some out of sync other in sync and even though I deal with this using timestamp field on the MSSQL server I'd like to keep only one copy per customer in my appdomain (if possible process would be better).
I can't use regular collections like List or ArrayList for example because I cannot pass parameters by their real local reference to the their existing Add() methods where I'm creating them using ref so that's not to good I think. So how can this be implemented/can it be implemented at all ? A 'linked list' type of class with all methods working with ref & out params is what I'm thinking now but it may get ugly pretty quickly. Is there another way to implement such collection like RefList<T>.Add(ref T obj)?
So bottom line is: I don't want re-create an object if I've already created it before during the entire application life unless I decide to re-create it explicitly (maybe its out-of-date or something so I have to fetch it again from the db). Is there alternatives maybe ?
The easiest way to do what you're trying to accomplish is to create a wrapper that holds on to the list. This wrapper will have an add method which takes in a ref. In the add it looks up the value in the list and creates it when it can't find the value. Or a Cache
But... this statement would make me worry.
I don't want re-create an object if
I've already created it before during
the entire application life
But as Raymond Chen points out that A cache with a bad policy is another name for a memory leak. What you've described is a cache with no policy
To fix this you should consider using for a non-web app either System.Runtime.Caching for 4.0 or for 3.5 and earlier the Enterprise Library Caching Block. If this is a Web App then you can use the System.Web.Caching. Or if you must roll your own at least get a sensible policy in place.
All of this of course assumes that your database's caching is insufficient.
Using Ioc will save you many many many bugs, and make your application easier to test and your modules will be less coupled.
Ioc performance are pretty good.
I recommend you to use the implementation of Castle project
http://stw.castleproject.org/Windsor.MainPage.ashx
maybe you'll need a day to learn it, but it's great.

How do I iterate through instances of a class in C#?

Is there a way to iterate over instances of a class in C#? These instances are not tracked or managed in a collection.
Not inside the regular framework. You would need to track them manually.
You can, however, do this in windbg/sos - mainly for debugging purposes (not for routine code).
You have to have references to them somewhere, or at least know where to look, so in identifying them you'd probably put them into a collection which you'd then iterate.
If you don't know where the references live, then you'd have to have to introduce some kind of tracking mechanism. Perhaps a static collection on the type? It would have to be implemented carefully though.
Not directly.
You could conceptually have your object place a copy of itself into some well-known place (e.g. a static collection) and then use that to iterate, but then you'd have to make sure you cleared the instance out of that collection at some point or else it'll never get garbage collected.
In the comment thread on this post there is an interesting discussion and solution related to this question.
As Marc said, if you want to do it in code, you would need to keep a collection of them. If you are debugging, have a look at this blog post: http://blogs.msdn.com/tess/archive/2006/01/23/516139.aspx
If you need a collection in memory of all of the object instances of a certain type, you could consider using a collection of System.WeakRef's A weak ref is a reference that does not keep the object that it references. This would let you keep a collection of weak-refs to the object instances you want to enumerate. Have a look at Weakrefs in the help for more info.

Categories