Related
System.Runtime.Caching.MemoryCache is a class in the .NET Framework (version 4+) that caches objects in-memory, using strings as keys. More than System.Collections.Generic.Dictionary<string, object>, this class has all kinds of bells and whistles that let you configure how much big the cache can grow to (in either absolute or relative terms), set different expiration policies for different cache items, and so much more.
My questions relate to the memory limits. None of the docs on MSDN seem to explain this satisfactorily, and the code on Reference Source is fairly opaque. Sorry about piling all of this into one SO "question", but I can't figure out how to take some out into their own questions, because they're really just different views of one overall question: "how do you reconcile idiomatic C#/.NET with the notion of a generally useful in-memory cache that has configurable memory limits that's implemented nearly entirely in managed code?"
Do key sizes count towards the space that the MemoryCache is considered to take up? What about keys in the intern pool, each of which should only add the size of an object reference to the size of the cache?
Does MemoryCache consider more than just the size of the object references that it stores when determining the size of the object being stored in the pool? I mean... it has to, right? Otherwise, the configuration options are extremely misleading for the common-case... for the remaining questions, I'm going to assume that it does.
Given that MemoryCache almost certainly considers more than the size of the object references of the values stored in the cache, how deep does it go?
If I were implementing something like this, I would find it very difficult to consider the memory usage of the "child" members of individual objects, without also pulling in "parent" reference properties.
e.g., imagine a class in a game application, Player. Player has some player-specific state that's encapsulated in a public PlayerStateData PlayerState { get; } property that encapsulates what direction the player is looking, how many sprockets they're holding, etc., as well as a reference to the entire game's state public GameStateData GameState { get; } that can be used to get back to the game's (much larger) state from a method that only knows about a player.
Does MemoryCache consider both PlayerState and GameState when considering the size of the contribution to the cache?
Maybe it's more like "what's the total size on the managed heap taken up by the objects directly stored in the cache, and everything that's reachable through members of those objects"?
It seems like it would be silly to multiply the size of GameState's contribution to the limit by 5 just because 5 players are cached... but then again, a likely implementation might do just that, and it's difficult to count PlayerState without counting GameState.
If an object is stored multiple times in the MemoryCache, does each entry count separately towards the limit?
Related to the previous one, if an object is stored directly in the MemoryCache, but also indirectly through another object's members, what impact does either one have on the memory limit?
If an object is stored in the MemoryCache, but also referenced by some other live objects completely disconnected from the MemoryCache, which objects count against the memory limit? What about if it's an array of objects, some (but not all) of which have incoming external references?
My own research led me to SRef.cs, which I gave up on trying to understand after getting here, which later leads here. Guessing the answers to all these questions would revolve around finding and meditating on the code that ultimately populated the INT64 that's stored in that handle.
I know this is late but I've done a lot of digging in the source code to try to understand what is going on and I have a fairly good idea now. I will say that MemoryCache is the worst documented class on MSDN, which kind of baffles me for something intended to be used by people trying to optimize their applications.
MemoryCache uses a special "sized reference" to measure the size of objects. It all looks like a giant hack in the memory cache source code involving reflection to wrap an internal type called "System.SizedReference", which from what I can tell causes the GC to set the size of the object graph it points to during gen 2 collections.
From my testing, this WILL include the size of parent objects, and thus all child objects referenced by the parent etc, BUT I've found that if you make references to parent objects weak references (i.e. via WeakReference or WeakReference<>) then it is no longer counted as part of the object graph, so that is what I do for all cache objects now.
I believe cache objects need to be completely self-contained or use weak references to other objects for the memory limit to work at all.
If you want to play with it yourself, just copy the code from SRef.cs, create an object graph and point a new SRef instance to it, and then call GC.Collect. After the collection the approximate size will be set to the size of the object graph.
Say we have a Game class.
The game class needs to pass down a reference to it's spritebatch. That is, the class calls a method passing it, and that method in turn passes it to other methods, until it is finally used.
Is this bad for performance? Is it better to just use statics?
I see one obvious disadvantage of statics though, being unable to make duplicate functionality in the same application.
It is not easy to answer your question as you have not specifically mentioned the requirement but generally i can give you some advice.
Always consider encapsulation: Do not expose the properties if they are not used else where.
Performance :For reference types, there is no any performance penalty, as they are already a reference type.but if your type is a value type then there will be a very small performance penalty.
So there is a Design or Performance trade off exists, Unless your method is called millions of times, you never have to think about public static property.
There are cons and pros like in everything.
Is this is a good or bad from performance point of view, depends on how computational intensive and how often used that code inside your game.
So here are my considerations on subject.
Passing like parameter:
Cons : passing more variable on stack, to push it into the function call.It's very fast, but again, it depends how the code you're talking about is used, so absence of it can bring some benefits, that's why inserted this point in cons.
Pros : you esplicitly manifest that the function on top of calling stack needs that parameter for read and/or write, so one looking on that code could easily imagine semantic dependencies of your calls.
Use like static:
Cons : There is no clear evidence (if not via direct knowledge or good written documentation) what parameters would or could affect the calculus inside that functions.
Pros : You don't pass it on the stack for all functions in chain.
I would personally recommend: use it like a parameter, because this clearly manifests what calling code depends on and even if there would be some measurable performance drawback, most probably it will not be relevant in your case. But again, as Rico Mariani always suggests: measure, measure, measure...
Statics is mostly not the best way. Because if later one you want to make multiple instances you might be in trouble.
Of course passing references cost a bit of performance, but depending on the amount of creation of instances it will matter more or less. Unless you are creating millions of objects every small amount of time it might be an issue.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I was using a custom method to deep clone objects the other day, and I know you can deep clone in different ways (reflection, binary serialization, etc), so I was just wondering:
What is/are the reason(s) that Microsoft does not include a deep copy method in the framework?
The problem is substantially harder than you seem to realize, at least in the general case.
For starters, a copy isn't just deep or shallow, it's a spectrum.
Let's imagine for a second that we have a list of arrays of strings, and we want to make a copy of it.
We start out at the shallowest level, we just copy the reference of the whole thing to another variable. Any changes to the list referenced from either variable is seen by the other.
So now we go and create a brand new list to give to the second variable. For each item in the first list we add it to the second list. Now we can modify the list referenced from either variable without it being seen by the other one. But what if we grab the first item of a list and change the string that's in the first array, it will be seen by both lists!
Now we're going through and creating a new list, and for each array in the first list we're creating a new array, adding each of the strings in the underlying array to the new array, and adding each of those new arrays to the new list. Now we can mutate any of the arrays in either list without seeing the changes. But wait, both lists are still referencing the same strings (which are value types after all; they internally have a character array for their data). What if some mean person were to come along and mutate one of the strings (using unsafe code you could actually do this)! So now you're copying all of the strings with a deep copy. But what if we don't need to do that? What if we know that nobody is so mean that they would mutate the string? Or, for that matter, what if we know that none of the arrays will be mutated (or that if they will be, that they're supposed to be reflected by both lists).
Then of course there are problems such as circular references, fields in a class that don't really represent it's state (i.e. possibly cached values or derived data that could just be re-calculated as-needed by a clone).
Realistically you'd need to have every type implement IClonable or some equivalent, and have it's own custom code for how to clone itself. This would be a lot of work to maintain for a language, especially since there are so many ways that complex objects could possibly be cloned. The cost would be quite high, and the benefits (outside of a handful of objects that it is deemed worthwhile to implement clone methods for) are generally not worth it. You, as a programmer, and write your own logic for cloning a type based on how deep you know you need to go.
It's similar to how it works (or doesn't work) in C and C++:
To do a deep copy, you actually have to know how different data is interpreted. In trivial cases, a shallow copy (which is provided) is the same as a deep copy. But once this is no longer true, it really depends on the implementation and interpretation. There's no general rule of thumb.
Let's use a game as a simple example:
A NPC object has two integers as members. One integer represents its health points, the other one is its unique ID.
If you clone the NPC, you have to keep the amount of health, while changing the unique ID. This is something the compiler/runtime can't determine on their own. You have to code this, essentially telling the program "how to copy".
I can think of two possible solutions:
Add a keyword to denote things that can't be copied. While this sounds like a good idea, it doesn't really solve the issue. You can tell the compiler that UniqueID must not copied, but at the same time you can't define how this should happen. And even if you could, you could just...
Create a copy constructor (C++) or a method to copy/clone the object (C#, e.g. CopyTo()).
Hmm.. My view is that:
A) because very rarely you want to have the copy really deep
B) because the framework cannot guarantee to know how to truly and meaningfully CLONE an object
C) because implementing deep-cloning in a naiive way is simple and takes one method and several lines of code using reflection and recursion
but I'll try to find an old MSDN article that covered that
edit: I've not found :( I'm still sure that I saw it somewhere, but I cannot google-it-out now.. However some useful links about related ICloneable and derived:
http://blogs.msdn.com/b/brada/archive/2004/05/03/125427.aspx
http://blogs.msdn.com/b/mrtechnocal/archive/2009/10/19/why-not-icloneable-t.aspx
https://stackoverflow.com/a/3712564/717732
So, as I've not found the author's words, let me expand the points:
A: because very rarely you want to have the copy really deep
You see, how can the framework guess how deep should it be in general? Let's assume that completely-deep and let's assume it has been implemented. Now we have memberwise-clone and total-clone methods. Still, there are some cases when people will need clone-me-but-not-the-root-base. So they post another questions why the total-clone has no way of cutting off the raw base. Or second-to-raw. Etc. Providing deep-clone solves almost nothing from the .Net team's point of view, as we, the users, will still rant about that just because we see some partial tools and are lazy and want to have everything:)
B) because the framework cannot guarantee to know how to truly and meaningfully CLONE an object
Especially with some special objects with handles or native-like IDs like those from Entity Framework, .Net Remoting Proxies, COM-wrappers etc: You might sucessfully read and clone the upper class hierarchy layers, but eventually, somewhere below you find some arcane thingies like IntPtrs that you just know that you should not copy. Most of the times. But sometimes you can. But the framework's code must be universal. Deep-cloning would either have to be harshly complicated with many sanity checks against specially-looking class members, or it would produce dangerous results if the programmer would invoke it on something that has base classes that the programmer did not care to analyze.
B+) Also, please note that the more base classes you have in your tree, the more probably is that they will have some parameterized constructors, which might indicate that direct-copying is not a good idea. Direct-copiable classes usually have parameterless constructors and all the copiable data accessible by properties..
B++) From the framework's designer point of view, taking memory and speed concerns, shallow copying is almost always very fast, while deep copying is just the opposite. It is beneficial to the framework's and platform's reputation to NOT allow the developers to freely deep-copy huge objects. Anyways, would you need a deep-copy if your object was lightweight and simple, huh? :) Not providing a deep-copy encourages the developers to think around the need of deep-copy, what usually makes the application lighter and faster.
C) because implementing deep-cloning in a naiive way is simple and takes one method and several lines of code using reflection and recursion
Having a shallow copy, how hard it is to actually write a deep copy? Not so hard! Just implement a method that is given an object 'obj':
pseudocode:
object deepcopier(object obj)
newobject = obj.shallowcopy()
foreach(field in newobject.fields)
newobject.field = deepcopier(newobject.field)
return newobject
and well, that's all. Of course the field enumeration must be performed by Reflection, and also reading/writing the fields - too.
However, this way is very naiive. It this has a serious flaw: what if some object has two fields that point to the same another object? We should detect it and do the cloning once then assign both fields to that one clone. Also if an object pointed by some field has reference to some object that is also pointed by another object (...) - that may also need to be tracked and cloned only once. Also, how about cycles? if somewhere there deep in the tree, an object has a reference back to the root? Such algo like above would happily descent and would re-copy everything once again, then again, and eventually would choke with StackOverflow.
This makes the cloning quite hard to be tracked and starts to look more like serialization. In fact if your class is a DataContract or Serializable, you can simply serialize it and deserialize to get a perfect deep copy :)
Deep-cloning is hard to do in an universal way, unless you know what the object means and what all its fields mean and know which ones should really be cloned and which should be unified. If you, as developer, know that this is just a data-object that is perfectly safe to deep-clone, so whydontya just make it Serializable? If you can't make it Serializable, then probably you also can't deep-clone it!
I'm writing an XNA engine and I am storing all of the models in a List. In order to be able to use this throughout the engine, I've made this a public static List<Model> so I can access it from any new classes that I develop. It certainly makes obtaining the list of models really easy to get too, but is this the right usage? Or would I be better off actually passing a variable through in a method declaration?
In OOP it's generally advisable to avoid using static methods and properties, unless you have a very good reason to do so. One of the reasons for that is that in the future you may want to have two or more instances of this list for some reason, and then you'll be stuck with static calls.
Static methods and properties are too rigid. As Stevey states it:
Static methods are as flexible as
granite. Every time you use one,
you're casting part of your program in
concrete. Just make sure you don't
have your foot jammed in there as
you're watching it harden. Someday you
will be amazed that, by gosh, you
really DO need another implementation
of that dang PrintSpooler class, and
it should have been an interface, a
factory, and a set of implementation
classes. D'oh!
For game development I advocate "Doing The Simplest Thing That Could Possibly Work". That includes using global variables (public static in C#), if that is an easy solution. You can always turn it into something more formal later. The "find all references" tool in Visual Studio makes this really easy.
That being said, there are very few cases where a global variable is actually the "correct" way to do something. So if you are going to use it, you should be aware of and understand the correct solution. So you can make the best tradeoff between "being lazy" and "writing good code".
If you are going to make something global, you need to fully understand why you are doing so.
In this particular case, it sounds like you're trying to trying to get at content. You should be aware that ContentManager will automatically return the same content object if you ask for it multiple times. So rather than loading models into a global list, consider making your Game class's built-in ContentManager available via a public static property on your Game class.
Or, better still, there's a method that I prefer, that I think is a bit better: I explain it in the answer to another question. Basically you make the content references private static in the classes that use them and pass the ConentManager into public static LoadContent functions. This compartmentalises your use of static to individual classes, rather than using a global that is accessed from all over your program (which would be difficult to extricate later). It also correctly handles loading content at the correct time.
I'd avoid using static as much as possible, over time you'll just end up with spaghetti code.
If you pass it in the constructor you're eliminating an unnecessary dependency, low coupling is good. The fewer dependencies there are, the better.
I would suggest to implement a Singleon object which encapsulates the model list.
Have a look at the MSDN singleton implementation.
This is a matter of balance and trade-offs.
Of course, OOP purists will say that avoid such global variables at all costs, since it breaks code compartmentization by introducing something that goes "out of the box" for any module, and thus making it hard to maintain, change, debug etc.
However, my personal experience has been that it should be avoided only if you are part of a very large enterprise solutions team, maintaining a very large enterprise-class application.
For others cases, encapsulating globally-accessible data into a "global" object (or a static object, same thing) simplifies OOP coding to a great extent.
You may get the middle-ground by writing a global GetModels() function that returns the list of models. Or use DI to automatically inject the list of models.
Ok, the title might sound a bit vague but I really can't think of something clearer here.
I recently was in the position where I needed a Point class, simply with two properties, X and Y and a ctor Point(int x, int y). Nothing fancy. Now this thing already exists in .NET, but this was in a library that handled certain objects on a map and dragging System.Drawing into this just felt ... wrong somehow. Even though System.Drawing.Point fit what I needed perfectly, I now created that struct myself again in that project.
Now I wonder whether that was a right or sensible thing to do. System.Drawing.Point would have come with a reference to that assembly as well, if I remember correctly. And putting something in System.Drawing into a totally non-drawing related context was somehow weird.
Thoughts? What if it wouldn't have implied a reference to another assembly?
I disagree with the other answers so far and say that it actually matters. The Point sample is a simple one, but in general using a class in a context it hasn't been designed for may have undesired effects.
A class may have been implemented for a particular use case only, e.g. no support for thread safety, requiring the use within a framework or exposing functionality that is unwanted in your code.
It might especially lead to problems when a newer version of the assembly is deployed, which is no longer compatible with the way that you use it, or the newer version brings additional members and dependencies that you don't want to have in your code.
The context of namespace is fundamental in approaching the precise function of a class; a Connection class is going to be a very different beast from one namespace to the next. Is a Web.Cache class going to be suitable for caching in other applications, or does it have a fundamental dependency on web infrastructure?
MSDN describes System.Drawing.Point structure as follows:
"Represents an ordered pair of integer x- and y-coordinates that defines a point in a two-dimensional plane."
With such a general description, it could be argued that this structure is only incidentally related to drawing and really belongs in a more fundamental namespace.
However, as it does live in System.Drawing, the implication is that it represents a point within a 2-dimensional drawing space. As such, using it for purposes other than drawing is mis-use of the class; it may function for your needs, but it is not being used for its original purpose.
I wouldn't say that what you did was wrong however a namespace is really just a container for declarations within which each name is unique, the namespace name does provide some context to help you find the right function for your need but don't get hung up on the context not quite fitting your useage, if the object is ideal then it's ideal. Apart from the using statement you will probably never actively refer to the namespace again.
If your purpose for the point was in no way drawing related, I think you did the right thing. Using a System.Drawing.Point in code which does nothing drawing related at all may confuse people down the line into thinking it's used for some drawing functionality.
I was in a similar situation recently. Not needing the Point class, but I'll use that as an example.
I made my own Point class because the System.Drawing.Point uses ints, when I needed doubles. I realised later that it was a good idea, even if I only needed ints because I can then extend the class as needed, and add methods, interfaces and attributes etc. Whereas I wouldn't have been able to should I have used the System.Drawing.Point.
There's the principle that you should not duplicate knowledge in your programs, but in some cases such as this one, it can't be avoided.
As for implying reference to another assembly, you can do this:
using Point = System.Drawing.Point; // or whatever your namespace is called
I would have just used the System.Drawing.Point. Why re-create something that already exists? Who cares what the namespace is called if provides the functionality you need. Just my 2 cents...
If you're not importing any "luggage" (like having to intialize stuff that you don't use) so it's exactly what you need, I'd go for the existing code. Just no point in recreating existing code. Chances are also that you may discover functions later on which perform functions you need and expect that particular class.
Still, I can identify with feeling strange having an "alien" object in your code. One way to get around this would be to subclass it with your own point. That'll look a bit strange if you're not changing anything in your subclass, but that's only visible in the source of that class, and where you use it, it's consistently named. Also it'll start looking actually smart as soon as you find yourself adding functionality to your custom Point.
I would worry very little about bringing something in from the namespace given that both functionally and conceptually (how it is described in the docs) it fits the goal.
I might hesitate to bring in a new assembly though. In this case the fact that Point is so quick to roll ones own that I might consider the hassle of doing so less than that of adding another dependency to the assembly I was writing.
Otherwise, as long as I wasn't using it as a Key (the GetHashCode impl. in point isn't great in the way it collides for e.g. all of {0,1},{1,0},{2,3}and{3,2} if you have a lot of low-valued or rectanguarly distributed points) I'd use it.