Is there a way to find how much memory is used for a particular object? For example a List. Taking everything into account, like string interning and whatever is done by compiler/runtime environment/whatever.
ANTS Memory Profiler profiles the memory consumption of .NET code. I've had great results with it in the past.
You'd really have to define exactly what you meant by "how much memory is used for a particular object". For instance, you could mean "if this object were garbage collected, how much would be freed" - or you could mean "how much memory does this object and everything it touches take up."
Your point about string interning is a good example. Suppose you do:
List<string> internedStrings = new List<string>();
List<string> nonInternedStrings = new List<string>();
for (int i=0; i < 1000; i++)
{
string tmp = new string(' ', i+1);
nonInternedStrings.Add(tmp);
tmp = tmp.Intern();
internedStrings.Add(tmp);
}
Does nonInternedStrings really take up more memory than internedStrings? If internedStrings were garbage collected, it wouldn't free as much memory - but if internedStrings had never been created (including not interning each of its elements) then more memory would never have been required.
If you can be more specific about exactly what you mean, we may be able to help you. It's a complex issue though.
This seems to be a sibling of this Delphi question. A naive algorithm won't take into account the difference between aggregation and composition. Even an algorithm based on mark and sweep won't tell you whether a hash table had to grow its internal array because an object was referenced by it. You probably are better off profiling your application for a variety of scenarios and plotting the resource usage against N, where N is some measure of the scale of your dataset.
Have you tried CLR Profiler 2.0?
Related
I want to process many integers in a class, so I listed them into an int* array.
int*[] pp = new int*[]{&aaa,&bbb,&ccc};
However, the compiler declined the code above with the following EXCUSE:
> You can only take the address of an unfixed expression inside of a fixed statement initializer
I know I can change the code above to avoid this error; however, we need to consider ddd and eee will join the array in the future.
public enum E {
aaa,
bbb,
ccc,
_count
}
for(int i=0;i<(int)E._count;i++)
gg[(int)E.bbb]
Dictionary<string,int>ppp=new Dictionary<string,int>();
ppp["aaa"]=ppp.Count;
ppp["bbb"]=ppp.Count;
ppp["ccc"]=ppp.Count;
gg[ppp["bbb"]]
These solution works, but they make the code and the execution time longer.
I also expect a nonofficial patch to the compiler or a new nonofficial C# compiler, but I have not seen an available download for many years; it seems very difficult to have one for us.
Are there better ways so that
I do not need to count the count of the array ppp.
If the code becomes long, there are only several letters longer.
The execution time does not increase much.
To add ddd and eee into the array, there are only one or two
setences for each new member.
.NET runtime is a managed execution runtime which (among other things) provides garbage collection. .NET garbage collector (GC)
not only manages the allocation and release of memory, but also transparently moves the objects around the "managed heap", blocking
the rest of your code while doing it.
It also compacts (defragments) the memory by moving longer lived objects together, and even "promoting" them into different parts of the heap, called generations, to avoid checking their status too often.
There is a bunch of memory being copied all the time without your program even realizing it. Since garbage collection is an operation that can happen at any time during the execution of your program, any pointer-related
("unsafe") operations must be done within a small scope, by telling the runtime to "pin" the objects using the fixed keyword. This prevents the GC from moving them, but only for a while.
Using pointers and unsafe code in C# is not only less safe, but also not very idiomatic for managed languages in general. If coming from a C background, you may feel like at home with these constructs, but C# has a completely different philosophy: your job as a C# programmer should be to write reliable, readable and maintenable code, and only then think about squeezing a couple of CPU cycles for performance reasons. You can use pointers from time to time in small functions, doing some very specific, time-critical code. But even then it is your duty to profile before making such optimizations. Even the most experienced programmers often fail at predicting bottlenecks before profiling.
Finally, regarding your actual code:
I don't see why you think this:
int*[] pp = new int*[] {&aaa, &bbb, &ccc};
would be any more performant than this:
int[] pp = new int[] {aaa, bbb, ccc};
On a 32-bit machine, an int and a pointer are of the same size. On a 64-bit machine, a pointer is even bigger.
Consider replacing these plain ints with a class of your own which will provide some context and additional functionality/data to each of these values. Create a new question describing the actual problem you are trying to solve (you can also use Code Review for such questions) and you will benefit from much better suggestions.
I was having a discussion with a colleague the other day about this hypothetical situation. Consider this pseudocode:
public void Main()
{
MyDto dto = Repository.GetDto();
foreach(var row in dto.Rows)
{
ProcessStrings(row);
}
}
public void ProcessStrings(DataRow row)
{
string string1 = GetStringFromDataRow(row, 1);
string string2 = GetStringFromDataRow(row, 2);
// do something with the strings
}
Then this functionally identical alternative:
public void Main()
{
string1 = null;
string2 = null,
MyDto dto = Repository.GetDto();
foreach(var row in dto.Rows)
{
ProcessStrings(row, string1, string2)
}
}
public void ProcessStrings(DataRow row, string string1, string string2)
{
string1 = GetStringFromDataRow(row, 1);
string2 = GetStringFromDataRow(row, 2);
// do something with the strings
}
How will these differ in processing when running the compiled code? Are we right in thinking the second version is marginally more efficient because the string variables will take up less memory and only be disposed once, whereas in the first version, they're disposed of on each pass of the loop?
Would it make any difference if the strings in the second version were passed by ref or as out parameters?
When you're dealing with "marginally more efficient" level of optimizations you risk not seeing the whole picture and end up being "marginally less efficient".
This answer here risks the same thing, but with that caveat, let's look at the hypothesis:
Storing a string into a variable creates a new instance of the string
No, not at all. A string is an object, what you're storing in the variable is a reference to that object. On 32-bit systems this reference is 4 bytes in size, on 64-bit it is 8. Nothing more, nothing less. Moving 4/8 bytes around is overhead that you're not really going to notice a lot.
So neither of the two examples, with the very little information we have about the makings of the methods being called, creates more or less strings than the other so on this count they're equivalent.
So what is different?
Well in one example you're storing the two string references into local variables. This is most likely going to be cpu registers. Could be memory on the stack. Hard to say, depends on the rest of the code. Does it matter? Highly unlikely.
In the other example you're passing in two parameters as null and then reusing those parameters locally. These parameters can be passed as cpu registers or stack memory. Same as the other. Did it matter? Not at all.
So most likely there is going to be absolutely no difference at all.
Note one thing, you're mentioning "disposal". This term is reserved for the usage of objects implementing IDisposable and then the act of disposing of these by calling IDisposable.Dispose on those objects. Strings are not such objects, this is not relevant to this question.
If, instead, by disposal you mean "garbage collection", then since I already established that neither of the two examples creates more or less objects than the others due to the differences you asked about, this is also irrelevant.
This is not important, however. It isn't important what you or I or your colleague thinks is going to have an effect. Knowing is quite different, which leads me to...
The real tip I can give about optimization:
Measure
Measure
Measure
Understand
Verify that you understand it correctly
Change, if possible
You measure, use a profiler to find the real bottlenecks and real time spenders in your code, then understand why those are bottlenecks, then ensure your understanding is correct, then you can see if you can change it.
In your code I will venture a guess that if you were to profile your program you would find that those two examples will have absolutely no effect whatsoever on the running time. If they do have effect it is going to be on order of nanoseconds. Most likely, the very act of looking at the profiler results will give you one or more "huh, that's odd" realizations about your program, and you'll find bottlenecks that are far bigger fish than the variables in play here.
In both of your alternatives, GetStringFromDataRow creates new string every time. Whether you store a reference to this string in a local variable or in argument parameter variable (which is essentially not much different from local variable in your case) does not matter. Imagine you even not assigned result of GetStringFromDataRow to any variable - instance of string is still created and stored somewhere in memory until garbage collected. If you would pass your strings by reference - it won't make much difference. You will be able to reuse memory location to store reference to created string (you can think of it as the memory address of string instance), but not memory location for string contents.
Is this:
foreach(Type item in myCollection)
{
StringBuilder sb = new StringBuilder();
}
much slower than:
StringBuilder sb = new StringBuilder();
foreach(Type item in myCollection)
{
sb = new StringBuilder();
}
In other words, will it really matter where I declare my StringBuilder?
No, it will not matter performance-wise where you declare it.
For general code-cleanliness, you should declare it in the inner-most scope that it is used - ie. your first example.
You could maybe gain some performance, if you write this:
StringBuilder sb = new StringBuilder();
foreach(Type item in myCollection)
{
sb.Length = 0;
}
So you have to instantiate the StringBuilder just once and reset the size in the loop, which should be slightly faster than instantiating a new object.
In the 2nd example you're creating an extra instance of StringBuilder. Apart from that they are both they same, so the performance issue is negligable.
There isn't enough code here to clearly indicate a performance difference in your specific case. Having said that, the difference between declaring a reference variable inside of a loop like this vs. outside is trivial for most cases.
The effective difference between your two code samples is that the second will allocate 1 more instance of StringBuilder than the first. The performance impact of this as compared to the rest of your application is essentially nothing.
Best way to check is by trying both methods in a loop, about 100.000 each. Measure the amount of time each 100.000 iterations take and compare them. I don't think there is a lot of difference.
But there is a small difference, though. The first example will have as many variables as the number of iterations. The second example just has one variable. The compiler is smart enough to do some optimizations here, so you won't notice a speed improvement.
However, if you don't want to use the last object generated inside the loop once you're outside the loop again, then the first solution would be better. In the second solution, it just takes a while before the garbage collector will free the last object created. In the first example, the garbage collector will be a bit faster in freeing the object. It depends on the rest of the code but if you store a lot of data in this StringBuilder object then the second example might hold on to this memory a lot longer, thus decreasing the performance of your code after leaving the loop! Then again, if the objects eats up 100 KB and you have 16 GB in your machine, no one cares... The garbage collector will eventually free it again, probably as soon when you leave the method which contains this loop.
If you have other similar type code segments, you could always profile or put some timers around the code and run a benchmark type test to see for yourself. Another factor would be the memory footprint, which others have commented on.
Is it possible to get the size(in bytes) of a Session object after storing something such as a datatable inside it?
I want to get the size of a particular Session object, such as Session["table1"], not the whole Session collection, so the other question, while helpful, is not quite a duplicate.
You can use marshalling to create a copy of the object, that would give you an approximate number on how much memory it uses.
But, as always it's impossible to give an exact figure of the memory usage. A DataTable object is not a single solid piece of memory that you can measure. It contains a lot of objects and they have references between them, and there may be several references to the same object which means that there isn't one copy of the object for each reference to it. Each DataRow for example has a reference to the table that it belongs to, but that of course doesn't mean that each row has a complete copy of the entire table.
You could use reflection, see this article.
You might also want to consider having a look at some Memory Performance Counters or perhaps profiling your application with a tool such as DotTrace or the CLR Profiler.
Maybe you can use external tools like CLR Profiler or VSTS Profiler to check it.
This is taken almost line-for-line from the "duplicate question" from the first comment in the question.
int totalSessionBytes;
BinaryFormatter b = new BinaryFormatter();
MemoryStream m;
b.Serialize(m, Session["table1"]);
totalSessionBytes = m.Length;
I'm developing an application which currently have hundreds of objects created.
Is it possible to determine (or approximate) the memory allocated by an object (class instance)?
You could use a memory profiler like
.NET Memory Profiler (http://memprofiler.com/)
or
CLR Profiler (free) (http://clrprofiler.codeplex.com/)
A coarse way could be this in-case you wanna know whats happening with a particular object
// Measure starting point memory use
GC_MemoryStart = System.GC.GetTotalMemory(true);
// Allocate a new byte array of 20000 elements (about 20000 bytes)
MyByteArray = new byte[20000];
// Obtain measurements after creating the new byte[]
GC_MemoryEnd = System.GC.GetTotalMemory(true);
// Ensure that the Array stays in memory and doesn't get optimized away
GC.KeepAlive(MyByteArray);
process wide stuff could be obtained perhaps like this
long Process_MemoryStart = 0;
Process MyProcess = System.Diagnostics.Process.GetCurrentProcess();
Process_MemoryStart = MyProcess.PrivateMemorySize64;
hope this helps ;)
The ANTS memory profiler will tell you exactly how much is allocated for each object/method/etc.
Here's a related post where we discussed determining the size of reference types.
You can also use WinDbg and either SOS or SOSEX (like SOS with with a lot more commands and some existing ones improved) WinDbg extensions. The command you would use to analyze an object at a particular memory address is !objsize
One VERY important item to remember is that !objsize only gives you the size of the class itself and DOES NOT necessarily include the size of the aggregate objects contained inside the class - I have no idea why it doesn't do this as it is quite frustrating and misleading at times.
I've created 2 Feature Suggestions on the Connect website that ask for this ability to be included in VisualStudio. Please vote for the items of you would like to see them added as well!
https://connect.microsoft.com/VisualStudio/feedback/details/637373/add-feature-to-debugger-to-view-an-objects-memory-footprint-usage
https://connect.microsoft.com/VisualStudio/feedback/details/637376/add-feature-to-debugger-to-view-an-objects-rooted-references
EDIT:
I'm adding the following to clarify some info from the answer provided by Charles Bretana:
the OP asked about the size of an 'object' not a 'class'. An object is an instance of a class. Maybe this is what you meant?
The memory allocated for an object does not include the JITted code. The JIT code lives in its own 'JIT Code Heap'.
The JIT only compiles code on a method by method basis - not at a class level. So if a method never gets called for a class, it is never JIT compiled and thus never has memory allocated for it on the JIT Code Heap.
As an aside, there are about 8 different heaps that the CLR uses:
Loader Heap: contains CLR structures and the type system
High Frequency Heap: statics, MethodTables, FieldDescs, interface map
Low Frequency Heap: EEClass, ClassLoader and lookup tables
Stub Heap: stubs for CAS, COM wrappers, P/Invoke
Large Object Heap: memory allocations that require more than 85k bytes
GC Heap: user allocated heap memory private to the app
JIT Code Heap: memory allocated by mscoreee (Execution Engine) and the JIT compiler for managed code
Process/Base Heap: interop/unmanaged allocations, native memory, etc
HTH
Each "class" requires enough memory to hold all of it's jit-compiled code for all it's members that have been called by the runtime, (although if you don't call a method for quite some time, the CLR can release that memory and re-jit it again if you call it again... plus enough memory to hold all static variables declared in the class... but this memory is allocated only once per class, no matter how many instances of the class you create.
For each instance of the class that you create, (and has not been Garbage collected) you can approximate the memory footprint by adding up the memory usage by each instance-based declared variable... (field)
reference variables (refs to other objects) take 4 or 8 bytes (32/64 bit OS ?)
int16, Int32, Int64 take 2,4, or 8 bytes, respectively...
string variable takes extra storage for some meta data elements, (plus the size of the address pointer)
In addition, each reference variable in an object could also be considered to "indirectly" include the memory taken up on the heap by the object it points to, although you would probably want to count that memory as belonging to that object not the variable that references it...
etc. etc.
To get a general sense for the memory allocation in your application, use the following sos command in WinDbg
!dumpheap -stat
Note that !dumpheap only gives you the bytes of the object type itself, and doesn't include the bytes of any other object types that it might reference.
If you want to see the total held bytes (sum all the bytes of all objects referenced by your object) of a specific object type, use a memory profiler like dot Trace - http://www.jetbrains.com/profiler/
If you can - Serialize it!
Dim myObjectSize As Long
Dim ms As New IO.MemoryStream
Dim bf As New Runtime.Serialization.Formatters.Binary.BinaryFormatter()
bf.Serialize(ms, myObject)
myObjectSize = ms.Position
There is the academic question of What is the size of an object at runtime? And that is interesting, but it can only be properly answered by a profiler that is attached to the running process. I spent quite a while looking at this recently and determined that there is no generic method that is accurate and fast enough that you would ever want to use it in a production system. Simple cases like arrays of numerical types have easy answers, but beyond this the best answer would be Don't bother trying to work it out. Why do you want to know this? Is there other information available that could serve the same purpose?
In my case I ended up wanting to answer this question because I had various data that were useful, but could be discarded to free up RAM for more critical services. The poster boys here are an Undo Stack and a Cache.
Eventually I concluded that the right way to manage the size of the undo stack and the cache was to query for the amount of available memory (it's a 64-bit process so it is safe to assume it is all available) and then allow more items to be added if there is a sufficiently large buffer of RAM and require items to be removed if RAM is running low.
For any Unity Dev lurking around for an answer, here's a way to compare two different class memory allocations inspired by #varun's answer:
void Start()
{
var totalMemory = System.GC.GetTotalMemory(false);
var class1 = new Class1[100000];
System.GC.KeepAlive(class1);
for (int i = 0; i < 100000; i++)
{
class1[i] = new Class1();
}
var newTotalMemory = System.GC.GetTotalMemory(false);
Debug.Log($"Class1: {newTotalMemory} - {totalMemory} = {newTotalMemory - totalMemory}");
var class2 = new Class2[100000];
System.GC.KeepAlive(class2);
for (int i = 0; i < 100000; i++)
{
class2[i] = new Class2(10, 10);
}
var newTotalMemory2 = System.GC.GetTotalMemory(false);
Debug.Log($"Class2: {newTotalMemory2} - {newTotalMemory} = {newTotalMemory2 - newTotalMemory}");
}