How come my class take so much space in memory?

How come my class take so much space in memory? - c#

I will have literally tens of millions of instances of some class MyClass and want to minimize its memory size. The question of measuring how much space an object takes in the memory was discussed in Find out the size of a .net object
I decided to follow Jon Skeet's suggestion, and this is my code:
// Edit: This line is "dangerous and foolish" :-)
// (However, commenting it does not change the result)
// [StructLayout(LayoutKind.Sequential, Pack = 1)]
public class MyClass
{
public bool isit;
public MyClass nextRight;
public MyClass nextDown;
}
class Program
{
static void Main(string[] args)
{
var a1 = new MyClass(); //to prevent JIT code mangling the result (Skeet)
var before = GC.GetTotalMemory(true);
MyClass[] arr = new MyClass[10000];
for (int i = 0; i < 10000; i++)
arr[i] = new MyClass();
var after = GC.GetTotalMemory(true);
var per = (after - before) / 10000.0;
Console.WriteLine("Before: {0} After: {1} Per: {2}", before, after, per);
Console.ReadLine();
}
}
I run the program on 64 bit Windows, Choose "release", platform target: "any cpu", and choose "optimize code" (The options only matter if I explicitly target x86) The result is, sadly, 48 bytes per instance.
My calculation would be 8 bytes per reference, plus 1 byte for bool plus some ~8byte overhead. What is going on? Is this a conspiracy to keep RAM prices high and/or let non-Microsoft code bloat? Well, ok, I guess my real question is: what am I doing wrong, or how can I minimize the size of MyClass?
Edit: I apologize for being sloppy in my question, I edited a couple of identifier names. My concrete and immediate concern was to build a "2-dim linked-list" as a sparse boolean matrice implementation, where I can get an enumeration of set values in a given row/column easily. [Of course that means I have to also store the x,y coordinates on the class, which makes my idea even less feasible]

Approach the problem from the other end. Rather than asking yourself "how can I make this data structure smaller and still have tens of millions of them allocated?" ask yourself "how can I represent this data using a completely different data structure that is far more compact?"
It looks like you are building a doubly-linked list of bools, which, as you note, uses thirty to fifty times more memory than it needs to. Is there some reason why you're not simply using a BitArray to store your list of bools?
UPDATE:
in fact I was trying to implement a sparse boolean two-dimensional matrix
Well why didn't you say so in the first place?
When I want to make a sparse Boolean two-d matrix of enormous size, I build an immutable persistent boolean quadtree with a memoized factory. If the array is sparse, or even if it is dense but self-similar in some way, you can achieve enormous compressions. Square arrays of 264 x 264 Booleans are easily representable even though obviously as a real array, that would be more memory than exists in the world.
I have been toying with the idea of doing a series of blog articles on this technique; I will likely do so in late March. (UPDATE: I did not write that article in March 2012; I wrote it in August 2020. https://ericlippert.com/2020/08/17/life-part-32/)
Briefly, the idea is to make an abstract class Quad that has two subclasses: Single, and Multi. "Single" is a doubleton -- like a singleton, but with exactly two instances, called True and False. A Multi is a Quad that has four sub-quads, called NorthEast, SouthEast, SouthWest and NorthWest.
Each Quad has an integer "level"; the level of a Single is zero, and a multi of level n is required to have all of its children be Quads of level n-1.
The Multi factory is memoized; when you ask it to make a new Multi with four children, it consults a cache to see if it has made it before. If it has, it does not construct a new one; it hands out the old one. Since Quads are immutable, you do not have to worry about someone changing the Quad on you after it is in the cache.
Consider now how many memory words (a word is 4 or 8 bytes depending on architecture) an "all false" Multi of level n consumes. A level 1 "all false" multi consumes four words for the links to its children, a word for the level count (if necessary; you are not required to keep the level in the multi, though it helps for debugging) and a couple words for the sync block and so on. Let's call it eight words. (Plus the memory for the False Single quad, which we can assume is a constant two or three words, and thereby may be ignored.)
A level 2 "all false" multi consumes the same eight words, but each of its four children is the same level 1 multi. Therefore the total consumption of the level 2 "all false" multi is let's say 16 words.
The same for the level 3, 4,... and so on. The total memory consumption for a level 64 multi that is logically a 264 x 264 square array of Booleans is only 64 x 16 memory words!
Make sense? Hopefully that is enough of a sketch to get you going. If not, see my blog link above.

8 (object reference) + 8 (object reference) + 1 (bool) + 16 (header) + 8 (reference in array itself) = 41
Even if it's misaligned internally, each will be aligned on the heap. So we're looking at least 48bytes.
I can't for the life of me see why you'd want a linked list of bools though. A list of them would take 48times less space, and that's before you get to optimisations of storing a bool per bit which would make it 384 times smaller. And easier to manipulate.

If these hundreds of millions of instances of the class are mostly copies of the class with minor variations in class property values, then your system is a prime candidate to use what is called the Flyweight pattern. This pattern minimizes memory use by using the same instanes over and over, and just changing the properties as needed...

Related

Finding insertion points in a sorted array faster than O(n)?

This is for game programming. Lets say I have a Unit that can track 10 enemies within it's range. Each enemy has a priority between 0-100. So the array currently looks like this (numbers represent priority):
Enemy - 96
Enemy - 78
Enemy - 77
Enemy - 73
Enemy - 61
Enemy - 49
Enemy - 42
Enemy - 36
Enemy - 22
Enemy - 17
Say a new enemy wanders within range and has a priority of 69, this will be inserted between 73 and 61, and 17 will be removed from the array (Well, the 17 would be removed before the insertion, I believe).
Is there any way to figure out that it needs to be inserted between 73 and 61 without an O(n) operation?

I feel you're asking the wrong question here. You have to both first find the spot to insert into and then insert the element. These are two operation that are both tied together and I feel you shouldn't be asking about how to find where to do one faster without the other. It'll make sense why towards the end of the question. But I'm addressing the question of actually inserting faster.
Short Answer: No
Answer you'll get from someone that's too smart for themselves:
The only way to accomplish this is to not use an array. In an array unless you are inserting into the first or last permissions the insert will be O(n). This is because the array consists of its elements occupying contiguous space in memory. That is how you are able to reference a particular element in O(1) time, you know exactly where that element is. The cost is to insert in the middle you need to move half the elements in the array. So while you can look up with a binary search in log(n) time you cannot insert in that time.
So if you're going to do anything, you'll need a different data structure. A simple binary tree may be the solution it will do the insertion in log(n) time. On the other hand if you're feeding it a sorted array you have to worry about tree balancing, so not you might need a red and black tree. Or if you are always popping the element that is the closest or the furthest then you can use heap sort. A heap sort is the best algorithm for a priority queue. It has an additional advantage of fitting a tree structure in an array so it has far better spatial locality (more on this later).
The truth:
You'll most likely have a dozen maybe a few dozen enemies in the vicinity at most. At that level the asymptotic performance does not matter because it is designed especially for large values of 'n'. What you're looking at is a religious adherence to your CS 201 professor's calls about Big Oh. Linear search and insertion will be the fastest method, and the answer to will it scale is, who the hell cares. If you try to implement a complicated algorithm to scale it, you will almost always be slower since what is determining your speed is not the software, it is the hardware, and you're better off sticking to doing things that the hardware knows how to deal with well: "linearly going down memory". In fact after the prefetchers do their thing it would be faster to linearly go through each element even if there were a couple of thousand elements than to implement a red and black tree. Because a data structure like a tree would allocate memory all over the place without any regard to spatial locality. And the calls to allocate more memory for a node are in themselves more expensive than the time it takes to read through a thousand elements. Which is why graphics cards use insert sort all over the place.
Heap Sort
Heap sort might actually be faster depending on the input data since it is using a linear array although it may confuse the prefetchers so it's hard to say. The only limitation is that you can only pop the highest priority element. Obviously you can define highest priority to be either the lowest or the largest element. Heap sort is too fancy for me to try and describe it over here, just Google it. It does separate insertion and removal into two O(log(n)) operations. The biggest downside of heap sort is it will seriously decrease the debugability of the code. A heap is not a sorted array, it has an order to it, but other than heap sort being a complicated unintuitive algorithm, it is not apparently visible to a human being if a heap is setup correctly. So you would introduce more bugs for in the best case little benefit. Hell, the last time I had to do a heap sort I copied the code for it and that had bugs in it.
Insertion Sort With Binary Search
So this is what it seems like you're trying to do. The truth is this is a very bad idea. On average insertion sort takes O(n). And we know this is a hard limit for inserting a random element into a sorted array. Yes we can find the element we want to insert into faster by using a binary search. But then the average insertion still takes O(n). Alternatively, in the best case, if you are inserting and the element goes into the last position insertion sort takes O(1) time because when you inserted, it is already in the correct place. However, if you do a binary search to find the insertion location, then finding out you're supposed to insert in the last position takes O(log(n)) time. And the insertion itself takes O(1) time. So in trying to optimize it, you've severely degraded the best case performance. Looking at your use case, this queue holds the enemies with their priorities. The priority of an enemy is likely a function of their strength and their distance. Which means when an enemy enters into the priority queue, it will likely have a very low priority. This plays very well into the best case of insertion of O(1) performance. If you decrease the best case performance you will do more harm than good because it is also your most general case.
Preoptimization is the root of all evil -- Donald Knuth

Since you are maintaining a sorted search pool at all times, you can use binary search. First check the middle element, then check the element halfway between the middle element and whichever end of the array is closer, and so on until you find the location. This will give you O(log2n) time.

Sure, assuming you are using an Array type to house the list this really easy.
I will assume Enemy is your class name, and that is has a property called Priority to perform the sort. We will need an IComparer<Enemy> that looks like the following:
public class EnemyComparer : IComparer<Enemy>
{
int IComparer<Enemy>.Compare(Enemy x, Enemy y)
{
return y.Priority.CompareTo(x.Priority); // reverse operand to invert ordering
}
}
Then we can write a simple InsertEnemy routine as follows:
public static bool InsertEnemy(Enemy[] enemies, Enemy newEnemy)
{
// binary search in O(logN)
var ix = Array.BinarySearch(enemies, newEnemy, new EnemyComparer());
// If not found, the bit-wise compliment is the insertion index
if (ix < 0)
ix = ~ix;
// If the insertion index is after the list we bail out...
if (ix >= enemies.Length)
return false;// Insert is after last item...
//Move enemies down the list to make room for the insertion...
if (ix + 1 < enemies.Length)
Array.ConstrainedCopy(enemies, ix, enemies, ix + 1, enemies.Length - (ix + 1));
//Now insert the newEnemy into the position
enemies[ix] = newEnemy;
return true;
}
There are other data structures that would make this a bit faster, but this should prove efficient enough. A B-Tree or binary tree would be ok if the list will get large, but for 10 items it's doubtful it would be faster.
The method above was tested with the addition of the following:
public class Enemy
{
public int Priority;
}
public static void Main()
{
var rand = new Random();
// Start with a sorted list of 10
var enemies = Enumerable.Range(0, 10).Select(i => new Enemy() {Priority = rand.Next(0, 100)}).OrderBy(e => e.Priority).ToArray();
// Insert random entries
for (int i = 0; i < 100; i++)
InsertEnemy(enemies, new Enemy() {Priority = rand.Next(100)});
}

Named numbers as variables [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I've seen this a couple of times recently in high profile code, where constant values are defined as variables, named after the value, then used only once. I wondered why it gets done?
E.g. Linux Source (resize.c)
unsigned five = 5;
unsigned seven = 7;
E.g. C#.NET Source (Quaternion.cs)
double zero = 0;
double one = 1;

Naming numbers is terrible practice, one day something will need to change, and you'll end up with unsigned five = 7.
If it has some meaning, give it a meaningful name. The 'magic number' five is no improvement over the magic number 5, it's worse because it might not actually equal 5.
This kind of thing generally arises from some cargo-cult style programming style guidelines where someone heard that "magic numbers are bad" and forbade their use without fully understanding why.

Well named variables
Giving proper names to variables can dramatically clarify code, such as
constant int MAXIMUM_PRESSURE_VALUE=2;
This gives two key advantages:
The value MAXIMUM_PRESSURE_VALUE may be used in many different places, if for whatever reason that value changes you need to change it in only one place.
Where used it immediately shows what the function is doing, for example the following code obviously checks if the pressure is dangerously high:
if (pressure>MAXIMUM_PRESSURE_VALUE){
//without me telling you you can guess there'll be some safety protection in here
}
Poorly named variables
However, everything has a counter argument and what you have shown looks very like a good idea taken so far that it makes no sense. Defining TWO as 2 doesn't add any value
constant int TWO=2;
The value TWO may be used in many different places, perhaps to double things, perhaps to access an index. If in the future you need to change the index you cannot just change to int TWO=3; because that would affect all the other (completely unrelated) ways you've used TWO, now you'd be tripling instead of doubling etc
Where used it gives you no more information than if you just used "2". Compare the following two pieces of code:
if (pressure>2){
//2 might be good, I have no idea what happens here
}
or
if (pressure>TWO){
//TWO means 2, 2 might be good, I still have no idea what happens here
}
Worse still (as seems to be the case here) TWO may not equal 2, if so this is a form of obfuscation where the intention is to make the code less clear: obviously it achieves that.
The usual reason for this is a coding standard which forbids magic numbers but doesn't count TWO as a magic number; which of course it is! 99% of the time you want to use a meaningful variable name but in that 1% of the time using TWO instead of 2 gains you nothing (Sorry, I mean ZERO).
this code is inspired by Java but is intended to be language agnostic

Short version:
A constant five that just holds the number five is pretty useless. Don't go around making these for no reason (sometimes you have to because of syntax or typing rules, though).
The named variables in Quaternion.cs aren't strictly necessary, but you can make the case for the code being significantly more readable with them than without.
The named variables in ext4/resize.c aren't constants at all. They're tersely-named counters. Their names obscure their function a bit, but this code actually does correctly follow the project's specialized coding standards.
What's going on with Quaternion.cs?
This one's pretty easy.
Right after this:
double zero = 0;
double one = 1;
The code does this:
return zero.GetHashCode() ^ one.GetHashCode();
Without the local variables, what does the alternative look like?
return 0.0.GetHashCode() ^ 1.0.GetHashCode(); // doubles, not ints!
What a mess! Readability is definitely on the side of creating the locals here. Moreover, I think explicitly naming the variables indicates "We've thought about this carefully" much more clearly than just writing a single confusing return statement would.
What's going on with resize.c?
In the case of ext4/resize.c, these numbers aren't actually constants at all. If you follow the code, you'll see that they're counters and their values actually change over multiple iterations of a while loop.
Note how they're initialized:
unsigned three = 1;
unsigned five = 5;
unsigned seven = 7;
Three equals one, huh? What's that about?
See, what actually happens is that update_backups passes these variables by reference to the function ext4_list_backups:
/*
* Iterate through the groups which hold BACKUP superblock/GDT copies in an
* ext4 filesystem. The counters should be initialized to 1, 5, and 7 before
* calling this for the first time. In a sparse filesystem it will be the
* sequence of powers of 3, 5, and 7: 1, 3, 5, 7, 9, 25, 27, 49, 81, ...
* For a non-sparse filesystem it will be every group: 1, 2, 3, 4, ...
*/
static unsigned ext4_list_backups(struct super_block *sb, unsigned *three,
unsigned *five, unsigned *seven)
They're counters that are preserved over the course of multiple calls. If you look at the function body, you'll see that it's juggling the counters to find the next power of 3, 5, or 7, creating the sequence you see in the comment: 1, 3, 5, 7, 9, 25, 27, &c.
Now, for the weirdest part: the variable three is initialized to 1 because 30 = 1. The power 0 is a special case, though, because it's the only time 3x = 5x = 7x. Try your hand at rewriting ext4_list_backups to work with all three counters initialized to 1 (30, 50, 70) and you'll see how much more cumbersome the code becomes. Sometimes it's easier to just tell the caller to do something funky (initialize the list to 1, 5, 7) in the comments.
So, is five = 5 good coding style?
Is "five" a good name for the thing that the variable five represents in resize.c? In my opinion, it's not a style you should emulate in just any random project you take on. The simple name five doesn't communicate much about the purpose of the variable. If you're working on a web application or rapidly prototyping a video chat client or something and decide to name a variable five, you're probably going to create headaches and annoyance for anyone else who needs to maintain and modify your code.
However, this is one example where generalities about programming don't paint the full picture. Take a look at the kernel's coding style document, particularly the chapter on naming.
GLOBAL variables (to be used only if you really need them) need to
have descriptive names, as do global functions. If you have a function
that counts the number of active users, you should call that
"count_active_users()" or similar, you should not call it "cntusr()".
...
LOCAL variable names should be short, and to the point. If you have
some random integer loop counter, it should probably be called "i".
Calling it "loop_counter" is non-productive, if there is no chance of it
being mis-understood. Similarly, "tmp" can be just about any type of
variable that is used to hold a temporary value.
If you are afraid to mix up your local variable names, you have another
problem, which is called the function-growth-hormone-imbalance syndrome.
See chapter 6 (Functions).
Part of this is C-style coding tradition. Part of it is purposeful social engineering. A lot of kernel code is sensitive stuff, and it's been revised and tested many times. Since Linux is a big open-source project, it's not really hurting for contributions — in most ways, the bigger challenge is checking those contributions for quality.
Calling that variable five instead of something like nextPowerOfFive is a way to discourage contributors from meddling in code they don't understand. It's an attempt to force you to really read the code you're modifying in detail, line by line, before you try to make any changes.
Did the kernel maintainers make the right decision? I can't say. But it's clearly a purposeful move.

My organisation have certain programming guidelines, one of which is the use of magic numbers...
eg:
if (input == 3) //3 what? Elephants?....3 really is the magic number here...
This would be changed to:
#define INPUT_1_VOLTAGE_THRESHOLD 3u
if (input == INPUT_1_VOLTAGE_THRESHOLD) //Not elephants :(
We also have a source file with -200,000 -> 200,000 #defined in the format:
#define MINUS_TWO_ZERO_ZERO_ZERO_ZERO_ZERO -200000
which can be used in place of magic numbers, for example when referencing a specific index of an array.
I imagine this has been done for "Readability".

The numbers 0, 1, ... are integers. Here, the 'named variables' give the integer a different type. It might be more reasonable to specify these constant (const unsigned five = 5;)

I've used something akin to that a couple times to write values to files:
const int32_t zero = 0 ;
fwrite( &zero, sizeof(zero), 1, myfile );
fwrite accepts a const pointer, but if some function needs a non const pointer, you'll end up using a non const variable.
P.S.: That always keeps me wondering what may be the sizeof zero .

How do you come to a conslusion that it is used only once? It is public, it could be used any number of times from any assembly.
public static readonly Quaternion Zero = new Quaternion();
public static readonly Quaternion One = new Quaternion(1.0f, 1.0f, 1.0f, 1.0f);
Same thing applies to .Net framework decimal class. which also exposes public constants like this.
public const decimal One = 1m;
public const decimal Zero = 0m;

Numbers are often given a name when these numbers have special meaning.
For example in the Quaternion case the identity quaternion and unit length quaternion have special meaning and are frequently used in a special context. Namely Quaternion with (0,0,0,1) is an identity quaternion so it's a common practice to define them instead of using magic numbers.
For example
// define as static
static Quaternion Identity = new Quaternion(0,0,0,1);
Quaternion Q1 = Quaternion.Identity;
//or
if ( Q1.Length == Unit ) // not considering floating point error

One of my first programming jobs was on a PDP 11 using Basic. The Basic interpreter allocated memory to every number required, so every time the program mentioned 0, a byte or two would be used to store the number 0. Of course back in those days memory was a lot more limited than today and so it was important to conserve.
Every program in that work place started with:
10 U0%=0
20 U1%=1
That is, for those who have forgotten their Basic:
Line number 10: create an integer variable called U0 and assign it the number 0
Line number 20: create an integer variable called U1 and assign it the number 1
These variables, by local convention, never held any other value, so they were effectively constants. They allowed 0 and 1 to be used throughout the program without wasting any memory.
Aaaaah, the good old days!

some times it's more readable to write:
double pi=3.14; //Constant or even not constant
...
CircleArea=pi*r*r;
instead of:
CircleArea=3.14*r*r;
and may be you would use pi more again (you are not sure but you think it's possible later or in other classes if they are public)
and then if you want to change pi=3.14 into pi=3.141596 it's easier.
and some other like e=2.71, Avogadro and etc.

.Net Dictionary<int,int> out of memory exception at around 6,000,000 entries

I am using a Dictionary<Int,Int> to store the frequency of colors in an image, where the key is the the color (as an int), and the value is the number of times the color has been found in the image.
When I process larger / more colorful images, this dictionary grows very large. I get an out of memory exception at just around 6,000,000 entries. Is this the expected capacity when running in 32-bit mode? If so, is there anything I can do about it? And what might be some alternative methods of keeping track of this data that won't run out of memory?
For reference, here is the code that loops through the pixels in a bitmap and saves the frequency in the Dictionary<int,int>:
Bitmap b; // = something...
Dictionary<int, int> count = new Dictionary<int, int>();
System.Drawing.Color color;
for (int i = 0; i < b.Width; i++)
{
for (int j = 0; j < b.Height; j++)
{
color = b.GetPixel(i, j);
int colorString = color.ToArgb();
if (!count.Keys.Contains(color.ToArgb()))
{
count.Add(colorString, 0);
}
count[colorString] = count[colorString] + 1;
}
}
Edit: In case you were wondering what image has that many different colors in it: http://allrgb.com/images/mandelbrot.png
Edit: I also should mention that this is running inside an asp.net web application using .Net 4.0. So there may be additional memory restrictions.
Edit: I just ran the same code inside a console application and had no problems. The problem only happens in ASP.Net.

Update: Given the OP's sample image, it seems that the maximum number of items would be over 16 million, and apparently even that is too much to allocate when instantiating the dictionary. I see three options here:
Resize the image down to a manageable size and work from that.
Try to convert to a color scheme with fewer color possibilities.
Go for an array of fixed size as others have suggested.
Previous answer: the problem is that you don't allocate enough space for your dictionary. At some point, when it is expanding, you just run out of memory for the expansion, but not necessarily for the new dictionary.
Example: this code runs out of memory at nearly 24 million entries (in my machine, running in 32-bit mode):
Dictionary<int, int> count = new Dictionary<int, int>();
for (int i = 0; ; i++)
count.Add(i, i);
because with the last expansion it is currently using space for the entries already there, and tries to allocate new space for another so many million more, and that is too much.
Now, if we initially allocate space for, say, 40 million entries, it runs without problem:
Dictionary<int, int> count = new Dictionary<int, int>(40000000);
So try to indicate how many entries there will be when creating the dictionary.
From MSDN:
The capacity of a Dictionary is the number of elements that can be added to the Dictionary before resizing is necessary. As elements are added to a Dictionary, the capacity is automatically increased as required by reallocating the internal array.
If the size of the collection can be estimated, specifying the initial capacity eliminates the need to perform a number of resizing operations while adding elements to the Dictionary.

Each dictionary entry holds two 4-byte integers: 8 bytes total. 8 bytes * 6 millions entries is only about 48MB, +/- some space for object overhead, alignment, etc. There's plenty of space in memory for this. .Net provides virtual address space of up to 2 GB per process. 48MB or so shouldn't cause a problem.
I expect what's actually happening here is related to how the dictionary auto-expands and how the garbage collector handles (or doesn't handle) compaction.
First, the auto-expanding part. Last time I checked (back around .Net 2.0*), collections in .Net tended to use arrays internally. They would allocated a reasonably-sized array in the collection constructor (say, 10 items), and then use a doubling algorithm to create additional space whenever the array filled up. All the existing items would have to be copied to the new array, but then the old array could be garbage collected. The garbage collector is pretty reliable about this, and so it means you're left using space for at most 2n - 1 items in the collection.
Now the Garbage Collector compaction part. After a certain size, these arrays end up in a section of memory called the Large Object Heap. Garbage Collection still works here (though less often). What doesn't really work here well is compaction (think memory defragmentation). The physical memory used by the old object will be released, returned to the operating system, and available for other processes. However, the virtual address space in your process... the table that maps program memory offsets to physical memory addresses, will still have the (empty) space reserved.
This is important, because remember: we're working with a rapidly growing object. It's possible for such an object to take up address space far larger than the final size of the object itself. An object grows enough, fast enough, and suddenly you get an OutOfMemoryException, even though your app isn't really using all that much RAM.
The first solution here is allocate enough space in the initial collection for all of your data. This allows you to skip all those re-allocations and copying. Your data will live in a single array, and use only the space you actually asked for. Most collections, including the Dictionary, have an overload for the constructor that allows you to give it the number of items you want the first array to use. Be careful here: you don't need to allocate an item for every pixel in your image. There will be a lot of repeated colors. You only need to allocate enough to have space for each color in your image. If it's only large images that give you problems, and you're almost handling them with six millions records, you might find that 8 million is plenty.
My next suggestion is to group your pixel colors. A human can't tell and doesn't care if two colors are just one bit apart in any of the rgb components. You might go as far as to look at the separate RGB values for each pixel and normalize the pixel so that you only care about changes of more than 5 or so for an R,G,or B value. That would get you from 16.5 million potential colors all the way down to only about 132,000, and the data will likely be more useful, too. That might look something like this:
var colorCounts = new Dictionary<Color, int>(132651);
foreach(Color c in GetImagePixels().Select( c=> Color.FromArgb( (c.R/5) * 5, (c.G/5) * 5, (c.B/5) * 5) )
{
colorCounts[c] += 1;
}
* IIRC, somewhere in a recent or upcoming version of .Net both of these issues are being addressed. One by allowing you to force compaction of the LOH, and the other by using a set of arrays for collection backing stores, rather than trying to keep everything in one big array

The maximum size limit provided by CLR is 2GB
When you run a 64-bit managed application on a 64-bit Windows
operating system, you can create an object of no more than 2 gigabytes
(GB).
You may better use an array.
You may also check this BigArray<T>, getting around the 2GB array size limit

In the 32 bit runtime, the maximum number of items you can have in a Dictionary<int, int> is in the neighborhood of 61.7 million. See my old article for more info.
If you're running in 32 bit mode, then your entire application plus whatever bits of ASP.NET and the underlying machinery is required all have to fit within the memory available to your process: normally 2 GB in the 32-bit runtime.
By the way, a really wacky way to solve your problem (but one I wouldn't recommend unless you're really hurting for memory), would be the following (assuming a 24-bit image):
Call LockBits to get a pointer to the raw image data
Compress the per-scan-line padding by moving the data for each scan line to fill the previous row's padding. You end up with an array of 3-byte values followed by a bunch of empty space (to equal the padding).
Sort the image data. That is, sort the 3-byte values. You'd have to write a custom sort, but it wouldn't be too bad.
Go sequentially through the array and count the number of unique values.
Allocate a 2-dimensional array: int[count,2] to hold the values and their occurrence counts.
Go sequentially through the array again to count occurrences of each unique value and populate the counts array.
I wouldn't honestly suggest using this method. Just got a little laugh when I thought of it.

Try using an array instead. I doubt it will run out of memory. 6 million int array elements is not a big deal.

Struct vs class memory overhead

I'm writing an app that will create thousands of small objects and store them recursively in array. By "recursively" I mean that each instance of K will have an array of K instances which will have and array of K instances and so on, and this array + one int field are the only properties + some methods. I found that memory usage grows very fast for even small amount of data - about 1MB), and when the data I'm processing is about 10MB I get the "OutOfMemoryException", not to mention when it's bigger (I have 4GB of RAM) :). So what do you suggest me to do? I figured, that if I'd create separate class V to process those objects, so that instances of K would have only array of K's + one integer field and make K as a struct, not a class, it should optimize things a bit - no garbage collection and stuff... But it's a bit of a challenge, so I'd rather ask you whether it's a good idea, before I start a total rewrite :).
EDIT:
Ok, some abstract code
public void Add(string word) {
int i;
string shorter;
if (word.Length > 0) {
i = //something, it's really irrelevant
if (t[i] == null) {
t[i] = new MyClass();
}
shorterWord = word.Substring(1);
//end of word
if(shorterWord.Length == 0) {
t[i].WordEnd = END;
}
//saving the word letter by letter
t[i].Add(shorterWord);
}
}
}

For me already when researching deeper into this I had the following assumptions (they may be inexact; i'm getting old for a programmer). A class has extra memory consumption because a reference is required to address it. Store the reference and an Int32 sized pointer is needed on a 32bit compile. Allocated always on the heap (can't remember if C++ has other possibilities, i would venture yes?)
The short answer, found in this article, Object has a 12bytes basic footprint + 4 possibly unused bytes depending on your class (has no doubt something to do with padding).
http://www.codeproject.com/Articles/231120/Reducing-memory-footprint-and-object-instance-size
Other issues you'll run into is Arrays also have an overhead. A possibility would be to manage your own offset into a larger array or arrays. Which in turn is getting closer to something a more efficient language would be better suited for.
I'm not sure if there are libraries that may provide Storage for small objects in an efficient manner. Probably are.
My take on it, use Structs, manage your own offset in a large array, and use proper packing instructions if it serves you (although i suspect this comes at a cost at runtime of a few extra instructions each time you address unevenly packed data)
[StructLayout(LayoutKind.Sequential, Pack = 1)]

Your stack is blowing up.
Do it iteratively instead of recursively.
You're not blowing the system stack up, your blowing the code stack up, 10K function calls will blow it out of the water.
You need proper tail recursion, which is just an iterative hack.

Make sure you have enough memory in your system. Over 100mb+ etc. It really depends on your system. Linked list, recursive objects is what you are looking at. If you keep recursing, it is going to hit the memory limit and nomemoryexception will be thrown. Make sure you keep track of the memory usage on any program. Nothing is unlimited, especially memory. If memory is limited, save it to a disk.
Looks like there is infinite recursion in your code and out of memory is thrown. Check the code. There should be start and end in recursive code. Otherwise it will go over 10 terrabyte memory at some point.

You can use a better data structure
i.e. each letter can be a byte (a-0, b-1 ... ). each word fragment can be in indexed also especially substrings - you should get away with significantly less memory (though a performance penalty)

Just list your recursive algorithm and sanitize variable names. If you are doing BFS type of traversal and keep all objects in memory, you will run out of mem. For example, in this case, replace it with DFS.
Edit 1:
You can speed up the algo by estimating how many items you will generate then allocate that much memory at once. As the algo progresses, fill up the allocated memory. This reduces fragmentation and reallocation & copy-on-full-array operations.
Nonetheless, after you are done operating on these generated words you should delete them from your datastructure so they can be GC-ed so you don't run out of mem.

Improving set comparisons by hashing them (being overly clever..?)

After my last, failed, attempt at asking a question here I'm trying a more precise question this time:
What I have:
A huge dataset (finite, but I wasted days of multi-core processing time to compute it before...) of ISet<Point>.
A list of input values between 0 to 2n, n≤17
What I need:
3) A table of [1], [2] where I map every value of [2] to a value of [1]
The processing:
For this computation I have a formula, that takes a bit value (from [2]) and a set of positions (from [1]) and creates a new ISet<Point>. I need to find out which of the original set is equal to the resulting set (i.e. The "cell" in the table at "A7" might point to "B").
The naive way:
Compute the new ISet<Point> and use .Contains(mySet) or something similar on the list of values from [1]. I did that in previous versions of this proof of concept/pet project and it was dead slow when I started feeding huge numbers. Yes, I used a profiler. No, this wasn't the only slow part of the system, but I wasted a considerable amount of time in this naive lookup/mapping.
The question, finally:
Since I basically just need to remap to the input, I thought about creating a List of hashed values for the list of ISet<Point>, doing the same for my result from the processing and therefor avoiding to compare whole sets.
Is this a good idea? Would you call this premature optimization (I know that the naive way above is too slow, but should I implement something less clever first? Performance is really important here, think days of runtime again)? Any other suggestions to ease the burdon here or ideas what I should read up on?
Update: Sorry for not providing a better explanation or a sample right away.
Sample for [1] (Note: These are real possible datapoints, but obviously it's limited) :
new List<ISet<Point>>() {
new HashSet() {new Point(0,0) },
new HashSet() {new Point(0,0), new Point(2,1) },
new HashSet() {new Point(0,1), new Point(3,1) }
}
[2] is just a boolean vector of the length n. For n = 2 it's
0,0
0,1
1,0
1,1
I can do that one by using an int or long, basically.
Now I have a function that takes an vector and an ISet<Point> and returns a new ISet<Point>. It's not a 1:1 transformation: An set of 5 might result in a set of 11 or whatever. The resulting ISet<Point> is however guaranteed to be part of the input.
Using letters for a set of points and numbers for the bit vectors, I'm starting with this
A B C D E F
1
2
3
4
5
6
7
What I need to have at the end is
A B C D E F
1 - C A E - -
2 B C E F A -
3 ................
4 ................
5 ................
6 F C B A - -
7 E - C A - D
There are several costly operations in these project, one is the preparation of the sets of point ([1]). But this question is about the matching now: I can easily (more or less, not that important now) compute a target ISet for a given bit vector and a source ISet. Now I need to match/find that in the original set.
The whole beast is going to be a state machine, where the set of points is a valid state. Later I don't care about the individual states, I can actually refer to them by anything (a letter, an index, whatever). I just need to keep the associations:
1, B => C
Update: Eric asked if a HashSet would be possible. The answer is yes, but only if the dataset stays small enough. My question (hashing) is: Might it be possible/a good idea to employ a hashing algorithm for this hashset? My idea is this:
Walk the (lazily generated) list/sequence of ISet<Point> (I could change this type, I just want to stress that it is a mathematical set of points, no duplicates).
Create a simpler representation of the input (a hash?) and store it (in a hashset?)
Compute all target sets for this input, but only store again a simple representation (see above)
Discard the set
Fix up the mapping (equal hash = equal state)
Good idea? Problems with this? One that I could come up with is a collision (how probable is that?) - and I wouldn't know a good hashing function to begin with..

OK, I think I understand the problem at least now. Let me see if I can rephrase.
Let's start by leaving sets out of it. We'll keep it abstract.
You have a large list L, containing instances of reference type S (for "set"). Such a list is of course logically a mapping from natural numbers N onto S.
L: N --> S
S has the property that two instances can be compared for both reference equality and value equality. That is, there can be two instances of S which are not reference equals, but logically represent the same value.
You have a function F which takes a value of type V (for "vector") and an instance of type S and produces another instance of type S.
F: (V, S) --> S
Furthermore, you know that if F is given an instance of S from L then the resulting instance of S will be value equals to something on the list, but not necessarily reference equals.
The problem you face is: given an instance s of S which is the result of a call to F, which member L(n) is value-equals to s?
Yes?
The naive method -- go down L(1), L(2), ... testing set equality along the way will be dead slow. It'll be at least linear in the size of L.
I can think of several different ways to proceed. The easiest is your initial thought: make L something other than a list. Can you make it a HashSet<S> instead of List<S>? If you implement a hashing algorithm and equality method on S then you can build a fast lookup table.
If that doesn't suit then we'll explore other options.
UPDATE:
OK, so I can see two basic ways to deal with your memory problem. (1) Keep everything in memory using data structures that are much smaller than your current implementation, or (2) change how you store stuff on disk so that you can store an "index" in memory and rapidly go to the right "page" of the disk file to extract the information you need.
You could be representing a point as a single short where the top byte is x and the bottom byte is y, instead of representing it as two ints; a savings of 75%.
A set of points could be implemented as a sorted array of shorts, which is pretty compact and easy to write a hash algorithm for.
That's probably the approach I'd go for since your data are so compressible.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.