I have been looking at some SO questions related to the max size of an array of bytes (here and here) and have been playing with some arrays and getting some results I don't quite understand. My code is as follows:
byte[] myByteArr;
byte[] myByteArr2 = new byte[671084476];
for (int i = 1; i < 2; i++)
{
myByteArr = new byte[671084476];
}
This will compile and upon execution it will throw a 'System.OutOfMemoryException' on the initialization of myByteArr. If I change the 2 in the for loop to a 1 or I comment out one of the initialization's (either myByteArr2 or myByteArr) it will run fine.
Also, byte[] myByteArr = new byte[Int32.MaxValue - 56]; throws the same exception.
Why does this happen when compiled for 32-bit? Aren't they within the 2GB limit?
The limits of a 32-bit program are not per-object. It's a process limit. You cannot have more than 2GB total in use.
Not only that, but in practice, it's often difficult to get anywhere near 2GB due to address space fragmentation. .NET's managed (ie. movable) memory helps somewhat, but doesn't eliminate this problem.
Even if you are using a 64-bit process, you may have a similar problem because in C# arrays are indexed by an int, which is defined as a 32-bit signed integer, and thus can't address past the 2GB boundary in an array of bytes. If you read the answer to the second link carefully, you'll also see that there is a 2GB per object limit. Your array of bytes presumably has some overhead, so it can't get to the full 2GB just for the raw data.
See #Habib's link in the comments for details.
Related
I am writing .NET applications running on Windows Server 2016 that does an http get on a bunch of pieces of a large file. This dramatically speeds up the download process since you can download them in parallel. Unfortunately, once they are downloaded, it takes a fairly long time to pieces them all back together.
There are between 2-4k files that need to be combined. The server this will run on has PLENTLY of memory, close to 800GB. I thought it would make sense to use MemoryStreams to store the downloaded pieces until they can be sequentially written to disk, BUT I am only able to consume about 2.5GB of memory before I get an System.OutOfMemoryException error. The server has hundreds of GB available, and I can't figure out how to use them.
MemoryStreams are built around byte arrays. Arrays cannot be larger than 2GB currently.
The current implementation of System.Array uses Int32 for all its internal counters etc, so the theoretical maximum number of elements is Int32.MaxValue.
There's also a 2GB max-size-per-object limit imposed by the Microsoft CLR.
As you try to put the content in a single MemoryStream the underlying array gets too large, hence the exception.
Try to store the pieces separately, and write them directly to the FileStream (or whatever you use) when ready, without first trying to concatenate them all into 1 object.
According to the source code of the MemoryStream class you will not be able to store more than 2 GB of data into one instance of this class.
The reason for this is that the maximum length of the stream is set to Int32.MaxValue and the maximum index of an array is set to 0x0x7FFFFFC7 which is 2.147.783.591 decimal (= 2 GB).
Snippet MemoryStream
private const int MemStreamMaxLength = Int32.MaxValue;
Snippet array
// We impose limits on maximum array lenght in each dimension to allow efficient
// implementation of advanced range check elimination in future.
// Keep in sync with vm\gcscan.cpp and HashHelpers.MaxPrimeArrayLength.
// The constants are defined in this method: inline SIZE_T MaxArrayLength(SIZE_T componentSize) from gcscan
// We have different max sizes for arrays with elements of size 1 for backwards compatibility
internal const int MaxArrayLength = 0X7FEFFFFF;
internal const int MaxByteArrayLength = 0x7FFFFFC7;
The question More than 2GB of managed memory has already been discussed long time ago on the microsoft forum and has a reference to a blog article about BigArray, getting around the 2GB array size limit there.
Update
I suggest to use the following code which should be able to allocate more than 4 GB on a x64 build but will fail < 4 GB on a x86 build
private static void Main(string[] args)
{
List<byte[]> data = new List<byte[]>();
Random random = new Random();
while (true)
{
try
{
var tmpArray = new byte[1024 * 1024];
random.NextBytes(tmpArray);
data.Add(tmpArray);
Console.WriteLine($"{data.Count} MB allocated");
}
catch
{
Console.WriteLine("Further allocation failed.");
}
}
}
As has already been pointed out, the main problem here is the nature of MemoryStream being backed by a byte[], which has fixed upper size.
The option of using an alternative Stream implementation has been noted. Another alternative is to look into "pipelines", the new IO API. A "pipeline" is based around discontiguous memory, which means it isn't required to use a single contiguous buffer; the pipelines library will allocate multiple slabs as needed, which your code can process. I have written extensively on this topic; part 1 is here. Part 3 probably has the most code focus.
Just to confirm that I understand your question: you're downloading a single very large file in multiple parallel chunks and you know how big the final file is? If you don't then this does get a bit more complicated but it can still be done.
The best option is probably to use a MemoryMappedFile (MMF). What you'll do is to create the destination file via MMF. Each thread will create a view accessor to that file and write to it in parallel. At the end, close the MMF. This essentially gives you the behavior that you wanted with MemoryStreams but Windows backs the file by disk. One of the benefits to this approach is that Windows manages storing the data to disk in the background (flushing) so you don't have to, and should result in excellent performance.
I am working on optimization of memory consuming application. In relation to that I have question regarding C# reference type size overhead.
The C# object consumes as many bytes as its fields, plus some additional administrative
overhead. I presume that administrative overhead can be different for different .NET versions and implementations.
Do you know what is the size (or maximum size if the overhead is variable) of the administrative overhead for C# objects (C# 4.0 and Windows 7 and 8 environment)?
Does the administrative overhead differs between 32- or 64-bit .NET runtime?
Typically, there is an 8 or 12 byte overhead per object allocated by the GC. There are 4 bytes for the syncblk and 4 bytes for the type handle on 32bit runtimes, 8 bytes on 64bit runtimes. For details, see the "ObjectInstance" section of Drill Into .NET Framework Internals to See How the CLR Creates Runtime Objects on MSDN Magazine.
Note that the actual reference does change on 32bit or 64bit .NET runtimes as well.
Also, there may be padding for types to fit on address boundaries, though this depends a lot on the type in question. This can cause "empty space" between objects as well, but is up to the runtime (mostly, though you can affect it with StructLayoutAttribute) to determine when and how data is aligned.
There is an article online with the title "The Truth About .NET Objects And Sharing Them Between AppDomains" which shows some rotor source code and some results of experimenting with objects and sharing them between app domains via a plain pointer.
http://geekswithblogs.net/akraus1/archive/2012/07/25/150301.aspx
12 bytes for all 32-bit versions of the CLR
24 bytes for all 64-bit versions of the CLR
You can do test this quite easily by adding millions of objects (N) to an array. Since the pointer size is known you can calculate the object size by dividing the value by N.
var initial = GC.GetTotalMemory(true);
const int N = 10 * 1000 * 1000;
var arr = new object[N];
for (int i = 0; i < N; i++)
{
arr[i] = new object();
}
var ObjSize = (GC.GetTotalMemory(false) - initial - N * IntPtr.Size) / N;
to get an approximate value on your .NET platform.
The object size is actually defined to allow the GC to make assumptions about the minimum object size.
\sscli20\clr\src\vm\object.h
//
// The generational GC requires that every object be at least 12 bytes
// in size.
#define MIN_OBJECT_SIZE (2*sizeof(BYTE*) + sizeof(ObjHeader))
For e.g. 32 bit this means that the minimum object size is 12 bytes which do leave a 4-byte hole. This hole is empty for an empty object but if you add e.g. int to your empty class then it is filled and the object size stays at 12 bytes.
There are two types of overhead for an object:
Internal data used to handle the object.
Padding between data members.
The internal data is two pointers, so in a 32-bit application that is 8 bytes, and in a 64-bit application that is 16 bytes.
Data members are padded so that they start on an even address boundary. If you for example have a byte and an int in the class, the byte is probably padded with three unused bytes so that the int starts on the next machine word boundary.
The layout of the classes is determined by the JIT compiler depending on the architecture of the system (and might vary between framework versions), so it's not known to the C# compiler.
Okay, this problem is indeed a challenge!
Background
I am working on an arithmetic-based project involving larger than normal numbers. I new I was going to be working with a worst-case-senario of 4 GB-caped file sizes (I was hopping to even extend that to a 5GB cap as I have seen file sizes greater than 4 GB before - specifically image *.iso files)
The Question In General
Now, the algorithm(s) to which I will apply computation to do not matter at the moment, but the loading and handling of such large quantities of data - the numbers - do.
A System.IO.File.ReadAllBytes(String) can only read a cap of 2 GB worth of file data, so this is my first problem - How will I go about loading and/or configuring to access memory, such file sizes - twice as much, if not more?
Next, I was writing my own class to treat a 'stream' or array of bytes as a big number and add multiple operator methods to perform hex arithmetic until I read about the System.Numerics.BigInteger() class online - but being that there is no BigInteger.MaxValue and that I can only load a max of 2 GB of data at a time, I don't know what the potential of BigInteger would be - even compared to the object I was writing called Number() (which does have my desired minimum potential). There were also issues with available memory and performance, though I do not care so much about speed, but rather completing this experimental process successfully.
Summary
How should I load 4-5 gigabytes of data?
How should I store and handle the data after having been loaded? Stick with BigInteger or finish my own Number class?
How should I handle such large quantities of memory during runtime without running out of memory? I'll be treating the 4-5 GB of data like any other number instead of an array of bytes - performing such arithmetic as division and multiplication.
PS I cannot reveal too much information about this project under a non-discloser agreement. ;)
For those who would like to see a sample operator from my Number object for a per-byte array adder(C#):
public static Number operator +(Number n1, Number n2)
{
// GB5_ARRAY is a cap constant for 5 GB - 5368709120L
byte[] data = new byte[GB5_ARRAY];
byte rem = 0x00, bA, bB, rm, dt;
// Iterate through all bytes until the second to last
// The last byte is the remainder if any
// I tested this algorithm on smaller arrays provided by the `BitConverter` class,
// then I made a few tweeks to satisfy the larger arrays and the Number object
for (long iDx = 0; iDx <= GB5_ARRAY-1; iDx++)
{
// bData is a byte[] with GB5_ARRAY number of bytes
// Perform a check - solves for unequal (or jagged) arrays
if (iDx < GB5_ARRAY - 1) { bA = n1.bData[iDx]; bB = n2.bData[iDx]; } else { bA = 0x00; bB = 0x00; }
Add(bA, bB, rem, out dt, out rm);
// set data and prepare for the next interval
rem = rm; data[iDx] = dt;
}
return new Number(data);
}
private static void Add(byte a, byte b, byte r, out byte result, out byte remainder)
{
int i = a + b + r;
result = (byte)(i % 256); // find the byte amount through modulus arithmetic
remainder = (byte)((i - result) / 256); // find remainder
}
Normally, you would process large files using a streaming API, either raw binary (Stream), or via some protocol-reader (XmlReader, StreamReader, etc). This also could be done via memory-mapped files in some cases. The key point here is that you only look at a small portion of the file at a time (a moderately-sized buffer of data, a logical "line", or "node", etc - depending on the scenario).
Where this gets odd is your desire to map this somehow directly to some form of large number. Frankly, I don't know how we can help with that without more information, but if you are dealing with an actual number of this size, I think you're going to struggle unless the binary protocol makes that convenient. And "performing such arithmetic as division and multiplication" is meaningless on raw data; that only makes sense on parsed data with custom operations defined.
Also: note that in .NET 4.5 you can flip a configuration switch to expand the maximum size of arrays, going over the 2GB limit. It still has a limit, but: it is a bit bigger. Unfortunately, the maximum number of elements is still the same, so if you are using a byte[] array it won't help. But if you are using SomeCompositeStruct[] you should be able to get higher usage. See gcAllowVeryLargeObjects
Use FileStream: http://msdn.microsoft.com/en-us/library/system.io.filestream.aspx
FileStream is the beginning for you.
If you don't have enough memory (it should be at least 4x more than max your number size I think) you will need to use hard disk. So instead having all data in memory you would rather load part of data, do some computing and write it back to hard disk.
I need to declare square matrices in C# WinForms with more than 20000 items in a row.
I read about 2GB .Net object size limit in 32bit and also the same case in 64bit OS.
So as I understood the single answer - is using unsafe code or separate library built withing C++ compiler.
The problem for me is worth because ushort[20000,20000] is smaller then 2GB but actually I cannot allocate even 700MB of memory. My limit is 650MB and I don't understand why - I have 32bit WinXP with 3GB of memory.
I tried to use Marshal.AllocHGlobal(700<<20) but it throws OutOfMemoryException, GC.GetTotalMemory returns 4.5MB before trying to allocate memory.
I found only that many people say use unsafe code but I cannot find example of how to declare 2-dim array in heap (any stack can't keep so huge amount of data) and how to work with it using pointers.
Is it pure C++ code inside of unsafe{} brackets?
PS. Please don't ask WHY I need so huge arrays... but if you want - I need to analyze texts (for example books) and found lot of indexes. So answer is - matrices of relations between words
Edit: Could somebody please provide a small example of working with matrices using pointers in unsafe code. I know that under 32bit it is impossible to allocate more space but I spent much time in googling such example and found NOTHING
Why demand a huge 2-D array? You can simulate this with, for example, a jagged array - ushort[][] - almost as fast, and you won't hit the same single-object limit. You'll still need buckets-o-RAM of course, so x64 is implied...
ushort[][] arr = new ushort[size][];
for(int i = 0 ; i < size ; i++) {
arr[i] = new ushort[size];
}
Besides which - you might want to look at sparse-arrays, eta-vectors, and all that jazz.
The reason why you can't get near even the 2Gb allocation in 32 bit Windows is that arrays in the CLR are laid out in contiguous memory. In 32 bit Windows you have such a restricted address space that you'll find nothing like a 2Gb hole in the virtual address space of the process. Your experiments suggest that the largest region of available address space is 650Mb. Moving to 64 bit Windows should at least allow you to use a full 2Gb allocation.
Note that the virtual address space limitation on 32 bit Windows has nothing to do with the amount of physical memory you have in your computer, in your case 3Gb. Instead the limitation is caused by the number of bits the CPU uses to address memory addresses. 32 bit Windows uses, unsurprisingly, 32 bits to access each memory address which gives a total addressable memory space of 4Gbytes. By default Windows keeps 2Gb for itself and gives 2Gb to the currently running process, so you can see why the CLR will find nothing like a 2Gb allocation. With some trickery you can change the OS/user allocation so that Windows only keeps 1Gb for itself and gives the running process 3Gb which might help. However with 64 bit windows the addressable memory assigned to each process jumps up to 8 Terabytes so here the CLR will almost certainly be able to use full 2Gb allocations for arrays.
I'm so happy! :) Recently I played around subject problem - tried to resolve it using database but only found that this way is far to be perfect. Matrix [20000,20000] was implemented as single table.
Even with properly set up indexes time required only to create more than 400 millions records is about 1 hour on my PC. It is not critical for me.
Then I ran algorithm to work with that matrix (require twice to join the same table!) and after it worked more than half an hour it made no even single step.
After that I understood that only way is to find a way to work with such matrix in memory only and back to C# again.
I created pilot application to test memory allocation process and to determine where exactly allocation process stops using different structures.
As was said in my first post it is possible to allocate using 2-dim arrays only about 650MB under 32bit WinXP.
Results after using Win7 and 64bit compilation also were sad - less than 700MB.
I used JAGGED ARRAYS [][] instead of single 2-dim array [,] and results you can see below:
Compiled in Release mode as 32bit app - WinXP 32bit 3GB phys. mem. - 1.45GB
Compiled in Release mode as 64bit app - Win7 64bit 2GB under VM - 7.5GB
--Sources of application which I used for testing are attached to this post.
I cannot find here how to attach source files so just describe design part and put here manual code.
Create WinForms application.
Put on form such contols with default names:
1 button, 1 numericUpDown and 1 listbox
In .cs file add next code and run.
private void button1_Click(object sender, EventArgs e)
{
//Log(string.Format("Memory used before collection: {0}", GC.GetTotalMemory(false)));
GC.Collect();
//Log(string.Format("Memory used after collection: {0}", GC.GetTotalMemory(true)));
listBox1.Items.Clear();
if (string.IsNullOrEmpty(numericUpDown1.Text )) {
Log("Enter integer value");
}else{
int val = (int) numericUpDown1.Value;
Log(TryAllocate(val));
}
}
/// <summary>
/// Memory Test method
/// </summary>
/// <param name="rowLen">in MB</param>
private IEnumerable<string> TryAllocate(int rowLen) {
var r = new List<string>();
r.Add ( string.Format("Allocating using jagged array with overall size (MB) = {0}", ((long)rowLen*rowLen*Marshal.SizeOf(typeof(int))) >> 20) );
try {
var ar = new int[rowLen][];
for (int i = 0; i < ar.Length; i++) {
try {
ar[i] = new int[rowLen];
}
catch (Exception e) {
r.Add ( string.Format("Unable to allocate memory on step {0}. Allocated {1} MB", i
, ((long)rowLen*i*Marshal.SizeOf(typeof(int))) >> 20 ));
break;
}
}
r.Add("Memory was successfully allocated");
}
catch (Exception e) {
r.Add(e.Message + e.StackTrace);
}
return r;
}
#region Logging
private void Log(string s) {
listBox1.Items.Add(s);
}
private void Log(IEnumerable<string> s)
{
if (s != null) {
foreach (var ss in s) {
listBox1.Items.Add ( ss );
}
}
}
#endregion
The problem is solved for me. Guys, thank you in advance!
If sparse array does not apply, it's probably better to just do it in C/C++ with platform APIs related to memory mapped file: http://en.wikipedia.org/wiki/Memory-mapped_file
For the OutOfMemoryException read this thread (especially nobugz and Brian Rasmussen's answer):
Microsoft Visual C# 2008 Reducing number of loaded dlls
If you explained what you are trying to do it would be easier to help. Maybe there are better ways than allocating such a huge amount of memory at once.
Re-design is also choice number one in this great blog post:
BigArray, getting around the 2GB array size limit
The options suggested in this article are:
Re-design
Native memory for array containing simple types, sample code available here:
Unsafe Code Tutorial
Unsafe Code and Pointers (C# Programming Guide)
How to: Use Pointers to Copy an Array of Bytes (C# Programming Guide)
Writing a BigArray class which segments the large data structure into smaller segments of manageable size, sample code in the above blog post
How can I reduce the number of loaded dlls When debugging in Visual C# 2008 Express Edition?
When running a visual C# project in the debugger I get an OutOfMemoryException due to fragmentation of 2GB virtual address space and we assume that the loaded dlls might be the reason for the fragmentation.
Brian Rasmussen, you made my day! :)
His proposal of "disabling the visual studio hosting process" solved the problem.
(for more information see history of question-development below)
Hi,
I need two big int-arrays to be loaded in memory with ~120 million elements (~470MB) each, and both in one Visual C# project.
When I'm trying to instantiate the 2nd Array I get an OutOfMemoryException.
I do have enough total free memory and after doing a web-search I thought my problem is that there aren't big enough contiguous free memory blocks on my system.
BUT! - when I'm instantiating only one of the arrays in one Visual C# instance and then open another Visual C# instance, the 2nd instance can instantiate an array of 470MB.
(Edit for clarification: In the paragraph above I meant running it in the debugger of Visual C#)
And the task-manager shows the corresponding memory usage-increase just as you would expect it.
So not enough contiguous memory blocks on the whole system isn't the problem. Then I tried running a compiled executable that instantiates both arrays which works also (memory usage 1GB)
Summary:
OutOfMemoryException in Visual C# using two big int arrays, but running the compiled exe works (mem usage 1GB) and two separate Visual C# instances are able to find two big enough contiguous memory blocks for my big arrays, but I need one Visual C# instance to be able to provide the memory.
Update:
First of all special thanks to nobugz and Brian Rasmussen, I think they are spot on with their prediction that "the Fragmentation of 2GB virtual address space of the process" is the problem.
Following their suggestions I used VMMap and listdlls for my short amateur-analysis and I get:
* 21 dlls listed for the "standalone"-exe. (the one that works and uses 1GB of memory.)
* 58 dlls listed for vshost.exe-version. (the version which is run when debugging and that throws the exception and only uses 500MB)
VMMap showed me the biggest free memory blocks for the debugger version to be 262,175,167,155,108MBs.
So VMMap says that there is no contiguous 500MB block and according to the info about free blocks I added ~9 smaller int-arrays which added up to more than 1,2GB memory usage and actually did work.
So from that I would say that we can call "fragmentation of 2GB virtual address space" guilty.
From the listdll-output I created a small spreadsheet with hex-numbers converted to decimal to check free areas between dlls and I did find big free space for the standalone version inbetween (21) dlls but not for the vshost-debugger-version (58 dlls). I'm not claiming that there can't be anything else between and I'm not really sure if what I'm doing there makes sense but it seems consistent with VMMaps analysis and it seems as if the dlls alone already fragment the memory for the debugger-version.
So perhaps a solution would be if I would be able to reduce the number of dlls used by the debugger.
1. Is that possible?
2. If yes how would I do that?
You are battling virtual memory address space fragmentation. A process on the 32-bit version of Windows has 2 gigabytes of memory available. That memory is shared by code as well as data. Chunks of code are the CLR and the JIT compiler as well as the ngen-ed framework assemblies. Chunks of data are the various heaps used by .NET, including the loader heap (static variables) and the garbage collected heaps. These chunks are located at various addresses in the memory map. The free memory is available for you to allocate your arrays.
Problem is, a large array requires a contiguous chunk of memory. The "holes" in the address space, between chunks of code and data, are not large enough to allow you to allocate such large arrays. The first hole is typically between 450 and 550 Megabytes, that's why your first array allocation succeeded. The next available hole is a lot smaller. Too small to fit another big array, you'll get OOM even though you've got an easy gigabyte of free memory left.
You can look at the virtual memory layout of your process with the SysInternals' VMMap utility. Okay for diagnostics, but it isn't going to solve your problem. There's only one real fix, moving to a 64-bit version of Windows. Perhaps better: rethink your algorithm so it doesn't require such large arrays.
3rd update: You can reduce the number of loaded DLLs significantly by disabling the Visual Studio hosting process (project properties, debug). Doing so will still allow you to debug the application, but it will get rid of a lot of DLLs and a number of helper threads as well.
On a small test project the number of loaded DLLs went from 69 to 34 when I disabled the hosting process. I also got rid of 10+ threads. All in all a significant reduction in memory usage which should also help reduce heap fragmentation.
Additional info on the hosting process: http://msdn.microsoft.com/en-us/library/ms242202.aspx
The reason you can load the second array in a new application is that each process gets a full 2 GB virtual address space. I.e. the OS will swap pages to allow each process to address the total amount of memory. When you try to allocate both arrays in one process the runtime must be able to allocate two contiguous chunks of the desired size. What are you storing in the array? If you store objects, you need additional space for each of the objects.
Remember an application doesn't actually request physical memory. Instead each application is given an address space from which they can allocate virtual memory. The OS then maps the virtual memory to physical memory. It is a rather complex process (Russinovich spends 100+ pages on how Windows handle memory in his Windows Internal book). For more details on how Windows does this please see http://blogs.technet.com/markrussinovich/archive/2008/11/17/3155406.aspx
Update: I've been pondering this question for a while and it does sound a bit odd. When you run the application through Visual Studio, you may see additional modules loaded depending on your configuration. On my setup I get a number of different DLLs loaded during debug due to profilers and TypeMock (which essentially does its magic via the profiler hooks).
Depending on the size and load address of these they may prevent the runtime from allocating contiguous memory. Having said that, I am still a bit surprised that you get an OOM after allocating just two of those big arrays as their combined size is less than 1 GB.
You can look at the loaded DLLs using the listdlls tools from SysInternals. It will show you load addresses and size. Alternatively, you can use WinDbg. The lm command shows loaded modules. If you want size as well, you need to specify the v option for verbose output. WinDbg will also allow you to examine the .NET heaps, which may help you to pinpoint why memory cannot be allocated.
2nd Update: If you're on Windows XP, you can try to rebase some of the loaded DLLs to free up more contiguous space. Vista and Windows 7 uses ASLR, so I am not sure you'll benefit from rebasing on those platforms.
This isn't an answer per se, but perhaps an alternative might work.
If the problem is indeed that you have fragmented memory, then perhaps one workaround would be to just use those holes, instead of trying to find a hole big enough for everything consecutively.
Here's a very simple BigArray class that doesn't add too much overhead (some overhead is introduced, especially in the constructor, in order to initialize the buckets).
The statistics for the array is:
Main executes in 404ms
static Program-constructor doesn't show up
The statistics for the class is:
Main took 473ms
static Program-constructor takes 837ms (initializing the buckets)
The class allocates a bunch of 8192-element arrays (13 bit indexes), which on 64-bit for reference types will fall below the LOB limit. If you're only going to use this for Int32, you can probably up this to 14 and probably even make it nongeneric, although I doubt it will improve performance much.
In the other direction, if you're afraid you're going to have a lot of holes smaller than the 8192-element arrays (64KB on 64-bit or 32KB on 32-bit), you can just reduce the bit-size for the bucket indexes through its constant. This will add more overhead to the constructor, and add more memory-overhead, since the outmost array will be bigger, but the performance should not be affected.
Here's the code:
using System;
using NUnit.Framework;
namespace ConsoleApplication5
{
class Program
{
// static int[] a = new int[100 * 1024 * 1024];
static BigArray<int> a = new BigArray<int>(100 * 1024 * 1024);
static void Main(string[] args)
{
int l = a.Length;
for (int index = 0; index < l; index++)
a[index] = index;
for (int index = 0; index < l; index++)
if (a[index] != index)
throw new InvalidOperationException();
}
}
[TestFixture]
public class BigArrayTests
{
[Test]
public void Constructor_ZeroLength_ThrowsArgumentOutOfRangeException()
{
Assert.Throws<ArgumentOutOfRangeException>(() =>
{
new BigArray<int>(0);
});
}
[Test]
public void Constructor_NegativeLength_ThrowsArgumentOutOfRangeException()
{
Assert.Throws<ArgumentOutOfRangeException>(() =>
{
new BigArray<int>(-1);
});
}
[Test]
public void Indexer_SetsAndRetrievesCorrectValues()
{
BigArray<int> array = new BigArray<int>(10001);
for (int index = 0; index < array.Length; index++)
array[index] = index;
for (int index = 0; index < array.Length; index++)
Assert.That(array[index], Is.EqualTo(index));
}
private const int PRIME_ARRAY_SIZE = 10007;
[Test]
public void Indexer_RetrieveElementJustPastEnd_ThrowsIndexOutOfRangeException()
{
BigArray<int> array = new BigArray<int>(PRIME_ARRAY_SIZE);
Assert.Throws<IndexOutOfRangeException>(() =>
{
array[PRIME_ARRAY_SIZE] = 0;
});
}
[Test]
public void Indexer_RetrieveElementJustBeforeStart_ThrowsIndexOutOfRangeException()
{
BigArray<int> array = new BigArray<int>(PRIME_ARRAY_SIZE);
Assert.Throws<IndexOutOfRangeException>(() =>
{
array[-1] = 0;
});
}
[Test]
public void Constructor_BoundarySizes_ProducesCorrectlySizedArrays()
{
for (int index = 1; index < 16384; index++)
{
BigArray<int> arr = new BigArray<int>(index);
Assert.That(arr.Length, Is.EqualTo(index));
arr[index - 1] = 42;
Assert.That(arr[index - 1], Is.EqualTo(42));
Assert.Throws<IndexOutOfRangeException>(() =>
{
arr[index] = 42;
});
}
}
}
public class BigArray<T>
{
const int BUCKET_INDEX_BITS = 13;
const int BUCKET_SIZE = 1 << BUCKET_INDEX_BITS;
const int BUCKET_INDEX_MASK = BUCKET_SIZE - 1;
private readonly T[][] _Buckets;
private readonly int _Length;
public BigArray(int length)
{
if (length < 1)
throw new ArgumentOutOfRangeException("length");
_Length = length;
int bucketCount = length >> BUCKET_INDEX_BITS;
bool lastBucketIsFull = true;
if ((length & BUCKET_INDEX_MASK) != 0)
{
bucketCount++;
lastBucketIsFull = false;
}
_Buckets = new T[bucketCount][];
for (int index = 0; index < bucketCount; index++)
{
if (index < bucketCount - 1 || lastBucketIsFull)
_Buckets[index] = new T[BUCKET_SIZE];
else
_Buckets[index] = new T[(length & BUCKET_INDEX_MASK)];
}
}
public int Length
{
get
{
return _Length;
}
}
public T this[int index]
{
get
{
return _Buckets[index >> BUCKET_INDEX_BITS][index & BUCKET_INDEX_MASK];
}
set
{
_Buckets[index >> BUCKET_INDEX_BITS][index & BUCKET_INDEX_MASK] = value;
}
}
}
}
I had a similar issue once and what I ended up doing was using a list instead of an array. When creating the lists I set the capacity to the required sizes and I defined both lists BEFORE I tried adding values to them. I'm not sure if you can use lists instead of arrays but it might be something to consider. In the end I had to run the executable on a 64 bit OS, because when I added the items to the list the overall memory usage went above 2GB, but at least I wa able to run and debug locally with a reduced set of data.
A question: Are all elements of your array occupied? If many of them contain some default value then maybe you could reduce memory consumption using an implementation of a sparse array that only allocates memory for the non-default values. Just a thought.
Each 32bit process has a 2GB address space (unless you ask the user to add /3GB in boot options), so if you can accept some performance drop-off, you can start a new process to get 2GB more in address space - well, a little less than that. The new process would be still fragmented with all the CLR dlls plus all the Win32 DLLs they use, so you can get rid of all address space fragmentation caused by CLR dlls by writing the new process in a native language e.g. C++. You can even move some of your calculation to the new process so you get more address space in your main app and less chatty with your main process.
You can communicate between your processes using any of the interprocess communication methods. You can find many IPC samples in the All-In-One Code Framework.
I have experience with two desktop applications and one moble application hitting out-of-memory limits. I understand the issues. I do not know your requirements, but I suggest moving your lookup arrays into SQL CE. Performance is good, you will be surprised, and SQL CE is in-process. With the last desktop application, I was able to reduce my memory footprint from 2.1GB to 720MB, which had the benefit of speeding up the application due to significantly reducing page faults. (Your problem is fragmentation of the AppDomain's memory, which you have no control over.)
Honestly, I do not think you will be satisfied with performance after squeezing these arrays into memory. Don't forget, excessive page faults has a significant impact on performance.
If you do go SqlServerCe, make sure to keep the connection open to improve performance. Also, single row lookups (scalar) may be slower than returning a result set.
If you really want to know what is going on with memory, use CLR Profiler. VMMap is not going to help. The OS does not allocate memory to your application. The Framework does by grabbing large chucks of OS memory for itself (caching the memory) then allocating, when needed, pieces of this memory to applications.
CLR Profiler for the .NET Framework 2.0 at
https://github.com/MicrosoftArchive/clrprofiler