I have a c# application which I used to run on a XP machine.
I switched recently to a Windows 7.0 machine.
I have the following error message when being in debugger: "System.StackOverflowException". Still have the XP machine, don't have the problem with this one.
It's overflowing in the middle of a recursive algorithm.
Anyone familiar with this problem? Is that the OS which has to do with this or is that the machine itself?
Many thanks for your help,
Michael
It would be helpful to know just how deep the recursion goes in XP before reaching the base case, and where it errors in Win7.
Theoretically, a Windows 7 process should have more available stack space than a WinXP process; at the very least, they should be the same. However, there are other factors at play here. Check out this blog post: http://blogs.technet.com/b/markrussinovich/archive/2009/07/08/3261309.aspx
In short, the limiting factor is usually "resident available memory"; this is physical RAM (not page file space) that is available for data that must be kept there and can't be swapped to the page file. A lot of things must be kept "resident" on the average computer and cannot be swapped out to the page file; most important is that anything that must be run in "kernel mode" (requiring direct access to the core system) must be kept in RAM to avoid page faults, even when there are no active threads for that process at the time.
Windows 7 has more of these "kernel-mode" processes. For instance, Windows Aero (which wasn't part of WinXP) uses your graphics card to accelerate rendering of the desktop, and so it must run in kernel mode. The Windows 7 kernel itself is larger, because it includes additional security and additional built-in hardware support. Windows 7 also has additional background processes etc that run in kernel mode that weren't in WinXP.
So, all other things being equal (including RAM), a Windows 7 machine will actually have less resident memory available to commit to your recursive algorithm, meaning that the algorithm will not be able to recurse deeply enough to reach the base case before a call triggers a StackOverflowException due to Windows not having enough resident memory to meet the "commit" required for the new call.
In addition, Windows 7 arranges things in memory differently. Older Windows versions (XP and older) reserved a memory space for each new process in roughly sequential fashion; the N+1th process (or thread) is given a memory address one block after the last one reserved for the Nth process/thread. Beginning with Windows Vista, memory was allocated in a more "random" fashion; Windows will choose a location in memory that may or may not be adjacent to any other reserved block (it's only guaranteed not to be a part of any other reserved block). This is a security feature designed to confuse malware and prevent it from successfully snooping around in other processes' memory. However, the less space-efficient allocation scheme means that the OS will more quickly run out of 1MB blocks of contiguous RAM to allocate to each new thread. At that point, it begins allocating the gaps. So, depending on your Windows 7 machine's specific memory usage footprint, the thread for your recursive function may request the usual 1MB of stack space, and be given a pointer by the OS which actually only has 128K of contiguous space. Your program won't be able to tell the difference, until it can't actually commit all the space it thought it had reserved. This can produce Heisenbugs where it'll work one time but fail the next because of non-deterministic differences in the exact memory space Windows reserves for the thread each time.
The answer to all of this is "more RAM". The amount needed by the core kernel-mode processes is relatively static, so every GB of additional RAM you can add is a GB that is available solely for user program processes and threads.
How recursive is recursive?
Anything deeper than about ten or so could be risky.
If you're exhausting the stack and you're sure it's not a bug, you could manage your own stack...
For instance:
void Process(SomeType foo)
{
DoWork(foo); //work on foo
foreach(var child in foo.Children)
{
Process(child);
}
}
could become
void Process(SomeType foo)
{
Stack<SomeType> bar=new Stack<SomeType>();
bar.Push(foo);
while(bar.Any())
{
var item=bar.Pop();
DoWork(item);//work on item
foreach(var child in item.Children)
{
bar.Push(child);
}
}
}
thus eliminating any CLR call-stack problems.
Of course, this won't fix an unbounded recursion.
I don't believe this has anything to do with the physical RAM on your PC. I suspect the reason you didn't happen to see it on XP is simply that Windows 7 probably has a (slightly?) different version of .Net.
Clearly, you need to somehow limit the depth of your recursion (or substitute a non-recursive loop).
But you can potentially configure your .Net stack(s). Please look at these links:
http://www.atalasoft.com/cs/blogs/rickm/archive/2008/04/22/increasing-the-size-of-your-stack-net-memory-management-part-3.aspx
http://msdn.microsoft.com/en-us/library/5cykbwz4.aspx
How does the .NET IL .maxstack directive work?
Related
I am working to diagnose a series of OutOfMemoryException problems within an application of ours. This is an internal 32-bit (x86) OWIN-hosted WebAPI that runs within a console application and talks to a series of hardware components in parallel. For that reason it's creating around 20 instances of a library, and the sharp increase in "virtual size" memory matches when those instances are created.
From the output of Process Explorer, and dotMemory, it does not appear that we're allocating that much actual memory within this application:
From reading many, many SO answers I think I understand that our problem is either from fragmentation within the G0, G1, G2 & LOH heaps, or we're possibly bumping into the 2GB addressable memory limit for a 32-bit process running on Windows 7. This application works in batches where it collects a bunch of data from hardware devices, creates collections in memory to aggregate that data into a single object, and then saves it to be retrieved by a client app. This activity is the cause of the spikes in the dotMemory visual, but these data structures are not enormous, which I think the dotMemory chart shows.
Looking at the heaps has shown they rarely grow beyond 10-15MB in size, and I don't see much evidence that the LOH is growing too large or being severely fragmented. I'm really struggling with how to proceed to better understand what's happening here.
So my question is two-fold:
Is it conceivable that we could be hitting that 2GB limit for virtual memory, and that's a cause for these memory exceptions?
If that is a possible cause then am I right in thinking a 64-bit build would get around that?
We are exploring moving to a 64-bit build, but that would require updating some low-level libraries we use to also be 64-bit. It's certainly an option we will explore eventually (if not sooner), but we're trying to understand this situation better before investing the time required.
Update after setting the LARGEADDRESSFLAG
Based a recommendation I set that flag on the binary and interestingly saw the virtual size jump immediately to nearly 3GB. I don't know if I should be alarmed by that?!
I will monitor the application with this configuration for the next several hours.
In my case the advice provided by #ThomasWeller was indeed correct and enabling the "large address aware" flag has allowed this application to run for several days without throwing memory exceptions.
Today's PCs have a large amount of physical RAM but still, the stack size of C# is only 1 MB for 32-bit processes and 4 MB for 64-bit processes (Stack capacity in C#).
Why the stack size in CLR is still so limited?
And why is it exactly 1 MB (4 MB) (and not 2 MB or 512 KB)? Why was it decided to use these amounts?
I am interested in considerations and reasons behind that decision.
You are looking at the guy that made that choice. David Cutler and his team selected one megabyte as the default stack size. Nothing to do with .NET or C#, this was nailed down when they created Windows NT. One megabyte is what it picks when the EXE header of a program or the CreateThread() winapi call doesn't specify the stack size explicitly. Which is the normal way, almost any programmer leaves it up the OS to pick the size.
That choice probably pre-dates the Windows NT design, history is way too murky about this. Would be nice if Cutler would write a book about it, but he's never been a writer. He's been extraordinarily influential on the way computers work. His first OS design was RSX-11M, a 16-bit operating system for DEC computers (Digital Equipment Corporation). It heavily influenced Gary Kildall's CP/M, the first decent OS for 8-bit microprocessors. Which heavily influenced MS-DOS.
His next design was VMS, an operating system for 32-bit processors with virtual memory support. Very successful. His next one was cancelled by DEC around the time the company started disintegrating, not being able to compete with cheap PC hardware. Cue Microsoft, they made him a offer he could not refuse. Many of his co-workers joined too. They worked on VMS v2, better known as Windows NT. DEC got upset about it, money changed hands to settle it. Whether VMS already picked one megabyte is something I don't know, I only know RSX-11 well enough. It isn't unlikely.
Enough history. One megabyte is a lot, a real thread rarely consumes more than a couple of handfuls of kilobytes. So a megabyte is actually rather wasteful. It is however the kind of waste you can afford on a demand-paged virtual memory operating system, that megabyte is just virtual memory. Just numbers to the processor, one each for every 4096 bytes. You never actually use the physical memory, the RAM in the machine, until you actually address it.
It is extra excessive in a .NET program because the one megabyte size was originally picked to accommodate native programs. Which tend to create large stack frames, storing strings and buffers (arrays) on the stack as well. Infamous for being a malware attack vector, a buffer overflow can manipulate the program with data. Not the way .NET programs work, strings and arrays are allocated on the GC heap and indexing is checked. The only way to allocate space on the stack with C# is with the unsafe stackalloc keyword.
The only non-trivial usage of the stack in .NET is by the jitter. It uses the stack of your thread to just-in-time compile MSIL to machine code. I've never seen or checked how much space it requires, it rather depends on the nature of the code and whether or not the optimizer is enabled, but a couple of tens of kilobytes is a rough guess. Which is otherwise how this website got its name, a stack overflow in a .NET program is quite fatal. There isn't enough space left (less than 3 kilobytes) to still reliably JIT any code that tries to catch the exception. Kaboom to desktop is the only option.
Last but not least, a .NET program does something pretty unproductive with the stack. The CLR will commit the stack of a thread. That's an expensive word that means that it doesn't just reserve the size of the stack, it also makes sure that space is reserved in the operating system's paging file so the stack can always be swapped out when necessary. Failing to commit is a fatal error and terminates a program unconditionally. That only happens on machine with very little RAM that runs entirely too many processes, such a machine will have turned to molasses before programs start dying. A possible problem 15+ years ago, not today. Programmers that tune their program to act like an F1 race-car use the <disableCommitThreadStack> element in their .config file.
Fwiw, Cutler didn't stop designing operating systems. That photo was made while he worked on Azure.
Update, I noticed that .NET no longer commits the stack. Not exactly sure when or why this happened, it's been too long since I checked. I'm guessing this design change happened somewhere around .NET 4.5. Pretty sensible change.
The default reserved stack size is specified by the linker and it can be overridden by developers via changing the PE value at the link time or for an individual thread by specifying the dwStackSize parameter for the CreateThread WinAPI function.
If you create a thread with the initial stack size larger than or equal to the default stack size then it rounded up to the nearest multiple of 1 MB.
Why the value equals to 1 MB for 32-bit processes and 4 MB for 64-bit? I think you should ask developers, who designed Windows, or wait until someone of them answers your question.
Probably Mark Russinovich knows that and you can contact him. Maybe you can find this information in his Windows Internals books earlier than sixth edition which describes less info about stacks rather than his article. Or maybe Raymond Chen knows reasons since he writes interesting things about Windows internals and its history. He can answer your question too, but you should post a suggestion to the Suggestion Box.
But at this time I'll try to explain some probable reasons why Microsoft have choose these values using MSDN, Mark's and Raymond's blogs.
The defaults have these values probably because in early times PCs were slow and allocating memory on the stack was much faster than allocating memory in the heap. And since stack allocations were much cheaper they were used, but it required a larger stack size.
So the value were the optimal reserved stack size for most of applications. It's optimal because allows to make a lot of nested calls and allocate memory on the stack to pass structures to calling functions. At the same time it allows to create a lot threads.
Nowadays these values are mostly used for backward compatibility, because structures which are passed as parameters to WinAPI functions are still allocated on the stack. But if you're not using stack allocations then a thread's stack usage will be significantly less than the default 1 MB and it is wasteful as Hans Passant mentioned. And to prevent this the OS commits only the first page of the stack (4 KB), if other isn't specified in the PE header of the application. Other pages are allocated on demand.
Some applications override reserved address space and initially committed to optimize memory usage. As an example, the maximum stack size of an IIS native process's thread is 256 KB (KB932909). And this decreasing of the default values is recommended by Microsoft:
It is best to choose as small a stack size as possible and commit the stack that is needed for the thread or fiber to run reliably. Every page that is reserved for the stack cannot be used for any other purpose.
Sources:
Thread Stack Size (Microsoft Docs)
Pushing the Limits of Windows: Processes and Threads (Mark Russinovich)
By default, the maximum stack size of a thread that is created in a native IIS process is 256 KB (KB932909)
Here is the deal: when my web server starts up, it creates a couple of lengthy (20M of elements) arrays with really small objects (like 1-2-3 ints). The accumulative size of any individual array is NOT larger than 2GB (the limitation of CLR, see the link below for some details). The w3wp.exe does grow in memory usage close to 2GB (never more than that). The code is compiled in Any CPU platform mode and run on Windows 7 x64 with 8GB of RAM.
What on earth makes it to throw OutOfMemoryException while creating my lists? Does it make any difference if I host the process thru IIS or VS? This appears not happening is PROD but I am experiencing this on my dev machine all the time. (Will try to restart now...)
This may be related but I don't seem to have objects that big:
Very large collection in .Net causes out-of-memory exception
EDIT:
It does make difference to run in IIS or VS - don't see that happening when the process is started in IIS. So could it be VS debugger limitation?
Based on your updated question, it's obvious that Visual Studio does not run in 64 bit mode. So your limitation is 2GB under Visual Studio.
This post probably contains some code helpful to prove this fact:
How to detect Windows 64-bit platform with .NET?
Probably memory allocations are not optimized (i.e. done in small steps and resizes). This has a potential to fragment the heap such, that there is no longer enough contigious free space to store the 'semi-large' array.
That allocation fails, and this situation is by definition OOM, even though plenty fragments of heap might be available. Usually, excessive use of linq can cause this; at a certain point deferred execution looses it's appeal, and you can buy a lot of performance/resources by doing one or two '.ToList()' at strategic places (in my experience, often close to the beginning of your generating process, where the bulk of the data arrives).
Check if you have apppool recycling threshold set to 2GB
http://technet.microsoft.com/en-us/library/cc732519%28WS.10%29.aspx
My small stress test, which allocates random length arrays (100..200MB each) in a loop, shows different behaviour on a 64 bit Win7 machine and on a 32 bit XP (in a VM). Both systems first normally allocate as much arrays as will fit into the LOH. Then the LOH gets bigger and bigger until the virtual address space available is filled up. Expected behaviour so far. But than - on further requests - both behave differently:
While on Win7 an OutOfMemoryException (OOM) is thrown, on XP it seems, the heap gets increased and even swapped to disk - at least no OOM is thrown. (Dont know, if this may have to do with XP running in a virtual box.)
Question:
How does the runtime (or the OS?) decide, whether for managed memory allocation requests, if it is too large to get allocated, a OOM is generated or the large object heap is getting increased - eventually even swapped to disk?
If it is swapped, when does an OOM occour than?
IMO this question is important to all production environments, potentially dealing with larger datasets. Somehow it feels more "safe" to know, the system would rather slow down dramatically in such situations (by swapping) than simply throwing an OOM. At least, it should somehow be deterministically, right?
#Edit: the app is a 32 bit application, therefore running in 32 bit mode on Win 7.
The normal rules apply, a managed process is not treated differently by the Windows memory manager. The ultimate source for chunks of memory is the Windows memory manager. If it cannot find a hole in the virtual memory address space to fit the requested memory allocation then it fails the VirtualAlloc() call and the CLR generates OOM.
Same for swapping behavior, if pages in RAM are needed to map pages of other processes or even pages of the same process then they'll get swapped out. This is not otherwise associated with OOM.
You cannot assume it will work exactly the same on XP as it does on Win7 x64. Getting OOM on x64 when you build your program targeting AnyCPU is quite unusual, a 64-bit operating system has a very large virtual memory address space. The upper limit is set by the maximum size of the paging file. A 32-bit program will run in the WOW emulation layer, it can have a 4 GB address space if you set the LARGEADDRESSAWARE option bit with Editbin.exe.
You can use SysInteral's VMMap utility to see how the address space of your process is carved up.
I was hoping someone could explain why my application when loaded uses varying amounts of RAM. I'm speaking about a compiled version that uses the exe directly. It's a pretty basic applications and there are no conditional branches in the startup of the application. Yet every time I start it up the RAM amount varies from 6MB-16MB.
I know it's on the small end of usage anyways but I'm curious of why this happens.
Edit: to give a bit more clarification on what the app actually does.
It is a WinForm project.
It connects to a database using sqlclient to retrieve a list of servers.
Based on that list a series of buttons are created to start and stop a service on those servers.
Using the System.Timers class to audit the status of the services on those servers every 20 seconds.
The applications at this point sits there and waits for user input via one of the button clicks to start/stop the service.
The trick here is that the amount of RAM reported by the task schedule is not the amount of RAM used by your application. Rather, it is the amount of RAM reserved for use by your application.
Remember that with managed frameworks like .Net, you don't request or release memory directly. Rather, a garbage collector manages the memory for you. The amount of memory reserved for your application at a given time can vary and depends on a lot of different factors, including memory pressure created at the time by other programs.
Think of it this way: if you need 10 MBs of RAM for your app, is it faster to request and return it to the operating system 1 MB at a time over 10 requests/releases or reserve the block at once with one request/release? Now extend that to a scenario where you don't know exactly how much RAM you'll need, only that it's somewhere in the neighborhood of 10 MB. Additionally, your computer has 1 GB sitting there unused. Of course the best thing to do is take a good-sized chunk of that available RAM. Even 20 or 30 MB wouldn't be unreasonable relative to the ram that's sitting there unused, because unused RAM is wasted performance.
If your system later starts to feel some memory pressure then .Net can easily return some RAM to the system. This is one of the ways managed languages can sometimes give better performance than languages like C++ with traditional memory management: a garbage collector that can more easily take the entire system health into account when allocating memory.
What are you using to determine how much memory is being "used". Even with regular applications Windows will aggressively allocate unused memory in advance, with .NET applications it's even more complicated as to how much memory is actually being used, and how much Windows is just tacking on so that it will be available instantly when needed. If another application actually asks for memory this reserved memory will be repurposed.
One way to check is to minimize the application (at least on XP). If you are looking at the memory use in something like task manager you'll notice it drops off right away, eliminating the seemly "random" amount allocated.
It may be related to the jitter, after the first load the jitter already created a compiled version and it doesn't need to run. Other than that you would have to give us some more details about the app and which kind of memory you are referring to.