C# Huge object initializer throws Stack Overflow error - c#

I need to build an object which consists of almost 20000 nested objects (in multiple levels). Each object is a simple database entity with 1-5 fields or a list of entities. I am using inline object initializer to initiate my root object.
new OUTPUT() { XREF_CATALOG_MATERIALS = xrefCatalogMaterials.Find(x => x.MATERIAL.PART_NUM.Equals("xxxx")), FUNCTION = new FUNCTION() {...
I tried running on both x86 and x64 mode and in both cases I get stackoverflow exception. The same code and logic works fine on the other cases that my object is not that big (around 6000 nested objects)
Is there any way to increase .Net applicationheap size? any suggestion that I can use to solve that issue?

from that description you don't have a problem with heap size. you have a problem with stack size. looks like you're trying to invoke too many nested functions. every function call has an effect on stack.
Stack is much smaller than heap and it is relatively easy to overflow it. Easiest way is a recursion.
https://msdn.microsoft.com/en-us/library/system.stackoverflowexception(v=vs.110).aspx
StackOverflowException is thrown for execution stack overflow errors, typically in case of a very deep or unbounded recursion.

Related

C# - How to Bypass Error cs0212 Cheaply for Programmers and Computers?

I want to process many integers in a class, so I listed them into an int* array.
int*[] pp = new int*[]{&aaa,&bbb,&ccc};
However, the compiler declined the code above with the following EXCUSE:
> You can only take the address of an unfixed expression inside of a fixed statement initializer
I know I can change the code above to avoid this error; however, we need to consider ddd and eee will join the array in the future.
public enum E {
aaa,
bbb,
ccc,
_count
}
for(int i=0;i<(int)E._count;i++)
gg[(int)E.bbb]
 
Dictionary<string,int>ppp=new Dictionary<string,int>();
ppp["aaa"]=ppp.Count;
ppp["bbb"]=ppp.Count;
ppp["ccc"]=ppp.Count;
gg[ppp["bbb"]]
These solution works, but they make the code and the execution time longer.
I also expect a nonofficial patch to the compiler or a new nonofficial C# compiler, but I have not seen an available download for many years; it seems very difficult to have one for us.
Are there better ways so that
I do not need to count the count of the array ppp.
If the code becomes long, there are only several letters longer.
The execution time does not increase much.
To add ddd and eee into the array, there are only one or two
setences for each new member.
.NET runtime is a managed execution runtime which (among other things) provides garbage collection. .NET garbage collector (GC)
not only manages the allocation and release of memory, but also transparently moves the objects around the "managed heap", blocking
the rest of your code while doing it.
It also compacts (defragments) the memory by moving longer lived objects together, and even "promoting" them into different parts of the heap, called generations, to avoid checking their status too often.
There is a bunch of memory being copied all the time without your program even realizing it. Since garbage collection is an operation that can happen at any time during the execution of your program, any pointer-related
("unsafe") operations must be done within a small scope, by telling the runtime to "pin" the objects using the fixed keyword. This prevents the GC from moving them, but only for a while.
Using pointers and unsafe code in C# is not only less safe, but also not very idiomatic for managed languages in general. If coming from a C background, you may feel like at home with these constructs, but C# has a completely different philosophy: your job as a C# programmer should be to write reliable, readable and maintenable code, and only then think about squeezing a couple of CPU cycles for performance reasons. You can use pointers from time to time in small functions, doing some very specific, time-critical code. But even then it is your duty to profile before making such optimizations. Even the most experienced programmers often fail at predicting bottlenecks before profiling.
Finally, regarding your actual code:
I don't see why you think this:
int*[] pp = new int*[] {&aaa, &bbb, &ccc};
would be any more performant than this:
int[] pp = new int[] {aaa, bbb, ccc};
On a 32-bit machine, an int and a pointer are of the same size. On a 64-bit machine, a pointer is even bigger.
Consider replacing these plain ints with a class of your own which will provide some context and additional functionality/data to each of these values. Create a new question describing the actual problem you are trying to solve (you can also use Code Review for such questions) and you will benefit from much better suggestions.

Most efficient way, Tags or List<GameObject>?

In my game I can use a list of game objects or tags to iterate but i prefer knows what is the most efficient way.
Save more memory using tags or unity requires many resources to do a search by tag?
public List<City> _Citys = new List<City>();
or
foreach(GameObject go in GameObject.FindGameObjectsWithTag("City"))
You're better of using a List of City objects and doing a standard for loop to iterate over the 'City' objects. The List just simply holds references to the 'City' objects, so impact on memory should be minimal - you could use an array of GameObjects[] instead of a List (which is what FindGameObjectsWithTag returns).
It's better for performance to use a populated List/Array rather than searching by Tags and of course you're explicitly pointing to an object rather than using 'magic' strings -- if you change the tag name later on then the FindGameObjectsWithTag method will silently break, as it will no longer find any objects.
Also, avoid using a foreach loop in Unity as this unfortunately creates a lot of garbage (the garbage collector in Unity isn't great so it's best to create as little garbage as possbile), instead just use a standard for loop:
Replace the “foreach” loops with simple “for” loops. For some reason, every iteration of every “foreach” loop generated 24 Bytes of garbage memory. A simple loop iterating 10 times left 240 Bytes of memory ready to be collected which was just unacceptable
EDIT: As mentioned in pid's answer - measure. You can use the built-in Unity profiler to inspect memory usage: http://docs.unity3d.com/Manual/ProfilerMemory.html
Per Microsoft's C# API rules, verbs such as Find* or Count* denote active code while terms such as Length stand for actual values that require no code execution.
Now, if the Unity3D folks respected those guidelines is a matter of debate, but from the name of the method I can already tell that it has a cost and should not be taken too lightly.
On the other side, your question is about performance, not correctness. Both ways are correct per se, but one is supposed to have better performance.
So, the main rule of refactoring for performance is: MEASURE.
It depends on memory allocation and garbage collection, it is impossible to tell which really is faster without measuring.
So the best advice I could give you is pretty general. Whenever you feel the need to enhance performance of code you have to actually measure what you are about to improve, before and after.
Your code examples are 2 distinctly different things. One is instantiating a list, and one is enumerating over an IEnumerable returned from a function call.
I assume you mean the difference between iterating over your declared list vs iterating over the return value from GameObject.FindObjectsWithTag() in which case;
Storing a List as a member variable in your class, populating it once and then iterating over it several times is more efficient than iterating over GameObject.FindObjectsWithTag several times.
This is because you keep your List and your references to the objects in your list at all times without having to repopulate it.
GameObject.FindObjectsWithTag will search your entire object hierarchy and compile a list of all the objects that it finds that matches your search criteria. This is done every time you call the function, so there is additional overhead even if the amount of objects it finds is the same as it still searches your hierarchy.
To be honest, you could just cache your results with a List object using GameObject.FindObjectWithTag providing the amount of objects returned will not change. (As in to say you are not instantiating or destroying any of those objects)

Push a stack onto another stack

In C#, is there a way to push one Stack onto another Stack without iterating through the stack elements? If not, is there a better data structure I should be using? In Java you can do:
stack1.addAll(stack2)
I was hoping to find the C# analogue...
0. Safe Solution - Extension Method
public static class Util {
public static void AddAll<T>(this Stack<T> stack1, Stack<T> stack2) {
T[] arr = new T[stack2.Count];
stack2.CopyTo(arr, 0);
for (int i = arr.Length - 1; i >= 0; i--) {
stack1.Push(arr[i]);
}
}
}
Probably the best is to create an extension method. Note that I am putting the first stack "on top" of the other stack so to speak by looping from arr.Length-1 to 0. So this code:
Stack<int> x = new Stack<int>();
x.Push(1);
x.Push(2);
Stack<int> y = new Stack<int>();
y.Push(3);
y.Push(4);
x.AddAll(y);
Will result in x being: 4,3,2,1. Which is what you would expect if you push 1,2,3,4. Of course, if you were to loop through your second stack and actually pop elements and then push those to the first stack, you would end up with 1,2,4,3. Again, modify the for loop as you see fit. Or you could add another parameter to specify which behavior you would like. I don't have Java handy, so I don't know what they do.
Having said that, you could do this, but I don't make any guarantees that it will continue to work. MS could always change the default behavior of how stack works when calling ToList. But, this is shorter, and on my machine with .NET 4.5 works the same as the extension method above:
1 Line Linq solution:
y.Reverse().ToList().ForEach(item => x.Push(item));
In your question, wanting to do this "without iterating through the stack elements" basically means a LinkedList-based stack where you would just join the first and last elements to combine stacks in constant time.
However, unless you've a very specific reason for using LinkedList, it's likely a better idea to just iterate over an array-based (List-based) stack elements.
As far as a specific implementation goes, you should probably clarify whether you want the second stack to be added to the first in the same stack order or to be reversed into the first stack by being popped out.
An addAll would just be a convenience method for a foreach loop that adds all of the items. There really isn't much you can do besides that:
foreach(var item in stack2)
stack1.Push(item);
If you do it particularly frequently you can add an extension method for it, for your own convenience.
This isn't meant to be done with the current .NET Stack implementation.
In order for the content of a Stack to be "grafted" onto the end of another Stack without iterating though its elements internal implementation details how the Stack class stores them in memory has to be known. Based on the principle of encapsulation this information is "officially" only know inside the Stack class itself. .NET's Stack does not expose methods to do this, so without using reflection there is no way to do it as the OP requested.
Conceivably you could use reflection to append to the internal array of one Stack the content of another Stack and also update the field that stores the stack length but this would be highly dependent on the implementation of the Stack class which could be changed in future versions of the framework without warning.
If you really need a Stack that can do this you could write your own Stack class from scratch or simply use another collection like ArrayList or LinkedList which have the method you want and add Push and Pop extension methods to them.

C# Stack Overflow Overwrite EIP

I would like to write a vulnerable program, to better understand Stack Overflow (causes) in c#, and also for educational purposes. Basically, I "just" want a stack overflow, that overwrites the EIP, so I get control over it and can point to my own code.
My problem is: Which objects do use the stack as memory location?
For example: the Program parses a text file with recursive bytewise reading until a line break is found (yeah, I think nobody would do this, but as this is only for learning...). Currently, I'm appending a string with the hex value of chars in a text file. This string is a field of an object that is instanciated after calling main().
Using WinDbg, I got these values after the stack has overflown from (nearly) endless recursion:
(14a0.17e0): Break instruction exception - code 80000003 (first chance)
eax=00000000 ebx=00000000 ecx=0023f618 edx=778570b4 esi=fffffffe edi=00000000
eip=778b04f6 esp=0023f634 ebp=0023f660 iopl=0
BTW I'm using a Win7x86 AMD machine, if this is from interest.
I've seen many C++ examples causing a stack overflow using strcpy, is there any similar method in c#?
Best Regards,
NoMad
edit: I use this code to cause the stack overflow.
class FileTest
{
FileStream fs = new FileStream("test.txt", FileMode.Open, FileAccess.Read);
string line = String.Empty;
public FileTest()
{
Console.WriteLine(ReadTillBreak());
}
private string ReadTillBreak()
{
int b = 0;
b = fs.ReadByte();
line += (char)b;
if (b != 13)
ReadTillBreak();
return line;
}
}
Is it possible to overflow the stack and write into the eip with the line string (so, content of test.txt)?
The reason you can do exploit stack corrupts in C and C++ is because you handle memory yourself and the language allows you to do all sorts of crazy stuff. C# runs in an environment that is specifically designed to prevent a lot of these problems. I.e. while you can easily generate a stack overflow in C# there's no way that you can modify the control flow of the program that way using managed code.
The way exploits against managed environments usually work is by breaking out of the sandbox so to speak. As long as the code runs in the sandbox there are a lot of these tricks that will simply not work.
If you want to learn about stack corruption I suggest you stick to C or C++.
I'm not entirely clear on you descriptions of what you have tried. Stack overflows do not generally "overwrite the EIP".
To cause a stack overflow, the most straight forward way is something like this.
void RecursiveMethod()
{
RecursiveMethod();
}
Since each call to this method stores the return address on the stack, calling it endlessly like this without returning will eventually use up all stack space. Of course, modern Windows applications have tons of stack space so it could take a while. You could increase the amount of stack usage for each call by adding arguments or local variables within the method.

C#: How to test for StackOverflowException

Say you have a method that could potentially get stuck in an endless method-call loop and crash with a StackOverflowException. For example my naive RecursiveSelect method mentioned in this question.
Starting with the .NET Framework version 2.0, a StackOverflowException object cannot be caught by a try-catch block and the corresponding process is terminated by default. Consequently, users are advised to write their code to detect and prevent a stack overflow. For example, if your application depends on recursion, use a counter or a state condition to terminate the recursive loop.
Taking that information (from this answer) into account, since the exception can't be caught, is it even possible to write a test for something like this? Or would a test for this, if that failed, actually break the whole test-suite?
Note: I know I could just try it out and see what happens, but I am more interested in general information about it. Like, would different test frameworks and test runners handle this differently? Should I avoid a test like this even though it might be possible?
You would need to solve the Halting Problem! That would get you rich and famous :)
It's evil but you can spin it up in a new process. Launch the process from the unit test and wait for it to complete and check the result.
How about checking the number of frames on the stack in an assert statement?
const int MaxFrameCount = 100000;
Debug.Assert(new StackTrace().FrameCount < MaxFrameCount);
In your example from the related question this would be (The costly assert statement would be removed in the release build):
public static IEnumerable<T> SelectRecursive<T>(this IEnumerable<T> subjects, Func<T, IEnumerable<T>> selector)
{
const int MaxFrameCount = 100000;
Debug.Assert(new StackTrace().FrameCount < MaxFrameCount);
// Stop if subjects are null or empty
if(subjects == null || !subjects.Any())
yield break;
// For each subject
foreach(var subject in subjects)
{
// Yield it
yield return subject;
// Then yield all its decendants
foreach (var decendant in SelectRecursive(selector(subject), selector))
yield return decendant;
}
}
It's not a general test though, as you need to expect it to happen, plus you can only check the frame count and not the actual size of the stack. It is also not possible to check whether another call will exceed stack space, all that you can do is roughly estimate how many calls in total will fit on your stack.
The idea is to keep track of how deeply a recursive funcion is nested, so that it doesn't use too much stack space. Example:
string ProcessString(string s, int index) {
if (index > 1000) return "Too deeply nested";
s = s.Substring(0, index) + s.Substring(index, 1).ToUpper() + s.Substring(index + 1);
return ProcessString(s, index + 1);
}
This of course can't totally protect you from stack overflows, as the method can be called with too little stack space left to start with, but it makes sure that the method doesn't singelhandedly cause a stack overflow.
We cannot have a test for StackOverflow because this is the situation when there is no more stack left for allocation, the application would exit automatically in this situation.
If you are writing library code that somebody else is going to use, stack overflows tends to be a lot worse than other bugs because the other code can't just swallow the StackOverflowException; their entire process is going down.
There's no easy way to write a test that expects and catches a StackOverflowException, but that's not the behavior you want to be testing!
Here's some tips for testing your code doesn't stack overflow:
Parallelize your test runs. If you have a separate test suite for the stack overflow cases, then you'll still get results from the other tests if one test runner goes down. (Even if you don't separate your test runs, I'd consider stack overflows to be so bad that it's worth crashing the whole test runner if it happens. Your code shouldn't break in the first place!)
Threads may have different amounts of stack space, and if somebody is calling your code you can't control how much stack space is available. While the default for 32 bit CLR is 1MB stack size and 64 bit is 2MB stack size, be aware that web servers will default to a much smaller stack. Your test code could use one of the Thread constructors that takes a smaller stack size if you want to verify your code won't stack overflow with less avaliable space.
Test every different build flavor that you ship (Release and Debug? with or without debugging symbols? x86 and x64 and AnyCPU?) and platforms you'll support (64 bit? 32 bit? 64 bit OS running 32 bit? .NET 4.5? 3.5? mono?). The actual size of the stack frame in the generated native code could be different, so what breaks on one machine might not break on another.
Since your tests might pass on one build machine but fail on another, ensure that if it starts failing it doesn't block checkins for your entire project!
Once you measure how few iterations N cause your code to stack overflow, don't just test that number! I'd test a much larger number (50 N?) of iterations doesn't stack overflow (so you don't get tests that pass on one build server but fail on another).
Think about every possible code path where a function can eventually later call itself. Your product might prevent the X() -> X() -> X() -> ... recursive stack overflow, but what if there is a X() -> A() -> B() -> C() -> ... -> W() -> X() -> A() -> ... recursive case that is still broken?
PS: I don't have more details about this strategy, but apparently if your code hosts the CLR then you can specify that stack overflow only crashes the AppDomain?
First and foremost I think the method should handle this and make sure it does not recurse too deep.
When this check is done, I think the check should be tested - if anything - by exposing the maximum depth reached and assert it is never larger than the check allows.

Categories