Is there a full list of optimizations done by the /optimize C# compiler key available anywhere?
EDIT:
Why is it disabled by default?
Is it worth using in a real-world app? -- it is disabled by default only in Debug configuration and Enabled in Release.
Scott Hanselman has a blog post that shows a few examples of what /optimize (which is enabled in Release Builds) does.
As a summary: /optimize does many things with no exact number or definition given, but one of the more visible are method inlining (If you have a Method A() which calls B() which calls C() which calls D(), the compiler may "skip" B and C and go from A to D directly), which may cause a "weird" callstack in the Release build.
It is disabled by default for debug builds. For Release builds it is enabled.
It is definitely worth enabling this switch as the compiler makes lots of tweaks and optimizations depending on the kind of code you have.
For eg: Skipping redundant initializations, comparisons that never change etc.
Note: You might have some difficulty debugging if your turn on optimization as the code you have and the IL code that is generated may not match. This is the reason it is turned on only for Release builds.
Quoted from the MSDN page:
The /optimize option enables or
disables optimizations performed by
the compiler to make your output file
smaller, faster, and more efficient.
In other words, it does exactly what you think it would - optimises the compiled CIL (Common Intermediate Language) code that gets executed by the .NET VM. I wouldn't worry about what the specific optimisations are - suffice to say that they are many, and probably quite complex in some cases. If you are really interested in what sort of things it does, you could probably investigate the Mono C# Compiler (I doubt the details about the MS C# one are public).
The reason optimisation is disabled by default for Debug configurations is that it makes certain debugging features impossible. A few notable ones:
Perhaps most crucially, the Edit and Continue feature is disabled - i.e. no modifying code during execution.
Breaking execution often means the wrong line of code is highlighted (usually the one after the expected one).
Unused local variables aren't actually assigned or even declared.
Really, the default options for optimisation never ought to be changed. Having the option off for debugging is highly useful, while having it on for Release mode is equally wise.
Related
I heard compiling in Release mode generates optimized code than in Debug mode, which is fine.
But is this optimization in the IL? is it in the machine code once the CLR runs it? is the metadata structure different from PE compiled in Release and Debug?
thanks
Building in Release build turns on the /optimize compile option for the C# compiler. That has a few side-effects, the IL indeed changes but not a great deal. Notable is that the compiler no longer makes an effort to make the code perfectly debuggable. It for example skips an empty static constructor, it no longer emits the NOP opcodes that allows you to set a breakpoint on a curly brace and allows local variables with different scopes to overlap in a stack frame. Small stuff.
The most important difference is the [Debuggable] attribute that's emitted for the assembly, its IsJITOptimizerDisabled property is false.
Which turns on the real optimizer, the one that's built into the jitter. You'll find the list of optimizations it performs in this answer. Do note the usefulness of this approach, any language benefits from having the code optimizer in the jitter instead of the compiler.
So in a nutshell, very minor changes in the IL, very large changes in the generated machine code.
Yes, there's some optimization in the IL - in particular, the debug version will include NOP instructions which make it easy for a debugger to insert break points, I believe. There are also potentially differences in terms of the level of debug information provided (line numbers etc).
I suggest you take a small sample program, compile it in both ways, and then look at the output in ildasm.
The C# compiler doesn't do much optimization - the JIT compiler does most of that - but I think there are some differences.
The cil differs, it is optimized. Since the machine code is a translation of the cil, it also differs. You can see it by yourself, just open the disassembly window in visual studio. Metadata should remain the same as you don't change the structure of class contracts between releases.
In VB there is a side-effect of Edit + Continue support compiled into the executable, which can cause a memory leak. It is affected by any event that is declared with the WithEvents keyword. A WeakReference keeps track of those event instances. Problem is, those WeakReferences are leaked if you run the app without a debugger. The rate at which the process consumes memory is highly dependent on how many instances of the class get created. The leak is 16 bytes per event per object.
Disclaimer: copied from Hans' answer here
See this Microsoft knowledge base article.
This is not an answer to the exact question. Just to add that you can purposefully mark which code has to be run in debug mode and which in release mode with the help of preprocessor markups.
#if DEBUG
// code only meant for debug mode
#endif
#if NOT DEBUG
// code only meant for release mode
#endif
So if you do this you'd get different IL generated.
The .NET CLR JIT will; to my understanding; try to optimize code using patterns such as Method Inlining, Loop Unrolling, etc... In the case of Method Inlining this would not be performed for reasons such as the following:
Methods that are greater than 32 bytes of IL will not be inlined.
Virtual functions are not inlined.
Methods that have complex flow control will not be in-lined. Complex flow control is any flow control other than if/then/else; in this case, switch or while.
Methods that contain exception-handling blocks are not inlined, though methods that throw exceptions are still candidates for inlining.
If any of the method's formal arguments are structs, the method will not be inlined.
Etc...
My question is... Is there any way to detect what the JIT Optimization process is deciding to skip for these or other reason?
My thinking is that, I want to know what areas of code may need to be restructured to ensure I can take advantange of JIT optimizations.
Nowadays you can run your application on your own build of CoreCLR and gather all statistics you want. You can examine clrconfigvalues.h and enable any flag you want to get any related information (for example JitDump, using set COMPLUS_JitDump command in command prompt)
It's not quite easy, but it's possible.
How much performance gain (if any) can a windows service gain between a debug build and release build and why?
For managed code, unless you have a lot of stuff conditionally compiled in for DEBUG builds there should be little difference - the IL should be pretty much the same. The Jitter generates differently when run under the debugger or not - the compilation to IL isn't affected much.
There are some things the /optimize does when compiling to IL, but they aren't particularly aggressive. And some of those IL optimizations will probably be handled by the jitter optimizations, even if they aren't optimized in the IL (like the removal of nops).
See Eric Lippert's article http://blogs.msdn.com/ericlippert/archive/2009/06/11/what-does-the-optimize-switch-do.aspx for details:
The /optimize flag does not change a huge amount of our emitting and generation logic. We try to always generate straightforward, verifiable code and then rely upon the jitter to do the heavy lifting of optimizations when it generates the real machine code. But we will do some simple optimizations with that flag set.
Read Eric's article for information about /optimize does do differently in IL generation.
Well, though the question is a duplicate, I feel that some of the better answers in the original question are at the very bottom. Personally I have seen situations where there is an appreciable difference between debug and release modes. (Example : Property performance, where there was a 2x difference between accessing properties in debug and release mode). Whether this difference would be present in an actual software(instead of benchmark like program) is debatable, but I have seen it happen in one product I worked on.
From Neil's answer on the original question, from msdn social:
It is not well documented, here's what I know. The compiler emits an instance of the System.Diagnostics.DebuggableAttribute. In the debug version, the IsJitOptimizerEnabled property is True, in the release version it is False. You can see this attribute in the assembly manifest with ildasm.exe.
The JIT compiler uses this attribute to disable optimizations that would make debugging difficult. The ones that move code around like loop-invariant hoisting. In selected cases, this can make a big difference in performance. Not usually though.
Mapping breakpoints to execution addresses is the job of the debugger. It uses the .pdb file and info generated by the JIT compiler that provides the IL instruction to code address mapping. If you would write your own debugger, you'd use ICorDebugCode::GetILToNativeMapping().
I've encountered the following paragraph:
“Debug vs. Release setting in the IDE when you compile your code in Visual Studio makes almost no difference to performance… the generated code is almost the same. The C# compiler doesn’t really do any optimization. The C# compiler just spits out IL… and at the runtime it’s the JITer that does all the optimization. The JITer does have a Debug/Release mode and that makes a huge difference to performance. But that doesn’t key off whether you run the Debug or Release configuration of your project, that keys off whether a debugger is attached.”
The source is here and the podcast is here.
Can someone direct me to a Microsoft article that can actually prove this?
Googling "C# debug vs release performance" mostly returns results saying "Debug has a lot of performance hit", "release is optimized", and "don't deploy debug to production".
Partially true. In debug mode, the compiler emits debug symbols for all variables and compiles the code as is. In release mode, some optimizations are included:
unused variables do not get compiled at all
some loop variables are taken out of the loop by the compiler if they are proven to be invariants
code written under #debug directive is not included, etc.
The rest is up to the JIT.
Full list of optimizations here courtesy of Eric Lippert.
There is no article which "proves" anything about a performance question. The way to prove an assertion about the performance impact of a change is to try it both ways and test it under realistic-but-controlled conditions.
You're asking a question about performance, so clearly you care about performance. If you care about performance then the right thing to do is to set some performance goals and then write yourself a test suite which tracks your progress against those goals. Once you have a such a test suite you can then easily use it to test for yourself the truth or falsity of statements like "the debug build is slower".
And furthermore, you'll be able to get meaningful results. "Slower" is meaningless because it is not clear whether it's one microsecond slower or twenty minutes slower. "10% slower under realistic conditions" is more meaningful.
Spend the time you would have spent researching this question online on building a device which answers the question. You'll get far more accurate results that way. Anything you read online is just a guess about what might happen. Reason from facts you gathered yourself, not from other people's guesses about how your program might behave.
I can’t comment on the performance but the advice “don’t deploy debug to production” still holds simply because debug code usually does quite a few things differently in large products. For one thing, you might have debug switches active and for another there will probably be additional redundant sanity checks and debug outputs that don’t belong in production code.
From msdn social
It is not well documented, here's what
I know. The compiler emits an
instance of the
System.Diagnostics.DebuggableAttribute.
In the debug version, the
IsJitOptimizerEnabled property is
True, in the release version it is
False. You can see this attribute in
the assembly manifest with ildasm.exe
The JIT compiler uses this attribute
to disable optimizations that would
make debugging difficult. The ones
that move code around like
loop-invariant hoisting. In selected
cases, this can make a big difference
in performance. Not usually though.
Mapping breakpoints to execution
addresses is the job of the debugger.
It uses the .pdb file and info
generated by the JIT compiler that
provides the IL instruction to code
address mapping. If you would write
your own debugger, you'd use
ICorDebugCode::GetILToNativeMapping().
Basically debug deployment will be slower since the JIT compiler optimizations are disabled.
What you read is quite valid. Release is usually more lean due to JIT optimization, not including debug code (#IF DEBUG or [Conditional("DEBUG")]), minimal debug symbol loading and often not being considered is smaller assembly which will reduce loading time. Performance different is more obvious when running the code in VS because of more extensive PDB and symbols that are loaded, but if you run it independently, the performance differences may be less apparent. Certain code will optimize better than other and it is using the same optimizing heuristics just like in other languages.
Scott has a good explanation on inline method optimization here
See this article that give a brief explanation why it is different in ASP.NET environment for debug and release setting.
One thing you should note, regarding performance and whether the debugger is attached or not, something that took us by surprise.
We had a piece of code, involving many tight loops, that seemed to take forever to debug, yet ran quite well on its own. In other words, no customers or clients where experiencing problems, but when we were debugging it seemed to run like molasses.
The culprit was a Debug.WriteLine in one of the tight loops, which spit out thousands of log messages, left from a debug session a while back. It seems that when the debugger is attached and listens to such output, there's overhead involved that slows down the program. For this particular code, it was on the order of 0.2-0.3 seconds runtime on its own, and 30+ seconds when the debugger was attached.
Simple solution though, just remove the debug messages that was no longer needed.
In msdn site...
Release vs. Debug configurations
While you are still working on your
project, you will typically build your
application by using the debug
configuration, because this
configuration enables you to view the
value of variables and control
execution in the debugger. You can
also create and test builds in the
release configuration to ensure that
you have not introduced any bugs that
only manifest on one type of build or
the other. In .NET Framework
programming, such bugs are very rare,
but they can occur.
When you are ready to distribute your
application to end users, create a
release build, which will be much
smaller and will usually have much
better performance than the
corresponding debug configuration. You
can set the build configuration in the
Build pane of the Project Designer, or
in the Build toolbar. For more
information, see Build Configurations.
I recently run into a performance issue. The products full list was taking too much time, about 80 seconds. I tuned the DB, improved the queries and there wasn't any difference. I decided to create a TestProject and I found out that the same process was executed in 4 seconds. Then I realized the project was in Debug mode and the test project was in Release mode. I switched the main project to Release mode and the products full list only took 4 seconds to display all the results.
Summary: Debug mode is far more slower than run mode as it keeps debugging information. You should always deploy in Relase mode. You can still have debugging information if you include .PDB files. That way you can log errors with line numbers, for example.
To a large extent, that depends on whether your app is compute-bound, and it is not always easy to tell, as in Lasse's example. If I've got the slightest question about what it's doing, I pause it a few times and examine the stack. If there's something extra going on that I didn't really need, that spots it immediately.
Debug and Release modes have differences. There is a tool Fuzzlyn: it is a fuzzer which utilizes Roslyn to generate random C# programs. It runs these programs on .NET core and ensures that they give the same results when compiled in debug and release mode.
With this tool it was found and reported a lot of bugs.
I'm writing my own scripting language in C#, with some features I like, and I chose to use MSIL as output's bytecode (Reflection.Emit is quite useful, and I dont have to think up another bytecode). It works, emits executable, which can be run ( even decompiled with Reflector :) )and is quite fast.
But - I want to run multiple 'processes' in one process+one thread, and control their assigned CPU time manually (also implement much more robust IPC that is offered by .NET framework) Is there any way to entirely disable JIT and create own VM, stepping instruction-after-instruction using .NET framework (and control memory usage, etc.), without need to write anything on my own, or to achieve this I must write entire MSIL interpret?
EDIT 1): I know that interpreting IL isn't the fastest thing in the universe :)
EDIT 2): To clarify - I want my VM to be some kind of 'operating system' - it gets some CPU time and divides it between processes, controls memory allocation for them, and so on. It doesnt have to be fast, nor effective, but just a proof of concept for some of my experiments. I dont need to implement it on the level of processing every instruction - if this should be done by .NET, I wont mind, i just want to say : step one instruction, and wait till I told you to step next.
EDIT 3): I realized, that ICorDebug can maybe accomplish my needs, now looking at implementation of Mono's runtime.
You could use Mono - I believe that allows an option to interpret the IL instead of JITting it. The fact that it's open source means (subject to licensing) that you should be able to modify it according to your needs, too.
Mono doesn't have all of .NET's functionality, admittedly - but it may do all you need.
Beware that MSIL was designed to be parsed by a JIT compiler. It is not very suitable for an interpreter. A good example is perhaps the ADD instruction. It is used to add a wide variety of value type values: byte, short, int32, int64, ushort, uint32, uint64. Your compiler knows what kind of add is required but you'll lose that type info when generating the MSIL.
Now you need to find it back at runtime and that requires checking the types of the values on the evaluation stack. Very slow.
An easily interpreted IL has dedicated ADD instructions like ADD8, ADD16, etc.
Microsofts implementation of the Common Language Runtime has only one execution system, the JIT. Mono, on the other hand comes with both, a JIT and an interpreter.
I, however, do not fully understand what exactly you want to do yourself and what you would like to leave to Microsofts implementation:
Is there any way to entirely disable JIT and create own VM?
and
... without need to write anything on my own, or to achieve this I must write entire MSIL interpret?
is sort of contradicting.
If you think, you can write a better execution system than microsofts JIT, you will have to write it from scratch. Bear in mind, however, that both microsofts and monos JIT are highly optimized compilers. (Programming language shootout)
Being able to schedule CPU time for operating system processes exactly is not possible from user mode. That's the operating systems task.
Some implementation of green threads might be an idea, but that is definitely a topic for unmanaged code. If that's what you want, have a look at the CLR hosting API.
I would suggest, you try to implement your language in CIL. After all, it gets compiled down to raw x86. If you don't care about verifiability, you can use pointers where necessary.
One thing you could consider doing is generating code in a state-machine style. Let me explain what I mean by this.
When you write generator methods in C# with yield return, the method is compiled into an inner IEnumerator class that implements a state machine. The method's code is compiled into logical blocks that are terminated with a yield return or yield break statement, and each block corresponds to a numbered state. Because each yield return must provide a value, each block ends by storing a value in a local field. The enumerator object, in order to generate its next value, calls a method that consists of a giant switch statement on the current state number in order to run the current block, then advances the state and returns the value of the local field.
Your scripting language could generate its methods in a similar style, where a method corresponds to a state machine object, and the VM allocates time by advancing the state machine during the time allotted. A few tricky parts to this method: implementing things like method calls and try/finally blocks are harder than generating straight-up MSIL.