I have a rather large solution revolving around a WebAPI project. I ran into some performance issues on a particular web service, and used the built-in performance profiler in VS2013 to find the bottlenecks and deal with them. Eventually I got the response time on a HTTP request down from around 500ms to 50ms (I use an external app to perform repeated requests and log the round-trip time).
However, I only see this improvement while running the WebAPI from the performance profiler tool. As soon as I switch back to running it straight from Visual Studio (F5) or on our test server, the response times increase to around 400ms, still an improvement on the original 500, but not exactly magnificent.
It only makes a slight difference if I run it in debug mode or release mode. Setting "debug info" to "none" rather than the default "pdb-only", on all the involved projects helps a tiny bit, bringing the average response time down to around 350ms.
I can't for the life of me, figure out what the performance profiler tool does to optimize the code further. And it's killing me that I've seen how fast it can run, but I'm unable to achieve the same performance.
Turns out the performance profiler wasn't doing anything different to optimize the code. But it did run IIS Express without any debuggers enabled.
By going to the Web API project's properties, and switching to the Web-tab, I could uncheck all the debuggers, and now the response times during debugging matched what I was seeing when running the profiler. Obviously disabling the ASP.NET debugger, prevents any debugging of the code.
Mikal is right.
This performance difference issue (when doing profiling is better than debuging) also has troubled me for a few hours, I even tried to move code from Web API to Console Application to test, and Console Application has similar performance as good as when profiling Web API.
Then I figured out that it is due to ASP.NET debugger was enabled for debugging mode, which consumed a lot CPU, after de-check that debugger, performance is back, as good as Console Application and Profiling mode
Related
As the load on our Azure website has increased (along with the complexity of the work that it's doing), we've noticed that we're running into CPU utilization issues. CPU utilization gradually creeps upward over the course of several hours, even as traffic levels remain fairly steady. Over time, if the Azure stats are correct, we'll somehow manage to get > 60 seconds of CPU per instance (not quite clear how that works), and response times will start increasing dramatically.
If I restart the web server, CPU drops immediately, and then begins a slow creep back up. For instance, in the image below, you can see CPU creeping up, followed by the restart (with the red circle) and then a recovery of the CPU.
I'm strongly inclined to suspect that this is a problem somewhere in my own code, but I'm scratching my head as to how to figure it out. So far, any attempts to reproduce this on my dev or testing environments have proven ineffectual. Nearly all the suggestions for profiling IIS/C# performance seem to presume either direct access to the machine in question or at least a "Cloud Service" instance rather than an Azure Website.
I know this is a bit of a long shot, but... any suggestions, either for what it might be, or how to troubleshoot it?
(We're using C# 5.0, .NET 4.5.1, ASP.NET MVC 5.2.0, WebAPI 2.2, EF 6.1.1, Azure System Bus, Azure SQL Database, Azure redis cache, and async for every significant code path.)
Edit 8/5/14 - I've tried some of the suggestions below. But when the website is truly busy, i.e., ~100% CPU utilization, any attempt to download a mini-dump or GC dump results in a 500 error, with the message, "Not enough storage." The various times that I have been able to download a mini-dump or GC dump, they haven't shown anything particularly interesting, at least, so far as I could figure out. (For instance, the most interesting thing in the GC dump was a half dozen or so >100KB string instances - those seem to be associated with the bundling subsystem in some fashion, so I suspect that they're just cached ScriptBundle or StyleBundle instances.)
Try remote debugging to your site from visual studio.
Try https://{sitename}.scm.azurewebsites.net/ProcessExplorer/ there you can take memory dumps an GC dumps of your w3wp process.
Then you can compare 2 GC dumps to find memory leaks and open the memory dump with windbg/VS for further "offline" debugging.
This is a very strange issue that I am experiencing, and almost goes against anything logical that I can think of. I am currently profiling a website which we are building, which sometimes takes 5 seconds for a page to load. This happens both on IIS, and Visual Studio Development Server. However, when I profile it using ANTS Performance Profiler, it performs up to 5x faster, and loads in less than a second.
I am quite baffled as to why this can happen, because as far as I know, profiling should increase the time, not decrease it. Anyone could maybe shed some light on this?
Site is developed in Visual Studio 2010, ASP.Net v4.0, C#.
This is interesting as its very rare (I work on ANTS support). The main difference ANTS imparts on a process is permissions (since (usually) the process is fully initiated by ANTS and inherits the permissions). We have some routines that optimise the start-up procedure but I've never heard of a speed-up like this. Using Taskmanager, take a look at the login account that the process runs under ANTS and normally- then try to run your process under the account that ANTS uses. You may find this helps to explain the speed-up.
Performance testing needs to be done in carefully controlled setting. Things like system file cache, network, machine load, NGEN status, virus scanner could affect perf result.
Use Perfview to understand how 5s is spent (could be waiting for disk IO):
http://www.microsoft.com/en-us/download/details.aspx?id=28567
I have a webservice that is in much need of some optimization. It's on an enterprise application the resides on a virtual server machine and is getting a huge bottle neck. I'm confident in my skills and being able to make this more efficient, but I was wondering if anyone out there has had a good experience with a profiler or optimization tool that would help point me to trouble spots.
The webservices main function is to generate PDFs which are created using Sql Reports and a third party PDF Writer utility. Basically it gets an ID and creates X number of PDFs based on number of Forms that are associated to that ID. So it has a loop which can run an average of 8 times / ID, and there are thousands of IDs sent daily. Needless to say there is always a back log of PDFs to be created, which the client would rather not see.
I have also thought about running multi-threads to asynchronously generate the PDF pages but I'm hesitant because they said they had issues with multi-threading on the "Virtual Server". So if anyone can point me to a good tutorial or advise about multi-threading on a Virtual Server I would appreciate that too.
Thanks for any help you can give.
I've used this one before and it's great:
JetBrains dotTrace
http://www.jetbrains.com/profiler/whatsnew/
Try Telerik's JustTrace, it has alot of neat stuff. It has 60 days free trial with support, so you can try it out first.
Fast Profiling
JustTrace aims to redefine fast memory and performance profiling. It adds minimal overhead to the profiled application, allows near seamless execution, and enables analysis-in-place, thus eliminating the need to move the application from its environment. The user can examine different stages of the application’s behavior by swiftly taking multiple snapshots throughout its lifetime.
Made-to-Measure Profiling
JustTrace offers three distinct profilers – Sampling, Memory and Tracing – to meet even the most demanding profiling requirements.
Profiling of Already Running Processes
JustTrace allows for unobtrusive attaching to live processes. Should an application start experiencing higher memory or CPU consumption, analysis on its state gives the opportunity to handle scenarios that are otherwise hard to reproduce.
Simple yet Intuitive UI
By definition, a memory and performance profiling tool should enable you to speed up the performance of your apps without slowing you down or getting into your way. JustTrace employs a minimalistic yet highly intuitive user interface that allows for easy navigation of the performance and memory results. A few effortless steps take you from choosing the application being profiled to an in-depth analysis of the profiling insights made by JustTrace. Memory and performance profiling has never been easier.
Live Profiling
JustTrace enables real-time monitoring of the application’s execution. The close-up watching of the application’s behavior brings out potential performance bottlenecks to the surface, and provides reliable hints of the application’s stages that are worth investigating.
Stand-alone Tool and Seamless Visual Studio Integration
JustTrace offers seamless integration with Visual Studio and can also be used as a stand-alone tool. The integration of JustTrace into Visual Studio’s UI removes a burdensome step by cutting the time needed to jump between the development environment and the tool to test the resulting memory and CPU utilization improvements. Simply modify the code, then run it through the Visual Studio UI and get JustTrace’s core capabilities in a single tool window.
Profiling of Multiple Application Types
JustTrace enables the profiling of local applications, running applications, Silverlight applications and local ASP .NET web site.
I would suggest taking a look at ANTS Memory & Performance Profiler from Red Gate:
ANTS Memory Profiler
ANTS Performance Profiler
The ANTS profilers do a fantastic job of identifying bottlenecks and memory leaks. They're not free, but they're very affordable and offer fully functional trials so you can evaluate the products.
There are other profilers:
ANTS: http://www.red-gate.com/products/dotnet-development/ants-performance-profiler/
Which can also profile SQL calls. They also have an EAP open at the moment which gives you more functionality for database calls, that is here:
http://help.red-gate.com/help/ANTSPerformanceProfiler/download_eap.html
There is YourKit:
http://www.yourkit.com/
Visual Studio has a profiler too but not as good.
I have some C# code which passed a delegate as a callback to an unmanaged method via a P/Invoke function call in a NUnit test.
The code works great and passes all tests in both Relase and Debug modes. And it runs fast on one machine whether running under the Debugger or not.
But after setting up a nearly identical development environment on another PC for a new developer starting soon, it runs fast in Release and Debug configuration. But horribly slow when the Debugger is attached.
Note that I have seen a type of slowness with "debug unmanaged code" enabled on the Project. I have disabled that, recompiled and it doesn't matter with or w/o. I tried it both ways several times.
Also, there aren't any break points or watch variables set.
As an aside, this unit test actually calls the unmanaged method in a loop 1 million times which returns after incrementing a counter. It's extremely simple code that was only testing the performance of making unmanaged calls across AppDomains.
Please remember that this is identical code from git commit that only runs slow when under the debugger on one of the machines. No code modifications are different between them so it seems conclusively this isn't a "code" issue but rather a setting in Visual Studio somewhere related to unmanaged vs. managed debugging, I will wildly guss.
Thanks in advance for any ideas. If you really think seeing the code will help. I'll post the unit C# test and the cpp file too.
Edit: I narrowed down that this slowness in the debugger only happens for the unmanaged code that calls into a different AppDomain. So in these performance tests there is the primary and another, secondary AppDomain. Managed to Unmanaged calls are tested to callback from the primary domain to itself. Those are fast! But those that callback across from unmanaged into the other AppDomain are very, very slow. This means from 20 million per second down to only 4 or 5 thousand per second.
Note that the method being called to test is void callback()--so not arguments or return value. In other words, there's nothing to marshall.
Edit: I was jiggerng with different settings and now my development box is SLOW too. I was sure it was the "Just My Code" setting that saw was off for the faster machine so enabled it to try that out. But now, even after disabling it again, it's still slow. So not sure if this is the cause or not.
Check if symbol files settings are the same on both machines. Loading all symbols for native code may take very long time (Tools -> Options ->Debugging -> Symbols).
I've encountered the following paragraph:
“Debug vs. Release setting in the IDE when you compile your code in Visual Studio makes almost no difference to performance… the generated code is almost the same. The C# compiler doesn’t really do any optimization. The C# compiler just spits out IL… and at the runtime it’s the JITer that does all the optimization. The JITer does have a Debug/Release mode and that makes a huge difference to performance. But that doesn’t key off whether you run the Debug or Release configuration of your project, that keys off whether a debugger is attached.”
The source is here and the podcast is here.
Can someone direct me to a Microsoft article that can actually prove this?
Googling "C# debug vs release performance" mostly returns results saying "Debug has a lot of performance hit", "release is optimized", and "don't deploy debug to production".
Partially true. In debug mode, the compiler emits debug symbols for all variables and compiles the code as is. In release mode, some optimizations are included:
unused variables do not get compiled at all
some loop variables are taken out of the loop by the compiler if they are proven to be invariants
code written under #debug directive is not included, etc.
The rest is up to the JIT.
Full list of optimizations here courtesy of Eric Lippert.
There is no article which "proves" anything about a performance question. The way to prove an assertion about the performance impact of a change is to try it both ways and test it under realistic-but-controlled conditions.
You're asking a question about performance, so clearly you care about performance. If you care about performance then the right thing to do is to set some performance goals and then write yourself a test suite which tracks your progress against those goals. Once you have a such a test suite you can then easily use it to test for yourself the truth or falsity of statements like "the debug build is slower".
And furthermore, you'll be able to get meaningful results. "Slower" is meaningless because it is not clear whether it's one microsecond slower or twenty minutes slower. "10% slower under realistic conditions" is more meaningful.
Spend the time you would have spent researching this question online on building a device which answers the question. You'll get far more accurate results that way. Anything you read online is just a guess about what might happen. Reason from facts you gathered yourself, not from other people's guesses about how your program might behave.
I can’t comment on the performance but the advice “don’t deploy debug to production” still holds simply because debug code usually does quite a few things differently in large products. For one thing, you might have debug switches active and for another there will probably be additional redundant sanity checks and debug outputs that don’t belong in production code.
From msdn social
It is not well documented, here's what
I know. The compiler emits an
instance of the
System.Diagnostics.DebuggableAttribute.
In the debug version, the
IsJitOptimizerEnabled property is
True, in the release version it is
False. You can see this attribute in
the assembly manifest with ildasm.exe
The JIT compiler uses this attribute
to disable optimizations that would
make debugging difficult. The ones
that move code around like
loop-invariant hoisting. In selected
cases, this can make a big difference
in performance. Not usually though.
Mapping breakpoints to execution
addresses is the job of the debugger.
It uses the .pdb file and info
generated by the JIT compiler that
provides the IL instruction to code
address mapping. If you would write
your own debugger, you'd use
ICorDebugCode::GetILToNativeMapping().
Basically debug deployment will be slower since the JIT compiler optimizations are disabled.
What you read is quite valid. Release is usually more lean due to JIT optimization, not including debug code (#IF DEBUG or [Conditional("DEBUG")]), minimal debug symbol loading and often not being considered is smaller assembly which will reduce loading time. Performance different is more obvious when running the code in VS because of more extensive PDB and symbols that are loaded, but if you run it independently, the performance differences may be less apparent. Certain code will optimize better than other and it is using the same optimizing heuristics just like in other languages.
Scott has a good explanation on inline method optimization here
See this article that give a brief explanation why it is different in ASP.NET environment for debug and release setting.
One thing you should note, regarding performance and whether the debugger is attached or not, something that took us by surprise.
We had a piece of code, involving many tight loops, that seemed to take forever to debug, yet ran quite well on its own. In other words, no customers or clients where experiencing problems, but when we were debugging it seemed to run like molasses.
The culprit was a Debug.WriteLine in one of the tight loops, which spit out thousands of log messages, left from a debug session a while back. It seems that when the debugger is attached and listens to such output, there's overhead involved that slows down the program. For this particular code, it was on the order of 0.2-0.3 seconds runtime on its own, and 30+ seconds when the debugger was attached.
Simple solution though, just remove the debug messages that was no longer needed.
In msdn site...
Release vs. Debug configurations
While you are still working on your
project, you will typically build your
application by using the debug
configuration, because this
configuration enables you to view the
value of variables and control
execution in the debugger. You can
also create and test builds in the
release configuration to ensure that
you have not introduced any bugs that
only manifest on one type of build or
the other. In .NET Framework
programming, such bugs are very rare,
but they can occur.
When you are ready to distribute your
application to end users, create a
release build, which will be much
smaller and will usually have much
better performance than the
corresponding debug configuration. You
can set the build configuration in the
Build pane of the Project Designer, or
in the Build toolbar. For more
information, see Build Configurations.
I recently run into a performance issue. The products full list was taking too much time, about 80 seconds. I tuned the DB, improved the queries and there wasn't any difference. I decided to create a TestProject and I found out that the same process was executed in 4 seconds. Then I realized the project was in Debug mode and the test project was in Release mode. I switched the main project to Release mode and the products full list only took 4 seconds to display all the results.
Summary: Debug mode is far more slower than run mode as it keeps debugging information. You should always deploy in Relase mode. You can still have debugging information if you include .PDB files. That way you can log errors with line numbers, for example.
To a large extent, that depends on whether your app is compute-bound, and it is not always easy to tell, as in Lasse's example. If I've got the slightest question about what it's doing, I pause it a few times and examine the stack. If there's something extra going on that I didn't really need, that spots it immediately.
Debug and Release modes have differences. There is a tool Fuzzlyn: it is a fuzzer which utilizes Roslyn to generate random C# programs. It runs these programs on .NET core and ensures that they give the same results when compiled in debug and release mode.
With this tool it was found and reported a lot of bugs.