.NET Service - Analysing Development vs Test performance - c#

I have a .NET REST API written using C# MVC5.
The API uses repository that fire hoses necessary data from database, then analyses it and transforms into usable model. The transformation uses a lot of linq to model the data.
On dev (Windows 10), i7 8 core # 3.7ghz, 32gb ram. it takes 10 secs for large test range.
Running on a VM (Windows 2008R2) virtual xeon with 8 virtual cores # 2.99ghz, 8GB RAM takes 300 seconds (5 mins).
Neither exhaust memory, and neither are CPU-bound (CPU touches 50% on the VM, and barely noticed on dev box.)
Same database, code etc.
The API makes use of async api to load some peripheral data whilst it's doing primary job, so I could put some logs in to log time I guess.
What are the common techniques for tackling this problem? Can the CPU speed really be making that much difference?
thanks
EDIT:
FOllowing comment by Pieter, I've increased the VM's memory to 12GB and monitored the performance of the VM whilst executing the operation. It's not the best visual aid (screen shot of TM end of op), but what it did show what that the vCPUs never really went above ~60% and memory - apart from a few mb at beginning of request, never went above 2.7GB.
If IIS / .NET / my operation is not maxing out the resources, what is taking so long?

Related

Azure Worker Role - Threading, memory consumption, fragmentation and compaction

Our running Cloud Service role makes an intensive use of threading. It spins up more and less 100 Tasks. Most of them do something, then they sleep for some minutes than start again. Some of them spin up a Service Bus Queue Client and handle up to 5 concurrent messages. Those services do a lot of stuff, but mainly data trasformation: they takes something from the DB and send them to external web services, or they take something from those web services and put on the DB.
From my application's service inventory, I'd say that there are no more than 300 "concurrent" Tasks.
Because of some error logs, I had to try to go deeper and I found something strange. I try to sum up what I discovered and what I think about them. I'd be glad to get a check from you.
This application is a .NET 4.5.2 worker role, compiled for a 64-bit architecture, and runs on a 2 Standard_A2 Medium Instances (i.e. 3.5GB memory, 2 core CPU). Because of the usage of Redis Caching, we have tweaked the ThreadPool SetMinThread like this:
ThreadPool.SetMinThreads(300, 300);
Now, here is what I discovered going through my analysis:
It happens that I log this error: "There were not enough free threads in the ThreadPool to complete the operation."
I think this is related to async / await operations in our whole application. If I understood correctly, async/await should use IO Completion Ports part of the Thread Pool. The fact that there are no free IOCP, sent me through the need of raising up accordingly the "SetMaxThread" property of the ThreadPool. Does it make sense?
int CurrentMaxWorkerThreads = 0;
int CurrentMaxIOPorts = 0;
ThreadPool.GetMaxThreads(out CurrentMaxWorkerThreads, out CurrentMaxIOPorts);
ThreadPool.SetMaxThreads(CurrentMaxWorkerThreads, 4000);
Memory usage: I've seen that after some days of running (2-3 days), the process takes more and less 2GB. Using Ants Memory Profiler, I discovered that a lot of this used memory is free, because fragmented (I'll ask about this on point 3). The question is: shouldn't my application scale up to more than 2GB? Isn't the 2GB limit set just for 32 bit applications?
Memory fragmentation: I've read a lot about that in those days, and I really think that trying to figure out why it's happening will be very time expensive, even because - using Ants Memory Profiler - I see that a lot of memory consumption is due to byte arrays (that probably are generated to send stuff outside of the application). Right now, just to keep things under control, I'd compact the LOH, using the .NET 4.5.1 feature:
GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect();

How to know when too much Parallel/Threads is killing performance?

I'm using several applications to process/send/retrieve data on/from the web and I'm asking if I can add more of these apps or if there are already too much apps on this server? I'm using parallel-processing for example when I need to retrieve data from several sources. Most of my classes are async (due to Mongo new queries which are async).
I'm wondering how many threads is "too much" for a CPU and is there a method to figure it out?
Here is a printscreen of my server at the moment. I can see that there are 24 logical processors, 191 processes and 3484 active threads! Could we say that this is too much ? But the CPU Utilization is only 31% ...
And for example on this server I had an app with 20 as degree of parallelism, I set it to 8 and the perfs was just the same ...

.NET Website Huge Peak Working Set Memory

I have an asp.net/c# 4.0 website on a shared server.
The os Windows Server 2012 is running IIS 7, has 32GB of ram and 3.29GHz of processor.
The server is running into difficulty now and again, such as problems RDP'ing and other PHP websites running slow.
Our sys admin has suggested my website as being a possible memory hog and cause of these issues.
At any given time the sites "Memory (private working set)" is 2GB and can peak as high as 15GB.
I ran a trial version of JetBrains dotMemory on the server, attached to the website's w3wp.exe process. My first time using this program, I am a complete novice here.
I took two memory snapshots using dotMemory.
And the basic snapshot comparison can be seen here: http://i.imgur.com/0Jk8yYE.jpg
From the above comparison we can see that System.String and BusinessObjects.Item are the two items with the most survived bytes.
Drilling down on system.string I could see that the main dominating survived object was System.Web.Caching.CacheEntry with 135MB. See the screengrab: http://i.imgur.com/t1hs8nd.jpg
Which leads me to suspect maybe I cache too much?
I cache all data that comes out of my database: HTML Page Content, Nav-Menu Items, Relationships between pages & children, articles etc. Using HttpContext.Current.Cache.Insert.
With the cache timeout set to 10080 minutes.
My Questions:
1 - is 2GB Memory (private working set) and a peak as high as 15GB to much for a server with 32 GB Ram?
2 - Have I used dotMemory correctly to identify an issue?
3 - Is my caching an issue?
4 - Possible other causes?
5 - Possible solutions?
Thanks
Usage of a large amount of memory itself can not slow down a program. It can be slowed down
if windows actively using swap file. Ask an admin to check this case
if .NET program causes garbage collecting too much (produces too much memory traffic)
You can see a memory traffic information in dotMemory. Unfortunately it can't gather such information if it attaches to the program, the only way to collect object creation stack traces and memory traffic information is to launch your program under dotMemory profiler, enabling corresponding settings.
BUT!
The problem is you can't evaluate is memory traffic amount "N" high or no, the only way to find a root of a performance problem is using performance profiler.
For example JetBrains dotTrace can show you, how much time you program spends in garbage collecting, and only in case this is a bottle neck, you should use memory profiler to find a root of traffic.
Conclusion: try to find a bottle neck using performance profiler first.
Then, if you still have a questions about dotMemory, ask me, I'll try to help you :)

Troubleshooting creeping CPU utilization on Azure websites

As the load on our Azure website has increased (along with the complexity of the work that it's doing), we've noticed that we're running into CPU utilization issues. CPU utilization gradually creeps upward over the course of several hours, even as traffic levels remain fairly steady. Over time, if the Azure stats are correct, we'll somehow manage to get > 60 seconds of CPU per instance (not quite clear how that works), and response times will start increasing dramatically.
If I restart the web server, CPU drops immediately, and then begins a slow creep back up. For instance, in the image below, you can see CPU creeping up, followed by the restart (with the red circle) and then a recovery of the CPU.
I'm strongly inclined to suspect that this is a problem somewhere in my own code, but I'm scratching my head as to how to figure it out. So far, any attempts to reproduce this on my dev or testing environments have proven ineffectual. Nearly all the suggestions for profiling IIS/C# performance seem to presume either direct access to the machine in question or at least a "Cloud Service" instance rather than an Azure Website.
I know this is a bit of a long shot, but... any suggestions, either for what it might be, or how to troubleshoot it?
(We're using C# 5.0, .NET 4.5.1, ASP.NET MVC 5.2.0, WebAPI 2.2, EF 6.1.1, Azure System Bus, Azure SQL Database, Azure redis cache, and async for every significant code path.)
Edit 8/5/14 - I've tried some of the suggestions below. But when the website is truly busy, i.e., ~100% CPU utilization, any attempt to download a mini-dump or GC dump results in a 500 error, with the message, "Not enough storage." The various times that I have been able to download a mini-dump or GC dump, they haven't shown anything particularly interesting, at least, so far as I could figure out. (For instance, the most interesting thing in the GC dump was a half dozen or so >100KB string instances - those seem to be associated with the bundling subsystem in some fashion, so I suspect that they're just cached ScriptBundle or StyleBundle instances.)
Try remote debugging to your site from visual studio.
Try https://{sitename}.scm.azurewebsites.net/ProcessExplorer/ there you can take memory dumps an GC dumps of your w3wp process.
Then you can compare 2 GC dumps to find memory leaks and open the memory dump with windbg/VS for further "offline" debugging.

Multiple app instances, windows GDI limit

Im trying to run simultaneously hundreds of instances of the same app(using C#), and after about 200 instances the GUI starts to slow down dramatically until the point that the load time of the next instance is climbing up to 20s (from 1s).
The test maching is :
xeon 5520
12gb ram
windows 2008 web 64 bit
at max load (200 instances) the cpu is at about 20% and ram 45%, so im sure its not a hardware issue.
I already tried configuring Session size and SharedSection in the registry of the windows but it doesnt seem to help.
I also tried to running the app in the background and also on multiple sessions (different sessions) and still the same (i though maybe it a limitation per session).
When the slowdown occures for example on one session i can login to another session and the desktops works without a problem (the first dekstop becomse unusable.)
My question is - is there a way to strip the gdi objects or maybe eliminate the use of the GUI? or is it a windows limitation?
p.s - I cant change the app since its a third pary.
Thanks in advance.
With 200 instances running, the constant context switching is probably hurting performance. Context switching isn't counted in CPU load.
Edit: whoops, wrong link.
Try monitoring context switching on your system
http://technet.microsoft.com/en-us/library/cc938606.aspx
I doubt it's GDI - if you run out of GDI handles/resources you'll notice vast chunks of your windows failing to redraw, rather than everythign slowing down.
The most likely reason for a sudden drop in performance is that you are maxing out your RAM and thrashing your Virtual Memory as all your processes fight for CPU time. Check memory usage, and if it's high, see if you can reduce the footprint of your application. Or apply a "hardware fix" by installing more RAM. Or add Sleeps into your Apps where possible so that they aren't demanding constant timeslices from your CPU (and thus needing to be constantly paged in from VM).

Categories