What could be rate limiting CPU cycles on my C# WCF Service?

What could be rate limiting CPU cycles on my C# WCF Service? - c#

Something very strange started happening on our production servers a day or two ago regarding a WCF Service we run there: it seems that something started rate limiting the process in question's CPU cycles to the amount of CPU cycles that would be available on one core, even though the load is spread across all cores (the process is not burning one core to 100% usage)
The Service is mostly just a CRUD (create, read, update, delete) service, with the exception of a few long running (can take up to 20 minutes) service calls that exist there. These long running service calls kicks of a simple Thread and returns void so not to make the Client application wait, or hold up the WCF connection:
// WCF Service Side
[OperationBehavior]
public void StartLongRunningProcess()
{
Thread workerThread = new Thread(DoWork);
workerThread.Start();
}
private void DoWork()
{
// Call SQL Stored proc
// Write the 100k+ records to new excel spreadsheet
// return (which kills off this thread)
}
Before the above call is kicked off, the service seems to respond as it should, Fetching data to display on the front-end quickly.
When you kick off the long running process, and the CPU usage goes to 100 / CPUCores, the front-end response gets slower and slower, and eventually wont accept any more WCF connections after a few minutes.
What I think is happening, is the long running process is using all the CPU cycles the OS is allowing, because something is rate limiting it, and WCF can't get a chance to accept the incoming connection, never mind execute the request.
At some point I started wondering if the Cluster our virtual servers run on is somehow doing this, but then we managed to reproduce this on our development machines with the client communicating to the service using the loopback address, so the hardware firewalls are not interfering with the network traffic either.
While testing this inside of VisualStudio, i managed to start 4 of these long running processes and with the debugger confirmed that all 4 are executing simultaneously, in different threads (by checking Thread.CurrentThread.ManagedThreadId), but still only using 100 / CPUCores worth of CPU cycles in total.
On the production server, it doesn't go over 25% CPU usage (4 cores), when we doubled the CPU cores to 8, it doesn't go over 12.5% CPU usage.
Our development machines have 8 cores, and also wont go over 12.5% CPU usage.
Other things worth mentioning about the service
Its a Windows Service
Its running inside of a TopShelf host
The problem didn't start after a deployment (of our service anyway)
Production server is running Windows Server 2008 R2 Datacenter
Dev Machines are running Windows 7 Enterprise
Things that we have checked, double checked, and tried:
Changing the process' priority up to High from Normal
Checked that the processor affinity for the process is not limiting to a specific core
The [ServiceBehavior] Attribute is set to ConcurrencyMode = ConcurrencyMode.Multiple
Incoming WCF Service calls are executing on different threads
Remove TopShelf from the equation hosting the WCF service in just a console application
Set the WCF Service throttling values: <serviceThrottling maxConcurrentCalls="1000" maxConcurrentInstances="1000" maxConcurrentSessions="1000" />
Any ideas on what could be causing this?

There must be a shared resource that only allows a single thread to access it at a time. This would effectively only allow one thread at a time to run, and create exactly the situation you have.
Processor affinity masks are the only way to limit a process to a single CPU, and if you did this you would see one CPU pinned and all the others idle (which is not your situation).
We use a tool called LeanSentry that is very good at identifying these kinds of problems. It will attach itself to IIS as a debugger and capture stack dumps of all executing processes, then tell you if most of your threads are blocked in the same spot. There is a free trial that would be long enough for you to figure this out.

The CPU usage looks like a lock on a table in the SQL Database to me. I would use the SQL management studio to analyze the statements see if it can confirm that.
Also you indicated that you call a stored procedure might want to have it look at that as well.
This all just looks like a database issue to me

Related

IIS web server and thread pool issues

Question is related ASP.NET 4.0 and IIS based azure cloud service:
need to know right number of IOCP threads to set for production web service where we make 10-20K/sec remote calls
Also need to know right number of Worker threads to set for production web service...specially to handle 10-20K/sec API calls...specially in bursts
Basically, I am facing issue that each of my cloud service VM should handle 10-20K requests/sec but it is not able to do so due to thread pool issue w.r.t. asp.net
my prod service does nothing but get data from redis and simply return

Assuming code is efficient and there is enough hardware i.e. there are no issues related to memory, cpu and n/w:
1. You should try to keep IOCP to minimal numbers 50-100
2. You should try to keep CPU threads to high to handle bursts of requests
I am not sure if it's a good idea to keep 2-5K active threads to cater to 10-20K requests/sec

Discrete .NET middleware processor vs spawning a new process from IIS

I have a 4 tier .NET application which consists of a
Silverlight 5 Client
MVC4 Web API Controller (Supplying data to the SL5 Client)
Windows Service - responsible for majority of data processing.
Oracle DB storage.
The workflow is simple: SL5 client sends a request to the rest service, the rest service simply stores it in the DB.
The windows service, while periodically polling the DB for new records, detects the new records and attempts to process them accordingly. Once finished it updates the records and their status in the DB.
In the meantime the SL5 Client also periodically polls the DB to see if the records have been processed. When they are, the result is retrieved and rendered on the screen.
So the question here is the following:
Is there a difference between spawning the same processing code (currently in the windows service) in a new discrete process (right out of the Web API Controller), vs keeping it as is in the windows service?
Aside from removing the constant DB polling that happens in the windows service, it simplifies processing greatly because it can be done on a per-request basis as the requests arrive from the client. But are there any other drawbacks? Perhaps server or other issues with IIS?

Yes there is a difference.
Windows services are the right tool for asynchronous processing. Operations can take a long time without producing strange effects. After all, it is a continuously running service.
IIS on the other hand, processes requests by using a thread pool. Long running tasks have the potential to exhaust that thread pool, so this may cause problems depending on the number of background tasks you start. Also, IIS makes no guarantees to keep long running tasks alive. If the web site is recycled, which happens regularly in a IIS default installation, your background task may die.

Wcf NetNamedPipesBinding replying slow on heavy load

I have a wcf server using NetNamedPipesBinding.
I can see when the server is loaded with requests the reply is very slow (1-7 seconds).
The application code runs very fast but the time between sending the reply and receiving the reply takes long.
Is this because there are lots of messages at the pipe and they are processed sequentially ? is there a way to improve that ?
there are only 2 processes involves (caller and service) and the calls are 2 way, the caller process uses different threads to call.
Thanks.

If you are creating a separate Thread for each request, you could be starving your system. Since both client and server are on the same machine, it may be the client's fault the server is slow.
There are lots of ways to do multithreading in .NET and a new Thread may be the worst. At worst you should move your calls to the thread pool (http://msdn.microsoft.com/en-us/library/3dasc8as.aspx)
or you may want to use the async methods of the proxy (http://msdn.microsoft.com/en-us/library/ms730059.aspx).

Windows kernel queuing outbound network connections

We have an application (a meta-search engine) that must make 50 - 250 outbound HTTP connections in response to a user action, frequently.
The way we do this is by creating a bunch of HttpWebRequests and running them asynchronously using Action.BeginInvoke. This obviously uses the ThreadPool to launch the web requests, which run synchronously on their own thread. Note that it is currently this way as this was originally a .NET 2.0 app and there was no TPL to speak of.
Using ETW (our event sources combined with the .NET framework and kernal ones) and NetMon is that while the thread pool can start 200 threads running our code in about 300ms (so, no threadpool exhaustion issues here), it takes up a variable amount of time, sometimes up to 10 - 15 seconds for the Windows kernel to make all the TCP connections that have been queued up.
This is very obvious in NetMon - you see around 60 - 100 TCP connections open (SYN) immediately (the number varies, but it's never more then around 120), then the rest trickle in over a period of time. It's as if the connections are being queued somewhere, but I don't know where and I don't know how to tune this to we can perform more concurrent outgoing connections. Perfmon Outbound Connection Queue stays at 0 but in the Connections Established counter you can see an initial spike of connections then a gradual increase as the rest filter through.
It does appear that latency to the endpoints to which we are connecting play a part, as running the code close to the endpoints that it connects to doesn't show the problem as significantly.
I've taken comprehensive ETW traces but there is no decent documentation on many of the Microsoft providers, which would be a help I'm sure.
Any advice to work around this or advice on tuning windows for a large amount of outgoing connections would be great. The platform is Win7 (dev) and Win2k8R2 (prod).

It looks like slow DNS queries are the culprit here. Looking at the ETW provider "Microsoft-Windows-Networking-Correlation", I can trace the network call from inception to connection and note that many connections are taking > 1 second at the DNS resolver (Microsoft-Windows-RPC).
It appears our local DNS server is slow/can't handle the load we are throwing at it and isn't caching aggressively. Production wasn't showing as severe symptoms as the prod DNS servers do everything right.

How to avoid a manually started thread from dying?

I have a process on a system that runs on IIS, it takes hours to finish so it runs on a thread.
The problem is that this thread is dropped after some time because the IIS process timeout (no activity). This thread can't stop in the middle.
How can I prevent this timeout if the thread is running?

In the settings of the Application Pool in IIS you could configure it to not recycle the AppDomain after a certain period of inactivity. Notice however that using long running tasks in IIS is bad idea and this setting might not be 100% reliable. For example if your server starts running low on memory or high CPU usage IIS could still recycle it. IIRC this threshold could also be configured.
The best way would be to externalize those long running tasks as a separate Windows Service.
And if you cannot do any of those previous things and you are absolutely desperate the last thing you could try in your total despair is to auto-ping the web application from this background thread by sending HTTP requests at regular intervals to avoid it from dying. But once again that should really be the last thing you should attempt.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.