Here is my problem:
I have just been brought onto a massive asp.net C# project and I've been charged with fixing some performance issues (not my area of expertise). More specifically after 5 - 7 redirects/ajax calls the web server stops responding and the whole page (and eventually the browser) freezes.
I don't think this is a coding issue as I've set up break points in a few pages (Page_Load method) and after the 5 requests it does not even reach the break points.
I don't believe this is related to this issue as I've increased the browser's maximum connections per server parameter and I got the same behavior. Furthermore after these 5 request in one browser IE, the application stops working in FF as well.
This is not a resource issue as the w3wp.exe process never exceeds 500MB memory.
One thing I've noticed when using Fiddler and other tools to monitor the requests is that the server takes a very long time when loading image files (png, jpg). I don't know if this is relevant.
I've enabled failed request tracing on the server and the only thing I've noticed is that some request fail with a 401 error even dough I've set Anonymous Authentication to enabled.
Here is the exact message
MODULE_SET_RESPONSE_ERROR_STATUS
ModuleName ManagedPipelineHandler
Notification 128
HttpStatus 401
HttpReason Unauthorized
HttpSubStatus 0
ErrorCode 0
ConfigExceptionInfo
Notification EXECUTE_REQUEST_HANDLER
ErrorCode The operation completed successfully. (0x0)
This message is sometimes thrown with ModuleName: ScriptModule
I have already wasted 2 days on this thing and I'm running out of ideas so any suggestions would be appreciated.
Like any large generic problem, your best bet in diagnosing the issue is to figure out how to break down the issue into smaller parts, how to hypothesize the issues, and how to validate or invalidate your hypotheses. My first inclination would be to hypothesize that the server-side processes in this particular are taking a long time, causing your client requests to block, making the whole thing seem frozen.
From there, I would attempt to replicate the long running server side processes by creating isolated client side tests - perhaps if the URLs are HTTP gets, I would test the same URLs individually. If they were HTTP posts, I'd create an isolated test form if feasible to see what happens with each request. If a long running server side process is found then you have a starting point.
If there are no long running server side processes then it may be JavaScript / client side coding issues that need to be looked into. But definitely when you're working a large, unfamiliar project, your best bet is to figure out how to break down the issue into smaller components that can then be tested
I solved the issue finally. Here is what I did:
Experimented with IIS settings and App_Pool recycling and noticed that there is nothing wrong with the way it handles requests that actually reach it.
I focused on the Http.sys module and noticed that in the log files there were a lot of Timer_ConnectionIdle and Client_Reset errors.
After some more experimentation and a lot of Google searches, I accidentally found this answer and it solved my issue. As the answer suggests the problem was caused by the AVG antivirus installed and incorrectly configured on the server.
Thanks for all the help and suggestions.
If it's ajax calls that are causing your browser to freeze, make sure they are not blocking ajax calls.
Just appending to Shan's answer, which is a good one.
First off, there is obviously a code issue as this is by no means 'normal' behavior for IIS.
That said, you must isolate it as Shan indicated. For example, given the server itself no longer accepts connections then we can pretty well eliminate javascript as the source of the problem and relegate it to being just a symptom.
Typically when a worker process spins into space like this it is due to either an infinite loop or an issue where multiple threads are trying to lock the same resource. I bet if you let it run long enough IIS itself will timeout, kill and restart the process.
With that in mind you want to look for any type of multithreaded garbage (which I highly recommend you don't do in a web server) or for anything that indicates a tight infinite loop. A loop is going to become apparent if you execute the requests individually. A multi-threaded issue will only show up if you happen to get a collision.
Run various performance counters on the web server. Also, once it locks up, let it sit that way for awhile. Once IIS performs it's own reset on the worker process go look for indicators in the event log.
Related
I have a .net core api that must make around 150,000 calls to collect data from external services. I am running these requests in parallel using Parallel.forEach and that seems to be working great, however I get an error from the http client for around 100,000 of my requests!
The Operation was canceled
Looking back at this I wish I had also logged the exception type but I believe this is due to not having enough outgoing connection limit.
Through debugging I have found that this returns 2:
ServicePointManager.DefaultConnectionLimit
On the face of it, if this really is the maximum amount of open connections allowed to an external domain / server, I want to increase that as high as possible. Ideally to 150,000 to ensure parallel processing doesnt cause an issue.
The problem is I cant find any information on what a safe limit is, or how much load this will put on my machine - if it is even a lot. Since this issue causes a real request to be made my data provider counts it in my charges - but obviously I get nothing from it since the .net core framework is just throwing my result away..
Since this problem is also intermittent it can be difficult to debug and I would just like to set this value as high as is safe to do so on my local machine.
I believe this question is relevant to stackoverflow since it does deal directly with the technical issue above, whereas other questions I could find only ask details about what this setting is.
As far as I understand, you are trying to make 150000 simulatenous request to external services. I presume that your services are Restful web services. If that is the case when you set DefaultConnectionLimit to an arbitrary number (very high), every single request opens a port for requesting data. This definitely clogs your network and your ports (port range is 0 to 65535).
Besides, making 150000 request without using throttling uncontrollably consumes your OS resources.
DefaultConnectionLimit is there because it protects you from aforementioned problems.
you may consider to use SemaphoreSlim for throttling
My company has an application that keeps track of information related to web sites that are hosted on various machines. A central server runs a windows service that gets a list of sites to check, and then queries a service running on those target sites to get a response that can be used to update the local data.
My task has been to apply multithreading to this process to reduce the time it takes to run through all the sites (almost 3000 sites that take about 8 hours to run sequentially). The service runs through successfuly when it's not multithreaded, but the moment I spread out the work to multiple threads (testing with 3 right now, plus a watcher thread) there's a bizarre crash that seems to originate from the call to the remote services that are supposed to provide the data. It's a SOAP/XML call.
When run on the test server, the service just gives up and doesn't complete it's task, but doesn't stop running. When run through the debugger (Dev Studio 2010) the whole thing just stops. I'll run it, and seconds later it'll stop debugging, but not because it completed. It does not throw an exception or give me any kind of message. With breakpoints I can walk through to the point where it just stops. Event logging leads me to the same spot. It stops on the line of code that tries to get a response from the web service on the other sites. And again: it only does that when multithreaded.
I found some information that suggested there's a limit to the number of connections that defaults to 2. The proposed solution is to add some tags to the app.config, but that hasn't solved the problem...
<system.net>
<connectionManagement>
<add address="*" maxconnection="20"/>
</connectionManagement>
</system.net>
I still think it might be related to the number of allowed connections, but I have been unable to find information around it online very well. Is there something straightforward I'm missing? Any help would be much appreciated.
No crash however bizarre will escape the stack-dump. Try going through that dump and see if it points out to some obvious function.
Are you using some third party tool or some other component for the actual service call ? If yes, then please check the documentation/contact-the-person-who-wrote-it, to confirm that their components are thread safe. If they are not, you have large task ahead. :) (I have worked on DB which are not safe, so trust me it is not very uncommon to find few global static variables thrown around..)
Lastly if you are 100% sure that this is due multiple threads then, put a lock in your worked thread. Initially say it covers entire main-while-loop. Therotically it should not crash not as even though it is multi-threaded, you have serialized the execution.
Next step is to reduce to scope of the thread. Say, there are three functions in the
main-while-loop , say f1(), f2(), f3(), then start locking f2() and f3() while leaving f1 unlocked... If things work out, then problem is somewhere in f2 or f3().
I hope you got the idea of what I am suggest
I know this is like blind man guessing elephant, but that is the best you can do, if your code uses LOT many external component which are not adequately documented.
I am experiencing the exact same issue as a user reports on eggheadcafe, but don't know what steps to take after reading the following answer.:
Two problems you should chase down:
1. Why is the website leaking resources to the finalizers. That is
bad
2. What is Oracle code waiting on -- work with Oracle's support on it
This is the issue:
I have an intermittent problem with a
web site hosted on IIS6 (w2k3 sp2).
I appears to occur randomly to users
when they click on a hyperlink within
a page. The request is sent to the
web server but a response is never
returned. If the user tries to
navigate to another hyperlink they are
not able to (i.e. the web site appears
to hang for that user). Other users
of the website at the time are not
affected by this hang and if the user
with the problem opens a new http
session (closing IE and opening the
web site again) they no longer
experience the hang.
I've placed a debugger (IISState) on
the w3wp process with the following
output. Entries with "Thread is
waiting for a lock to be released.
Looking for lock owner." look like
they might be causing the issue. Can
anyone tell what lock the process is
waiting on?
Thanks
http://www.eggheadcafe.com/software/aspnet/33799697/session-hangs.aspx
In my case my .Net C# MVC application runs against a MySQL database for data and a MS SQL database for .Net membership.
I hope someone with more knowledge of IIS can help resolve this problem.
It sounds like you have a race condition in your database calls resulting in a deadlock at the database level. You may want to look at the settings you have in your application pool for database connections. Likely you will need to put some checks in somewhere or redefine procedures in order to reduce the likelihood of the race:
http://msdn.microsoft.com/en-us/library/ms178104.aspx
I would explain the experienced hang due to session serialization. Not the part about saving/loading it from some source, but that ASP.NET does not allow the same session to execute two parallel pages simultaneously, unless they execute with a readonly-session. The later is done either in the page directive, or in web.config, by setting EnableSessionState="ReadOnly".
Your problem still exists, this wont change that the first thread hangs. I would verify that your database connections are disposed correctly. However, you never mention any Oracle database in your question (only Mysql and SQL Server). Why are you using the Oracle drivers at all? (This seems like a valid place to start debugging.)
However, as stated by David Wang in his answer in your linked question, part two of your problem is a lock that's never released. You'll need support from Oracle (or their source code) to debug this further.
IIS hang is not something surprising. IISState is out of date, and you may use Debug Diag,
http://support.microsoft.com/kb/919791 (if CPU usage is high)
http://support.microsoft.com/kb/919792 (otherwise)
The hang dumps should tell you what is the root cause.
Microsoft support can help analyze the dumps, if you are not familiar with the tricks. http://support.microsoft.com
I have a web service written in C#.
It behaves rather strange during pool recycling.
If I configure a pool with 5 worker processes which should recycle after say 100 requests (in production its actually 10000 but nevermind that). I get the proper response for the first 100 per process (i.e 500 requests), but after that some of the requests returns an improper result (i also get timeouts but that is okay as the process is recycling).
Since these improper results seems to happen AFTER the recycle, while the service is starting up it is kinda hard to just attach the debugger and see what happens (as the debugger is dettached when the recycle occurs).
So my question(s) is/are:
1. Do anybody know a good method for debugging this kind of thing
Edit: 2. Anbody who happens to have an idea on what might be wrong (the service has no state information between requests) - I found the error, by attaching the debugger and luckily seeing an exception (caught in a global exception handler - god i hate those): But the 1 question still stands. Is there an easier way than attaching the debugger and hope you make it in time to see the error.
You should make it clear what is the improper result. If it is not a .NET error, you should review your code and add some application level logging on your own code.
A debugger can only help when you have nothing else to resort to.
What I have ended up doing (for now), is to remove most of those "semi-global" try/catch/do-nothing handlers and then write a SoapExtension for handling "Unhandled Exceptions", and dump out all the information I can come near.
I got most the inspiration from Jeff Atwood's article on CodeProject: http://www.codeproject.com/kb/aspnet/ASPNETExceptionHandling.aspx
Its not really the same as attaching the debugger, but will have to do for now.
We have an IIS hosted web method which is randomly dying on us about 10% of the time. In trying to debug this we've added Log.Debug() messages in front of every real code line and it appears to be dying on random lines.
Has anyone seen this or have an idea on how to debug this?
[Additional Details]
We've spent a lot of time looking at it and have discovered the following...
We have a seperate self-hosted WCF Service that access the same database and lives on the same machine. When it is under heavy load the web method croaks every time. If it's not under load then things usually work fine (but not 100%).
High CPU doesn't seem to be part of the problem. We ran a small app that created a high cpu load and the web service did not die.
The web service dies when we either new up an XmlSerializer (without doing the sgen precomp) OR have NHibernate create a SessionFactory. The only two things these things have in common is that they 1) seem like things people commonly do.. 2) seem like they would be fairly intensive.
We've added a Global.asax to try to capture Application_End and Application_Error but neither event gets fired. This to me implies that we're not dealing with a normal application pool resetting?
Sounds like it might be a threading issue. You are using informative debug messages -- you should try to reproduce the issue while running the debugger and breaking on all exceptions. Make sure you check all the windows logs for information on why the app pool crashed.
Per comment: It's hard to say, but many things can cause a thread to appear to "just die." Memory issues: are you doing any interop? Improper marshaling: are you touching data on another thread? But, I will play the probabilities and ask if you're sure your handling any exception that might be happening and logging it. Are you sure you are not gobbling up an exception and not reporting it? Somewhere down low? Is this a permissions issue? Are you running partial trust or on a low privilege user account?
Figured it out.. two problems really..
We added Global.asax but it didn't get copied over which explains why we weren't seeing any messages. We fixed this and found out that...
Our WCF log was being written out to the bin directory of the IIS Web Service. In retrospect this is kind of silly since the WS is an old school web service. The WCF stuff is in the same directory only for some reason that is unknown to us since the initial person who set things up is gone..
Lesson learned.. Somewhere there is a message that explains everything.. you just have to find it.