For the past two days I've been trying to resolve the following error:
Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.
As far as I have checked all of our DbContext requests are wrapped with using, and still it looks like it doesn't get disposed or something else.
Is there a way to determine if the web app is leaking connection? Ir something else I can try?!
I have tried to increase timeouts in SQL Server + increment the pool size from default value of 100 to 200 as a temporary result but it didn't helped, and that's not the solutions I want.
Any suggestions would help.
I think I have a case similar to yours. Our old project version (still using EF4) leaks connections when my code does NOT touch the context. The context is created, a new connection allocated, but on Dispose() the context does not seem to return its connection.
If I just perform one little query (.First() on any random table) on that underutilized context, the situation improves.
A more recent branch of my project uses EF6. That code works fine.
What helped me track this down was SQL Profiler. I also knew roughly what my problem user was doing, so I repeated his steps and kept an eye on Profiler's SPID column. Then I stepped through parts of the code until I got one of the ghost connections I was chasing.
(I stumbled across this just now, so I have not had time to investigate further)
Related
I recently monitored my sql database activity I found about 400 processes in activity monitoring.Later I figured that the problem is with my connection string object which would not be cleared physically even though I completely closed and disposed it, so once I suspend my IIS all the processes from activity monitoring would disappear.
after a little searching I found that I can clean all of my connections from application pool so that all the useless processes from SMSS would be killed but
I'm really concerned about it's impact on webserver. It's true that this approach would clear useless tasks from SMSS but for every request a new connection should really be created is it worth it???
considering my application is kind of enterprise app which is supposed to handle to many requests, I'm so afraid of making IIS server down by using this approach.
Do notice that my connection string value is not completely fixed for all the requeests, I made it variable by changing only "Application Name" section of it in every request according to the request parameters for the purpose of getting requestors information in sql activity monitoring and sql profiler.
is it worth to do so considering my business scope or it's better I fix the connection string value in other word is performance lag on this approach is so severe that I have to change my logging strategy or it's just a little slower???
Do notice that my connection string value is not completely fixed for all the requeests, I made it variable by changing only "Application Name" section of it in every request according to the request parameters for the purpose of getting requestors information in sql activity monitoring and sql profiler.
This is really bad because it kills pooling. You might as well disable pooling but that comes with a heavy performance penalty (which you are paying right now already).
Don't do that. Obtain monitoring information in a different way.
Besides that, neither SQL Server nor .NET have a problem with 400 connections. That's unusually high but will not cause problems.
If you run multiple instances of the app (e.g. for HA) this will multiply. The limit is 30k. I'm not aware of any reasons why this would cause a slowdown for the app, but it might cause problems for your monitoring tools.
I have a multi-threaded application that talks to SQL server via Linq to Sql. The app is running fine on a quad core (Intel I-7) machine when the number of threads is artificially kept at 8:
Parallel.ForEach(allIds,
new ParallelOptions { MaxDegreeOfParallelism = 8 },
x => DoTheWork(x));
When the number of threads is left to the system to decide:
Parallel.ForEach(allIds, x => DoTheWork(x));
After running for a little while, I get the following exception:
Timeout expired. The timeout period elapsed prior to obtaining a
connection from the pool. This may have occurred because all pooled
connections were in use and max pool size was reached.
There are only two patterns in my app for calling SQL:
first:
using (var dc = new MyDataContext())
{
//do stuff
dc.SafeSubmitChanges();
}
second:
using (var dc = new MyDataContext())
{
//do some other stuff
DoStuff(dc);
}
.....
private void DoStuff(DataContext dc)
{
//do stuff
dc.SafeSubmitChanges();
}
I decided to throttle the calls by this form of logic:
public static class DataContextExtention
{
public const int SQL_WAIT_PERIOD = 5000;
public static void SafeSubmitChanges(this DataContext dc)
{
try
{
dc.SubmitChanges();
}
catch (Exception e)
{
if (e.Message ==
"Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.")
{
System.Data.SqlClient.SqlConnection.ClearAllPools();
System.Threading.Thread.Sleep(SQL_WAIT_PERIOD);
dc.SafeSubmitChanges();
}
else
{
throw;
}
}
}
}
This made absolutely no difference. Once the app throws the first exception of this kind, all sorts of random places in the app (even lines of code that have nothing to do with SQL server) start throwing this exception.
Q1: Isn't religiously employing using statement supposed to guard against exactly this scenario?
Q2: What is wrong and how do I fix this?
Note: There are approx 250,000 ids. I also tested at MaxDegreeOfParallelism = 16 and I get the same exception.
I suppose it depends on how many items there are in allIds. If Parallel.ForEach creates too many parallel concurrent tasks, it could be that each one tries to open connection to the database (in parallel) and thus exhausting connection pool and leaving it unable to provide connections to all concurrent tasks that are requesting new connections.
If satisfying the connection pool request takes longer than timeout, that error message would make sense. So when you set MaxDegreeOfParallelism = 8, you have no more than 8 concurrent tasks, and thus no more than 8 connections "checked out" from the pool. Before the task completes (and Parallel.ForEach now has an available slot to run new task) the connection is returned back to the pool, so that when Parallel.ForEach runs the next item, connection pool can satisfy the next request for the connection, and thus you don't experience the issue when you artificially limit concurrency.
EDIT 1
#hatched's suggestion above is on the right track - increase the pool size. However, there is a caveat. Your bottleneck likely isn't really in computing power, but in database activity. What I suspect (speculation, admittedly) is happening is that while talking to the database, the thread can't do much and goes blocked (or switches to another task). So thread pool sees that there are more tasks pending, but CPU is not utilized (because of outstanding IO operations), and thus decides to take on more tasks for the available CPU slack. This of course just saturates the bottleneck even more and back to square one. So even if you increase the connection pool size, you're likely to keep running into the wall until your pool size is as big as your task list. As such, you may actually want to have bounded parallelism such that it never exhausts thread pool (and fine tune by making thread pool larger / smaller depending on DB load, etc.).
One way to try to find out if the above is true is to see why connections are taking so long and not getting returned to the pool. I.e. analyze to see if there is db contention that is slowing all connections down. If so, more parallelization won't do you any good (in-fact, that would be making things worse).
I was thinking the following might help, in my experience with Oracle the DB Connection Pool has caused me issues before. So I thought there may be similar issue with SQL Server connection pool. Sometimes knowing the default connection settings and seeing connection activity on the DB is good information.
If you are using Sql Server 8 the default SQL Connection Pool is 100. The default Timeout is 15 seconds. I would want to have the SQL Admin track how many connections your making while running the app and see if your putting load on the DB Server. Maybe add some performance counters as well. Since this looks like a SQL Server exception I would gets some metrics to see what is happening. You could also use intellitrace to help see DB Activity.
Intellitrace Link: http://www.dotnetcurry.com/showarticle.aspx?ID=943
Sql Server 2008 Connection Pool Link: http://msdn.microsoft.com/en-us/library/8xx3tyca(v=vs.110).aspx
Performance Counters Link: http://msdn.microsoft.com/en-us/library/ms254503(v=vs.110).aspx
I could be way off target here, but I wonder if the problem is not being caused as a side effect of this fact about connection pooling (Taken from here, emphasis mine):
When connection pooling is enabled, and if a timeout error or other login error occurs, an exception will be thrown and subsequent connection attempts will fail for the next five seconds, the "blocking period". If the application attempts to connect within the blocking period, the first exception will be thrown again. Subsequent failures after a blocking period ends will result in a new blocking periods that is twice as long as the previous blocking period, up to a maximum of one minute.
So in other words, it's not that you are running out of connections per se, it's that something is failing on one or more of the parallel operations, perhaps because the poor table is caving under the pressure of parallel writes - have you profiled what's happening database-side to see if there are any problems with contention on the table during the operation?
This could then cause other requests for connections to start to back up due to the "penalty" described above; hence the exceptions and once you start to get one, your SafeSubmit method can only ever make things worse because it keeps retrying an already banjaxed operation.
This explanation would also heavily support the idea that the real bottleneck here is the database and that maybe it's not a good idea to try to hammer a table with unbounded parallel IO; its better to measure and come up with a maximum DOP based on the characteristics of what the database can bear (which could well be different for different hardware)
Also, as regards your first question, using only guarantees that your DataContext object will be auto-magically Dispose()d when it goes out of scope, so it's not at all designed to protect in this scenario - all it is is syntactic sugar for
try
{
var dc = new DataContext();
//do stuff with dc
}
finally
{
dc.Dispose();
}
and in this case that's not a guard against there being (too) many DataContexts currently trying to connect to the database at the same time.
Are you sure you are not facing connection leaks? Please check out the accepted answer at this link
Moreover, do you have already set MultipleActiveResultSets = true ?
From MSDN:
When true, an application can maintain multiple active result sets
(MARS). When false, an application must process or cancel all result
sets from one batch before it can execute any other batch on that
connection. Recognized values are true and false.
For more information, see Multiple Active Result Sets (MARS).
We currently have a little situation on our hands - it seems that someone, somewhere forgot to close the connection in code. Result is that the pool of connections is relatively quickly exhausted. As a temporary patch we added Max Pool Size = 500; to our connection string on web service, and recycle pool when all connections are spent, until we figure this out.
So far we have done this:
SELECT SPId
FROM MASTER..SysProcesses
WHERE DBId = DB_ID('MyDb') and last_batch < DATEADD(MINUTE, -15, GETDATE())
to get SPID's that aren't used for 15 minutes. We're now trying to get the query that was executed last using that SPID with:
DBCC INPUTBUFFER(61)
but the queries displayed are various, meaning either something on base level regarding connection manipulation was broken, or our deduction is erroneous...
Is there an error in our thinking here? Does the DBCC / sysprocesses give results we're expecting or is there some side-effect catch? (for example, connections in pool influence?)
(please, stick to what we could find out using SQL since the guys that did the code are many and not all present right now)
I would expect that there is a myriad of different queries 'remembered' by inputbuffer - depending on the timing of your failure and the variety of queries you run, it seems unlikely that you'd see consistent queries in this way. Recall that the connections will eventually be closed, but only when they're GC'd and finalized.
As Mitch suggests, you need to scour your source for connection-opens and ensure they're localized and wrapped in a using(). Also look for possibly-long-lived objects that might be holding on to connections. In an early version of our catalog ASP page objects held connections that weren't managed properly.
To narrow it down, can you monitor connection-counts (perfmon) as you focus on specific portions of your app? Does it happen more in CRUD areas vs. reporting or other queries? That might help narrow down the source-scour you need to do.
Are you able to change the connection strings to contain information about where and why the connection was created in the Application field?
We have a web service coded in C# that makes many calls to MS SQL Server 2005 database. The code uses Using blocks combined with C#'s connection pooling.
During a SQL trace, we saw many, many calls to "sp_resetconnection". Most of these are short < 0.5 sec, however sometimes we get calls lasting as much as 9 seconds.
From what I've read sp_resetconnection is related to connection pooling and basically resets the state of an open connection. My questions:
Why does an open connection need its state reset?
Why so many of these calls!
What could cause a call to sp_reset connection to take a non-trivial amount of time.
This is quite the mystery to me, and I appreciate any and all help!
The reset simply resets things so that you don't have to reconnect to reset them. It wipes the connection clean of things like SET or USE operations so each query has a clean slate.
The connection is still being reused. Here's an extensive list:
sp_reset_connection resets the following aspects of a connection:
It resets all error states and numbers (like ##error)
It stops all EC's (execution contexts) that are child threads of a parent EC executing a parallel query
It will wait for any outstanding I/O operations that is outstanding
It will free any held buffers on the server by the connection
It will unlock any buffer resources that are used by the connection
It will release all memory allocated owned by the connection
It will clear any work or temporary tables that are created by the connection
It will kill all global cursors owned by the connection
It will close any open SQL-XML handles that are open
It will delete any open SQL-XML related work tables
It will close all system tables
It will close all user tables
It will drop all temporary objects
It will abort open transactions
It will defect from a distributed transaction when enlisted
It will decrement the reference count for users in current database; which release shared database lock
It will free acquired locks
It will releases any handles that may have been acquired
It will reset all SET options to the default values
It will reset the ##rowcount value
It will reset the ##identity value
It will reset any session level trace options using dbcc traceon()
sp_reset_connection will NOT reset:
Security context, which is why connection pooling matches connections based on the exact connection string
If you entered an application role using sp_setapprole, since application roles can not be reverted
The transaction isolation level(!)
Here's an explanation of What does sp_reset_connection do? which says, in part "Data access API's layers like ODBC, OLE-DB and SqlClient call the (internal) stored procedure sp_reset_connection when re-using a connection from a connection pool. It does this to reset the state of the connection before it gets re-used." Then it gives some specifics of what that system sproc does. It's a good thing.
sp_resetconnection will get called everytime you request a new connection from a pool.
It has to do this since the pool cannot guarantee the user (you, the programmer probably :)
have left the connection in a proper state. e.g. Returning an old connection with uncommited transactions would be ..bad.
The nr of calls should be related to the nr of times you fetch a new connection.
As for some calls taking non-trivial amount of time, I'm not sure. Could be the server is just very busy processing other stuff at that time. Could be network delays.
Basically the calls are the clear out state information. If you have ANY open DataReaders it will take a LOT longer to occur. This is because your DataReaders are only holding a single row, but could pull more rows. They each have to be cleared as well before the reset can proceed. So make sure you have everything in using() statements and are not leaving things open in some of your statements.
How many total connections do you have running when this happens?
If you have a max of 5 and you hit all 5 then calling a reset will block - and it will appear to take a long time. It really is not, it is just blocked waiting on a pooled connection to become available.
Also if you are running on SQL Express you can get blocked due to threading requirements very easily (could also happen in full SQL Server, but much less likely).
What happens if you turn off connection pooling?
I have a web service that's been running fine without modification for a couple of years now. Suddenly today it decides that it would not like to function, and throws a SQL timeout:
System.Data.SqlClient.SqlException: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
Interesting to note that this web service lives on the same server as the database, and also that if I pull the query out of a SQL trace and run it in management studio, it returns in under a second. But it's timing out after exactly 30 seconds when called from the web service without fail. I'm using the Enterprise Library to connect to the database, so I can't imagine that randomly started failing.
I'm not quite sure what could suddenly make it stop working. I've recycled the app pool it's in, and even restarted the SQL process that I saw it was using. Same behavior. Any way I can troubleshoot this?
UPDATE: Mitch nailed it. As soon as I added "WITH RECOMPILE" before the "AS" keyword in the sproc definition, it came back to life. Bravo!
The symptoms you describe are 99.9% certain of being due to an incorrectly cached query plan.
Please see these answers:
Big difference in execution time of stored proc between Managment Studio and TableAdapter
Rule of thumb on when to use WITH RECOMPILE option
which include the advice to rebuild indexes and ensure statistics are up to date as a starting point.
Do you have a regular index maintenance job scheduled and enabled?
The canonical reference is: Slow in the Application, Fast in SSMS?
Rebuild any relevant indexes.
Update statistics, check set options on query in profiler. SSMS might be using different connection set options.