I am currently calling up several stored procedures in some .NET code, SqlConnection. I'd like to disable the caching done by SQL Server, so that I can measure performance periodically (I'm gonna be comparing it to another server that likely won't have any cached data either). Is this possible to do without modifying the sprocs?
This is the code that I am currently using:
using (SqlConnection connection = new SqlConnection(/* connection string goes here */)) {
SqlCommand command = new SqlCommand(procName, connection);
command.Parameters.AddRange(parameters);
command.CommandType = System.Data.CommandType.StoredProcedure;
connection.Open();
SqlDataReader r = command.ExecuteReader();
// todo: read data here
r.Close();
connection.Close();
}
First thing, by "cacheing" here I'm assuming you're referring to the Execution Plan Cache. Once SQL Server figures out the best order to execute your statements, it stores it for a while. This problem is commonly known as "Parameter Sniffing". This is what you clear when you run dbcc freeproccache. Unfortunately, that's an admin-privileged command and it affects all connections.
The root of the problem is that your SQL probably performs differently with a different set of parameters. SQL Server will only store the execution plan of the first execution it sees and the parameters associated with it. So if the arguments on first execution are good for the common case, your app will perform fine. But once in a while, the wrong arguments will get used on first execution and your entire application can perform poorly.
There are a number of ways to optimize your SQL statement to reduce the impact of this, but it's not completely avoidable.
Generate the SQL dynamically - You take the performance hit of generating a query plan on each execution, but this may be worth it if using the wrong execution plan causes your query to never return. I suggest this path, though it is more cumbersome. I found SET STATISTICS TIME ON and SQL Profiler helpful in reducing the plan generation time. The biggest improvement came from using 3-part naming (owner.schema.table) for the tables.
Specify a "good set" of initial parameters for your query with query hints.
SELECT Col1, Col2
FROM dbo.MySchema.MyTab
WHERE Col1=#Parameter
OPTION (OPTIMIZE FOR (#Parameter='value'));
This link describe the parameter sniffing problem fairly well. This was a bigger problem in SQL 2005. Later versions of SQL did a better job of avoiding this.
Related
I have a console batch application which includes a process that uses SqlDataAdapter.Fill(DataTable) to perform a simple SELECT on a table.
private DataTable getMyTable(string conStr)
{
DataTable tb = new DataTable();
StringBuilder bSql = new StringBuilder();
bSql.AppendLine("SELECT * FROM MyDB.dbo.MyTable");
bSql.AppendLine("WHERE LEN(IdString) > 0");
try
{
string connStr = ConfigurationManager.ConnectionStrings[conStr].ConnectionString;
using (SqlConnection conn = new SqlConnection(connStr))
{
conn.Open();
using (SqlDataAdapter adpt = new SqlDataAdapter(bSql.ToString(), conn))
{
adpt.Fill(tb);
}
}
return tb;
}
catch (SqlException sx)
{
throw sx;
}
catch (Exception ex)
{
throw ex;
}
}
This method is executed synchronously, and was run successfully in several test environments over many months of testing -- both when started from the command-line or started under control of an AutoSys job.
When moved into production, however, the process hung up -- at the Fill method as nearly as we can tell. Worse, instead of timing out, it apparently started spawning new request threads, and after a couple hours, had consumed more than 5 GB of memory on the application server. This affected other active applications, making me very unpopular. There was no exception thrown.
The Connection String is about as plain-vanilla as they come.
"data source=SERVER\INSTANCE;initial catalog=MyDB;integrated security=True;"
Apologies if I use the wrong terms regarding what the SQL DBA reported below, but when we had a trace put on the SQL Server, it showed the Application ID (under which the AutoSys job was running) being accepted as a valid login. The server then appeared to process the SELECT query. However, it never returned a response. Instead, it went into an "awaiting command" status. The request thread appeared to remain open for a few minutes, then disappeared.
The DBA said there was no sign of a deadlock, but that he would need to monitor in real time to determine whether there was blocking.
This only occurs in the production environment; in test environments, the SQL Servers always responded in under a second.
The AutoSys Application ID is not a new one -- it's been used for several years with other SQL Servers and had no issues. The DBA even ran the SELECT query manually on the production SQL server logged in as that ID, and it responded normally.
We've been unable to reproduce the problem in any non-production environment, and hesitate to run it in production without a server admin standing by to kill the process. Our security requirements limit my access to view server logs and processes, and I usually have to engage another specialist to look at them for me.
We need to solve this problem sooner or later. The amount of data we're looking at is currently only a few rows, but will increase over the next few months. From what's happening, my best guess is that it involves communication and/or security between the application server and the SQL server.
Any additional ideas or items to investigate are welcome. Thanks everyone.
This may be tied to permissions. SQL Server does some odd things instead of giving a proper error message sometimes.
My suggestion, and this might improve performance anyway, is to write a stored procedure on the server side that executes the select, and call the stored procedure. This way, the DBA can ensure you have proper access to the stored procedure without allowing direct access to the table if for some reason that's being blocked, plus you should see a slight performance boost.
Though it may be caused by some strange permissions/ADO.NET issues as mentioned by #user1895086, I'd nonetheless would recommend to recheck a few things one more time:
Ensure that queries run manually by DBA and executed in your App are the same - either hardcode it or at least log just before running. It is better to be safe than sorry.
Try to select only few rows - it is always a good idea to not select the entire table if you can avoid it, and in our case SELECT TOP 1(or 100) query may not exhibit such problems. Perhaps there is just much more data than you think and ADO.Net just dutifully tries to load all those rows. Or perhaps not.
Try SqlDataReader to be sure that SqlDataAdapter does not cause any issues - yes, it uses the same DataAdapter internally, but we would at least exclude those additional operations from a list of suspects.
Try to get a hand on the dump with those 5 GB of memory - analyzing memory dumps is not a trivial task, but it won't be too difficult to understand what is eating those hefty chunks of memory. Because I somehow doubt that ADO.NET will just spawn a lot of additional objects for no reason.
Is there any benefit to explicitly using the StoredProcedure CommandType as opposed to just using a Text Command? In other words, is
cmd = new SqlCommand("EXEC StoredProc(#p1, #p2)");
cmd.CommandType = CommandType.Text;
cmd.Parameters.Add("#p1", 1);
cmd.Parameters.Add("#p2", 2);
any worse than
cmd = new SqlCommand("StoredProc");
cmd.CommandType = CommandType.StoredProcedure;
cmd.Parameters.Add("#p1", 1);
cmd.Parameters.Add("#p2", 2);
EDIT: Fixed bad copy paste job (again). Also, the whole point of the question is for a data access class. I'd much rather be able to pass the stored proc name and parameters in one line as opposed to extra lines for each parameter.
One difference is how message pumping happens.
Where I used to work we had a number of batch processes that ran over night. Many of them simply involved running a stored procedure. We used to schedule these using sql server jobs, but moved away from it to instead call the procedures from a .Net program. This allowed us to keep all our scheduled tasks in one place, even the ones that had nothing to do with Sql Server.
It also allowed us to build better logging functionality into the .Net program that calls the procedures, so that the logging from all of the overnight processes was consistent. The stored procedures would use the sql print and raiserror functions, and the .Net program will receive and log those. What we learned was that CommandType.StoredProcedure would always buffer these messages into batches of about 50. The .Net code wouldn't see any log events until the procedure finished or flushed the buffer, no matter what options you set on the connection or what you did in your sql. CommandType.Text fixed this for us.
As a side issue, I'd use explicit types with your query parameters. Letting .Net try to infer your parameter types can cause issues in some situations.
It's cleaner.
You're calling a stored procedure, why not just use the CommandType.StoredProcedure?
We are using Linq to connect the database and in one situation we need to execute the big query and I can execute that query directly in sql server, but when I was trying to execute the same query through asp.net it was showing timeout error, can you help me?
Also we are using pagemethods => webmethods to contact the sql server.
Have a look at the answer to this question: Linq-to-SQL Timeout
You can set the command timeout on your DataContext object (https://msdn.microsoft.com/library/system.data.linq.datacontext.commandtimeout%28v=vs.110%29.aspx).
Example (from the linked answer):
using (MainContext db = new MainContext())
{
db.CommandTimeout = 3 * 60; // 3 Mins
}
You need to increase the CommandTimeout, not ConnectionTimout. ConnectionTimeout (In other answer) is the amount of time the app allows to connect to the DB rather than run a comment.
You probably also want to look into improving the performance of your sql query by adding indexes etc. You could use SQL profiler to catch the sql statement that Linq to SQL generated. Grab the query and run it through an execution plan on SSMS and see where it's taking most time to execute. This is generally a good place to start.
I am using old school ADO.net with C# so there is a lot of this kind of code. Is it better to make one function per query and open and close db each time, or run multiple queries with the same connection obect? Below is just one query for example purpose only.
using (SqlConnection connection = new SqlConnection(ConfigurationManager.ConnectionStrings["DBConnectMain"].ConnectionString))
{
// Add user to database, so they can't vote multiple times
string sql = " insert into PollRespondents (PollId, MemberId) values (#PollId, #MemberId)";
SqlCommand sqlCmd = new SqlCommand(sql, connection);
sqlCmd.Parameters.Add("#PollId", SqlDbType.Int);
sqlCmd.Parameters["#PollId"].Value = PollId;
sqlCmd.Parameters.Add("#MemberId", SqlDbType.Int);
sqlCmd.Parameters["#MemberId"].Value = Session["MemberId"];
try
{
connection.Open();
Int32 rowsAffected = (int)sqlCmd.ExecuteNonQuery();
}
catch (Exception ex)
{
//Console.WriteLine(ex.Message);
}
}
Well, you could measure; but as long as you are using the connections (so they are disposed even if you get an exception), and have pooling enabled (for SQL server it is enabled by default) it won't matter hugely; closing (or disposing) just returns the underlying connection to the pool. Both approaches work. Sorry, that doesn't help much ;p
Just don't keep an open connection while you do other lengthy non-db work. Close it and re-open it; you may actually get the same underlying connection back, but somebody else (another thread) might have made use of it while you weren't.
For most cases, opening and closing a connection per query is the way to go (as Chris Lively pointed out). However, There are some cases where you'll run into performance bottlenecks with this solution though.
For example, when dealing with very large volumes of relatively quick to execute queries that are dependent on previous results, I might suggest executing multiple queries in a single connection. You might encounter this when doing batch processing of data, or data massaging for reporting purposes.
Always be sure to use the 'using' wrapper to avoid mem leaks though, regardless of which pattern you follow.
If the methods are structured such that a single command is executed within a single method, then Yes: instantiate and dispose of the connection for each command.
If the methods are structured such that you have multiple commands executed in the same block of code, then the outer block needs to be the using clause for the connection.
ADO is very good about connection pooling so instantiating and disposing of the command object is going to be extremely fast and really won't impact performance.
As an example, we have a few pages that will execute update to 50 queries in order to compose the page. Because there is branching code to determine the queries to run, we have each of them wrapped with their own using (connection...) clauses.
We once ripped those out and grabbed one connection object and passed it to the individual methods. This had exactly zero performance improvement while complicating the hell out of the code with all the exception clauses every where to ensure the connection was properly disposed at the end. At the end of the test, we rolled back the code to how it was before. Much cleaner to know exactly what was going on and when a connection was being used.
Well, as always, it depends. If you have 5 database call to make within the same method call, you should probably use a single connection.
However, holding onto connection while nothing is happening isn't usually advised from a scalability standpoint.
ADO.NET is old school now? Wow, you just made me feel old. To me Rogue Wave ODBC using Borland C++ on Windows 3.1 is old school.
To answer, in general you want to understand how your data drivers work. Understand such concepts as connection pooling and learn to profile the transaction costs associate with connecting / disconnecting and executing queries. Then take that knowledge and apply it it your situation.
I have a webpage that takes 10 minutes to run one query against a database, but the same query returns in less than a second when run from SQL Server Management Studio.
The webpage is just firing SQL at the database that is executing a stored procedure, which in turn is performing a pretty simple select over four tables. Again the code is basic ADO, setting the CommandText on an SqlCommand and then performing an ExecuteReader to get the data.
The webpage normally works quickly, but when it slows down the only way to get it speeded up is to defragment the indexes on the tables being queried (different ones different times), which doesn't seem to make sense when the same query executes so quickly manually.
I have had a look at this question but it doesn't apply as the webpage is literally just firing text at the database.
Does anyone have any good ideas why this is going slow one way and not the other?
Thanks
I would suspect parameter sniffing.
The cached execution plan used for your application's connection probably won't be usable by your SSMS connection due to different set options so it will generate a new different plan.
You can retrieve the cached plans for the stored procedure by using the query below. Then compare to see if they are different (e.g. is the slow one doing index seeks and bookmark lookups at a place where the other one does a scan?)
Use YourDatabase;
SELECT *
FROM sys.dm_exec_cached_plans
CROSS APPLY sys.dm_exec_sql_text(plan_handle)
CROSS APPLY sys.dm_exec_query_plan(plan_handle)
cross APPLY sys.dm_exec_plan_attributes(plan_handle) AS epa
where sys.dm_exec_sql_text.OBJECTID=object_id('YourProcName')
and attribute='set_options'
Is there any difference between the command text of the query in the app and the query you are executing manually? Since you said that reindexing helps performance (which also updates statistics), it sounds like it may be getting stuck on a bad execution plan.
You might want to run a sql trace and capture the showplanxml event to see what the execution plan looks like, and also capture sql statement complete (though this can slow the server down if a lot of statements are coming through the system so be careful) to be sure the statement sent to SQL server is the same one you are running manually.