Reduce number of database calls

Reduce number of database calls - c#

I have a stored-procedure which accepts five parameters and performing a update on a table
Update Table
Set field = #Field
Where col1= #Para1 and Col2=#Para and Col3=#Para3 and col4 =#aPara4
From the user prospective you can select multiple values for all the condition parameters.
For example you can select 2 options which needs to match Col1 in database table (which need to pass as #Para1)
So I am storing all the selected values in separates lists.
At the moment I am using foreach loop to do the update
foreach (var g in _list1)
{
foreach (var o in _list2)
{
foreach (var l in _list3)
{
foreach (var a in _list4)
{
UpdateData(g, o, l,a);
}
}
}
}
I am sure this is not a good way of doing this since this will call number of database call. Is there any way I can ignore the loop and do a minimum number of db calls to achieve the same result?
Update
I am looking for some other approach than Table-Valued Parameters

You can bring query to this form:
Update Table Set field = #Field Where col1 IN {} and Col2 IN {} and Col3 IN {} and col4 IN {}
and pass parameters this way: https://stackoverflow.com/a/337792/580053

One possible way would be to use Table-Valued Parameters to pass the multiple values per condition to the stored procedure. This would reduce the loops in your code and should still provide the functionality that you are looking for.
If I am not mistaken they were introduced in SQL Server 2008, so as long as you don't have to support 2005 or earlier they should be fine to use.

Consider using the MS Data Access Application Block from the Enterprise Library for the UpdateDataSet command.
Essentially, you would build a datatable where each row is a parameter set, then you execute the "batch" of parameter sets against the open connection.
You can do the same without that of course, by building a string that has several update commands in it and executing it against the DB.

Since table-valued parameters are off limits to you, you may consider an XML-based approach:
Build an XML document containing the four columns that you would like to pass.
Change the signature of your stored procedure to accept a single XML-valued parameter instead of four scalar parameters
Change the code of your stored procedure to perform the updates based on the XML that you get
Call your new stored procedure once with the XML that you constructed in memory using the four nested loops.
This should reduce the number of round-trips, and speed up the overall execution time. Here is a link to an article explaining how inserting many rows can be done at once using XML; your situation is somewhat similar, so you should be able to use the approach outlined in that article.

So long as you have the freedom to update the structure of the stored procedure; the method I would suggest for this would be to use a table value parameter instead of the multiple parameters.
A good example which goes into both server and database code for this can be found at: http://www.codeproject.com/Articles/39161/C-and-Table-Value-Parameters

Why are you using a stored procedure for this? In my opinion you shouldn't use SP to do simple CRUD operations. The real power of stored procedures is for heavy calculations and things like that.
Table-valued parameters would be my choice, but since you are looking for other approach why don't you go the simpler way and just dynamically construct a bulk/mass update query on your server side code and run it against the DB?

Related

Trying to get an UPSERT working on a set of data using dapper

I'm trying to get an upsert working on a collection of IDs (not the primary key - that's an identity int column) on a table using dapper. This doesn't need to be a dapper function, just including in case that helps.
I'm wondering if it's possible (either through straight SQL or using a dapper function) to run an upsert on a collection of IDs (specifically an IEnumerable of ints).
I really only need a simple example to get me started, so an example would be:
I have three objects of type Foo:
{ "ExternalID" : 1010101, "DescriptorString" : "I am a descriptive string", "OtherStuff" : "This is some other stuff" }
{ "ExternalID" : 1010122, "DescriptorString" : "I am a descriptive string123", "OtherStuff" : "This is some other stuff123" }
{ "ExternalID" : 1033333, "DescriptorString" : "I am a descriptive string555", "OtherStuff" : "This is some other stuff555" }
I have a table called Bar, with those same column names (where only 1033333 exists):
Table Foo
Column ID | ExternalID | DescriptorString | OtherStuff
Value [1]|[1033333] |["I am a descriptive string555"]|["This is some other stuff555"]

Well, since you said that this didn't need to be dapper-based ;-), I will say that the fastest and cleanest way to get this data upserted is to use Table-Valued Parameters (TVPs) which were introduced in SQL Server 2008. You need to create a User-Defined Table Type (one time) to define the structure, and then you can use it in either ad hoc queries or pass to a stored procedure. But this way you don't need to export to a file just to import, nor do you need to convert it to XML just to convert it back to a table.
Rather than copy/paste a large code block, I have noted three links below where I have posted the code to do this (all here on S.O.). The first two links are the full code (SQL and C#) to accomplish this (the 2nd link being the most analogous to what you are trying to do). Each is a slight variation on the theme (which shows the flexibility of using TVPs). The third is another variation but not the full code as it just shows the differences from one of the first two in order to fit that particular situation. But in all 3 cases, the data is streamed from the app into SQL Server. There is no creating of any additional collection or external file; you use what you currently have and only need to duplicate the values of a single row at a time to be sent over. And on the SQL Server side, it all comes through as a populated Table Variable. This is far more efficient than taking data you already have in memory, converting it to a file (takes time and disk space) or XML (takes cpu and memory) or a DataTable (for SqlBulkCopy; takes cpu and memory) or something else, only to rely on an external factor such as the filesystem (the files will need to be cleaned up, right?) or need to parse out of XML.
How can I insert 10 million records in the shortest time possible?
Pass Dictionary<string,int> to Stored Procedure T-SQL
Storing a Dictionary<int,string> or KeyValuePair in a database
Now, there are some issues with the MERGE command (see Use Caution with SQL Server's MERGE Statement) that might be a reason to avoid using it. So, I have posted the "upsert" code that I have been using for years to an answer on DBA.StackExchange:
How to avoid using Merge query when upserting multiple data using xml parameter?

Best Practice: writing data from DGV to SQL Server table

I have an unbound DataGridView with one visible field.
The user can copy data, from the clipboard, into this DGV in a similar manner to this article
Now I'd like to move this data into a table on SQL Server.
It's been suggested to me to do the following:
Create a stored procedure that takes a single parameter and writes that input to a table
Loop through the items in the DGV feeding each into the stored procedure and therefore writing them to the table
Can I not just grab all the items in the DGV and insert them into the target table at once, without having to loop?
Or is the loop method (with upto 2,000 iterations) the best practice in such a situation? (Or is there no particular best practice?!)

If you are looking at using a stored proc, then you can follow some of the examples of passing arrays of values proposed examples by Erland Sommarskog;
Take a look at;
http://www.sommarskog.se/arrays-in-sql-2008.html <- For SS 2008 based around Table Valued Parameters.
http://www.sommarskog.se/arrays-in-sql-2005.html <- Options for SS 2005. I've used the XML method quite a few times and found it quite useful.
If you are using SS 2008, then you could possibly investigate his example of using the datatable as a source.
Not sure if these are considered best practice or not, but it is certainly food for thought.

How do I use enums in TSQL without hard coding magic number all over my SQL scripts/procs?

We have enums in our C# code:
public enum JobStatus
{
Ready = 0,
Running = 1,
Cancelling = 2,
}
These values are also stored in database fields, and we have lots of TSQL (mostly stored procs, as well as some batches and SSIS) that also processes the data:
SELECT TOP 1 #JobSID = JobSID
FROM Job
WHERE Status = 0 /* JobStatus.Ready */
ORDER BY SubmitDate ASC
CREATE TABLE ImportCrossEffect(
/* lots deleted */
Source tinyint
DEFAULT 1 NOT NULL -- 0: Unknown (default), 1:Imported, 2:Keyed
)
How do I avoid hard coding the “magic numbers” in the TSQL?
How do I remove the risk of the enumerations not matching on the C# and TSQL sides?
(I have included the C# tag, as I would like solution that “single sourced” the definitions of the enums on both the C# and TSQL sides)
Updates:
We don't have tables in the database with the Enum names in them, the values are just stored in tiny int columns.
I was hoping for something like a SQL pre-processor that would "expand" all the enum to there "magic value".

You can always pass the value from the enumeration into the stored proc/command that you are trying to execute. This way, you never have to worry about the enumerations in the database.
If you want to store the enumerations in the database, then I suggest you create a view (possibly titled with your enumeration), like so:
create view JobStatus
select 0 as Ready, 1 as Running, 2 as Cancelling
You can then access/join on the view if you need it to.
Note, the query optimizer treats any reference to the above as a constant scan/scalar operation, not a table scan, so you aren't incurring the reads that would occur if you were accessing an actual table.

In one project we defined an attribute that applied to every enum member that stored the table and value expected in the database and the unit tests verified the link. Messy though.

I like to use table valued functions for this. They work well with encapsulating and packaging variables together, almost like named enums from other languages.
Let's say we a couple of variables for different order statuses:
New = 100
In progress = 250
Completed = 500
What I would do is create a function that returns my variables
CREATE FUNCTION dbo.OrderStatus()
RETURNS #T_def TABLE
(
NEW int,
IN_PROGRESS int,
COMPLETED int
)
AS
BEGIN
INSERT INTO #T_def (NEW, IN_PROGRESS, COMPLETED)
VALUES (100, 250, 500);
RETURN
END
I can now use my variables in my queries, and even intellisence will work to hint me about the variables:
SELECT
OrderStatus.NEW
FROM dbo.OrderStatus() "OrderStatus";
Or as part of another query using CROSS APPLY
SELECT
OrderStatus.NEW
,*
FROM CUSTOMER_ORDERS
CROSS APPLY dbo.OrderStatus() "OrderStatus";

In pure TSQL the only similar thing I can think is a scalar UDF that returns the vale wanted.
In LINQ to SQL you can map members as c#/.NET enums, and it handles it for you.
But to be honest; in most pure TSQL I'd mainly just use the literals.

Perhaps you could implement managed code on the SQL server instead of using normal TSQL? Not 100% sure if this would work, but it might be an option to explore.
Possible Solution here...

What's a good alternative to firing a stored procedure 368 times to update the database?

I'm working on a .NET component that gets a set of data from the database, performs some business logic on that set of data, and then updates single records in the database via a stored procedure that looks something like spUpdateOrderDetailDiscountedItem.
For small sets of data, this isn't a problem, but when I had a very large set of data that required an iteration of 368 stored proc calls to update the records in the database, I realized I had a problem. A senior dev looked at my stored proc code and said it looked fine, but now I'd like to explore a better method for sending "batch" data to the database.
What options do I have for updating the database in batch? Is this possible with stored procs? What other options do I have?
I won't have the option of installing a full-fledged ORM, but any advice is appreciated.
Additional Background Info:
Our current data access model was built 5 years ago and all calls to the db currently get executed via modular/static functions with names like ExecQuery and GetDataTable. I'm not certain that I'm required to stay within that model, but I'd have to provide a very good justification for going outside of our current DAL to get to the DB.
Also worth noting, I'm fairly new when it comes to CRUD operations and the database. I much prefer to play/work in the .NET side of code, but the data has to be stored somewhere, right?
Stored Proc contents:
ALTER PROCEDURE [dbo].[spUpdateOrderDetailDiscountedItem]
-- Add the parameters for the stored procedure here
#OrderDetailID decimal = 0,
#Discount money = 0,
#ExtPrice money = 0,
#LineDiscountTypeID int = 0,
#OrdersID decimal = 0,
#QuantityDiscounted money = 0,
#UpdateOrderHeader int = 0,
#PromoCode varchar(6) = '',
#TotalDiscount money = 0
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
Update OrderDetail
Set Discount = #Discount, ExtPrice = #ExtPrice, LineDiscountTypeID = #LineDiscountTypeID, LineDiscountPercent = #QuantityDiscounted
From OrderDetail with (nolock)
Where OrderDetailID = #OrderDetailID
if #UpdateOrderHeader = -1
Begin
--This code should get code the last time this query is executed, but only then.
exec spUpdateOrdersHeaderForSkuGroupSourceCode #OrdersID, 7, 0, #PromoCode, #TotalDiscount
End

If you are using SQL 2008, then you can use a table-valued parameter to push all of the updates in one s'proc call.
update
Incidentally, we are using this in combination with the merge statement. That way sql server takes care of figuring out if we are inserting new records or updating existing ones. This mechanism is used at several major locations in our web app and handles hundreds of changes at a time. During regular load we will see this proc get called around 50 times a second and it is MUCH faster than any other way we've found... and certainly a LOT cheaper than buying bigger DB servers.

An easy and alternative way I've seen in use is to build a SQL statement consisting of sql_execs calling the sproc with the parameters in the string. Not sure if this is advised or not, but from the .NET perspective, you are only populating one SqlCommand and calling ExecuteNonQuery once...
Note if you choose this then please, please use the StringBuilder! :-)
Update: I much prefer Chris Lively's answer, didn't know about table-valued parameters until now... unfortunately the OP is using 2005.

You can send the full set of data as XML input to the stored procedure. Then you can perform Set operations to modify the database. Set based will beat RBARs on performance almost every single time.

If you are using a version of SQL Server prior to 2008, you can move your code entirely into the stored procedure itself.
There are good and "bad" things about this.
Good
No need to pull the data across a network wire.
Faster if your logic is set based
Scales up
Bad
If you have rules against any logic in the database, this would break your design.
If the logic cannot be set based then you might end up with a different set of performance problems
If you have outside dependencies, this might increase difficulty.
Without details on exactly what operations you are performing on the data it's hard to give a solid recommendation.
UPDATE
Ben asked what I meant in one of my comments about the CLR and SQL Server. Read Using CLR Integration in SQL Server 2005. The basic idea is that you can write .Net code to do your data manipulation and have that code live inside the SQL server itself. This saves you from having to read all of the data across the network and send updates back that way.
The code is callable by your existing proc's and gives you the entire power of .net so that you don't have to do things like cursors. The sql will stay set based while the .net code can perform operations on individual records.
Incidentally, this is how things like heirarchyid were implemented in SQL 2008.
The only real downside is that some DBA's don't like to introduce developer code like this into the database server. So depending on your environment, this may not be an option. However, if it is, then it is a very powerful way to take care of your problem while leaving the data and processing within your database server.

Can you create batched statement with 368 calls to your proc, then at least you will not have 368 round trips. ie pseudo code
var lotsOfCommands = "spUpdateOrderDetailDiscountedItem 1; spUpdateOrderDetailDiscountedItem 2;spUpdateOrderDetailDiscountedItem ... 368'
var new sqlcommand(lotsOfCommands)
command.CommandType = CommandType.Text;
//execute command

I had issues when trying to the same thing (via inserts, updates, whatever). While using an OleDbCommand with parameters, it took a bunch of time to constantly re-create the object and parameters each time I called it. So, I made a property on my object for handling such call and also added the appropriate "parameters" to the function. Then, when I needed to actually call/execute it, I would loop through each parameter in the object, set it to whatever I needed it to be, then execute it. This created SIGNIFICANT performance improvement... Such pseudo-code of my operation:
protected OleDbCommand oSQLInsert = new OleDbCommand();
// the "?" are place-holders for parameters... can be named parameters,
// just for visual purposes
oSQLInsert.CommandText = "insert into MyTable ( fld1, fld2, fld3 ) values ( ?, ?, ? )";
// Now, add the parameters
OleDbParameter NewParm = new OleDbParameter("parmFld1", 0);
oSQLInsert.Parameters.Add( NewParm );
NewParm = new OleDbParameter("parmFld2", "something" );
oSQLInsert.Parameters.Add( NewParm );
NewParm = new OleDbParameter("parmFld3", 0);
oSQLInsert.Parameters.Add( NewParm );
Now, the SQL command, and place-holders for the call are all ready to go... Then, when I'm ready to actuall call it, I would do something like..
oSQLInsert.Parameters[0].Value = 123;
oSQLInsert.Parameters[1].Value = "New Value";
oSQLInsert.Parameters[2].Value = 3;
Then, just execute it. The repetition of 100's of calls could be killed by time by creating your commands over and over...
good luck.

Is this a one-time action (like "just import those 368 new customers once") or do you regularly have to do 368 sproc calls?
If it's a one-time action, just go with the 368 calls.
(if the sproc does much more than just updates and is likely to drag down the performance, run it in the evening or at night or whenever no one's working).
IMO, premature optimization of database calls for one-time actions is not worth the time you spend with it.

Bulk CSV Import
(1) Build data output via string builder as CSV then do a Bulk CSV import:
http://msdn.microsoft.com/en-us/library/ms188365.aspx

Table-valued parameters would be best, but since you're on SQL 05, you can use the SqlBulkCopy class to insert batches of records. In my experience, this is very fast.

Getting rows from a SQL table matching a dictionary using LINQ

I have the following code snippet:
var matchingAuthors = from authors in DB.AuthorTable
where m_authors.Keys.Contains(authors.AuthorId)
select authors;
foreach (AuthorTableEntry author in matchingAuthors)
{
....
}
where m_authors is a Dictionary containing the "Author" entries, and DB.AuthorTable is a SQL table. When the size of m_authors goes beyond a certain value (somewhere around the 3000 entries mark), I get an exception:
System.Data.SqlClient.SqlException: The incoming tabular data stream (TDS) remote procedure call (RPC) protocol stream is incorrect.
Too many parameters were provided in this RPC request. The maximum is 2100.
Is there any way I can get around this and work with a larger size dictionary? Alternatively, is there a better way to get all rows in a SQL table where a particular column value for that row matches one of the dictionary entries?

LINQ to SQL uses a parametrized IN statement to perform a local Contains():
...
WHERE AuthorId IN (#p0, #p1, #p2, ...)
...
So the error you're seeing is that SQL ran out of parameters to use for your keys. I can think of two options:
Select the whole table and filter using LINQ to Objects.
Generating an expression tree from your keys: see Option 2 here.

Another option is to consider how you populate m_authors and whether you can include that in the query as a query element itself so it turns into a server-side join/subselect.

Depending on your requirements, you could break apart the work into multiple smaller chunks (first thousand, second thousand, etc.) This runs certain risks if your data is read-write and changes frequently, but it might give you a bit better scalability beyond pulling back thousands of rows in one big gulp. And, if your data can be worked on in part (i.e. without having the entire set in memory), you could send off chunks to be worked on in a separate thread while you are pulling back the next chunk.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.