I am using stored procedures to fetch information from the database. First I fetch all the parent elements and hold them in the array and then using the parent Id I fetch all the related children. Each parent can have 150 children. There are about 100 parent elements. What is the best way to increase the performance of the fetch operation. Currently it takes 13 seconds to retrieve.
Here is the basic algorithm:
while(reader.read())
{
Parent p = new Parent();
// assign properties to the parent
p.Children = GetChildrenByParentId(parent.Id);
}
You should get all that data in one SQL select / stored proc (do some sort of join on child data) and then populate parent and children objects. Now you have 100*150 = 15000 requests on DB and I if you can do this with one request I would expect dramatic performance effect.
As Brian mentioned it in comment, that is known as RBAR, RowByAgonizingRow :)
Like a achronime a lot, here is more :
https://www.simple-talk.com/sql/t-sql-programming/rbar--row-by-agonizing-row/
The first and most important step is to measure the performance. Is it SQL Server that is the bottle neck, or .NET?
Also, you need to minimize the times you have to go back to the database, so if you can retrieve all of the data you need in a single stored procedure, that would be best.
From your question, it sounds to me like it is SQL Server that is the problem. To test this, run your stored procedure from SQL Query Anylizer, and see how long it takes for a known parent id. I bet you just need some indexes added to your underlying table to make it possible for SQL to get the data faster. If possible, look at your Execution Plan for the stored procedure. You can find a good article about reading execution plans here.
SQL Server 2008 is easy, create a user defined table type and pass the list of parent IDs to that, OR you can just use the logic you used to get those parent IDs in the first place and just join to the tables that hold child data.
To create the table type, you make something like this:
CREATE TYPE [dbo].[Int32List]
AS TABLE (
[ID] int NOT NULL
);
GO
And your stored proc goes something like this:
CREATE PROCEDURE [dbo].[MyStoredProc]
#ParentIDTable [dbo].[Int32List] READONLY
AS
--logic goes here
GO
And you call that procedure from your C# code like this:
DataTable ParentIDs = new DataTable();
ParentIDs.Columns.Add("ID", typeof(int));
SqlConnection connection = new SqlConnection(yourConnectionInfo);
SqlCommand command = new SqlCommand("MyStoredProc", connection);
command.CommandType = CommandType.StoredProcedure;
command.Parameters.Add("#ParentIDTable", SqlDbType.Structured).Value = ParentIDs;
command.Parameters["#ParentIDTable"].TypeName = "Int32List";
This way is nice, because it's a great way to effectively pass a list of values to SQL Server and treat it like a table. I use table types all over my applications where I want to pass an array of values to a stored proc. Just remember that the column names in the C# DataTable need to match the column names in the table type you created, and the TypeName property needs to match the table type's name.
With this method, you will only make 1 call to the DB, and when you iterate through the results, you should also make sure to include the ParentID in the select list so you can match each child to the proper parent object.
Here's a great resource to explain table types in more detail: http://www.sommarskog.se/arrays-in-sql-2008.html
Related
In my C# code, I will be populating a Dictionary.
I need to get that data into a MySQL table in the most efficient way possible.
Is it possible to pass that to a MySQL stored procedure? I guess I could pass it in some sort of string with commas, etc, so that the stored procedure could then call a function to parse the string and populate the table, but that's a pain.
Any other ideas?
Thanks!
Based on the comments so far, let my try to show the code/sudocode I'm working on.
The code that builds the dictionary will look something like this:
private void DistributeCallsToReschedule()
{
CallTimeSpacing = GetNumMinutesInNextCallWindow() / callsToReschedule.Count;
DateTime currTimeToCall = new DateTime();
foreach (int id in callsToReschedule)
{
CallIdTimeToCallMap.Add(id, currTimeToCall);
currTimeToCall.AddMinutes(CallTimeSpacing);
}
}
So, the dictionary can contain my entries.
What I HOPE I can do is to pass the dictionary to a stored procedure as shown below.
If this isn't possible, what's the most efficient way to do what the stored procedure indicates; ie,
the stored procedure wants to have a table to JOIN to, that has the data from the dictonary populated in the C# code. In other words, what's the most efficient way to get the dictionary's data into a table in MySQL? If this isn't possible, and I have to loop, what's the most efficient way to do that: Iteratively call a stored procedure? Build a prepared statement that has all the values (build via StringBuilder, I suppose)?
PARAMETERS PASSED TO STORED PROCEDURE BY C# CODE:
#CallIdTimeToCallMap
Put #CallIdTimeToCallMap into CallIdTimeToCallMapTable;
update cr
set cr.TimeToCall = map.TimeToCall
from callRequest cr
inner join CallIdTimeToCallMapTable map on
cr.id = map.id
You have to map objects to tables and columns before any relational database can do anything with them. Objects are not relations.
You don't say what the parameters are that the stored procedure is expecting.
If it's an INSERT or UPDATE that expects a large set of objects, I'd wonder if a stored procedure is the right answer. You'd have to call it repeatedly, once to write the row for each object in the set. I'd consider a prepared statement, binding variables, and batching so you can do it in one round trip.
Is the set a single unit of work? Have you thought about transactional behavior?
I have many tables in the database that have at least one column that contains a Url. And these are repeated a lot through-out the database. So I normalize them to a dedicated table and I just use numeric IDs everywhere I need them. I often need to join them so numeric ids are much better than full strings.
In MySql + C++, to insert a lot of Urls in one strike, I used to use multi-row INSERT IGNOREs or mysql_set_local_infile_handler(). Then batch SELECT with IN () to pull the IDs back from the database.
In C# + SQLServer I noticed there's a SqlBulkCopy class that's very useful and fast in mass-insertion. But I also need mass-selection to resolve the Url IDs after I insert them. Is there any such helper class that would work the same as SELECT WHERE IN (many, urls, here)?
Or do you have a better idea for turning Urls into numbers in a consistent manner in C#? I thought about crc32'ing the urls or crc64'ing them but I worry about collisions. I wouldn't care if collisions are few, but if not... it would be an issue.
PS: We're talking about tens of millions of Urls to get an idea of scale.
PS: For basic large insert, SQLBulkCopy is faster than SqlDbType.Structured. Plus it has the SqlRowsCopied event for a status tracking callback.
There is even a better way than SQLBulkCopy.
It's called Structured Parameters and it allows you to pass a table-valued parameter to stored procedure or query through ADO.NET.
There are code examples in the article, so I will only highlight what you need to do to get it up and working:
Create a user defined table type in the database. You can call it UrlTable
Setup a SP or query which does the SELECT by joining with a table variable or type UrlTable
In your backing code (C#), create a DataTable with the same structure as UrlTable, populate it with URLs and pass it to an SqlCommand through as a structured parameter. Note that column order correspondence is critical between the data table and the table type.
What ADO.NET does behind the scenes (if you profile the query you can see this) is that before the query it declares a variable of type UrlTable and populates it (INSERT statements) with what you pass in the structured parameter.
Other than that, query-wise, you can do pretty much everything with table-valued parameters in SQL (join, select, etc).
I think you could use the IGNORE_DUP_KEY option on your index. If you set IGNORE_DUP_KEY = ON on the index of the URL column, the duplicate values are simply ignored and the rest are inserted appropriately.
I have a a form containing 10 drop down lists. These lists we fetch by doing 10 calls to database at the time of form load.
I want to know the performance on application as well as on Sql Server in following 2 cases. Also please suggest best approach.
Fetch data for each of these drop down lists doing 10 requests
Create stored proc which will fetch 10 tables and return these 10 tables on UI in a data reader to create entities (single hit)
Please suggest your views...
Its good if you are fetch data in one go i.e by calling proceudre onece and get all ten dropdown data ..but it also depends on the number records you have and time to process each record that you are going to bind with each dropdownbox
option 1. It is easy to maintain.
1.10 requests doesn't cost very much
2.assume some day you want to query only five of them, you can easily combine the data parts. if you put them into one store procdure,things will be diffcult when business logic is changed.
You can return multiple tables from sql server stored procedure.
Create a stored procedure with multiple select queries.
for example if your sp has 10 select queries, it will return ten result sets or tables.
Few months back we had a same situation, and we went for option 2 in a way, that we have 5 data tables being returned from different SPs, so we made one SP, with 5 output parameters.
In those parameters we send as Input that if specific data table is required or not, and later SP returns that at which index the specific data table is returned.
CREATE procedure [dbo].[MySP]
#pTable1 smallint OUTPUT
#pTable2 smallint OUTPUT
AS
DECLARE iLocation smallint = 0;
BEGIN
IF #pTable1 = 1
BEGIN
SELECT * FROM TABLE1;
SET #pTable1 = iLocation;
iLocation = iLocation + 1;
END
END
.....
AND SO ON
I hope it will give you a better idea.
I have written a single stored procedure that returns 2 tables:
select *
from workers
select *
from orders
I call this stored procedure from my C# application and get a DataSet with two tables, and everything is working fine.
My question is how can I change the tables name at the SQL Server side so that in the C# side I will be able to access it via a name (instead of Tables[0]):
myDataSet.Tables["workers"]...
I tried to look for the answer in Google but couldn't find it. Maybe the search keywords was not sufficient.
You cannot really do anything from the server-side to influence those table names - those names only exist on the client-side, in your ADO.NET code.
What you can do is on the client-side - add table mappings - something like:
SqlDataAdapter dap = new SqlDataAdapter(YourSqlCommandHere);
dap.TableMappings.Add("Table", "workers");
dap.TableMappings.Add("Table1", "orders");
This would "rename" the Table (first result set) to workers and Table1 (second result set) to orders before you actually fill the data. So after the call to
dap.Fill(myDataSet);
you would then have myDataSet.Tables["workers"] and myDataSet.Tables["orders"] available for you to use.
The TDS Protocol documentation (Which is the protocol used to return results from SQL Server) does not mention a "resultset name". So the only way you will ever be able to access the result sets in ADO.net is by the number as mentioned in your example.
Can someone suggest the best way to retrieve a scalar value when the site uses .xsd files for the data sets? I have such site where before I commit to a insert task I need to verify duplicates.
Back in the day one would just instantiate a new connection and command object and run the query through BLL/DAL - easy job. With this prepackaged xsd file that the Studio creates for you I have no idea how to do it.
Thanks,
Risho
First, i would recommend to add an unique index in your database to ensure that it's impossible to create duplicates.
To answer your question: you can add queries to the automatically created TableAdapters:
How to: Create TableAdapter queries
From MSDN
TableAdapter with multiple queries
Unlike standard data adapters, TableAdapters can contain multiple
queries to fill their associated data tables. You can define as many
queries for a TableAdapter as your application requires, as long as
each query returns data that conforms to the same schema as its
associated data table. This enables loading of data that satisfies
differing criteria. For example, if your application contains a table
of customers, you can create a query that fills the table with every
customer whose name begins with a certain letter, and another query
that fills the table with all customers located in the same state. To
fill a Customers table with customers in a given state you can create
a FillByState query that takes a parameter for the state value: SELECT
* FROM Customers WHERE State = #State. You execute the query by calling the FillByState method and passing in the parameter value like
this: CustomerTableAdapter.FillByState("WA").
In addition to queries that return data of the same schema as the
TableAdapter's data table, you can add queries that return scalar
*(single) values.* For example, creating a query that returns a count of
customers (SELECT Count(*) From Customers) is valid for a
CustomersTableAdapter even though the data returned does not conform
to the table's schema.