Handling concurrent attempts on a SQL transaction (special situation) - c#

I have 3 tables in a SQL Server 2008R2 database, that I need to fill their records right after each other so I used transaction to do this job with no problem. basically I have 2 INSERT store procedure queries in middle of a transaction to insert records in these tables as the code below;
The transaction was handled in C# SqlTransaction class at ASP.NET.
The following procedures just used in middle of the transaction.
First Table:
ALTER PROCEDURE [INSERT_RESOURCE]
#docID int,
#resTitle nvarchar(500),
#resCategory nvarchar(100),
#resType nvarchar(50),
#resLink nvarchar(MAX),
#createdBy nvarchar(50),
#createdDateTime datetime
AS
BEGIN
INSERT INTO Resource
VALUES(#resTitle, #resCategory, #resType,
#resLink, #createdBy, #createdDateTime)
END
Second Table:
CREATE PROCEDURE [INSERT_RESOURCE_DOCUMENT]
#docName nvarchar(200),
#docSize nvarchar(50),
#docType nvarchar(50),
#docPath nvarchar(MAX),
#docTitle nvarchar(100),
#uploadBy nvarchar(50),
#uploadDateTime datetime
AS
BEGIN
INSERT INTO Document
VALUES(#docName, #docSize, #docType, #docPath,
#docTitle, #uploadBy, #uploadDateTime)
INSERT INTO Resource_Document --Third table
VALUES(
(SELECT TOP 1 ResourceID FROM Resource ORDER BY ResourceID DESC),
(SELECT TOP 1 DocID FROM Document ORDER BY DocID DESC)
)
The above procedures are work fine but the possible issue could be on the third procedure, that is using the last ID of the first two tables to insert data in the third table, but because of the last INSERT statement is using the SELECT TOP 1 query it might pick up the wrong id if at the same time someone else use the same transaction to add some values into the first two tables.
so I was wondering how can I resolve the issue in this transaction ?
is there any other ways that I can used in third store-procedure to get those ids from the first two tables ?

Your problem here is scope. You want to gain the last inserted value for that user, during that transaction. Your select top 1 queries break the scope of the user and may select the last inserted value for any user.
To remain in the user scope, take advantage of SQL's scoping methods. Convert all 3 of these actions into one single stored procedure, then use the SCOPE_IDENTITY() method to get the value that was last inserted into an identity column for this session/user. This will safely guarantee that users won't get each others' inserted values.
Read more here: http://msdn.microsoft.com/en-us/library/ms190315.aspx

The third script will definitely lead to an issue when two records are added at the same time.
I think you could place an after trigger (for every insert on Resource) and an after update trigger (for every insert on Document).
or you could join the above two tables (Resource & Document) and then create a trigger which adds the data to the third table (Resource_Document)
For reference - http://msdn.microsoft.com/en-us/library/ms189799.aspx

Related

T-SQL insert into sometimes fails to add to a table at random

So I have two tables, Records (Input ID = Primary Key) and TaskNotes (Input Id, TaskNote : No primary key).
There used to be a single stored procedure which would add to the record table, get the primary id that was generated, then add that ID to the TaskNotes table, along with the task notes text.
Recently, there was an issue where the sproc would run seemingly half way, with the record being added, but the task notes entry not being run.
I since split out into an AddRecord stored procedure and an AddTaskNotes stored procedure, which are being called from a C# application.
This works as similarly as before, however, at random the AddTaskNotes still wont be run.
I think the issue is a locking of the TaskNotes table.
Has anyone experienced this before and could let me know how it was resolved?
The current rate is about 1 failed tasknotes for every 400 record entries.
This is the AddRecord statement;
INSERT INTO Time.Records
( TeamID ,
UserID ,
TimeIN ,
TimeOUT
)
VALUES ( #TeamID , #UserID , #TimeIN , #TimeOUT );
return SCOPE_IDENTITY();
This is the AddTaskNotes statement;
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
INSERT INTO Time.TaskNotes ( InputID, TaskNotes )
VALUES ( #InputID, #TaskNotes );
END

Explain Code First CRUD auto-generated SQL for Identity column

Code-first auto generates an insert procedure code as below for a table that has ProductID as primary key (identity column).
CREATE PROCEDURE [dbo].[InsertProducts]
#ProductName [nvarchar](max),
#Date [datetime],
AS
BEGIN
INSERT dbo.ProductsTable([ProductName], [Date])
VALUES (#ProductName, #Date)
-- identity stuff starts here
DECLARE #ProductID int
SELECT #ProductID = [ProductID]
FROM dbo.FIT_StorageLocations
WHERE ##ROWCOUNT > 0 AND [ProductID] = scope_identity()
SELECT t0.[ProductID]
FROM dbo.ProductsTable AS t0
WHERE ##ROWCOUNT > 0 AND t0.[ProductID] = #ProductID
END
GO
Could you please explain the code that handles the identity column? Also, if an insert procedure is to be manually written from scratch, would it be handled differently?
If for example I would remove this auto generated code, I would encounter one of the following errors:
Procedure ....expects parameter '#ProductID', which was not supplied
Store update, insert, or delete statement affected an unexpected number of rows (0). Entities may have been modified or deleted since entities were loaded. See http://go.microsoft.com/fwlink/?LinkId=472540 for information on understanding and handling optimistic concurrency exceptions.
In the app, this is how I call the procedure which works fine until I try to mess with the code first auto generated SQL:
using (var db = new AppContext())
{
var record = new ProductObj()
{
ProductName= this.ProductName,
Date = DateTime.UtcNow
};
db.ProductDbSet.Add(record);
db.SaveChanges();
}
I guess there are two things to be explained here.
Why a SELECT statement when I insert stuff?
Let's first see what a regular insert by Entity Framework looks like. By "regular" I mean an insert without mapping CUD actions to stored procedures. The normal pattern is:
INSERT [dbo].[Product]([Name], ...)
VALUES (#0, ...)
SELECT [Id]
FROM [dbo].[Product]
WHERE ##ROWCOUNT > 0 AND [Id] = scope_identity()
So the INSERT is followed by a SELECT. This is because EF needs to know the identity value that the database assigns to the new Product to assign it to the entity object's Product.ProductId property and to track the entity. If for some reason you'd decide to do an update immediately after the insert, EF will be able to generate an update statement like UPDATE ... WHERE Id = #0.
When the insert is handled by a stored procedure, the sproc should return the new Id value in a way that looks like the regular insert. It expects to receive a one-column result set of which the column is named after the identity column. It should contain one row, the new identity value.
So that's why there is a SELECT statement in there, and why EF complains if you remove it. But, you might ask, does EF really need 7 lines of code to get an assigned identity value?
Why so much code?
Honestly, I have to speculate a bit here, because it isn't documented as far as I can find. But let's look at a minimal working version:
INSERT [dbo].[Products]([Name])
VALUES (#Name)
SELECT scope_identity() AS ProductId;
This does the job. It's even the standard example of many tutorials, including official ones, on mapping CUD actions to stored procedures.
But a database can be stuffed with triggers, constraints, defaults, etc. It's hard to predict their influence on the returned scope_identity() under the wide range of circumstances EF may encounter. So EF wants to guarantee that the returned value really belongs to the newly inserted record. And that a record has actually been inserted in the first place. That's why it adds the SELECT from the Product table, including the ##ROWCOUNT.
To implement these safeguards, a minimal version would be:
INSERT [dbo].[Products]([Name])
VALUES (#Name)
SELECT t0.[ProductId]
FROM [dbo].[Products] AS t0
WHERE ##ROWCOUNT > 0 AND t0.[ProductId] = scope_identity()
Same as in the regular insert.
That's as far as I can follow EF. It puzzles me a bit that this single SELECT apparently is enough for a regular INSERT but not for a stored procedure. I can't explain why there are two SELECTs in the generated code.

Insert multiple sql rows via stored proc

I have looked a some related topics but my question isn't quite answered:
C# - Inserting multiple rows using a stored procedure
Insert Update stored proc on SQL Server
Efficient Multiple SQL insertion
I have the following kind of setup when running my stored procedure in the code behind for my web application. The thing is I am now faced with the possibility of inserting multiple products and I would like to do it all in one ExecuteNonQuery rather than do a foreach loop and run it n number of times.
I am not sure how to do this, or if it can be, with my current setup.
The code should be somewhat self explanatory but if clarification is needed let me know. Thanks.
SqlDatabase database = new SqlDatabase(transMangr.ConnectionString);
DbCommand commandWrapper = StoredProcedureProvider.GetCommandWrapper(database, "proc_name", useStoredProc);
database.AddInParameter(commandWrapper, "#ProductID", DbType.Int32, entity._productID);
database.AddInParameter(commandWrapper, "#ProductDesc", DbType.String, entity._desc);
...more parameters...
Utility.ExecuteNonQuery(transMangr, commandWrapper);
Proc
ALTER PROCEDURE [dbo].[Products_Insert]
-- Add the parameters for the stored procedure here
#ProductID int,
#Link varchar(max)
#ProductDesc varchar(max)
#Date DateTime
AS BEGIN
SET NOCOUNT ON;
INSERT INTO [dbo].[Prodcuts]
(
[CategoryID],
[Link],
[Desc],
[Date]
)
VALUES
(
#ProductID,
#Link,
#ProductDesc,
#Date
)
END
You should be fine running your stored procedure in a loop. Just make sure that you commit rarely, not after every insert.
For alternatives, you have already found the discussion about loading data.
Personally, I like SQL bulk insert of the form insert into myTable (select *, literalValue from someOtherTable);
But that will probably not do in your case.
You could pass all your data as a table value parameter - MSDN has a pretty good write up about it here
Something along the lines of the following should work
CREATE TABLE dbo.tSegments
(
SegmentID BIGINT NOT NULL CONSTRAINT pkSegment PRIMARY KEY CLUSTERED,
SegCount BIGINT NOT NULL
);
CREATE TYPE dbo.SegmentTableType AS TABLE
(
SegmentID BIGINT NOT NULL
);
CREATE PROCEDURE dbo.sp_addSegments
#Segments dbo.SegmentTableType READONLY
AS
BEGIN
MERGE INTO dbo.tSegments AS tSeg
USING #Segments AS S
ON tSeg.SegmentID = S.SegmentID
WHEN MATCHED THEN UPDATE SET T.SegCount = T.SegCount + 1
WHEN NOT MATCHED THEN INSERT VALUES(tSeg.SegmentID, 1);
END
Define the commandWrapper and parameters for the command outside of the loop and then with in the loop you just assign parameter values and execute the proc.
SqlDatabase database = new SqlDatabase(transMangr.ConnectionString);
DbCommand commandWrapper = StoredProcedureProvider.GetCommandWrapper(database, "proc_name", useStoredProc);
database.AddInParameter(commandWrapper, "#ProductID", DbType.Int32 );
database.AddInParameter(commandWrapper, "#ProductDesc", DbType.String);
...more parameters...
foreach (var entity in entitties)
{
database.SetParameterValue(commandWrapper, "#ProductID",entity._productID);
database.SetParameterValue(commandWrapper, "#ProductDesc",entity._desc);
//..more parameters...
Utility.ExecuteNonQuery(transMangr, commandWrapper);
}
Not ideal from a purist way of doing things, but sometimes one is limited by frameworks and libraries, and that you are forced to call stored procedures in a certain way, bind parameters in a certain way, and that connections are managed by pools as part of your framework.
In such circumstances, a method we have found to work is to simply write your stored procedure with a lot of parameters, usually a name followed by a number, e.g. #ProductId1, #ProductDesc1, #ProductId2, #ProductDesc2 up to a number you decide, possibly say 32.
You can use some form of scripting language to produce the lines for this.
You can get the stored procedure to insert all the values first into a table parameter that allows nulls, then do bulk inserts / merges on this data in a way similar to Johnv2020's answer. You might remove the null rows first.
It will usually be more efficient than doing it one at a time (partly because of the database operations itself, and partly because of your framework's overheads in getting the connection to call the procedure etc.)

How do I structure this transaction?

We have an ASP.NET/MSSQL based web app which generates orders with sequential order numbers.
When a user saves a form, a new order is created as follows:
SELECT MAX(order_number) FROM order_table, call this max_order_number
set new_order_number = max_order_number + 1
INSERT a new order record, with this new_order_number (it's just a field in the order record, not a database key)
If I enclose the above 3 steps in single transaction, will it avoid duplicate order numbers from being created, if two customers save a new order at the same time? (And let's say the system is eventually on a web farm with multiple IIS servers and one MSSQL server).
I want to avoid two customers selecting the same MAX(order_number) due to concurrency somewhere in the system.
What isolation level should be used? Thank you.
Why not just use an Identity as the order number?
Edit:
As far as I know, you can make the current order_number column an Identity (you may have to reset the seed, it's been a while since I've done this). You might want to do some tests.
Here's a good read about what actually goes on when you change a column to an Identity in SSMS. The author mentions how this may take a while if the table already has millions of rows.
Using an identity is by far the best idea. I create all my tables like this:
CREATE TABLE mytable (
mytable_id int identity(1, 1) not null primary key,
name varchar(50)
)
The "identity" flag means, "Let SQL Server assign this number for me". The (1, 1) means that identity numbers should start at 1 and be incremented by 1 each time someone inserts a record into the table. Not Null means that nobody should be allowed to insert a null into this column, and "primary key" means that we should create a clustered index on this column. With this kind of a table, you can then insert your record like this:
-- We don't need to insert into mytable_id column; SQL Server does it for us!
INSERT INTO mytable (name) VALUES ('Bob Roberts')
But to answer your literal question, I can give a lesson about how transactions work. It's certainly possible, although not optimal, to do this:
-- Begin a transaction - this means everything within this region will be
-- executed atomically, meaning that nothing else can interfere.
BEGIN TRANSACTION
DECLARE #id bigint
-- Retrieves the maximum order number from the table
SELECT #id = MAX(order_number) FROM order_table
-- While you are in this transaction, no other queries can change the order table,
-- so this insert statement is guaranteed to succeed
INSERT INTO order_table (order_number) VALUES (#id + 1)
-- Committing the transaction releases your lock and allows other programs
-- to work on the order table
COMMIT TRANSACTION
Just keep in mind that declaring your table with an identity primary key column does this all for you automatically.
The risk is two processes selecting the MAX(order_number) before one of them inserts the new order. A safer way is to do it in one step:
INSERT INTO order_table
(order_number, /* other fields */)
VALUES
( (SELECT MAX(order_number)+1 FROM order_table ) order_number,
/* other values */
)
I agree with G_M; use an Identity field. When you add your record, just
INSERT INTO order_table (/* other fields */)
VALUES (/* other fields */) ; SELECT SCOPE_IDENTITY()
The return value from Scope Identity will be your order number.

MySql Batching Stored Procedure Calls with .Net / Connector?

Is there a way to batch stored procedure calls in MySql with the .Net / Connector to increase performance?
Here's the scenario... I'm using a stored procedure that accepts a few parameters as input. This procedure basically checks to see whether an existing record should be updated or a new one inserted (I'm not using INSERT INTO .. ON DUPLICATE KEY UPDATE because the check involves date ranges, so I can't really make a primary key out of the criteria).
I want to call this procedure a lot of times (let's say batches of 1000 or so). I can of course, use one MySqlConnection and one MySqlCommand instance and keep changing the parameter values, and calling .ExecuteNonQuery().
I'm wondering if there's a better way to batch these calls?
The only thought that comes to mind is to manually construct a string like 'call sp_myprocedure(#parama_1,#paramb_1);call sp_myprocedure(#parama_2,#paramb2);...', and then create all the appropriate parameters. I'm not convinced this will be any better than calling .ExecuteNonQuery() a bunch of times.
Any advice? Thanks!
EDIT: More info
I'm actually trying to store data from an external data source, on a regular basis. Basically I'm taking rss feeds of Domain auctions (from various sources like godaddy, pool, etc.), and updating a table with the auction info using this stored procedure (let's call it sp_storeSale). Now, in this table that the sale info gets stored, I want to keep historical records for sales for a given domain, so I have a domain table, and a sale table. The sale table has a many to one relationship with the domain table.
Here's the stored procedure:
-- --------------------------------------------------------------------------------
-- Routine DDL
-- Note: comments before and after the routine body will not be stored by the server
-- --------------------------------------------------------------------------------
DELIMITER $$
CREATE PROCEDURE `DomainFace`.`sp_storeSale`
(
middle VARCHAR(63),
extension VARCHAR(10),
brokerId INT,
endDate DATETIME,
url VARCHAR(500),
category INT,
saleType INT,
priceOrBid DECIMAL(10, 2),
currency VARCHAR(3)
)
BEGIN
DECLARE existingId BIGINT DEFAULT NULL;
DECLARE domainId BIGINT DEFAULT 0;
SET #domainId = fn_getDomainId(#middle, #extensions);
SET #existingId = (
SELECT id FROM sale
WHERE
domainId = #domainId
AND brokerId = #brokerId
AND UTC_TIMESTAMP() BETWEEN startDate AND endDate
);
IF #existingId IS NOT NULL THEN
UPDATE sale SET
endDate = #endDate,
url = #url,
category = #category,
saleType = #saleType,
priceOrBid = #priceOrBid,
currency = #currency
WHERE
id = #existingId;
ELSE
INSERT INTO sale (domainId, brokerId, startDate, endDate, url,
category, saleType, priceOrBid, currency)
VALUES (#domainId, #brokerId, UTC_TIMESTAMP(), #endDate, #url,
#category, #saleType, #priceOrBid, #currency);
END IF;
END
As you can see, I'm basically looking for an existing record that is not 'expired', but has the same domain, and broker, in which case I assume the auction is not over yet, and the data is an update to the existing auction. Otherwise, I assume the auction is over, it is a historical record, and the data I've got is for a new auction, so I create a new record.
Hope that clears up what I'm trying to achieve :)
I'm not entirely sure what you're trying to do but it sounds kinda house-keeping or maintenance related so I won't be too ashamed at posting the following suggestion.
Why dont you move all of your logic into the database and process it all server side ?
The following example uses a cursor (shock/horror) but it's perfectly acceptable to use them in such circumstances.
If you can avoid using cursors at all - great, but the main point of my suggestion is about moving the logic from your application tier back into the data tier to save on the round trips. You'd call the following sproc once and it would process the entire range of data in single call.
call house_keeping(curdate() - interval 1 month, curdate());
Also, if you can provide just a bit more information about what you're trying to do we might be able to suggest other approaches.
Example stored procedure
drop procedure if exists house_keeping;
delimiter #
create procedure house_keeping
(
in p_start_date date,
in p_end_date date
)
begin
declare v_done tinyint default 0;
declare v_id int unsigned;
declare v_expired_date date;
declare v_cur cursor for
select id, expired_date from foo where
expired_date between p_start_date and p_end_date;
declare continue handler for not found set v_done = 1;
open v_cur;
repeat
fetch v_cur into v_id, v_expired_date;
/*
if <some condition> then
insert ...
else
update ...
end if;
*/
until v_done end repeat;
close v_cur;
end #
delimiter ;
Just incase you think I'm completely mad in suggesting cursors you might want to read this
Optimal MySQL settings for queries that deliver large amounts of data?
Hope this helps :)

Categories