Bulk Copying and Deleting in OneTransaction

Bulk Copying and Deleting in OneTransaction - c#

In C# application I like to copy the table data from one server(SQLServer2000) to another server (SQLServer2005). I like to copy the data in single instance and delete the existing data in the SQL Server 2000 table. I need to do all this(Bulk copying and Deleting) in single transaction. How to achieve this?
Note: I am having two different sql server connections how to achieve this for single transaction

To minimise the duration of the transaction, I always do this by bulk-copying to a staging table (same schema, but different name - no indexes etc), and then once all the data is at the server, do something like:
BEGIN TRAN
DELETE FROM FOO
INSERT FOO ...
SELECT ...
FROM FOO_STAGING
COMMIT TRAN
DELETE FROM FOO_STAGING
(the transaction could be either in the TSQL or on the connection via managed code, or via TransactionScope; the TSQL could be either as command-text, or as a SPROC)

You can do this using linked servers, although I've never tried doing this between a SQL2005 and a SQL2000 instance. On your SQL2005 instance:
sp_addlinkedserver Sql2000Server --Only need to do this once on the server
BEGIN TRAN
INSERT INTO dbo.MyTable (id, column)
SELECT id, column FROM Sql2000Server.MyDatabase.dbo.MyTable
DELETE FROM Sql2000Server.MyDatabase.dbo.MyTable
--etc...
COMMIT TRAN
See Microsoft books online for the syntax to add / remove linked servers (http://msdn.microsoft.com/en-us/library/ms190479.aspx)

Further to the linked server suggestion, you can use SSIS as well, which would be my preferred method.

Related

Select and read first record of table and delete record after that, in one stored procedure

I want to read records one-by-one and delete them after read. Table is a temp table, a multi thread program will use data of table. I need to read each record just once and not by multiple thread.
is there any solution by stored procedures to create this thread safe program(delete record just after read by first thread)?

First, I feel like I have to warn you that it's probably not the best idea to do it in SQL Server - relational databases works best with a set based approach, not on a row-by-row basis.
Reading and deleting each row individually will have a very poor performance.
Having said that, here's one way to delete a row, get it back to the client using the output clause, and (thanks to the rowlock hint) do it in a thread safe manner:
DELETE TOP(1)
FROM #tempTable WITH (ROWLOCK)
OUTPUT deleted.*
ORDER BY id

This should be your Stored Procedure code:
Create Procedure DeleteButOne
As
Begin
Select TOP 1 * From "Your Table Name"
DELETE FROM "Your Table Name" WHERE Id NOT IN (SELECT TOP 1 ID FROM "Your Table
Name")
End
And then you can execute procedure :
Execute DeleteButOne

Is it possible to record INSERTed row references for deletion later?

For testing I INSERT a bunch of rows into a SQLServer DB. I want to clear them up at the end.. while this should be a specific test-only database I want to be well-behaved just in case. For the same reason, I don't want to make assumptions on what data may already be in the DB when deleting my test data e.g. deleting all rows or all rows with dates in a certain range.
When inserting rows is there a way I can catch an reference to the actual rows inserted, so I can later delete those exact rows regardless if the data in them has been modified. So I know which exact rows I 'own' for the test?
I'm calling SQLServer queries from C# code within the ms-test framework.

Yes it is possible with OUTPUT clause.
You need to create one intermediate table for capturing inserted records identity values.
DECLARE #TAB TABLE(ID INT)
INSERT INTO TABLE1
output inserted.Table1PK into #TAB
SELECT id,name ... ETC
Now check your table variable
SELECT * FROM #TAB
If you want dump the Table variable data into one Actual table.
For more info look at OUTPUT Clause (Transact-SQL)
Now you can delete the TABLE1 records like
DELETE FROM TABLE1 WHERE ID IN (SELECT ID FROM #TAB)

You can use transactions and rollback the transaction instead of committing the transaction at the end of your test.
You can do this with an explicit transaction, or by set implicit_transactions on; at the start of your session and not committing the transactions.
begin transaction test /* explicit transaction */
insert into TestTable (pk,col) values
(1,'val');
update TestTable
set pk=2
where pk=1;
select *
from TestTable
where pk = 2;
rollback transaction
--commit transaction
I wouldn't do this in a production environment because the transaction will hold locks until the transaction completes, but if you are testing combinations of inserts, updates, and deletes where the pk might change, this could be safer than trying to track all of your changes to manually roll them back.
This would also protect you from making mistakes in your tests that you aren't able to manually roll back.
Related: How to: Write a SQL Server Unit Test that Runs within the Scope of a Single Transaction

T-SQL Equivalent of .NET TransactionScopeOption.Suppress

In my .NET code, inside a database transaction (using TransactionScope), I could include a nested block with TransactionScopeOption.Suppress, which ensures that the commands inside the nested block are committed even if the outer block rolls back.
Following is a code sample:
using (TransactionScope txnScope = new TransactionScope(TransactionScopeOption.Required))
{
db.ExecuteNonQuery(CommandType.Text, "Insert Into Business(Value) Values('Some Value')");
using (TransactionScope txnLogging = new TransactionScope(TransactionScopeOption.Suppress))
{
db.ExecuteNonQuery(CommandType.Text, "Insert Into Logging(LogMsg) Values('Log Message')");
txnLogging.Complete();
}
// Something goes wrong here. Logging is still committed
txnScope.Complete();
}
I was trying to find if this could be done in T-SQL. A few people have recommended OPENROWSET, but it doesn't look very 'elegant' to use. Besides, I think it is a bad idea to put connection information in T-SQL code.
I've used SQL Service Broker in past, but it also supports Transactional Messaging, which means message is not posted to the queue until the database transaction is committed.
My requirement: Our application stored procedures are being fired by some third party application, within an implicit transaction initiated outside stored procedure. And I want to be able to catch and log any errors (in a database table in the same database) within my stored procedures. I need to re-throw the exception to let the third party app rollback the transaction, and for it to know that the operation has failed (and thus do whatever is required in case of a failure).

You can set up a loopback linked server with the remote proc transaction Promotion option set to false and then access it in TSQL or use a CLR procedure in SQL server to create a new connection outside the transaction and do your work.
Both methods suggested in How to create an autonomous transaction in SQL Server 2008.
Both methods involve creating new connections. There is an open connect item requesting this functionality be provided natively.

Values in a table variable exist beyond a ROLLBACK.
So in the following example, all the rows that were going to be deleted can be inserted into a persisted table and queried later on thanks to a combination of OUTPUT and table variables.
-- First, create our table
CREATE TABLE [dbo].[DateTest] ([Date_Test_Id] INT IDENTITY(1, 1), [Test_Date] datetime2(3));
-- Populate it with 15,000,000 rows
-- from 1st Jan 1900 to 1st Jan 2017.
INSERT INTO [dbo].[DateTest] ([Test_Date])
SELECT
TOP (15000000)
DATEADD(DAY, 0, ABS(CHECKSUM(NEWID())) % 42734)
FROM [sys].[messages] AS [m1]
CROSS JOIN [sys].[messages] AS [m2];
BEGIN TRAN;
BEGIN TRY
DECLARE #logger TABLE ([Date_Test_Id] INT, [Test_Date] DATETIME);
-- Delete every 1000 row
DELETE FROM [dbo].[DateTest]
OUTPUT deleted.Date_Test_Id, deleted.Test_Date INTO #logger
WHERE [Date_Test_Id] % 1000 = 0;
-- Make it fail
SELECT 1/0
-- So this will never happen
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
ROLLBACK TRAN
SELECT * INTO dbo.logger FROM #logger;
END CATCH;
SELECT * FROM dbo.logger;
DROP TABLE dbo.logger;

Should you make multiple insert calls or pass XML?

I have an account creation process and basically when the user signs up, I have to make entries in mutliple tables namely User, Profile, Addresses. There will be 1 entry in User table, 1 entry in Profile and 2-3 entries in Address table. So, at most there will be 5 entries. My question is should I pass a XML of this to my stored procedure and parse it in there or should I create a transaction object in my C# code, keep the connection open and insert addresses one by one in loop?
How do you approach this scenario? Can making multiple calls degrade the performance even though the connection is open?

No offence, but you're over thinking this.
Gather your information, when you have it all together, create a transaction and insert the new rows one at a time. There's no performance hit here, as the transaction will be short lived.
A problem would be if you create the transaction on the connection, insert the user row, then wait for the user to enter more profile information, insert that, then wait for them to add address information, then insert that, DO NOT DO THIS, this is a needlessly long running transaction, and will create problems.
However, your scenario (where you have all the data) is a correct use of a transaction, it ensures your data integrity and will not put any strain on your database, and will not - on it's own - create deadlocks.
Hope this helps.
P.S. The drawbacks with the Xml approach is the added complexity, your code needs to know the schema of the xml, your stored procedure needs to know the Xml schema too. The stored procedure has the added complexity of parsing the xml, then inserting the rows. I really don't see the advantage of the extra complexity for what is a simple short running transaction.

If you want to insert records in multiple table then using XML parameter is a complex method. Creating Xml in .net and extracting records from xml for three diffrent tables is complex in sql server.
Executing queries within a transaction is easy approach but some performance will degrade there to switch between .net code and sql server.
Best approach is to use table parameter in storedprocedure. Create three data table in .net code and pass them in stored procedure.
--Create Type TargetUDT1,TargetUDT2 and TargetUDT3 for each type of table with all fields which needs to insert
CREATE TYPE [TargetUDT1] AS TABLE
(
[FirstName] [varchar](100)NOT NULL,
[LastName] [varchar](100)NOT NULL,
[Email] [varchar](200) NOT NULL
)
--Now write down the sp in following manner.
CREATE PROCEDURE AddToTarget(
#TargetUDT1 TargetUDT1 READONLY,
#TargetUDT2 TargetUDT2 READONLY,
#TargetUDT3 TargetUDT3 READONLY)
AS
BEGIN
INSERT INTO [Target1]
SELECT * FROM #TargetUDT1
INSERT INTO [Target2]
SELECT * FROM #TargetUDT2
INSERT INTO [Target3]
SELECT * FROM #TargetUDT3
END
In .Net, Create three data table and fill the value, and call the sp normally.

For example assuming your xml as below
<StoredProcedure>
<User>
<UserName></UserName>
</User>
<Profile>
<FirstName></FirstName>
</Profile>
<Address>
<Data></Data>
<Data></Data>
<Data></Data>
</Address>
</StoredProcedure>
this would be your stored procedure
INSERT INTO Users (UserName) SELECT(UserName) FROM OPENXML(#idoc,'StoredProcedure/User',2)
WITH ( UserName NVARCHAR(256))
where this would provide idoc variable value and #doc is the input to the stored procedure
DECLARE #idoc INT
--Create an internal representation of the XML document.
EXEC sp_xml_preparedocument #idoc OUTPUT, #doc
using similar technique you would run 3 inserts in single stored procedure. Note that it is single call to database and multiple address elements will be inserted in single call to this stored procedure.
Update
just not to mislead you here is a complete stored procedure for you do understand what you are going to do
USE [DBNAME]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER OFF
GO
CREATE PROCEDURE [dbo].[procedure_name]
#doc [ntext]
WITH EXECUTE AS CALLER
AS
DECLARE #idoc INT
DECLARE #RowCount INT
SET #ErrorProfile = 0
--Create an internal representation of the XML document.
EXEC sp_xml_preparedocument #idoc OUTPUT, #doc
BEGIN TRANSACTION
INSERT INTO Users (UserName)
SELECT UserName FROM OPENXML(#idoc,'StoredProcedure/User',2)
WITH ( UserName NVARCHAR(256) )
-- Insert Address
-- Insert Profile
SELECT #ErrorProfile = ##Error
IF #ErrorProfile = 0
BEGIN
COMMIT TRAN
END
ELSE
BEGIN
ROLLBACK TRAN
END
EXEC sp_xml_removedocument #idoc

Have you noticed any performance problems, what you are trying to do is very straight forward and many applications do this day in day out. Be careful not to be drawn into any premature optimization.
Database inserts should be very cheep, as you have suggested create a new transaction scope, open you connection, run your inserts, commit the transaction and finally dispose everything.
using (var tran = new TransactionScope())
using (var conn = new SqlConnection(YourConnectionString))
using (var insetCommand1 = conn.CreateCommand())
using (var insetCommand2 = conn.CreateCommand())
{
insetCommand1.CommandText = \\SQL to insert
insetCommand2.CommandText = \\SQL to insert
insetCommand1.ExecuteNonQuery();
insetCommand2.ExecuteNonQuery();
tran.Complete();
}
Bundling all your logic into a stored procedure and using XML gives you added complications, you will need to have additional logic in your database, you now have to transform your entities into an XML blob and you code has become harder to unit test.
There are a number of things you can do to make the code easier to use. The first step would be to push your database logic into a reusable database layer and use the concept of a repository to read and write your objects from the database.
You could of course make your life a lot easier and have a look at any of the ORM (Object-relational mapping) libraries that are available. They take away the pain of talking to the database and handle that for you.

fastest Import of a csv to a database table

I have implemented an import functionality which takes data from a csv file in an Asp.Net appication. The file of the size can vary from a few kb's to a max of 10 MB.
However when an import occurs and if the file size is > 50000 it takes around 20 MINS .
Which is way too much of a time. I need to perform an import for around 300000 records within a timespan of 2-3 Mins .
I know that the import to a database also depends on the physical memory of the db server .I create insert scripts in bulk and execute . I also know using SqlBulkCopy would also be another option but in my case its just not the inserting of product's that take place but also update and delete that is a field called "FUNCTION CODE" which decides whether to Insert,Update Or Delete.
Any suggestions regarding as to how to go about this would be greatly appreciated.
One approach towards this would be to implement multiple threads which carry out processes simultaneosly ,but i have never implemented threading till date and hence am not aware of the complication i would incur by implementing the same.
Thanks & Regards,
Francis P.

SqlBulkCopy is definitely going to be fastest. I would approach this by inserting the data into a temporary table on the database. Once the data is in the temp table, you could use SQL to merge/insert/delete accordingly.

I guess you are using SQL Server...
If you are using 2005/2008 consider using SSIS to process the file. Technet
Importing huge amount of data within the asp.net process is not the best thing you can do. You might upload the file and start a process that is doing the magic for you.

If this is a repeated process and the file is uploaded via asp.net plus you are doing some decision making on the data to decide insert/update or delete, then try out http://www.codeproject.com/KB/database/CsvReader.aspx it is this fast csv reader. Its quite quick and economical with memory

You are doing all your database queries with 1 connection sequentially. So for every insert/update/delete you are sending the command through the wire, wait for the db to do it's thing, and then wake up again when something is sent back.
Databases are optimized for heavy parallel access. So there are 2 easy routes for a significant speedup:
Open X connections to the database (where you have to tweak X but just start with 5) and either: spin up 5 threads who each do a chunk of the same work you were doing.
or: use asynchronous calls and when a callback arrives shoot in the next query.

I suggest using the XML functionality in SQL Server 2005/2008, which will allow you to bulk insert and bulk update. I'd take the following approach:
Process the entire file into an in-memory data structure.
Create a single XML document from this structure to pass to a stored proc.
Create a stored proc to load data from the XML document into a temporary table, then perform the inserts and updates. See below for guidance on creating the stored proc.
There are numerous advantages to this approach:
The entire operation is completed in one database call, although if your dataset is really large you may want to batch it.
You can wrap all the database writes into a single transaction easily, and roll back if anything fails.
You are not using any dynamic SQL, which could have posed a security risk.
You can return the IDs of the inserted, updated and/or deleted records using the OUTPUT clause.
In terms of the stored proc you will need, something like the following should work:
CREATE PROCEDURE MyBulkUpdater
(
#p_XmlData VARCHAR(MAX)
)
AS
DECLARE #hDoc INT
EXEC sp_xml_preparedocument #hDoc OUTPUT, #p_XmlData
-- Temporary table, should contain the same schema as the table you want to update
CREATE TABLE #MyTempTable
(
-- ...
)
INSERT INTO #MyTempTable
(
[Field1],
[Field2]
)
SELECT
XMLData.Field1,
XMLData.Field2
FROM OPENXML (#hdoc, 'ROOT/MyRealTable', 1)
WITH
(
[Field1] int,
[Field2] varchar(50),
[__ORDERBY] int
) AS XMLData
EXEC sp_xml_removedocument #hDoc
Now you can simply insert, update and delete your real table from your temporary table as required eg
INSERT INTO MyRealTable (Field1, Field2)
SELECT Field1, Field2
FROM #MyTempTable
WHERE ...
UPDATE MyRealTable
SET rt.Field2 = tt.Field2
FROM MyRealTable rt
JOIN MyTempTable tt ON tt.Field1 = MyRealTable.Field1
WHERE ...
For an example of the XML you need to pass in, you can do:
SELECT TOP 1 *, 0 AS __ORDERBY FROM MyRealTable AS MyRealTable FOR XML AUTO, ROOT('ROOT')
For more info, see OPENXML, sp_xml_preparedocument and sp_xml_removedocument.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Bulk Copying and Deleting in OneTransaction - c#

Further to the linked server suggestion, you can use SSIS as well, which would be my preferred method.

Related

Select and read first record of table and delete record after that, in one stored procedure

Is it possible to record INSERTed row references for deletion later?

T-SQL Equivalent of .NET TransactionScopeOption.Suppress

Should you make multiple insert calls or pass XML?

fastest Import of a csv to a database table

Categories

Resources