How to get the Identity values in SQL BULK COPY? - c#

I have to get the IDENTITY values from a table after SQLBULKCOPY to the same table. The volume of data could be thousands of records.
Can someone help me out on this ?

Disclaimer: I'm the owner of the project Bulk Operations
In short, this project overcomes SqlBulkCopy limitations by adding MUST-HAVE features like outputting inserted identity value.
Under the hood, it uses SqlBulkCopy and a similar method as #Mr Moose answer.
var bulk = new BulkOperation(connection)
// Output Identity Value
bulk.ColumnMappings.Add("CustomerID", ColumnMappingDirectionType.Output);
// Map Column
bulk.ColumnMappings.Add("Code");
bulk.ColumnMappings.Add("Name");
bulk.ColumnMappings.Add("Email");
bulk.BulkInsert(dt);
EDIT: Answer comment
can I simply get a IList or simply , I see its saved back in the customers table, but there is no variable where I can get a hold of it, can you please help with that. So, I an insert in Orders.CustomerID table
It depends, you can keep a reference to the Customer DataRow named CustomerRef in the Order DataTable.
So once you merged your customer, you are able to populate easily a column CustomerID from the column CustomerRef in your Order DataTable.
Here is an example of what I'm trying to say: https://dotnetfiddle.net/Hw5rf3

I've used a solution similar to this one from Marc Gravell in that it is useful to first import into a temp table.
I've also used the MERGE and OUTPUT as described by Jamie Thomson on this post to track data I have inserted into my temp table to match it with the id generated by the IDENTITY column of the table I want to insert into.
This is particularly useful when you need to use that ID as a foreign key reference to other tables you are populating.

Try this
CREATE TABLE #temp
(
DataRow varchar(max)
)
BULK INSERT #Temp FROM 'C:\tt.txt'
ALTER TABLE #temp
ADD id INT IDENTITY(1,1) NOT NULL
SELECT * FROM #temp

-- dummy schema
CREATE TABLE TMP (data varchar(max))
CREATE TABLE [Table1] (id int not null identity(1,1), data varchar(max))
CREATE TABLE [Table2] (id int not null identity(1,1), id1 int not null, data varchar(max))
-- imagine this is the SqlBulkCopy
INSERT TMP VALUES('abc')
INSERT TMP VALUES('def')
INSERT TMP VALUES('ghi')
-- now push into the real tables
INSERT [Table1]
OUTPUT INSERTED.id, INSERTED.data INTO [Table2](id1,data)
SELECT data FROM TMP

Related

Bulk Insert With Auto Increment - No Identity column

I am trying to implement bulk insert of data from Datatable. In my MS-SQL Table(Destination table) i have a column with primary key not Identity column, so i have to increment manually. But its not possible in Code because there will be multi Thread on the same table.Please give me suggestion if any.
public void BulkInsert(DataTable dtTable)
{
DataTable dtProductSold = dtTable;
//creating object of SqlBulkCopy
SqlBulkCopy objbulk = new SqlBulkCopy(ConStr.ToString());
//assigning Destination table name
objbulk.DestinationTableName = "BatchData_InvReportMapping";
//Mapping Table column
objbulk.ColumnMappings.Add("InvPK", "InvPK");
objbulk.ColumnMappings.Add("DateValue", "DateDalue");
objbulk.ColumnMappings.Add("TextValue", "TextValue");
objbulk.ColumnMappings.Add("NumericValue", "NumericValue");
objbulk.ColumnMappings.Add("ErrorValue", "ErrorValue");
//inserting bulk Records into DataBase
objbulk.WriteToServer(dtProductSold);
}
Thanks in advance,
This is too long for a comment.
If you have a primary key column, then you need to take responsibility for its being unique and non-NULL when you insert rows. SQL Server offers a very handy mechanism to help with this, which is the identity column.
If you do not have an identity, then I you basically have two options:
Load data that has a valid primary key column.
Create a trigger that assigns the value when rows are loaded in.
Oh, wait. The default option for bulk insert is not to fire triggers, so the second choice really isn't a good option.
Instead, modify the table to have an identity primary key column. Then define a view on the table without the primary key and do the bulk insert into the view. The primary key will then be assigned automatically.
EDIT:
There is a third option, which might be feasible. Load the data into a staging table. Then insert from the staging table into the final table, calculating the primary key value. Something like this:
insert into finaltable (pk, . . .)
select m.maxpk + seqnum, . . . .
from (select row_number() over (order by (select null)) as seqnum,
. . .
from stagingtable
) s cross join
(select max(pk) as maxpk
from finaltable
) m;
i had one idea
generally we use tables to store the records, even if you insert the data using front end finally it will be stored in table.So i am suggesting to use sequences with insert trigger on the table. which means when you insert the data into the table first the trigger will be called, sequence will be incremented the the increased value will be stored along with other values in the table. just try this. because in oracle 11g we don't have identity() hence we will use sequences and insert trigger for identity column
Create a Table called id's. VARCHAR(50) TableName, INT Id.
When you want to generate your ids read the relevant row and increment it by the number of rows you want to insert within the same transaction.
you can now bulk insert these rows whenever you want without worrying about other threads inserting them.
Similar to how Nhibernates HiLow generator works.
http://weblogs.asp.net/ricardoperes/making-better-use-of-the-nhibernate-hilo-generator

Avoiding duplicate record insertion into SQL table

I have a windows service which basically watches a folder for any CSV file. Each record in the CSV file is inserted into a SQL table. If the same CSV file is put in that folder, it can lead to duplicate record entries in the table. How can I avoid duplicate insertions into the SQL table?
Try INSERT WHERE NOT EXISTS, where a, b and c are relevant columns, #a, #b and #c are relevant values.
INSERT INTO table
(
a,
b,
c
)
VALUES
(
#a,
#b,
#c
)
WHERE NOT EXISTS
(
SELECT 0 FROM table WHERE a = #a, b = #b, c = #c
)
The accepted answer has a syntax error and is not compatible with relational databases like MySQL.
Specifically, the following is not compatible with most databases:
values(...) where not exists
While the following is generic SQL, and is compatible with all databases:
select ... where not exists
Given that, if you want to insert a single record into a table after checking if it already exists, you can do a simple select with a where not exists clause as part of your insert statement, like this:
INSERT
INTO table_name (
primay_col,
col_1,
col_2
)
SELECT 1234,
'val_1',
'val_2'
WHERE NOT EXISTS (
SELECT 1
FROM table_name
WHERE primary_col=1234
);
Simply pass all values with the select keyword, and put the primary or unique key condition in the where clause.
Problems with the answers using WHERE NOT EXISTS are:
performance -- row-by-row processing requires, potentially, a very large number of table scans against table
NULL handling -- for every column where there might be NULLs you will have to write the matching condition in a more complicated way, like
(a = #a OR (a IS NULL AND #a IS NULL)).
Repeat that for 10 columns and viola - you hate SQL :)
A better answer would take into account the great SET processing capabilities that relational databases provide (in short -- never use row-by-row processing in SQL if you can avoid it. If you can't -- think again and avoid it anyway).
So for the answer:
load (all) data into a temporary table (or a staging table that can be safely truncated before load)
run the insert in a "set"-way:
INSERT INTO table (<columns>)
select <columns> from #temptab
EXCEPT
select <columns> from table
Keep in mind that the EXCEPT is safely dealing with NULLs for every kind of column ;) as well as choosing a high-performance join type for matching (hash, loop, merge join) depending on the available indexes and table statistics.

Procedure returning a list of identity [duplicate]

I am inserting records through a query similar to this one:
insert into tbl_xyz select field1 from tbl_abc
Now I would like to retreive the newly generated IDENTITY Values of the inserted records. How do I do this with minimum amount of locking and maximum reliability?
You can get this information using the OUTPUT clause.
You can output your information to a temp target table or view.
Here's an example:
DECLARE #InsertedIDs TABLE (ID bigint)
INSERT into DestTable (col1, col2, col3, col4)
OUTPUT INSERTED.ID INTO #InsertedIDs
SELECT col1, col2, col3, col4 FROM SourceTable
You can then query the table InsertedIDs for your inserted IDs.
##IDENTITY will return you the last inserted IDENTITY value, so you have two possible problems
Beware of triggers executed when inserting into table_xyz as this may change the value of ##IDENTITY.
Does tbl_abc have more than one row. If so then ##IDENTITY will only return the identity value of the last row
Issue 1 can be resolved by using SCOPE__IDENTITY() instead of ##IDENTITY
Issue 2 is harder to resolve. Does field1 in tbl_abc define a unique record within tbl_xyz, if so you could reselect the data from table_xyz with the identity column. There are other solutions using CURSORS but these will be slow.
SELECT ##IDENTITY
This is how I've done it before. Not sure if this will meet the latter half of your post though.
EDIT
Found this link too, but not sure if it is the same...
How to insert multiple records and get the identity value?
As far as I know, you can't really do this with straight SQL in the same script. But you could create an INSERT trigger. Now, I hate triggers, but it's one way of doing it.
Depending on what you are trying to do, you might want to insert the rows into a temp table or table variable first, and deal with the result set that way. Hopefully, there is a unique column that you can link to.
You could also lock the table, get the max key, insert your rows, and then get your max key again and do a range.
Trigger:
--Use the Inserted table. This conaints all of the inserted rows.
SELECT * FROM Inserted
Temp Table:
insert field1, unique_col into #temp from tbl_abc
insert into tbl_xyz (field1, unique_col) select field1, unique_col from tbl_abc
--This could be an update, or a cursor, or whatever you want to do
SELECT * FROM tbl_xyz WHERE EXISTS (SELECT top 1 unique_col FROM #temp WHERE unique_col = tbl_xyz.unique_col)
Key Range:
Declare #minkey as int, #maxkey as int
BEGIN TRANS --You have to lock the table for this to work
--key is the name of your identity column
SELECT #minkey = MAX(key) FROM tbl_xyz
insert into tbl_xyz select field1 from tbl_abc
SELECT #maxkey = MAX(key) FROM tbl_xyz
COMMIT Trans
SELECT * FROM tbl_xyz WHERE key BETWEEN #minkey and #maxkey

Is is possible to get new values for Id (IDENTITY) before inserting data in a table?

Is is possible to get new values for Id (IDENTITY) before inserting data in a table ?
Is is possible to write something like that :
INSERT INTO Table1
SELECT *GET_NEW_IDENTITY*, Field1, Field2 FROM Table2
I need the values of Id because I want to insert data in Table1 and, just after, insert data in another table which has a foreign key linked to Table1 (with Id)
IDENT_CURRENT. Returns the last identity value generated for a specified table or view. The last identity value generated can be for any session and any scope.
SCOPE_IDENTITY. Returns the last identity value inserted into an identity column in the same scope. A scope is a module: a stored procedure, trigger, function, or batch.
OUTPUT. Returns information from, or expressions based on, each row affected by an INSERT, UPDATE, DELETE, or MERGE statement. [...] The OUTPUT clause may be useful to retrieve the value of identity or computed columns after an INSERT or UPDATE operation.
you can also have the insert statement return the newly inserted value for later use. for example
create table demo( Id int identity primary key, data varchar(10))
go
insert into demo(data) output inserted.Id values('something')
No, because it is the act of adding a row which creates the new identity value.
To do what you want,
SELECT newid = ##identity FROM table
just after the INSERT
Why would you need to get the identity value before doing the insert? Just do the insert to Table2 returning SCOPE_IDENTITY() and then use the resulting Id value for your insert to Table1.
This is just fast demo. You can use new ID for insert for update, insert into another table, query, etc. in another way. Hoping I did not insert errors into script during formatting, editing post
-- run [1] before this script once to have environment
--create temporary table once if not dropped after
-- really only ID field is needed, the others are for illustration
create table #temp_id (Id int, d1 int, d2 int)
select * from Table2;-- this is read-only, filled once here source
select * from Table1;--interesting for following runs
insert into Table1
OUTPUT INSERTED.id
-- really only ID is needed, the rest is for illustration
, inserted.d1, inserted.d2 INTO #temp_id
select field1, field2, null-- null to be merged later
-- or inserted/updated into another table
from Table2;
select * from Table1;
select * from #temp_id;
MERGE Table1 AS TARGET
USING #temp_id AS SOURCE
ON (TARGET.id = SOURCE.id)
WHEN MATCHED
--AND OR are redundant if Table1.ID is PK
THEN
UPDATE SET TARGET.IDnew = SOURCE.id;
select * from Table1;
--drop table #temp_id
--drop table table1
--drop table table2
[1]
Reproducing the tables from question and filling with data
create table Table1( Id int identity primary key, d1 int, d2 int, IDnew int)
create table Table2( field1 int, field2 int)
insert into table2 values(111,222)
insert into table2 values(333,444)
IDENT_CURRENT('tableName') returns the current value of the identity for the given table. The identity value that will be assigned on Insert will be IDENT_CURRENT('tableName') + IDENT_INCR('tableName').
SELECT IDENT_CURRENT('tableName') + IDENT_INCR('tableName')

How can I Insert/Update into two related tables in one command?

A database exists with two tables
Data_t : DataID Primary Key that is
Identity 1,1. Also has another field
'LEFT' TINYINT
Data_Link_t : DataID PK and FK where
DataID MUST exist in Data_t. Also has another field 'RIGHT' SMALLINT
Coming from a microsoft access environment into C# and sql server I'm looking for a good method of importing a record into this relationship.
The record contains information that belongs on both sides of this join (Possibly inserting/updating upwards 5000 records at once). Bonus to process the entire batch in some kind of LINQ list type command but even if this is done record by record the key goal is that BOTH sides of this record should be processed in the same step.
There are countless approaches and I'm looking at too many to determine which way I should go so I thought faster to ask the general public. Is LINQ an option for inserting/updating a big list like this with LINQ to SQL? Should I go record by record? What approach should I use to add a record to normalized tables that when joined create the full record?
Sounds like a case where I'd write a small stored proc and call that from C# - e.g. as a function on my Linq-to-SQL data context object.
Something like:
CREATE PROCEDURE dbo.InsertData(#Left TINYINT, #Right SMALLINT)
AS BEGIN
DECLARE #DataID INT
INSERT INTO dbo.Data_t(Left) VALUES(#Left)
SELECT #DataID = SCOPE_IDENTITY();
INSERT INTO dbo.Data_Link_T(DataID, Right) VALUES(#DataID, #Right)
END
If you import that into your data context, you could call this something like:
using(YourDataContext ctx = new YourDataContext)
{
foreach(YourObjectType obj in YourListOfObjects)
{
ctx.InsertData(obj.Left, obj.Right)
}
}
and let the stored proc handle all the rest (all the details, like determining and using the IDENTITY from the first table in the second one) for you.
I have never tried it myself, but you might be able to do exactly what you are asking for by creating an updateable view and then inserting records into the view.
UPDATE
I just tried it, and it doesn't look like it will work.
Msg 4405, Level 16, State 1, Line 1
View or function 'Data_t_and_Data_Link_t' is not updatable because the modification affects multiple base tables.
I guess this is just one more thing for all the Relational Database Theory purists to hate about SQL Server.
ANOTHER UPDATE
Further research has found a way to do it. It can be done with a view and an "instead of" trigger.
create table Data_t
(
DataID int not null identity primary key,
[LEFT] tinyint,
)
GO
create table Data_Link_t
(
DataID int not null primary key foreign key references Data_T (DataID),
[RIGHT] smallint,
)
GO
create view Data_t_and_Data_Link_t
as
select
d.DataID,
d.[LEFT],
dl.[RIGHT]
from
Data_t d
inner join Data_Link_t dl on dl.DataID = d.DataID
GO
create trigger trgInsData_t_and_Data_Link_t on Data_t_and_Data_Link_T
instead of insert
as
insert into Data_t ([LEFT]) select [LEFT] from inserted
insert into Data_Link_t (DataID, [RIGHT]) select ##IDENTITY, [RIGHT] from inserted
go
insert into Data_t_and_Data_Link_t ([LEFT],[RIGHT]) values (1, 2)

Categories