I am trying to implement bulk insert of data from Datatable. In my MS-SQL Table(Destination table) i have a column with primary key not Identity column, so i have to increment manually. But its not possible in Code because there will be multi Thread on the same table.Please give me suggestion if any.
public void BulkInsert(DataTable dtTable)
{
DataTable dtProductSold = dtTable;
//creating object of SqlBulkCopy
SqlBulkCopy objbulk = new SqlBulkCopy(ConStr.ToString());
//assigning Destination table name
objbulk.DestinationTableName = "BatchData_InvReportMapping";
//Mapping Table column
objbulk.ColumnMappings.Add("InvPK", "InvPK");
objbulk.ColumnMappings.Add("DateValue", "DateDalue");
objbulk.ColumnMappings.Add("TextValue", "TextValue");
objbulk.ColumnMappings.Add("NumericValue", "NumericValue");
objbulk.ColumnMappings.Add("ErrorValue", "ErrorValue");
//inserting bulk Records into DataBase
objbulk.WriteToServer(dtProductSold);
}
Thanks in advance,
This is too long for a comment.
If you have a primary key column, then you need to take responsibility for its being unique and non-NULL when you insert rows. SQL Server offers a very handy mechanism to help with this, which is the identity column.
If you do not have an identity, then I you basically have two options:
Load data that has a valid primary key column.
Create a trigger that assigns the value when rows are loaded in.
Oh, wait. The default option for bulk insert is not to fire triggers, so the second choice really isn't a good option.
Instead, modify the table to have an identity primary key column. Then define a view on the table without the primary key and do the bulk insert into the view. The primary key will then be assigned automatically.
EDIT:
There is a third option, which might be feasible. Load the data into a staging table. Then insert from the staging table into the final table, calculating the primary key value. Something like this:
insert into finaltable (pk, . . .)
select m.maxpk + seqnum, . . . .
from (select row_number() over (order by (select null)) as seqnum,
. . .
from stagingtable
) s cross join
(select max(pk) as maxpk
from finaltable
) m;
i had one idea
generally we use tables to store the records, even if you insert the data using front end finally it will be stored in table.So i am suggesting to use sequences with insert trigger on the table. which means when you insert the data into the table first the trigger will be called, sequence will be incremented the the increased value will be stored along with other values in the table. just try this. because in oracle 11g we don't have identity() hence we will use sequences and insert trigger for identity column
Create a Table called id's. VARCHAR(50) TableName, INT Id.
When you want to generate your ids read the relevant row and increment it by the number of rows you want to insert within the same transaction.
you can now bulk insert these rows whenever you want without worrying about other threads inserting them.
Similar to how Nhibernates HiLow generator works.
http://weblogs.asp.net/ricardoperes/making-better-use-of-the-nhibernate-hilo-generator
Related
We are converting database primary keys from GUIDs to auto-incremented INTs. We have data that we parse from text files and put into two C# DataTables Claim and ClaimCharge that we have been using to bulk insert into identically named tables in the database. In the database, ClaimCharge.ClaimID is a foreign key to Claim.ID and several claim charges exist for one claim.
With GUIDs we generated the Claim and ClaimCharge IDs in C#, so bulk inserting was no problem. But with INTs, I don't know what the Claim.ID will be, so I can't assign ClaimCharge.ClaimID. I need some ideas on how this could be accomplished with INTs.
For instance, if the Claim table could be manually locked against inserts, I could:
Bulk insert into alternate tables named ClaimBulkData ClaimChargeBulkData. These tables would still use GUIDs for convenience in keeping the relationship maintained between C# and SQL.
Manually lock the Claim table against inserts (don't know if this is possible) and get the max(ID).
Increment all of the data in ClaimBulkData using MAX(ID).
Associate ClaimChargeBulkData to ClaimBulkData using the newly updated INT
Insert data into real Claim table as a set using IDENTITY_INSERT ON using some kind of exception to the imaginary lock created in step 2.
Release manually created lock against inserts on Claim table (again I don't know if this is possible.
Insert data into real ClaimCharge table.
I want to avoid inserting the data one row at a time in either C# or T-SQL.
Why not just add the new auto-increment column to the master tables -- you will then have both GUID and autoid column so you can fix up the foreign key relationship (one master table at a time)
i.e.,
Assume you have master1 and detail1 and detail1
alter table Master1 add ID int identity(1,1) not null
GO
alter Detail1 add master1ID int null
GO
alter Detail2 add master1ID int null
GO
Then update Detail1 and Detail12 based on joining Master1 on the oldguid key to set the corresponding value of Master1ID for each table
You can then add the foreign keys based on Master1ID to Detail and Detail2
At this point you should have a complete set of data based on both sets of keys, and you can test update views, etc. to make sure they work with the new integer ids
Finally, once all is cool, drop to unneeded GUID foreign key and the Guid columns themselves.
You can always run a database pack once you get everything clean and converted if your intent was to reduce overall disk usage via this restructuring. The point is much of the work is fixups for foreign keys in a process like this.
I have a small process that works on a few SQL tables. The tables were originally Guid primary keyed, but for efficiency we're updating them to a BigInt identity.
I have a batch insert that adds an item in the primary key table, then several items in the foreign key table. With Guids, this was easy, as I'd create the Guid in the code and pass it in for the parameter.
I'm curious what the best approach is for an identity column? I know I can do :
declare #id int
insert into PrimaryKeyTable (...) Values (...)
select #id = Scope_Identity()
and get back the primary key.
Is the best approach to split the batches into two, and pass the parameter back in in the code for the foreign key inserts? Or is there a way to do all the inserts in one SQL statement? Is there a general public opinion on the matter, or a best-practice? Thankyou for any guidance.
You need to use OUTPUT
INSERT...
OUTPUT INSERTED.ID
This allows you to do a batch insert and it will spit out the batched identity ID's and whatever else you explicitly set to output
Relatively simple problem.
Table A has ID int PK, unique Name varchar(500), and cola, colb, etc
Table B has a foreign key to Table A.
So, in the application, we are generating records for both table A and table B into DataTables in memory.
We would be generating thousands of these records on a very large number of "clients".
Eventually we make the call to store these records. However, records from table A may already exist in the database, so we need to get the primary keys for the records that already exist, and insert the missing ones. Then insert all records for table B with the correct foreign key.
Proposed solution:
I was considering sending an xml document to SQL Server to open as a rowset into TableVarA, update TableVarA with the primary keys for the records that already exist, then insert the missing records and output that to TableVarNew, I then select the Name and primary key from TableVarA union all TableVarNew.
Then in code populate the correct FKs into TableB in memory, and insert all of these records using SqlBulkCopy.
Does this sound like a good solution? And if so, what is the best way to populate the FKs in memory for TableB to match the primary key from the returned DataSet.
Sounds like a plan - but I think the handling of Table A can be simpler (a single in-memory table/table variable should be sufficient):
have a TableVarA that contains all rows for Table A
update the ID for all existing rows with their ID (should be doable in a single SQL statement)
insert all non-existing rows (that still have an empty ID) into Table A and make a note of their ID
This could all happen in a single table variable - I don't see why you need to copy stuff around....
Once you've handled your Table A, as you say, update Table B's foreign keys and bulk insert those rows in one go.
What I'm not quite clear on is how Table B references Table A - you just said it had an FK, but you didn't specify what column it was on (assuming on ID). Then how are your rows from Table B referencing Table A for new rows, that aren't inserted yet and thus don't have an ID in Table A yet?
This is more of a comment than a complete answer but I was running out of room so please don't vote it down for not being up to answer criteria.
My concern would be that evaluating a set for missing keys and then inserting in bulk you take a risk that the key got added elsewhere in the mean time. You stated this could be from a large number of clients so it this is going to happen. Yes you could wrap it in a big transaction but big transactions are hogs would lock out other clients.
My thought is to deal with those that have keys in bulk separate assuming there is no risk the PK would be deleted. A TVP is efficient but you need explicit knowledge of which got processed. I think you need to first search on Name to get a list of PK that exists then process that via TVP.
For data integrity process the rest one at a time via a stored procedure that creates the PK as necessary.
Thousands of records is not scary (millions is). Large number of "clients" that is the scary part.
I got a table PartsMedia where I can insert all the images related to a product .
The table has the columns :
PartsMediaID , auto-increment
PartsNo
MediaLink
MediaDescription
CatalogCode
SortCode
I want to insert a complete row with automatic increment and the PartsNo should be the same as the PartsNo from the PartsMaster table.
The medialink should be the PartsNo + '-2.jpg'
The mediadescription is for example 'image2'
The CatalogCode should be 'catalog'
and the sorting code should be '0'
From The partsMaster table I Just need the PartNo So I can add this to the PartMedia Table.
The PartNo is the foreign key in the PartMedia table.
The following I got so far but no luck
insert into dbo.PartsMedia (PartNo,MediaLink,MediaDescription,CatalogCode, SortCode)
values (dbo.PartsMaster.PartNo, PartsMaster.PartNo+'-2.jpg','image2', 'catalog','0')
I need some help .
Kind regards,
It's unclear to me what you really want.
But if this is MS SQL, and you're trying to override the identity column (which as auto increment), you need to tell Sql Server that you can insert a new value in the identity column:
SET IDENTITY_INSERT tablename ON
YOUR INSERT GOES HERE
SET IDENTITY_INSERT tablename OFF
Your insert statement lacks a select-clause that grabs the correct row(s) from the PartsMaster table.
insert into foo(a, b, c)
select x, y, z from T
(warning: Dev pretending to know anything about databases)
It sounds like you have a data normalization problem. Each entity should have only one ID in your database, and it only makes sense for something like a surrogate key ID (for that table) to be auto-increment.
If you want to refer to the ID of an entity in a different table, you should have a foreign key constraint, and that column shouldn't be auto-increment.
Reason being - what if in the future you want more than one piece of media (image) for a part? Maybe in the future you'll want pics and vids. In these scenarios, you need to support duplicate PartsNo values.
A database exists with two tables
Data_t : DataID Primary Key that is
Identity 1,1. Also has another field
'LEFT' TINYINT
Data_Link_t : DataID PK and FK where
DataID MUST exist in Data_t. Also has another field 'RIGHT' SMALLINT
Coming from a microsoft access environment into C# and sql server I'm looking for a good method of importing a record into this relationship.
The record contains information that belongs on both sides of this join (Possibly inserting/updating upwards 5000 records at once). Bonus to process the entire batch in some kind of LINQ list type command but even if this is done record by record the key goal is that BOTH sides of this record should be processed in the same step.
There are countless approaches and I'm looking at too many to determine which way I should go so I thought faster to ask the general public. Is LINQ an option for inserting/updating a big list like this with LINQ to SQL? Should I go record by record? What approach should I use to add a record to normalized tables that when joined create the full record?
Sounds like a case where I'd write a small stored proc and call that from C# - e.g. as a function on my Linq-to-SQL data context object.
Something like:
CREATE PROCEDURE dbo.InsertData(#Left TINYINT, #Right SMALLINT)
AS BEGIN
DECLARE #DataID INT
INSERT INTO dbo.Data_t(Left) VALUES(#Left)
SELECT #DataID = SCOPE_IDENTITY();
INSERT INTO dbo.Data_Link_T(DataID, Right) VALUES(#DataID, #Right)
END
If you import that into your data context, you could call this something like:
using(YourDataContext ctx = new YourDataContext)
{
foreach(YourObjectType obj in YourListOfObjects)
{
ctx.InsertData(obj.Left, obj.Right)
}
}
and let the stored proc handle all the rest (all the details, like determining and using the IDENTITY from the first table in the second one) for you.
I have never tried it myself, but you might be able to do exactly what you are asking for by creating an updateable view and then inserting records into the view.
UPDATE
I just tried it, and it doesn't look like it will work.
Msg 4405, Level 16, State 1, Line 1
View or function 'Data_t_and_Data_Link_t' is not updatable because the modification affects multiple base tables.
I guess this is just one more thing for all the Relational Database Theory purists to hate about SQL Server.
ANOTHER UPDATE
Further research has found a way to do it. It can be done with a view and an "instead of" trigger.
create table Data_t
(
DataID int not null identity primary key,
[LEFT] tinyint,
)
GO
create table Data_Link_t
(
DataID int not null primary key foreign key references Data_T (DataID),
[RIGHT] smallint,
)
GO
create view Data_t_and_Data_Link_t
as
select
d.DataID,
d.[LEFT],
dl.[RIGHT]
from
Data_t d
inner join Data_Link_t dl on dl.DataID = d.DataID
GO
create trigger trgInsData_t_and_Data_Link_t on Data_t_and_Data_Link_T
instead of insert
as
insert into Data_t ([LEFT]) select [LEFT] from inserted
insert into Data_Link_t (DataID, [RIGHT]) select ##IDENTITY, [RIGHT] from inserted
go
insert into Data_t_and_Data_Link_t ([LEFT],[RIGHT]) values (1, 2)