I have 3 tables:
Staging: that gets employee records inserted every month.
Master: table has contains all previously entered records from staging, unique records.
Changes: keeps track of all changes - has no primary key.
The tables have 10 columns. In the staging table, every month we have about 2,500,000 records. Using a cursor I am able to insert new records from staging into the master table.
When it comes to update, I am using an inner join to get the records from staging that already exist in the master table.
To find out if any of the employee info has changed, do I have to query something line this:
WHERE Staging.FirstName <> Master.FirstName
OR Staging.LastName <> Master.LastName
OR ...
And so on for 10 columns, or is there an easier way?
If the two tables really are identical, you could create a persisted computed column in each table that represents a checksum of the entire row (see http://technet.microsoft.com/en-us/library/ms189788.aspx), create an index on that, and then use that for your joins.
Using a Cursor for millions of rows does not sound like fun.
Maybe you should look at EXCEPT/MERGE
WITH NewAndChanged AS (
SELECT Stage.Id
,Stage.Col1
,Stage.Col2
FROM Stage
EXCEPT
SELECT Master.Id
,Master.Col1
,Master.Col2
FROM Master
)
MERGE Master
USING NewAndChanged
ON Master.Id = NewAndChanged.Id
WHEN MATCHED
THEN UPDATE SET
Col1 = NewAndChanged.Col1
,Col2 = NewAndChanged.Col2
WHEN NOT MATCHED
THEN INSERT (
Id
,Col1
,Col2
)
VALUES (
NewAndChanged.Id
,NewAndChanged.Col1
,NewAndChanged.Col2
)
Related
I have two tables in my sample database. These are called:
Active Products.
Inactive Products
In my asp form I have two list boxes and a dropdown list.
DropDown List: Category.
When I select any one category like milk products in my dropdown list, it will show these products under list in the category on my Left side.
Products to Active. - [ Left Side ]
Activated Products. - [ Right side ]
I used to buttons for move the list items for move left and right side. When I click an update button, it will take the list items what contains on right side list box and insert these items in first table called Active Products.
Now I select two list items on my right side and I move them to left side list box and click update button, that moved items are updated in second table. How do I do this?
Active table:
Categoryid | Produtid
-----------+-----------
1 1
1 5
1 6
Example if i select a first category, that corresponding products were display in right side, if i move Productid 5 and 6 on left side.
It will be deleted in this table updated on in-active table.
My expected output will be look like this, how to wrote function for getting this output. someone please guide me. Thanks in advance,
Active table:
Categoryid | Produtid
-----------+---------
1 1
Inactive table:
Categoryid | Produtid
-----------+----------
1 5
1 6
For inserting:
INSERT INTO InactiveTable
SELECT * FROM ActiveTable WHERE RowId = "<insert rowid>"
For deleting:
DELETE FROM ActiveTable
WHERE RowId = "<insert rowid>"
Obviously insert before you delete.
First of, I'd suggest you use an Active/Inactive flag (just add a bit to your table).
If that's really not what you want, you can write an SQL-trigger. Should look something like this:
CREATE TRIGGER dbo.SetActiveProductToInactive
ON dbo.ActiveProducts
FOR DELETE
AS
INSERT INTO dbo.InactiveProducts
SELECT * FROM deleted
GO
This inserts all deleted items into your ActiveProducst table before actually deleting them. If this is too much for what you are trying to accomplish, you should look at James' answer.
when you delete row it returns record. Try this.
Creating new table:
CREATE TABLE archived_table_timestamp AS WITH deleted_rows as
(DELETE FROM main_table RETURNING *)SELECT * FROM deleted_rows;
to insert in existing table:
WITH deleted_rows as (
DELETE FROM active_table WHERE 'condition' RETURNING *
)
INSERT INTO archived_table SELECT * FROM deleted_rows;
As #Nick.McDermaid mentioned in the comments it would be better to have Inactive flag on the product table. But if you still need to move these rows between to tables you can use OUTPUT clause to do that just in a single statement.
First we are creating these to tables
create table #Active (Categoryid int, Productid int);
create table #Inactive (Categoryid int, Productid int);
Second we are inserting these records
insert into #Active (Categoryid, Productid) values (1, 1), (1, 5), (1, 6);
Third we are deleting some rows from #Active table and inserting them to the #Passive table
delete t
output deleted.*
into #Inactive
from #Active AS t
where t.Productid in (5, 6);
If you need much more details about OUTPUT clause you can visit that link:
https://learn.microsoft.com/en-us/sql/t-sql/queries/output-clause-transact-sql
I am trying to implement bulk insert of data from Datatable. In my MS-SQL Table(Destination table) i have a column with primary key not Identity column, so i have to increment manually. But its not possible in Code because there will be multi Thread on the same table.Please give me suggestion if any.
public void BulkInsert(DataTable dtTable)
{
DataTable dtProductSold = dtTable;
//creating object of SqlBulkCopy
SqlBulkCopy objbulk = new SqlBulkCopy(ConStr.ToString());
//assigning Destination table name
objbulk.DestinationTableName = "BatchData_InvReportMapping";
//Mapping Table column
objbulk.ColumnMappings.Add("InvPK", "InvPK");
objbulk.ColumnMappings.Add("DateValue", "DateDalue");
objbulk.ColumnMappings.Add("TextValue", "TextValue");
objbulk.ColumnMappings.Add("NumericValue", "NumericValue");
objbulk.ColumnMappings.Add("ErrorValue", "ErrorValue");
//inserting bulk Records into DataBase
objbulk.WriteToServer(dtProductSold);
}
Thanks in advance,
This is too long for a comment.
If you have a primary key column, then you need to take responsibility for its being unique and non-NULL when you insert rows. SQL Server offers a very handy mechanism to help with this, which is the identity column.
If you do not have an identity, then I you basically have two options:
Load data that has a valid primary key column.
Create a trigger that assigns the value when rows are loaded in.
Oh, wait. The default option for bulk insert is not to fire triggers, so the second choice really isn't a good option.
Instead, modify the table to have an identity primary key column. Then define a view on the table without the primary key and do the bulk insert into the view. The primary key will then be assigned automatically.
EDIT:
There is a third option, which might be feasible. Load the data into a staging table. Then insert from the staging table into the final table, calculating the primary key value. Something like this:
insert into finaltable (pk, . . .)
select m.maxpk + seqnum, . . . .
from (select row_number() over (order by (select null)) as seqnum,
. . .
from stagingtable
) s cross join
(select max(pk) as maxpk
from finaltable
) m;
i had one idea
generally we use tables to store the records, even if you insert the data using front end finally it will be stored in table.So i am suggesting to use sequences with insert trigger on the table. which means when you insert the data into the table first the trigger will be called, sequence will be incremented the the increased value will be stored along with other values in the table. just try this. because in oracle 11g we don't have identity() hence we will use sequences and insert trigger for identity column
Create a Table called id's. VARCHAR(50) TableName, INT Id.
When you want to generate your ids read the relevant row and increment it by the number of rows you want to insert within the same transaction.
you can now bulk insert these rows whenever you want without worrying about other threads inserting them.
Similar to how Nhibernates HiLow generator works.
http://weblogs.asp.net/ricardoperes/making-better-use-of-the-nhibernate-hilo-generator
I have a .Net DataTable that contains records, all of which are "added" records. The corresponding table in the database may contain millions of rows. If I attempt to simply call the "Update" method on my SqlDataAdapter, any existing records cause an exception to be raised due to a violation of the primary key constraint. I considered loading all of the physical table's records into a second DataTable instance, merging the two, and then calling the Update method on the second DataTable. This actually works exactly like I want. However, my concern is that if there are 30 billion records in the physical table, loading all of that data into a DataTable in memory could be an issue.
I considered selecting a sub-set of data from the physical table and proceeding as described above, but the construction of the sub-query has proved to be very involved and very tedious. You see, I am not working with a single known table. I am working with a DataSet that contains several hundred DataTables. Each of the DataTables maps to its own physical table. The name and schema of the tables are not known at compile time. This has to all be done at run time.
I have played with the SqlBulkCopy class but have the same issue - duplicate records raise an exception.
I don't want to have to dynamically construct queries for each table at run time. If that is the only way, so be it, but I just can't help but think that there must be a simpler solution using what Ado.Net provides.
you could create your insertcommand like this:
declare #pk int = 1
declare #txt nvarchar(100) = 'nothing'
insert into #temp (id, txt)
select distinct #pk, #txt
where not exists (select id from #temp x where x.id = #pk)
assuming that your table #temp (temporary table used for this example) is created like this (with primary key on id)
create table #temp (id int not null, txt nvarchar(100))
i have 3 tables in dataset
when i click save button,
i want to add these tables to database tables using data adapter
all these 3 tables primary keys are sql generated auto number.
relation ships Invoice, InvoiceProduct , InvoiceProductExp tables are:
InvoiceNo has many InvoiceProductNo
InvoiceProductNo has many InvoiceProductExpNo
the following code can not solve these relaionship
DECLARE #InvoiceNo INT
DECLARE #InvoiceProductNo INT
INSERT INTO Invoice ([Date])
VALUES (GETDATE())
SELECT #InvoiceNo = SCOPE_IDENTITY()
INSERT INTO InvoiceProduct([InvoiceNo])
VALUES (#InvoiceNo)
SELECT #InvoiceProductNo = SCOPE_IDENTITY()
INSERT INTO InvoiceProductExp ([InvoiceProductNo], [InvoiceNo])
VALUES (#InvoiceProductNo, #InvoiceNo)
If you are using a Dataset and DataAdapter, you shouldn't be issuing all those statements. Each data adapter needs only know how to update its records. When you update the parent, the identity value will be put into your dataset automatically and the child records will automatically be set (assuming you set up your relationships correctly.) After that, you update the child tables.
Read some of the comments in this SO thread, there are some good code snippets there.
A database exists with two tables
Data_t : DataID Primary Key that is
Identity 1,1. Also has another field
'LEFT' TINYINT
Data_Link_t : DataID PK and FK where
DataID MUST exist in Data_t. Also has another field 'RIGHT' SMALLINT
Coming from a microsoft access environment into C# and sql server I'm looking for a good method of importing a record into this relationship.
The record contains information that belongs on both sides of this join (Possibly inserting/updating upwards 5000 records at once). Bonus to process the entire batch in some kind of LINQ list type command but even if this is done record by record the key goal is that BOTH sides of this record should be processed in the same step.
There are countless approaches and I'm looking at too many to determine which way I should go so I thought faster to ask the general public. Is LINQ an option for inserting/updating a big list like this with LINQ to SQL? Should I go record by record? What approach should I use to add a record to normalized tables that when joined create the full record?
Sounds like a case where I'd write a small stored proc and call that from C# - e.g. as a function on my Linq-to-SQL data context object.
Something like:
CREATE PROCEDURE dbo.InsertData(#Left TINYINT, #Right SMALLINT)
AS BEGIN
DECLARE #DataID INT
INSERT INTO dbo.Data_t(Left) VALUES(#Left)
SELECT #DataID = SCOPE_IDENTITY();
INSERT INTO dbo.Data_Link_T(DataID, Right) VALUES(#DataID, #Right)
END
If you import that into your data context, you could call this something like:
using(YourDataContext ctx = new YourDataContext)
{
foreach(YourObjectType obj in YourListOfObjects)
{
ctx.InsertData(obj.Left, obj.Right)
}
}
and let the stored proc handle all the rest (all the details, like determining and using the IDENTITY from the first table in the second one) for you.
I have never tried it myself, but you might be able to do exactly what you are asking for by creating an updateable view and then inserting records into the view.
UPDATE
I just tried it, and it doesn't look like it will work.
Msg 4405, Level 16, State 1, Line 1
View or function 'Data_t_and_Data_Link_t' is not updatable because the modification affects multiple base tables.
I guess this is just one more thing for all the Relational Database Theory purists to hate about SQL Server.
ANOTHER UPDATE
Further research has found a way to do it. It can be done with a view and an "instead of" trigger.
create table Data_t
(
DataID int not null identity primary key,
[LEFT] tinyint,
)
GO
create table Data_Link_t
(
DataID int not null primary key foreign key references Data_T (DataID),
[RIGHT] smallint,
)
GO
create view Data_t_and_Data_Link_t
as
select
d.DataID,
d.[LEFT],
dl.[RIGHT]
from
Data_t d
inner join Data_Link_t dl on dl.DataID = d.DataID
GO
create trigger trgInsData_t_and_Data_Link_t on Data_t_and_Data_Link_T
instead of insert
as
insert into Data_t ([LEFT]) select [LEFT] from inserted
insert into Data_Link_t (DataID, [RIGHT]) select ##IDENTITY, [RIGHT] from inserted
go
insert into Data_t_and_Data_Link_t ([LEFT],[RIGHT]) values (1, 2)