I want to add a script post-deployment to insert data. But when I execute this script below I got this message error:
Must declare the table variable "#TempTable".
The purpose of this script it's to force adding Id which is a primary key, I know that we can use:
SET IDENTITY_INSERT [dbo].[TableName] ON;
GO
and after the Merge
SET IDENTITY_INSERT [dbo].[TableName] OFF;
GO
But it's not working for me I can't deploy my DacPac
DECLARE #vehicleType TABLE(
[VehicleTypeId] BIGINT,
[Name] NVARCHAR(200) NOT NULL
);
INSERT INTO #VehicleType ([VehicleTypeId], [Name])
VALUES(1,'Automobile'),(2,'HeavyVehicle'),(3,'Motorcycle')
SET IDENTITY_INSERT [dbo].[VehicleType] ON;
GO
MERGE INTO [dbo].[VehicleType]
USING #VehicleType as vhlt
ON ([dbo].[VehicleType].[VehicleTypeId] = vhtl.[VehicleTypeId] and [dbo].[VehicleType].[Name] = vhlt.[Name])
WHEN NOT MATCHED THEN
INSERT VALUES ([VehicleTypeId], [Name]);
SET IDENTITY_INSERT [dbo].[VehicleType] OFF;
GO
The first thing that I need to point out is that a lookup table such as this should not have an auto-number (IDENTITY) column. Auto-number values are synthetic and should only be used when you don't care what the value is, just that it's unique. That works well when users can add values to the lookup set.
For non-user-serviceable lookup tables, like your VehicleType, you do care what the VehicleTypeId values are so auto-number works against you. If you can remove the IDENTITY property from the column definition, you absolutely should. If you can't, you're stuck using IDENTITY_INSERT. Just remember not to use IDENTITY in future lookup table designs.
There's a way to do this without the table variable: use a CTE with the MERGE statement.
The CTE will contain all the values you want using unioned static SELECT statements (using UNION ALL skips duplicate checks since you know there won't be any duplicates).
The MERGE statement uses the CTE as its source table.
The ON clause in the MERGE statement should only compare the primary key. If a value in a non-key column was changed, it would cause a new row to be inserted; the new row would violate the primary key. Instead, update the Name column to the value in the CTE. To do that, I've added a WHEN MATCHED clause below.
I suggest always using dst and src as aliases in a MERGE statement. The MERGE statement can involve a lot of different parts. If you use the same aliases for the source and target/destination tables, that's one less bit of complexity you have to worry about. I've seen this actually help developers learn the syntax and this technique.
I like to name the CTE src so I don't actually have to alias it.
-- Only IF you cannot remove the IDENTITY property from VehicleTypeId:
SET IDENTITY_INSERT [dbo].[VehicleType] ON;
GO
-- Stage the data and merge it into the table in one shot:
WITH [src] ([VehicleTypeId], [Name])
AS
(
SELECT 1, 'Automobile'
UNION ALL SELECT 2, 'HeavyVehicle'
UNION ALL SELECT 3, 'Motorcycle'
)
MERGE INTO [dbo].[VehicleType] AS [dst]
USING [src]
ON [dst].[VehicleTypeId] = src.[VehicleTypeId]
WHEN NOT MATCHED BY TARGET THEN
INSERT ([VehicleTypeId], [Name])
VALUES ([src].[VehicleTypeId], [src].[Name])
WHEN MATCHED AND ([dst].[Name] <> [src].[Name]) THEN
UPDATE
SET [Name] = [src].[Name]
;
GO
-- Again, only IF you cannot remove the IDENTITY property from VehicleTypeId:
SET IDENTITY_INSERT [dbo].[VehicleType] OFF;
GO
When I write a script like this that I intend to run from SSMS, I include an OUTPUT clause so I can see what happened and verify that it was right.
...
OUTPUT $action AS [*Action],
COALESCE([inserted].[VehicleTypeId], [deleted].[VehicleTypeId]) AS [=VehicleTypeId],
[deleted].[Name] AS [-Name],
[inserted].[Name] AS [+Name]
-- Repeat deleted and inserted AS -/+ for remaining non-key columns.
;
A note about the prefixes I've used in the aliases:
* for metadata columns (really, it's only ever *Action).
= for the primary key columns. They won't change, so only one per column in the key.
- for the old values (e.g. -Name).
+ for the new values (e.g. +Name).
Related
Code-first auto generates an insert procedure code as below for a table that has ProductID as primary key (identity column).
CREATE PROCEDURE [dbo].[InsertProducts]
#ProductName [nvarchar](max),
#Date [datetime],
AS
BEGIN
INSERT dbo.ProductsTable([ProductName], [Date])
VALUES (#ProductName, #Date)
-- identity stuff starts here
DECLARE #ProductID int
SELECT #ProductID = [ProductID]
FROM dbo.FIT_StorageLocations
WHERE ##ROWCOUNT > 0 AND [ProductID] = scope_identity()
SELECT t0.[ProductID]
FROM dbo.ProductsTable AS t0
WHERE ##ROWCOUNT > 0 AND t0.[ProductID] = #ProductID
END
GO
Could you please explain the code that handles the identity column? Also, if an insert procedure is to be manually written from scratch, would it be handled differently?
If for example I would remove this auto generated code, I would encounter one of the following errors:
Procedure ....expects parameter '#ProductID', which was not supplied
Store update, insert, or delete statement affected an unexpected number of rows (0). Entities may have been modified or deleted since entities were loaded. See http://go.microsoft.com/fwlink/?LinkId=472540 for information on understanding and handling optimistic concurrency exceptions.
In the app, this is how I call the procedure which works fine until I try to mess with the code first auto generated SQL:
using (var db = new AppContext())
{
var record = new ProductObj()
{
ProductName= this.ProductName,
Date = DateTime.UtcNow
};
db.ProductDbSet.Add(record);
db.SaveChanges();
}
I guess there are two things to be explained here.
Why a SELECT statement when I insert stuff?
Let's first see what a regular insert by Entity Framework looks like. By "regular" I mean an insert without mapping CUD actions to stored procedures. The normal pattern is:
INSERT [dbo].[Product]([Name], ...)
VALUES (#0, ...)
SELECT [Id]
FROM [dbo].[Product]
WHERE ##ROWCOUNT > 0 AND [Id] = scope_identity()
So the INSERT is followed by a SELECT. This is because EF needs to know the identity value that the database assigns to the new Product to assign it to the entity object's Product.ProductId property and to track the entity. If for some reason you'd decide to do an update immediately after the insert, EF will be able to generate an update statement like UPDATE ... WHERE Id = #0.
When the insert is handled by a stored procedure, the sproc should return the new Id value in a way that looks like the regular insert. It expects to receive a one-column result set of which the column is named after the identity column. It should contain one row, the new identity value.
So that's why there is a SELECT statement in there, and why EF complains if you remove it. But, you might ask, does EF really need 7 lines of code to get an assigned identity value?
Why so much code?
Honestly, I have to speculate a bit here, because it isn't documented as far as I can find. But let's look at a minimal working version:
INSERT [dbo].[Products]([Name])
VALUES (#Name)
SELECT scope_identity() AS ProductId;
This does the job. It's even the standard example of many tutorials, including official ones, on mapping CUD actions to stored procedures.
But a database can be stuffed with triggers, constraints, defaults, etc. It's hard to predict their influence on the returned scope_identity() under the wide range of circumstances EF may encounter. So EF wants to guarantee that the returned value really belongs to the newly inserted record. And that a record has actually been inserted in the first place. That's why it adds the SELECT from the Product table, including the ##ROWCOUNT.
To implement these safeguards, a minimal version would be:
INSERT [dbo].[Products]([Name])
VALUES (#Name)
SELECT t0.[ProductId]
FROM [dbo].[Products] AS t0
WHERE ##ROWCOUNT > 0 AND t0.[ProductId] = scope_identity()
Same as in the regular insert.
That's as far as I can follow EF. It puzzles me a bit that this single SELECT apparently is enough for a regular INSERT but not for a stored procedure. I can't explain why there are two SELECTs in the generated code.
How can I get the primary key value and put it in another column when I insert the data?
Here is my table schema:
CREATE TABLE IF NOT EXISTS [MyTable] (
[ID] INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
[custom_ID] INTEGER NULL,
[Name] VARCHAR (200) NULL)
The query I have so far is:
INSERT INTO MyTable (custom_ID, Name)
values (
' {Here I need to get the primary key value, and then put it in custom_ID} ',
'someName')
Thanks!
Don't do that. You are violating all sorts of good database design principles by going this route. The primary key is supposed to represent the data that unique identifies a tuple (row). When you start having multiple copies of your primary key, you in turn defeat the entire purpose of the key.
As suggested by Chris:
CREATE TRIGGER MyTable_CustomID AFTER INSERT ON MyTable WHEN NEW.Custom_ID=NULL BEGIN
UPDATE MyTable SET Custom_ID=NEW.Id WHERE ROWID=NEW.ROWID;
END;
As far I can say you are trying to generate some custom Id based upon the PK. Like if my PK = 1 then you need a custom Id as ABCDEF/000/1 . So in that case you have to pick the immediate Id generate by the insert query and then run a update statement either in a trigger or just after the insert statement. Since its sql lite so you need to research a little bit to get identity similar to Scope_Identity() or ##Identity in Sql Server.
Try something like this:
insert [Order] (col1, col2, ...) values ('val1', 'val2', ...) -- Note: no ID is specified
declare #id int = scope_identity()
insert OrderDetail (order_id, col1, ...) values (#id, 'val1', ...)
I am inserting records through a query similar to this one:
insert into tbl_xyz select field1 from tbl_abc
Now I would like to retreive the newly generated IDENTITY Values of the inserted records. How do I do this with minimum amount of locking and maximum reliability?
You can get this information using the OUTPUT clause.
You can output your information to a temp target table or view.
Here's an example:
DECLARE #InsertedIDs TABLE (ID bigint)
INSERT into DestTable (col1, col2, col3, col4)
OUTPUT INSERTED.ID INTO #InsertedIDs
SELECT col1, col2, col3, col4 FROM SourceTable
You can then query the table InsertedIDs for your inserted IDs.
##IDENTITY will return you the last inserted IDENTITY value, so you have two possible problems
Beware of triggers executed when inserting into table_xyz as this may change the value of ##IDENTITY.
Does tbl_abc have more than one row. If so then ##IDENTITY will only return the identity value of the last row
Issue 1 can be resolved by using SCOPE__IDENTITY() instead of ##IDENTITY
Issue 2 is harder to resolve. Does field1 in tbl_abc define a unique record within tbl_xyz, if so you could reselect the data from table_xyz with the identity column. There are other solutions using CURSORS but these will be slow.
SELECT ##IDENTITY
This is how I've done it before. Not sure if this will meet the latter half of your post though.
EDIT
Found this link too, but not sure if it is the same...
How to insert multiple records and get the identity value?
As far as I know, you can't really do this with straight SQL in the same script. But you could create an INSERT trigger. Now, I hate triggers, but it's one way of doing it.
Depending on what you are trying to do, you might want to insert the rows into a temp table or table variable first, and deal with the result set that way. Hopefully, there is a unique column that you can link to.
You could also lock the table, get the max key, insert your rows, and then get your max key again and do a range.
Trigger:
--Use the Inserted table. This conaints all of the inserted rows.
SELECT * FROM Inserted
Temp Table:
insert field1, unique_col into #temp from tbl_abc
insert into tbl_xyz (field1, unique_col) select field1, unique_col from tbl_abc
--This could be an update, or a cursor, or whatever you want to do
SELECT * FROM tbl_xyz WHERE EXISTS (SELECT top 1 unique_col FROM #temp WHERE unique_col = tbl_xyz.unique_col)
Key Range:
Declare #minkey as int, #maxkey as int
BEGIN TRANS --You have to lock the table for this to work
--key is the name of your identity column
SELECT #minkey = MAX(key) FROM tbl_xyz
insert into tbl_xyz select field1 from tbl_abc
SELECT #maxkey = MAX(key) FROM tbl_xyz
COMMIT Trans
SELECT * FROM tbl_xyz WHERE key BETWEEN #minkey and #maxkey
We have an ASP.NET/MSSQL based web app which generates orders with sequential order numbers.
When a user saves a form, a new order is created as follows:
SELECT MAX(order_number) FROM order_table, call this max_order_number
set new_order_number = max_order_number + 1
INSERT a new order record, with this new_order_number (it's just a field in the order record, not a database key)
If I enclose the above 3 steps in single transaction, will it avoid duplicate order numbers from being created, if two customers save a new order at the same time? (And let's say the system is eventually on a web farm with multiple IIS servers and one MSSQL server).
I want to avoid two customers selecting the same MAX(order_number) due to concurrency somewhere in the system.
What isolation level should be used? Thank you.
Why not just use an Identity as the order number?
Edit:
As far as I know, you can make the current order_number column an Identity (you may have to reset the seed, it's been a while since I've done this). You might want to do some tests.
Here's a good read about what actually goes on when you change a column to an Identity in SSMS. The author mentions how this may take a while if the table already has millions of rows.
Using an identity is by far the best idea. I create all my tables like this:
CREATE TABLE mytable (
mytable_id int identity(1, 1) not null primary key,
name varchar(50)
)
The "identity" flag means, "Let SQL Server assign this number for me". The (1, 1) means that identity numbers should start at 1 and be incremented by 1 each time someone inserts a record into the table. Not Null means that nobody should be allowed to insert a null into this column, and "primary key" means that we should create a clustered index on this column. With this kind of a table, you can then insert your record like this:
-- We don't need to insert into mytable_id column; SQL Server does it for us!
INSERT INTO mytable (name) VALUES ('Bob Roberts')
But to answer your literal question, I can give a lesson about how transactions work. It's certainly possible, although not optimal, to do this:
-- Begin a transaction - this means everything within this region will be
-- executed atomically, meaning that nothing else can interfere.
BEGIN TRANSACTION
DECLARE #id bigint
-- Retrieves the maximum order number from the table
SELECT #id = MAX(order_number) FROM order_table
-- While you are in this transaction, no other queries can change the order table,
-- so this insert statement is guaranteed to succeed
INSERT INTO order_table (order_number) VALUES (#id + 1)
-- Committing the transaction releases your lock and allows other programs
-- to work on the order table
COMMIT TRANSACTION
Just keep in mind that declaring your table with an identity primary key column does this all for you automatically.
The risk is two processes selecting the MAX(order_number) before one of them inserts the new order. A safer way is to do it in one step:
INSERT INTO order_table
(order_number, /* other fields */)
VALUES
( (SELECT MAX(order_number)+1 FROM order_table ) order_number,
/* other values */
)
I agree with G_M; use an Identity field. When you add your record, just
INSERT INTO order_table (/* other fields */)
VALUES (/* other fields */) ; SELECT SCOPE_IDENTITY()
The return value from Scope Identity will be your order number.
I receive a daily XML file that contains thousands of records, each being a business transaction that I need to store in an internal database for use in reporting and billing.
I was under the impression that each day's file contained only unique records, but have discovered that my definition of unique is not exactly the same as the provider's.
The current application that imports this data is a C#.Net 3.5 console application, it does so using SqlBulkCopy into a MS SQL Server 2008 database table where the columns exactly match the structure of the XML records. Each record has just over 100 fields, and there is no natural key in the data, or rather the fields I can come up with making sense as a composite key end up also having to allow nulls. Currently the table has several indexes, but no primary key.
Basically the entire row needs to be unique. If one field is different, it is valid enough to be inserted. I looked at creating an MD5 hash of the entire row, inserting that into the database and using a constraint to prevent SqlBulkCopy from inserting the row,but I don't see how to get the MD5 Hash into the BulkCopy operation and I'm not sure if the whole operation would fail and roll back if any one record failed, or if it would continue.
The file contains a very large number of records, going row by row in the XML, querying the database for a record that matches all fields, and then deciding to insert is really the only way I can see being able to do this. I was just hoping not to have to rewrite the application entirely, and the bulk copy operation is so much faster.
Does anyone know of a way to use SqlBulkCopy while preventing duplicate rows, without a primary key? Or any suggestion for a different way to do this?
I'd upload the data into a staging table then deal with duplicates afterwards on copy to the final table.
For example, you can create a (non-unique) index on the staging table to deal with the "key"
Given that you're using SQL 2008, you have two options to solve the problem easily without needing to change your application much (if at all).
The first possible solution is create a second table like the first one but with a surrogate identity key and a uniqueness constraint added using the ignore_dup_key option which will do all the heavy lifting of eliminating the duplicates for you.
Here's an example you can run in SSMS to see what's happening:
if object_id( 'tempdb..#test1' ) is not null drop table #test1;
if object_id( 'tempdb..#test2' ) is not null drop table #test2;
go
-- example heap table with duplicate record
create table #test1
(
col1 int
,col2 varchar(50)
,col3 char(3)
);
insert #test1( col1, col2, col3 )
values
( 250, 'Joe''s IT Consulting and Bait Shop', null )
,( 120, 'Mary''s Dry Cleaning and Taxidermy', 'ACK' )
,( 250, 'Joe''s IT Consulting and Bait Shop', null ) -- dup record
,( 666, 'The Honest Politician', 'LIE' )
,( 100, 'My Invisible Friend', 'WHO' )
;
go
-- secondary table for removing duplicates
create table #test2
(
sk int not null identity primary key
,col1 int
,col2 varchar(50)
,col3 char(3)
-- add a uniqueness constraint to filter dups
,constraint UQ_test2 unique ( col1, col2, col3 ) with ( ignore_dup_key = on )
);
go
-- insert all records from original table
-- this should generate a warning if duplicate records were ignored
insert #test2( col1, col2, col3 )
select col1, col2, col3
from #test1;
go
Alternatively, you can also remove the duplicates in-place without a second table, but the performance may be too slow for your needs. Here's the code for that example, also runnable in SSMS:
if object_id( 'tempdb..#test1' ) is not null drop table #test1;
go
-- example heap table with duplicate record
create table #test1
(
col1 int
,col2 varchar(50)
,col3 char(3)
);
insert #test1( col1, col2, col3 )
values
( 250, 'Joe''s IT Consulting and Bait Shop', null )
,( 120, 'Mary''s Dry Cleaning and Taxidermy', 'ACK' )
,( 250, 'Joe''s IT Consulting and Bait Shop', null ) -- dup record
,( 666, 'The Honest Politician', 'LIE' )
,( 100, 'My Invisible Friend', 'WHO' )
;
go
-- add temporary PK and index
alter table #test1 add sk int not null identity constraint PK_test1 primary key clustered;
create index IX_test1 on #test1( col1, col2, col3 );
go
-- note: rebuilding the indexes may or may not provide a performance benefit
alter index PK_test1 on #test1 rebuild;
alter index IX_test1 on #test1 rebuild;
go
-- remove duplicates
with ranks as
(
select
sk
,ordinal = row_number() over
(
-- put all the columns composing uniqueness into the partition
partition by col1, col2, col3
order by sk
)
from #test1
)
delete
from ranks
where ordinal > 1;
go
-- remove added columns
drop index IX_test1 on #test1;
alter table #test1 drop constraint PK_test1;
alter table #test1 drop column sk;
go
Why not simply use, instead of a Primary Key, create an Index and set
Ignore Duplicate Keys: YES
This will prevent any duplicate key to fire an error, and it will not be created (as it exists already).
I use this method to insert around 120.000 rows per day and works flawlessly.
I would bulk copy into a temporary table and then push the data from that into the actual destination table. In this way, you can use SQL to check for and handle duplicates.
What is the data volume? You have 2 options that I can see:
1: filter it at source, by implementing your own IDataReader and using some hash over the data, and simply skipping any duplicates so that they never get passed into the TDS.
2: filter it in the DB; at the simplest level, I guess you could have multiple stages of import - the raw, unsanitised data - and then copy the DISTINCT data into your actual tables, perhaps using an intermediate table if you want to. You might want to use CHECKSUM for some of this, but it depends.
And fix that table. No table ever should be without a unique index, preferably as a PK. Even if you add a surrogate key because there is no natural key, you need to be able to specifically identify a particular record. Otherwise how will you get rid of the duplicates you already have?
I think this is a lot cleaner.
var dtcolumns = new string[] { "Col1", "Col2", "Col3"};
var dtDistinct = dt.DefaultView.ToTable(true, dtcolumns);
using (SqlConnection cn = new SqlConnection(cn)
{
copy.ColumnMappings.Add(0, 0);
copy.ColumnMappings.Add(1, 1);
copy.ColumnMappings.Add(2, 2);
copy.DestinationTableName = "TableNameToMapTo";
copy.WriteToServer(dtDistinct );
}
This way only need one database table and can keep Bussiness Logic in code.