Updates in Linq To Sql - c#

For updating records, instead of querying the context and updating each record individually,
we currently use code that does DeleteAllOnSubmit on existing set of rows and InsertAllOnSubmit on new set of rows.
This worked fine for majority of our scenarios, as in if the same row (content) is being inserted/deleted, it gets removed even though we have an insert and a delete in the ChangeSet. Also, if the primary key is the same, and the records have different content, it converts it to a single update. The problem we have is the primary key’s match in a case insensitive manner, like say ‘abc’ and ‘Abc’, Linq thinks they are different keys and then tries to run the Insert first followed by the delete next which fails due to primary key violation, since for our database settings, both the primary keys are considered equal. Is there a way where we could make Linq use a case insensitive comparison, when it determines an update from the inserts and deletes in ChangeSet?
I am aware that the other way would be to query the database, and if the record is present, do a update instead of a insert and a delete. But we do have this logic for multiple objects and we would like to see if there are other options that work.
Thanks for the responses.
Let me try to explain the issue we have with a example.
Say we have two tables a Bank and a Branch where a Bank can have multiple Branches.
We are given a set of branches that need to set in the table. So the logic would be to delete all branches for that bank and set it to the set of branches we have.
The current code we have does something
DataContext dc = new DataContext();
var destBranches = dc.Branches.Where(b => b.BankID.Equals("123"));
dc.Users.DeleteAllOnSubmit(destBranches);
dc.Branches.InsertAllOnSubmit(branches);
If we went with the update route, for each branch, we have to see if it exists in dest, then modify its properties, if not insert it, and finally if any dest branch is not in the set of branches, delete it. We have lots of tables that this change needs to be made.

If you have SQL 2008 look into using the MERGE statement. It performs an update/insert in one shot. SQL 2008 s'procs also accept table-value parameters which would make this trivial.

You may also try Plinqo. It does all the batch update dirty work for you.

Related

Setting a primary key with ROW_NUMBER in a view mapped with Entity Fluent API makes linq timeout

My problem is the following : I map my view to an object through Entity Fluent API. I needed a view containing an few left joins, an there were no unique identifier in the tables, therefore Entity always returned the same set of object. In a few different threads / blogs, I saw a solution consisting of add a column with
ROW_NUMBER() OVER (ORDER BY Id))
I then tried to map it in Entity :
in my class I add a property
public long Row { get; set; }
and in my configuration class I add
HasKey(imc => imc.Row).HasColumnName("Row")
Apparently, the mapping works. What doesn't work is that, when I query the objects with linq, even a Count() will timeout ; however the request itself only returns about 200 lines when used in a SQL Management Studio environement.
Has anyone ever seen this issue ?
EDIT:
I have been able to bypass the problem by replacing the "row_number()" with a newid() in the MS SQL View, but I'm still afraid it might be a problem later on.
Your query is slow which causes the timeout. About 1 million people have seen this before. You would need to analyze the query plan. Computing a row number over the whole table if unindexed can be slow. Also, a row number cannot be used as a key because it's values changes when you change the underlying data. EF does not support changing keys.
If you use newid() as the "key" in the view then you get fresh IDs each time. I think you might not be aware of the fact that a view is merely a shortcut for that particular query. It's contents are not stored anywhere.
Introduce a column that can be used as a key. For example an IDENTITY column.

LINQ to SQL CRUD (insert specifically) - inserting multiple items

I know you can insert new items to your SQL database (LINQ to SQL, code generated by SQLMetal.exe). You can attach new items with the Attach method in your entity table and what not, or you can edit existing records.
Now, let's say, instead of one new entity, you're presented with a lot - some of which may well already exist within the table. There is a primary key, but it's possible there may be some altered records in the collection, so the primary key probably isn't going to be the best method of figuring out what's changed.
Do I have to go through every record in my LINQ table and then compare all of its column data with all of the column data in the entities in the collection in question? This would tell me which ones are new, which ones have had changes, and which ones can be discarded. This just seems like a really long winded way of doing it.
Is there an easier way?
Thanks.
I think an "UPSERT" is what your after.
It's basically a combined insert/update command for sql, if it exists update it, if not create it.
http://www.databasejournal.com/features/mssql/article.php/3739131/UPSERT-Functionality-in-SQL-Server-2008.htm

How to get the primary key from a table without making a second trip?

How would I get the primary key ID number from a Table without making a second trip to the database in LINQ To SQL?
Right now, I submit the data to a table, and make another trip to figure out what id was assigned to the new field (in an auto increment id field). I want to do this in LINQ To SQL and not in Raw SQL (I no longer use Raw SQL).
Also, second part of my question is: I am always careful to know the ID of a user that's online because I'd rather call their information in various tables using their ID as opposed to using a GUID or a username, which are all long strings. I do this because I think that SQL Server doing a numeric compare is much (?) more efficient than doing a username (string) or even a guid (very long string) compare. My questions is, am I more concerned than I should be? Is the difference worth always keeping the userid (int32) in say, session state?
#RedFilter provided some interesting/promising leads for the first question, because I am at this stage unable to try them, if anyone knows or can confirm these changes that he recommended in the comments section of his answer?
If you have a reference to the object, you can just use that reference and call the primary key after you call db.SubmitChanges(). The LINQ object will automatically update its (Identifier) primary key field to reflect the new one assigned to it via SQL Server.
Example (vb.net):
Dim db As New NorthwindDataContext
Dim prod As New Product
prod.ProductName = "cheese!"
db.Products.InsertOnSubmit(prod)
db.SubmitChanges()
MessageBox.Show(prod.ProductID)
You could probably include the above code in a function and return the ProductID (or equivalent primary key) and use it somewhere else.
EDIT: If you are not doing atomic updates, you could add each new product to a separate Collection and iterate through it after you call SubmitChanges. I wish LINQ provided a 'database sneak peek' like a dataset would.
Unless you are doing something out of the ordinary, you should not need to do anything extra to retrieve the primary key that is generated.
When you call SubmitChanges on your Linq-to-SQL datacontext, it automatically updates the primary key values for your objects.
Regarding your second question - there may be a small performance improvement by doing a scan on a numeric field as opposed to something like varchar() but you will see much better performance either way by ensuring that you have the correct columns in your database indexed. And, with SQL Server if you create a primary key using an identity column, it will by default have a clustered index over it.
Linq to SQL automatically sets the identity value of your class with the ID generated when you insert a new record. Just access the property. I don't know if it uses a separate query for this or not, having never used it, but it is not unusual for ORMs to require another query to get back the last inserted ID.
Two ways you can do this independent of Linq To SQL (that may work with it):
1) If you are using SQL Server 2005 or higher, you can use the OUTPUT clause:
Returns information from, or
expressions based on, each row
affected by an INSERT, UPDATE, or
DELETE statement. These results can be
returned to the processing application
for use in such things as confirmation
messages, archiving, and other such
application requirements.
Alternatively, results can be inserted
into a table or table variable.
2) Alternately, you can construct a batch INSERT statement like this:
insert into MyTable
(field1)
values
('xxx');
select scope_identity();
which works at least as far back as SQL Server 2000.
In T-SQL, you could use the OUTPUT clause, saying:
INSERT table (columns...)
OUTPUT inserted.ID
SELECT columns...
So if you can configure LINQ to use that construct for doing inserts, then you can probably get it back easily. But whether LINQ can get a value back from an insert, I'll let someone else answer that.
Calling a stored procedure from LINQ that returns the ID as an output parameter is probably the easiest approach.

Best way to check whether a row has been updated in SQL

I have an update statement which updates a table. And there is a column that records the last modified time. If data in a specific row has not been changed, I don't want to change the last modified date time.
What is the best way to check whether an update statement will change the row of data or not.
Thanks,
Check the old vs. new data in your code instead of doing it in a query.
No need to bother the DB layer unnecessarily if data didn't change at all.
In short, if data didn't change, don't send the UPDATE statement.
One way is to start a transaction, select the contents of the row and compare it to what you're going to update it to. If they don't match, then do the update and end the transaction. If they match, rollback the transaction.
Sounds like you are going through a table and modifying some rows, then you want to go BACK through the table a second time and update the timestamp for the rows that were just changed.
Don't do it in two passes. Just update the date/time at the same time as you update whatever other columns you are changing:
UPDATE myTable
SET retailprice = wholesaleprice * 1.10,
lastmodified = GetDate()
WHERE ...
Or are you issuing an update statement on ALL rows, but for most rows, it just sets it to the value it already has? Don't do that. Exclude those rows that wouldn't be modified in your where clause:
UPDATE myTable
SET retailprice = wholesaleprice * 1.10,
lastmodified = GetDate()
WHERE retailprice <> wholesaleprice * 1.10
If you want to do this preemptively, the only way I can think of that you will do this is to modify the WHERE clause of the update statement to compare the existing value vs the new value (for EVERY value). If ANY of them are not equal, then the update should take place.
That's when a DAL is handy. It keeps track of all colums so if none changed then I don't even send an UPDATE statement to the database.
It depends on whether you have control of the data or not. Seb above is correct in saying you should check the old data against the new data before doing the update. But what if the data is not under your control?
Say you are a webservice being asked to do an update. Then the only way to check would be to query the existing data and compare it to the new data.
Don't know of any SQL functionality that would detect whether the update has actually changed any data or not.
There are ways in SQL to detect how many rows have been included in an update statement. Don't know of a way to detect whether an update statement actually changed any data, that would be interesting to know.
If you are using sql 2005/2008 then you can do as follows in the stored procedure.
update newTable
set readKey='1'
output inserted.id,
inserted.readKey as readKey,
deleted.readKey as prevReadKey
into #tempTable
where id = '1111'
Then you can select from #tempTable to verify if the prevReadKey and readKey has similar value if both has similar value you can reset your last modified datetime.
This way you don't have to fire multiple queries on the table in the case when a value is actually changing. But yes in the case when the value is not changing, this will be firing two update statements where none is required. This should be OK if those cases are rare.
P.S. NOTE:- The query given might be syntactically wrong as it is not tested. But this is the way your problem can be solved. I have done it in following way using OUTPUT clause with Merge statement in one of my project and it can be done with update statement too. Here is the reference of OUTPUT Clause
You COULD write an INSTEAD OF UPDATE trigger in T-SQL, where you could do what has been suggested above in the DAL layer -- compare the values in the existing record vs. the values in the update statement and either apply the update or not. You could use the Columns_Updated() function in the trigger to see if anything had been updated, and proceed accordingly.
It's not particularly efficient from the machine's point of view, but you could write it once and it would handle this situation no matter which application, stored procedure or other process was trying to update the record.

TSQL: UPDATE with INSERT INTO SELECT FROM

so I have an old database that I'm migrating to a new one. The new one has a slightly different but mostly-compatible schema. Additionally, I want to renumber all tables from zero.
Currently I have been using a tool I wrote that manually retrieves the old record, inserts it into the new database, and updates a v2 ID field in the old database to show its corresponding ID location in the new database.
for example, I'm selecting from MV5.Posts and inserting into MV6.Posts. Upon the insert, I retrieve the ID of the new row in MV6.Posts and update it in the old MV5.Posts.MV6ID field.
Is there a way to do this UPDATE via INSERT INTO SELECT FROM so I don't have to process every record manually? I'm using SQL Server 2005, dev edition.
The key with migration is to do several things:
First, do not do anything without a current backup.
Second, if the keys will be changing, you need to store both the old and new in the new structure at least temporarily (Permanently if the key field is exposed to the users because they may be searching by it to get old records).
Next you need to have a thorough understanding of the relationships to child tables. If you change the key field all related tables must change as well. This is where having both old and new key stored comes in handy. If you forget to change any of them, the data will no longer be correct and will be useless. So this is a critical step.
Pick out some test cases of particularly complex data making sure to include one or more test cases for each related table. Store the existing values in work tables.
To start the migration you insert into the new table using a select from the old table. Depending on the amount of records, you may want to loop through batches (not one record at a time) to improve performance. If the new key is an identity, you simply put the value of the old key in its field and let the database create the new keys.
Then do the same with the related tables. Then use the old key value in the table to update the foreign key fields with something like:
Update t2
set fkfield = newkey
from table2 t2
join table1 t1 on t1.oldkey = t2.fkfield
Test your migration by running the test cases and comparing the data with what you stored from before the migration. It is utterly critical to thoroughly test migration data or you can't be sure the data is consistent with the old structure. Migration is a very complex action; it pays to take your time and do it very methodically and thoroughly.
Probably the simplest way would be to add a column on MV6.Posts for oldId, then insert all the records from the old table into the new table. Last, update the old table matching on oldId in the new table with something like:
UPDATE mv5.posts
SET newid = n.id
FROM mv5.posts o, mv6.posts n
WHERE o.id = n.oldid
You could clean up and drop the oldId column afterwards if you wanted to.
The best you can do that I know is with the output clause. Assuming you have SQL 2005 or 2008.
USE AdventureWorks;
GO
DECLARE #MyTableVar table( ScrapReasonID smallint,
Name varchar(50),
ModifiedDate datetime);
INSERT Production.ScrapReason
OUTPUT INSERTED.ScrapReasonID, INSERTED.Name, INSERTED.ModifiedDate
INTO #MyTableVar
VALUES (N'Operator error', GETDATE());
It still would require a second pass to update the original table; however, it might help make your logic simpler. Do you need to update the source table? You could just store the new id's in a third cross reference table.
Heh. I remember doing this in a migration.
Putting the old_id in the new table makes both the update easier -- you can just do an insert into newtable select ... from oldtable, -- and the subsequent "stitching" of records easier. In the "stitch" you'll either update child tables' foreign keys in the insert, by doing a subselect on the new parent (insert into newchild select ... (select id from new_parent where old_id = oldchild.fk) as fk, ... from oldchild) or you'll insert children and do a separate update to fix the foreign keys.
Doing it in one insert is faster; doing it in a separate step meas that your inserts aren't order dependent, and can be re-done if necessary.
After the migration, you can either drop the old_id columns, or, if you have a case where the legacy system exposed the ids and so users used the keys as data, you can keep them to allow use lookup based on the old_id.
Indeed, if you have the foreign keys correctly defined, you can use systables/information-schema to generate your insert statements.
Is there a way to do this UPDATE via INSERT INTO SELECT FROM so I don't have to process every record manually?
Since you wouldn't want to do it manually, but automatically, create a trigger on MV6.Posts so that UPDATE occurs on MV5.Posts automatically when you insert into MV6.Posts.
And your trigger might look something like,
create trigger trg_MV6Posts
on MV6.Posts
after insert
as
begin
set identity_insert MV5.Posts on
update MV5.Posts
set ID = I.ID
from inserted I
set identity_insert MV5.Posts off
end
AFAIK, you cannot update two different tables with a single sql statement
You can however use triggers to achieve what you want to do.
Make a column in MV6.Post.OldMV5Id
make a
insert into MV6.Post
select .. from MV5.Post
then make an update of MV5.Post.MV6ID

Categories