I need to implement a "shift by year" operation in my Entity Framework application.
To simplify things, let's suppose I have an entity / table that has a foreign key FK to some other table and a YEAR (int, int). Users may shift the entity years to the future, so I need to copy data from 2017 to 2018 and so forth.
There's a restriction in the database that says that the pair (FK, YEAR) must be unique.
When I perform the "shift" in memory and send the changes to the database with SaveChanges I receive an error from the database that I'm violating the unique constraint. I suppose that it is trying to perform the update row by row, therefore the violation.
If I sort the collection by year descending (so no duplicates happen if the rows are sent one by one) the same error happens.
My workaround is to delete the old data and insert new data with the new years, but I think there might be a cleaner solution...
I solved the issue by inverting the order and saving item by item after modifying. It's not very clean to run so many SaveChanges but the overall process is rather complex and it's using transactions, so it's not a big deal.
I had to make some more changes because users can shift years to the past or to the future, in which case I must / mustn't reverse the order, but to simplify this very issue let's say that it's solved if I reverse the order and save item by item.
Related
Without getting into the "why", just understand this in inherited and what I have to work with :)
I have an EF6 edmx mapped to a view. There is no identifying column on it, so in order for EF to map the entity, the first not-null column was selected as the PK. The original thought behind this was it is read only no updates or deletes would be done. There is no filtering (ODATA sits on top of this), and the only - and I mean only - way this is used is select top N * from the entity.
There are 4 records in the view.
TypeCode | Contact | UserID | LocaleID | EntityName
---------------------------------------------------------
1 6623 1032 9 Jane
1 6623 1032 9 Jane
1 6623 1032 9 John
1 6623 1032 9 John
The problem I am seeing is that EF is mapping all 4 rows the same. All "John" names above become "Jane"
OK, putting aside the design decision, and the fact there is no identifying record on the view, why is EF mapping the last two rows wrong? My initial thought is that since the "PK" is set as TypeCode It doesn't know how to do it. But why would it be using the key column when just reading results from the database? I would have thought it only mattered for updates and deletes
If you query data by Entity Framework, the default behavior is that each materialized entity is tracked by its unique key. The unique key consists of any properties you told EF to use as key, or, alternatively, it inferred as key properties (TypeCode in your case). Whenever a duplicate entity key tries to enter the change tracker, an error is thrown telling that the object is already being tracked.
So EF simply can't materialize objects having duplicate primary key values. It would compromise its tracking mechanism.
It appears that, at least in EF6, AsNoTracking() can be used as a work-around. AsNoTracking tells EF to just materialize objects without tracking them, so it doesn't generate entity keys.
What I don't understand is why EF doesn't throw an exception whenever it reads duplicate primary key values. Now it silently returns the same object as many times as it encounters its key value in the SQL query result. This has caused many many people to get confused to no end.
By the way, a common way to avoid this issue is by generating temporary unique key values to the view by using ROW_NUMBER in Sql Server. That's good enough for read-only data that you read once into one context instance.
I am rewriting a new timesheet application including redesigning database and it will require data migration from Oracle to Oracle.
In the old system field ‘EmployeeCod’ is a Primary Key and it is in Alphanumeric form i.e. ‘UK001’, ‘UK002’,‘FR001’,’FR002’, ‘US001’ . Employee table is also linked to timesheet and other tables where the EmpCode is being referred as a FK.
To make the JOINs perform faster in the new system I was thinking about adding a new INT column in the Employee table and set it to PK. (Don't know if it will make any big difference)
-Employee table has about 600 rows.
-Data type of EmpCode is Varchar2(20) in old DB which I can reduce to Varchar2(6) in the new system and alter it later as company expends.
I am wondering if it is better to keep the EmpCode as a Primary Key which will make things easier in migrating data or should I add a INT column?
Someone has given me following advise in one of my previous thread:
“if you need to create a composite code of AANNN then I'd split this into two: a simple 'Prefix' field of CHAR(2) and an identity field of INT, then turn EmpCode into a computed field that concats the two and stick an index on there that (#Chris)”
I am not sure if this option would work as employee table is linked to other tables as well. (EmpCode is being used as FK in other tables)
n
If you do add this PK, and also keep the former PK, you will have some data management issues to deal with. Or perhaps your customers. Getting rid of the old PK may not be feasable if there are existing users who will be upgrading to the new database.
If EmployeeCode, the former PK is used by the users of the data to identify Employees, then you will have to add a constraint to make sure that this field is unique. Carrying both codes will wipe out any performance gains you were hoping for.
If it were me, I'd leave well enough alone. The performance gains, if any, will be trivial.
The performance difference will be negligible if the index you're creating on the alphanumeric field is the clustered index for the table. Which, based off of your question is going to be the case, but I wanted to note that for completeness. I say this for two reasons:
A clustered index is the physical order of the table and so when seeking against that index, looking for more data presumably off of the data page in a query, a binary search can be performed against it because it's also physically stored in that order.
A binary search is just about as efficient as you can get, lest we forget though a statistical index. I call this out because integer primary keys build statistical indexes which are as fast a seek as you can get because mathmatically speaking we know 2 comes after 1 for example.
So, just keep that in mind when building alphanumeric, or even compound, keys and indexes and trying to compare the difference between them and an integer key. Personally, I prefer to stick with integer primary keys because I have found them to perform better over time during extreme growth.
I hope this helps.
I use alphanumeric primary keys regularly and see absolutely no issues with it. There is no performance issue, you have a wider addressable space, and you can be more expressive/human readable. Integer keys are just a convention.
Add to that the risk you're adding to you project by adding a major architectural change over and above the porting issues, I'd say stick with the existing schema as much as possible.
There will be no performance improvement - in fact, unless you know and can prove/measure that you have a performance problem, changing things "to make them faster" usually leads to pain.
However, there is a concern that your primary key appears to carry meaning - it's a country code, concatenated with a number. What if an employee moves from the US to the UK? What if the UK hires its 1000th employee?
For that reason, I'd refactor the application to use a meaningless primary key; whether it's an INT or a VARCHAR is not hugely relevant.
You do occassionally come across alphanumeric primary keys.. personally I find it just makes life more difficult.. if you are able to change it and you want to change it, I would say go ahead.. it will make things easier for you later. As for it being an FK, you would need to be careful to write a script to properly update all the data. One way you can do this is:
Step 1: Create a new int column for the PK and set Identity Insert to true
Step 2: Add a new int column in your child table and then:
Step 3: write an update script like this:
UPDATE childTable C
INNER JOIN parentTable P ON C.oldEmpID = P.oldEmpID
SET C.myNewEmpIDColumn = P.myNewEmpIDColumn
Step 4: Repeat steps 2 & 3 for all child tables
Step 5: Delete all old FK columns
Something like that and don't forget to backup your current DB first ;)
I have a unique constraint on a Navigations table's column called Index. I have two Navigation entities and I want to swap their Index values.
When I call db.SaveChanges it throws an exception indicating that a unique constraint was violated. It seems EF is updating one value and then the other, thus violating the constraint.
Shouldn't it be updating them both in a transaction and then trying to commit once the values are sorted out and not in violation of the constraint?
Is there a way around this without using temporary values?
It is not problem of EF but the problem of SQL database because update commands are executed sequentially. Transaction has nothing to do with this - all constrains are validated per command not per transaction. If you want to swap unique values you need more steps where you will use additional dummy values to avoid this situation.
You could run a custom SQL Query to swap the values, like this:
update Navigation
set valuecolumn =
case
when id=1 then 'value2'
when id=2 then 'value1'
end
where id in (1,2)
However, Entity Framework cannot do that, because it's outside the scope of an ORM. It just executes sequential update statements for each altered entity, like Ladislav described in his answer.
Another possibility would be to drop the UNIQUE constraint in your database and rely on the application to properly enforce this constraint. In this case, the EF could save the changes just fine, but depending on your scenario, it may not be possible.
There are a few approaches. Some of them are covered in other answers and comments but for completeness, I will list them out here (note that this is just a list that I brainstormed and it might not be all that 'complete').
Perform all of the updates in a single command. See W0lf's answer for an example of this.
Do two sets of updates - one to swap all of the values to the negative of the intended value and then a second to swap them from negative to positive. This is working on the assumptions that negative values are not prevented by other constraints and that they are not values that records other than those in a transient state will have.
Add an extra column - IsUpdating for example - set it to true in the first set of updates where the values are changed and then set it back to false in a second set of updates. Swap the unique constraint for a filtered, unique index which ignores records where IsUpdating is true.
Remove the constraint and deal with duplicate values.
I know you can insert new items to your SQL database (LINQ to SQL, code generated by SQLMetal.exe). You can attach new items with the Attach method in your entity table and what not, or you can edit existing records.
Now, let's say, instead of one new entity, you're presented with a lot - some of which may well already exist within the table. There is a primary key, but it's possible there may be some altered records in the collection, so the primary key probably isn't going to be the best method of figuring out what's changed.
Do I have to go through every record in my LINQ table and then compare all of its column data with all of the column data in the entities in the collection in question? This would tell me which ones are new, which ones have had changes, and which ones can be discarded. This just seems like a really long winded way of doing it.
Is there an easier way?
Thanks.
I think an "UPSERT" is what your after.
It's basically a combined insert/update command for sql, if it exists update it, if not create it.
http://www.databasejournal.com/features/mssql/article.php/3739131/UPSERT-Functionality-in-SQL-Server-2008.htm
I'm using a 'in database' circularly linked list (cll). I'm inserting the database entries forming these cll's using Linq to Sql.
They have the general form:
id uuid | nextId uuid | current bit
If i try to do a SubmitChanges with a few objects forming a complete cll, i get the error "A cycle was detected in the set of changes".
I can circumvent this by making the linked list 'circular' in a separate SubmitChanges, but this has two down sides: I'm losing my capability to do this in one transaction. For a small period the data in my database isn't correct.
Is there a way to fix this behaviour?
The database needs to enforce its contraints, and I imagine you have a foreign key constraint between nextId and Id. If this chain of relations leads back to the start (as you have found) the database will not allow it.
I suspect your choices are:
Remove the foreign key constraint.
Store in the DB as a linked list, and only join the head with the tail in your code.
Even your second option won't work, as the DB won't allow you to add this last reference.