What is the right order of insertion/deletion/modification on dataset?

What is the right order of insertion/deletion/modification on dataset? - c#

The MSDN claims that the order is :
Child table: delete records.
Parent table: insert, update, and delete records.
Child table: insert and update records.
I have a problem with that.
Example : ParentTable have two records parent1(Id : 1) and parent2(Id : 2)
ChildTable have a record child1(Id : 1, ParentId : 1)
If we update the child1 to have a new parent parent2, and then we delete parent1.
We have nothing to delete in child table
We delete parent1 : we broke the constraint, because the child is still attached to parent1, unless we update it first.
So what is the right order, and is the MSDN false on the subject?
My personnals thoughts is
Child table: delete records.
Parent table: insert, update records.
Child table: insert and update records.
Parent table: delete records.
But the problem is, with potentially unique constraint, we must always delete the records in a table before adding new... So I have no solution right now for commiting my datas to my database.
Edit : thanks for the answers, but your corner case is my daily case... I opt for the ugly solution to disabled constraint, then update database, and re-enabled constraint. I'm still searching a better solution..

Doesn't your SQL product support deferred constraint checking ?
If not, you could try
Delete all child records - delete all parent records - insert all parent records - insert all child records
where any UPDATEs have been split into their constituent DELETEs and INSERTs.
This should work correctly in all cases, but at acceptable speeds probably in none ...
It is also provable that this is the only scheme that can work correctly in all cases, since :
(a) key constraints on parent dictate that parent DELETES must precede parent INSERTS,
(b) key constraints on child dictate that child DELETES must precede child INSERTS,
(c) FK dictates that child DELETES must precede parent DELETES
(d) FK also dictates that child INSERTS must follow parent INSERTS
The given sequence is the only possible one that satisfies these 4 requirements, and it also shows that UPDATEs to the child make a solution impossible no matter what, since an UPDATE means a "simultaneous" DELETE plus INSERT.

You have to take their context into account. MS said
When updating related tables in a dataset, it is important to update
in the proper sequence to reduce the chance of violating referential
integrity constraints.
in the context of writing client data application software.
Why is it important to reduce the chance of violating referential integrity constraints? Because violating those constraints means
more round trips between the dbms and the client, either for the client code to handle the constraint violations, or for the human user to handle the violations,
more time taken,
more load on the server,
more opportunities for human error, and
more chances for concurrent updates to change the underlying data (possibly confusing either the application code, the human user, or both).
And why do they consider their procedure the right way? Because it provides a single process that will avoid referential integrity violations in almost all the common cases, and even in a lot of the uncommon ones. For example . . .
If the update is a DELETE operation on the referenced table, and if foreign keys in the referencing tables are declared as ON DELETE CASCADE, then the optimal thing is to simply delete the referenced row (the parent row), and let the dbms manage the cascade. (This is also the optimal thing for ON DELETE SET DEFAULT, and for ON DELETE SET NULL.)
If the update is a DELETE operation on the referenced table, and if foreign keys in the referencing tables are declared as ON DELETE RESTRICT, then the optimal thing is to delete all the referencing rows (child rows) first, then delete the referenced row.
But, with proper use of transactions, MS's procedure leaves the database in a consistent state regardless. The value is that it's a single, client-side process to code and to maintain, even though it's not optimal in all cases. (That's often the case in software design--choosing a single way that's not optimal in all cases. ActiveRecord leaps to mind.)
You said
Example : ParentTable have two records parent1(Id : 1) and parent2(Id
: 2)
ChildTable have a record child1(Id : 1, ParentId : 1)
If we update the child1 to have a new parent parent2, and the we
delete parent1.
We have nothing to delete in child table
We delete parent1 : we broke the constraint, because the child is still attached to parent1, unless we update it first.
That's not a referential integrity issue; it's a procedural issue. This problem clearly requires two transactions.
Update the child to have a new parent, then commit. This data must be corrected regardless of what happens to the first parent. Specifically, this data must be corrected even if there are concurrent updates or other constraints that make it either temporarily or permanently impossible to delete the first parent. (This isn't a referential integrity issue, because there's no ON DELETE SET TO NEXT PARENT ID OR MAKE YOUR BEST GUESS clause in SQL foreign key constraints.)
Delete the first parent, then commit. This might require first updating any number of child rows in any number of tables. In a huge organization, I can imagine some deletes like this taking weeks to finish.

Sounds to me like:
Insert parent2. Child still points to parent1.
Update child to point to parent2. Now nothing references parent1.
Delete parent1.
You'd want to wrap it in a transaction where available.
Depending on your schema, you could also extend this to:
Update parent1 to indicate that it is locked (or lock it in the DB), thus preventing updates.
Insert parent2
Update child to point to parent2
Delete parent1
This order has the advantage that a join between the parent and child will return a consistent result throughout. When the child is updating the results of a join will "flip" to the new state.
EDIT:
Another option is to move the parent/child references into another table, e.g. "links";
CREATE TABLE links (
link_id INT NOT NULL IDENTITY(1,1) PRIMARY KEY,
parent_id INT NOT NULL,
child_id INT NOT NULL
);
You may well want foreign keys constraints on the parent and child columns, as of course some appropriate indices. This arrangement allows for very flexible relationships between the parent and child tables - possibly too flexible, but that depends on your application. Now you can do something like;
UPDATE links
SET parent_id = #new_parent_id
WHERE parent_id = #old_parent_id
AND child_id = #child_id;

The need to DELETE a parent record without deleting the child records is unusual enough that I am certain the normally prescribed order of dataset operations defined by MS does not apply in this case.
The most efficient method would be to UPDATE the child records to reflect the new parent, then DELETE the original parent. As others have mentioned, this operation should be performed within a transaction.

I think seperating actions on tables is not a good design, so my solution is
insert/update/delete parent table
insert/update/delete child table
the key point is you should not change parentId of a child record, you should delete child of parent1 and add a new child to parent2. by doing like this you will no longer worry about broke constraint. and off course you must use transaction.

MSDN claim is correct in the basis of using dependencies (foreign keys). Think of the order as
Child table (cascade delete)
Parent table: insert and/or update and/or delete record meaning final step of the cascade delete.
Child table: insert or update.
Since we talk about cascade delete, we must guarantee that by deleting a parent record, there is a need to delete any child record relating to parent before we delete the parent record. If we don't have child records, there is no delete at child level. That's all.
On the other hand you may approach you case in different ways. I think that a real life (almost) scenario will be more helpful. Let's assume that the parent table is the master part of orders (orderID, clientID, etc) and the child table is the details part (detailID, orderID, productOrServiceID, etc). So you get an order and you have the following
Parent table
orderID = 1 (auto increment)
...
Child table
detailID = 1 (auto increment)
orderID = 1
productOrServiceID = 342
and
detailID = 2
orderID = 1
productOrServiceID = 169
and
detailID = 3
orderID = 1
productOrServiceID = 307
So we have one order for three products/services. Now your client wants you to move the second product or service to a new order and deliver it later. You have two options to do this.
The first one (direct)
Create a new order (new parent record) that gets orderID = 2
Update child table by setting orderID = 2 where orderID = 1 and productOrServiceID = 169
As a result you will have
Parent table
orderID = 1 (auto increment)
...
and
orderID = 2
...
Child table
detailID = 1 (auto increment)
orderID = 1
productOrServiceID = 342
and
detailID = 2
orderID = 2
productOrServiceID = 169
and
detailID = 3
orderID = 1
productOrServiceID = 307
The second one (indirect)
Keep a DataRow of the second product/service from child table as a variable
Delete the relative row from child table
Create a new order (new parent record) that gets orderID = 2
Insert the kept DataRow on child table by changing the field orderID from 1 to 2
As a result you will have
Parent table
orderID = 1 (auto increment)
...
and
orderID = 2
...
Child table
detailID = 1 (auto increment)
orderID = 1
productOrServiceID = 342
and
detailID = 3
orderID = 1
productOrServiceID = 307
and
detailID = 4
orderID = 2
productOrServiceID = 169
The reason for the second option, which is by the way the preferable one for many applications, is that gives raw sequences of detail ids for each parent record. I have seen cases of expanding the second option by recreating all details records. I think that is quite easy to find open source solutions relating to this case and check the implementation.
Finally my personal advice is to avoid doing this kind of stuff with datasets unless your application is single user. Databases can easily handle this "problem" in a thread safe way with transactions.

Related

Change Lookup(master) table rows to "readonly" when already in use

We have many lookup tables in the system and if it's already referred by some other tables, we shouldn't be allowed to update or delete the look-up table "value" column. eg: EnrollStatusName in below table.
Eg:
Lookup table: EnrollStatus
ID
EnrollStatusName
1
Pending
2
Approved
3
Rejected
Other table: UserRegistration
URID
EnrollStatusID(FK)
11
1
12
1
13
2
In this now I can edit Lookup table row 3 since it's not referring anywhere.
The solution which comes to my mind is to add a read-only column to look up the table and whenever there is a DML to the UserRegistration table, update the read-only column to true. Is there any other best approach to this? It can be either handling in application code or in SQL hence I'm tagging c# also to know the possibilities.

Delete is easy; just establish a foreign key relationship to some other table, and don't cascade or setnull. It's no longer possible to delete the in-use row because it has dependent rows in other tables
Update is perhaps trickier. You can use the same mechanism and I think it's neatest, instead of doing the update as an update, do it as a delete and insert - if the row is in use the foreign key will prevent the delete..
Belayer pointed out in the comments that you can use UPDATE also; you'll have to include the PK column in the list of columns you set and you can't set it to the same value it already is, nor to a value that is already in use. You'll probably need a strategy like two updates in a row if you want to have a controlled list of IDs
UPDATE EnrollStatus SET id=-id, EnrollStatusName='whatever' WHERE id=3
UPDATE EnrollStatus SET id=-id WHERE id=-3
A strategy of flipping it negative then back positive will work out only if it's not in use. If it is used then it will error out on the first statement.
If you don't care that your PKs end up a mix of positives and negatives (and you shouldn't, but people do seem to care more than they should about what values PKs have) you can forego the second update; you can always insert new values as positive incrementing and flipflop them while they're being edited before being brought into use..

Tables in SQL using C#

When you have child table and parent table, I know that if I'll use the cascade option
it will remove the appropriate child row when I'm removing some row from the parent table.
But what about when you're removing first from the child table some row this will also work?
is the appropriate row from the parent table will be removed?
Is it possible anyway?
Edit: I have a database for a game which include:
playersTbl (PK = PlayerID)
GamesTbl (PK = GameID)
GameMovesTbl (PK = MoveID,FK = GameID)
Now when the user starts the game he must register to the game (he can be only himself or a part from a group that plays the game)
Now what I want to do is when there is just one player in some game is :
When the user want to delete this player from the db, it will remove the record from the playersTbl, the appropriatet game and the game move..
Edit: Right now, the playersTbl and gamesTbl are strangers to each other.
so the best solution I see is to create a new table that joins between those tables.
Now my DB looks like:
PlayersTbl (PK = PlayerID)
JoinTbl (FK = PlayerID, GameID)
GamesTbl (PK = GameID)
GameMovesTbl (PK = MoveID,FK = GameID)
so if I'm using the cascade option it means that:
PlayersTbl is parent table of JoinTbl
JoinTbl is parent table of GamesTbl
GamesTbl is parent table of GameMovesTbl
But whenever the user deletes some player it removes only from the PlayersTbl and the JoinTbl.. so my question is what is the best relationships between those tables so the delete option will work properly?

No, cascading deletes won't remove the parent when the child is deleted... nor should they.
Take for example, a Customer table (the parent) and an Order table (the child). Just because a single Order row is deleted, that doesn't mean that the Customer row that owned the Order should be removed as well. The Customer may have had dozens of other Orders...
If you want deletes to cascade from parent to child, and from child to parent, this would seem to indicate a one-to-one relationship between your tables... at which point you should ask yourself if they really should be two tables or combined into one.
EDIT:
#Elior In your scenario, the Player should be the parent of Game, and Game should be the parent of GameMoves. If you want to remove Player, delete the row, and with cascading deletes enabled, Game rows associated with the Player will be removed, which will then cascade down to remove GameMoves associated with the Game which was removed.
All the cascades are from parent to child... you don't need to remove any parents based on children being removed.

I think your best bet is to handle this requirement in the application and not in referential integrity using SQL triggers. If your app uses stored procedures it would be fine to put that logic there. Referential integrity is meant to prevent orphaned child records, not to ensure that all parents have child records.
Technically you could create a trigger that deletes the parent if ALL children are deleted, but in the event that there ever IS a case where you want a parent with no children, the triggers will prevent you from doing that.
It's better to have that rule higher up the stack - either in the application or in a stored procedure (if you're using them).

Database Design Deleting parent and his children in nested Parent-child situation

In Oracle I have a table called Category consisting of three columns:
ID = which is system produced unique key ,
Catgeory_name = which is 300 char ,
and parent_id = which either could be -1 which means no parent for this category, or it could be a value from the ID column described earlier as the parent_id.
The problem is when I delete a category who is a parent, I need to automatically delete all the children as well. My question is : Does SQL provide any means to do this automatically or should I take care of it in my upper layer langugae which is C#.
For example if there was a foreign key situation between two tables, I know SQL provides ON DELETE CASCADE to delete the dependent records as well as the parent record upon a delete request for the origianal record.
However, I don't know of any way in SQL that would take care of the above situation automatically, meaning when the parent is deleted in the above table, all the children get deleted as well.
Thanks in advance for your help.

If the parent_id was set to NULL if there was no parent, you could define a foreign key on category that referenced the primary key in category
SQL> create table category (
2 id number primary key,
3 category_name varchar2(300),
4 parent_id number references category( id )
5 );
Table created.
You could then declare that foreign key constraint to automatically delete the children when the parent row is deleted.
If you really want to use a magic value of -1 to indicate the absence of a parent rather than using a proper NULL, you could potentially insert a row into the category table with an id of -1 and then create the foreign key constraint. But that is much less elegant than using a NULL.

How to maintain referential integrity when parents tables keys are changed?

I have created a C# app that makes a clone copy from an MS Access database and migrates the data to another DB server, but that will require changing the primary keys. What is the best way to maintain the referential integrity to the child tables when the parent tables keys are changed?
Thanks,
Andrew

You may already know this but your primary key column values should not be changing, much, if at all. However, that aside, you don't mention what database you are using. But with SQL Server, you can set up FK's to do what is called a cascading update. This means that if a PK value changes, all FK rows in child tables will have the value changed as well.
The following is an article describing this: http://blogs.techrepublic.com.com/datacenter/?p=128

I'm assuming you have autoIncrement set as the datatype on the PK field of an Access table and you want equivalent functionality in your new db.
Import the Access tables into destination tables with numeric, not auto-increment, data types. Then add your RI back between parent and child tables. Then edit your PK field to auto increment.

I did end up using composite primary keys since each time the app makes a clone copy it is a "snapshot" of the entire dataset. I've therefore added a columne called "Revision" and set each table's primary key to Pk = OID + REVISION.
For the child table it should reference the parent table by their primary key, which means the foreign key will also be composite. How do you achieve that relationship in Access? What I have done is in Access 2007 go to "Database Tools" -> "Relationship" and there edit the relationship so that it displays the following:
(parent Oid) 1 <--- many (child parentKey), (parent Revision) 1 <--- many (child Revision)
Please tell me is this is the way to do it. Or if someone can tell me how to achieve that using SQL commands I'll try that too.
Thanks,

Linq2SQL dealing with inserts/deletes on table with unique constraints

I have a table that looks like the following:
TABLE Foo
{
Guid Id [PK],
int A [FK],
int B [FK],
int C [FK],
}
And unique constraint over A, B and C.
Now say for example, you insert a row with a fresh PK with with A = 1, B = 1, C = 1.
SubmitChanges(), all happy.
Now you edit the table.
You remove the previous entry, and insert a row with a fresk PK with A = 1, B = 1, C = 1.
SubmitChanges() BOOM! Unique key constraint SQL exception.
From what I can see, it attempts to first insert the new record, and then try to delete the previous one. I can even understand that it is not possible to determine the order this needs to happen.
But what can I do about it? Would making those 3 fields a composite PK (and removing the old one) be a better solution or wont it even work?
For now, the 'solution' is to remove the unique constraints from the DB (but I'll rather not do so).

One option would be to create a transaction (either a connection-bound transaction, or a TransactionScope) - remove the record and SubmitChanges, add the record and SubmitChanges, then finally commit the transaction (or roll-back if you blew up).
Note that you can associate a connection-bound transaction through the data-context constructor IIRC. TransactionScope should also work, and is easier to do - but not quite as efficient.
Alternatively, write an SP that does this swap job at the database, and access that SP via the data-context.

I had the same problem. Ended up writing a wrapper class with an 'Added' and 'Deleted' collection of entities that I maintained. As well as a 'Current' collection. The UI was bound to the current collection.
Only when I go to save do I InsertOnSubmit / DeleteOnSubmit, and I parse the 2 collections to decide which entities to do what to.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.