Modelling group membership with a "selected" member in Database - c#
In my data model I have an entity Group and another entity GroupMember. One Group consists of one or more GroupMembers, but one GroupMember can only be in one Group at the same time. So far no problem, in the database GroupMember has a foreign key to the Group's id. However now I want to have one of the members to be the "default" or "selected" member. There should always be exactly one selected member not more and not less.
I tried modelling this in Entity Framework having one 1-* association to model the group membership and one (0..1)-1 relationship to save an instance of the selected GroupMember inside of Group.
However now I obviously have the problem, that when inserting instances of Group and GroupMember I get an error that entity framework cannot determine in which order to insert the items, since Group requires a valid GroupMember as the default member, but the GroupMember cannot be inserted unless without referencing an existing Group entity. A chicken-egg problem so to say...
The easiest way would probably be to make one of the relationships optionally, but this would remove a constraint that I would like to have during normal database operation. Ideally entity framework should insert the data in any order to the database and the database should check constraint violations only at the end of the transaction.
Another way would be to model the selected member as a boolean property "IsSelected" in the GroupMember. However I'm not sure how to ensure that there is only one selected member at the same time using only the entity framework designer (I want to try to avoid working with the database directly).
Can you offer any guidance what would be the preferred way to handle this? Thanks!
The correct way to model this is with an association table:
+-------+ +--------+ +--------+
| Group |--------------| Member |-----------------| Person |
+-------+ 1 * +--------+ 1 1 +--------+
| 1 | 1
| |
| |
| 0..1 |
+--------+ |
| Leader |--------------------------------------------+
+--------+ 0..1
I'm pretending that "leader" is an accurate description of who is "special" in the group. You should try to use a more descriptive name than "selected".
The schema looks like this:
CREATE TABLE Group
(
Id int NOT NULL PRIMARY KEY,
...
)
CREATE TABLE Person
(
Id int NOT NULL PRIMARY KEY,
...
)
CREATE TABLE Member
(
PersonId int NOT NULL PRIMARY KEY
CONSTRAINT FK_Member_Person FOREIGN KEY REFERENCES Person (Id)
ON UPDATE CASCADE ON DELETE CASCADE,
GroupId int NOT NULL
CONSTRAINT FK_Member_Group FOREIGN KEY REFERENCES Group (Id)
ON UPDATE CASCADE ON DELETE CASCADE
)
CREATE INDEX IX_Member_Group ON Member (GroupId)
CREATE TABLE Leader
(
PersonId int NOT NULL PRIMARY KEY
CONSTRAINT FK_Leader_Person FOREIGN KEY REFERENCES Person (Id)
ON UPDATE CASCADE ON DELETE CASCADE,
GroupId int NOT NULL
CONSTRAINT FK_Leader_Group FOREIGN KEY REFERENCES Group (Id)
ON UPDATE CASCADE ON DELETE CASCADE,
CONSTRAINT U_Member_Group UNIQUE (GroupId)
)
It expresses the following information about the relationships:
A group exists, period. It may or may not have members. If it has no members, then by definition it also has no leader. It still exists, because new members might be added later.
A person exists, period. A person would not cease to exist simply because his/her group does.
A person may be a member of one and only one group.
A person may also be the leader of a group. A group can only have one leader at a time. The leader of a group may or may not be considered a member.
You may think that the constraints imposed by this relational design are significantly looser than the ones asked about in your question. And you'd be right. That's because your question is conflating the data model with the business/domain requirements.
In addition to this model you should also have several business rules, enforced by your application, such as:
If a group has no members, it is deleted/deactivated/hidden.
If a deactivated/hidden group acquires members, it is reactivated/shown.
A person must be a member of some group. This information must be supplied when a new person is added (it does not have to be an existing group, it can be a new group). If a person's membership group is deleted, this should trigger an exception process; alternatively, do not allow a group to be deleted if it still has members.
A group which has members must have a leader. If a new person is added to an empty group, that person becomes the leader. If the leader (person) is deleted, then a new leader should be automatically selected based on some criteria, or an exception process should be triggered.
Why is this the "correct" design?
First of all because it accurately portrays the independence of entities and their relationships. Groups and persons do not actually depend on each other; it is simply your business rules dictating that you are not interested in persons without a group membership or groups without any members or leaders.
More importantly because the indexing and constraints are far cleaner:
Querying the members of a group is fast.
Querying the membership(s) of a person is fast.
Querying the leader of a group is fast.
Querying the persons who are also leaders is fast.
Deleting a group will automatically remove all group memberships/leaders.
Deleting a person will automatically remove all group memberships/leaderships.
Changing a membership is still a single UPDATE statement.
Changing a leadership is still a single UPDATE statement.
SQL Server won't complain about multiple cascade paths.
Each table has at most 2 indexes, on the columns you'd expect to be indexed.
You can easily extend this design, i.e. to accommodate different types of membership.
Changes to membership/leadership will never interfere with simple queries (such as finding a person by name).
Every ORM can handle this with no trouble at all. Generally you would treat it as a many-to-many but you might be able to implement it as nullable-one-to-one.
All of the other solutions have some serious, fatal flaw:
Putting the GroupId on Person and LeaderId on Group results in a cycle that cannot be resolved except by making at least one of the columns nullable. You will also not be able to CASCADE one of the relationships.
Putting the GroupId on Person and an additional IsLeader on Person does not allow you to enforce the upper bound (1 leader per group) without a trigger. Actually, you technically can with a filtered index (SQL '08 only), but it's still wrong-headed because the IsLeader bit does not actually designate a relationship, and if you accidentally update the GroupId but forget about IsLeader then you've suddenly just made this person the leader of an entirely different group, and probably violated the at-most-one constraint.
Some people will choose to add GroupId to Person but still maintain the Leader association table. That is a better design conceptually, but since you'll likely have a CASCADE from Group to Person, you won't be able to put a two-way CASCADE on Leader as well (you'll get the "multiple cascade paths" error if you try).
Yes, I know it's a little more work and requires you to think a little harder about what your business rules are, but trust me, this is what you want to do. Anything else will only lead to pain.
The easiest way to do this is as follows:
Declare a boolean property IsSelected on the GroupMember entity.
Add a partial class declaration to the GroupMember class (all EF entity classes are declared partial, so it's easy to extend them with custom code).
Subscribe to the 'BeforeValueChanging' event of the IsSelected property (I can't remember the exact name of the event from the top of my head, but you can be sure that EF provides something like that.).
In your event handler, you then can implement the desired logic. There's no need to directly care about the database...
HTH!
Related
Enforce Integrity to exactly one linking table
I have a table, Order, that needs to be linked to exactly one table, Company, Person, Other. Order -> CompanyOrder (LinkTable) -> Company Order -> PersonOrder (LinkTable) -> Person Order -> OtherOrder (LinkTable) -> Other It may appear that there could be an ownership hierarchy (i.e. A company could have a person who could have an order), but the naming convention above is not representative of the actual domain - we need to specifically have a CompanyOrder, PersonOrder, OtherOrder. The link table may contain contextual information (i.e. if a company has something specific related to an order). The structure was done as there are many tables that directly related to Order, and otherwise would need to be replicated for each kind i.e. CompanyOrderItem, PersonOrderItem, OtherOrderItem. With SQL Server / EF Core at my disposal is there a means to ensure that only CompanyOrder, PersonOrder, OtherOrder has a foreign key to my Order table? I've attempted to enforce a form of referential integrity in EF by having a change request: OrderChangeRequest { OrderType, ContextId } which then can translate into populating CompanyOrder, PersonOrder, or OtherOrder based on the OrderType.
Entity Framework Navigation Property for a field that can store a foreign key from two different tables
I need a table called Loan. The columns of this table will be: Id, Type, BorrowId, Description The trick here is that Type field will determine whether the borrower is an Employee or a Customer and then the BorrowerId will either be an Id from the Employee table or an Id from the Customer table. Is this bad design as far as Entity Framework is concerned? The reason I ask is because it seems like I won't be able to create a Borrower Navigation property on the Loan table since the Id can be from two tables. Does anyone have a solution for this? Like how I can change my data models to work with Navigation properties.
A simple answer to your question is "Yes it's a bad design". Referential Integrity should be strictly enforced and when you remove that ability by alternating the reference you create a window for errors. If you want two options create two columns, and create foreign keys on each to the tables they reference. Then your application will be effectively foolproof. :D
SQL 2012 Computed Column Foreign Key Exclusive Or With Ten Other Tables
I have eleven tables. Call one of them the Parent table, and the other ten are Child tables, perhaps ChildA, ChildB, etc through ChildJ Consider the Parent table to be abstracting a piece of electronics. Every piece of electronics has some common columns, like a name, but each different type of electronic device has widely differing properties. The columns needed to represent a TV are greatly different than, say, a cell phone. The parent may have one and only one child that exists in one of the Child tables, but will not be associated with more than one. Ergo, if Parent has a ChildA, it won't have a ChildB through ChildJ. The way that I have currently implemented this is through one "ChildType" column, one "ChildId" field, and ten persisted computed columns. For example, I assign (arbitrarily) the value 1 for the ChildType of ChildA, 2 for ChildB, etc. (there is a CHECK constraint on ChildType) I then create persisted columns using CASE to give the Parent table a ChildAId, ChildBId, etc by using the Type column. that is, ChildAId AS CASE WHEN ChildType=1 THEN ItemId END PERSISTED, ChildBId AS CASE WHEN ChildType=2 THEN ItemId END PERSISTED, .... etc These computed columns are persisted, as I need to use them in FOREIGN KEY constraints. The contents of the various Child tables are so different as to be completely unrelated to one another. In this way, I have effectively managed to represent a variant type in SQL Other ideas I had considered, and why I rejected them: Use the Id of the Parent table in the ChildX tables. Rejected because it allows more than one Child per Parent. xref tables between Parent and the various ChildX tables. Rejected because it, too, would allow multiple children per parent (and more importantly, multiple parents per child) Create a bunch of columns that represent a superset of all of the data needed for all of the child types. rejected because it is stupid (also, this is what the system I am in the process of replacing did, and one of the things I am trying to avoid) Now to the actual question: While this was a great idea when there were only 2 types of children, I started to get worried when it suddenly jumped up to 10. While this will likely not get as high as 50, it might get up to 25 different child types before we're done. Also, this works really well when brought down into C# through entity framework: Effectively a Parent row associatd with a ChildA row becomes a Parent object. It is a beautiful thing really, and one of the main reasons I picked what I picked. Is there a more standardized way that data of this type (basically a variant) gets represented in SQL in a way that is controlled through constraints and allows me to query it and consume it with things like entity framework? Is the addition of many fields (all but one of which will always be NULL) the trade-off cost of how I'm doing this? Am I not seeing a red flag that I should be seeing?
I believe quite common way to deal with such cases is 1) use the Id of the Parent table in the ChildX tables with a slight modification Even though it introduces redundancy, it's an acceptable cost for having clear model which enforced by the engine. For instance, ParentTable(parentTableId, childType, UNIQUE(parentTableId,childType) , PK(parentTableId), CHECK(childType IN (1,2,3,4,5,...)) ); ChildTable1(parentTableId, childType, other attributes, FK(parentTableId, childType), PK(parentTableId, childType), CHECK(childType=1)); ChildTable2(parentTableId, childType, other attributes, FK(parentTableId, childType), PK(parentTableId, childType), CHECK(childType=2)); If number of child types is very small, say 2-3, it's ok to have check constraint in parent table like in the example above; if you have or expect more, I'd rather create a small lookup table, ChildType and use a foreign key to it (this way adding a new child type won't require changing check constraint in ParentTable)...
What you describe looks to me like inheritance. There are 3 common ways to model it, what you describe is specifically similar to table-per-type (TPT). TPT is usually modeled with your option 1, i.e. sharing the PK on the derived tables with the parent table: Base(PK id, [type], attr1, attr2) Derived1(PK/FK id, attr10, attr11) Derived2(PF/FK id, attr20, attr21) The type attribute on the base table is not really required but may help greatly because it enables you to know what a line actually is without doing a tentative join to all derived tables. Yes, with this model you could insert rows in multiple derived tables. People usually don't bother. You'll notice that you cannot enforce everything in a DB model. a1ex07 answer is showing you that at the price of a composite PK you can enforce this if you really want to. Since you mention EF, you'll be glad to know that it has built-in support for TPT inheritance.
Forcing a one-or-more in the ICollection<> with Entity Framework Code-First
Using Entity Framework 4.1 Code-First, I have set up a many-to-many relationship using the fluent API. If possible, I would like to force one side of the relationship to have 1 or more of the other, instead of the default 0 or more. Currently, if the tables are A and B, A can have 0 or more B's and B can have 0 or more A's, but I would like to force A to have at least 1 B. Can I do this in the data model or do I have to just put it in the business logic? Many thanks.
You can't define this restriction in the model. Probably because you can't define a corresponding foreign key constraint in the database. You could introduce an additional required navigation reference from A to B to ensure that A always refers to at least one B. Table A would need a non-nullable foreign key column to the table B then. But you still had to check in business logic that this required reference is also an element in the collection of the many-to-many relationship (which is not enforced in the database: A (Id=1) could refer to B (Id=2) but there is no entry (1,2) in the join table).
One-to-one self-relationship and Entity Framework
I would like to have entity which can have a child (one or zero). This child is same type as parent. I am not sure how to set entity framework becouse I would like to have two navigation properties for every entity. One for navigation to child and one for navigation to parent. Basically it is exactly the same structure as doubly linked list. I think this table structure should be enough: int | id | PK int | id_next | FK text | data But how can I create navigation properties for next/prev items? I am able to create only navigation property for next item. Thanks for help.
You can't. The problem here is that a one-to-one relation has a very specific requirement - FK value must be unique in the whole table. Once the uniqueness is not enforced you can add a second entity pointing to the same parent and you have a one-to-many relation. To enforce this in a self referencing relation like you described in your example you will place an unique index on the id_next and it will work in SQL server. The problem is that entity framework doesn't support unique keys. Because of that entity framework is only able to build one-to-one relations between two different entity types where FK in the dependent the entity type is also its PK (the only way how to force FK to be unique) = both entities has same PK value. This cannot work with a self referencing relation because you cannot have two same PK values in one table.
You can do this in EF4 by specifying a 0..1 -> 0..1 relationship on the entity. Name one of the navigation properties "Previous" and the other "Next". This will create a hidden field on the underlying DB. I haven't thoroughly tested this approach but it seemed to work when I created the database script.
Research Tree structures in the Entity Framework. You basically want a vertical tree (i.e. one branch). The framework won't enforce only one branch, but you can manage that in your business logic.