I was curious on what peoples thoughts are on keeping the Id of a DAL entity as a property of the Domain Entity, at the absolute most a read-only property.
My first thoughts was that this is ok to do but the more I think about it the more I dislike the idea. After all the domain model is supposed to be completely unaware of the how data is persisted, and keeping and Id property on each domain model is a less-than-subtle indication. The persistence layer may be something that doesn't require primary keys, or another property exposed in the domain model may be a suitable candidate for identification, a model no. perhaps.
But then that got me thinking, for domain models that do not have a reliable means of uniquely identifying an entry in a database persistence layer, how are they to identify entries when it comes to updating or deleting?
A dictionary based on weak reference keys could do the trick; WeakDictionary<DomainEntity, PrimaryKeyType>. This dictionary would be a part of the repository implementation, whenever the client of the repository Fetch's a collection of DomainEntity a weak reference to the entity and its persistence layer Id is stored in this internal dictionary such then when comes time to return the modified entity to the repository for updating the persistence layer, the following could be done to get back the Id
PrimaryKeyType id = default(PrimaryKeyType);
if (!weakDictionary.TryGetValue(someDomainEntity, out id))
// id not found, throw exception? custom or otherwise..
// id found, continue happily mapping domain model back to data model.
The benefits of this approach as I see it, is the domain entity need not maintain its persistence layer specific id and the repository forces you to have a legitimate Domain Entity obtained either by some call to a Fetch... method or the Add/CreateNew method, else should you try to update/delete the entity it will throw an exception.
I'm aware that this probably over-engineering and I should just buckle down and get pragmatic, I was just curious on what other people thought about this.
I don't want to start another thread just for this minor question as it is somewhat related. But since it is relatively recently I have started looking into DDD (though in this case my database came first) I wondered if I could confirm that I have the right mindset for Domain Entities, here is a cut down example of my Employee domain entity.
public class Employee : DomainEntity
{
public string FirstName { get; }
public string LastName { get; }
public UserGroup Group { get; }
// etc..
// only construct valid employees
public Employee(string firstName, string lastName, SecureString password, UserGroup group);
// validate, update. (not sure about this one.. pulled it
// from an open source project, I think that names should be able to be set individually).
AssignName(string firstName, string lastName);
// validate, update.
ResetPassword(SecureString oldPassword, SecureString newPassword);
// etc..
}
Thank you!
Your proposal of using weak references has one major flaw.
As you might know, domain entities have the important characteristic in that they must have identity. This is important for comparison reasons. If two entities have the same identity, regardless of the values of their properties, then they are considered equal:
Entity1 == Entity2 ⇔ Entity1.Identity == Entity2.Identity
A typical "design pattern" would be to inherit all entities from a DomainEntity<T> abstract class, which overrides the comparison of these objects and compares by identity.
Now, consider your approach of using a weak reference look up. Let's take an example:
You fetch an Entity1, say the "Reegan Layzell" user, from a repository. Then you fetch the exact same "Reegan Layzell" entity from the repository again as Entity2. You now have the same entity in your domain in two objects. But they have difference references (of course).
When comparing, these entities will not be considered equal in your domain.
I admire your fear of introducing database concerns into your domain model, but propagating the database ID into your entities is hardly going to affect the quality of your models and it will save you a lot of trouble. Like you said, we need to be pragmatic.
With regards to your Employee example: Does AssignName really make sense? In reality, can an employee's name really change after creation? Other than that, it looks like you have the right idea. I highly recommend you watch this: Crafting Wicked Domain Models by Jimmy Bogard.
Related
In Domain-Driven Design, how can I hydrate the Id property when I retrieve an entity from a repository? When I create an entity for the first time (before it is persisted), I can generate a unique ID in the entity's constructor. But when I retrieve an entity from the repository, it already has an ID. How do I set the Id property in this case? Passing the ID to the entity's constructor doesn't feel right to me, but maybe it is the correct approach?
I am not using an object-relational mapping (ORM) tool.
public interface IPersonRepository
{
Person GetById(long id);
}
public abstract class Entity
{
public long Id { get; private set; }
protected Entity()
{
Id = // Generate a unique Id with some algorithm.
}
}
public sealed class Person : Entity
{
//...
}
When I CREATE the Entity for the first time (before its persistence), I can generate a unique id in Entity's constructor...
which may not be a good idea. Non deterministic data (like time, or copies of remote mutable state) should be inputs to your domain model. In practice, you will often get away with it; but that alone doesn't make it a good idea.
The usual answer is that the repository will fetch the persisted representation of the information (a DTO, for example), and hand that off to a factory whose purpose is the construction of the entity.
So the identity of the entity becomes just another piece of information passed from the repository to the factory.
Now, "factory" here is just another life cycle pattern; and it can take many different forms, including the form of constructor. In which case, the identifier would normally just be passed into the entity as an argument.
Identifiers in particular can be a bit weird, because they don't normally express business semantics. It's typical of the identifier pattern that they are opaque things that really only support equality comparison. Your entities almost never look at their own identifier to figure out what to do next.
But if your entity needs a reference to its own identifier, for whatever reason, you'll normally create that reference when you initialize the object, and leave it unchanged from that point forward (in other words, the entities identifier property is an immutable reference to an immutable value).
1) Aggregate or Entity?
I think there is some confusion in your question in terms of DDD. In general you shouldn't load entities. You should load Aggregate, through Aggregate root (which is entity), all other entities for this aggregate should be loaded automatically.
From Evans DDD:
only AGGREGATE roots can be obtained directly with database queries. All other objects must be found by traversal of associations.
Martin Fowler:
Aggregates are the basic element of transfer of data storage - you request to load or save whole aggregates.
Aggregate Root
2) How to set a Id. It's a good idea to use immutable properties. public long Id { get; private set; }, lets think we are doing things correctly when we use immutable id. Now lets go ahead and found possible ways for setting Id properly.
set id from the class method. Looks confusing to set id for existing entity (aggregate root). I don't suggest to use this option.
set id from constructor. Why not? You set the Id during the creation of the entity (aggregate root). From Evans DDD:
A public constructor must follow the same rules as a FACTORY: It must be an atomic operation that satisfies all invariants of the created object.
factory. From Evans DDD:
Complex assemblies, especially of AGGREGATES, call for FACTORIES
set id during deserialisation. Is clear and simple way. I would chose this one. I would store Id and other data together (it's common practise). GetById(long id); returns Person which already had Id setted during deserialisation.
According to DDD for each aggregate i have repository. Let's take an example:
Client (model aggregate) > ClientRepository
Visit (model aggregate) > VisitRepository
Now phisicly i have association table in database which connects Client and Visit because clients could have many visists.
The question is: Should i create separated model like: ClientVisit which also would be an aggregate:
public class ClientVisit
{
int clientId;
int visitId;
}
Also s repository like ClientVisitRepository which could reference/use ClientRepository and VisitRepository.
Or is it enough to stick with e.g CLientRepository and get data from there without additionality model and repository
Modification to the post:
Instead of Visit (wrong example) - let's replace by Car therefore each client can have many cars. We would have also unique transactionNumber so:
Client (model aggregate) > ClientRepository
Car (model aggregate) > CarRepository
Should then i then create aggregate such as:
public class ClientCar
{
int clientId;
int visitId;
int transactionNumber;
}
and ClientCarRepository?
No, don't use a different repository for each entity or aggregate. You are not applying DDD completely in your modelling. You have to focus on the Ubiquitous language. Let me explain.
Repositories are meant to be nothing more than serializers and de-serializers for your entities and aggregates. There shouldn't be an intentional 1-to-1 between them. In fact, most of the time you won't have the 1-to-1. In my code, I tend to scope repositories to the bounded context or to a subcontext.
Take a trivial example: A blogging application. I might have a repository that can persist a comment. Persisting the comment means saving the comment itself and updating User's comment count. The Save(Comment comment, Usr usr) method will make two calls to my persistence mechanism to update the individual Entities or Aggregates.
Repository, in the sense of domain driven design, is a life cycle management pattern. See chapter 6 of the "blue book".
It's purpose is to isolate from our application code the strategy we are using to store and retrieve aggregate roots. That's important, because the aggregate roots are the only parts of the domain code that our application code talks to directly.
From this, it follows that you don't need a repository for the client car relation unless it is the root of its own aggregate.
Figuring out whether this relation should be in its own aggregate or not is a matter of domain analysis -- you're going to have to dig into your specific domain to figure out the answer. For something like a car rental domain, I would guess that you'll want this relation, and the information associated with its life cycle, to be in a separate aggregate from the car or the customer. But I wouldn't feel confident in that guess until I had worked through a few edge cases with the domain experts.
Whether you treat an entity as aggregate root, thereby introduce a corresponding repository, depends on your domain or its ubiquitous language. One of the key indicators of aggregates is that they encapsulate important domain operations.
Hard to be precise without knowing your domain, however, in your example, Client seems to be a more natural candidate for an aggregate: a client may own new cars, get rid of a few, etc; the corresponding operations (i.e. adding cars or removing cars) fit naturally into client.
ClientCar (or ClientVisit), on the other hand, doesn't seem to have any purpose other than retrieving cars owned by a client. For this purpose, navigating the entity should suffice, no aggregate is necessary. Your Client repository may introduce a method for this purpose like the following:
public interface ClientRepository
{
Client findById(String clientId);
void store(Client client);
IList<Cars> carsOwnedBy(String clientId);
}
Then carsOwnedBy method implementation retrieves a Client and returns only the Cars associated with it.
Most of what I've read (e.g. from the author) indicates that AutoMapper should be used to map an an entity to a DTO. It should not load anything from the database.
But what if I have this:
public class Customer {
public int Id { get; set; }
public string Name { get; set; }
public virtual ICollection<Order> Orders { get; set; }
}
public class CustomerDto {
public int Id { get; set; }
public string Name { get; set; }
public IEnumerable<int> OrderIds { get; set; } // here is the problem
}
I need to map from DTO to entity (i.e. from CustomerDto to Customer), but first I must use that list of foreign keys to load corresponding entities from the database. AutoMapper can do that with a custom converter.
I agree that it doesn't feel right... but what are the alternatives? Sticking that logic into a controller, service, a repository, some manager class? All that seems to be pushing the logic somewhere else, in the same tier. And if I do that, I must also perform the mapping manually!
From a DDD perspective, the DTO should not be part of the domain. So AutoMapper is also not part of the domain, because it knows about that DTO. So AutoMapper is in the same tier as the controllers, services, etc.
So does it make sense to put the DTO-to-entity logic (which includes accessing the database, and possibly throwing exceptions) into an AutoMapper mapping?
EDIT
#ChrisSimon's great answer below explains from a DDD perspective why I shouldn't do this. From a non-DDD perspective, is there a compelling reason not to use AutoMapper to load from the db?
To start with, I'm going to summarise my understanding of Entities in DDD:
Entities can be created - often using a factory. This is the start of their life-cycle.
Entities can be mutated - have their state modified - by calling methods on the entity. This is how they progress through their lifecycle. By ensuring that the entity owns its own state, and can only have its state modified by calling its methods, the logic that controls the entity's state is all within the entity class, leading to cleaner separation of business logic and more maintainable systems.
Using Automapper to convert from a Dto to the entity means the entity is giving up ownership of its state. If the dto is in an invalid state and you map that directly onto the entity, the entity may end up in an invalid state - you have lost the value of making entities contain data + logic, which is the foundation of the DDD entity.
To make a suggestion as to how you should approach this, I'd ask - what is the operation you are trying to achieve? DDD encourages us not to think about CRUD operations, but to think about real business processes, and to model them on our entities. In this case it looks like you are linking Orders to the Customer entity.
In an Application Service I would have a method like:
void LinkOrdersToCustomer(CustomerDto dto)
{
using (var dbTxn = _txnFactory.NewTransaction())
{
var customer = _customerRepository.Get(dto.Id);
foreach (var orderId in dto.OrderIds)
{
var order = _orderRepository.Get(orderId);
customer.LinkToOrder(order);
}
dbTxn.Save();
}
}
Within the LinkToOrder method, I would have explicit logic that did things like:
Check that order is not null
Check that the customer's state permits adding the order (are they currently active? is their account closed? etc.)
Check that the order actually does belong to the customer (what would happen if the order referenced by orderId belonged to another customer?)
Ask the order (via a method on the order entity) if it is in a valid state to be added to a customer.
Only then would I add it to the Customers Order's collection.
This way, the application 'flow' and infrastructure management is contained within the application/services layer, but the true business logic is contained within the domain layer - within your entities.
If the above requirements are not relevant in your application, you may have other requirements. If not, then perhaps it is not necessary to go the route of DDD - while DDD has a lot to add, its overheads are generally only worth it in systems with lots of complex business logic.
This isn't related to the question you asked, but I'd also suggest you take a look at the modelling of Customer and Order. Are they both independent Aggregates? If so, modelling Customer as containing a collection of Order may lead to problems down the road - what happens when a customer has a million orders? Even if the collection is lazy loaded, you know at some point something will attempt to load it, and there goes your performance. There's some great reading about aggregate design here: http://dddcommunity.org/library/vernon_2011/ which recommends modelling references by Id rather than reference. In your case, you could have a collection of OrderIds, or possibly even a completely new entity to represent the link - CustomerOrderLink which would have two properties - CustomerId, and OrderId. Then none of your entities would have embedded collections.
I am little bit confused about the problem. I have an entity Product that is represented in the database. It looks like POCO. Here is example (I use attributes instead of fluent api for simplicity).
public class Product
{
[Key]
public int Id { get; set; }
//other properties that have mapping to db
}
But now I want to avoid AnemicDomainModel anti-pattern
So I am going to fill the Product model with methods and properties, that do not have mapping to db, so I should use [Ignore].
public class Product
{
[Key]
public int Id { get; set; }
[Ignore]
public object FooProperty { get; set; }
//other properties that have mapping to db
//other properties and methods that do not have mapping to db
}
I think such a way spoils my model. In this article I've found acceptable workaround. Its idea is to separate Product (domain model) and ProductState (state of product that is stored in the database). So Product is wrapper for ProductState.
I really want to know the views of other developers. Thanks a lot for your answers.
I understood that my real question sounds something like that: "Should I separate Data model and domain model? Can I change EF entities from Anemic to Rich?"
To ensure persistence ignorance of your entities, I've found EF Fluent Mapping to be better than Data Annotations. The mappings are declared in an external file, thus normally your entity doesn't have to change if something in the persistence layer changes. However, there are still some things you can't map with EF.
Vaughn's "backing state object" solution you linked to is nice, but it is an extra layer of indirection which adds a fair amount of complexity to your application. It's a matter of personal taste, but I would use it only in cases when you absolutely need stuff in your entities that cannot be mapped directly because of EF shortcomings. It also plays well with an Event Sourcing approach.
The beauty of the Entity Framework is that it allows you to map your database tables to your domain model using mappings which can be defined using the Fluent API, therefore there is no need to have separate data entities. This is in comparison to its predecessor Linq To SQL where you'd map each table to an individual data entity.
Take the following example, for the paradigm of a Student and Course - a student can take many courses, and a course can have many students, therefore a many-to-many relationship in your database design. This would consist of three tables: Student, Course, StudentToCourse.
The EF will allow you to use Fluent API mappings to create the many collections on either side of the relationship without having the intermediate table (StudentToCourse) defined in your model (StudentToCourse has no existence in a DOMAIN MODEL), you would only need two classes in your domain, Student and Course. Whereas in LinqToSQL you'd have to define all three in your model as data entities and then create mappings between your data entities and domain model resulting in lots of plumbing work open to bugs.
The argument of the anaemic vs rich domain model should have little effect on your mapping between your model and database tables, but focuses on where you place the behaviour - in either the domain model or the service layer.
I've seen 2 types of entities, like this:
public class Person
{
public int Id {get;set;}
public string Name {get;set;}
public Country Country {get;set;}
}
and like this:
public class Person
{
public int Id {get;set;}
public string Name {get;set;}
public int CountryId {get;set;}
}
I think that the 2nd approach is more lightweight, and you get related data only if you needed;
which one do you think is better?
It depends what you want. If you only want to get the Country's ID then go for the second option. If you actually want to make use of navigation properties and/or lazy loading, then go for the first option.
Personally, I use Entity Framework and combine options one and two:
public class Person
{
public int Id {get;set;}
public string Name {get;set;}
public int CountryId {get;set;}
public Country Country {get;set;}
}
So I have a choice when it comes to returning data from my repositories. This also means that when I come to save, I can just populate the actual value type properties, instead of having to load the country object and assign it to the person.
Taken at face value, the first is an example of a rich domain model, and the second is a data driven approach. Allowing rich domain models is one of the main benefits of ORM.
The only reason I would include the CountryId (either in place of the Country, or in addition to it) would be for optimization for some very specific performance problem. Even then I would think twice. And optimization is something you shouldn't be thinking about too much at the initial design stage. Whats wrong with Person.Country.Id? (Assuming you need the id at all, and it's not just infrastructure).
If you are looking at this from any other angle than performance optimisation, then you are probably taking the wrong approach by including 'foreign keys' in your domain model. I had the same problem when first using NHibernate, coming from an ADO type background. I would almost certainly go with the first example.
There are two considerations, Platforms and Traffic, outlined below...
All in Microsoft Platform
In multi tier solutions, where end client is Silverlight and you are going to share your generated code via RIA services, or you have WPF client with WCF RIA services, first solution gives you better design.
Non Microsoft End client
If your end client is non microsoft client like Flex/Flash, Java or any ajax based smart clients, then first model will not be useful as it needs track itself (self tracking objects). Second model is preferred here.
Low Traffic applications
If network traffic is not much of issue and your design of software is more important, or you have highly scalable middle tires for caching etc, like App Fabric etc, first solution good one which will give you better design.
High Traffic applications
First model will serialize more data then necessary, and that can be a real performance issue in high traffic applications. So in that case, second model will work better because only if user is requesting more data of reference, then only it will be loaded.
This is quite a tradeoff issue between "Better Design" vs "Better Performance", and it needs to be selected based on parameters mentioned above and there can be more parameters depending upon complexity of project, team size, documentation and more.
Good question! For me
public List<Person> GetPersonsLivingIn(int countryId) {
return ObjectContext.Persons.Where(x => x.CountryId == countryId).ToList();
}
just looks like it works that way without knowing about all the magic (leaky) abstractions that may be present in the ORM that would make x => x.Country == country work. I came from Linq2Sql where I had some problems with the first one when passing around objects created in different object contexts.
But I would do as GenericTypeTea said and include both the id and the navigation property. After all, you'll want a navigable object graph at some point. And that way you can still make
public List<Person> GetPersonsLivingIn(Country country) {
return ObjectContext.Persons.Where(x => x.CountryId == country.CountryId).ToList();
}
which has a more OO feeling interface, but still looks like it would work without magic.
Except in some weird edge cases, there are no good reasons for the second design.
They are both equally lightweight (references are lazily loaded by default), but the second one doesn't give you navigational capabilities, which restricts and complicates your queries.
STOP!
In NHibernate, there is NO need to specify the foreign key in your domain model, not even for performance reasons.
Assuming you have lazy loading enabled (it's enabled by default), calling:
int countryId = person.Country.Id;
...won't incur a database hit to retrieve the Country entity. NHibernate will return a dynamic proxy of your Customer, not an actual Customer. Because of the proxy, a database hit will only occur on first access to a Property on your Customer entity, but NHibernate is smart enough to realise that 'person.Country.Id' is the same as accessing the customer ID foreign key in your Person table, which gets loaded in anyway.
However, the following code:
string countryName = person.Country.Name;
...will hit the database, the call to the 'Name' property will load the entire Customer instance.
This behavior assumes you have set-up your mapping like so:
<many-to-one name="Country" class="Country" column="Country_ID" lazy="proxy" />
(note that lazy="proxy" is the default).
Simply put, there is no need to map foreign keys in your domain model with NHibernate.