I have been struggling to understand DDD. Here is a scenario that boggles me. Say we have the entity Fund which has value object allocation/holdings and historical prices. What if a service only wants allocations of a particular fund? Should we return a list of allocation objects or return a Fund entity that contains a list of allocations? If we resort to the first approach, we need to create an Allocation Repository. The second approach seems a bit weird, since the entity is being modified to return only certain value objects to the service. Without much knowledge about the entity, shouldn't the service have all fund fields accessible to it?
My description might not be accurate. Please let me know if I need to clarify my post.
class Fund
{
int fundId;
List<Allocation> allocations;
List<Holding> holdings;
}
class Allocation
{
string type;
string percentage;
}
To answer the question in the title, no you should not. The repository pattern only works if the items in the repository have identity. If an object has identity then it is an entity not a value object.
Value objects should be all or nothing, e.g. changing one property on a value object replaces the entire thing. Thus a value object is immutable after creation.
That is not to say that a version of a value object internal to the repository cannot have an identity, but you should not let persistence concerns alter your domain.
Based on your description it actually sounds like Allocation is an entity, because it is differentiable and thus has identity.
Assuming that Allocation is an entity, the question I would then be asking is should Allocation be its own aggregate.
There is multiple variations of repository implementations but I would not mind returning a list of Allocation IF, and ONLY IF, Allocation is never managed on it's own.
In other words, if you will, at some point, want to get information about an Allocation, no matter which Fund it belongs to, then you will need a repository for Allocations, and if you are making such a repository, then you should have a method like getAllocationsbyFundId(int id) or somethign similar. If it doesn't make sense to look at Allocations on their own without knowning which Fund it is from, then Allocations are really a part of Fund and it would make complete sense to have a method on your Fund repository to return the Allocations of a specific Fund.
If you, however, end up with a GetAllAllocation() method on your Fund repository, then you have slipped out of a clean pattern.
I may not quite understand your domain so let me know if I get this wrong. When we take the Order / OrderLine scenario we may model OrderLine as a VO (much like your Fund / Allocation). Why would we ever want to query a service to return just a list of the OrderLine objects for an Order? :)
However, if you really need to do this you should be loading the Fund instance and using its contained Allocations list. However, querying your domain model usually leads to problems (lazy-loading, fetching strategies, and moving away from tell-don't-ask). If you do need to query, consider creating a lightweight query model (some call it a read model) that performs this function.
So I concur with Mgetz that you should have a repository of VOs. If you have a fixed list of VOs then you could use a type of enum structure. In C# you could do this with readonly class instances. Vaughn Vernon calls this 'Standard Types' (if memory serves). I don't think you have that scenario, though.
Related
I currently have a repository for just about every table in the database and would like to further align myself with DDD by reducing them to aggregate roots only.
Let’s assume that I have the following tables, User and Phone. Each user might have one or more phones. Without the notion of aggregate root I might do something like this:
//assuming I have the userId in session for example and I want to update a phone number
List<Phone> phones = PhoneRepository.GetPhoneNumberByUserId(userId);
phones[0].Number = “911”;
PhoneRepository.Update(phones[0]);
The concept of aggregate roots is easier to understand on paper than in practice. I will never have phone numbers that do not belong to a User, so would it make sense to do away with the PhoneRepository and incorporate phone related methods into the UserRepository? Assuming the answer is yes, I’m going to rewrite the prior code sample.
Am I allowed to have a method on the UserRepository that returns phone numbers? Or should it always return a reference to a User, and then traverse the relationship through the User to get to the phone numbers:
List<Phone> phones = UserRepository.GetPhoneNumbers(userId);
// Or
User user = UserRepository.GetUserWithPhoneNumbers(userId); //this method will join to Phone
Regardless of which way I acquire the phones, assuming I modified one of them, how do I go about updating them? My limited understanding is that objects under the root should be updated through the root, which would steer me towards choice #1 below. Although this will work perfectly well with Entity Framework, this seems extremely un-descriptive, because reading the code I have no idea what I’m actually updating, even though Entity Framework is keeping tab on changed objects within the graph.
UserRepository.Update(user);
// Or
UserRepository.UpdatePhone(phone);
Lastly, assuming I have several lookup tables that are not really tied to anything, such as CountryCodes, ColorsCodes, SomethingElseCodes. I might use them to populate drop downs or for whatever other reason. Are these standalone repositories? Can they be combined into some sort of logical grouping/repository such as CodesRepository? Or is that against best practices.
You are allowed to have any method you want in your repository :) In both of the cases you mention, it makes sense to return the user with phone list populated. Normally user object would not be fully populated with all the sub information (say all addresses, phone numbers) and we may have different methods for getting the user object populated with different kind of information. This is referred to as lazy loading.
User GetUserDetailsWithPhones()
{
// Populate User along with Phones
}
For updating, in this case, the user is being updated, not the phone number itself. Storage model may store the phones in different table and that way you may think that just the phones are being updated but that is not the case if you think from DDD perspective. As far as readability is concerned, while the line
UserRepository.Update(user)
alone doesn't convey what is being updated, the code above it would make it clear what is being updated. Also it would most likely be part of a front end method call that may signifiy what is being updated.
For the lookup tables, and actually even otherwise, it is useful to have GenericRepository and use that. The custom repository can inherit from the GenericRepository.
public class UserRepository : GenericRepository<User>
{
IEnumerable<User> GetUserByCustomCriteria()
{
}
User GetUserDetailsWithPhones()
{
// Populate User along with Phones
}
User GetUserDetailsWithAllSubInfo()
{
// Populate User along with all sub information e.g. phones, addresses etc.
}
}
Search for Generic Repository Entity Framework and you would fine many nice implementation. Use one of those or write your own.
Your example on the Aggregate Root repository is perfectly fine i.e any entity that cannot reasonably exist without dependency on another shouldn't have its own repository (in your case Phone). Without this consideration you can quickly find yourself with an explosion of Repositories in a 1-1 mapping to db tables.
You should look at using the Unit of Work pattern for data changes rather than the repositories themselves as I think they're causing you some confusion around intent when it comes to persisting changes back to the db. In an EF solution the Unit of Work is essentially an interface wrapper around your EF Context.
With regards to your repository for lookup data we simply create a ReferenceDataRepository that becomes responsible for data that doesn't specifically belong to a domain entity (Countries, Colours etc).
If phone makes no sense w/o user, it's an entity (if You care about it's identity) or value object and should always be modified through user and retrieved/updated together.
Think about aggregate roots as context definers - they draw local contexts but are in global context (Your application) themselves.
If You follow domain driven design, repositories are supposed to be 1:1 per aggregate roots.
No excuses.
I bet these are problems You are facing:
technical difficulties - object relation impedance mismatch. You are struggling with persisting whole object graphs with ease and entity framework kind a fails to help.
domain model is data centric (as opposed to behavior centric). because of that - You lose knowledge about object hierarchy (previously mentioned contexts) and magically everything becomes an aggregate root.
I'm not sure how to fix first problem, but I've noticed that fixing second one fixes first good enough. To understand what I mean with behavior centric, give this paper a try.
P.s. Reducing repository to aggregate root makes no sense.
P.p.s. Avoid "CodeRepositories". That leads to data centric -> procedural code.
P.p.p.s Avoid unit of work pattern. Aggregate roots should define transaction boundaries.
This is an old question, but thought worth posting a simple solution.
EF Context is already giving you both Unit of Work (tracks changes) and Repositories (in-memory reference to stuff from DB). Further abstraction is not mandatory.
Remove the DBSet from your context class, as Phone is not an aggregate root.
Use the 'Phones' navigation property on User instead.
static void updateNumber(int userId, string oldNumber, string newNumber)
static void updateNumber(int userId, string oldNumber, string newNumber)
{
using (MyContext uow = new MyContext()) // Unit of Work
{
DbSet<User> repo = uow.Users; // Repository
User user = repo.Find(userId);
Phone oldPhone = user.Phones.Where(x => x.Number.Trim() == oldNumber).SingleOrDefault();
oldPhone.Number = newNumber;
uow.SaveChanges();
}
}
If a Phone entity only makes sense together with an aggregate root User, then I would also think it makes sense that the operation for adding a new Phone record is the responsibility of the User domain object throught a specific method (DDD behavior) and that could make perfectly sense for several reasons, the immidiate reason is we should check the User object exists since the Phone entity depends on it existence and perhaps keep a transaction lock on it while doing more validation checks to ensure no other process have deleted the root aggregate before we are done validating the operation. In other cases with other kinds of root aggregates you might want to aggregate or calculate some value and persist it on column properties of the root aggregate for more efficient processing by other operations later on. Note though I suggest the User domain object have a method that adds the Phone it doesn't mean it should know about the existence of the database or EF, one of the great feature of EM and Hibernate is that they can track changes made to entity classes transparently and that also means adding of new related entities by their navigation collection properties.
Also if you want to use methods that retrieve all phones regardless of the users owning them you could still though it through the User repository you only need one method returns all users as IQueryable then you can map them to get all user phones and do a refined query with that. So you don't even need a PhoneRepository in this case. Beside I would rather use a class with extensions method for IQueryable that I can use anywhere not just from a Repository class if I wanted to abstract queries behind methods.
Just one caveat for being able to delete Phone entities by only using the domain object and not a Phone repository you need to make sure the UserId is part of the Phone primary key or in other words the primary key of a Phone record is a composite key made up of UserId and some other property (I suggest an auto generated identity) in the Phone entity. This makes sense intuively as the Phone record is "owned" by the User record and it's removal from the User navigation collection would equal its complete removal from the database.
I'm using Linq-To-SQL for a project with around 75 tables. We have to keep a cache of entire tables that we pull down because the entities are all interrelated and pulling them on demand takes way too long. So, to track all of these entities from all of these tables, we have a single class responsible for maintaining in-memory table references. This Cache object has a different property for each of the 75 table references, and each reference caches its table on demand. for example:
private EntityTableReference _reference;
public EntityTableReference EntityTableReference
{
get
{
// Caches all entities from the table
return _reference ?? (_reference = new EntityTableReference(this));
}
}
Now, I've seen a lot of guides saying that this really goes against the principles of OO. The Cache object doesn't do anything, it just provides a common object to pass around so that we can send a single reference to the Cache object in our function calls rather than a reference to every table that the function needs to access. This has been working really well for us and I don't see any downsides in terms of maintainability, readability, speed, etc.
Are there any criticisms against this sort of design decision? Is this a case where breaking the rules is OK because we've evaluated the advantages and disadvantages, or am I missing something here and digging myself into a hole?
One concern I can see is support for Concurrency. If a lot of processes/threads are accessing this object, the read/write operations might end up becoming a bottleneck.
I'm having difficulty wrapping my head around business objects or more specifically, business object collections.
Here's a quick example of what I'm trying to do.
If I have an Incident Object, this object can have a number of people involved and each of those Person objects can have multiple notes. Notes can't exist without a Person object and Person objects can't exist without an Incident Object.
If I have Public List<Note> notes = new List<Note>() then methods such as ADD and REMOVE become available to Person within Incident. I assume that if I was to call those methods on the Notes collection it will simply remove it from the List but not execute any code to actually add/update/delete the employee from the data source. This leads me to believe that I shouldn't use List but something else?
This also leads me to another question. Where should the actual database CRUD operations reside. Should a Note object have its own CRUD or should the Person object be responsible for it since it can't exist without it?
I'm a little lost about which way to go and I'd like to get this part right because it will be the template for the rest of the program.
Some great information has been given but one thing that you mentioned that may be confusing you is this:
"If i have Public List notes = new
List() then methods such as ADD,
REMOVE become available to Person
within Incident."
That all depends on how you design your classes. One thing that you should think about is the way this data relates to one another. That will help you picture your class design.
It sounds like the following:
One incident can involve many people
One person can create many notes
A note is the lowest level and exists due to an incident being created and a responsible person(s) working on that incident.
Incident 1 - many Persons
Person 1 - many notes
You can do this type of relationship in a number of ways. One way may be to actually seperate the objects involved, and then create joined objects.
For instance
public class Incident {
//insert incident fields here
//do not add person logic / notes logic
//probably contains only properties
}
public class Person {
//insert person fields
//private members with public properties
//do not embed any other logic
}
public class Comment {
//insert comment private fields
//add public properties
//follow the law of demeter
}
These classes do not give details to one another, they are just repositories to store this information. You then relate these classes to one another for instance
public class IncidentPersonnel {
List<Person> p;
//add methods to add a person to an incident
//add methods to remove a person from an incident
....
}
Then you may have another class handling the commenting by personnel
public class PersonnelNotes {
List<Note> n;
//other methods...
}
You can go further with this but it may complicate things but I am just giving you another idea of how to handle this.
Try to follow the law of demeter for functions
Encapsulate all of your objects, in addition, your neighbor can talk to you but not much else... This will help keep your classes loosely coupled and makes the thought process a bit simpler for you.
Finally, you mentiond how the CRUD operations should work. This all goes back to your DAL (Data Access Layer). Rather then return rows of data from a table you could then return a referenced object with all of its attributes. Add's and remove's work the same way (passing in or out an object). You can use an ORM or write up your own DAL. It all depends on how involved you want to involve yourself :).
You have several different questions in one here, I will try to answer most.
In regards to problems using List<T> - the framework has a ReadOnlyCollection<T> that is useful in exactly your situation. This is a collection that does not allow adding or removing once created.
In regards to CRUD operation responsibility - that should belong to your data layer, not any of your objects (see SRP - Single Responsibility Principle).
The way I do it is: each object that has children objects contains a list of them, and each object with a parent contains a property with its type. Adding is done by populating an object (or an hierarchy of objects) and sending to the DAL for persistence if desired. The CRUD operations are all in the DAL, which is agnostic of the object types but uses such types to determine which tables, columns, etc to access. Deleting is the only thing dealt with differently by setting an object's Deleted property which triggers the DAL to remove it.
Now regarding business logic - it does not reside with the objects themselves (the DAOs) but rather it is done by classes that receive or gather such DAOs when necessary, perform their work and send the DAOs back to the DAL for updates.
We are using Linq to SQL to read and write our domain objects to a SQL Server database.
We are exposing a number of services (via WCF) to do various operations. Conecptually, the implementation of these operations consists of three steps: reconstitute the necessary domain objects from the database; execute the operation on the domain objects; persist the (now changed) domain objects back to the database.
Problem is that sometimes, there are two or more instances of the same entity objects, which can lead to inconsistenties when saving the objects back to the db. A little made-up example:
public void Move(string sourceLocationid, destinationLocationId, itemId);
which is supposed to move the item with the given id from the source to the destination location (actual services are more complicated, often involving many locations, items etc). Now, it could be that both source and destination location id are the same - a naive implementation would just reconstitute two instances of the entity object, which would lead to problems.
This issue is now "solved" by checking for it manually, i.e. we reconstitute a first location, check if the id of the second is different from it, and if so reconsistute the second, and so on. This is obvisouly difficult and error-prone.
Anyway, I was actually surprised that there does not seem to be a "standard" solution for this in domain driven design. In particular, repositories or factories do not seem to solve this problem (unless they maintain their own cache, which then needs to be updated etc).
My idea would be to make a DomainContext object per operation, which tracks and caches the domain objects used in that particular method. Instead of reconstituing and saving individual domain objects, such an object would be reconstituted and saved as a whole (possibly using repositories), and it could act as a cache for the domain objects used in that particular operation.
Anyway, it seems that this is a common problem, so how is this usually dealt with? What do you think of the idea above?
The DataContext in Linq-To-Sql supports the Identity Map concept out of the box and should be caching the objects you retrieve. The objects will only be different if you are not using the same DataContext for each GetById() operation.
Linq to Sql objects aren't really valid outside of the lifetime of the DataContext. You may find Rick Strahl's Linq to SQL DataContext Lifetime Management a good background read.
Also, the ORM is not responsible for logic in the domain. It's not going to disallow your example Move operation. That's up for the domain to decide what that means. Does it ignore it? or is it an error? It's your domain logic, and that needs to be implemented at the service boundary you are creating.
However, Linq-To-Sql does know when an object changes, and from what I've looked at, it won't record the change if you are re-assigning the same value. e.g. if Item.LocationID = 12, setting the locationID to 12 again won't trigger an update when SubmitChanges() is called.
Based on the example given, I'd be tempted to return early without ever loading an object if the source and destination are the same.
public void Move(string sourceLocationId, destinationLocationId, itemId)
{
if( sourceLocationId == destinationLocationId )
return;
using( DataContext ctx = new DataContext() )
{
Item item = ctx.Items.First( o => o.ItemID == itemId );
Location destination =
ctx.Locations.First( o => o.LocationID == destinationLocationID );
item.Location = destination;
ctx.SubmitChanges();
}
}
Another small point, which may or may not be applicable, is you should make your interfaces as chunky as possible. e.g. If you're typically going to perform 10 move operations at once, it's better to call 1 service method to perform all 10 operations at once, rather than 1 operation at a time. ref: chunky vs chatty
Many ORMs use two concepts that, if I understand you, address your issue. The first and most relevant is Context this is responsible for ensuring that only one object represents a entity (database table row, in the simple case) no mater how many times or ways it's requested from the database. The second is Unit of Work; this ensures that updates to the database for a group of entities either all succeed or all fail.
Both of these are implemented by the ORM I'm most familiar with (LLBLGen Pro), however I believe NHibernate and others also implement these concepts.
Suppose I have
public class Product: Entity
{
public IList<Item> Items { get; set; }
}
Suppose I want to find an item with max something... I can add the method Product.GetMaxItemSmth() and do it with Linq (from i in Items select i.smth).Max()) or with a manual loop or whatever. Now, the problem is that this will load the full collection into memory.
The correct solution will be to do a specific DB query, but domain entities do not have access to repositories, right? So either I do
productRepository.GetMaxItemSmth(product)
(which is ugly, no?), or even if entities have access to repositories, I use IProductRepository from entity
product.GetMaxItemSmth() { return Service.GetRepository<IProductRepository>().GetMaxItemSmth(); }
which is also ugly and is a duplication of code. I can even go fancy and do an extension
public static IList<Item> GetMaxItemSmth(this Product product)
{
return Service.GetRepository<IProductRepository>().GetMaxItemSmth();
}
which is better only because it doesn't really clutter the entity with repository... but still does method duplication.
Now, this is the problem of whether to use product.GetMaxItemSmth() or productRepository.GetMaxItemSmth(product)... again. Did I miss something in DDD? What is the correct way here? Just use productRepository.GetMaxItemSmth(product)? Is this what everyone uses and are happy with?
I just don't feel it is right... if I can't access a product's Items from the product itself, why do I need this collection in Product at all??? And then, can Product do anything useful if it can't use specific queries and access its collections without performance hits?
Of course, I can use a less efficient way and never mind, and when it's slow I'll inject repository calls into entities as an optimization... but even this doesn't sound right, does it?
One thing to mention, maybe it's not quite DDD... but I need IList in Product in order to get my DB schema generated with Fluent NHibernate. Feel free to answer in pure DDD context, though.
UPDATE: a very interesting option is described here: http://devlicio.us/blogs/billy_mccafferty/archive/2007/12/03/custom-collections-with-nhibernate-part-i-the-basics.aspx, not only to deal with DB-related collection queries, but also can help with collection access control.
Having an Items collection and having GetXXX() methods are both correct.
To be pure, your Entities shouldn't have direct access to Repositories. However, they can have an indirect reference via a Query Specification. Check out page 229 of Eric Evans' book. Something like this:
public class Product
{
public IList<Item> Items {get;}
public int GetMaxItemSmth()
{
return new ProductItemQuerySpecifications().GetMaxSomething(this);
}
}
public class ProductItemQuerySpecifications()
{
public int GetMaxSomething(product)
{
var respository = MyContainer.Resolve<IProductRespository>();
return respository.GetMaxSomething(product);
}
}
How you get a reference to the Repository is your choice (DI, Service Locator, etc). Whilst this removes the direct reference between Entity and Respository, it doesn't reduce the LoC.
Generally, I'd only introduce it early if I knew that the number of GetXXX() methods will cause problems in the future. Otherwise, I'd leave it for a future refactoring exercise.
I believe in terms of DDD, whenever you are having problems like this, you should first ask yourself if your entity was designed properly.
If you say that Product has a list of Items. You are saying that Items is a part of the Product aggregate. That means that, if you perform data changes on the Product, you are changing the items too. In this case, your Product and it's items are required to be transactionally consistent. That means that changes to one or another should always cascade over the entire Product aggregate, and the change should be ATOMIC. Meaning that, if you changed the Product's name and the name of one of it's Items and if the database commit of the Item's name works, but fails on the Product's name, the Item's name should be rolled back.
This is the fact that Aggregates should represent consistency boundaries, not compositional convenience.
If it does not make sense in your domain to require changes on Items and changes on the Product to be transactionally consistent, then Product should not hold a reference to the Items.
You are still allowed to model the relationship between Product and items, you just shouldn't have a direct reference. Instead, you want to have an indirect reference, that is, Product will have a list of Item Ids.
The choice between having a direct reference and an indirect reference should be based first on the question of transactional consistency. Once you have answered that, if it seemed that you needed the transactional consistency, you must then further ask if it could lead to scalability and performance issues.
If you have too many items for too many products, this could scale and perform badly. In that case, you should consider eventual consistency. This is when you still only have an indirect reference from Product to items, but with some other mechanism, you guarantee that at some future point in time (hopefully as soon as possible), the Product and the Items will be in a consistent state. The example would be that, as Items balances are changed, the Products total balance increases, while each item is being one by one altered, the Product might not exactly have the right Total Balance, but as soon as all items will have finished changing, the Product will update itself to reflect the new Total Balance and thus return to a consistent state.
That last choice is harder to make, you have to determine if it is acceptable to have eventual consistency in order to avoid the scalability and performance problems, or if the cost is too high and you'd rather have transactional consistency and live with the scalability and performance issues.
Now, once you have indirect references to Items, how do you perform GetMaxItemSmth()?
In this case, I believe the best way is to use the double dispatch pattern. You create an ItemProcessor class:
public class ItemProcessor
{
private readonly IItemRepository _itemRepo;
public ItemProcessor(IItemRepository itemRepo)
{
_itemRepo = itemRepo;
}
public Item GetMaxItemSmth(Product product)
{
// Here you are free to implement the logic as performant as possible, or as slowly
// as you want.
// Slow version
//Item maxItem = _itemRepo.GetById(product.Items[0]);
//for(int i = 1; i < product.Items.Length; i++)
//{
// Item item = _itemRepo.GetById(product.Items[i]);
// if(item > maxItem) maxItem = item;
//}
//Fast version
Item maxItem = _itemRepo.GetMaxItemSmth();
return maxItem;
}
}
And it's corresponding interface:
public interface IItemProcessor
{
Item GetMaxItemSmth(Product product);
}
Which will be responsible for performing the logic you need that involves working with both your Product data and other related entities data. Or this could host any kind of complicated logic that spans multiple entities and don't quite fit in on any one entity per say, because of how it requires data that span multiple entities.
Than, on your Product entity you add:
public class Product
{
private List<string> _items; // indirect reference to the Items Product is associated with
public List<string> Items
{
get
{
return _items;
}
}
public Product(List<string> items)
{
_items = items;
}
public Item GetMaxItemSmth(IItemProcessor itemProcessor)
{
return itemProcessor.GetMaxItemSmth(this);
}
}
NOTE:
If you only need to query the Max items and get a value back, not an Entity, you should bypass this method altogether. Create an IFinder that has a GetMaxItemSmth that returns your specialised read model. It's ok to have a separate model only for querying, and a set of Finder classes that perform specialized queries to retrieve such specialized read model. As you must remember, Aggregates only exist for the purpose of data change. Repositories only work on Aggregates. Therefore, if no data change, no need for either Aggregates or Repositories.
(Disclaimer, I am just starting to get a grasp on DDD. or at least believe doing it :) )
I will second Mark on this one and emphasize 2 point that took me some times to realize.
Think about your object in term of aggregates, which lead to
The point is that either you load the children together with the parent or you load them separately
The difficult part is to think about the aggregate for your problem at hand and not to focus the DB structure supporting it.
An example that emphasizes this point i customer.Orders. Do you really need all the orders of your customer for adding a new order? usually not. what if she has 1 millin of them?
You might need something like OutstandingAmount or AmountBuyedLastMonth in order to fulfill some scenarios like "AcceptNewOrder" or ApplyCustomerCareProgram.
Is the product the real aggregate root for your sceanrio?
What if Product is not an Aggregate Root?
i.e. are you going to manipulate the item or the product?
If it is the product, do you need the ItemWithMaxSomething or do you need MaxSomethingOfItemsInProduct?
Another myth: PI means You don't need to think about the DB
Given that you really need the item with maxSomething in your scenario, then you need to know what it means in terms of database operation in order to choose the right implementation, either through a service or a property.
For example if a product has a huge number of items, a solution might be to have the ID of the Item recorded with the product in the db instead of iterating over the all list.
The difficult part for me in DDD is to define the right aggregates. I feel more and more that if I need to rely on lazy loading then I might have overseen some context boundary.
hope this helps :)
I think that this is a difficult question that has no hard and fast answer.
A key to one answer is to analyze Aggregates and Associations as discussed in Domain-Driven Design. The point is that either you load the children together with the parent or you load them separately.
When you load them together with the parent (Product in your example), the parent controls all access to the children, including retrieval and write operations. A corrolary to this is that there must be no repository for the children - data access is managed by the parent's repository.
So to answer one of your questions: "why do I need this collection in Product at all?" Maybe you don't, but if you do, that would mean that Items would always be loaded when you load a Product. You could implement a Max method that would simply find the Max by looking over all Items in the list. That may not be the most performant implementation, but that would be the way to do it if Product was an Aggregate Root.
What if Product is not an Aggregate Root? Well, the first thing to do is to remove the Items property from Product. You will then need some sort of Service that can retrieve the Items associated with the Product. Such a Service could also have a GetMaxItemSmth method.
Something like this:
public class ProductService
{
private readonly IItemRepository itemRepository;
public ProductService (IItemRepository itemRepository)
{
this.itemRepository = itemRepository;
}
public IEnumerable<Item> GetMaxItemSmth(Product product)
{
var max = this.itemRepository.GetMaxItemSmth(product);
// Do something interesting here
return max;
}
}
That is pretty close to your extension method, but with the notable difference that the repository should be an instance injected into the Service. Static stuff is never good for modeling purposes.
As it stands here, the ProductService is a pretty thin wrapper around the Repository itself, so it may be redundant. Often, however, it turns out to be a good place to add other interesting behavior, as I have tried to hint at with my code comment.
Another way you can solve this problem is to track it all in the aggregate root. If Product and Item are both part of the same aggregate, with Product being the root, then all access to the Items is controlled via Product. So in your AddItem method, compare the new Item to the current max item and replace it if need be. Maintain it where it's needed within Product so you don't have to run the SQL query at all. This is one reason why defining aggregates promotes encapsulation.
Remember that NHibernate is a mapper between the database and your objects. Your issue appears to me that your object model is not a viable relational model, and that's ok, but you need to embrace that.
Why not map another collection to your Product entity that uses the power of your relational model to load in an efficient manner. Am I right in assuming that the logic to select this special collection is not rocket science and could easily be implemented in filtered NHibernate mapped collection?
I know my answer has been vague, but I only understand your question in general terms. My point is that you will have problems if you treat your relational database in an object oriented manner. Tools like NHibernate exist to bridge the gap between them, not to treat them in the same way. Feel free to ask me to clarify any points I didn't make clear.
You can now do that with NHibernate 5 directly without specific code !
It won't load the whole collection into memory.
See https://github.com/nhibernate/nhibernate-core/blob/master/releasenotes.txt
Build 5.0.0
=============================
** Highlights
...
* Entities collections can be queried with .AsQueryable() Linq extension without being fully loaded.
...