Populating domain entities from database

Populating domain entities from database - c#

I have many things like this in my code (this is just one simple example):
var invoice = context.Invoices
.ForId(invoiceId)
.Include(i => i.Payments)
.Include(i => i.OrderLines)
.First();
And Invoice has field UnpaidAmount caclulated as
public double UnpaidAmount
{
get
{
return OrderLines.Sum(ol => ol.Amount) -
Payments.Sum(p => p.Amount);
}
}
Now, what happens often in the project is that if someone needs to modify UnpaidAmount logic to this for instance:
return OrderLines.Sum(ol => ol.Amount) -
Payments.Sum(p => p.Amount) -
CreditNotes.Sum(cn => cn.Amount);
Then they would need to find everywhere in project where UnpaidAmount is used and add CreditNotes to Include when fetching Invoice. People often forget that and, in this case, Sum on CreditNotes actually gets called on an empty collection, instead on one fetched from database.
This becomes really buggy and hard to maintain through project.
The alternative is to either lose LazyLoading so we dont have to think about includes anywhere but this can lead to performance problems which might not be detected during the develpoment but later in production when number of records fetched gets larger.
Or to have one method which fetches Invoice object with all of its navigation properties + recursively doing it for navigation properties deeper in the object graph. But that would be overkill because many things which are not needed would be fetched every time.
I assume it is trade off which I will have to make, but I just need advice from people who faced with this kind of problem on larger projects, what solution do you think is most maintainable for long run?

Then they would need to find everywhere in project where UnpaidAmount is used and add CreditNotes to Include when fetching Invoice.
Why does this responsibility live in more than one place?
The alternative is to either lose LazyLoading so we dont have to think about includes anywhere but this can lead to performance problems which might not be detected during the develpoment but later in production when number of records fetched gets larger.
Or to have one method which fetches Invoice object with all of its navigation properties + recursively doing it for navigation properties deeper in the object graph. But that would be overkill because many things which are not needed would be fetched every time.
What I think you are looking for is the Repository Pattern.
The basic idea:
You have one or more role interfaces that consumers can use to describe how they are going to use the Invoice (or alternatively, which view of the invoice satisfies their needs).
You provide implementations of those roles that share a common understanding of how the invoice data is stored. This means that when you introduce CreditNotes into the model, there's only one place that needs to change.
You use your plumbing (dependency injection, or whatever) to ensure that the correct implementation is provided for each role.
In short, you create an explicit contract between the consumers and the suppliers; the consumers describe what they need, the suppliers have freedom of choice in how they meet that need.
Udi Dahan wrote a few posts related to this idea, back in the day.
Better Domain Driven Design Implementation
Query Objects vs Methods on a Repository
Fetching Strategy Design
Intentions and Interfaces

Can offer to introduce some kind of Store, called, for example, Invoices. It will get DbContext in constructor, and provide all the methods you have to fetch and save invoices. So, you will have only one place to write and modify query.
class Invoices
{
public Invoices(DbContext dbContext){....}
public Invoice GetInvoiceById(int invoiceId)
{
return this.dbContext.Invoices
.ForId(invoiceId)
.Include(i => i.Payments)
.Include(i => i.OrderLines)
.FirstOrDefault();
}
...

Related

Entity Framework dependencies loading

I have a long-time burning question about how to avoid null errors with data queried via Entity Framework (version 6 - not Core yet, sadly).
Let's say you have a table Employees, and it has a relationship with another table, EmployeePayments (one employee has many employee payments).
On your Employee domain object you create a property TotalPayments which relies on you having loaded the EmployeePayments for that object.
I try to ensure that any time I do a query, I "include" the dependency, for example:
var employees = context.Employees.Include(e => e.EmployeePayments);
The problem is, I have a lot of queries around the place (I use the generic repository pattern, so I call repository functions like GetAll or GetSingle from my service library), and so that's a lot of places to remember to add the includes. If I don't include them, I run the risk of having a null exception if the TotalPayments property is used.
What's the best way to handle this?
Note 1: we have a lot of tables and I don't really want to have to revert to using specific repositories for each one, we take advantage of the generic repository in a lot of ways.... but I will be happy to hear strong arguments for the alternative :)
Note 2: I do not have lazy loading turned on, and don't plan on turning it on, for performance reasons.

This is one reason I consider the Generic Repository an anti-pattern for EF. I use a repository pattern, but scope it like I would a controller. I.e. a CreateOrderController would have a CreateOrderRepository. This repository would provide access to all relevant entities via IQueryable. Common stuff like lookups etc. would have their own secondary repository. Using generic repositories that are geared to working with a single entity type mean adding references to several repositories to do specific things and running into issues like this when attempting to load entities. Sometimes you want related data, sometimes you don't. Simply adding convenient methods in top level entities effectively "breaks" that an object should always be considered complete or complete-able without relying on lazy-loading which brings significant performance costs.
Having repositories return IQueryable avoids many of the problems by giving control to the calling code how entities are consumed. For instance I don't put helper methods in the entities, but rather code needing to populate a view model relies on Linq to build the view model. If my view model wants a sum of payments for an employee, then my repository returning IQueryable can do the following:
public IQueryable<Employee> GetEmployeeById(int employeeId)
{
return Context.Employees.Where(x => x.EmployeeId == employeeId);
}
then in the controller / service:
using (var contextScope = ContextScopeFactory.Create())
{
var employeeViewModel = EmployeeRepository.GetEmployeeById(employeeId)
.Select(x => new EmployeeSummaryViewModel
{
EmployeeId = x.EmployeeId,
EmployeeName = x.LastName + ", " + x.FirstName,
TotalPayments = x.Payments.Where(p => p.IsActive).Sum(p => p.Amount)
}).Single();
}
I use a repository because it is easier to mock out than the DbContext and it's DbSets. For Synchronous code I just have the mock to populate and return List<Employee>().AsQueryable(). For Async code I need to add a wrapper for an Async List.
This pattern may go against more traditional views of a repository and separation of concerns that the calling code needs to "know" about the entities, that EF-isms are leaked. However, no matter what approach you try to rationalize to get around the inefficiencies of trying to "hide" EF behind a repository, either you will be left with very inefficient code where repositories return pre-populated DTOs or contain dozens of near identical methods to return different DTOs (or worse, entities in various degrees of completeness) or you are adding complexities like passing in magic strings or expression trees into your methods to tell EF how to filter, how to sort, what to include, paging, etc. Passing in expressions or strings requires the calling code to "know" about the entities and leaks EF restrictions. (Passed in expressions / strings still have to be able to be ultimately understood by EF)
So this may not be a viable answer to the current state of your project, but it might be worth looking into whether your dependency on the repositories can be better managed without splitting them with the Generic pattern, and/or leveraging EF's excellent IQueryable / Linq capabilities to let your controllers/services project the entities into view models / DTOs rather than embedding these reduce elements in the entities themselves.

DDD repositories with EF explicitly loading

I'm starting to get my head into Domain Driven Design and I'm having some issues with the repositories and the fact that EF Core explicitly loading will automatically fill my navigational properties.
I have a repository that I use to load my aggregate root and its children. However, some of the aggregate children need to be loaded later on (I need to load those entities based on a date range).
Example:
Load schedule owners
Calculate a date range
Load schedule owner's schedules
I'm trying to keep my data access layer isolated from the core layer and this is where I have some questions.
Imagine this method on my repository:
public List<Schedule> GetSchedules(Guid scheduleOwnePk, DateRange dateRange)
{
var schedules = dbContext.Schedules.Where(x => x.PkScheduleOwner == scheduleOwnerPk && x.StartDate >= dateRange.Start && x.EndDate <= dateRange.End).ToList();
return schedules;
}
I can call this method from the core layer in two ways:
//Take advantage of EF core ability to fill the navigational property automatically
scheduleOwnerRepository.GetSchedules(scheduleOwner.Pk, dateRange)
or
var schedules = scheduleOwnerRepository.GetSchedules(scheduleOwner.Pk, dateRange);
//At this moment EF core already loaded the navigational property, so I need to clear it to avoid duplicated results
scheduleOwner.Schedules.Clear();
//Schedules is implemented as an IEnumerable to protect it from being changed outside the aggregator root
scheduleOwner.AddSchedules(schedules);
The problem with the first approach is that it leaks EF core to the core layer, meaning that the property ScheduleOwner.Schedules will no longer be filled if I move away from EF core.
The second approach abstracts EF core but requires some extra steps to get ScheduleOwner.Schedules filled. Since EF core will automatically load the navigational property after the repository method is called, I'm forced to clear it before adding the results, otherwise I'll be inserting duplicated results.
How do you guys deal with this kind of situation? Do you take advantage of EF core features or do you follow the more natural approach of calling a repository method and use its results to fill some property?
Thanks for the help.

There are a couple of things to consider here.
Try to avoid using your domain model for querying. Rather use a read model through a query layer.
An aggregate is a complete unit as it were so when loaded you load everything. When you run into a scenario where you do not need all of the related data it may indicate that the data is not part of the aggregate but it may, in fact, only be related in a weaker sense.
An example is Order to Customer. Although an Order may very well require a Customer the Order is an aggregate in its own right. The Customer may have a list of OrderIds but that may become large rather quickly. One would typically not require a complete list of orders to determine whether an aggregate is valid or complete. However, you may very well need a list of ActiveOrder value objects of sorts if that is required for, say, keep a maximum order amount although there are various ways to deal with that case also.
Back to your scenario. An EF entity is not your domain model and when I have had to make use of EF in the past I would load the entity and then map to my domain entity in the repository. The repository would only deal with domain aggregates and you should avoid query methods on the repository. As a minimum a repository would typically have at least a Get(id) and a Save(aggregate) method.
I would recommend querying using a separate layer that returns as simple a result as possible. For something like a Count I may return an int whereas something like IScheduleQuery.Search(specification) I may return IEnumerable<DataRow> or, if it contains more complex data or I have a need for a read model I may return IEnumerable<Query.Schedule>.

Using CQRS with repositories

If I understand correctly CQRS is about dividing write and read responsibilities. So I can use repositories in my write model, for example var user = repository.GetUserById(); - this will get the user by id and then repository.UpdateUser(user); will update the user with changed properties. In the read model we can construct more complex DTO's:
public class UsersReadModel
{
private IMyContext context;
public UsersReadModel(IMyContext context)
{
this.context = context;
}
public ComplexUserDTO GetComplexUser(ISelectQuery query)
{
ComplexUserDTO user = new ComplexUserDTO();
// get all user properties, GetUser by id
user.UserDTO = context.Users.Where(d => d.UserId == query.UserId).ProjectTo<UserDTO>().FirstOrDefault();
//here I don't need everything from PoliciesTable, I just need two columns, so I use anonymous object
var policieObject = context.Policies.Where(f => f.BasePolicyId == query.PolicyId).Select(s => new { s.PoliciesNames, s.Clients.Select(d => d.ClientNames).ToList() }).FirstOrDefault();
user.PoliciesNames = policieObject.PoliciesNames;
user.ClientsNames = policieObject.ClientsNames;
return user;
}
}
So in my Write model, I get user by id from my repository, because i don't need to map it to DTO, and in my read model I use GetUser by id, but I map it to DTO, because I need it in that way. Isn't this code repeat(if I want to change getting user by id i'll have to change it in both places)? Can I use repositories in my read model? In this case I'll have to use both repositories and context(for the anonymous object, and selecting part of table columns) in UsersReadModel.

If your domain is very simple then the Write and the Read will be very similar and a lot of cod duplication will occur. In fact, this works in reverse as well, if your Write model is very similar to the Read model then you could implement them as CRUD and you don't necessarily need CQRS.
Can I use repositories in my read model?
You can have anything you want on the Read side; the two sides are separated from many points of view.
In CQRS there are many cases when code duplication occurs. Don't be afraid of that. You could extract that in shared classes.
P.S.
You should have a Read model for every use case, not for every Write model. If you have a 1:1 correspondence from Write to Read then this could also means that you should have implemented this using CRUD.
P.S. I like to use CQRS even if the domain is simple as I like to have very optimized Read models (different persistence type, no JOINS, custom data sharding etc).

There are a few things to look at here. From your description, it doesn't sound like there is a separation between the read and write models. Remember, they have very different purposes.
CQRS leans heavily on domain-driven design principles. A key principle is the encapsulation of your domain objects.
As a result, you wouldn't expect a domain object to have 'properties' on it (especially not setters). It may have ID for example but not much else. This is becuase it's role is to protect invariants within its self. Not something you can do easily if you have setters.
I would also argue that a domain object shouldn't really have getters except for id. If you have a good read model there is little need for them and may encourage incorrect use of the object. There are times when this idea can be relaxed a little. Although I can't think of one right now.
As a result, a repository for a domain object can be very simple. GetById and Save (unless you are using event sourcing but that's another topic).
The Read model, on the other hand, should be shaped to serve the UI. Each model is likely to have a mix of data from various sources. For example, you are likely to want to see a users details in context or their activities or orders or value to the company or whatever the purpose of the application is.
This explanation of the typical structure of a CQRS application may be helpful: CQRS + Event Sourcing - A Step by Step Overview
And this may give you some insight into creating domain objects: Aggregate Root - How to Build One for CQRS and Event Sourcing
Hope this helps.

If I understand correctly CQRS is about dividing write and read responsibilities.
Closer to say that it is about having data models that are designed for the use cases that they support.
We have Truth, and that truth has multiple representations.
The trick is that the representations don't need to be coupled in time -- we can update the "book of record" representation now, and the representations we use to support queries eventually.
Can I use repositories in my read model?
Absolutely. There's no magic.
Udi Dahan would probably suggest that you be thinking about different repositories, or perhaps more precisely methods on your repositories that provide different explicit representations of the read model depending on what you are doing. Each method loads the representation that you need for that particular use case.

Custom Explicit Loading in Entity Framework - any way to do it?

I've got a list of entity object Individual for an employee survey app - an Individual represents an employee or outside rater. The individual has the parent objects Team and Team.Organization, and the child objects Surveys, Surveys.Responses. Responses, in turn, are related to Questions.
So usually, when I want to check the complete information about an Individual, I need to fetch Individuals.Include(Team.Organization).Include(Surveys.Responses.Question).
That's obviously a lot of includes, and has a performance cost, so when I fetch a list of Individuals and don't need their related objects, I don't bother with the Includes... but then the user wants to manipulate an Individual. So here's the challenge. I seem to have 3 options, all bad:
1) Modify the query that downloads the big list of Individuals to .Include(Team.Organization).Include(Surveys.Responses.Question). This gives it bad performance.
2) Individuals.Load(), TeamReference.Load(), OrganizationReference.Load(), Surveys.Load(), (and iterate through the list of Surveys and load their Responses and the Responses' Questions).
3) When a user wishes to manipulate an Individual, I drop that reference and fetch a whole brand new Individual from the database by its primary key. This works, but is ugly because it means I have two different kinds of Individuals, and I can never use one in place of the other. It also creates ugly problems if I'm iterating across a list repeatedly, as it's tricky to avoid loading and dropping the fully-included Individuals repeatedly, which is wasteful.
Is there any way to say
myIndividual.Include("Team.Organization").Include("Surveys.Responses.Question");
with an existing Individual entity, instead of taking approach (3)?
That is, is there any middle-ground between "fetch everything from the database up-front" and "late-load one relationship at a time"?
Possible solution that I'm hoping I could get insight about:
So there's no way to do a manually-implemented explicit load on a navigational-property? No way to have the system interpret
Individual.Surveys = from survey in MyEntities.Surveys.Include("Responses.Question")
where survey.IndividualID = Individual.ID
select survey; //Individual.Surveys is the navigation collection property holding Surveys on the Individual.
Individual.Team = from team in MyEntities.Teams.Include("Organization")
where team.ID = Individual.TeamID
select team;
as just loading Individual's related objects from the database instead of being an assignment/update operation? If this means no actual change in X and Y, can I just do that?
I want a way to manually implement a lazy or explicit load that isn't doing it a dumb (one relation at a time) way. Really, the Teams and Organizationss aren't the problem, but the Survey.Responses.Questions are a massive buttload of database hits.
I'm using 3.5, but for the sake of others (and when my project finally migrates to 4) I'm sure responses relevant to 4 would be appreciated. In that context, similar customization of lazy loading would be good to hear about too.
edit: Switched the alphabet soup to my problem domain, edited for clarity.
Thanks

The Include statement is designed to do exactly what you're hoping to do. Having multiple includes does indeed eager load the related entities.
Here is a good blog post about it:
http://thedatafarm.com/blog/data-access/the-cost-of-eager-loading-in-entity-framework/
In addition, you can use strongly typed "Includes" using some nifty ObjectContext extension methods. Here is an example:
http://blogs.microsoft.co.il/blogs/shimmy/archive/2010/08/06/say-goodbye-to-the-hard-coded-objectquery-t-include-calls.aspx

DDD: entity's collection and repositories

Suppose I have
public class Product: Entity
{
public IList<Item> Items { get; set; }
}
Suppose I want to find an item with max something... I can add the method Product.GetMaxItemSmth() and do it with Linq (from i in Items select i.smth).Max()) or with a manual loop or whatever. Now, the problem is that this will load the full collection into memory.
The correct solution will be to do a specific DB query, but domain entities do not have access to repositories, right? So either I do
productRepository.GetMaxItemSmth(product)
(which is ugly, no?), or even if entities have access to repositories, I use IProductRepository from entity
product.GetMaxItemSmth() { return Service.GetRepository<IProductRepository>().GetMaxItemSmth(); }
which is also ugly and is a duplication of code. I can even go fancy and do an extension
public static IList<Item> GetMaxItemSmth(this Product product)
{
return Service.GetRepository<IProductRepository>().GetMaxItemSmth();
}
which is better only because it doesn't really clutter the entity with repository... but still does method duplication.
Now, this is the problem of whether to use product.GetMaxItemSmth() or productRepository.GetMaxItemSmth(product)... again. Did I miss something in DDD? What is the correct way here? Just use productRepository.GetMaxItemSmth(product)? Is this what everyone uses and are happy with?
I just don't feel it is right... if I can't access a product's Items from the product itself, why do I need this collection in Product at all??? And then, can Product do anything useful if it can't use specific queries and access its collections without performance hits?
Of course, I can use a less efficient way and never mind, and when it's slow I'll inject repository calls into entities as an optimization... but even this doesn't sound right, does it?
One thing to mention, maybe it's not quite DDD... but I need IList in Product in order to get my DB schema generated with Fluent NHibernate. Feel free to answer in pure DDD context, though.
UPDATE: a very interesting option is described here: http://devlicio.us/blogs/billy_mccafferty/archive/2007/12/03/custom-collections-with-nhibernate-part-i-the-basics.aspx, not only to deal with DB-related collection queries, but also can help with collection access control.

Having an Items collection and having GetXXX() methods are both correct.
To be pure, your Entities shouldn't have direct access to Repositories. However, they can have an indirect reference via a Query Specification. Check out page 229 of Eric Evans' book. Something like this:
public class Product
{
public IList<Item> Items {get;}
public int GetMaxItemSmth()
{
return new ProductItemQuerySpecifications().GetMaxSomething(this);
}
}
public class ProductItemQuerySpecifications()
{
public int GetMaxSomething(product)
{
var respository = MyContainer.Resolve<IProductRespository>();
return respository.GetMaxSomething(product);
}
}
How you get a reference to the Repository is your choice (DI, Service Locator, etc). Whilst this removes the direct reference between Entity and Respository, it doesn't reduce the LoC.
Generally, I'd only introduce it early if I knew that the number of GetXXX() methods will cause problems in the future. Otherwise, I'd leave it for a future refactoring exercise.

I believe in terms of DDD, whenever you are having problems like this, you should first ask yourself if your entity was designed properly.
If you say that Product has a list of Items. You are saying that Items is a part of the Product aggregate. That means that, if you perform data changes on the Product, you are changing the items too. In this case, your Product and it's items are required to be transactionally consistent. That means that changes to one or another should always cascade over the entire Product aggregate, and the change should be ATOMIC. Meaning that, if you changed the Product's name and the name of one of it's Items and if the database commit of the Item's name works, but fails on the Product's name, the Item's name should be rolled back.
This is the fact that Aggregates should represent consistency boundaries, not compositional convenience.
If it does not make sense in your domain to require changes on Items and changes on the Product to be transactionally consistent, then Product should not hold a reference to the Items.
You are still allowed to model the relationship between Product and items, you just shouldn't have a direct reference. Instead, you want to have an indirect reference, that is, Product will have a list of Item Ids.
The choice between having a direct reference and an indirect reference should be based first on the question of transactional consistency. Once you have answered that, if it seemed that you needed the transactional consistency, you must then further ask if it could lead to scalability and performance issues.
If you have too many items for too many products, this could scale and perform badly. In that case, you should consider eventual consistency. This is when you still only have an indirect reference from Product to items, but with some other mechanism, you guarantee that at some future point in time (hopefully as soon as possible), the Product and the Items will be in a consistent state. The example would be that, as Items balances are changed, the Products total balance increases, while each item is being one by one altered, the Product might not exactly have the right Total Balance, but as soon as all items will have finished changing, the Product will update itself to reflect the new Total Balance and thus return to a consistent state.
That last choice is harder to make, you have to determine if it is acceptable to have eventual consistency in order to avoid the scalability and performance problems, or if the cost is too high and you'd rather have transactional consistency and live with the scalability and performance issues.
Now, once you have indirect references to Items, how do you perform GetMaxItemSmth()?
In this case, I believe the best way is to use the double dispatch pattern. You create an ItemProcessor class:
public class ItemProcessor
{
private readonly IItemRepository _itemRepo;
public ItemProcessor(IItemRepository itemRepo)
{
_itemRepo = itemRepo;
}
public Item GetMaxItemSmth(Product product)
{
// Here you are free to implement the logic as performant as possible, or as slowly
// as you want.
// Slow version
//Item maxItem = _itemRepo.GetById(product.Items[0]);
//for(int i = 1; i < product.Items.Length; i++)
//{
// Item item = _itemRepo.GetById(product.Items[i]);
// if(item > maxItem) maxItem = item;
//}
//Fast version
Item maxItem = _itemRepo.GetMaxItemSmth();
return maxItem;
}
}
And it's corresponding interface:
public interface IItemProcessor
{
Item GetMaxItemSmth(Product product);
}
Which will be responsible for performing the logic you need that involves working with both your Product data and other related entities data. Or this could host any kind of complicated logic that spans multiple entities and don't quite fit in on any one entity per say, because of how it requires data that span multiple entities.
Than, on your Product entity you add:
public class Product
{
private List<string> _items; // indirect reference to the Items Product is associated with
public List<string> Items
{
get
{
return _items;
}
}
public Product(List<string> items)
{
_items = items;
}
public Item GetMaxItemSmth(IItemProcessor itemProcessor)
{
return itemProcessor.GetMaxItemSmth(this);
}
}
NOTE:
If you only need to query the Max items and get a value back, not an Entity, you should bypass this method altogether. Create an IFinder that has a GetMaxItemSmth that returns your specialised read model. It's ok to have a separate model only for querying, and a set of Finder classes that perform specialized queries to retrieve such specialized read model. As you must remember, Aggregates only exist for the purpose of data change. Repositories only work on Aggregates. Therefore, if no data change, no need for either Aggregates or Repositories.

(Disclaimer, I am just starting to get a grasp on DDD. or at least believe doing it :) )
I will second Mark on this one and emphasize 2 point that took me some times to realize.
Think about your object in term of aggregates, which lead to
The point is that either you load the children together with the parent or you load them separately
The difficult part is to think about the aggregate for your problem at hand and not to focus the DB structure supporting it.
An example that emphasizes this point i customer.Orders. Do you really need all the orders of your customer for adding a new order? usually not. what if she has 1 millin of them?
You might need something like OutstandingAmount or AmountBuyedLastMonth in order to fulfill some scenarios like "AcceptNewOrder" or ApplyCustomerCareProgram.
Is the product the real aggregate root for your sceanrio?
What if Product is not an Aggregate Root?
i.e. are you going to manipulate the item or the product?
If it is the product, do you need the ItemWithMaxSomething or do you need MaxSomethingOfItemsInProduct?
Another myth: PI means You don't need to think about the DB
Given that you really need the item with maxSomething in your scenario, then you need to know what it means in terms of database operation in order to choose the right implementation, either through a service or a property.
For example if a product has a huge number of items, a solution might be to have the ID of the Item recorded with the product in the db instead of iterating over the all list.
The difficult part for me in DDD is to define the right aggregates. I feel more and more that if I need to rely on lazy loading then I might have overseen some context boundary.
hope this helps :)

I think that this is a difficult question that has no hard and fast answer.
A key to one answer is to analyze Aggregates and Associations as discussed in Domain-Driven Design. The point is that either you load the children together with the parent or you load them separately.
When you load them together with the parent (Product in your example), the parent controls all access to the children, including retrieval and write operations. A corrolary to this is that there must be no repository for the children - data access is managed by the parent's repository.
So to answer one of your questions: "why do I need this collection in Product at all?" Maybe you don't, but if you do, that would mean that Items would always be loaded when you load a Product. You could implement a Max method that would simply find the Max by looking over all Items in the list. That may not be the most performant implementation, but that would be the way to do it if Product was an Aggregate Root.
What if Product is not an Aggregate Root? Well, the first thing to do is to remove the Items property from Product. You will then need some sort of Service that can retrieve the Items associated with the Product. Such a Service could also have a GetMaxItemSmth method.
Something like this:
public class ProductService
{
private readonly IItemRepository itemRepository;
public ProductService (IItemRepository itemRepository)
{
this.itemRepository = itemRepository;
}
public IEnumerable<Item> GetMaxItemSmth(Product product)
{
var max = this.itemRepository.GetMaxItemSmth(product);
// Do something interesting here
return max;
}
}
That is pretty close to your extension method, but with the notable difference that the repository should be an instance injected into the Service. Static stuff is never good for modeling purposes.
As it stands here, the ProductService is a pretty thin wrapper around the Repository itself, so it may be redundant. Often, however, it turns out to be a good place to add other interesting behavior, as I have tried to hint at with my code comment.

Another way you can solve this problem is to track it all in the aggregate root. If Product and Item are both part of the same aggregate, with Product being the root, then all access to the Items is controlled via Product. So in your AddItem method, compare the new Item to the current max item and replace it if need be. Maintain it where it's needed within Product so you don't have to run the SQL query at all. This is one reason why defining aggregates promotes encapsulation.

Remember that NHibernate is a mapper between the database and your objects. Your issue appears to me that your object model is not a viable relational model, and that's ok, but you need to embrace that.
Why not map another collection to your Product entity that uses the power of your relational model to load in an efficient manner. Am I right in assuming that the logic to select this special collection is not rocket science and could easily be implemented in filtered NHibernate mapped collection?
I know my answer has been vague, but I only understand your question in general terms. My point is that you will have problems if you treat your relational database in an object oriented manner. Tools like NHibernate exist to bridge the gap between them, not to treat them in the same way. Feel free to ask me to clarify any points I didn't make clear.

You can now do that with NHibernate 5 directly without specific code !
It won't load the whole collection into memory.
See https://github.com/nhibernate/nhibernate-core/blob/master/releasenotes.txt
Build 5.0.0
=============================
** Highlights
...
* Entities collections can be queried with .AsQueryable() Linq extension without being fully loaded.
...

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.