How can I perform aggregate operations via the repository pattern?

How can I perform aggregate operations via the repository pattern? - c#

I've seen various blog posts (and much conflicting advice) about the repository pattern, and so I'll start by saying that the code below is probably not following the repository pattern in many people's opinion. However, it's a common-enough implementation, and whether it adheres to Fowler's original definition or not, I'm still interested in understanding more about how this implementation is used in practice.
Suppose I have a project where data access is abstracted via an interface such as the one below, which provides basic CRUD operations.
public interface IGenericRepository<T>
{
void Add(T entity);
void Remove(T entity);
void Update(T entity);
IEnumerable<T> Fetch(Expression<Func<T,bool>> where);
}
Further suppose that I have a service layer built atop that, for example:
public class FooService
{
private IGenericRepository<Foo> _fooRespository;
...
public IEnumerable<Foo> GetBrightlyColoredFoos()
{
return _fooRepository.Fetch(f => f.Color == "pink" || f.Color == "yellow");
}
}
Now suppose that I now need to know how many brightly colored Foos there are, without actually wanting to enumerate them. Ideally, I want to implement a CountBrightlyColoredFoos() method in my service, but the repository implementation gives me no way to achieve that other than by fetching them all and counting them - which is potentially very inefficient.
I could extend the repository to add a Count() method, but what about other aggregate functions that I might need, such as Min() or Max(), or Sum(), or... you get the idea.
Likewise, what if I wanted to get a list of the distinct Foo colors (SELECT DISTINCT). Again, the simple repository provides no way to do that sort of thing either.
Keeping the repository simple to make it easy to test/mock is very laudable, but how do you then address these requirements? Surely there are only two ways to go - a more complex repository, or a "back-door" for the service layer to use that bypasses the repository (and thus defeats its purpose).

I would say you need to change your design. What you want to do is have one "main" generic repository that has your basic CRUD, but also smaller repositories for each entity. You will then just have to draw a line on where to place certain operations (like sum, count, max, etc.) Most likely not all your entities are going to have to get counted, summed, etc. and most of the time you won't be able to add a generic version that applies to all entities for aggregate functions.
Base Repository:
public abstract class BaseRep<T> : IBaseRep<T> where T : class
{
//basic CRUD
}
Foo Repository:
public class FooRep : BaseRep<Foo>, IFooRep
{
//foo specific functions
}

Related

Where should I put the complex queries using the Repository Pattern?

I have an application in which I use Entity Framework, and I have a class called BaseRepository<T> with a few basic CRUD methods, such as (Get, GetAll, Update, Delete, Insert), and from this class I generate my specific repositories, such like BaseRepository <Products>, BaseRepository<People>, BaseRepository<Countries> and many more.
The problem is that, when I have a complex logic in the service, that involves making joins of several tables and that does not return an entity, but an object that is handled in the service (it is neither a DB entity nor a DTO), I find that repositories don't help me much with just basic CRUD operations.
Where should I put this query complex? in which of the repositories should it be? How do I join these repositories? The problem is that I see that the repositories handle a single entity, what should I do in this case? I've been doing some research and read that returning IQueryable<T> is bad practice, so I rule out that possibility of sending IQueryable<T> of the tables I'm going to join the service and do it there.
I've researched and found no clear answer. I would like to know how and where these complex queries are organized, since I also want to respect the responsibility of each repository with its respective entity.

I would like to know how and where these complex queries are organized, since I also want to respect the responsibility of each repository with its respective entity.
The complex queries are the responsibility of the code that is requesting the data, not the responsibility of the repository. The single responsibility of the repository is to provide access to the data, not to provide every possible request shape that may be needed in any use case. You really want to write methods in your repositories like:
customerRepo.GetCustomerWithLastTenOrdersAndSupportCasesAsDTO()
or
customerRepo.GetCustomerForCustomerSupportHomePage()
Of course not. So your repository provides a property of type IQueryable<T> or DbSet<T> which can be used as a starting point for consumers to add whatever queries they need.
I've been doing some research and read that returning IQueryable is bad practice
Don't beleive everything you read. There's really not a good alternative to exposing IQueryable<T> from your repository. And once you digest that, there's not much of a reason to have any repository type other than your DbContext subtype.

Its hard to answer without having a code to understand what you want to achieve, hopefully my answer gives you an idea on how you can use abstract classes to override your Queryable Collection. If your requirement is more complex, can you provide more information with example code.
Create your BaseRepository like this -
public abstract class BaseRepository<T>
{
public IQueryable<T> Collection { get; set; }
public readonly DbContext _context;
public BaseRepository(DbContext context)
{
_context = context;
Collection = SetQueryableCollection();
}
public virtual IQueryable<T> SetQueryableCollection() => _context.Set<T>();
// CRUD Operations here for e.g. -
public virtual async Task<List<T>> Get()
{
return await Collection.ToListAsync();
}
}
Now, the class that inherits this -
public class ProductRepository : BaseRepository<Product>
{
public ProductRepository(MyContext context) : base(context)
{
}
//now override the method that sets the collection
public override IQueryable<Product> SetQueryableCollection() =>
_context.Set<Product>().Include(p => p.Brand).ThenInclude(...);
// things to keep in mind, the _context is public in the way I've done it. You can change that and directly expose the Collection and set it to your need per entity type.
}
So now your GET method uses the overriden method to set the Collection.

One big repository vs. many little ones?

I have several product tables in my database:
ProductTypes
ProductCategories
ProductCategoryItems
ProductInventory
The way I see it now, I can make IProduct which would have methods such as:
FindAllTypes()
FindAllCategories(int typeId)
FindAllItems(int categoryId)
Or, I can separate each to mimic the table structure: IProductType, IProductCategory, etc.
Is there a reason to go with one over another?

The idea of repositories is to delegate each one with responsibility for a single entity. In this case making a repository for each entity is recommended. You can go for the big repository one as well, but is not the best solution. In the end you'll get a HUGE class with lots of methods and really tight coupled. Also difficult to give maintenance to.

I don't think having a huge repository is really a good idea, then you'd basically have a data access god class that does everything.
I like to have a base Repository<T> which does common operations such as GetById and GetAll. I normally have all my repositories inherit this base class to get the common methods for free, so I don't have to keep rewriting the same code.

In my opinion it very much depends on the business domain model, it's very important to determine what are your main business entities. Not necessarily every table in the DB is directly mapped to a business entity. Tables are just representations of your one or many entities in a normalized way for relational databases.
Try to picture your domain model beyond the restrictions of normalized relational databases, is there really more than one business concept? Repositories should be constructed around solid, whole, first-class business entities.
My advice would be to have an IProductRepository with the necessary methods to implement CRUD operations and grow it as needed. You don't want to get too ambitious interfaces beacuse you may not need most of it, and it could be a burden. The important thing about interfaces is to decouple your code from the persistence schema, so you can latter offer the flexibility to switch between them.
Maybe in the future the business will need to evolve to a more detailed representation of -for instance- the product's providers, and in that moment you'll use your good judgement to decide wether that represents an important business entity worthy of a dedicated repository or not.
Hope this helps.

I disagree with the others (edit: except with Isaac). The small repositories are a facade (not the pattern).
If the entity types are coupled (have navigation properties to each other) then they are not really separatable.
Modifying one entity type and committing the changes may commit change to others.
Also, you can not create any small repository above the same unit of work,
since the ORM only has a limited amount of entities mapped to the database.
Divide your model into separatable domains and create one specific unit of work for each domain.
On these unit of works create aggregate roots for each entity type that you may require immediate access to.
Each root should have specifically typed add, remove, getbykeys, query and etc methods.
The unit of work should have the commitchanges and alike methods on it.
Each of the roots is similar to the small repositories the others mentioned, however, the unit of work is the real medium sized repository (of which your model may have more than one type of).
Example:
// Create one of these
interface IUnitOfWork
{
void Commit();
}
// Create one of these
interface IEntitySet<TEntity> where TEntity : class
{
void Add(TEntity entity);
void Remove(TEntity entity);
TEntity Create<TSpecificEntity>() where TSpecificEntity : TEntity;
IQueryable<TEntity> Query();
}
// Create one of these per entity type
interace IEntitySetOfTEntity1 : IEntitySet<Entity1>
{
TEntity1 GetByKeys(int key1);
}
interace IEntitySetOfTEntity2 : IEntitySet<Entity2>
{
TEntity1 GetByKeys(short key1, short key2);
}
// Create one of these per separatable domain
interface IDomain1UnitOfWork : IUnitOfWork
{
IEntitySetOfTEntity1 Entity1s
{
get;
}
IEntitySetOfTEntity2 Entity2s
{
get;
}
}
All these interfaces and their implementations can be auto-generated.
These interfaces and their implementations are very light weight and by no means are any of them "a HUGE class with lots of methods". Since they can be auto-generated, maintenance is easy.
Specific functionalities can be added to the interfaces IDomain1UnitOfWork, IEntitySetOfTEntity1 and alike by using:
a. extension methods
b. partial interfaces and classes (less recommended, since this results in a less clean DAL)
The IEntitySetOfTEntity1 like interfaces can be disgarded if you use extension methods to add the GetByKeys() methods to IEntitySet<Entity1>.

Advice With Repository/Service Layer Design Pattern

Trying to make a really simple repository and service layer pattern here. (.NET 4, C#, LINQ, although this question is partially language-agnostic). Note: this is just R&D.
My goal is to minimize the amount of method definitions in my service layer.
Here's my Repository Contract:
interface IFooRepository
{
IEnumerable<Foo> Find();
void Insert(Foo foo);
void Update(Foo foo);
void Delete(Foo foo);
}
Nothing new there.
Now, here's what im (trying) to have in my Service Contract:
interface IFooDataService
{
public IEnumerable<Foo> Find(FooSearchArgs searchArgs);
}
Essentially, any particular "Foo" has many properties (id, name, etc), which i would like to be able to search upon.
So, i dont want to have 1x Find method for each different property, i just want one - that way when i create extra properties i dont have to modify the contracts.
The "FooSearchArgs" is just a simple POCO with all the different "Foo" properties it.
So, that's what im trying to do, here's my questions:
Is this poor design? If so, what are the alternatives?
How can i implement this filtering in the service layer? Would i have to check what properties of "FooSearchArgs" are set, then keep filtering down? (if this, then query.where, if this, query.where, etc) Anyone have an idea of a clever LINQ IEnumerable extension method to do this? (ie repository.WhereMeetsSearchCriteria(fooSearchArgs))
Appreciate the help.

We use something very similar. One thing you need to decide on is if you are going to expose IQueryable outside of the repository. Your find method returns IEnumerable which could be the IQueryable returned from your when clause.
The advantage of returning the IQueryable is that you can further refine your criteria up outside of your repository layer.
repository.Find(predicate).Where(x => x.SomeValue == 1);
The expression will only be compiled when you come to use the returned data and here in lies the disadvantage. Because you only hit the database when you actually come to use the results you could end up trying to call the database after your session (nhibernate) or connections have been closed.
My personal preference is to use the specification pattern where you pass your find method an ISpecification object is used to do the query.
public interface ISpecification<TCandidate>
{
IQueryable<TCandidate> GetSatisfyingElements(IQueryable<TCandidate> source);
}
public class TestSpecification : ISpecification<TestEntity>
{
public IQueryable<TestEntity> GetSatisfyingElements(IQueryable<TestEntity> source)
{
return source.Where(x => x.SomeValue == 2);
}
}
public class ActiveRecordFooRepository: IFooRepository
{
...
public IEnumerable<TEntity> Find<TEntity>(ISpecification<TEntity> specification) where TEntity : class
{
...
return specification.GetSatisfyingElements(ActiveRecordLinq.AsQueryable<TEntity>()).ToArray();
...
}
public TEntity FindFirst<TEntity>(ISpecification<TEntity> specification) where TEntity : class
{
return specification.GetSatisfyingElements(ActiveRecordLinq.AsQueryable<TEntity>()).First();
}
}
After the query is run the repository calls ToArray or ToList on the resulting IQueryable returned from the specification so that the query is evaluated there and then. Whilst this may seem less flexible than exposing IQueryable it comes with several advantages.
Queries are executed straight away and prevents a call to the database being made after sessions have closed.
Because your queries are now bundled into specifications they are unit testable.
Specifications are reusable meaning you don't have code duplication when trying to run similar queries and any bugs in the queries only need to be fixed in one place.
With the right kind of implementation you can also chain your specifications together.
repository.Find(
firstSpecification
.And(secondSpecification)
.Or(thirdSpecification)
.OrderBy(orderBySpecification));

Is passing a Func as a parameter to your service layer's Find method, instead of the FooSearchArgs, an option? Enumerables have a Where method (linq) that takes a Func as a parameter, so you could use it to filter the results.

Is there anything wrong with having a few private methods exposing IQueryable<T> and all public methods exposing IEnumerable<T>?

I'm wondering if there is a better way to approach this problem. The objective is to reuse code.
Let’s say that I have a Linq-To-SQL datacontext and I've written a "repository style" class that wraps up a lot of the methods I need and exposes IQueryables. (so far, no problem).
Now, I'm building a service layer to sit on top of this repository, many of the service methods will be 1<->1 with repository methods, but some will not. I think a code sample will illustrate this better than words.
public class ServiceLayer
{
MyClassDataContext context;
IMyRepository rpo;
public ServiceLayer(MyClassDataContext ctx)
{
context = ctx;
rpo = new MyRepository(context);
}
private IQueryable<MyClass> ReadAllMyClass()
{
// pretend there is some complex business logic here
// and maybe some filtering of the current users access to "all"
// that I don't want to repeat in all of the public methods that access
// MyClass objects.
return rpo.ReadAllMyClass();
}
public IEnumerable<MyClass> GetAllMyClass()
{
// call private IQueryable so we can do attional "in-database" processing
return this.ReadAllMyClass();
}
public IEnumerable<MyClass> GetActiveMyClass()
{
// call private IQueryable so we can do attional "in-database" processing
// in this case a .Where() clause
return this.ReadAllMyClass().Where(mc => mc.IsActive.Equals(true));
}
#region "Something my class MAY need to do in the future"
private IQueryable<MyOtherTable> ReadAllMyOtherTable()
{
// there could be additional constrains which define
// "all" for the current user
return context.MyOtherTable;
}
public IEnumerable<MyOtherTable> GetAllMyOtherTable()
{
return this.ReadAllMyOtherTable();
}
public IEnumerable<MyOtherTable> GetInactiveOtherTable()
{
return this.ReadAllMyOtherTable.Where(ot => ot.IsActive.Equals(false));
}
#endregion
}
This particular case is not the best illustration, since I could just call the repository directly in the GetActiveMyClass method, but let’s presume that my private IQueryable does some extra processing and business logic that I don't want to replicate in both of my public methods.
Is that a bad way to attack an issue like this? I don't see it being so complex that it really warrants building a third class to sit between the repository and the service class, but I'd like to get your thoughts.
For the sake of argument, lets presume two additional things.
This service is going to be exposed through WCF and that each of these public IEnumerable methods will be calling a .Select(m => m.ToViewModel()) on each returned collection which will convert it to a POCO for serialization.
The service will eventually need to expose some context.SomeOtherTable which wont be wrapped into the repository.

I think it's a good model since you can create basic IQueryable private functions that can be used by the functions you are exposing publicly. This way your public methods do not need to recreate a lot of the common functionality your IQueryable methods perform and they can be extended as needed and deferring the execution while still hiding that functionality publicly.
An example like how to get X out of some table which may take a lot of logic that you don't need in it's raw form. You then have that as a private method, as you do in your example, and then the public method adds the finalizing criteria or queries to generate a useable set of data which could differ from function to function. Why keep reinventing the wheel over and over... just create the basic design (which you IQueryable does) and drop on the tread pattern that is required as needed (your public IEnumerable does) :)
+1 for a good design IMO.

Loading Subrecords in the Repository Pattern

Using LINQ TO SQL as the underpinning of a Repository-based solution. My implementation is as follows:
IRepository
FindAll
FindByID
Insert
Update
Delete
Then I have extension methods that are used to query the results as such:
WhereSomethingEqualsTrue() ...
My question is as follows:
My Users repository has N roles. Do I create a Roles repository to manage Roles? I worry I'll end up creating dozens of Repositories (1 per table almost except for Join tables) if I go this route. Is a Repository per Table common?

If you are building your Repository to be specific to one Entity (table), such that each Entity has the list of methods in your IRepository interface that you listed above, then what you are really doing is an implementation of the Active Record pattern.
You should definitely not have one Repository per table. You need to identify the Aggregates in your domain model, and the operations that you want to perform on them. Users and Roles are usually tightly related, and generally your application would be performing operations with them in tandem - this calls for a single repository, centered around the User and it's set of closely related entities.
I'm guessing from your post that you've seen this example. The problem with this example is that all the repositories are sharing the same CRUD functionality at the base level, but he doesn't go beyond this and implement any of the domain functions. All the repositories in that example look the same - but in reality, real repositories don't all look the same (although they should still be interfaced), there will be specific domain operations associated with each one.
Your repository domain operations should look more like:
userRepository.FindRolesByUserId(int userID)
userRepository.AddUserToRole(int userID)
userRepository.FindAllUsers()
userRepository.FindAllRoles()
userRepository.GetUserSettings(int userID)
etc...
These are specific operations that your application wants to perform on the underlying data, and the Repository should provide that. Think of it as the Repository represents the set of atomic operations that you would perform on the domain. If you choose to share some functionality through a generic repository, and extend specific repositories with extension methods, that's one approach that may work just fine for your app.
A good rule of thumb is that it should be rare for your application to need to instantiate multiple repositories to complete an operation. The need does arise, but if every event handler in your app is juggling six repositories just to take the user's input and correctly instantiate the entities that the input represents, then you probably have design problems.

Is a Repository per Table common?
No, but you can still have several repositiories. You should build a repository around an aggregate.
Also, you might be able to abstract some functionality from all the repositories... and, since you are using Linq-to-Sql, you probably can...
You can implement a base repository which in a generic way implements all this common functionality.
The following example serves only to prove this point. It probably needs a lot of improvement...
interface IRepository<T> : IDisposable where T : class
{
IEnumerable<T> FindAll(Func<T, bool> predicate);
T FindByID(Func<T, bool> predicate);
void Insert(T e);
void Update(T e);
void Delete(T e);
}
class MyRepository<T> : IRepository<T> where T : class
{
public DataContext Context { get; set; }
public MyRepository(DataContext context)
{
Context = Context;
}
public IEnumerable<T> FindAll(Func<T,bool> predicate)
{
return Context.GetTable<T>().Where(predicate);
}
public T FindByID(Func<T,bool> predicate)
{
return Context.GetTable<T>().SingleOrDefault(predicate);
}
public void Insert(T e)
{
Context.GetTable<T>().InsertOnSubmit(e);
}
public void Update(T e)
{
throw new NotImplementedException();
}
public void Delete(T e)
{
Context.GetTable<T>().DeleteOnSubmit(e);
}
public void Dispose()
{
Context.Dispose();
}
}

To me the repository pattern is about putting a thin wrapper around your data access methodology. LINQ to SQL in your case, but NHibernate, hand-rolled in others. What I've found myself doing is create a repository-per-table for that is extremely simple (like bruno lists and you already have). That is responsible for finding things and doing CRUD operations.
But then I have a service level that deals more with aggregate roots, as Johannes mentions. I would have a UserService with a method like GetExistingUser(int id). This would internally call the UserRepository.GetById() method to retrieve the user. If your business process requires the user class returned by GetExistingUser() to pretty much always need the User.IsInRoles() property to be filled, then simply have the UserService depend upon both the UserRepository and RoleRepository. In pseudo code it could look something like this:
public class UserService
{
public UserService(IUserRepository userRep, IRoleRepository roleRep) {...}
public User GetById(int id)
{
User user = _userService.GetById(id);
user.Roles = _roleService.FindByUser(id);
return user;
}
The userRep and roleRep would be constructed with your LINQ to SQL bits something like this:
public class UserRep : IUserRepository
{
public UserRep(string connectionStringName)
{
// user the conn when building your datacontext
}
public User GetById(int id)
{
var context = new DataContext(_conString);
// obviously typing this freeform but you get the idea...
var user = // linq stuff
return user;
}
public IQueryable<User> FindAll()
{
var context = // ... same pattern, delayed execution
}
}
Personally I would make the repository classes internally scoped and have the UserService and other XXXXXService classes public so keep your consumers of the service API honest. So again I see repositories as more closely linked to the act of talking to a datastore, but your service layer being more closely aligned to the needs of your business process.
I've often found myself really overthinking the flexibility of Linq to Objects and all that stuff and using IQuerable et al instead of just building service methods that spit out what I actually need. User LINQ where appropriate but don't try to make the respository do everything.
public IList<User> ActiveUsersInRole(Role role)
{
var users = _userRep.FindAll(); // IQueryable<User>() - delayed execution;
var activeUsersInRole = from users u where u.IsActive = true && u.Role.Contains(role);
// I can't remember any linq and i'm type pseudocode, but
// again the point is that the service is presenting a simple
// interface and delegating responsibility to
// the repository with it's simple methods.
return activeUsersInRole;
}
So, that was a bit rambling. Not sure if I really helped any, but my advise is to avoid getting too fancy with extension methods, and just add another layer to keep each of the moving parts pretty simple. Works for me.

If we write our repository layer as detailed as Womp suggests, what do we put in our service layer. Do we have to repeat same method calls, which would mostly consists of calls to corresponding repository method, for use in our controllers or codebehinds? This assumes that you have a service layer, where you write your validation, caching, workflow, authentication/authorization code, right? Or am I way off base?

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.