Generic Repository or Specific Repository for each entity?

Generic Repository or Specific Repository for each entity? - c#

Background
At the company I work for I have been ordered to update an old MVC app and implement a repository pattern for a SQL database. I have created the context of the database using Entity Framework Database-First and got 23 entities.
The first question
Do I need to create a repository for each entity or implement a generic repository for the context? I'm asking this because I have found following while searching internet:
One repository per domain
You should think of a repository as a collection of domain objects in memory. If you’re building an application called Vega, you shouldn’t have a repository like the following:
public class VegaRepository {}
Instead, you should have a separate repository per domain class, like OrderRepository, ShippingRepository and ProductRepository.
Source: Programming with Mosh: 4 Common Mistakes with the Repository Pattern
The second question
Does a generic repository work for Entity Framework Database-First? This is because I have found following while searching internet:
Entity framework
Do note that the repository pattern is only useful if you have POCOs which are mapped using code first. Otherwise you’ll just break the abstraction with the entities instead (= the repository pattern isn’t very useful then). You can follow this article if you want to get a foundation generated for you.
Source: CodeProject: Repository pattern, done right

To begin with, if you are using full ORM like Entity Framework or NHibernate, you should avoid implementing additional layer of Repository and Unit Of Work.
This is because; the ORM itself exposes both Generic Repository and Unit Of Work.
In case of EF, your DbContext is Unit Of Work and DbSet is Generic Repository. In case of NHibernate, it is ISession itself.
Building new wrapper of Generic Repository over same existing one is repeat work. Why reinvent the wheel?
But, some argue that using ORM directly in calling code has following issues:
It makes code little more complicated due to lack of separation of concerns.
Data access code is merged in business logic. As a result, redundant complex query logic spread at multiple places; hard to manage.
As many ORM objects are used in-line in calling code, it is very hard to unit test the code.
As ORM only exposes Generic Repository, it causes many issues mentioned below.
Apart from all above, one other issue generally discussed is "What if we decide to change ORM in future". This should not be key point while taking decision because:
You rarely change ORM, mostly NEVER – YAGNI.
If you change ORM, you have to do huge changes anyway. You may minimize efforts by encapsulating complete data access code (NOT just ORM) inside something. We will discuss that something below.
Considering four issues mentioned above, it may be necessary to create Repositories even though you are using full ORM - This is per case decision though.
Even in that case, Generic Repository must be avoided. It is considered an anti-pattern.
Why generic repository is anti-pattern?
A repository is a part of the domain being modeled, and that domain is not generic.
Not every entity can be deleted.
Not every entity can be added
Not every entity has a repository.
Queries vary wildly; the repository API becomes as unique as the entity itself.
For GetById(), identifier types may be different.
Updating specific fields (DML) not possible.
Generic query mechanism is the responsibility of an ORM.
Most of the ORMs expose an implementation that closely resemble with Generic Repository.
Repositories should be implementing the SPECIFIC queries for entities by using the generic query mechanism exposed by ORM.
Working with composite keys is not possible.
It leaks DAL logic in Services anyway.
Predicate criteria if you accept as parameter needs to be provided from Service layer. If this is ORM specific class, it leaks ORM into Services.
I suggest you read these (1, 2, 3, 4, 5) articles explaining why generic repository is an anti-pattern. This other answer discusses about Repository Pattern in general.
So, I will suggest:
Do NOT use repository at all, directly use ORM in your calling code.
If you have to use repository, then do not try to implement everything with Generic Repository.
Instead, optionally create very simple and small Generic Repository as abstract base class. OR you can use Generic Repository exposed by your ORM as base repository if ORM allows it.
Implement Concrete Repositories as per your need and derive all them from Generic Repository. Expose concrete repositories to calling code.
This way you get all the good of generic repository still bypassing its drawbacks.
Even though very rare, this also helps switching ORM in future as ORM code is cleanly abstracted in DAL/Repositories. Please understand that switching ORM is not a primary objective of Data Access Layer or Repository.
In any case, do not expose Generic Repository to calling code.
Also, do not return IQueryable from concrete repositories. This violates basic purpose of existence of Repositories - To abstract data access. With exposing IQueryable outside the repository, many data access decisions leak into calling code and Repository lose the control over it.
do I need to create a repository for each entity or implement a generic repository for the context
As suggested above, creating repository for each entity is better approach. Note that, Repository should ideally return Domain Model instead of Entity. But this is different topic for discussion.
does a generic repository works for EF Database First?
As suggested above, EF itself exposes Generic Repository. Building one more layer on it is useless. Your image is saying the same thing.

Related

Is it appropriate for a Factory class to also include functionality of extracting data from a database

One of the main aspects of software development that I struggle with is delegating the correct responsibility into classes within my program. Working at my first junior role I'm also being exposed to a lot of different design patterns and ideas, and sometimes the information can be overwhelming.
Obviously when we are building software we tend to state that a class should be responsible for one thing and one thing only. It should do that thing well and nothing more. So in the case of the Factory pattern the Factory class should be responsible for building the product and exposing an interface that allows the director to extract the product from the factory.
However the factory class obviously needs to receive the data to build the product from somewhere, without input data we have no output product. Therefore I'd like to know whether including functionality for a factory to query a database is appropriate? My rationale for this is that if the factory is tasked with building a particular product, then it should also be responsible for retrieving the data required to build that product. But I'm not 100% sure if this is correct.
Alternatively, should there be a repository class who's responsibility is to retrieve the data in question from the database, which can be then passed to the factory for assembly into the required product? The use of a repository class seems a bit excessive in this case as we have a class that will hold a large number of different pieces of data which then must be shipped into the factory class.
If we also bear in mind Uncle Bob's teachings that state that functions and methods should have absolutely no more than three parameters then we will be breaking this rule by passing in a large amount of data to the factory. If we first assemble the data into an encompassing class before passing to the factory then we are essentially doing the factory's job within the repository class.
Some guidance would be really appreciated on this, as in my head the lines are very blurry and I'm not sure how I should proceed.

You shouldn't use the factory pattern to object building that extracted from a database. There are the Repository pattern and the Data Mapper pattern for this goal. Those patterns must encapsulate all logic of work with the data storage. Those patterns must have the following responsibility:
the Repository must give an interface to business logic for work with data storage
the Data Mapper must convert data from database to concrete object
The algorithm of cooperation between objects can look like:
business logic uses a repository to read/persist objects.
the repository uses a Data Mapper to convert objects to INSERT or UPDATE queries and to convert data from data storage to object
Also, you can read more details about the repository pattern in C# on the site of Microsoft and you can see C# example of the repository pattern

Use 2 different classes.
A Data Access Object (DAO) provides an abstract interface to the database, and hides its details.
A factory abstracts and hides the details of the creation of your objects. For example, for unit testing you might want to configure the factory so that it doesn't use the Database at all.
To reduce the number of parameters between the DAO and the factory, wrap your many pieces of data in a few logically related classes.

Is it apropriate for a Factory class to also include functionality of extracting data from a database
My rationale for this is that if the factory is tasked with building a particular product, then it should also be responsible for retrieving the data required to build that product. But I'm not 100% sure if this is correct.
A product to retrieve from database is not a trivial object, it is a domain model.
A domain model (aka business model, aka entity(which might indicate a particular instance of it)) belongs to your domain layer (aka business layer).
In this regard, there are some patterns you should be a minimum familiar with...
(Active Record) VS (Data Mapper + Repository) VS (Table Data Gateway + Factory)
Active record pattern kind of violate the Single Responsibility Principle by leading you to implement database access logic inside your domain model and tightly couples them.
Ideally, to avoid the cons above for the cost of a slightly increased complexity (on short term only), we separate the database access logic into a supplementary layer, the data access layer. One of the main component of this layer being the data mapper which in our context (READ operation) is in charge to retrieve data from database and map it to a new domain model instance, your specific product (entity). More generally it encapsulates CRUD operations to the database abstracting this database. Its API inputs and outputs are entity objects and possibly Query Objects.
Optionally, a featured data mapper would make use of patterns such as:
Unit Of Work
Lazy Loading
Identity Map
Transaction
Lock Strategies
Metadata Mapping
Alternatively, should there be a repository class who's responsibility is to retrieve the data in question from the database, which can be then passed to the factory for assembly into the required product? The use of a repository class seems a bit excessive in this case as we have a class that will hold a large number of different pieces of data which then must be shipped into the factory class.
Repository is not a part of your data access layer, but of your domain layer. So it is client of your data access layer. It doesn't encapsulate any database access logic but uses the data mapper.
A repository encapsulates query logic for a particular domain model plus a collection of in-memory entities you've previously retrieved.
A very basic example:
class ProductRepository
{
private $productCollection;
public function findById($id)
{
if (!$this->productCollection->has($id)) {
$product = $this->dataMapper->get(new Query(Product::class, $id));
$this->productCollection->add($product);
return $product;
}
return $this->productCollection->get($id);
}
}
Finally, we can encapsulate database access logic in a table data gateway and use it in a factory. This would result in a solution similar to Gonen I's one. It is simple to implement but there might be cons compared to the data mapper solution. I've never implemented, used or even studied this approach so I can't tell much...
You'd definitely learn a lot by attempting to implement all that by yourself and I'd encourage you to, but keep in mind that if you need a serious solution, ORMs might be interesting for you.
If you're keen to learn more about all this, I recommend Martin Fowler's Patterns of Enterprise Application Architecture book which is summarized here: https://www.martinfowler.com/eaaCatalog/index.html

Why should i build a repository pattern with a unit of work on the top of my EF?

According to the MSDN the DbSet :
DbSet<TEntity> Class
A DbSet represents the collection of all entities in the context, or
that can be queried from the database, of a given type. DbSet objects
are created from a DbContext using the DbContext.Set method.
And according to the MSDN the DbContext :
DbContext Class
A DbContext instance represents a combination of the Unit Of Work and
Repository patterns such that it can be used to query from a database
and group together changes that will then be written back to the store
as a unit. DbContext is conceptually similar to ObjectContext.
So that the EF use the repository pattern and the UOW internally .
DbSet <----> Repository
DbContext <----> Unit Of Work
Why should I build a repository pattern with a unit of work on the top of my EF?

Why should i build a repository pattern with a unit of work on the top of my EF?
Depends on how you want to manage your dependencies.
If Entity Framework is your abstraction layer and the database itself is the dependency, then Entity Framework does indeed already provide your repositories and unit of work. The trade-off is that your domain relies on Entity Framework. As long as that dependency is acceptable, you're good.
If, on the other hand, you want to treat Entity Framework itself as a dependency that can potentially be swapped out without making changes to domain code, then you'd want to create an abstraction as a wrapper around that.
Basically, it all comes down to where you draw the line of what is or is not an "external dependency". For some projects it doesn't matter, for some it's the physical database, for some it's the data access framework, etc.

Why should I build a repository pattern with a unit of work on the top
of my EF?
Because of the Interface Segregation Principle. The method signatures in DbSet and DbContext are basically a big low-level mess, there's a huge mismatch between them and what is typically expected in a Repository and a Unit of Work. In other words, if you use DbSet and DbContext directly, your Application Services code will suffer from leaky abstractions.
In your Application layer, you need to manipulate appropriate semantics. The code in that layer only needs to speak in terms of business transactions and large collections where you can fetch and store stuff. These are very high-level, minimalist abstract concepts. Entity Framework lingo is just too fuzzy and low-level for that, so you need to introduce other idioms - Repository and UoW.

Utility DAL layer with entity framework 6

I'm wondering about the utility of making a dal layer with EF.
Why not calling EF directly in business layer, considering EF DBContext is a unitOfWork and List DBSet are repositories ?
So why adding an extra DAL layer, wich is finally a facade..
The only advantage i see, is in case of we have to change the data access implementation, like replace EF by Hibernate or other. But honestly, i've never seen that happen.

Actually with a data mapper the necessity of developing a DAL is plain useless because it would contain 0 lines of code.
Everything on top of a data mapper isn't a data access layer but actual domain, because a data mapper implementation like an OR/M translates your objects into the underlying relational data and viceversa, and you work on top of them is to develop your domain and miss the pain of object-relational impedance.
The point of introducing the repository pattern on top of a data mapper is because you want to both be able to switch the underlying data store even to a non-relational one in the long run (also, switch from NoSQL to SQL, who knows!), and there's another definitive reason to introduce the repository layer in your software: because you want to be able to mock the data store with fakes in order to unit test your domain.
Finally, even when Entity Framework implements unit of work and other patterns, sometimes their implementation may not suit your own domain requirements and you need to wrap them to provide more concretion to your domain.

What should be CRUD?

I read many documentation about CRUD and I still don't understand what exactly should be CRUDable! It seems most of people are talking about CRUD entities but they architecture doesn't show any Create, Read, Update or Delete methods in their entities. They implements these CRUD operations in a separate class. I like to call these kind of classes CRUD controllers.
Is it correct to create POCO entites with CRUD controller? What should be CRUD?

My take is that you should have a repository which performs the CRUD operations.
Then a controller should call the appropriate CRUD method in the repository, possibly via an intermediate service layer.
Read more about the repository pattern here and here.

These classes are usually called repositories. A repository provides access to your entities with means of adding, updating, removing and retrieving one or more entities. So the repository Creates, Reads, Updates and Deletes (CRUD).
When using a database your POCO is normally an object your database entity is converted into, e.g. with AutoMapper in the repository.

In architectural terms, CRUD means that you have entities without business rules. This entities, have, at most, some simple validations. When you have this kind of entity you speak of CRUD because you can modify the data without worrying about anything else (but validations). This can be used for example for maintaining a list of contacts: name + phone no. + address: at most you can validate that the name is not empty, the phone no. is valid, and the address is valid. But there are no business rules in there.
If there are involved business rules, you should avoid using CRUD to make sure that the business rules are respected. For example you should not allow CRUD for an order detail, because there are business rules involved: perhaps you cannot change an order detail if the order is already paid or sent, or confirmed to the customer. Besides the total order amount depends on the order details. In this case you should use the order with its details as a whole, and read / write / update it all at once. (In DDD this is called "aggregate").
Speaking about CRUD is not a question of how you implement it (repository, ORM like DbContext or NHibernate, or wichever you want to use), but a more philosophical question.
Implementing CRUD is much faster than implementing any other architecture which involves business rules (for example DDD). If you can use CRUD for an entity, is advisable to use it, but not in the other cases.
As to your comment:
but they architecture doesn't show any Create, Read, Update or Delete methods in their entities
That's natural... you can do CRUD with EF for example, without explicitly declaring the CRUD methods. Create an entity in the context, or remove it or modify it and the CRUD operations will be implicitly executed on the SaveChanges.

Remove coupling for EF DBContext and infrastructure layer

I defined my interfaces in infrastructure layer, to use Dependency Injection, but now problem, how can i resolve dependency of DBContext using interface, without adding reference to EF dll, in infrasturcure layer and service layer.

If you need to hide EF completely from your application, you will need to use the repository pattern, hide EF behind your repositories and generate (or write) POCO entities.
If you're more pragmatic, you can use generic repositories with IQueryable support, which allows a great development and unit testing experience, but what to choose is up to you.

You can modify the T4 files (aka T4 templates or .tt files) to create interfaces along with context and even separate them into separate T4 files for each of the two, so you can place them in separate assemblies. You can also make the context return IQueryable instead of ObjectQuery, however...
In order to write optimized query that run on the database and not in memory, the queries must take into account the technology beneath them, you can not write generic queries, unit test them on a in memory list, then expect them to translate to SQL correct and run efficiently and without exceptions.
- You will have to test your queries above a real database (with demo data).
What you should do is implement services which hide the DAL technology from the layers above it, yet inside their implementation use the full power of EF to work as efficiently as possible.
These services can be mocked, to test the layers above them and the services themselves can be tested together with their usage of EF, using a test DB (e.g. using a LOCALDB instance created and started by the test class).
A few of the many relevent links:
Generic Repository With EF 4.1 what is the point
ASP.NET MVC3 and Entity Framework Code first architecture
Is UnitOfWork and GenericRepository Pattern redundant In EF 4.1 code first?
https://softwareengineering.stackexchange.com/questions/133448/unit-integration-testing-my-dal

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.