Entity Framework vs repository pattern - c#

I'm wondering about usefulness of repository pattern in project using Entity Framework. There are opposite claims about this - some say that EF is implementation of repository and Unit of Work pattern itself, so there is no need to wrap it in next abstraction layer, some argue that it has advantages like separation of DAL and BL and easier creation of unit tests. In my experience, I often come across the following approach (generally, not only in EF projects):
Repository (DAL) <-> Service (BL) <-> Controller
Repository + Service + Type = Model
Repository has methods responsible for data access only i.e.:
public interface IUsersRepository
{
IEnumerable<User> GetAll();
User Get(int id);
User GetByLogin(string login);
void Update(User item);
void Delete(int id);
void Delete(User item);
// ...
}
Often, there is used generic repository instead, which methods receive functions for filtering, sorting etc.
Service in turn uses repository and contains business logic i.e.:
public interface IUsersService
{
// ...
bool VerifyPassword(string login, string password);
void ChangePassword(string login, string password);
// ...
}
As I understand, service shouldn't have any DAL operations - this means that we shouldn't i.e. return IQueryable collections from repository, because then query will be executed outside repository and unit tests will not be fully reliable (differences between LINQ-to-Entities and LINQ-to-Objects).
I see a problem with queries efficiency here - we will often get more data from database than needed and then filter it in memory.
For example, let's consider bool VerifyPassword(string login, string password) method.
In this case we need to get whole User entity from database, which can have 50 properties for example, for password verification purpose only. We can, of course, create many methods in repository like :
string GetPasswordHash(string login)
or
bool VerifyPassword(string login, string passwordHash)
{
return db.Users.Any(x => x.Login = login && x.Password = passwordHash);
}
without need to get whole entity from database, but in my opinion this can be a "little" overhead.
We can also move VerifyPassword function from service to repository - then we should ask ourself a question whether two layers - repository and service are needed. But if we merge them, we will lose benefits of separating DAL and BL layers - unit tests would be integration tests in reality. So maybe it would be simplier (and better) to keep it all in controller and inject mocked DbContext or use something like Effort for unit tests?
I will be grateful for your answers how you see this and how you solve this problem in your projects.
UPDATE
I understand that repository pattern allows to easily change data source/provider to other like nHibernate, LINQ-to-SQL, plain ADO and so on.
But could you tell me how do you implement sorting and paging in your repository API? I mean what's the best way to pass sorting and paging specification to repository? I see that some passes function predicates for LINQ but this will tightly couple repository with LINQ - using it with plain ADO, stored procedures etc. would be problematic. Creating many methods like GetUsersOrderedByNameDesc() etc. is crazy in my opinion. So maybe create your own class for specification, forge and pass sorting/paging criterias, and process it in repository? If yes, can you provide me some example of implementation?

I will try to attempt your core question. Here are my two cents.
Why repository vs entity framework. You will need to answer some of the questions I propose you ask when designing or picking which one:
Is there a chance your project will have different repositories it will talk to? If yes, then repository pattern over entity framework makes sense.
Having a repository pattern will also allow you to set up your test framework more easily where you can have your mockup db that the repository pattern (DAL) will use.
You may also want to consider the changes to your data model can percolate back to changes in your business logic with the entity framework.

Related

Unit testing existing codebase

I'm working on a medium sized (ASP.NET MVC) project with lots of business logic code since the business rules for the application are quite complex at times.
I'm using the following solution structure:
ClientName.Web (MVC website)
ClientName.Interfaces (defines the business logic methods)
ClientName.BusinessLogic (should probably be called "Services" since I'm using entity framework here)
ClientName.Model (contains the EF DbContext class, enums etc..)
ClientName.Tests (unit tests project)
For accessing the (MSSQL) database, I'm using Entity Framework v. 6.
Now, most of the business logic methods are already written and working as they should. However, the size of the codebase is now at a point, where I'm fixing a feature and then breaking another which is far from ideal. What I should've done from the very beginning was to write unit tests for the business logic methods, taking a TDD approach.
Due to this, I want to try and bring in unit tests for the existing (and future) business logic methods and I have read about Moq and discovered this blog post on MSDN which I found interesting. There's one problem, though: my DbContext gets injected into the business logic classes (I run one DbContext per http request) which are used to perform the CRUD operations. A business logic class could look like this:
public class PersonBusiness
{
private readonly MyContext _myContext;
public PersonBusiness(MyContext myContext)
{
_myContext = myContext;
}
public IEnumerable<PersonResponsibility> GetResponsibilities()
{
return _myContext.PersonResponsibilities.Where(x => x.IsActive).ToList();
}
public void CreatePerson(string name)
{
Person person = new Person() { Name = name };
_myContext.People.Add(person);
_myContext.SaveChanges();
}
}
(This is just a very simply example, some of the BL methods are absurd complex with reads from N tables etc)
As far as I understand, I need a fake DbContext for testing which Moq can help me with, but what I don't get is how I use the fake DbContext with my business logic classes since it expects a type of MyContext. Is there a way that I can use my existing methods, but with a fake context instead?
Also, this is a fairly large database with 20 - 25 tables. Do I have to create the mock data manually for each table, for each test I run, or is there some other way "mock" it? Some tests will involve i.e. 7 different tables which makes for a lot of manual mock data :-)
Any help/hint is greatly appreciated.
Thanks in advance.
Instead of passing a MyContext, create an IMyContext interface for your context class and pass that instead. You can then mock what you expect on the Context class.
As for mocking your data, 20-25 tables isn't really a large schema. You can build up a library of shared mocked data, and manipulate it as required for your tests.

Data access architectures with Raven DB

What data access architectures are available that I can use with Raven DB?
Basically, I want to separate persistence via interfaces, so I don't expose underline storage to the upper layers. I.e. I don't want my domain to see IDocumentStore or IDocumentSession which are from Raven DB.
I have implemented the generic repository pattern and that seems to work. However, I am not sure that is actually the correct approach. Maybe I shall go towards command-query segregation or something else?
What are your thoughts?
Personally, I'm not really experienced with the Command Pattern. I saw that it was used in Rob Ashton's excellent tutorial.
For myself, I'm going to try using the following :-
Repository Pattern (as you've done)
Dependency Injection with StructureMap
Moq for mock testing
Service layer for isolating business logic (not sure of the pattern here .. or even if this is a pattern.
So when i wish to get any data from RavenDB (the persistence source), i'll use Services, which will then call the appropriate repository. This way, i'm not exposing the repository to the Application nor is the repository very heavy or complex -> it's basically a FindAll / Save / Delete.
eg.
public SomeController(IUserService userService, ILoggingService loggingService)
{
UserService = userService;
LoggingService = loggingService;
}
public ActionMethod Index()
{
// Find all active users, page 1 and 15 records.
var users = UserService.FindWithIsActive(1, 15);
return View(new IndexViewModel(users));
}
public class UserService : IUserService
{
public UserService(IGenericReposistory<User> userRepository,
ILoggingService loggingService)
{
Repository = userRepository;
LoggingService = loggingService;
}
public IEnumberable<User> FindWithIsActive(int page, int count)
{
// Note: Repository.Find() returns an IQueryable<User> in this case.
// Think of it as a SELECT * FROM User table, if it was an RDMBS.
return Repository.Find()
.WithIsActive()
.Skip(page)
.Take(count)
.ToList();
}
}
So that's a very simple and contrived example with no error/validation checking, try/catch, etc... .. and it's pseudo code .. but you can see how the services are rich while the repository is (suppose to be, for me at least) simple or lighter. And then I only expose any data via services.
That's what I do right now with .NET and Entity Framework and I'm literally hours away from giving this a go with RavenDb (WOOT!)
What are you trying to achieve by that?
You can't build an application which makes use of both an RDBMS and DocDB, not efficiently at least. You have to decide for yourself which database you are going to use, and then go all the way with it. If you decide to go with an RDMBS, you can use NHibernate for example, and then again - no need for any other abstraction layer.

The responsibilities of my service and repository layer

The other day I asked this question:
Should the repository layer return data-transfer-objects (DTO)?
The answer (well by just one person, but I already had a hunch that it wasn't a good idea) was that no, the repository later should not have to deal with the DTO objects (their purpose is purely to be sent over the wire) and the service layer should deal with that.
Now I've come up with a construction in the meantime that I need your opinion on. The idea is that, when it makes sense to do so, the repository layer can return an interface type I've defined called IProjectable. This wraps the query (the repository layer does not execute the query yet) but does not allow the consumer to change the query (it's not IQueryable), just to perform projection operations on it (so far for me only First and ToPagedList) that would perform the projection and actually execute the query.
So something like this in the repository:
public IProjectable<User> GetUser(int id)
{
var query = from u in Set<User>()
where u.UserID == id
select u;
return query.AsProjectable();
}
And in the service layer something like this:
var dto = repository.GetUser(16).Single(u => new SimpleUserDto
{
FullName = u.FirstName + " " + u.LastName,
DisplayAddress = u.Address.Street + u.Address.HouseNumber,
OrderCount = u.Orders.Count()
});
return dto;
Am I correct in saying that doing the actual data access here is still the responsibility of the repository layer (as it should be) and that the projection to a serializable form is the responsibility of the service layer (as it should be)?
The only other way I see to do this efficiently (returning a User from the repository and doing the Count() on his Orders in the service layer would result in an extra query to the database) is to define a type that has all these properties and return it from the repository layer and just don't call it "Dto", which seems silly as it would be identical to the DTO just not named the same for the sake of "purity". This way, it seems, I can have my cake and eat it for the most part too.
The downside I see is that you can get a mismatch where the service layer performs projections that can't actually be translated to SQL which it shouldn't have to worry about, or where it performs such complex projections that makes it questionable what layer is doing the actual data access.
I'm using Entity Framework 4 by the way, if it matters.
Am I correct in saying that doing the
actual data access here is still the
responsibility of the repository layer
(as it should be) and that the
projection to a serializable form is
the responsibility of the service
layer (as it should be)?
Yes you are, the service layer still has no idea how the actual DataAccess is being performed (as it should not have not). Are the calls send to SQL? is there a caching layer in between?
The downside I see is that you can get
a mismatch where the service layer
performs projections that can't
actually be translated to SQL which it
shouldn't have to worry about, or
where it performs such complex
projections that makes it questionable
what layer is doing the actual data
access.
For this problem i use a pipeline pattern which basicly is just a set of extension methods over IProjectable which can perform tested projections. Next, in your serviceLayer you can just write your query using a composition of these pipeline methods, for example:
var users = repository.GetUsers().FilterByName("Polity").OrderByAge().ToTransferObjects();
One of developers I most respect ayende (http://ayende.com/Blog/Default.aspx) said: "ORM is your repository" video here -> http://community.devexpress.com/blogs/seth/archive/2011/03/09/interview-with-ayende-rahien-aka-oren-eini.aspx
Question is do you really need Repository pattern?
Just my opinion :)

Does Queryability and Lazy Loading in C# blur the lines of Data Access vs Business Logic?

I am experiencing a mid-career philosophical architectural crisis. I see the very clear lines between what is considered client code (UI, Web Services, MVC, MVP, etc) and the Service Layer. The lines from the Service layer back, though, are getting more blurred by the minute. And it all started with the ability to query code with Linq and the concept of Lazy loading.
I have created a Business Layer that consists of Contracts and Implementations. The Implementations then could have dependencies to other Contracts and so on. This is handled via an IoC Container with DI. There is one service that handles the DataAccess and all it does is return a UnitOfWork. This UnitOfWork creates a transaction when extantiated and commits the data on the Commit method. [View this Article (Testability and Entity Framework 4.0)]:
public interface IUnitOfWork : IDisposable {
IRepository<T> GetRepository<T>() where T : class;
void Commit();
}
The Repository is generic and works against two implementations (EF4 and an InMemory DataStore). T is made up of POCOs that get generated from the database schema or the EF4 mappings. Testability is built into the Repository design. We can leverage the in-memory implementation to assert results with expectations.
public interface IRepository<T> where T : class {
IQueryable<T> Table { get; }
void Add(T entity);
void Remove(T entity);
}
While the Data Source is abstracted, IQueryable still gives me the ability to create queries anywhere I want within the Business logic. Here is an example.
public interface IFoo {
Bar[] GetAll();
}
public class FooImpl : IFoo {
IDataAccess _dataAccess;
public FooImpl(IDataAccess dataAccess) {
_dataAccess = dataAccess;
}
public Bar[] GetAll() {
Bar[] output;
using (var work = _dataAccess.DoWork()) {
output = work.GetRepository<Bar>().Table.ToArray();
}
return output;
}
}
Now you can see how the queries could get even more complex as you perform joins with complex filters.
Therefore, my questions are:
Does it matter that there is no clear distinction between BLL and the DAL?.
Is queryability considered data access or business logic when behind a Repository layer that acts like an InMemory abstraction?
Addition: The more I think about it, maybe the second question was the only one that should have been asked.
I think the best way to answer your questions is to step back a moment and consider why separation between business logic layers and data access layers is the recommended practice.
In my mind, the reasons are simple: keep the business logic separate from the data layer because the business logic is where the value is, the data layer and business logic will need to change over time more or less independently of each other, and and the business logic needs to be readable without having to have detailed knowledge of what all the data access layer does.
So the litmus test for your query gymnastics boils down to this:
Can you make a change to the data schema in your system without upsetting a significant portion of the business logic?
Is your business logic readable to you and to other C# developers?
1. Only if you care more about philosophy than getting stuff done. :)
2. I'd say it's business logic because you have an abstraction in between. I would call that repository layer part of DAL, and anything that uses it, BL.
But yeah, this is blurry to me as well. I don't think it matters though. The point of using patterns like this is to write a clean, usable code that easy to communicate at the same time, and that goal is accomplished either way.
1.Does it matter that there is no clear distinction between BLL and the DAL?.
It sure does matter! Any programmer that uses your Table property needs to understand the ramifications (database roundtrip, query translation, object tracking). That goes for programmers reading the business logic classes as well.
2.Is queryability considered data access or business logic when behind a Repository layer that acts like an InMemory abstraction?
Abstraction is a blanket that we hide our problems under.
If your abstraction is perfect, then the queries could be abstractly considered as operating against in-memory collections and therefore they are not data access.
However, abstractions leak. If you want queries that make sense in the data world, there must be effort to work above and beyond the abstraction. That extra effort (which defeats abstraction) produces data access code.
Some examples:
output = work.GetRepository<Bar>().Table.ToArray();
This is code is (abstractly) fine. But in the data world it results in scanning an entire table and is (at least generally) dumb!
badquery = work.GetRepository<Customer>().Table.Where(c => c.Name.Contains("Bob")).ToArray();
goodquery = work.GetRepository<Customer>().Table.Where(c => c.Name.StartsWith("Bob")).ToArray();
Goodquery is better than bad query when there's an index on Customer.Name. But that fact is not available to us unless we lift the abstraction.
badquery = work.GetRepository<Customer>().Table
.GroupBy(c => c.Orders.Count())
.Select(g => new
{
TheCount = g.Key,
TheCustomers = g.ToList()
}).ToArray();
goodquery = work.GetRepository<Customer>().Table
.Select(c => new {Customer = c, theCount = c.Orders.Count())
.ToArray()
.GroupBy(x => x.theCount)
.Select(g => new
{
TheCount = g.Key,
TheCustomers = g.Select(x => x.Customer).ToList()
})
.ToArray();
goodquery is better than bad query since badquery will requery the database by group key, for each group (and worse, it is highly unlikely there is an index to help with filtering customers by c.Orders.Count() ).
Testability is built into the Repository design. We can leverage the InMemory implementation to assert results with expectations.
Be under no illusions that your queries are being tested if you actually run them against in-memory collections. Those queries are untestable unless a database is involved.

Should a Repository be responsible for "flattening" a domain?

Disclaimer: I'm pretty new to DDD and its associated terminology, so if i'm mislabeling any concepts, please correct me.
I'm currently working on a site with a relatively simple domain model (Catalog items, each of which stores a collection of CatalogImage items.)
My repository follows the standard interface of FindbyID(int ID) GetAll() etc...
The problem arises when trying to find a particular image by its ID; I end up with methods such as FindImagebyID(int CatalogItemID, int ImgID)
As new requirments develop, and the object graph becomes more heavily nested, I could see an explosion of methods such as Find{NestedType}ByID(int catalogItemID,.....,int nestedTypeID)
Should I simply be returning an IEnumerable from the FindAll() method, and using Linq in a higher layer to form these queries? Or will that be a violation of SoC?
It sounds to me like you have a justification for building multiple repositories.
Example
interface CatalogRepository
{
Catalog FindByID(int ID);
}
interface CatalogImageRepository
{
CatalogImage FindByID(int ID);
}
This will properly separate out your concerns, since each repository is only responsible for knowing how to deal with that specific entity.
I would filter the model at a layer above the repository, with LINQ if you like. Makes the repository simple. If you are using LINQ to get the data from the database this method works very well, if you are having to use ADO or some other legacy data access layer than it might make it more difficult to make the repository so simple. Linq makes it easy so that you can have the repository return IQueryable and let the next layer add the filtering and the actual retrieval of data does not happen until it is asked for. This makes it possible to have a method on the repository like GetImages() that gets all images, and the next layer adds the filtering for a specific image. If you are using ADO, you are probably not going to want to bring back all images then filter....so could be a trade off.

Categories