Data Layer Architecture with multiple datasources - c#

I am trying to create a system that allows you to switch multiple data sources, e.g. switching from Entity Framework to Dapper. I am trying to find the best approach to do this.
At the moment I have different projects for different data layers, e.g. Data.EF for Entity Framework, Data.Dapper for Dapper. I have used a database approach but when it creates the models the information generated is coupled together and not easy to refactor, e.g. separation of models.
I have a project called models, this holds domain and view models, and I was thinking of creating Data.Core and follow the repository pattern. But then, doing this will add an extra layer so I would have Presentation / Business / Repository / Data.
I would like to know the best structure for this approach. Should I also do a code-first approach to create my database? This helps separate concerns and improve abstraction. This is quite a big application so getting the structure right is essential.

I'd suggest factoring your data interfaces either to the model through repository interfaces for your entities or to an infrastructure project. (I think the latter was your rationale behind creating a Data.Core project.)
Each data source will then implement the very same set of interfaces, and you can easily switch between them, even dynamically using dependency injection.
For instance, using repositories:
Model
\_ Entities
Entity
\_ Repositories
IEntityRepository
Data.EF
EntityRepository : Model.IEntityRepository
Data.Dapper
EntityRepository : Model.IEntityRepository
Then in your business you won't need to even reference Data.EF or Data.Dapper: you can work with IEntityRepository and have that reference injected dynamically.

I think you approach is correct. I'd say Presentation / business / repository / data is pretty standard these days.
I'd say the code first approach using POCOs is the preferred option today in the industry. I would advise to start creating a project containing your POCO data structures with any logic in it and take it from there. The advantage of this is that your objects model the domain more naturally. If you start with a db centric approach the problem is that, if you are not careful, you may end with objects more akin to SQL relational databases than to the real model. This was painfully evident in the first versions of .net where it was encouraged to use Datasets tighly coupled with the db and that often caused problems to work with in the business layer.
If needed you can do any complex mapping between the business objects and the db objects in the repository layer. You can use a proxy and/or a unit of work if you need to.

I would suggest you create your domain objects, use the code-first approach and also apply the repository pattern

Yes the repository pattern does bring in an extra layer. Have a look at this post for more detail information Difference between Repository and Service Layer?
RE: code-first approach to create my database
It doesn't matter how big your application is, it is a question of what else you intend to use the database for. If this database is simply a repository for this application then using code-first is fine as you are simply storing your code objects. However if you are using this database as an integration point between applications then you may wish to design the database seperately to the application models.

Related

Seperation of models in a web api application

My team devolops a web api application using entity framework,
The Gui is developed by a seperate team.
My question is how should the models be defined? Should we have two projects - one for domain models (database entities) and one for Dtos which are serializable?
Where should the parsing from Dto to domain models should happen and when should it happen the opposite way?
Moreover, sometimes all the data is needed to be sent to the clients.. Should a Dto be created for those cases as well? Or should I return a domain model?
Generally speaking, it's a good idea to not let your entities (database models) leak out of your database layer. However, as with everything in software - this can have its downfalls. One such downfall being is that it starts to increase complexity of your data layer as it involves mapping your entities to their DTO within your database layer, ultimately leaving repositories that are full of similar methods returning different DTO types.
Some people also feel that exposing IQueryables from your data layer is also a bad thing as you start to leak abstractions to different layers - though this has always seemed a little extreme.
Personally, I favour what I feel is a more pragmatic approach and I prefer to use a tool like AutoMapper to automatically map my entities to my DTOs within the business logic layer.
For example:
// Initial configuration loaded on start up of application and cached by AutoMapper
AutoMapper.Mapper.CreateMap<BlogPostEntity, BlogPostDto>();
// Usage
BlogPostDto blogPostDto = AutoMapper.Mapper.Map<BlogPostDto>(blogPostEntity);
AutoMapper also has the ability to configure more complex mapping, though you should try and avoid this if possible by sticking to flatter DTOs.
In addition, another great feature of AutoMapper is the ability to automatically project your entities to DTOs. This results in much cleaner SQL where only the columns within your DTO are queried:
public IEnumerable<BlogPostDto> GetRecentPosts()
{
IEnumerable<BlogPostDto> blogPosts = this.blogRepository.FindAll().Project(this.mappingEngine).To<BlogPostDto>().ToList();
return blogPosts;
}
Moreover, sometimes all the data is needed to be sent to the clients.. Should a Dto be created for those cases as well? Or should I return a domain model?
DTOs should be created for those. Ultimately you don't want your client depending on your data schema, which is exactly what will happen if you expose your entities.
Alternatives: Command/Query Segregation
It behooves me to also highlight that there are also some other alternatives to a typical layered architecture, such as the Command/Query Segregation approach where you model your commands and queries via a mediator. I won't go into it in too much detail as it's a whole other subject but it's one I would definitely favour over a layered approach discussed above. This would result in you mapping your entities to your DTOs directly within the modelled command or query.
I would recommend taking a look at Mediatr for this. The author, Jimmy Bogard who also created AutoMapper also has this video talking about the same subject.
I've had similar requirements in several projects and in most cases we separated at least three layers:
Database Layer
The database objects are simple one-to-one representations of the database tables. Nothing else.
Domain Layer
The domain layer defines entity objects which represent a complete business object. In our defintion an entity aggregates all data which is directly associated to the entity and can not be regarded as a dedicated entity.
An exmaple: In an application which handles invoices you have a table invoice and invoice_items. The business logic reads both tables and combines the data into a entity object Invoice.
Application Layer
In the application layer we define models for all kind of data we want to send to the client. Pass-through of domain entity objects to save time is tempting but strictly prohibited. The risk to publish any data which shouldn't be published is too high. Furthermore you gain more freedom regarding the design of your API. That's what helps you to fit your last requirement (send all data to the client): Just built a new model which aggregates the data of all domain objects you need to send.
This is the minimum set of layers we use in all projects. There were hundreds of cases where we've been very happy to have several abstraction layers which gave us enough possibilities to enhance and scale an application.

Repository Pattern and Mapping between Business Objects and Data Access Objects

I use Entity Framework as ORM in my .net MVC project. I've implemented the Repository-Pattern (generic) to get/save/update/remove DAOs (Data Access Objects). I also have Business Objects which contain all the business logic. I have - for example - a DAO called Student and a BO (Business Object) called Student as well. The BO contains the logic, the DAO just the data stored in the DB.
Now I am wondering if the Student-Repository should return the Business-Object instead of the DAO?
I could achieve that using Automapper by converting the DAO to a Business Object before returning it from the Repository.Get(). Same with all the other methods. But is this a good practice?
Update
I have a Data Access Layer project and a project for the Business Logic. Entity Framework creates its entities in partial classes (into the Data Access Project) so I could actually extend the entities with other partial classes but the problem is that I reference the Data Access Project in my Business project and I don't have access to the logic code within the Data Access project. So I have to put the logic inside the Business project but as it is not possible to create partial classes over two projects I have to go another way... or do you have a good idea how to structure and solve the problem in a better way?
IMHO there are several goals (some competing):
Make business logic testable in isolation
Design domain objects to match your domain
Decouple data access from everything else
Keep it simple
Can you test your business logic without a database? Probably yes, whether the classes are EF POCO entities or mapped from DAOs.
Do your domain objects match your domain? Are their names well-chosen? Are they always in a valid state? (This can be difficult with a bunch of public read/write properties.) Domain-driven design considerations apply here. (I'm no expert in that.)
Could you swap out EF for Dapper, SQL Server for MongoDB, or current data access for a web service call without changing anything outside the data access layer - with confidence? My suspicion is no. Generic repositories tend to leak IQueryable into other layers. Not everything supports querying, and provider implementations vary. Unit tests typically use LINQ to Objects, which does not behave the same as LINQ to Entities. Also, if you want to extract a web service contract, you would have to look through all classes to find all the queries. See IQueryable is Tight Coupling.
Finally, do you need all of this? If your application's purpose is CRUD data access with no business logic above simple validation, maybe not. These considerations definitely apply to a complex application or site.
Yes, that's totally good practice. Usually you have repository interfaces defined in domain assembly. These interfaces are used by domain services, and implemented in persistence assembly. Entity Framework allows you to map business entities fluently, without polluting them with attributes or forcing them to inherit from some specific base class (POCO entities). That makes your domain model Persistence Ignorant.

Using EF and Passing Data Through Application Layers

All,
We are using EF as our primary data access technology. Like many apps out there, we have a business objects/domain layer. This layers talks to our repository, which, in turn, talks to EF.
My question is: What is the best mechanism for passing the data back and forth to/from EF? Should we use the EF-generated entity classes (we did DB-first development, so we have entity classes that EF generated), create our own DTOs, use JSON or something else?
Of course, I could make an argument for each of these, as well as a counter-argument against them. I'm looking for opinions based on experience building a non-trivial application using a layered architecture and EF.
Thanks,
John
I would use POCOs and use them with EF. You can still do that with the DB first approach.
The main benefit is that your business objects will not be tied to any data access technology.
Your underlying storage mechanism can, and will, change but your POCOs remain. All that business logic is easily re-used and tested.
As you're looking for cons, then I would say it might take longer. However, that cost is well worth it.
With t4 templates I put the actual EF generated entities in a common project that is referenced by all other projects. I use the EF database first created models through the entire application (including use as view models). If I need to add additional properties to an entity that are not in the database I just extend the partial class of the entity in the common project. I have written dozens and large nTier applications using this model and its worked great.

Is POCO the right choice when working with entity framework?

I've been reading about POCO (Plain Old CLR Object) for a while but still can't find the real added value of using it instead of using the auto generated partial classes of the entity framework?
One more thing is it best to use my entity framework directly from the presentation layer or creating a BLL will be better?
The main benefit of a POCO is that you can pass it to a class, library or assembly that doesn't need to know anything about Entity Framework to do its job.
Remember the Single Responsibility Principle - your DAL should know about EF, but your Domain should not, and neither should your presentation layer, because EF deals with data access and those layers do not. Passing EF generated class objects up to those layers generally means you need to make those layers aware of EF, breaking the SRP and making it harder to unit test those layers in isolation.
In response to Ali's further query in the comments below, here is an expanded explanation.
In more complex applications, you will want to split the logic up into separate concerns - data access, business logic, presentation (and perhaps a lot more).
As entity framework deals with data access, it resides in the data access layer - here, you will see little difference between POCOs and EF generated classes. This is fine, because this layer already knows about Entity Framework.
However, you may want to pass data up to the business logic layer - if you do this with EF generated classes, then your business logic layer must also know about Entity Framework because the EF classes rely on a lot of things EF provides specially. What this does is remove the isolation that your business logic layer should have - you need this isolation so you can unit test it correctly by injecting known data into the class from a fake data access layer, which is incredibly hard to do if you let the business logic layer know about EF.
With unit testing, you should be testing the layers functionality, not the functionality of third party libraries - but with EF you end up testing a lot of EF's functionality, or your own functionality which relies very heavily on that of EF's. This isn't good, and it can mask errors or issues.
Removing the business logics dependency on EF generated classes also allows you to move the layer to as remote a location as you like from the data access layer - you can even stick it behind a web service and it would be completely happy. But you can only do this with POCOs, you cannot do this with EF generated classes.
POCO's really come into their own in large, complex multi layered applications - if you aren't layering your app, then you won't see a whole load of benefits imho.
All of this is my opinion, and I'm just a coder with experience - I'm not a coding rockstar, so some other commenters may like to further expand my answers...
The real benefits with POCO is that you can use code first and EF Migrations. If you are not going to use code first you can use the designer generated classes.
If you have a large application you should create a separate BLL, but if your application is very small you can probably go directly with the EF classes in the presentation layer.
Using POCO classes in an ORM allows you to create tests for that code in an easier manner. It also allows you to have a layer of abstraction between your model objects (the POCO classes) and the data access code so if you need to you can swap the data access code (EF for NHibernate, for instance).
I've worked with the POCO model in the past and I can tell you that it's useful for big enterprise projects and large teams of developers where changes to the model happen often and where the monolithic file model used by default by EF does not scale well. The benefits on small projects or in rapid application development are hard to see.
TLDR version: If you're asking yourself what the benefits of POCO and code first are, you probably won't gain anything from using them.

Entity Framework and 3 layer architecture

I have a three layer architecture program. The questions are:
1. Data access is the layer of EF?
2. If i want to use an entity generated by EF from Presentation Layer, then i reference the Data Access, but this violates the principles of 3 layered architecture.
Microsoft Spain released a pretty good documentation, guide and sample application for N-layered applications on codeplex, you can look it up here:
http://microsoftnlayerapp.codeplex.com/
You will find many directions and helpful implementation patterns there.
hth.
Yes EF would be your Data Access Layer.
With EF you can use T4 templates with POCO support, you can then extract these POCO into a seperate dll and this will be reference from all of your layers.
What type of application are you building? If you are building an ASP.NET MVC 3 application, you can have your View be the presentation layer, your Model is your data access (which can use EF) and the controller and / or Action Filters can contain your business logic and in this scenario you will be using your EF Model in the presentation layer but still satisfy the separation of concerns principle.
EF does two things: -
1) Generates an domain model for you (optional, but commonly used)
2) Gives you the ability to query / modify your database via that domain model.
This can give the appearance of blurring the lines between domain model and data access but the two are indeed separate.
As long as you're not doing stuff like creating object contexts and writing queries directly in your presentation tierthen IMHO you are not breaking abstraction - the only thing you are "breakin"g is the fact that you will need to reference System.Data.Objects (or whatever the EF dll is) in your presentation project(s) (which is just a physical artifact) unless you go down the route suggested by Jethro to generate your domain model into a separate project.
For the three tier architecture. I would consider doing Abstraction using Domain Model and Data model pattern rather then doing direct EF from Presentation Layer.
So the idea is that you have your Data Model which has EF POCO classes with Repositories which knows how to access these Classes for various CRUDs.
Your Domain Model would have models related to your Client (so you can put various ViewModels or Validation related code), It can be a WPF or MVC web app.
Now between these two there is a business which talks to both Domain and Data models.
Your Presentation Layer does know nothing about the EF/Data Layer/Repository. When you want to introduce new Data Framework or database, you just need to write new repository classes and data models classes (which prob. be with some sort of code gen).
This also allows your code to be Unit testable as well.

Categories