I have a general difference of opinion on an architectural design and even though stackoverflow should not be used to ask for opinions I would like to ask for pros and cons of both approaches that I will describe below:
Details:
- C# application
- SQL Server database
- Using Entity Framework
- And we need to decide what objects we are going to use to store our information and use all throughout the application
Scenario 1:
We will use the Entity Framework entities to pass all around through our application, for example the object should be used to store all information, we pass it around to the BL and eventually our WepApi will take this entity and return the value. No DTOs nor POCOs.
If the database schema changes, we update the entity and modify in all classes where it is used.
Scenario 2:
We create an intermediate class - call it a DTO or call it a POCO - to hold all information that is required by the application. There is an intermediate step of taking the information stored in the entity and populated into the POCO but we keep all EF code within the data access and not across all layers.
What are the pros and cons of each one?
I would use intermediate classes, i.e. POCO instead of EF entities.
The only advantage I see to directly use EF entities is that it's less code to write...
Advantages to use POCO instead:
You only expose the data your application actually needs
Basically, say you have some GetUsers business method. If you just want the list of users to populate a grid (i.e. you need their ID, name, first name for example), you could just write something like that:
public IEnumerable<SimpleUser> GetUsers()
{
return this.DbContext
.Users
.Select(z => new SimpleUser
{
ID = z.ID,
Name = z.Name,
FirstName = z.FirstName
})
.ToList();
}
It is crystal clear what your method actually returns.
Now imagine instead, it returned a full User entity with all the navigation properties and internal stuff you do not want to expose (such as the Password field)...
It really simplify the job of the person that consumes your services
It's even more obvious for Create like business methods. You certainly don't want to use a User entity as parameter, it would be awfully complicated for the consumers of your service to know what properties are actually required...
Imagine the following entity:
public class User
{
public long ID { get; set; }
public string Name { get; set; }
public string FirstName { get; set; }
public string Password { get; set; }
public bool IsDeleted { get; set; }
public bool IsActive { get; set; }
public virtual ICollection<Profile> Profiles { get; set; }
public virtual ICollection<UserEvent> Events { get; set; }
}
Which properties are required for you to consume the void Create(User entity); method?
ID: dunno, maybe it's generated maybe it's not
Name/FirstName: well those should be set
Password: is that a plain-text password, an hashed version? what is it?
IsDeleted/IsActive: should I activate the user myself? Is is done by the business method?
Profiles: hum... how do I affect a profile to a user?
Events: the hell is that??
It forces you to not use lazy loading
Yes, I hate this feature for multiple reasons. Some of them are:
extremely hard to use efficiently. I've seen too much times code that produces thousands of SQL request because the developers didn't know how to properly use lazy loading
extremely hard to manage exceptions. By allowing SQL requests to be executed at any time (i.e. when you lazy load), you delegate the role of managing database exceptions to the upper layer, i.e. the business layer or even the application. A bad habit.
Using POCO forces you to eager-load your entities, much better IMO.
About AutoMapper
AutoMapper is a tool that allows you to automagically convert Entities to POCOs and vice et versa. I do not like it either. See https://stackoverflow.com/a/32459232/870604
I have a counter-question: Why not both?
Consider any arbitrary MVC application. In the model and controller layer you'll generally want to use the EF objects. If you defined them using Code First, you've essentially defined how they are used in your application first and then designed your persistence layer to accurately save the changes you need in your application.
Now consider serving these objects to the View layer. The views may or may not reflect your objects, or an aggregation of your working objects. This often leads to POCOS/DTO's that captures whatever is needed in the view. Another scenario is when you want to publish objects in a web service. Many frameworks provide easy serialization on poco classes in which case you typically either need to 1) annotate your EF classes or 2) make DTO's.
Also be aware that any lazy loading you may have on your EF classes is lost when you use POCOS or if you close your context.
Related
I'm trying to figure out what is the best method for integration testing when testing an application logic against a real relational database. I'm developing my solution in C# using Entity Framework and NUnit, but this should not be a language dependent question.
Imagine you're building an application that lets the user create Car entities and Person entities. Each Car must have a Person related to it, so basically one Person can have 0-N Car entities, and one Car can have only 1 Person entity as a FK.
The entities could look like this:
public class Person {
public int PersonId { get; set; }
public string Name { get; set; }
public string Surname { get; set; }
public List<Car> Cars { get; set; } // the navigation property
}
public class Car {
public int CarId { get; set; }
public string Make { get; set; }
public string Model { get; set; }
public int PersonId { get; set; } // the foreign key
public Person Person { get; set; } // the navigation property
}
Suppose you have a class CarRequestHandler with a method GetList that accepts a name filter and returns a list of Car entities that match that name.
Now let's focus on the integration testing aspect of this: I want to write some tests that connect to a real SQL Server database and checks if my logic works correctly and if I wrote the right queries using EF Core.
If I want to test the CarRequestHandler.GetList(string name) method, I first have to seed the database with some sample Car entities and then I can execute the test to see what the results of the invocation are. But in order to create a Car, I need to have a Person object already created that can be assigned to the Car entity.
Now, doing this by hand in every test method (or even in a setup fixture) can become really tedious, and especially cumbersome to write and maintain, since in a more complex database, the graph of the dependencies of the entity handler under test could become huge, and that could mean I may need to build an entire object graph consisting of every dependency my entity needs to exist in the SQL Server real database.
Is there some kind of tip you can give me in order to avoid having a big ol' spaghetti codebase that will make me and my team mates go "let's skip testing, we don't have time to do that"?
I hope I explained it well enough, let me know if I need to expand on anything.
The best way I know of to run against real databases in tests would be using docker. There is a library called testcontainers that can greatly simplify the setup for .net testing. The real chore is getting realistic test data. For Some things you can use faker to generate realistic looking test data. But for others you´re gonna have to manually maintain a test data set when it´s too hard to generate.
Testing against production databases is also an option for some things that don´t need to write a lot to the database, or is easy to undo / written in a way a user will never seen it. Though, your integration tests loose portability and everything that runs said tests now needs access to the production database.
If you have trouble adding data to your database you might want to reconsider how your database is structured, and what data the database should contain by default.
Consider looking at it from the applications perspective, it should be fairly simple for the application to store and retrieve objects from the database. So you should test your CarRequestHandler in a similar way it would be used by the actual application. I.e. test the interface, not the implementation. In some cases you might need to access the database directly in order to setup some specific cases, but this should ideally be fairly rare.
In some cases this might involve running a fairly large part of the application if there are many dependencies between various components. This might be easier to manage if using some kind of Dependency injection framework.
Also note the idea of "default data", your application probably needs some amount of pre-existing data to work, and you will need some system to add this data to any new system. Ideally it should be possible to add this data either from code, or by calling a script. You should add this default data to the database before running your tests. This allows the unit tests both to test the queries, and also test that various components work correctly with the default data.
Does the application need to be tested in Production or can this be done in a simulation? Is the Database distributed?
I'm struggling a little bit with following problem. Let's say I want to manage dependencies in my project, so my domain won't depend on any external stuff - in this problem on repository. In this example let's say my domain is in project.Domain.
To do so I declared interface for my repository in project.Domain, which I implement in project.Infrastructure. Reading DDD Red Book by Vernon I noticed, that he suggests that method for creating new ID for aggregate should be placed in repository like:
public class EntityRepository
{
public EntityId NextIdentity()
{
// create new instance of EntityId
}
}
Inside this EntityId object would be GUID but I want to explicitly model my ID, so that's why I'm not using plain GUIDs. I also know I could skip this problem completely and generate GUID on the database side, but for sake of this argument let's assume that I really want to generate it inside my application.
Right now I'm just thinking - are there any specific reasons for this method to be placed inside repository like Vernon suggests or I could implement identity creation for example inside entity itself like
public class Entity
{
public static EntityId NextIdentity()
{
// create new instance of EntityId
}
}
You could place it in the repository as Vernon says, but another idea would be to place a factory inside the constructor of your base entity that creates the identifier. In this way you have identifiers before you even interact with repositories and you could define implementation per your ID generation strategy. Repository could include a connection to something, like a web service or a database which can be costly and unavailable.
There are good strategies (especially with GUID) that allow good handling of identifiers. This also makes your application fully independent of the outside world.
This also enables you to have different identifier types throughout your application if the need arises.
For eg.
public abstract class Entity<TKey>
{
public TKey Id { get; }
protected Entity() { }
protected Entity(IIdentityFactory<TKey> identityFactory)
{
if (identityFactory == null)
throw new ArgumentNullException(nameof(identityFactory));
Id = identityFactory.CreateIdentity();
}
}
Yes, you could bypass the call to the repository and just generate the identity on the Entity. The problem, however, is that you've broken the core idea behind the repository: keeping everything related to entity storage isolated from the entity itself.
I would say keep the NextIdentity method in the respository, and still use it, even if you are only generating the GUID's client-side. The benefit is that in some future where you want to change how the identity's are being seeded, you can support that through the repository. Whereas, if you go with the approach directly on the Entity, then you would have to refactor later to support such a change.
Also, consider scenarios where you would use different repositories in such cases like testing. ie. you might want to generate two identities with the same ID and perform clash testing or "does this fail properly". Having a repository handle the generation gives you opportunity to get creative in such ways, without making completely unique test cases that don't mimic what actual production calls would occur.
TLDR; Keep it in the repository, even if your identifier can be client-side generated.
This question already has answers here:
Should Entities in Domain Driven Design and Entity Framework be the same?
(4 answers)
Closed 5 years ago.
I have a three tier app with a class library as the Infrastructure Layer, which contains an Entity Framework data model (database first).
Entity Framework creates entities under the Model.tt folder. These classes are populated with data from the database.
In the past I would map the classes created by Entity Framework (in the data project) to classes in the Domain project e.g. Infrastructure.dbApplication was mapped to Domain.Application.
My reading is telling me that I should be using the classes contained in .tt as the domain classes i.e. add domain methods to the classes generated by Entity Framework. However, this would mean that the domain classes would be contained in the Infrastructure project, wouldn't it? Is is possible to relocate the classes generated by Entity framework to the Domain project? Am I missing something fundamental here?
I think in the true sense it is a Data Model - not a Domain Model. Although people talk about having the Entity Framework Model as a domain concept, I don't see how you can easily retro fit Value objects such as say amount which would be represented in the true domain sense as such:
public class CustomerTransaction
{
public int Id { get; set; }
public string TransactionNumber { get; set; }
public Amount Amount { get; set; }
}
public class Amount
{
public decimal Value { get; }
public Currency Currency { get; }
}
As opposed to a more incorrect data model approach:
public class CustomerTransaction
{
public int Id { get; set; }
public string TransactionNumber { get; set; }
public int CurrencyType { get; set; }
public decimal Amount { get; set; }
}
Yes, the example is anaemic, but only interested in properties for clarity sake - not behaviour. You will need to change visibility of properties, whether you need default constructor on the "business/data object" for starters.
So in the domain sense, Amount is a value object on a Customer Transaction - which I am assuming as an entity in the example.
So how would this translate to database mappings via Entity Framework. There might be away to hold the above in a single CustomerTransaction table as the flat structure in the data model, but my way would to be add an additional repository around it and map out to the data structures.
Udi Dahan has some good information on DDD and ORM in the true sense. I thought somewhere he talked about DDD and ORM having the Data Model instance as a private field in the domain object but I might be wrong.
Also, that data model suffers from Primitive Obsession (I think Fowler coined it in his Refactoring book - although it Is in his book) Jimmy Bogard talks about that here.
Check out Udi Dahan stuff.
You should move your model to a different project. That is good practice. I don't quite get it what you meant by "moving to to Domain project" Normally entity framework generated classes are used as a domain model. No need for creating "different" domain model from this. This model should be use only near to database operations, whereas web(window) application should use only DTO (Domain transfer objects)
I don't know if you use it or not - but this is a nice tool allowing for recreating model from the database :
https://marketplace.visualstudio.com/items?itemName=SimonHughes.EntityFrameworkReversePOCOGenerator
This allows to store model in classes (instead of EDMX) Someone refers to it as "code first" but there is a misunderstanding. One can use this tool to create model and still be on "database first" This is done simply to omit using EDMX as a model definition.
You can relocate the entity classes by creating a new item in your Domain project: DbContext EF 6.x Generator (not sure of the name and you might have to install a plugin to get this item in the list, also exists for EF 5.x).
Once you have created this new item, you have to edit it to set the path of your EDMX at the very begining of the file. In my project for example it is:
const string inputFile = #"..\..\DAL.Impl\GlobalSales\Mapping\GlobalSalesContext.edmx";
You will also need to edit the DbContext.tt file to add the right using on top of the generated class. At each change you've done on the EDMX, you also will have to right click the generator and click: "Run custom tool" to generate the new classes.
That being said, is it a good practice? As you can see that's what I have done in my project. As long as you do not have EF specific annotations or stuff like that in the generated entity classes, I would said that it is acceptable.
If you need to change your ORM, you can just keep the generated classes and remove all the EF stuff (.tt files, etc) and the rest of your application will work the same. But that's opinion based.
Most of what I've read (e.g. from the author) indicates that AutoMapper should be used to map an an entity to a DTO. It should not load anything from the database.
But what if I have this:
public class Customer {
public int Id { get; set; }
public string Name { get; set; }
public virtual ICollection<Order> Orders { get; set; }
}
public class CustomerDto {
public int Id { get; set; }
public string Name { get; set; }
public IEnumerable<int> OrderIds { get; set; } // here is the problem
}
I need to map from DTO to entity (i.e. from CustomerDto to Customer), but first I must use that list of foreign keys to load corresponding entities from the database. AutoMapper can do that with a custom converter.
I agree that it doesn't feel right... but what are the alternatives? Sticking that logic into a controller, service, a repository, some manager class? All that seems to be pushing the logic somewhere else, in the same tier. And if I do that, I must also perform the mapping manually!
From a DDD perspective, the DTO should not be part of the domain. So AutoMapper is also not part of the domain, because it knows about that DTO. So AutoMapper is in the same tier as the controllers, services, etc.
So does it make sense to put the DTO-to-entity logic (which includes accessing the database, and possibly throwing exceptions) into an AutoMapper mapping?
EDIT
#ChrisSimon's great answer below explains from a DDD perspective why I shouldn't do this. From a non-DDD perspective, is there a compelling reason not to use AutoMapper to load from the db?
To start with, I'm going to summarise my understanding of Entities in DDD:
Entities can be created - often using a factory. This is the start of their life-cycle.
Entities can be mutated - have their state modified - by calling methods on the entity. This is how they progress through their lifecycle. By ensuring that the entity owns its own state, and can only have its state modified by calling its methods, the logic that controls the entity's state is all within the entity class, leading to cleaner separation of business logic and more maintainable systems.
Using Automapper to convert from a Dto to the entity means the entity is giving up ownership of its state. If the dto is in an invalid state and you map that directly onto the entity, the entity may end up in an invalid state - you have lost the value of making entities contain data + logic, which is the foundation of the DDD entity.
To make a suggestion as to how you should approach this, I'd ask - what is the operation you are trying to achieve? DDD encourages us not to think about CRUD operations, but to think about real business processes, and to model them on our entities. In this case it looks like you are linking Orders to the Customer entity.
In an Application Service I would have a method like:
void LinkOrdersToCustomer(CustomerDto dto)
{
using (var dbTxn = _txnFactory.NewTransaction())
{
var customer = _customerRepository.Get(dto.Id);
foreach (var orderId in dto.OrderIds)
{
var order = _orderRepository.Get(orderId);
customer.LinkToOrder(order);
}
dbTxn.Save();
}
}
Within the LinkToOrder method, I would have explicit logic that did things like:
Check that order is not null
Check that the customer's state permits adding the order (are they currently active? is their account closed? etc.)
Check that the order actually does belong to the customer (what would happen if the order referenced by orderId belonged to another customer?)
Ask the order (via a method on the order entity) if it is in a valid state to be added to a customer.
Only then would I add it to the Customers Order's collection.
This way, the application 'flow' and infrastructure management is contained within the application/services layer, but the true business logic is contained within the domain layer - within your entities.
If the above requirements are not relevant in your application, you may have other requirements. If not, then perhaps it is not necessary to go the route of DDD - while DDD has a lot to add, its overheads are generally only worth it in systems with lots of complex business logic.
This isn't related to the question you asked, but I'd also suggest you take a look at the modelling of Customer and Order. Are they both independent Aggregates? If so, modelling Customer as containing a collection of Order may lead to problems down the road - what happens when a customer has a million orders? Even if the collection is lazy loaded, you know at some point something will attempt to load it, and there goes your performance. There's some great reading about aggregate design here: http://dddcommunity.org/library/vernon_2011/ which recommends modelling references by Id rather than reference. In your case, you could have a collection of OrderIds, or possibly even a completely new entity to represent the link - CustomerOrderLink which would have two properties - CustomerId, and OrderId. Then none of your entities would have embedded collections.
I'm trying to figure out the best approach to architecting this project. Basically, it's a "band" profile site. I'm using ASP.NET 4, EF, and Automapper (structuremap too, but that's not important). I'm running into performance issues and need advice on whether my approach is right or not (my guess is not). I'll focus on specific sections and provide stripped down examples.
I have a EntityFramework repository class that interacts directly onto the EF objects using LINQ:
[Pluggable("Repository")]
public class EntityDataRepository : IRepository
{
static EntityDataRepository()
{
// other mappings removed
// Data. objects are EF objects, mapping to my DTO classes
Mapper.CreateMap<Data.Event, Models.EventModel>();
Mapper.CreateMap<Data.Genre, Models.GenreModel>();
Mapper.CreateMap<Data.Band, Models.BandModel>();
}
public IEnumerable<BandModel> GetBandsByUser(Guid userId)
{
using (var ctx = new DbContext())
{
var user = GetCurrentUserModel(ctx, userId);
var efBands = from r in user.BandRelations
orderby r.Date
select r.Band;
return Mapper.Map<IEnumerable<Data.Band>, IEnumerable<Models.BandModel>>(efBands);
}
}
}
Bands have genres and events. Note that it maps the EF objects to my DTO object, and returns a list of them. It acts as a proxy to enable my controllers to invoke methods on to obtain the data that it needs (actual logic altered to show what I need):
namespace OpenGrooves.Web.Areas.Edit.Controllers
{
[Authorize]
public class MyBandsController : BaseController
{
public ActionResult ShowBands()
{
IEnumerable<BandModel> bands = repository.GetBandsByUser(loggedUserGuid).First();
return View(bands);
}
}
}
Finally, here's the BandModel class, which is mirroring the Band class entity in EF:
public class BandModel
{
// fluff and scalar properties removed
public IEnumerable<EventModel> Events { get; set; }
public IEnumerable<GenreModel> Genres { get; set; }
}
Basically, am I doing this right? In my EF to DTO classes, the Band EF entity has navigational properties, such as Genres and Events. The problem is, during the mapping that takes place in automapper, these list properties are being populated, especially if one of my proxy methods returns a list of BandModels. It seems to be invoking the Genres and Event EF queries for each record, which is a major performance killer obviously (at least 2 queries for Events and Genres are ran for each BandModel object returned).
Is it OK practice to use EF objects directly in my controllers, possibly even used as models for views?
Is there something I need to change in my mappings to enable lazy loading for these navigational properties (events, genres off a BandModel object)?
Thanks!!
Is it OK practice to use EF objects directly in my controllers, possibly even used as models for views?
Yes, Kinda.
This answer is subjective and depends on how you view your separation of concerns. Most MVC developers, including me, swear on view models. It decouples your data or domain classes from the presentation layer. This is awesome.
Some people don't like being awesome, including other languages and frameworks like every PHP MVC framework, Rails, and Django. Nobody can say these languages "do it wrong" but us .NET devs subscribe to a different paradigm.
Your second question is strange you say "is there something to enable lazy loading" right after you say lazy loading is happening. Care to explain?
Lazy loading is on by default in EF4.