I'm just about to start a new pet project and I've been wondering about how I should go about validation when adding an entity to a parent's one-to-many collection. I'll use two example classes to summarize what I'm going on about a Student and a Teacher. Constraint here is that at any given time a Student can only be taught by one (and only one) Teacher who in-turn could be teaching one or more Students).
public class Student
{
public bool IsEnrolled { get; set; }
public virtual Teacher IsCurrentlyBeingTaughtBy { get; set; }
}
public class Teacher
{
public virtual ICollection<Student> IsCurrentlyTeaching { get; set; }
}
When students arrive at a class I need to assign them to the Teacher's IsCurrentlyTeaching collection, but I first need to make sure they're enrolled. My question is where best to validate this basic rule? The options going around my head currently are:
1. Use a repository pattern
As I'm going to be applying unit tests I'm leaning in favor of this method as I can wrap my data access logic up into a mockable object and there is a single responsibility here so I only have to validate this in my repository once. BUT - is validation the responsibility of the repository, or should I be only dealing with the CRUD of entities in a repository?
2. Validate this in the controller action
I should mention here that I propose this to be an MVC3 project, so keeping specifically to that should I be performing this validation in the controller's action before adding the Student to the repository (and subsequently the Teacher's list of students they're currently teaching). BUT - am I heading down a fat controller path that I really shouldn't be?
3. Perform this validation on the Teacher entity
Cutting out the middle-man (i.e. repository) should I be adding the Student via a method on the Teacher POCO such as AddStudent(Student student) and throwing a custom exception when trying to add a student who hasn't been enrolled?
There are probably more options available, but these are the three I'm trying to choose between at this present moment and I've got a little tunnel vision from thinking about this. Obviously all of the above can be suitably unit tested but thinking long-term (and accommodating growth) which path should I be heading down?
You may be able to create your own custom validator for this. That would let you piggyback on the validation the MVC is already providing. I've never tried this, but I would imagine something like this would work:
public class EnsureEnrollment : ValidationAttribute
{
public EnsureEnrollment () { }
public override ValidationResult IsValid(object value, ValidationContext validationContext)
{
var studentList = value as IEnumerable<Student>;
if (studentList == null)
{
return ValidationResult.Success;
}
foreach(Student s in studentList )
{
if(!s.IsEnrolled)
{
//Insert whatever error message you want here.
return new ValidationResult("Student \"" + s.Name + "\" is not enrolled.");
}
}
return ValidationResult.Success;
}
}
Then on your property just add your annotation:
[EnsureEnrollment()]
public virtual ICollection<Student> IsCurrentlyTeaching { get; set; }
Personally, I like having my validation as part of static CRUDL methods on my entities. True, you have to pass the context in to every one of them, but it keeps the controllers a lot cleaner and makes all of that functionality readily available for any other projects that may use your entities in the future.
Previously I created a base class that all of my entities derived from which had a must override for Validate. The Validate method was called by almost all of the CRUDL methods and other working methods to ensure that the entity was proper before acting on it. Most of these validation rules were a bit more complex that could be easily expressed using the DataAnnotations attributes.
Or you can integrate specific validation points into a method with a more specific purpose. Take for instance:
public static bool AddToTeacher(SchoolContext db, Student student, Teacher teacher)
{
if (student.IsEnrolled)
{
teacher.IsCurrentlyTeaching(student);
return db.SaveChanges() > 0;
}
return false;
}
The AddToTeacher method only ensures that a specific requirement is met. If I wanted to ensure that the student was properly formed and was of eligible course track and what not, I would likely write a short method (or several all called by a "container" method) to validate those particular points.
In short, I do my best to keep every bit of entity specific code on the entity so that the controller is mostly ignorant of how the entities work.
As for on which entity to put it, it depends on how you think. Student.AddToTeacher is just as viable in my opinion as Teacher.AddStudent. I personally would use the former just because that is what most of my entities currently look like with "child" entities adding themselves to "parents" rather than the other way around.
Related
I'm writing validation for a class (e.g. Car) which requires a number of similar/identical database calls.
RuleFor(c => c.Id).MustAsync(async (car, id, context, cancellation) =>
{
return await _carRepository.Get(id) != null;
}).WithMessage("Car with id '{PropertyValue}' does not exist!");
RuleFor(c => c.Model).MustAsync(async (car, model, context, cancellation) =>
{
var expectedModel = (ModelType)context.ParentContext.RootContextdata["ExpectedModel"]
var databaseCar = await _carRepository.Get(car.Id); // Repeated database call
return databaseCar.Model == expectedModel;
}).WithMessage('Stored car does not have the expected model.');
Ideally I would do this call once but I gather storing the result as a member on the validator instance is not advised, and overriding ValidateAsync with the database result added to the context (similarly to ExpectedModel in the example above) results in rather clumsy code to retrieve it.
Am I missing something?
One quick solution could be to add some kind of memoization/caching on your Repository class, so that multiple requests for the same Car within the same context (e.g. HTTP Request) will remember and return the same object without requiring multiple round-trips. But there might be a better way.
There are various levels of validation to consider. As Jammer points out, FluentValidation is usually used to validate the consistency of a given model: did the client send me something that appears on the surface to be a valid request? Determining whether that request is valid given the current state of data is another level of validation that people often do in different ways.
One way that you could get the best of both worlds is to create a new class to represent both the given car model and everything that your application needs in order to validate it.
public class ValidCar
{
public CarModel Model {get; set;}
public CarEntity Entity {get; set;}
}
First you assemble all the data you need into a new ValidCar, and then you can use FluentValidation rules on this new model to ensure it's actually valid.
One benefit to this approach is you can have your business logic methods require a ValidCar as a parameter instead of just a CarModel. This makes it very difficult to accidentally forget to validate the car in some code path, and it prepackages up data that's likely to be useful to much of the business-level logic that you plan to use.
I would argue that checking for an existing item with the same ID is not a validation question.
If you have to do this. Create a method in your repository that specifically checks this in an optimised way. Only select the ID column so at least you aren't loading and materialsing the entire entity.
I'm struggling a little bit with following problem. Let's say I want to manage dependencies in my project, so my domain won't depend on any external stuff - in this problem on repository. In this example let's say my domain is in project.Domain.
To do so I declared interface for my repository in project.Domain, which I implement in project.Infrastructure. Reading DDD Red Book by Vernon I noticed, that he suggests that method for creating new ID for aggregate should be placed in repository like:
public class EntityRepository
{
public EntityId NextIdentity()
{
// create new instance of EntityId
}
}
Inside this EntityId object would be GUID but I want to explicitly model my ID, so that's why I'm not using plain GUIDs. I also know I could skip this problem completely and generate GUID on the database side, but for sake of this argument let's assume that I really want to generate it inside my application.
Right now I'm just thinking - are there any specific reasons for this method to be placed inside repository like Vernon suggests or I could implement identity creation for example inside entity itself like
public class Entity
{
public static EntityId NextIdentity()
{
// create new instance of EntityId
}
}
You could place it in the repository as Vernon says, but another idea would be to place a factory inside the constructor of your base entity that creates the identifier. In this way you have identifiers before you even interact with repositories and you could define implementation per your ID generation strategy. Repository could include a connection to something, like a web service or a database which can be costly and unavailable.
There are good strategies (especially with GUID) that allow good handling of identifiers. This also makes your application fully independent of the outside world.
This also enables you to have different identifier types throughout your application if the need arises.
For eg.
public abstract class Entity<TKey>
{
public TKey Id { get; }
protected Entity() { }
protected Entity(IIdentityFactory<TKey> identityFactory)
{
if (identityFactory == null)
throw new ArgumentNullException(nameof(identityFactory));
Id = identityFactory.CreateIdentity();
}
}
Yes, you could bypass the call to the repository and just generate the identity on the Entity. The problem, however, is that you've broken the core idea behind the repository: keeping everything related to entity storage isolated from the entity itself.
I would say keep the NextIdentity method in the respository, and still use it, even if you are only generating the GUID's client-side. The benefit is that in some future where you want to change how the identity's are being seeded, you can support that through the repository. Whereas, if you go with the approach directly on the Entity, then you would have to refactor later to support such a change.
Also, consider scenarios where you would use different repositories in such cases like testing. ie. you might want to generate two identities with the same ID and perform clash testing or "does this fail properly". Having a repository handle the generation gives you opportunity to get creative in such ways, without making completely unique test cases that don't mimic what actual production calls would occur.
TLDR; Keep it in the repository, even if your identifier can be client-side generated.
I often find myself writing implementation details of the presistence layer in my domain objects. In my upcoming example, it seems useful to be able to pass a PersonRecord into the constructor in order to map the properties.
Note: The PersonRecord is an Entity Framework entity in this example.
public class Person
{
public int Id { get; set; }
public string Name { get; set; }
public string Email { get; set; }
public Person(Data.persistence.PersonRecord entity)
{
Id = entity.Id
Name = entity.Name
Email = entity.Email
}
}
I also find it useful to map the properties back to the entity when saving the data.
public void UpdateEntity(Data.persistence.PersonRecord entity)
{
entity.Id = Id
entity.Name = Name
entity.Email = Email
}
I then have a repository for this entity which performs the saving.
The concerns I have about this approach is that my domain objects are coupled with Entity Framework making it arduous to replace along with other SRP violation issues.
Are my concerns warranted?
I can overcome the constructor issue by swapping the dependency around, taking each property as a parameter and passing in those values inside the repository.
I can do the same for updating the entity, I wouldn't need any extra logic in the domain model but rather let the repository be responsible for mapping the properties.
By doing these things however I also need to do the following:
Every retrieval method in the repository will need to map the value of each property across on the domain object's constructor. This violates the DRY principle unless that logic is abstracted into a factory function etc.
The update method in the repository will also be required to map each property, this seems okay as there will likely only be one update method.
Am I missing a trick?
if not, which is the best approach?
I have a general difference of opinion on an architectural design and even though stackoverflow should not be used to ask for opinions I would like to ask for pros and cons of both approaches that I will describe below:
Details:
- C# application
- SQL Server database
- Using Entity Framework
- And we need to decide what objects we are going to use to store our information and use all throughout the application
Scenario 1:
We will use the Entity Framework entities to pass all around through our application, for example the object should be used to store all information, we pass it around to the BL and eventually our WepApi will take this entity and return the value. No DTOs nor POCOs.
If the database schema changes, we update the entity and modify in all classes where it is used.
Scenario 2:
We create an intermediate class - call it a DTO or call it a POCO - to hold all information that is required by the application. There is an intermediate step of taking the information stored in the entity and populated into the POCO but we keep all EF code within the data access and not across all layers.
What are the pros and cons of each one?
I would use intermediate classes, i.e. POCO instead of EF entities.
The only advantage I see to directly use EF entities is that it's less code to write...
Advantages to use POCO instead:
You only expose the data your application actually needs
Basically, say you have some GetUsers business method. If you just want the list of users to populate a grid (i.e. you need their ID, name, first name for example), you could just write something like that:
public IEnumerable<SimpleUser> GetUsers()
{
return this.DbContext
.Users
.Select(z => new SimpleUser
{
ID = z.ID,
Name = z.Name,
FirstName = z.FirstName
})
.ToList();
}
It is crystal clear what your method actually returns.
Now imagine instead, it returned a full User entity with all the navigation properties and internal stuff you do not want to expose (such as the Password field)...
It really simplify the job of the person that consumes your services
It's even more obvious for Create like business methods. You certainly don't want to use a User entity as parameter, it would be awfully complicated for the consumers of your service to know what properties are actually required...
Imagine the following entity:
public class User
{
public long ID { get; set; }
public string Name { get; set; }
public string FirstName { get; set; }
public string Password { get; set; }
public bool IsDeleted { get; set; }
public bool IsActive { get; set; }
public virtual ICollection<Profile> Profiles { get; set; }
public virtual ICollection<UserEvent> Events { get; set; }
}
Which properties are required for you to consume the void Create(User entity); method?
ID: dunno, maybe it's generated maybe it's not
Name/FirstName: well those should be set
Password: is that a plain-text password, an hashed version? what is it?
IsDeleted/IsActive: should I activate the user myself? Is is done by the business method?
Profiles: hum... how do I affect a profile to a user?
Events: the hell is that??
It forces you to not use lazy loading
Yes, I hate this feature for multiple reasons. Some of them are:
extremely hard to use efficiently. I've seen too much times code that produces thousands of SQL request because the developers didn't know how to properly use lazy loading
extremely hard to manage exceptions. By allowing SQL requests to be executed at any time (i.e. when you lazy load), you delegate the role of managing database exceptions to the upper layer, i.e. the business layer or even the application. A bad habit.
Using POCO forces you to eager-load your entities, much better IMO.
About AutoMapper
AutoMapper is a tool that allows you to automagically convert Entities to POCOs and vice et versa. I do not like it either. See https://stackoverflow.com/a/32459232/870604
I have a counter-question: Why not both?
Consider any arbitrary MVC application. In the model and controller layer you'll generally want to use the EF objects. If you defined them using Code First, you've essentially defined how they are used in your application first and then designed your persistence layer to accurately save the changes you need in your application.
Now consider serving these objects to the View layer. The views may or may not reflect your objects, or an aggregation of your working objects. This often leads to POCOS/DTO's that captures whatever is needed in the view. Another scenario is when you want to publish objects in a web service. Many frameworks provide easy serialization on poco classes in which case you typically either need to 1) annotate your EF classes or 2) make DTO's.
Also be aware that any lazy loading you may have on your EF classes is lost when you use POCOS or if you close your context.
I am new to both MVC and Entity Framework and I have a question about the right/preferred way to do this.
I have sort of been following the Nerd Dinner MVC application for how I am writing this application. I have a page that has data from a few different places. It shows details that come from a few different tables and also has a dropdown list from a lookup table.
I created a ViewModel class that contains all of this information:
class DetailsViewModel {
public List<Foo> DropdownListData { get; set; }
// comes from table 1
public string Property1 { get; set; }
public string Property2 { get; set; }
public Bar SomeBarObject { get; set; } // comes from table 2
}
In the Nerd Dinner code, their examples is a little too simplistic. The DinnerFormViewModel takes in a single entity: Dinner. Based on the Dinner it creates a SelectList for the countries based on the dinners location.
Because of the simplicity, their data access code is also pretty simple. He has a simple DinnerRepository with a method called GetDinner(). In his action methods he can do simple things like:
Dinner dinner = new Dinner();
// return the view model
return View(new DinnerFormViewModel(dinner));
OR
Dinner dinner = repository.GetDinner(id);
return View(new DinnerFormViewModel(dinner));
My query is a lot more complex than this, pulling from multiple tables...creating an anonymous type:
var query = from a in ctx.Table1
where a.Id == id
select new { a.Property1, a.Property2, a.Foo, a.Bar };
My question is as follows:
What should my repository class look like? Should the repository class return the ViewModel itself? That doesn't seem like the right way to do things, since the ViewModel sort of implies it is being used in a view. Since my query is returning an anonymous object, how do I return that from my repository so I can construct the ViewModel in my controller actions?
While most of the answers are good, I think they are missing an in-between lines part of your question.
First of all, there is no 100% right way to go about it, and I wouldn't get too hung up on the details of the exact pattern to use yet. As your application gets more and more developped you will start seeing what's working and what's not, and figure out how to best change it to work for you and your application. I just got done completely changing the pattern of my Asp.Net MVC backend, mostly because a lot of advice I found wasn't working for what I was trying to do.
That being said, look at your layers by what they are supposed to do. The repository layer is solely meant for adding/removing/and editing data from your data source. It doesn't know how that data is going to be used, and frankly it doesn't care. Therefore, repositories should just return your EF entities.
The part of your question that other seem to be missing is that you need an additional layer in between your controllers and the repositories, usually called the service layer or business layer. This layer contains various classes (however you want to organize them) that get called by controllers. Each of these classes will call the repository to retrieve the desired data, and then convert them into the view models that your controllers will end up using.
This service/business layer is where your business logic goes (and if you think about it, converting an entity into a view model is business logic, as it's defining how your application is actually going to use that data). This means that you don't have to call specific conversion methods or anything. The idea is you tell your service/business layer what you want to do, and it gives you business entities (view models) back, with your controllers having no knowledge of the actual database structure or how the data was retrieved.
The service layer should be the only layer that calls repository classes as well.
You are correct a repository should not return a view model. As changes to your view will cause you to change your data layer.
Your repository should be an aggregate root. If your property1, property2, Foo, Bar are related in some way I would extract a new class to handle this.
public class FooBarDetails
{
public string Property1 {get;set;}
public string Property2 {get;set;}
public Foo Foo {get;set;}
public Bar Bar {get;set;}
}
var details = _repo.GetDetails(detailId);
If Foo and Bar are not related at all it might be an option to introduce a service to compose your FooBarDetails.
FooBarDetails details = _service.GetFooBar(id);
where GetFooBar(int) would look something like this:
_fooRepo.Get(id);
_barRepo.Get(id);
return new FooBarDetails{Foo = foo, Bar = bar, Property1 = "something", Property2 = "something else"};
This all is conjecture since the design of the repository really depends on your domain. Using generic terms makes it hard to develop potential relationships between your objects.
Updated
From the comment if we are dealing with an aggregate root of an Order. An order would have the OrderItem and also the customer that placed the order.
public class Order
{
public List<OrderItem> Items{get; private set;}
public Customer OrderedBy {get; private set;}
//Other stuff
}
public class Customer
{
public List<Orders> Orders{get;set;}
}
Your repo should return a fully hydrated order object.
var order = _rep.Get(orderId);
Since your order has all the information needed I would pass the order directly to the view model.
public class OrderDetailsViewModel
{
public Order Order {get;set;}
public OrderDetailsViewModel(Order order)
{
Order = order;
}
}
Now having a viewmodel with only one item might seem overkill (and it most likely will be at first). If you need to display more items on your view it starts to help.
public class OrderDetailsViewModel
{
public Order Order {get;set;}
public List<Order> SimilarOrders {get;set;}
public OrderDetailsViewModel(Order order, List<Order> similarOrders)
{
Order = order;
SimilarOrders = similarOrders;
}
}
Repository should work only with models not anonymous types and it should only implement CRUD operations. If you need some filtering you can add a service layer for that.
For mapping between ViewModels and Models you can use any of mapping libraries, such as Automapper.
The current answers are very good. I would just point out that you are abusing anonymous types; they should only be used for intermediate transport steps, and never passed to other places in your code (e.g. view model constructors).
My approach would be to inject the view model with all the relevant model classes. E.g. an action method might look like:
var dinner = dinnerRepository.Get(dinnerId);
var bar = barRepository.Get(barId);
var viewModel = new DinnerAndBarFormViewModel(dinner, bar);
return View(viewModel);
I have the same doubt of the poster and I am still not convinced. I personally do not like very much the given advice of limiting the repository to just executing basic CRUD operations. IMHO, performances should always be kept into account when developing a real application, and substituting a SQL outer join with two different queries for master-detail relationships doesn't sound too good to me.
Also, this way the principle that only needed fields should be queried is completely lost: using this approach, we are forced to always retrieve all the fields of all the tables involved, which is simply crazy in non-toy applications!