If I have a 3 layer web forms application that takes user input, I know I can validate that input using validation controls in the presentation layer. Should I also validate in the business and data layers as well to protect against SQL injection and also issues? What validations should go in each layer?
Another example would be passing a ID to return a record. Should the data layer ensure that the id is valid or should that happen in BLL / UI?
You should validate in all layers of your application.
What validation will occur at each layer is specific to the layer itself. Each layer should be safe to send "bad" requests to and get a meaningful response, but which checks to perform at each layer will depend on your specific requirements.
Broadly:
User Interface - Should validate user input, provide helpful error messages and visual clues to correcting them; it should be protecting your lower layers against invalid user input.
Business / Domain Layer - Should check arguments to methods are valid (throwing ArgumentException and similar when they are not) and should check that operations are possible within the constraints of your business rules; it should be protecting your domain against programming mistakes.
Data Layer - Should check the data you are trying to insert or update is valid within the context of your database, that it meets all the relational constraints and check constraints; it should be protecting your database against mistakes in data-access.
Validation at each layer will ensure that only data and operations the layer believes to be correct are allowed to enter. This gives you a great deal of predictability, knowing information had to meet certain criteria to make it through to your database, that operations had to be logical to make it through your domain layer, and that user input has been sanitized and is easier to work with.
It also gives you security knowing that if any of your layers was subverted, there is another layer performing checks behind it which should prevent anything entering which you don't want to.
Should I also validate in the business and data layers as well to protect against SQL injection and also issues?
Yes and Yes.
In your business layer code, you need to validate the input again (as client side can be spoofed), and also for your business logic, making sure the entries make sense for your application.
As for the data layer - you again need to ensure data is valid for the DB. Use parametrized queries as this will pretty much ensure no SQL injection will happen.
As for your specific question regarding the ID - the DB will know if an ID exists or not. Whether that is valid or not, depends on whether it has meaning for your business layer or not. If it purely a DB artefact (not part of your object model), than the DB needs to handle it, if it is a part of your object model and has significance to it, the business layer should handle it.
You absolutely need to validate in your business and data layers. The UI is an untrusted layer, it is always possible for somebody to bypass your client-side validation and in some cases your server-side UI validation.
Preventing SQL injection is simply a matter of parameterizing your queries. The phrase "SQL Injection" shouldn't even exist anymore, it's been a solved problem for years and years, and yet every day I see people writing queries using string concatenation. Don't do this. Parameterize the commands and you will be fine.
One of the main reasons you separate your app into multiple tiers is so that each tier is reusable. If individual tiers don't do their own validation, then they are not autonomous and you don't have proper separation of concerns. You also can't do any thorough testing without individual components doing built-in validation.
I tend to relax these restrictions for classes or methods that are internal or private because they're not getting directly tested or used. As long as the public API is fully-validated, private APIs can generally assume that the class is in a valid state.
So, basically, yes, every layer, in fact every public class and method needs to validate its own data/arguments.
Semantic validation, like checking whether or not a particular Customer ID is valid, is going to depend on your design requirements. Obviously the business layer has no way of knowing whether or not an ID exists until said ID actually hits the data layer, so it can't perform this check in an advance. Whether it throws an exception for a missing ID or simply returns null/ignores the error depends on exactly what the class/method is designed to do.
However, if this ID needs to be in a special format - for example, maybe you're using specially-coded account numbers ("R-12345-A-678") - then it does become the responsibility of the domain/business layer to validate the input and make sure it conforms to the correct format, especially if the consumer of your business class is trying to create a new account.
No layer should trust data coming from another layer. The analogy I use for this is one of fiefdoms. Say you want to send a message to the king. If the message is not in the proper format it will be rejected before it ever gets to his ears. You could continue to send messages until you eventually get the format right or you could use an emissary. The job of the emissary is to help you verify that your message will be in the acceptable format so that the king will hear it.
Each layer in a system is a fiefdom. Each layer acts as an emissary to the layer to which it will send data by verifying that it will be accepted. No layer trusts data coming from outside that layer (no one trusts messages from outside the fiefdom). The database does not trust the middle layer. The middle-layer does not trust the database or the presentation layer. The presentation does not trust the user or the middle layer.
So the answer is that absolutely you should check and re-check the data in each layer.
Short answer: yes.
Validate as input gets received in each new layer and before it gets acted upon, generally I validate such input just before it gets used or passed on to the next layer (javascript checks if it's a valid email and free of malicious input, likewise the business layer does before constructing a query using it.)
To your last question: if the ID returns a record, then it is valid, and you'd have to find the record's id to confirm whether or not it is valid, so you'd be making a lot of unnessecary lookups if you were to try that.
I hope that helps.
I do all of my validation at the Presenter layer in the Model-View-Presenter. Validation is somewhat tricky because it's really a crosscutting concern so many times.
I prefer to do it at the presenter layer because I can then shortcircuit calling to the model.
The other approach is to do the validation in the model layer but then the issue of communication of errors because you cannot easily inform other layers of errors aside from exceptions. You can always pack exceptions with data or create your own custom exception that you can attach a list of error messages or similar construct to but that always seem dirty to me.
Later when I expose my model through a web service I will implement double validation checking both in the Presenter and in the Model since it will be possible to jump the presenter layer if you call the web service. The other big advantage to this is that it decouples my validations for the presenter layer from the model since the model might only require raw validation of types to match the database whereas users of my UI I want more granular rules of what they input not just that they physically can.
Other questions: the sql injection portion that is a model concern and should not be in any middle layers. However most sql injection attacks are completely nullified when text fields don't allow special characters. The other part of this is you should almost always be using parametrized sql which makes sql injection not usable.
The question on the ID that's a model concern either it can get a record with that ID or it should return null or throw an exception for record not found depending on what convention you wish to establish.
Related
I'm just getting started with DDD and implementing the Onion Architecture.
I'm making an invitation system, where a super user can invite another user by email to his company. However when creating the invitation, I want to ensure that the user is not already created in the system. I want to do that by checking if there's any record in the database with that email. I'm using Entity Framework to handle the database context.
And I've made a repository for both Invitation and UserAccount, that contain methods to find items based on an Id.
I need to use the DB context in order to see if the invitation is still valid, but since the method is declared in the Domain Layer, I can't really figure out how to do it, without breaking the design pattern. The Domain layer should not know anything about the persistence layer.
I thought about injecting the IUserAccountRepository and then executing the required methods in order to complete the Accept() method, but I'm afriad this is wrong.
The Domain layer should not know anything about the persistence layer.
That's right - the domain layer should not know about persistence.
But that constraint doesn't apply to the application layer.
In other words, we design our domain model interface so that it "asks for" the information that we (might) need to successfully compute the next state of the model, and the application has the problem of figuring out where that information comes from.
public UserAccount Accept(Guid userId, Boolean userExistsInDatabase)
What you will see in some designs, is that instead of passing in the answer to the question, we'll pass in the capability of asking the question, and let the model decide for itself whether the question should be asked and what to do with the answer
public UserAccount Accept(Guid userId, Roster roster)
In this case, Roster would be an interface defined by your model, that accepts some piece of information that the model already has and reports back some other piece of information that your model understands. Then your application would provide an implementation of this interface when invoking the method.
Passing values across the boundaries is a bit more "pure", in that the model code doesn't need to know anything about the failure modes of the Roster -- all of that code would instead live in the application layer.
It's OK to use Contracts in your Domain Services.
"IUserAccountRepository" is a Contract that we create in the domain and the domain service doesn't know about implementation.
So Do not worry about that it's right
I'm wondering what's the best way to do validation of database constraints (e.g. UNIQUE) in a ASP.NET MVC application, build with DDD in mind, where the underlying layers are Application Layer (application services), Domain Layer (domain model) and Infrastructure Layer (persistance logic, logging, etc.).
I've been looking through lots of DDD samples, but what many of them doesn't mention is how to do validation in the repository (I suppose that this is where this type of validation fits). If you know of any samples doing this, please share them it will be much appreciated.
More specific, I have two questions. How would you perform the actual validation? Would you explicitly check if a customer name already exists by querying the database, or would you try inserting it directly in the database and catching the error if any (seems messy)? I prefer the first one, and if choosing this, should it be done in the repository, or should it be the job of a application service?
When the error is detected, how would you pass it to ASP.NET MVC so the user can be informed nicely about the error? Preferably using the ModelStateDictionary so the error is easily highlighted on the form.
In the N-Lyered app by Microsoft Spain, they use the IValidatableObject interface and the most simple property validation is placed on the entity itself, such as:
public IEnumerable<ValidationResult> Validate(ValidationContext validationContext)
{
var validationResults = new List<ValidationResult>();
if (String.IsNullOrWhiteSpace(this.FirstName))
validationResults.Add(new ValidationResult(Messages.validation_CustomerFirstNameCannotBeNull, new string[] { "FirstName" }));
return validationResults;
}
Before the entity is persisted, the Validate message is called to ensure that the properties are valid:
void SaveCustomer(Customer customer)
{
var validator = EntityValidatorFactory.CreateValidator();
if (validator.IsValid(customer)) //if customer is valid
{
_customerRepository.Add(customer);
_customerRepository.UnitOfWork.Commit();
}
else
throw new ApplicationValidationErrorsException(validator.GetInvalidMessages<Customer>(customer));
}
The ApplicationValidationErrorsException can then be catched in the MVC application and the validation error messages can be parsed and inserted into the ModelStateDictionary.
I could add all the validation logic into the SaveCustomer method, e.g. querying the database checking if a customer already exists using a given column (the UNIQUE one).
Maybe this is okay, but I would rather that the validator.IsValid (or something similar) would do this for me, or that validation is performed once again in the Infrastructure layer (if it belongs here, im not sure).
What do you think? How do you do it? I'm very interesting in gaining more insight into different validation techniques in layered applications.
Possible solution #1
In the case where the validation logic can't be done in the presentation layer (like Iulian Margarintescu suggests) and needs to be done in the service layer, how would you pass validation errors up to the presentation layer?
Microsoft has a suggestion here (see listing 5). What do you think about that approach?
You mention DDD, yet there is a lot more to DDD than entities and repositories. I assume you are familiar with Mr Eric Evans's book Domain Driven Design and i would strongly suggest you re-read the chapters about strategic design and bounded contexts. Also Mr Evans has a very nice talk called "What i've learned about DDD since the book" that you can find here. Talks about SOA, CQRS and event sourcing from Greg Young or Udi Dahan also contain a lot of information about DDD and applying DDD. I must warn you that you might discover things that will change the way you think about applying DDD.
Now for your question about validation - One approach might be to query the db (using an Ajax call that is directed to an app service) as soon as the user types something in the "name" field and try to suggest an alternative name if the one he entered already exists. When the user submits the form, try to insert the record in the db and handle any duplicate key exception (at the repository or app service level) . Since you are already checking for duplicates ahead of time the cases where you get an exception should be fairly rare so any decent "We are sorry, please retry" message should do since, unless you have A LOT of users they will probably never see it.
This post from Udi Dahan also has some information on approaching validation. Remember that this might be a constraint you are imposing on the business instead of a constraint that the business imposes on you - Maybe it provides more value for the business to allow customers with the same name to register, instead of rejecting them.
Also remember that DDD is a lot more about business than it is about technology. You can do DDD and deploy your app as a single assembly. Layers of client code on top of services on top of entities on top of repositories on top of databases have been abused so many times in the name of "good" design, without any reasons for why it is a good design.
I'm not sure this will answer your question(s) but i hope it will guide you to find the answers yourself.
I'm wondering what's the best way to do validation of database constraints (e.g. UNIQUE)
and if choosing this, should it be done in the repository, or should it be the job of a application service?
It depends on what you are validating.
If it's an aggregate root creation you are trying to validate - then there is nothing more global than app itself that "holds" it. In this case, I apply validation directly in repository.
If it's an entity, it lives in aggregate root context. In this case I'm validating entity uniqueness in aggregate root itself against all the other entities in this particular aggregate root. Same goes for value objects in entities/roots.
P.s. repository is a service. Do not look at services as universal store for necessary but hard to name properly code. Naming matters. The same goes with names like "Helpers", "Managers", "Common", "Utilities", etc. - they are pretty much meaningless.
Also - you don't need to pollute your code base with pattern names: AllProducts > ProductRepository; OrderRegistrator > OrderService; order.isCompleted > IsOrderCompletedSpecification.IsSatisfiedBy.
More specific, I have two questions. How would you perform the actual validation? Would you explicitly check if a customer name already exists by querying the database, or would you try inserting it directly in the database and catching the error if any (seems messy)?
I would query the database. Although, if high performance is a concern and customer name availability is only thing that database should enforce - I would go with relying on database (1 less round trip).
When the error is detected, how would you pass it to ASP.NET MVC so the user can be informed nicely about the error? Preferably using the ModelStateDictionary so the error is easily highlighted on the form.
Usually it is not a good idea to use exceptions for controlling flow of application, but, since I want to enforce UI to show only available things that can be done, I'm just throwing exception in case validation fails. In UI layer, there's a handler that neatly picks it up and spits out in html.
Also - it is important to understand what is the scope of command (e.g. product ordering command might check 2 things - if customer ain't debtor and if product is in store). If command has multiple associated validations, those should be coupled together so UI would receive them simultaneously. Otherwise it would lead to annoying user experience (going through multiple errors while trying to order that damn product over and over again).
i'm using three tier architecture with c# and sql server database as the data source. according to DRY principal the validation should be done in one place only which in my case is either the front end data access layer or the database stored procedures.
so i was wondering whether to validate the stored procedure parameters in data access layer or leave it to stored procedure itself??
DRY is an important principle, but so is defence in depth.
When it comes to validating input, you must ensure it is safe - this should be done on each and every level (so both in DAL and stored procedure).
As for validating data for business logic, this should be in your business logic layer (BLL).
If you are using a three tier architecture, I would recommend you investigate using an ORM instead such as Nhibernate, or Linq to Entites. An ORM will provide you with better refactor-ability and hence maintainability (Maintainability to me is the most important thing, as it leads to quality in the longer run, based on my experience).
It is not wise to put your validation in to the UI, as it is safer to have your secuirty down in your DAL (data access layer) than in your UI where it can more easily be bypassed (accidentially or on purpose). Think about SQL injection. You should validate on your data access agasint this as opposed to only your UI as it is easy to miss on your UI, and easy to bypass as a malicious user trying to gain access to other data they are not allowed to access.
I think that it might make sense to have validation potentially on the UI for usability, and in the data access layer for safety. I do like the DRY principal of doing validation in one place, and you can still do that. If you make a common set of rules which are propogated through to both the data access layer and the UI then you will have a safe and usable system (through immediate feedback on data entry). ANother way could be to have different rules for different layers. For example field length rules and data entry patterns could be UI specific. The DAL can enforce the data is valid for example. THat is doing validation in multiple places, but as long as they are not independently doing the same thing, I think you will be ok. This is one of the hardest areas of consideration when designing an application as validation is a cross cutting concern and how you do it depends alot on how you structure the rest of your application design.
First off, I am using web forms without any ORM framework.
I have been struggling with how to make my domain objects as "smart" and "rich" as they can be without allowing them access to my service and repository layer. My most recent attempt was in creating a model for gift certificates for a online store.
The main recurring issues that I am seeing is that:
More and more logic keeps being introduced in the service layer. All the calls to the repository must pass through the service layer and each time the parameters are validated (eg - exists in db, etc). As a result my service layer is growing, but my domain objects just have some simple contractual validations. Even object validation is in the service layer since if the ID of the item is null, it will check the db to ensure that the code is unique. IHMO, the consumer of the system should not care if the functionality they need deals with persistence or not.
I have a separate POCO for transaction log entries for when a gift certificate is redeemed. I assume that I should put a list or collection of these transactions as a property of my Gift Certificate model, but I am still unsure of when that property should be filled. Do I add a separate method on the service for loading the transactions into a object on demand (eg - LoadTransactions(gc object)) or should the transactions be automatically loaded any time a existing gift certificate or list of gift certificates are requested (or maybe a option in the getGCs to load transactions as well)
What about computed fields like "Available Balance"... should I even have properties like this on my object? Anytime I am working with the object, I will need to keep updating that property to insure it is up to date. Right now I simply have a service method GetBalanceByCode(gc code).
Even actions like redeeming a gift certificate are basically 100% data-centric (take some input parameters, validate them and add a transaction log entry to db).
More and more logic keeps being
introduced in the service layer (...)
Even object validation is in the
service layer (...)
Validation is not the best candidate as domain model element. Input (my personal preference is that it's represented as commands) should be validated at application service level. Domain logic should model how business work and assume that all the arguments are valid. Good candidates for domain logic are computations for example: you want to have them in one single place and have them well tested.
I have a separate POCO for transaction
log entries for when a gift
certificate is redeemed.
This kind of object is known as Event. You can learn about Events from Eric Evans 'What I learnt since the Blue Book' presentation. Event is basically an entity which is immutable. Events are quite often aggregates on their own because usually there's lots of them. By making them aggregates, you don't have any problems with lazy loading them as part of other objects's collection.
What about computed fields like
"Available Balance"... should I even
have properties like this on my
object?
Computed properties are kind of logic that naturally fits in domain model, however it's debatable if a better approach is to compute the value each time or compute it when object changes and persist it in the DB.
Even actions like redeeming a gift
certificate are basically 100%
data-centric (take some input
parameters, validate them and add a
transaction log entry to db).
This action would be modeled as creating a CertificateRedeemed event. This event would be probably created by Certificate aggregate or some other object. This blog post by Udi Dahan can be helpful
This is not an entirely easy question to answer given the fact that domain models are very subjective, and rely a lot on your...well, domain. It sounds like you are actually creating something similar to The Onion Architecture (and Part 2) described by Jeffery Palermo. This is not a bad pattern to use, though DDD purists will tell you it leads to "anemic" domain models (where your domain objects are basically Data holders with no behavior). The thing is, that may be exactly what you need in your scenario. A "full, rich" domain model may be overkill for what you are doing (and given your last bullet point it sounds like that could be the case).
You may not need a domain model for your system at all. You could be well served with some View Models (that is simple data models to describe your view) and have your UI send some DTOs to through your services to put the data in the database. If you find something that requires a more complex approach, then you can apply a richer domain model to that component. Also remember that you don't necessarily have one domain model in your system. There can, and in many cases should, be different models that describe things differently (often grouped into Bounded Contexts). The overall goal of DDD is to simplify otherwise complex systems. If its causing you additional complexity, then you may be taking the long way round.
There is an approach called DCI (data-context-interactions) which is supposed to be alternative to the old school OOP. Although it does not address explicitly the issue of persistence ignorance, your question brought it to my mind, because it deals with similar issues.
In DCI domain objects are small data-holders with only a little logic, like in your case, and interactions between them are implemented separately. The algorithm of interaction is not spread through small methods of several objects, but it is in one place, which might make it more lucid and understandable.
I think it is still rather academic thing than a solution we should start implementing tomorrow, but someone who comes across this question might be interested.
Hey, I have a silverlight application that allows the user to modify their username, password, bio etc. This information is stored in a MySQL database and retrieved used a WCF webservice.
I need to sanitize all information received from the user before it gets into the database. At the moment I can't store apostrophes in my DB. Where is the best place to sanitize the input (silverlight or WCF methods) and how do I go about it?
BTW, I am not worried about SQL injection as I will be implementing parametrized queries in a few days.
Thanks
The correct answer here is somewhat of a matter of architectural preference. This type of user input validation is a system rule. Many would say that all rule implementation should be done on the service side. From a strict separation of concerns point of view all rules should be enforced in the business logic on the service side of the system.
But, when this kind of validation is handled on the client more immediate feedback can be given to the user resulting in a more usable interface. With the added benefit of not producing any network traffic merely for the purpose of telling the user that he pressed the wrong key.
In the end neither approach is wrong. The 'best' approach can really only be determined by what you want for your system. Architectural purity vs. user responsiveness.
You are right to use parameterized queries. Alternatively, you could use an ORM and also get the SQL injection protection.