i'm using three tier architecture with c# and sql server database as the data source. according to DRY principal the validation should be done in one place only which in my case is either the front end data access layer or the database stored procedures.
so i was wondering whether to validate the stored procedure parameters in data access layer or leave it to stored procedure itself??
DRY is an important principle, but so is defence in depth.
When it comes to validating input, you must ensure it is safe - this should be done on each and every level (so both in DAL and stored procedure).
As for validating data for business logic, this should be in your business logic layer (BLL).
If you are using a three tier architecture, I would recommend you investigate using an ORM instead such as Nhibernate, or Linq to Entites. An ORM will provide you with better refactor-ability and hence maintainability (Maintainability to me is the most important thing, as it leads to quality in the longer run, based on my experience).
It is not wise to put your validation in to the UI, as it is safer to have your secuirty down in your DAL (data access layer) than in your UI where it can more easily be bypassed (accidentially or on purpose). Think about SQL injection. You should validate on your data access agasint this as opposed to only your UI as it is easy to miss on your UI, and easy to bypass as a malicious user trying to gain access to other data they are not allowed to access.
I think that it might make sense to have validation potentially on the UI for usability, and in the data access layer for safety. I do like the DRY principal of doing validation in one place, and you can still do that. If you make a common set of rules which are propogated through to both the data access layer and the UI then you will have a safe and usable system (through immediate feedback on data entry). ANother way could be to have different rules for different layers. For example field length rules and data entry patterns could be UI specific. The DAL can enforce the data is valid for example. THat is doing validation in multiple places, but as long as they are not independently doing the same thing, I think you will be ok. This is one of the hardest areas of consideration when designing an application as validation is a cross cutting concern and how you do it depends alot on how you structure the rest of your application design.
Related
Trying to build a stample project using DDD, I'm facing an issue:
To validate zipcode, address, and etc.., I have a set of db table(20 tables hundreds of columns, 26Mo) that I would like to query.
Those table are not related to my domain. This table have their own connection string and can be stored outside of the persitance DB.
I was thinking of adding a connection string to the Core and use a simple orm raw sql query to validate the data.
The process is easyer to write in C# than in SQL so there is no stored procedure to do the job.
There is no modification on those data. Only querying.
I think it's important to remember that DDD doesn't have to apply to everything you do. If you have a complex problem domain that is worthy of the complexities DDD brings, that's fine. However it's also fine to have other areas of your software (other boundaries, essentially) that are CRUD. In fact, CRUD is best where you can get away with it because of the simplicity. As #D.R. said, you can load data using something more akin to a Transaction Script (I can see something like IZipCodeValidator in your future) and pass the results of that in where you need them, or you might consider your Application Service being allowed to go and get that ZipCode data using CRUD (IZipCodeRepository) and passing that in to a full-on Domain Object that has complex rules for the validation.
I believe it's a DDD purist view to try and avoid passing things to methods on Domain Objects that do things (e.g. DomainObject.ValidateAddress(address, IZipCodeRepository repo)), instead preferring to pass in the values useful for the validation (e.g. DomainObject.ValidateAddress(address, IEnumerable<ZipCode> zipcodes)). I think anyone could see the potential for performance issues there, so your mileage may vary. I'll just say to resist it if you can.
This sounds like a bounded context of its own. I'd probably query it from the core domain using an anti-corruption layer in-between. So your domain simply uses an interface to a service. Your application layer would implement this interface with the anti-corruption layer to the other bounded context. The implementation can use simple DB query mechanisms like ADO.NET queries.
I have been pondering this problem for a while now and cannot think of an acceptable solution. I have an application that is planned to become very large. Because of this I am trying to make it modular. It is based on MVC4. I have not decided on using a ORM or mapping everything myself. I would like to have the following structure:
----------------------
| Database
----------------------
| Data/Data Access Layer (Class Library) (Objects reside here)
----------------------
| Core MVC Project (User and Session are stored here)
----------------------
| MVC Modules
I want to keep the validation of the UpdatedBy field as close to the database as possible, possibly in the Data/Data Access layer. The problem is I want to store the user in the Session and do the validation in the class library (where there is no Session). I also want to avoid as much as possible passing the user all over the place. Is there a way to store the user in the Session and have the Data Access layer access that info without being passed the user? Anybody have any recommendations on how to do this elegantly?
EDIT: I want to keep validation, and CRUD activities as close to the Data layer as possible where the Core MVC project just calls Save() on an object and the Data layer validates the object, figures out what user modified or created it and saves it to the DB.
EDIT 2: It is imperative that the Data layer have absolutely no dependencies in the MVC layer.
The LastUpdated can easily be implemented with a Trigger on DB Insert/Updates, but the UpdatedBy is a bit trickier.
A key question is "does your business layer require knowledge of who is using it?" If so, then the interfaces can be designed to require that a Username is provided when making actions. If not then you need to make the data accessible from within/behind the business layer, but without being explicitly provided to it (such as with Dependency Injection, or by providing a Context that is availalble throughout).
You could consider creating a seperate audit-trail using ActionFilters around your controller actions, which provides easy access to the Session, and can create a running history of actions your users take. This may or may not correctly 100% to your database records, but does provide a clear history of the actions of the application--which is valuable in its own right.
You could also consider using a Command pattern, whereby the application generates specific commands (e.g. an UpdateWidgetName command) that are enacted on the business/data layer. In some regards this is how MVC already works, but having an explicit Command which captures the user and date is still a useful addition to your business layer.
Also be aware of the shortcomings of keeping this on the record itself. You'll only know who last edited the record--you won't be able to tell specifically what they edited, or who edited it previously. For relatively simple scenarios this is usually sufficient, but it is far from providing actual historical data for the record.
If you really want 100% auditing you should look at the Event Sourcing design pattern, where effectively if an action isn't audited then it didn't happen. It's a very different paradigm from the typical CRUD approach, but is very powerful (albeit more complicated to design initially)
One other note: consider seperating your business and persistence code into two layers. Tying them together makes the business logic tightly coupled to persistence (bad), which will prevent it from being reused. Look into implementing a Repository which is dedicated to persisting and retrieving your business objects. It pays off.
If you use a structure like this in your application, you can define some core interfaces that can be used throughout your application (like ICurrentUserProvider), and then you can implement those interfaces in the parts of your application where they are best implemented, without creating a tight coupling or dependency to that specific part of the application.
When your web project is initialized, it can initialize your DI framework so that your controllers get their dependencies injected into them. That way your controller gets the Business Layer services it needs, and those Business Layer services have the data-layer implementations they need (without actually having a direct dependency on them), and the data access object gets the service that can tell it who the current user is (without depending directly on the MVC layer).
I'm working on a project which has basically three layers: Presentation, business and data.
Each layer is on a different project and all layers use DTO's defined in another project.
business layer and data layer return DTO's or Lists of DTOs when querying the database.
So far so good, but now we have to query views and those views of course do not match an existant DTO. What we have done until now is just create a special DTO, business- and data-layer classes so they were treated like normal entities (minus insert, update etc.)
But it does not seem correct. Why should the be treated like normal entities when they are clearly not. Well the DTO seems necessary, but creating "business logic" and a datalayer-class for every view seems rather akward. So I thought I create one generic business- and datalayer class which holds the logic/code for all views (I still would have to create a DTO for every different view, perhaps I could use anonymous types)
What do you think about me idea or how would you solve this issue?
EDIT: 9. August 2011
Sorry, the post may have been unclear.
By views I meant Views from a sql-server.
I feel your pain completely. The fact is that in almost every non trivial project with decent complexity you will get to the point where the things you have to show to the users on UI overlap, aggregate or are simply a subset of data of business entities. The way I tend to approach this is to accept this fact and go even further - separate the query side from the business logic side both logically and physically. The fact is that you need your entities only for actual business operations and keeping the business constraints valid, and when does this happen? Only when someone changes the data. So there is no need to even build entities when you display the data.
The way I like to structure the solutions is:
User opens the view -> Query is performed only to get the specific
data for the view -> Returned data is the model (although you could
call it a DTO as well, in this case it's the same thing)
User changes something -> Controller (or service) builds the full entity from repo,
business logic action is performed on the entity -> changes are
persisted -> result is returned
What I want to say is, it is ok to treat your read side separately from write side. It is ok to have different infrastructure for this as well. When you start to treat it differently, you will see the benefits - for example you can tailor you queries to what you need on UI.
You can even get to the point where your infrastructure will allow to build your queries with different techniques, for example using LINQ or plain SQL queries - what is best for certain scenarios.
I would advise against using DTOs between layers. I'm not convinced that there's any benefit, but I'll be happy to take instruction if you think you have some.
The harm comes in maintaining multiple parallel hierarchies that express the same idea (business objects plus multiple DTOs between layers). It means lots more code to maintain and greater possibility of errors.
Here's how I'd layer applications:
view <--- controller <--- service <--- + <--- model
+ <--- persistence
This design decouples views from services; you can reuse services with different views. The service methods implement use cases, validate inputs according to business rules, own units of work and transactions, and collaborate with model and persistence objects to fulfill requests.
Controller and view are tightly coupled; change the view, change the controller. The view does nothing other than render data provided by the controller. The controller is responsible for validation, binding, choosing the appropriate services, making response data available, and routing to the next view.
Cross cutting concerns such as logging, transactions, security, etc. are applied at the appropriate layer (usually the services).
Services and persistence should be interface-based.
I've dropped most layered architectures like this as they are a pain to manage all the transformations and are over-complicated. It's typical astronaut architecture. I've been using the following:
View models for forms/views in ASP.Net MVC. This is an important decoupling step. The UI will evolve separately to the model typically.
No service layer, instead replacing it with "command handlers" (mutating operations) and "finders" (query operations) which represent small operations and queries respectively (CQS - Command Query Separation).
Model persistence with NHibernate and ALL domain logic inside the model.
Any external services talk to the finders and command handlers as well
This leads to a very flat manageable architecture with low coupling and all these problems go away.
First off, I am using web forms without any ORM framework.
I have been struggling with how to make my domain objects as "smart" and "rich" as they can be without allowing them access to my service and repository layer. My most recent attempt was in creating a model for gift certificates for a online store.
The main recurring issues that I am seeing is that:
More and more logic keeps being introduced in the service layer. All the calls to the repository must pass through the service layer and each time the parameters are validated (eg - exists in db, etc). As a result my service layer is growing, but my domain objects just have some simple contractual validations. Even object validation is in the service layer since if the ID of the item is null, it will check the db to ensure that the code is unique. IHMO, the consumer of the system should not care if the functionality they need deals with persistence or not.
I have a separate POCO for transaction log entries for when a gift certificate is redeemed. I assume that I should put a list or collection of these transactions as a property of my Gift Certificate model, but I am still unsure of when that property should be filled. Do I add a separate method on the service for loading the transactions into a object on demand (eg - LoadTransactions(gc object)) or should the transactions be automatically loaded any time a existing gift certificate or list of gift certificates are requested (or maybe a option in the getGCs to load transactions as well)
What about computed fields like "Available Balance"... should I even have properties like this on my object? Anytime I am working with the object, I will need to keep updating that property to insure it is up to date. Right now I simply have a service method GetBalanceByCode(gc code).
Even actions like redeeming a gift certificate are basically 100% data-centric (take some input parameters, validate them and add a transaction log entry to db).
More and more logic keeps being
introduced in the service layer (...)
Even object validation is in the
service layer (...)
Validation is not the best candidate as domain model element. Input (my personal preference is that it's represented as commands) should be validated at application service level. Domain logic should model how business work and assume that all the arguments are valid. Good candidates for domain logic are computations for example: you want to have them in one single place and have them well tested.
I have a separate POCO for transaction
log entries for when a gift
certificate is redeemed.
This kind of object is known as Event. You can learn about Events from Eric Evans 'What I learnt since the Blue Book' presentation. Event is basically an entity which is immutable. Events are quite often aggregates on their own because usually there's lots of them. By making them aggregates, you don't have any problems with lazy loading them as part of other objects's collection.
What about computed fields like
"Available Balance"... should I even
have properties like this on my
object?
Computed properties are kind of logic that naturally fits in domain model, however it's debatable if a better approach is to compute the value each time or compute it when object changes and persist it in the DB.
Even actions like redeeming a gift
certificate are basically 100%
data-centric (take some input
parameters, validate them and add a
transaction log entry to db).
This action would be modeled as creating a CertificateRedeemed event. This event would be probably created by Certificate aggregate or some other object. This blog post by Udi Dahan can be helpful
This is not an entirely easy question to answer given the fact that domain models are very subjective, and rely a lot on your...well, domain. It sounds like you are actually creating something similar to The Onion Architecture (and Part 2) described by Jeffery Palermo. This is not a bad pattern to use, though DDD purists will tell you it leads to "anemic" domain models (where your domain objects are basically Data holders with no behavior). The thing is, that may be exactly what you need in your scenario. A "full, rich" domain model may be overkill for what you are doing (and given your last bullet point it sounds like that could be the case).
You may not need a domain model for your system at all. You could be well served with some View Models (that is simple data models to describe your view) and have your UI send some DTOs to through your services to put the data in the database. If you find something that requires a more complex approach, then you can apply a richer domain model to that component. Also remember that you don't necessarily have one domain model in your system. There can, and in many cases should, be different models that describe things differently (often grouped into Bounded Contexts). The overall goal of DDD is to simplify otherwise complex systems. If its causing you additional complexity, then you may be taking the long way round.
There is an approach called DCI (data-context-interactions) which is supposed to be alternative to the old school OOP. Although it does not address explicitly the issue of persistence ignorance, your question brought it to my mind, because it deals with similar issues.
In DCI domain objects are small data-holders with only a little logic, like in your case, and interactions between them are implemented separately. The algorithm of interaction is not spread through small methods of several objects, but it is in one place, which might make it more lucid and understandable.
I think it is still rather academic thing than a solution we should start implementing tomorrow, but someone who comes across this question might be interested.
If I have a 3 layer web forms application that takes user input, I know I can validate that input using validation controls in the presentation layer. Should I also validate in the business and data layers as well to protect against SQL injection and also issues? What validations should go in each layer?
Another example would be passing a ID to return a record. Should the data layer ensure that the id is valid or should that happen in BLL / UI?
You should validate in all layers of your application.
What validation will occur at each layer is specific to the layer itself. Each layer should be safe to send "bad" requests to and get a meaningful response, but which checks to perform at each layer will depend on your specific requirements.
Broadly:
User Interface - Should validate user input, provide helpful error messages and visual clues to correcting them; it should be protecting your lower layers against invalid user input.
Business / Domain Layer - Should check arguments to methods are valid (throwing ArgumentException and similar when they are not) and should check that operations are possible within the constraints of your business rules; it should be protecting your domain against programming mistakes.
Data Layer - Should check the data you are trying to insert or update is valid within the context of your database, that it meets all the relational constraints and check constraints; it should be protecting your database against mistakes in data-access.
Validation at each layer will ensure that only data and operations the layer believes to be correct are allowed to enter. This gives you a great deal of predictability, knowing information had to meet certain criteria to make it through to your database, that operations had to be logical to make it through your domain layer, and that user input has been sanitized and is easier to work with.
It also gives you security knowing that if any of your layers was subverted, there is another layer performing checks behind it which should prevent anything entering which you don't want to.
Should I also validate in the business and data layers as well to protect against SQL injection and also issues?
Yes and Yes.
In your business layer code, you need to validate the input again (as client side can be spoofed), and also for your business logic, making sure the entries make sense for your application.
As for the data layer - you again need to ensure data is valid for the DB. Use parametrized queries as this will pretty much ensure no SQL injection will happen.
As for your specific question regarding the ID - the DB will know if an ID exists or not. Whether that is valid or not, depends on whether it has meaning for your business layer or not. If it purely a DB artefact (not part of your object model), than the DB needs to handle it, if it is a part of your object model and has significance to it, the business layer should handle it.
You absolutely need to validate in your business and data layers. The UI is an untrusted layer, it is always possible for somebody to bypass your client-side validation and in some cases your server-side UI validation.
Preventing SQL injection is simply a matter of parameterizing your queries. The phrase "SQL Injection" shouldn't even exist anymore, it's been a solved problem for years and years, and yet every day I see people writing queries using string concatenation. Don't do this. Parameterize the commands and you will be fine.
One of the main reasons you separate your app into multiple tiers is so that each tier is reusable. If individual tiers don't do their own validation, then they are not autonomous and you don't have proper separation of concerns. You also can't do any thorough testing without individual components doing built-in validation.
I tend to relax these restrictions for classes or methods that are internal or private because they're not getting directly tested or used. As long as the public API is fully-validated, private APIs can generally assume that the class is in a valid state.
So, basically, yes, every layer, in fact every public class and method needs to validate its own data/arguments.
Semantic validation, like checking whether or not a particular Customer ID is valid, is going to depend on your design requirements. Obviously the business layer has no way of knowing whether or not an ID exists until said ID actually hits the data layer, so it can't perform this check in an advance. Whether it throws an exception for a missing ID or simply returns null/ignores the error depends on exactly what the class/method is designed to do.
However, if this ID needs to be in a special format - for example, maybe you're using specially-coded account numbers ("R-12345-A-678") - then it does become the responsibility of the domain/business layer to validate the input and make sure it conforms to the correct format, especially if the consumer of your business class is trying to create a new account.
No layer should trust data coming from another layer. The analogy I use for this is one of fiefdoms. Say you want to send a message to the king. If the message is not in the proper format it will be rejected before it ever gets to his ears. You could continue to send messages until you eventually get the format right or you could use an emissary. The job of the emissary is to help you verify that your message will be in the acceptable format so that the king will hear it.
Each layer in a system is a fiefdom. Each layer acts as an emissary to the layer to which it will send data by verifying that it will be accepted. No layer trusts data coming from outside that layer (no one trusts messages from outside the fiefdom). The database does not trust the middle layer. The middle-layer does not trust the database or the presentation layer. The presentation does not trust the user or the middle layer.
So the answer is that absolutely you should check and re-check the data in each layer.
Short answer: yes.
Validate as input gets received in each new layer and before it gets acted upon, generally I validate such input just before it gets used or passed on to the next layer (javascript checks if it's a valid email and free of malicious input, likewise the business layer does before constructing a query using it.)
To your last question: if the ID returns a record, then it is valid, and you'd have to find the record's id to confirm whether or not it is valid, so you'd be making a lot of unnessecary lookups if you were to try that.
I hope that helps.
I do all of my validation at the Presenter layer in the Model-View-Presenter. Validation is somewhat tricky because it's really a crosscutting concern so many times.
I prefer to do it at the presenter layer because I can then shortcircuit calling to the model.
The other approach is to do the validation in the model layer but then the issue of communication of errors because you cannot easily inform other layers of errors aside from exceptions. You can always pack exceptions with data or create your own custom exception that you can attach a list of error messages or similar construct to but that always seem dirty to me.
Later when I expose my model through a web service I will implement double validation checking both in the Presenter and in the Model since it will be possible to jump the presenter layer if you call the web service. The other big advantage to this is that it decouples my validations for the presenter layer from the model since the model might only require raw validation of types to match the database whereas users of my UI I want more granular rules of what they input not just that they physically can.
Other questions: the sql injection portion that is a model concern and should not be in any middle layers. However most sql injection attacks are completely nullified when text fields don't allow special characters. The other part of this is you should almost always be using parametrized sql which makes sql injection not usable.
The question on the ID that's a model concern either it can get a record with that ID or it should return null or throw an exception for record not found depending on what convention you wish to establish.