Hey, I have a silverlight application that allows the user to modify their username, password, bio etc. This information is stored in a MySQL database and retrieved used a WCF webservice.
I need to sanitize all information received from the user before it gets into the database. At the moment I can't store apostrophes in my DB. Where is the best place to sanitize the input (silverlight or WCF methods) and how do I go about it?
BTW, I am not worried about SQL injection as I will be implementing parametrized queries in a few days.
Thanks
The correct answer here is somewhat of a matter of architectural preference. This type of user input validation is a system rule. Many would say that all rule implementation should be done on the service side. From a strict separation of concerns point of view all rules should be enforced in the business logic on the service side of the system.
But, when this kind of validation is handled on the client more immediate feedback can be given to the user resulting in a more usable interface. With the added benefit of not producing any network traffic merely for the purpose of telling the user that he pressed the wrong key.
In the end neither approach is wrong. The 'best' approach can really only be determined by what you want for your system. Architectural purity vs. user responsiveness.
You are right to use parameterized queries. Alternatively, you could use an ORM and also get the SQL injection protection.
Related
Trying to build a stample project using DDD, I'm facing an issue:
To validate zipcode, address, and etc.., I have a set of db table(20 tables hundreds of columns, 26Mo) that I would like to query.
Those table are not related to my domain. This table have their own connection string and can be stored outside of the persitance DB.
I was thinking of adding a connection string to the Core and use a simple orm raw sql query to validate the data.
The process is easyer to write in C# than in SQL so there is no stored procedure to do the job.
There is no modification on those data. Only querying.
I think it's important to remember that DDD doesn't have to apply to everything you do. If you have a complex problem domain that is worthy of the complexities DDD brings, that's fine. However it's also fine to have other areas of your software (other boundaries, essentially) that are CRUD. In fact, CRUD is best where you can get away with it because of the simplicity. As #D.R. said, you can load data using something more akin to a Transaction Script (I can see something like IZipCodeValidator in your future) and pass the results of that in where you need them, or you might consider your Application Service being allowed to go and get that ZipCode data using CRUD (IZipCodeRepository) and passing that in to a full-on Domain Object that has complex rules for the validation.
I believe it's a DDD purist view to try and avoid passing things to methods on Domain Objects that do things (e.g. DomainObject.ValidateAddress(address, IZipCodeRepository repo)), instead preferring to pass in the values useful for the validation (e.g. DomainObject.ValidateAddress(address, IEnumerable<ZipCode> zipcodes)). I think anyone could see the potential for performance issues there, so your mileage may vary. I'll just say to resist it if you can.
This sounds like a bounded context of its own. I'd probably query it from the core domain using an anti-corruption layer in-between. So your domain simply uses an interface to a service. Your application layer would implement this interface with the anti-corruption layer to the other bounded context. The implementation can use simple DB query mechanisms like ADO.NET queries.
I'm calling a web service and passing it a string of dynamically-generated SQL. This string contains user input. It is currently being built using simple string concatenation, and is therefore vulnerable to a SQL injection attack. I cannot use the normal parameterized SQL solutions, because I'm not executing the command from my application.
My first attempt was to build a parameterized SqlCommand object. However, there does not appear to be any way to extract the final SQL statement. My second attempt was to use sp_executesql, but that seems to have the same problem as my original code: concatenating a SQL command together with user input.
So, how can I generate the SQL without resorting to writing my own input sanitization logic (i.e. .Replace("'", "''")? Is there a built-in class or a good third-party library available?
As I understand the question... how can you generate the SQL without resorting to writing my own input sanitization logic. There are several sql injection mitigation techniques to consider using.
Some of these are security-based on the back-end and others are business and application development rules enforced by the database engine.
Back-End Security: Always apply the “Least Privilege” rule: set up low-privileged database accounts
for applications that access the DBMS.
Server Side Sanitation: On the server side. validate user-supplied data – as well as any data obtained from a potentially unsafe sourceClient-side input validation can be
useful
Client-Side: Do not return SQL error messages to users as they contain information useful for
attackers, such as the query or details about the targeted tables or even their content.
This can be easily prevented in Java using exception handling.
Client-Side: Encode text input fields likely to contain problematic characters into an
alphanumeric version using a two-way function such as Base64.
Client-Side: Be proactive in writing the code to prevent SQL injection. Filter all input data via a 2-step process. First, apply white-list filtering at user
input collection (e.g., web forms): allow only field-relevant characters, string formats
and data types; restrict string length. Then, black-list filtering or escaping should be
applied in the data access layer before generating SQL queries: escape SQL metacharacters
and keywords/operators.
Client or middle-tier side.: Validate dynamically-generated database object names (e.g. table names) with strict white-list filtering.
Client-Side Avoid quoted/delimited identifiers as they significantly complicate all whitelisting,
black-listing and escaping efforts.
Development: Enforce process to have developers a safe API which will take care of security and avoid SQL Injection. Do this instead of relying on developers to implement complex defensive coding techniques.
API: Develop an API or middel tier that analyzes the database schema at compile time
and writes code for a custom set of SQL query construction classes (which then
integrate into the IDE and are directly called by developers to build SQL queries).
The result is a tree-like structure based on a generic template, mapping the
possible variations of SQL queries according to tables and columns definition.
There are 3 main types of classes: SQL statements, table columns and where
conditions. These classes have strong-typed methods mapping the data types in the
database schema. Attack surface is reduced. The proposed API would not execute queries as you specified in your question, it only generates the SQL. The proposed API would check data types against its mappings, upon input value submission. Second, the query would be pre-compiled by the
DBMS-specific driver using JDBC’s PreparedStatement interface with binded
variables. Any error in either step will prevent the query’s execution.
The proposed API design used by developers would address server-side validation , SQL error interception. Strong typing are directly enforced, while text input encoding and 2-step input validation are not needed , as dynamic input is injected through a separate protected data
channel (binded variables) via the PreparedStatement interface. Object names are not inputted by the user and are routinely validated. A low-privileged database account should however still be provided to the proposed API.
In the proposed API, Data input entry points do not need to be identified as protection is applied right before database interaction. Segmented queries are fully supported; their
security is ensured as each query modification is validated by the API. White-filtering
and blacklisting are unnecessary as dynamic inputs are specified using binded variables.
In the proposed API, column lengths (e.g. for varchar fields) could be stored in the DB class (as
names and data types are) allowing the solution to perform bounds validation for
input data and therefore increase its protection level and overall accuracy.
Java-based prototypes of similar API designs are in progress and under current research efforts.
i'm using three tier architecture with c# and sql server database as the data source. according to DRY principal the validation should be done in one place only which in my case is either the front end data access layer or the database stored procedures.
so i was wondering whether to validate the stored procedure parameters in data access layer or leave it to stored procedure itself??
DRY is an important principle, but so is defence in depth.
When it comes to validating input, you must ensure it is safe - this should be done on each and every level (so both in DAL and stored procedure).
As for validating data for business logic, this should be in your business logic layer (BLL).
If you are using a three tier architecture, I would recommend you investigate using an ORM instead such as Nhibernate, or Linq to Entites. An ORM will provide you with better refactor-ability and hence maintainability (Maintainability to me is the most important thing, as it leads to quality in the longer run, based on my experience).
It is not wise to put your validation in to the UI, as it is safer to have your secuirty down in your DAL (data access layer) than in your UI where it can more easily be bypassed (accidentially or on purpose). Think about SQL injection. You should validate on your data access agasint this as opposed to only your UI as it is easy to miss on your UI, and easy to bypass as a malicious user trying to gain access to other data they are not allowed to access.
I think that it might make sense to have validation potentially on the UI for usability, and in the data access layer for safety. I do like the DRY principal of doing validation in one place, and you can still do that. If you make a common set of rules which are propogated through to both the data access layer and the UI then you will have a safe and usable system (through immediate feedback on data entry). ANother way could be to have different rules for different layers. For example field length rules and data entry patterns could be UI specific. The DAL can enforce the data is valid for example. THat is doing validation in multiple places, but as long as they are not independently doing the same thing, I think you will be ok. This is one of the hardest areas of consideration when designing an application as validation is a cross cutting concern and how you do it depends alot on how you structure the rest of your application design.
we are creating a WinForms .NET4 app with MS SQL Server and we are deciding between two scenarios:
1) WinForms application directly connects to the MS SQL Server.
2) Use 3-layer architecture and insert a WebServices in between.
Questions:
1) Is it a good practice to open SQL connection publicly to the "world"?
2) Which scenario would you recommend. App is data oriented, quite simple and not planning any other client, only the WinForms one.
Thanks in advance.
James
Definitely go with the option having a web services layer. This allows you:
to continue using your domain model (POCO and serialization).
to avoid opening your SQL Server to the internet.
to apply advanced business logic in your web services.
to remove SQL logic from your client application; all the data access belongs on the app tier.
to apply security rules/constraints as you need. Block a customer/user or IP address for various reasons.
When you say "quite simple and not planning any other client", i would take that with a grain of salt, apps always grow and morph as people realise what they can do and what else they can include. You need to rephrase that as "it is initially going to be a small simple app".
WebServices may be overkill for you at this point in time, but if you follow a nice n-tier architecture they will be very simple to add at a later date, with minimal refactoring.
As for exposing SQL to the world - no this is NOT a good practice. You can secure it very well, and ensure the logins that are used by the app (or users if they have their own logins) have minimal rights - just enough to run the stored procedures or execute the CRUD statements on the tables they need access to. But if you mess up the security while it is exposed to the world then kiss your SQL Server and its data goodbye. This is a complex subject in itself, so you are better to post individual questions when you have them.
If I have a 3 layer web forms application that takes user input, I know I can validate that input using validation controls in the presentation layer. Should I also validate in the business and data layers as well to protect against SQL injection and also issues? What validations should go in each layer?
Another example would be passing a ID to return a record. Should the data layer ensure that the id is valid or should that happen in BLL / UI?
You should validate in all layers of your application.
What validation will occur at each layer is specific to the layer itself. Each layer should be safe to send "bad" requests to and get a meaningful response, but which checks to perform at each layer will depend on your specific requirements.
Broadly:
User Interface - Should validate user input, provide helpful error messages and visual clues to correcting them; it should be protecting your lower layers against invalid user input.
Business / Domain Layer - Should check arguments to methods are valid (throwing ArgumentException and similar when they are not) and should check that operations are possible within the constraints of your business rules; it should be protecting your domain against programming mistakes.
Data Layer - Should check the data you are trying to insert or update is valid within the context of your database, that it meets all the relational constraints and check constraints; it should be protecting your database against mistakes in data-access.
Validation at each layer will ensure that only data and operations the layer believes to be correct are allowed to enter. This gives you a great deal of predictability, knowing information had to meet certain criteria to make it through to your database, that operations had to be logical to make it through your domain layer, and that user input has been sanitized and is easier to work with.
It also gives you security knowing that if any of your layers was subverted, there is another layer performing checks behind it which should prevent anything entering which you don't want to.
Should I also validate in the business and data layers as well to protect against SQL injection and also issues?
Yes and Yes.
In your business layer code, you need to validate the input again (as client side can be spoofed), and also for your business logic, making sure the entries make sense for your application.
As for the data layer - you again need to ensure data is valid for the DB. Use parametrized queries as this will pretty much ensure no SQL injection will happen.
As for your specific question regarding the ID - the DB will know if an ID exists or not. Whether that is valid or not, depends on whether it has meaning for your business layer or not. If it purely a DB artefact (not part of your object model), than the DB needs to handle it, if it is a part of your object model and has significance to it, the business layer should handle it.
You absolutely need to validate in your business and data layers. The UI is an untrusted layer, it is always possible for somebody to bypass your client-side validation and in some cases your server-side UI validation.
Preventing SQL injection is simply a matter of parameterizing your queries. The phrase "SQL Injection" shouldn't even exist anymore, it's been a solved problem for years and years, and yet every day I see people writing queries using string concatenation. Don't do this. Parameterize the commands and you will be fine.
One of the main reasons you separate your app into multiple tiers is so that each tier is reusable. If individual tiers don't do their own validation, then they are not autonomous and you don't have proper separation of concerns. You also can't do any thorough testing without individual components doing built-in validation.
I tend to relax these restrictions for classes or methods that are internal or private because they're not getting directly tested or used. As long as the public API is fully-validated, private APIs can generally assume that the class is in a valid state.
So, basically, yes, every layer, in fact every public class and method needs to validate its own data/arguments.
Semantic validation, like checking whether or not a particular Customer ID is valid, is going to depend on your design requirements. Obviously the business layer has no way of knowing whether or not an ID exists until said ID actually hits the data layer, so it can't perform this check in an advance. Whether it throws an exception for a missing ID or simply returns null/ignores the error depends on exactly what the class/method is designed to do.
However, if this ID needs to be in a special format - for example, maybe you're using specially-coded account numbers ("R-12345-A-678") - then it does become the responsibility of the domain/business layer to validate the input and make sure it conforms to the correct format, especially if the consumer of your business class is trying to create a new account.
No layer should trust data coming from another layer. The analogy I use for this is one of fiefdoms. Say you want to send a message to the king. If the message is not in the proper format it will be rejected before it ever gets to his ears. You could continue to send messages until you eventually get the format right or you could use an emissary. The job of the emissary is to help you verify that your message will be in the acceptable format so that the king will hear it.
Each layer in a system is a fiefdom. Each layer acts as an emissary to the layer to which it will send data by verifying that it will be accepted. No layer trusts data coming from outside that layer (no one trusts messages from outside the fiefdom). The database does not trust the middle layer. The middle-layer does not trust the database or the presentation layer. The presentation does not trust the user or the middle layer.
So the answer is that absolutely you should check and re-check the data in each layer.
Short answer: yes.
Validate as input gets received in each new layer and before it gets acted upon, generally I validate such input just before it gets used or passed on to the next layer (javascript checks if it's a valid email and free of malicious input, likewise the business layer does before constructing a query using it.)
To your last question: if the ID returns a record, then it is valid, and you'd have to find the record's id to confirm whether or not it is valid, so you'd be making a lot of unnessecary lookups if you were to try that.
I hope that helps.
I do all of my validation at the Presenter layer in the Model-View-Presenter. Validation is somewhat tricky because it's really a crosscutting concern so many times.
I prefer to do it at the presenter layer because I can then shortcircuit calling to the model.
The other approach is to do the validation in the model layer but then the issue of communication of errors because you cannot easily inform other layers of errors aside from exceptions. You can always pack exceptions with data or create your own custom exception that you can attach a list of error messages or similar construct to but that always seem dirty to me.
Later when I expose my model through a web service I will implement double validation checking both in the Presenter and in the Model since it will be possible to jump the presenter layer if you call the web service. The other big advantage to this is that it decouples my validations for the presenter layer from the model since the model might only require raw validation of types to match the database whereas users of my UI I want more granular rules of what they input not just that they physically can.
Other questions: the sql injection portion that is a model concern and should not be in any middle layers. However most sql injection attacks are completely nullified when text fields don't allow special characters. The other part of this is you should almost always be using parametrized sql which makes sql injection not usable.
The question on the ID that's a model concern either it can get a record with that ID or it should return null or throw an exception for record not found depending on what convention you wish to establish.