What is the best practice to handle dangerous characters in asp.net?
see example: asp.net sign up form
Should you:
use a JavaScript to prevent them from entering it into the textbox in the 1st place?
have a general function that does a find and replace on the server side?
The problem with #1, is it will increase page load time.
ASP .NET handles potentially dangerous characters for you, by default since ASP .NET 2.0. From Request Validation in ASP.NET:
Request validation is a feature in ASP.NET that examines an HTTP
request and determines whether it contains potentially dangerous
content. In this context, potentially dangerous content is any HTML
markup or JavaScript code in the body, header, query string, or
cookies of the request. ASP.NET performs this check because markup or
code in the URL query string, cookies, or posted form values might
have been added for malicious purposes.
Request validation helps prevent this kind of attack. If ASP.NET
detects any markup or code in a request, it throws a "potentially
dangerous value was detected" error and stops page processing.
Perhaps the most important bit of this is that it happens on the server; regardless of the client accessing your application they can not just turn of JavaScript to work around it.
Solution number 1 won't increment load time by much.
You should ALWAYS use solution number 2 along with solution number one, because users can turn off javascript in their browsers.
You accept them like regular characters on the write-side. When rendering you encode your output. You have to encode it anyway regardless of security so that you can display special characters.
What is the best practice to handle dangerous characters in asp.net?
I did not watch the screencast you link to (questions should be self-contained anyway), but there are no dangerous characters. It all depends on the context. Take Stack Overflow for example, it lets me input the characters Dangerous!'); DROP TABLE Questions--. Nothing dangerous there.
ASP.NET itself will do its best to prevent malicious input at the HTTP level: it won't let any user access files like web.config or files outside your web root.
As soon as you start doing something with user input, it's up to you. There's no silver bullet, no one rule that fits them all. If you're going to display the user input as HTML, you'll have to make sure you only allow harmless markup tags without any scriptable attributes. If you're allowing users to upload images, make sure only images get uploaded. If you're going to send input to an RDBMS, be sure to escape characters that have meaning for the database manipulation language.
And so on.
ALWAYS validate input on the server, this should not even be a discussion, just do it!
Client-side validation is just eye candy for the user, but the server is where it counts!
Thinking that
ASP .NET handles potentially dangerous characters for you, by default since ASP .NET 2.0. From Request Validation in ASP.NET:
is like thinking that a solid door will keep a thief out. It won't. It will only slow him. You have to know what are the most common vectors and what are the possible solutions. You must comprehend that every EVERY EVERY variable (field/property) you write in an HTML/CSS/Javascript is a potential attack vector that must be sanitized (through the use of appropriate libraries, like some methods included in newer MVC.NET, or at least the <%: %> of ASP.NET 4.0), no exceptions, every EVERY EVERY query you execute is a potential attach vector that must be sanitized through the exclusive use of ORM and parameterized queries, no exceptions. No passwords must be saved in the db. And tons of other similar things. It isn't very difficult, but laziness, complacence, ignorance will make it harder (if not nearly impossible). If it isn't you that will introduce the hole then it's the programmer on your left, or the programmer on your right. There is not hope.
Related
Please help me understand why this is happening. My Web Application retrieves text that contains Apostrophe from the database to display using a Label in a Web Form. However, it is getting a Potentially Dangerous Error when I submit the Web Form.
I can solve this error by using HttpUtility.HtmlDecode() on the value before I set it to Label.Text
I have been Googling, and I cannot understand the flow that causes this problem.
I understand that Potentially Dangerous Error occurs when submitting potential HTML tags like '', and it also filters some encoded characters like "'"
However, what I cannot understand are:
Does the string value automatically HTML encodes when it displays on the Web Form? Because the value from my database is not HTML encoded.
Why does decoding it solves the error when my value from the database is not encoded in the first place? If that is the case, should I decode all values from database?
I think you are looking for something like this -> How to allow apostrophe in asp.net website
The main reason this occurs is mostly due to fact that accepting apostrophes without handling it properly could lead to SQL Injection . By Default Microsoft enables a validation that occurs with every request, this occurs even before it reaches the normal page lifecycle, and checks if there is anything in the form that could potentially lead to SQL Injection or other OWASP vulnerabilities (In no way I am saying Microsoft protects you from everything, but at least tries to cover the basics and make you aware). By the end it is up to you to disable that functionality (based on my first link) but you should be aware and protect yourself in a proper manner.
On your questions enconding as the name says will take care of the apostrophe as a proper character not a special character that could still be used for a SQL query for instance. And obviously depending on the scenario you might want to decode or encode it again when you present it. If you want to display rich html then you probably need to decode, etc, etc.
Bottom line I think you should go back to my first paragraph and understand when this starts and only then you will be able to understand what would be the right solution for your case as I assume you dont want to risk creating a vulnerable website :)
I'm working on a mini-CMS module for one of my projects, where users are allowed to edit content in markdown. I'm using markdown-it for parsing and showing a preview.
I was thinking a lot about how to send the input to the server, and also how to store it in the database. I came to a conclusion to avoid duplicating the markdown parsing at server-side, and send both markdown and the parsed HTML to the server. I think nowadays the added overhead is minimal, even on a site where edits are heavy.
So at final stage I still need to validate the HTML sent to the server, as it can be a security bottleneck of the system. I've read a lot about Microsoft's implementation of AntiXSS, and how it is (or was) quite unusable for such scenarios, as it was too gready. For example I've found this article with even a helper code (using HTMLAgilityPack) to give a usable sanitizing implementation.
Unfortunately I haven't found anything newer than 2013 on this topic. I'd like to ask at present how to do a proper HTML encoding where there are allowed tags and attributes, but still safe from any kind of XSS attacks? Is such a code like in the article still needed, or are there any built-in solutions?
Also, if my choice of client-side markdown parsing is not viable, what are some other options? What I want to avoid, is duplicating all kinds of markdown logic at both client and server. For example I've prepared several custom extensions for markdown-it, etc.
If you allow html to be edited on the client and stored to the server, you are basically opening up a can of worms. This applies to client side html editors, and also to your usecase where you want to save html generated from markdown. The problem is that a malicious user may send any html to your backend, not just one that can actually be generated from the markdown. Html code in this case will be plain user input and as such must not be trusted.
Say you want to implement whitelisting of tags and tag attributes, the HTMLAgilityPack way. Consider a simple link in html. You obviously want to allow the <a> tag, and also the href attribute to that so that links are possible. But what about <a href="javascript:alert(1)"> then? It will be vulnerable to obvious XSS, and this is just one example, it would be vulnerable in numerous ways.
Even worse is that you probably want to render user-given html on the client before a server roundtrip (something like a preview), and also save it to your database and render it after downloading it again. For this you have to turn off request validation, and also automatic encoding as those would make this impossible.
So you have a few options that could actually work to prevent XSS.
Client-side sanitization: You could use the client-side sanitizer from Google Caja (only the Javascript library, not the whole thing) to remove Javascript from any html content. The way this would work is before displaying any such html (before previewing html on the client, or before displaying html downloaded from the server), you would run it through Caja, and that would remove any Javascript, thus eliminating XSS. It works reasonably well in my experience, it removes Javascript from CSS too, and also the trivial ones like href, src, a script tag, event attributes (onclick, onmouseover, etc). Another similar library is HTML Purify, but that only works for new browsers and does not remove Javascript from CSS (because that does not work in newer browsers anyway).
Server-side sanitization: You could also use Caja on the server side properly, but that's probably way too difficult and hard to maintain for your usecase, and also if only this is implemented, preview on the client (without a server roundtrip) would still be vulnerable to DOM XSS.
Content-Security-Policy: You could use the Content-Security-Policy response header to disable all inline Javascript on your website. One drawback is that it has implications on your client-side architecture (you cannot have inline Javascript at all, obviously), and also browser support is limited, and in unsupported browsers your page will in fact be vulnerable to XSS. However, the latest version of current major browser all support Content-Security-Policy, so it is indeed a good option.
Separate frame: You could serve unsafe html from a different origin (ie. a different subdomain) and accept the risk of XSS on that origin. However, cross-frame communication would still be a problem, and so would authentication and/or CSRF depending on the solution. This is kind of the old school way, options above are probably better for your usecase.
You could also use a combination of these for defense in depth.
I ended up using the code in the article. I made an important change, I removed style attributes from the whitelist completely. I don't need them, the styling which I allow can be achieved by classes. Also, style attributes are also dangerous and hard to encode/escape properly. Now I feel that the code is safe enough for my current purposes.
What else needs to be validated apart from what I have below? This is my question.
It is important that any input to a site is properly validated:
Textboxes, etc – use .NET validators (or custom code if the validators aren’t appropriate)
Querystring or Form values – use manual validation (casting to specific types, boundary checking, etc)
This ties into the problems which XSS can reveal.
Basically you have to validate any input that someone could potentially tamper with:
Form Postbacks (mainly .NET Controls – these can be validated with .NET validation controls. Also if you have Request Validation turned on on all pages, this reduces the risk )
QueryString Values
Cookie values
HTTP Headers
Viewstate (automatically done for you as long as you have ViewState MAC enabled)
Javascript (all JS can be viewed and changed, so need to ensure no crucial functionality is handled by JavaScript- i.e. always enable server side validation)
There is a lot that can go wrong with a web application. Your list is pretty comprehensive, although it is duplication. The http spec only states, GET, POST, Cookie and Header. There are many different types of POST, but its all in the same part of the request.
For your list I would also add everything having to do with file upload, which is a type of POST. For instance, file name, mime type and the contents of the file. I would fire up a network monitoring application like Wireshark and everything in the request should be considered potentially harmful.
There will never be a one size fits all validation function. If you are merging sql injection and xss sanitation functions then you maybe in trouble. I recommend testing your site using automation. A free service like Sitewatch or an open source tool like skipfish will detect methods of attack that you have missed.
Also, on a side note. Passing the view state around with a MAC and/or encrypted is a gross misuse of cryptography. Cryptography is tool used when there is no other solution. By using a MAC or encryption you are opening the door for an attacker to brute force this value or use something like oracle padding attack to take advantage of you. A view state should be kept track by the server, period end of story.
I would suggest a different way of looking at the problem that is orthogonal to what you have here (and hence not incompatible, there's no reason why you can't examine it both ways in case you catch with one what you miss with another).
The two things that are important in any validation are:
Things you pay attention to.
Things you pass to another layer untouched.
Now, most of the things you've mentioned so far fit into the first cateogry. Cookies that you ignore fit into the second, as would query & post information if you passed to another handler with Server.Execute or similar.
The second category is the most debatable.
On the one hand, if a given handler (.aspx page, IHttpHandler, etc.) ignores a cookie that may be used by another handler at some point in the future, it's mostly up to that other handler to validate it.
On the other hand, it's always good to have an approach that assumes other layers have security holes and you shouldn't trust them to be correct, even if you wrote them yourself (especially if you wrote them yourself!)
A middle-ground position, is that if there are perhaps 5 different states some persistant data could validly be in, but only 3 make sense when a particular piece of code is hit, it might verify that it is in one of those 3 states, even if that doesn't pose a risk to that particular code.
That done, we'll concentrate on the first category.
Querystrings, form-data, post-backs, headers and cookies all fall under the same category of stuff that came from the user (whether they know it or not). Indeed, they are sometimes different ways of looking at the same thing.
Of this, there is a subset that we will actually work upon in any way.
Of that there is a range of legal values for each such item.
Of that, there is a range of legal combinations of values for the items as a whole.
Validation therefore becomes a matter of:
Identify what input we will act upon.
Make sure that each component of that input is valid in its own right.
Make sure that the combinations are valid (e.g it may be valid to not send a credit card number, but invalid to not send one but set payment type to "credit card").
Now, when we come to this, it's generally best not to try to catch certain attacks. For example, it's not so good to avoid ' in values that will be passed to SQL. Rather, we have three possibilities:
It's invalid to have ' in the value because it doesn't belong there (e.g. a value that can only be "true" or "false", or from a set list of values in which none of them contain '). Here we catch the fact that it isn't in the set of legal values, and ignore the precise nature of the attack (thus being protected also from other attacks we don't even know about!).
It's valid as human input, but not as what we will use. An example here is a large number (in some cultures ' is used to separate thousands). Here we canonicalise both "123,456,789" and "123'456'789" to 123456789 and don't care what it was like before that, as long as we can meaningfully do so (the input wasn't "fish" or a number that is out of the range of legal values for the case in hand).
It's valid input. If your application blocks apostrophes in name fields in an attempt to block SQL-injection, then it's buggy because there are real names with apostrophes out there. In this case we consider "d'Eath" and "O'Grady" to be valid input and deal with the fact that ' is significant in SQL by escaping properly (ideally by using an API for data access that will do this for us.
A classic example of the third point with ASP.NET is code that blocks "suspicious" input with < and > - something that makes a great number of ASP.NET pages buggy. Granted, it's better to be buggy in blocking that inappropriately than buggy by accepting it inappropriately, but the defaults are for people who haven't thought about validation and trying to stop them from hurting themselves too badly. Since you are thinking about validation, you should consider whether it's appropriate to turn that automatic validation off and then treat < and > in a manner appropriate for your given use.
Note also that I haven't said anything about javascript. I don't validate javascript (unless perhaps I was actually receiving it), I ignore it. I pretend it doesn't exist and then I won't miss a case where its validation could be tampered with. Pretend yours doesn't exist at this layer too. Ultimately client-side validation is to save the good guys making honest mistakes time, not to twart the bad guys.
For similar reasons, this is best not tested through a browser. Use Fiddler to construct requests that hit the validation points you want to examine. This way all client-side validation is by-passed, and you're looking at the server the same way an attacker will.
Finally, remember that a page with 100% perfect validation is not necessarily secure. E.g. if your validation is perfect but your authentication poor then someone can send "valid" code to it that will be just - perhaps more - nasty as the more classic SQL-injection of XSS code. That hits onto other topics that are for other questions, except that validation as discussed here is only part of the puzzle.
To prevent my application from crashing with the error "A potentially dangerous Request.Form value was detected...", I just turned page validation off. I want to revisit this and solve it correctly.
Is there a good strategy for this? If people are entering '<' and '>', I think the only way to save their data is to encode it via Javacript. I have tried catching it in the code-behind, but it becomes too late. I am thinking of inheriting the textbox and auto encode/decode the input with client scripts. I also have to think of all the angle brackets that are already saved in my database.
Any suggestions or experience with this?
I get from your answer that you don't want your client to send you "dangerous" content, so its desirable to leave the page validation turned on, as a last line of defense, instead of turning it off and using Server.HtmlEncode on each user input value (you might miss one and it is a lot of work).
I would go for a javascript solution, for example you could use a library such as jQuery, and hook into the submit events of the forms, and tidy the input before submitting. Much cleaner than creating your own derived textbox.
For the users without javascript, or that try to "hack" your little script, sc#!w them, they will reach your last line of defense, and get an error.
It's best to think of the built-in page validation as a safety device that isn't applicable to all cases. There are more than a few times when it is completely impossible to do something with it turned on. In these cases we turn it off, and deal with the validation ourselves.
The most obvious case is that sometimes we actually do want to send big chunks of HTML to the server. Of course, doing so still has to be made secure, but "oh, that looks like a big chunk of HTML! throw a security exception!" obviously isn't the correct way to do that.
So, in these cases it's perfectly sensible to turn off page-validation and add your own server-side. It does mean that you have to think about just how this input will be used with a bit more scrutiny than before. Follow through the path of every datum input (not just those where you expect to see characters like <, and ensure that either it will never be sent back to the client unescaped, or that it is thoroughly inspected to guarantee safety.
You can escape dangerous chars before posting the data. Like this:
string = escape(string);
and then on the server side:
var stringVal = Server.UrlDecode(Request["string"]);
Something like that.
Have you considered using ,
Server.HtmlEncode(input)
There is no real need to do it in the client end using javascript. You can easily do it in the server side using the above technique.
And possibly be a duplicate of this question
/BB
We've got an interstitial page that warns people when they're leaving our site. The trouble is it takes querystring parameters and blindly generates a page, thus it's vulnerable to XSS attacks. I've been tasked with fixing it and I want to do it right.
You should call Server.HtmlEncode to properly escape your generated HTML.
Yes, try this:
if(Uri.IsWellFormedUriString(url, UriKind.Absolute) && url.StartsWith("http"))
Response.Write(string.Format("{0}",
HttpUtility.HtmlEncode(url)));
So things not to do;
Use regex
Use HtmlEncode without thought.
Things to do;
Treat all input as untrusted.
Encode input before it is output. However make sure you're using the right type of encoding. If you put user input in an attribute then you use HtmlAttributeEncode, if it's just HTML then you use HtmlEncode, if you put it into JavaScript then it's JavaScriptEncode. If your javascript puts it into a div then it's HtmlEncode, followed by JavaScriptEncode.
Consider using AntiXSS which provides more encoding mechanisms and uses a safe list approach which is inherently safer.
Whitelist the exit URLs so people cannot use this page as an open referrer. Do not have a parameter which has the URL, rather have a GUID which looks up the URL from a database, session table or whatever.
(Disclosure : I own AntiXSS)
The best way is to get rid of the page entirely and just accept that its a website and make it act like a website. Websites link to other resources, it's why the web has over 200million sites instead of about a dozen.
Failing that, your best bet is to start with HtmlEncoding as a quick fix, and then replacing it with a lookup of ids to bring one to different sites.
But really, those "ZOMG you are leaving!" pages are horrible. They're even worse than the sites that open new tabs for every so-called "external" link.