Please help me understand why this is happening. My Web Application retrieves text that contains Apostrophe from the database to display using a Label in a Web Form. However, it is getting a Potentially Dangerous Error when I submit the Web Form.
I can solve this error by using HttpUtility.HtmlDecode() on the value before I set it to Label.Text
I have been Googling, and I cannot understand the flow that causes this problem.
I understand that Potentially Dangerous Error occurs when submitting potential HTML tags like '', and it also filters some encoded characters like "'"
However, what I cannot understand are:
Does the string value automatically HTML encodes when it displays on the Web Form? Because the value from my database is not HTML encoded.
Why does decoding it solves the error when my value from the database is not encoded in the first place? If that is the case, should I decode all values from database?
I think you are looking for something like this -> How to allow apostrophe in asp.net website
The main reason this occurs is mostly due to fact that accepting apostrophes without handling it properly could lead to SQL Injection . By Default Microsoft enables a validation that occurs with every request, this occurs even before it reaches the normal page lifecycle, and checks if there is anything in the form that could potentially lead to SQL Injection or other OWASP vulnerabilities (In no way I am saying Microsoft protects you from everything, but at least tries to cover the basics and make you aware). By the end it is up to you to disable that functionality (based on my first link) but you should be aware and protect yourself in a proper manner.
On your questions enconding as the name says will take care of the apostrophe as a proper character not a special character that could still be used for a SQL query for instance. And obviously depending on the scenario you might want to decode or encode it again when you present it. If you want to display rich html then you probably need to decode, etc, etc.
Bottom line I think you should go back to my first paragraph and understand when this starts and only then you will be able to understand what would be the right solution for your case as I assume you dont want to risk creating a vulnerable website :)
Related
I have an application where in I have stored a lot of websites without validating them. Now I am validating the URL entered. But the already stored URL's are there as it is.
I want a strict display code that allows me to correct the user typos also and just gives the a proper URL to deal with.
The data that is already in the system has a lot of typos such as ...http://example.com or htp://example.com or ttp://example.com. I want the code to tackle that and come up with the proper url either by regexing the invalid part or making it correct.
That is the best approach to establish this?
You can obviously pick out the correct ones with a regex.
However, you will need to write your own logic to fix those that are 'broken'. You could pull these and with another regex and then simply search and replace the broken element. There are going to be limitations to this as you can only really check the protocol prefix and not the domain part itself.
Here is my try:
http(s)?://(www.)?[a-zA-Z0-9\-\.\\/]+
where [a-zA-Z0-9-.\/] includes all characters that you want to allow users to use.
P.S. please be aware that if you are using RegEx under C#, do not forget to use double \\ as otherwise your expression might not work properly.
Hope it gets you started.
What is the best practice to handle dangerous characters in asp.net?
see example: asp.net sign up form
Should you:
use a JavaScript to prevent them from entering it into the textbox in the 1st place?
have a general function that does a find and replace on the server side?
The problem with #1, is it will increase page load time.
ASP .NET handles potentially dangerous characters for you, by default since ASP .NET 2.0. From Request Validation in ASP.NET:
Request validation is a feature in ASP.NET that examines an HTTP
request and determines whether it contains potentially dangerous
content. In this context, potentially dangerous content is any HTML
markup or JavaScript code in the body, header, query string, or
cookies of the request. ASP.NET performs this check because markup or
code in the URL query string, cookies, or posted form values might
have been added for malicious purposes.
Request validation helps prevent this kind of attack. If ASP.NET
detects any markup or code in a request, it throws a "potentially
dangerous value was detected" error and stops page processing.
Perhaps the most important bit of this is that it happens on the server; regardless of the client accessing your application they can not just turn of JavaScript to work around it.
Solution number 1 won't increment load time by much.
You should ALWAYS use solution number 2 along with solution number one, because users can turn off javascript in their browsers.
You accept them like regular characters on the write-side. When rendering you encode your output. You have to encode it anyway regardless of security so that you can display special characters.
What is the best practice to handle dangerous characters in asp.net?
I did not watch the screencast you link to (questions should be self-contained anyway), but there are no dangerous characters. It all depends on the context. Take Stack Overflow for example, it lets me input the characters Dangerous!'); DROP TABLE Questions--. Nothing dangerous there.
ASP.NET itself will do its best to prevent malicious input at the HTTP level: it won't let any user access files like web.config or files outside your web root.
As soon as you start doing something with user input, it's up to you. There's no silver bullet, no one rule that fits them all. If you're going to display the user input as HTML, you'll have to make sure you only allow harmless markup tags without any scriptable attributes. If you're allowing users to upload images, make sure only images get uploaded. If you're going to send input to an RDBMS, be sure to escape characters that have meaning for the database manipulation language.
And so on.
ALWAYS validate input on the server, this should not even be a discussion, just do it!
Client-side validation is just eye candy for the user, but the server is where it counts!
Thinking that
ASP .NET handles potentially dangerous characters for you, by default since ASP .NET 2.0. From Request Validation in ASP.NET:
is like thinking that a solid door will keep a thief out. It won't. It will only slow him. You have to know what are the most common vectors and what are the possible solutions. You must comprehend that every EVERY EVERY variable (field/property) you write in an HTML/CSS/Javascript is a potential attack vector that must be sanitized (through the use of appropriate libraries, like some methods included in newer MVC.NET, or at least the <%: %> of ASP.NET 4.0), no exceptions, every EVERY EVERY query you execute is a potential attach vector that must be sanitized through the exclusive use of ORM and parameterized queries, no exceptions. No passwords must be saved in the db. And tons of other similar things. It isn't very difficult, but laziness, complacence, ignorance will make it harder (if not nearly impossible). If it isn't you that will introduce the hole then it's the programmer on your left, or the programmer on your right. There is not hope.
What else needs to be validated apart from what I have below? This is my question.
It is important that any input to a site is properly validated:
Textboxes, etc – use .NET validators (or custom code if the validators aren’t appropriate)
Querystring or Form values – use manual validation (casting to specific types, boundary checking, etc)
This ties into the problems which XSS can reveal.
Basically you have to validate any input that someone could potentially tamper with:
Form Postbacks (mainly .NET Controls – these can be validated with .NET validation controls. Also if you have Request Validation turned on on all pages, this reduces the risk )
QueryString Values
Cookie values
HTTP Headers
Viewstate (automatically done for you as long as you have ViewState MAC enabled)
Javascript (all JS can be viewed and changed, so need to ensure no crucial functionality is handled by JavaScript- i.e. always enable server side validation)
There is a lot that can go wrong with a web application. Your list is pretty comprehensive, although it is duplication. The http spec only states, GET, POST, Cookie and Header. There are many different types of POST, but its all in the same part of the request.
For your list I would also add everything having to do with file upload, which is a type of POST. For instance, file name, mime type and the contents of the file. I would fire up a network monitoring application like Wireshark and everything in the request should be considered potentially harmful.
There will never be a one size fits all validation function. If you are merging sql injection and xss sanitation functions then you maybe in trouble. I recommend testing your site using automation. A free service like Sitewatch or an open source tool like skipfish will detect methods of attack that you have missed.
Also, on a side note. Passing the view state around with a MAC and/or encrypted is a gross misuse of cryptography. Cryptography is tool used when there is no other solution. By using a MAC or encryption you are opening the door for an attacker to brute force this value or use something like oracle padding attack to take advantage of you. A view state should be kept track by the server, period end of story.
I would suggest a different way of looking at the problem that is orthogonal to what you have here (and hence not incompatible, there's no reason why you can't examine it both ways in case you catch with one what you miss with another).
The two things that are important in any validation are:
Things you pay attention to.
Things you pass to another layer untouched.
Now, most of the things you've mentioned so far fit into the first cateogry. Cookies that you ignore fit into the second, as would query & post information if you passed to another handler with Server.Execute or similar.
The second category is the most debatable.
On the one hand, if a given handler (.aspx page, IHttpHandler, etc.) ignores a cookie that may be used by another handler at some point in the future, it's mostly up to that other handler to validate it.
On the other hand, it's always good to have an approach that assumes other layers have security holes and you shouldn't trust them to be correct, even if you wrote them yourself (especially if you wrote them yourself!)
A middle-ground position, is that if there are perhaps 5 different states some persistant data could validly be in, but only 3 make sense when a particular piece of code is hit, it might verify that it is in one of those 3 states, even if that doesn't pose a risk to that particular code.
That done, we'll concentrate on the first category.
Querystrings, form-data, post-backs, headers and cookies all fall under the same category of stuff that came from the user (whether they know it or not). Indeed, they are sometimes different ways of looking at the same thing.
Of this, there is a subset that we will actually work upon in any way.
Of that there is a range of legal values for each such item.
Of that, there is a range of legal combinations of values for the items as a whole.
Validation therefore becomes a matter of:
Identify what input we will act upon.
Make sure that each component of that input is valid in its own right.
Make sure that the combinations are valid (e.g it may be valid to not send a credit card number, but invalid to not send one but set payment type to "credit card").
Now, when we come to this, it's generally best not to try to catch certain attacks. For example, it's not so good to avoid ' in values that will be passed to SQL. Rather, we have three possibilities:
It's invalid to have ' in the value because it doesn't belong there (e.g. a value that can only be "true" or "false", or from a set list of values in which none of them contain '). Here we catch the fact that it isn't in the set of legal values, and ignore the precise nature of the attack (thus being protected also from other attacks we don't even know about!).
It's valid as human input, but not as what we will use. An example here is a large number (in some cultures ' is used to separate thousands). Here we canonicalise both "123,456,789" and "123'456'789" to 123456789 and don't care what it was like before that, as long as we can meaningfully do so (the input wasn't "fish" or a number that is out of the range of legal values for the case in hand).
It's valid input. If your application blocks apostrophes in name fields in an attempt to block SQL-injection, then it's buggy because there are real names with apostrophes out there. In this case we consider "d'Eath" and "O'Grady" to be valid input and deal with the fact that ' is significant in SQL by escaping properly (ideally by using an API for data access that will do this for us.
A classic example of the third point with ASP.NET is code that blocks "suspicious" input with < and > - something that makes a great number of ASP.NET pages buggy. Granted, it's better to be buggy in blocking that inappropriately than buggy by accepting it inappropriately, but the defaults are for people who haven't thought about validation and trying to stop them from hurting themselves too badly. Since you are thinking about validation, you should consider whether it's appropriate to turn that automatic validation off and then treat < and > in a manner appropriate for your given use.
Note also that I haven't said anything about javascript. I don't validate javascript (unless perhaps I was actually receiving it), I ignore it. I pretend it doesn't exist and then I won't miss a case where its validation could be tampered with. Pretend yours doesn't exist at this layer too. Ultimately client-side validation is to save the good guys making honest mistakes time, not to twart the bad guys.
For similar reasons, this is best not tested through a browser. Use Fiddler to construct requests that hit the validation points you want to examine. This way all client-side validation is by-passed, and you're looking at the server the same way an attacker will.
Finally, remember that a page with 100% perfect validation is not necessarily secure. E.g. if your validation is perfect but your authentication poor then someone can send "valid" code to it that will be just - perhaps more - nasty as the more classic SQL-injection of XSS code. That hits onto other topics that are for other questions, except that validation as discussed here is only part of the puzzle.
To prevent my application from crashing with the error "A potentially dangerous Request.Form value was detected...", I just turned page validation off. I want to revisit this and solve it correctly.
Is there a good strategy for this? If people are entering '<' and '>', I think the only way to save their data is to encode it via Javacript. I have tried catching it in the code-behind, but it becomes too late. I am thinking of inheriting the textbox and auto encode/decode the input with client scripts. I also have to think of all the angle brackets that are already saved in my database.
Any suggestions or experience with this?
I get from your answer that you don't want your client to send you "dangerous" content, so its desirable to leave the page validation turned on, as a last line of defense, instead of turning it off and using Server.HtmlEncode on each user input value (you might miss one and it is a lot of work).
I would go for a javascript solution, for example you could use a library such as jQuery, and hook into the submit events of the forms, and tidy the input before submitting. Much cleaner than creating your own derived textbox.
For the users without javascript, or that try to "hack" your little script, sc#!w them, they will reach your last line of defense, and get an error.
It's best to think of the built-in page validation as a safety device that isn't applicable to all cases. There are more than a few times when it is completely impossible to do something with it turned on. In these cases we turn it off, and deal with the validation ourselves.
The most obvious case is that sometimes we actually do want to send big chunks of HTML to the server. Of course, doing so still has to be made secure, but "oh, that looks like a big chunk of HTML! throw a security exception!" obviously isn't the correct way to do that.
So, in these cases it's perfectly sensible to turn off page-validation and add your own server-side. It does mean that you have to think about just how this input will be used with a bit more scrutiny than before. Follow through the path of every datum input (not just those where you expect to see characters like <, and ensure that either it will never be sent back to the client unescaped, or that it is thoroughly inspected to guarantee safety.
You can escape dangerous chars before posting the data. Like this:
string = escape(string);
and then on the server side:
var stringVal = Server.UrlDecode(Request["string"]);
Something like that.
Have you considered using ,
Server.HtmlEncode(input)
There is no real need to do it in the client end using javascript. You can easily do it in the server side using the above technique.
And possibly be a duplicate of this question
/BB
My company requires our ASP.NET code to pass a Fortify 360 scan before releasing the code. We use AntiXSS everywhere to sanitize HTML output. We also validate input. Unfortunately, they recently changed the "template" Fortify was using and now it's flagging all our AntiXSS calls as "Poor Validation". These calls are doing things like AntiXSS.HTMLEncode(sEmailAddress).
Anyone know exactly what would satisfy Fortify? A lot of what it's flagging is output where the value comes from a database and never from a user at all, so if HTMLEncode isn't safe enough, we have no idea what is!
I'm a member of Fortify's Security Research Group and I'm sorry for the confusion this issue has been causing you. We haven't done a very good job of presenting this type of issue. I think part of the problem is the category name -- we're not trying to say that there is anything wrong with the validation mechanism, just that we cannot tell if it is the appropriate validation for this situation.
In other words, we don't know what the right validation is for your particular context. For this reason, we do not recognize any HTML encoding functions as validating against XSS out of the box, even the ones in Microsoft's AntiXSS library.
As for what the right solution is, if you are using HtmlEncode to output a username to the body of an HTML page, your original code is fine. If the the encoded username is being used in a URL, it could be vulnerable to XSS. At Fortify, when we find similar issues in our own code, if the validation matches the context, we mark it "Not an Issue".
We are aware of the problems around these issues keep tweaking our rules to make them more precise and understandable. We release new rules every three months and expect to make a couple changes in upcoming releases. For Q4, we plan to split the issues into Inadequate Validation (for blacklisting encoding and other weak validation schemes) and Context Sensitive Validation (the type of issue you're seeing). Please let us know if we can help more.
(A link to an OWASP explanation of why HTML encoding doesn't solve all problems:
http://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet#Why_Can.27t_I_Just_HTML_Entity_Encode_Untrusted_Data.3F)
fd_dev, I would add that you shouldn't focus on squeezing your code to fit through static analysis hoops. If you are qualified and confident that the finding doesn't apply, use the Fortify GUI tools to record a comment and suppress the issue.
If you are not sure, take a little screenshot and email it to Fortify Technical Support. They are well qualified to advise you on how to interpret your Fortify security findings.
blowdart is spot on. See http://www.schneier.com/blog/archives/2008/05/random_number_b.html for the worst case of what can happen if you chase static analysis results without understanding the purpose of the code and the reason/mechanics behind the finding. (In a word, you could make the code worse instead of better)-:
We've found a solution. Believe it or not, this causes Fortify360 to accept the code.
string sSafeVal = Regex.Replace(sValue, #"[\x00-\x1F\x7F]+", "");
Response.Write AntiXSS.HTMLEncode(sSafeVal);
So where AntiXSS.HTMLEncode alone fails, replacing non-printable characters works. Nevermind the fact that the HTMLEncode would do that anyways. I'm guessing they simply trigger off the Regex.Replace and I imagine any pattern would work.