Encoded characters are not rendering correctly - c#

I am using the shorthand for HttpUtility.HtmlEncode to encode the data going into my textboxs.
<asp:TextBox ID="txtProperty" runat="server" Text='<%#: Bind("Property")%>'></asp:TextBox>
My understanding of how encoded characters behave is that when your web browser renders them, they should display as the characters they represent and not the actual encoded characters. As this example code on the MSDN website suggests.
However my encoded characters does not behave this way.
For example a '£' character being retrieved from a database, displays in the textbox as:
And not:
I don't think it has anything to do with how my website is configured to handle encoding, because if I manually set the text as the encoded characters in the HTML:
<asp:TextBox ID="txtProperty" runat="server" Text="£"></asp:TextBox>
It renders the encoded characters correctly as:
This indicates to me that it is a problem with the way I am using HtmlEncode.
Still I tried explicitly setting the encoding to UTF-8 in my webconfig and it made no difference.
Could someone explain this behavior, or what might be the problem here?

When you do <%#: Bind("Property")%> ASP.NET will already take care of HTML-encoding the string, if you pre-encode it you'll fall in the double-encoding scenario.
See ScottGu's New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2):
ASP.NET 4 introduces a new IHtmlString interface (along with a concrete implementation: HtmlString) that you can implement on types to indicate that its value is already properly encoded (or otherwise examined) for displaying as HTML, and that therefore the value should not be HTML-encoded again.
The <%: %> code-nugget syntax checks for the presence of the IHtmlString interface and will not HTML encode the output of the code expression if its value implements this interface.
This allows developers to avoid having to decide on a per-case basis whether to use <%= %> or <%: %> code-nuggets.
Instead you can always use <%: %> code nuggets, and then have any properties or data-types that are already HTML encoded implement the IHtmlString interface.

Related

Using i18next with Razor [duplicate]

Is there a nicer syntax when creating elements with hyphenated attributes instead of using:
<%= Html.TextBox ("name", value, new Dictionary<string, object> { {"data-foo", "bar"} }) %>
Looking at the HTML specs for the proposed standards HTML 5 and WIA ARIA it seems hyphens in HTML attributes are being planned to be more common as some sort of simple name spacing.
E.g. HTML 5 proposes custom attributes are prefixed with data- and WIA ARIA uses the aria- prefix for all WIA ARIA attributes.
When using HTML helpers in ASP.NET MVC such as <%= Html.TextBox("name", value, new { attribute = attributeValue }) %> the anonymous object is converted to a dictionary.
Unfortunately in C# there is no support for hyphens in names, so the only alternative is to create a dictionary. The syntax for which is very verbose, has anyone seen a nicer alternative or a simple way of altering the functionality of ASP.NET MVC's HTML extensions without having to re-write the entire extension?
Use an underscore in the data attribute name, and it'll magically handle it for you, converting it to a hyphen. It knows you want a hyphen rather than an underscore as underscores aren't valid in html attribute names.
<%= Html.TextBox("name", value, new { #data_foo = "bar"}) %>
The answer provided at ActionLink htmlAttributes suggests using underscores instead of hyphens. MVC.Net is supposed to emit hyphens instead of the underscores when sending the page to the browser.

Should I set charset/codepage when working with digits only?

I have a C# code that returns coordinates:
<%# Page Language="C#" CodePage="65001" CodeFile="codefile.aspx.cs" inherits="codefile" %>
Response.Write(coordinates());
It otuput something like this:
77.0444687 12.9120790
Do I need to set the CodePage="65001"?
Is that appropriate?
Since the CodePage value of "65001" is the Windows implementation of UTF-8 and the default encoding for ASP.NET is UTF-8, then it is not necessary to use this CodePage value, but it is not inappropriate either. You are just restating the default value. I suppose if the default changes in newer versions of the .NET Framework, then explicitly stating this value would be more useful.
Read Code Page Identifiers for more information.
By default, all pages have anyway a CodePage that is first set on web.config of the system, then you may have change it on web.config on the site, and finally you can change it on the page declaration on top.
You do not need to change the code page for numbers, from the default of the system that is probably UTF-8 encoding.

Does using razor in WebMatrix mitigate an XSS threat?

I have purposfully (for testing) assigned the following variable in WebMatrix C#:
string val = "<script type='text/javascript'>alert('XSS Vector')</script>";
Later in the page I have used razor to write that value directly to the page.
<p>
#val
</p>
It writes the text, but in a safe manner (i.e., no alert scripts run)
This, coupled with the fact that if 'val' contains an html entity (e.g., <) it also writes exactly "<" and not "<" as I would expect the page to render.
Is this because C# runs first, then html is rendered?
More importantly, is using razor in this fashion a suitable replacement for html encoding, when used like this?
The #Variable syntax will HtmlEncode any text you pass to it; hence you seeing literally what you set to the string value. You are correct in that this is for XSS protection. It is part of Razor that does this; the #Variable syntax itself.
So basically, using the #Variable syntax is not so much a 'replacement' for Html Encoding; it is HTML encoding.
Related: If you ever want some string to render the HTML, you would use this syntax in Razor:
#Html.Raw(Variable)
That causes the Html Encoding not to be done. Obviously, this is dangerous to do with user-supplied input.

ASP.NET automatically converts & to &

Minor issue, but it's driving me nuts nonetheless.
I'm building a url for a <script> tag include to be rendered on an ASP.NET page, something like this:
<script src='<%= string.Format("http://example.com/page.aspx?a={0}&b={1}&c={2:0.00}", A, B, C)%>' type='text/javascript'></script>
Problem is when this is rendered, the & characters are replaced with &:
<script src='http://example.com/page.aspx?a=xxx&b=zzz&c=123.45' type='text/javascript'></script>
I was expecting this, obviously:
<script src='http://example.com/page.aspx?a=xxx&b=zzz&c=123.45' type='text/javascript'></script>
However, if I render the url directly, outside the <script> tag, it looks ok! Just doing
<%= string.Format("http://example.com/page.aspx?a={0}&b={1}&c={2:0.00}", A, B, C) %>
Renders this:
http://example.com/page.aspx?a=xxx&b=zzz&c=123.45
What gives? And how do I stop this madness? My OCD can't take it!
As #Falkon and #AVD have said, ASP.NET is automatically doing the "right" thing in the <script> tag. See the w3c recommendation - C.12. Using Ampersands in Attribute Values (and Elsewhere)
In order to ensure that documents are compatible with historical HTML user agents and XML-based user agents, ampersands used in a document that are to be treated as literal characters must be expressed themselves as an entity reference (e.g. "&").
I'm not entirely sure why ASP.NET doesn't do the same thing in the rest of the page (could be any number of good reasons), but at least it's correcting the ampersand in the script tag. Conclusion: While you may be cursing ASP.NET for "scrambling" your url, you may want to thank it instead for helping your webpage be standards compliant.
Maybe MvcHtmlString.Create() or Html.Raw()?
<script src='<%= MvcHtmlString.Create("http://example.com/page.aspx?a={0}&b={1}&c={2:0.00}", A, B, C)%>' type='text/javascript'></script>
or
<script src='<%= Html.Raw("http://example.com/page.aspx?a={0}&b={1}&c={2:0.00}", A, B, C)%>' type='text/javascript'></script>
I can work it out. I just make a method:
public void BuildUrl(String baseUrl = "", String data = "")
{
Response.Write(baseUrl + data);
}
and use it in my html page like this:
<button type="button" class="btn_new" ref="<% this.BuildUrl(this.BaseUrl + "Master/Tanker_Edit.aspx?", "type=new&unique_id=" + Session.SessionID); %>">New</button>
The result:
<button type="button" class="btn_new" ref="http://kideco.local/Master/Tanker_Edit.aspx? type=new&unique_id=emxw1pkpnwcxpn1cl2cf04zv">
Around the comment use Server.HtmlEncode( yourString )
It will automatically escape the double or single quotes for you, as well as ampersands (&) and less-than and greater-than signs, etc.
Instead of <%=, which can sometimes automatically HTMLEncode things it writes to the response stream, try using <$:. This is a new(ish) code expression nugget syntax added with ASP.Net 4.0. This new syntax will still HTMLEncode most things by default, but there is a special IHtmlString interface you can use to explicitly tell this new nugget that you do not want to HTMLEncode this data, and thus avoid double encoding. You should pretty much always use the newer <%: and pretty much never use the older <%=... though of course there will be exceptions to this.
More details are available here:
http://weblogs.asp.net/scottgu/archive/2010/04/06/new-lt-gt-syntax-for-html-encoding-output-in-asp-net-4-and-asp-net-mvc-2.aspx
"&" is a reserved character in HTML and XML, and consequently ASP.NET. The "&" gets converted by ASP.NET to & because that is the code to display that character on the web.
You might find your answer in the answers on this question : ASP.Net URLEncode Ampersand for use in Query String
Hope that helps, good luck!

Html Encoding of output in legacy ASP.NET site

I have a legacy ASP.Net site (recently upgraded to .NET 4.0) which never had Request Validation turned on and it doesn't Html encode any user input at all.
My solution was to turn on request validation and to catch the HttpRequestValidationException in Global.asax and redirect the user to an error page. I don't Html Encode the user input as I'll have to do it in thousands of places. I hope my approach will stop any XSS vectors getting saved into database.
However, in case if there is already any XSS vector stored in database I reckon I should also Html encode all output. Unfortunately I have very limited dev and test resource to successfully achieve this. I came up with a list of changes I need to go through:
Search and Replace all <%= %> with <%: %>.
Search and Replace all Labels with Literals and add Mode="Encode".
Wrap all eval() with HtmlEncode.
My questions are:
Is there any simpler way of turning on all output to be automatically Html encoded?
Am I missing anything from above list?
Any pitfalls I should be careful about?
Thanks.
Search and Replace all <%= %> with <%: %>.
Don't forget the <%# and Response.Write which will be harder to replace
Search and Replace all Labels with Literals and add Mode="Encode".
But you will loose all formatting on the previously generated spans, break the DOM
and the corresponding js/css
You would also have to search all Literals with Mode="PassThrough" and set them to Encode
Wrap all eval() with HtmlEncode.
Yes, it seems like a subset of the <%# matter above
Also, you could have some custom controls with funky render method
Assuming there is "only" a relational DB as back-end, If I had access to the DB, I would first go on identifying the problematic tables and columns which values contain markup.
I would then :
cleanup as best as I can those values in DB.
ensure HtmlEncoding of the corresponding outputs in my pages
I could then go for a basic global replace <%= becoming <%: and sanitize outputs on the long run.

Categories