Allow and Limit some HTML characters - c#

I have make a messaging system in which user can send messages to each other, they can also send files as attachement in message(its like simple email system). It allows users to send HTML characters and they'll render by browser, for eg if they enter
<b>Hello</b>
it'll rendered as
Hello
Its working fine,however i am facing one problem if user enter
<iframe src="anywebsite"><iframe>
theny it'll also rendered by browser.
How can i allow only some particular characters to be rendered by browser rest will display as normal text
I am using Asp.net MVC3
In my model class i've add
[AllowHtml] attribute to allow HTML characters

You could use the AntiXss library:
For example:
#Html.Raw(Sanitizer.GetSafeHtmlFragment("<b>Hello</b>"))
#Html.Raw(Sanitizer.GetSafeHtmlFragment("<iframe src=\"anywebsite\"><iframe>"))
The first will render the Hello text in bold whereas the second won't render anything as it is not considered safe.
You could also checkout the AntiSamy project.

Related

What is the best way to filter bad HTML Content from Posts using AntiXSS Library?

I want to create an Asp.net Website and I want to prevent Cross Site Scripting. I have a page with Summernote (a WYSIWYG HTML Editor), which, when submittet, posts HTML Code to MVC ActionResult via form or Ajax Post.
This Method saves this Code in my Database as content/body of a message. On another Site, you can display the content, which shows formating things like Lists etc.
Because of security reasons i want to filter the content i recieve from client. I am using the AntiXSS Library from Microsoft.
A part of my MVC Code:
[ValidateInput(false), HttpPost, ValidateAntiForgeryToken]
public ActionResult CreateMessage(string subject, string body)
{
var cleanBody = Sanitizer.GetSafeHtmlFragment(body);
//do the Database thing here
}
The major problem is, that it kills my HTML Elements with tag, because it removes the src=""
should be:
<p><img src="data:image/png;base64,some/ultra/long/picture/code/here" data-filename="grafik.png"></p>
remaining:
<p><img src="" alt=""><img src=""></p>
What can i do to prevent this?
Is there a way to add an exception rule?
Is there an another better way?
How does it work?
Thanks for help!
There is no such thing anymore as the "AntiXSS Library". It used to be a separate library, but Microsoft moved it into .Net, so it's now under System.Web.Security.AntiXss.
The reason this is important is that you need a sanitizer. The way you are using AntiXss currently will take a list of html tags and a list of attributes to those tags, and will remove everything else from your html code. That's not very good for you, because you only want to remove javascript, regardless of tags or attributes. Let's take for example <a>, with its href attribute. You most probably want to allow your users to insert links, but you don't want them to be able to insert javascript via <a href="javascript: ...">. So you cannot filter out href for <a>, but if you leave it, your page will be vulnerable to XSS.
So you want a sanitizer that only removes javascript. In the original AntiXSS library there was a sanitizer, but when Microsoft moved it to .Net, the sanitizer was left out.
So in short, AntiXss will not help you with your current usecase.
You can find proper html sanitizers like for example Google Caja (client-side sanitizer here), or many others. The point is, even if this sanitizer is in javascript (on the client), if you carefully don't insert your data into the page DOM before sanitizing it, it will all be fine.
So in short, you could just save any data from the HTML editor to your database as is without any transformation (mind sql injection of course, but current data access technologies should have that covered), and then when such data is displayed, send it to the client without adding it to the page dom (like as json data for example, but properly encoded for json then of course!), then run your sanitizer that will remove any javascript, and then add it to the page.
The reason this is very good is because your wysiwyg html editor will likely have a preview screen. Don't forget to add sanitization to previews as well, otherwise the preview will be vulnerable to XSS. If sanitization was on the server, you would have to send the editor contents to the server, sanitize it and send it back to your user for preview - not very user-friendly.
Also note that many wysiwyg editors support hooking into their rendering and adding such a sanitizer. If an editor does not support this and does not have its own sanitizer, that cannot be made secure with regard to XSS.

Pass URL with single quote and ampersand in Outlook using a .NET website

I have a web page that uses a webmail service to send emails. This is on an company intranet using a Microsoft Exchange server. My website created an email with a link to an image handler on my website. In my code, I can print some debug messages and I see:
<img src='http://tav.target.com/VIBEHandler.ashx?id=z064441_45975&type=Amazing'/>
But in the email, when I view the source code, I see this:
<img src="http://tav.target.com/VIBEHandler.ashx?id=z064441_45975&type=Amazing"/>
My single quotes changed to double quotes (no big deal).
&
changed to
&
This causes the URL to not work and images appear as the red "x", indicating a missing image.
How can I preserve my URL?
Your 3rd party emailing service might be converting your HTML document to a valid XML document for compatibility reasons.
http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
Basically, in XML, an ampersand character represents and XML entity, and can not be used unless you place the text within a CDATA node. Your 3rd party service seems to just be converting the & to & , which would work to safely display the value, but doesn't do too much for a URL.
http://www.w3schools.com/xml/xml_cdata.asp
If I were in your situation, I would URL encode the image URL when generating the HTML document that is being sent out. This way, it is both a proper link, and a valid XML string.
HttpUtility.UrlEncode(myUrlString);
http://msdn.microsoft.com/en-us/library/4fkewx0t%28v=vs.110%29.aspx
Hope this helps!
The best solution we could come up with is to use a single variable with multiple values separated with an underscore. This eliminates the need for the '&' symbol entirely and makes everything happy and compatible.
The URL is basically a link to an image handler so we can include images in emails without the use of attachments, shared drives, etc. The image handler can also do things like merge images together to create a single image (WAY better than trying to overlap images in emails which almost NEVER works). I simply added some code to the image handler that can check for and dissect the "meta-variable" in my URL.
http://sample.com?var=ONE_TWO_THREE
http://sample.com?var1=ONE&var2=TWO&var3=THREE
The URL now looks more clean and can have as many variables as I want so long as I put everything in the exact correct order, read it all in using the same sequence, don't miss anything, and document everything well. I COULD go one step farther and specify what each variable means:
http://sample.com?var=first-Nicolai_last-Dutka_age-34_etc-foobar
But that just tells the whole world what all my variables mean! Hypothetically, I could do:
http://sample.com?var=24154#kja&nl897q45pjkh8&&^HJ435
Then it would be up to me to determine where the breaking points are to bust that up into the variables:
24151, kja*, n1897, 45, etc
Of course, I'm not going to be that complex and will likely just stick to:
http://sample.com?var=ONE_TWO_THREE
Enjoy!

Sending HTML in e-mail

I need my .NET desktop app to be able to send various HTML mails, allowing users to create custom templates, including images and possibly CSS style (if they copy/paste the HTML from other sources).
From what I've been reading, it's not that simple:
Images need to be embedded and their links replaced with content IDs
CSS styles containing images also need to be fixed
Background color/image won't work, it's better to wrap the mail in a table and apply the CSS to it
SMTP servers can interpret lines starting with a dot as "end of transmission", so at least a space must be added to all such lines
Who knows what else
My questions are:
Is there anything else I should take care of?
Is there a library which already does this so that I don't reinvent?
One thing I can think of, make use of Alternate views for those recipients whose mail clients can't/won't accept HTML emails (or they've got it turned off). That way they'll get a plain text version, in which you could include a link to an html version live on the web if they decide want to view it.
I have also heard that not including a plain text version increases your likelyhood of being marked as spam - this is due to the fact that many mail filters compare the plain text and html versions of a message; if they differ too wildly it's not a good sign for you :-)
Other spam indicators include html messages which have more pictures than text, and generally sloppy html - broken css, bad links, missing tags etc - consider using some sort of markup validator before sending.
I have found the following CodeProject article, which describes how to embed various image resources into the mail:
Sending the contents of a webpage with images as an HTML mail.
It has some useful examples, although it doesn't seem to include an alternate plain text view, so I will have to add that.
It's still a pity that no-one has put together a library which does this stuff automatically.

Send MVC actionresult to printer

I have a controller with an action:
SomeController/ActionToBePrinted
ActionToBePrinted() returns an html view.
This action is called from a normal mvc razor view when pressing a button - how would I go about sending the content of the view to a printer when the button is pressed?
Aloha,
Hugo
You cant send direct to the printer.
I suggest you to create a custom ActionResult, that returns a PDF file or something like that. ASP.NET MVC Action Results and PDF Content
You can show a html page as well and open the print dialog using javascript like this
Click to Print This Page
But always the user has to start the print process, you cant do this programmatically.
You can perform a GET request (e.g. use window.open() and pass in URL or use AJAX) and put the returned HTML contents into a new window. Then use
Window.print(). Then simply close the window when you are done.
You could tie this directly into a single view by adding something in the body, but I prefer to use JavaScript in these cases. This keeps the design acting as a re-useable object or service that can be used across multiple views. In other words, you setup the controller-model, but no view. Instead, JavaScript steps in as the View.
Keep in mind that HTML is not a print format. So if you need to control the layout, you should be using a print technology such as PDF. XSLT provides an excellent means to create both HTML and PDF output using the same data, albeit it's a lot more work to create XSLT templates than it is to slap down window.print
Personally, I have an MVC page acting as a service that takes URL parameters. The page hooks into Adobe XSL-FO and uses the params to drive the output.

how can I get html code from selected text in asp.net?

I want to provide html email function in my application. But I don't know how to get html code from the text like
<br /> <b>,
etc. My application will provide user friendly user interface to let users to enter subject, email body and select attachment. The development environment is asp.net/c#. I use System.Net.Mail class to do email sending. I know I can write html email by using IsBodyHtml property, but how to get the html from the user interface?Does anyone have a solution?
Have you looked into the HTML Editor control provided by the ASP.Net AJAX Control Toolkit? It is probably the easiest route to give the editor a friendly interface to generate "rich text" with and for you to grab the underlying HTMl that generated it.
There are also numerous jQuery plugins available if you wish to go that route.
In fact, this is quite simple, I'd recommend you use some WYSIWYG Html Editor (or google "html editor for c#").
Basically, it writes html and javascript, for the textBox work as html editor, pretty the same when we are writing our questions and answer here in SO.
Have you looked for a Rich Text Editor that you can use for your users to enter their message (body)?
They usually have a function to get the HTML output of the text entered.

Categories