Ensure safety of submited Html by the client, in server side - c#

I have an MVC 3 web application project, and in one page I use NicEdit to allow the user enter formatted text.
When the controller receives the data, the data is in html format... perfect. NicEdit itself don't allow for script tags, nor onFoo events to be entered directly in elements, but one user with bad intentions can force scripts in and that would not be safe.
What can I do to ensure the safety of the incoming data... strip out script tags, find and remove onXyz events... what else?
Also, what is the easiest way to do it? Should I use HtmlAgilityPack, or there is a simple function somewhere that will do all the job with a simple call.
Note: just encoding the whole string is not a valid solution. What I want is a way to ensure that the Html code is safe to render in another page, when someone wants to view the submited content.
Thanks!

You could use the AntiXss library. Dangerous scripts will be removed:
string input = ...
string safeOutput = AntiXss.GetSafeHtmlFragment(input);

Related

conditionally run code in cshtml based on URL [duplicate]

I know on client side (javascript) you can use windows.location.hash but could not find anyway to access from the server side. I'm using asp.net.
We had a situation where we needed to persist the URL hash across ASP.Net post backs. As the browser does not send the hash to the server by default, the only way to do it is to use some Javascript:
When the form submits, grab the hash (window.location.hash) and store it in a server-side hidden input field Put this in a DIV with an id of "urlhash" so we can find it easily later.
On the server you can use this value if you need to do something with it. You can even change it if you need to.
On page load on the client, check the value of this this hidden field. You will want to find it by the DIV it is contained in as the auto-generated ID won't be known. Yes, you could do some trickery here with .ClientID but we found it simpler to just use the wrapper DIV as it allows all this Javascript to live in an external file and be used in a generic fashion.
If the hidden input field has a valid value, set that as the URL hash (window.location.hash again) and/or perform other actions.
We used jQuery to simplify the selecting of the field, etc ... all in all it ends up being a few jQuery calls, one to save the value, and another to restore it.
Before submit:
$("form").submit(function() {
$("input", "#urlhash").val(window.location.hash);
});
On page load:
var hashVal = $("input", "#urlhash").val();
if (IsHashValid(hashVal)) {
window.location.hash = hashVal;
}
IsHashValid() can check for "undefined" or other things you don't want to handle.
Also, make sure you use $(document).ready() appropriately, of course.
[RFC 2396][1] section 4.1:
When a URI reference is used to perform a retrieval action on the
identified resource, the optional fragment identifier, separated from
the URI by a crosshatch ("#") character, consists of additional
reference information to be interpreted by the user agent after the
retrieval action has been successfully completed. As such, it is not
part of a URI, but is often used in conjunction with a URI.
(emphasis added)
[1]: https://www.rfc-editor.org/rfc/rfc2396#section-4
That's because the browser doesn't transmit that part to the server, sorry.
Probably the only choice is to read it on the client side and transfer it manually to the server (GET/POST/AJAX).
Regards
Artur
You may see also how to play with back button and browser history
at Malcan
Just to rule out the possibility you aren't actually trying to see the fragment on a GET/POST and actually want to know how to access that part of a URI object you have within your server-side code, it is under Uri.Fragment (MSDN docs).
Possible solution for GET requests:
New Link format: http://example.com/yourDirectory?hash=video01
Call this function toward top of controller or http://example.com/yourDirectory/index.php:
function redirect()
{
if (!empty($_GET['hash'])) {
/** Sanitize & Validate $_GET['hash']
If valid return string
If invalid: return empty or false
******************************************************/
$validHash = sanitizeAndValidateHashFunction($_GET['hash']);
if (!empty($validHash)) {
$url = './#' . $validHash;
} else {
$url = '/your404page.php';
}
header("Location: $url");
}
}

How display data generated by Rich Text Editor in ASP.NET MVC? [duplicate]

I have a controller which generates a string containing html markup. When it displays on views, it is displayed as a simple string containing all tags.
I tried to use an Html helper to encode/decode to display it properly, but it is not working.
string str= "seeker has applied to Job floated by you.</br>";
On my views,
#Html.Encode(str)
You are close you want to use #Html.Raw(str)
#Html.Encode takes strings and ensures that all the special characters are handled properly. These include characters like spaces.
You should be using IHtmlString instead:
IHtmlString str = new HtmlString("seeker has applied to Job floated by you.</br>");
Whenever you have model properties or variables that need to hold HTML, I feel this is generally a better practice. First of all, it is a bit cleaner. For example:
#Html.Raw(str)
Compared to:
#str
Also, I also think it's a bit safer vs. using #Html.Raw(), as the concern of whether your data is HTML is kept in your controller. In an environment where you have front-end vs. back-end developers, your back-end developers may be more in tune with what data can hold HTML values, thus keeping this concern in the back-end (controller).
I generally try to avoid using Html.Raw() whenever possible.
One other thing worth noting, is I'm not sure where you're assigning str, but a few things that concern me with how you may be implementing this.
First, this should be done in a controller, regardless of your solution (IHtmlString or Html.Raw). You should avoid any logic like this in your view, as it doesn't really belong there.
Additionally, you should be using your ViewModel for getting values to your view (and again, ideally using IHtmlString as the property type). Seeing something like #Html.Encode(str) is a little concerning, unless you were doing this just to simplify your example.
you can use
#Html.Raw(str)
See MSDN for more
Returns markup that is not HTML encoded.
This method wraps HTML markup using the IHtmlString class, which
renders unencoded HTML.
I had a similar problem with HTML input fields in MVC. The web paged only showed the first keyword of the field.
Example: input field: "The quick brown fox" Displayed value: "The"
The resolution was to put the variable in quotes in the value statement as follows:
<input class="ParmInput" type="text" id="respondingRangerUnit" name="respondingRangerUnit"
onchange="validateInteger(this.value)" value="#ViewBag.respondingRangerUnit">
I had a similar problem recently, and google landed me here, so I put this answer here in case others land here as well, for completeness.
I noticed that when I had badly formatted html, I was actually having all my html tags stripped out, with just the non-tag content remaining. I particularly had a table with a missing opening table tag, and then all my html tags from the entire string where ripped out completely.
So, if the above doesn't work, and you're still scratching your head, then also check you html for being valid.
I notice even after I got it working, MVC was adding tbody tags where I had none. This tells me there is clean up happening (MVC 5), and that when it can't happen, it strips out all/some tags.

How to extend Html.Raw in order to Sanitize dangerous HTML data before displaying

I inherited a web-app which already has some input fields accepting plain Html from user. (you may understand that the XSS (Cross Site Scripting) bell rings here...! )
The same input is displayed on specific view pages with the use of #Html.Raw (... the bell now rings louder)
And, to be able to do that work, the [ValidateInput(false)] decorator on the Controller and [AllowHtml] on the Model field, comes to fill the picture... (what can i say about the bell!!!)
Now, before someone convicts some programmer to death :-) let me make clear that this dangerous input functionality is allowed to users of specific-admin-role. So this is kind of controlled situation.
Lately, though, we decided to add some control to this situation, as this functionality creates risk from inside, in case of malicious behavior of the admin user himself.
The easy implementable option would be to disable this whole funcionality and add some Markdown editor instead, which will store harmless Rich-Text-Format input, BUT still I would have to transform all the existing data to this Markdown, so that they display correctly.
What I need, though, is to be able to lower the risk of inside - not eliminate - by adding some sort of Filter of Script tags and other dangerous tags, as an extension of the existing Html.Raw helper.
Can anyone suggest a way to extend or wrap the existing HtmlHelper, please?
Here is the Metadata info:
// Summary:
// Returns markup that is not HTML encoded.
//
// Parameters:
// value:
// The HTML markup.
//
// Returns:
// The HTML markup without encoding.
public IHtmlString Raw(string value);
Using Microsoft AntiXSS library you can avoid Cross Site Scripting attacks.Install AntiXSS 4.3.0. from nuget Install-Package AntiXSS.
#Html.Raw(Microsoft.Security.Application.Sanitizer.GetSafeHtmlFragment(value))
if this didnt work then try with AjaxControlToolkit's HtmlAgilityPackSanitizerProvider .using this you can whitelist some tags and attributes.
you can check this SO link

response.write writes after html closing tag, how to instead replace a string from the response inside html?

in the global.asax to measure the request execution time in the onbeginrequest (start the stopwatch) and onendrequest (calculate the difference).
then in the end request do response.write with the result.
however it writes the result AFTER the closing html tag. basically appends to the end.
current line of code is:
HttpContext.Current.Response.Write(elapsedTime);
is there an easy way for the response write to REPLACE the string ::actualResult:: within the actual html with the actual result string from the response write?
i've tried a lot of things including searching online but seems no one needs this or i suck at searching. i thought i could just get the entire response somehow and replace from there but unsure how to do that... something along ...Response.GetTheEnitreResponse??.Replace... of course that is just wishful thinking ;)
thnx
You didn't specify if you were using web forms, MVC, web pages, etc. but normally these frameworks have buffered a response that has been output by whatever page the user is hitting. Your code in onendrequest is coming to the party after all of the page contents (normally closed with an html closing tag) has been output to the buffer. So when you do a Response.Write you are appending to that html, thus it is outside the closing html tag.
If you want to have the timing be visual on the page you will have to parse into the response and inject your string. This looks hard to do outside of a Page class in ASP.NET.
Messy, and there are better alternatives. Tracing is usually the way these types of things are handled.
You may want to consider writing this information out to a Glimpse trace or somehow hooking into its display... I can't say enough about Glimpse.
Rather than writing the elapsed value to the response, you could store the result in HttpContext.Items and then access this on the view/page:
HttpContext.Current.Items.Add("elapsed", elapsed);
HttpContext.Items Property

Reading values from webpage programmatically

I don't know what it called, but i think this is possible
I am looking to write something(don't know the exact name) that will,
go to a webpage and select a value from drop-down box on that page and read values from that page after selection, I am not sure weather it called crawler or activity, i am new to this but i heard long time back from one of my friend this can be done,
can any one please give me a head start
Thanks
You need an HTTP client library (perhaps libcurl in C, or some C# wrapper for it, or some native C# HTTP client library like this).
You also need to parse the retrieved HTML content. So you probably need an HTML parsing library (maybe HTML agility pack).
If the targeted webpage is nearly fixed and has e.g. some comments to ease finding the relevant part, you might use simpler or ad-hoc parsing techniques.
Some sites might send a nearly empty static HTML client, with the actual page being dynamically constructed by Javascript scripts (Ajax). In that case, you are unlucky.
Maybe you want some web service ....
One simple way (but not the most efficient way) is to simply read the webpage as String using the WebClient, for example:
WebClient Web = new WebClient();
String Data = Web.DownloadString("Address");
Now since HTML is simply an XML document you can parse the string to a XDocument and look up the tag that represents the dropdown box. Parsing the string to XDocument is done this way:
XDocument xdoc = XDocument.Pase(Data);
Update:
If you want to read the result of the selected value, and that result is displayed within the page do this:
Get all the items as I explained.
If the page does not make use of models, then you can use your selected value as an argument for example :
www.somepage.com/Name=YourItem?
Read the page again and find the value

Categories