Strip text of all formatting on paste - c#

I have an application that allows the user to create an article. The problem arises when the user pastes from something like Word which comes loaded with a bunch of markup.
I'm using a jQuery editor called tiny_mce which allows the markup. I do a htmlencode and decode obviously but it means that i carry a huge payload of markup.
Is there a way to strip (all) markup from pasted text and just keep the text?
Or is there a way that tiny_mce can show the markup as text?

It's been a while since I used tinyMCE, but when I did I used this paste plugin that did automatic clean-up on paste, including paste from Word.

Strip all HTML markup using Regex: http://weblogs.asp.net/rosherove/archive/2003/05/13/6963.aspx
string stripped = Regex.Replace(textBox1.Text,#"<(.|\n)*?>",string.Empty);
This Regex expression can be applied to the language of choice.

I use a simple Windows shell addin caled Pure Text. It overloads the Windows+V key to do a plain text paste.

Related

Best method or control to display text from a file in an asp.net webpage

This may be a totally newbie question, but here it goes. I have a asp.net web page that I need to display text from a .txt file. I am trying to figure what would be the best control to do this with or the best method. I looked at using an iframe, but this does a very poor job of displaying the text from the file (for instance no word wrap for an iframe). I don't really expect anyone to solve this for me completely, but if you have any suggestions or know of any links to tutorials or explanations where someone has done this, I would be very greatful.
Thanks
You can for example add a Literal control, assign File.ReadAllLines("yourfile.txt") to the Text property and replace \r\n with <br />.
You should just read the text-file in code (using a streamreader for example). Once you have that text, just output it to your web page.
If you're using web forms you could place a label and then set the text of that label.
If you're using MVC you could put it in the ViewBag and then in your view output the value from the ViewBag (or use a custom viewmodel)
You could use a Literal or Label control. Make sure that the control that you use encodes the text in order to avoid XSS vulnerabilities (or encode the text manually if necessary).
It might as well be necessary to substitute line endings with <br/> tags.

Removing HTML tags from string except hyperlinks and line breaks?

I have strings of HTML markup along with normal text written between it, what I want to do is to remove all HTML tags except hyperlinks and line breaks so that it will look like normal notepad style text but formatted (i.e with line breaks so it remains readable) and hyperlinks to ensure all external links remain visible for user to click.
I have tried some regex solutions but they completely eliminate all HTML markup which I don't want.
Thanks.
Use Html Agility Pack. it seems be useful for your issue.

highlight a word in textarea in ASP.NET using C#

I have a TextArea control in my ASP.NET page which gets populated with a paragraph containing multiple sentences from the database. After this data gets populated in the TextArea control, I need to search for a few words in them and highlight them in different color. The words that I need to highlight are present inside a table in the database.
My question is : How do I highlight the selected words in a TextArea control using C#?
Please help. Thank you.
The HTML <textarea> tag doesn't contain the ability to add any text formatting. If you want to highlight a portion of your text, then you'll need to display it within a <div>, <span> or some other HTML element.
If you need to make the text editable and still highlight portions of it, then you could use a WYSIWYG HTML editor, such as the jHtmlArea Free, Open Source jQuery plugin.
Shameless Plug: jHtmlArea is a jQuery plugin I created a while back to fit a need I had for a light weight, easily extensible WYSIWYG HTML editor.
You can look into the DynaCloud jQuery plugin, or CodeMirror. Both provide some functionality for highlighting text.
http://johannburkard.de/blog/programming/javascript/highlight-javascript-text-higlighting-jquery-plugin.html
http://codemirror.net/

Safe HTML in ASP.NET Controls

Im sure this is a common question...
I want the user to be able to enter and format a description.
Right now I have a multiline textbox that they can enter plain text into. It would be nice if they could do a little html formatting. Is this something I am going to have to handle? Parse out the input and only validate if there are "safe" tags like <ul><li><b> etc?
I am saving this description in an SQL db. In order to display this HTML properly do I need to use a literal on the page and just dump it in the proper area or is there a better control for what I am doing?
Also, is there a free control like the one on SO for user input/minor editing?
Have a look at the AntiXSS library. The current release (3.1) has a method called GetSafeHtmlFragment, which can be used to do the kind of parsing you're talking about.
A Literal is probably the correct control for outputting this HTML, as the Literal just outputs what's put into it and lets the browser render any HTML. Labels will output all the markup including tags.
The AJax Control Toolkit has a text editor.
Also, is there a free control like the
one on SO for user input/minor
editing?
Stackoverflow uses the WMD control and markdown as explained here:
https://blog.stackoverflow.com/2008/09/what-was-stack-overflow-built-with/
You will need to check what tags are entered to avoid Cross side scripting attacks etc. You could use a regex to check that any tags are on a 'whitelist' you have and strip out any others.
You can check out this link for a list of rich text editors.
In addition to the other answers, you will need to set ValidateRequest="false" in the #Page directive of the page that contains the textbox. This turns off the standard ASP.NET validation that prevents HTML from being posted from a textbox. You should then use your own validation routine, such as the one #PhilPursglove mentions.

What do you use (free) to format C# code?

Are there any VS.NET plugins that will format a selection of code for printing or emailing and is also free?
Have you checked the inbuilt formatting provided by VS? Select code and enter key chord Ctrl+K, Ctrl+F.
Or goto (menu)Edit->Advanced->FormatSelection or Edit->Advanced->FormatDocument
If you copy your code from Visual Studio and paste it into Word, the syntax highlighting will be kept.
Alternatively, you could take a look at the Copy Source As HTML add-in.
Is this just a matter of using spaces instead of tabs to do indent your code?
try Artistic Style 1.22, http://astyle.sourceforge.net/
it's easy to use, has 3 or 4 predefined styles and is configurable.
Use some kind of tabs-to-spaces function, and make sure the print or email uses a monospaced (aka. typewriter or console) font.
I'm pretty sure VisualStudio had a (little well hidden) function to convert indenting from tabs to spaces and vice versa.
I'm normally using vim where you can use:
:set expandtab
:%retab
to replace tabs with spaces and:
:set noexpandtab
:%retab
to replace spaces to tabs.
Spaces is better for emailing etc. because noone can agree on the length (in spaces) of a tab.
To turn tabs to spaces, select the code and use Editor -> Advanced -> Untabify Selected Lines.

Categories