How do I extract/insert text into RTF string in C#

How do I extract/insert text into RTF string in C# - c#

In a C# console app I have the need to extract the text from an RTF string, add some more text to it, and then convert it back into RTF. I have been able to do this using the System.Windows.Forms.RichTextBox class, but I find it a bit odd to use a Forms control in a non-Forms app. Any better way to do this?

Doing anything with RTF is pretty difficult unless you're using the windows forms. As stated above, using forms is the easiest way to go.
You could write something yourself, but the RTF spec is pretty complicated.
http://www.biblioscape.com/rtf15_spec.htm
Or you could use a conversion DLL / ActiveX object of which there is a large number available.
http://www.sautinsoft.com/
Or - If you're doing this from Linux, there are also tools available. A cursory glance throws up UnRTF
http://www.gnu.org/software/unrtf/unrtf.html
I haven't included stuff to turn text back to RTF because I think the RTF specification treats and formats text correctly.

I think you should just shake this feeling of "odd". There's nothing odd about it.

It depends on what you mean by 'better'. You are already using the simplest and easiest way of doing it.

There is nothing wrong with using an user-interface control in a console application or even in a web application. The Windows controls are part of the .NET Framework, might as well use them. These controls do not need to be hosted in "forms" in order to work.
Reinventing the wheel, using DLL/ActiveX/OCX, and using Linux are simply not practical answers to your question. The better way is...do what you know. There is actually a performance and maintainence benefit to using existing framework methods then using the suggested alternatives.

Related

Parsing PCL for Text

Ok, so I know this is a crappy question but it has been driving me crazy all day...
I have a bunch of files containing raw PCL6/PCL XL code from printing jobs run to our printers. What I need to be able to do is somehow parse them so I can search for specific text.
Does anyone know if this is possible or understand PCL enough to suggest a reason why even on basic prints from say notepad the raw text doesn't seem to be visible within the code?
I suppose I should mention, I need to be able to code this into my C# app. Manual converters or the ability to print the pcl is not going to do what I want.

#mcalex is correct, PCL 6 (PCLXL) is a compiled binary. You can't read it. You need something that can decompile for you. Pagetech have some solutions for this. You could also look to convert to some other format where the data might be more readable. If the source could be generated in PCL 5 or PS you "might" have a better chance or reading the data directly (although not likely).

Edit HTML files manipulating DOM in a jQuery style

I have a batch of HTML files which need some editions easy to perform with jQuery (mainly selecting some nodes and changing their attributes).
My approach to achive this, has been opening them one by one in Google Chrome, excecuting the jQuery code in the console, and then copying the resulting DOM back to my HTML editor.
Since what I'm currently doing takes a lot of time, and also due to the fact that every file needs the same edition (i.e., the same jQuery/JS code will work for every HTML file), I am considering to write a script/program to do this.
Anyway, I am not completely clear of which of the following (if any of them) approaches I should take to accomplish this task.
Write a JavaScript script with jQuery using some FileSystem/File manipulation library (which one?)
Write a Java or C# program using some jQuery-based library (like CsQuery)
Finding a plugin for some of my editors (Aptana, Notepad++, Eclipse, etc) or a completely different editor that supports jQuery-like commands for edition (just as notepad++ regex replacement support). This would be slow with big batches, but at least it would allow me to avoid the annoying copy/paste to/from Chrome.
Is one of this approaches the right way to accomplish what I need? (Is there a right way to do this?) Which should be more straight-forward?
I think that #2 would be easier for me since I have a lot more experience in Java and C# than in JavaScript, but I think that maybe that idea would be sort of using a sledgehammer to crack a nut.

You should consider using PhantomJs. It is a headless WebKit which can be executed from te commandline. It accepts a javascript or coffeescript file as a an argument, which can be used to e.g. do something with a web page. Here is an example:
var page = require('webpage').create();
page.open('http://m.bing.com', function(status) {
var title = page.evaluate(function(s) {
return document.querySelector(s).innerText;
}, 'title');
console.log(title);
phantom.exit();
});

I am not sure of the right way but it sounds like you are familiar with C# and would think writing a class library would be the least overhead for automation. Here are some potential solutions:
Scripting Library (e.g., C#.NET) - You can use a library like the one you mentioned or something like ScriptSharp if you want to use DOM manipulation. If the HTML has appropriate closing tags you can also use LINQ to easily navigate the HTML (or something like the HTML Agility Pack found on CodePlex). I would even recommend using Mustache with an HTML file template in C#.
JavaScript Library - If you wanted to stay in pure JavaScript you can use Node.js. There are file manipulation libraries you can use.
Headless Browsers - Haven't thought through being able to save the resulting HTML automatically but you can use something like jsTestDriver or Phantom.js
You can go with the plugins in editors as well, but I would stick with a Java, C#, python, etc. library that you can potentially call from existing application or schedule as a job/service.

Perform diff on RTF and preview changes

I am looking to compare two RTF files and provide a way of highlighting the differences. Does anybody know of anyway I can do this? The project is a .net project, however worst case, I'm sure I could implement an unmanaged app.
Thanks

I'd say that depends.
If you just want to view the difference between the RTF "code", i.e. the plain textfiles, you could employ a library like DiffPlex.
If you want some WYSIWYG view, i.e. including formatting, etc., you could actually use Word (via Automation). Much like TortoisSVN (xdocdiff) does it.

How to draw a base glyph with one color and its attached diacritic with another one?

MS Word has this capability in its Hebrew and Arabic versions. I would like to achieve this in a windows desktop application, using .Net (may be with win-api calls).

As explained in the link provided by Otaku here, current rich text edit controls can not handle this (unless you go for the hack OP in that Q did, which did not seem like a very good solution).
You could write code to do this manually yourself, ditching the text edit control completely, but that would probably mean a lot of work. It took Microsoft years to get support for combining diacritics working properly in MSWord. I would search for open source software that has this capability, and look at how other developers have done it. It might be hard to find, though, and you would likely have to step outside .NET-land. Maybe OpenOffice can do this?
This discussion might also be of help.
I am afraid that you will find, though, that you'll have to manually parse the Unicode and assign colors to the correct glyphs. If you want to be complete, that is one heck of a job.

How to Syntax Highlight in a RichTextBox [C#]?

How do I syntax highlight in a richtextbox control AS THE USER TYPES and USING A String[] keywords. I will be publishing a lightweight notepad to the web soon and I want it to have syntax highlighting. I am using Windows forms. Can someone post a code example?

RichTextBox syntax highlighting (talks about RichTextBox itself - minimal features but exactly what you asked for here)
A textbox/richtextbox that has syntax highlighting? [C#] (talks mostly about other ways of doing it)

The Scintilla control is an excellent source code editor that includes syntax highlighting amongst a whole range of other features. You can embed it in your own app and there is a .NET wrapper available.
With Scintilla you can specify the keywords and it will then apply the syntax highlighting as you type.

Are you using WinForms or WPF?
If WPF, you could have a look at AvalonEdit. It's free and open source, and it's used in SharpDevelop (open source IDE).

You can change the font of selected words in the richtextbox. Take a look at the Select and SelectedFont properties of the control.
But essentially, you need to iterate through the words, check if a word is present in your keywords, and then change the font, using the above-mentioned properties.

Not exactly an answer to your question, but have you looked at the text editor component from SharpDevelop? It's quite lightweight (<200kB IIRC), can be easily embedded in WinForms applications and has syntax highlighting for several languages built in.
Otherwise, you might want to look at this CodeProject page. It reformats the RTF while you type, which is not very efficient for large files, and it contains a few creepy catch (Exception) { } blocks, so I'm not sure if I would use it in a life-critical application, but it's definitely a good starting point to see how it can be done.

Syntax highlighting is not an easy task to perform efficiently. Many solutions you can find (like the ones involving modification of RTF) are a one time solution. If you want to highlight and un-highlight words on the fly during edition, your code has to be ready for it. I would not reinvent the wheel and use ICSharp.TextEditor or alike to solve your problem.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.