Word Automation Multiple Paste Problem - c#

Is there a better way to paste HTML fragments into a Word document than via the clipboard from C#?
using Word = Microsoft.Office.Interop.Word;
I'm using some code that puts HTML into the clipboard:
HtmlFragment.CopyToClipboard(changedText);
I have a selection in word (from a formfield) and I do:
word.Selection.Paste();
But sometimes it just throws a COM exception. If I add
Thread.Sleep(100);
I can get it to work, but that's not ideal.
The Insert methods look like a better option but there is no Insert from HTML.
So what's the best way to insert lots of HTML fragments into Word quickly using the automation interfaces?
Edit
Some good advice in the responses but the issue turned out to be a simple <br> tag causing word to fail on paste.

For interop, instead of Selection.Paste you'll want to use Selection.PasteSpecial with a WdPasteDataType of wdPasteHTML.
If you're using the new formats of Word (i.e. 2007/2010), you could give up interop all together and just go with WordprocessingML (using the Open XML SDK or just free-hand it with Linq and System.IO.Packaging). Or you could just it in conjunction with Interop if that was a need.
If you're using Open XML, you could just use altChunk to import HTML. Here's an example (which includes an example for HTML) at How to Use altChunk for Document Assembly. And another (fresh off the presses - it was released today): Importing HTML that contains Numbering using altChunk.

+1 to Otaku's comments, though, generally speaking, i've found it best to use the various RANGE.* functions for pasting in data than the Selection object, or pasting through the clipboard. the main reason is, if you paste through the clipboard, you scramble whatever was on the clipboard (which might not be what the user wants to happen).
the Selection object applies across all open word documents, which can get you in trouble in some cases. Unfortunately there are a few things that you just about can't do any other way.
And, there are some things (Like altering text at the current cursor position) that you MUST use the selection object for.

+1 to DarinH comments. Also something to note is that you can paste on any place in the document using Range without having to change the selection of the document (the cursor in the document).
Sometimes PasteAndFormat throws an Exception on freshly created Documents, check my reply here if that happens: https://stackoverflow.com/a/65796482/15001063

Related

Embed word document into another WITHOUT icon

How to embed a word document into another word document via OpenXML SDK, but showing content, not an icon of word? Such, as we do it manually in word: Insert object from file -> WITHOUT checking "Dispaly as icon"?
I've found this article, but it uses an icon. I've also tried to use OpenXML SDK Productivity Tool, but shows only generated binary data.
EDITED:
I use the following code:
DrawAspect = OleDrawAspectValues.Content
and then i add image part:
var imagePart = mainDocumentPart.AddNewPart<ImagePart>("image/x-emf", imagePartId);
GenerateImagePart(imagePart);
But my image part - is just an array of bytes of word's icon.
So, in this case happens the following: when i open generated document, it shows embedded document as an icon, but when i double click this embedded document, edit it and save changes, the embedded document is shown as a content, so maybe it's possible in some way to show this content without editing embedded document? Should i use instead of array of bytes of word's icon an array of bytes of doc's screenshot?
Not sure i described it clear, so please ask
I'm afraid what you are asking for is almost impossible.
The only difference as far as the word file is concerned between the icon and the embedded file, is the image.
When you don't use a icon Word pretty much just take a screenshot of the document you are embedding and inserts that in place of the Icon graphic.
I've uploaded an example I grabbed from a Word file I made. Found this little gem in the /media folder inside the .docx file.
So basicly, your only choice in resolving this if you can't live with the Icon is to somehow grab a picture of the word-file you want to embed and insert that instead of the Icon image.
How you'd go about that can't be pretty. First of all the open xml sdk contains no such functionality. I tried playing a bit around with office interop as well, but no luck.
I only see two possible ways to achieve this.
First one is via Interop. You'll need to install a "pretend printer" like the ones that print to PDF instead of sending it to a printer. This one however needs to print to an image format. The format of the file in the Media folder was .emf but I'm not positive thats a requirement.
Anyways, should the above somehow be possible you could embed that picture, pretty much using the example you link from Microsoft, and just change this size of the "icon" which now would be an image of the document.
Second possibility would be to open the word document as a process, set the document size to 72% (or whatever makes the document be the only one on screen on your desktop) and the grab a print screen and cut it down to just the document and the use that as your image for the embedding.
For the record, I don't recommend you do any of the above, but thoose are the only options I see.
Should someone have a better solution to this I'm all ears.
Finally, should you decide that you want to push on with this, I'll be happy to code up an example of option number 2 if you reply and tell me you'd like that.
Kaspar
There is a nice wrapper API (Document Builder 2.2) around open xml specially designed to merge documents, with flexibility of choosing the paragraphs to merge etc. You can download it from here.
Using this tool you can embed a paragraph of another word document or entire word document as per your requirement.
The documentation and screen casts on how to use it are here.
Hope this helps.

How to put text with headings on clipboard?

If I copy text (using the cursor etc. I don't mean programmatically) from a web page or Word document, and paste it in a Word document - Word knows what text is a heading, and what is simple text. I want to do the same thing (programmatically) - put text on the clipboard and specify that part of it is heading1, part heading2... and part is simple text.
I found this class to put html text (which can have headings) on the clipboard, but was wondering:
a) That's from January 2007. Perhaps there's a simpler way now.
b) HTML only allows up to 6 heading levels. (I actually tried h7 but Word didn't recognize it.) Perhaps there's some way to have unlimited heading levels like Word does.
I don't think the clipboard Handling had updates in the latest versions of .net framework.
I think that more complex updates/adding content to a word document may be achieved using ole automation or the open xml sdk.

How to read bookmarks value in word document, C++\CLI using open xml sdk 2.0

I've created a word document using openXML SDK in C++\CLI in which I've entered Bookmarks,
I need to open that word document and search for the bookmarks present in it and replace it with some text value.
Please suggest the above with sample code or any links which I can refer.
Thanks in advance
I suggest a lot more specific. A bookmark can have paragraphs, images, tables, textboxes, etc. all in it. It can also start in the middle of a table and end outside the table. So replacing what's inside it can be very problematic.
So I'm going to take a guess as to what you want and from that might have an answer for you. I am guessing you want something where you place tags in the document and then your program can replace those tags with data. Instead of bookmarks use fields. There are a number of mailmerge fields that work great for this.
If this will work for you, then for the actual code, Descendents is the main thing you need.

How to create a word document using html written in C#

I creating a C# application that has to create a word document.
I'm using the Microsoft.Office.Interop.Word to do this and I've successfully managed to output some word documents, but creating the content trough the code is a very time consuming work.
I noted that word is able to open html pages and show it as a normal content so I created a simple test table in html and inserted it into the word document. But when I outputted the document the obvious happened: The tags where still there! Word did not format the tags as html. It just outputted exactly what I put in there.
How can I tell word to reformat the text as html?
edit: (trough the C# code of course)
edit 2: Please note that I'm parsing trough some data to make this, so I will end up with about 4 pages of the same table/html, so I will need to be able to tell word to start at the next page each time I've finished a loop. So a html-only method will probably not work.
If you're only wanting to output simple HTML content as a Word document, you could always cheat and write out the HTML content with a .doc extension.
Word will open that just fine.
If you need to add a page break, you can use a CSS page-break-before, like so:
<br style="page-break-before: always;"/>
If you're set on using Interop, having read up a little bit, this post states that you need a converter to insert HTML, and the converters are only accessible when:
you paste HTML from the Clipboard
open/insert HTML from a file
So, this answer looks like it provides a clipboard-based solution : Adding html text to Word using Interop
However, if there's any money to spend on the project, I can heartily recommend Aspose.Words which will do all of this for you.
As requested by the OP, and to make easier for others to find this solution, here it goes the answer I posted as a comment (plus extra results from testing):
When opening an HTML file, MS Word honors the CSS properties page-break-before and page-break-after. There is a caveat, however:
On "Web design" view, page-breaks are never shown (this doesn't mean that they aren't there), just like browsers don't "show" them. And Word opens html files on Web design view by default (which quite makes sense). You need to print the document or switch to some other view (typicall "Print design") to see your breaks in all their glory.
So, saving an HTML file with a .doc extension is a viable solution (also tested: Word opens it properly despite of the extension).
Note: all the testing was done on MS Word 2003 using this snippet: <html>asdf<br style="page-break-before: always;">new page!</html>
Don't build the document in code, create it in Word as template or mail merge template and the use code to merge or replace the fields data.
See this answer here
MS Word Office Automation - Filling Text Form Fields And Check Box Form Fields And Mail Merge
And See this from the mothership:
http://msdn.microsoft.com/en-us/library/ff433638.aspx
If you don't want to use an external lib, Interop is too slow for you and neither pure HTML nor mail merge template are flexible enough, you could write your content as text or HTML into one or more files (using C#), create a VBA macro in a Word document which by itself creates a second Word document, reads the content files and does any formatting you want afterwards.
You can run this macro programmatically by starting Word using the command line switch /m.
Another possible approach, if your html is xhtml (i.e. XML compliant), you could use XSLT to convert it to a Word XML format. But this would take a LOOOOOOOOOOONG time to code.
If you don't have to use HTML as the starting point you could simply build the Word XML document yourself rather than using XSLT, which would be easier. Time consuming but possible - it's something I do quite a lot in my work.
If a third party component is an option I would recommend the stuff from Aspose.
I have been pretty happy with their tools so far. The API is a little messy but everything works as one would expect.

C# - Templated Printing from Object(s)

I'm in need of a solution to print or export (pdf/doc) from C#. I want to be able to design a template with place holders, bind an object (or xml) to this template, and get out a finished document.
I'm not really sure if this is a reporting solution or not.
I also don't want to have to roll my own printing / graphics code -- I'd like all display concerns handled in a template.
I initially think of this as something Crystal Reports can do (although I've never used CR), but I'm not sure if I'm abusing the system here -- I'm not really interested in binding ADO.NET datasets at the moment (screw datasets). Can Crystal deal with binding to objects?
Does SSRS or WPF play in this field too?
A subset of WPF-P is XPS which can be used to present your objects via databinding.
One of the best choices if you are already using WPF.
Google Keywords: XPS, FixedDocument, FlowDocument, WPF Printing
Might read through this thread:
http://groups.google.com/group/nhusers/browse_thread/thread/e2c2b8f834ae7ea8
Seems a lot of people like iTextSharp
http://itextsharp.sourceforge.net/
For Word docs, look into Word's Mail Merge feature and Word automation. I did this recently in a form letter printing project. Basically what I did was create a Word template file (file extension .dot) and in this template file I defined MergeFields in a standard form letter. My application queries a database for the records it needs to print and then for each record it returns it matches fields in the database with these merge fields and sends the result (the merged doc) to the printer.
It's working really well and if I had a link that gave a definitive explanation, I'd provide it (check back here, I'll see if I can't find the most useful ones). Hopefully I've provided enough keywords to let you find your own resources. I can go into more detail if you need.
I've never had to export PDF files but for a project I'm working on now I'll have to. For a free solution my research has lead to iTextSharp (like Will Shaver points out) but I've only done the initial investigations and I have found a few pay solutions I might end up resorting to.

Categories