Editing Microsoft Word Documents Programmatically - c#

I want to know if this could be done.
I am building a data dictionary for our software system (school project), and I'm thinking of an automated way to do this. Basically I don't use much of Microsoft Word (2007), I only use it in documenting schools stuff, etc. I want to know if its possible to create/edit a Word document programmatically from a template.
The idea is, I will create a page on Word that contains an empty form that will be repeated on every page. For every data that I will input to my program, it will update the corresponding field in the form and skips to the next form.
The purpose of this, is to eliminate copy-paste methods (my habit) and to speed things up when doing the documentation.

Word automation, as suggested by others, will lead you to a world of hurt for two major reasons:
Office is not intended to be run unattended, so it can pop up message boxes at any time, and
It is (probably) not licensed to enable office functionality for computers which don't have it. If you generate a Word document on a web site using automation, you have to make sure that this functionality cannot be reached by computers which don't have office installed (unless they changed this rule in the last years).
I have used Aspose.Words, it costs a little, but it works well and is intended for this.

Not exactly sure what you really want, but creating word documents with c# shouldn't be any problem:
http://support.microsoft.com/kb/316384

If i find out your purpose correctly you need to visit this microsoft msdn link
Manipulating Word 2007 Files with OpenXML

Definitely possible. A fairly easy way of doing it using Office Automation. See this KB article for a basic sample: How to automate Microsoft Word to create a new document by using Visual C#
I think the main difference to that sample will be that you'll open your template and do SaveAs instead of creating a new document, but I can't remember exactly.
However, depending on your exact requirements, there might be better alternatives. For example, it's not recommended to do Office Automation on servers (including on webservers), so if that's needed you might want to look at something else.

You can use com interop of .net framework.
Understanding the Word Object Model from a .NET Developer's Perspective
Building COM Objects in C#

Using COM programming is not the best way as mentioned by erikkallen, I suggest using OPEN XML. It is really easy to use and your document generation operation will be very fast.
http://blog.goyello.com/2009/08/21/how-to-generate-open-xml-file-in-c-in-4-minutes/
http://msdn.microsoft.com/en-us/library/aa338205(v=office.12).aspx

Related

Hide sections in a word document based on users responses to a series of questions

I'm looking for the best way to achieve the following workflow.
Ask a series of questions
Captured the responses
Use the answers captured to either hide or show certain aspects of word document.
Save the complete word document to a location (TBD).
I'm not a developer, so will need to source one who could pick this up, but before I do I wanted to know the best approach to this workflow.
Appreciate any feedback you can offer.
Cheers
There are several libraries for interacting with Word (.docx) documents in C#, such as NPOI and DocX, and it is not theoretically complicated to programatically populate a document based on user input and some decision tree and then save it somewhere locally or expose it for download via a web interface. But, keep in mind, that's only part of the solution -- apps have to be hosted, secured, monitored, etc., and that's where the "hard" part is likely to be.
If you are looking to accomplish this within an enterprise environment that uses Microsoft Office 365, you may not need a developer at all. Microsoft Flow / Microsoft PowerAutomate allows you to produce complex workflows such as the one you described. There's a very similar one listed here:
https://flow.microsoft.com/en-us/galleries/public/templates/3c651e28cded46aab2ba40a2c3116f30/create-word-and-pdf-documents-from-microsoft-forms/

ASP.NET and OpenXML - Create formatted/styled xlsx file - Best Solution?

This question might be more subjective, but I'm hoping someone with more experience can guide me in the right direction.
I'm brand new to web development, but have been coding C# for a couple years. My job wants me to convert an existing app we have to SharePoint 2013 and part of the app generates an excel report with custom formats and styling. In the original app we used Interop, but apparently since it's 32bit and our server is 64bit, Interop won't work. I thought about just doing a csv, but our customer is adamant about keeping the styling so I found OpenXML.
I don't have any experience with OpenXML, but I saw the tool can convert files into code. I loaded our template into the tool and it generated about 2000 lines of code which seems very excessive. Using Interop it's a fraction of the length and seems much easier to read. I'm tempted to just copy all the code over and stick it in a region (which I know most developers hate and I agree looks bad) and put a note at the top saying that if the template ever changes to just redo that region with the new one.
Is that my best option or is there a better alternative? Unfortunately our dev network is pretty closed off (it's a pain to get approval on third party non Microsoft stuff) so I'm limited on third party libraries I can add as well. If there's an option without doing that, that'd be preferred.
If you have one or more templates, just use OpenXML to create a new workbook from a template for each request. And then use code to enter values into named ranges, datasets into rows, etc.
BTW—ClosedXML makes simple and medium things a lot easier.

Where will using Interop fail? Don't use Office.Interop if Office is not installed. late binding?

My windows forms application uses Novacode DocX to write a document from a template. The Novacode portion of the project works perfect and the file saves. The issue is that when I load the document the field/s (Table Of Contents) are not updated when the Novacode portion adds headings and such.
I could, and did, write a macro to update fields on open. This would solve the problem, but not everyone that will use my application will have this macro. I can't save the file as a .docm file with the macro attached for various reasons (assume file must be ".docx").
What I've found is that the Microsoft.office.interop.Word assembly will allow me to call "Fields.Update". My understanding is this will do the trick, but I can't block users that don't have word installed from running my application. My understanding is that if I am "Using Microsoft.Office.Interop.Word", or have it in my references that the application won't run if someone doesn't have word.
So I have code that checks if word is installed. If I run this, and it is installed, can I then use "Late Binding" to run interop code? Other related questions, have replies that point to "NetOffice" as a way to run interop without checking if word is installed.
I'm trying to make this as comprehensive as possible with my research. My question is very similar to this one "
how do I easily test the case where my C# application can't find an external assembly?". I would hope this issue can soon be solved for everyone, but I'm not sure it will be.
Side note, if anyone knows a way to update the fields, or even just the existing TOC, of a word document that is saved in the ".docx" format without having word installed that would be awesome to know, and would circumvent my whole issue. Although I would still like to know the answer to the interop question.
Also this is my first real question on StackOverflow, if you have tags to suggest please do so along with your answer. If you have feedback on how I ask my question, I will accept that too, but please don't close/delete the question without any answers. I linked to questions that are similar, but those questions have not gotten responses in a while. I believe I have done everything according to the rules.
This is more of an answer to your "if anyone knows a way to update the fields, or even just the existing TOC, of a word document that is saved in the ".docx" format without having word installed" question, but you might want to look in to the Open XML SDK for Office.
This will let you modify .docx files without having any dependency on having Word installed.
I found this tutorial which I think is doing almost exactly what you are wanting do do using the Open XML SDK.
Many things to say, but I think I found my answers
The main question was if I add the reference to "Microsoft.Office.Interop.Word" and the client running the application does not have word, where will the application fail? My understanding now, is that it will not fail on launch if the client does not have word. It will, however, fail when the code that uses the "Office.Interop.Word" is reached.
The way to prevent this, is a simple registry check method. I used a variation of This method to check the registry. Then before any of my code that uses the "Office.Interop.Word" code is run, I check if the client has word in the registry. If they don't have word, I take the proper notification actions for my application. I also surrounded the "Office.Interop.Word" code in a "try catch" exception block as a double safe measure. In my code the exception would mean word is not installed. A variation of the code using "Office.Interop.Word" I used to update fields can be found here.
Novacode DocX can support Docm files if you change the code yourself. I did not want to, and didn't use a docm file. Docm files have security warnings associated to them when emailed. So an auto updating macro is out of the question.
-Octopus Emoji is Celebrating

Windows App spellcheck

I was wondering if there is another way to spell check a Windows app instead what I've been of using: "Microsoft.Office.Interop.Word". I can't buy a spell checking add-on. I also cannot use open source and would like the spell check to be dynamic..any suggestions?
EDIT:
I have seen several similar questions, the problem is they all suggest using open source applications (which I would love) or Microsoft Word.
I am currently using Word to spell check and it slows my current application down and causes several glitches in my application. Word is not a clean solution so I'm really wanting to find some other way.. Is my only other option to recreate my app as a WPF app so I can take advantage of the SpellCheck Class?
If I were you I would download the data from the English Wiktionary and parse it to obtain a list of all English words (for instance). Then you could rather easily write at least a primitive spell-checker yourself. In fact, I use a parsed version of the English Wiktionary in my own mathematical application AlgoSim. If you'd like, I could send you the data file.
Update
I have now published a parsed word list at english.zip (942 kB, 383735 entries, zip). The data originates from the English Wiktionary, and as such, is licensed under the Creative Commons Attribution/Share-Alike License.
To obtain a list like this, you can either download all articles on Wiktionary as a huge XML file containing all Wiki- and HTML-formatted articles. This is then more or less trivial to parse. Alternatively, you can run a bot on the site. I got help to obtain a parsed file from a user at Wiktionary (I seem to have forgotten his name, though...), and this file (english.txt in english.zip) is a further processed version of the file I got.
http://msdn.microsoft.com/en-us/library/system.windows.controls.spellcheck.aspx
I use Aspell-win32, it's old but it's open source, and works as well or better than the Word spell check. Came here looking for a built in solution.

Retrieve file properties

When in Windows XP, if I open the properties window for the file and click the second tab, I will find a window where to add attributes or remove them.
While developing things, I noticed there was actually something I wanted to know about the file. How to retrieve this data? It's a string with name 'DESCRIPTION'.
The actual tab is saying 'Custom'. I think it's called metadata what it shows.
I noticed that only the files I'm looking at have that tab. It seems to be specific only for the SLDLFP -file.
Not on an XP machine, but I think this might work
FileVersionInfo myFileVersionInfo = FileVersionInfo.GetVersionInfo("path.txt");
string desc = myFileVersionInfo.FileDescription;
I think the custom tab is only available for Office documents, and display custom properties (In Word, File -> Properties, Custom tab).
The best way to get the information would be by using MS Office hooks. Last time I did anything like this, it was using OLE Automation, so good luck!
Edit:
Since you added a mention of SLDLFP, I'm guessing that you are working with SolidWorks files.
There may be some standard APIs for this, but none that I have heard of.
Using SolidWorks via Automation is probably going to be your best bet.
I found a link describing how to read these kind of values with a Word 2003 and VB.Net, I would expect that it is similar to how to do this with SolidWorks.
Reading and Writing Custom Document Properties in Microsoft Office Word 2003 with Microsoft Visual Basic .NET
I think this applies to all microsoft office based documents (and not all the other files).
You might need to automate word/excel/powerpoint to get that info.
OR you might need some kind of a binary file reader for MSOffice based files to read the attributes.

Categories