Edit Saved Microsoft Word Document in C#/Asp.net - c#

I am not sure if this is possible and every where I have searched, I cannot find a clear answer. I am saving a Microsoft Word document to a SQL Server 2008 table. Basically just converting the file to a Byte[] and writing that to the table. This word document is a "template" file. The file is a form that the user needs to fill out. What I am wondering, is after reading that file from SQL Server and before opening it up for the user, is there a way to autopopulate some fields in the form for the user? For example, if I know the address of the user already, can I autopopulate the address field in the template for them?
I know that using Microsoft.Office.Interop.Word, I can search the document for bookmarks and insert data at the bookmark. However, as far as I know, you cannot use Microsoft.Office.Interop.Word to open a Byte[].
Is there anyway to complete what I was looking for?

If you want to use OpenXML, then you can do it like this,
//Load your byte[] array into memory stream and then
WordprocessingDocument doc = WordprocessingDocument.Open(stream, true);
You can do what you are trying to achieve using OpenXML without installing word on the server side..More resources on OpenXMl can be found on http://openxmldeveloper.org/. And the open xml sdk can be downloaded from here.

I think the general steps would be to
1) Save the file to the local hard drive of the user with a file name based on the template but with a .doc extension.
2)Open the file with interop, but keep it invisible.
3)Populate the fields with bookmarks.
4)Show it to the user.

Related

Is there any way to store Autotext (GlossaryDocument) in docx instead of dotx

I need make word file with some autotext (generated from database)
Now I programmatically generate word document (docx) and template for it (dotx). Dotx contains list of autotext (in GlossaryDocument) and in docx file I paste relation on it:
documentSettingPart1.AddExternalRelationship("http://schemas.openxmlformats.org/officeDocument/2006/relationships/attachedTemplate", new Uri($"file:./{Path.GetFileName(dotxTemplate)}", UriKind.Relative) , relationId);
So If user save both files in the same directory and open docx, he can use autotext perfectly. But I looking for a way to realize it in one docx file because it's inconvenient for users to have two files and make sure they are in the same directory.
I tried add GlossaryDocumentPart in docx or change document type (ChangeDocumentType(WordprocessingDocumentType.Document)) but after that I see GlossaryDocument in open xml sdk, but when I open docx-file in Word there are not any autotext from this GlossaryDocument
Is there any way to make docx file that contains autotext in yourself?
A docx file cannot contain AutoText (Building Blocks). It is simply not supported. But why not save the document you're distributing as a template and the user can use it to create a new document whenever it's requiredf? That's what templates are for...
What is possible is to store the Word Open XML that represents the content to be re-used in (a) Custom XML Part(s). You'd need to code some kind of interface to enable the user to retrieve and insert this content. If the code should travel with the document, then as VBA - and it would then need to be a docm rather than docx file.
Given Word 2013 or newer, it's also possible to map/link a content control to a node in a Custom XML Part. But, again, this would require you to develop some kind of interface for the user.
Also possible would be a VSTO or Word JS API solution rather than VBA.

editable word document attachment

Its a general scenario when we provide an option of attaching a file (MS .doc) to end user. This file is stored in DB as binary. When user try to access this attachment next time, we allow them to download it. Now, here I want to give a feature to user where he should be able to open this doc file on click, edit it and save it without downloading.
.doc is a binary format and not easy to work with - a library such as Aspose, as mentioned by Christian, is definitely the way to go.
However, if .DOCX is acceptable (and that's Office 2007 and higher), then you can achieve what you want in three steps:
Convert .docx to HTML
Convert Word to HTML then render HTML on webpage
Display the HTML using any rich text control of your choice
What is the best rich textarea editor for jQuery?
Finally, convert HTML back to .docx:
Convert Html to Docx in c#
You would have to "reinvent" Microsoft Office Online (look into your skydrive account). I am unsure if there are any "out of the box" libraries for that, but you could build a simple editing app by leveraging Aspose word (or some other library). But that would be far from simple.
Link to aspose: http://www.aspose.com/.net/word-component.aspx
Word will only open files that are locally stored. What you are looking for is something similar to editing items that SharePoint provides using the WebDAV interface.
You may be able to use this approach to support your requirement. You should be cautious about the security aspects of the solution unless you have fully authenticated access to the shared folder on the server.
I am not sure if a standalone MS Word Document editor exists. However, this can be done with using a combination of rich text formatting / converting tool (for example, the DevExpress ASPxHtmlEditor + Document Server):
Load binary data from a DB;
Import loaded data (MS Word content) as HTML content into the ASPxHtmlEditor;
Edit imported data via the WYSIWYG ASPxHtmlEditor;
Convert the edited HTML back to MS Word content;
Save the converted / edited MS Word content back to the DB.
I believe, it is possible to do something like this if you have such products (free or commercial analogs) in your project.

How to create *.docx files from a template in C#

I have a working ASP.NET MVC web application to manage projects and customers. Now I want to generate a word file for some customers. In this file should be displayed some data about the customer. Every generated file should have the same data and the same design. So I want to craete a new Word Template with the fields and want to fill the placeholders programmatically.
My problem is that I couldn't find a clear way to do that. Does anybody know a good learning resources?
Try this page:
Building Office Open XML Files
Open XML files (docx) are ZIP packages containing XML files. In your case, I would create a copy of your original template, then use the System.IO.Packaging API to open the file and modify it. By opening, the correct XML file and replacing certain placeholders in XML, you should be able to achieve the result you want.
While trying to do the same I have found some libraries to create/edit DOC or DOCX in .Net
GemBox.Document
TemplateEngine.Docx
DocXTemplateEngine
Templater - Nuget
Spire.Doc - Nuget
for the new document generation from a template file (.dot) should be very easy, I think it's a parameter you specify in the word application file open or so, when you pass the path of the .dot file, telling word to create a new .doc based on that file and not edit the actual template document.
for form fields and bookmarks filling, lots of examples online on how to do it from C#, see here:
MS Word Office Automation - Filling Text Form Fields And Check Box Form Fields And Mail Merge

Inserting a Word document content (with formatting) in a RDLC report using C#

I'm creating a RDLC report in C#. Is it possible to insert the content of a Word 2003 document (with formatting) in it (either in design time or programmatically) before exporting to PDF. The final result will be a PDF file containing the initial report (fields from database) and the Word document content following it.
Why this? I need to give the user the possibility to fill a form, attach a word document and export the all to PDF as I described earlier (ASP.NET). I don't have Word installed on the server so I can't Interact with its COM objects.
Thank you.
Which format does the word document use? If it's .docx, you can try going with the Open XML SDK from Microsoft.
Not sure about how to import the formatting.
This question was asked a long time ago, so I doubt this answer will be of much use to the OP, but if someone else stumbles across this as I did...
...I would think it should be fairly easy using a 3rd party component to convert the doc to an image and then use that in the RDLC without much hassle at all.

Save a binary file in SQL Server as BLOB and text (or get the text from Full-Text index)

Currently we are saving files (PDF, DOC) into the database as BLOB fields. I would like to be able to retrieve the raw text of the file to be able to manipulate it for hit-highlighting and other functions.
Does anyone know of a simple way to either parse out the files and save the raw text on save, either via SQL or .net code. I have found that Adobe has a filtdump utility that will convert the PDF to text. Filtdump seems to be a command line tool, and i don't see a way to use a file stream. And what would the extractor be for Office documents and other file types?
-or-
Is there a way to pull out the raw text from the SQL Full text index, without using 3rd party filters?
Note i am trying to build a .net & MSSql solution without having to use a third party tool such as Lucene
If it isn't absolutely necessary to stream directly from SQL Server into your app, the hard part is parsing the PDF or DOC file formats.
The iTextSharp library will give you access to the innards of a PDF file:
http://itextsharp.sourceforge.net/
Here's a commercial product that claims to parse Word docs:
Aspose.Words
Edited to add:
I think you're also asking if there are ways to make SQL Server Full-text Indexing do the work for you by adding IFilters. This sounds like a good idea. I haven't done this myself, but MS has apparently supported a Word filter for a long time, and now Adobe has released a (free) PDF filter. There's a lot of information here:
Filter Central
10 Ways to Optimize SQL Server Full-text Indexing
SQL Server Full Text Search: Language Features - a little out of date but easy to understand.
SQL Server Full-Text Search feature uses IFilters for extracting plain text from PDF or Office file formats. You can install IFilters on your server or if your code is running on the same machine as SQL Server you're already have it.
Here is an article which shows how to use IFilters from .NET: http://www.codeproject.com/KB/cs/IFilter.aspx
You could from your C# application open the .doc file and save it as text and put both the text and .doc document into the database.
If you are using SQL 2008, then you could consider using the new FILESTREAM feature.
Your data is stored in a varbinary(max) column, but you can also access the raw data via a regular Win32 handle.
Here's some sample code showing how to get the handle.
I had this same issue... I solved it by adding the following to my application:
EPocalipse.IFilter.dll (for everything -but- Office 2007
documents, due to 64x Windows issues)
OpenXML SDK 2.0 (for Office 2007 Documents)
I use these to grab the plain text and then store it in the database alongside the binary data. Keep in mind that I am certainly not an expert, so there may be a better way to do this, but this works for everything but "Quick Save" pre-2007 Word Documents, which apparently aren't read by iFilters. I just have my users resave the document if that error occurs, and everything works fine.
Let me know if you'd like some sample code... I would post it right now, but it's a bit long.

Categories