Fill word template data using openXML SDK - c#

I have an word file template and xml file for data. I want to find content Content control in word and get data from xml and then replace text in word template. i'm using the following code but it is not updating word file.
using (WordprocessingDocument document = WordprocessingDocument.CreateFromTemplate(txtWordFile.Text))
{
MainDocumentPart mainPart = document.MainDocumentPart;
IEnumerable<SdtBlock> block = mainPart.Document.Body.Descendants<SdtBlock>().Where
(r => r.SdtProperties.GetFirstChild<Tag>().Val == "TotalClose");
Text t = block.Descendants<Text>().Single();
t.Text = "13,450,542";
mainPart.Document.Save();
}

For anyone still struggling with this - you can check out this library https://github.com/antonmihaylov/OpenXmlTemplates
With it you can replace the text inside all content controls of the document based on a JSON object (or a basic C# dictionary) without writing specific code, instead you specify the variable name in the tag of the content control.
(Note - i am the maker of that library, but it is open sourced and licensed under LGPLv3)

I think you should write changes to temporary file.
See Save modified WordprocessingDocument to new file or my code from work project:
MemoryStream yourDocStream = new MemoryStream();
... // populate yourDocStream with .docx bytes
using (Package package = Package.Open(yourDocStream, FileMode.Open, FileAccess.ReadWrite))
{
// Load the document XML in the part into an XDocument instance.
PackagePart packagePart = LoadXmlPackagePart(package);
XDocument xDocument = XDocument.Load(XmlReader.Create(packagePart.GetStream()));
// making changes
// Save the XML into the package
using (XmlWriter xw = XmlWriter.Create(packagePart.GetStream(FileMode.Create, FileAccess.Write)))
{
xDocument.Save(xw);
}
var resultDocumentBytes = yourDocStream.ToArray();
}

The basic approach you use works fine, but I'm surprised you're not getting any compile-time errors because
IEnumerable<SdtBlock> block = mainPart.Document
.Body
.Descendants<SdtBlock>()
.Where(r => r.SdtProperties.GetFirstChild<Tag>().Val == "TotalClose");
is not compatible with
Text t = block.Descendants<Text>().Single();
block, as IEnumerable has no Descendants property. You either need to loop through all the items in IEnumerable and perform this on each item, or you need to define and instantiate a single item, like this:
using (WordprocessingDocument document = WordprocessingDocument.CreateFromTemplate(txtWordFile.Text))
{
MainDocumentPart mainPart = pkgDoc.MainDocumentPart;
SdtBlock block = mainPart.Document.Body.Descendants<SdtBlock>().Where
(r => r.SdtProperties.GetFirstChild<Tag>().Val == "test1").FirstOrDefault();
Text t = block.Descendants<Text>().Single();
t.Text = "13,450,542";
mainPart.Document.Save();
}

Related

Adding multiple images to a docx file using xslt template

So I need to generate a docx file for reporting purposes. This report contains text, tables and a lot of images.
So far, I managed to add text and a table (and populate it based on the content of my xml using an xslt transform).
However, I am stuck on adding images. I found some examples of how to add images using C# but I don't think this is what I need. I need to format the document using my xslt and add the images in the right places (for instance in a table cell). Is it somehow possible to add a container using xslt which uses the filepath to display/embed the image similar to the <img> tag in html?
I know that the docx format is basically a zip containing a file structure and to embed the image I should add it to this file structure also.
Any examples or references are appreciated.
to give you an idea of my code:
XslCompiledTransform transform = new XslCompiledTransform();
transform.Load(xsltFile);
StringWriter stringWriter = new StringWriter();
XmlWriter xmlWriter = XmlWriter.Create(stringWriter);
transform.Transform(xmlFile, xmlWriter);
XmlDocument newWordContent = new XmlDocument();
newWordContent.LoadXml(stringWriter.ToString());
File.Copy(docXtemplate, outputFilename, true);
using (WordprocessingDocument myDoc = WordprocessingDocument.Open(outputFilename, true))
{
MainDocumentPart mainPart = myDoc.MainDocumentPart;
Body body = new Body(newWordContent.DocumentElement.InnerXml);
DocumentFormat.OpenXml.Wordprocessing.Document document = new DocumentFormat.OpenXml.Wordprocessing.Document(body);
document.Save(mainPart);
}
It basically replaces the body of an existing docx file. This enables me to use all the formatting, etc.
The xslt file is generated by adjusting the document.xml file from the docx.
Update
Ok, so I figured out how to add an image to the docx file directory, see below
using (WordprocessingDocument myDoc = WordprocessingDocument.Open(outputFilename, true))
{
MainDocumentPart mainPart = myDoc.MainDocumentPart;
ImagePart imagePart = mainPart.AddImagePart(ImagePartType.Png);
using (FileStream stream = new FileStream(imageFile, FileMode.Open))
{
imagePart.FeedData(stream);
}
Body body = new Body(newWordContent.DocumentElement.InnerXml);
DocumentFormat.OpenXml.Wordprocessing.Document document = new
DocumentFormat.OpenXml.Wordprocessing.Document(body);
document.Save(mainPart);
}
This will add the image to the docx structure. I also checkt the relatioship and this is present in the 'document.xml.rels' file. When I take this id and use it in my xslt to add the image to the document (for testing), I do see an area where the image should be when opening with Word, however it says: cannot display image with the red cross.
A difference I do notice is that image which where in the orignal docx are saved in "word\media" while the added image with the code above is added in "media". Not sure if this is a problem
Ok, So I think I figured it out.
XslCompiledTransform transform = new XslCompiledTransform();
transform.Load(xsltFile);
StringWriter stringWriter = new StringWriter();
XmlWriter xmlWriter = XmlWriter.Create(stringWriter);
transform.Transform(xmlFile, xmlWriter);
XmlDocument newWordContent = new XmlDocument();
newWordContent.LoadXml(stringWriter.ToString());
using (WordprocessingDocument myDoc = WordprocessingDocument.Open(outputFilename, true))
{
MainDocumentPart mainPart = myDoc.MainDocumentPart;
ImagePart imagePart = mainPart.AddImagePart(ImagePartType.Png, "imgId");
using (FileStream stream = new FileStream(imageFile, FileMode.Open))
{
imagePart.FeedData(stream);
}
Body body = new Body(newWordContent.DocumentElement.InnerXml);
DocumentFormat.OpenXml.Wordprocessing.Document document = new
DocumentFormat.OpenXml.Wordprocessing.Document(body);
document.Save(mainPart);
}
The above code will add an image to your docx file structure with a specific id. You can use this id to refer to in your xsl transform. In the code example from my question I didn't set the id but used the one that was generated. However, each time you run this code the image will be added to the file with a new id resulting in a "not able to display" error. Not one of my sharpest moments;-).
For my use case I have to add multiple images to a large document so that code will be different but I think that based on the above code this can be achieved.

Inserting a doc file inplace of place holder

I have a word document which contain many pages. One of those pages contain a placeholder instead of other content. so I want to replace that placeholder with another doc file without losing formatting. This doc file which is to be replaced may have many pages. How can I replace that placeholder with this doc file programmatically.. I searched many but could not find any option to insert a doc file replacing a placeholder.. Thank You In Advance.
Or how can we copy the contents of doc to be inserted and then replace the placeholder with copied content
I found a post here.The below code is from that post.
With the library, you can do the following to replace text from a Word document, considering that documentByteArray is your document byte content taken from database:
using (MemoryStream mem = new MemoryStream())
{
mem.Write(documentByteArray, 0, (int)documentByteArray.Length);
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(document, true))
{
string docText = null;
using (StreamReader sr = new StreamReader(wordDoc.MainDocumentPart.GetStream()))
{
docText = sr.ReadToEnd();
}
Regex regexText = new Regex("Hello world!");
docText = regexText.Replace(docText, "Hi Everyone!");
using (StreamWriter sw = new StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create)))
{
sw.Write(docText);
}
}
}
if instead of "Hi Everyone" if we replace it with a binarydata,which is an array of bytes
byte[] binarydata = File.ReadAllBytes(filepaths);
how can we modify the program?
First of all you should get a Nuget package called Novacode.Docx, this is what I have found to be the best Document creator and editor in the last few years.
using Novacode.Docx;
void Main()
{
var doc = DocX.Load(#"c:\temp\existingDoc.docx");
var docToAdd = DocX.Load(#"c:\temp\docToAdd.docx");
doc.InsertDocument(docToAdd, true); //version 1.0.0.22
doc.InsertDocument(docToAdd); //version 1.0.0.19
}
this is the most simple and basic implementation of what it is that youre after but this works.
for anything else take a look at the documentation at
https://docx.codeplex.com/
or
http://cathalscorner.blogspot.co.uk/
this will be the best place to start. I would also recommend that if you do use this one that you use the version 1.0.0.19 as there are some formatting issues in 1.0.0.22

How to insert text into a content control with the Open XML SDK

I'm trying to develop a solution which takes the input from a ASP.Net Web Page and Embed the input values into Corresponding Content Controls within a MS Word Document. The MS Word Document has also got Static Data with some Dynamic data to be Embed into the Header and Footer fields.
The Idea here is that the solution should be Web based. Can I use OpenXML for this purpose or any other approach that you can suggest.
Thank you very much in advance for all your valuable inputs. I really appreciate them.
I have a little code sample from my project, to insert a few words in a content control you've created in a Word document:
public static WordprocessingDocument InsertText(this WordprocessingDocument doc, string contentControlTag, string text)
{
SdtElement element = doc.MainDocumentPart.Document.Body.Descendants<SdtElement>()
.FirstOrDefault(sdt => sdt.SdtProperties.GetFirstChild<Tag>()?.Val == contentControlTag);
if (element == null)
throw new ArgumentException($"ContentControlTag \"{contentControlTag}\" doesn't exist.");
element.Descendants<Text>().First().Text = text;
element.Descendants<Text>().Skip(1).ToList().ForEach(t => t.Remove());
return doc;
}
It simply looks for the first contentcontrol in the document with a specific Tag (you can set that by enabling designer mode in word and right-clicking on the content control), and replaces the current text with the text passed into the method. After this the document will still contain the content controls of course which may not be desired. So when I'm done editing the document I run the following method to get rid of the content controls:
internal static WordprocessingDocument RemoveSdtBlocks(this WordprocessingDocument doc, IEnumerable<string> contentBlocks)
{
List<SdtElement> SdtBlocks = doc.MainDocumentPart.Document.Descendants<SdtElement>().ToList();
if (contentBlocks == null)
return doc;
foreach(var s in contentBlocks)
{
SdtElement currentElement = SdtBlocks.FirstOrDefault(sdt => sdt.SdtProperties.GetFirstChild<Tag>()?.Val == s);
if (currentElement == null)
continue;
IEnumerable<OpenXmlElement> elements = null;
if (currentElement is SdtBlock)
elements = (currentElement as SdtBlock).SdtContentBlock.Elements();
else if (currentElement is SdtCell)
elements = (currentElement as SdtCell).SdtContentCell.Elements();
else if (currentElement is SdtRun)
elements = (currentElement as SdtRun).SdtContentRun.Elements();
foreach (var el in elements)
currentElement.InsertBeforeSelf(el.CloneNode(true));
currentElement.Remove();
}
return doc;
}
To open the WordProcessingDocument from a template and edit it, there is plenty of information available online.
Edit:
Little sample code to open/save documents while working with them in a memorystream, of course you should take care of this with an extra repository class that takes care of managing the document in the real code:
byte[] byteArray = File.ReadAllBytes(#"C:\...\Template.dotx");
using (var stream = new MemoryStream())
{
stream.Write(byteArray, 0, byteArray.Length);
using (WordprocessingDocument doc = WordprocessingDocument.Open(stream, true))
{
//Needed because I'm working with template dotx file,
//remove this if the template is a normal docx.
doc.ChangeDocumentType(DocumentFormat.OpenXml.WordprocessingDocumentType.Document);
doc.InsertText("contentControlName","testtesttesttest");
}
using (FileStream fs = new FileStream(#"C:\...\newFile.docx", FileMode.Create))
{
stream.WriteTo(fs);
}
}

Replace a bookmark in a Word document with the contents of another word document

I'm looking to replace a bookmark in a word document with the entire contents of another word document. I was hoping to do something along the lines of the following, but appending the xml does not seem to be enough as it does not include pictures.
using Word = Microsoft.Office.Interop.Word;
...
Word.Application wordApp = new Word.Application();
Word.Document doc = wordApp.Documents.Add(filename);
var bookmark = doc.Bookmarks.OfType<Bookmark>().First();
var doc2 = wordApp.Documents.Add(filename2);
bookmark.Range.InsertXML(doc2.Contents.XML);
The second document contains a few images and a few tables of text.
Update: Progress made by using XML, but still doesn't satisfy adding pictures as well.
You've jumped in deep.
If you're using the object model (bookmark.Range) and trying to insert a picture you can use the clipboard or bookmark.Range.InlineShapes.AddPicture(...). If you're trying to insert a whole document you can copy/paste the second document:
Object objUnit = Word.WdUnits.wdStory;
wordApp.Selection.EndKey(ref objUnit, ref oMissing);
wordApp.ActiveWindow.Selection.PasteAndFormat(Word.WdRecoveryType.wdPasteDefault);
If you're using XML there may be other problems, such as formatting, images, headers/footers not coming in correctly.
Depending on the task it may be better to use DocumentBuilder and OpenXML SDK. If you're writing a Word addin you can use the object API, it will likely perform the same, if you're processing documents without Word go with OpenXML SDK and DocumentBuilder. The issue with DocumentBuilder is if it doesn't work there aren't many work-arounds to try. It's open source not the cleanest piece of code if you try troubleshooting it.
You can do this with openxml SDK and Document builder. To outline here is what you will need
1> Inject insert key in main doc
public WmlDocument GetProcessedTemplate(string templatePath, string insertKey)
{
WmlDocument templateDoc = new WmlDocument(templatePath);
using (MemoryStream mem = new MemoryStream())
{
mem.Write(templateDoc.DocumentByteArray, 0, templateDoc.DocumentByteArray.Length);
using (WordprocessingDocument doc = WordprocessingDocument.Open([source], true))
{
XDocument xDoc = doc.MainDocumentPart.GetXDocument();
XElement bookMarkPara = [get bookmarkPara to replace];
bookMarkPara.ReplaceWith(new XElement(PtOpenXml.Insert, new XAttribute("Id", insertKey)));
doc.MainDocumentPart.PutXDocument();
}
templateDoc.DocumentByteArray = mem.ToArray();
}
return templateDoc;
}
2> Use document builder to merge
List<Source> documentSources = new List<Source>();
var insertKey = "INSERT_HERE_1";
var processedTemplate = GetProcessedTemplate([docPath], insertKey);
documentSources.Add(new Source(processedTemplate, true));
documentSources.Add(new Source(new WmlDocument([docToInsertFilePath]), insertKey));
DocumentBuilder.BuildDocument(documentSources, [outputFilePath]);

How to "Insert text from file" using OpenXml SDK 2.0

Using Word 2010 GUI, there is an option to "Insert text from file...", which does exactly that: It insert the text in the main part of a document to the current location in your document.
I would like to do the same using C# and the OpenXml SDK 2.0
using (var mainDocument = WordprocessingDocument.Open("MainFile.docx", true);
{
var mainPart = mainDocument.MainDocumentPart;
var bookmarkStart = mainPart
.Document
.Body
.Descendants<BookmarkStart>()
.SingleOrDefault(b => b.Name == "ExtraContentBookmark");
var extraContent = GetTextFromFile("ExtraFile.docx");
bookmarkStart.InsertAfterSelf(extraContent);
}
I have tried using plain Xml (XElement), using OpenXmlElement (MainDocumentPart.Document.Body.Descendants), and using AltChunk. Every alternative so far has yielded a non-conformant docx-file.
What should the method GetTextFromFile look like?
This is how I implemented it. The solution was to use AltChunk as described by Eric White. I had already tried it, but as Bradley said in his answer, a bookmark may be anywhere in a document, and mine was inside a paragraph. As soon as I inserted the text before the containing paragraph, everything worked fine.
Here is the (simplified) code:
using (var mainDocument = WordprocessingDocument.Open("MainFile.docx", true);
{
var mainPart = mainDocument.MainDocumentPart;
var bookmarkStart = mainPart
.Document
.Body
.Descendants<BookmarkStart>()
.SingleOrDefault(b => b.Name == "ExtraContentBookmark");
var altChunk = GetAltChunkFromFile("ExtraFile.docx", mainPart);
var containingParagraph = element.Ancestors<Paragraph>().FirstOrDefault();
containingParagraph.InsertBeforeSelf(altChunk);
}
...
private AltChunk GetAltChunk(string filename, MainDocumentPart mainDocumentPart)
{
var altChunkId = "AltChunkId1";
var chunk = mainDocumentPart.AddAlternativeFormatImportPart(
AlternativeFormatImportPartType.WordprocessingML, altChunkId);
chunk.FeedData(File.Open(filename, FileMode.Open));
var altChunk = new AltChunk { Id = altChunkId };
return altChunk;
}
It is not as simple as inserting the descendants of the document body tag at the bookmark location. Some reasons:
The two documents may be using different styles; you would have to copy across dependant styles, or update the references to use the styles in the destination document.
The <bookmarkStart> tag can appear almost anywhere in a document, including inside a paragraph, a run, a table cell, etc. Since you cannot nest paragraphs or runs, you will have to determine where the bookmark is situated, then ascend/descend the XML tree until you find an appropriate place to insert the content.
What you're trying to do becomes quite a complicated task when using the OpenXml SDK. It requires an in-depth understanding of the format and its schema.
I would almost advise using VSTO/OLE automation instead, as it enables you to use the functionality that is built into Word.

Categories