I have a word document with a bunch of content controls on it. These are mapped to a custom XML part. To build the document on the fly, I simply overwrite the custom XML part.
The problem I'm having, is that if I don't define a particular item, it's space is still visible in the document, pushing down the stuff below it, and looking inconsistent with the rest of the document.
Here's a basic example of my code:
var path = HttpContext.Current.Server.MapPath("~/Classes/Word/LawyerBio.docx");
using (WordprocessingDocument myDoc = WordprocessingDocument.Open(path, true))
{
//create new XML string
//these values will populate the template word doc
string newXML = "<root>";
if (!String.IsNullOrEmpty(_lawyer["Recognition"]))
{
newXML += "<recognition>";
newXML += _text.Field("Recognition Title");
newXML += "</recognition>";
}
if (!String.IsNullOrEmpty(_lawyer["Board Memberships"]))
{
newXML += "<boards>";
newXML += _text.Field("Board Memberships Title");
newXML += "</boards>";
}
newXML += "</root>";
MainDocumentPart mainPart = myDoc.MainDocumentPart;
//delete old xml part
mainPart.DeleteParts<CustomXmlPart>(mainPart.CustomXmlParts);
//add new xml part
CustomXmlPart customXml = mainPart.AddCustomXmlPart(CustomXmlPartType.CustomXml);
using(StreamWriter ts = new StreamWriter(customXml.GetStream()))
{
ts.Write(newXML);
}
myDoc.Close();
}
Is there any way to make these content controls actually collapse/hide?
I think you will have to do either some preprocessing before the docx is opened in Word, or some postprocessing (eg via a macro).
As an example of the preprocessing approach, OpenDoPE defines a "condition" which you could use to exclude the undefined stuff.
Related
So I need to generate a docx file for reporting purposes. This report contains text, tables and a lot of images.
So far, I managed to add text and a table (and populate it based on the content of my xml using an xslt transform).
However, I am stuck on adding images. I found some examples of how to add images using C# but I don't think this is what I need. I need to format the document using my xslt and add the images in the right places (for instance in a table cell). Is it somehow possible to add a container using xslt which uses the filepath to display/embed the image similar to the <img> tag in html?
I know that the docx format is basically a zip containing a file structure and to embed the image I should add it to this file structure also.
Any examples or references are appreciated.
to give you an idea of my code:
XslCompiledTransform transform = new XslCompiledTransform();
transform.Load(xsltFile);
StringWriter stringWriter = new StringWriter();
XmlWriter xmlWriter = XmlWriter.Create(stringWriter);
transform.Transform(xmlFile, xmlWriter);
XmlDocument newWordContent = new XmlDocument();
newWordContent.LoadXml(stringWriter.ToString());
File.Copy(docXtemplate, outputFilename, true);
using (WordprocessingDocument myDoc = WordprocessingDocument.Open(outputFilename, true))
{
MainDocumentPart mainPart = myDoc.MainDocumentPart;
Body body = new Body(newWordContent.DocumentElement.InnerXml);
DocumentFormat.OpenXml.Wordprocessing.Document document = new DocumentFormat.OpenXml.Wordprocessing.Document(body);
document.Save(mainPart);
}
It basically replaces the body of an existing docx file. This enables me to use all the formatting, etc.
The xslt file is generated by adjusting the document.xml file from the docx.
Update
Ok, so I figured out how to add an image to the docx file directory, see below
using (WordprocessingDocument myDoc = WordprocessingDocument.Open(outputFilename, true))
{
MainDocumentPart mainPart = myDoc.MainDocumentPart;
ImagePart imagePart = mainPart.AddImagePart(ImagePartType.Png);
using (FileStream stream = new FileStream(imageFile, FileMode.Open))
{
imagePart.FeedData(stream);
}
Body body = new Body(newWordContent.DocumentElement.InnerXml);
DocumentFormat.OpenXml.Wordprocessing.Document document = new
DocumentFormat.OpenXml.Wordprocessing.Document(body);
document.Save(mainPart);
}
This will add the image to the docx structure. I also checkt the relatioship and this is present in the 'document.xml.rels' file. When I take this id and use it in my xslt to add the image to the document (for testing), I do see an area where the image should be when opening with Word, however it says: cannot display image with the red cross.
A difference I do notice is that image which where in the orignal docx are saved in "word\media" while the added image with the code above is added in "media". Not sure if this is a problem
Ok, So I think I figured it out.
XslCompiledTransform transform = new XslCompiledTransform();
transform.Load(xsltFile);
StringWriter stringWriter = new StringWriter();
XmlWriter xmlWriter = XmlWriter.Create(stringWriter);
transform.Transform(xmlFile, xmlWriter);
XmlDocument newWordContent = new XmlDocument();
newWordContent.LoadXml(stringWriter.ToString());
using (WordprocessingDocument myDoc = WordprocessingDocument.Open(outputFilename, true))
{
MainDocumentPart mainPart = myDoc.MainDocumentPart;
ImagePart imagePart = mainPart.AddImagePart(ImagePartType.Png, "imgId");
using (FileStream stream = new FileStream(imageFile, FileMode.Open))
{
imagePart.FeedData(stream);
}
Body body = new Body(newWordContent.DocumentElement.InnerXml);
DocumentFormat.OpenXml.Wordprocessing.Document document = new
DocumentFormat.OpenXml.Wordprocessing.Document(body);
document.Save(mainPart);
}
The above code will add an image to your docx file structure with a specific id. You can use this id to refer to in your xsl transform. In the code example from my question I didn't set the id but used the one that was generated. However, each time you run this code the image will be added to the file with a new id resulting in a "not able to display" error. Not one of my sharpest moments;-).
For my use case I have to add multiple images to a large document so that code will be different but I think that based on the above code this can be achieved.
I have an word file template and xml file for data. I want to find content Content control in word and get data from xml and then replace text in word template. i'm using the following code but it is not updating word file.
using (WordprocessingDocument document = WordprocessingDocument.CreateFromTemplate(txtWordFile.Text))
{
MainDocumentPart mainPart = document.MainDocumentPart;
IEnumerable<SdtBlock> block = mainPart.Document.Body.Descendants<SdtBlock>().Where
(r => r.SdtProperties.GetFirstChild<Tag>().Val == "TotalClose");
Text t = block.Descendants<Text>().Single();
t.Text = "13,450,542";
mainPart.Document.Save();
}
For anyone still struggling with this - you can check out this library https://github.com/antonmihaylov/OpenXmlTemplates
With it you can replace the text inside all content controls of the document based on a JSON object (or a basic C# dictionary) without writing specific code, instead you specify the variable name in the tag of the content control.
(Note - i am the maker of that library, but it is open sourced and licensed under LGPLv3)
I think you should write changes to temporary file.
See Save modified WordprocessingDocument to new file or my code from work project:
MemoryStream yourDocStream = new MemoryStream();
... // populate yourDocStream with .docx bytes
using (Package package = Package.Open(yourDocStream, FileMode.Open, FileAccess.ReadWrite))
{
// Load the document XML in the part into an XDocument instance.
PackagePart packagePart = LoadXmlPackagePart(package);
XDocument xDocument = XDocument.Load(XmlReader.Create(packagePart.GetStream()));
// making changes
// Save the XML into the package
using (XmlWriter xw = XmlWriter.Create(packagePart.GetStream(FileMode.Create, FileAccess.Write)))
{
xDocument.Save(xw);
}
var resultDocumentBytes = yourDocStream.ToArray();
}
The basic approach you use works fine, but I'm surprised you're not getting any compile-time errors because
IEnumerable<SdtBlock> block = mainPart.Document
.Body
.Descendants<SdtBlock>()
.Where(r => r.SdtProperties.GetFirstChild<Tag>().Val == "TotalClose");
is not compatible with
Text t = block.Descendants<Text>().Single();
block, as IEnumerable has no Descendants property. You either need to loop through all the items in IEnumerable and perform this on each item, or you need to define and instantiate a single item, like this:
using (WordprocessingDocument document = WordprocessingDocument.CreateFromTemplate(txtWordFile.Text))
{
MainDocumentPart mainPart = pkgDoc.MainDocumentPart;
SdtBlock block = mainPart.Document.Body.Descendants<SdtBlock>().Where
(r => r.SdtProperties.GetFirstChild<Tag>().Val == "test1").FirstOrDefault();
Text t = block.Descendants<Text>().Single();
t.Text = "13,450,542";
mainPart.Document.Save();
}
So I'm trying to populate the content controls in a word document by matching the Tag and populating the text within that content control.
The following displays in a MessageBox all of the tags I have in my document.
//Create a copy of the template file and open the document
File.Delete(hhscDocument);
File.Copy(hhscTemplate, hhscDocument, true);
//Open the word document specified by location
using (var document = WordprocessingDocument.Open(hhscDocument, true))
{
//Change the document type from template to document
var mainDocument = document.MainDocumentPart.Document;
if (mainDocument.Body.Descendants<Tag>().Any())
{
//MessageBox.Show(mainDocument.Body.Descendants<Table>().Count().ToString());
var tags = mainDocument.Body.Descendants<Tag>().ToList();
var aString = string.Empty;
foreach(var tag in tags)
{
aString += string.Format("{0}{1}", tag.Val, Environment.NewLine);
}
MessageBox.Show(aString);
}
}
However when I try the following it doesn't work.
//Create a copy of the template file and open the document
File.Delete(hhscDocument);
File.Copy(hhscTemplate, hhscDocument, true);
//Open the word document specified by location
using (var document = WordprocessingDocument.Open(hhscDocument, true))
{
//Change the document type from template to document
var mainDocument = document.MainDocumentPart.Document;
if (mainDocument.Body.Descendants<Tag>().Any())
{
//MessageBox.Show(mainDocument.Body.Descendants<Table>().Count().ToString());
var tags = mainDocument.Body.Descendants<Tag>().ToList();
var bString = string.Empty;
foreach(var tag in tags)
{
bString += string.Format("{0}{1}", tag.Parent.GetFirstChild<Text>().Text, Environment.NewLine);
}
MessageBox.Show(bString);
}
}
My objective in the end is if I match the appropriate tag I want to populate/change the text in the content control that tag belongs to.
So I basically used FirstChild and InnerXml to pick apart the documents XML contents. From there I developed the following that does what I need.
//Open the word document specified by location
using (var document = WordprocessingDocument.Open(hhscDocument, true))
{
var mainDocument = document.MainDocumentPart.Document;
if (mainDocument.Body.Descendants<Tag>().Any())
{
//Find all elements(descendants) of type tag
var tags = mainDocument.Body.Descendants<Tag>().ToList();
//Foreach of these tags
foreach (var tag in tags)
{
//Jump up two levels (.Parent.Parent) in the XML element and then jump down to the run level
var run = tag.Parent.Parent.Descendants<Run>().ToList();
//I access the 1st element because there is only one element in run
run[0].GetFirstChild<Text>().Text = "<new_text_value>";
}
}
mainDocument.Save();
}
This finds all the tags inside of your document and stores the elements in a list
var tags = mainDocument.Body.Descendants<Tag>().ToList();
This part of the code starts off at the tag part of the xml. From there I call parent twice to jump up two levels in the XML code so I can gain access to the Run level using descendants.
var run = tag.Parent.Parent.Descendants<Run>().ToList();
And last but not least the following code stores a new value into the text part of the PlainText Content control.
run[0].GetFirstChild<Text>().Text = "<new_text_value>";
Things that I noticed is the xml hierarchy is a funky thing. I find it easier to access these things from bottom up, hence why I started with the tags and moved up.
I'm looking to replace a bookmark in a word document with the entire contents of another word document. I was hoping to do something along the lines of the following, but appending the xml does not seem to be enough as it does not include pictures.
using Word = Microsoft.Office.Interop.Word;
...
Word.Application wordApp = new Word.Application();
Word.Document doc = wordApp.Documents.Add(filename);
var bookmark = doc.Bookmarks.OfType<Bookmark>().First();
var doc2 = wordApp.Documents.Add(filename2);
bookmark.Range.InsertXML(doc2.Contents.XML);
The second document contains a few images and a few tables of text.
Update: Progress made by using XML, but still doesn't satisfy adding pictures as well.
You've jumped in deep.
If you're using the object model (bookmark.Range) and trying to insert a picture you can use the clipboard or bookmark.Range.InlineShapes.AddPicture(...). If you're trying to insert a whole document you can copy/paste the second document:
Object objUnit = Word.WdUnits.wdStory;
wordApp.Selection.EndKey(ref objUnit, ref oMissing);
wordApp.ActiveWindow.Selection.PasteAndFormat(Word.WdRecoveryType.wdPasteDefault);
If you're using XML there may be other problems, such as formatting, images, headers/footers not coming in correctly.
Depending on the task it may be better to use DocumentBuilder and OpenXML SDK. If you're writing a Word addin you can use the object API, it will likely perform the same, if you're processing documents without Word go with OpenXML SDK and DocumentBuilder. The issue with DocumentBuilder is if it doesn't work there aren't many work-arounds to try. It's open source not the cleanest piece of code if you try troubleshooting it.
You can do this with openxml SDK and Document builder. To outline here is what you will need
1> Inject insert key in main doc
public WmlDocument GetProcessedTemplate(string templatePath, string insertKey)
{
WmlDocument templateDoc = new WmlDocument(templatePath);
using (MemoryStream mem = new MemoryStream())
{
mem.Write(templateDoc.DocumentByteArray, 0, templateDoc.DocumentByteArray.Length);
using (WordprocessingDocument doc = WordprocessingDocument.Open([source], true))
{
XDocument xDoc = doc.MainDocumentPart.GetXDocument();
XElement bookMarkPara = [get bookmarkPara to replace];
bookMarkPara.ReplaceWith(new XElement(PtOpenXml.Insert, new XAttribute("Id", insertKey)));
doc.MainDocumentPart.PutXDocument();
}
templateDoc.DocumentByteArray = mem.ToArray();
}
return templateDoc;
}
2> Use document builder to merge
List<Source> documentSources = new List<Source>();
var insertKey = "INSERT_HERE_1";
var processedTemplate = GetProcessedTemplate([docPath], insertKey);
documentSources.Add(new Source(processedTemplate, true));
documentSources.Add(new Source(new WmlDocument([docToInsertFilePath]), insertKey));
DocumentBuilder.BuildDocument(documentSources, [outputFilePath]);
I'm updating a word document by rewriting the CustomXMLPart file. I've basically followed this tutorial: http://blogs.msdn.com/b/brian_jones/archive/2009/01/05/taking-advantage-of-bound-content-controls.aspx
private bool _makeDoc()
{
var path = HttpContext.Current.Server.MapPath("~/Classes/Word/template.docx");
using (WordprocessingDocument myDoc = WordprocessingDocument.Open(path, true))
{
//create new XML string
//these values will populate the template word doc
string newXML = "<root>";
newXML += "<name>";
newXML += "name goes here";
newXML += "</name>";
newXML += "<bio>";
newXML += "text" + "more text";
newXML += "</bio>";
newXML += "</root>";
MainDocumentPart mainPart = myDoc.MainDocumentPart;
//delete old xml part
mainPart.DeleteParts<CustomXmlPart>(mainPart.CustomXmlParts);
//add new xml part
CustomXmlPart customXml = mainPart.AddCustomXmlPart(CustomXmlPartType.CustomXml);
using(StreamWriter ts = new StreamWriter(customXml.GetStream()))
{
ts.Write(newXML);
}
myDoc.Close();
}
return true;
}
The problem is that I can't figure out how to add a line break between "text" and "more text". I've tried Environment.NewLine, I've tried wrapping it in <w:p><w:r><w:t> tags. I can't seem to get it to produce a valid docx file.
Any help would be appreciated.
The Content Control properties has an option for "Allow carriage returns". Turning this on, and using Environment.NewLine worked perfectly.
I believe you'll have to wrap them in paragraphs in order to get the returns, as far as I know at least. So your resulting OOXML would look something like,
<w:p><w:r><w:t>Text</w:t></w:r></w:p>
<w:p><w:r><w:t>More text</w:t></w:r></w:p>
As far as it not resulting in valid OOXML when you do this, have you opened the OOXML package "document.xml" and saw exactly where the XML is invalid?
Edit:
The OOXML SDK 2.0 comes with some validation tools you might find useful.
via raw XML you can add:
<w:r>
<w:br />
</w:r>
via OOXML SDK:
Paragraph paragraph1 = new Paragraph();
Run breakRun = new Run();
breakRun.Append( new Break() );
paragraph1.Append( breakRun );
_document.MainDocumentPart.Document.AppendChild<Paragraph>(paragraph1);
//where _document is the WordProcessingDocument instance