Documentformat.openxml - create word document with default styles - c#

I want the word document created to use the default styles in Word, so the user can change the styles using the built in themes.
I have tried using:
var paragraph = new Paragraph();
var run = new Run();
run.Append(new Text(text));
paragraph.Append(run);
var header = new Header();
header.Append(paragraph);
But its styled as "Normal".
So, how do i make it become "Heading 1" when i open the document in Word?

If you're like me and you found this post because you were trying to build documents using OpenXML with the default styles "Heading 1", "Heading 2", "Title", etc. you get when you use Microsoft Word I found the solution after a few hours.
First I tried to find the styles in the normal template "Normal.dotm". This is not where the styles are stored, you are looking in the wrong place. The default styles are actually defined in a "Default.dotx" file in a directory named QuickStyles.
The path is going to change depending on your version and your OS. For me I found the dotx at "C:\Program Files (x86)\Microsoft Office\Office14\1033\QuickStyles".
I found some code from this blog post to create and modify a document from a template:
void CreateWordDocumentUsingMSWordStyles(string outputPath, string templatePath)
{
// create a copy of the template and open the copy
System.IO.File.Copy(templatePath, outputPath, true);
using (var document = WordprocessingDocument.Open(outputPath, true))
{
document.ChangeDocumentType(WordprocessingDocumentType.Document);
var mainPart = document.MainDocumentPart;
var settings = mainPart.DocumentSettingsPart;
var templateRelationship = new AttachedTemplate { Id = "relationId1" };
settings.Settings.Append(templateRelationship);
var templateUri = new Uri("c:\\anything.dotx", UriKind.Absolute); // you can put any path you like and the document styles still work
settings.AddExternalRelationship("http://schemas.openxmlformats.org/officeDocument/2006/relationships/attachedTemplate", templateUri, templateRelationship.Id);
// using Title as it would appear in Microsoft Word
var paragraphProps = new ParagraphProperties();
paragraphProps.ParagraphStyleId = new ParagraphStyleId { Val = "Title" };
// add some text with the "Title" style from the "Default" style set supplied by Microsoft Word
var run = new Run();
run.Append(new Text("My Title!"));
var paragraph = new Paragraph();
paragraph.Append(paragraphProps);
paragraph.Append(run);
mainPart.Document.Body.Append(paragraph);
mainPart.Document.Save();
}
}
Simply call this method with templatePath pointing to your Default.dotx file and you will be able to use the default styles as they appear in Microsoft Word.
var path = System.IO.Path.GetTempFileName();
CreateWordDocumentUsingMSWordStyles(path, "C:\\Program Files (x86)\\Microsoft Office\\Office14\\1033\\QuickStyles\\Default.dotx");
This does let the user change "Style Sets" in Word once they open the document as per the original question.

Related

Merge PDF files with TOC element

I'm merging PDF files using GemBox.Pdf as shown here. This works great and I can easily add outlines.
I've previously done a similar thing and merged Word files with GemBox.Document as shown here.
But now my problem is that there is no TOC element in GemBox.Pdf. I want to get automatically a Table of Contents while merging multiple PDF files into one.
Am I missing something or is there really no such element for PDF?
Do I need to recreate it, if yes then how would I do that?
I can add a bookmark, but I don't know how to add a link to it.
There is no such element in PDF files, so we need to create this content ourselves.
Now one way would be to create text elements, outlines, and link annotations, position them appropriately, and set the link destinations to outlines.
However, this could be quite some work so perhaps it would be easier to just create the desired TOC element with GemBox.Document, save it as a PDF file, and then import it into the resulting PDF.
// Source data for creating TOC entries with specified text and associated PDF files.
var pdfEntries = new[]
{
new { Title = "First Document Title", Pdf = PdfDocument.Load("input1.pdf") },
new { Title = "Second Document Title", Pdf = PdfDocument.Load("input2.pdf") },
new { Title = "Third Document Title", Pdf = PdfDocument.Load("input3.pdf") },
};
/***************************************************************/
/* Create new document with TOC element using GemBox.Document. */
/***************************************************************/
// Create new document.
var tocDocument = new DocumentModel();
var section = new Section(tocDocument);
tocDocument.Sections.Add(section);
// Create and add TOC element.
var toc = new TableOfEntries(tocDocument, FieldType.TOC);
section.Blocks.Add(toc);
section.Blocks.Add(new Paragraph(tocDocument, new SpecialCharacter(tocDocument, SpecialCharacterType.PageBreak)));
// Create heading style.
// By default, when updating TOC element a TOC entry is created for each paragraph that has heading style.
var heading1Style = (ParagraphStyle)tocDocument.Styles.GetOrAdd(StyleTemplateType.Heading1);
// Add heading and empty (placeholder) pages.
// The number of added placeholder pages depend on the number of pages that actual PDF file has so that TOC entries have correct page numbers.
int totalPageCount = 0;
foreach (var pdfEntry in pdfEntries)
{
section.Blocks.Add(new Paragraph(tocDocument, pdfEntry.Title) { ParagraphFormat = { Style = heading1Style } });
section.Blocks.Add(new Paragraph(tocDocument, new SpecialCharacter(tocDocument, SpecialCharacterType.PageBreak)));
int currentPageCount = pdfEntry.Pdf.Pages.Count;
totalPageCount += currentPageCount;
while (--currentPageCount > 0)
section.Blocks.Add(new Paragraph(tocDocument, new SpecialCharacter(tocDocument, SpecialCharacterType.PageBreak)));
}
// Remove last extra-added empty page.
section.Blocks.RemoveAt(section.Blocks.Count - 1);
// Update TOC element and save the document as PDF stream.
toc.Update();
var pdfStream = new MemoryStream();
tocDocument.Save(pdfStream, new GemBox.Document.PdfSaveOptions());
/***************************************************************/
/* Merge PDF files into PDF with TOC element using GemBox.Pdf. */
/***************************************************************/
// Load a PDF stream using GemBox.Pdf.
var pdfDocument = PdfDocument.Load(pdfStream);
var rootDictionary = (PdfDictionary)((PdfIndirectObject)pdfDocument.GetDictionary()[PdfName.Create("Root")]).Value;
var pagesDictionary = (PdfDictionary)((PdfIndirectObject)rootDictionary[PdfName.Create("Pages")]).Value;
var kidsArray = (PdfArray)pagesDictionary[PdfName.Create("Kids")];
var pageIds = kidsArray.Cast<PdfIndirectObject>().Select(obj => obj.Id).ToArray();
// Remove empty (placeholder) pages.
while (totalPageCount-- > 0)
pdfDocument.Pages.RemoveAt(pdfDocument.Pages.Count - 1);
// Add pages from PDF files.
foreach (var pdfEntry in pdfEntries)
foreach (var page in pdfEntry.Pdf.Pages)
pdfDocument.Pages.AddClone(page);
/*****************************************************************************/
/* Update TOC links from placeholder pages to actual pages using GemBox.Pdf. */
/*****************************************************************************/
// Create a mapping from an ID of a empty (placeholder) page indirect object to an actual page indirect object.
var pageCloneMap = new Dictionary<PdfIndirectObjectIdentifier, PdfIndirectObject>();
for (int i = 0; i < kidsArray.Count; ++i)
pageCloneMap.Add(pageIds[i], (PdfIndirectObject)kidsArray[i]);
foreach (var entry in pageCloneMap)
{
// If page was updated, it means that we passed TOC pages, so break from the loop.
if (entry.Key != entry.Value.Id)
break;
// For each TOC page, get its 'Annots' entry.
// For each link annotation from the 'Annots' get the 'Dest' entry.
// Update the first item in the 'Dest' array so that it no longer points to a removed page.
if (((PdfDictionary)entry.Value.Value).TryGetValue(PdfName.Create("Annots"), out PdfBasicObject annotsObj))
foreach (PdfIndirectObject annotObj in (PdfArray)annotsObj)
if (((PdfDictionary)annotObj.Value).TryGetValue(PdfName.Create("Dest"), out PdfBasicObject destObj))
{
var destArray = (PdfArray)destObj;
destArray[0] = pageCloneMap[((PdfIndirectObject)destArray[0]).Id];
}
}
// Save resulting PDF file.
pdfDocument.Save("Result.pdf");
pdfDocument.Close();
This way you can easily customize the TOC element by using the TOC switches and styles. For more info, see the Table Of Content example from GemBox.Document.

Open XML: We found a problem with some content in 'myfile.xlsx'

I am creating an Excel file using the Open XML SDK.
Worksheet newWs = new Worksheet()
{
MCAttributes = new MarkupCompatibilityAttributes() { Ignorable = "x14ac" }
};
When I add a SheetViews instance as follows,
SheetViews sheetViews = new SheetViews();
SheetView sheetView = new SheetView();
Selection selection = new Selection() { ActiveCell = "B1" };
sheetView.Append(selection);
sheetViews.Append(sheetView);
newWs.Append(sheetViews);
I get an error as shown below (and also ActiveCell is not working):
We found a problem with some content in 'myfile.xlsx'. Do you want us
to try to recover as much as we can? If you trust the source of this
workbook, click Yes.
This was issue with ordering of the excel xml elements.
I had applied sheetviews after applying styles.
Which openxml didn't like.
So I get to know about this by using XML SDK productivity tool. Which helped making correct order.

Creating word add in with OpenXML to insert new MergeField

I'm new to VSTO and OpenXML and I would like to develop a Word add-in. This add-in should use OpenXML, The add in should add a MergeField to the document, I can actually add MergeField using ConsoleApp but I want to insert the MergeField from the Word add in to the current opened document.
So I have this code in ButtonClick
// take current file location
var fileFullName = Globals.ThisAddIn.Application.ActiveDocument.FullName;
Globals.ThisAddIn.Application.ActiveDocument.Close(WdSaveOptions.wdSaveChanges, WdOriginalFormat.wdOriginalDocumentFormat, true);
// function to insert new field here
OpenAndAddTextToWordDocument(fileFullName, "username");
Globals.ThisAddIn.Application.Documents.Open(fileFullName);
And I Created the function which should add the new MergeField:
public static DocumentFormat.OpenXml.Wordprocessing.Paragraph OpenAndAddTextToWordDocument(string filepath, string txt)
{
// Open a WordprocessingDocument for editing using the filepath.
WordprocessingDocument wordprocessingDocument =
WordprocessingDocument.Open(filepath, true);
// Assign a reference to the existing document body.
Body body = wordprocessingDocument.MainDocumentPart.Document.Body;
// add text
string instructionText = String.Format(" MERGEFIELD {0} \\* MERGEFORMAT", txt);
SimpleField simpleField1 = new SimpleField() { Instruction = instructionText };
Run run1 = new Run();
RunProperties runProperties1 = new RunProperties();
NoProof noProof1 = new NoProof();
runProperties1.Append(noProof1);
Text text1 = new Text();
text1.Text = String.Format("«{0}»", txt);
run1.Append(runProperties1);
run1.Append(text1);
simpleField1.Append(run1);
DocumentFormat.OpenXml.Wordprocessing.Paragraph paragraph = new DocumentFormat.OpenXml.Wordprocessing.Paragraph();
paragraph.Append(new OpenXmlElement[] { simpleField1 });
return paragraph;
// Close the handle explicitly.
wordprocessingDocument.Close();
But something is not working here, when I use the add in it doesn't do anything
Thanks for the help.
Add a try/catch and you'll probably find that it can't open the file because it's currently open for editing.
The OpenXML SDK is a library for writing to Office files without going through Office's interfaces. But you're trying to do so while also using Office's interfaces, so you're essentially trying to take two approaches at once. This isn't going to work unless you first close the document.
But what you probably want to do is use VSTO. In VSTO, each document has a Fields collection, that you can use to add fields.
Fields.Add(Range, Type, Text, PreserveFormatting)

iTextSharp Input string was not in a correct format css error

I have been trying to get my MVC application te create pdf files based on MVC Views. I got this working with plain html. But i would also like to iclude my css files that i use for the browser. Now some of them work but with one i get the following error:
An exception of type 'System.FormatException' occurred in mscorlib.dll but was not handled in user code
Additional information: Input string was not in a correct format.
I am using the following code:
var data = GetHtml(new IndexModel(Context), "~\\Views\\Home\\Index.cshtml", "");
using (var document = new iTextSharp.text.Document())
{
//define output control HTML
var memStream = new MemoryStream();
TextReader xmlString = new StringReader(data);
PdfWriter writer = PdfWriter.GetInstance(document, new FileStream("c:\\tmp\\my.pdf", FileMode.OpenOrCreate));
//open doc
document.Open();
// register all fonts in current computer
FontFactory.RegisterDirectories();
// Set factories
var htmlContext = new HtmlPipelineContext(null);
htmlContext.SetTagFactory(Tags.GetHtmlTagProcessorFactory());
// Set css
ICSSResolver cssResolver = XMLWorkerHelper.GetInstance().GetDefaultCssResolver(false);
cssResolver.AddCssFile(HttpContext.Server.MapPath("~/Content/elements.css"), true);
cssResolver.AddCssFile(HttpContext.Server.MapPath("~/Content/style.css"), true);
cssResolver.AddCssFile(HttpContext.Server.MapPath("~/Content/jquery-ui.css"), true);
// Export
IPipeline pipeline = new CssResolverPipeline(cssResolver, new HtmlPipeline(htmlContext, new PdfWriterPipeline(document, writer)));
var worker = new XMLWorker(pipeline, true);
var xmlParse = new XMLParser(true, worker);
xmlParse.Parse(xmlString);
xmlParse.Flush();
document.Close();
}
the string "data" is correct and has no issues, the problem lies with the AddCssFile().
If i create the pdf without and css files everything works, but including the css files triggers the error.
Help will be very much appreciated.
I don't know the exact answer, but by looking at the error you are getting back, I would try two different approaches.
Move the
cssResolver.AddCssFile(HttpContext.Server.MapPath("~/Content/elements.css"), true);
To something like
var cssPath = HttpContext.Server.MapPath("~/Content/elements.css"), true);
cssResolver.AddCssFile(cssPath);
Then set a breakpoint and look at the values being returned for cssPath. Make sure they are accurate and do not contain any odd characters.
Second approach... If all else fails, try giving an absolute URL to the CSS resource such as http://yourdomain.com/cssPath instead of a file system path.
If either of those two appraoches help you, then you can use it to determine the actual problem and then refactor it to your hearts content after that.
UPDATE ------------------------------------------------------------------>
According to the documentation, you need an absolute URL for the file, so Server.MapPath won't work.
addCssFile
void addCssFile(String href,
boolean isPersistent)
throws CssResolverException
Add a
Parameters:
href - the link to the css file ( an absolute uri )
isPersistent - true if the added css should not be deleted on a call to clear
Throws:
CssResolverException - thrown if something goes wrong
In that case, I would try using something like :
public string AbsoluteContent(string contentPath)
{
var path = Url.Content(contentPath);
var url = new Uri(HttpContext.Current.Request.Url, path);
return url.AbsoluteUri;
}
and use it like such :
var cssPath = AbsoluteContent("~/Content/embeddedCss/yourcssfile.css");

Heading 1, Heading 2 is not highlighted in style ribbon of document after merging docx file

I am merging few docx files, those files were created using openxml and wordml through C#. Those files having heading tag as heading 1 , heading 2 etc. along with some text with these tags. When those files are created individually then if we click or select those text which are tagged with heading 1 and heading 2, then the Heading 1, Heading 2 etc are getting highlighted and the navigation pan are also showing against those Heading 1, Heading 2 tags, but after merging those documents when we click or select these text the Heading 1 and Heading 2 is not getting highlighted. in the style ribbon. The code for that merging is given here,
MemoryStream ms = new MemoryStream();
using (WordprocessingDocument myDoc =
WordprocessingDocument.Create(ms, WordprocessingDocumentType.Document))
{
MainDocumentPart mainPart = myDoc.AddMainDocumentPart();
mainPart.Document = new Document { Body = new Body() };
int counter = 1;
foreach (var sectionOutput in sectionOutputs)
{
foreach (var outputFile in sectionOutput.Files)
{
Paragraph sectionBreakPara = null;
if (!sectionOutput.SectionType.Equals(sectionOutputs[sectionOutputs.Count - 1].SectionType))
{
if (outputFile == sectionOutput.Files.Last())
//check whether this is the last file in this section
{
using (
WordprocessingDocument pkgSourceDoc =
WordprocessingDocument.Open(outputFile.OutputStream, true))
{
var sourceBody = pkgSourceDoc.MainDocumentPart.Document.Body;
SectionProperties docSectionBreak =
sourceBody.Descendants<SectionProperties>().LastOrDefault();
if (docSectionBreak != null)
{
var clonedSectionBreak = (SectionProperties)docSectionBreak.CloneNode(true);
clonedSectionBreak.RemoveAllChildren<FooterReference>();
clonedSectionBreak.RemoveAllChildren<HeaderReference>();
sectionBreakPara = new Paragraph();
ParagraphProperties sectionParaProp = new ParagraphProperties();
sectionParaProp.AppendChild(clonedSectionBreak);
sectionBreakPara.AppendChild(sectionParaProp);
}
}
}
}
string altChunkId = string.Format("altchunkId{0}", counter);
AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(
AlternativeFormatImportPartType.WordprocessingML, altChunkId);
outputFile.OutputStream.Seek(0, SeekOrigin.Begin);
chunk.FeedData(outputFile.OutputStream);
AltChunk altChunk = new AltChunk(new AltChunkProperties(new MatchSource { Val = new OnOffValue(true) })) { Id = altChunkId };
mainPart.Document.Body.AppendChild(altChunk);
if (sectionBreakPara != null)
{
mainPart.Document
.Body
.AppendChild(sectionBreakPara);
}
counter++;
}
}
mainPart.Document.Save();
}
return ms;
In general, this symptom arises when the style definition is not present in the styles.xml part. If during the merge process the document content was carried over but the styles parts weren't, that could cause this problem.
In a new Word document, there are only a very few basic styles, like Normal. A style definition like Heading 1 is not added to the styles.xml until you assign that style to a paragraph. When a paragraph element contains a style assignment for a style not present in the package, the style is ignored.
It can also arise in table cells, where a table setting is overriding the style. For example, in a table you can say the first row (like headings) should appear in a particular font and color, and that will override a style setting.
If neither of those works, if you post a smallish amount of the XML that's generated, right around one of the paragraphs and its immediate context, that might give some clues.

Categories