Merge 2 Word Document using c#

Merge 2 Word Document using c# - c#

I am trying to merge 2 word documents in binary format, however, it only gets the first document. The resulting document should have all elements from each source document included with formatting.
Here is the code so far. Using OpenXml by the way.
private static byte[] Merge(byte[] dest, byte[] src)
{
string altChunkId = "AltChunkId" + DateTime.Now.Ticks.ToString();
var memoryStreamDest = new MemoryStream();
memoryStreamDest.Write(dest, 0, dest.Length);
memoryStreamDest.Seek(0, SeekOrigin.Begin);
using (WordprocessingDocument doc = WordprocessingDocument.Open(memoryStreamDest, true))
{
MainDocumentPart mainPart = doc.MainDocumentPart;
Paragraph para = new Paragraph(new Run((new Break() { Type = BreakValues.Page })));
mainPart.Document.Body.InsertAfter(para, mainPart.Document.Body.LastChild);
//Insert the source file into the target file using AltChunk
AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(
AlternativeFormatImportPartType.WordprocessingML, altChunkId);
using (MemoryStream mem = new MemoryStream())
{
mem.Write(src, 0, (int)src.Length);
mem.Seek(0, SeekOrigin.Begin);
chunk.FeedData(mem);
}
AltChunk altChunk = new AltChunk();
altChunk.Id = altChunkId;
mainPart.Document.Body.InsertAfter(altChunk, mainPart.Document.Body.Descendants<Paragraph>().Last());
mainPart.Document.Save();
return memoryStreamDest.ToArray();
}
}

Related

How to convert XML to Byte to File/PDF

I have successfully done "find and replace" which created an xml. Now I want to convert the newly created xml file to pdf which will be attached as a file and sent in a mail.
The result of the Base64String was tested on a base64 pdf file converter but the pdf cannot be opened. Got this error: Something went wrong couldn't open the file
HOW CAN I MAKE THIS WORK?
public async Task<string> CreateDocument(string PolicyNumber)
{
var policy = await _context.Policy.SingleOrDefaultAsync(p => p.PolicyNumber == PolicyNumber);
ArgumentNullException.ThrowIfNull(policy, "Policy Not Available");
//CreatePolicyDocument
//create policy document
var files = #"C:\Users\PATHTODOCUMENT\holderTest.docx";
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(files, true))
{
string docText;
using (StreamReader sr = new StreamReader(wordDoc.MainDocumentPart.GetStream()))
{
docText = sr.ReadToEnd();
}
Regex regexText = new Regex("XCONCLUSION_DATEX");
var newWordText = regexText.Replace(docText, "Hi Everyone!");
using (StreamWriter sw = new StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create)))
{
sw.Write(newWordText);
Encoding encoding = Encoding.UTF8;
byte[] docAsBytes = encoding.GetBytes(newWordText);
File.WriteAllBytes("hello.pdf", docAsBytes);
var file = Convert.ToBase64String(docAsBytes);
}
}
//send message
//
return "";
}

PDF file has its own construct,so you can't generate a pdf file with the contentstring of xml file dirctly.
If you really want to convert xml contentstring to PDF,you could try iTextSharp.
I tried with a simple demo,and here's the code:
[HttpPost]
public IActionResult XMLtoPDF([FromForm] Try T1)
{
var streamContent = new StreamContent(T1.file.OpenReadStream());
var filestring = streamContent.ReadAsStringAsync().Result;
MemoryStream outputStream = new MemoryStream();
Document doc = new Document();
PdfWriter writer = PdfWriter.GetInstance(doc, outputStream);
//PDF settings
PdfDestination pdfDest = new PdfDestination(PdfDestination.XYZ, 0, doc.PageSize.Height, 1f);
doc.Open();
var paragraph = new Paragraph(filestring);
doc.Add(paragraph);
doc.Close();
outputStream.Close();
var pdfArray = outputStream.ToArray();
return File(pdfArray, "application/pdf");
}
Result:

Convert html to pdf and merge it with existing pdfs

I have a System.Net.Mail.MailMessage which shall have it's html body and pdf attachments converted into one single pdf.
Converting the html body to pdf works for me with this answer
Converting the pdf attachments into one pdf works for me with this answer
However after ~10 hours of trying I can not come up with a combined solution which does both. All I'm getting are NullReferenceExceptions somewhere in IText source, "the document is not open", etc...
For example, this will throw no error but the resulting pdf will only contain the attachments but not the html email body:
Document document = new Document();
StringReader sr = new StringReader(mail.Body);
HTMLWorker htmlparser = new HTMLWorker(document);
using (FileStream fs = new FileStream(targetPath, FileMode.Create))
{
PdfCopy writer = new PdfCopy(document, fs);
document.Open();
htmlparser.Parse(sr);
foreach (string fileName in pdfList)
{
PdfReader reader = new PdfReader(fileName);
reader.ConsolidateNamedDestinations();
for (int i = 1; i <= reader.NumberOfPages; i++)
{
PdfImportedPage page = writer.GetImportedPage(reader, i);
writer.AddPage(page);
}
PRAcroForm form = reader.AcroForm;
if (form != null)
{
writer.CopyAcroForm(reader);
}
reader.Close();
}
writer.Close();
document.Close();
}
I'm using the LGPL licensed ITextSharp 4.1.6

From v4.1.6 fanboy to v4.1.6 fanboy :D
Looks like the HTMLWorker is closing the documents stream right after parsing. So as a workaround, you could create a pdf from your mailbody in memory. And then add this one together with the attachment to your final pdf.
Here is some code, that should do the trick:
StringReader htmlStringReader = new StringReader("<html><body>Hello World!!!!!!</body></html>");
byte[] htmlResult;
using (MemoryStream htmlStream = new MemoryStream())
{
Document htmlDoc = new Document();
PdfWriter htmlWriter = PdfWriter.GetInstance(htmlDoc, htmlStream);
htmlDoc.Open();
HTMLWorker htmlWorker = new HTMLWorker(htmlDoc);
htmlWorker.Parse(htmlStringReader);
htmlDoc.Close();
htmlResult = htmlStream.ToArray();
}
byte[] pdfResult;
using (MemoryStream pdfStream = new MemoryStream())
{
Document doc = new Document();
PdfCopy copyWriter = new PdfCopy(doc, pdfStream);
doc.Open();
PdfReader htmlPdfReader = new PdfReader(htmlResult);
AppendPdf(copyWriter, htmlPdfReader); // your foreach pdf code here
htmlPdfReader.Close();
PdfReader attachmentReader = new PdfReader("C:\\temp\\test.pdf");
AppendPdf(copyWriter, attachmentReader);
attachmentReader.Close();
doc.Close();
pdfResult = pdfStream.ToArray();
}
using (FileStream fs = new FileStream("C:\\temp\\test2.pdf", FileMode.Create, FileAccess.Write))
{
fs.Write(pdfResult, 0, pdfResult.Length);
}
private void AppendPdf(PdfCopy writer, PdfReader reader)
{
for (int i = 1; i <= reader.NumberOfPages; i++)
{
PdfImportedPage page = writer.GetImportedPage(reader, i);
writer.AddPage(page);
}
}
Ofc you could directly use a FileStream for the final document instead of a MemoryStream as well.

combine two word documents into one [duplicate]

I have around 10 word documents which I generate using open xml and other stuff.
Now I would like to create another word document and one by one I would like to join them into this newly created document.
I wish to use open xml, any hint would be appreciable.
Below is my code:
private void CreateSampleWordDocument()
{
//string sourceFile = Path.Combine("D:\\GeneralLetter.dot");
//string destinationFile = Path.Combine("D:\\New.doc");
string sourceFile = Path.Combine("D:\\GeneralWelcomeLetter.docx");
string destinationFile = Path.Combine("D:\\New.docx");
try
{
// Create a copy of the template file and open the copy
//File.Copy(sourceFile, destinationFile, true);
using (WordprocessingDocument document = WordprocessingDocument.Open(destinationFile, true))
{
// Change the document type to Document
document.ChangeDocumentType(DocumentFormat.OpenXml.WordprocessingDocumentType.Document);
//Get the Main Part of the document
MainDocumentPart mainPart = document.MainDocumentPart;
mainPart.Document.Save();
}
}
catch
{
}
}
Update( using AltChunks):
using (WordprocessingDocument myDoc = WordprocessingDocument.Open("D:\\Test.docx", true))
{
string altChunkId = "AltChunkId" + DateTime.Now.Ticks.ToString().Substring(0, 2) ;
MainDocumentPart mainPart = myDoc.MainDocumentPart;
AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(
AlternativeFormatImportPartType.WordprocessingML, altChunkId);
using (FileStream fileStream = File.Open("D:\\Test1.docx", FileMode.Open))
chunk.FeedData(fileStream);
AltChunk altChunk = new AltChunk();
altChunk.Id = altChunkId;
mainPart.Document
.Body
.InsertAfter(altChunk, mainPart.Document.Body.Elements<Paragraph>().Last());
mainPart.Document.Save();
}
Why this code overwrites the content of the last file when I use multiple files?
Update 2:
using (WordprocessingDocument myDoc = WordprocessingDocument.Open("D:\\Test.docx", true))
{
MainDocumentPart mainPart = myDoc.MainDocumentPart;
string altChunkId = "AltChunkId" + DateTime.Now.Ticks.ToString().Substring(0, 3);
AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.WordprocessingML, altChunkId);
using (FileStream fileStream = File.Open("d:\\Test1.docx", FileMode.Open))
{
chunk.FeedData(fileStream);
AltChunk altChunk = new AltChunk();
altChunk.Id = altChunkId;
mainPart.Document
.Body
.InsertAfter(altChunk, mainPart.Document.Body
.Elements<Paragraph>().Last());
mainPart.Document.Save();
}
using (FileStream fileStream = File.Open("d:\\Test2.docx", FileMode.Open))
{
chunk.FeedData(fileStream);
AltChunk altChunk = new AltChunk();
altChunk.Id = altChunkId;
mainPart.Document
.Body
.InsertAfter(altChunk, mainPart.Document.Body
.Elements<Paragraph>().Last());
}
using (FileStream fileStream = File.Open("d:\\Test3.docx", FileMode.Open))
{
chunk.FeedData(fileStream);
AltChunk altChunk = new AltChunk();
altChunk.Id = altChunkId;
mainPart.Document
.Body
.InsertAfter(altChunk, mainPart.Document.Body
.Elements<Paragraph>().Last());
}
}
This code is appending the Test2 data twice, in place of Test1 data as well.
Means I get:
Test
Test2
Test2
instead of :
Test
Test1
Test2

Using openXML SDK only, you can use AltChunk element to merge the multiple document into one.
This link the-easy-way-to-assemble-multiple-word-documents and this one How to Use altChunk for Document Assembly provide some samples.
EDIT 1
Based on your code that uses altchunk in the updated question (update#1), here is the VB.Net code I have tested and that works like a charm for me:
Using myDoc = DocumentFormat.OpenXml.Packaging.WordprocessingDocument.Open("D:\\Test.docx", True)
Dim altChunkId = "AltChunkId" + DateTime.Now.Ticks.ToString().Substring(0, 2)
Dim mainPart = myDoc.MainDocumentPart
Dim chunk = mainPart.AddAlternativeFormatImportPart(
DocumentFormat.OpenXml.Packaging.AlternativeFormatImportPartType.WordprocessingML, altChunkId)
Using fileStream As IO.FileStream = IO.File.Open("D:\\Test1.docx", IO.FileMode.Open)
chunk.FeedData(fileStream)
End Using
Dim altChunk = New DocumentFormat.OpenXml.Wordprocessing.AltChunk()
altChunk.Id = altChunkId
mainPart.Document.Body.InsertAfter(altChunk, mainPart.Document.Body.Elements(Of DocumentFormat.OpenXml.Wordprocessing.Paragraph).Last())
mainPart.Document.Save()
End Using
EDIT 2
The second issue (update#2)
This code is appending the Test2 data twice, in place of Test1 data as
well.
is related to altchunkid.
For each document you want to merge in the main document, you need to:
add an AlternativeFormatImportPart in the mainDocumentPart with an Id which must be unique. This element contains the inserted data
add in the body an Altchunk element in which you set the id to reference the previous AlternativeFormatImportPart.
In your code, you are using the same Id for all the AltChunks. It's why you see many time the same text.
I am not sure the altchunkid will be unique with your code: string altChunkId = "AltChunkId" + DateTime.Now.Ticks.ToString().Substring(0, 2);
If you don't need to set a specific value, I recommend you to not set explicitly the AltChunkId when you add the AlternativeFormatImportPart. Instead, you get the one generated by the SDK like this:
VB.Net
Dim chunk As AlternativeFormatImportPart = mainPart.AddAlternativeFormatImportPart(DocumentFormat.OpenXml.Packaging.AlternativeFormatImportPartType.WordprocessingML)
Dim altchunkid As String = mainPart.GetIdOfPart(chunk)
C#
AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(DocumentFormat.OpenXml.Packaging.AlternativeFormatImportPartType.WordprocessingML);
string altchunkid = mainPart.GetIdOfPart(chunk);

There is a nice wrapper API (Document Builder 2.2) around open xml specially designed to merge documents, with flexibility of choosing the paragraphs to merge etc. You can download it from here (update: moved to github).
The documentation and screen casts on how to use it are here.
Update: Code Sample
var sources = new List<Source>();
//Document Streams (File Streams) of the documents to be merged.
foreach (var stream in documentstreams)
{
var tempms = new MemoryStream();
stream.CopyTo(tempms);
sources.Add(new Source(new WmlDocument(stream.Length.ToString(), tempms), true));
}
var mergedDoc = DocumentBuilder.BuildDocument(sources);
mergedDoc.SaveAs(#"C:\TargetFilePath");
Types Source and WmlDocument are from Document Builder API.
You can even add the file paths directly if you choose to as:
sources.Add(new Source(new WmlDocument(#"C:\FileToBeMerged1.docx"));
sources.Add(new Source(new WmlDocument(#"C:\FileToBeMerged2.docx"));
Found this Nice Comparison between AltChunk and Document Builder approaches to merge documents - helpful to choose based on ones requirements.
You can also use DocX library to merge documents but I prefer Document Builder over this for merging documents.
Hope this helps.

The only thing missing in these answers is the for loop.
For those who just want to copy / paste it:
void MergeInNewFile(string resultFile, IList<string> filenames)
{
using (WordprocessingDocument document = WordprocessingDocument.Create(resultFile, WordprocessingDocumentType.Document))
{
MainDocumentPart mainPart = document.AddMainDocumentPart();
mainPart.Document = new Document(new Body());
foreach (string filename in filenames)
{
AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.WordprocessingML);
string altChunkId = mainPart.GetIdOfPart(chunk);
using (FileStream fileStream = File.Open(filename, FileMode.Open))
{
chunk.FeedData(fileStream);
}
AltChunk altChunk = new AltChunk { Id = altChunkId };
mainPart.Document.Body.AppendChild(altChunk);
}
mainPart.Document.Save();
}
}
All credits go to Chris and yonexbat

Easy to use in C#:
using System;
using System.IO;
using System.Linq;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;
namespace WordMergeProject
{
public class Program
{
private static void Main(string[] args)
{
byte[] word1 = File.ReadAllBytes(#"..\..\word1.docx");
byte[] word2 = File.ReadAllBytes(#"..\..\word2.docx");
byte[] result = Merge(word1, word2);
File.WriteAllBytes(#"..\..\word3.docx", result);
}
private static byte[] Merge(byte[] dest, byte[] src)
{
string altChunkId = "AltChunkId" + DateTime.Now.Ticks.ToString();
var memoryStreamDest = new MemoryStream();
memoryStreamDest.Write(dest, 0, dest.Length);
memoryStreamDest.Seek(0, SeekOrigin.Begin);
var memoryStreamSrc = new MemoryStream(src);
using (WordprocessingDocument doc = WordprocessingDocument.Open(memoryStreamDest, true))
{
MainDocumentPart mainPart = doc.MainDocumentPart;
AlternativeFormatImportPart altPart =
mainPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.WordprocessingML, altChunkId);
altPart.FeedData(memoryStreamSrc);
var altChunk = new AltChunk();
altChunk.Id = altChunkId;
OpenXmlElement lastElem = mainPart.Document.Body.Elements<AltChunk>().LastOrDefault();
if(lastElem == null)
{
lastElem = mainPart.Document.Body.Elements<Paragraph>().Last();
}
//Page Brake einfügen
Paragraph pageBreakP = new Paragraph();
Run pageBreakR = new Run();
Break pageBreakBr = new Break() { Type = BreakValues.Page };
pageBreakP.Append(pageBreakR);
pageBreakR.Append(pageBreakBr);
return memoryStreamDest.ToArray();
}
}
}

My solution :
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
namespace TestFusionWord
{
internal class Program
{
public static void MergeDocx(List<string> ListPathFilesToMerge, string DestinationPathFile, bool OverWriteDestination, bool WithBreakPage)
{
#region Control arguments
List<string> ListError = new List<string>();
if (ListPathFilesToMerge == null || ListPathFilesToMerge.Count == 0)
{
ListError.Add("Il n'y a aucun fichier à fusionner dans la liste passée en paramètre ListPathFilesToMerge");
}
else
{
foreach (var item in ListPathFilesToMerge.Where(x => Path.GetExtension(x.ToLower()) != ".docx"))
{
ListError.Add(string.Format("Le fichier '{0}' indiqué dans la liste passée en paramètre ListPathFilesToMerge n'a pas l'extension .docx", item));
}
foreach (var item in ListPathFilesToMerge.Where(x => !File.Exists(x)))
{
ListError.Add(string.Format("Le fichier '{0}' indiqué dans la liste passée en paramètre ListPathFilesToMerge n'existe pas", item));
}
}
if (string.IsNullOrWhiteSpace(DestinationPathFile))
{
ListError.Add("Le fichier destination FinalPathFile passé en paramètre ne peut être vide");
}
else
{
if (Path.GetExtension(DestinationPathFile.ToLower()) != ".docx")
{
ListError.Add(string.Format("Le fichier destination '{0}' indiqué dans le paramètre DestinationPathFile n'a pas l'extension .docx", DestinationPathFile));
}
if (File.Exists(DestinationPathFile) && !OverWriteDestination)
{
ListError.Add(string.Format("Le fichier destination '{0}' existe déjà. Utilisez l'argument OverWriteDestination si vous souhaitez l'écraser", DestinationPathFile));
}
}
if (ListError.Any())
{
string MessageError = "Des erreurs ont été rencontrés, détail : " + Environment.NewLine + ListError.Select(x => "- " + x).Aggregate((x, y) => x + Environment.NewLine + y);
throw new ArgumentException(MessageError);
}
#endregion Control arguments
#region Merge Files
//Suppression du fichier destination (aucune erreur déclenchée si le fichier n'existe pas)
File.Delete(DestinationPathFile);
//Création du fichier destination à vide
using (WordprocessingDocument document = WordprocessingDocument.Create(DestinationPathFile, WordprocessingDocumentType.Document))
{
MainDocumentPart mainPart = document.AddMainDocumentPart();
mainPart.Document = new Document(new Body());
document.MainDocumentPart.Document.Save();
}
//Fusion des documents
using (WordprocessingDocument myDoc = WordprocessingDocument.Open(DestinationPathFile, true))
{
MainDocumentPart mainPart = myDoc.MainDocumentPart;
Body body = mainPart.Document.Body;
for (int i = 0; i < ListPathFilesToMerge.Count; i++)
{
string currentpathfile = ListPathFilesToMerge[i];
AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.WordprocessingML);
string altchunkid = mainPart.GetIdOfPart(chunk);
using (FileStream fileStream = File.Open(currentpathfile, FileMode.Open))
chunk.FeedData(fileStream);
AltChunk altChunk = new AltChunk();
altChunk.Id = altchunkid;
OpenXmlElement last = body.Elements().LastOrDefault(e => e is AltChunk || e is Paragraph);
body.InsertAfter(altChunk, last);
if (WithBreakPage && i < ListPathFilesToMerge.Count - 1) // If its not the last file, add breakpage
{
last = body.Elements().LastOrDefault(e => e is AltChunk || e is Paragraph);
last.InsertAfterSelf(new Paragraph(new Run(new Break() { Type = BreakValues.Page })));
}
}
mainPart.Document.Save();
}
#endregion Merge Files
}
private static int Main(string[] args)
{
try
{
string DestinationPathFile = #"C:\temp\testfusion\docfinal.docx";
List<string> ListPathFilesToMerge = new List<string>()
{
#"C:\temp\testfusion\fichier1.docx",
#"C:\temp\testfusion\fichier2.docx",
#"C:\temp\testfusion\fichier3.docx"
};
ListPathFilesToMerge.Sort(); //Sort for always have the same file
MergeDocx(ListPathFilesToMerge, DestinationPathFile, true, true);
#if DEBUG
Process.Start(DestinationPathFile); //open file
#endif
return 0;
}
catch (Exception Ex)
{
Console.Error.WriteLine(Ex.Message);
//Log exception here
return -1;
}
}
}
}

convert a pdfreader to a pdfdocument

Can anyone tell me how to convert a PdfReader object into a PdfDocument ?
I have read a disk file and converted to a memorystream but I need it as a PdfDocument for other methods in my C# program.
I'm converting an application to use iTextSharp instead of PdfSharp.
MemoryStream pdfstream = new MemoryStream();
/* Convert the attachment to an byte array */
byte[] pdfarray = (byte[])dr["Data"];
/* Write the attachment into the memory */
pdfstream.Write(pdfarray, 0, pdfarray.Length);
/* Set the memorystream to the beginning */
pdfstream.Seek(0, System.IO.SeekOrigin.Begin);
/* Open the pdf document */
PdfSharp.Pdf.PdfDocument document = PdfSharp.Pdf.IO.PdfReader.Open(pdfstream, PdfDocumentOpenMode.Modify);
//iTextSharp.text.Document doc1 = iTextSharp.text.pdf.PdfReader.GetStreamBytes(
//ITS.pdf.PdfReader rdr = ITS.pdf.PdfReader(
string filename = DateTime.Now.Ticks.ToString() + "_" + dr["AttachmentName"].ToString();
string path = Path.Combine(FolderName, filename);
document.Save(path);

I think you can do something like this (note code not run or tested, might need a tweak):
using (MemoryStream ms = new MemoryStream())
{
Document doc = new Document(PageSize.A4, 50, 50, 15, 15);
PdfWriter writer = PdfWriter.GetInstance(doc, ms);
using (var rdr = new PdfReader(filePath))
{
PdfImportedPage page;
for(int i = 1; i <= rdr.PageCount; i++)
{
page = writer.GetImportedPage(templateReader, i)
writer.DirectContent.AddTemplate(page, 0, 0);
doc.NewPage();
}
}
}
This will read in the PDF page by page and output it to your document.

Corrupt document after calling AddAlternativeFormatImportPart using OpenXml

I am trying to create an AddAlternativeFormatImportPart in a .docx file in order to reference it in the document via an AltChunk. the problem is that the code below causes the docx file to read as corrupted by Word and cannot be opened.
string html = "some html code."
string altChunkId = "html234";
var document = WordprocessingDocument.Open(inMemoryPackage, true);
var mainPart = document.MainDocumentPart.Document;
var mainDocumentPart = document.MainDocumentPart;
AlternativeFormatImportPart chunk = mainDocumentPart.AddAlternativeFormatImportPart
(AlternativeFormatImportPartType.Xhtml, altChunkId);
Stream contentStream = chunk.GetStream(FileMode.Open,FileAccess.ReadWrite);
StreamWriter contentWriter = new StreamWriter(contentStream);
contentWriter.Write(html);
contentWriter.Flush();
{
...
}
mainPart.Save();

I think it might be how you are handeling the stream from the AlternativeFormatImportPart. Try using FeedData instead, like in my example below.
StringBuilder xhtmlBuilder = new StringBuilder();
xhtmlBuilder.Append("<html>");
xhtmlBuilder.Append("<body>");
xhtmlBuilder.Append("<b>Hello world!</b>");
xhtmlBuilder.Append("</body>");
xhtmlBuilder.Append("</html>");
using (WordprocessingDocument doc = WordprocessingDocument.Open(inputFilePath, true))
{
string altChunkId = "chunk1";
AlternativeFormatImportPart chunk = doc.MainDocumentPart.AddAlternativeFormatImportPart
(AlternativeFormatImportPartType.Xhtml, altChunkId);
using (MemoryStream xhtmlStream = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(xhtmlBuilder.ToString())))
{
chunk.FeedData(xhtmlStream);
AltChunk altChunk = new AltChunk();
altChunk.Id = altChunkId;
doc.MainDocumentPart.Document.Body.Append(altChunk);
}
doc.MainDocumentPart.Document.Save();
}

I think it is because you cannot import an AltChunk into a document that is opened from a memory stream. I had the same issue. I was opening the template from a memory stream like so:
Private Sub UpdateDoc(templatePath As String)
Using fs As FileStream = File.OpenRead(templatePath)
Using ms As New MemoryStream
CopyStream(fs, ms)
Using doc As WordprocessingDocument = WordprocessingDocument.Open(ms, True)
'update the document
doc.MainDocumentPart.Document.Save()
End Using
End Using
End Using
End Sub
Private Sub CopyStream(source As Stream, target As Stream)
Dim buffer() As Byte
Dim bytesRead As Integer = 1
ReDim buffer(32768)
While bytesRead > 0
bytesRead = 0
bytesRead = source.Read(buffer, 0, buffer.Length)
target.Write(buffer, 0, bytesRead)
End While
End Sub
This works for normal updates of content controls etc. and document is fine when streamed back to client or saved as docx. But it corrupts doc when inserting an AltChunk.
Opening a doc from a physical file path works when inserting AltChunk like so:
Using doc As WordprocessingDocument = WordprocessingDocument.Open(strTempFile, True)
Dim altChunkId As String = "AltChunkId1"
Dim mainDocPart As MainDocumentPart = doc.MainDocumentPart
Dim chunk As AlternativeFormatImportPart = mainDocPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.Xhtml,
altChunkId)
Dim strHTML As String = "<html><head/><body><h1>Html Heading</h1><p>This is an html document in a string literal.</p></body></html>"
Using chunkStream As Stream = chunk.GetStream(FileMode.Create, FileAccess.Write)
Using sr As StreamWriter = New StreamWriter(chunkStream)
sr.Write(strHTML)
End Using
End Using
Dim altChunk As New AltChunk
altChunk.Id = altChunkId
mainDocPart.Document.Body.InsertAfter(altChunk, mainDocPart.Document.Body.Elements(Of Paragraph)().Last())
mainDocPart.Document.Save()
End Using
It seems you cannot import an AltChunk into a memory stream, you can only do it when you open the physical file for writing. Can anyone shed some light on this matter?

I know this is an old post, but i have the same issue.
When using AltChunk in file, it works but not when in MemoryStream.
It would be great if anyone knows anything about this. This is how i initiate the WordprocessingDocument
var byteArrayWithFileFrom360 = ProcessFileHandler.GetFileContent(204735);
var wordDocMemoryStream = new MemoryStream();
wordDocMemoryStream.Write(byteArrayWithFileFrom360, 0, byteArrayWithFileFrom360.Length);
var myDoc = WordprocessingDocument.Open(wordDocMemoryStream, true);

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Merge 2 Word Document using c# - c#

Related

How to convert XML to Byte to File/PDF

Convert html to pdf and merge it with existing pdfs

combine two word documents into one [duplicate]

convert a pdfreader to a pdfdocument

Corrupt document after calling AddAlternativeFormatImportPart using OpenXml

Categories

Resources