Corrupt document after calling AddAlternativeFormatImportPart using OpenXml - c#

I am trying to create an AddAlternativeFormatImportPart in a .docx file in order to reference it in the document via an AltChunk. the problem is that the code below causes the docx file to read as corrupted by Word and cannot be opened.
string html = "some html code."
string altChunkId = "html234";
var document = WordprocessingDocument.Open(inMemoryPackage, true);
var mainPart = document.MainDocumentPart.Document;
var mainDocumentPart = document.MainDocumentPart;
AlternativeFormatImportPart chunk = mainDocumentPart.AddAlternativeFormatImportPart
(AlternativeFormatImportPartType.Xhtml, altChunkId);
Stream contentStream = chunk.GetStream(FileMode.Open,FileAccess.ReadWrite);
StreamWriter contentWriter = new StreamWriter(contentStream);
contentWriter.Write(html);
contentWriter.Flush();
{
...
}
mainPart.Save();

I think it might be how you are handeling the stream from the AlternativeFormatImportPart. Try using FeedData instead, like in my example below.
StringBuilder xhtmlBuilder = new StringBuilder();
xhtmlBuilder.Append("<html>");
xhtmlBuilder.Append("<body>");
xhtmlBuilder.Append("<b>Hello world!</b>");
xhtmlBuilder.Append("</body>");
xhtmlBuilder.Append("</html>");
using (WordprocessingDocument doc = WordprocessingDocument.Open(inputFilePath, true))
{
string altChunkId = "chunk1";
AlternativeFormatImportPart chunk = doc.MainDocumentPart.AddAlternativeFormatImportPart
(AlternativeFormatImportPartType.Xhtml, altChunkId);
using (MemoryStream xhtmlStream = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(xhtmlBuilder.ToString())))
{
chunk.FeedData(xhtmlStream);
AltChunk altChunk = new AltChunk();
altChunk.Id = altChunkId;
doc.MainDocumentPart.Document.Body.Append(altChunk);
}
doc.MainDocumentPart.Document.Save();
}

I think it is because you cannot import an AltChunk into a document that is opened from a memory stream. I had the same issue. I was opening the template from a memory stream like so:
Private Sub UpdateDoc(templatePath As String)
Using fs As FileStream = File.OpenRead(templatePath)
Using ms As New MemoryStream
CopyStream(fs, ms)
Using doc As WordprocessingDocument = WordprocessingDocument.Open(ms, True)
'update the document
doc.MainDocumentPart.Document.Save()
End Using
End Using
End Using
End Sub
Private Sub CopyStream(source As Stream, target As Stream)
Dim buffer() As Byte
Dim bytesRead As Integer = 1
ReDim buffer(32768)
While bytesRead > 0
bytesRead = 0
bytesRead = source.Read(buffer, 0, buffer.Length)
target.Write(buffer, 0, bytesRead)
End While
End Sub
This works for normal updates of content controls etc. and document is fine when streamed back to client or saved as docx. But it corrupts doc when inserting an AltChunk.
Opening a doc from a physical file path works when inserting AltChunk like so:
Using doc As WordprocessingDocument = WordprocessingDocument.Open(strTempFile, True)
Dim altChunkId As String = "AltChunkId1"
Dim mainDocPart As MainDocumentPart = doc.MainDocumentPart
Dim chunk As AlternativeFormatImportPart = mainDocPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.Xhtml,
altChunkId)
Dim strHTML As String = "<html><head/><body><h1>Html Heading</h1><p>This is an html document in a string literal.</p></body></html>"
Using chunkStream As Stream = chunk.GetStream(FileMode.Create, FileAccess.Write)
Using sr As StreamWriter = New StreamWriter(chunkStream)
sr.Write(strHTML)
End Using
End Using
Dim altChunk As New AltChunk
altChunk.Id = altChunkId
mainDocPart.Document.Body.InsertAfter(altChunk, mainDocPart.Document.Body.Elements(Of Paragraph)().Last())
mainDocPart.Document.Save()
End Using
It seems you cannot import an AltChunk into a memory stream, you can only do it when you open the physical file for writing. Can anyone shed some light on this matter?

I know this is an old post, but i have the same issue.
When using AltChunk in file, it works but not when in MemoryStream.
It would be great if anyone knows anything about this. This is how i initiate the WordprocessingDocument
var byteArrayWithFileFrom360 = ProcessFileHandler.GetFileContent(204735);
var wordDocMemoryStream = new MemoryStream();
wordDocMemoryStream.Write(byteArrayWithFileFrom360, 0, byteArrayWithFileFrom360.Length);
var myDoc = WordprocessingDocument.Open(wordDocMemoryStream, true);

Related

Merge 2 Word Document using c#

I am trying to merge 2 word documents in binary format, however, it only gets the first document. The resulting document should have all elements from each source document included with formatting.
Here is the code so far. Using OpenXml by the way.
private static byte[] Merge(byte[] dest, byte[] src)
{
string altChunkId = "AltChunkId" + DateTime.Now.Ticks.ToString();
var memoryStreamDest = new MemoryStream();
memoryStreamDest.Write(dest, 0, dest.Length);
memoryStreamDest.Seek(0, SeekOrigin.Begin);
using (WordprocessingDocument doc = WordprocessingDocument.Open(memoryStreamDest, true))
{
MainDocumentPart mainPart = doc.MainDocumentPart;
Paragraph para = new Paragraph(new Run((new Break() { Type = BreakValues.Page })));
mainPart.Document.Body.InsertAfter(para, mainPart.Document.Body.LastChild);
//Insert the source file into the target file using AltChunk
AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(
AlternativeFormatImportPartType.WordprocessingML, altChunkId);
using (MemoryStream mem = new MemoryStream())
{
mem.Write(src, 0, (int)src.Length);
mem.Seek(0, SeekOrigin.Begin);
chunk.FeedData(mem);
}
AltChunk altChunk = new AltChunk();
altChunk.Id = altChunkId;
mainPart.Document.Body.InsertAfter(altChunk, mainPart.Document.Body.Descendants<Paragraph>().Last());
mainPart.Document.Save();
return memoryStreamDest.ToArray();
}
}

How do I zip a Word document stored in a MemoryStream using ICSharpCode.SharpZipLib.Zip?

I am trying to zip a Word document that is stored in a MemoryStream like this:
MemoryStream outputMemStream = new MemoryStream();
ZipOutputStream zipStream = new ZipOutputStream(outputMemStream);
zipStream.SetLevel(3); //0-9, 9 being the highest level of compression
ZipEntry newEntry = new ZipEntry("my_document.docx");
newEntry.DateTime = DateTime.Now;
zipStream.PutNextEntry(newEntry);
//docMS is a MemoryStream that contains a Word document
StreamUtils.Copy(docMS, zipStream, new byte[4096]);
docMS.Close();
zipStream.CloseEntry();
zipStream.IsStreamOwner = false; // False stops the Close also Closing the underlying stream.
zipStream.Close(); // Must finish the ZipOutputStream before using outputMemStream.
outputMemStream.Position = 0;
context.Response.AddHeader("content-disposition", String.Format("attachment;filename={0}", "my_document.zip"));
context.Response.ContentType = "application/octet-stream";
zipStream.WriteTo(context.Response.OutputStream);
context.Response.End();
When the zip file is downloaded and unzipped, 'my_document.docx' opens up as a blank document. If I modify the code above so that the 'my_document.docx' file is downloaded directly (not zipped) via the MemoryStream 'docMS', then the document opens up fine. I cannot figure out why.

Create document with WordprocessingDocument and MemoryStream

I am using the following code to create a MS word document using OpenXML WordprocessingDocument from a word template.I am using Stream and not using any physical location for new document.Using OpenXML ,Is is possible to create document without using a physical location (only with Stream) and finally save to a location?
I am not getting any error and new document is created successfully but the newly created document is corrupted and unable to open in MS word.
using (Stream stream1 = new FileStream("c:\\TestDoc.dotx", FileMode.Open, FileAccess.ReadWrite, FileShare.ReadWrite)) {
using (WordprocessingDocument document = WordprocessingDocument.Open(stream1, true)) {
document.ChangeDocumentType(DocumentFormat.OpenXml.WordprocessingDocumentType.Document);
MainDocumentPart mainPart = document.MainDocumentPart;
DocumentSettingsPart documentSettingPart1 = mainPart.DocumentSettingsPart;
mainPart.Document.Save();
Stream mystream = mainPart.GetStream();
FileStream fileStream = File.Create("c:\\newdoc.docx", (int)mystream.Length);
byte[] bytesInStream = new byte[mystream.Length];
mystream.Read(bytesInStream, 0, bytesInStream.Length);
fileStream.Write(bytesInStream, 0, bytesInStream.Length);
document.Close();
}
}

Merge multiple word documents into one using OpenXML and XElement

As the title states I am trying to merge multiple word(.docx) files into one word doc. Each of these documents is one page long. I am using some of the code from this post in this implementation. The issue I am running into is that only the first document gets written properly, every other iteration appends a new document but the document contents is the same as the first.
Here is the code I am using:
//list that holds the file paths
List<String> fileNames = new List<string>();
fileNames.Add("filePath");
fileNames.Add("filePath");
fileNames.Add("filePath");
fileNames.Add("filePath");
fileNames.Add("filePath");
//get the first document
MemoryStream mainStream = new MemoryStream();
byte[] buffer = File.ReadAllBytes(fileNames[0]);
mainStream.Write(buffer, 0, buffer.Length);
using (WordprocessingDocument mainDocument = WordprocessingDocument.Open(mainStream, true))
{
//xml for the new document
XElement newBody = XElement.Parse(mainDocument.MainDocumentPart.Document.Body.OuterXml);
//iterate through eacah file
for (int i = 1; i < fileNames.Count; i++)
{
//read in the document
byte[] tempBuffer = File.ReadAllBytes(fileNames[i]);
WordprocessingDocument tempDocument = WordprocessingDocument.Open(new MemoryStream(tempBuffer), true);
//new documents XML
XElement tempBody = XElement.Parse(tempDocument.MainDocumentPart.Document.Body.OuterXml);
//add the new xml
newBody.Add(tempBody);
string str = newBody.ToString();
//write to the main document and save
mainDocument.MainDocumentPart.Document.Body = new Body(newBody.ToString());
mainDocument.MainDocumentPart.Document.Save();
mainDocument.Package.Flush();
tempBuffer = null;
}
//write entire stream to new file
FileStream fileStream = new FileStream("xmltest.docx", FileMode.Create);
mainStream.WriteTo(fileStream);
//ret = mainStream.ToArray();
mainStream.Close();
mainStream.Dispose();
}
Again the problem is that each new document being created has the same content as the first document. So when I run this the output will be a document with five identical pages. I've tried switching the documents order around in the list and get the same result so it is nothing specific to one document.
Could anyone suggest what I am doing wrong here? I'm looking through it and I can't explain the behavior I am seeing. Any suggestions would be appreciated. Thanks much!
Edit: I'm thinking this may have something to do with that fact that the documents I am trying to merge have been generated with custom XML parts. I'm thinking that the Xpath in the documents are somehow pointing to the same content. The thing is I can open each of these documents and see the proper content, it's just when I merge them that I see the issue.
This solution uses DocumentFormat.OpenXml
public static void Join(params string[] filepaths)
{
//filepaths = new[] { "D:\\one.docx", "D:\\two.docx", "D:\\three.docx", "D:\\four.docx", "D:\\five.docx" };
if (filepaths != null && filepaths.Length > 1)
using (WordprocessingDocument myDoc = WordprocessingDocument.Open(#filepaths[0], true))
{
MainDocumentPart mainPart = myDoc.MainDocumentPart;
for (int i = 1; i < filepaths.Length; i++)
{
string altChunkId = "AltChunkId" + i;
AlternativeFormatImportPart chunk = mainPart.AddAlternativeFormatImportPart(
AlternativeFormatImportPartType.WordprocessingML, altChunkId);
using (FileStream fileStream = File.Open(#filepaths[i], FileMode.Open))
{
chunk.FeedData(fileStream);
}
DocumentFormat.OpenXml.Wordprocessing.AltChunk altChunk = new DocumentFormat.OpenXml.Wordprocessing.AltChunk();
altChunk.Id = altChunkId;
//new page, if you like it...
mainPart.Document.Body.AppendChild(new Paragraph(new Run(new Break() { Type = BreakValues.Page })));
//next document
mainPart.Document.Body.InsertAfter(altChunk, mainPart.Document.Body.Elements<Paragraph>().Last());
}
mainPart.Document.Save();
myDoc.Close();
}
}
The way you seem to merge may not work properly at times. You can try one of the approaches
Using AltChunk as in http://blogs.msdn.com/b/ericwhite/archive/2008/10/27/how-to-use-altchunk-for-document-assembly.aspx
Using http://powertools.codeplex.com/ DocumentBuilder.BuildDocument method
If still you face the similar issue you can find the databound controls prior to Merge and
assign data to these controls from the CustomXml part. You can find this approach in method AssignContentFromCustomXmlPartForDataboundControl of OpenXmlHelper class. The code can be downloaded from http://worddocgenerator.codeplex.com/

DocumentFormat.OpenXml Adding an Image to a word doc

I am creating a simple word doc, using the openXml SDK.
It is working so far.
Now how can I add an image from my file system to this doc? I don't care where it is in the doc just so it is there.
Thanks!
Here is what I have so far.
string fileName = "proposal"+dealerId +Guid.NewGuid().ToString()+".doc";
string filePath = #"C:\DWSApplicationFiles\Word\" + fileName;
using (WordprocessingDocument wordDoc = WordprocessingDocument.Create(filePath, WordprocessingDocumentType.Document, true))
{
MainDocumentPart mainPart = wordDoc.AddMainDocumentPart();
mainPart.Document = new Document();
//create the body
Body body = new Body();
DocumentFormat.OpenXml.Wordprocessing.Paragraph p = new DocumentFormat.OpenXml.Wordprocessing.Paragraph();
DocumentFormat.OpenXml.Wordprocessing.Run runParagraph = new DocumentFormat.OpenXml.Wordprocessing.Run();
DocumentFormat.OpenXml.Wordprocessing.Text text_paragraph = new DocumentFormat.OpenXml.Wordprocessing.Text("This is a test");
runParagraph.Append(text_paragraph);
p.Append(runParagraph);
body.Append(p);
mainPart.Document.Append(body);
mainPart.Document.Save();
}
Here is a method that can be simpler than the one described in the msdn page posted above, this code is in C++/CLI but of course you can write the equivalent in C#
WordprocessingDocument^ doc = WordprocessingDocument::Open(doc_name, true);
FileStream^ img_fs = gcnew FileStream(image_path, FileMode::Open);
ImagePart^ image_part = doc->MainDocumentPart->AddImagePart(ImagePartType::Jpeg);
image_part->FeedData(img_fs);
Run^ img_run = doc->MainDocumentPart->Document->Body->AppendChild(gcnew Paragraph())->AppendChild(gcnew Run());
Vml::ImageData^ img_data = img_run->AppendChild(gcnew Picture())->AppendChild(gcnew Vml::Shape())->AppendChild(gcnew Vml::ImageData());
img_data->RelationshipId = doc->MainDocumentPart->GetIdOfPart(image_part);
doc->Close();
This code worked for me: http://msdn.microsoft.com/en-us/library/bb497430.aspx
Your code adds image to your docx package, but in order to see it in the document you have to declare it in your document.xml i.e. link it to your physical image. That's why you have to write that long function listed in the msdn link.
My problem is how to add effects to pictures (editing, croping, background removal).
If you know how to do this I'd appreciate your help :)
How to: Add an Image Part to an Office Open XML Package by Using the Open XML API
http://msdn.microsoft.com/en-us/library/bb497430(v=office.12).aspx
public static void AddImagePart(string document, string fileName)
{
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(document, true))
{
MainDocumentPart mainPart = wordDoc.MainDocumentPart;
ImagePart imagePart = mainPart.AddImagePart(ImagePartType.Jpeg);
using (FileStream stream = new FileStream(fileName, FileMode.Open))
{
imagePart.FeedData(stream);
}
}
}

Categories