I am struggling to work out a concise way to do what I'd imagine would be quite simple... I have a simple existing PowerPoint presentation with one slide and in it one image.
I want to programatically open this with the Open XML SDK (hosted in a .Net Core web application) and add a hyperlink to this, and save it... such that when it's reopened in PowerPoint, one can control+click on the image to visit the link.
using (var ppt = PresentationDocument.Open("powerpoint.pptx", true))
{
var image = ppt.PresentationPart.SlideParts.First().ImageParts.First();
// Code to add hyperlink to image here - a bit like:
// image.HyperLink = "http://somewebpage"
ppt.Save();
}
Thanks to the help from the comment from #Cindy Mester, I was able to strip down the suggested migration code from the Open XML SDK Productivity Tool to this:
using (var ms = new MemoryStream())
{
var original = File.OpenRead("withoutlink.pptx");
original.CopyTo(ms);
using (var ppt = PresentationDocument.Open(ms, true))
{
var slidePart1 = ppt.PresentationPart.SlideParts.First();
var slide1 = slidePart1.Slide;
var commonSlideData1 = slide1.GetFirstChild<CommonSlideData>();
var shapeTree1 = commonSlideData1.GetFirstChild<ShapeTree>();
var picture1 = shapeTree1.GetFirstChild<Picture>();
var nonVisualPictureProperties1 = picture1.GetFirstChild<NonVisualPictureProperties>();
var nonVisualDrawingProperties1 =
nonVisualPictureProperties1.GetFirstChild<NonVisualDrawingProperties>();
var nonVisualDrawingPropertiesExtensionList1 = nonVisualDrawingProperties1
.GetFirstChild<A.NonVisualDrawingPropertiesExtensionList>();
var relationshipId = "rId" + nonVisualPictureProperties1.Count();
var hyperlinkOnClick1 = new A.HyperlinkOnClick {Id = relationshipId};
nonVisualDrawingProperties1.InsertBefore(hyperlinkOnClick1,
nonVisualDrawingPropertiesExtensionList1);
slidePart1.AddHyperlinkRelationship(new Uri("http://www.google.com/", UriKind.Absolute), true,
relationshipId);
ppt.SaveAs("withlink.pptx");
}
In order that I could edit the file without modifying the original I copied to memory stream and opened that - In my web app I can the stream this memory stream back to the client.
Related
TLDR; Please either confirm that the 2nd code snippit is the accepted method for creating a CustomXmlPart or show me another less tedious method.
So I'm trying to embed some application data in some of the elements in a .pptx that I'm modifying using the OpenXmlSDK.
To explain briefly, I need to embed an chart code into each image that is loaded into the presentation. It's so that the presentation can be re-uploaded and the charts can be generated again then replaced using the newest data.
Initially I was using Extended Attributes on the OpenXmlElement itself:
//OpenXmlDrawing = DocumentFormat.OpenXml.Drawing
// there's only one image per slide for now, so I just grab the blip which contains the image
OpenXmlDrawing.Blip blip = slidePart.Slide.Descendants<OpenXmlDrawing.Blip>().FirstOrDefault();
//then apply the attribute
blip.SetAttribute(new OpenXmlAttribute("customAttribute", null, customAttributeValue));
The issue with that being, when the .pptx is edited in PowerPoint 2013, it strips out all the Extended Attributes.
SO.
I've read in multiple places now that the solution is to use a CustomXmlPart.
So I was trying to find how to do it.. and it was looking like it would require me to have a separate file for each CustomXmlPart to feed into the part. Ex/
var customXmlPart = slidePart.AddCustomXmlPart(CustomXmlPartType.CustomXml);
using (FileStream stream = new FileStream(fileName, FileMode.Open))
{
customXmlPart.FeedData(stream);
}
^ and that would need to be repeated with a different file for each CustomXmlPart. Which then means I'd likely just have to have a template file containing a skeleton custom XML part, and then dynamically fill in its contents for each individual slide before feeding it into the custom xml part.
It seems like a heck of a lot of work just to put in a little custom attribute. But I haven't been able to find any alternative methods.
Can anyone please either confirm that this is indeed the way I should do it, or point me in another direction? Greatly appreciated.
The answer is yes! :)
public class CustomXMLPropertyClass
{
public string PropertyName { get; set; }
public string PropertyValue { get; set; }
}
private static void AddCustomXmlPartCustomPropertyToSlidePart(string propertyName, string propertyValue, SlidePart part)
{
var customXmlPart = part.AddCustomXmlPart(CustomXmlPartType.CustomXml);
var customProperty = new CustomXMLPropertyClass{ PropertyName = propertyName, PropertyValue = propertyValue };
var serializer = new System.Xml.Serialization.XmlSerializer(customProperty.GetType());
var stream = new MemoryStream();
serializer.Serialize(stream, customProperty);
var customXml = System.Text.Encoding.UTF8.GetString(stream.ToArray());
using ( var streamWriter = new StreamWriter(customXmlPart.GetStream()))
{
streamWriter.Write(customXml);
streamWriter.Flush();
}
}
and then to get it back out:
private static string GetCustomXmlPropertyFromCustomXmlPart(CustomXmlPart customXmlPart)
{
var customXmlProperty = new CustomXMLPropertyClass();
string xml = "";
using (var stream = customXmlPart.GetStream())
{
var streamReader = new StreamReader(stream);
xml = streamReader.ReadToEnd();
}
using (TextReader reader = new StringReader(xml))
{
var serializer = new System.Xml.Serialization.XmlSerializer(typeof(customXmlProperty));
customXmlProperty = (CustomXMLPropertyClass)serializer.Deserialize(reader);
}
var customPropertyValue = customXmlProperty.PropertyValue;
return customPropertyValue;
}
You could also try custom properties. Custom XML files are meant for complex objects, and it sounds like you only need to store simple information.
I have generated a word file using Open Xml and I need to send it as attachment in a email with pdf format but I cannot save any physical pdf or word file on disk because I develop my application in cloud environment(CRM online).
I found only way is "Aspose Word to .Net".
http://www.aspose.com/docs/display/wordsnet/How+to++Convert+a+Document+to+a+Byte+Array But it is too expensive.
Then I found a solution is to convert word to html, then convert html to pdf. But there is a picture in my word. And I cannot resolve the issue.
The most accurate conversion from DOCX to PDF is going to be through Word. Your best option for that is setting up a server with OWAS (Office Web Apps Server) and doing your conversion through that.
You'll need to set up a WOPI endpoint on your application server and call:
/wv/WordViewer/request.pdf?WOPISrc={WopiUrl}&type=downloadpdf
OR
/wv/WordViewer/request.pdf?WOPISrc={WopiUrl}&type=printpdf
Alternatively you could try and do it using OneDrive and Word Online, but you'll need to work out the parameters Word Online uses as well as whether that's permitted within the Ts & Cs.
You can try Gnostice XtremeDocumentStudio .NET.
Converting From DOCX To PDF Using XtremeDocumentStudio .NET
http://www.gnostice.com/goto.asp?id=24900&t=convert_docx_to_pdf_using_xdoc.net
In the published article, conversion has been demonstrated to save to a physical file. You can use documentConverter.ConvertToStream method to convert a document to a Stream as shown below in the code snippet.
DocumentConverter documentConverter = new DocumentConverter();
// input can be a FilePath, Stream, list of FilePaths or list of Streams
Object input = "InputDocument.docx";
string outputFileFormat = "pdf";
ConversionMode conversionMode = ConversionMode.ConvertToSeperateFiles;
List<Stream> outputStreams = documentConverter.ConvertToStream(input, outputFileFormat, conversionMode);
Disclaimer: I work for Gnostice.
If you wanna convert bytes array, then to use Metamorphosis:
string docxPath = #"example.docx";
string pdfPath = Path.ChangeExtension(docxPath, ".pdf");
byte[] docx = File.ReadAllBytes(docxPath);
// Convert DOCX to PDF in memory
byte[] pdf = p.DocxToPdfConvertByte(docx);
if (pdf != null)
{
// Save the PDF document to a file for a viewing purpose.
File.WriteAllBytes(pdfPath, pdf);
System.Diagnostics.Process.Start(pdfPath);
}
else
{
System.Console.WriteLine("Conversion failed!");
Console.ReadLine();
}
I have recently used SautinSoft 'Document .Net' library to convert docx to pdf in my React(frontend), .NET core(micro services- backend) application. It only take 15 seconds to generate a pdf having 23 pages. This 15 seconds includes getting data from database, then merging data with docx template and then converting it to pdf. The code has deployed to azure Linux box and works fine.
https://sautinsoft.com/products/document/
Sample code
public string GeneratePDF(PDFDocumentModel document)
{
byte[] output = null;
using (var outputStream = new MemoryStream())
{
// Create single pdf.
DocumentCore singlePDF = new DocumentCore();
var documentCores = new List<DocumentCore>();
foreach (var section in document.Sections)
{
documentCores.Add(GenerateDocument(section));
}
foreach (var dc in documentCores)
{
// Create import session.
ImportSession session = new ImportSession(dc, singlePDF, StyleImportingMode.KeepSourceFormatting);
// Loop through all sections in the source document.
foreach (Section sourceSection in dc.Sections)
{
// Because we are copying a section from one document to another,
// it is required to import the Section into the destination document.
// This adjusts any document-specific references to styles, bookmarks, etc.
// Importing a element creates a copy of the original element, but the copy
// is ready to be inserted into the destination document.
Section importedSection = singlePDF.Import<Section>(sourceSection, true, session);
// First section start from new page.
if (dc.Sections.IndexOf(sourceSection) == 0)
importedSection.PageSetup.SectionStart = SectionStart.NewPage;
// Now the new section can be appended to the destination document.
singlePDF.Sections.Add(importedSection);
//Paging
HeaderFooter footer = new HeaderFooter(singlePDF, HeaderFooterType.FooterDefault);
// Create a new paragraph to insert a page numbering.
// So that, our page numbering looks as: Page N of M.
Paragraph par = new Paragraph(singlePDF);
par.ParagraphFormat.Alignment = HorizontalAlignment.Center;
CharacterFormat cf = new CharacterFormat() { FontName = "Consolas", Size = 11.0 };
par.Content.Start.Insert("Page ", cf.Clone());
// Page numbering is a Field.
Field fPage = new Field(singlePDF, FieldType.Page);
fPage.CharacterFormat = cf.Clone();
par.Content.End.Insert(fPage.Content);
par.Content.End.Insert(" of ", cf.Clone());
Field fPages = new Field(singlePDF, FieldType.NumPages);
fPages.CharacterFormat = cf.Clone();
par.Content.End.Insert(fPages.Content);
footer.Blocks.Add(par);
importedSection.HeadersFooters.Add(footer);
}
}
var pdfOptions = new PdfSaveOptions();
pdfOptions.Compression = false;
pdfOptions.EmbedAllFonts = false;
pdfOptions.EmbeddedImagesFormat = PdfSaveOptions.EmbImagesFormat.Png;
pdfOptions.EmbeddedJpegQuality = 100;
//dont allow editing after population, also ensures content can be printed.
pdfOptions.PreserveFormFields = false;
pdfOptions.PreserveContentControls = false;
if (!string.IsNullOrEmpty(document.PdfProperties.Title))
{
singlePDF.Document.Properties.BuiltIn[BuiltInDocumentProperty.Title] = document.PdfProperties.Title;
}
if (!string.IsNullOrEmpty(document.PdfProperties.Author))
{
singlePDF.Document.Properties.BuiltIn[BuiltInDocumentProperty.Author] = document.PdfProperties.Author;
}
if (!string.IsNullOrEmpty(document.PdfProperties.Subject))
{
singlePDF.Document.Properties.BuiltIn[BuiltInDocumentProperty.Subject] = document.PdfProperties.Subject;
}
singlePDF.Save(outputStream, pdfOptions);
output = outputStream.ToArray();
}
return Convert.ToBase64String(output);
}
I am porting an existing app from Java to C#. The original app used the IText library to fill PDF form templates and save them as new PDF's. My C# code (example) below:
string templateFilename = #"C:\Templates\test.pdf";
string outputFilename = #"C:\Output\demo.pdf";
using (var existingFileStream = new FileStream(templateFilename, FileMode.Open))
{
using (var newFileStream = new FileStream(outputFilename, FileMode.Create))
{
var pdfReader = new PdfReader(existingFileStream);
var stamper = new PdfStamper(pdfReader, newFileStream);
var form = stamper.AcroFields;
var fieldKeys = form.Fields.Keys;
foreach (string fieldKey in fieldKeys)
{
form.SetField(fieldKey, "REPLACED!");
}
stamper.FormFlattening = true;
stamper.Close();
pdfReader.Close();
}
}
All works well only if I ommit the
stamper.FormFlattening = true;
line, but then the forms are visible as...forms.
When I add the this line, any values set to the form fields are lost, resulting in a blank form. I would really appreciate any advice.
Most likely you can resolve this when using iTextSharp 5.4.4 (or later) by forcing iTextSharp to generate appearances for the form fields. In your example code:
var form = stamper.AcroFields;
form.GenerateAppearances = true;
Resolved the issue by using a previous version of ITextSharp (5.4.3). Not sure what the cause is though...
I found a working solution for this for any och the newer iTextSharp.
The way we do it was:
1- Create a copy of the pdf temmplate.
2- populate the copy with data.
3- FormFlatten = true and setFullCompression
4- Combine some of the PDFs to a new document.
5- Move the new combined document and then remove the temp.
This way we got the issue with removed input and if we skipped the "formflatten" it looked ok.
However when we moved the "FormFlatten = true" from step 3 and added it as a seperate step after the moving etc was complete, it worked perfectly.
Hope I explained somewhat ok :)
In your PDF File, change the property to Visible, the Default value is Visible but not printable.
I want to create an excel document based on a template using Open XML Format SDK 2.0.
I have followed this tutorial Creating Documents by Using the Open XML Format SDK 2.0 CT.
My problem is that the rows and cells i put in to the document doesn't get saved. When I open the document it looks just like the template.
There is no exceptions thrown when I run my code. I figure I have to force the changes to be saved in the document, but I cant figure out how.
Here's some of my code:
public static void GenerateExcelReportToDisk()
{
var factory = new Factory();
var generated = "result.xlsx";
var newFile = Util.GetReportTargetPath() + generated;
var templateFile = Util.GetReportTemplatePath() + #"template.xlsx";
File.Copy(templateFile, newFile, true);
using (var myWorkbook = SpreadsheetDocument.Open(newFile, true))
{
var workbookPart = myWorkbook.WorkbookPart;
var worksheetPart = workbookPart.WorksheetParts.First();
var sheetData = worksheetPart.Worksheet.GetFirstChild<SheetData>();
//Get data
var data = factory.GetAllFixtures().Take(20);
int rowIndex = 3;
foreach (var fixture in data)
{
var pcRate = fixture.PCRate;
var account = fixture.Charter != null ? fixture.Charter.Shortname : null;
var region = fixture.Region != null ? fixture.Region.GroupName : null;
//CreateContentRow is exactly like the tutorial linked above.
var row = CreateContetRow(rowIndex, region, pcRate, account);
rowIndex++;
sheetData.AppendChild(row);
}
//Tried to add myWorkbook.WorkbookPart.Workbook.Save(); here, but it doesn't do anything
myWorkbook.Close();
}
Well, I managed to figure this out by myself after a short while.
Posting the answer here in case it will help someone (including myself):
In the line above myWorkbook.Close(); add worksheetPart.Worksheet.Save();
As simple as that...
I am trying to manipulate the XML of a Word 2007 document in C#. I have managed to find and manipulate the node that I want but now I can't seem to figure out how to save it back. Here is what I am trying:
// Open the document from memoryStream
Package pkgFile = Package.Open(memoryStream, FileMode.Open, FileAccess.ReadWrite);
PackageRelationshipCollection pkgrcOfficeDocument = pkgFile.GetRelationshipsByType(strRelRoot);
foreach (PackageRelationship pkgr in pkgrcOfficeDocument)
{
if (pkgr.SourceUri.OriginalString == "/")
{
Uri uriData = new Uri("/word/document.xml", UriKind.Relative);
PackagePart pkgprtData = pkgFile.GetPart(uriData);
XmlDocument doc = new XmlDocument();
doc.Load(pkgprtData.GetStream());
NameTable nt = new NameTable();
XmlNamespaceManager nsManager = new XmlNamespaceManager(nt);
nsManager.AddNamespace("w", nsUri);
XmlNodeList nodes = doc.SelectNodes("//w:body/w:p/w:r/w:t", nsManager);
foreach (XmlNode node in nodes)
{
if (node.InnerText == "{{TextToChange}}")
{
node.InnerText = "success";
}
}
if (pkgFile.PartExists(uriData))
{
// Delete template "/customXML/item1.xml" part
pkgFile.DeletePart(uriData);
}
PackagePart newPkgprtData = pkgFile.CreatePart(uriData, "application/xml");
StreamWriter partWrtr = new StreamWriter(newPkgprtData.GetStream(FileMode.Create, FileAccess.Write));
doc.Save(partWrtr);
partWrtr.Close();
}
}
pkgFile.Close();
I get the error 'Memory stream is not expandable'. Any ideas?
I would recommend that you use Open XML SDK instead of hacking the format by yourself.
Using OpenXML SDK 2.0, I do this:
public void SearchAndReplace(Dictionary<string, string> tokens)
{
using (WordprocessingDocument doc = WordprocessingDocument.Open(_filename, true))
ProcessDocument(doc, tokens);
}
private string GetPartAsString(OpenXmlPart part)
{
string text = String.Empty;
using (StreamReader sr = new StreamReader(part.GetStream()))
{
text = sr.ReadToEnd();
}
return text;
}
private void SavePart(OpenXmlPart part, string text)
{
using (StreamWriter sw = new StreamWriter(part.GetStream(FileMode.Create)))
{
sw.Write(text);
}
}
private void ProcessDocument(WordprocessingDocument doc, Dictionary<string, string> tokenDict)
{
ProcessPart(doc.MainDocumentPart, tokenDict);
foreach (var part in doc.MainDocumentPart.HeaderParts)
{
ProcessPart(part, tokenDict);
}
foreach (var part in doc.MainDocumentPart.FooterParts)
{
ProcessPart(part, tokenDict);
}
}
private void ProcessPart(OpenXmlPart part, Dictionary<string, string> tokenDict)
{
string docText = GetPartAsString(part);
foreach (var keyval in tokenDict)
{
Regex expr = new Regex(_starttag + keyval.Key + _endtag);
docText = expr.Replace(docText, keyval.Value);
}
SavePart(part, docText);
}
From this you could write a GetPartAsXmlDocument, do what you want with it, and then stream it back with SavePart(part, xmlString).
Hope this helps!
You should use the OpenXML SDK to work on docx files and not write your own wrapper.
Getting Started with the Open XML SDK 2.0 for Microsoft Office
Introducing the Office (2007) Open XML File Formats
How to: Manipulate Office Open XML Formats Documents
Manipulate Docx with C# without Microsoft Word installed with OpenXML SDK
The problem appears to be doc.Save(partWrtr), which is built using newPkgprtData, which is built using pkgFile, which loads from a memory stream... Because you loaded from a memory stream it's trying to save the document back to that same memory stream. This leads to the error you are seeing.
Instead of saving it to the memory stream try saving it to a new file or to a new memory stream.
The short and simple answer to the issue with getting 'Memory stream is not expandable' is:
Do not open the document from memoryStream.
So in that respect the earlier answer is correct, simply open a file instead.
Opening from MemoryStream editing the document (in my experience) easy lead to 'Memory stream is not expandable'.
I suppose the message appears when one do edits that requires the memory stream to expand.
I have found that I can do some edits but not anything that add to the size.
So, f.ex deleting a custom xml part is ok but adding one and some data is not.
So if you actually need to open a memory stream you must figure out how to open an expandable MemoryStream if you want to add to it.
I have a need for this and hope to find a solution.
Stein-Tore Erdal
PS: just noticed the answer from "Jan 26 '11 at 15:18".
Don't think that is the answer in all situations.
I get the error when trying this:
var ms = new MemoryStream(bytes);
using (WordprocessingDocument wd = WordprocessingDocument.Open(ms, true))
{
...
using (MemoryStream msData = new MemoryStream())
{
xdoc.Save(msData);
msData.Position = 0;
ourCxp.FeedData(msData); // Memory stream is not expandable.