copying openXML image from one document to another - c#

We have conditional Footers that INCLUDETEXT based on the client:
IF $CLIENT = "CLIENT1" "{INCLUDETEXT "CLIENT1HEADER.DOCX"}" ""
Depending on our document, there could be a varying amount of IF/ELSE, and these all work correctly for merging the correct files in the correct place.
However, some of these documents may have client specific images/branding, which also need to be copied across from the INCLUDETEXT file.
Below is the method that is called to replace any Picture elements that exist in the IEnumerable<Run> that is copied from the Source document to the Target document.
The image is copied fine, however it doesn't appear to update the RID in my Picture or add a record into the .XML.Rels files. (I even tried adding a ForEach to add to all the headers and footers, to see if this made any difference.
private void InsertImagesFromOldDocToNewDoc(WordprocessingDocument source, WordprocessingDocument target, IEnumerable<Picture> pics)
{
IEnumerable<Picture> imageElements = source.MainDocumentPart.Document.Descendants<Run>().Where(x => x.Descendants<Picture>().FirstOrDefault() != null).Select(x => x.Descendants<Picture>().FirstOrDefault());
foreach (Picture pic in pics) //the new pics
{
Picture oldPic = imageElements.Where(x => x.Equals(pic)).FirstOrDefault();
if (oldPic != null)
{
string imageId = "";
ImageData shape = oldPic.Descendants<ImageData>().FirstOrDefault();
ImagePart p = source.MainDocumentPart.GetPartById(shape.RelationshipId) as ImagePart;
ImagePart newPart = target.MainDocumentPart.AddPart<ImagePart>(p);
newPart.FeedData(p.GetStream());
shape.RelId = target.MainDocumentPart.GetIdOfPart(newPart);
string relPart = target.MainDocumentPart.CreateRelationshipToPart(newPart);
}
}
}
Has anyone come across this issue before?
It appears the OpenXML SDK documentation is a 'little' sparse...

Late reaction but this thread helped me a lot to got it working. Here my solution for copying a document with images
private static void CopyDocumentWithImages(string path)
{
if (!Path.GetFileName(path).StartsWith("~$"))
{
using (var source = WordprocessingDocument.Open(path, false))
{
using (var newDoc = source.CreateNew(path.Replace(".docx", "-images.docx")))
{
foreach (var e in source.MainDocumentPart.Document.Body.Elements())
{
var clonedElement = e.CloneNode(true);
clonedElement.Descendants<DocumentFormat.OpenXml.Drawing.Blip>()
.ToList().ForEach(blip =>
{
var newRelation = newDoc.CopyImage(blip.Embed, source);
blip.Embed = newRelation;
});
clonedElement.Descendants<DocumentFormat.OpenXml.Vml.ImageData>().ToList().ForEach(imageData =>
{
var newRelation = newDoc.CopyImage(imageData.RelationshipId, source);
imageData.RelationshipId = newRelation;
});
newDoc.MainDocumentPart.Document.Body.AppendChild(clonedElement);
}
newDoc.Save();
}
}
}
}
CopyImage:
public static string CopyImage(this WordprocessingDocument newDoc, string relId, WordprocessingDocument org)
{
var p = org.MainDocumentPart.GetPartById(relId) as ImagePart;
var newPart = newDoc.MainDocumentPart.AddPart(p);
newPart.FeedData(p.GetStream());
return newDoc.MainDocumentPart.GetIdOfPart(newPart);
}
CreateNew:
public static WordprocessingDocument CreateNew(this WordprocessingDocument org, string name)
{
var doc = WordprocessingDocument.Create(name, WordprocessingDocumentType.Document);
doc.AddMainDocumentPart();
doc.MainDocumentPart.Document = new Document(new Body());
using (var streamReader = new StreamReader(org.MainDocumentPart.ThemePart.GetStream()))
using (var streamWriter = new StreamWriter(doc.MainDocumentPart.AddNewPart<ThemePart>().GetStream(FileMode.Create)))
{
streamWriter.Write(streamReader.ReadToEnd());
}
using (var streamReader = new StreamReader(org.MainDocumentPart.StyleDefinitionsPart.GetStream()))
using (var streamWriter = new StreamWriter(doc.MainDocumentPart.AddNewPart<StyleDefinitionsPart>().GetStream(FileMode.Create)))
{
streamWriter.Write(streamReader.ReadToEnd());
}
return doc;
}

Stuart,
I had faced the same problem when I was trying to copy the numbering styles from one document to the other.
I think what Word does internally is, whenever an object is copied from one document to the other the ID for that object is not copied over to the new document and instead what happens is a new ID is assigned to it.
You'll have to get the ID after the image has been copied and then replace it everywhere your image has been used.
I hope this helps, this is what I to use copy numbering styles.
Cheers

Related

PdfLayer.GetTitle() always returning null

C# itext 7.1.4 (NuGet release) doesn't seem to parse OCG/layer titles correctly.
The C# code below should read a pdf, print all layer titles, turn off the layer visibility and save it to the dest file.
Example pdf file: https://docdro.id/qI479di
using iText.Kernel.Pdf;
using System;
namespace PDFSetOCGVisibility
{
class Program
{
static void Main(string[] args)
{
var src = #"layer-example.pdf";
var dest = #"layer-example-out.pdf"; ;
PdfDocument pdf = new PdfDocument(new PdfReader(src), new PdfWriter(dest));
var Catalog = pdf.GetCatalog();
var ocProps = Catalog.GetOCProperties(false);
var layers = ocProps.GetLayers();
foreach(var layer in layers)
{
var title = layer.GetTitle();
Console.WriteLine($"title: {title ?? "null"}");
layer.SetOn(false);
}
pdf.Close();
}
}
}
Expected output is:
title: Layer 1
title: Layer 2
Actual output is:
title: null
title: null
Writing the file with disabled layers works fine but the layer titles are always null.
Just tested the itext5 version:
using iTextSharp.text.pdf;
using System;
using System.IO;
namespace PDFSetOCGVisibility5
{
class Program
{
static void Main(string[] args)
{
var src = #"layer-example.pdf";
var dest = #"layer-example-out.pdf";
var reader = new PdfReader(src);
PdfStamper pdf = new PdfStamper(reader, new FileStream(dest, FileMode.Create));
var layers = pdf.GetPdfLayers();
foreach (var layer in layers)
{
var title = layer.Key;
Console.WriteLine($"title: {title ?? "null"}");
layer.Value.On = false;
}
pdf.Close();
reader.Close();
}
}
}
It's working as expected, so this seems to be a regression in itext7
I don't know what's the purpose of title/GetTitle() but to get the Name (as displayed on the panel) the following code works:
var title = layer.GetPdfObject().GetAsString(PdfName.Name).ToUnicodeString();

Set programatically created ReportBook as HTML5 ReportSource

A user can select multiple orders, and download all the reports as one PDF.
We used PdfSmartCopy to merge the reports:
protected void Print(int[] order_ids)
{
byte[] merged_reports;
using (MemoryStream ms = new MemoryStream())
using (Document doc = new Document())
using (PdfSmartCopy copy = new PdfSmartCopy(doc, ms))
{
doc.Open();
foreach (string order_id in order_ids)
{
Telerik.Reporting.InstanceReportSource reportSource = new Telerik.Reporting.InstanceReportSource();
reportSource.ReportDocument = new OrderReport();
reportSource.Parameters.Add(new Telerik.Reporting.Parameter("order_id", order_id));
RenderingResult result = new ReportProcessor().RenderReport("PDF", reportSource, new Hashtable());
using (PdfReader reader = new PdfReader(result.DocumentBytes))
{
copy.AddDocument(reader);
}
}
doc.Close();
merged_reports = ms.ToArray();
}
Response.Clear();
Response.Cache.SetCacheability(HttpCacheability.NoCache);
Response.Expires = -1;
Response.Buffer = false;
Response.ContentType = "application/pdf";
Response.OutputStream.Write(merged_reports, 0, merged_reports.Length);
}
But we started using the HTML5 ReportViewer elsewhere and we want to use it there as well to be consistent. I thought of creating a ReportBook programmatically and set it as the ReportSource of the ReportViewer, but the only thing I can set is a string. We have already used ReportBook before, but this was an actual SomeReportBook.cs that we could set through new SomeReportBook().GetType().AssemblyQualifiedName;.
Any clue? Here is what I have at the moment:
protected void Print(int[] order_ids)
{
Telerik.Reporting.ReportBook reportBook = new Telerik.Reporting.ReportBook();
foreach (string order_id in order_ids)
{
Telerik.Reporting.InstanceReportSource reportSource = new Telerik.Reporting.InstanceReportSource();
reportSource.ReportDocument = new OrderReport();
reportSource.Parameters.Add(new Telerik.Reporting.Parameter("order_id", order_id));
reportBook.ReportSources.Add(reportSource);
}
this.ReportViewer.ReportSource = new Telerik.ReportViewer.Html5.WebForms.ReportSource()
{
Identifier = // Can't use reportBook.GetType().AssemblyQualifiedName
};
}
I have also struggled with this challenge for quite some time; I would to share in case
someone else faces such a challenge. Kindly do this.
1.Create a class that inherits from - Telerik.Reporting.ReportBook
2.Create a method that loads all your reports in your reportbook class i.e.
this.ReportSources.Add(new TypeReportSource
{
TypeName = typeof(Report1).AssemblyQualifiedName
});
Call you method in your class constructor
use the following code to set the report viewer source
var reportSource = new Telerik.ReportViewer.Html5.WebForms.ReportSource();
reportSource.IdentifierType = IdentifierType.TypeReportSource;
reportSource.Identifier = typeof(ReportCatalog).AssemblyQualifiedName;//or
namespace.class, assembly e.g. "MyReports.Report1, MyReportsLibrary"
reportSource.Parameters.Add("Parameter1", "Parameter1");
reportSource.Parameters.Add("Parameter2", "Parameter2");
ReportsViewer1.ReportSource = reportSource;
Report1 = Newly created class that inherits from Telerik.Reporting.ReportBook

Create PDF from existing pdf with azure storage

I made a bot application with the Microsoft Botbuilder. Now I want to create a pdf-file from the user input. The file should be stored in my azure storage.
I have a "pdf-template" which should be copied and modified (this file is in the azure storage already). It has some textboxes which should be filled with the user input. I already wrote the code for that with iTextSharp.
But I need a filestream for this code. Does anybody know how to get the filestream from the file in my azure storage? Or is there maybe another way to finish my task?
Edit:
Here is the code where I need the filestream
string fileNameExisting = Path.Combine(Directory.GetCurrentDirectory(), "Some.pdf");
string fileNameNew = #"Path/Some2.pdf";
var inv = new Invention
{
Inventor = new Inventor { Firstname = "TEST!", Lastname= "TEST!" },
Date = DateTime.Now,
Title = "TEST",
Slogan = "TEST!",
Description = "TEST!",
Advantages = "TEST!s",
TaskPosition = "TEST!",
TaskSolution = "TEST!"
};
using (var existingFileStream = new FileStream(fileNameExisting, FileMode.Open))
using (var newFileStream = new FileStream(fileNameNew, FileMode.Create))
{
// Open existing PDF
var pdfReader = new PdfReader(existingFileStream);
// PdfStamper, which will create
var stamper = new PdfStamper(pdfReader, newFileStream);
var form = stamper.AcroFields;
var fieldKeys = form.Fields.Keys;
foreach (string fieldKey in fieldKeys)
{
var props = fieldKey.Split('.');
string t = GetProp(props, inv);
form.SetField(fieldKey, t);
}
stamper.Close();
pdfReader.Close();
}
}
public static string GetProp(string[] classes, object oldObj)
{
var obj = oldObj.GetType().GetProperty(classes[0]).GetValue(oldObj, null);
if(classes.Length>1)
{
classes = classes.Skip(1).ToArray();
return GetProp(classes, obj);
}
Console.WriteLine(obj.ToString());
return obj.ToString();
}
The PdfReader constructor also takes a byte array. You should be able to create the object using something like:
var pdfTemplateBytes = await new WebClient().DownloadDataTaskAsync("https://myaccount.blob.core.windows.net/templates/mytemplate.pdf");
var pdfReader = new PdfReader(pdfTemplateBytes );

Inserting json documents in DocumentDB

In DocumentDB documentation examples, I find insertion of C# objects.
// Create the Andersen family document.
Family AndersenFamily = new Family
{
Id = "AndersenFamily",
LastName = "Andersen",
Parents = new Parent[] {
new Parent { FirstName = "Thomas" },
new Parent { FirstName = "Mary Kay"}
},
IsRegistered = true
};
await client.CreateDocumentAsync(documentCollection.DocumentsLink, AndersenFamily);
In my case, I'm receiving json strings from application client and would like to insert them in DocumentDB without deserializing them. Could not find any examples of doing something similar.
Any help is sincerely appreciated..
Thanks
Copied from the published .NET Sample code -
private static async Task UseStreams(string colSelfLink)
{
var dir = new DirectoryInfo(#".\Data");
var files = dir.EnumerateFiles("*.json");
foreach (var file in files)
{
using (var fileStream = new FileStream(file.FullName, FileMode.Open, FileAccess.Read))
{
Document doc = await client.CreateDocumentAsync(colSelfLink, Resource.LoadFrom<Document>(fileStream));
Console.WriteLine("Created Document: ", doc);
}
}
//Read one the documents created above directly in to a Json string
Document readDoc = client.CreateDocumentQuery(colSelfLink).Where(d => d.Id == "JSON1").AsEnumerable().First();
string content = JsonConvert.SerializeObject(readDoc);
//Update a document with some Json text,
//Here we're replacing a previously created document with some new text and even introudcing a new Property, Status=Cancelled
using (var memoryStream = new MemoryStream(Encoding.UTF8.GetBytes("{\"id\": \"JSON1\",\"PurchaseOrderNumber\": \"PO18009186470\",\"Status\": \"Cancelled\"}")))
{
await client.ReplaceDocumentAsync(readDoc.SelfLink, Resource.LoadFrom<Document>(memoryStream));
}
}

Retrieving Data From XML File

I seem to be having a problem with retrieving XML values with C#, which I know it is due to my very limited knowledge of C# and .XML.
I was given the following XML file
<PowerBuilderRunTimes>
<PowerBuilderRunTime>
<Version>12</Version>
<Files>
<File>EasySoap110.dll</File>
<File>exPat110.dll</File>
<File>pbacc110.dll</File>
</File>
</PowerBuilderRunTime>
</PowerBuilderRunTimes>
I am to process the XML file and make sure that each of the files in the exist in the folder (that's the easy part). It's the processing of the XML file that I have having a hard time with. Here is what I have done thus far:
var runtimeXml = File.ReadAllText(string.Format("{0}\\{1}", configPath, Resource.PBRuntimes));
var doc = XDocument.Parse(runtimeXml);
var topElement = doc.Element("PowerBuilderRunTimes");
var elements = topElement.Elements("PowerBuilderRunTime");
foreach (XElement section in elements)
{
//pbVersion is grabbed earlier. It is the version of PowerBuilder
if( section.Element("Version").Value.Equals(string.Format("{0}", pbVersion ) ) )
{
var files = section.Elements("Files");
var fileList = new List<string>();
foreach (XElement area in files)
{
fileList.Add(area.Element("File").Value);
}
}
}
My issue is that the String List is only ever populated with one value, "EasySoap110.dll", and everything else is ignored. Can someone please help me, as I am at a loss.
Look at this bit:
var files = section.Elements("Files");
var fileList = new List<string>();
foreach (XElement area in files)
{
fileList.Add(area.Element("File").Value);
}
You're iterating over each Files element, and then finding the first File element within it. There's only one Files element - you need to be iterating over the File elements within that.
However, there are definitely better ways of doing this. For example:
var doc = XDocument.Load(Path.Combine(configPath, Resource.PBRuntimes));
var fileList = (from runtime in doc.Root.Elements("PowerBuilderRunTime")
where (int) runtime.Element("Version") == pbVersion
from file in runtime.Element("Files").Elements("File")
select file.Value)
.ToList();
Note that if there are multiple matching PowerBuilderRunTime elements, that will create a list with all the files of all those elements. That may not be what you want. For example, you might want:
var doc = XDocument.Load(Path.Combine(configPath, Resource.PBRuntimes));
var runtime = doc.Root
.Elements("PowerBuilderRunTime")
.Where(r => (int) r.Element("Version") == pbVersion)
.Single();
var fileList = runtime.Element("Files")
.Elements("File")
.Select(x => x.Value)
.ToList();
That will validate that there's exactly one matching runtime.
The problem is, there's only one element in your XML, with multiple children. You foreach loop only executes once, for the single element, not for its children.
Do something like this:
var fileSet = files.Elements("File");
foreach (var file in fileSet) {
fileList.Add(file.Value);
}
which loops over all children elements.
I always preferred using readers for reading homegrown XML config files. If you're only doing this once it's probably over kill, but readers are faster and cheaper.
public static class PowerBuilderConfigParser
{
public static IList<PowerBuilderConfig> ReadConfigFile(String path)
{
IList<PowerBuilderConfig> configs = new List<PowerBuilderConfig>();
using (FileStream stream = new FileStream(path, FileMode.Open))
{
XmlReader reader = XmlReader.Create(stream);
reader.ReadToDescendant("PowerBuilderRunTime");
do
{
PowerBuilderConfig config = new PowerBuilderConfig();
ReadVersionNumber(config, reader);
ReadFiles(config, reader);
configs.Add(config);
reader.ReadToNextSibling("PowerBuilderRunTime");
} while (reader.ReadToNextSibling("PowerBuilderRunTime"));
}
return configs;
}
private static void ReadVersionNumber(PowerBuilderConfig config, XmlReader reader)
{
reader.ReadToDescendant("Version");
string version = reader.ReadString();
Int32 versionNumber;
if (Int32.TryParse(version, out versionNumber))
{
config.Version = versionNumber;
}
}
private static void ReadFiles(PowerBuilderConfig config, XmlReader reader)
{
reader.ReadToNextSibling("Files");
reader.ReadToDescendant("File");
do
{
string file = reader.ReadString();
if (!string.IsNullOrEmpty(file))
{
config.AddConfigFile(file);
}
} while (reader.ReadToNextSibling("File"));
}
}
public class PowerBuilderConfig
{
private Int32 _version;
private readonly IList<String> _files;
public PowerBuilderConfig()
{
_files = new List<string>();
}
public Int32 Version
{
get { return _version; }
set { _version = value; }
}
public ReadOnlyCollection<String> Files
{
get { return new ReadOnlyCollection<String>(_files); }
}
public void AddConfigFile(String fileName)
{
_files.Add(fileName);
}
}
Another way is to use a XmlSerializer.
[Serializable]
[XmlRoot]
public class PowerBuilderRunTime
{
[XmlElement]
public string Version {get;set;}
[XmlArrayItem("File")]
public string[] Files {get;set;}
public static PowerBuilderRunTime[] Load(string fileName)
{
PowerBuilderRunTime[] runtimes;
using (var fs = new FileStream(fileName, FileMode.Open, FileAccess.Read))
{
var reader = new XmlTextReader(fs);
runtimes = (PowerBuilderRunTime[])new XmlSerializer(typeof(PowerBuilderRunTime[])).Deserialize(reader);
}
return runtimes;
}
}
You can get all the runtimes strongly typed, and use each PowerBuilderRunTime's Files property to loop through all the string file names.
var runtimes = PowerBuilderRunTime.Load(string.Format("{0}\\{1}", configPath, Resource.PBRuntimes));
You should try replacing this stuff with a simple XPath query.
string configPath;
System.Xml.XPath.XPathDocument xpd = new System.Xml.XPath.XPathDocument(cofigPath);
System.Xml.XPath.XPathNavigator xpn = xpd.CreateNavigator();
System.Xml.XPath.XPathExpression exp = xpn.Compile(#"/PowerBuilderRunTimes/PwerBuilderRunTime/Files//File");
System.Xml.XPath.XPathNodeIterator iterator = xpn.Select(exp);
while (iterator.MoveNext())
{
System.Xml.XPath.XPathNavigator nav2 = iterator.Current.Clone();
//access value with nav2.value
}

Categories