Insert HTML directly in PDF using itextsharp - c#

I have WPF application in which user enters some text in rich text box(rtb), I convert that rtb string to HTML and then convert that HTML to image and then insert it in the PDF document
using (Stream inputPdfStream = new FileStream("sample.pdf", FileMode.Open, FileAccess.Read, FileShare.Read))
using (Stream outputPdfStream = new FileStream("result2.pdf", FileMode.Create, FileAccess.Write, FileShare.None))
{
var reader = new PdfReader(inputPdfStream);
var stamper = new PdfStamper(reader, outputPdfStream);
PdfContentByte pdfContentByte = null;
int c = reader.NumberOfPages;
iTextSharp.text.Image image = TextSharp.text.Image.GetInstance(ConvertXamltohtmltoImage(xamlstring));
foreach (var item in lst)
{
image.ScaleToFit(item._Size.Width, item._Size.Height);
image.SetAbsolutePosition(item.Location.X, item.Location.Y);
pdfContentByte = stamper.GetOverContent(item.pageNo);
pdfContentByte.AddImage(image);
}
stamper.Close();
}
My question is can I insert HTML directly into PDF?

You need an extra DLL to do that: http://sourceforge.net/projects/itextsharp/files/xmlworker/
See the demo: http://demo.itextsupport.com/xmlworker/
Unfortunately, the documentation hasn't been updated recently. We're working on it.

Related

iTextSharp - adding stamp - stamp is not on top of content but under it

I am trying to stamp existing pdf document using ITextSharp's stamper.
I am able to open existing pdf and put an image inside on the desired position. (stamp the pdf)
Probleme is that stamp (red image) is always under the drawing. (black lines are over the red image)
I am not able to do it vice versa.
My result:
The desired result is exactly the opposite - red image over the black lines
Any idea how to accomplish this properly?
Thx for any advice.
Here is my code:
using (Stream inputPdfStream = new FileStream(#"D:\tmp\go\input.pdf", FileMode.Open, FileAccess.Read, FileShare.Read))
using (Stream outputPdfStream = new FileStream(#"D:\tmp\go\output.pdf", FileMode.Create, FileAccess.Write, FileShare.None))
using (Stream inputImageStream = new FileStream(#"D:\tmp\go\stamp.png", FileMode.Open, FileAccess.Read, FileShare.Read))
{
var reader = new PdfReader(inputPdfStream);
var stamper = new PdfStamper(reader, outputPdfStream);
int lastPage = reader.NumberOfPages;
Image image = Image.GetInstance(inputImageStream);
image.ScalePercent(35.5f);
image.SetAbsolutePosition(30, 30);
PdfGState graphicsState = new PdfGState();
graphicsState.BlendMode = PdfGState.BM_DARKEN;
var pdfContentByte = stamper.GetOverContent(lastPage);
pdfContentByte.SetGState(graphicsState);
pdfContentByte.SaveState();
pdfContentByte.AddImage(image);
stamper.Close();
}

Convert html to pdf and merge it with existing pdfs

I have a System.Net.Mail.MailMessage which shall have it's html body and pdf attachments converted into one single pdf.
Converting the html body to pdf works for me with this answer
Converting the pdf attachments into one pdf works for me with this answer
However after ~10 hours of trying I can not come up with a combined solution which does both. All I'm getting are NullReferenceExceptions somewhere in IText source, "the document is not open", etc...
For example, this will throw no error but the resulting pdf will only contain the attachments but not the html email body:
Document document = new Document();
StringReader sr = new StringReader(mail.Body);
HTMLWorker htmlparser = new HTMLWorker(document);
using (FileStream fs = new FileStream(targetPath, FileMode.Create))
{
PdfCopy writer = new PdfCopy(document, fs);
document.Open();
htmlparser.Parse(sr);
foreach (string fileName in pdfList)
{
PdfReader reader = new PdfReader(fileName);
reader.ConsolidateNamedDestinations();
for (int i = 1; i <= reader.NumberOfPages; i++)
{
PdfImportedPage page = writer.GetImportedPage(reader, i);
writer.AddPage(page);
}
PRAcroForm form = reader.AcroForm;
if (form != null)
{
writer.CopyAcroForm(reader);
}
reader.Close();
}
writer.Close();
document.Close();
}
I'm using the LGPL licensed ITextSharp 4.1.6
From v4.1.6 fanboy to v4.1.6 fanboy :D
Looks like the HTMLWorker is closing the documents stream right after parsing. So as a workaround, you could create a pdf from your mailbody in memory. And then add this one together with the attachment to your final pdf.
Here is some code, that should do the trick:
StringReader htmlStringReader = new StringReader("<html><body>Hello World!!!!!!</body></html>");
byte[] htmlResult;
using (MemoryStream htmlStream = new MemoryStream())
{
Document htmlDoc = new Document();
PdfWriter htmlWriter = PdfWriter.GetInstance(htmlDoc, htmlStream);
htmlDoc.Open();
HTMLWorker htmlWorker = new HTMLWorker(htmlDoc);
htmlWorker.Parse(htmlStringReader);
htmlDoc.Close();
htmlResult = htmlStream.ToArray();
}
byte[] pdfResult;
using (MemoryStream pdfStream = new MemoryStream())
{
Document doc = new Document();
PdfCopy copyWriter = new PdfCopy(doc, pdfStream);
doc.Open();
PdfReader htmlPdfReader = new PdfReader(htmlResult);
AppendPdf(copyWriter, htmlPdfReader); // your foreach pdf code here
htmlPdfReader.Close();
PdfReader attachmentReader = new PdfReader("C:\\temp\\test.pdf");
AppendPdf(copyWriter, attachmentReader);
attachmentReader.Close();
doc.Close();
pdfResult = pdfStream.ToArray();
}
using (FileStream fs = new FileStream("C:\\temp\\test2.pdf", FileMode.Create, FileAccess.Write))
{
fs.Write(pdfResult, 0, pdfResult.Length);
}
private void AppendPdf(PdfCopy writer, PdfReader reader)
{
for (int i = 1; i <= reader.NumberOfPages; i++)
{
PdfImportedPage page = writer.GetImportedPage(reader, i);
writer.AddPage(page);
}
}
Ofc you could directly use a FileStream for the final document instead of a MemoryStream as well.

iTextSharp- How to create thumbnail image from first page of a pdf file

I want to create a thumbnail image from the first page of a PDF file. The code I am using is:
using (FileStream fs = new FileStream(inputFile, FileMode.Open, FileAccess.Read, FileShare.Read))
{
using (Document doc = new Document())
{
using (PdfWriter w = PdfWriter.GetInstance(doc, fs))
{
PdfReader r = new PdfReader(inputFile);
PdfImportedPage importedPage = w.GetImportedPage(r, 1);
iTextSharp.text.Image PdfImage = iTextSharp.text.Image.GetInstance(importedPage);
PdfImage.ScaleAbsolute(importedPage.Width / 2, importedPage.Height / 2);
System.Drawing.Image img = System.Drawing.Image.FromStream(new MemoryStream(PdfImage.RawData));
img.Save(thumbNailImagePath);
doc.Close();
r.Close();
}
}
}
Here PdfImage.RawData is returning null value. Can anyone tell me what is wrong here? I am new to iTextSharp, is it possible to create a thumbnail image of the first page of PDF content using iTextSharp?
Thanks Bruno and Amedee. Based on your comment, I have used GhostscriptSharp to create thumbnail. It has the method GhostscriptWrapper.GeneratePageThumb(inputFile, thumbnailPath, pageNo, width, height) to create thumbnail of particular page.

Copying PDF without loose form field structure with ItextSharp

I would like to get a pdf, keep somes pages, then save it to another destination without losing fieldstructure.
Here the code perfectly working for copying:
string sourceFolder = Environment.GetFolderPath(Environment.SpecialFolder.Desktop);
string sourceFile = Path.Combine(sourceFolder, "POMultiple.pdf");
string fileName = #"C:\Users\MyUser\Desktop\POMultiple.pdf";
byte[] file = System.IO.File.ReadAllBytes(fileName);
public static void removePagesFromPdf(byte[] sourceFile, String destinationFile, params int[] pagesToKeep)
{
//Used to pull individual pages from our source
PdfReader r = new PdfReader(sourceFile);
//Create our destination file
using (FileStream fs = new FileStream(destinationFile, FileMode.Create, FileAccess.Write, FileShare.None))
{
using (Document doc = new Document())
{
PdfWriter writer = PdfWriter.GetInstance(doc, fs);
//Open the desitination for writing
doc.Open();
//Loop through each page that we want to keep
foreach (int page in pagesToKeep)
{
//Add a new blank page to destination document
doc.NewPage();
//Extract the given page from our reader and add it directly to the destination PDF
writer.DirectContent.AddTemplate(writer.GetImportedPage(r, page), 0, 0);
}
//Close our document
doc.Close();
}
}
}
But when I open "TestOutput.pdf" file in acrobat reader all my fields are empty.
Any Help ?
You need something like this:
PdfReader reader = new PdfReader(sourceFile);
reader.SelectPages(2-4,8-9);
PdfStamper stp = new PdfStamper(reader, new FileStream(destinationFile, FileMode.Create));
stp.Close();
reader.Close();

How to set copyright metadata of an existing PDF using iTextSharp for C#

How can the copyright metadata of an existing (i.e. a pdf loaded from file or memory stream) pdf file be set using iTextSharp for C#?
Thanks a lot
The native XMP structures don't have copyright implemented (or at least they don't in a way that Adobe Reader recognizes.) To do that you can reverse engineer what Adobe kicks out and write it manually:
String inputPDF = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "Services.pdf");
String outputPDF = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "Services_Out.pdf");
PdfReader reader = new PdfReader(inputPDF);
using (FileStream fs = new FileStream(outputPDF, FileMode.Create, FileAccess.Write, FileShare.Read))
{
using (PdfStamper stamper = new PdfStamper(reader, fs))
{
using (MemoryStream ms = new MemoryStream())
{
string CopyrightName = "YOUR NAME HERE";
string CopyrightUrl = "http://www.example.com/";
XmpWriter xmp = new XmpWriter(ms);
xmp.AddRdfDescription("xmlns:dc=\"http://purl.org/dc/elements/1.1/\"", String.Format("<dc:rights><rdf:Alt><rdf:li xml:lang=\"x-default\">{0}</rdf:li></rdf:Alt></dc:rights>", CopyrightName));
xmp.AddRdfDescription("xmlns:xmpRights=\"http://ns.adobe.com/xap/1.0/rights/\"", string.Format("<xmpRights:Marked>True</xmpRights:Marked><xmpRights:WebStatement>{0}</xmpRights:WebStatement>", CopyrightUrl));
xmp.Close();
stamper.XmpMetadata = ms.ToArray();
stamper.Close();
}
}
}

Categories