Adding a HTML page as a last page to PDF document

Adding a HTML page as a last page to PDF document - c#

I am creating a PDF Document consisting 6 images (1 Image on 1 Page) using iTextSharp.
I need to add a HTML Page as a last page after the 6th Image.
I have tried the below, but the HTML does not get added on a new page, instead gets attached immediately below the 5th Image.
Please advice how to make the html add to the last page.
Code for reference:
string ImagePath = HttpContext.Current.Server.MapPath("~/Images/");
string[] fileNames = System.IO.Directory.GetFiles(ImagePath);
string outputFileNames = "Test.pdf";
string outputFilePath = System.Web.Hosting.HostingEnvironment.MapPath("~/Pdf/" + outputFileNames);
Document doc = new Document(PageSize.A4, 20, 20, 20, 20);
System.IO.Stream st = new FileStream(outputFilePath, FileMode.Create, FileAccess.Write);
PdfWriter writer = PdfWriter.GetInstance(doc, st);
doc.Open();
writer.PageEvent = new Footer();
for (int i = 0; i < fileNames.Length; i++)
{
string fname = fileNames[i];
if (System.IO.File.Exists(fname) && Path.GetExtension(fname) == ".png")
{
iTextSharp.text.Image img = iTextSharp.text.Image.GetInstance(fname);
img.Border = iTextSharp.text.Rectangle.BOX;
img.BorderColor = iTextSharp.text.BaseColor.BLACK;
doc.Add(img);
}
}
byte[] pdf; // result will be here
var cssText = File.ReadAllText(MapPath("~/Style1.css"));
var html = File.ReadAllText(MapPath("~/HtmlPage1.html"));
using ( var memoryStream = new MemoryStream())
{
using (var cssMemoryStream = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(cssText)))
{
using (var htmlMemoryStream = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(html)))
{
XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, htmlMemoryStream, cssMemoryStream);
}
}
pdf = memoryStream.ToArray();
//document.Add(new Paragraph(Encoding.UTF8.GetString(pdf)));
}
doc.NewPage();
doc.Add(new Paragraph(Encoding.UTF8.GetString(pdf)));
doc.Close();
writer.Close();
I need to add a HTML Page as a last page after the 6th Image.
Any help is appreciated

In contrast to what you assume according to your code comments, pdf is not where the result will be. It remains empty:
byte[] pdf; // result will be here
...
using ( var memoryStream = new MemoryStream())
{
... code not accessing memoryStream ...
pdf = memoryStream.ToArray();
//document.Add(new Paragraph(Encoding.UTF8.GetString(pdf)));
}
doc.NewPage();
doc.Add(new Paragraph(Encoding.UTF8.GetString(pdf)));
Thus, you add the new page before adding an empty paragraph, after the converted html already has been added to the document.
Actually it is added during
XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, htmlMemoryStream, cssMemoryStream);
So you have to add the new page before that. Thus, the following replacing everything from your byte[] pdf; on should do the job:
var cssText = File.ReadAllText(MapPath("~/Style1.css"));
var html = File.ReadAllText(MapPath("~/HtmlPage1.html"));
using (var cssMemoryStream = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(cssText)))
{
using (var htmlMemoryStream = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(html)))
{
doc.NewPage();
XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, htmlMemoryStream, cssMemoryStream);
}
}
doc.Close();
As an aside, don't close the writer! It implicitly is closed when the doc is closed. Closing it again does nothing at best or damage otherwise.
In a comment you claimed
but this also does not resolve the issue... the pdf content still get added after the image and then continued on new page.
So I tested the proposed change. Obviously I don't have your environment and also not your image, html, and css files. Thus, I used own ones, a small screen shot and "<html><body><h1>Test</h1><p>This is a test piece of html</p></body></html>".
With your code I get:
With the code changed as described above I get
My impression here is that the proposed code change does resolve the issue. The html content is added on a new page.
Thus apparently your either incorrectly applied the proposed change, or you executed old code, or you inspected some old result.

Related

Creating a blank pdf and uploading an image to each page

I am new to file handling in asp.net core 6.0. I want to create a blank pdf and load images from the ImagePath list into it. Using the resources on the internet, I tried to create a blank pdf and throw it into it, but in vain.
I couldn't use pdfReader inside pdfStamper. It was the only resource on the Internet that I found suitable for myself.
Link to the question; Converting Multiple Images into Multiple Pages PDF using itextsharp
How can I do that my code is below.
public static string MainStamping(string docname, List < string > imagePath, string mediaField) {
var config = new ConfigurationBuilder().AddJsonFile("appsettings.json").Build();
var webRootPath = config["AppSettings:urunResimPath"].ToString();
string filename = webRootPath + "\\menupdf\\" + docname + ".pdf";
// yeniisim = yeniisim + filelist.FileName;
// var fileName = "menupdf\\" + yeniisim;
FileStream pdfOutputFile = new FileStream(filename, FileMode.Create);
PdfConcatenate pdfConcatenate = new PdfConcatenate(pdfOutputFile);
PdfReader result = null;
for (int i = 0; i < imagePath.Count; i++) {
result = CreatePDFDocument1(imagePath[i], mediaField);
pdfConcatenate.AddPages(result);
}
pdfConcatenate.Close();
return filename;
}
public static PdfReader CreatePDFDocument1(string imagePath, string mediaField) {
PdfReader pdfReader = null;
//C:\Users\hilal\OneDrive\Belgeler\GitHub\Restaurant\Cafe.Web\wwwroot\assets\barkod-menu
var config = new ConfigurationBuilder().AddJsonFile("appsettings.json").Build();
var webRootPath = config["AppSettings:urunResimPath"].ToString();
string image = webRootPath + "\\barkod-menu\\" + imagePath;
iTextSharp.text.Image instanceImg = iTextSharp.text.Image.GetInstance(image);
MemoryStream inputStream = new MemoryStream();
inputStream.Seek(0, SeekOrigin.Begin); //I don't know what to do here do I need to use it?
pdfReader = new PdfReader(inputStream);
MemoryStream memoryStream = new MemoryStream();
PdfStamper pdfStamper = new PdfStamper(pdfReader, memoryStream);
AcroFields testForm = pdfStamper.AcroFields;
testForm.SetField("MediaField", mediaField);
PdfContentByte overContent = pdfStamper.GetOverContent(1);
IList < AcroFields.FieldPosition > fieldPositions = testForm.GetFieldPositions("ImageField");
if (fieldPositions == null || fieldPositions.Count <= 0) throw new ApplicationException("Error locating field");
AcroFields.FieldPosition fieldPosition = fieldPositions[0];
overContent.AddImage(instanceImg);
pdfStamper.FormFlattening = true;
pdfStamper.Close();
PdfReader resultReader = new PdfReader(memoryStream.ToArray());
pdfReader.Close();
return resultReader;
}
If I want to explain visually, the blank pdf I created will be uploaded in this way. Thank you

The following shows how to create a PDF using iTextSharp (v. 5.5.13.3), and add images to the PDF (one image per page). It's been tested with .NET 6.
Pre-requisite:
Download / install NuGet package iTextSharp (v. 5.5.13.3)
Add the following using statements:
using iTextSharp.text;
using iTextSharp.text.pdf;
using System.IO;
In the code below the PDF is saved to a byte array, instead of a file to allow for more options.
CreatePdfDocument:
public static byte[] CreatePdfDocument(List<string> imagePaths)
{
byte[] pdfBytes;
using (MemoryStream ms = new MemoryStream())
{
using (Document doc = new Document(PageSize.LETTER, 1.0f, 1.0f, 1.0f, 1.0f))
{
using (PdfWriter writer = PdfWriter.GetInstance(doc, ms))
{
//open
doc.Open();
//create a new page for each image
for (int i = 0; i < imagePaths.Count; i++)
{
//add new page
doc.NewPage();
//get image
iTextSharp.text.Image img = iTextSharp.text.Image.GetInstance(imagePaths[i]);
//ToDo: set desired image size
//img.ScaleAbsolute(100.0f, 100.0f);
//center image on page
img.SetAbsolutePosition((PageSize.LETTER.Width - img.ScaledWidth) / 2, (PageSize.LETTER.Height - img.ScaledHeight) / 2);
//add image to page
doc.Add(img);
}
//close
doc.Close();
//convert MemoryStream to byte[]
pdfBytes = ms.ToArray();
}
}
}
return pdfBytes;
}
I've decided that all I want to do is write the PDF to a file. The method below writes the PDF to a file.
CreatePdf:
public static void CreatePdf(string filename, List<string> imagePaths)
{
//create PDF
byte[] pdfBytes = CreatePdfDocument(imagePaths);
//save PDF to file
System.IO.File.WriteAllBytes (filename, pdfBytes);
}
Resources:
How to generate PDF one page one row from datatable using iTextSharp
iTextSharp - Working with images
Center image in pdf using itextsharp
Additional Resources:
iTextSharp: How to resize an image to fit a fix size?

acrofields not printing with multipage document - itextsharp

we are trying to print / create a multi page pdf from a single template pdf containing editable acrofields. The code seemed to work find when using for a simple single page . However when trying to do for multi page, it doesnt seems to be showing values when the multipage pdf is printed.
the code for this is as follows
public ActionResult InsertUpdateFoodCourtMultiple(string FoodCourt, int EmployeeId, string EmployeeNo, DateTime FromDate, DateTime ToDate)
{
iTextSharp.text.Document document = new iTextSharp.text.Document();
String fileName = "";
fileName = "FoodToken.pdf";
string filePath = "~/Content/Files/" + fileName;
byte[] result;
//create newFileStream object which will be disposed at the end
using (MemoryStream newFileStream = new MemoryStream())
{
// step 2: we create a writer that listens to the document
PdfCopy writer = new PdfCopy(document, newFileStream);
// step 3: we open the document
document.Open();
while (FromDate <= ToDate)
{
var reader = new PdfReader(Server.MapPath(filePath));
MemoryStream output = new MemoryStream();
var stamper = new PdfStamper(reader, output);
stamper.AcroFields.SetField("TxtSerial", EmployeeId.ToString());
stamper.AcroFields.SetField("TxtEmployeeId", EmployeeNo);
stamper.AcroFields.SetField("TxtDate", FromDate.ToString("dd-MMM-yyyy"));
stamper.AcroFields.SetField("TxtIssuedBy", SessionHelper.GetLoggedInUser().FirstName + " " + SessionHelper.GetLoggedInUser().LastName);
stamper.AcroFields.SetField("TxtMeal", " - " + FoodCourt);
stamper.FormFlattening = true;
// step 4: we add content
for (int i = 1; i <= reader.NumberOfPages; i++)
{
PdfImportedPage page = writer.GetImportedPage(reader, i);
writer.AddPage(page);
}
stamper.Close();
// writer.AddDocument(reader);
reader.Close();
// step 5: we close the document and writer
//reader = new PdfReader(output);
//writer.AddDocument(reader);
//reader.Close();
FromDate = FromDate.AddDays(1);
}
result = newFileStream.ToArray();
writer.Close();
}//disposes the newFileStream object
document.Close();
Response.AddHeader("Content-Disposition", "inline; filename=" + fileName);
Response.ContentType = "application/pdf";
return File(result, "application/pdf");
}
any help appreciated

I see two major errors:
Copying the unchanged original instead of the filled in copy
In your while loop you add the document for the current FromDate to the PdfCopy:
for (int i = 1; i <= reader.NumberOfPages; i++)
{
PdfImportedPage page = writer.GetImportedPage(reader, i);
writer.AddPage(page);
}
The mistake is, though, that you add the pages from the reader containing the original "FoodToken.pdf", not the filled-in version created in the stamper. You actually ignore the result of the stamper completely.
Your code in comments shows that you probably once had tried code which attempted to add the filled-in version to the PdfCopy instance:
//reader = new PdfReader(output);
//writer.AddDocument(reader);
//reader.Close();
This is the code you should try to get working. (If your problem with that code was that the output already was closed, consider telling the stamper not to close the result stream or alternatively simply retrieve its bytes.)
Using an unfinished result
You retrieve the contents of the MemoryStream the PdfCopy writes to before you close the latter and so tell it to finalize its output:
result = newFileStream.ToArray();
writer.Close();
This causes the result to be an incomplete PDF.
Thus, try it the other way around. (If you then run into a problem that the newFileStream is closed, consider telling the writer not to close the result stream.)
Another thing you shouldn't do is close the document after disposing the MemoryStream its output is directed to. Here you might be lucky because you closed the writer before making document.Close() a NOP; in general, though,such a Close call is likely to cause some exception.

Itextsharp Pdfwriter.getinstance returning 1kb pdf file

I'm trying to create a Memory stream that is a PDF, for testing I'm writing it to disk. When I write the stream I get a 1kb PDF file. Any ideas?
UPDATE:
It looks like I wasn't calling doc.close, however when I do it disposes of my finalreportstream. Is there a way around this?
public static Stream CombinePages(Stream firstPageStream, string nextPagesString)
{
var firstPage = new PdfReader(firstPageStream);
var nextPages = new PdfReader(nextPagesString);
Stream finalReportStream = new MemoryStream();
var doc = new Document();
var w = PdfWriter.GetInstance(doc, finalReportStream);
doc.Open();
doc.SetPageSize(firstPage.GetPageSize(1));
doc.NewPage();
//Add Page 1
w.DirectContent.AddTemplate(w.GetImportedPage(firstPage, 1), 0, 0);
//Add the rest of the pages
//copy readnextpages to doc starting page2 this cuts the first page
for (var page = 2; page <= nextPages.NumberOfPages; page++)
{
doc.SetPageSize(nextPages.GetPageSize(page));
doc.NewPage();
w.DirectContent.AddTemplate(w.GetImportedPage(nextPages, page), 0, 0);
}
return finalReportStream;
}
And then write it to disk:
var fileStream = File.Create(destfilename);
finalReportStream.CopyTo(fileStream);
fileStream.Close();

Adding text to existing multipage PDF document in memorystream using iTextSharp

I am trying to add text to an existing PDF file using iTextSharp. I have been reading many posts, including the popular thread here.
I have some differences:
My PDF are X pages long
I want to keep everything in memory, and never have a file stored on my filesystem
So I tried to modify the code, so it takes in a byte array and returns a byte array. I have come this far:
The code compiles and runs
My out byte array has a different length than my in byte array
My problem:
I cannot see my added text when i later store the modified byte array and open it in my PDF reader
I don't get why. From every StackOverflow post I have seen, I do the same. using the DirectContent, I use BeginText and write a text. However, i cannot see it, no matter how I move the position around.
Any idea what is missing from my code?
public static byte[] WriteIdOnPdf(byte[] inPDF, string str)
{
byte[] finalBytes;
// open the reader
using (PdfReader reader = new PdfReader(inPDF))
{
Rectangle size = reader.GetPageSizeWithRotation(1);
using (Document document = new Document(size))
{
// open the writer
using (MemoryStream ms = new MemoryStream())
{
using (PdfWriter writer = PdfWriter.GetInstance(document, ms))
{
document.Open();
for (var i = 1; i <= reader.NumberOfPages; i++)
{
document.NewPage();
var baseFont = BaseFont.CreateFont(BaseFont.HELVETICA_BOLD, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
var importedPage = writer.GetImportedPage(reader, i);
var contentByte = writer.DirectContent;
contentByte.BeginText();
contentByte.SetFontAndSize(baseFont, 18);
var multiLineString = "Hello,\r\nWorld!";
contentByte.ShowTextAligned(PdfContentByte.ALIGN_LEFT, multiLineString,100, 200, 0);
contentByte.EndText();
contentByte.AddTemplate(importedPage, 0, 0);
}
document.Close();
ms.Close();
writer.Close();
reader.Close();
}
finalBytes = ms.ToArray();
}
}
}
return finalBytes;
}

The code below shows off a full-working example of creating a PDF in memory and then performing a second pass, also in memory. It does what #mkl says and closes all iText parts before trying to grab the raw bytes from the stream. It also uses GetOverContent() to draw "on top" of the previous pdf. See the code comments for more details.
//Bytes will hold our final PDFs
byte[] bytes;
//Create an in-memory PDF
using (var ms = new MemoryStream()) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, ms)) {
doc.Open();
//Create a bunch of pages and add text, nothing special here
for (var i = 1; i <= 10; i++) {
doc.NewPage();
doc.Add(new Paragraph(String.Format("First Pass - Page {0}", i)));
}
doc.Close();
}
}
//Right before disposing of the MemoryStream grab all of the bytes
bytes = ms.ToArray();
}
//Another in-memory PDF
using (var ms = new MemoryStream()) {
//Bind a reader to the bytes that we created above
using (var reader = new PdfReader(bytes)) {
//Store our page count
var pageCount = reader.NumberOfPages;
//Bind a stamper to our reader
using (var stamper = new PdfStamper(reader, ms)) {
//Setup a font to use
var baseFont = BaseFont.CreateFont(BaseFont.HELVETICA_BOLD, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
//Loop through each page
for (var i = 1; i <= pageCount; i++) {
//Get the raw PDF stream "on top" of the existing content
var cb = stamper.GetOverContent(i);
//Draw some text
cb.BeginText();
cb.SetFontAndSize(baseFont, 18);
cb.ShowText(String.Format("Second Pass - Page {0}", i));
cb.EndText();
}
}
}
//Once again, grab the bytes before closing things out
bytes = ms.ToArray();
}
//Just to see the final results I'm writing these bytes to disk but you could do whatever
var testFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "test.pdf");
System.IO.File.WriteAllBytes(testFile, bytes);

ITextSharp Parsing HTML with Images in it: It parses correctly but wont show images

I am trying to generate a .pdf from html using the library ITextSharp. I am able to create the pdf with the html text converted to pdf text/paragraphs
My Problem: The pdf does not show my images(my img elements from the html). All my img html elements in my html dont get displayed in the pdf? Is it possible for ITextSharp to parse HTML & display images. I really hope so otherwise I am stuffed :(
I am linking to the correct directory where the images are(using IMG_BASURL) but they are just not showing
My code:
// mainContents variable is a string containing my HTML
var document = new Document(PageSize.A4, 50, 50, 80, 100);
var output = new MemoryStream();
var writer = PdfWriter.GetInstance(document, output);
document.open();
Hashtable providers = new Hashtable();
providers.Add("img_baseurl","C:/users/xx/VisualStudio/Projects/myproject/");
var parsedHtmlElements = HTMLWorker.ParseToList(new StringReader(mainContents), null, providers);
foreach (var htmlElement in parsedHtmlElements)
document.Add(htmlElement as IElement);
document.Close();

Every time that I've encountered this the problem was that the image was too large for the canvas. More specifically, even a naked IMG tag internally will get wrapped in a Chunk that will get wrapped in a Paragraph, and I think that the image is overflowing the Paragraph but I'm not 100% sure.
The two easy fixes are to either enlarge the canvas or to specify image dimensions on the HTML IMG tag. The third more complex route would be to use an additional provider IMG_PROVIDER. To do this you need to implement the IImageProvider interface. Below is a very simple version of one
public class ImageThing : IImageProvider {
//Store a reference to the main document so that we can access the page size and margins
private Document MainDoc;
//Constructor
public ImageThing(Document doc) {
this.MainDoc = doc;
}
Image IImageProvider.GetImage(string src, IDictionary<string, string> attrs, ChainedProperties chain, IDocListener doc) {
//Prepend the src tag with our path. NOTE, when using HTMLWorker.IMG_PROVIDER, HTMLWorker.IMG_BASEURL gets ignored unless you choose to implement it on your own
src = Environment.GetFolderPath(Environment.SpecialFolder.Desktop) + #"\" + src;
//Get the image. NOTE, this will attempt to download/copy the image, you'd really want to sanity check here
Image img = Image.GetInstance(src);
//Make sure we got something
if (img == null) return null;
//Determine the usable area of the canvas. NOTE, this doesn't take into account the current "cursor" position so this might create a new blank page just for the image
float usableW = this.MainDoc.PageSize.Width - (this.MainDoc.LeftMargin + this.MainDoc.RightMargin);
float usableH = this.MainDoc.PageSize.Height - (this.MainDoc.TopMargin + this.MainDoc.BottomMargin);
//If the downloaded image is bigger than either width and/or height then shrink it
if (img.Width > usableW || img.Height > usableH) {
img.ScaleToFit(usableW, usableH);
}
//return our image
return img;
}
}
To use this provider just add it to the provider collection like you did with HTMLWorker.IMG_BASEURL:
providers.Add(HTMLWorker.IMG_PROVIDER, new ImageThing(doc));
It should be noted that if you use HTMLWorker.IMG_PROVIDER that you are responsible for figuring out everything about the image. The code above assumes that all image paths need to be prepended with a constant string, you'll probably want to update this and check for HTTP at the start. Also, because we're saying that we want to completely handle the image processing pipeline the provider HTMLWorker.IMG_BASEURL is no longer needed.
The main code loop would now look something like this:
string html = #"<img src=""Untitled-1.png"" />";
string outputFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "HtmlTest.pdf");
using (FileStream fs = new FileStream(outputFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (Document doc = new Document(PageSize.A4, 50, 50, 80, 100)) {
using (PdfWriter writer = PdfWriter.GetInstance(doc, fs)) {
doc.Open();
using (StringReader sr = new StringReader(html)) {
System.Collections.Generic.Dictionary<string, object> providers = new System.Collections.Generic.Dictionary<string, object>();
providers.Add(HTMLWorker.IMG_PROVIDER, new ImageThing(doc));
var parsedHtmlElements = HTMLWorker.ParseToList(sr, null, providers);
foreach (var htmlElement in parsedHtmlElements) {
doc.Add(htmlElement as IElement);
}
}
doc.Close();
}
}
}
One last thing, make sure to specify which version of iTextSharp you are targetting when posting here. The code above targets iTextSharp 5.1.2.0 but I think you might be using the 4.X series.

I faced the same problem and tried the following proposed solutions:
string replaced a tag, encode in base64 and embed the image to a .NET class library but none worked !
So I've come to the old fashioned solution: adding the logo manually with doc.Add()
Here's your code updated:
string html = #"<img src=""Untitled-1.png"" />";
string outputFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "HtmlTest.pdf");
using (FileStream fs = new FileStream(outputFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (Document doc = new Document(PageSize.A4, 50, 50, 80, 100)) {
using (PdfWriter writer = PdfWriter.GetInstance(doc, fs)) {
doc.Open();
using (StringReader sr = new StringReader(html)) {
System.Collections.Generic.Dictionary<string, object> providers = new System.Collections.Generic.Dictionary<string, object>();
providers.Add(HTMLWorker.IMG_PROVIDER, new ImageThing(doc));
var parsedHtmlElements = HTMLWorker.ParseToList(sr, null, providers);
foreach (var htmlElement in parsedHtmlElements) {
doc.Add(htmlElement as IElement);
}
// here's the magic
var logo = iTextSharp.text.Image.GetInstance(Server.MapPath("~/HTMLTemplate/logo.png"));
logo.SetAbsolutePosition(440, 800);
document.Add(logo);
// end
}
doc.Close();
}
}
}

string siteUrl = HttpContext.Current.Server.MapPath("/images/image/ticket/Ticket.jpg");
string HTML = "<table><tr><td><u>asdasdsadasdsa <img src='" + siteUrl + "' al='tt' /> </u></td></tr></table>";

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Adding a HTML page as a last page to PDF document - c#

Related

Creating a blank pdf and uploading an image to each page

acrofields not printing with multipage document - itextsharp

Itextsharp Pdfwriter.getinstance returning 1kb pdf file

Adding text to existing multipage PDF document in memorystream using iTextSharp

ITextSharp Parsing HTML with Images in it: It parses correctly but wont show images

Categories

Resources