Divide one page PDF file in two pages PDF file

Divide one page PDF file in two pages PDF file - c#

I'm using iTextSharp to handle pdf files. I'd like to know how I can split a page in half and make 2 different pages from the two pieces. I tried a lot but nothing seems to work right now.
First try
iTextSharp.text.Rectangle size = new iTextSharp.text.Rectangle(0, pdfReader.GetPageSize(1).Height / 2, pdfReader.GetPageSize(1).Width, 0);
Second try
iTextSharp.text.Rectangle size = pdfReader.GetPageSizeWithRotation(1);
iTextSharp.text.Document document = new iTextSharp.text.Document(size.GetRectangle(0, size.Height / 2));
And several others. The results are always the same: I have a file with just the second half of the original page.

I don't understand your code snippets, but then again: probably you don't understand them either, so let's not look at what you've written so far, and let's take a closer look at the TileInTwo example:
public void manipulatePdf(String src, String dest)
throws IOException, DocumentException {
// Creating a reader
PdfReader reader = new PdfReader(src);
int n = reader.getNumberOfPages();
// step 1
Rectangle mediabox = new Rectangle(getHalfPageSize(reader.getPageSizeWithRotation(1)));
Document document = new Document(mediabox);
// step 2
PdfWriter writer
= PdfWriter.getInstance(document, new FileOutputStream(dest));
// step 3
document.open();
// step 4
PdfContentByte content = writer.getDirectContent();
PdfImportedPage page;
int i = 1;
while (true) {
page = writer.getImportedPage(reader, i);
content.addTemplate(page, 0, -mediabox.getHeight());
document.newPage();
content.addTemplate(page, 0, 0);
if (++i > n)
break;
mediabox = new Rectangle(getHalfPageSize(reader.getPageSizeWithRotation(i)));
document.setPageSize(mediabox);
document.newPage();
}
// step 5
document.close();
reader.close();
}
public Rectangle getHalfPageSize(Rectangle pagesize) {
float width = pagesize.getWidth();
float height = pagesize.getHeight();
return new Rectangle(width, height / 2);
}
In this example, we ask the PdfReader instance for the page size of the first page and we create a new rectangle with the same width and only half the height.
We then import each page in the document, and we add it twice on different pages:
once on the odd pages with a negative y value to show the upper half of the original page,
once on the even pages with y = 0 to show the lower half of the original page.
As every page in the original document can have a different size, we may need to change the page size for every new couple of pages.

Related

How can I split a long pdf page into several pages with iText7 in C# in different lengths?

I have pdf files in receipt format (e.g. 80x1000mm), one long page per document.
I want to create a new pdf and split the receipt into several printable A4 pages.
The receipt contains blocks (e.g. QR.Codes) which must not be cut in the middle.
So the new pages must be created based on given lengths (rectangles) (measured from the top edge). e.g.
Page 1: 0 - 272mm
Page 2: 272 - 513mm
Page 3: 513 - 783 mm
Page 4: 783 - to end
How can I split the pdf page into several pages with iText7 in c# by specifying rectangels?
looked at several examples (git TileClipped.java, C06E02_TheGoldenGateBridge_Tiles)
I have created a very simple example to play around with coordinates, AddXObjectAt, PdfCanvas AffineTransform, ConcatMatrix, Clip etc..
The example generates a 1000mm long pdf.
I have experimented with this pdf. For the beginning I just tried to cut 4 x 250mm.
Unfortunately without success.
using System;using iText.Commons;using iText.Kernel;using iText.Kernel.Colors;using iText.Kernel.Font;using iText.Kernel.Geom;using iText.Kernel.Pdf;using iText.Kernel.Pdf.Canvas;using iText.Kernel.Pdf.Xobject;using iText.Layout;using iText.Layout.Element;
namespace SplitPage
{
internal class Program
{
public const string fileReceipt = "c:\\tmp\\receipt1000mm.pdf";
public const string fileA4 = "c:\\tmp\\receiptA4.pdf";
static void Main(string[] args)
{
new Program().CreateAndSplitPdfReceipt();
}
static float mmToPt(float mm) => mm * 72.0f / 25.4f;
void CreateAndSplitPdfReceipt()
{
float recieptLength = 2834.6f; // 1000mm approx 2834.6 pt
float recieptWidth = 220f; // approx 78mm
// Create 1000mm long receipt (= 2834.6pt)
{
PdfDocument pdfDoc = new PdfDocument(new PdfWriter(fileReceipt));
pdfDoc.SetDefaultPageSize(new PageSize(recieptWidth, recieptLength));
Document doc = new Document(pdfDoc);
for (int i = 1; i <= 250; i++)
{
float y = recieptLength - mmToPt(i * 4f); // 4mm per row * 250rows = 1000mm
var p = new Paragraph(string.Format("Row {0}: y={1:f1} y'={2}mm ", i, y, i*4));
p.SetFontSize(8);
p.SetFixedPosition(10, y, 220);
p.SetMargin(0);
doc.Add(p);
}
pdfDoc.Close();
}
// split receipt1000.pdf in 4 equal parts a 250mm
{
float chunk = recieptLength / 4f;
PdfDocument srcDoc = new PdfDocument(new PdfReader(fileReceipt));
PdfDocument pdfA4 = new PdfDocument(new PdfWriter(fileA4));
pdfA4.SetDefaultPageSize(PageSize.A4);
PdfPage srcPage = srcDoc.GetPage(1); // just one page in receipt
PdfFormXObject copyObject = srcPage.CopyAsFormXObject(pdfA4);
PdfPage pageA4;
PdfCanvas canvas;
// 1st
pageA4 = pdfA4.AddNewPage(PageSize.A4);
canvas = new PdfCanvas(pageA4);
canvas.AddXObjectAt(copyObject, 0, 3 * chunk);
// 2nd
pageA4 = pdfA4.AddNewPage(PageSize.A4);
canvas = new PdfCanvas(pageA4);
canvas.AddXObjectAt(copyObject, 0, 2 * chunk);
// 3rd
pageA4 = pdfA4.AddNewPage(PageSize.A4);
canvas = new PdfCanvas(pageA4);
canvas.AddXObjectAt(copyObject, 0, 1 * chunk);
// 4th
pageA4 = pdfA4.AddNewPage(PageSize.A4);
canvas = new PdfCanvas(pageA4);
canvas.AddXObjectAt(copyObject, 0, 0);
srcDoc.Close();
pdfA4.Close(); }}}}
My problem is obviously that I did not understand the basic operations.

Getting past the 200 inch adobe error with iTextSharp

I am converting a multi-paged (paginated) pdf document into a single page (non-paginated) pdf document.
I am looking to overcome the 200 inch limitation in adobe reader.
With iTextSharp.PdfReader each page is read to create a total height of the target document and find the maximum width.
The code to create the document works ok reading directly from the paginated pdf into the non-paginated pdf. Utilizing Chrome or Foxit the file opens fine. Adobe gives the 200 inch truncation when the page exceeds 200 inchs. In my test file the page height is 8.25 x 814 inches.
Changing the UserUnits to 4.07 (814/200) has Adobe show the page height as 814in but still truncates the page as well as showing the width as 33.
If the width of the target file is set to width/userunits (8.25/4.07) the only left 2 inches are shown in the target file.
The copy part of the code:
RandomAccessFileOrArray ra = new RandomAccessFileOrArray(fn);
SizeF pageSize = new SizeF(pageWidth, pageHeight);
float USERUnitNewValue = ComputeUserUnit(pageSize);
if (pageHeight > 14400f)
{
USERUnitNewValue = pageHeight / 14400f;
}
float NewPageWidth = (pageWidth <= 14400f) ? pageWidth : pageWidth* USERUnitNewValue;
float NewPageHeight = pageHeight * USERUnitNewValue;
FileInfo file1 = new FileInfo(newfn);
DirectoryInfo directory1 = file1.Directory;
if (!directory1.Exists)
directory1.Create();
iTextSharp.text.Rectangle newPagesize = new iTextSharp.text.Rectangle(pageWidth, pageHeight);
Document newPdf = new Document(newPagesize);
PdfWriter writer = PdfWriter.GetInstance(newPdf, new FileStream(newfn, FileMode.Create));
writer.PdfVersion = PdfWriter.VERSION_1_6;
if (pageHeight > 14400)
{
writer.Userunit = USERUnitNewValue;
}
newPdf.SetMargins(0f, 0f, 0f, 0f);
newPdf.Open();
PdfContentByte cb = writer.DirectContent;
float verticalPosition = pageHeight;
for (int pagenumber = 1; pagenumber <= n1; pagenumber++)
{
if (pdfReader.NumberOfPages >= pagenumber)
{
verticalPosition = verticalPosition - pdfReader.GetPageSize(pagenumber).Height;
cb.AddTemplate(writer.GetImportedPage(pdfReader, pagenumber), 0, verticalPosition);
}
else
{
break;
}
}
newPdf.Close();
How can the original file be copied into the target where both files would keep the same size if someone sends it to a printer?
Yes there is some redundancy in this code as I have been troubleshooting this for a little while now.
The key question here is a setting that would maintain the 8.25 x 814in and still allow adobe to open the file.
Thanks,
Mike

Thank you David.
After looking through Mr Lowagie's document and brief note about addTemplate.
cb.addTemplate(page, scale, 0, 0, scale, 0, 0)
The code was updated to utilize the new userunit and scaling.
Opening ok in Adobe now and reporting page length as expected
Once again the code is a little ugly still
RandomAccessFileOrArray ra = new RandomAccessFileOrArray(fn);
SizeF pageSize = GetPageSize(fn);
PdfReader pdfReader = new PdfReader(fn);
float USERUnitNewValue = ComputeUserUnit(pageSize);
int n1 = pdfReader.NumberOfPages;
if (pageSize.Height > 14400f) //14400 value is 72 pixels per inch over 200 inches. 200 inches seems to be adobe limit to a page
{ //determine the userunit to be used
USERUnitNewValue = pageSize.Height / 14400f;
}
float NewPageWidth = (pageSize.Width <= 14400f) ? pageSize.Width / USERUnitNewValue : pageSize.Width / USERUnitNewValue;
float NewPageHeight = pageSize.Height / USERUnitNewValue;
FileInfo file1 = new FileInfo(newfn);
DirectoryInfo directory1 = file1.Directory;
if (!directory1.Exists)
directory1.Create();
iTextSharp.text.Rectangle newPagesize = new iTextSharp.text.Rectangle(NewPageWidth, NewPageHeight);
Document newPdf = new Document(newPagesize);
PdfWriter writer = PdfWriter.GetInstance(newPdf, new FileStream(newfn, FileMode.Create));
writer.PdfVersion = PdfWriter.VERSION_1_6;
if (pageSize.Height > 14400)
{
writer.Userunit = USERUnitNewValue;
}
newPdf.SetMargins(0f, 0f, 0f, 0f);
newPdf.Open();
PdfContentByte cb = writer.DirectContent;
float verticalPosition = NewPageHeight;
for (int pagenumber = 1; pagenumber <= n1; pagenumber++)
{
if (pdfReader.NumberOfPages >= pagenumber)
{
/*convoluted page position. First position should be 0,0
unlike other counters this starts as page 1 so we need to subtract the
first page height away so that we start at the bottom of the previous image
* hmm seems that ths AddTemplate feature adds the pages in reverse order or
* at least the coordinate system sets 0,0 at the bottom left of the page
*/
float widthfactor = 1 / USERUnitNewValue; //Page scaling (width)
float heightfactor = 1 / USERUnitNewValue; //Page scaling (height)
//vertical position needs to take into account the new page height taking new UserUnit in affect
verticalPosition = verticalPosition - (pdfReader.GetPageSize(pagenumber).Height / USERUnitNewValue);
cb.AddTemplate(writer.GetImportedPage(pdfReader, pagenumber), heightfactor, 0, 0, widthfactor, 0, verticalPosition);
}
else
{
break;
}
}
newPdf.Close();
Thank you again for your help,
Mike

Split PDF page to multiple page in C#

I'm creating an application for Windows 8.1 in C#. Into Windows.Data.Pdf i've found how use PDF files in my application. But i want to know if i can split one A3 page to multiple PDF files ?

You don't want to split a page, you want to tile it.
This is explained in Chapter 6 of my book (section 6.2.3). Take a look at the TilingHero example (Java / C#). In this example, one large page (hero.pdf) is split into a PDF with several A4 pages (superman.pdf).
This is some code:
PdfReader reader = new PdfReader(resource);
Rectangle pagesize = reader.GetPageSizeWithRotation(1);
using (Document document = new Document(pagesize)) {
// step 2
PdfWriter writer = PdfWriter.GetInstance(document, ms);
// step 3
document.Open();
// step 4
PdfContentByte content = writer.DirectContent;
PdfImportedPage page = writer.GetImportedPage(reader, 1);
// adding the same page 16 times with a different offset
float x, y;
for (int i = 0; i < 16; i++) {
x = -pagesize.Width * (i % 4);
y = pagesize.Height * (i / 4 - 3);
content.AddTemplate(page, 4, 0, 0, 4, x, y);
document.NewPage();
}
}
The math is valid for an A0 page. You need to adapt it for an A3 page (meaning: the math you need is way easier to do).
You need to calculate pagesize so that it results in smaller pages, and then use something like this:
using (Document document = new Document(pagesize)) {
// step 2
PdfWriter writer = PdfWriter.GetInstance(document, ms);
// step 3
document.Open();
// step 4
PdfContentByte content = writer.DirectContent;
PdfImportedPage page = writer.GetImportedPage(reader, 1);
// adding the same page 16 times with a different offset
float x, y;
for (int i = 0; i < 16; i++) {
x = -pagesize.Width * (i % 4);
y = pagesize.Height * (i / 4 - 3);
content.AddTemplate(page, x, y); // we don't scale anymore
document.NewPage();
}
}

itextsharp: unexpected elements on copied pages

Here is known code that splits PDF document:
try
{
FileInfo file = new FileInfo(#"d:\С.pdf");
string name = file.Name.Substring(0, file.Name.LastIndexOf("."));
// we create a reader for a certain document
PdfReader reader = new PdfReader(#"d:\С.pdf");
// we retrieve the total number of pages
int n = reader.NumberOfPages;
int digits = 1 + (n / 10);
System.Console.WriteLine("There are " + n + " pages in the original file.");
Document document;
int pagenumber;
string filename;
for (int i = 0; i < n; i++)
{
pagenumber = i + 1;
filename = pagenumber.ToString();
while (filename.Length < digits) filename = "0" + filename;
filename = "_" + filename + ".pdf";
// step 1: creation of a document-object
document = new Document(reader.GetPageSizeWithRotation(pagenumber));
// step 2: we create a writer that listens to the document
PdfWriter writer = PdfWriter.GetInstance(document, new FileStream(name + filename, FileMode.Create));
// step 3: we open the document
document.Open();
PdfContentByte cb = writer.DirectContent;
PdfImportedPage page = writer.GetImportedPage(reader, pagenumber);
int rotation = reader.GetPageRotation(pagenumber);
if (rotation == 90 || rotation == 270)
{
cb.AddTemplate(page, 0, -1f, 1f, 0, 0, reader.GetPageSizeWithRotation(pagenumber).Height);
}
else
{
cb.AddTemplate(page, 1f, 0, 0, 1f, 0, 0);
}
// step 5: we close the document
document.Close();
}
}
catch (DocumentException de)
{
System.Console.Error.WriteLine(de.Message);
}
catch (IOException ioe)
{
System.Console.Error.WriteLine(ioe.Message);
}
Here is left top corner of one splitted page:
You can see here (and in other corners) unexpected lines,rounds.. How can I avoid them?

As explained many times before (ITextSharp include all pages from the input file, Itext pdf Merge : Document overflow outside pdf (Text truncated) page and not displaying, and so on), you should read chapter 6 of my book iText in Action (you can find the C# version of the examples here).
You are using a combination of Document, PdfWriter and PdfImportedPage to split a PDF. Please tell me who made you do it this way, so that I can curse the person who inspired you (because I've answered this question hundreds of times before, and I'm getting tired of repeating myself). These classes aren't a good choice for that job:
you lose all interactivity,
you need to rotate the content yourself if the page is in landscape (you already discovered this),
you need to take the original page size into account,
...
Your problem is similar to this one Itext pdf Merge : Document overflow outside pdf (Text truncated) page and not displaying. Apparently the original document you're trying to split contains a MediaBox and a CropBox. When you look at your original document, only the content inside the CropBox is shown. When you look at your copy, the content inside the MediaBox is shown, unveiling "printer marks". These printer marks show where the page needs to be cut in a publishing environment. When printing books or magazines, the pages on which content is printed are usually bigger than the final page. The extra content is cut off before assembling the book or magazine.
Long story short: read the documentation, replace PdfWriter with PdfCopy, replace AddTemplate() with AddPage().

Itextsharp: Adjust 2 elements on exactly one page

So, I'm having this problem using C# (.NET 4.0 + WinForms) and iTextSharp 5.1.2.
I have some scanned images stored on a DB and need to build on the fly PDF with those images. Some files have just one page and other ones hundreds. That is working just fine using:
foreach (var page in pages)
{
Image pageImage = Image.GetInstance(page.Image);
pageImage.ScaleToFit(document.PageSize.Width,document.PageSize.Height);
pageImage.Alignment = Image.ALIGN_TOP | Image.ALIGN_CENTER;
document.Add(pageImage);
document.NewPage();
//...
}
The problem is:
I need to add an small table at the bottom of the last page.
I try:
foreach (var page in pages)
{
Image pageImage = Image.GetInstance(page.Image);
pageImage.ScaleToFit(document.PageSize.Width,document.PageSize.Height);
pageImage.Alignment = Image.ALIGN_TOP | Image.ALIGN_CENTER;
document.Add(pageImage);
document.NewPage();
//...
}
Table t = new table....
document.Add(t);
The table is successfully added but IF the size of the image fits the page size of the document then the table is added on the next page.
I need to resize the last image of the document (if it has multiple ones, or the first if has only 1) in order to put the table directly on that page (with the image) and that both ocuppy just one page.
I try to scale the image by percent BUT given that the image size of the image that'll be on the last page is unknow and that it must FILL the biggest portion of the page I need to do that dinamically.
Any idea?

Let me give you a couple of things that might help you and then I'll give you a full working example that you should be able to customize.
The first thing is that the PdfPTable has a special method called WriteSelectedRows() that allows you to draw a table at an exact x,y coordinate. It has six overloads but the most commonly used one is probably:
PdfPTable.WriteSelectedRows(int rowStart,int rowEnd, float xPos, float yPos, PdfContentByte canvas)
To place a table with the upper left corner positioned at 400,400 you would call:
t.WriteSelectedRows(0, t.Rows.Count, 400, 400, writer.DirectContent);
Before calling this method you are required to set the table's width using SetTotalWidth() first:
//Set these to your absolute column width(s), whatever they are.
t.SetTotalWidth(new float[] { 200, 300 });
The second thing is that the height of the table isn't known until the entire table is rendered. This means that you can't know exactly where to place a table so that it truly is at the bottom. The solution to this is to render the table to a temporary document first and then calculate the height. Below is a method that I use to do this:
public static float CalculatePdfPTableHeight(PdfPTable table)
{
using (MemoryStream ms = new MemoryStream())
{
using (Document doc = new Document(PageSize.TABLOID))
{
using (PdfWriter w = PdfWriter.GetInstance(doc, ms))
{
doc.Open();
table.WriteSelectedRows(0, table.Rows.Count, 0, 0, w.DirectContent);
doc.Close();
return table.TotalHeight;
}
}
}
}
This can be called like this:
PdfPTable t = new PdfPTable(2);
//In order to use WriteSelectedRows you need to set the width of the table
t.SetTotalWidth(new float[] { 200, 300 });
t.AddCell("Hello");
t.AddCell("World");
t.AddCell("Test");
t.AddCell("Test");
float tableHeight = CalculatePdfPTableHeight(t);
So with all of that here's a full working WinForms example targetting iTextSharp 5.1.1.0 (I know you said 5.1.2 but this should work just the same). This sample looks for all JPEGs in a folder on the desktop called "Test". It then adds them to an 8.5"x11" PDF. Then on the last page of the PDF, or if there's only 1 JPEG to start with on the only page, it expands the height of the PDF by however tall the table that we're adding is and then places the table at the bottom left corner. See the comments in the code itself for further explanation.
using System;
using System.Text;
using System.Windows.Forms;
using iTextSharp.text;
using iTextSharp.text.pdf;
using System.IO;
namespace Full_Profile1
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
public static float CalculatePdfPTableHeight(PdfPTable table)
{
//Create a temporary PDF to calculate the height
using (MemoryStream ms = new MemoryStream())
{
using (Document doc = new Document(PageSize.TABLOID))
{
using (PdfWriter w = PdfWriter.GetInstance(doc, ms))
{
doc.Open();
table.WriteSelectedRows(0, table.Rows.Count, 0, 0, w.DirectContent);
doc.Close();
return table.TotalHeight;
}
}
}
}
private void Form1_Load(object sender, EventArgs e)
{
//Create our table
PdfPTable t = new PdfPTable(2);
//In order to use WriteSelectedRows you need to set the width of the table
t.SetTotalWidth(new float[] { 200, 300 });
t.AddCell("Hello");
t.AddCell("World");
t.AddCell("Test");
t.AddCell("Test");
//Calculate true height of the table so we can position it at the document's bottom
float tableHeight = CalculatePdfPTableHeight(t);
//Folder that we are working in
string workingFolder = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "Test");
//PDF that we are creating
string outputFile = Path.Combine(workingFolder, "Output.pdf");
//Get an array of all JPEGs in the folder
String[] AllImages = Directory.GetFiles(workingFolder, "*.jpg", SearchOption.TopDirectoryOnly);
//Standard iTextSharp PDF init
using (FileStream fs = new FileStream(outputFile, FileMode.Create, FileAccess.Write, FileShare.None))
{
using (Document document = new Document(PageSize.LETTER))
{
using (PdfWriter writer = PdfWriter.GetInstance(document, fs))
{
//Open our document for writing
document.Open();
//We do not want any margins in the document probably
document.SetMargins(0, 0, 0, 0);
//Declare here, init in loop below
iTextSharp.text.Image pageImage;
//Loop through each image
for (int i = 0; i < AllImages.Length; i++)
{
//If we only have one image or we are on the second to last one
if ((AllImages.Length == 1) | (i == (AllImages.Length - 1)))
{
//Increase the size of the page by the height of the table
document.SetPageSize(new iTextSharp.text.Rectangle(0, 0, document.PageSize.Width, document.PageSize.Height + tableHeight));
}
//Add a new page to the PDF
document.NewPage();
//Create our image instance
pageImage = iTextSharp.text.Image.GetInstance(AllImages[i]);
pageImage.ScaleToFit(document.PageSize.Width, document.PageSize.Height);
pageImage.Alignment = iTextSharp.text.Image.ALIGN_TOP | iTextSharp.text.Image.ALIGN_CENTER;
document.Add(pageImage);
//If we only have one image or we are on the second to last one
if ((AllImages.Length == 1) | (i == (AllImages.Length - 1)))
{
//Draw the table to the bottom left corner of the document
t.WriteSelectedRows(0, t.Rows.Count, 0, tableHeight, writer.DirectContent);
}
}
//Close document for writing
document.Close();
}
}
}
this.Close();
}
}
}
EDIT
Below is an edit based on your comments. I'm only posting the contents of the for loop which is the only part that changed. When calling ScaleToFit you just need to take tableHeight into account.
//Loop through each image
for (int i = 0; i < AllImages.Length; i++)
{
//Add a new page to the PDF
document.NewPage();
//Create our image instance
pageImage = iTextSharp.text.Image.GetInstance(AllImages[i]);
//If we only have one image or we are on the second to last one
if ((AllImages.Length == 1) | (i == (AllImages.Length - 1)))
{
//Scale based on the height of document minus the table height
pageImage.ScaleToFit(document.PageSize.Width, document.PageSize.Height - tableHeight);
}
else
{
//Scale normally
pageImage.ScaleToFit(document.PageSize.Width, document.PageSize.Height);
}
pageImage.Alignment = iTextSharp.text.Image.ALIGN_TOP | iTextSharp.text.Image.ALIGN_CENTER;
document.Add(pageImage);
//If we only have one image or we are on the second to last one
if ((AllImages.Length == 1) | (i == (AllImages.Length - 1)))
{
//Draw the table to the bottom left corner of the document
t.WriteSelectedRows(0, t.Rows.Count, 0, tableHeight, writer.DirectContent);
}
}

Just use a method table.CalculateHeights() if you want to know the height of table.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.