I've created 2 form-fillable pdf's, one to be used as a customer order form and the other to be used in-house as a production sheet. Each of the pdf's has identical fields (same name and type of field for each). I've written an app that (among several other things) uses iTextSharp to read all of the fields in a given customer order form, creates a new production sheet, and fills in all of the data from the order form. This all works smoothly for the text and date fields (string data). However, there is one image field on each pdf and I need to take the image from the image field on the order form and copy it to the image field on the production sheet. This is where I'm getting hung up.
I can use pr.Acrofields.GetFieldItem("imageFieldName"); to get the image as an Acrofields.item object, but I can't seem to get iTextSharp to let me put that into an image field using something like the PdfStamper.Acrofields.SetField() method, since it will only take a string.
Is there perhaps a way to take that image data and store it as a temporary .jpg or .bmp file, then insert that into the production sheet's image field? Or am I going about this all wrong?
As already said in a comment, the pdf format does not have any image fields. Some pdf designers allow to emulate them using e.g. a button plus some javascript. But as the field is merely emulated, there is no image value. This is indeed the case for your two documents.
To retrieve the image from the source form button, therefore, we cannot take the button value but instead have to extract the image from the button appearance. We do this using the itext parser namespace classes with a custom ImageRenderListener render listener class collecting bitmap images.
To set the image to the target form button, furthermore, we also cannot simply set the button value but have to set the button appearance. We do this using the iText AcroFields methods GetNewPushbuttonFromField and ReplacePushbuttonField.
The ImageRenderListener render listener class
All this render listener does is collect bitmap images:
public class ImageRenderListener : IRenderListener
{
public List<System.Drawing.Image> Images = new List<System.Drawing.Image>();
public void BeginTextBlock()
{ }
public void EndTextBlock()
{ }
public void RenderText(TextRenderInfo renderInfo)
{ }
public void RenderImage(ImageRenderInfo renderInfo)
{
PdfImageObject imageObject = renderInfo.GetImage();
if (imageObject == null)
{
Console.WriteLine("Image {0} could not be read.", renderInfo.GetRef().Number);
}
else
{
Images.Add(imageObject.GetDrawingImage());
}
}
}
A Copy method for the image
This method retrieves the first image from the source reader form element and adds it to the target stamper form element:
void Copy(PdfReader source, string sourceButton, PdfStamper target, string targetButton)
{
PdfStream xObject = (PdfStream) PdfReader.GetPdfObjectRelease(source.AcroFields.GetNormalAppearance(sourceButton));
PdfDictionary resources = xObject.GetAsDict(PdfName.RESOURCES);
ImageRenderListener strategy = new ImageRenderListener();
PdfContentStreamProcessor processor = new PdfContentStreamProcessor(strategy);
processor.ProcessContent(ContentByteUtils.GetContentBytesFromContentObject(xObject), resources);
System.Drawing.Image drawingImage = strategy.Images.First();
Image image = Image.GetInstance(drawingImage, drawingImage.RawFormat);
PushbuttonField button = target.AcroFields.GetNewPushbuttonFromField(targetButton);
button.Image = image;
target.AcroFields.ReplacePushbuttonField(targetButton, button.Field);
}
An example
I filled an image into the source document using Adobe Acrobat Reader
and saved this document as Customer Order Form-Willi.pdf.
Then I applied the above copy method:
String source = #"Customer Order Form-Willi.pdf";
String dest = #"Production Sheet.pdf";
String target = #"Production Sheet-withImage.pdf";
using (PdfReader sourceReader = new PdfReader(source))
using (PdfReader destReader = new PdfReader(dest))
using (PdfStamper targetStamper = new PdfStamper(destReader, File.Create(target), (char)0, true))
{
Copy(sourceReader, "proofImage", targetStamper, "proofImage");
}
The result in Production Sheet-withImage.pdf:
Some words of warning
The code above is very optimistic and contains no plausibility checks. For production you should definitively make it more defensive and check for null values, empty lists, etc.
Related
https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf
First time trying the Magic.NET-Q16-AnyCPU and I encounter an issue that I think I just do know the correct method to use I hope someone could lead me to get this corrected. I have a pdf with some existing texts and tables, plus a redline text added by using a pdf editing software.
Attached picture #1: depends on which software I use to view the png, some software shows black background. I like to have the background can show white or show as transparent (which is preferred) for all software. With page containing the red line text, it shows the red line one correctly but the other contents.
Attached picture #2: If I use settings.UseMonochrome = true; then this is no longer an issue but the resulting image I get is not what I prefer. I may sound stupid but with settings.UseMonochrome = false; my eyes feel better, especially when there are some graphics and tables.
Can I set settings. UseMonochrome = false; and still get all the contents shown correctly? And can I ensure the PNG background is either removed or set as White? Already try the image.BackgroundColor = MagickColors.White; but this does not impact the result.
private void button1_Click(object sender, EventArgs e)
{
var settings = new MagickReadSettings();
settings.Density = new Density(100, 100);
//Do not want to use this
//settings.UseMonochrome = true;
var filename = #"C:\Users\USER1\Desktop\dummy.pdf";
using (var images = new MagickImageCollection())
{
if(System.IO.File.Exists(filename))
{
images.Read(filename, settings);
var page = 1;
foreach (var image in images)
{
image.Write("C:\\Users\\USER1\\Desktop\\dummy.PNG");
page++;
}
}
}
}
I would like to get a map from an online source (whether than be Google Maps, OpenStreetMap, or other) and be able to (1) display it on a form, and (2) save it as an image. These two functions are separate.
After a bit of research I concluded that the GMap.NET.GMapControl was probably the best way of doing this and I implemented this method. However, I have hit a snag when trying to save the image.
I am saving the image by generating a jpeg from the control using its ToImage() method. This works, but only when the map is visible on screen. In my application I need to be able to generate the image without rendering it to screen.
If the GMapControl is not visible, the jpeg is just a black rectangle. The test code below demonstrates this. The form contains two GMapControl controls. If both are visible I get two identical jpeg images. If one is hidden, the corresponding jpeg is blank.
Is there a way I can get the map image using a GMapControl without plotting it to screen? Or should I take a different approach and use something else? The more lightweight the better as far as I am concerned.
(My first attempt was using the WebBrowser control. I moved on from this because I was getting all the borders etc. as well as the map. I tried to exclude everything but the div containing the map, but then I lost everything ; I suspect this may have been because the map div was nested and I was hiding its parent...)
public partial class testForm : Form
{
public testForm()
{
InitializeComponent();
}
private void testForm_Shown(Object sender, EventArgs e)
{
gMapControl2.Hide(); // this results in a blank jpg image for gMapControl2
// Plot the same map to both gMapControls...
PlotMap(gMapControl1);
PlotMap(gMapControl2);
// Excuse the clunky wait method here ; it was due to a 'cross-thread' error when using the event raised by the gMapControl
// It serves the purpose here.
Task.Factory.StartNew(() => { Task.Delay(5000).Wait(); }).Wait(); // wait for 5 seconds to give maps plenty of time to render
WriteBitmap(gMapControl1, $#"E:\Test_gMapControl1.jpg");
WriteBitmap(gMapControl2, $#"E:\Test_gMapControl2.jpg");
}
private void PlotMap(GMapControl gMapControl)
{
gMapControl.MapProvider = GoogleMapProvider.Instance;
GMaps.Instance.Mode = AccessMode.ServerOnly;
gMapControl.ShowCenter = false;
gMapControl.MinZoom = 1;
gMapControl.MaxZoom = 25;
gMapControl.Zoom = 10;
gMapControl.Position = new PointLatLng(10, 10); // centered on 10 lat, 10 long
}
private void WriteBitmap(GMapControl gMapControl, string filename)
{
Image b = gMapControl.ToImage();
b.Save(filename, ImageFormat.Jpeg);
}
}
I'm creating a FixedDocument by adding FixedPages to PageContents, then adding them to the FixedDocument somehow like this
FixedDocument fd = new FixedDocument();
// add FixedPages in PageContent to fd
Printing them with a PrintDialog, like this
pdialog.PrintDocument(fd.DocumentPaginator, "Test");
results in the correct number of pages. However, every page printed - e.g. to a PDF - is the content of the first page.
I tried testing the ImageSources I add to the FixedPages, those seem correct. I also tested the final FixedDocument with a DocumentViewer like so
Window wnd = new Window();
DocumentViewer viewer = new DocumentViewer();
viewer.Document = fd;
wnd.Content = viewer;
try
{
wnd.Show();
}
catch(Exception e)
{
Console.WriteLine(e.ToString());
}
This strangely shows the correct output I would expect. What's even stranger is that I get an IOException after wnd.Show(); (which is why I surrounded it with a try/catch). Even with the try catch I can only view it maybe 1-2 seconds before the same IOException thrown by my MainWindow. Something like "Wrong username or password" - which doesn't make sense, since the images I'm trying to print are local ones.
Putting the DocumentViewer aside, my problem with the Print() method only printing the first page n times (n being the number of actual pages it should be) still persists, just thought that the exception in the DocumentViewer may give someone an idea of an underlying problem.
This might be a possible duplicate of FixedDocument always print first page - however he doesn't mention problems with DocumentViewer and the question remains unanswered.
Thanks in advance for any help!
I have had a similar issue, printing labels in a FixedDocument from a List of Data, that contains a List Of Image Sources (User Photo), and also dynamically creates a QRCode image from an integer for the users id.
The format for the image is created from a custom UserControl that I used to position The Text fields and images for each label. When I viewed the created document in the DocumentViewer control, it displayed perfectly. Correct photo image, correct QRCode image for each label. However, when I printed the document (or saved to PDF file or XPS File), Ever Label had only the first image in both the Photo and QRCode image positions on the label.
When I came across this post, I though that I would try saving then reloading the images as suggested, and this worked!! However the IO overhead for 30 labels per page, and many pages of labels meant that this wasn't a very useful workaround! I
Then found that simply converting the ImageSource to a ByteArray, and then back again, before adding to the FixedDocument worked also, but without the added IO overhead. Not massively elegant, but has been a real headache for me for a week now!!
Here is a snippet of code from the main body of the method that builds the labels:
var qr = GetQRCodeImage(playerId); // Gets ImageSource
var ph = LoadImage(data[dataIndex].Photo); // Gets ImageSource
var qrCode = FixDocumentCacheImageBugFix(qr); // Gets ImageSource
if (ph != null) {
var photo = FixDocumentCacheImageBugFix(ph);
label = new AveryBarcodeLabel(line1, line2, line3, qrCode, photo); // Calls constructor to instantiate new Label with new ImageSources
}
else {
label = new AveryBarcodeLabel(line1, line2, line3, qrCode); // Calls constructor to instantiate new Label with new ImageSources (where photo is null)
}
and here are the methods I used to "Fix" the Images
public static ImageSource FixDocumentCacheImageBugFix(ImageSource image) {
var bytes = ImageSourceToBytes(image);
return ByteToImage(bytes);
}
public static ImageSource ByteToImage(byte[] imageData) {
var biImg = new BitmapImage();
var ms = new MemoryStream(imageData);
biImg.BeginInit();
biImg.StreamSource = ms;
biImg.EndInit();
ImageSource imgSrc = biImg;
return imgSrc;
}
public static byte[] ImageSourceToBytes(ImageSource imageSource) {
byte[] bytes = null;
var bitmapSource = imageSource as BitmapSource;
if (bitmapSource != null) {
var encoder = new JpegBitmapEncoder();
encoder.Frames.Add(BitmapFrame.Create(bitmapSource));
using (var stream = new MemoryStream()) {
encoder.Save(stream);
bytes = stream.ToArray();
}
}
return bytes;
}
So, this isn't really the answer to why it happened, but I found at least the culprit: my image.
I'm loading a multipage LZW-compressed TIFF like so:
TiffBitmapEncoder encoder = new TiffBitmapEncoder();
foreach (ImageSource frame in encoder.Frames)
{
frame.Freeze();
Images.Add(frame);
}
where Images is a collection of ImageSource. They display fine in the application, I can also save them again using a TiffBitmapEncoder, but printing them using WPF ends up with the in the question mentioned problem as well as - when using a DocumentViewer - an exception telling me about 'wrong username or password', which doesn't make sense.
The way I found out the image to be the problem was temporarily saving the individual ImageSources of the TIFF using a PngBitmapEncoder and immediately reloading the pages from the separate files with the same encoder into the same slot in my Images collection.
Since this works without any issues (no username/password exception in my DocumentViewer and my printing working correctly) I have to assume that he doesn't like something about the TIFF format.
This doesn't answer my underlying question of why it didn't work, but since this is at least a workaround that works, I'll just put that here and don't check the 'answered' mark just yet.
Maybe someone knows why my TIFF ImageSource produced those strange results?
I'm trying to do something when I click image displayed inside pictureBox1.
pictureBox is loaded with this code:
string imgpath = #"img\256.png";
pictureBox48.Image = Image.FromFile(imgpath);
Then control is released to me so I can see that the picture loaded correctly.
Then i click the picture:
public void pictureBox48_Click(object sender, EventArgs e)
{
string variable1 = pictureBox48.ImageLocation;
Form3 fo = new Form3(variable1);
fo.ShowDialog();
}
This doesn't work. When I debug the code I see that variable1 stay null, that is pictureBox48.ImageLocation is null. Why is that? Shouldn't it be the path to the image that is assigned there?
You can't get the image path when you set the image using the Image property because you are assigning an Image object which can come from different sources.
Set the image using ImageLocation.
string imgpath = #"img\256.png";
pictureBox48.ImageLocation = imgpath;
When you click in the PictureBox you can get the path using the same property:
public void pictureBox48_Click(object sender, EventArgs e)
{
string variable1 = pictureBox48.ImageLocation;
Form3 fo = new Form3(variable1);
fo.ShowDialog();
}
When dealing with Image or PictureBox I would recommend to not use something like Location or Path of the image. Assume that when the image is loaded user removes it from the hard drive and you're left with the code full of errors.
That's why you should rely on Image itself as it contains every information about the image like pixel format, width, height and raw pixel data.
I would recommend you to just copy the image, not the path to the file.
This piece of code should give you a hint:
pixtureBox48.Image = Image.FromFile(imgPath);
// above code assumes that the image is still on hard drive and is accessible,
// now let's assume user deletes that file. You have the data but not on the physical location.
Image copyImage = (Image)pictureBox48.Image.Clone();
Form3 fo = new Form(copyImage); // change .ctor definition to Form(Image copy)
fo.ShowDialog();
I have taken the link values from PDF file like http://google.com
but I need to take the anchor text value, for example click here.
How to to take the anchor link value text?
I have taken the URL value of the PDF file by using the below URL: Reading hyperlinks from pdf file
for example.
Anchor a = new Anchor("Test Anchor");
a.Reference = "http://www.google.com";
myParagraph.Add(a);
Here I get the http://www.google.com but I need to get anchor value i.e. Test Anchor
Need your suggestions.
From the PDF file you need to identify the region where the link is placed and then read the text below the link using iTextSharp.
This way you can extract the text underneath the link. The limitation of this approach is that if the link region is wider than the text, the extraction will read the full text under that region.
private void GetAllHyperlinksFromPDFDocument(string pdfFilePath)
{
string linkTextBuilder = "";
string linkReferenceBuilder = "";
PdfDictionary PageDictionary = default(PdfDictionary);
PdfArray Annots = default(PdfArray);
PdfReader R = new PdfReader(pdfFilePath);
List<BinaryHyperlink> ret = new List<BinaryHyperlink>();
//Loop through each page
for (int i = 1; i <= R.NumberOfPages; i++)
{
//Get the current page
PageDictionary = R.GetPageN(i);
//Get all of the annotations for the current page
Annots = PageDictionary.GetAsArray(PdfName.ANNOTS);
//Make sure we have something
if ((Annots == null) || (Annots.Length == 0))
continue;
//Loop through each annotation
foreach (PdfObject A in Annots.ArrayList)
{
//Convert the itext-specific object as a generic PDF object
PdfDictionary AnnotationDictionary = (PdfDictionary)PdfReader.GetPdfObject(A);
//Make sure this annotation has a link
if (!AnnotationDictionary.Get(PdfName.SUBTYPE).Equals(PdfName.LINK))
continue;
//Make sure this annotation has an ACTION
if (AnnotationDictionary.Get(PdfName.A) == null)
continue;
//Get the ACTION for the current annotation
PdfDictionary AnnotationAction = (PdfDictionary)AnnotationDictionary.GetAsDict(PdfName.A);
if (AnnotationAction.Get(PdfName.S).Equals(PdfName.URI))
{
//Get action link URL : linkReferenceBuilder
PdfString Link = AnnotationAction.GetAsString(PdfName.URI);
if (Link != null)
linkReferenceBuilder = Link.ToString();
//Get action link text : linkTextBuilder
var LinkLocation = AnnotationDictionary.GetAsArray(PdfName.RECT);
List<string> linestringlist = new List<string>();
iTextSharp.text.Rectangle rect = new iTextSharp.text.Rectangle(((PdfNumber)LinkLocation[0]).FloatValue, ((PdfNumber)LinkLocation[1]).FloatValue, ((PdfNumber)LinkLocation[2]).FloatValue, ((PdfNumber)LinkLocation[3]).FloatValue);
RenderFilter[] renderFilter = new RenderFilter[1];
renderFilter[0] = new RegionTextRenderFilter(rect);
ITextExtractionStrategy textExtractionStrategy = new FilteredTextRenderListener(new LocationTextExtractionStrategy(), renderFilter);
linkTextBuilder = PdfTextExtractor.GetTextFromPage(R, i, textExtractionStrategy).Trim();
}
}
}
}
Unfortunately I don't think you're going to be able to do this, at least not without a lot of guess-work. In HTML this would be easy because a hyperlink and its text are stored together as:
Click here
However, in a PDF these two entities are not stored with any form of relationship. What we think of as a "hyperlink" within a PDF is technically a PDF Annotation that just happens to be sitting on top of text. You can see this by opening a PDF in an editing program such as Adobe Acrobat Pro. You can change the text but the "clickable" area doesn't change. You can also move and resize the "clickable" area and put it anywhere in the document.
When creating PDFs, iText/iTextSharp abstract this away so you don't have to think about this. You can create a "hyperlink" with clickable text but when it generates a PDF it ultimately will create the text as normal text, calculate the rectangle coordinates and then put an annotation at that rectangle.
I did say that you could try to guess at this, and it might or might not work for you. To do this you'd need to get the rectangle for annotation and then find the text that's also at those coordinates. It won't be an exact match, however, because of padding issues. If you absolutely have to get the text under a hyperlink then this is the only way that I know of for doing this. Good luck!