underline portion of text using iTextSharp - c#

I have an application that uses itextsharp to fill PDF form fields.
One of these fields has some text with tags. For example:
<U>This text should be underlined</>.
I'd like that the text closed in .. has to be underlined.
How could I do that?
How could I approch it with HTMLWorker for example?
Here's the portion of code where I write my description:
for (int i = 0; i < linesDescription.Count; i++)
{
int count = linesDescription[i].Count();
int countTrim = linesDescription[i].Trim().Count();
Chunk cnk = new Chunk(linesDescription[i] + GeneralPurpose.ReturnChar, TextStyle);
if (firstOpe && i > MaxLinePerPage - 1)
LongDescWrapped_dt_extra.Add(cnk);
else
LongDescWrapped_dt.Add(cnk);
}

Ordinary text fields do not support rich text. If you want the fields to remain interactive, you will need RichText fields. These are fields that are flagged in a way that they accept an RV value. This is explained here: Set different parts of a form field to have different fonts using iTextSharp (Note that I didn't succeed in getting this to work, but you may have better luck.)
If it is OK for you to flatten the form (i.e. remove all interactivity), please take a look at the FillWithUnderline example:
public void manipulatePdf(String src, String dest) throws DocumentException, IOException {
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
stamper.setFormFlattening(true);
AcroFields form = stamper.getAcroFields();
FieldPosition pos = form.getFieldPositions("Name").get(0);
ColumnText ct = new ColumnText(stamper.getOverContent(pos.page));
ct.setSimpleColumn(pos.position);
ElementList elements = XMLWorkerHelper.parseToElementList("<div>Bruno <u>Lowagie</u></div>", null);
for (Element element : elements) {
ct.addElement(element);
}
ct.go();
stamper.close();
}
In this example, we don't fill out the field, but we get the fields position (a page number and a rectangle). We then use ColumnText to add content at this position. As we are inputting HTML, we use XML Worker to parse the HTML into iText objects that we can add to the ColumnText object.
This is a Java example, but it should be easy to port this to C# if you know how to code in C# (which I don't).

You can trythis
Chunk chunk = new Chunk("Underlined TExt", FontFactory.GetFont(FontFactory.TIMES_ROMAN, 12.0f, iTextSharp.text.Font.BOLD | iTextSharp.text.Font.UNDERLINE));
Paragraph reportHeadline = new Paragraph(chunk);
reportHeadline.SpacingBefore = 12.0f;
pdfDoc.Add(reportHeadline);

Related

How to query the Base Paragraph element position? in order to add Link Annotation without saving the file

I'm creating a simple PDF file with some text and an hyperlink attached to the that text:
Document pdfDocument = new Document();
Page pdfPage = pdfDocument.Pages.Add();
TextFragment textFragment = new TextFragment("My Text");
Table table = new Table();
Row row = table.Rows.Add();
Cell cell = row.Cells.Add();
cell.Paragraphs.Add(textFragment);
pdfPage.Paragraphs.Add(table);
LinkAnnotation link = new LinkAnnotation(pdfPage, textFragment.Rectangle); //[Before Save]textFragment.Rectangle: 0,0,35.56,10
link.Action = new GoToURIAction("Link1 before save");
pdfPage.Annotations.Add(link);
pdfDocument.Save(dataDir + "SimplePDFWithLink.pdf");
The problem is that the link annotation is being assign to the before save rectangle [0,0,33.56,10] at the bottom of the screen where's the textFragment is being added to a different rectangle (I can't set here the Position property because I don't know it, it is relative to the cell's table).
In order to solve this I've tried saving the page and only then searching the textFragment using TextFragmentAbsorber
pdfDocument.Save(dataDir + "SimplePDFWithLink.pdf");
//[After Save]textFragment.Rectangle: 0,0,90,770
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber();
pdfPage.Accept(textFragmentAbsorber);
foreach (TextFragment absorbedTextFragment in textFragmentAbsorber.TextFragments)
{
link = new LinkAnnotation(pdfPage, absorbedTextFragment.Rectangle);
link.Action = new GoToURIAction("Link 2 after save");
pdfPage.Annotations.Add(link);
}
pdfDocument.Save(dataDir + "SimplePDFWithLink.pdf");
My Question:
Is is possible to add a simple link to a TextFragment (which is BaseParagraph not StructureElement) without saving the document first?
Here is a simple demo of the outcome, you can see that before saving the document the link is added to the left bottom of the document instead of the text rectangle:
Update:
If I specify the TextFragment's Position value with some arbitrary values, the link is then added exactly to the text, but I don't know what will be the Position value of the element because it being built dynamically using a Table.
Working with TextFragment and TextSegment does work and adds the link without pre-saving the file:
TextFragment textFragment = new TextFragment("My Text");
TextSegment textSegment = new TextSegment("Link to File");
textSegment.Hyperlink = new Aspose.Pdf.WebHyperlink("www.google.com");
textFragment.Segments.Add(textSegment);
It is worth to mention it is works well when linking to a file on the user's file-system like:
textSegment.Hyperlink = new Aspose.Pdf.WebHyperlink("Files\foo.png");

Replace text in Word with text from C# form

I'm trying to make an application in C#. When pressing a radio button, I'd like to open a Microsoft Word document (an invoice) and replace some text with text from my Form. The Word documents also contains some textboxes with text.
I've tried to implement the code written in this link Word Automation Find and Replace not including Text Boxes but when I press the radio button, a window appears asking for "the encoding that makes the document readable" and then the Word document opens and it's full of black triangles and other things instead of my initial template for the invoice.
How my invoice looks after:
Here is what I've tried:
string documentLocation = #"C:\\Documents\\Visual Studio 2015\\Project\\Invoice.doc";
private void yes_radioBtn_CheckedChanged(object sender, EventArgs e)
{
FindReplace(documentLocation, "HotelName", "MyHotelName");
Process process = new Process();
process.StartInfo.FileName = documentLocation;
process.Start();
}
private void FindReplace(string documentLocation, string findText, string replaceText)
{
var app = new Microsoft.Office.Interop.Word.Application();
var doc = app.Documents.Open(documentLocation);
var range = doc.Range();
range.Find.Execute(FindText: findText, Replace: WdReplace.wdReplaceAll, ReplaceWith: replaceText);
var shapes = doc.Shapes;
foreach (Shape shape in shapes)
{
var initialText = shape.TextFrame.TextRange.Text;
var resultingText = initialText.Replace(findText, replaceText);
shape.TextFrame.TextRange.Text = resultingText;
}
doc.Save();
doc.Close();
Marshal.ReleaseComObject(app);
}
So if your word template is the same each time you essentially
Copy The Template
Work On The Template
Save In Desired Format
Delete Template Copy
Each of the sections that you are replacing within your word document you have to insert a bookmark for that location (easiest way to input text in an area).
I always create a function to accomplish this, and I end up passing in the path - as well as all of the text to replace my in-document bookmarks. The function call can get long sometimes, but it works for me.
Application app = new Application();
Document doc = app.Documents.Open("sDocumentCopyPath.docx");
if (doc.Bookmarks.Exists("bookmark_1"))
{
object oBookMark = "bookmark_1";
doc.Bookmarks.get_Item(ref oBookMark).Range.Text = My Text To Replace bookmark_1;
}
if (doc.Bookmarks.Exists("bookmark_2"))
{
object oBookMark = "bookmark_2";
doc.Bookmarks.get_Item(ref oBookMark).Range.Text = My Text To Replace bookmark_2;
}
doc.ExportAsFixedFormat("myNewPdf.pdf", WdExportFormat.wdExportFormatPDF);
((_Document)doc).Close();
((_Application)app).Quit();
This code should get you up and running unless you want to pass in all the values into a function.
EDIT: If you need more examples I'm working on a blog post as well, so I have a lot more detail if this wasn't clear enough for your use case.

Check if "Continued" or "New" page

I would like to know if there is a way to check if the "New Page" happened because of exceeded table or programmatically (by using doc.NewPage();)?
If the new page caused because of exceeded table or text, I need to hide the header table and show a text instead, else if the new page caused programmatically I need to display the header table normally.
I tried to find a flag or something like this in the "OnStartPage" event that show me if the page exceeded or not, but I found nothing.
I hope that some one can help me here.
Thanks!
I would look at the IPdfPTableEventSplit interface that you can implement and assign to a PdfPTable.TableEvent property. It has two methods, SplitTable and TableLayout. The first method is called whenever a table split happens, the second is called whenever the table actually gets written to the canvas. In the first method you could set a flag and disable the header rows if a split happened and in the second method you could write your content out.
The SplitTable method is fired before the new page is added so you need to keep track of a trinary state, "no split", "draw on next page" and "draw on this page". I've packaged these up as an enum:
[Flags]
public enum SplitState {
None = 0,
DrawOnNextPage = 1,
DrawOnThisPage = 2
}
The implemented interface would look like this:
public class SplitTableWatcher : IPdfPTableEventSplit {
/// <summary>
/// The current table split state
/// </summary>
private SplitState currentSplitState = SplitState.None;
public void SplitTable(PdfPTable table) {
//Disable header rows for automatic splitting (per OP's request)
table.HeaderRows = 0;
//We now need to split on the next page, so append the flag
this.currentSplitState |= SplitState.DrawOnNextPage;
}
public void TableLayout(PdfPTable table, float[][] widths, float[] heights, int headerRows, int rowStart, PdfContentByte[] canvases) {
//If a split happened and we're on the next page
if (this.currentSplitState.HasFlag(SplitState.DrawOnThisPage)) {
//Draw something, nothing too special here
var cb = canvases[PdfPTable.TEXTCANVAS];
cb.BeginText();
cb.SetFontAndSize(BaseFont.CreateFont(BaseFont.HELVETICA, BaseFont.CP1252, false), 18);
//Use the table's widths and heights to find a spot, this probably could use some tweaking
cb.SetTextMatrix(widths[0][0], heights[0]);
cb.ShowText("A Split Happened!");
cb.EndText();
//Unset the draw on this page flag, it will be reset below if needed
this.currentSplitState ^= SplitState.DrawOnThisPage;
}
//If we previously had the next page flag set change it to this page
if (currentSplitState.HasFlag(SplitState.DrawOnNextPage)) {
this.currentSplitState = SplitState.DrawOnThisPage;
}
}
}
And finally the actual implementation of that class with some simple test data:
var testFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "test.pdf");
using (var fs = new FileStream(testFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, fs)) {
doc.Open();
var t = new PdfPTable(1);
//Implement our class
t.TableEvent = new SplitTableWatcher();
//Add a single header row
t.HeaderRows = 1;
t.AddCell("Header");
//Create 100 test cells
for (var i = 1; i < 100; i++) {
t.AddCell(i.ToString());
}
doc.Add(t);
doc.Close();
}
}
}

How to Reading hyperlinks with AnchorText from pdf file C#

I have taken the link values from PDF file like http://google.com
but I need to take the anchor text value, for example click here.
How to to take the anchor link value text?
I have taken the URL value of the PDF file by using the below URL: Reading hyperlinks from pdf file
for example.
Anchor a = new Anchor("Test Anchor");
a.Reference = "http://www.google.com";
myParagraph.Add(a);
Here I get the http://www.google.com but I need to get anchor value i.e. Test Anchor
Need your suggestions.
From the PDF file you need to identify the region where the link is placed and then read the text below the link using iTextSharp.
This way you can extract the text underneath the link. The limitation of this approach is that if the link region is wider than the text, the extraction will read the full text under that region.
private void GetAllHyperlinksFromPDFDocument(string pdfFilePath)
{
string linkTextBuilder = "";
string linkReferenceBuilder = "";
PdfDictionary PageDictionary = default(PdfDictionary);
PdfArray Annots = default(PdfArray);
PdfReader R = new PdfReader(pdfFilePath);
List<BinaryHyperlink> ret = new List<BinaryHyperlink>();
//Loop through each page
for (int i = 1; i <= R.NumberOfPages; i++)
{
//Get the current page
PageDictionary = R.GetPageN(i);
//Get all of the annotations for the current page
Annots = PageDictionary.GetAsArray(PdfName.ANNOTS);
//Make sure we have something
if ((Annots == null) || (Annots.Length == 0))
continue;
//Loop through each annotation
foreach (PdfObject A in Annots.ArrayList)
{
//Convert the itext-specific object as a generic PDF object
PdfDictionary AnnotationDictionary = (PdfDictionary)PdfReader.GetPdfObject(A);
//Make sure this annotation has a link
if (!AnnotationDictionary.Get(PdfName.SUBTYPE).Equals(PdfName.LINK))
continue;
//Make sure this annotation has an ACTION
if (AnnotationDictionary.Get(PdfName.A) == null)
continue;
//Get the ACTION for the current annotation
PdfDictionary AnnotationAction = (PdfDictionary)AnnotationDictionary.GetAsDict(PdfName.A);
if (AnnotationAction.Get(PdfName.S).Equals(PdfName.URI))
{
//Get action link URL : linkReferenceBuilder
PdfString Link = AnnotationAction.GetAsString(PdfName.URI);
if (Link != null)
linkReferenceBuilder = Link.ToString();
//Get action link text : linkTextBuilder
var LinkLocation = AnnotationDictionary.GetAsArray(PdfName.RECT);
List<string> linestringlist = new List<string>();
iTextSharp.text.Rectangle rect = new iTextSharp.text.Rectangle(((PdfNumber)LinkLocation[0]).FloatValue, ((PdfNumber)LinkLocation[1]).FloatValue, ((PdfNumber)LinkLocation[2]).FloatValue, ((PdfNumber)LinkLocation[3]).FloatValue);
RenderFilter[] renderFilter = new RenderFilter[1];
renderFilter[0] = new RegionTextRenderFilter(rect);
ITextExtractionStrategy textExtractionStrategy = new FilteredTextRenderListener(new LocationTextExtractionStrategy(), renderFilter);
linkTextBuilder = PdfTextExtractor.GetTextFromPage(R, i, textExtractionStrategy).Trim();
}
}
}
}
Unfortunately I don't think you're going to be able to do this, at least not without a lot of guess-work. In HTML this would be easy because a hyperlink and its text are stored together as:
Click here
However, in a PDF these two entities are not stored with any form of relationship. What we think of as a "hyperlink" within a PDF is technically a PDF Annotation that just happens to be sitting on top of text. You can see this by opening a PDF in an editing program such as Adobe Acrobat Pro. You can change the text but the "clickable" area doesn't change. You can also move and resize the "clickable" area and put it anywhere in the document.
When creating PDFs, iText/iTextSharp abstract this away so you don't have to think about this. You can create a "hyperlink" with clickable text but when it generates a PDF it ultimately will create the text as normal text, calculate the rectangle coordinates and then put an annotation at that rectangle.
I did say that you could try to guess at this, and it might or might not work for you. To do this you'd need to get the rectangle for annotation and then find the text that's also at those coordinates. It won't be an exact match, however, because of padding issues. If you absolutely have to get the text under a hyperlink then this is the only way that I know of for doing this. Good luck!

Show/Hide AcroFields in iTextSharp

I have the following code:
PdfStamper pst = null;
try
{
PdfReader reader = new PdfReader(GetTemplateBytes());
pst = new PdfStamper(reader, Response.OutputStream);
var acroFields = pst.AcroFields;
pst.FormFlattening = true;
pst.FreeTextFlattening = true;
pst.SetFullCompression();
SetFieldsInternal(acroFields);
pst.Close();
}
protected override void SetFieldsInternal(iTextSharp.text.pdf.AcroFields acroFields)
{
acroFields.SetFieldProperty("txtForOffer", "setflags", PdfAnnotation.FLAGS_PRINT, null);
}
How do I show / hide the acrofields in the SetFieldsInternal function ?
The point is that the user may want to download 2 versions of the PDF, one with some text showing, one without text showing.
The template PDF is generated using OpenOffice. I just fill in the acrofields.
You can set an AcroField as readonly like this:
form.setFieldProperty("companyFld", "setfflags", PdfFormField.FF_READ_ONLY, null);
It is "setfflags" BTW not "setflags"
EDIT: MY BAD!!! You asked to make a field visible or not. You would use the "setflags" argument in this case and you can pass any of the PdfAnnotation FLAGS_ constants to adjust visibility.

Categories