.NET C# - MigraDoc - How to change document charset?

.NET C# - MigraDoc - How to change document charset? - c#

I've searched for solution to this problem, but still cannot find the answer. Any help would be appreciated.
Document document = new Document();
Section section = document.AddSection();
Paragraph paragraph = section.AddParagraph();
paragraph.Format.Font.Color = Color.FromCmyk(100, 30, 20, 50);
paragraph.AddText("ąčęėįųųūū");
paragraph.Format.Font.Size = 9;
paragraph.Format.Alignment = ParagraphAlignment.Center;
</b>
<...>
In example above characters "ąčęėįųųūū" are not displayed in exported pdf.
How can I set 'MigraDoc' character set ?

Just tell the Renderer to create an Unicode document:
PdfDocumentRenderer renderer = new PdfDocumentRenderer(true, PdfSharp.Pdf.PdfFontEmbedding.Always);
renderer.Document = document;
The first parameter of PdfDocumentRenderer must be true to get Unicode.
Please note that not all True Type fonts include all Unicode characters (but it should work with Arial, Verdana, etc.).
See here for a complete sample:
http://www.pdfsharp.net/wiki/HelloMigraDoc-sample.ashx

If you are mixing PDFSharp and MigraDoc, as I do ( it means that you have a PdfSharp object PdfDocument document and a MigraDoc object Document doc, which is rendered as a part of document), everything is not that simple. The example, that PDFSharp Team has given works only when you are using MigraDoc separately.
So you should use it this way:
Make sure you are rendering your MigraDoc doc earlier than rendering the MigraDoc object to the PDF sharp XGraphics gfx.
Use the hack to set encoding for the gfx object.
XGraphics gfx = XGraphics.FromPdfPage(page);
// HACK²
gfx.MUH = PdfFontEncoding.Unicode;
gfx.MFEH = PdfFontEmbedding.Always;
// HACK²
Document doc = new Document();
PdfDocumentRenderer pdfRenderer = new PdfDocumentRenderer(true, PdfFontEmbedding.Always);
pdfRenderer.Document = doc;
pdfRenderer.RenderDocument();
MigraDoc.Rendering.DocumentRenderer docRenderer = new DocumentRenderer(doc);
docRenderer.PrepareDocument();
docRenderer.RenderObject(gfx, XUnit.FromCentimeter(5), XUnit.FromCentimeter(10), "12cm", para);
For 1.5.x-betax
let gfx = XGraphics.FromPdfPage(page)
gfx.MUH <- PdfFontEncoding.Unicode
let doc = new Document()
let pdfRenderer = new PdfDocumentRenderer(true, PdfFontEmbedding.Always)
pdfRenderer.Document <- doc
pdfRenderer.RenderDocument()
let docRenderer = new DocumentRenderer(doc)
docRenderer.PrepareDocument()
docRenderer.RenderObject(gfx, XUnit.FromCentimeter 5, XUnit.FromCentimeter 10, "12cm", para)

Related

Add multiple Checkboxes in ITextsharp html to pdf

I'm using the iTextSharp library to convert my html to pdf. The issue is I'm trying to add checkbox appearance using the below code:
string HTML,public static String FONT = "c:/windows/fonts/WINGDING.TTF";
public static String TEXT = "o";
public void HTMLToPdf( string FileName)
{
string HTML="<!DOCTYPE html>
<html>
<head><title></title><meta charset='UTF-8'></head>
<body><div class='mystyle'>Here i want to print many checkbox lik appearances</div></body>
<html>";
Document pdfDoc = new Document(PageSize.A4, 30f, 30f, 10f, 10f);
pdfDoc.Add(p);
BaseFont bf = BaseFont.CreateFont(FONT, BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
Font f = new Font(bf, 12);
Paragraph p = new Paragraph(TEXT, f);
pdfDoc.Add(p);
}
The problem is this method adds the checkbox at the begining of pdf, please help me to attach the paragraph containing the checkbox value to my html.
Simply put, I'm getting the value at pdfDoc.Add(p), but I want it in a variable to print it many times in html.

In fact, it is not very clear what was meant in the question:
please help me to attach the paragraph containing the checkbox value to my html
I can assume that you wanted to convert HTML to PDF, and add paragraphs on the next line.
In general, it's a bad idea to use iTextSharp for this, since this library is outdated and no longer supported. I can suggest my own way of solving your problem in pdfHTML, this is an iText7 add-on. My code is in Java, but it's not much different from sharp.The main idea is not to close the document after html conversion. Because if you try to write a paragraph in a closed document, it will be at the very beginning, as in your example.
String FONT = "c:/windows/fonts/WINGDING.TTF";
String TEXT = "o";
File htmlSource = new File("checkBoxHtml.html");
File pdfDest = new File("output.pdf");
ConverterProperties converterProperties = new ConverterProperties();
Document document = HtmlConverter.convertToDocument(new FileInputStream(htmlSource),
new PdfDocument(new PdfWriter(pdfDest)), converterProperties);
PdfFont font = PdfFontFactory.createFont(FONT);
Text text = new Text(TEXT);
text.setFont(font);
Paragraph paragraph = new Paragraph();
// Adding text to the paragraph
paragraph.add(text);
// Adding paragraph to the document
document.add(paragraph);
document.close();

C#: Create PDF Form (AcroForm) using PDFsharp

How does one add a PDF Form element to a PDFsharp PdfPage object?
I understand that AcroForm is the best format for form-fillable PDF elements, but the PDFsharp library doesn't seem to allow you to create instances of the AcroForm objects.
I have been able to use PDFsharp to generate simple documents, as here:
static void Main(string[] args) {
PdfDocument document = new PdfDocument();
document.Info.Title = "Created with PDFsharp";
// Create an empty page
PdfPage page = document.AddPage();
// Draw Text
XGraphics gfx = XGraphics.FromPdfPage(page);
XFont font = new XFont("Verdana", 20, XFontStyle.BoldItalic);
gfx.DrawString("Hello, World!", font, XBrushes.Black,
new XRect(0, 0, page.Width, page.Height), XStringFormats.Center);
// Save document
const string filename = "HelloWorld.pdf";
document.Save(filename);
}
But I cannot work out how to add a fillable form element. I gather it would likely use the page.Elements.Add(string key, PdfItem item) method, but how do you make an AcroForm PdfItem? (As classes like PdfTextField do not seem to have a public constructor)
The PDFsharp forums and documentation have not helped with this, and the closest answer I found on Stack Overflow was this one, which is answering with the wrong library.
So, in short: How would I convert the "Hello World" text above into a text field?
Is it possible to do this in PDFsharp, or should I be using a different C# PDF library? (I would very much like to stick with free - and preferably open-source - libraries)

Most of the classes constructors in PdfSharp are sealed which makes it kind of difficult to create new pdf objects. However, you can create objects using it's classes to add low-level pdf elements.
Below is an example of creating a text field.
Please refer to the pdf tech specs starting on page 432 on definition of key elements
https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf
public static void AddTextBox()
{
using (PdfDocument pdf = new PdfDocument())
{
PdfPage page1 = pdf.AddPage();
double left = 50;
double right = 200;
double bottom = 750;
double top = 725;
PdfArray rect = new PdfArray(pdf);
rect.Elements.Add(new PdfReal(left));
rect.Elements.Add(new PdfReal(bottom));
rect.Elements.Add(new PdfReal(right));
rect.Elements.Add(new PdfReal(top));
pdf.Internals.AddObject(rect);
PdfDictionary form = new PdfDictionary(pdf);
form.Elements.Add("/Filter", new PdfName("/FlateDecode"));
form.Elements.Add("/Length", new PdfInteger(20));
form.Elements.Add("/Subtype", new PdfName("/Form"));
form.Elements.Add("/Type", new PdfName("/XObject"));
pdf.Internals.AddObject(form);
PdfDictionary appearanceStream = new PdfDictionary(pdf);
appearanceStream.Elements.Add("/N", form);
pdf.Internals.AddObject(appearanceStream);
PdfDictionary textfield = new PdfDictionary(pdf);
textfield.Elements.Add("/FT", new PdfName("/Tx"));
textfield.Elements.Add("/Subtype", new PdfName("/Widget"));
textfield.Elements.Add("/T", new PdfString("fldHelloWorld"));
textfield.Elements.Add("/V", new PdfString("Hello World!"));
textfield.Elements.Add("/Type", new PdfName("/Annot"));
textfield.Elements.Add("/AP", appearanceStream);
textfield.Elements.Add("/Rect", rect);
textfield.Elements.Add("/P", page1);
pdf.Internals.AddObject(textfield);
PdfArray annotsArray = new PdfArray(pdf);
annotsArray.Elements.Add(textfield);
pdf.Internals.AddObject(annotsArray);
page1.Elements.Add("/Annots", annotsArray);
// draw rectangle around text field
//XGraphics gfx = XGraphics.FromPdfPage(page1);
//gfx.DrawRectangle(new XPen(XColors.DarkOrange, 2), left, 40, right, bottom - top);
// Save document
const string filename = #"C:\Downloads\HelloWorld.pdf";
pdf.Save(filename);
pdf.Close();
Process.Start(filename);
}
}

MigraDoc and UTF characters

I am trying to force MigraDoc to render pdf in unicode (currently Chinese/Japanese characters) in c#.
Here is the code I use:
public void Render()
{
var doc = new MigraDoc.DocumentObjectModel.Document();
doc.AddSection();
Style style = doc.Styles["Normal"];
style.Font.Name = "Lucida Sans Unicode";
var paragraph = GetLastSection().AddParagraph();
paragraph.AddText("彤");
var pdfRenderer = new PdfDocumentRenderer(true, PdfFontEmbedding.Always);
pdfRenderer.Document = doc;
pdfRenderer.RenderDocument();
pdfRenderer.PdfDocument.Save(#"c:\temp\test.pdf");
}
The pdf itself gets generated but unfortunately the only thing I read is a square.
Version of MigraDoc is 1.32.4334.0
Thank you for any help.

converting HTML to a multi-column PDF

I am trying to generate a multi-column PDF from HTML using iText for .NET.
I am using CSS3 syntax to generate two columns.
And below code is not working for me.
CSS
column-count:2;
C# Code
StringReader html = new StringReader(#"
<div style='column-count:2;'>Sample Text. Sample Text. Sample Text. Sample Text.
Sample Text. Sample Text. Sample Text. Sample Text. Sample Text. Sample Text.
Sample Text. Sample Text. Sample Text. Sample Text. Sample Text. Sample Text.
Sample Text. Sample Text. </div>
");
Document document = new Document();
PdfWriter writer = PdfWriter.GetInstance(document, new FileStream(#"d:\temp\xyz.pdf", FileMode.Create));
document.Open();
XMLWorkerHelper.GetInstance().ParseXHtml(
writer, document, html
);
document.Close();
Please suggest what is issue in this code. Or is there any other HTML to PDF library available to fix this issue.

The CSS property column-count is not supported in XML Worker, and it probably never will.
However, this doesn't mean that you can't display HTML in columns.
If you go to the official XML Worker documentation, you'll find the ParseHtmlObjects where we parse a large HTML file and render it to a PDF with two columns: walden5.pdf
This is done by parsing the HTML into an ElementList first:
// CSS
CSSResolver cssResolver =
XMLWorkerHelper.getInstance().getDefaultCssResolver(true);
// HTML
HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
htmlContext.autoBookmark(false);
// Pipelines
ElementList elements = new ElementList();
ElementHandlerPipeline end = new ElementHandlerPipeline(elements, null);
HtmlPipeline html = new HtmlPipeline(htmlContext, end);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);
// XML Worker
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
Once we have the list of Element objects, we can add them to a ColumnText object:
// step 1
Document document = new Document(PageSize.LEGAL.rotate());
// step 2
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(file));
// step 3
document.open();
// step 4
Rectangle left = new Rectangle(36, 36, 486, 586);
Rectangle right = new Rectangle(522, 36, 972, 586);
ColumnText column = new ColumnText(writer.getDirectContent());
column.setSimpleColumn(left);
boolean leftside = true;
int status = ColumnText.START_COLUMN;
for (Element e : elements) {
if (ColumnText.isAllowedElement(e)) {
column.addElement(e);
status = column.go();
while (ColumnText.hasMoreText(status)) {
if (leftside) {
leftside = false;
column.setSimpleColumn(right);
}
else {
document.newPage();
leftside = true;
column.setSimpleColumn(left);
}
status = column.go();
}
}
}
// step 5
document.close();
As you can see, you need to make some decisions here: you need to define the rectangles on the pages. You need to introduce new pages, etc...
Note: there is currently no C# port of this documentation. Please think of the Java code as if it were pseudo code.

Unable to Set Font Family in iTextSharp XMLWorker

I am trying to parse some HTML to PDF using itextsharp XMLWorker library. It is working fine but I am unable to render some Unicode characters (Turkish) into my pdf.
I have read several blogs about the problem and they all propose registering a font which supports unicode characters. Then in external css file, I need to specify the font family to use.
html
{
font-family: 'Arial Unicode MS';
}
I also tried all Arial as family too. I tried setting the family in html as well.
<body face = 'Arial'>
None of them are working. Font is registered without problems and external CSS file is working too.
This is how I convert HTML to PDF,
string arialuniTff = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts), "ARIALUNI.TTF");
FontFactory.Register(arialuniTff);
// Resolve CSS
var cssResolver = new StyleAttrCSSResolver();
var cssFile = XMLWorkerHelper.GetCSS(new FileStream(Server.MapPath("~/Content/Editor.css"), FileMode.Open));
cssResolver.AddCss(cssFile);
// HTML
CssAppliers ca = new CssAppliersImpl();
HtmlPipelineContext hpc = new HtmlPipelineContext(ca);
hpc.SetTagFactory(Tags.GetHtmlTagProcessorFactory());
// PIPELINES
PdfWriterPipeline pdf = new PdfWriterPipeline(document, writer);
HtmlPipeline htmlPipe = new HtmlPipeline(hpc, pdf);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, htmlPipe);
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
StringReader sr = new StringReader("<html><head></head><body>" + topMessage.Replace("<br>", "<br></br>") + "</body></html>");
p.Parse(sr);

I see that you create your CssAppliersImpl instance without using a parameter. If you want to deal with fonts, you should create a ´FontProvider´ implementation and use an instance of that implementation as parameter for the CssAppliersImpl constructor. For instance: create a TestFontProvider class that shows you which font names are needed when parsing your HTML. That will help you understand if the right fonts are registered. If you see that all the fonts that are necessary are registered, the problem may be caused by something else. For instance: maybe the HTML is parsed using the wrong encoding...

Here is the working solution after so many attempts:
string fontPath = Path.Combine(#"fonts\Gaegu-Regular.ttf");
var fontProvider = new XMLWorkerFontProvider(XMLWorkerFontProvider.DONTLOOKFORFONTS);
fontProvider.Register(fontPath);
CssAppliers ca = new CssAppliersImpl(fontProvider);
HtmlPipelineContext htmlContext = new HtmlPipelineContext(ca);
var pipeline = new CssResolverPipeline(cssResolver, new HtmlPipeline(htmlContext, new PdfWriterPipeline(document, writer)));
Thanks.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

.NET C# - MigraDoc - How to change document charset? - c#

Related

Add multiple Checkboxes in ITextsharp html to pdf

C#: Create PDF Form (AcroForm) using PDFsharp

MigraDoc and UTF characters

converting HTML to a multi-column PDF

Unable to Set Font Family in iTextSharp XMLWorker

Categories

Resources