css not applying while creating PDF using itextsharp.xmlworker.dll - c#

I want to Generate PDF, but it's not taking all css which i need, like its not applying Margin, Padding, Align etc. I also want to put image on my PDF but I don't know how? following is my code
MemoryStream memoryStream = new MemoryStream();
Document doc = new Document(iTextSharp.text.PageSize.LETTER, 10, 10, 42, 35);
PdfWriter writer = PdfWriter.GetInstance(doc, memoryStream);
HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
htmlContext.SetTagFactory(Tags.GetHtmlTagProcessorFactory());
//create a cssresolver to apply css
ICSSResolver cssResolver = XMLWorkerHelper.GetInstance().GetDefaultCssResolver(false);
cssResolver.AddCss("div{color: red; text-align:center; font-size:30px;}", true);
cssResolver.AddCss("h1{color: green;}", true);
//Create and attach pipline, without pipline parser will not work on css
IPipeline pipeline = new CssResolverPipeline(cssResolver, new HtmlPipeline(htmlContext, new PdfWriterPipeline(doc, writer)));
//Create XMLWorker and attach a parser to it
XMLWorker worker = new XMLWorker(pipeline, true);
XMLParser xmlParser = new XMLParser(worker);
//All is well open documnet and start writing.
doc.Open();
string htmltext = "<html><body><h1>This is Heading </h1><div>This is a div content.</div></body></html>";
xmlParser.Parse(new StringReader(htmltext));
//Done! close the documnet
doc.Close();

Related

iTextSharp - Fit PDFcontent within a page

I'm trying to convert a simple HTML to PDF using iTextSharp. Consider the below scenario, where HTML div width will not fit within PDF Page size, content is blank in PDF.
TextReader reader = new StringReader(html);
using (MemoryStream memoryStream = new MemoryStream())
{
using (Document document = new Document(PageSize.A4, 0, 0, 36, 0))
{
PdfWriter writer = PdfWriter.GetInstance(document, new FileStream(destPath, FileMode.Create));
writer.ViewerPreferences = PdfWriter.PageModeUseOutlines | PdfWriter.PageLayoutSinglePage;
writer.CloseStream = false;
document.Open();
//writer.PageEvent = new HeaderFooterHelper();
HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
htmlContext.SetTagFactory(Tags.GetHtmlTagProcessorFactory());
ICSSResolver cssResolver = XMLWorkerHelper.GetInstance().GetDefaultCssResolver(true);
IPipeline pipeline = new CssResolverPipeline(cssResolver, new HtmlPipeline(htmlContext, new PdfWriterPipeline(document, writer)));
XMLWorker worker = new XMLWorker(pipeline, true);
XMLParser parser = new XMLParser(true, worker, Encoding.Unicode);
parser.Parse(reader);
worker.Close();
document.Close();
}
}
HTML Input :
<html>
<head>
</head>
<body>
<div style="width:1000px">
Blood is needed for emergencies and for people who have cancer, blood disorders,sickle cell, anemia and other illnesses. Some people need regular blood transfusions to live. For nearly 5 million people who receive blood transfusions every year, your donation can make the difference between life and death
</div>
</body>
</html>
We are migrating PDFs generated from EVO to iTextSharp. This is one of the issue faced during migration process. So content cannot be changed/manipulated based on pdf page size. In specific, looking for a solution with PageSize.A4 option.

Hebrew content not displayed when converting html to PDF using iTextSharp 5.5.8?

I am using the below code to convert an Html file to Pdf using iTextSharp
Document doc = new Document(iTextSharp.text.PageSize.A4, 10, 20, 5, 35);
var writer = PdfWriter.GetInstance(doc, new FileStream(savePath, FileMode.Create));
var xmlWorkerFontProvider = new XMLWorkerFontProvider();
var cssAppliers = new CssAppliersImpl(new MyFontProvider());
CssFilesImpl cssFiles = new CssFilesImpl();
StyleAttrCSSResolver cssResolver = new StyleAttrCSSResolver(cssFiles);
HtmlPipelineContext htmlContext = new HtmlPipelineContext(cssAppliers);
htmlContext.SetTagFactory(Tags.GetHtmlTagProcessorFactory());
htmlContext.SetImageProvider(new ITextImageHandler());
IPipeline pipeline = new CssResolverPipeline(cssResolver, new HtmlPipeline(htmlContext, new PdfWriterPipeline(doc, writer)));
XMLWorker worker = new XMLWorker(pipeline, true);
XMLParser xmlParser = new XMLParser(true, worker, Encoding.Unicode);
doc.Open();
doc.NewPage();
xmlParser.Parse(new StringReader(htmlString.ToString()));
doc.Close();
For English content this is working fine. But if the content is in Hebrew then text is not displayed in the PDF.
I have checked other answers related to this on Stack-overflow but they seem to use HtmlParser which is deprecated. So I don't want to use that.
Please let me know if any thing else is required. Thanks for you time.
Edit: After reading the comments I have tried settings the fonts as well. But still no luck. Below is the updated code.
Document document = new Document();
PdfWriter writer =
PdfWriter.GetInstance(document, new FileStream(savePath, FileMode.Create));
document.Open();
var cssResolver = new StyleAttrCSSResolver();
XMLWorkerFontProvider fontProvider =
new XMLWorkerFontProvider(XMLWorkerFontProvider.DONTLOOKFORFONTS);
fontProvider.Register(#"E:\fonts\NotoSansHebrew-Regular.ttf");
CssAppliers cssAppliers = new CssAppliersImpl(fontProvider);
HtmlPipelineContext htmlContext = new HtmlPipelineContext(cssAppliers);
htmlContext.SetTagFactory(Tags.GetHtmlTagProcessorFactory());
htmlContext.SetImageProvider(new ITextImageHandler());
PdfWriterPipeline pdf = new PdfWriterPipeline(document, writer);
HtmlPipeline html = new HtmlPipeline(htmlContext, pdf);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
p.Parse(new StringReader(htmlString.ToString()));
document.Close();
Below is an adaptation of Bruno's code with some actual HTML. To run it you just need to download the font Noto Sans Hebrew and place it on your desktop. Without any modifications (except possibly filepaths) try running this code which works for me. (I tested this against 5.5.5 so 5.5.8 should absolutely work.)
var file = System.IO.Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "test.pdf");
var fontFile = System.IO.Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "NotoSansHebrew-Regular.ttf");
var htmlText = #"<div dir=""rtl"" style=""font-family: Noto Sans Hebrew;"">שלום עולם</div>";
using (var FS = new System.IO.FileStream(file, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (var document = new Document()) {
using (var writer = PdfWriter.GetInstance(document, FS)) {
document.Open();
var cssResolver = new StyleAttrCSSResolver();
var fontProvider = new XMLWorkerFontProvider(XMLWorkerFontProvider.DONTLOOKFORFONTS);
fontProvider.Register(fontFile);
var cssAppliers = new CssAppliersImpl(fontProvider);
var htmlContext = new HtmlPipelineContext(cssAppliers);
htmlContext.SetTagFactory(Tags.GetHtmlTagProcessorFactory());
var pdf = new PdfWriterPipeline(document, writer);
var html = new HtmlPipeline(htmlContext, pdf);
var css = new CssResolverPipeline(cssResolver, html);
var worker = new XMLWorker(css, true);
var p = new XMLParser(worker);
using (var ms = new System.IO.MemoryStream(System.Text.Encoding.UTF8.GetBytes(htmlText))) {
using (var sr = new StreamReader(ms)) {
p.Parse(sr);
}
}
document.Close();
}
}
}
The trick to this whole thing is to get the exact name of the font in your HTML as it is in the font file. What's confusing sometimes is that fonts can actually have a bunch of names inside of them. And the older the font, the more likely that its going to have these. If I remember correctly, iText has some heuristics for determining the font name but if you want to play it safe you can also just use an alias and call it whatever you want. For instance, you can change the HTML to:
var htmlText = #"<div dir=""rtl"" style=""font-family: Gerp;"">שלום עולם</div>";
And everything will work just fine as long as you alias your font when registering it:
fontProvider.Register(fontFile, "Gerp");

itextsharp with xmlworker 5.5.3 vs 5.5.7 missing polish characters on newest one

For now I use version 5.5.3 and it works without problems, but I try to update to newest one and I have problem with polish characters (they are just missing).
I make conversion from rtf to html and from html to pdf like this:
private ElementList htmlToElementList(string htmlText)
{
ICSSResolver cssResolver = XMLWorkerHelper.GetInstance().GetDefaultCssResolver(true);
// HTML
HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
htmlContext.SetTagFactory(Tags.GetHtmlTagProcessorFactory());
htmlContext.AutoBookmark(false);
// Pipelines
ElementList elements = new ElementList();
ElementHandlerPipeline end = new ElementHandlerPipeline(elements, null);
HtmlPipeline html = new HtmlPipeline(htmlContext, end);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);
// XML Worker
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
p.Parse(new StringReader(htmlText));
return elements;
}
It work like it should on 5.5.3. I try to investigate and I found one difference between them (5.5.3 vs 5.5.7):
On each chunk in elements inside font BaseFont is not null only:
({itextSharp.text.pdf.TrueTypeFontUnicode})
image
on version 5.5.7 BaseFont is null.
I use only Century Gothic font (in html) (registered in FontFactory).
What is missing to get it work in new version?
I have also a same issue, my Turkish character are missing in my PDF.
i have fix it by this.
String htmlText = html.ToString();
Document document = new Document();
string filePath = HostingEnvironment.MapPath("~/Content/Pdf/");
PdfWriter.GetInstance(document, new FileStream(filePath + "\\pdf-"+Name+".pdf", FileMode.Create));
document.Open();
iTextSharp.text.html.simpleparser.HTMLWorker hw = new iTextSharp.text.html.simpleparser.HTMLWorker(document);
FontFactory.Register(Path.Combine(_webHelper.MapPath("~/App_Data/Pdf/arial.ttf")), "Garamond"); // just give a path of arial.ttf
StyleSheet css = new StyleSheet();
css.LoadTagStyle("body", "face", "Garamond");
css.LoadTagStyle("body", "encoding", "Identity-H");
css.LoadTagStyle("body", "size", "12pt");
hw.SetStyleSheet(css);
hw.Parse(new StringReader(htmlText));
please look here Missing Character issue in PDF using Itext
Regards,
Vinit patel

How to add bootstrap.css to pdf document itextsharp

took as a basis for this
http://dangtrung87.blogspot.com/2013/07/asp-generate-pdf-with-itextsharp.html
i have next code
string htmlText = RenderViewToString(this.ControllerContext, "report", null, true);
htmlText = System.Text.RegularExpressions.Regex.Replace(htmlText, #"\s+", " ");
htmlText = htmlText.Replace("\n", "").Replace("\r","").Trim();
//Generate PDF
using (var document = new Document(PageSize.A4, 40, 40, 40, 40))
{
htmlText = FormatImageLinks(htmlText);
//define output control HTML
var memStream = new MemoryStream();
TextReader xmlString = new StringReader(htmlText);
PdfWriter writer = PdfWriter.GetInstance(document, memStream);
//open doc
document.Open();
string arialuniTff = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts), "ARIALUNI.TTF");
// Set factories
var htmlContext = new HtmlPipelineContext(null);
htmlContext.SetTagFactory(Tags.GetHtmlTagProcessorFactory());
// Set css
ICSSResolver cssResolver = XMLWorkerHelper.GetInstance().GetDefaultCssResolver(false);
IPipeline pipeline = new CssResolverPipeline(cssResolver, new HtmlPipeline(htmlContext, new PdfWriterPipeline(document, writer)));
cssResolver.AddCssFile(HttpContext.Server.MapPath("~/Content/bootstrap.css"), true);
cssResolver.AddCss(".shadow {background-color: #ebdddd; }", true);
var worker = new XMLWorker(pipeline, true);
var xmlParse = new XMLParser(true, worker);
xmlParse.Parse(xmlString);
xmlParse.Flush();
document.Close();
document.Dispose();
return File(memStream.ToArray(), "application/pdf", "test.pdf");
}
I have error here xmlParse.Parse(xmlString);
Additional information: Input string was invalid.
if i change
cssResolver.AddCssFile(HttpContext.Server.MapPath("~/Content/bootstrap.css"), false); i have no error.
And i have pdf file but work only cssResolver.AddCss(".shadow {background-color: #ebdddd; }", true);
bootstrap style is not work (
how to correct add this ??
The reason you have this error is because cssResolver must be initialized before being used in pipeline, try changing these lines
// Set css
ICSSResolver cssResolver = XMLWorkerHelper.GetInstance().GetDefaultCssResolver(false);
IPipeline pipeline = new CssResolverPipeline(cssResolver, new HtmlPipeline(htmlContext, new PdfWriterPipeline(document, writer)));
cssResolver.AddCssFile(HttpContext.Server.MapPath("~/Content/bootstrap.css"), true);
cssResolver.AddCss(".shadow {background-color: #ebdddd; }", true);
by
// Set css
ICSSResolver cssResolver = XMLWorkerHelper.GetInstance().GetDefaultCssResolver(false);
cssResolver.AddCssFile(HttpContext.Server.MapPath("~/Content/bootstrap.css"), true);
cssResolver.AddCss(".shadow {background-color: #ebdddd; }", true);
IPipeline pipeline = new CssResolverPipeline(cssResolver, new HtmlPipeline(htmlContext, new PdfWriterPipeline(document, writer)));
Try it, works fine for me.
Regards

setting style sheets to html tags

I have been trying out the following code on c# using itextsharp
Document doc = new Document(iTextSharp.text.PageSize.LETTER, 10, 10, 42, 35);
PdfWriter.GetInstance(doc, new FileStream(Server.MapPath("~/Test.pdf"), FileMode.Create));
doc.Open();
HTMLWorker html = new HTMLWorker(doc);
StyleSheet css = new StyleSheet();
css.LoadTagStyle("div", "color", "red");
html.Parse(new StringReader("<div>Sample text</div>"));
css.LoadTagStyle("div", "color", "red");
html.SetStyleSheet(css);
doc.Close();
The test is however displayed in simple plain black.
The first answer is the String should be in HTML format.
And the second answer is HTMLWorker does not support CSS in this way.
You can use XMLWorker to achieve your goal.
public static void pdfWithCSS()
{
Document doc = new Document(iTextSharp.text.PageSize.LETTER, 10, 10, 42, 35);
PdfWriter writer = PdfWriter.GetInstance(doc, new FileStream(Server.MapPath("~/TestWithCSS.pdf"), FileMode.Create));
HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
htmlContext.SetTagFactory(Tags.GetHtmlTagProcessorFactory());
//create a cssresolver to apply css
ICSSResolver cssResolver = XMLWorkerHelper.GetInstance().GetDefaultCssResolver(false);
cssResolver.AddCss("div{color: red;}", true);
cssResolver.AddCss("h1{color: green;}", true);
//Create and attach pipline, without pipline parser will not work on css
IPipeline pipeline = new CssResolverPipeline(cssResolver, new HtmlPipeline(htmlContext, new PdfWriterPipeline(doc, writer)));
//Create XMLWorker and attach a parser to it
XMLWorker worker = new XMLWorker(pipeline, true);
XMLParser xmlParser = new XMLParser(worker);
//All is well open documnet and start writing.
doc.Open();
string htmltext = "<html><body><h1>Heading in Green</h1><div>This is a div content. It should look red.</div></body></html>";
xmlParser.Parse(new StringReader(htmltext));
//Done! close the documnet
doc.Close();
}
But even if you want to use HTMLWorker then you have to provide your CSS attribute in the element itself with style.
See the example bellow:
public static void pdfInLineCSS()
{
Document doc = new Document(iTextSharp.text.PageSize.LETTER, 10, 10, 42, 35);
PdfWriter.GetInstance(doc, new FileStream(Server.MapPath("~/Test.pdf"), FileMode.Create));
doc.Open();
HTMLWorker html = new HTMLWorker(doc);
/*StyleSheet css = new StyleSheet();*/ //Not supported
/*css.LoadTagStyle("div", "color", "red");*/
//css.LoadStyle("div", "color", "green");
string simple = "<html><body><h1 style='color: green;'>Heading in Green</h1><div style='color: red;'>Sample text in red color.</div></body></html>";
html.Parse(new StringReader(simple));
//css.LoadTagStyle("DIV", "color", "red");
/*html.SetStyleSheet(css);*/
doc.Close();
}
Resouces: xmlworker-5.4.5
iTextSharp, a .NET PDF library
xmlworker demo
And many more can not list all but a BIG THANKS to ALL
Happy Coding :)

Categories