Fast, multi-threaded and free HTML to PDF converter in C# for A4 documents [closed] - c#

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 days ago.
Improve this question
I would like to ask for your advice. I need a converter that will create an A4 PDF documents from HTML - I need it to be possible to define the margins of the document and I need the css rendering to work - (css code defined in HTML string). I don't need to save anything to a file - I need a converter to which I send html as a string and it returns a pdf as byte array. My requirement is for it to work as fast as possible - to be able to convert 5000 html strings to pdf documents - each one or two-page long, in a reasonable time. I need the converter to work with a C# ASP .NET Core application.
So far I have tried these converters:
Tuespechkin
Dink
Both work well but very slowly. The convert method takes a very long time. Unfortunately it can't even be called in parallel, even if I create multiple threads the method is always executed serially.
HtmlRenderer.PdfSharp
It works very fast (several times faster than tuespechkin and dink) and is executed in parallel, but the rendered pdf looks bad - some important css styles are ignored and even if I choose the A4 format - some text is beyond the edge of the document.
I also read this entire thread but found nothing helpful to solve my problem: Convert HTML to PDF in .NET
Thanks for every reply

Related

Content-wise not page-wise pdf comparison library [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I'm looking for a library that I can use in a C# windows application for comparing PDF files. There are a lot of tools that I have seen for doing page-wise pdf comparison (e.g., http://www.inetsoftware.de/other-products/pdf-content-comparer). However, I want content-wise comparison. That means that if content is added or removed that will cause everything after the change to be shiffted, then I do not want the shifted content to be considered as changed.
One option is to extract the text from the pdf files and then doing a text comparison using an algorithm like the one proposed by Eugene W. Myers in his paper "An O(ND) Difference Algorithm and its Variation". However, I wonder if there is a tool or library that I can use in C# to do this? Ideally, the tool will show the entire original document and highlight the changes. The tool will also detect other content changes like image changes.
Thanks.
A commercial option is DocsCorp compareDocs SDK (also known as DocuComp) http://www.docscorp.com/public/products/publicProductsDocuCompServer.cfm
It is a content based comparison solution. For example shifting of content due to insertion of a new paragraph will not cause all subsequent text to be considered 'changed'. The inserted paragraph will be marked as 'inserted' while the subsequent text will still be considered 'same'.
PDF to PDF comparison with output as single PDF. Changes are shown as annotations (insertions shown as underlined text, deletes are represented by PDF comments (yellow sticky notes) anchored to the point the deletion took place). Output can be a single PDF illustrating the changes. This is based on the modified PDF OR it can show a side by side view representing both PDF's in one PDF.
The comparison is text based only. It does not currently attempt to show changes in images or other graphical elements in PDF's.
For full disclosure I am employed and part own this company. My position is R&D VP.
Regards
Shane

How to convert PDF to WORD in c# [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
Does the anyone know a .Net component to convert PDF to Word or RTF programatically? I don't want to use OCR and Adobe dependent solutions.
I tried several libraries:
PDF Focus .NET: https://sautinsoft.com/products/pdf-focus/index.php
Aspose.PDF: https://products.aspose.com/pdf/net
Gembox: https://www.gemboxsoftware.com/document
Spire.PDF: https://www.e-iceblue.com/Introduce/pdf-for-net-introduce.html
considered also using Word via COM automation to open and save to pdf programmatically.
Among all of them I liked PDF Focus .NET best of all, and I will explain why:
They try to keep the structure of the document EDITABLE, so that
when I will try to continue editing the text, the paragraph will be
smoothly prolonged. Other libraries are trying to do a
"minimalistic" approach by inserting absolute positioned shapes, so
that if you continue editing the text, it will overlap with the next
piece of text.
They do all their best to recognize tables, so
that tables in the output document will be REAL TABLES, but not a
collection of shapes and texts with absolute positioning (as
produced by other libraries).
A customer of ours is evaluating now different libraries, and I will recommend PDF Focus .NET first of all.
P.S. I AM NOT INVOLVED IN ANY KIND OF RELATIONSHIP WITH THIS SOFTWARE PRODUCER. As a former .NET developer I simply see a high quality components which really work fine.
Use PDF Focus.
Nice and easy.
EDIT: And also
How to convert DOC into other formats using C#
http://dotnetf1.blogspot.com/2008/07/convert-word-doc-into-pdf-using-c-code.html
You need something like GemBox.Document. It's a simple .NET component that enables you to manipulate and convert all kinds of document files.
You should have read this: C# and PDF. There are methods to convert, like beforementioned PDF Focus but be warned: it is buggy, and crashy process. PDF is not intended to be PC-readable.

Correct way of parsing XMP XML metadata attached to the end of a PDF file? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I have a PDF with some meta data in XMP XML format attached to the end. What is the correct way of parsing and using this meta data?
At the minute i have a working solution using C99, parsing each character in the file, starting at the beginning and using loops until i reach a tag im after and then recording the contents until i reach the closing tag. I can't see this as the best way of doing things.
I'm now rewriting this program using C# + Mono (not .NET) and i wonder if there is a magic framework class for this task instead of just imitating the C99 version? (Also, i can only rely on third party libraries if they don't contain any p/invoke stuff, etc.)
I'm using Mono because i need this app to be cross-platform.
Adobe has published the XMP specification. Give it a try. You need to find out what XMP schema the XML uses and parse it accordingly.
If you can get the complete XML as a string, you can use XmlDocument.Load to get the complete XML in memory for querying.
You can then use XPath with the XmlDocument.SelectNodes method in order to get to your data.

Looking for a reporting tool that will allow vector graphics in output file (PDF) [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I have been tasked with evaluating our current system that we use for creating and outputing reports.
Currently we are using Crystal Reports 2008, (I know that this is and old version.), which has a custom commandline app that we wrote in C# to execute the report for a given parameter passed through the command line.
We like Crystal becuase it's easy to setup and design the report. It's also easy to print and create a PDF file from crystal using our custom commandline program.
One of the problems/complaints that we have is that Crystal does not appear to have a method that will allow us to create a PDF file with a vector images, such as our company logo. Crystal Reports always converts an image into a bitmap. When the PDF is printed, the results are less than flattering, and the PDF file size is increased.
Does anyone have any recomendadtions for a reporting product that we should consider?
iTextSharp supports importing WMF as vector image. Maybe other formats too.
See sample here. N.B.: it seems, it's a bit outdated... you'll need to replace 'getInstance' with 'GetInstance'.
www.hagridsolutions.com/xtraction
Offers easier use than Crystal and a rich export that can cater for exporting data into a MS Word template (that could contain vector images, headers, table of contents) and also export this into PDF or HTML format.
Design is drag-and-drop with no coding or dependence on specialized staff whatsoever.
You can define the reports once and have them scheduled to output to PDF, saved to the system to be viewed online or to a file system.
The dates can be rolling (as in Last Week, Last Month) and so always deliver based on what you need.
The design is drag-and-drop, the dashboards are interactive, the reports are available when you need them, and there is security to lock down access to the dashboards/reports and control of who can design dashboards/reports. The flexibility is surely there for whatever combination is needed.
I think that Combit's List and LAbel will fit this requirement.
www.combit.de
however the support for EMFs is not perfect, it works good for small and medium complexity.

Drawing SVG in .NET/C#? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I'd like to generate an SVG file using C#. I already have code to draw them in PNG and EMF formats (using framework's standard class System.Drawing.Imaging.Metafile and ZedGraph). What could you recommend to do to adapt that code to SVG? Preferably I'd like to find some library (free or not) that would mimic System.Drawing.Graphics interface.
Check out the SVG framework in C# and an SVG-GDI+ bridge project.
From the above web page...
The SvgGdi bridge is a set of classes
that use SvgNet to translate between
SVG and GDI+. What this means is that
any code that uses GDI+ to draw
graphics can easily output SVG as
well, simply by plugging in the
SvgGraphics object. This object is
exactly the same as a regular .NET
Graphics object, but creates an SVG
tree. Even things like hatched fills
and line anchors are implemented.
We have made a public fork of the C# .NET SVG library on Github.
It is much improved over the one you find on Codeplex, please have a look and fork it as you like:
https://github.com/svg-net/SVG
Edit:
Just to let you know, as of January 2021:
While others seem dead for years, this is still active. But we could definitely use some help from other developers.
I used this one http://svg.codeplex.com/ and I am quite satisfied with it. Still has some bugs so you should have a look at the patches in http://svg.codeplex.com/SourceControl/PatchList.aspx.
When I discover mistakes I can solve I post them directly there. But it takes some time to be evaluated by the guys there. It's a better idea to have a look at the patches and apply them yourself.
The library is reasonably sufficient for most usual needs. for really fancy stuff, it needs to be improved thought...
As SVG is basically a XML document - you can implement "drawing" yourself. Check the specs at W3C SVG spec. I did it once to generate SVG signature images, all it took was a couple of hours and a firefox to test the generated image.
Of course this applies if you are generating image from user input or if you do not mind spending some time doing conversion from another vector image format.
P.S. you can create your own wrapper to mimic System.Drawing.Graphics, e.g. DrawLine() to append to the internal buffer and so on.

Categories