making scanned pdf - c#

i need to convert bulk of pdf documents into non-editable format(scanned) some one help me to achieve this using C#.net

Assuming that Chris's comment is correct and you're trying to convert PDF docs to pictures, I'd suggest taking a look at ImageMagick.NET which is a .Net wrapper around ImageMagick which is an open source library for doing things like that.
Never used it myself, but it looks interesting.

Related

Generate image from PowerPoint slide

I use Open XML SDK tool to parse pptx-files. Now I am going to develop my own .NET library/tool to generate an image from a PowerPoint slide. Open XML SDK in principle is not for such tasks, and I do not know where to start research?.
maybe to solve it better to use another programming language, for example, C++ (I also know it) with some library?
or it may be necessary intermediate convert pptx into some another format, for example, HTML and only then to image?
I also tried to investigate Aspose.Slides and Spire.Presentation libraries' dependencies in NuGet to know what they use to an image generation, but these attempts did not succeed.
This looks like a good starting point: Presentation to image conversion
The cited versions are old, but the syntax is still the same: PPT slides to images
My mistake, I misunderstood the end goal.

How to convert office file to image

I am searching from last two days but did not find any thing.
My requirement is to create a document viewer in my web application (C#.Net) and I don't want to use any third party tool for this. Can I convert the files in image or PDF or in any common formate which can be easly render on web page. I also can not use Introp object.
Any help will be highly appreciated
You mention in one of your comments that you'd like to write all the code yourself but don't know where to start. Here's how I would go about it...
First, you'll need to familiarize yourself with the Microsoft Office Format specification. You can find that here (there's a link to the technical specification). Office documents are actually a .zip file with an XML file inside along with any binary data representing attachments. Just renamed a .docx file as .zip and you'll be able to open it up and see the XML and any other supporting documents inside (same is true for xlsx, etc...).
Then you'll need to become intimately familiar with either PDF or HTML, as your job now will be to convert the various Office document structure into PDF or HTML structure, being sure to respect page layout, margins, order, etc...
As others have said, this is a large task which is why third party tools exist today. Also, each third party toolset has it's limitation as this is really hard to "get right" in all situations and there will be edge cases that work for one document and not another (because maybe they didn't use Microsoft Word to save the .docx, maybe they used OpenOffice and OpenOffice interpreted the standard slightly differently...)
If you cannot use COM/Interop technologies in your solution, you can take a look at the specialized 3rd party options. I see that you prefer not to use them, however, there are no existing built-in solutions in the .NET Framework. Check out my answer in a similar thread that describes how to accomplish exactly the same task using 3rd party libraries (for example, DevExpress, since I have experience with it). In addition, take a look at the Documents demo, where you can see how to create images/thumbnails from different types of MS Office documents.
I believe what you need is an intermediate representation of the documents which can be converted into an image for the viewer to display.
Lets me try to explain with the below diagram:
You can use tools like smallpdf or OfficeToPDF to do that. Just integrate them into your application.
Small PDF(https://smallpdf.com/library-detail)
officetopdf (https://officetopdf.codeplex.com/)

how to write dll file to access pdf

Hi i'm new programming and i have written few application to access pdf content by using some dll files, but now my question is how can we write our own dll to access the pdf files. I know it's a big process but i'm very much interested to learn about this. any one please help me.
You can start by reading the PDF specification (warning 32MB behind this link) in order to understand how the PDF file format is implemented. This is necessary if you want to be able to parse it and extract the information you are interested in.
In the meantime (as this reading might occupy you during a certain amount of time) if you have pressing project deadlines you probably want to use an existing library such as iTextSharp.
I know it's a big process but i'm very much interested to learn about this.
That's true. I'd like to suggest to study some open source APIs (iTextSharp) and PDF SDK.

HTML to PDF - Bad performance

I´m using ExpertPDF (library for .NET C#) for converting HTML to PDF and my problem is that it takes a lot of time to do this.
Are there any customizations that will improve the conversion?
The HTML-page contains table-data with just a few images, so it is not that complex.
Have anyone else ever experienced this problem, or do you recommend another library for doing this?
I´m thankful for all hints I can get, there must be a way to increase the performance of this action...
If you really need build from HTML, i suggest to have a look websupergoo, it is no free or open source library can export PDF from HTML.
There is a lot of questions in stackoverflow that speak for this subject
Generate PDF from ASP.NET from raw HTML/CSS content?
Printing a PDF in .NET
How do I programmatically create a PDF in my .NET application?
see search result

What is a good way to output an asp.net, C# GridView into a PDF

I tried using the Microsoft ReportingControls but found them overly cumbersome, with too little documentation. I'd like a simple control that would convert a GridView control into a PDF document. I've started looking into PDFSHarp and am running into dead ends with documentation. Same thing with iTextSharp. I'm willing to dig into them further if they have worked for others in the past.
You could iterate over the data in your gridview and write it to a PDF table using iTextSharp. Have a look here: http://www.codeproject.com/KB/cs/iTextSharpPdfTables.aspx
I also recommend getting the book iText In Action.
I can recommend the ceTe DynamicPDF product if you just need to create PDF files. It is well documented and pretty easy to use. The only caveat that I would have is that your reports will all be built in Code. If you plan on adding a lot of reports then you might want to explore an alternative like Telerik's new reporting tool (which will export to PDF).
Well, make that two caveats: DynamicPDF is a bit expensive if you are just doing the one GridView export.

Categories