File size converting pdf to tiff - c#

I'm using ghostscriptSharp to convert PDF files to TIFF files for faxing. The PDF files sometimes contain photocopies of receipts.
I'm using the tiffg3 driver with a height x width of 400 x 400. I've noticed that the PDFs that contain photocopies tend to expand in size when converting to TIFFs, while the ones without those shrink in size. A typical increase that I'm seeing is going from 1 MB to 25 MB.
I've tried adding compression to the TIFF, but then the fax process can't read it. Is there a way to reduce the output size in ghostscriptSharp without reducing the resolution?

Creating a bitmap, even a low resolution monochrome bitmap, is likely to be larger than a vector-based description language.
Consider:
(Hello World) Tj
That's 16 bytes in a PDF file, and it doens't change if you change the font size. If you turn it into a bitmap, even at low resolution and compressed, it probably exceeds that size.
That's why rendering a page description language to a bitmap produces larger files and is one of the reasons for using a page description language for printing, instead of sending large bitmaps around.
The tiffg3 and tiffg4 devices in Ghostscript only produce monochrome output, because that's all you can encode with G3 and G4 encoding. TIFF G3 is already compressed using the Fax CCITT group 3 compression scheme (Group 3 = g3). If you try to compress that using some other scheme, then your fax software wont be able to read it.
You could try using CCITT Group 4 fax compression instead (the tiffg4 device) but if that doesn't help then basically that's what you get. Your only other option is to create the TIFF at a lower resolution. You don't say what resolution you are currently using. Fax normally supports 3 resolutions; 408x391, 204x196 and 204x98. If you are using superfine (408x391) then you could switch to a lower resolution.
I'm at a loss to see why this is a problem since you are sending the files by fax anyway, why do you care how large an intermediate TIFF file you get ?

If compression won't work and you can't reduce resolution, then the only remaining option would be color depth. It's plausible that the conversion could be using more colors when a photocopy is attached (because of gradients in shadows, or the particular color of the paper, or whatever); yet the receipt might be totally readable without all the colors (as long as the "ink" shows up as distinct from the "paper").
If your conversion tool has a setting for selecting a color depth, tinkering with that is likely your best bet.

If your toolkit allows encoding options, for faxing, your best bet will be to produce a bitonal (black and white) tiff with Group 4 encoding. The downside of that compression scheme is that the more "gray" you have (typical with color pictures converted to grayscale), the bigger your file will be, otherwise, for most things, the compression ratio will be just fine.

Related

C# PDF compression / recompress JBIG2 to JPEG

I have PDF compressed with JBIG2. How can I recompress it to JPEG or any other compression algorithms?
I want to use open source solution like Itextsharp/PDFSharp or any other c# .net open source project.
You would need to decompress the image data, convert it from 1 bit per component to 8 bits per component, then apply JPEG compression.
But it's a bit unusual to convert monochrome images to color. This should increase the size actually. JBIG2 is pretty good for many monochrome images, why do you want to use JPEG compression on it?

Reduce & Optimize Scanned Documents File Size

My customer has about 100,000 scanned documents (jpg) which they work with everyday. I want to know how can I reduce the file size of those images for faster file transfer and browsing.
The documents are scanned in black/white, saved in jpg format. They have a resolution of 150dpi and size of 1275x1753 (width x height). The main problem is their size which is between ~150kb and ~500kb which I think is too high for a black/white picture.
Is there a chance that I can reduce their size with changing the resolution, changing some color mode or something? Tried playing around with Photoshop but no luck.
The scanned documents are just for the sole purpose of Reviewing. So I don't think they need much detail or the original pic size.
Gonna write the program in c#, So tell me if there is a good image library for this purpose.
If your images are JPEG-compressed than they are either grayscale (8 bits per pixel) or full color (24 or 32 bits per pixel). I am not aware of any other JPEG types out there.
Given that, you probably won't get much benefit if you try to convert these images to other formats without changes to their size (number of pixels in both directions) and/or color space.
There is a possibility that JPEG 2000 might compress your images better than JPEG, but another lossy compression will introduce some more artifacts. You might try for yourself and see if this approach is acceptable for you. I can't recommend you any tools for this approach, though.
I would recommend you to try and convert your images to bilevel ones (i.e. with only two colors) and compress them with one of the FAX compression schemes (Group 3 or Group 4). You might try to reduce images sizes at the same time, too. This can be easily achieved using Docotic.Pdf library (Disclaimer: I work for the vendor of the library).
Please take a look at my answer to a question similar to yours. The answer shows how to use RecompressWithGroup4Fax and/or Scale methods to recompress existing images in PDF.
There is also valuable advice from #plinth about JBIG2 compression and other stuff. Well worth reading.

Tiff Conversion Inverted

i am converting 8bpp tif file to 8bpp grayscale but it goes larger in file size. i referred
the following url wischik i have tried atalasoft the file size is fine but the some of the files goes negative in irfanview.
can you guys tell me to solve the above both problem(filesize, negative)?
Images size depends on many factors like bpp, compression, colors..etc
Since you have two tiff 8bpp images I believe the most important factor is the compression since both images will contain palettes.
Ideally in your case the two images should be close in size (memory required) when using the same compression but the colored image will most likely be smaller. I have a sdk called leadtools that I use in my development and it gave me the result above.
Also a small note, maybe if you can provide a sample on the image you are facing the problem with we can help you more.

resize picture c# for web

Goal:
I have lots of pictures in many sizes (both dimensions and file size)
I'd like to convert these files twice:
thumbnail-size pictures
pictures that will look OK on a web page and will be as close to a full screen as possible... and keeping the file size under 500KB.
HTML Questions:
A. What is the best file format to use (jpg, png or other) ?
B. What is the best configuration for web ... as small as possible file size with reasonable quality?
C# Questions
A Is there a good way to achieve this conversion using C# code (if yes, how)?
Try the code in this small C# app for resizing and compressing the graphics. I have reused this code for use in an ASP.NET site without too much work, hopefully you can make use of it. You can run the app to check quality fits your needs etc.
http://blog.bombdefused.com/2010/08/bulk-image-optimizer-in-c-full-source.html
You can pass the image twice, specifying dimensions for a thumbnail, and then again for your display image. It can handle multiple formats (jpg, png, bmp, tiff, gif), and reduce file size significantly without loosing noticeable quality.
On .jpg vs .png, generally jpg is better as you will get a smaller file size than with png. I've generally used this code passing a quality of 90%, which reduces file size significantly, but still looks perfect.
I think PNG is better format for WEB than JPEG that always uses lossy JPG compression, but its degree is selectable, for higher quality and larger files, or lower quality and smaller files. PNG uses ZIP compression which is lossless, and slightly more effective than LZW (slightly smaller files).
In C# you can use System.Drawing namespace types to load, resize and convert mages. This namespace wraps GDI+ API.
A. For graphics I would use png and for fotos jpg.
B. Configuration?
C. There are tons of post that explain that:
http://www.codeproject.com/KB/GDI-plus/imgresizoutperfgdiplus.aspx
Resizing an Image without losing any quality

Image Steganography

I'm working on Steganography application. I need to hide a message inside an image file and secure it with a password, with not much difference in the file size. I am using Least Significant Bit algorithm and could do it successfully with BMP files but it does not work with JPEG, PNG or TIFF files. Does this algorithm work with these files at all? Is there a better way to achieve this? Thanks.
This heavily depends on the way the particular image format works. You'll need to dive into the internals of the format you want to use.
For JPEG, you could fiddle with the last bits of the DCT coefficients for each block.
For palette-based files (GIFs, and some PNGs), you could add extra colours to the palette that look identical to the existing ones, and encode information based on which one you use.
You'll have to distinguish between pixel-based (Bitmap) and palette-based formats (GIF) for which the steganographic technique is quite different. Also be aware that there are image formats like JPG that lose information in the compression process.
I'd also advice some general introduction to steganography including different formats.
Least Significant Bit approach does not work with JPEG and GIF images because you are using the pixel data (raw image) to store hidden information before compression. A pixel p, with data 0x123456 will probably not have this value after compression because its value depends on the compression rate and neighbour pixels. In this case we are talking about algorithms that does not only compact the image (like a ZIP, that keeps the content), but changes the color distribution, texture, and quality in order to decrease the number of bits to represent it.
However, PNG can be used just to compact the image in the same sense of ZIP file, keeping the content. Therefore, you can use the Least Significant Bit for PNG images, so that Wikipedia Steganography page shows example in this format.
As long as the image format is lossless, you can use the LSB steganography in pixels (BMP, PNG, TIFF, PPM). If it is lossy, you have to try something else, as compression and subsequent decompression cause small changes in the pixels and the message is gone. In GIF, you can embed your message into the palette. In JPEG you change the DCT coefficients, a low-level frequency representation of the image, which can be read from and saved as JPEG file losslessly.
There is an extensive research on steganography in JPEG. For introduction, I personally recommend Steganography in Digital Media: Principles, Algorithms, and Applications by Jessica Fridrich - must-read material for serious attempts in steganography. The approaches for various image formats are discussed in-depth there.
Also, LSB is inefficient and very easily detectable, you should not use that. There are better algorithms, however usually heavy on math and complex. Look for "steganography embedding distortion" and "steganography codes".

Categories