Extract object (signature) from white background image (A4 paper) using C# / .NET? - c#

How can I extract an object (Signature) from a white background image (A4 Paper) taken using a mobile camera in C#/.NET and crop it if possible?
I am trying ImageMagick library, but the out put is not 100% correct, I tried to manipulate the values without luck:
string GetSignature(string signature) {
string withoutBackground = "signature_no_bg.png";
using (var image = new MagickImage(signature))
{
image.Transparent(MagickColors.White);
// -alpha set
image.Alpha(AlphaOption.Set);
// -channel RGBA (don't think you need this)
// -fuzz 50%
image.ColorFuzz = new Percentage(40);
// -fill none
image.Settings.FillColor = MagickColors.None;
// -floodfill +0+0 white
image.FloodFill(MagickColors.White, 0, 0);
image.Write(Server.MapPath(withoutBackground));
}
return withoutBackground;
}
Using the above code, the following image:
was converted to:
Another option was to https://www.remove.bg/tools-api , it worked perfectly but its a bit expensive.
Any suggestions to enhance my ImageMagick code or to use another kind of libraries ?

I'm not experienced with ImageMagick, but I would try the following steps as a simple(st) alternative to filling:
Convert to greyscale
Threshold the image
Use the thresholded image as mask to extract the signature from the original RGB Image.
You will have to try how well it performs on your own images though. You can also try to include some blurring on the greyscale image before thresholding against noise.

Related

BitMiracle.LibTiff.Net Converting oJPEG tiff to Bitmap results in a negative color image

I'm using the BitMiracle.LibTiff v2.4.560.0 to convert oJPEG tiffs to Bitmap. This has worked out great until just recently. A Tiff, that I tried converting, is a document with a white background and black text. After converting the tiff, the result ends up with a black background and white text.
I'm using this Convert from Tiff to Bitmap sample for my conversion.
Is this a bug with the BitMiracle.LibTiff library or does there need to be modifications to the sample code? I made quite a few attempts of modifying the sample code, but with no success.
It turns out the image that causes the issue has a TiffTag.PHOTOMETRIC of Photometric.MINISWHITE. Changing that property to Photometric.MINISBLACK resolves the issue.
Added this snippet to Convert from Tiff to Bitmap
FieldValue[] value = tif.GetField(TiffTag.PHOTOMETRIC);
if (value[0].ToInt() == (int)Photometric.MINISWHITE)
{
tif.SetField(TiffTag.PHOTOMETRIC, Photometric.MINISBLACK);
}

C# - Working with high resolution screenshots

I need to capture an area within my desktop. But I need this area to be very high resolution (like, at least few thousand's pixels horizontal, same goes for vertical). Is it possible to get a screen capture that has high density of pixels? How can I do this? I tried capturing the screen with some AutoIt script, and got some very good results (images that were 350MB big), now I would like to do the same using C#.
Edit:
I am doing my read/write of a .tif file like that, and it already loses most of the data:
using (Bitmap bitmap = (Bitmap)Image.FromFile(#"ScreenShot.tif")) //this file has 350MB
{
using (Bitmap newBitmap = new Bitmap(bitmap))
{
newBitmap.Save("TESTRES.TIF", ImageFormat.Tiff); //now this file has about 60MB, Why?
}
}
I am trying to capture my screen like that, but the best I can get from this is few megabytes (nowhere near 350MB):
using (var bmpScreenCapture = new Bitmap(window[2], window[3], PixelFormat.Format32bppArgb))
{
using (var i = Graphics.FromImage(bmpScreenCapture))
{
i.InterpolationMode = InterpolationMode.High;
i.CopyFromScreen(window[0], window[1], 0, 0, bmpScreenCapture.Size, CopyPixelOperation.SourceCopy);
}
bmpScreenCapture.Save("test2.tif", ImageFormat.Tiff);
}
You can't gather more information than the source has.
This is a basic truth and it does apply here, too.
So you can't capture more the your 1920x1080 pixels at their color depth.
OTOH, since you want to feed the captured image into OCR, there a few more things to consider and in fact to do..
OCR is very happy if you help it by optimizing the image. This should involve
reducing colors and adding contrast
enlarging to the recommended dpi resolution
adding even more contrast
Funnily, this will help OCR although the real information cannot increase above the original source. But a good resizing algorithm will add invented data and these often will be just what the OCR software needs.
You should also take care to use a good i.e. non lossy format when you store the image to a file like png or tif and never jpg.
The best way will have to be adjusted by trial and error until the OCR results are good enough.
Hint: Due to font antialiasing most text on screenshots is surrounded by a halo of colorful pixels. Getting rid of it by the reducing or even removing saturation is one way; maybe you want to turn it off in your display properties? (Check out ClearType!)

Compress existing PDF using C# programming using freeware libraries

I have been searching a lot on Google about how to compress existing pdf (size).
My problem is
I can't use any application, because it needs to be done by a C# program.
I can't use any paid library as my clients don't want to go out of Budget. So a PAID library is certainly a NO
I did my home-work for last 2 days and came upon a solution using iTextSharp, BitMiracle but to no avail as the former decrease just 1% of a file and later one is a paid.
I also came across PDFcompressNET and pdftk but i wasn't able to find their .dll.
Actually the pdf is insurance policy with 2-3 images (black and white) and around 70 pages accounting to size of 5 MB.
I need the output in pdf only(can't be in any other format)
Here's an approach to do this (and this should work without regard to the toolkit you use):
If you have a 24-bit rgb or 32 bit cmyk image do the following:
determine if the image is really what it is. If it's cmyk, convert to rgb. If it's rgb and really gray, convert to gray. If it's gray or paletted and only has 2 real colors, convert to 1-bit. If it's gray and there is relatively little in the way of gray variations, consider converting to 1 bit with a suitable binarization technique.
measure the image dimensions in relation to how it is being placed on the page - if it's 300 dpi or greater, consider resampling the image to a smaller size depending on the bit depth of the image - for example, you can probably go from 300 dpi gray or rgb to 200 dpi and not lose too much detail.
if you have an rgb image that is really color, consider palettizing it.
Examine the contents of the image to see if you can help make it more compressible. For example, if you run through a color/gray image and fine a lot of colors that cluster, consider smoothing them. If it's gray or black and white and contains a number of specks, consider despeckling.
choose your final compression wisely. JPEG2000 can do better than JPEG. JBIG2 does much better than G4. Flate is probably the best non-destructive compression for gray. Most implementations of JPEG2000 and JBIG2 are not free.
if you're a rock star, you want to try to segment the image and break it into areas that are really black and white and really color.
That said, if you do can do all of this well in an unsupervised manner, you have a commercial product in its own right.
I will say that you can do most of this with Atalasoft dotImage (disclaimers: it's not free; I work there; I've written nearly all the PDF tools; I used to work on Acrobat).
One particular way to that with dotImage is to pull out all the pages that are image only, recompress them and save them out to a new PDF then build a new PDF by taking all the pages from the original document and replacing them the recompressed pages, then saving again. It's not that hard.
List<int> pagesToReplace = new List<int>();
PdfImageCollection pagesToEncode = new PdfImageCollection();
using (Document doc = new Document(sourceStream, password)) {
for (int i=0; i < doc.Pages.Count; i++) {
Page page = doc.Pages[i];
if (page.SingleImageOnly) {
pagesToReplace.Add(i);
// a PDF image encapsulates an image an compression parameters
PdfImage image = ProcessImage(sourceStream, doc, page, i);
pagesToEncode.Add(i);
}
}
PdfEncoder encoder = new PdfEncoder();
encoder.Save(tempOutStream, pagesToEncode, null); // re-encoded pages
tempOutStream.Seek(0, SeekOrigin.Begin);
sourceStream.Seek(0, SeekOrigin.Begin);
PdfDocument finalDoc = new PdfDocument(sourceStream, password);
PdfDocument replacementPages = new PdfDocument(tempOutStream);
for (int i=0; i < pagesToReplace.Count; i++) {
finalDoc.Pages[pagesToReplace[i]] = replacementPages.Pages[i];
}
finalDoc.Save(finalOutputStream);
What's missing here is ProcessImage(). ProcessImage will rasterize the page (and you wouldn't need to understand that the image might have been scaled to be on the PDF) or extract the image (and track the transformation matrix on the image), and go through the steps listed above. This is non-trivial, but it's doable.
I think you might want to make your clients aware that any of the libraries you mentioned is not completely free:
iTextSharp is AGPL-licensed, so you must release source code of your solution or buy a commercial license.
PDFcompressNET is a commercial library.
pdftk is GPL-licensed, so you must release source code of your solution or buy a commercial license.
Docotic.Pdf is a commercial library.
Given all of the above I assume I can drop freeware requirement.
Docotic.Pdf can reduce size of compressed and uncompressed PDFs to different degrees without introducing any destructive changes.
Gains depend on the size and structure of a PDF: For small files or files that are mostly scanned images the reduction might not be that great, so you should try the library with your files and see for yourself.
If you are most concerned about size and there are many images in your files and you are fine with loosing some of the quality of those images then you can easily recompress existing images using Docotic.Pdf.
Here is the code that makes all images bilevel and compressed with fax compression:
static void RecompressExistingImages(string fileName, string outputName)
{
using (PdfDocument doc = new PdfDocument(fileName))
{
foreach (PdfImage image in doc.Images)
image.RecompressWithGroup4Fax();
doc.Save(outputName);
}
}
There are also RecompressWithFlate, RecompressWithGroup3Fax and RecompressWithJpeg methods.
The library will convert color images to bilevel ones if needed. You can specify deflate compression level, JPEG quality etc.
Docotic.Pdf can also resize big images (and recompress them at the same time) in PDF. This might be useful if images in a document are actually bigger then needed or if quality of images is not that important.
Below is a code that scales all images that have width or height greater or equal to 256. Scaled images are then encoded using JPEG compression.
public static void RecompressToJpeg(string path, string outputPath)
{
using (PdfDocument doc = new PdfDocument(path))
{
foreach (PdfImage image in doc.Images)
{
// image that is used as mask or image with attached mask are
// not good candidates for recompression
if (!image.IsMask && image.Mask == null && (image.Width >= 256 || image.Height >= 256))
image.Scale(0.5, PdfImageCompression.Jpeg, 65);
}
doc.Save(outputPath);
}
}
Images can be resized to specified width and height using one of the ResizeTo methods. Please note that ResizeTo method won't try to preserve aspect ratio of images. You should calculate proper width and height yourself.
Disclaimer: I work for Bit Miracle.
Using PdfSharp
public static void CompressPdf(string targetPath)
{
using (var stream = new MemoryStream(File.ReadAllBytes(targetPath)) {Position = 0})
using (var source = PdfReader.Open(stream, PdfDocumentOpenMode.Import))
using (var document = new PdfDocument())
{
var options = document.Options;
options.FlateEncodeMode = PdfFlateEncodeMode.BestCompression;
options.UseFlateDecoderForJpegImages = PdfUseFlateDecoderForJpegImages.Automatic;
options.CompressContentStreams = true;
options.NoCompression = false;
foreach (var page in source.Pages)
{
document.AddPage(page);
}
document.Save(targetPath);
}
}
GhostScript is AGPL licensed software that can compress PDFs. There is also an AGPL licensed C# wrapper for it on github here.
You could use the GhostscriptProcessor class from that wrapper to pass custom commands to GhostScript, like the ones found in this AskUbuntu answer describing PDF compression.

C# Converting 32bpp image to 8bpp

I'm trying to convert a 32bpp screenshot image to an 8bpp (or 4bpp, or 1bpp) format using C#. I've already looked at several stackoverflow answers on similar subjects and most suggest variations using the following code:
public static Bitmap Convert(Bitmap oldbmp)
{
Bitmap newbmp = new Bitmap(oldbmp.Width, oldbmp.Height, PixelFormat.Format8bppIndexed);
Graphics gr = Graphics.FromImage(newbmp);
gr.PageUnit = GraphicsUnit.Pixel;
gr.DrawImageUnscaled(oldbmp, 0, 0);
return newbmp;
}
However, when this executes, I get a the exception: A graphics object cannot be created from an image that has an indexed pixel format. I understand that 8, 4 and 1bpp images have colour table mappings rather than the actual colour pixels themselves (as in 32 or 16bpp images) so I assume I'm missing some conversion step somewhere, but I'm fairly new to C# (coming from a C++ background) and would prefer to be able do this using native C# calls rather than resorting to PInvoking BitBlt and GetDIBits etc. Anybody able to help me solve this? Thanks.
EDIT: I should point out that I need this to be backwardly compatible to .NET framework 2.0
GDI+ in general has very poor support for indexed pixel formats. There is no simple way to convert an image with 65536 or 16 million colors into one that only has 2, 16 or 256. Colors have to be removed from the source image and that is a lossy conversion that can have very poor results. There are multiple algorithms available to accomplish this, none of them are perfect for every kind of image. This is a job for a graphics editor.
There is one trick I found. GDI+ has an image encoder for GIF files. That's a graphics format that has only 256 colors, the encoder must limit the number of colors. It uses a dithering algorithm that's suitable for photos. It does have a knack for generating a grid pattern, you'll be less than thrilled when it does. Use it like this:
public static Image Convert(Bitmap oldbmp) {
using (var ms = new MemoryStream()) {
oldbmp.Save(ms, ImageFormat.Gif);
ms.Position = 0;
return Image.FromStream(ms);
}
}
The returned image has a 8bpp pixel format with the Palette entries calculated by the encoder. You can cast it to Bitmap if necessary. By far the best thing to do is to simply not bother with indexed formats. They date from the stone age of computing back when memory was severely constrained. Or use a professional graphics editor.
AForge library is doing it perfectly using Grayscale.
var bmp8bpp = Grayscale.CommonAlgorithms.BT709.Apply(bmp);
This class is the base class for image grayscaling [...]
The filter accepts 24, 32, 48 and 64 bpp color images and produces 8
(if source is 24 or 32 bpp image) or 16 (if source is 48 or 64 bpp
image) bpp grayscale image.
Negative stride signifies the image is bottom-up (inverted). Just use the absolute of the stride if you dont care. I know that works for 24bpp images, unaware if it works for others.
You can use System.Windows.Media.Imaging in PresentationCore Assembly take a look at here for more information

C# - Copy an Image into an 8-bit Indexed Image

I want to create an 8-bit indexed image from a regular 32-bit Image object.
Bitmap img = new Bitmap(imgPath); // 32-bit
Bitmap img8bit = new Bitmap(imgW, imgH, Format8bppIndexed); // 8-bit
// copy img to img8bit -- HOW?
img8bit.Save(imgNewPath, ImageFormat.Png);
I cannot use SetPixel to copy it over pixel-by-pixel since Graphics doesn't work with indexed images.
How else?
I found a C# library that converts a bitmap into a palettized (8-bit) image. The technique is fast because it calls GDI32 (the windows graphics system) directly.
To convert to an 8bpp (palettized) image with a greyscale palette, do
System.Drawing.Bitmap b0 = CopyToBpp(b,8);
If you want to convert to an image with a different palette, look at the comments in the source code of CopyToBpp for suggestions. Note that, when you convert to a 1bpp or 8bpp palettized copy, Windows will look at each pixel one by one, and will chose the palette entry that's closest to that pixel. Depending on your source image and choice of palette, you may very well end up with a resulting image that uses only half of the colours available in the palette.
Converting an arbitrary RGBA image to an 8-bit indexed bitmap is a non-trivial operation; you have to do some math to determine the top 256 colors and round the rest (or do dithering, etc).
http://support.microsoft.com/kb/319061 has the details of everything except for a good algorithm, and it should give you an idea of how to get started.

Categories