PNG extracted via from PDF via Flate decoding is unrecognisable - C#

PNG extracted via from PDF via Flate decoding is unrecognisable - C# - c#

The C# software I'm involved with writing has a component that involves the reading of barcodes from scanned documents. The PDFs themselves are opened using PDFSharp.
Unfortunately we're encountering an issue with the process when it involves Flate Decoding of PDFs. Basically, all we get is a bunch of fuzz, which means there is no barcode to check and the document is not recognised.
Our code (which we shamelessly "borrowed" from another Stack Overflow case!) is as follows:
private FileInfo ExportAsPngImage(PdfDictionary image, string sourceFileName, ref int count)
{
//This code basically comes from http://forum.pdfsharp.net/viewtopic.php?f=2&t=2338#p6755
//and http://stackoverflow.com/questions/10024908/how-to-extract-flatedecoded-images-from-pdf-with-pdfsharp
string tempFile = string.Format("{0}_Image{1}.png", sourceFileName, count);
int width = image.Elements.GetInteger(PdfImage.Keys.Width);
int height = image.Elements.GetInteger(PdfImage.Keys.Height);
int bitsPerComponent = image.Elements.GetInteger(PdfImage.Keys.BitsPerComponent);
var pixelFormat = new PixelFormat();
switch (bitsPerComponent)
{
case 1:
pixelFormat = System.Drawing.Imaging.PixelFormat.Format1bppIndexed;
break;
case 8:
pixelFormat = System.Drawing.Imaging.PixelFormat.Format8bppIndexed;
break;
case 24:
pixelFormat = System.Drawing.Imaging.PixelFormat.Format24bppRgb;
break;
default:
throw new Exception("Unknown pixel format " + bitsPerComponent);
}
var fd = new FlateDecode();
byte[] decodedBytes = fd.Decode(image.Stream.Value);
byte[] resultBytes = null;
int newWidth = width;
int alignment = 4;
if (newWidth % alignment != 0)
//Image data in BMP files always starts at a DWORD boundary, in PDF it starts at a BYTE boundary.
//Most images have a width that is a multiple of 4, so there is no problem with them.
//You must copy the image data line by line and start each line at the DWORD boundary.
{
while (newWidth % alignment != 0)
{
newWidth++;
}
var copy_dword_boundary = new byte[height, newWidth];
for (int y = 0; y < height; y++)
{
for (int x = 0; x < newWidth; x++)
{
if (x <= width && (x + (y * width) < decodedBytes.Length))
// while not at end of line, take orignal array
copy_dword_boundary[y, x] = decodedBytes[x + (y * width)];
else //fill new array with ending 0
copy_dword_boundary[y, x] = 0;
}
}
resultBytes = new byte[newWidth * height];
int counter = 0;
for (int x = 0; x < copy_dword_boundary.GetLength(0); x++)
{
for (int y = 0; y < copy_dword_boundary.GetLength(1); y++)
{ //put 2dim array back in 1dim array
resultBytes[counter] = copy_dword_boundary[x, y];
counter++;
}
}
}
else
{
resultBytes = new byte[decodedBytes.Length];
decodedBytes.CopyTo(resultBytes, 0);
}
//Create a new bitmap and shove the bytes into it
var bitmap = new Bitmap(newWidth, height, pixelFormat);
BitmapData bitmapData = bitmap.LockBits(new Rectangle(0, 0, bitmap.Width, bitmap.Height), ImageLockMode.WriteOnly, bitmap.PixelFormat);
int length = (int)Math.Ceiling(width * bitsPerComponent / 8.0);
for (int i = 0; i < height; i++)
{
int offset = i * length;
int scanOffset = i * bitmapData.Stride;
Marshal.Copy(resultBytes, offset, new IntPtr(bitmapData.Scan0.ToInt32() + scanOffset), length);
}
bitmap.UnlockBits(bitmapData);
//Now save the bitmap to memory
using (var fs = new FileStream(String.Format(tempFile, count++), FileMode.Create, FileAccess.Write))
{
bitmap.Save(fs, ImageFormat.Png);
}
return new FileInfo(tempFile);
}
Unfortunately, all we get out of it is this http://i.stack.imgur.com/FwatQ.png
Any ideas on where we're going wrong, or suggestions for things we might try would be greatly appreciated.
Cheers

Thanks for the suggestions guys. One of the other developers managed to crack it - it was (as Jongware suggested) a JPEG, but it was actually zipped as well! Once unzipped it could be processed and recognised as normal.

Related

Show an array of bytes as an image on a form

I wrote some code to show an array of bytes as an image. There is an array of bytes in which every element represents a value of 8-bit gray scale image. Zero equals the most black and 255 does the most white pixel. My goal is to convert this w*w-pixel gray-scale image to some thing accepted by pictureBox1.Image.
This is my code:
namespace ShowRawImage
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
int i = 0, j = 0, w = 256;
byte[] rawIm = new byte[256 * 256];
for(i = 0; i < w; ++i)
{
for (j = 0; j < w; ++j)
{
rawIm[i * w + j] = (byte)j; // BitConverter.GetBytes(j);
}
}
MemoryStream mStream = new MemoryStream();
mStream.Write(rawIm, 0, Convert.ToInt32(rawIm.Length));
Bitmap bm = new Bitmap(mStream, false);// the error occurs here
mStream.Dispose();
pictureBox1.Image = bm;
}
}
}
However I get this error:
Parameter is not valid.
The error snapshot
where is my mistake?
EDIT:
In next step I am going to display 16-bit grayscale images.

The Bitmap(Stream, bool) constructor expects a stream with an actual image format (eg. PNG, GIF, etc.) along with header, palette, and possibly compressed image data.
To create a Bitmap from raw data, you need to use the Bitmap(int width, int height, int stride, PixelFormat format, IntPtr scan0) constructor, but that is also quite inconvenient because you need a pinned raw data that you can pass as scan0.
The best if you just create an 8bpp bitmap with grayscale palette and set the pixels manually:
var bmp = new Bitmap(256, 256, PixelFormat.Format8bppIndexed);
// making it grayscale
var palette = bmp.Palette;
for (int i = 0; i < 255; i++)
palette.Entries[i] = Color.FromArgb(i, i, i);
bmp.Palette = palette;
Now you can access its raw content as bytes where 0 is black and 255 is white:
var bitmapData = bmp.LockBits(new Rectangle(Point.Empty, bmp.Size), ImageLockMode.WriteOnly, PixelFormat.Format8bppIndexed);
for (int y = 0; y < bitmapData.Height; y++)
{
for (int x = 0; x < bitmapData.Width; x++)
{
unsafe
{
((byte*) bitmapData.Scan0)[y * bitmapData.Stride + x] = (byte)x;
}
}
}
bmp.UnlockBits(bitmapData);
The result image:
But if you don't want to use unsafe code, or you want to set pixels by colors, you can use this library (disclaimer: written by me) that supports efficient manipulation regardless of the actual PixelFormat. Using that library the last block can be rewritten like this:
using (IWritableBitmapData bitmapData = bmp.GetWritableBitmapData())
{
IWritableBitmapDataRow row = bitmapData.FirstRow;
do
{
for (int x = 0; x < bitmapData.Width; x++)
row[x] = Color32.FromGray((byte)x); // this works for any pixel format
// row.SetColorIndex(x, x); // for the grayscale 8bpp bitmap created above
} while (row.MoveNextRow());
}
Or like this, using Parallel.For (this works only because in your example all rows are the same so the image is a horizontal gradient):
using (IWritableBitmapData bitmapData = bmp.GetWritableBitmapData())
{
Parallel.For(0, bitmapData.Height, y =>
{
var row = bitmapData[y];
for (int x = 0; x < bitmapData.Width; x++)
row[x] = Color32.FromGray((byte)x); // this works for any pixel format
// row.SetColorIndex(x, x); // for the grayscale 8bpp bitmap created above
});
}

As said in the comments - bitmap is not just an array. So to reach your goal you can create bitmap of needed size and set pixels with Bitmap.SetPixel:
Bitmap bm = new Bitmap(w, w);
for(var i = 0; i < w; ++i)
{
for (var j = 0; j < w; ++j)
{
bm.SetPixel(i,j, Color.FromArgb(j, j, j));
}
}

PixelFormat for PNG image in PDF

I am trying to extract images using the PDFsharp library. As mentioned in the sample program, the library does not support the extraction of the non-JPEG images, therefore, I am trying to do it myself.
I found a non-working sample program for the same purpose. I am using the following code to extract a 400 x 400 PNG image embedded in a PDF file (the image was first inserted in a MS Word file, which was saved as a PDF file then).
PDF File Link:
https://drive.google.com/open?id=1aB-SrMB3eu00BywliOBC8AW0JqRa0Hbd
EXTRACTION CODE:
static void ExportAsPngImage(PdfDictionary image, ref int count)
{
int width = image.Elements.GetInteger(PdfSharp.Pdf.Advanced.PdfImage.Keys.Width);
int height = image.Elements.GetInteger(PdfSharp.Pdf.Advanced.PdfImage.Keys.Height);
System.Drawing.Imaging.PixelFormat pixelFormat = System.Drawing.Imaging.PixelFormat.Format8bppIndexed;
byte[] original_byte_boundary = image.Stream.UnfilteredValue;
byte[] result_byte_boundary = null;
//Image data in BMP files always starts at a DWORD boundary, in PDF it starts at a BYTE boundary.
//You must copy the image data line by line and start each line at the DWORD boundary.
byte[, ,] copy_dword_boundary = new byte[3, height, width];
for (int y = 0; y < height; y++)
{
for (int x = 0; x < width; x++)
{
if (x <= width && (x + (y * width) != original_byte_boundary.Length))
// while not at end of line, take orignale array
{
copy_dword_boundary[0, y, x] = original_byte_boundary[3*x + (y * width)];
copy_dword_boundary[1, y, x] = original_byte_boundary[3*x + (y * width) + 1];
copy_dword_boundary[2, y, x] = original_byte_boundary[3*x + (y * width) + 2];
}
else //fill new array with ending 0
{
copy_dword_boundary[0, y, x] = 0;
copy_dword_boundary[1, y, x] = 0;
copy_dword_boundary[2, y, x] = 0;
}
}
}
result_byte_boundary = new byte[3 * width * height];
int counter = 0;
int n_width = copy_dword_boundary.GetLength(2);
int n_height = copy_dword_boundary.GetLength(1);
for (int x = 0; x < width; x++)
{
for (int y = 0; y < height; y++)
{ //put 3dim array back in 1dim array
result_byte_boundary[counter] = copy_dword_boundary[0, x, y];
result_byte_boundary[counter + 1] = copy_dword_boundary[1, x, y];
result_byte_boundary[counter + 2] = copy_dword_boundary[2, x, y];
//counter++;
counter = counter + 3;
}
}
Bitmap bmp = new Bitmap(width, height, pixelFormat);
System.Drawing.Imaging.BitmapData bmd = bmp.LockBits(new Rectangle(0, 0, bmp.Width, bmp.Height), ImageLockMode.WriteOnly, bmp.PixelFormat);
System.Runtime.InteropServices.Marshal.Copy(result_byte_boundary, 0, bmd.Scan0, result_byte_boundary.Length);
bmp.UnlockBits(bmd);
using (FileStream fs = new FileStream(#"D:\TestPdf\" + String.Format("Image{0}.png", count), FileMode.Create, FileAccess.Write))
{
bmp.Save(fs, ImageFormat.Png);
count++;
}
}
PROBLEM:
Whatever PixelFormat format I choose, the saved PNG image does not look correct.
Original PNG IMAGE (Bit Depth-32):
Result of PixelFormat = Format24bppRgb

You can get the pixelformat from the PDF file. Since you did not include the PDF in your post, I cannot tell you which format would be correct.
PDF files do not contain PNG images, instead images use a special PDF image format which is somewhat similar to the BMP files used by Windows, but without any headers in the binary data. Instead the "header" information can be found with the properties of the Image object. See the PDF Reference for further details.

Convert image to byte array C# in a proper way

My problem is that I need to convert an image to a byte array to obtain its pixels.
My image size is 268x188 and when I use the property PixelsFormat it returns Format24bppRgb, so I understand that each pixel contains 3 bytes.
If this is true, the size of the pixels should be 268*188*3 = 151152 bytes, but the byte array that I am creating has a size of 4906 bytes, which is the size of the image file in my computer.
I don´t know if there is another way to obtain these pixels or you can only obtain image file size.

If you want to ignore the header and the compression of the file you can do the following.
var path = ...
using(var image = Image.FromFile(path))
using(var bitmap = new Bitmap(image))
{
var bitmapData = bitmap.LockBits(new Rectangle(0, 0, bitmap.Width, bitmap.Height), ImageLockMode.ReadOnly, bitmap.PixelFormat);
var bytesPerPixel = 4; // bitmapData.PixelFormat (image.PixelFormat and bitmapData.PixelFormat can be different)
var ptr = bitmapData.Scan0;
var imageSize = bitmapData.Width * bitmapData.Height * bytesPerPixel;
var data = new byte[imageSize];
for (int x = 0; x < imageSize; x += bytesPerPixel)
{
for(var y = 0; y < bytesPerPixel; y++)
{
data[x + y] = Marshal.ReadByte(ptr);
ptr += 1;
}
}
bitmap.UnlockBits(bitmapData);
}

To get image pixel try this:
public static byte[] GetImageRaw(Bitmap image)
{
if (image == null)
{
throw new ArgumentNullException(nameof(image));
}
if (image.PixelFormat != PixelFormat.Format24bppRgb)
{
throw new NotSupportedException("Invalid pixel format.");
}
const int PixelSize = 3;
var data = image.LockBits(
new Rectangle(Point.Empty, image.Size),
ImageLockMode.ReadWrite,
image.PixelFormat);
try
{
var bytes = new byte[data.Width * data.Height * PixelSize];
for (var y = 0; y < data.Height; ++y)
{
var source = (IntPtr)((long)data.Scan0 + y * data.Stride);
// copy row without padding
Marshal.Copy(source, bytes, y * data.Width * PixelSize, data.Width * PixelSize);
}
return bytes;
}
finally
{
image.UnlockBits(data);
}
}
Take a look at Bitmap.LockBits

I use this code in ASP.NET application. Very simple:
var imagePath = GetFilePathToYourImage();
using (var img = System.IO.File.OpenRead(imagePath))
{
var imageBytes = new byte[img.Length];
img.Read(imageBytes, 0, (int)img.Length);
}

How to change 12-bits and 10-bits raw file to bitmap image using c#

I have 10-bits and 12-bits rawbayer (.raw) file. And i want to convert it to Bitmap image but i am unable to change it. 8-bits or 16-bits raw file changed easily but 10-bits or 12-bits can't be changed.
here is the code for 8-bits raw file to Bitmap.
private void DisplayImage08(string fileName) //Raw file name
{
// Open a binary reader to read in the pixel data.
// We cannot use the usual image loading mechanisms since this is raw
// image data.
try
{
BinaryReader br = new BinaryReader(File.Open(fileName, FileMode.Open));
byte pixByte;
int i;
int iTotalSize = (int)br.BaseStream.Length;
// Get the dimensions of the image from the user
ID = new ImageDimensions(iTotalSize);
width = Convert.ToInt32(ID.txtwidth);
height = Convert.ToInt32(ID.txtheight);
//panel1.Width = width;
//panel1.Height = height;
pictureBox1.Width = width;
pictureBox1.Height = height;
pix08 = new byte[iTotalSize];
//pix08 = new byte[iTotalSize];
for (i = 0; i < iTotalSize; ++i)
{
pixByte = (byte)(br.ReadByte());
pix08[i] = pixByte;
}
br.Close();
int bitsPerPixel = 8;
stride = (width * bitsPerPixel + 7) / 8;
// Single step creation of the image
bmps = BitmapSource.Create(width, height, 96, 96, PixelFormats.Gray8, null,
pix08, stride);
//Bitmap bmp = new Bitmap();
// img.Source = bmps;
Bitmap bt = BitmapFromSource(bmps); //Change Bitmap source to Bitmap
pictureBox1.Image = bt; //Display on pictureBox
pictureBox1.SizeMode = System.Windows.Forms.PictureBoxSizeMode.AutoSize;
}
catch (Exception e)
{
}
}
But i want to change 12-bits or 10 bits raw file to Bitmap.
So please help me on this, that how can i change it?
Thanks

This isn't a complete answer by no means, but this code will convert a 12 or 10 bit image, depending on how the data is packed, to a 8 bit image.
// Input data read from file
var pixels = ...
// Output
var bytes = new byte[width * height];
// Sort data
for (var i = 0; i < bytes.Length; i++)
{
if (i % 2 == 0)
{
var index = i * 3 / 2;
bytes[i] = pixels[index];
}
else
{
var index = (i * 3 + 1) / 2;
bytes[i] = pixels[index];
}
}

unsafe image noise removal in c# (error : Bitmap region is already locked)

public unsafe Bitmap MedianFilter(Bitmap Img)
{
int Size =2;
List<byte> R = new List<byte>();
List<byte> G = new List<byte>();
List<byte> B = new List<byte>();
int ApetureMin = -(Size / 2);
int ApetureMax = (Size / 2);
BitmapData imageData = Img.LockBits(new Rectangle(0, 0, Img.Width, Img.Height), ImageLockMode.ReadOnly, PixelFormat.Format32bppRgb);
byte* start = (byte*)imageData.Scan0.ToPointer ();
for (int x = 0; x < imageData.Width; x++)
{
for (int y = 0; y < imageData.Height; y++)
{
for (int x1 = ApetureMin; x1 < ApetureMax; x1++)
{
int valx = x + x1;
if (valx >= 0 && valx < imageData.Width)
{
for (int y1 = ApetureMin; y1 < ApetureMax; y1++)
{
int valy = y + y1;
if (valy >= 0 && valy < imageData.Height)
{
Color tempColor = Img.GetPixel(valx, valy);// error come from here
R.Add(tempColor.R);
G.Add(tempColor.G);
B.Add(tempColor.B);
}
}
}
}
}
}
R.Sort();
G.Sort();
B.Sort();
Img.UnlockBits(imageData);
return Img;
}
I tried to do this. but i got an error call "Bitmap region is already locked" can anyone help how to solve this. (error position is highlighted)

GetPixel is the slooow way to access the image and doesn't work (as you noticed) anymore if someone else starts messing with the image buffer directly. Why would you want to do that?
Check Using the LockBits method to access image data for some good insight into fast image manipulation.
In this case, use something like this instead:
int pixelSize = 4 /* Check below or the site I linked to and make sure this is correct */
byte* color =(byte *)imageData .Scan0+(y*imageData .Stride) + x * pixelSize;
Note that this gives you the first byte for that pixel. Depending on the color format you are looking at (ARGB? RGB? ..) you need to access the following bytes as well. Seems to suite your usecase anyway, since you just care about byte values, not the Color value.
So, after having some spare minutes, this is what I'd came up with (please take your time to understand and check it, I just made sure it compiles):
public void SomeStuff(Bitmap image)
{
var imageWidth = image.Width;
var imageHeight = image.Height;
var imageData = image.LockBits(new Rectangle(0, 0, imageWidth, imageHeight), ImageLockMode.ReadOnly, PixelFormat.Format32bppRgb);
var imageByteCount = imageData.Stride*imageData.Height;
var imageBuffer = new byte[imageByteCount];
Marshal.Copy(imageData.Scan0, imageBuffer, 0, imageByteCount);
for (int x = 0; x < imageWidth; x++)
{
for (int y = 0; y < imageHeight; y++)
{
var pixelColor = GetPixel(imageBuffer, imageData.Stride, x, y);
// Do your stuff
}
}
}
private static Color GetPixel(byte[] imageBuffer, int imageStride, int x, int y)
{
int pixelBase = y*imageStride + x*3;
byte blue = imageBuffer[pixelBase];
byte green = imageBuffer[pixelBase + 1];
byte red = imageBuffer[pixelBase + 2];
return Color.FromArgb(red, green, blue);
}
This
Relies on the PixelFormat you used in your sample (regarding both the pixelsize/bytes per pixel and the order of the values). If you change the PixelFormat this will break.
Doesn't need the unsafe keyword. I doubt that it makes a lot of difference, but you are free to use the pointer based access instead, the method would be the same.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

PNG extracted via from PDF via Flate decoding is unrecognisable - C# - c#

Thanks for the suggestions guys. One of the other developers managed to crack it - it was (as Jongware suggested) a JPEG, but it was actually zipped as well! Once unzipped it could be processed and recognised as normal.

Related

Show an array of bytes as an image on a form

PixelFormat for PNG image in PDF

Convert image to byte array C# in a proper way

How to change 12-bits and 10-bits raw file to bitmap image using c#

unsafe image noise removal in c# (error : Bitmap region is already locked)

Categories

Resources