Extract colors from jpeg file (without Bitmap) - c#

I'm currently trying to figure out how JPEG's are made in depth out of interest. I found documents on the different sections (soi, sof, sos, eoi etc) which are pretty straight forward, but not how to get a single pixel out of there.
My first thought was to make a small image, 2x2 for example, but with all the headers and sections it's still to big to isolate the pixel information without knowing the exact location and method to extract it. I'm sure it's compressed, but is their a way to get it out manually? (as RGB?)
Anyone has a clue on how to do this?

Getting the value of a single pixel of a JPEG image requires parsing some (if not most) of those sections anyway.
There's a good step-by-step guide available at https://www.imperialviolet.org/binary/jpeg/ (though the code is in Haskell, so it might be moderately inscrutable to mere mortals) that explains the concepts behind turning a JPEG into a bunch of RGB values.

This is the only source I know that explains JPEG end-to-end:
https://www.amazon.com/gp/product/B01JXRY4R0/ref=dbs_a_def_rwt_bibl_vppi_i4
Parsing the structure of a JPEG stream is easy. Decoding a JPEG scan is very difficult and involves several compression steps. Plus there are two different types of scan that are commonly in use (progressive & sequential).

Related

Reduce & Optimize Scanned Documents File Size

My customer has about 100,000 scanned documents (jpg) which they work with everyday. I want to know how can I reduce the file size of those images for faster file transfer and browsing.
The documents are scanned in black/white, saved in jpg format. They have a resolution of 150dpi and size of 1275x1753 (width x height). The main problem is their size which is between ~150kb and ~500kb which I think is too high for a black/white picture.
Is there a chance that I can reduce their size with changing the resolution, changing some color mode or something? Tried playing around with Photoshop but no luck.
The scanned documents are just for the sole purpose of Reviewing. So I don't think they need much detail or the original pic size.
Gonna write the program in c#, So tell me if there is a good image library for this purpose.
If your images are JPEG-compressed than they are either grayscale (8 bits per pixel) or full color (24 or 32 bits per pixel). I am not aware of any other JPEG types out there.
Given that, you probably won't get much benefit if you try to convert these images to other formats without changes to their size (number of pixels in both directions) and/or color space.
There is a possibility that JPEG 2000 might compress your images better than JPEG, but another lossy compression will introduce some more artifacts. You might try for yourself and see if this approach is acceptable for you. I can't recommend you any tools for this approach, though.
I would recommend you to try and convert your images to bilevel ones (i.e. with only two colors) and compress them with one of the FAX compression schemes (Group 3 or Group 4). You might try to reduce images sizes at the same time, too. This can be easily achieved using Docotic.Pdf library (Disclaimer: I work for the vendor of the library).
Please take a look at my answer to a question similar to yours. The answer shows how to use RecompressWithGroup4Fax and/or Scale methods to recompress existing images in PDF.
There is also valuable advice from #plinth about JBIG2 compression and other stuff. Well worth reading.

Improving the quality of TIFF images

We have a around 600,000 images that were converted from JPEG to TIFF files and uploaded to our FileNet repository. These TIFF images are multi-page, made by stitching multiple JPEGs.
This was done couple of years ago. Now we started getting complaints from users the quality of the TIFF images are not the same as they were when they were JPEGs.
Is there any way we can improve the quality of TIFF files? If I have to re-migrate this data, can JPEGs be of multiple pages? Please advice.
You can't just add quality to an image, so you can either try improving the appearance of the current information or you'll need to re-create the images to get better information.
To me, it sounds like the initial creation process is the most likely cause of the quality issue. How you create the image is important.
For example, I had a large number of photos I needed to re-size, so I used irfanview's batch convert and the results were horrible. Perhaps I had the settings wrong, I don't know.
I then tried using ImageMagick, and the results were great.
The point being, the conversion process isn't trivial.
If I were you, I'd look at how the images were created, experiment with different settings to determine what gives the best appearance, then re-create your photo gallery.
For photographic material, there's no real reason to use anything other than a jpeg if the target market is the general consumer.
Both TIFF and JPEG support lossless and lossy storage of your images. You mentioned that there was a previous conversion. The conversion was probably a lossy conversion as such you probably won't be able to recover that data to the way it was previously.
That said if you have the original source images you might be able to get back to where you where. Regarding multi-image jpegs, there is such a format *.mpo but I haven't seen it used before so your millage may vary.
You probably converted gray scale or color Jpeg to Tiff. The most common is Tiff G4 which is only 1 bit per pixel. So 24 or 8 bits was converted to 1 bit and you will see a lot of images losses. There are multiple methods to improve image quality but I would have to see the images first to suggest a method.

Manipulating Individual Color Information Per Pixel Of A Video File

I am comfortable with several programming languages (stronger in C#, C, Java than the others) so please feel free to suggest whichever would provide me with a way to read in a (preferably uncompressed) video file and look at the color of each pixel in a frame, for every frame. So what I mean is, say in a 1 pixel display of a trivially small video that runs for 5 frames, are there standard library classes or ways I can access the 5 colors that one pixel will show during the video?
Having never worked with video properly I am not too clued up on the data structure a video file would use to represent the color information, or how one would manipulate this!
Many thanks
For processing uncompressed video data (as it might come off a camera) you get an array of pixel data per-frame; you probably want to read up about pixel formats and how frames are defined within the array, which will depend entirely on what is producing the video. The YUV444, YUV422 and YUV420 formats are quite common; they're expressed in the YUV colour space but you can readily convert between them and RGB (or indeed HSV) if that's what you want to do.
Compressed video formats are a nightmare unto themselves, but you can decompress them into a raw format with ffmpeg or a similar tool. (Be careful - uncompressed video quickly produces vast quantities of data!) Indeed, I would use ffmpeg's libraries to manipulate video, but they're written in C(C++?) for speed - I don't know whether they're available to java or c#.

Image Steganography

I'm working on Steganography application. I need to hide a message inside an image file and secure it with a password, with not much difference in the file size. I am using Least Significant Bit algorithm and could do it successfully with BMP files but it does not work with JPEG, PNG or TIFF files. Does this algorithm work with these files at all? Is there a better way to achieve this? Thanks.
This heavily depends on the way the particular image format works. You'll need to dive into the internals of the format you want to use.
For JPEG, you could fiddle with the last bits of the DCT coefficients for each block.
For palette-based files (GIFs, and some PNGs), you could add extra colours to the palette that look identical to the existing ones, and encode information based on which one you use.
You'll have to distinguish between pixel-based (Bitmap) and palette-based formats (GIF) for which the steganographic technique is quite different. Also be aware that there are image formats like JPG that lose information in the compression process.
I'd also advice some general introduction to steganography including different formats.
Least Significant Bit approach does not work with JPEG and GIF images because you are using the pixel data (raw image) to store hidden information before compression. A pixel p, with data 0x123456 will probably not have this value after compression because its value depends on the compression rate and neighbour pixels. In this case we are talking about algorithms that does not only compact the image (like a ZIP, that keeps the content), but changes the color distribution, texture, and quality in order to decrease the number of bits to represent it.
However, PNG can be used just to compact the image in the same sense of ZIP file, keeping the content. Therefore, you can use the Least Significant Bit for PNG images, so that Wikipedia Steganography page shows example in this format.
As long as the image format is lossless, you can use the LSB steganography in pixels (BMP, PNG, TIFF, PPM). If it is lossy, you have to try something else, as compression and subsequent decompression cause small changes in the pixels and the message is gone. In GIF, you can embed your message into the palette. In JPEG you change the DCT coefficients, a low-level frequency representation of the image, which can be read from and saved as JPEG file losslessly.
There is an extensive research on steganography in JPEG. For introduction, I personally recommend Steganography in Digital Media: Principles, Algorithms, and Applications by Jessica Fridrich - must-read material for serious attempts in steganography. The approaches for various image formats are discussed in-depth there.
Also, LSB is inefficient and very easily detectable, you should not use that. There are better algorithms, however usually heavy on math and complex. Look for "steganography embedding distortion" and "steganography codes".

How to create an image from a raw data of DICOM image

I have a raw pixel data in a byte[] from a DICOM image.
Now I would like to convert this byte[] to an Image object.
I tried:
Image img = Image.FromStream(new MemoryStream(byteArray));
but this is not working for me. What else should I be using ?
One thing to be aware of is that a dicom "image" is not necessarily just image data. The dicom file format contains much more than raw image data. This may be where you're getting hung up. Consider checking out the dicom file standard which you should be able to find linked on the wikipedia article for dicom. This should help you figure out how to parse out the information you're actually interested in.
You have to do the following
Identify the PIXEL DATA tag from the file. You may use FileStream to read byte by byte.
Read the pixel data
Convert it to RGB
Create a BitMap object from the RGB
Use Graphics class to draw the BitMap on a panel.
The pixel data usually (if not always) ends up at the end of the DICOM data. If you can figure out width, height, stride and color depth, it should be doable to skip to the (7FE0,0010) data element value and just grab the succeeding bytes. This is the trick that most normal image viewers use when they show DICOM images.
There is a C# library called EvilDicom (http://rexcardan.com/evildicom/) that can be used to pull the image out of a DICOM file. It has a tutorial on how to do it on the website.
You should use GDCM.
Grassroots DiCoM is a C++ library for DICOM medical files. It is automatically wrapped to python/C#/Java (using swig). It supports RAW, JPEG 8/12/16bits (lossy/lossless), JPEG 2000, JPEG-LS, RLE and deflated (zlib).
It is portable and is known to run on most system (Win32, linux, MacOSX).
http://gdcm.sourceforge.net/wiki/index.php/GDCM_Release_2.4
See for example:
http://gdcm.sourceforge.net/html/DecompressImage_8cs-example.html
Are you working with a pure standard DICOM File? I've been maintainning a DICOM parser for over a two years and I came across some realy strange DICOM files that didn't completely fulfill the standard (companies implementing their "own" twisted standard DICOM files) . flush you byte array into a file and test whether your image viewer(irfanview, picassa or whatever) can show it. If your code is working with a normal JPEG stream then from my experience , 99.9999% chance that this simply because the file voilate the standard in some strange way ( and believe me , medical companies does that a lot)
Also note that DICOM standard support several variants of the JPEG standard . could be that the Bitmap class doesn't support the data you get from the DICOM file. Can you please write down the transfer syntax?
You are welcome to send me the file (if it's not big) yossi1981#gmail.com , I can check it out , There was a time I've been hex-editing DICOM file for a half a year.
DICOM is a ridiculous specification and I sincerely hope it gets overhauled in the near future. That said Offis has a software suite "DCMTK" which is fairly good at converting dicoms with the various popular encodings. Just trying to skip ahead in the file x-bytes will probably be fine for a single file but if you have a volume or several volumes a more robust strategy is in order. I used DCMTK's conversion code and just grabbed the image bits before they went into a pnm. The file you'll be looking for in DCMTK is dcm2pnm or possibly dcmj2pnm depending on the encoding scheme.
I had a problem with the scale window that I fixed with one of the runtime flags. DCMTK is open source and comes with fairly simple build instructions.

Categories