I'm working on extracting images that were saved in an access database. Some were saved as Bitmap Image, some where saved as Microsoft Photo Editor (which shows up as MSPhotoEd.3). For each record containing bitmap images I'm obtaining the hexidecimal data that is saved and using the file signatures to obtain the image data and converting to binary. This works perfectly and as expected.
I'm currently attempting to do the same thing for the Microsoft Photo Editor images, however I'm having a difficult time finding documentation as to how this file format is structured. I'm going through the hex and I see signatures that would correlate to a jpeg file, however there are several of them (3 markers FFD8FF) and they do not coincide with eachother (22 trailers FFD9). It could be that this format is completely different and those just happen to show up...I'm not sure.
Has anyone attempted to do this before? Is it possible to extract a jpeg image from this file format (Microsoft Photo Editor)? If not does anyone know what the best way would be to go about converting this file format to any other usable image format programmatically?
Related
If you search for "add image to pdf" on Internet, you will find many useful articles. However none of them meet my requirements.
I want to add an image to a certain place inside an existing PDF file, for instance incide a textbox.
I am not certain of how exactly you require an image added to your PDF, but there a number of approaches you can consider:
1- Load the PDF as a rasterized image and draw the image at your desired location.
2- Add the image as an annotation to the PDF.
3- Convert the PDF to a format that allows easy modification of text and insertion of images.
Loading the PDF as a rasterized image is the most direct approach. However, your text will no longer be searchable and any other PDF objects (Annotations, Hyperlinks) will all become part of one image (no longer objects). But using this approach you can simply draw the image at the exact place you need. If you want to restore text searchability after doing this, you can use an OCR engine to process the text in the resulting image.
The ImageMagick library uses the Ghostscript common engine for dealing with PDF, and it can convert PDF pages to images. There's a .NET wrapper for ImageMagick to use with C#. For OCR, there are free engines like MODI or Tesseract.
Adding the image as an annotation allows you to maintain the original format and text in the PDF, though the image will be treated as a separate object than the text and will not be “in-line”. Annotations also allow you to draw them at the exact location you need without too much difficulty.
LibreOffice Draw and Okular are options you can consider for drawing annotations.
Finally, you could simply convert the PDF to a format that easier for processing and editing, like DOC, add your image then convert it back to PDF.
A special that camera I am using returns me only images in bitmap format.
However, for the processes that I apply to that image (image processing), fails every time because of the image format.
And if the image is in ".jpeg" format everything works perfect.
My question is:
Is there any way to convert a bitmap to jpeg without saving to the
file system.
I saw there were few answers telling conversion by saving the image to the file system. That's not what I want. I need to convert and return those images.
I saw that C# had a class (below) but could not deploy it because I didn't know how to :
System.Drawing.ImageFormatConverter
Thanks in advance for your time, and valuable helps.
You probably want to look at imageresizing.net.
FreeImage could also be an alternative, though Ive found it a bit buggy at times.
I need to store XML data on a server that only accepts jpeg images. I thought of writing my XML data inside a valid jpeg file. After all, other than the jpeg header, the content of the image file is arbitrary data right?
Is it possible to produce a valid jpeg file, but have its "body" filled with custom bytes?
Of course, I also need to be able to decode the custom jpeg file and restore the data.
I'm not familiar with the jpeg file format, so I'd appreciate an explicit example.
Perhaps just appending the data to a small jpeg will work?
Create a small jpeg.
Append your (obfuscated/encrypted) XML to the file.
Upload to server.
FWIW, you can easily see this works using a Hex editor. Just create a small jpeg and append your xml to the end. Then open it using any image editor.
This is a perfectly valid thing to do to a jpeg file:
Will random data appended to a JPG make it unusable?
Uhmmm it is a strange architecture... but anyway I think this post would be useful:
How to Add 'Comments' to a JPEG File Using C#
so the proposal is add the data as a metadata of a jpeg blank image.
If you want to add your data as the actually jpeg data, you first create a BitmapSource with BitmapSource.Create and put your data in the buffer parameter. Than use the JpegBitmapEncoder to save it as a jpeg file (an example is here).
However, as far as I know, the .Net jpeg encoder is not lossless (even if you set it's quality to 100%) so you will need a third party library that can encode JPEG lossless.
I don't know of a JPEG specific way but there is a PNG/GIF method to encode arbitrary data and pixels. Check out this post. Some sites allow you to upload PNGs and GIFs renamed to JPEG so you could try that.
http://blog.nihilogic.dk/2008/05/compression-using-canvas-and-png.html
He's saving javascript but you could use and text, really.
My C# application receives image files from KOFAX VRS TWAIN driver in TWSX_FILE mode, but neither my own .NET based application nor Windows default image viewer can open these files. However, Adobe Photoshop can open them without any problem.
I tried FreeImage library and although it detects their dimensions correctly it renders black images.
It seems that KOFAX has some kind of its own bitmap format which its header is different from normal bmp files:
http://www.fileformat.info/mirror/egff/ch03_03.htm
I have uploaded one of these files here:
http://www.box.net/shared/aby42aagz4
I wanted to know how can I open these images in my applications, anybody knows any lightweight open source/free library or C++/C# code snippet, supporting this image format?
You've basically answered your own question: The file is neither a Windows bitmap file nor is it in the documented Kofax Raster Format.
As you pointed out, the first two bytes are 'BM', which would indicate the file is purporting to be a Windows bitmap. However, if that were truly the case the next four bytes would contain the file size. In your sample file, the next four bytes contain a value much bigger than the actual file size so it can't be correctly interpreted as a Windows bitmap file.
As the fileformat.info site you linked to states, if the file was truly in Kofax Raster Format, it would start with the bytes '68464B2Eh'. Thus, your file isn't in Kofax Raster Format either. In fact, I tried opening it with Kofax's VCDemo software and got the following error: "Error 20204 - Internal invalid state"
Thus, Kofax's own software thinks the file is corrupt.
The fact that Photoshop can open it and display something doesn't necessarily mean it's a valid image file format. Image processing software packages will often simply try to guess at interpreting the raw bytes of the file. Sometimes they get lucky, sometimes not.
Trying to find something that can read the files assumes that the file is in a standard format, which it isn't. Thus, I wouldn't search for something that could read the file but instead search for why the VRS/TWAIN configuration you are using is producing a non-standard format.
I have a raw pixel data in a byte[] from a DICOM image.
Now I would like to convert this byte[] to an Image object.
I tried:
Image img = Image.FromStream(new MemoryStream(byteArray));
but this is not working for me. What else should I be using ?
One thing to be aware of is that a dicom "image" is not necessarily just image data. The dicom file format contains much more than raw image data. This may be where you're getting hung up. Consider checking out the dicom file standard which you should be able to find linked on the wikipedia article for dicom. This should help you figure out how to parse out the information you're actually interested in.
You have to do the following
Identify the PIXEL DATA tag from the file. You may use FileStream to read byte by byte.
Read the pixel data
Convert it to RGB
Create a BitMap object from the RGB
Use Graphics class to draw the BitMap on a panel.
The pixel data usually (if not always) ends up at the end of the DICOM data. If you can figure out width, height, stride and color depth, it should be doable to skip to the (7FE0,0010) data element value and just grab the succeeding bytes. This is the trick that most normal image viewers use when they show DICOM images.
There is a C# library called EvilDicom (http://rexcardan.com/evildicom/) that can be used to pull the image out of a DICOM file. It has a tutorial on how to do it on the website.
You should use GDCM.
Grassroots DiCoM is a C++ library for DICOM medical files. It is automatically wrapped to python/C#/Java (using swig). It supports RAW, JPEG 8/12/16bits (lossy/lossless), JPEG 2000, JPEG-LS, RLE and deflated (zlib).
It is portable and is known to run on most system (Win32, linux, MacOSX).
http://gdcm.sourceforge.net/wiki/index.php/GDCM_Release_2.4
See for example:
http://gdcm.sourceforge.net/html/DecompressImage_8cs-example.html
Are you working with a pure standard DICOM File? I've been maintainning a DICOM parser for over a two years and I came across some realy strange DICOM files that didn't completely fulfill the standard (companies implementing their "own" twisted standard DICOM files) . flush you byte array into a file and test whether your image viewer(irfanview, picassa or whatever) can show it. If your code is working with a normal JPEG stream then from my experience , 99.9999% chance that this simply because the file voilate the standard in some strange way ( and believe me , medical companies does that a lot)
Also note that DICOM standard support several variants of the JPEG standard . could be that the Bitmap class doesn't support the data you get from the DICOM file. Can you please write down the transfer syntax?
You are welcome to send me the file (if it's not big) yossi1981#gmail.com , I can check it out , There was a time I've been hex-editing DICOM file for a half a year.
DICOM is a ridiculous specification and I sincerely hope it gets overhauled in the near future. That said Offis has a software suite "DCMTK" which is fairly good at converting dicoms with the various popular encodings. Just trying to skip ahead in the file x-bytes will probably be fine for a single file but if you have a volume or several volumes a more robust strategy is in order. I used DCMTK's conversion code and just grabbed the image bits before they went into a pnm. The file you'll be looking for in DCMTK is dcm2pnm or possibly dcmj2pnm depending on the encoding scheme.
I had a problem with the scale window that I fixed with one of the runtime flags. DCMTK is open source and comes with fairly simple build instructions.