Ghostscript: get CMYK values for rendering from PDF

Ghostscript: get CMYK values for rendering from PDF - c#

I need to get the CMYK values used for rendering from the PDF.
I think they are the values range 0 - 1.0 under the C1 key.
Does anyone know how to get them ?

The CMYK values are nothing to do with the 'C1' key. There may be a colorspace defined as /C1, but it will not contain CMYK values.
Any object may be defined in a variety of colour spaces (Gray, RGB, CMYK, sRGB, Separation, DeviceN, NChannel, ICC and some special spaces), for those spaces which are not a device space (ie not Gray, RGB or CMYK) the colour is first converted into one of the device spaces. There are then rules defined in the PDF reference on how the device spaces are converted between themselves.
The actual colour components of an object will be defined in the content stream of the object (for vector objects in a page or Form context) or in the binary data (for images), or calculated from a function (Shading dictionaries).
In order to find any of these you will need to read the PDF file, decompressing streams as required, locate the object you want the information for and then determine the current colour space. Then you can convert the colour components from whatever colour space the object is defined in into CMYK.
Perhaps if you explained what your actual goal is it might be possible to be more helpful.
[UPDATE]
You could simply use Ghostscript to create a new, Grayscale, PDF by setting ColorConversionStrategy=Gray.
This has the advantage of working with all elements of the PDF not only images.
You do realise that a PDF file does not normally consist solely of a raster image ? There can be text, linework, shadings, and transparency groups can also be defined as operating in a given colour space. This is not a simple task.
If you are really only dealing with images then the ColorSpace is defined in the image dictionary (it may be an indirect reference). You will have to parse the PDF file (potentially decompressing it) to find the Color space definition. The sample values for each component are then given by the image data. These will range from 0-65535 (depending on the BPC, 1, 2, 4, 8 or 16 in the image dictionary) and you will have to apply the Decode array to map the values into a range suitable for the colour space.
If you then want to convert to gray scale, then you will have to apply a conversion to Gray. Complex spaces will include a method to map to a device space, and the conversion between device spaces is covered in the PDF reference manual. For ICCBased spaces you will need an ICC colour management engine, you might like to consider LCMS, or you could write your own.

Related

Counting and identifying colours in a vector image using ImageMagick

Customers upload image files, typically logos, to a web site, and I would like to be able to identify which colours the images contain.
I have kindof given up on bitmap images, since the anti aliasing introduces so many variations of each colour, but for vector images (eps, svg, ai to mention a few that could occur) I want to believe it should be doable.
The ideal solution would enable me to produce a list of colours which the user can verify; "Your image contains 3 colours: 111c, 222c and 333c, are these the colours you would like to use for printing?"
I am using Magick.net and C#. I am able to read the files into "MagickImage" instances, but I am lost on how to proceed in identifying the colours.

Let's say you start with this image, which is made from random colours then reduced down to just 18 colours:
convert -size 256x256 xc:black +noise random -colors 18 image.png
Now, you can get a list of the colours like this:
convert image.png -unique-colors -depth 8 txt:
Sample Output
# ImageMagick pixel enumeration: 18,1,65535,srgb
0,0: (7967,24415,7967) #1F5F1F srgb(31,95,31)
1,0: (24415,24672,8481) #5F6021 srgb(95,96,33)
2,0: (16191,12336,16448) #3F3040 srgb(63,48,64)
3,0: (8224,8224,24158) #20205E srgb(32,32,94)
4,0: (24672,24415,24415) #605F5F srgb(96,95,95)
5,0: (49344,16191,16448) #C03F40 srgb(192,63,64)
6,0: (16448,46260,13878) #40B436 srgb(64,180,54)
7,0: (8224,57311,24415) #20DF5F srgb(32,223,95)
8,0: (24415,57568,24415) #5FE05F srgb(95,224,95)
9,0: (49087,49344,16448) #BFC040 srgb(191,192,64)
10,0: (13364,16448,46517) #3440B5 srgb(52,64,181)
11,0: (24415,8224,57311) #5F20DF srgb(95,32,223)
12,0: (24415,24415,57054) #5F5FDE srgb(95,95,222)
13,0: (40863,24672,40863) #9F609F srgb(159,96,159)
14,0: (50372,15163,50115) #C43BC3 srgb(196,59,195)
15,0: (16448,49087,49344) #40BFC0 srgb(64,191,192)
16,0: (41120,40863,41120) #A09FA0 srgb(160,159,160)
17,0: (50372,50372,50372) #C4C4C4 grey77
And maybe you would like a swatch too:
convert image.png -unique-colors -scale 400x40 swatch.png

"OCR" characters/short sequences of text/numbers

here is my problem: I have to identify numbers (such as 853 / 52) and some text (containing around 8 letters of the alphabet) from a bitmap and i have to do that really fast.
Tesseract does the trick, but its execution time is a bit too slow for my liking. Since i have such a limited amount of characters that are always of the same font size and same font, i was thinking i could just extract them all and build a lookup table for certain characteristics of one character.
Yet to achieve this i would have to be able to "split" up a bitmap containing lets say 853 into its individual characters (kinda box them as some of those OCR trainers do).
Unfortunately i have no idea, how to start boxing/seperating them.. Any suggestions would be appreciated.

Thank you for the arictle.
I kinda solved half my problem.. If i use Aforge i can run them through a set of filters, in my case i increase contrast before i grayscale and binarize, and then run blob extraction on them, which allows me to chop up the picture. Now i have a clean set of character images that i will "only" have to match against comparitve ones.

Identify RGB and CMYK in a PDF file

I know this question has been asked before but it doesn't explain much and as I don't have a reputation to comment there I am asking this question.
The answer that was provided in the aforementioned thread retrieves the r g and b values but I don't know what tells if the values that are found show what part is CMYK (as I understand that after rendering all values are converted into RGB).
I need to first identify what color system is used in a pdf file, I understand now that CMYK and RGB can be simultaneously used in a single file. So I need to analyze the pdf file in my C# application and find a way to convert the CMYK parts to RGB if need be.
I learned that conversion can be done using ABCDpdf.

This is a very broad question and it would be better for you if you read at least part of the PDF specification. To give you a taste of why I'm saying this...
PDF and color spaces
1) PDF can contain a wide range of color spaces
- Device color spaces such as RGB, CMYK and Gray
- Abstract color spaces such as Lab
- ICC profile based color spaces such as ICC-based RGB, ICC-based Lab, ...
- Named or special color spaces such as Separation, Device-N and N-Channel
(and I'm omitting some charming ones such as patters and shadings)
2) All of the above color spaces can be used throughout a single PDF file. When your PDF file is compliant to certain ISO standards (such as PDF/A, PDF/X...) it has to obey rules that restrict the number of color spaces, but in general all color spaces are allowed in a single PDF.
3) Where a PDF file is used determines how these color spaces need to be handled. If you want to print to a desktop printer with CMYK inks, something is going to convert all of these color spaces to CMYK. If you're viewing the PDF file on screen, something is going to convert all of these color spaces to RGB.
Converting colors
Yes, you can convert from CMYK (and all of these other color spaces I mentioned) to RGB. But that is also much more difficult that it may sound if you want to do it correctly. As an example have a look at this site: http://www.rapidtables.com/convert/color/cmyk-to-rgb.htm
It contains quick and easy to use formulas for this conversion:
R = 255 × (1-C) × (1-K)
G = 255 × (1-M) × (1-K)
B = 255 × (1-Y) × (1-K)
This will work, but in practice you need an engine (such as for example LittleCMS) that uses ICC Profiles (used to characterise color spaces) to do a proper conversion.

16bit greyscale image to heatmap

I'm working on a scientific imaging software for my university, and I've encountered a major problem. Scientific camera (Apogee Alta U57) at my lab provides images as 16bpp array - it's 0-65535 values per pixel! We want to keep this range, but in fact we can't display them on monitor (0-255 grayscale range). So I found a way to resolve this problem - simply to make use of colors, and to display whole image as a heatmap (from black, blue, through green and red, to pure white).
I mean something like this - Example heatmap image I want to achieve
My only question is: How to efficiently convert 16bpp array of pixel values to complete heatmap bitmap in c#? Are there any libraries for doing that? If not, how do I achieve that using .NET resources?
My idea was to create function that maps 65536 values into (255 R, 255G, 255B), but it's a tough job - especially without using HSV model.
I would be much obliged for any help provided!

Your question consist of several parts:
reading in the 16 bit pixel data values
mapping them to 24 bit rgb colors
writing them out to an image file
I'll skip part one and three and give you a few ideas about part 2.
It is in fact harder than it seems. A unique mapping that doesn't lose any information is simple, in fact trivial, just a little bit shifting will do.
But you also want the result to work visually, meaning not so much is should be visually appealing but should make sense to a human eye. so we need a mapping that has a credible yet large enough gradient.
For this you should experiment a little. I suggest to make use of the LinearGradientBrush, as I show here. Have a look at the interpolateColors function! It uses only 6 colors in the example, way to few for your case!
You should pick many more; you may need to go through the color space in a spiral..
The trick for you will be to choose both nice and enough stop colors to create a 64k large set of unique colors, best going from blueish to reddish..
You will need to test the result for uniqueness; in fact you may want to create a pair of Dictionary and Dictionary for the mappings..

image conversion

I need to convert a RGB (jpg) grayscale CMYK using only to black channel (K).
I'm trying to do this with imageglue, but the result is not what i'm looking for since it converts the grays using the C,M and Y channel and leaves the black channel to 0%.
What I need is if anyone has experience in using any other library/api in .net that could work?

I would start by looking at the ColorConvertedBitmap class in WPF. Here is a link to the docs and a basic example:
http://msdn.microsoft.com/en-us/library/system.windows.media.imaging.colorconvertedbitmap(VS.85).aspx

Have you triedAForge.Net?
There is also ImageMagick, a c++ framework for image processing, with a .net wrapper (google for MagickNet)

Her is RGB to/from CMYK question which is related this one:
How is 1-bit bitmap data converted to 8bit (24bpp)?

I found The bitmap transform classes useful when trying to do some image format conversions but ... CYMK is one of the most complicated conversions you can tackle because there is more than one way to represent some colours. In particular equal CYM percentages give you shades of grey which are equivalent to the same percentage of K. Printers often use undercolour removal/transformation which normalises CYMK so that the a large common percentage is taken from CYM and transfered to the K. This is suppose to give purer blacks and grey tones. So even if you have a greyscale image represented using nothing but CYM with a zero black channel it could still print using nothing but K when you get it to a printer using undercolour removal.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.