How can i store 12 bit values in an ushort? - c#

I've got a stream coming from a camera that is set to a 12 bit pixel format.
My question is how can i store the pixel values in an array?
Before i was taking pictures with a 16 bit pixel format, but now i changed to 12 bit and I get the same full size image displayed four images on the screen next to one another I used to store the values in an ushort array then.
When i have the camera set to 8 bit pixel format I store the data in a byte array, but what should I use when having it at 12 bit?

Following on from my comment, we can process the incoming stream in 3-byte "chunks", each of which give 2 pixels.
// for a "chunk" of incoming array a[0], a[1], a[2]
ushort pixel1 = ((ushort)a[0] << 4) | ((a[1] >> 4) & 0xFF);
ushort pixel2 = ((ushort)(a[1] & 0xFF) << 4) | a[2];
(Assuming big-endian)

The smallest memory size you can allocate is one byte (8 bits) that means that if you need 12 bits of data to store one pixel in your frame array you should use ushort. And leave the 4 bits alone . That’s why it’s more efficient to design these kind of stuff with numbers from the pow of two
(1 2 4 8 16 32 64 128.. etch)

Related

What is actually contained in data chunk in wav file?

For example take the case of a stereo channel wav file with sample rate as 44100 and a bit depth of 16 bits.
Exactly how is the 16 bits divided up?
The audio clip that I was using, the first 4 bytes had data about the first audio channel the next 4 bits - I have no idea what it is( even when replaced with 0 , there is no effect on final audio file).
The next 4 bytes had data about the second audio channel the next 4 bits - I have no idea what it is( even when replaced with 0 , there is no effect on final audio file).
So I would like to figure out what those 4 bits are.
A WAV File contains several chunks.
The FMT chunk specifies the format of the audio data.
The actual audio data are within the data chunk.
It depends on the actual format. But let's assume the following format as example:
PCM, 16 bit, 2 channels with a samplerate of 44100Hz.
Audio data is represented as samples. In this case each sample takes 16 bits = 2 Bytes.
If we got multiple channels (in this examples 2 = Stereo), it will look like this:
left sample, right sample, left sample, right sample, ...
since each sample takes 2 Bytes (16 bits) we got something like this:
Byte 1 | Byte 2 | Byte 3 | Byte 4 | Byte 5 | Byte 6 | Byte 7 | Byte 8 | ...
left sample | right sample | left sample | right sample | ...
Each second of audio contains 44100 samples for EACH channel.
So in total, one second of audio takes 44100 * ( 16 / 8 ) * 2 Bytes.
WAV format audio file starts with a 44 byte header followed by the payload which is the uncompressed raw PCM audio data ... in the payload area as you walk across the PCM data each sample (point on audio curve) will contain data for all channels ... header will tell you number of channels ... for stereo using bit depth of 16 you will see two bytes (16 bits == bit depth) for a given channel immediately followed by the two bytes of the next channel etc...
For a given channel a given set of bytes (2 bytes in your case) will appear in two possible layouts determined by choice of endianness ... 1st byte followed by 2nd byte ... ordering of endianness is important here ... header also tells you what endianness you are using ... typically WAV format is little endian
each channel will generate its own audio curve
in your code to convert from PCM data into a usable audio curve data point you must combine all bytes of a given sample for given channel into a single value ... typically its integer and not floating point again the header defines which ... if integer it could be signed or unsigned ... little endian means as you read the file the first (left most) byte will become the least significant byte followed by each subsequent byte which becomes the next most significant byte
in pseudo code :
int mydatapoint // allocate your integer audio curve data point
step 0
mydatapoint = most-significant-byte
stop here for bit depth of 8
... if you have bit depth greater than 8 bits now left shift this to make room for the following byte if any
step 1
mydatapoint = mydatapoint << 8 // shove data to the left by 8 bits
// which effectively jacks up its value
// and leaves empty those right most 8 bits
step 2
// following operation is a bit wise OR operation
mydatapoint = mydatapoint OR next-most-significant-byte
now repeat doing steps 1 & 2 for each subsequent next byte of PCM data in order from most significant to least significant (for little endian) ... essential for any bit depth beyond 16 so for 24 bit audio or 32 bit you will need to combine 3 or 4 bytes of PCM data into your single integer output audio curve data point
Why are we doing this bit shifting nonsense
The level of audio fidelity when converting from analog to digital is driven by how accurately are you recording the audio curve ... analog audio is a continuous curve however to become digital it must be sampled into discrete points along the curve ... two factors determine the fidelity when sampling the analog curve to create its digital representation ... the left to right distance along the analog audio curve is determined by sample rate and the up and down distance along the audio curve is determined by bit depth ... higher sample rate gives you more samples per second and a greater bit depth gives you more vertical points to approximate the instantaneous height of the analog audio curve
bit depth 8 == 2^8 == 256 distinct vertical values to record curve height
bit depth 16 == 2^16 == 65536 distinct vertical values to record curve height
so to more accurately record into digital the height of our analog audio curve we want to become as granular as possible ... so the resultant audio curve is as smooth as possible and not jagged which would happen if we only allocated 2 bits which would give us 2^2 which is 4 distinct values ... try to connect the dots when your audio curve only has 4 vertical values to choose from on your plot ... the bit shifting is simply building up a single integer value from many bytes of data ... numbers greater than 256 cannot fit into one byte and so must be spread across multiple bytes of PCM data
http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html

Get the 2 most significant bits

How can I get the 2 most significant bits of a byte in C#.
I have something like this (value >> 6) & 7 , but I'm unsure if this is correct.
01011100 just wanting to return the part in bold.
If you want two bits, then you need to and by 3 (binary 11), not by 7 (binary 111).
So if value is a byte, something like:
byte twobits = (byte)((value >> 6) & 3);
Howevers, as the comments stated, this is redundant. It would suffice by right shifting by 6 (since the other bits would be 0 already).
Just for fun, if you want to have the two most significant bits of any data type, you could have:
byte twobits = (byte)(value >> (System.Runtime.InteropServices.Marshal.SizeOf(value)*8-2));
Just as a warning, Marshal.SizeOf gives the byte size of the variable type after marshalling, but it "usually" works.
If I read your question correctly, you want to have the two most significant bits of a byte in another byte, as the least significant bits, with the other bits set to zero.
In that case, you can just return myByte >> 6 as it will fill the rest of the bits with zeroes (in C# at least). The & 7 operation seems redundant, the layout of 7 is 00000111. This means you ensure that the 5 left most bits in the resulting byte are set to zero... You might have intended to ensure the 6 left most bits are zero, in that case it should be 3.
If you want to return only the left most two bits and keep them in place, then you should return myByte & 0b11000000.

Do you need to mask a number before retrieving the value stored at a certain byte?

I want to know if one needs to mask a number before retrieving the value stored at a certain byte when doing bit shifting.
Take, for example, this code:
short b1 = 1;
short b2 = 2;
short b0 = (short)((b1 << 8) | b2); //store two values in one variable
Console.WriteLine(b0); //b1 and b2 combined
Console.WriteLine((b0 & (255 << 8)) >> 8); //gets the value of b1
As far as I am concerned, doing a right shift drops all bits that are less than the number of bits you've shifted. Therefore, right shifting b0 by 8 bits will drop the 8 bits of b2, leaving just b1.
Console.WriteLine(b0 >> 8); //this also gets b1!!
I want to find out, is there any need to mask b0 with 255 << 8 before shifting to get the value of b1?
NB:
The only need I can think of for masking before retrieving a value is if there is something else stored at a higher byte, such as trying to get back the value of b2, in which this code would be used:
Console.WriteLine(b0 & 255); //gets the value of b2
I want to find out, is there any need to mask b0 with 255 << 8 before shifting to get the value of b1?
No, there's no need. So the compiler will omit the masking. Some people think it makes the code easier to understand or protects them against some imagined failure scenario. It's completely harmless.

Working with depth data - Kinect

I just started learning about Kinect through some quick start videos and was trying out the code to work with depth data.
However, I am not able to understand how the distance is being calculated using bit-shifting and various other formulas that are being employed to calculate other stuff too while working with this depth data.
http://channel9.msdn.com/Series/KinectSDKQuickstarts/Working-with-Depth-Data
Are these the particulars which are Kinect-specifics explained in the documentation etc.? Any help would be appreciated.
Thanks
Pixel depth
When you don't have the kinect set up to detect players, it is a simply array of bytes, with two bytes representing a single depth measurement.
So, just like in a 16 bit color image, each sixteen bits represent a depth rather than a color.
If the array were for a hypothetical 2x2 pixel depth image, you might see: [0x12 0x34 0x56 0x78 0x91 0x23 0x45 0x67] which would represent the following four pixels:
AB
CD
A = 0x34 << 8 + 0x12
B = 0x78 << 8 + 0x56
C = 0x23 << 8 + 0x91
D = 0x67 << 8 + 0x45
The << 8 simply moves that byte into the upper 8 bits of a 16 bit number. It's the same as multiplying it by 256. The whole 16 bit numbers become 0x3412, 0x7856, 0x2391, 0x6745. You could instead do A = 0x34 * 256 + 0x12. In simpler terms, it's like saying I have 329 items and 456 thousands of items. If I have that total of items, I can multiply the 456 by 1,000, and add it to the 329 to get the total number of items. The kinect has broken the whole number up into two pieces, and you simply have to add them together. I could "shift" the 456 over to the left by 3 zero digits, which is the same as multiplying by 1,000. It would then be 456000. So the shift and the multiplication are the same thing for whole amounts of 10. In computers, whole amounts of 2 are the same - 8 bits is 256, so multiplying by 256 is the same as shifting left by 8.
And that would be your four pixel depth image - each resulting 16 bit number represents the depth at that pixel.
Player depth
When you select to show player data it becomes a little more interesting. The bottom three bits of the whole 16 bit number tell you the player that number is part of.
To simplify things, ignore the complicated method they use to get the remaining 13 bits of depth data, and just do the above, and steal the lower three bits:
A = 0x34 << 8 + 0x12
B = 0x78 << 8 + 0x56
C = 0x23 << 8 + 0x91
D = 0x67 << 8 + 0x45
Ap = A % 8
Bp = B % 8
Cp = C % 8
Dp = D % 8
A = A / 8
B = B / 8
C = C / 8
D = D / 8
Now the pixel A has player Ap and depth A. The % gets the remainder of the division - so take A, divide it by 8, and the remainder is the player number. The result of the division is the depth, the remainder is the player, so A now contains the depth since we got rid of the player by A=A/8.
If you don't need player support, at least at the beginning of your development, skip this and just use the first method. If you do need player support, though, this is one of many ways to get it. There are faster methods, but the compiler usually turns the above division and remainder (modulus) operations into more efficient bitwise logic operations so you don't need to worry about it, generally.

Packing record length in 2 bytes

I want to create a ASCII string which will have a number of fields. For e.g.
string s = f1 + "|" + f2 + "|" + f3;
f1, f2, f3 are fields and "|"(pipe) is the delimiter. I want to avoid this delimiter and keep the field count at the beginning like:
string s = f1.Length + f2.Length + f3.Length + f1 + f2 + f3;
All lengths are going to be packed in 2 chars, Max length = 00-99 in this case. I was wondering if I can pack the length of each field in 2 bytes by extracting bytes out of a short. This would allow me to have a range 0-65536 using only 2 bytes. E.g.
short length = 20005;
byte b1 = (byte)length;
byte b2 = (byte)(length >> 8);
// Save bytes b1 and b2
// Read bytes b1 and b2
short length = 0;
length = b2;
length = (short)(length << 8);
length = (short)(length | b1);
// Now length is 20005
What do you think about the above code, Is this a good way to keep the record lengths?
I cannot see what you are trying to achieve. short aka Int16 is 2 bytes - yes, so you can happily use it. But creating a string does not make sense.
short sh = 56100; // 2 bytes
I believe you mean, being able to output the short to a stream. For this there are ways:
BinaryWriter.Write(sh) which writes 2 bytes straight to the stream
BitConverter.GetBytes(sh) which gives you bytes of a short
Reading back you can use the same classes.
If you want ascii, i.e. "00" as characters, then just:
byte[] bytes = Encoding.Ascii.GetBytes(length.ToString("00"));
or you could optimise it if you want.
But IMO, if you are storing 0-99, 1 byte is plenty:
byte b = (byte)length;
If you want the range 0-65535, then just:
bytes[0] = (byte)length;
bytes[1] = (byte)(length >> 8);
or swap index 0 and 1 for endianness.
But if you are using the full range (of either single or double byte), then it isn't ascii nor a string. Anything that tries to read it as a string might fail.
Whether it's a good idea depends on the details of what it's for, but it's not likely to be good.
If you do this then you're no longer creating an "ASCII string". Those were your words, but maybe you don't really care whether it's ASCII.
You will sometimes get bytes with a value of 0 in your "string". If you're handling the strings with anything written in C, this is likely to cause trouble. You'll also get all sorts of other characters -- newlines, tabs, commas, etc. -- that may confuse software that's trying to work with your data.
The original plan of separating with (say) | characters will be more compact and easier for humans and software to read. The only obvious downsides are (1) you can't allow field values with a | in (or else you need some sort of escaping) and (2) parsing will be marginally slower.
If you want to get clever you could pack your 2 bytes into 1 where the value of byte 1 is <= 127, or if the value is >=128 you use 2 bytes instead. This technique looses you 1 bit, per byte that you are using, but if you normally have small values, but occasionally have larger values it dynamically grows to accommodate the value.
All you need to do is mark bit 8 with a value indicating that the 2nd byte is required to be read.
If bit 8 of the active byte is not set, it means you have completed your value.
EG
If you have a value of 4 then you use this
|8|7|6|5|4|3|2|1|
|0|0|0|0|0|1|0|0|
If you have a value of 128 you then can read the 1st byte check if bit 8 is high, and read the remaining 7 bits of the 1st byte, then you do the same with the 2nd byte, moving the 7bits left 7 bits.
|BYTE 0 |BYTE 1 |
|8|7|6|5|4|3|2|1|8|7|6|5|4|3|2|1|
|1|0|0|0|0|0|0|0|0|0|0|0|0|0|0|1|

Categories