Optimization of conversion from opencv mat/Array to to OnnxRuntime Tensor? - c#

I am using the ONNXRuntime to inference a UNet model and as a part of preprocessing I have to convert an EMGU OpenCV matrix to OnnxRuntime.Tensor.
I achieved it using two nested for loops which is unfortunately quite slow:
var data = new DenseTensor<float>(new[] { 1, 3, WIDTH, HEIGHT});
for (int y = 0; y < HEIGHT; y++)
for (int x = 0; x < WIDTH; x++)
data[0, 0, x, y] = image.GetValue(2, y, x)/255.0;
data[0, 1, x, y] = image.GetValue(1, y, x)/255.0;
data[0, 2, x, y] = image.GetValue(0, y, x)/255.0;
Then I found out that there exists a method which converts Array to DenseTensor. I wanted to use this method as follows:
var imgToPredictFloat = new Mat(image.Height, image.Width, DepthType.Cv32F, 3);
image.ConvertTo(imgToPredictFloat, DepthType.Cv32F, 1/255.0);
CvInvoke.CvtColor(imgToPredictFloat, imgToPredictFloat, ColorConversion.Bgra2Rgb);
var data = image.GetData().ToTensor<float>;
var reshaped = data.Reshape(new int[] { 1, 3, WIDTH, HEIGHT});
This would greatly improve the performance however the layout of the output tensor is not correct (the same as from the for loop) and the model obviously won't work. Any suggestions how to reshape the array to the correct layout?
In the code is also performed converting int 0-255 to float 0-1 and BGR layout to RGB layout.

This is how I have used cv::Mat with ONNX Runtime ( C++ ) :
const wchar_t* model_path = L"C:/data/DNN/ONNX/ResNet/resnet152v2/resnet152-v2-7.onnx";
printf("Using Onnxruntime C++ API\n");
Ort::Session session(env, model_path, session_options);
// print model input layer (node names, types, shape etc.)
Ort::AllocatorWithDefaultOptions allocator;
size_t num_output_nodes = session.GetOutputCount();
std::vector<char*> outputNames;
for (size_t i = 0; i < num_output_nodes; ++i)
char* name = session.GetOutputName(i, allocator);
std::cout << "output: " << name << std::endl;
// print number of model input nodes
size_t num_input_nodes = session.GetInputCount();
std::vector<const char*> input_node_names(num_input_nodes);
std::vector<int64_t> input_node_dims; // simplify... this model has only 1 input node {1, 3, 224, 224}.
// Otherwise need vector<vector<>>
printf("Number of inputs = %zu\n", num_input_nodes);
// iterate over all input nodes
for (int i = 0; i < num_input_nodes; i++) {
// print input node names
char* input_name = session.GetInputName(i, allocator);
printf("Input %d : name=%s\n", i, input_name);
input_node_names[i] = input_name;
// print input node types
Ort::TypeInfo type_info = session.GetInputTypeInfo(i);
auto tensor_info = type_info.GetTensorTypeAndShapeInfo();
ONNXTensorElementDataType type = tensor_info.GetElementType();
printf("Input %d : type=%d\n", i, type);
// print input shapes/dims
input_node_dims = tensor_info.GetShape();
printf("Input %d : num_dims=%zu\n", i, input_node_dims.size());
for (int j = 0; j < input_node_dims.size(); j++)
printf("Input %d : dim %d=%jd\n", i, j, input_node_dims[j]);
cv::Size dnnInputSize;
cv::Scalar mean;
cv::Scalar std;
bool rgb = true;
//cv::Mat inputImage = cv::imread("C:/TestImages/kitten_01.jpg");
cv::Mat inputImage = cv::imread("C:/TestImages/slug_01.jpg");
rgb = true;
dnnInputSize = cv::Size(224, 224);
mean[0] = 0.485;
mean[1] = 0.456;
mean[2] = 0.406;
std[0] = 0.229;
std[1] = 0.224;
std[2] = 0.225;
cv::Mat blob;
// ONNX: (N x 3 x H x W)
cv::dnn::blobFromImage(inputImage, blob, 1.0 / 255.0, dnnInputSize, mean, rgb, false);
size_t input_tensor_size = blob.total();
std::vector<float> input_tensor_values(input_tensor_size);
for (size_t i = 0; i < input_tensor_size; ++i)
input_tensor_values[i] = blob.at<float>(i);
std::vector<const char*> output_node_names = { outputNames.front() };
// create input tensor object from data values
auto memory_info = Ort::MemoryInfo::CreateCpu(OrtArenaAllocator, OrtMemTypeDefault);
Ort::Value input_tensor = Ort::Value::CreateTensor<float>(memory_info, input_tensor_values.data(), input_tensor_size, input_node_dims.data(), 4);
// score model & input tensor, get back output tensor
auto output_tensors = session.Run(Ort::RunOptions{ nullptr }, input_node_names.data(), &input_tensor, 1, output_node_names.data(), 1);
assert(output_tensors.size() == 1 && output_tensors.front().IsTensor());
// Get pointer to output tensor float values
float* floatarr = output_tensors.front().GetTensorMutableData<float>();
assert(abs(floatarr[0] - 0.000045) < 1e-6);
cv::Mat1f result = cv::Mat1f(1000, 1, floatarr);
cv::Point classIdPoint;
double confidence = 0;
minMaxLoc(result, 0, &confidence, 0, &classIdPoint);
int classId = classIdPoint.y;
std::cout << "confidence: " << confidence << std::endl;
std::cout << "class: " << classId << std::endl;
The actual conversion part that you need is imho (adjust size and mean/std according to your network):
cv::Mat inputImage = cv::imread("C:/TestImages/slug_01.jpg");
rgb = true;
dnnInputSize = cv::Size(224, 224);
mean[0] = 0.485;
mean[1] = 0.456;
mean[2] = 0.406;
std[0] = 0.229;
std[1] = 0.224;
std[2] = 0.225;
cv::Mat blob;
// ONNX: (N x 3 x H x W)
cv::dnn::blobFromImage(inputImage, blob, 1.0 / 255.0, dnnInputSize, mean, rgb, false);


C# - Padding image bytes with white bytes to fill 512 x 512

I'm using Digital Persona SDK to scan fingerprints in wsq format, for requeriment I need 512 x 512 image, the SDK only export 357 x 392 image.
The sdk provide a method to compress captured image from device in wsq format and return a byte array that I can write to disk.
-I've tried to allocate a buffer of 262144 for 512 x 512 image.
-Fill the new buffer with white pixel data each byte to value 255.
-Copy the original image buffer into the new image buffer. The original image doesn’t need to be centered but it's important to make sure to copy without corrupting the image data.
To summarize I've tried to copy the old image into the upper right corner of the new image.
DPUruNet.Compression.SetWsqBitrate(95, 0);
Fid capturedImage = captureResult.Data;
//Fill the new buffer with white pixel data each byte to value 255.
byte[] bytesWSQ512 = new byte[262144];
for (int i = 0; i < bytesWSQ512.Length; i++)
bytesWSQ512[i] = 255;
//Compress capturedImage and get bytes (357 x 392)
byte[] bytesWSQ = DPUruNet.Compression.CompressRaw(capturedImage.Views[0].Width, capturedImage.Views[0].Height, 500, 8, capturedImage.Views[0].RawImage, CompressionAlgorithm.COMPRESSION_WSQ_NIST);
//Copy the original image buffer into the new image buffer
for (int i = 0; i < capturedImage.Views[0].Height; i++)
for (int j = 0; j < capturedImage.Views[0].Width; j++)
bytesWSQ512[i * bytesWSQ512.Length + j ] = bytesWSQ[i * capturedImage.Views[0].Width + j];
//Write bytes to disk
File.WriteAllBytes(#"C:\Users\Admin\Desktop\bytesWSQ512.wsq", bytesWSQ512);
When running that snippet I get IndexOutOfRangeException, I don't know if the loop or the calculation of indexes for new array are right.
Here is a representation of what I'm trying to do.
If someone is trying to achieve something like this or padding a raw image, I hope this will help.
DPUruNet.Compression.SetWsqBitrate(75, 0);
Fid ISOFid = captureResult.Data;
byte[] paddedImage = PadImage8BPP(captureResult.Data.Views[0].RawImage, captureResult.Data.Views[0].Width, captureResult.Data.Views[0].Height, 512, 512, 255);
byte[] bytesWSQ512 = Compression.CompressRaw(512, 512, 500, 8, paddedImage, CompressionAlgorithm.COMPRESSION_WSQ_NIST);
And the method to resize (pad) the image is:
public byte[] PadImage8BPP(byte[] original, int original_width, int original_height, int desired_width, int desired_height, byte pad_color)
byte[] canvas_8bpp = new byte[desired_width * desired_height];
for (int i = 0; i < canvas_8bpp.Length; i++)
canvas_8bpp[i] = pad_color; //Fill background. Note this type of fill will fail histogram checks.
int clamp_y_begin = 0;
int clamp_y_end = original_height;
int clamp_x_begin = 0;
int clamp_x_end = original_width;
int pad_y = 0;
int pad_x = 0;
if (original_height > desired_height)
int crop_distance = (int)Math.Ceiling((original_height - desired_height) / 2.0);
clamp_y_begin = crop_distance;
clamp_y_end = original_height - crop_distance;
pad_y = (desired_height - original_height) / 2;
if (original_width > desired_width)
int crop_distance = (int)Math.Ceiling((original_width - desired_width) / 2.0);
clamp_x_begin = crop_distance;
clamp_x_end = original_width - crop_distance;
pad_x = (desired_width - original_width) / 2;
//We traverse the captured image (either whole image or subset)
for (int y = clamp_y_begin; y < clamp_y_end; y++)
for (int x = clamp_x_begin; x < clamp_x_end; x++)
byte image_pixel = original[y * original_width + x];
canvas_8bpp[(pad_y + y - clamp_y_begin) * desired_width + pad_x + x - clamp_x_begin] = image_pixel;
return canvas_8bpp;

Converting CDF of histogram to c# from matlab?

How can I convert this matlab code to AForge.net+c# code?
cdf1 = cumsum(hist1) / numel(aa);
I found that there is Histogram.cumulative method is present in Accord.net.
But I dont know how to use.
Please teaching how to convert.
% Histogram Matching
close all
pkg load image
% 이미지 로딩
figure(1); imshow(aa); colormap(gray)
figure(2); imshow(ref); colormap(gray)
M = zeros(256,1,'uint8'); % Store mapping - Cast to uint8 to respect data type
hist1 = imhist(aa); % Compute histograms
hist2 = imhist(ref);
cdf1 = cumsum(hist1) / numel(aa); % Compute CDFs
cdf2 = cumsum(hist2) / numel(ref);
% Compute the mapping
for idx = 1 : 256
[~,ind] = min(abs(cdf1(idx) - cdf2));
M(idx) = ind-1;
% Now apply the mapping to get first image to make
% the image look like the distribution of the second image
out = M(double(aa)+1);
figure(3); imshow(out); colormap(gray)
Actually, I don't have a great knowledge of Accord.NET, but reading the documentation I think that ImageStatistics class is what you are looking for (reference here). The problem is that it cannot build a single histogram for the image and you have to do it by yourself. imhist in Matlab just merges the three channels and then counts the overall pixel occurrences so this is what you should do:
Bitmap image = new Bitmap(#"C:\Path\To\Image.bmp");
ImageStatistics statistics = new ImageStatistics(image);
Double imagePixels = (Double)statistics.PixelsCount;
Int32[] histR = statistics.Red.Values.ToArray();
Int32[] histG = statistics.Green.Values.ToArray();
Int32[] histB = statistics.Blue.Values.ToArray();
Int32[] histImage = new Int32[256];
for (Int32 i = 0; i < 256; ++i)
histImage[i] = histR[i] + histG[i] + histB[i];
Double cdf = new Double[256];
cdf[0] = (Double)histImage[0];
for (Int32 i = 1; i < 256; ++i)
cdf[i] = (Double)(cdf[i] + cdf[i - 1]);
for (Int32 i = 0; i < 256; ++i)
cdf[i] = cdf[i] / imagePixels;
In C#, an RGB value can be built from R, G and B channel values as follows:
public static int ChannelsToRGB(Int32 red, Int32 green, Int32 blue)
return ((red << 0) | (green << 8) | (blue << 16));

Capture screenshot of fullscreen DX11 program using SharpDX and EasyHook

Before anybody mentions it, I refered to this link to find out how I needed to copy the backbuffer to a bitmap.
Current situation
I am injected to the target process
Target process' FeatureLevel = Level_11_0
Target SwapChain is being made with DXGI_SWAP_CHAIN_FLAG_ALLOW_MODE_SWITCH flag.
SwapChain::Present function is hooked.
Screenshot turns out black and target process crashes. without screenshot process runs fine.
Desired situation
Make the screenshot properly and let the target process continue with its normal execution.
NOTE Hook class is the same as in the link. I only added an UnmodifiableHook version of it which does what its name says. I left out all unimportant bits.
using System;
using System.Runtime.InteropServices;
namespace Test
public sealed class TestSwapChainHook : IDisposable
private enum IDXGISwapChainVirtualTable
QueryInterface = 0,
AddRef = 1,
Release = 2,
SetPrivateData = 3,
SetPrivateDataInterface = 4,
GetPrivateData = 5,
GetParent = 6,
GetDevice = 7,
Present = 8,
GetBuffer = 9,
SetFullscreenState = 10,
GetFullscreenState = 11,
GetDesc = 12,
ResizeBuffers = 13,
ResizeTarget = 14,
GetContainingOutput = 15,
GetFrameStatistics = 16,
GetLastPresentCount = 17,
public static readonly int VIRTUAL_METHOD_COUNT_LEVEL_DEFAULT = 18;
[UnmanagedFunctionPointer(CallingConvention.StdCall, CharSet = CharSet.Unicode, SetLastError = true)]
public delegate int DXGISwapChainPresentDelegate(IntPtr thisPtr, uint syncInterval, SharpDX.DXGI.PresentFlags flags);
public delegate int DXGISwapChainPresentHookDelegate(UnmodifiableHook<DXGISwapChainPresentDelegate> hook, IntPtr thisPtr, uint syncInterval, SharpDX.DXGI.PresentFlags flags);
private DXGISwapChainPresentHookDelegate _present;
private Hook<DXGISwapChainPresentDelegate> presentHook;
static TestSwapChainHook()
SharpDX.DXGI.Rational rational = new SharpDX.DXGI.Rational(60, 1);
SharpDX.DXGI.ModeDescription modeDescription = new SharpDX.DXGI.ModeDescription(100, 100, rational, SharpDX.DXGI.Format.R8G8B8A8_UNorm);
SharpDX.DXGI.SampleDescription sampleDescription = new SharpDX.DXGI.SampleDescription(1, 0);
using (SharpDX.Windows.RenderForm renderForm = new SharpDX.Windows.RenderForm())
SharpDX.DXGI.SwapChainDescription swapChainDescription = new SharpDX.DXGI.SwapChainDescription();
swapChainDescription.BufferCount = 1;
swapChainDescription.Flags = SharpDX.DXGI.SwapChainFlags.None;
swapChainDescription.IsWindowed = true;
swapChainDescription.ModeDescription = modeDescription;
swapChainDescription.OutputHandle = renderForm.Handle;
swapChainDescription.SampleDescription = sampleDescription;
swapChainDescription.SwapEffect = SharpDX.DXGI.SwapEffect.Discard;
swapChainDescription.Usage = SharpDX.DXGI.Usage.RenderTargetOutput;
SharpDX.Direct3D11.Device device = null;
SharpDX.DXGI.SwapChain swapChain = null;
SharpDX.Direct3D11.Device.CreateWithSwapChain(SharpDX.Direct3D.DriverType.Hardware, SharpDX.Direct3D11.DeviceCreationFlags.BgraSupport, swapChainDescription, out device, out swapChain);
IntPtr swapChainVirtualTable = Marshal.ReadIntPtr(swapChain.NativePointer);
for (int x = 0; x < VIRTUAL_METHOD_COUNT_LEVEL_DEFAULT; x++)
SWAP_CHAIN_VIRTUAL_TABLE_ADDRESSES[x] = Marshal.ReadIntPtr(swapChainVirtualTable, x * IntPtr.Size);
catch (Exception)
if (device != null)
if (swapChain != null)
public TestSwapChainHook()
this._present = null;
this.presentHook = new Hook<DXGISwapChainPresentDelegate>(
new DXGISwapChainPresentDelegate(hookPresent),
public void activate()
public void deactivate()
private int hookPresent(IntPtr thisPtr, uint syncInterval, SharpDX.DXGI.PresentFlags flags)
lock (this.presentHook)
if (this._present == null)
return this.presentHook.original(thisPtr, syncInterval, flags);
return this._present(new UnmodifiableHook<DXGISwapChainPresentDelegate>(this.presentHook), thisPtr, syncInterval, flags);
public DXGISwapChainPresentHookDelegate present
lock (this.presentHook)
return this._present;
lock (this.presentHook)
this._present = value;
Using code
private TestSwapChain swapChainHook;
private bool capture = false;
private object captureLock = new object();
this.swapChainHook = new TestSwapChainHook();
this.swapChainHook.present = presentHook;
I used a different method to capture a screenshot described in this link. However my screenshot turns out like this:
Now this seems to be a problem with my conversion settings or whatever but I'm unable to find out what exactly I need to do to fix it. I know that the surface I'm converting to a bitmap uses the DXGI_FORMAT_R10G10B10A2_UNORM format (32-bits, 10 bits per color and 2 for alpha I think?). But I'm not sure how this even works in the for loops (skipping bytes and stuff). I just plain copy pasted it.
new hook function
private int presentHook(UnmodifiableHook<IDXGISwapChainHook.DXGISwapChainPresentDelegate> hook, IntPtr thisPtr, uint syncInterval, SharpDX.DXGI.PresentFlags flags)
lock (this.captureLock)
if (this.capture)
SharpDX.DXGI.SwapChain swapChain = (SharpDX.DXGI.SwapChain)thisPtr;
using (SharpDX.Direct3D11.Texture2D backBuffer = swapChain.GetBackBuffer<SharpDX.Direct3D11.Texture2D>(0))
SharpDX.Direct3D11.Texture2DDescription texture2DDescription = backBuffer.Description;
texture2DDescription.CpuAccessFlags = SharpDX.Direct3D11.CpuAccessFlags.Read;
texture2DDescription.Usage = SharpDX.Direct3D11.ResourceUsage.Staging;
texture2DDescription.OptionFlags = SharpDX.Direct3D11.ResourceOptionFlags.None;
texture2DDescription.BindFlags = SharpDX.Direct3D11.BindFlags.None;
using (SharpDX.Direct3D11.Texture2D texture = new SharpDX.Direct3D11.Texture2D(backBuffer.Device, texture2DDescription))
backBuffer.Device.ImmediateContext.CopyResource(backBuffer, texture);
using (SharpDX.DXGI.Surface surface = texture.QueryInterface<SharpDX.DXGI.Surface>())
SharpDX.DataStream dataStream;
SharpDX.DataRectangle map = surface.Map(SharpDX.DXGI.MapFlags.Read, out dataStream);
byte[] pixelData = new byte[surface.Description.Width * surface.Description.Height * 4];
int lines = (int)(dataStream.Length / map.Pitch);
int dataCounter = 0;
int actualWidth = surface.Description.Width * 4;
for (int y = 0; y < lines; y++)
for (int x = 0; x < map.Pitch; x++)
if (x < actualWidth)
pixelData[dataCounter++] = dataStream.Read<byte>();
GCHandle handle = GCHandle.Alloc(pixelData, GCHandleType.Pinned);
using (Bitmap bitmap = new Bitmap(surface.Description.Width, surface.Description.Height, map.Pitch, PixelFormat.Format32bppArgb, handle.AddrOfPinnedObject()))
if (handle.IsAllocated)
this.capture = false;
catch(Exception ex)
return hook.original(thisPtr, syncInterval, flags);
Turns out the DXGI_FORMAT_R10G10B10A2_UNORM format is in this bit format:
And Format32bppArgb is in this byte order:
So the final loop code would be:
while (pixelIndex < pixelData.Length)
uint currentPixel = dataStream.Read<uint>();
uint r = (currentPixel & 0x3FF);
uint g = (currentPixel & 0xFFC00) >> 10;
uint b = (currentPixel & 0x3FF00000) >> 20;
uint a = (currentPixel & 0xC0000000) >> 30;
pixelData[pixelIndex++] = (byte)(b >> 2);
pixelData[pixelIndex++] = (byte)(g >> 2);
pixelData[pixelIndex++] = (byte)(r >> 2);
pixelData[pixelIndex++] = (byte)(a << 6);
while ((pixelIndex % map.Pitch) >= actualWidth)
That screenshot does look like R10G10B10A2 is getting stuffed into R8G8B8A8. I haven't tested your code but we should have this bit layout
xxxxxxxx yyyyyyyy zzzzzzzz wwwwwwww
and you can extract them as follows
byte x = data[ptr++];
byte y = data[ptr++];
byte z = data[ptr++];
byte w = data[ptr++];
int r = x << 2 | y >> 6;
int g = (y & 0x3F) << 4 | z >> 4;
int b = (z & 0xF) << 6 | w >> 2;
int a = w & 0x3;
where r, g, b now have 10 bit resolution. If you want to scale them back to bytes you can do that with (byte)(r >> 2).
This would replace your double for loop. I have no way of testing this so I don't want to push it further, but I believe the idea is correct. The last check should skip the padding bytes in each row.
while(dataCounter < pixelData.Length)
byte x = dataStream.Read<byte>();
byte y = dataStream.Read<byte>();
byte z = dataStream.Read<byte>();
byte w = dataStream.Read<byte>();
int r = x << 2 | y >> 6;
int g = (y & 0x3F) << 4 | z >> 4;
int b = (z & 0xF) << 6 | w >> 2;
int a = w & 0x3;
pixelData[dataCounter++] = (byte)(r >> 2);
pixelData[dataCounter++] = (byte)(g >> 2);
pixelData[dataCounter++] = (byte)(b >> 2);
pixelData[dataCounter++] = (byte)(a << 6);
while((dataCounter % map.Pitch) >= actualWidth)

Saving a vector as a single number?

I was wondering if it would be possible to get a vector with an X and a Y value as a single number, knowing that both X and Y can range from -65000 to +65000.
Is this possible in any way?
Code examples on how to convert from this kind of number and to it would be nice.
Store it in a ulong:
ulong rslt = (uint)x;
rslt = rslt << 32;
rslt |= ((uint)y);
To get it out:
int x = (int)(rslt >> 32);
int y = (int)(rslt & 0xFFFFFFFF);
Assuming X and Y are both integer values and there is no overflow (32bit values is not enough) you can use e.g. (pseudocode)
V = fromXY(X, Y) = (y+65000)*130001+(x+65000)
(X,Y) = toXY(V) = (V%130001-65000,V/130001-65000) // <= / is integer division
(130001 is the number of distinct values for X or Y)
To combine:
var limit = 65000;
var x = 1;
var y = 2;
var single = x * (limit + 1) + y;
And then:
y = single % (limit + 1);
x = single - y / (limit + 1);
See it in action.
Of course, you have to assume that the maximum value for single fits within the size of the data type that stores it (which in this case it does).
the union does what you want very easily.
See also: http://www.cplusplus.com/doc/tutorial/other_data_types/
typedef long int64;
typedef int int32;
union {
struct { int32 a, b; };
int64 a_and_b;
} stacker;
int main ()
stacker.a = -1000;
stacker.b = 2000;
cout << stacker.a << ", " << stacker.b << endl;
cout << stacker.a_and_b << endl;
this will output:
-1000, 2000 <-- a and b read as two int32
8594229558296 <-- a and b interprested as a single int64

How to reverse that function

I've asked before about the opposite of Bitwise AND(&) and you told me its impossible to reverse.
Well,this is the situation: The server sends an image,which is encoded with the function I want to reverse,then it is encoded with zlib.
This is how I get the image from the server:
UInt32[] image = new UInt32[200 * 64];
int imgIndex = 0;
byte[] imgdata = new byte[compressed];
byte[] imgdataout = new byte[uncompressed];
Array.Copy(data, 17, imgdata, 0, compressed);
imgdataout = zlib.Decompress(imgdata);
for (int h = 0; h < height; h++)
for (int w = 0; w < width; w++)
imgIndex = (int)((height - 1 - h) * width + w);
image[imgIndex] = 0xFF000000;
if (((1 << (Int32)(0xFF & (w & 0x80000007))) & imgdataout[((h * width + w) >> 3)]) > 0)
image[imgIndex] = 0xFFFFFFFF;
Width,Height,Image decompressed and Image compressed length are always the same.
When this function is done I put image(UInt32[] array) in a Bitmap and I've got it.
Now I want to be the server and send that image.I have to do two things:
Reverse that function and then compress it with zlib.
How do I reverse that function so I can encode the picture?
for (int h = 0; h < height; h++)
for (int w = 0; w < width; w++)
imgIndex = (int)((height - 1 - h) * width + w);
image[imgIndex] = 0xFF000000;
if (((1 << (Int32)(0xFF & (w & 0x80000007))) & imgdataout[((h * width + w) >> 3)]) > 0)
image[imgIndex] = 0xFFFFFFFF;
EDIT:The format is 32bppRGB
The assumption that the & operator is always irreversible is incorrect.
Yes, in general if you have
c = a & b
and all you know is the value of c, then you cannot know what values a or b had before hand.
However it's very common for & to be used to extract certain bits from a longer value, where those bits were previously combined together with the | operator and where each 'bit field' is independent of every other. The fundamental difference with the generic & or | operators that makes this reversible is that the original bits were all zero beforehand, and the other bits in the word are left unchanged. i.e:
0xc0 | 0x03 = 0xc3 // combine two nybbles
0xc3 & 0xf0 = 0xc0 // extract the top nybble
0xc3 & 0x0f = 0x03 // extract the bottom nybble
In this case your current function appears to be extracting a 1 bit-per-pixel (monochrome image) and converting it to 32-bit RGBA.
You'll need something like:
int source_image[];
byte dest_image[];
for (int h = 0; h < height; ++h) {
for (int w = 0; w < width; ++w) {
int offset = (h * width) + w;
if (source_image[offset] == 0xffffffff) {
int mask = w % 8; // these two lines convert from one int-per-pixel
offset /= 8; // offset to one-bit-per-pixel
dest_image[offset] |= (1 << mask); // only changes _one_ bit
NB: assumes the image is a multiple of 8 pixels wide, that the dest_image array was previously all zeroes. I've used % and / in that inner test because it's easier to understand and the compiler should convert to mask / shift itself. Normally I'd do the masking and shifting myself.
