I have an application which uses a C++ DLL to communicate with a Canon Camera, methods in this C++ DLL are invoked from a C# application. What I've seen in the application is that when taking photos, the memory increases, of course. After I close the "Image Capture Window" the application still holds the same amount of memory as it did, when all of the images were captured.
Since my application exists of many layers of WPF UserControls, I thought that the "Image Preview UserControl" was unable to get garbage collected because of other controls subscribed to an event fired from this control. After some googling, I decided to implement the Weak Reference Pattern on the events.
//Source code found here: http://paulstovell.com/blog/weakevents
public sealed class WeakEventHandler<TEventArgs> where TEventArgs : EventArgs
{
private readonly WeakReference _targetReference;
private readonly MethodInfo _method;
public WeakEventHandler(EventHandler<TEventArgs> callback)
{
_method = callback.Method;
_targetReference = new WeakReference(callback.Target, true);
}
public void Handler(object sender, TEventArgs eventArgs)
{
var target = _targetReference.Target;
if (target != null)
{
var callback =
(Action<object, TEventArgs>)
Delegate.CreateDelegate(typeof (Action<object, TEventArgs>), target, _method, true);
if (callback != null)
{
callback(sender, eventArgs);
}
}
}
}
So, if I forget to unsubscribe some events the GC will collect them anyway. After some more testing, this approach did not work, so I decided to use the Redgate ANTS Memory Profiler
I took three snapshots:
Before taking images
After I took 4 images
After destruction of the wpf controllers
The result when comparing snapshot 1 and 3:
As you can see the amount of allocated Unmanaged Memory is the big problem here. My first thought would be that the C++ DLL isn't deallocating the allocated memory when the "Image Capture Window" is closed.
Am I correct that the problem is in the C++ plugin? Can I exclude the C# application? As far as I know, all code written in .NET is managed memory.
Based upon a comment here is how the image arrives from the C++ plugin to the C# plugin:
From the C++ plugin there is a callback like this:
_resultcallback(img->GetImageInfo().Data, img->GetImageInfo().Width, img->GetImageInfo().Height, img->GetImageInfo().BPP);
And the method which receives the image on the C# side:
private void OnResultImageCallback(IntPtr imagePtr, int width, int height, int bitsPerPixel)
{
_state = CameraState.InitializedStandby;
_cbResultData.Width = width;
_cbResultData.Height = height;
_cbResultData.BitsPerPixel = bitsPerPixel;
int memSize = bitsPerPixel * width * height / 8;
_cbResultData.data = new byte[memSize];
Marshal.Copy(imagePtr, _cbResultData.data, 0, memSize);
_deleteAllocatedImageFunction(imagePtr);
if (ImageCaptured != null)
ImageCaptured(_cbResultData.data, _cbResultData.Width, _cbResultData.Height, _cbResultData.BitsPerPixel);
_cbResultData.data = null;
}
I also have a method to clear the allocated memory in my C++ which takes in a byte-pointer like this:
BOOL CanonEDSDKWnd::ClearImageBuffer(BYTE* img) {
_debug->Write(_T("CanonEDSDKWnd::ClearImageBuffer"));
delete[] img;
return TRUE;
}
Which is called from the C# code with the IntPtr from the callback
_deleteAllocatedImageFunction(imagePtr);
I think your callback function should look like the following:
C++ side:
_resultcallback(
img // extend the signature
img->GetImageInfo().Data,
img->GetImageInfo().Width,
img->GetImageInfo().Height,
img->GetImageInfo().BPP
);
C# side:
private void OnResultImageCallback(IntPtr img, IntPtr imagePtr, int width, int height, int bitsPerPixel)
{
_state = CameraState.InitializedStandby;
_cbResultData.Width = width;
_cbResultData.Height = height;
_cbResultData.BitsPerPixel = bitsPerPixel;
int memSize = bitsPerPixel * width * height / 8;
_cbResultData.data = new byte[memSize];
Marshal.Copy(imagePtr, _cbResultData.data, 0, memSize);
_deleteAllocatedImageFunction(img);
if (ImageCaptured != null)
ImageCaptured(_cbResultData.data, _cbResultData.Width, _cbResultData.Height, _cbResultData.BitsPerPixel);
_cbResultData.data = null;
}
Related
private void RunEveryTenFrames(Color32[] pixels, int width, int height)
{
var thread = new Thread(() =>
{
Perform super = new HeavyOperation();
if (super != null)
{
Debug.Log("Result: " + super);
ResultHandler.handle(super);
}
});
thread.Start();
}
I'm running this function every 10 frames in Unity. Is this a bad idea. Also, when I try to add thread.Abort() inside the thread, it says thread is not defined and can't use local variable before it's defined error.
Is it a good idea to create a new thread every 10 frames in Unity?
No. 10 frames is too small for repeatedly creating new Thread.
Creating new Thread will cause overhead each time. It's not bad when done once in a while. It is when done every 10 frames. Remember this is not every 10 seconds. It is every 10 frames.
Use ThreadPool. By using ThreadPool with ThreadPool.QueueUserWorkItem, you are re-using Thread that already exist in the System in instead of creating new ones each time.
Your new RunEveryTenFrames function with ThreadPool should look something like this:
private void RunEveryTenFrames(Color32[] pixels, int width, int height)
{
//Prepare parameter to send to the ThreadPool
Data data = new Data();
data.pixels = pixels;
data.width = width;
data.height = height;
ThreadPool.QueueUserWorkItem(new WaitCallback(ExtractFile), data);
}
private void ExtractFile(object a)
{
//Retrive the parameters
Data data = (Data)a;
Perform super = new HeavyOperation();
if (super != null)
{
Debug.Log("Result: " + super);
ResultHandler.handle(super);
}
}
public struct Data
{
public Color32[] pixels;
public int width;
public int height;
}
I you ever need to call into Unity's API or use Unity's API from this Thread, see my other post or how to do that.
In my code I retrieve frames from a camera with a pointer to an unmanaged object, make some calculations on it and then I make it visualized on a picturebox control.
Before I go further in this application with all the details, I want to be sure that the base code for this process is good.
In particular I would like to:
- keep execution time minimal and avoid unnecessary operations, such as
copying more images than necessary. I want to keep only essential
operations
- understand if a delay in the calculation process on every frame could have detrimental effects on the way images are shown (i.e. if it is not printed what I expect) or some image is skipped
- prevent more serious errors, such as ones due to memory or thread management, or to image display.
For this purpose, I set up a few experimental lines of code (below), but I’m not able to explain the results of what I found. If you have the executables of OpenCv you can make a try by yourself.
using System;
using System.Drawing;
using System.Drawing.Imaging;
using System.Windows.Forms;
using System.Runtime.InteropServices;
using System.Threading;
public partial class FormX : Form
{
private delegate void setImageCallback();
Bitmap _bmp;
Bitmap _bmp_draw;
bool _exit;
double _x;
IntPtr _ImgBuffer;
bool buffercopy;
bool copyBitmap;
bool refresh;
public FormX()
{
InitializeComponent();
_x = 10.1;
// set experimemental parameters
buffercopy = false;
copyBitmap = false;
refresh = true;
}
private void buttonStart_Click(object sender, EventArgs e)
{
Thread camThread = new Thread(new ThreadStart(Cycle));
camThread.Start();
}
private void buttonStop_Click(object sender, EventArgs e)
{
_exit = true;
}
private void Cycle()
{
_ImgBuffer = IntPtr.Zero;
_exit = false;
IntPtr vcap = cvCreateCameraCapture(0);
while (!_exit)
{
IntPtr frame = cvQueryFrame(vcap);
if (buffercopy)
{
UnmanageCopy(frame);
_bmp = SharedBitmap(_ImgBuffer);
}
else
{ _bmp = SharedBitmap(frame); }
// make calculations
int N = 1000000; /*1000000*/
for (int i = 0; i < N; i++)
_x = Math.Sin(0.999999 * _x);
ShowFrame();
}
cvReleaseImage(ref _ImgBuffer);
cvReleaseCapture(ref vcap);
}
private void ShowFrame()
{
if (pbCam.InvokeRequired)
{
this.Invoke(new setImageCallback(ShowFrame));
}
else
{
Pen RectangleDtPen = new Pen(Color.Azure, 3);
if (copyBitmap)
{
if (_bmp_draw != null) _bmp_draw.Dispose();
//_bmp_draw = new Bitmap(_bmp); // deep copy
_bmp_draw = _bmp.Clone(new Rectangle(0, 0, _bmp.Width, _bmp.Height), _bmp.PixelFormat);
}
else
{
_bmp_draw = _bmp; // add reference to the same object
}
Graphics g = Graphics.FromImage(_bmp_draw);
String drawString = _x.ToString();
Font drawFont = new Font("Arial", 56);
SolidBrush drawBrush = new SolidBrush(Color.Red);
PointF drawPoint = new PointF(10.0F, 10.0F);
g.DrawString(drawString, drawFont, drawBrush, drawPoint);
drawPoint = new PointF(10.0F, 300.0F);
g.DrawString(drawString, drawFont, drawBrush, drawPoint);
g.DrawRectangle(RectangleDtPen, 12, 12, 200, 400);
g.Dispose();
pbCam.Image = _bmp_draw;
if (refresh) pbCam.Refresh();
}
}
public void UnmanageCopy(IntPtr f)
{
if (_ImgBuffer == IntPtr.Zero)
_ImgBuffer = cvCloneImage(f);
else
cvCopy(f, _ImgBuffer, IntPtr.Zero);
}
// only works with 3 channel images from camera! (to keep code minimal)
public Bitmap SharedBitmap(IntPtr ipl)
{
// gets unmanaged data from pointer to IplImage:
IntPtr scan0;
int step;
Size size;
OpenCvCall.cvGetRawData(ipl, out scan0, out step, out size);
return new Bitmap(size.Width, size.Height, step, PixelFormat.Format24bppRgb, scan0);
}
// based on older version of OpenCv. Change dll name if different
[DllImport( "opencv_highgui246", CallingConvention = CallingConvention.Cdecl)]
public static extern IntPtr cvCreateCameraCapture(int index);
[DllImport("opencv_highgui246", CallingConvention = CallingConvention.Cdecl)]
public static extern void cvReleaseCapture(ref IntPtr capture);
[DllImport("opencv_highgui246", CallingConvention = CallingConvention.Cdecl)]
public static extern IntPtr cvQueryFrame(IntPtr capture);
[DllImport("opencv_core246", CallingConvention = CallingConvention.Cdecl)]
public static extern void cvGetRawData(IntPtr arr, out IntPtr data, out int step, out Size roiSize);
[DllImport("opencv_core246", CallingConvention = CallingConvention.Cdecl)]
public static extern void cvCopy(IntPtr src, IntPtr dst, IntPtr mask);
[DllImport("opencv_core246", CallingConvention = CallingConvention.Cdecl)]
public static extern IntPtr cvCloneImage(IntPtr src);
[DllImport("opencv_core246", CallingConvention = CallingConvention.Cdecl)]
public static extern void cvReleaseImage(ref IntPtr image);
}
results [dual core 2 Duo T6600 2.2 GHz]:
A. buffercopy = false; copyBitmap = false; refresh = false;
This is the simpler configuration. Each frame is retrieved in turn, operations are made (in the reality they are based on the same frame, here just calculations), then the result of the calculations is printed on top of the image and finally it is displayed on a picturebox.
OpenCv documentation says:
OpenCV 1.x functions cvRetrieveFrame and cv.RetrieveFrame return image
stored inside the video capturing structure. It is not allowed to
modify or release the image! You can copy the frame using
cvCloneImage() and then do whatever you want with the copy.
But this doesn’t prevent us from doing experiments.
If the calculation are not intense (low number of iterations, N), everything is just ok and the fact that we manipulate the image buffer own by the unmanaged frame retriever doesn’t pose a problem here.
The reason is that probably they advise to leave untouched the buffer, in case people would modify its structure (not its values) or do operations asynchronously without realizing it. Now we retrieve frames and modify their content in turn.
If N is increased (N=1000000 or more), when the number of frames per second is not high, for example with artificial light and low exposure, everything seems ok, but after a while the video is lagged and the graphics impressed on it are blinking. With a higher frame rate the blinking appears from the beginning, even when the video is still fluid.
Is this because the mechanism of displaying images on the control (or refreshing or whatever else) is somehow asynchronous and when the picturebox is fetching its buffer of data it is modified in the meanwhile by the camera, deleting the graphics?
Or is there some other reason?
Why is the image lagged in that way, i.e. I would expect that the delay due to calculations only had the effect of skipping the frames received by the camera when the calculation are not done yet, and de facto only reducing the frame rate; or alternatively that all frames are received and the delay due to calculations brings the system to process images gotten minutes before, because the queue of images to process rises over time.
Instead, the observed behavior seems hybrid between the two: there is a delay of a few seconds, but this seems not increased much as the capturing process goes on.
B. buffercopy = true; copyBitmap = false; refresh = false;
Here I make a deep copy of the buffer into a second buffer, following the advice of the OpenCv documentation.
Nothing changes. The second buffer doesn’t change its address in memory during the run.
C. buffercopy = false; copyBitmap = true; refresh = false;
Now the (deep) copy of the bitmap is made allocating every time a new space in memory.
The blinking effect has gone, but the lagging keep arising after a certain time.
D. buffercopy = false; copyBitmap = false; refresh = true;
As before.
Please help me explain these results!
If I may be so frank, it is a bit tedious to understand all the details of your questions, but let me make a few points to help you analyse your results.
In case A, you say you perform calculations directly on the buffer. The documentation says you shouldn't do this, so if you do, you can expect undefined results. OpenCV assumes you won't touch it, so it might do stuff like suddenly delete that part of memory, let some other app process it, etc. It might look like it works, but you can never know for sure, so don't do it *slaps your wrist* In particular, if your processing takes a long time, the camera might overwrite the buffer while you're in the middle of processing it.
The way you should do it is to copy the buffer before doing anything. This will give you a piece of memory that is yours to do with whatever you wish. You can create a Bitmap that refers to this memory, and manually free the memory when you no longer need it.
If your processing rate (frames processed per second) is less than the number of frames captured per second by the camera, you have to expect some frames will be dropped. If you want to show a live view of the processed images, it will lag and there's no simple way around it. If it is vital that your application processes a fluid video (e.g. this might be necessary if you're tracking an object), then consider storing the video to disk so you don't have to process in real-time. You can also consider multithreading to process several frames at once, but the live view would have a latency.
By the way, is there any particular reason why you're not using EmguCV? It has abstractions for the camera and a system that raises an event whenever the camera has captured a new frame. This way, you don't need to continuously call cvQueryFrame on a background thread.
I think that you still have a problem with your UnmanageCopy method in that you only clone the image the first time this is called and you subsequently copy it. I believe that you need to do a cvCloneImage(f) every time as copy performs only a shallow copy, not a deep copy as you seem to think.
Asked a few questions about a project I was working on, got some good feedback and made some progress. The idea is to create an application that generates images of fractals, accelerated by CUDA. I am creating the ui in C# and having a DLL do the heavy lifting.
Basically, I am allocating a byte array in C#, passing that to the dll to fill with pixel data, and then using that to create a Bitmap and display that with a Windows Forms PictureBox in the ui. Previous questions have helped - was using dll to allocate memory before, now using consistent calling convention between dll and c#, but the code still gives an System.ArgumentException at "img = new Bitmap(...)
Relevant Code:
C++
extern "C" __declspec(dllexport) void __cdecl generateBitmap(void *bitmap)
{
int width = 1920;
int height = 1080;
int *dev_bmp;
gpuErrchk(cudaMalloc((void**)&dev_bmp, (3*width*height*sizeof(int))));
kernel<<<BLOCKS_PER_GRID, THREADS_PER_BLOCK>>>(dev_bmp, width, height);
gpuErrchk(cudaPeekAtLastError());
gpuErrchk(cudaDeviceSynchronize());
gpuErrchk(cudaMemcpy(bitmap, dev_bmp, (width*height*3), cudaMemcpyDeviceToHost));
cudaFree(dev_bmp);
}
c#
public unsafe class NativeMethods
{
[DllImport(#"C:\Users\Bill\Documents\Visual Studio 2012\Projects\FractalMaxUnmanaged\Debug\FractalMaxUnmanaged.dll", CallingConvention=CallingConvention.Cdecl)]
public static extern void generateBitmap(void *bitmap);
public static Bitmap create()
{
byte[] buf = new byte[1920 * 1080 * 3];
fixed (void* pBuffer = buf)
{
generateBitmap(pBuffer);
}
IntPtr unmanagedPtr = Marshal.AllocHGlobal(buf.Length);
Marshal.Copy(buf, 0, unmanagedPtr, buf.Length);
Bitmap img = new Bitmap(1920, 1080, 3, PixelFormat.Format24bppRgb, unmanagedPtr);
Marshal.FreeHGlobal(unmanagedPtr);
return img;
}
}
//...
private unsafe void mandlebrotButton_Click(object sender, EventArgs e)
{
FractalBox1.Image = (Image)NativeMethods.create();
}
What am I still doing wrong? As far as I can tell, all the parameters are invalid, but I get an invalid parameter exception in System.Drawing when I try to create the bitmap.
I am not sure what happens exactly in your case cause you didn't specify which parameter is invalid in the exception. I see that your stride must not be correct.
stride Type: System.Int32
Integer that specifies the byte offset between the beginning of one
scan line and the next. This is usually (but not necessarily) the
number of bytes in the pixel format (for example, 2 for 16 bits per
pixel) multiplied by the width of the bitmap. The value passed to this
parameter must be a multiple of four..
So your constructor should be like this:
Bitmap img = new Bitmap(1920, 1080, 1920 * 3, PixelFormat.Format24bppRgb, unmanagedPtr);
Perhaps the problem i have is a bit specific but I'm sure the solution would be interesting for a lot of people.
Now to the point. I have an ActiveX control that plays streaming video. My goal is to get to every frame it plays and display them in external c# application over some windows control, a panel, for instance.
Here is the sample DirectShow transform filter:
STDMETHODIMP CTransform::Transform(BSTR bsResource, struct U_VideoFrame *pInFrame, struct U_VideoFrameData **pOutFrameData)
{
//Must allocate memory this way, the output size must be equal to input size
*pOutFrameData = (U_VideoFrameData*)CoTaskMemAlloc(sizeof(U_VideoFrameData));
(*pOutFrameData)->pFrame = (BYTE*)CoTaskMemAlloc(pInFrame->Frame.nLength);
(*pOutFrameData)->nLength = pInFrame->Frame.nLength;
//Now transform data contained in (*pOutFrameData)->pFrame;
//We simply copy data here
memcpy((*pOutFrameData)->pFrame, pInFrame->Frame.pFrame, pInFrame->Frame.nLength);
return S_OK;
}
My idea is that somewhere inside this method I should place a callback function that will call my managed code and pass pInFrame to it. How can I do it? Please help
P.S. I have read the great article Howto implement callback interface from unmanaged DLL to .net app. It works as described (of course). However, when I modify the code above to this:
typedef int (__stdcall * Callback)(const char* text);
static Callback Handler = 0;
extern "C" __declspec(dllexport)
void __stdcall SetCallback(Callback handler) {
Handler = handler;
}
extern "C" __declspec(dllexport)
void __stdcall TestCallback() {
int retval = Handler("hello world");
}
// CTransform
STDMETHODIMP CTransform::Transform(BSTR bsResource, struct U_VideoFrame *pInFrame, struct U_VideoFrameData **pOutFrameData)
{
//Must allocate memory this way, the output size must be equal to input size
*pOutFrameData = (U_VideoFrameData*)CoTaskMemAlloc(sizeof(U_VideoFrameData));
(*pOutFrameData)->pFrame = (BYTE*)CoTaskMemAlloc(pInFrame->Frame.nLength);
(*pOutFrameData)->nLength = pInFrame->Frame.nLength;
//Now transform data contained in (*pOutFrameData)->pFrame;
//We simply copy data here
memcpy((*pOutFrameData)->pFrame, pInFrame->Frame.pFrame, pInFrame->Frame.nLength);
if (Handler != 0)
int retval = Handler("Transform");
return S_OK;
}
then the event does not fire from Transform method. TestCallback() method works
I'm stuck. Any help will be greatly appreciated.
I'm wondering how does the allocation and disposal of memory allocated for bitmaps work in .NET.
When I do a lot of bitmap creations in loops in a function and call it in succession it will work up until at some point the Bitmap wont be able to allocate memory giving the exception "Invalid parameter" for the size specified.
If I call the garbage collector from while to while it works.
With the following code you are able to repoduce the error:
class BitmapObject {
public bool Visible {
get { return enb; }
set { enb = value; }
}
private bool enb;
private Bitmap bmp;
public BitmapObject(int i, bool en)
{
enb = en;
bmp = new Bitmap(i, i);
}
}
class Pool<T> where T : BitmapObject
{
List<T> preallocatedBitmaps = new List<T>();
public void Fill() {
Random r = new Random();
for (int i = 0; i < 500; i++) {
BitmapObject item = new BitmapObject(500, r.NextDouble() > 0.5);
preallocatedBitmaps.Add(item as T);
}
}
public IEnumerable<T> Objects
{
get
{
foreach (T component in this.preallocatedBitmaps)
{
if (component.Visible)
{
yield return (T)component;
}
}
}
}
}
static class Program
{
/// <summary>
/// The main entry point for the application.
/// </summary>
[STAThread]
static void Main()
{
for (int i = 0; i < 10; i++ )
{
Test();
// without this it breaks
//GC.Collect();
//GC.WaitForPendingFinalizers();
}
Console.ReadKey();
}
private static void Test() {
Pool<BitmapObject> pool = new Pool<BitmapObject>();
pool.Fill();
for (int i = 0; i < 100; i++)
{
var visBitmaps = pool.Objects;
// do something
}
}
}
The Bitmap class is inevitably the one where you have to stop ignoring that IDisposable exists. It is a small wrapper class around a GDI+ object. GDI+ is unmanaged code. The bitmap occupies unmanaged memory. A lot of it when the bitmap is large.
The .NET garbage collector ensures that unmanaged system resources are released with the finalizer thread. Problem is, it only kicks into action when you create sufficient amounts of managed objects to trigger a garbage collection. That won't work well for the Bitmap class, you can create many thousands of them before generation #0 of the garbage collected heap fills up. You will run out of unmanaged memory before you can get there.
Managing the lifetime of the bitmaps you use is required. Call the Dispose() method when you no longer have a use for it. That's not always the golden solution, you may have to re-think your approach if you simply have too many live bitmaps. A 64-bit operating system is the next solution.
The .NET Bitmap class "encapsulates a GDI+ bitmap", that means you should call Dispose on a Bitmap when you are finished with it,
"Always call Dispose before you
release your last reference to the
Image. Otherwise, the resources it is
using will not be freed until the
garbage collector calls the Image
object's Finalize method."
Why don't you use using keyword. Just encapsulate your Bitmap object in it and Compiler will ensure that Dispose method is called.
Its simply a syntactic shortcut for
try
{
...
}
finally
{
...Dispose();
}