C# - Multithreaded Processing of a single Image (Webcam Frames)

C# - Multithreaded Processing of a single Image (Webcam Frames) - c#

I have made a programm which is able to capture webcam frames and display them after running different per pixel algorithms - for example making the image gray scale.
At the moment I am using the FastBitmap class (can't find the link atm) which uses pointers to set and get a pixel within a bitmap.
However, I wanted to make my programm multithreaded so that multiple threads are working on the same image. For that I split the image into several sections via its BitmapData (one section per thread) and let the different threads work on their given section BitmapData. In the end a "manager" waits until all threads are done (join) and hands in the resulting image.
That's the theory, but in real this isn't working for me.
When I run this programm I get some strange errors, telling me that I have to release the LHC before reusing it, that I am accessing illegal memory, external exceptions etc. ... Everytime another and I can't understand why but I think the BitmapData sections are the main problem but I don't want to use the slower Mashal copy!
So my questions are the following:
Is it possible to have sectioned multithreaded image processing in C# with unsafe pointer methods?
If yes - how?
As for image processing libraries:
I don't need filters or some default image processing algorithms but I need my own "per pixel" algorithm - I even thought about adding a pixel shader to my program. xD
As my programm is based around the converting of frames of a webcam I need the fastest algorithm possible.
I've read all forum posts and tutorials etc. which I could just find and still have no idea how to do this with unsafe code correctly until I've made this account to finally ask this question here.
Robbepop

Of course it is possible :)
Take a look at:
https://github.com/dajuric/accord-net-extensions
This library contains exactly want you want:
parallel processor which is used to execute parallel operations on an image (e.g. color conversion) and yes those functions which operate on an image are unsafe (use pointers):
NuGet packages ready.

Related

Fast copying of GUI in C#?

Right now I'm copying window graphics from one window to another via BitBlt from WinApi. I wonder if there is any other fast / faster way to do the same in C#.
Keyword is performance here. If I should stay with WinApi I would hold HDC in memory for quick drawing and if .NET Framework has other possibilities I would probably hold Graphics objects. Right now I'm something to slow when I have to copy ~ 1920x1080 windows.
So how can I boost performance of gui copying in C# ?
I just want to know if I can get better than this. Explicit hardware acceleration (OpenGL, DirectX) is out of my interest here. I decided to stay pure .NET + WinApi.
// example to copy desktop to window
Graphics g = Graphics.FromHwnd(Handle);
IntPtr dc = g.GetHdc();
IntPtr dc0 = Windows.GetWindowDC(Windows.GetDesktopWindow());
Windows.BitBlt(dc, 0, 0, Width, Height, dc0, 0, 0, Windows.SRCCOPY);
// clean up of DCs and Graphics left out
Hans's questions:
How slow is it?
Too slow. Feels (very) stiff.
How fast does it need to be?
The software mustn't feel slow.
Why is it important?
User friendliness of software.
What does the hardware look like?
Any random PC.
Why can't you solve it with better hardware?
It's just software that runs on Windows machines. You won't goy and buy a new PC for a random software that runs slow on your old ?

Get a better video card!
The whole point of the GDI, whether you access it from native code or via .Net, is that it abstracts the details of the graphics subsystem. The downside being that the low level operations such as blitting are in the hands of the graphic driver writers. You can safely assume that these are as optimised as it's possible to get (after all, a video card manufacturer wants to make their card look the best).
The overhead of using the .Net wrappers instead of using the native calls directly pale into insignificance compared to the time spent doing the operation itself, so you're not going to gain much really.
Your code is doing a non-scaling, no blending copy which is possibly the fastest way to copy images.
Of course, you should be profiling the code to see what effect any changes you do to the code is having.
The question then is why are you copying such large images from one window to another? Do you control the contents of both windows?
Update
If you control both windows, why not draw to a single surface and then blit that to both windows? This should replace the often costly operation of reading data from the video card (i.e. blitting from one window to another) with two write operations. It's just a thought and might not work but you'd need timing data to see if it makes any difference.

Mitigating suspension and/or termination while encoding video in an app

I have a Windows Store app with an option to export certain data in a video file format. My app is in C#, but the encoding itself is handled by dropping into a C++ library adapted from this sample by David Catuhe and is working well. The problem is that I have found is that the encoding process can take a long time when run at high quality, and if the screen times out (say, on a Surface RT) or the user switches apps, the process fails. I'm not entirely sure what the source of the failure is and am working to verify it, but even if the process were able to survive suspension without changes, I don't know how to handle being tombstoned.
I can live with the encoding being interrupted in certain situations. What I don't want is to have to start over from scratch if the app goes away for some reason.
As far as I can tell, it isn't feasible to simply close the stream without finalizing the video and resume writing to it later. In light of this, I have considered a few options, but I can't tell which, if any, might actually work. I'd be very grateful for some direction.
1) If possible, it'd be great to be able to simply close the stream and reopen it later, picking up where I left off. At the moment I haven't been able to get this to work, but if it SHOULD work I'd love to know.
2) Push the encode process to a background task, either from the start or only when tombstoned. But is there a way to pass an open stream from my app to a background task? If not, is there a way to get my app's background task to run without CPU/memory limitations at least while my app is in the foreground? Because doing a whole encode within the very tight constraints that normally bind background tasks would take years.
3) Render segments of the video progressively while the app is in the foreground and then stitch the parts together at the end. This way, if the encode is interrupted I can pick up at the most recent segment. From my reading this should be possible in theory (I think it falls under the category of remuxing, which would avoid the need to re-encode the video). But I haven't found any samples that cover this scenario, not even in C++ (which I have almost no experience with). The Transcode API doesn't seem to cover joining multiple samples. I've looked into using SharpDX to do it, but the most likely candidate for what I'd want to use (a Media Session) is only exposed for desktop apps.
4) Push the work off to either a desktop or server app. The problem is I want to have this run on Windows RT (so desktop is out) and I don't currently have a business model that can support servers capable of handling such intensive work on my customers' behalf.
So my question is, what is my best line of attack here? Is there any way to hold onto my stream across suspension? And if, as I suspect, option #3 is my best bet, do you know of any samples or guides on how to do it? Obviously C# options would be very much preferred, so I hope I am overlooking one. C++ might be OK (as it was with Mr. Catuhe's sample that got me this far), but I'm afraid I'd need some pretty specific guidance. The MSDN documentation on this, incidentally, is so high-level that I have only a vague idea of even which pieces I would need to assemble and what each requires, let alone how to write the actual program in C++.
Any help you could offer would be very much appreciated.

Unfortunately I don't have enough reputation points on SO to just comment so I have to give this as an answer.
You could consider a combination of #3 and #4. Render in segments within your app and then upload the segments for stitching together. This would bring you back into the realms of using a commodity solution to create your final output.

Save multiple images to disk rapidly in C#

I have a program in C# which saves a large number of images to disk after processing them
. This seems to be taking quite a bit of time due to the fact that so many images need to be saved.
Now, I was wondering: is there any way to speed up saving images in C#? At the moment, I'm using the standard bmp.Save(filename) approach.
If it helps, part of the image generation process involves using lockbits to access and modify the pixel values more rapidly, so perhaps when I do this, the images could be saved to disk at the same time? Apologies if this idea is daft, but I'm still somewhat new to C#.

You could certainly start a new thread for each image save. That would reduce the time taken a bit, the disk would then become the bottle neck though.
One other option would be to save the images to a temporary buffer list and then return control to the program. Then have a thread to write each one to disk in the background. Of course, that would only give the appearance of this happening quickly. It could possibly serve your needs though.
I am sure that .NET has implemented some sort of Asynchronous I/O to do this for you. I know Windows has so it makes sense that it would be in .NET.
This may be helpful.
http://msdn.microsoft.com/en-us/library/kztecsys(v=vs.71).aspx

Translating C to C# and HLSL: will this be possible?

I've taken on quite a daunting challenge for myself. In my XNA game, I want to implement Blargg's NTSC filter. This is a C library that transforms a bitmap to make it look like it was output on a CRT TV with the NTSC standard. It's quite accurate, really.
The first thing I tried, a while back, was to just use the C library itself by calling it as a dll. Here I had two problems, 1. I couldn't get some of the data to copy correctly so the image was messed up, but more importantly, 2. it was extremely slow. It required getting the XNA Texture2D bitmap data, passing it through the filter, and then setting the data again to the texture. The framerate was ruined, so I couldn't go down this route.
Now I'm trying to translate the filter into a pixel shader. The problem here (if you're adventurous to look at the code - I'm using the SNES one because it's simplest) is that it handles very large arrays, and relies on interesting pointer operations. I've done a lot of work rewriting the algorithm to work independently per pixel, as a pixel shader will require. But I don't know if this will ever work. I've come to you to see if finishing this is even possible.
There's precalculated array involved containing 1,048,576 integers. Is this alone beyond any limits for the pixel shader? It only needs to be set once, not once per frame.
Even if that's ok, I know that HLSL cannot index arrays by a variable. It has to unroll it into a million if statements to get the correct array element. Will this kill the performance and make it a fruitless endeavor again? There are multiple array accesses per pixel.
Is there any chance that my original plan to use the library as is could work? I just need it to be fast.
I've never written a shader before. Is there anything else I should be aware of?
edit: Addendum to #2. I just read somewhere that not only can hlsl not access arrays by variable, but even to unroll it, the index has to be calculable at compile time. Is this true, or does the "unrolling" solve this? If it's true I think I'm screwed. Any way around that? My algorithm is basically a glorified version of "the input pixel is this color, so look up my output pixel values in this giant array."

From my limited understanding of Shader languages, your problem can easily be solved by using texture instead of array.
Pregenerate it on CPU and then save as texture. 1024x1024 in your case.
Use standard texture access functions as if texture was the array. Posibly using nearest-neighbor to limit blendinding of individual pixels.
I dont think this is possible if you want speed.

Rendering to a single Bitmap object from multiple threads

What im doing is rendering a number of bitmaps to a single bitmap. There could be hundreds of images and the bitmap being rendered to could be over 1000x1000 pixels.
Im hoping to speed up this process by using multiple threads but since the Bitmap object is not thread-safe it cant be rendered to directly concurrently. What im thinking is to split the large bitmap into sections per cpu, render them separately then join them back together at the end. I haven't done this yet incase you guys/girls have any better suggestions.
Any ideas? Thanks

You could use LockBits and work on individual sections of the image.
For an example of how this is done you can look at the Paint.Net source code, especially the BackgroundEffectsRenderer (yes that is a link to the mono branch, but the Paint.Net main code seems to be only available in zip files).

Lee, if you're going to use the Image GDI+ object, you may just end up doing all the work twice. The sections that you generate in multiple threads will need to be reassembled at the end of your divide and conquer approach and wouldn't that defeat the purpose of dividing in the first place?
This issue might only be overcome if you're doing something rather complex in each of the bitmap sections that would be much more processing time than simply redrawing the image subparts onto the large bitmap without going to all that trouble.
Hope that helps. What kind of image rendering are you planning out?

You could have each thread write to a byte array, then when they are all finished, use a single thread to create a bitmap object from the byte arrays. If all other processing has been done before hand, that should be pretty quick.

I've done something similar and in my case I had each thread lock x (depended on the size of the image and the number of threads) many rows of bits in the image, and do their writing to those bits such that no threads ever overlapped their writes.

One approach would be to render all the small bitmaps onto an ersatz bitmap, which would just be a two-dimensional int array (which is kind of all a Bitmap really is anyway). Once all the small bitmaps are combined in the big array, you do a one-time copy from the big array into a real Bitmap of the same dimensions.
I use this approach (not including the multi-threaded aspect) all the time for complex graphics on Windows Mobile devices, since the memory available for creating "real" GDI+ Bitmaps is severely limited.
You could also just use a Bitmap as you originally intended. Bitmap is not guaranteed to be thread-safe, but I'm not sure that would be a problem as long as you could assure that no two threads are ever overwriting the same portion of the bitmap. I'd give it a try, at least.
Update: I just re-read your question, and I realized that you're probably not going to see much (if any) improvement in the overall speed of these operations by making them multi-threaded. It's the classic nine-women-can't-make-a-baby-in-one-month problem.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.