Fast copying of GUI in C#? - c#

Right now I'm copying window graphics from one window to another via BitBlt from WinApi. I wonder if there is any other fast / faster way to do the same in C#.
Keyword is performance here. If I should stay with WinApi I would hold HDC in memory for quick drawing and if .NET Framework has other possibilities I would probably hold Graphics objects. Right now I'm something to slow when I have to copy ~ 1920x1080 windows.
So how can I boost performance of gui copying in C# ?
I just want to know if I can get better than this. Explicit hardware acceleration (OpenGL, DirectX) is out of my interest here. I decided to stay pure .NET + WinApi.
// example to copy desktop to window
Graphics g = Graphics.FromHwnd(Handle);
IntPtr dc = g.GetHdc();
IntPtr dc0 = Windows.GetWindowDC(Windows.GetDesktopWindow());
Windows.BitBlt(dc, 0, 0, Width, Height, dc0, 0, 0, Windows.SRCCOPY);
// clean up of DCs and Graphics left out
Hans's questions:
How slow is it?
Too slow. Feels (very) stiff.
How fast does it need to be?
The software mustn't feel slow.
Why is it important?
User friendliness of software.
What does the hardware look like?
Any random PC.
Why can't you solve it with better hardware?
It's just software that runs on Windows machines. You won't goy and buy a new PC for a random software that runs slow on your old ?

Get a better video card!
The whole point of the GDI, whether you access it from native code or via .Net, is that it abstracts the details of the graphics subsystem. The downside being that the low level operations such as blitting are in the hands of the graphic driver writers. You can safely assume that these are as optimised as it's possible to get (after all, a video card manufacturer wants to make their card look the best).
The overhead of using the .Net wrappers instead of using the native calls directly pale into insignificance compared to the time spent doing the operation itself, so you're not going to gain much really.
Your code is doing a non-scaling, no blending copy which is possibly the fastest way to copy images.
Of course, you should be profiling the code to see what effect any changes you do to the code is having.
The question then is why are you copying such large images from one window to another? Do you control the contents of both windows?
Update
If you control both windows, why not draw to a single surface and then blit that to both windows? This should replace the often costly operation of reading data from the video card (i.e. blitting from one window to another) with two write operations. It's just a thought and might not work but you'd need timing data to see if it makes any difference.

Related

C++ AMP calculations and WPF rendering graphics card dual use performance

Situation:
In an application that has both the need for calculation as well as rendering images (image preprocessing and then displaying) I want to use both AMP and WPF (with AMP doing some filters on the images and WPF not doing much more than displaying scaled/rotated images and some simple overlays, both running at roughly 30fps, new images will continuously stream in).
Question:
Is there any way to find out how the 2 will influence each other?
I am wondering on whether I will see the hopefully nice speed-up I will see in an isolated AMP only environment in the actual application later on as well.
Additional Info:
I will be able and am going to measure the AMP performance separately, since it is low level and new functionality that I am going to set up in a separate project anyway. The WPF rendering part already exists though in a complex application, so it would be difficult to isolate that.
I am not planning on doing the filters etc for rendering only since the results will be needed in intermediate levels as well (other algorithms, e. g. edge detection, saving, ...).
There are a couple of things you should consider here;
Is there any way to find out how the 2 will influence each other?
Directly no, but indirectly yes. Both WPF and AMP are making use of the GPU for rendering. If the AMP portion of your application uses too much of the GPU's resources it will interfere with your frame rate. The Cartoonizer case study from the C++ AMP book uses MFC and C++ AMP to do exactly the way you describe. On slower hardware with high image processing loads you can see the application's responsiveness suffer. However in almost all cases cartoonizing images on the GPU is much faster and can achieve video frame rates.
I am wondering on whether I will see the hopefully nice speed-up
With any GPU application the key to seeing performance improvements is that the speedup from running compute on the GPU, rather than the CPU, must make up for the additional overhead of copying data to and from the GPU.
In this case there is additional overhead as you must also marshal data from the native (C++ AMP) to managed (WPF) environments. You need to take care to do this efficiently by ensuring that your data types are blitable and do not require explicit marshaling. I implemented an N-body modeling application that used WPF and native code.
Ideally you would want to render the results of the GPU calculation without moving it through the CPU. This is possible but not if you explicitly involve WPF. The N-body example achieves this by embedding a DirectX render area directly into the WPF and then renders data directly from the AMP arrays. This was largely because the WPF viewport3D really didn't meet my performance needs. For rendering images WPF may be fine.
Unless things have changed with VS 2013 you definitely want your C++ AMP code in a separate DLL as there are some limitations when using native code in a C++/CLI project.
As #stijn suggests I would build a small prototype to make sure that the gains you get by moving some of the compute to the GPU are not lost due to the overhead of moving data both to and from the GPU but also into WPF.

C# - Multithreaded Processing of a single Image (Webcam Frames)

I have made a programm which is able to capture webcam frames and display them after running different per pixel algorithms - for example making the image gray scale.
At the moment I am using the FastBitmap class (can't find the link atm) which uses pointers to set and get a pixel within a bitmap.
However, I wanted to make my programm multithreaded so that multiple threads are working on the same image. For that I split the image into several sections via its BitmapData (one section per thread) and let the different threads work on their given section BitmapData. In the end a "manager" waits until all threads are done (join) and hands in the resulting image.
That's the theory, but in real this isn't working for me.
When I run this programm I get some strange errors, telling me that I have to release the LHC before reusing it, that I am accessing illegal memory, external exceptions etc. ... Everytime another and I can't understand why but I think the BitmapData sections are the main problem but I don't want to use the slower Mashal copy!
So my questions are the following:
Is it possible to have sectioned multithreaded image processing in C# with unsafe pointer methods?
If yes - how?
As for image processing libraries:
I don't need filters or some default image processing algorithms but I need my own "per pixel" algorithm - I even thought about adding a pixel shader to my program. xD
As my programm is based around the converting of frames of a webcam I need the fastest algorithm possible.
I've read all forum posts and tutorials etc. which I could just find and still have no idea how to do this with unsafe code correctly until I've made this account to finally ask this question here.
Robbepop
Of course it is possible :)
Take a look at:
https://github.com/dajuric/accord-net-extensions
This library contains exactly want you want:
parallel processor which is used to execute parallel operations on an image (e.g. color conversion) and yes those functions which operate on an image are unsafe (use pointers):
NuGet packages ready.

GPU access on Windows Mobile

I am building an app for Windows Mobile 6.5 and I was wondering if there is any way to hardware accelerate various calculations. I would like to have the GPU do some of the work for the app, instead of relying on the CPU to do everything.
I would like to use C#, but if that is not possible, then C++ is just fine.
Thanks for any guidance!
EDIT-
An example of the types of calculations I want to offload to the GPU would be things like calculating the locations of 25-100 different rectangles so they can be placed on the screen. This is just a simple example, but I've currently been doing these kinds of calculations on a seperate thread, so I figured (since it's geometry calculations) it would be a prime candidate for GPU.
To fully answer your question I would need more details about what calculations you are trying to perform, but the short answer is no, the GPUs in Windows Mobile devices and the SDK Microsoft exposes are not suitable for GPGPU(General-Purpose Computation on Graphics Hardware).
GPGPU really only became practical when GPUs started providing programmable vertex and pixel shaders with DirectX9(and limited support in 8). The GPUs used with Windows Mobile 6.5 devices are much more similar to those around DirectX8, and do not have programmable vertex and pixel shaders:
http://msdn.microsoft.com/en-us/library/aa920048.aspx
Even on modern desktop graphics cards with GPGPU libraries such as CUDA, getting performance increases when offloading calculations to the GPU is not a trivial task. The calculations must be inherently suited to GPUS( ie able to run massively in parallel, and enough calculations performed on any memory to offset the cost of transferring it to the GPU and back ).
That does not mean it is impossible to speed up calculations with the GPU on Windows Mobile 6.5, however. There is a small set problems that can be mapped to a fixed functions pipeline without shaders. If you can figure out how to solve your problem by rending polygons and reading back the resulting image, then you can use the GPU to do it, but it is unlikely that the calculations you need to do would be suitable, or that it would be worth the effort of attempting.

Image resizing efficiency in C# and .NET 3.5

I have written a web service to resize user uploaded images and all works correctly from a functional point of view, but it causes CPU usage to spike every time it is used. It is running on Windows Server 2008 64 bit. I have tried compiling to 32 and 64 bit and get about the same results.
The heart of the service is this function:
private Image CreateReducedImage(Image imgOrig, Size NewSize)
{
var newBM = new Bitmap(NewSize.Width, NewSize.Height);
using (var newGrapics = Graphics.FromImage(newBM))
{
newGrapics.CompositingQuality = CompositingQuality.HighSpeed;
newGrapics.SmoothingMode = SmoothingMode.HighSpeed;
newGrapics.InterpolationMode = InterpolationMode.HighQualityBicubic;
newGrapics.DrawImage(imgOrig, new Rectangle(0, 0, NewSize.Width, NewSize.Height));
}
return newBM;
}
I put a profiler on the service and it seemed to indicate the vast majority of the time is spent in the GDI+ library itself and there is not much to be gained in my code.
Questions:
Am I doing something glaringly inefficient in my code here? It seems to conform to the example I have seen.
Are there gains to be had in using libraries other than GDI+? The benchmarks I have seen seem to indicate that GDI+ does well compare to other libraries but I didn't find enough of these to be confident.
Are there gains to be had by using "unsafe code" blocks?
Please let me know if I have not included enough of the code...I am happy to put as much up as requested but don't want to be obnoxious in the post.
Image processing is usually an expensive operation. You have to remember that a 32 bit color image is expanded in memory into 4 * pixel width * pixel height before your app even starts any kind of processing. A spike is definitely to be expected especially when doing any kind of pixel processing.
That being said, the only place i could see you in being able to speed up the process or lowering the impact on your processor is to try a lower quality interpolation mode.
You could try
newGrapics.InterpolationMode = InterpolationMode.Low;
as HighQualityBicubic will be the most processor-intensive of the resampling operations, but of course you will then lose image quality.
Apart from that, I can't really see anything that can be done to speed up your code. GDI+ will almost certainly be the fastest on a Windows machine (no code written in C# is going to surpass a pure C library), and using other image libraries carries the potential risk of unsafe and/or buggy code.
The bottom line is, resizing an image is an expensive operation no matter what you do. The simplest solution is your case might simply be to replace your server's CPU with a faster model.
I know that the DirectX being released with Windows 7 is said to provide 2D hardware acceleration. Whether this implies it will beat out GDI+ on this kind of operation, I don't know. MS has a pretty unflattering description of GDI here which implies it is slower than it should be, among other things.
If you really want to try to do this kind of stuff yourself, there is a great GDI Tutorial that shows it. The author makes use of both SetPixel and "unsafe blocks," in different parts of his tutorials.
As an aside, multi-threading will probably help you here, assuming your server has more than one CPU. That is, you can process more than one image at once and probably get faster results.
When you write
I have written a web service to resize
user uploaded images
It sounds to mee that the user uploads an image to a (web?) server, and the server then calls a web service to do the scaling?
If that is the case, I would simply move the scaling directly to the server. Imho, scaling an image doesn't justify it's own web service. And you get quite a bit unnecessary traffic going from the server to the web service, and back. In particular because the image is probably base64 encoded, which makes the data traffic even bigger.
But I'm just guessing here.
p.s. Unsafe blocks in itself doesn't give any gain, they just allow unsafe code to be compiled. So unless you write your own scaling routing, an unsafe block isn't going to help.
You may want to try ImageMagick. It's free, and there is also a .NET wrapper: click here. Or here.
Or you can send a command to a DOS Shell.
We have used ImageMagick on Windows Servers now and then, for batch processing and sometimes for a more flexible image conversion.
Of course, there are commercial components as well, like those by Leadtools and Atalasoft. We have never tried those.
I suspect the spike is because you have the interpolation mode cranked right up. All interpolation modes work per pixel and BiCubic High Quality is about as high as you can go with GDI+ so I suspect the per pixel calculations are chewing up your CPU.
As a test try dropping the interpolation mode down to InterpolationModeNearestNeighbor and see if the CPU spike drops - if so then that's your culprit.
If so then do some trial and error for cost vs quality, chances are you might not need High Quality BiCubic to get decent results

Unsafe C# and pointers for 2D rendering, good or bad?

I am writing a C# control that wraps DirectX 9 and provides a simplified interface to perform 2D pixel level drawing. .NET requires that I wrap this code in an unsafe code block and compile with the allow unsafe code option.
I'm locking the entire surface which then returns a pointer to the locked area of memory. I can then write pixel data directly using "simple" pointer arithmetic. I have performance tested this and found a substantial speed improvement over other "safe" methods I know of.
Is this the fastest way to manipulate individual pixels in a C# .NET application? Is there a better, safer way? If there was an equally fast approach that does not require pointer manipulation it would be my preference to use that.
(I know this is 2008 and we should all be using DirectX 3D, OpenGL, etc., however this control is to be used exclusively for 2D pixel rendering and simply does not require 3D rendering.)
Using unsafe pointers is the fastest way to do direct memory manipulation in C# (definitely faster than using the Marshal wrapper functions).
Just out of curiosity, what sort of 2D drawing operations are you trying to perform?
I ask because locking a DirectX surface to do pixel level manipulations will defeat most of the hardware acceleration benefits that you would hope to gain from using DirectX. Also, the DirectX device will fail to initialize when used over terminal services (remote desktop), so the control will be unusable in that scenario (this may not matter to you).
DirectX will be a big win when drawing large triangles and transforming images (texture mapped onto a quad), but it won't really perform that great with single pixel manipulation.
Staying in .NET land, one alternative is to keep around a Bitmap object to act as your surface, using LockBits and directly accessing the pixels through the unsafe pointer in the returned BitmapData object.
Yes, that is probably the fastest way.
A few years ago I had to compare two 1024x1024 images at the pixel level; the get-pixel methods took 2 minutes, and the unsafe scan took 0.01 seconds.
I have also used unsafe to speed up things of that nature. The performance improvements are dramatic, to say the least. The point here is that unsafe turns off a bunch of things that you might not need as long as you know what you're doing.
Also, check out DirectDraw. It is the 2D graphics component of DirectX. It is really fast.
I recently was tasked with creating a simple histogram control for one of our thin client apps (C#). The images that I was analyzing were about 1200x1200 and I had to go the same route. I could make the thing draw itself once with no problem, but the control needed to be re-sizable. I tried to avoid it, but I had to get at the raw memory itself.
I'm not saying it is impossible using the standard .NET classes, but I couldn't get it too work in the end.

Categories