SwapChainPanel performance issues - c#

I'm using a SwapChainPanel to render a control. The render method attaches to the CompositionTarget.Rendering event.
Also, RenderTarget.CreateCompatibleTarget is called to create an offscreen target. The compatibleTarget.Bitmap property is called to create a cached bitmap that can be blitted onscreen.
During each frame:
BeginDrawing() is called on the onscreen target.
If the scene has been invalidated by program logic, it is redrawn to the offscreen target.
The onscreen target is cleared using the background color. Without this, successive frames are somehow blended into each other.
The offscreen bitmap (cached above) is drawn onto the onscreen target using onscreenTarget.DrawBitmap(cachedBitmap), with opacity set to 1.
onScreenTarget.Flush() is called to flush the contents.
EndDrawing() is called on the onscreen target.
I find that this gives a very low frame rate.
Comparison with WindowRenderTarget
For comparison, I tested the exact same scene code in a WinForms app using a WindowRenderTarget. (SharpDX makes this possible since it works on UWP as well as desktop.) This gives a much higher frame rate, and zero steady-state CPU consumption.
Questions:
Why does SwapChainPanel produce such a low frame rate compared to WindowRenderTarget?
Why is it necessary to clear the onscreen target each frame before drawing the bitmap in step 4 even when the opacity is 1?
Can I avoid steps 1-6 if nothing has changed? This consumes around 7% CPU.

I don't think there should be too much performance difference between SwapChainPanel & WindowRenderTarget, because both of them are almost directly based on DirectX components. You could compare their settings to investigate the performance differences: Is the SwapChainPanel related with D2D device? Any differences among the configuration of devicecontext, swapchain description, and WindowRenderTarget's renderTargetProperty; Are there any difference between the bitmap processing methods?
THe opacity is set for the image you are drawing. However, the clear() if for the overall rendering view.
When you mean "if nothing changed", I supposed you mean you have nothing to draw. Then of course you can do nothing. Otherwise, the step from 1 to 6 are necessary.
Besides, you can find a good documented SwapChainPanel example here. Some settings in this example are for performance improving.

Related

Copy OpenGL back buffer directly onto GDI DC pixel data

I'm writting a GUI wich uses OpenGL via the OpenTK and the GLControl on C# and i'm trying to use dirty rectangles for drawing only the controls that need to be drawed. Obviusly it's not wise to redraw an entire maximized form just for refreshing a mouse-hover button.
My first attempt was to use glScissors but this doesn't limit the SwapBuffers, wich in my platform, I suspect (because of the performance almost entirely dependent on the window size) doesn't 'swap' but do a full copy of the back buffer onto the front buffer.
The second attempt was the glAddSwapHintRectWIN wich in theory would limit the swapped (in this case copied) area of the SwapBuffers, but this is only a hint and it doesn't do anything at all.
The third attempt was the glDrawBuffer to copy a part of the back buffer onto the frame buffer, for some unknown reason, even when i copy only a part of the buffer, the performance still decreases the same way before when the window size increase.
It seams that a full-area refresh it's still hapening no matter what i do.
So i'm trying to use the glReadPixels () and somehow get a pointer to draw directly onto a hDC pixel data getted from the CreateGraphics() of the control. Is this possible?
EDIT:
I think something is wrong with the GLControl, why the performance of this code depends on the screen size, i'm not doing any swapbuffers or clearing, just drawing a constant-size triangle on the front buffer:A driver problem, maybe?
GL.DrawBuffer(DrawBufferMode.Front);
Vector4 Color;
Color = new Vector4((float)R.NextDouble(), 0, 0, 0.3F);
GL.Begin(BeginMode.Triangles);
GL.Color4(Color.X, Color.Y, Color.Z, Color.W);
GL.Vertex3(50, 50, 0);
GL.Vertex3(150F, 50F, 0F);
GL.Vertex3(50F, 150F, 0F);
GL.End();
GL.Finish();
EDIT 2
This solutions are not viable:
Drawing onto a texture and using glGetTexImage for drawing onto a GDI bitmap and then drawing that bitmap onto the window hDC
Reading buffer pixels from the buffer using glReadPixels onto a GDI bitmap and then drawing that bitmap onto the window hDC.
Splitting the window onto a grid of viewports and updating only the cells that contains the dirty rectangle
First of all, what platform (GPU and OS) are you using? What kind of performance are we talking about?
Keep in mind that there are several limitations when trying to combine GDI and OpenGL on the same hDC. Indeed, in most cases this will turn off hardware acceleration and give you OpenGL 1.1 through Microsoft's software renderer.
Hardware accelerated OpenGL is optimized for redrawing the entire window every frame. SwapBuffers() invalidates the contents of the backbuffer, which makes dirty rectangles impossible to implement when double buffering on the default framebuffer.
There are two solutions:
do not call SwapBuffers(). Set GL.DrawBuffer(DrawBufferMode.Front) and use single-buffering to update the rectangles that are dirty. This has severe drawbacks, including turning off desktop composition on Windows.
do not render directly to the default framebuffer. Instead, allocate and render into a framebuffer object. This way, you can update only the regions of the FBO that have been modified. (You will still need to copy the FBO to screen every frame, so it may or may not be a performance win depending on your GUI complexity.)
Edit:
40-60ms for a single triangle indicates that you are not getting any hardware acceleration. Check GL.GetString(StringName.Renderer) - does it give the name of your GPU or does it return "Microsoft GDI renderer"?
If it is the latter, then you must install OpenGL drivers from the website of your GPU vendor. Do that and the performance problem will disappear.
After several test with OpenTK, it appears that in single or double buffered mode, the slowdown observed with control size increasing still remains, even with constant size scissor enabled. Even the use or not of GL.Clear() doesn't impact slowdown.
(Note that only height changes has significant impact.)
Testing with ansi c example, I had the same results.
Making the same couple of tests under linux gave the same results too.
Under linux I noticed that frame rate changes when I move from one display to the other. Even with vsync disabled.
Next step would be to check if directX has the same behaviour. If yes, than the limitation is located on the bus between display and graphic card.
EDIT: conclusion:
This behaviour is leading you to false impression. Consider only building your interface on a FBO with dirty rect mechanisms and render it on a quad (made of tri's is better) and swap as usual without thinking that you can improve swapping for a given window size by clipping some operations.

Redraw unchanging background on every Draw?

This might be a very simple question, but I searched and found no other way to do it. It doesn't make sense to redraw the background on every Draw. Is there a way to draw some things and leave them on the screen?
I've tried to comment-out the
GraphicsDevice.Clear(Color.CornflowerBlue);
But that doesn't help. (What is its purpose?)
The dark purple colour you are seeing is used by XNA and DirectX to indicate an uninitialised buffer. XNA will also clear buffers to this colour to emulate the behaviour of the Xbox 360 or Windows Phone, so that if you build a game on Windows, it "just works" on those other platforms (or, rather, so it fails in the same way, so you can debug it).
XNA is double-buffered. You don't draw directly to the screen, but to a "backbuffer". The screen only displays the "front buffer". Every time GraphicsDevice.Present gets called (Game calls it for you in EndDraw), those two buffers get swapped, and what you were drawing gets displayed (and you get a fresh buffer to draw on).
I'm not sure why XNA marks the buffer as uninitialised when it gets swapped. I haven't come across this behaviour before - mostly because it's very unusual to want to swap buffers and preserve their contents.
Usually what you want to do is call Game.SupressDraw, when you know you're not going to modify the contents of the screen (saving both a call to Draw and a swap). See also answers here and here.
Keep in mind that clearing the screen with GraphicsDevice.Clear is extremely fast. And that XNA has no concept of "background" or "foreground" (you're always drawing on top of whatever is already in the buffer).
If you do have some expensive-to-render content that you want to re-use between frames, generally you would draw it into to a render target once, and then draw that to the screen each frame. But, as always, avoid premature optimisation! Graphics cards are designed specifically to redraw scenes every frame - they're pretty damn fast!
See this, if you want to just prevent it clearing the image you can do:
GraphicsDevice.GetType().GetField("lazyClearFlags", BindingFlags.NonPublic | BindingFlags.Instance).SetValue(GraphicsDevice, ClearOptions.DepthBuffer);

Quickly scale/crop a bitmap image stream for display in WPF

Question:
What is a fast way to scale and/or crop a bitmap provided from a WritableBitmap for display in the UI?
Requirements:
Have Low CPU usage
Handle large images (5 Megapixel, abt 2500x2000 pixels)
Resize and/or crop to the same resolution/area as the UI element the bitmap is displayed in.
Use WPF
Specifically, it must allow a 14FPS 5 Megapixel camera image stream to be displayed in a WPF UI element at full speed.
Update:
I have been able to speed up the drawing quite a bit by painting to a Canvas control using an ImageBrush as follows, where m_bitmap is my WriteableBitmap:
ImageBrush brush = new ImageBrush();
brush.ImageSource = m_bitmap;
brush.Stretch = Stretch.Uniform;
canvas.Background = brush;
I'm now able to get the full 14FPS, though it still using about 20% CPU, so I'm not sure how well it preform if I add another camera or two (the plan is to have 4 or so running).
Another thing I think might be slowing down the drawing is the images are in a mono, Gray8, format, not the standard RGB32 (or is it bgra32 for WPF?) format. If I understand correctly, the image has to be converted to the standard format to be displayed, which would add significant overhead to each frame's drawing time.
Some background:
I'm currently working with a 5 Megapixel, 14 FPS, video camera and am trying to get the frames to render to the screen at full speed. I would like to do this using WPF.
I currently have an example in WinForms that runs full speed for an unscaled image, but (as I would expect) it has major trouble if I set the pictureBox.SizeMode = Zoom;. The example reads raw data directly from the camera stream to a buffer and then copies the data from the buffer into the bitmap set to the PictureBox control. The copy algorithm uses LockBits to speed things up.
I converted that example into WPF, rewriting the parts using Bitmap objects to instead use WritableBitmap objects and an Image control instead of PictureBox. Unforunately this is not able to render the stream to the screen at any decent rate, scaled or unscaled. Both have significant CPU load and very slow updates.
The performance when rendering to the screen is turned off is great. It is able to process the image stream at full speed and resolution while using around 3% CPU and less than 100MB memory.
Note: when I say rendering to the screen is turned off, the WritableBitmap is still being continuously updated, only is not set to the Image control.
I've seen a lot of discussion about getting fast bitmap updating in WPF, but have been unsuccessful in getting it to work at an reasonable speed/cpu load. Also I would like to have the image scaled in such a way that I can see the whole image.
I imagine the key will lie in some sort of scaling/crop combination that needs to be done so that WPF will not try to render(cache?) all 5 million pixels, but only those on the screen, and only at the current screen resolution. I imagine/hope this can be done fairly easily and without too much hit to memory or CPU, but currently have no idea how to do so. I have found the DecodePixelWidth and DecodePixelHeight properties, but those are only applicable when loading an image from a file to a BitmapImage.
Did you have a look at the following post?
Resizing WritableBitmap
If it does not solve your problem, I have more questions for you:
What is the resolution of your image?
Is the size of you UI element constant? What's its size?
Edit:
After your edit, I noticed that you want to display a BitmapImage in Gray8 PixelFormat, why don't you try to set this property when creating your BitmapImage (m_bitmap)?
m_bitmap.Format = PixelFormat.Gray8; // could not test
I am certain that taking your 8 bits/pixel and multiplying the amount of bits needed per pixel by 4 while not gaining any quality is slowing down your application. Especially because you run operations on 32 bits per pixel images when you could be running those operations on 8 bits per pixel images.
While its interface is a bit old-fashioned, I believe that convert (see http://en.wikipedia.org/wiki/ImageMagick) is very often used (and may in fact be the industry standard).
Edit: StackOverflow has about 2,300 question tagged with imagemagick here. See for example What is the difference for sample/resample/scale/resize/adaptive-resize/thumbnail operators in ImageMagick convert?
The OP for https://apple.stackexchange.com/a/41531 decided to go with ImageMagick. And the accepted answer to Efficient JPEG Image Resizing in PHP also suggests ImageMagick, with 19 votes.
However, I don't know whether ImageMagick is capable of meeting your requirements of 14FPS, 5 Megapixels.
The only answer to Recommendation for real time image processing tools on Linux suggests a fork graphicsmagick, which seems to also be available for Windows.

High CPU load when changing background image of Canvas containing overlay elements

I am working on an Application that loads live video images from a camera and displays an overlay on top of said image. The Overlay does not change often so it can be considered as still. However it usually contains about 1,000 to 10,000 Lines.
When the video image is updated there is a notable impact to the CPU load depending on whether the overlay is visible or not. The overlay does neither get invalidated nor changed, just the image behind it is changing.
My setup is this:
<Canvas>
<Image/>
<Canvas>
<OverlayElement 1/>
<OverlayElement 2/>
<OverlayElement 3/>
<.../>
</Canvas>
</Canvas>
The Image's Source is a WriteableBitmap. Every time a new camera image (type byte[]) is available, the main Canvas' Dispatcher is invoked to write the image data by using WriteableBitmap.WritePixels().
The inner Canvas contains all Overlay Elements, being
- a contour (PolyLine)
- a circle (Path with EllipseGeometry) and
- a set of Rays (Path with one Figure containing LineSgements).
The number n of Points in the contour equals the number of line Segments in the last mentioned Path. n is usually around 1,000 - 3,000.
Depending on the count and length of Lines shown in the overlay the CPU load for showing a live image varies (increases if length or count go up) even if the overlay does not change. At some point this affects the frame rate and makes the program unusable. Line length is mostly correlated with line intersection, so maybe the Path is struggling to calculate it's fill area despite it is not painted?
So how could I improve the performance here?
What bugs me most is that even if the overlay does not change, the render time increases with it's primitive count. I would expect to have constant render time once the overlay has been drawn in it's last set state. What could I do to achieve that aside from rendering the whole overlay to a bitmap?
I am also open minded for suggestions on how to get the byte[] onto the screen more efficiently. Just keep in mind this problem is part of a bigger Application and i cannot change all paradigms concentrating on how to get the image drawn.
What I have tried so far:
Override the OnRender() method of the inner Canvas, drawing the overlay myself. This works fine but has the performance issue that brings me here ;)
Use Shapes (PolyLine, Ellipse, Path) as the inner Canvas' children to hold the overlay elements. This works, too. It is faster to redraw the overlay when it changes but on the other hand worsens the performance issue when updating the background image.
Like 2., but use Freeze() on Geometries wherever possible. Has no or little performance impact.
Thanks for your help in advance.

Frames per second using WPF's DrawingContext class

I've got an example app which draws Rectangles, Lines etc. using the DrawContext instance of the OnRender override of a Control class. This control is repainted every 10 milliseconds by calling InvalidateVisual (I can post the source to anybody who's interested). I calculate the Frames per second (FPS) by measuring the time between each call of OnRender.
However, this figure for FPS is incorrect. Just by looking at the app I can see that the figure given for FPS is higher than the number of times per second that the app repaints itself. This is because "When you use a DrawingContext object's draw commands, you are actually storing a set of rendering instructions (although the exact storage mechanism depends on the type of object that supplies the DrawingContext) that will later be used by the graphics system; you are not drawing to the screen in real-time."
So what I would like to know is if there is any event I can subscribe to, or any other way, to ascertain how many times per second my WPF app/control is generating a new bitmap and drawing it to the screen? Is there any bitmap buffer held by the "graphics system" which we can access?
Many thanks!
What you need to use is the CompositionTarget.Rendering event that is called every frame.

Categories