I am creating a WPF mapping program which will potentially load and draw hundreds of files to the screen at any one time, and a user may want to zoom and pan this display. Some of these file types may contain thousands of points, which would most likely be connected as some kind of path. Other supported formats will include TIFF files.
Is it better for performance to have a single DrawingVisual to which all data is drawn, or should I be creating a new DrawingVisual for each file loaded?
If anyone can offer any advice on this it would be much appreciated.
You will find lots of related questions on Stack Overflow, however not all of them mention that one of the most high-performance ways to draw large amounts of data to the screen is to use the WriteableBitmap API. I suggest taking a look at the WriteableBitmapEx open source project on codeplex. Disclosure, I have contributed to this once, but it is not my library.
Having experimented with DrawingVisual, StreamGeometry, OnRender, Canvas, all these fall over once you have to draw 1,000+ or more "objects" to the screen. There are techniques that deal with the virtualization of a canvas (there' a million items demo with Virtualized Canvas) but even this is limited to the ~1000 visible at one time before slow down. WriteableBitmap allows you to access a bitmap directly and draw on that (oldskool style) meaning you can draw tens of thousands of objects at speed. You are free to implement your own optimisations (multi-threading, level of detail) but do note you don't get much frills with that API. You literally are doing the work yourself.
There is one caveat though. While WPF uses the CPU for tesselation / GPU for rendering, WriteableBitmap will use CPU for everything. Therefore the fill-rate (number of pixels rendered per frame) becomes the bottleneck depending on your CPU power.
Failing that if you really need high-performance rendering, I'd suggest taking a look at SharpDX (Managed DirectX) and the interop with WPF. This will give you the highest performance as it will directly use the GPU.
Using many small DrawingVisuals with few details rendered per visual gave better performance in my experience compared to less DrawingVisuals with more details rendered per visual. I also found that deleting all of the visuals and rendering new visuals was faster than reusing existing visuals when a redraw was required. Breaking each map into a number of visuals may help performance.
As with anything performance related, conducting timing tests with your own scenarios is the best way to be sure.
Related
Situation:
In an application that has both the need for calculation as well as rendering images (image preprocessing and then displaying) I want to use both AMP and WPF (with AMP doing some filters on the images and WPF not doing much more than displaying scaled/rotated images and some simple overlays, both running at roughly 30fps, new images will continuously stream in).
Question:
Is there any way to find out how the 2 will influence each other?
I am wondering on whether I will see the hopefully nice speed-up I will see in an isolated AMP only environment in the actual application later on as well.
Additional Info:
I will be able and am going to measure the AMP performance separately, since it is low level and new functionality that I am going to set up in a separate project anyway. The WPF rendering part already exists though in a complex application, so it would be difficult to isolate that.
I am not planning on doing the filters etc for rendering only since the results will be needed in intermediate levels as well (other algorithms, e. g. edge detection, saving, ...).
There are a couple of things you should consider here;
Is there any way to find out how the 2 will influence each other?
Directly no, but indirectly yes. Both WPF and AMP are making use of the GPU for rendering. If the AMP portion of your application uses too much of the GPU's resources it will interfere with your frame rate. The Cartoonizer case study from the C++ AMP book uses MFC and C++ AMP to do exactly the way you describe. On slower hardware with high image processing loads you can see the application's responsiveness suffer. However in almost all cases cartoonizing images on the GPU is much faster and can achieve video frame rates.
I am wondering on whether I will see the hopefully nice speed-up
With any GPU application the key to seeing performance improvements is that the speedup from running compute on the GPU, rather than the CPU, must make up for the additional overhead of copying data to and from the GPU.
In this case there is additional overhead as you must also marshal data from the native (C++ AMP) to managed (WPF) environments. You need to take care to do this efficiently by ensuring that your data types are blitable and do not require explicit marshaling. I implemented an N-body modeling application that used WPF and native code.
Ideally you would want to render the results of the GPU calculation without moving it through the CPU. This is possible but not if you explicitly involve WPF. The N-body example achieves this by embedding a DirectX render area directly into the WPF and then renders data directly from the AMP arrays. This was largely because the WPF viewport3D really didn't meet my performance needs. For rendering images WPF may be fine.
Unless things have changed with VS 2013 you definitely want your C++ AMP code in a separate DLL as there are some limitations when using native code in a C++/CLI project.
As #stijn suggests I would build a small prototype to make sure that the gains you get by moving some of the compute to the GPU are not lost due to the overhead of moving data both to and from the GPU but also into WPF.
Does the rendering performance reduce seriously in case when WPF application's XAML contains a lot of nested Grid, StackPanel, DockPanel and other containers?
Really the answer is simply "yes". More of anything will use more processor time. SURPRISE!
In the case of WPF, elements are arranged into a hierarchical scene graph. Adding levels of depth to this graph will slow your application more than adding siblings to existing elements. You should always strive to keep the depth the graph low. Consider using Grid instead of nesting StackPanels.
So why is depth more important than raw element count? Well, depth generally implies;
layout dependency - if a parent is re-sized a child is likely to be re-rendered.
occlusion - if 2 elements overlap, invalidating one will often invalidate the other.
recursion - most graph operations are CPU bound - they depend entirely on CPU speed and have no dedicated hardware support (the renderer uses your graphics chip where possible). Cycling through levels of the graph for resources and layout updates is expensive.
Concerning occlusion, the BitmapCache class can help greatly!
When you create a very complex UI, with lots of nested objects and DataTemplate with lot of elements, you can impact seriously the performance of the App, because the bigger the UI Tree, the bigger it will take to render, and if the framework cannot render in 30FPS you will start to see performance drops. You should use the most lightweight panels you need in order to avoid extra logic you wonn't need. Here are some performance tips in order to make you App faster:
http://msdn.microsoft.com/en-us/library/bb613542(v=vs.110).aspx
WPF uses MeasureOverride and ArrangeOverride methods inorder to render UIElements. MeasureOverride measure the UIElements width and size based on the Parent controls Width and Size. ArrangeOVerride method will arrange the UIElements at runtime based on these measures. These methods are optimized for faster performance and should not cause any rendering performance issue.
But there should be a capacity where these methods can handle UIElements within a minimal time. If this limit exceeds then there should be performance issue.
eg: Suppose a bike can carry 2 person. if 5 persons overloaded what will happen :)
Jet Brains .Trace is a tool to analyze the performance issue which will helps to see these two methods
There are some GDI's objects to do some works with images in WPF, but these objects generate memory leaks easily and other errors (i.e MILERR_WIN32ERROR).
What would be high level alternatives to do the same work without using GDI?
Would be GDI bad for performance in a WPF application, once WPF uses DirectX beneath?
What would be high level alternatives to do the same work without using GDI?
It really depends, but ideally, you'd do the work using WPF's api instead.
Would be GDI bad for performance in a WPF application, once WPF uses DirectX beneath?
There's always going to be extra conversion between WPF's image formats and System.Drawing, as WPF doesn't use GDI. This is going to cause some extra overhead to map back and forth.
I'm working on a "falling sand" style of game.
I've tried many ways of drawing the sand to the screen, however, each way seems to produce some problem in one form or another.
List of things I've worked through:
Drawing each pixel individually, one at a time from a pixel sized texture. Problem: Slowed down after about 100,000 pixels were changing per update.
Drawing each pixel to one big texture2d, drawing the texture2d, then clearing the data. Problems: using texture.SetPixel() is very slow, and even with disposing the old texture, it would cause a small memory leak (about 30kb per second, which added up quick), even after calling dispose on the object. I simply could not figure out how to stop it. Overall, however, this has been the best method (so far). If there is a way to stop that leak, I'd like to hear it.
Using Lockbits from bitmap. This worked wonderfully from the bitmaps perspective, but unfortunately, I still had to convert the bitmap back to a texture2d, which would cause the frame rate to drop to less than one. So, this has the potential to work very well, if I can find a way to draw the bitmap in xna without converting it (or something).
Setting each pixel into a texture2d with set pixel, by replacing the 'old' position of pixels with transparent pixels, then setting the new position with the proper color. This doubled the number of pixel sets necessary to finish the job, and was much much slower than using number 2.
So, my question is, any better ideas? Or ideas on how to fix styles 2 or 3?
My immediate thought is that you are stalling the GPU pipeline. The GPU can have a pipeline that lags several frames behind the commands that you are issuing.
So if you issue a command to set data on a texture, and the GPU is currently using that texture to render an old frame, it must finish all of its rendering before it can accept the new texture data. So it waits, killing your performance.
The workaround for this might be to use several textures in a double- (or even triple- or quad-) buffer arrangement. Don't attempt to write to a texture that you have just used for rendering.
Also - you can write to textures from a thread other than your rendering thread. This might come in handy, particularly for clearing textures.
As you seem to have discovered, it's actually quicker to SetData in large chunks, rather than issue many, small SetData calls. Determining the ideal size for a "chunk" differs between GPUs - but it is a fair bit bigger than a single pixel.
Also, creating a texture is much slower than reusing one, in raw performance terms (if you ignore the pipeline effect I just described); so reuse that texture.
It's worth mentioning that a "pixel sprite" requires sending maybe 30 times as much data per-pixel to the GPU than a texture.
See also this answer, which has a few more details and some in-depth links if you want to go deeper.
I have a WPF application where I need to add a feature that will display a series of full screen bitmaps very fast. In most cases it will be only two images, essentially toggling the two images. The rate they are displayed should be constant, around 10-20ms per image. I tried doing this directly in WPF using a timer, but the display rate appeared to vary quit a bit. I also tried using SharpGL (a .Net wrapper on OpenGL), but it was very slow when using large images (I may not have been doing the best way). I will have all the bitmaps upfront, before compile time, so the format could be changed as long as the pixels are not altered.
What would be the best way to do this?
I'm already behind schedule so I don't have time to learn lots of APIs or experiment with lots of options.
"I tried doing this directly in WPF using a timer, but the display rate appeared to vary quit a bit."
Instead of using a Timer, use Thread.Sleep(20) as it wont hog as many system resources. This should give you an immediate improvement.
It also sounds as though there will be user interaction with the application while the images are rapidly toggling, in this case put the code that toggles the images in a background thread. Remember though that the UI- is not thread safe.
These are just quick wins, but you might need to use DirectX for Hardware Acceleration to get around HAL:
The Windows' Hardware Abstraction Layer (HAL) is implemented in
Hal.dll. The HAL implements a number of functions that are
implemented in different ways by different hardware platforms, which
in this context, refers mostly to the Chipset. Other components in the
operating system can then call these functions in the same way on all
platforms, without regard for the actual implementation.