Unexplainable performance issues with BitmapSource in WPF - c#

I have in my application a 3D world and data for this 3D world. The UI around the application is done with WPF and so far it seems to be working ok. But now I am implementing the following functionality: If you click on the terrain in the 3D view it will show the textures used in this chunk of terrain in a WPF control. The image data of the textures is compressed (S3TC) and I handle creation of BGRA8 data in a separate thread. Once its ready I'm using the main windows dispatcher to do the WPF related tasks. Now to show you this in code:
foreach (var pair in loadTasks)
{
var img = pair.Item2;
var loadInfo = TextureLoader.LoadToArgbImage(pair.Item1);
if (loadInfo == null)
continue;
EditorWindowController.Instance.WindowDispatcher.BeginInvoke(new Action(img =>
{
var watch = Stopwatch.StartNew();
var source = BitmapSource.Create(loadInfo.Width, loadInfo.Height, 96, 96, PixelFormats.Bgra32,
null,
loadInfo.Layers[0], loadInfo.Width * 4);
watch.Stop();
img.Source = source;
Log.Debug(watch.ElapsedMilliseconds);
}));
}
While I cant argue with the visual output there is a weird performance issue. As you can see I have added a stopwatch to check where the time is consumed and I found the culprit: BitmapSource.Create.
Typically I have 5-6 elemets in loadTasks and the images are 256x256 pixels. Interestingly now the first invocation shows 280-285ms for BitmapSource.Create. The next 4-5 all are below 1ms. This consistently happens every time I click the terrain and the loop is started. The only way to avoid the penalty in the first element is to click on the terrain constantly but as soon as I don't click the terrain (and therefore do not invoke the code above) for 1-2 seconds the next call to BitmapSource.Create gets the 280ms penalty again.
Since anything above 5ms is far beyond any reasonable or acceptable time to create 256x256 bitmap (my S3TC decompression does all 10(!) mip layers in less than 2 ms) I guess there has to be something else going on here?
FYI: All properties of loadInfo are static properties and do not perform any calculations you cant see in the code.

Related

Achieving lower frame rates without screen tearing

I'd like to achieve 24fps in a game. I am able to do this with the following code:
IsFixedTimeStep = info.fixedTimeStep;
TargetElapsedTime = TimeSpan.FromSeconds(1f / 24f);
However this produces either a stuttering frame rate or screen tearing, depending on whether vsync is enabled or not. I would expect this to be the case because of the mismatch between 24fps and the 60fps of the monitor.
I decided to try instead to achieve 30fps:
IsFixedTimeStep = info.fixedTimeStep;
TargetElapsedTime = TimeSpan.FromSeconds(1f / 30f);
However this also produces either a stuttering image or screen tearing. I can't understand why this happens when 30fps is a nice half of the 60fp monitor refresh. Perhaps this is because the frame rates involved are not precise?
A bit of Googling caused me to discover that I can get a far better result by telling Monogame to sync every other screen refresh:
graphics = new GraphicsDeviceManager(game);
graphics.PreparingDeviceSettings += (sender, e) =>
{
e.GraphicsDeviceInformation.PresentationParameters.PresentationInterval = PresentInterval.Two;
};
[EDIT 1: It has been brought to my attention that this is a bad idea because it assumes a monitor refresh rate of 60, so I need a better method even more!]
[EDIT 1.1: I discovered that this line specifies 60fps as a basis for then using the above technique of hitting 30fps:
game.TargetElapsedTime = TimeSpan.FromTicks(166666);
]
This gives me something around 30fps but with a smooth result. This is an acceptable result for me, but I wondered if anyone knows of any way at all of achieving something closer to 24fps? Or is this just impossible without the jittering/tearing?
Many thanks!
[EDIT 2: I get the same results whether in exlusive fullscreen or borderless window mode.]
Well I figured out what I think is the best achievable solution. This gives me 24fps with a pretty smooth (as smooth as 24 fps is likely to give) result.
IsFixedTimeStep = true;
TargetElapsedTime = TimeSpan.FromTicks(208333); // 48fps (will be halved to 24fps).
// The following effectively halves the frame rate to 24fps by syncing every other refresh.
graphics.PreparingDeviceSettings += (sender, e) =>
{
e.GraphicsDeviceInformation.PresentationParameters.PresentationInterval = PresentInterval.Two;
};
It gives a better result than just using TimeSpan.FromTicks(416666) to set it at 24fps. I'm not sure why this is.
30fps can be achieved by adjusting the FromTicks value to give 60fps, which then gets halved to 30fps. Again, it produces a better result for me than just aiming for 30fps in the first place.

C# How to Improve Efficiency in Direct2D Drawing

Good morning,
I have been teaching myself a bit of Direct2D programming in C#, utilizing native wrappers that are available (currently using d2dSharp, but have also tried SharpDX). I'm running into problems with efficiency, though, where the basic drawing Direct2D drawing methods are taking approximately 250 ms to draw 45,000 basic polygons. The performance I am seeing is on par, or even slower than, Windows GDI+. I'm hoping that someone can take a look at what I've done and propose a way(s) that I can dramatically improve the time it takes to draw.
The background to this is that I have a personal project in which I am developing a basic but functional CAD interface capable of performing a variety of tasks, including 2D finite element analysis. In order to make it at all useful, the interface needs to be able to display tens-of-thousands of primitive elements (polygons, circles, rectangles, points, arcs, etc.).
I initially wrote the drawing methods using Windows GDI+ (System.Drawing), and performance is pretty good until I reach about 3,000 elements on screen at any given time. The screen must be updated any time the user pans, zooms, draws new elements, deletes elements, moves, rotates, etc. Now, in order to improve efficiency, I utilize a quad tree data structure to store my elements, and I only draw elements that actually fall within the bounds of the control window. This helped significantly when zoomed in, but obviously, when fully zoomed out and displaying all elements, it makes no difference. I also use a timer and tick events to update the screen at the refresh rate (60 Hz), so I'm not trying to update thousands of times per second or on every mouse event.
This is my first time programming with DirectX and Direct2D, so I'm definitely learning here. That being said, I've spent days reviewing tutorials, examples, and forums, and could not find much that helped. I've tried a dozen different methods of drawing, pre-processing, multi-threading, etc. My code is below
Code to Loop Through and Draw Elements
List<IDrawingElement> elementsInBounds = GetElementsInDraftingWindow();
_d2dContainer.Target.BeginDraw();
_d2dContainer.Target.Clear(ColorD2D.FromKnown(Colors.White, 1));
if (elementsInBounds.Count > 0)
{
Stopwatch watch = new Stopwatch();
watch.Start();
#region Using Drawing Element DrawDX Method
foreach (IDrawingElement elem in elementsInBounds)
{
elem.DrawDX(ref _d2dContainer.Target, ref _d2dContainer.Factory, ZeroPoint, DrawingScale, _selectedElementBrush, _selectedElementPointBrush);
}
#endregion
watch.Stop();
double drawingTime = watch.ElapsedMilliseconds;
Console.WriteLine("DirectX drawing time = " + drawingTime);
watch.Reset();
watch.Start();
Matrix3x2 scale = Matrix3x2.Scale(new SizeFD2D((float)DrawingScale, (float)DrawingScale), new PointFD2D(0, 0));
Matrix3x2 translate = Matrix3x2.Translation((float)ZeroPoint.X, (float)ZeroPoint.Y);
_d2dContainer.Target.Transform = scale * translate;
watch.Stop();
double transformTime = watch.ElapsedMilliseconds;
Console.WriteLine("DirectX transform time = " + transformTime);
}
DrawDX Function for Polygon
public override void DrawDX(ref WindowRenderTarget rt, ref Direct2DFactory fac, Point zeroPoint, double drawingScale, SolidColorBrush selectedLineBrush, SolidColorBrush selectedPointBrush)
{
if (_pathGeometry == null)
{
CreatePathGeometry(ref fac);
}
float brushWidth = (float)(Layer.Width / (drawingScale));
brushWidth = (float)(brushWidth * 2);
if (Selected == false)
{
rt.DrawGeometry(Layer.Direct2DBrush, brushWidth, _pathGeometry);
//Note that _pathGeometry is a PathGeometry
}
else
{
rt.DrawGeometry(selectedLineBrush, brushWidth, _pathGeometry);
}
}
Code to Create Direct2D Factory & Render Target
private void CreateD2DResources(float dpiX, float dpiY)
{
Factory = Direct2DFactory.CreateFactory(FactoryType.SingleThreaded, DebugLevel.None, FactoryVersion.Auto);
RenderTargetProperties props = new RenderTargetProperties(
RenderTargetType.Default, new PixelFormat(DxgiFormat.B8G8R8A8_UNORM,
AlphaMode.Premultiplied), dpiX, dpiY, RenderTargetUsage.None, FeatureLevel.Default);
Target = Factory.CreateWindowRenderTarget(_targetPanel, PresentOptions.None, props);
Target.AntialiasMode = AntialiasMode.Aliased;
if (_selectionBoxLeftStrokeStyle != null)
{
_selectionBoxLeftStrokeStyle.Dispose();
}
_selectionBoxLeftStrokeStyle = Factory.CreateStrokeStyle(new StrokeStyleProperties1(LineCapStyle.Flat,
LineCapStyle.Flat, LineCapStyle.Flat, LineJoin.Bevel, 10, DashStyle.Dash, 0, StrokeTransformType.Normal), null);
}
I create a Direct2D factory and render target once and keep references to them at all times (that way I'm not recreating each time). I also create all of the brushes when the drawing layer (which describes color, width, etc.) is created. As such, I am not creating a new brush every time I draw, simply referencing a brush that already exists. Same with the geometry, as can be seen in the second code-snippet. I create the geometry once, and only update the geometry if the element itself is moved or rotated. Otherwise, I simply apply a transform to the render target after drawing.
Based on my stopwatches, the time taken to loop through and call the elem.DrawDX methods takes about 225-250 ms (for 45,000 polygons). The time taken to apply the transform is 0-1 ms, so it appears that the bottleneck is in the RenderTarget.DrawGeometry() function.
I've done the same tests with RenderTarget.DrawEllipse() or RenderTarget.DrawRectangle(), as I've read that using DrawGeometry is slower than DrawRectangle or DrawEllipse as the rectangle / ellipse geometry is known beforehand. However, in all of my tests, it hasn't mattered which draw function I use, the time for the same number of elements is always about equal.
I've tried building a multi-threaded Direct2D factory and running the draw functions through tasks, but that is much slower (about two times slower). The Direct2D methods appear to be utilizing my graphics card (hardware accelerated is enabled), as when I monitor my graphics card usage, it spikes when the screen is updating (my laptop has an NVIDIA Quadro mobile graphics card).
Apologies for the long-winded post. I hope this was enough background and description of things I've tried. Thanks in advance for any help!
Edit #1
So changed the code from iterating over a list using foreach to iterating over an array using for and that cut the drawing time down by half! I hadn't realized how much slower lists were than arrays (I knew there was some performance advantage, but didn't realize this much!). It still, however, takes 125 ms to draw. This is much better, but still not smooth. Any other suggestions?
Direct2D can be used with P/Invoke
See the sample "VB Direct2D Pixel Perfect Collision"
from https://social.msdn.microsoft.com/Forums/en-US/cea42526-4b82-454d-9d79-2e1d94083552/collisions?forum=vbgeneral
the animation is perfect, even done in VB

usage of kinect toolbox record and replay

I am new to StackOverflow and Kinect SDK. I am currently working on my final year project which involves Record/Replay Colour/Depth and skeleton data from Kinect. A Kinect Toolbox was found which enables this and I am integrating the Toolbox with the SDK sample projects (Colour/Depth/skeleton basics C# WPF) to make a program that could display all those stream from the .replay file recorded previously.
The problem I have for now is due to the differences of the KinectReplay Class from the Toolbox and the KinectSensor Class in the SDK. In the Depth Basics Sample code, in order to display the streams, the following lines in WindowLoaded() which allocate space for data retrieved from the Kinect :
/
/ Allocate space to put the depth pixels we'll receive
this.depthPixels = new DepthImagePixel[this.sensor.DepthStream.FramePixelDataLength];
// Allocate space to put the color pixels we'll create
this.colorPixels = new byte[this.sensor.DepthStream.FramePixelDataLength * sizeof(int)];
// This is the bitmap we'll display on-screen
this.colorBitmap = new WriteableBitmap(this.sensor.DepthStream.FrameWidth, this.sensor.DepthStream.FrameHeight, 96.0, 96.0, PixelFormats.Bgr32, null);
//The code below came from "Skeleton basics C# WPF", which I need to find the correspondence of "CoordinateMapper" in KinectReplay Class
// We are not using depth directly, but we do want the points in our 640x480 output resolution.
DepthImagePoint depthPoint = this.sensor.CoordinateMapper.MapSkeletonPointToDepthPoint(skelpoint, DepthImageFormat.Resolution640x480Fps30);
In the original sample code, the parameter for the size of above objects were retrieved from KinectSensor Object, which I need to do similar things but took data from KinectReplay Object, for example, how do I get the equivalent of “this.sensor.DepthStream.FramePixelDataLength” from a KinectReplay object as “this.replay = new KinectReplay(recordStream);” ?
The only solution that I can think of is to call “this.depthPixels = new DepthImagePixel[e.FramePixelDataLength];
” in the replay_DepthImageFrameReady(object sender, ReplayDepthImageFrameReadyEventArgs e) which is invoked each time a depth image frame is received from the KinectReplay. Thus an array of DepthImagePixel will be initialised many times which is inefficient, and in the sample code this will be done only once.
One solution would be to simply get the number of pixels in a frame once during initialization and always use this value since it's unlikely that the number of pixels in a recorded frame will change.
For example, assuming you have a method called OnNewDepthReplay frame, you would do something like this (not tested, syntax might be off):
public void OnNewDepthReplayFrame(DepthReplayFrameEventArgs e) {
if (depthPixels == null) {
depthPixels = new new DepthImagePixel[eFramePixelDataLength];
}
// code that uses your depthPixels here
}
However, using the record/replay capabilities that come with the Kinect 1.5 and 1.6 SDKs might actually be a better option than using the Kinect Toolbox. I used to use Kinect Toolbox for it's recording/replay but then moved to Kinect Studio myself when Kinect for Windows v 1.5 came out. Here's a video on how to use Kinect Studio as well as a guide on MSDN.

Saving a non-pow-2 dimension screen capture in Managed DirectX

I'm attempting to capture a rendered screen from a Managed DirectX application. Typically, the way to do this is as follows:
Surface renderTarget = device.GetRenderTarget(0);
SurfaceLoader.Save(snapshotName, ImageFileFormat.Bmp, renderTarget);
Which is (in my understanding) shorthand for something like:
Surface renderTarget = device.GetRenderTarget(0);
Surface destTarget = device.CreateOffscreenPlainSurface(ClientRectangle.Width, ClientRectangle.Height, graphicsSettings.WindowedDisplayMode.Format, Pool.SystemMemory);
device.GetRenderTargetData(renderTarget,destTarget);
SurfaceLoader.Save(snapshotName,ImageFileFormat.Bmp, destTarget);
The problem is that on older video cards which don't support non-power-of-two dimension textures, the above fails. I've tried a number of workarounds, but nothing seems to accomplish this seemingly simple task of saving arbitrary-dimensioned screen captures. For example, the following fails on new Bitmap() with an invalid parameter exception (note that this requires creating the device with PresentFlag.LockableBackBuffer):
Surface surf = m_device.GetRenderTarget(0);
GraphicsStream gs = surf.LockRectangle(LockFlags.ReadOnly);
Bitmap bmp = new Bitmap(gs);
bmp.Save(snapshotName, ImageFormat.Png);
surf.UnlockRectangle();
Any tips would be greatly appreciated...I've pretty much exhausted everything I can think of (or turn up on Google)...
Why not create a texture which is the next highest power of 2 and then copy a sub rect? It would get round your issues even if the image saved has a whole load of blank space.
I'm surprised Bitmap has issues, tbh. However .. if thats the case then the above will work even it its not ideal.

EmguCV/OpenCV QueryFrame slow/buffers

We have an application, where we get a message from an external system and then we take a picture, do some processing and return something back to the external system. Doing some performance testing, I found two problems (they are somewhat related). I was hoping someone will be able to explain this to me.
1) Does _capture.QueryFrame() buffer frames?
What we see is, if there is a gap between the query for two frames from a web camera, the second frame is often an older picture and not the one when the queryFrame was called.
We were able to mitigate this problem to some extent by discarding some frames, i.e. calling _capture.QueryFrame() 2-3 times and discarding the results.
2) The second issue is when we timed different parts of the application, we found that clearing the buffer (calling QueryFrame() 2-3 times and not using the results) takes about 65ms and then this line: Image<Bgr, Byte> source = _capture.QueryFrame() takes about 80ms. These two parts take the biggest chunk of processing time, our actual processing takes just about 20-30ms more.
Is there a faster way (a) to clear the buffer (b) to capture the frame?
If you have experience with OpenCV and know of something related, please do let me know.
I answered a similar question System.TypeInitializationException using Emgu.CV in C# and having tested the various possibilities to acquire an up to date frame I found the bellow the bes method.
1) yes when you set up a Capture from a webcam a ring buffer is created to store the images in this allows effcient allocation of memory.
2) yes there is a faster way, Set your Capture device up globally and set it of recording and calling ProcessFrame to get an image from the buffer whenever it can. Now change your QueryFrame simply to copy whatever frames its just acquired. This will hopefully stop your problem of getting the previous frame and you will now have the most recent frame out of the buffer.
private Capture cap;
Image<Bgr, Byte> frame;
public CameraCapture()
{
InitializeComponent();
cap= new Capture();
cap.SetCaptureProperty(Emgu.CV.CvEnum.CAP_PROP.CV_CAP_PROP_FRAME_HEIGHT, height);
cap.SetCaptureProperty(Emgu.CV.CvEnum.CAP_PROP.CV_CAP_PROP_FRAME_WIDTH, width);
Application.Idle += ProcessFrame;
}
private void ProcessFrame(object sender, EventArgs arg)
{
frame = _capture.QueryFrame();
grayFrame = frame.Convert<Gray, Byte>();
}
public Image<Bgr,byte> QueryFrame()
{
return frame.Copy();
}
I hope this helps if not let me know and I'll try and tailor a solution to your requirements. Don't forget you can always have your acquisition running on a different thread and invoke the new QueryFrame method.
Cheers
Chris
This could also be due to the refresh rate of the webcamera you are using. My camera works at 60Hz so I have a timer that takes captures a frame every 15 milliseconds.

Categories