Opengl - "fullscreen" texture rendering performance issue

Opengl - "fullscreen" texture rendering performance issue - c#

I am writing a 2D game in openGL and I ran into some performance problems while rendering a couple of textures covering the whole window.
What I do is actually create a texture with the size of the screen, render my scene onto that texture using FBO and then render the texture a couple of times with a different offsets to get a kind of "shadow" going. But when I do that I get a massive performance drop while using my integrated video card.
So all in all I render 7 quads onto the whole screen(background image, 5 "shadow images" with a black "tint" and the same texture with its true colors). I am using RGBA textures which are 1024x1024 in size and fit them in a 900x700 window. I am getting 200 FPS when I am not rendering the textures and 34 FPS when I do (in both scenarios I actually create the texture and render the scene onto it). I find this quite odd because I am only rendering 7 quads essentially. A strange thing is also that when I run a CPU profiler it doesn't suggest that this is the bottleneck(I know that opengl uses a pipeline architecture and this thing can happen but most of the times it doesn't).
When I use my external video card I get consistent 200 FPS when I do the tests above. But when I disable the scene rendering onto the texture and disable the texture rendering onto the screen I get ~1000 FPS. This happens only to my external video card - when I disable the FBO using the integrated one I get the same 200 FPS. This really confuses me.
Can anyone explain what's going on and if the above numbers sound right?
Integrated video card - Intel HD Graphics 4000
External video card - NVIDIA GeForce GTX 660M
P.S. I am writing my game in C# - so I use OpenTK if that is of any help.
Edit:
First of all thanks for all of the responses - they were all very helpful in a way, but unfortunately I think there is just a little bit more to it than just "simplify/optimize your code". Let me share some of my rendering code:
//fields defined when the program is initialized
Rectangle viewport;
//Texture with the size of the viewport
Texture fboTexture;
FBO fbo;
//called every frame
public void Render()
{
//bind the texture to the fbo
GL.BindFramebuffer(FramebufferTarget.Framebuffer, fbo.handle);
GL.FramebufferTexture2D(FramebufferTarget.Framebuffer, fboTexture,
TextureTarget.Texture2D, texture.TextureID, level: 0);
//Begin rendering in Ortho 2D space
GL.MatrixMode(MatrixMode.Projection);
GL.PushMatrix();
GL.LoadIdentity();
GL.Ortho(viewport.Left, viewport.Right, viewport.Top, viewport.Bottom, -1.0, 1.0);
GL.MatrixMode(MatrixMode.Modelview);
GL.PushMatrix();
GL.LoadIdentity();
GL.PushAttrib(AttribMask.ViewportBit);
GL.Viewport(viewport);
//Render the scene - this is really simple I render some quads using shaders
RenderScene();
//Back to Perspective
GL.PopAttrib(); // pop viewport
GL.MatrixMode(MatrixMode.Projection);
GL.PopMatrix();
GL.MatrixMode(MatrixMode.Modelview);
GL.PopMatrix();
//Detach the texture
GL.FramebufferTexture2D(FramebufferTarget.Framebuffer, fboTexture, 0,
0, level: 0);
//Unbind the fbo
GL.BindFramebuffer(FramebufferTarget.Framebuffer, 0);
GL.PushMatrix();
GL.Color4(Color.Black.WithAlpha(128)); //Sets the color to (0,0,0,128) in a RGBA format
for (int i = 0; i < 5; i++)
{
GL.Translate(-1, -1, 0);
//Simple Draw method which binds the texture and draws a quad at (0;0) with
//its size
fboTexture.Draw();
}
GL.PopMatrix();
GL.Color4(Color.White);
fboTexture.Draw();
}
So I don't think there is actually anything wrong with the fbo and rendering onto the texture, because this is not causing the program to slow down on both of my cards. Previously I was initializing the fbo every frame and that might have been the reason for my Nvidia card to slow down, but now when I am pre-initializing everything I get the same FPS both with and without fbo.
I think the problem is not with the textures in general because if I disable textures and just render the untextured quads I get the same result. And still I think that my integrated card should run faster than 40 FPS when rendering only 7 quads on the screen, even if they cover the whole of it.
Can you give me some tips on how can I actually profile this and post back the result? That would be really useful.
Edit 2:
Ok I experimented a bit and managed to get much better performance. First I tried rendering the final quads with a shader - this didn't have any impact on performance as I expected.
Then I tried to run a profiler. But I far as I know SlimTune is just a CPU profiler and it didn't give me the results I wanted. Then I tried gDEBugger. It has an integration with visual studio which I later found out that it does not support .NET projects. I tried running the external version but it didn't seem to work (but maybe I just haven't played with it enough).
The thing that really did the trick was that rather than rendering the 7 quads directly to the screen I first render them on a texture, again using fbo, and then render the final texture once onto the screen. This got my fps from 40 to 120. Again this seem kind of curios to say the least. Why is rendering to a texture way faster than directly rendering to the screen? Nevertheless thanks for the help everyone - it seems that I have fixed my problem. I would really appreciate if someone come up with reasonable explanation of the situation.

Obviously this is a guess since I haven't seen or profiled your code, but I would guess that integrated cards are just struggling with your post-processing (drawing the texture several times to achieve your "shadow" effect).
I don't know your level of familiarity with these concepts, so sorry if I'm a bit verbose here.
About Post-Processing
Post-processing is the process of taking your completed scene, rendered to a texture, and applying effects to the image before displaying it on the screen. Typical uses of post-processing include:
Bloom - Simulate brightness more naturally by "bleeding" bright pixels into neighboring darker ones.
High Dynamic Range rendering - Bloom's big brother. The scene is rendered to a floating-point texture, allowing greater color ranges (as opposed to the usual 0 for black and 1 for full brightness). The final colors displayed on the screen are calculated using the average luminance of all the pixels on the screen. The effect of all of this is that the camera acts somewhat like the human eye - in a dark room, a bright light (say, through a window) looks extremely bright, but once you get outside, the camera adjusts and light only looks that bright if you stare directly at the sun.
Cel-shading - Colors are modified to give a cartoon-like appearance.
Motion blur
Depth of field - The in-game camera approximates a real one (or your eyes), where only objects at a certain distance are in-focus and the rest are blurry.
Deferred shading - A fairly advanced application of post-processing where lighting is calculated after the scene has been rendered. This costs a lot of video RAM (it usually uses several fullscreen textures) but allows a large number of lights to be added to the scene quickly.
In short, you can use post-processing for a lot of neat tricks. Unfortunately...
Post Processing Has a Cost
The cool thing about post-processing is that its cost is independent of the scene's geometric complexity - it will take the same amount of time whether you drew a million triangles or whether you drew a dozen. That's also its drawback, however. Even though you're only rendering a quad over and over to do post-processing, there is a cost for rendering each pixel. If you were to use a larger texture, the cost would be larger.
A dedicated graphics card obviously has far more computing resources to apply post-processing, whereas an integrated card usually has much fewer resources it can apply. It is for this reason that "low" graphics settings on video games often disable many post-processing effects. This wouldn't show up as a bottleneck on a CPU profiler because the delay happens on the graphics card. The CPU is waiting for the graphics card to finish before continuing your program (or, more accurately, the CPU is running another program while it waits for the graphics card to finish).
How Can You Speed Things Up?
Use fewer passes. If you halve the passes, you halve the time it takes to do post-processing. To that end,
Use shaders. Since I didn't see you mention them anywhere, I'm not sure if you're using shaders for your post-processing. Shaders essentially allow you to write a function in a C-like language (since you're in OpenGL, you can use either GLSL or Cg) which is run on every rendered pixel of an object. They can take any parameters you like, and are extremely useful for post-processing. You set the quad to be drawn using your shader, and then you can insert whatever algorithm you'd like to be run on every pixel of your scene.

Seeing some code would be nice. If the only difference between the two is using an external GPU or not, the difference could be in memory management (ie how and when you're creating an FBO, etc.), since streaming data to the GPU can be slow. Try moving anything that creates any sort of OpenGL buffer or sends any sort of data to it to initialization. I can't really give any more detailed advice without seeing exactly what you're doing.

It isn't just about number of quads you render, and I believe in your case it's got more to do with amout of triangle filling your video card has to do.
As was mentioned, the common way to do fullscreen post-processing is with shaders. If you want better performance on your integrated card and can't use shaders, then you should simplify your rendering routine.
Make sure you really need alpha blending. On some cards/drivers rendering textures with alpha channel can significantly reduce performance.
A somewhat low-quality way to reduce the amount of fullscreen filling would be to first perform all of your shadow draws on another, smaller texture (say, 256x256 instead of 1024x1024). Then you would draw a quad with that compound shadow texture onto your buffer. This way instead of 7 1024x1024 quads you would only need 6 256x256 and one 1024x1024. But you will lose in resolution.
Another technique, and I'm not sure it can be applied in your case, is to pre-render your complex background so you'll have to do less drawing in your rendering loop.

Related

Getting "giggly" effect when slowly moving a sprite

How do I remove this "giggly" effect when slowly moving a sprite?
I have tried adjusting Antialiasing values in QualitySettings and Filter Mode in ImportSettings in the Unity Editor but that doesn't change anything.
Ideally, I would like to keep the Filter Mode to Point (no filter) and anti aliasing turned on to 2x
The sprite is located inside a Sprite Renderer component of a GameObject.
I have uploaded my Unity Project here: http://www.filedropper.com/sprite
I really don't know how to fix the problem... Can anyone help with my personal project?

I cooked up a quick animation to demonstrate what's happening here:
The grid represents the output pixels of your display. I've overlaid on top of it the sliding sprite we want to sample, if we could render it with unlimited sub-pixel resolution.
The dots in the center of each grid cell represent their sampling point. Because we're using Nearest-Nieghbour/Point filtering, that's the only point in the texture they pay attention to. When the edge of a new colour crosses that sampling point, the whole pixel changes colour at once.
The trouble arises when the source texel grid doesn't line up with our output pixels. In the example above, the sprite is 16x16 texels, but I've scaled it to occupy 17x17 pixels on the display. That means, somewhere in every frame, some texels must get repeated. Where this happens changes as we move the sprite around.
Because each texel is rendered slightly larger than a pixel, there's a moment where it completely bridges the sampling points of two adjacent pixels. Both sampling points land within the same enlarged texel, so both pixels see that texel as the nearest one to sample from, and the texel gets output to the screen in two places.
In this case, since there's only a 1/16th scale difference, each texel is only in this weird situation for a frame or two, then it shifts to its neighbour, creating a ripple of doubled pixels that appears to slide across the image.
(One could view this as a type of moiré pattern resulting from the interaction of the texel grid and the sampling grid when they're dissimilar)
The fix is to ensure that you scale your pixel art so each texel is displayed at the size of an integer multiple of pixels.
Either 1:1
Or 2:1, 3:1...
Using a higher multiple lets the sprite move in increments shorter than its own texel size, without localized stretching that impacts the intended appearance of the art.
So: pay close attention to the resolution of your output and the scaling applied to your assets, to ensure you keep an integer multiple relationship between them. The blog post that CAD97 links has practical steps you can take to achieve this.
Edit: To demonstrate this in the Unity project you've uploaded, I modified the camera settings to match your pixels to units setting, and laid out the following test. The Mario at the top has a slightly non-integer texel-to-pixel ratio (1.01:1), while the Mario at the bottom has 1:1. You can see only the top Mario exhibits rippling artifacts:

You might be interested in this blog post about making "pixel-perfect" 2D games in Unity.
Some relevant excerpts:
If you start your pixel game with all the default settings in Unity, it will look terrible!
The secret to making your pixelated game look nice is to ensure that your sprite is rendered on a nice pixel boundary. In other words, ensure that each pixel of your sprite is rendered on one screen pixel.
These other settings are essential to make things as crisp as possible.
On the sprite:
Ensure your sprites are using lossless compression e.g. True Color
Turn off mipmapping
Use Point sampling
In Render Quality Settings:
Turn off anisotropic filtering
Turn off anti aliasing
Turn on pixel snapping in the sprite shader by creating a custom material that uses the Sprite/Default shader and attaching it to the SpriteRenderer.
Also, I'd just like to point out that Unless you are applying Physics, Never Use FixedUpdate. Also, if your sprite has a Collider and is moving, it should have a Kinematic RigidBody attached even if you're never going to use physics, to tell the engine that the Collider is going to move.

Same problem here. I noticed that the camera settings and scale are also rather important to fix the rippling problem.

Here is What Worked for me:
Go to Project Settings > Quality
Under Quality Make the default Quality as High for all.
Set the Anistropic Texture to "Disabled"
Done, And the issue is resolved for me.
Image Reference:
enter image description here

WinForms Draw parts of image at a rotated rectangle frame

I'm working on image transitions for my digital photo frame and am trying to achieve this transition:
It's more of a radar-style transition with the wiping effect going from one side to another in a 180 degree angle. Although, it doesn't appear that "blocky", I just spaced out the rectangles for illustration purposes. The entire thing should be a smooth transitions without any FPS stuttering effects.
My logic is to draw the specific part of the image at (theta) rotation angle like my drawing above - but that will end up drawing 100's of rectangles that sweeps across the screen.
Is there a more efficient way to do this? If not, could I have a few code tips to point me in the right direction?

It is practically impossible to have without any FPS shuttering especially in bigger screens because WinForms uses CPU only rendering. You will have to embed OpenTK (if you want to use OpenGL) or Direct3D frame inside, or maybe WPF where you do the animation.
If you use any of them (for example OpenGL), you have to load it as a texture, and the animation would be done on the triangle level (dragging the corners only) not on the image itself.
If you want to have a curved surface, like a real page transition, I recommend to use a bezier patch as is found here: http://nehe.gamedev.net/tutorial/bezier_patches__fullscreen_fix/18003/
This coding takes a lot of time, and is much more over the purpose of StackOverflow (to setup a full OpenGL/DirectX control + how to do a Bezier patch if you want to set it up).
If you don't want to embed anything, you may look to this transformations tutorial using WPF, but I'm not 100% sure that this is what you need:
http://www.codeproject.com/Articles/14895/WPF-Tutorial-Part-Transformations

(Unity C#) Texture2D scaling mystery

Alright, I've seen there's a number of threads about scaling a Texture2D both on here and the Unity forums. That line of searching brought me to now being able to scale Texture2Ds by using this class: http://wiki.unity3d.com/index.php/TextureScale
Now, what I'm currently working on is applying decals directly to the targeted texture. That's working fine. BUT, depending on the texture size and how much it's being scaled over it's geometry, the decal appears at different sizes (without any scaling, see attached image). That's what lead me to even look into scaling a Texture2D.
The mystery is, I'm not sure what kind of math to throw at the scale function to make the decal appear the same size visually no matter which mesh and texture it's on.
Any help is much appreciated : ]

If the texture is twice as large, the decal will be twice as large. So, scale the decal by half, and it will be normal size.
In general, every place you multiply the scale for the texture, divide the scale for the decal by the same amount.

C# XNA 2D trail effect optimization

Currently as a trail effect in my game I have for every 5 frames a translucent texture copy of a sprite is added to a List<> of trails.
The alpha values of these trails is decremented every frame and a draw function iterates through the list and draws each texture. Once they hit 0 alpha they are removed from the List<>.
The result is a nice little trail effect behind moving entities. The problem is for about 100+ entities, the frame rate begins to drop drastically.
All trail textures come from the same sprite sheet so i dont think it's batching issue. I profiled the code and the CPU intensity is lower during the FPS drop spikes then it is at normal FPS so I assume that means its a GPU limitation?
Is there any way to achieve this effect more efficiently?
Heres the general code im using:
// fade alpha
m_alpha -= (int)(gameTime.ElapsedGameTime.TotalMilliseconds / 10.0f);
// draw
if (m_alpha > 0) {
// p is used to alter RGB of the trails color (m_tint) depending on alpha value
float p = (float)m_alpha/255.0f;
Color blend = new Color((int)(m_tint.R*p), (int)(m_tint.G*p), (int)(m_tint.B*p), m_alpha);
// draw texture to sprite batch
Globals.spriteBatch.Draw(m_texture, getOrigin(), m_rectangle, blend, getAngle(), new Vector2(m_rectangle.Width/2, m_rectangle.Height/2), m_scale, SpriteEffects.None, 0.0f);
} else {
// flag to remove from List<>
m_isDone = true;
}
I guess i should note, the m_texture given to the trail class is a reference to a global texture shared by all trails. Im note creating a hard copy for each trail.
EDIT: If I simply comment out the SpriteBatch.Draw call, even when im allocating a new trail every single frame for hundreds of objects there is no drop in frames... there has got to be a better way to do this.

Usually for trails, instead of clearing the screen on every frame, you simply draw a transparent screen-sized rectangle before drawing the current frame. Thus the previous frame is "dimmed" or "color blurred" while the newer frame is fully "clear" and "bright". As this is repeated, a trail is generated from all the previous frames, which are never cleared but rather "dimmed".
This technique is VERY efficient and it is used in the famous Flurry screensaver (www.youtube.com/watch?v=pKPEivA8x4g).
In order to make the trails longer, you simply increase the transparency of the rectangle that you use to clear the screen. Otherwise, you make it more opaque to make the trail shorter. Note, however, that if you make the trails too long by making the rectangle too transparent, you risk leaving some light traces of the trail that due to alpha blending, might not completely erase even after a long time. The Flurry screensaver suffers from this kind of artifact, but there are ways to compensate for it.
Depending on your situation, you might have to adapt the technique. For instance, you might want to have several drawing layers that allow certain objects to leave a trail while others don't generate trails.
This technique is more efficient for long trails than trying to redraw a sprite thousands of times as your current approach.
On the other hand, I think the bottleneck in your code is the following line:
Globals.spriteBatch.Draw(m_texture, getOrigin(), m_rectangle, blend, getAngle(), new Vector2(m_rectangle.Width/2, m_rectangle.Height/2), m_scale, SpriteEffects.None, 0.0f);
It is inefficient to have thousands of GPU calls like Draw(). It would be more efficient if you had a list of polygons in a buffer, where each polygon is located in the correct position and it has transparency information stored with it. Then, with a SINGLE call to Draw(), you can then render all polygons with the correct texture and transparency. Sorry I cannot provide you with code for this, but if you want to continue with your approach, this might be the direction you are headed. In short, your GPU can certainly draw millions of polygons at a time, but it can't call Draw() that many times...

isometric tile engine

I am making an RPG game using an isometric tile engine that I found here:
http://xnaresources.com/default.asp?page=TUTORIALS
However after completing the tutorial I found myself wanting to do some things with the camera that I am not sure how to do.
Firstly I would like to zoom the camera in more so that it is displaying a 1 to 1 pixel ratio.
Secondly, would it be possible to make this game 2.5d in the way that when the camera moves, the sprite trees and things alike, move properly. By this I mean that the bottom of the sprite is planted while the top moves against the background, making a very 3d like experience. This effect can best be seen in games like diablo 2.
Here is the source code off their website:
http://www.xnaresources.com/downloads/tileengineseries9.zip
Any help would be great, Thanks

Games like Diablo or Sims 1, 2, SimCity 1-3, X-Com 1,2 etc. were actually just 2D games. The 2.5D effect requires that tiles further away are exactly the same size as tiles nearby. Your rotation around these games are restricted to 90 degrees.
How they draw is basically painters algorithm. Drawing what is furthest away first and overdrawing things that are nearer. Diablo is actually pretty simple, it didn't introduce layers or height differences as far as I remember. Just a flat map. So you draw the floor tiles first (in this case back to front isn't too necessary since they are all on the same elevation.) Then drawing back to front the walls, characters effects etc.
Everything in these games were rendered to bitmaps and rendered as bitmaps. Even though their source may have been a 3D textured model.
If you want to add perspective or free rotation then you need everything to be a 3D model. Your rendering will be simpler because depth or render order isn't as critical as you would use z-buffering to solve your issues. The only main issue is to properly render transparent bits in the right order or else you may end up with some odd results. However even if your rendering is simpler, your animation or in memory storage is a bit more difficult. You need to animate 3D models instead of just having an array of bitmaps to do the animation. Selection of items on the screen requires a little more work since position and size of the elements are no longer consistent or easily predictable.
So it depends on which features you want that will dictate which sort of solution you can use. Either way has it's plusses and minuses.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.