I'm working on a project in XNA, and I'm using some rather large textures, which I'm loading into the game as Texture2D objects, and drawing on the screen much smaller than they are loaded in. The reason for this is that I need to draw them at various different sizes in different places, and, while I could do this with multiple textures, it is impractical to do so.
My problem is that XNA does not seem to antialias these Texture2D objects upon scaling them down. I have set:
graphics.PreferMultiSampling = true;
and also
spriteBatch.Begin(SpriteBlendMode.AlphaBlend);
But to no avail. I have also tried various things involving the filters in the GraphicsDevice but, to be honest, without really knowing what I was doing.
Obviously, I'm drawing this using a spriteBatch, but aside from that, nothing particularly interesting in the way I'm drawing it.
Here is an (enlarged) example of what's happening, and what I'm trying to make happen:
As you can see, XNA is not providing any anti-aliasing when rescaling the image. Is there a way I could force it to do so, in an effort to make the edges look cleaner?
Are you calling graphics.PreferMultiSampling = true; before GraphicsDevice is created? You must turn it on before it is created, either in the Game constructor or Initialize().
Related
I have an Unity 3D scene with several cameras looking at the same object (a huge brain mesh ~100k tri) but not necessary with the same point of view.
In the same 3D scene there is a huge number of spheric plots meshes (from 100 to 30000).
In all the cameras i have to display the brain mesh with a part of the plots meshes.
Depending on the camera view, each plot can have a different size (mesh filter and spheric collider), a different material (opaque or transparent) and can be visible or not.
The spheric collider must have the same size than the mesh.
I set up a shared mesh in common for each spheric mesh.
Their material can be one of the several shared materials i have defined.
Before rendering the scene, for each camera view in the OnPreCull function i have to define which plots are visibles and how they look.
This part can be very costly, i tried several things :
setting gameobject inactive : too costly
setting local scale to vector3(0,0,0) : better but i can see that the rendering is still done in the profiler
setting a total transparent material : same result, but the in the profiler the rendering is now transparent instead of opaque
setting a layer not in the cameras layers masks : huge script cost
I don't kwnow if i can make an efficient culling system with all theses cameras looking at the same point...
I welcome any new ideas.
First issue:
Regarding your specific question with the four dots.
Simply set the renderer.enabled = false, that's all there is to it.
Note however that as I mention in a comment, you would never try to "cull yourself" in Unity (unless I have misunderstood your description).
Second issue:
Regarding the small spheres. I suspect you have very many in the scene. You simply can't do that. In video games (the most difficult of all 3D engineering), you just do this with billboarding. It's how say "grass" is done in a scene. You can achieve this nicely with the particle system in Unity, or other techniques. An implementation is beyond the scope of this answer, but you will have to fully investigate billboarding. Simply it's a small flat image which always faces the camera in the render pass.
Issue 2B:
Note however that sphere colliders are wonderful, and you can use as many as you want. I'm sure this is obvious from base mathematical reasons. Side tip: often folks try to "write their own" thinking it will be faster. It's impossible to outwrite the 100? person-years of spatial culling scientific research in PhysX, and moreover they use the metal, the gpu, so you can't beat it.
Issue three:
Is there a chance you're using a mesh collider somewhere in the project? Never use mesh colliders, at all. (It's extremely confusing they are mentioned or used in Unity; they only have one or two very specific limited uses.)
Issue four:
I'm confused about why you are turning things on and off. I have a guess.
I suspect you are not using more than one "stage"!
There's an amazing trick about video games when you have more than one camera. In fact you have "offscreen" scenes! So you may have players in a dungeon or the like. Off "to the side" you may have an entirely duplicate or triplicate setup of the whole thing running (you could "see it if the camera turned the wrong way") for the other cameras. (In the example you would have different qualities on the dopplegangers, coloring, map-style or whatever the case is.) Sometimes you make a whole double just to run physics calculations or address other problems.
Fascinating extreme example of that sort of thing.
In short in your situation,
You likely need one whole 'stage' of a camera and brain for each of the camera views!
Again this can be http://answers.unity3d.com/answers/299823/view.html but it is indeed the everyday thing. In your overall scene you will see eight happy brains sitting in a row, each with their own camera. In each one you would display whatever items/angle etc are relevant. (Obviously, if certain items are "identical, other than the viewing angle" you could use the "same brain with more than one camera": but I would not do that, best to have one-brain-one-camera for each view.)
I believe that could be the fundamental issue you're having!
I am writing a 2D game in openGL and I ran into some performance problems while rendering a couple of textures covering the whole window.
What I do is actually create a texture with the size of the screen, render my scene onto that texture using FBO and then render the texture a couple of times with a different offsets to get a kind of "shadow" going. But when I do that I get a massive performance drop while using my integrated video card.
So all in all I render 7 quads onto the whole screen(background image, 5 "shadow images" with a black "tint" and the same texture with its true colors). I am using RGBA textures which are 1024x1024 in size and fit them in a 900x700 window. I am getting 200 FPS when I am not rendering the textures and 34 FPS when I do (in both scenarios I actually create the texture and render the scene onto it). I find this quite odd because I am only rendering 7 quads essentially. A strange thing is also that when I run a CPU profiler it doesn't suggest that this is the bottleneck(I know that opengl uses a pipeline architecture and this thing can happen but most of the times it doesn't).
When I use my external video card I get consistent 200 FPS when I do the tests above. But when I disable the scene rendering onto the texture and disable the texture rendering onto the screen I get ~1000 FPS. This happens only to my external video card - when I disable the FBO using the integrated one I get the same 200 FPS. This really confuses me.
Can anyone explain what's going on and if the above numbers sound right?
Integrated video card - Intel HD Graphics 4000
External video card - NVIDIA GeForce GTX 660M
P.S. I am writing my game in C# - so I use OpenTK if that is of any help.
Edit:
First of all thanks for all of the responses - they were all very helpful in a way, but unfortunately I think there is just a little bit more to it than just "simplify/optimize your code". Let me share some of my rendering code:
//fields defined when the program is initialized
Rectangle viewport;
//Texture with the size of the viewport
Texture fboTexture;
FBO fbo;
//called every frame
public void Render()
{
//bind the texture to the fbo
GL.BindFramebuffer(FramebufferTarget.Framebuffer, fbo.handle);
GL.FramebufferTexture2D(FramebufferTarget.Framebuffer, fboTexture,
TextureTarget.Texture2D, texture.TextureID, level: 0);
//Begin rendering in Ortho 2D space
GL.MatrixMode(MatrixMode.Projection);
GL.PushMatrix();
GL.LoadIdentity();
GL.Ortho(viewport.Left, viewport.Right, viewport.Top, viewport.Bottom, -1.0, 1.0);
GL.MatrixMode(MatrixMode.Modelview);
GL.PushMatrix();
GL.LoadIdentity();
GL.PushAttrib(AttribMask.ViewportBit);
GL.Viewport(viewport);
//Render the scene - this is really simple I render some quads using shaders
RenderScene();
//Back to Perspective
GL.PopAttrib(); // pop viewport
GL.MatrixMode(MatrixMode.Projection);
GL.PopMatrix();
GL.MatrixMode(MatrixMode.Modelview);
GL.PopMatrix();
//Detach the texture
GL.FramebufferTexture2D(FramebufferTarget.Framebuffer, fboTexture, 0,
0, level: 0);
//Unbind the fbo
GL.BindFramebuffer(FramebufferTarget.Framebuffer, 0);
GL.PushMatrix();
GL.Color4(Color.Black.WithAlpha(128)); //Sets the color to (0,0,0,128) in a RGBA format
for (int i = 0; i < 5; i++)
{
GL.Translate(-1, -1, 0);
//Simple Draw method which binds the texture and draws a quad at (0;0) with
//its size
fboTexture.Draw();
}
GL.PopMatrix();
GL.Color4(Color.White);
fboTexture.Draw();
}
So I don't think there is actually anything wrong with the fbo and rendering onto the texture, because this is not causing the program to slow down on both of my cards. Previously I was initializing the fbo every frame and that might have been the reason for my Nvidia card to slow down, but now when I am pre-initializing everything I get the same FPS both with and without fbo.
I think the problem is not with the textures in general because if I disable textures and just render the untextured quads I get the same result. And still I think that my integrated card should run faster than 40 FPS when rendering only 7 quads on the screen, even if they cover the whole of it.
Can you give me some tips on how can I actually profile this and post back the result? That would be really useful.
Edit 2:
Ok I experimented a bit and managed to get much better performance. First I tried rendering the final quads with a shader - this didn't have any impact on performance as I expected.
Then I tried to run a profiler. But I far as I know SlimTune is just a CPU profiler and it didn't give me the results I wanted. Then I tried gDEBugger. It has an integration with visual studio which I later found out that it does not support .NET projects. I tried running the external version but it didn't seem to work (but maybe I just haven't played with it enough).
The thing that really did the trick was that rather than rendering the 7 quads directly to the screen I first render them on a texture, again using fbo, and then render the final texture once onto the screen. This got my fps from 40 to 120. Again this seem kind of curios to say the least. Why is rendering to a texture way faster than directly rendering to the screen? Nevertheless thanks for the help everyone - it seems that I have fixed my problem. I would really appreciate if someone come up with reasonable explanation of the situation.
Obviously this is a guess since I haven't seen or profiled your code, but I would guess that integrated cards are just struggling with your post-processing (drawing the texture several times to achieve your "shadow" effect).
I don't know your level of familiarity with these concepts, so sorry if I'm a bit verbose here.
About Post-Processing
Post-processing is the process of taking your completed scene, rendered to a texture, and applying effects to the image before displaying it on the screen. Typical uses of post-processing include:
Bloom - Simulate brightness more naturally by "bleeding" bright pixels into neighboring darker ones.
High Dynamic Range rendering - Bloom's big brother. The scene is rendered to a floating-point texture, allowing greater color ranges (as opposed to the usual 0 for black and 1 for full brightness). The final colors displayed on the screen are calculated using the average luminance of all the pixels on the screen. The effect of all of this is that the camera acts somewhat like the human eye - in a dark room, a bright light (say, through a window) looks extremely bright, but once you get outside, the camera adjusts and light only looks that bright if you stare directly at the sun.
Cel-shading - Colors are modified to give a cartoon-like appearance.
Motion blur
Depth of field - The in-game camera approximates a real one (or your eyes), where only objects at a certain distance are in-focus and the rest are blurry.
Deferred shading - A fairly advanced application of post-processing where lighting is calculated after the scene has been rendered. This costs a lot of video RAM (it usually uses several fullscreen textures) but allows a large number of lights to be added to the scene quickly.
In short, you can use post-processing for a lot of neat tricks. Unfortunately...
Post Processing Has a Cost
The cool thing about post-processing is that its cost is independent of the scene's geometric complexity - it will take the same amount of time whether you drew a million triangles or whether you drew a dozen. That's also its drawback, however. Even though you're only rendering a quad over and over to do post-processing, there is a cost for rendering each pixel. If you were to use a larger texture, the cost would be larger.
A dedicated graphics card obviously has far more computing resources to apply post-processing, whereas an integrated card usually has much fewer resources it can apply. It is for this reason that "low" graphics settings on video games often disable many post-processing effects. This wouldn't show up as a bottleneck on a CPU profiler because the delay happens on the graphics card. The CPU is waiting for the graphics card to finish before continuing your program (or, more accurately, the CPU is running another program while it waits for the graphics card to finish).
How Can You Speed Things Up?
Use fewer passes. If you halve the passes, you halve the time it takes to do post-processing. To that end,
Use shaders. Since I didn't see you mention them anywhere, I'm not sure if you're using shaders for your post-processing. Shaders essentially allow you to write a function in a C-like language (since you're in OpenGL, you can use either GLSL or Cg) which is run on every rendered pixel of an object. They can take any parameters you like, and are extremely useful for post-processing. You set the quad to be drawn using your shader, and then you can insert whatever algorithm you'd like to be run on every pixel of your scene.
Seeing some code would be nice. If the only difference between the two is using an external GPU or not, the difference could be in memory management (ie how and when you're creating an FBO, etc.), since streaming data to the GPU can be slow. Try moving anything that creates any sort of OpenGL buffer or sends any sort of data to it to initialization. I can't really give any more detailed advice without seeing exactly what you're doing.
It isn't just about number of quads you render, and I believe in your case it's got more to do with amout of triangle filling your video card has to do.
As was mentioned, the common way to do fullscreen post-processing is with shaders. If you want better performance on your integrated card and can't use shaders, then you should simplify your rendering routine.
Make sure you really need alpha blending. On some cards/drivers rendering textures with alpha channel can significantly reduce performance.
A somewhat low-quality way to reduce the amount of fullscreen filling would be to first perform all of your shadow draws on another, smaller texture (say, 256x256 instead of 1024x1024). Then you would draw a quad with that compound shadow texture onto your buffer. This way instead of 7 1024x1024 quads you would only need 6 256x256 and one 1024x1024. But you will lose in resolution.
Another technique, and I'm not sure it can be applied in your case, is to pre-render your complex background so you'll have to do less drawing in your rendering loop.
So I'm working on a project, where I have a 3d cube-based world. I got all of that to work, and I'm starting the user interface, and the moment I start using spritebatch to draw a cursor texture I have, I discovered that XNA doesn't layer all the models correctly, some of the models that are further away will be drawn first, instead of behind a model. When I take out all the spritebatch code, which is just this:
spriteBatch.Begin();
cursor.draw(spriteBatch);
spriteBatch.End();
I find that the problem is fixed immediately. The cursor is an object, the draw method is just using spriteBatch.draw();
The way I see it, there are two solutions, I could find a way to draw my cursor and other interfaces without using SpriteBatch, or maybe there is a parameter in spriteBatch.Begin() that I could plug in to fix the issue? I'm not sure how to do either of these, anyone else encounter this problem and know how to fix it?
Thanks in advance.
This does not answer the question!
I'm not sure if you could (or should) draw 2D without a spritebatch, however I've had the same problem with 3D model rendering when using 2D spritebatch, and the solution on solution I've found on GameDev helped me solve this:
Your device states are probably wrong. This often happens when mixing
2D and 3D (for example the overload for SpriteBatch.Begin() which
takes no arguments sets some device states that are incompatible with
3D rendering. No worries though, all you have to do is to make sure
that the following device states are set the way you want them:
GraphicsDevice.BlendState = BlendState.Opaque;
GraphicsDevice.DepthStencilState = DepthStencilState.Default;
GraphicsDevice.RasterizerState = RasterizerState.CullCounterClockwise;
GraphicsDevice.SamplerStates[0] = SamplerState.LinearWrap;
Basically, you first call the SpriteBatch methods for 2D draws, then the above code (which should ensure proper 3D rendering), and then draw your 3D models. In fact, I only used the first two lines - BlendState and DepthStencilState and it worked as it should.
Have you had a look at this overload?
You can use the last parameter, layerDepth, to control in what order sprites are drawn. If you use that, make sure to check out the sprite sort mode (in the SpriteBatch.Begin(...) call) as well. Does this cover what you need to do?
Edit: Also note that this implies using the correct perspective matrix, drawing in 2D (I'm assuming you want your cursor to display in 2D on top of everything else), and after all the 3D stuff (it's quite possible to draw sprites at Z=0 for example, making objects in front of that obstruct the sprite).
Currently as a trail effect in my game I have for every 5 frames a translucent texture copy of a sprite is added to a List<> of trails.
The alpha values of these trails is decremented every frame and a draw function iterates through the list and draws each texture. Once they hit 0 alpha they are removed from the List<>.
The result is a nice little trail effect behind moving entities. The problem is for about 100+ entities, the frame rate begins to drop drastically.
All trail textures come from the same sprite sheet so i dont think it's batching issue. I profiled the code and the CPU intensity is lower during the FPS drop spikes then it is at normal FPS so I assume that means its a GPU limitation?
Is there any way to achieve this effect more efficiently?
Heres the general code im using:
// fade alpha
m_alpha -= (int)(gameTime.ElapsedGameTime.TotalMilliseconds / 10.0f);
// draw
if (m_alpha > 0) {
// p is used to alter RGB of the trails color (m_tint) depending on alpha value
float p = (float)m_alpha/255.0f;
Color blend = new Color((int)(m_tint.R*p), (int)(m_tint.G*p), (int)(m_tint.B*p), m_alpha);
// draw texture to sprite batch
Globals.spriteBatch.Draw(m_texture, getOrigin(), m_rectangle, blend, getAngle(), new Vector2(m_rectangle.Width/2, m_rectangle.Height/2), m_scale, SpriteEffects.None, 0.0f);
} else {
// flag to remove from List<>
m_isDone = true;
}
I guess i should note, the m_texture given to the trail class is a reference to a global texture shared by all trails. Im note creating a hard copy for each trail.
EDIT: If I simply comment out the SpriteBatch.Draw call, even when im allocating a new trail every single frame for hundreds of objects there is no drop in frames... there has got to be a better way to do this.
Usually for trails, instead of clearing the screen on every frame, you simply draw a transparent screen-sized rectangle before drawing the current frame. Thus the previous frame is "dimmed" or "color blurred" while the newer frame is fully "clear" and "bright". As this is repeated, a trail is generated from all the previous frames, which are never cleared but rather "dimmed".
This technique is VERY efficient and it is used in the famous Flurry screensaver (www.youtube.com/watch?v=pKPEivA8x4g).
In order to make the trails longer, you simply increase the transparency of the rectangle that you use to clear the screen. Otherwise, you make it more opaque to make the trail shorter. Note, however, that if you make the trails too long by making the rectangle too transparent, you risk leaving some light traces of the trail that due to alpha blending, might not completely erase even after a long time. The Flurry screensaver suffers from this kind of artifact, but there are ways to compensate for it.
Depending on your situation, you might have to adapt the technique. For instance, you might want to have several drawing layers that allow certain objects to leave a trail while others don't generate trails.
This technique is more efficient for long trails than trying to redraw a sprite thousands of times as your current approach.
On the other hand, I think the bottleneck in your code is the following line:
Globals.spriteBatch.Draw(m_texture, getOrigin(), m_rectangle, blend, getAngle(), new Vector2(m_rectangle.Width/2, m_rectangle.Height/2), m_scale, SpriteEffects.None, 0.0f);
It is inefficient to have thousands of GPU calls like Draw(). It would be more efficient if you had a list of polygons in a buffer, where each polygon is located in the correct position and it has transparency information stored with it. Then, with a SINGLE call to Draw(), you can then render all polygons with the correct texture and transparency. Sorry I cannot provide you with code for this, but if you want to continue with your approach, this might be the direction you are headed. In short, your GPU can certainly draw millions of polygons at a time, but it can't call Draw() that many times...
I am trying to write a custom Minecraft Classic multiplayer client in XNA 4.0, but I am completely stumped when it comes to actually drawing the world in the game. Each block is a cube in 3D space, and it is possible for it to have different textures on each side. I have been reading around the Internet, and found out that for a cube to have a different texture on each side, each face needs its own set of vertices. That makes a total of 24 vertices for each cube, and if you have a world that consists of 64*64*64 cubes (or possibly even more!), that makes a lot of vertices.
In my original code, I split up the texture map I had into separate textures, and applied these before drawing each side of every cube. I was told that this is a very expensive approach, and that I should keep the textures in the same file, and simply use the UV coordinates to map certain subtextures onto the cube. This didn't do much for performance though, since the sheer amount of vertices is simply too much. I was also told to collect the vertices in a VertexBuffer and draw them all at once, but this didn't help much either, and occasionally causes an exception when the number of vertices exceeds the maximum size of the buffer. Any attempt I've tried to make cubes share vertices has also failed, resulting in massive slowdown and glitchy cubes.
I have no idea what to do with this. I am pretty good at programming in general, but any kind of 3D programming or game development completely escapes me.
Here is the method I use to draw the cubes. I have two global lists List<VertexPositionTexture> and List<int>, one for vertices and one for indices. When drawing, I iterate through all of the cubes in the world and do RenderShape on the ones that aren't empty (like Air). The shape class that I have is pasted below. The commented code in the AddVertices method is the attempt to make cubes share vertices. When all of the cubes' vertices have been added to the list, the data is pasted into a VertexBuffer and IndexBuffer, and DrawIndexedPrimitives is called.
To be honest, I am probably doing it completely wrong, but I really have no idea how to do it, and there are no tutorials that actually describe how to draw lots of objects, only extremely simple ones. I had to figure out how to redo the BasicShape to have several textures myself.
The shape:
http://pastebin.com/zNUFPygP
You can get a copy of the code I wrote with a few other devs called TechCraft:
http://techcraft.codeplex.com
Its free and open source. It should show you how to create an engine similar to Minecrafts.
There are a lot of things you can do to speed this up:
What you want to do is bake a region of cubes into a vertex buffer. What I mean by this is to take all of the cubes in a small area, and put them all into one vertex buffer. Only update this buffer when a cube changes.
In a world like minecraft's, LOTS of faces are occluding each other. The biggest thing you can do is to hide faces that are shared between two cubes. Imagine two cubes sitting right next to each other, you don't really need to draw the face in between, since it can never be seen anyway. In our engine, this resulted in 20 times less vertices.
_ _ _ _
|_|_| == |_ _|
As for your textures, it is a good idea, like you said, to use a texture atlas. This greatly reduces your draw calls.
Good luck! And if you feel like cheating, look at Infiniminer. Infiniminer is the game minecraft was based off. It's written in XNA and is open-source!
You need to think about reducing the size of the problem. How can you produce the same image by doing less work?
If your cubes are spaced at regular intervals and are all the same size, you may not need to store the vertices at all - your shader may be able to calculate the vertex positions as it runs. If they are different sizes and not spaced at regular intervals, then you may still be able to use some for onf instancing (where you supply the position and size of a cube to a shader and it works out where to render the vertices to make a cube appear at that location)
If your cubes obscure anything behnd them, then you only need to draw the front-most cubes - anything behind them is just not visible. A natural approach for this would be to use an octree data structure, which divides 3D space into voxels (cubes). Using an octree you could quickly deternine which cubes are visible, and just draw those cubes - so rather than drawing 64x64x64 cubes, you may find you nly have to draw a few hundred per frame. You will also find that as the camera moves, the set of visible cubes will not change much, so you may be able to use this "temporal coherence" to update your data structures to minimise the work that needs to be done to decide which cubes are visible.
I don't think Minecraft draws all the cubes, all the time. Most of them are interior, and you need to draw only those on the surface. So basically, you need an efficient voxel renderer.
I recently wrote an engine to do this in XNA, the technique you want to look into is called hardware instancing and allows you to pass one model into the shader with a stream of world positions to "instance" that model hundreds (even thousands of times) all over your game world.
I built my engine on top of this example, replacing the instanced model with my own.
http://xbox.create.msdn.com/en-US/education/catalog/sample/mesh_instancing
Once you make it into a re-usable class, it and its accompanying shaders become very useful for rendering thousands of pretty much anything you want (bushes, trees, cubes, swarms of birds, etc).
Once you have a base model (could be one face of the block), its mesh will have an associated texture that you can then replace with whatever you want to allow you to dynamically change block texturing for each side and differing types of blocks.