I'm working on simple .NET renderer now, and keep pushing the same stone for a month already:
I'm using effects framework, compiled by SlimDX developers for DX11, and I keep having troubles with updating EffectVariables inside rendering loop (i need vector value updated after every DrawIndexed call). Code looks like (filtered):
public Vector4 wireframeColor = new Vector4();
public Vector4 gridColor = new Vector4();
wireColorVar = effect.GetVariableByName("wireframeColor").AsVector();
wireColorVar.Set(gridColor);
DrawIndexed(GridDomain);
foreach model in scene
{
wireColorVar.Set(wireframeColor);
DrawIndexed(modelDomain);
}
.fx file looks like:
cbuffer ColorBuffer
{
float4 wireframeColor;
float4 diffuseColor;
};
float4 PVertex_Wire_PShader(PVertex_PInput input) : SV_TARGET
{
return wireframeColor;
}
The problem is that every time all those rendering passes apply, in shader they deal only with last known variable value - i.e. wireframeColor, and never with the value of gridColor. I've had this problem for a while, so can surely tell, that it applies to every type of EffectVariables, from buffers to ShaderResources (UAViews for example), and this is really frustrating. Effect variables always tend to take the value of the last invoked one. DeviceContext.Flush() gives nothing, but it really looks like some sort of enqueing of GPU commands.
Only source of info for now, and it didn't work
Looks like applying effect passes doesn't flush variable changes. When i needed compute-shader to work, i had to manually apply resourceviews to explicitly placed variables (by register index) of shader stage.
Is this Effects implementation problem? It's not inevitable, i can still use low-level constant buffer assignments, but then there is no point to use effects at all.
P.S. oh, and don't suggest using diffuseColor field, or other means of just multiplying variable count. I need to change values of one and only variable many times between swapChain.Present() call.
Thank you for attention.
Switched to native DX11 shader mechanism, Microsoft effects seem to follow strange logic while applying GPU commands.
Related
I'm developing a role-playing game in C# (Unity) with a Lua scripting front-end for game logic and modding. I have a design question I've been thinking about and can't seem to find the answer. I'm implementing an Effect class, which provides a common interface to define and handle effects that affect champions or creatures, whether due to a spell, an enchanted item, or a condition (paralyzed, afraid...). The objective is to be as flexible as possible and decouple effects code from the actual champion components/classes.
I want the effect to have access to callbacks, so that it can alter what happens to the entity. If the character health changes for example, active effects can kick in and change that change before it's applied. Here are two examples in Lua, the API should be self-explanatory:
Ring of Health Loss Halving:
onHealthAdjustment = function(entity, val)
if val < 0 then val = math.floor(val / 2); end
return val;
end
Mana Shield spell:
onHealthAdjustment = function(entity, val)
if val < 0 then
championProperties = entity.championProperties;
if championProperties then
championProperties:adjustMana(val);
end
return 0;
else
return val;
end
end
That's fine, but how to handle execution order of callbacks?
Let's say the champion loses 10 health. If the ring gets processed first, it lowers that to 5, then the spell reduces health loss to 0 and removes 5 mana instead.
If the spell gets processed first, it reduces health loss to 0, removes 10 mana, and then the ring callback gets a 0 and does nothing.
I can add an effect priority variable, but there would always end up some with the same value. Process in last-applied first order, or process last-applied only leads to stupid exploits with for example picking and clicking back items in the inventory to make sure what order the effects are processed... I don't see a way to call callbacks in parallel instead of sequentially...
I'm either not seeing an obvious way to fix the current pattern, or I need to change to another design pattern. I've read about Strategy and Observer patterns but can't seem to find a clear answer. How are these cases usually handled?
Thanks!
there would always end up some with the same value
So? If you get a collision, fix it. The order in which the effects are applied is not arbitrary, it's part of your design
Presumably in your code you have a list of event handlers per event type which you iterate through when the event happens. Just make sure that list is in the right order (e.g. by controlling the order they are registered) and you're done.
Side note. In case you didn't know, this:
onHealthAdjustment = function(entity, val) end
Can be written like this:
function onHealthAdjustment(entity, val) end
We have a project that currently uses DirectX11 SlimDX and would like to move it to SharpDX. However, this project uses the Effect framework from SlimDX which I understand is no longer properly supported in DirectX11. However I can't find definitive information about how I should transition the effects.
The effects in use are relatively simple pixels shaders, contained in .fx files.
Should I move away from .fx files? What to? To plain .hlsl files? Or should I use the SharpDX Toolkit? Does this use a different format from .fx files?
I can't find any documentation on this. Is there anyone who has made this transition who could give me some advice, or any documentation on the SharpDX Toolkit effects framework?
The move to SharpDX is pretty simple, there's a couple of changes in naming, and resource description, but apart the fact that it's relatively cumbersome (depending on the size of your code base), there's nothing too complex.
About effects framework, you have the library SharpDX.Direct3D11.Effects that wraps it, so you have it of course supported.
It's pretty much the same as per SlimDX counterpart, so you should not have any major issues moving from it.
If you want to transition away from fx framework to more plain hlsl, you can keep the same fx file, compilation steps will change, instead on compiling the whole file you need to compile each shader separately.
So for example, to compile and create a VertexShader:
CompilationResult result = ShaderBytecode.Compile(content, "VS", "vs_5_0", flags, EffectFlags.None, null, null);
VertexShader shader = new VertexShader(device, result.Bytecode);
Also you need to be careful with all constantbuffers/resource registers, it's generally good to set them explicitely, for example:
cbuffer cbData : register(b0)
{
float4x4 tW;
float4x4 tColor;
float4 cAmb;
};
You of course don't have anymore all the EffectVariable, Get by name/semantic, so instead you need to map your cBuffer to a struct in c# (you can also use datastream directly), and create Constant buffer resources.
[StructLayout(LayoutKind.Sequential,Pack=16)]
public struct cbData
{
public Matrix tW;
public Matrix tColor;
public Vector4 cAmb;
}
BufferDescription bd = new BufferDescription()
{
BindFlags = BindFlags.ConstantBuffer,
CpuAccessFlags = CpuAccessFlags.Write,
OptionFlags = ResourceOptionFlags.None,
SizeInBytes = 144, //Matches the struct
Usage = ResourceUsage.Dynamic
};
var cbuffer = new SharpDX.Direct3D11.Buffer(device, bd);
Use either UpdateSubResource or MapSubresource to update data, and deviceContext.VertexShader.SetConstantBuffer to bind to pipeline.
If you need to inspect shader with reflection, this is done this way (please note that's actually what the effects framework does, it's just a layer on top of d3dcompiler):
ShaderReflection refl = new ShaderReflection(result.Bytecode);
You then need to set up all API calls manually (which is what Effects does for you when you call EffectPass.Apply ).
Also since you compile shaders individually, there is no more layout validation between stages (effects compiler giving you : No valid VertexShader-PixelShader combination....). So you need to be careful setting your pipeline with non matching shaders (you can use reflection data to validate manually, or watch a black screen with debug runtime spamming your output window in visual studio).
So transitioning can be a bit tedious, but can also be beneficial since it's easier to minimize pipeline state changes (In my use case this is not a concern, so effects framework does just fine, but if you have a high number of draw calls that can become significant).
I'm working on a personal project that, like many XNA projects, started with a terrain displacement map which is used to generate a collection of vertices which are rendered in a Device.DrawIndexedPrimitives() call.
I've updated to a custom VertexDeclaration, but I don't have access to that code right now, so I will post the slightly older, but paradigmatically identical (?) code.
I'm defining a VertexBuffer as:
VertexBuffer = new VertexBuffer(device, VertexPositionNormalTexture.VertexDeclaration, vertices.Length, BufferUsage.WriteOnly);
VertexBuffer.SetData(vertices);
where 'vertices' is defined as:
VertexPositionNormalTexture[] vertices
I've also got two index buffers that are swapped on each Update() iteration. In the Draw() call, I set the GraphicsDevice buffers:
Device.SetVertexBuffer(_buffers.VertexBuffer);
Device.Indices = _buffers.IndexBuffer;
Ignoring what I hope are irrelevant lines of code, I've got a method that checks within a bounding shape to determine whether a vertex is within a certain radius of the mouse cursor and raises or lowers those vertex positions depending upon which key is pressed. My problem is that the VertexBuffer.SetData() is only called once at initialization of the container class.
Modifying the VertexPositionNormalTexture[] array's vertex positions doesn't get reflected to the screen, though the values of the vertex positions are changed. I believe this to be tied to the VertexBuffer.SetData() call, but you can't simply call SetData() with the vertex array after modifying it.
After re-examining how the IndexBuffer is handled (2 buffers, swapped and passed into SetData() at Update() time), I'm thinking this should be the way to handle VertexBuffer manipulations, but does this work? Is there a more appropriate way? I saw another reference to a similar question on here, but the link to source was on MegaUpload, so...
I'll try my VertexBuffer.Swap() idea out, but I have also seen references to DynamicVertexBuffer and wonder what the gain there is? Performance supposedly suffers, but for a terrain editor, I don't see that as being too huge a trade-off if I can manipulate the vertex data dynamically.
I can post more code, but I think this is probably a lack of understanding of how the device buffers are set or data is streamed to them.
EDIT: The solution proposed below is correct. I will post my code shortly.
First: I am assuming you are not adding or subtracting vertices from the terrain. If you aren't, you won't need to alter the indexbuffer at all.
Second: you are correct in recognizing that simply editing your array of vertices will not change what is displayed on screen. A VertexBuffer is entirely separate from the vertices it is created from and does not keep a reference to the original array of them. It is a 'snapshot' of your vertices when you set the data.
I'm not sure about some of what seem to be assumptions you have made. You can, as far as I am aware, call VertexBuffer.SetData() at any time. If you are not changing the number of vertices in your terrain, only their positions, this is good. Simply re-set the data in the buffer every time you change the position of a vertex. [Note: if I am wrong and you can only set the data on a buffer once, then just replace the old instance of the buffer with a new one and set the data on that. I don't think you need to, though, unless you've changed the number of vertices]
Calling SetData is fairly expensive for a large buffer, though. You may consider 'chunking' your terrain into many smaller buffers to avoid the overhead required to set the data upon changing the terrain.
I do not know much about the DynamicVertexBuffer class, but I don't think it's optimal for this situation (even if it sounds like it is). I think it's more used for particle vertices. I could be wrong, though. Definitely research it.
Out of curiosity, why do you need two index buffers? If your vertices are the same, why would you use different indices per frame?
Edit: Your code for creating the VertexBuffer uses BufferUsage.WriteOnly. Good practice is to make the BufferUsage match that of the GraphicsDevice. If you haven't set the BufferUsage of the device, you probably just want to use BufferUsage.None. Try both and check performance differences if you like.
I have the following code:
public class Character
{
public Vector2 WorldPixelPosition
{
get { return Movement.Position; }
}
public Vector2 WorldPosition
{
get { return new Vector2(Movement.Position.X / Tile.Width, Movement.Position.Y / Tile.Height); }
}
public Vector2 LevelPosition
{
get { return new Vector2(WorldPosition.X % Level.Width, WorldPosition.Y % Level.Height); }
}
}
Now somewhere else in my code, I make about 2500 calls in a loop to Character.LevelPosition.
This means that per update-cycle, 5000 'new' Vector2s are being made, and on my laptop, it really drops the framerate.
I have temporarily fixed it by creating
var levelPosition = Character.LevelPosition;
before I initiate the loop, but I kinda feel its ugly code to do this everytime I come across a similar situation. Maybe it -is- the way to go, but I want to make sure.
Is there a better or commonly accepted way to do this?
I'm using the XNA-Framework, which uses Vector2's.
From what I understand, you should avoid allocating lots of objects from the heap in XNA, because that causes bad performance. But since Vector2 is a struct, we're not allocating anything on the heap here, so that shouldn't be the problem here.
Now, if you have tight loop, like you do, in a performance-critical application, like a game, you will always have to think about performance, there is no going around that.
If we look at the code for LevelPosition, you call the getter for WorldPosition twice and probably some more getters. The getter for WorldPosition probably calls few other getters. (It's hard to say what exactly is going on without having the source, because getter call and field access look exactly the same.)
Call to a getter, which is actually just a call to a special method, is usually pretty fast and can be even faster if the compiler decides to use inlining. But all the calls add up together, especially if you call them in a loop.
The solution for this is some sort of caching. One option would be to make LevelPosition a field and devise a system to update it when necessary. This could work, but it could also actually hurt performance if you need to update it more often than you read it.
Another solution is, as you discovered, to cache the result in a local variable. If you know that this is correct, i.e. that the value of the property won't change during the execution of the loop, then that's awesome! You solved your performance problem and you did it with only a single line of code that's easily understandable to any programmer. What more do you want?
Let me restate that. You found a solution to your performance problem that:
works
is simple to implement
is easy to understand
I think such solution wold be very hard to beat.
Creating many objects in a loop may be an expensive operation (*). Maybe if would help to create the Vector2 in advance (for example when the coordinates change) and in the future just change the coordinates.
Example:
public class Character
{
private Vector2 m_worldPosition = new Vector2(0, 0);
private Vector2 m_levelPosition = new Vector2(0, 0);
....
public Vector2 WorldPosition
{
get
{
m_worldPosition.X = ...;
m_worldPosition.Y = ...;
return m_worldPosition;
}
}
public Vector2 LevelPosition
{
get
{
m_levelPosition.X = ...;
m_levelPosition.Y = ...;
return m_levelPosition;
}
}
}
EDIT
The same should be done for the LevelPosition property as well. See modified source code.
(*)
Tim Schmelter pointed me to this question with a detailed discussion about the impact of instantiating objects. I have rephrased my initial sentence that object creation is always expensive. While creating objects is not always an expensive operation, it may still slow down performance in certain cases.
You can make a private field to store the value and not compute it each time. You can make a method to update the private fields and subscribe for the Movement.Position changes in some way. This way the value will be computed only once when position changes.
I'm working on a game and am in the middle of a bit of AI number crunching and I want to optimize the code as much as I can. I have several structs in my code like Circle, Vector, etc. I want to reduce to a minimum the load on the GC as a result of this code since it will run many times a second and in each invocation will generate several large arrays and perform a lot of computations.
My question is, if I have a small method that can "return" multiple value types (i.e intersection of circle and vector, etc), what is the most performant way to transfer its result back to the main method?
I can think of a few ways to do it:
Return an array of the results i.e Vector[ ]
Return a List<Vector>
Pass in a List<Vector>
Any other way..?
What would be the best strategy to avoid a lot of small unnecessary objects / arrays on the heap that the GC then has to collect?
If you're in a situation where:
You're calling this method very frequently
You'll always be dealing with the same size of collection (or similar, at least)
You don't need the previous results by the time you next call it
... then you may be able to improve performance by repeatedly passing in the same array (or List<T>) and modifying it. On the other hand, you should absolutely measure performance before making any changes. You should also determine what's going to be "good enough" so you don't bend your code away from the most natural design any more than you have to.
This depends a lot of the type of your data passed, most times games use structs for vectors, intersection-data, etc...
When the data are structs you should avoid List<T> and passing per value because the data is copied then. But this depends a lot on the code, sometimes passing per value might be faster, sometimes not. I would make some performance tests. You can use this method for simple tests without a profiler:
public static Int64 MeasureTime(Action myAction)
{
var stopWatch = new Stopwatch();
stopWatch.Start();
myAction();
stopWatch.Stop();
return stopWatch.ElapsedMilliseconds;
}
When dealing with structs it might be always a good way to use out or ref.
More informations
All this situations may not cause a performance issue, they are just to explain best practices i learned when working with structs. To determine if ref and out are useful in each case you should make a performance test.
ref is used to avoid such situations:
Vector input = new Vector(x, y, z);
input = ModifyVector(input); // This line causes copies of the input vector and it's slow
The performance hit here depends a lot of the size of the Vector class, it's not a good practice to use the following method everytime. When returning a simple Int32 it's not necessary and should not be used to keep the code readable.
Right way:
Vector input = new Vector(x, y, z);
ModifyVector(ref input);
Of course the ref keyword could be used to make methods faster which return nothing, but this methods must take care about the data passed to them and should avoid modifing them. The speed benefit could be more than 50% in some situations ( I have a high-performance vector library and i tested many cases ).
out is used to avoid such situations:
Ray ray = ...;
CollisionData data = CastRay(ref ray); // Note the ref here to pass the Ray which contains 6 floats
CollisionData contains at least the point where it hits the ground and a normal. When using out here to get the result it should be much faster.
Right way:
Ray ray = ...;
CollisionData data;
CastRay(ref ray, out data);
When using arrays..
..you should know that a array is already a reference and you don't need the ref or out keyword to handle them. But when working with structs you should know that you are not holding references of the structs in your array. So when modifing a value of a array you can use ref myArray[0] to pass the value to a method and modify the struct at index zero in place without copying it. Also avoid to create new instances of them.
Don't know what is your platform/framework (XNA?), but I have encountered a problem with GC under Windows Phone 7 (7.5 got generational GC, but haven't test it). To avoid so called freezes of GC I made a collection where I pre-load all necessary data into Dictionary.
but first measure measure measure.