Suppose you have two structs that have exactly the same memory layout. Is it possible to do a very fast unchecked memory cast from one to the other in C#/.NET?
//my code base
[StructLayout(LayoutKind.Sequential)]
public struct VectorA
{
float x;
float y;
float z;
}
//defined by a third party library
[StructLayout(LayoutKind.Sequential)]
public struct VectorB
{
float a;
float b;
float c;
}
//somewhere else in my code
var vectorA = new VectorA();
//then calling a method from the library
MethodFromThirdPartyLibrary((VectorB)vectorA); //compiler error
Of course it should be faster as a method that assigns the data fields and creates a new copy in memory.
Also: The 3d vector is only an example, same problem for matrices which is 16 floats and Vector2, Vector4, ...
EDIT: Improved code with more comments and better usage example.
Why would it be faster? Would it be faster in C++ than writing the copy explicitly as in C#? Remember, you only have 3 x 32-bit numbers you want to copy from one place to another, so it's not exactly a good fit for vectorization.
It's likely if you had an array of these structures that you could get some speed up using vectorized load/stores in an unrolled loop in assembler. But you've not stated that in the question.
The main overhead here is probably the method call, rather than the assignment:
static void VecAToB(ref VectorA vectorA, ref VectorB vectorB)
{
vectorB.x = vectorA.a;
vectorB.y = vectorA.b;
vectorB.z = vectorA.c;
}
You might like to try:
[MethodImpl(MethodImplOptions.AggressiveInlining)]
static void VecAToB(ref VectorA vectorA, ref VectorB vectorB)
{
vectorB.x = vectorA.a;
vectorB.y = vectorA.b;
vectorB.z = vectorA.c;
}
Related
I need to implement a mutable polygon that behaves like a struct, that it is copied by value and changes to the copy have no side effects on the original.
Consider my attempt at writing a struct for such a type:
public unsafe struct Polygon : IEnumerable<System.Drawing.PointF>
{
private int points;
private fixed float xPoints[64];
private fixed float yPoints[64];
public PointF this[int i]
{
get => new PointF(xPoints[i], yPoints[i]);
set
{
xPoints[i] = value.X;
yPoints[i] = value.Y;
}
}
public IEnumerator<PointF> GetEnumerator()
{
return new PolygonEnumerator(ref this);
}
}
I have a requirement that a Polygon must be copied by value so it is a struct.
(Rationale: Modifying a copy shouldn't have side effects on the original.)
I would also like it to implement IEnumerable<PointF>.
(Rationale: Being able to write for (PointF p in poly))
As far as I am aware, C# does not allow you to override the copy/assignment behaviour for value types. If that is possible then there's the "low hanging fruit" that would answer my question.
My approach to implementing the copy-by-value behaviour of Polygon is to use unsafe and fixed arrays to allow a polygon to store up to 64 points in the struct itself, which prevents the polygon from being indirectly modified through its copies.
I am running into a problem when I go to implement PolygonEnumerator : IEnumerator<PointF> though.
Another requirement (wishful thinking) is that the enumerator will return PointF values that match the Polygon's fixed arrays, even if those points are modified during iteration.
(Rationale: Iterating over arrays works like this, so this polygon should behave in line with the user's expectations.)
public class PolygonEnumerator : IEnumerator<PointF>
{
private int position = -1;
private ??? poly;
public PolygonEnumerator(ref Polygon p)
{
// I know I need the ref keyword to ensure that the Polygon
// passed into the constructor is not a copy
// However, the class can't have a struct reference as a field
poly = ???;
}
public PointF Current => poly[position];
// the rest of the IEnumerator implementation seems straightforward to me
}
What can I do to implement the PolygonEnumerator class according to my requirements?
It seems to me that I can't store a reference to the original polygon, so I have to make a copy of its points into the enumerator itself; But that means changes to the original polygon can't be visited by the enumerator!
I am completely OK with an answer that says "It's impossible".
Maybe I've dug a hole for myself here while missing a useful language feature or conventional solution to the original problem.
Your Polygon type should not be a struct because ( 64 + 64 ) * sizeof(float) == 512 bytes. That means every value-copy operation will require a copy of 512 bytes - which is very inefficient (not least because of locality-of-reference which strongly favours the use objects that exist in a single location in memory).
I have a requirement that a Polygon must be copied by value so it is a struct.
(Rationale: Modifying a copy shouldn't have side effects on the original.)
Your "requirement" is wrong. Instead define an immutable class with an explicit copy operation - and/or use a mutable "builder" object for efficient construction of large objects.
I would also like it to implement IEnumerable<PointF>.
(Rationale: Being able to write for (PointF p in poly))
That's fine - but you hardly ever need to implement IEnumerator<T> directly yourself because C# can do it for you when using yield return (and the generated CIL is very optimized!).
My approach to implementing the copy-by-value behaviour of Polygon is to use unsafe and fixed arrays to allow a polygon to store up to 64 points in the struct itself, which prevents the polygon from being indirectly modified through its copies.
This is not how C# should be written. unsafe should be avoided wherever possible (because it breaks the CLR's built-in guarantees and safeguards).
Another requirement (wishful thinking) is that the enumerator will return PointF values that match the Polygon's fixed arrays, even if those points are modified during iteration.
(Rationale: Iterating over arrays works like this, so this polygon should behave in line with the user's expectations.)
Who are your users/consumers in this case? If you're so concerned about not breaking user's expectations then you shouldn't use unsafe!
Consider this approach instead:
(Update: I just realised that the class Polygon I defined below is essentially just a trivial wrapper around ImmutableList<T> - so you don't even need class Polygon, so just use ImmutableList<Point> instead)
public struct Point
{
public Point( Single x, Single y )
{
this.X = x;
this.Y = y;
}
public Single X { get; }
public Single Y { get; }
// TODO: Implement IEquatable<Point>
}
public class Polygon : IEnumerable<Point>
{
private readonly ImmutableList<Point> points;
public Point this[int i] => this.points[i];
public Int32 Count => this.points[i];
public Polygon()
{
this.points = new ImmutableList<Point>();
}
private Polygon( ImmutableList<Point> points )
{
this.points = points;
}
public IEnumerator<PointF> GetEnumerator()
{
//return Enumerable.Range( 0, this.points ).Select( i => this[i] );
return this.points.GetEnumerator();
}
public Polygon AddPoint( Single x, Single y ) => this.AddPoint( new Point( x, y ) );
public Polygon AddPoint( Point p )
{
ImmutableList<Point> nextList = this.points.Add( p );
return new Polygon( points: nextList );
}
}
Is the assignement of a value type considered to be atomic in .Net?
For example, consider the following program:
struct Vector3
{
public float X { get; private set; }
public float Y { get; private set; }
public float Z { get; private set; }
public Vector3(float x, float y, float z)
{
this.X = x;
this.Y = y;
this.Z = z;
}
public Vector3 Clone()
{
return new Vector3(X, Y, Z);
}
public override String ToString()
{
return "(" + X + "," + Y + "," + Z + ")";
}
}
class Program
{
private static Vector3 pos = new Vector3(0,0,0);
private static void ReaderThread()
{
for (int i = 0; i < int.MaxValue; i++)
{
Vector3 v = pos;
Console.WriteLine(v.ToString());
Thread.Sleep(200);
}
}
private static void WriterThread()
{
for (int i = 1; i < int.MaxValue; i++)
{
pos = new Vector3(i, i, i);
Thread.Sleep(200);
}
}
static void Main(string[] args)
{
Thread w = new Thread(WriterThread);
Thread r = new Thread(ReaderThread);
w.Start();
r.Start();
}
}
Can a program like this suffer from a High-Level data race? Or even a Data Race?
What I want to know here is: is there any possibility that v will either contain:
Garbage values due to a possible data race
Mixed components X, Y or Z that refer to both pos before assignement and pos after assignment. For example, if pos = (1,1,1) and then pos is assigned the new value of (2,2,2) can v = (1,2,2)?
Structs are value types. If you assign a struct to a variable/field/method parameter, the whole struct content will be copied from the source storage location to the storage location of the variable/field/method parameter (the storage location in each case being the size of the struct itself).
Copying a struct is not guaranteed to be an atomic operation. As written in the C# language specification:
Atomicity of variable references
Reads and writes of the following data types are atomic: bool, char,
byte, sbyte, short, ushort, uint, int, float, and reference types. In
addition, reads and writes of enum types with an underlying type in
the previous list are also atomic. Reads and writes of other types,
including long, ulong, double, and decimal, as well as user-defined
types, are not guaranteed to be atomic. Aside from the library
functions designed for that purpose, there is no guarantee of atomic
read-modify-write, such as in the case of increment or decrement.
So yes, it can happen that while one thread is in the process of copying the data from a struct storage location, another thread comes along and starts copying new data from another struct to that storage location. The thread copying from the storage location thus can end up copying a mix of old and new data.
As a side note, your code can also suffer from other concurrency problems due to how one of your threads is writing to a variable and how the variable is used by another thread. (An answer by user acelent to another question explains this rather well in technical detail, so i will just refer to it: https://stackoverflow.com/a/46695456/2819245) You can avoid such problems by encapsulating any access of such "thread-crossing" variables in a lock block. As an alternative to lock, and with regard to basic data types, you could also use methods provided by the Interlocked class to access thread-crossing variables/fields in a thread-safe manner (Alternating between both lock and Interlocked methods for the same thread-crossing variable is not a good idea, though).
In C / Objective-C when I have a lot of data in an array of structs and I need to do repeatable things with various indexes of the array I could pass the address of the index and access members using ->. Say to simplify things say I have this struct.
typedef struct _Particle {
float x;
float y;
float z;
int somethingCool;
} Particle;
Particle particles[100];
I had a function/method like this:
-(void) resetParticle: (Particle *) thisParticle {
thisParticle->x = 0;
thisParticle->y = 0;
thisParticle->z = 0;
thisParticle->somethingCool = 1234;
}
Then I could call it like this:
[self resetParticle:& particles[20]];
How do I replicate this in C#? I want to know how to do it with array of structs for particle systems.
But also in C# I'm using Vector3[] arrays for procedural meshes (Unity), which I think is an object and not a struct. I keep having to type the same code over and over to build quads. Seems like it should go in a method, but I don't want to copy data all over the place times several thousand indexes per frame.
I need to return a list of points i have from a C dll to a C# application using PInvoke. These are points in 3 dimensions [x,y,z]. The number of points varies by what kind of model it is. In C i handle this a linked list of structs. But I don't see how i can pass this on to C#.
The way I see it, I have to return a flexible two dimensional array, probably in a struct.
Any suggestions to how this can be done? Both ideas on how to return it in C and how to access it in C# are highly appreciated.
A linked list of structs could be passed back, but it would be quite a hassle to deal with, as you would have to write code to loop through the pointers, reading and copying the data from native memory into managed memory space. I would recommend a simple array of structs instead.
If you have a C struct like the following (assuming 32-bit ints)...
struct Point
{
int x;
int y;
int z;
}
... then you'd represent it nearly the same way in C#:
[StructLayout(LayoutKind.Sequential]
struct Point
{
public int x;
public int y;
public int z;
}
Now to pass an array back, it would be easiest to have your native code allocate the array and pass it back as a pointer, along with another pointer specifying the size in elements.
Your C prototype might look like this:
// Return value would represent an error code
// (in case something goes wrong or the caller
// passes some invalid pointer, e.g. a NULL).
// Caller must pass in a valid pointer-to-pointer to
// capture the array and a pointer to capture the size
// in elements.
int GetPoints(Point ** array, int * arraySizeInElements);
The P/Invoke declaration would then be this:
[DllImport("YourLib.dll")]
static extern int GetPoints(
[MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 1)] out Point[] array,
out int arraySizeInElements);
The MarshalAs attribute specifies that the array should be marshaled using the size specified in the second parameter (you can read more about this at MSDN, "Default Marshaling for Arrays").
If you use this approach, note that you must use CoTaskMemAlloc to allocate the native buffer as this is what the .NET marshaler expects. Otherwise, you will get memory leaks and/or other errors in your application.
Here is a snippet from the simple example I compiled while verifying my answer:
struct Point
{
int x;
int y;
int z;
};
extern "C"
int GetPoints(Point ** array, int * arraySizeInElements)
{
// Always return 3 items for this simple example.
*arraySizeInElements = 3;
// MUST use CoTaskMemAlloc to allocate (from ole32.dll)
int bytesToAlloc = sizeof(Point) * (*arraySizeInElements);
Point * a = static_cast<Point *>(CoTaskMemAlloc(bytesToAlloc));
*array = a;
Point p1 = { 1, 2, 3 };
a[0] = p1;
Point p2 = { 4, 5, 6 };
a[1] = p2;
Point p3 = { 7, 8, 9 };
a[2] = p3;
return 0;
}
The managed caller can then deal with the data very simply (in this example, I put all the interop code inside a static class called NativeMethods):
NativeMethods.Point[] points;
int size;
int result = NativeMethods.GetPoints(out points, out size);
if (result == 0)
{
Console.WriteLine("{0} points returned.", size);
foreach (NativeMethods.Point point in points)
{
Console.WriteLine("({0}, {1}, {2})", point.x, point.y, point.z);
}
}
This is just to satisfy my own curiosity.
Is there an implementation of this:
float InvSqrt (float x)
{
float xhalf = 0.5f*x;
int i = *(int*)&x;
i = 0x5f3759df - (i>>1);
x = *(float*)&i;
x = x*(1.5f - xhalf*x*x);
return x;
}
in C#? If it exists, post the code.
I guess I should have mentioned I was looking for a "safe" implementation... Either way, the BitConverter code solves the problem. The union idea is interesting. I'll test it and post my results.
Edit:
As expected, the unsafe method is the quickest, followed by using a union (inside the function), followed by the BitConverter. The functions were executed 10000000 times, and the I used the System.Diagnostics.Stopwatch class for timing. The results of the calculations are show in brackets.
Input: 79.67
BitConverter Method: 00:00:01.2809018 (0.1120187)
Union Method: 00:00:00.6838758 (0.1120187)
Unsafe Method: 00:00:00.3376401 (0.1120187)
For completeness, I tested the built-in Math.Pow method, and the "naive" method (1/Sqrt(x)).
Math.Pow(x, -0.5): 00:00:01.7133228 (0.112034710535584)
1 / Math.Sqrt(x): 00:00:00.3757084 (0.1120347)
The difference between 1 / Math.Sqrt() is so small that I don't think one needs to resort to the Unsafe Fast InvSqrt() method in C# (or any other unsafe method). Unless one really needs to squeeze out that last bit of juice from the CPU... 1/Math.Sqrt() is also much more accurate.
You should be able to use the StructLayout and FieldOffset attributes to fake a union for plain old data like floats and ints.
[StructLayout(LayoutKind.Explicit, Size=4)]
private struct IntFloat {
[FieldOffset(0)]
public float floatValue;
[FieldOffset(0)]
public int intValue;
// redundant assignment to avoid any complaints about uninitialized members
IntFloat(int x) {
floatValue = 0;
intValue = x;
}
IntFloat(float x) {
intValue = 0;
floatValue = x;
}
public static explicit operator float (IntFloat x) {
return x.floatValue;
}
public static explicit operator int (IntFloat x) {
return x.intValue;
}
public static explicit operator IntFloat (int i) {
return new IntFloat(i);
}
public static explicit operator IntFloat (float f) {
return new IntFloat(f);
}
}
Then translating InvSqrt is easy.
Use BitConverter if you want to avoid unsafe code.
float InvSqrt(float x)
{
float xhalf = 0.5f * x;
int i = BitConverter.SingleToInt32Bits(x);
i = 0x5f3759df - (i >> 1);
x = BitConverter.Int32BitsToSingle(i);
x = x * (1.5f - xhalf * x * x);
return x;
}
The code above uses new methods introduced in .NET Core 2.0. For .NET Framework, you have to fall back to the following (which performs allocations):
float InvSqrt(float x)
{
float xhalf = 0.5f * x;
int i = BitConverter.ToInt32(BitConverter.GetBytes(x), 0);
i = 0x5f3759df - (i >> 1);
x = BitConverter.ToSingle(BitConverter.GetBytes(i), 0);
x = x * (1.5f - xhalf * x * x);
return x;
}
Otherwise, the C# code is exactly the same as the C code you gave, except that the method needs to be marked as unsafe:
unsafe float InvSqrt(float x) { ... }
Definitely possible in unsafe mode. Note that even though in the Quake 3 source code the constant 0x5f3759df was used, numerical research showed that the constant 0x5f375a86 actually yields better results for Newton Approximations.
I don't see why it wouldn't be possible using the unsafe compiler option.