Implementing IEnumerator<T> for Fixed Arrays - c#

I need to implement a mutable polygon that behaves like a struct, that it is copied by value and changes to the copy have no side effects on the original.
Consider my attempt at writing a struct for such a type:
public unsafe struct Polygon : IEnumerable<System.Drawing.PointF>
{
private int points;
private fixed float xPoints[64];
private fixed float yPoints[64];
public PointF this[int i]
{
get => new PointF(xPoints[i], yPoints[i]);
set
{
xPoints[i] = value.X;
yPoints[i] = value.Y;
}
}
public IEnumerator<PointF> GetEnumerator()
{
return new PolygonEnumerator(ref this);
}
}
I have a requirement that a Polygon must be copied by value so it is a struct.
(Rationale: Modifying a copy shouldn't have side effects on the original.)
I would also like it to implement IEnumerable<PointF>.
(Rationale: Being able to write for (PointF p in poly))
As far as I am aware, C# does not allow you to override the copy/assignment behaviour for value types. If that is possible then there's the "low hanging fruit" that would answer my question.
My approach to implementing the copy-by-value behaviour of Polygon is to use unsafe and fixed arrays to allow a polygon to store up to 64 points in the struct itself, which prevents the polygon from being indirectly modified through its copies.
I am running into a problem when I go to implement PolygonEnumerator : IEnumerator<PointF> though.
Another requirement (wishful thinking) is that the enumerator will return PointF values that match the Polygon's fixed arrays, even if those points are modified during iteration.
(Rationale: Iterating over arrays works like this, so this polygon should behave in line with the user's expectations.)
public class PolygonEnumerator : IEnumerator<PointF>
{
private int position = -1;
private ??? poly;
public PolygonEnumerator(ref Polygon p)
{
// I know I need the ref keyword to ensure that the Polygon
// passed into the constructor is not a copy
// However, the class can't have a struct reference as a field
poly = ???;
}
public PointF Current => poly[position];
// the rest of the IEnumerator implementation seems straightforward to me
}
What can I do to implement the PolygonEnumerator class according to my requirements?
It seems to me that I can't store a reference to the original polygon, so I have to make a copy of its points into the enumerator itself; But that means changes to the original polygon can't be visited by the enumerator!
I am completely OK with an answer that says "It's impossible".
Maybe I've dug a hole for myself here while missing a useful language feature or conventional solution to the original problem.

Your Polygon type should not be a struct because ( 64 + 64 ) * sizeof(float) == 512 bytes. That means every value-copy operation will require a copy of 512 bytes - which is very inefficient (not least because of locality-of-reference which strongly favours the use objects that exist in a single location in memory).
I have a requirement that a Polygon must be copied by value so it is a struct.
(Rationale: Modifying a copy shouldn't have side effects on the original.)
Your "requirement" is wrong. Instead define an immutable class with an explicit copy operation - and/or use a mutable "builder" object for efficient construction of large objects.
I would also like it to implement IEnumerable<PointF>.
(Rationale: Being able to write for (PointF p in poly))
That's fine - but you hardly ever need to implement IEnumerator<T> directly yourself because C# can do it for you when using yield return (and the generated CIL is very optimized!).
My approach to implementing the copy-by-value behaviour of Polygon is to use unsafe and fixed arrays to allow a polygon to store up to 64 points in the struct itself, which prevents the polygon from being indirectly modified through its copies.
This is not how C# should be written. unsafe should be avoided wherever possible (because it breaks the CLR's built-in guarantees and safeguards).
Another requirement (wishful thinking) is that the enumerator will return PointF values that match the Polygon's fixed arrays, even if those points are modified during iteration.
(Rationale: Iterating over arrays works like this, so this polygon should behave in line with the user's expectations.)
Who are your users/consumers in this case? If you're so concerned about not breaking user's expectations then you shouldn't use unsafe!
Consider this approach instead:
(Update: I just realised that the class Polygon I defined below is essentially just a trivial wrapper around ImmutableList<T> - so you don't even need class Polygon, so just use ImmutableList<Point> instead)
public struct Point
{
public Point( Single x, Single y )
{
this.X = x;
this.Y = y;
}
public Single X { get; }
public Single Y { get; }
// TODO: Implement IEquatable<Point>
}
public class Polygon : IEnumerable<Point>
{
private readonly ImmutableList<Point> points;
public Point this[int i] => this.points[i];
public Int32 Count => this.points[i];
public Polygon()
{
this.points = new ImmutableList<Point>();
}
private Polygon( ImmutableList<Point> points )
{
this.points = points;
}
public IEnumerator<PointF> GetEnumerator()
{
//return Enumerable.Range( 0, this.points ).Select( i => this[i] );
return this.points.GetEnumerator();
}
public Polygon AddPoint( Single x, Single y ) => this.AddPoint( new Point( x, y ) );
public Polygon AddPoint( Point p )
{
ImmutableList<Point> nextList = this.points.Add( p );
return new Polygon( points: nextList );
}
}

Related

How to extend/modify behaviour of vector type structs in Unity to achieve position wrapping?

I'm creating a grid based game using Vector2Ints to represent positions on the grid. The world needs to be wrapped on the x and y axes, meaning that if the world is 100x100 cells big then for an entity with position = new Vector2Int(99, 99) and displacement = new Vector2Int(1,1), we'd have newPosition = position + displacement equal to Vector2Int(0, 0). The obvious way initially seemed to somehow override the set method behaviour of the Vector2Int struct in Unity because then I can continue to benefit from all the other methods on the struct like addition, multiplication with ints etc. while still getting the "wrapping" functionality with every operation. This is key so I don't have to remember to keep calling a helper function.
The way I thought about achieving this would be to somehow extend the Vector2Int so I can set a mapWidth and mapHeight, and modify the set methods to x = xIn % mapWidth and y = yIn % mapHeight.
I would appreciate suggestions on how best to achieve the above without just duplicating the code, albeit with the minor modifications, of the whole Vector2Int struct.
You can add an extension method to any class like so:
// the class name here doesn't matter. Just some static class.
public static class Helpers
{
public static Vector2Int Wrapped(this Vector2Int v, int wrapX, int wrapY)
{
return new Vector2Int(v.x % wrapX, v.y % wrapY);
}
}
Usage:
Vector2Int vec = new Vector2Int(15,15);
Vector2Int vecWrapped = vec.Wrapped(10, 10);
vecWrapped will now be 5, 5.
It's the "this Vector2Int" in the method parameters that makes it extend the class.
EDIT:
On the question of whether it's possible to override the Set method, not directly no afaik. You can add an extension method called Set, but it can't have the same signature, ie. it can't be Set(x, y). You could add a method called something else though.
And afaik, there's no way to make it automatically 'gridify' after any operation on the vector.
As suggested by aybe, using "this ref" can make this much more usable though, removing the need to assign the return value.
public static class Helpers
{
public static int gridX = 10;
public static int gridY = 10;
public static void SetGrid(this ref Vector2Int v, int x, int y)
{
v.Set(x % gridX, y % gridY);
}
}
Usage:
Vector2Int v = Vector2Int.zero;
v.SetGrid(15, 15);
v will now be 5, 5.

Should I use Int32[,] or System.Drawing.Point when all I want is the x,y coordinates?

I am building an app that lets me control my Android devices from my PC. It's running great so now I want to start cleaning up my code for release. I'm trying to clean up solution references that I don't need so I took a look at the using System.Drawing; that I have for implementing the Point class. The thing is, I don't really need it if I switch to using a two-dimensional Int32 array.
So I could have: new Int32[,] {{200, 300}}; instead of new Point(200, 300); and get rid of the System.Drawing namespace altogether. The question is: does it really matter? Am I realistically introducing bloat in my app by keeping the System.Drawing namespace? Is Int32[,] meaningfully more lightweight?
Or, should I not use either and just keep track of the x,y coordinates in individual Int32 variables?
EDIT: I got rid of the original idea I wrote: Int32[200, 300] and replaced it with new Int32[,] {{200, 300}}; because as #Martin Mulder pointed out Int32[200, 300] "creates a two-dimensional array with 60000 integers, all of them are 0."
EDIT2: So I'm dumb. First of all I was trying to fancify too much by using the multi-dimensional array. Utter, overboard silliness. Secondly, I took the advice to use a struct and it all worked flawlessly, so thank you to the first four answers; every one of them was correct. But, after all that, I couldn't end up removing the System.Drawing reference because I was working on a WinForms app and the System.Drawing is being used all over in the designer of the app! I suppose I could further refactor it but I got the size down to 13KB so it's good enough. Thank you all!
Just create your own:
public struct Point : IEquatable<Point>
{
private int _x;
private int _y;
public int X
{
get { return _x; }
set { _x = value; }
}
public int Y
{
get { return _y; }
set { _y = value; }
}
public Point(int x, int y)
{
_x = x;
_y = y;
}
public bool Equals(Point other)
{
return X == other.X && Y == other.Y;
}
public override bool Equals(object other)
{
return other is Point && Equals((Point)other);
}
public int GetHashCode()
{
return unchecked(X * 1021 + Y);
}
}
Better yet, make it immutable (make the fields readonly and remove the setters), though if you'd depended on the mutability of the two options you consider in your question then that'll require more of a change to how you do things. But really, immutability is the way to go here.
You are suggesting very ill advised:.
new Point(200, 300) creates a new point with two integers: The X and Y property with values 200 and 300.
new Int32[200,300] creates a two-dimensional array with 60000 integers, all of them are 0.
(After your edit) new Int32[,] {{200, 300}} also creates a two-dimensional array, this time with 2 integers. To retrieve the first value (200), you can access it like this: array[0,0] and the second value (300) like array[0,1]. The second dimension is not required or needed or desired.
If you want to get rid of the reference to the library there are a few other suggestions:
new Int32[] {200, 300} creates an one-dimensional array of two integers with values 200 and 300. You can access them with array[0] and array[1].
As Ron Beyer suggested, you could use Tuple<int, int>.
Create your own Point-struct (pointed out by Jon Hanna). It makes your applicatie a bit larger, but you prevent the reference and you prevent the library System.Drawing is loaded into memory.
If I wanted to remove that reference, I would go for the last option since it is more clear to what I am doing (a Point is more readable than an Int32-array or Tuple). Solution 2 and 3 are slightly faster that solution 1.
Nothing gets "embedded" in your application by just referencing a library. However, if the Point class really is all you need, you could just remove the reference and implement you own Point struct. That may be more intuitive to read instead of an int array.
Int32[,] is something different by the way. It's a two-dimensional array, not a pair of two int values. You'll be making things worse by using that.
You could use Tuple<int, int>, but I'd go for creating your own structure.
As some people have suggested implementations here. So just wrap your two integers, I'd just use this:
public class MyPoint
{
public int X;
public int Y;
}
Add all other features only if needed.
As #Glorin Oakenfoot said, you should implement your own Point class. Here's an example:
public class MyPoint // Give it a unique name to avoid collisions
{
public int X { get; set; }
public int Y { get; set; }
public MyPoint() {} // Default constructor allows you to use object initialization.
public MyPoint(int x, int y) { X = x, Y = y }
}

Substitute the GetHashCode() Method of System.Drawing.Point

System.Drawing.Point has a really, really bad GetHashCode method if you intend to use it to describes 'pixels' in a Image/Bitmap: it is just XOR between the X and Y coordinates.
So for a image with, say, 2000x2000 size, it has an absurd number of colisions, since only the numbers in the diagonal are going to have a decent hash.
It's quite easy to create a decent GetHashCode method using unchecked multiplication, as some people already mentioned here.
But what can I do to use this improved GetHashCode method in a HashSet?
I know I could create my own class/struct MyPoint and implement it using this improved methods, but then I'd break all other pieces of code in my project that use a System.Drawing.Point.
Is it possible to "overwrite" the method from System.Drawing.Point using some sort of extension method or the like? Or to "tell" the HashSet to use another function instead of the GetHashCode?
Currently I'm using a SortedSet<System.Drawing.Point> with a custom IComparer<Point> to store my points. When I want to know if the set contains a Point I call BinarySearch. It's faster than a HashSet<System.Drawing.Point>.Contains method in a set with 10000 colisions, but it's no as fast as HashSet with a good hash could be.
You can create your own class that implements IEqualityComparer<Point>, then give that class to the HashSet constructor.
Example:
public class MyPointEqualityComparer : IEqualityComparer<Point>
{
public bool Equals(Point p1, Point p2)
{
return p1 == p2; // defer to Point's existing operator==
}
public int GetHashCode(Point obj)
{
return /* your favorite hashcode function here */;
}
}
class Program
{
static void Main(string[] args)
{
// Create hashset with custom hashcode algorithm
HashSet<Point> myHashSet = new HashSet<Point>(new MyPointEqualityComparer());
// Same thing also works for dictionary
Dictionary<Point, string> myDictionary = new Dictionary<Point, string>(new MyPointEqualityComparer());
}
}

Serializing a dictionary in C#

I have a class named serializableVector2:
[Serializable]
class serializableVector2
{
public float x, y;
public serializableVector2(int x, int y)
{
this.x = x;
this.y = y;
}
}
and I have a struct named savedMapTile:
[Serializable]
struct savedMapTile
{
public oreInstance ore;
public int backgroundTileId;
public int playerId;
public tree tree;
}
and I have a dictionary using these two classes:
[SerializeField]
Dictionary<serializableVector2, savedMapTile> savedTiles;
I am trying to load this dictionary modify it, and then save it again all using serialization.
I am deserializing the dictionary like so:
FileStream f = File.Open(saveFileName, FileMode.Open);
BinaryFormatter b = new BinaryFormatter();
savedTiles = (Dictionary<serializableVector2, savedMapTile>)b.Deserialize(f);
f.Close();
and I am serializing it like so:
FileStream f = File.Open(saveFileName, FileMode.Create);
BinaryFormatter b = new BinaryFormatter();
b.Serialize(f, savedTiles);
f.Close();
However, when I try to access an element in the dictionary that I know should exist I get the following error:
System.Collections.Generic.KeyNotFoundException: The given key was not
present in the dictionary.
I get this error from running this code:
id = (savedTiles[new serializableVector2(-19,13)].backgroundTileId);
What I find really strange is that I am able to print out the entirety of the dictionaries keys and its values as well. This is where I am getting the values -19 and 13 for the Vector2. I print the keys and values like so:
for (int i = 0; i < 100; i++ )
{
UnityEngine.Debug.Log(vv[i].x +" "+vv[i].y);
UnityEngine.Debug.Log(x[i].backgroundTileId);
}
At this point I'm really stumped, I have no clue what is going on. I can see the file being saved in windows explorer, I can access keys and values in the dictionary, but I cant seem to use it properly. It is also important to note that when I use the .Contains() method on the dictionary in a similar way to how I am trying to access a value, it always returns false.
This is for a Unity 5 project, using C# in visual studio running on windows 8.1.
Change your serializableVector2 from a class to a struct and you should be able to find things in your dictionary. Someone may correct me if I have this wrong, but to the best of my knowledge the Dictionary is going to call GetHashCode on the key and use that code to store the item in the dictionary. If you create two instances of your class with the same x and y coordinates and call GetHashCode you will see that two instances yield different hash codes. If you change it to a struct than they will produce the same hash code. I believe this is what is causing you to get the "Key not found" issues. On a somewhat related note, it does seem strange that the constructor takes int for the x and y and then stores them as floats. You may want to consider changing the constructor to take float.
[Serializable]
struct serializableVector2
{
public float x, y;
public serializableVector2(float x, float y)
{
this.x = x;
this.y = y;
}
}
You have two issues:
Your dictionary key serializableVector2 is a class relying on the default equality and hashing methods. The defaults use reference equality such that only variables pointing to the same object will be equal and return the same hash.
If that were not the case you would still be relying on floating point equality. Unless your serialised can guarantee precise storage and retrieval of floating point values the deserialised serializableVector2 may NOT be equal to the original.
Suggested solution:
Override GetHashCode and Equals for your serializableVector2 class. When performing comparisons and hashing round your floats to within 32-bit floating point precision of your expected range of values. You can rely on 6+ significant digits of precision (within the same range) so if your world is += 1000 units I believe you can safely round to 3 decimal points.
Example for GetHashCode (without testing):
public override int GetHashCode() {
return Math.Round(x,3).GetHashCode() ^ Math.Round(y,3).GetHashCode();
}

Efficient implementation of flyweight pattern

Background
One of the most used data-structures in our application is a custom Point struct. Recently we have been running into memory issues, mostly caused by an excessive number of instances of this struct.
Many of these instances contain the same data. Sharing a single instance would significantly help to reduce memory usage. However, since we are using structs, instances cannot be shared. It is also not possible to change it to a class, because the struct semantics are important.
Our workaround for this is to have a struct containing a single reference to a backing class, which contains the actual data. These flyweight dataclasses are stored in and retrieved from a factory to ensure no duplicates exist.
A narrowed down version of the code looks something like this:
public struct PointD
{
//Factory
private static class PointDatabase
{
private static readonly Dictionary<PointData, PointData> _data = new Dictionary<PointData, PointData>();
public static PointData Get(double x, double y)
{
var key = new PointData(x, y);
if (!_data.ContainsKey(key))
_data.Add(key, key);
return _data[key];
}
}
//Flyweight data
private class PointData
{
private double pX;
private double pY;
public PointData(double x, double y)
{
pX = x;
pY = y;
}
public double X
{
get { return pX; }
}
public double Y
{
get { return pY; }
}
public override bool Equals(object obj)
{
var other = obj as PointData;
if (other == null)
return false;
return other.X == this.X && other.Y == this.Y;
}
public override int GetHashCode()
{
return X.GetHashCode() * Y.GetHashCode();
}
}
//Public struct
public Point(double x, double y)
{
_data = Point3DDatabase.Get(x, y);
}
public double X
{
get { return _data == null ? 0 : _data.X; }
set { _data = PointDatabase.Get(value, Y); }
}
public double Y
{
get { return _data == null ? 0 : _data.Y; }
set { _data = PointDatabase.Get(X, value); }
}
}
This implementation ensures that the struct semantics are maintained, while ensuring only one instance of the same data is kept around.
(Please don't mention memory leaks or such, this is simplified example code)
The Problem
Although above approach works to lower our memory usage, the performance is horrendous. A project in our application can easily contain a million different points or more. As a result, the lookup of a PointData instance is very costly. And this lookup has to be done whenever a Point is manipulated, which, as you can probably guess, is what our application is all about. As a result, this approach is not suitable for us.
As an alternative, we could make two versions of the Point class: one with backing flyweight as above, and one containing its own data (with possible duplicates). All (short-lived) calculations could be done in the second class, while when storing the Point for longer durations they could be converted to the first, memory-efficient class. However, this means that all the users of the Point class have to be inspected and adjusted to this scheme, something which is not feasible for us.
What we are looking for is an approach which meets below criteria:
When there are multiple Points with the same data, the memory usage should be lower than having a different struct instance for each of these.
Performance should not be much worse than working directly on primitive data in the struct.
Struct semantics should be maintained.
The 'Point' interface should remain the same (i.e. classes that use 'Point' should not have to be changed).
Is there any way we can improve our approach towards these criteria? Or can anyone suggest a different approach we can attempt?
Rather than re-work an entire data structure and programming model, my go-to solution for performance and memory issues is to cache, pre-fetch and most importantly cull you data when it is not needed.
Think of it this way. On a graph, you cannot display few millions of points at once because you run out of pixels (you should occlusion-cull these points). Similarly, in a table, there isn't enough vertical space on screen (you need data set truncation). Consider streaming data from your source file as you need it. If your source data structure is not appropriate for dynamic retrieval, consider an intermediate, temporary file format. This is one of the ways .Net's JITer works so quickly!

Categories