Value semantics in c# struct vs tuple - c#

So I'm making my first steps in C# and was making a simple tile puzzle. When I was modeling the position of a tile I wanted to have value semantics. So, as far as I can see there are basically two ways of doing this, with a struct or with a Tuple.
In the case of the Tuple my code looks like this:
public class TilePosition : Tuple<int,int>
{
public int HComponent{get { return Item1; }}
public int VComponent{get { return Item2; }}
public TilePosition(int horizontalPosition, int verticalPosition)
: base(horizontalPosition, verticalPosition)
{
}
}
The struct solution would look like this:
public struct TilePosition
{
private readonly int hComponent;
private readonly int vComponent;
public int HComponent { get { return hComponent; } }
public int VComponent { get { return vComponent; } }
public TilePosition(int hComponent, int vComponent)
{
this.hComponent = hComponent;
this.vComponent = vComponent;
}
public static bool operator ==(TilePosition position1, TilePosition position2)
{
return position1.Equals(position2);
}
public static bool operator !=(TilePosition position1, TilePosition position2)
{
return !(position1 == position2);
}
}
The tuple is conciser but it exposes Item1 and Item2 which would be confusing in a public API, even though I have added the H and V component properties around them.
The struct need more code and I get a compiler warning about how I should override Equals and GetHashCode because I'm overriding == and !=, but if I do that I'm not getting anything from using a struct (from the semantic and syntactic point of view) because it wold be exactly the same code with a conventional class.
So are there any benefits from using a struc over a subclassed Tuple aside from not having the noise of the Item properties?
Would both of my solution behave in the same way as I expect or are there nuances I should be aware of?

(As an aside, it would be good to implement IEquatable<TilePosition> in both cases too - particularly in the struct case, to avoid boxing.)
So are there any benefits from using a struc over a subclassed Tuple aside from not having the noise of the Item properties?
Given that it's immutable, in both cases you have roughly "value semantics" in both cases, but there are still differences...
An instance of the class type requires space on the heap (assuming no escape detection etc by the CLR); a value of the struct type may in some cases only use the stack
Passing a value of the class type just means passing a reference (4 bytes or 8 bytes depending on CLR architecture); passing a value of the struct type really passes the values (so 8 bytes)
In the class type version null is a valid value; in the struct type version you'd need to use TilePosition? to indicate a possibly-absent value
In the struct version new TilePosition() is valid and will have values of 0 for both fields (and this will be the default value, e.g. for fields and array elements)
As you haven't sealed your class, someone could create a mutable subclass; it's therefore not safe for clients to assume it's fully immutable. (You should probably seal it...)
You can use your class type with any code which uses Tuple<,>, whereas that's clearly not the case for the struct type
The meaning of == will differ between the two types. Even if you overload == in the class type, a caller could still end up just comparing references. And in the struct case, you could still end up comparing boxed references, unhelpfully.
These are just differences of course - whether they count as benefits for one approach or the other depends on your requirements.

How about using a jagged array and holding an item on each field. This would more closely follow the tile puzzle:
Tiles and space for them is a 1:1 mapping. So each tile/space can only have one space/tile.
No need to compare tile components.
Moving tiles is easy, example
if (fields[x, y] = null)
{
fields[x, y] = fields[oldX, oldY];
fields[oldX, oldY] = null;
}

Your best bet is to do it properly and put in all the work. If you want a struct rather than a class (which may be appropriate for you), here's a sample implementation:
public struct TilePosition: IEquatable<TilePosition>
{
public TilePosition(int horizontalPosition, int verticalPosition)
{
_x = horizontalPosition;
_y = verticalPosition;
}
public int HComponent
{
get
{
return _x;
}
}
public int VComponent
{
get
{
return _y;
}
}
public static bool operator == (TilePosition lhs, TilePosition rhs)
{
return lhs.Equals(rhs);
}
public static bool operator != (TilePosition lhs, TilePosition rhs)
{
return !lhs.Equals(rhs);
}
public bool Equals(TilePosition other)
{
return (_x == other._x) && (_y == other._y);
}
public override bool Equals(object obj)
{
return obj is TilePosition && Equals((TilePosition)obj);
}
public override int GetHashCode()
{
unchecked
{
return (_x*397) ^ _y;
}
}
private readonly int _x;
private readonly int _y;
}

Related

HashSet with a custom struct allocates heavy with Contains function

I am using the HashSet collection type which has already significantly improved the performance of my algorithm. It seems that each time I invoke myHashSet.Contains(someValue) the internal implementation is boxing the value type immediately before invoking Equals.
Is there a way to avoid these wasteful allocations when using value types?
Sample Code:
public struct TestStruct {
public int a;
public int b;
public override int GetHashCode() {
return a ^ b;
}
public override bool Equals(object obj) {
if (!(obj is TestStruct))
return false;
TestStruct other = (TestStruct)obj;
return a == other.a && b == other.b;
}
}
var hashset = new HashSet<TestStruct>();
PopulateSet(hashset);
// About to go crazy on the allocations...
if (hashset.Contains(someValue)) { ... }
// Lots of allocations just happened :(
After a lucky guess it looks like the answer is just to implement the IEquatable<T> interface like demonstrated below. HashSet<T> (or at least the Mono implementation) then takes an allocation-free approach to its Contains method by using a different comparer implementation.
public struct TestStruct : IEquatable<TestStruct> {
...
public bool Equals(TestStruct other) {
return a == other.a && b == other.b;
}
}
// No more pain!
if (hashset.Contains(someValue)) { ... }

Is there any way to implicitly construct a type in C#?

I read of a useful trick about how you can avoid using the wrong domain data in your code by creating a data type for each domain type you're using. By doing this the compiler will prevent you from accidentally mixing your types.
For example, defining these:
public struct Meter
{
public int Value;
public Meter(int value)
{
this.Value = value;
}
}
public struct Second
{
public int Value;
public Second(int value)
{
this.Value = value;
}
}
allows me to not mix up meters and seconds because they're separate data types. This is great and I can see how useful it can be. I'm aware you'd still need to define operator overloads to handle any kind of arithmetic with these types, but I'm leaving that out for simplicity.
The problem I'm having with this approach is that in order to use these types I need to use the full constructor every time, like this:
Meter distance = new Meter(5);
Is there any way in C# I can use the same mode of construction that a System.Int32 uses, like this:
Meter distance = 5;
I tried creating an implicit conversion but it seems this would need to be part of the Int32 type, not my custom types. I can't add an Extension Method to Int32 because it would need to be static, so is there any way to do this?
You can specify an implicit conversion directly in the structs themselves.
public struct Meter
{
public int Value;
public Meter(int value)
{
this.Value = value;
}
public static implicit operator Meter(int a)
{
return new Meter(a);
}
}
public struct Second
{
public int Value;
public Second(int value)
{
this.Value = value;
}
public static implicit operator Second(int a)
{
return new Second(a);
}
}

Using Enums that are in an external dll

I have a project I am working that will involve creating one DLL that will be used across multiple other sites. Inside this DLL we need to reference about 10 Enums. The values of these Enums however will be different for each site the DLL is used on. For example:
MyBase.dll may have a class MyClass with an attribute of type MyEnum.
MyBase.dll is then referenced in MySite. MyStie will also reference MyEnums.dll which will contain the values for the MyEnum type.
Is there any way to accomplish this? While building MyBase.dll, I know what enums will exist in side of MyEnums.dll. The problem is I cannot build MyBase.dll without specifically referenceing the MyEnums.dll, which is not created until the MyBase.dll is used in a specific project.
I hope that makes sense and hope I can find an answer here.
Thanks.
Edit:
Thanks for all the comments. It will take a few reads to completely understand, but let me try to give a better example of what I am looking at here.
Lets say the following code is in my DLL that will be put into various projects. Status is an enum.
public Class MyClass
{
private Status _currentStatus;
public Status CurrentStatus
{
get
{
return _currentStatus;
}
}
public void ChangeStatus(Status newStatus)
{
_currentStatus = newStatus;
}
}
What I want to be able to do is the define the possible values for Status in the individual projects. So in this DLL, I will never reference what values might be in the Status enum, I just have to know that it exists.
I hope that is a bit more clear on what I am trying to do.
If you want each client to see different enum values (in a different assembly version), then using an enum is a bad solution - changes will break client code...
Using an enum might work (as long as the enum names and assembly name are the same and the assembly isn't signed) - you could just swap the assembly. However, if a value is used anywhere in the code that isn't there at the end you'll end up with an exception. Also, you may have the explicitly number the values, to make sure different subsets of the values won't end up with the same number for different values or different numbers for the same value.
Instead consider using a dynamically built collection, e.g. a list, a dictionary or a database table. Or just give the same assembly with the same superset of enum values to everyone and let the users decide which values are relevant to them (perhaps use significant prefixes for values as a convention).
Or you could use a combination of the two...
Generate a different structure (different type name (or namespace) and assembly name) per site with different properties (according to site's profile) and one master structure for the service that accepts the structures. Have all the structures implement the same interface, which you expect to receive...
public interface IStatus
{
string GetKey();
}
public struct ClientXStatus : IStatus
{
private readonly string _key;
private ClientXStatus(string key)
{
_key = key;
}
// Don't forget default for structs is 0,
// therefore all structs should have a "0" property.
public ClientXStatus Default
{
get
{
return new ClientXStatus();
}
}
public ClientXStatus OptionB
{
get
{
return new ClientXStatus(10);
}
}
string IStatus.GetKey()
{
return _key;
}
public override bool Equals(object obj)
{
return (obj is IStatus) && ((IStatus)obj).GetKey() == _key;
}
public override int GetHashCode()
{
return _key.GetHashCode();
}
public static bool operator==(ClientXStatus x, IStatus y)
{
return x.Equals(y);
}
public static bool operator==(IStatus x, ClientXStatus y)
{
return y.Equals(x);
}
public static bool operator!=(ClientXStatus x, IStatus y)
{
return !x.Equals(y);
}
public static bool operator!=(IStatus x, ClientXStatus y)
{
return !y.Equals(x);
}
// Override Equals(), GetHashCode() and operators ==, !=
// So clients can compare structures to each other (to interface)
}
Use a master struct for the service:
public struct MasterStatus : IStatus
{
private readonly string _key;
private MasterStatus(string key)
{
_key = key;
}
// Don't forget default for structs is 0,
// therefore all structs should have a "0" property.
public MasterStatus Default
{
get
{
return new MasterStatus();
}
}
// You should have all the options here
public MasterStatus OptionB
{
get
{
return new MasterStatus(10);
}
}
// Here use implicit interface implementation instead of explicit implementation
public string GetKey()
{
return _key;
}
public static implicit operator MasterStatus(IStatus value)
{
return new MasterStatus(value.GetKey());
}
public static implicit operator string(MasterStatus value)
{
return new value._key;
}
// Don't forget to implement Equals, GetHashCode,
// == and != like in the client structures
}
Demo service code:
public void ServiceMethod(IStatus status)
{
switch (status.GetKey())
{
case (string)MasterStructA.OptionB:
DoSomething();
}
}
Or:
public void ChangeStatus(IStatus status)
{
_status = (MasterStatus)status;
}
This way you:
Use code generation to prevent collision of values.
Force users to use compile time checks (no int values or string values) by hiding values (as private) and only accepting your structures.
Use real polymorphism in the service's code (an interface) and not a error-prone hack.
Use immutable value types (like enums) and not reference types.
First you have to decide WHERE to put your constants. Then you can transform your enum to static properties.
For example:
public enum MyEnum
{
Value1,
Value2
}
Can be changed to (first naive approach):
public static class MyFakeEnum
{
public static int Value1
{
get { return GetActualValue("Value1"); }
}
public static int Value2
{
get { return GetActualValue("Value2"); }
}
private static int GetActualValue(string name)
{
// Put here the code to read the actual value
// from your favorite source. It can be a database, a configuration
// file, the registry or whatever else. Consider to cache the result.
}
}
This simply will provide required constants but you'll have to throw away compile-time check for the type if you'll need MyFakeEnum as parameter. For a better solution you can follow, for example, what Microsoft did (more or less) for System.Drawing.Color.
public sealed class MyFakeEnum
{
public static readonly MyFakeEnum Value1 = new MyFakeEnum("Value1");
public static readonly MyFakeEnum Value2 = new MyFakeEnum("Value2");
private MyFakeEnum(string name)
{
_name = name;
}
public static implicit operator int(MyFakeEnum value)
{
return GetActualValue(value._name);
}
private string _name;
}
Of course you should provide proper overides at least for Equals, GetHashCode and ToString.
Pro
It can be an upgrade from an existing enum. Code won't be breaked and you may just need to recompile.
You can use it as strongly typed parameter. For example: void DoSomething(MyFakeEnum value) is valid and callers can't pass something else (note that this is one of the reasons because enums are considered weak).
If you implement all the required operators you can use the normal syntax for comparison: value == MyFakeEnum::Value1.
With a little bit of code you may even implement the FlagsAttribute syntax.
You do not change the normal syntax of enums: MyFakeEnum.Value1.
You can implement any number of implicit/explicit conversion operators to/from your type and any conversion will be safe and checked in the point it's done (this is not true again with standard enums).
You do not have hard-coded strings that can be breaked by changes and won't be catched until they cause a run-time error (yes, run-time). Using, for example, a dictionary if you'll change the definitions then you'll have to search all your code for that string.
Cons
First implementation is longer because you have to write support code (but for any new value you'll simply add a new line).
Value list is fixed and must be known at compile time (this is not an issue if you're searching a replacement for an enum because it's fixed too).
With this solution you may save more or less the same syntax you had with standard enums.

Comparing objects

I have a class it contains some string members, some double members and some array objects.
I create two objects of this class, is there any simplest, efficient way of comparing these objects and say their equal? Any suggestions?
I know how to write a compare function, but will it be time consuming.
The only way you can really do this is to override bool Object.Equals(object other) to return true when your conditions for equality are met, and return false otherwise. You must also override int Object.GetHashCode() to return an int computed from all of the data that you consider when overriding Equals().
As an aside, note that the contract for GetHashCode() specifies that the return value must be equal for two objects when Equals() would return true when comparing them. This means that return 0; is a valid implementation of GetHashCode() but it will cause inefficiencies when objects of your class are used as dictionary keys, or stored in a HashSet<T>.
The way I implement equality is like this:
public class Foo : IEquatable<Foo>
{
public bool Equals(Foo other)
{
if (other == null)
return false;
if (other == this)
return true; // Same object reference.
// Compare this to other and return true/false as appropriate.
}
public override bool Equals(Object other)
{
return Equals(other as Foo);
}
public override int GetHashCode()
{
// Compute and return hash code.
}
}
A simple way of implementing GetHashCode() is to XOR together the hash codes of all of the data you consider for equality in Equals(). So if, for example, the properties you compare for equality are string FirstName; string LastName; int Id;, your implementation might look like:
public override int GetHashCode()
{
return (FirstName != null ? FirstName.GetHashCode() : 0) ^
(LastName != null ? LastName.GetHashCode() : 0) ^
Id; // Primitives of <= 4 bytes are their own hash codes
}
I typically do not override the equality operators, as most of the time I'm concerned with equality only for the purposes of dictionary keys or collections. I would only consider overriding the equality operators if you are likely to do more comparisons by value than by reference, as it is syntactically less verbose. However, you have to remember to change all places where you use == or != on your object (including in your implementation of Equals()!) to use Object.ReferenceEquals(), or to cast both operands to object. This nasty gotcha (which can cause infinite recursion in your equality test if you are not careful) is one of the primary reasons I rarely override these operators.
The 'proper' way to do it in .NET is to implement the IEquatable interface for your class:
public class SomeClass : IEquatable<SomeClass>
{
public string Name { get; set; }
public double Value { get; set; }
public int[] NumberList { get; set; }
public bool Equals(SomeClass other)
{
// whatever your custom equality logic is
return other.Name == Name &&
other.Value == Value &&
other.NumberList == NumberList;
}
}
However, if you really want to do it right, this isn't all you should do. You should also override the Equals(object, object) and GetHashCode(object) methods so that, no matter how your calling code is comparing equality (perhaps in a Dictionary or perhaps in some loosely-typed collection), your code and not reference-type equality will be the determining factor:
public class SomeClass : IEquatable<SomeClass>
{
public string Name { get; set; }
public double Value { get; set; }
public int[] NumberList { get; set; }
/// <summary>
/// Explicitly implemented IEquatable method.
/// </summary>
public bool IEquatable<SomeClass>.Equals(SomeClass other)
{
return other.Name == Name &&
other.Value == Value &&
other.NumberList == NumberList;
}
public override bool Equals(object obj)
{
var other = obj as SomeClass;
if (other == null)
return false;
return ((IEquatable<SomeClass>)(this)).Equals(other);
}
public override int GetHashCode()
{
// Determine some consistent way of generating a hash code, such as...
return Name.GetHashCode() ^ Value.GetHashCode() ^ NumberList.GetHashCode();
}
}
Just spent the whole day writing an extension method looping through reflecting over properties of an object with various complex bits of logic to deal with different property type and actually got it close to good, then at 16:55 it dawned on me that if you serialize the two object, you simply need compare the two strings ... duh
So here is a simple serializer extension method that even works on Dictionaries
public static class TExtensions
{
public static string Serialize<T>(this T thisT)
{
var serializer = new DataContractSerializer(thisT.GetType());
using (var writer = new StringWriter())
using (var stm = new XmlTextWriter(writer))
{
serializer.WriteObject(stm, thisT);
return writer.ToString();
}
}
}
Now your test can be as simple as
Asset.AreEqual(objA.Serialise(), objB.Serialise())
Haven't done extensive testing yet, but looks promising and more importantly, simple. Either way still a useful method to have in your utility set right ?
The best answer is to implement IEquatable for your class - it may not be the answer you want to hear, but that's the best way to implement value equivalence in .NET.
Another option would be computing a unique hash of all of the members of your class and then doing value comparisons against those, but that's even more work than writing a comparison function ;)
Since these are objects my guess is that you will have to override the Equals method for objects. Otherwise the Equals method will give you ok only if both objects refering to the same object.
I know this is not the answer you want. But since there is little number of properties in your class you can easily override the method.

C#: optimizing dictionary access (hash in key structures)

So, I need to create a struct in C# that will act as a key into a (quite large) dictionary, will look like this:
private readonly IDictionary<KeyStruct, string> m_Invitations;
Problem is, I REALLY need a struct to use as a key, because it is only possible to identify entries via two separate data items, where one of them can be a null (not only empty!) string.
What will I need to implement on the struct? How would you go about creating the hash? Would a hash collision (occassional) hurt the performance heavily or would that be negligible?
I'm asking because this is "inner loop" code.
If you have resharper, you can generate these method with Alt-Ins -> Equality members.
Here is the generated code for you KeyStruct:
public struct KeyStruct : IEquatable<KeyStruct>
{
public string Value1 { get; private set; }
public long Value2 { get; private set; }
public KeyStruct(string value1, long value2)
: this()
{
Value1 = value1;
Value2 = value2;
}
public bool Equals(KeyStruct other)
{
return Equals(other.Value1, Value1) && other.Value2 == Value2;
}
public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj)) return false;
if (obj.GetType() != typeof (KeyStruct)) return false;
return Equals((KeyStruct) obj);
}
public override int GetHashCode()
{
unchecked
{
return ((Value1 != null ? Value1.GetHashCode() : 0)*397) ^ Value2.GetHashCode();
}
}
public static bool operator ==(KeyStruct left, KeyStruct right)
{
return left.Equals(right);
}
public static bool operator !=(KeyStruct left, KeyStruct right)
{
return !left.Equals(right);
}
}
If KeyStruct is structure (declared with struct C# keyword), don't forget to override Equals and GetHash code methods, or provide custom IEqualityComparer to dictionary constructor, because default implementation of ValueType.Equals method uses Reflection to compare content of two structure instances.
It is prefer to make KeyStruct immutable, if you do so, you can calculate structure instance hash once and then simply return it from GetHashCode method. But it may be premature optimization, depends of how often do you need to get value by key.
Generally, it is OK to use structure as a dictionary key.
Or maybe you are asking how to implement GetHashCode method?
You need to implement (override) two methods.
1. bool Equals(object)
2. int GetHashCode()
The hash code need not be unique but the less different objects will return the same hash code the better performance you will have.
you can use something like:
public int GetHashCode()
{
int strHash = str == null ? 0 : str.GetHashCode();
return ((int)lng*397) ^ strHash;
}

Categories