I have a problem using a class of made of structures.
Here's the basic definition:
using System;
struct Real
{
public double real;
public Real(double real)
{
this.real = real;
}
}
class Record
{
public Real r;
public Record(double r)
{
this.r = new Real(r);
}
public void Test(double origval, double newval)
{
if (this.r.real == newval)
Console.WriteLine("r = newval-test passed\n");
else if (this.r.real == origval)
Console.WriteLine("r = origval-test failed\n");
else
Console.WriteLine("r = neither-test failed\n");
}
}
When I create a non-dynamic (static?) Record, setting the Real works.
When I create a dynamic Record, setting the real doesn't work.
When I create a dynamic Record, replacing the real works.
And here's the test program
class Program
{
static void Main(string[] args)
{
double origval = 8.0;
double newval = 5.0;
// THIS WORKS - create fixed type Record, print, change value, print
Record record1 = new Record(origval);
record1.r.real = newval; // change value ***
record1.Test(origval, newval);
// THIS DOESN'T WORK. change value is not making any change!
dynamic dynrecord2 = new Record(origval);
dynrecord2.r.real = newval; // change value
dynrecord2.Test(origval, newval);
// THIS WORKS - create dynamic type Record, print, change value, print
dynamic dynrecord3 = new Record(origval);
dynamic r = dynrecord3.r; // copy out value
r.real = newval; // change copy
dynrecord3.r = r; // copy in modified value
dynrecord3.Test(origval, newval);
}
}
And here's the output:
r = newval-test passed
r = origval-test failed
r = newval-test passed
When I change the struct Real to class Real, all three cases work.
So what's going on?
Thanks,
Max
dynamic is really a fancy word for object as far as the core CLI is concerned, so you are mutating a boxed copy. This is prone to craziness. Mutating a struct in the first place is really, really prone to error. I would simply make the struct immutable - otherwise you are going to get this over and over.
I dug a little deeper into this problem. Here's an answer from Mads Torgersen of Microsoft.
From Mads:
This is a little unfortunate but by design. In
dynrecord2.r.real = newval; // change value
The value of dynrecord2.r gets boxed, which means copied into its own heap object. That copy is the one getting modified, not the original that you subsequently test.
This is a consequence of the very “local” way in which C# dynamic works. Think about a statement like the above – there are two fundamental ways that we could attack that:
1) Realize at compile time that something dynamic is going on, and essentially move the whole statement to be bound at runtime
2) Bind individual operations at runtime when their constituents are dynamic, returning something dynamic that may in turn cause things to be bound at runtime
In C# we went with the latter, which is nicely compositional, and makes it easy to describe dynamic in terms of the type system, but has some drawbacks – such as boxing of resulting value types for instance.
So what you are seeing is a result of this design choice.
I took another look at the MSIL. It essentially takes
dynrecord2.r.real = newval;
and turns it into:
Real temp = dynrecord2.r;
temp.real = newval;
If dynrecord2.r is a class, it just copies the handle so the change affects the internal field. If dynrecord2.r is a struct, a copy is made, and the change doesn't affect the original.
I'll leave it up to the reader to decide if this is a bug or a feature.
Max
Make your struct immutable and you won't have problems.
struct Real
{
private double real;
public double Real{get{return real;}}
public Real(double real)
{
this.real = real;
}
}
Mutable structs can be useful in native interop or some high performance scenarios, but then you better know what you're doing.
Related
I am fairly new to programming and C#, and I am creating a game using C# 9.0 in which all instances of Entity have certain stats. I want to be able to change their private data fields using properties, though I'm not entirely sure how properties work. I know they are useful in encapsulation as getters and setters.
Context:
I am trying to optimize code and decrease memory usage where possible
The byte field str should be variable (through events, training, etc.), but have a "ceiling" and "floor"
If dog.str = 253, then dog.Str += 5; should result in dog.str being 255
If dog.str = 2, then dog.Str -= 5; should result in dog.str being 0
private byte str;
public short Str
{
get => str;
set
{
if (value > byte.MaxValue) str = byte.MaxValue; //Pos Overflow
else if (value < byte.MinValue) str = byte.MinValue; //Neg Overflow
else str = (byte)value;
}
}
Questions:
Since the property is of datatype Short, does it create a new private backing field that consumes memory? Or is value/Str{set;} just a local variable that later disappears?
Does the property public float StrMod {get => (float)(str*Effects.Power);} create a backing field? Would it be better to just create a method like public float getStrMod() instead?
Is this code optimal for what I'm trying to achieve? Is there some better way to do this, considering the following?
If for some reason the Short overflowed (unlikely in this scenario, but there may be a similar situation), then I would end up with the same problem. However, if extra memory allocation isn't an issue, then I could use an int.
The {get;} will return a Short, which may or may not be an issue.
Question 1:
No it doesn't, its backing field is str.
Question 2:
Profile your code first instead of making random changes in hope to reduce memory usage.
"Premature optimization is the root of all evil", do you really have such issues at this point ?
Personally I'd use int and use same type for property and backing field for simplicity.
This would avoid wrapping such as assigning 32768 which would then result as -32768 for short.
Side note, don't think that using byte necessarily results in 1 byte, if you have tight packing requirements then you need to look at StructLayoutAttribute.Pack.
Other than that I see nothing wrong with your code, just get it to work first then optimize it!
Here's how I'd write your code, maybe you'll get some ideas from it:
class Testing
{
private int _value;
public int Value
{
get => _value;
set => _value = Clamp(value, byte.MinValue, byte.MaxValue);
}
private static int Clamp(int value, int min, int max)
{
return Math.Max(min, Math.Min(max, value));
}
}
EDIT:
Different scenarios:
class Testing
{
private int _value1;
public int Value1 // backing field is _value1
{
get => _value1;
set => _value1 = value;
}
public int Value2 { get; set; } // adds a backing field
public int Value3 { get; } // adds a backing field
public int Value4 => 42; // no backing field
}
As you might have guessed, properties are syntactic sugar for methods, they can do 'whatever' under the hood compared to a field which can only be assigned a value to.
Also, one difference with a method is that you can browse its value in the debugger, that's handy.
Suggested reading:
https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/classes-and-structs/properties
Finally, properties are expected to return quickly, else write a method, and possibly async if it's going to take a while (advantage to method in this case as properties can't be async).
#aybe answer covers main thing about you question. I would like to add additional info to your 2nd question. You should consider on which platform you write application. There is a word term:
In computing, a word is the natural unit of data used by a particular
processor design. A word is a fixed-sized piece of data handled as a
unit by the instruction set or the hardware of the processor. The
number of bits in a word (the word size, word width, or word length)
is an important characteristic of any specific processor design or
computer architecture.
If processor has 64 bit word, then every variable which type is less than 64 bits will still occupy 64 bits in memory. Keep in mind that variable of given type will be handled as given type and size in memory doesn't impact range, overflow or underflow - arithmetic will be processed for given type.
In short - if you have 64-bit desktop processor and you will use only short variables, then you will not observe any memory savings in comparison to declaring int variables.
I feel like I'm missing something complete obvious so I apologise in advance if (when?) that is the case.
I'm trying to do something really simple, change a bool value in a struct from false to true. Obviously I can't change it directly, so I created a method within the struct that I can call which should change the value there. It doesn't seem to be the case. Here is the code and I'd appreciate any insight;
public Dictionary<int, List<ScanLineNode>> allScanLineNodes = new Dictionary<int, List<ScanLineNode>>();
public void MethodName(ScanLineNode node [...])
{
//This will perform a raycast from the Node's position in the specified direction. If the raycast hits nothing, it will return Vector3.zero ('Row is complete'), otherwise will return the hit point
Vector3 terminationPoint = node.RaycastDirection(node, direction, maxDist, targetRaycast, replacementColour, backgroundColour);
ScanLineNode terminationNode = new ScanLineNode();
//Previously attempted to store a local reference to this row being used, but also did not work
//List<ScanLineNode> rowNodes = allScanLineNodes[node.rowNumber];
[...]
if (terminationPoint == Vector3.zero)
{
//Definitely reaches this point, and executes this function along the row, I have added breakpoints and checked what happens in this for loop. After running 'RowComplete' (which just changes 'rowComplete' from false to true) 'rowComplete' is still false. Just in case I've included the RowComplete() function below.
Debug.Log("Row Complete: " + node.rowNumber);
for (int i = 0; i < allScanLineNodes[node.rowNumber].Count; i++)
{
allScanLineNodes[node.rowNumber][i].RowCompleted();
}
}
}
ScanLineNode Struct -- Most stuff is hidden (that I don't believe is affecting this), I have included the RowComplete() function however.
public struct ScanLineNode
{
[...]
public bool rowComplete;
[...]
public ScanLineNode([...])
{
[...]
rowComplete = false;
[...]
}
public void RowCompleted()
{
rowComplete = true;
}
}
I have also confirmed that RowCOmpleted() does not get called anywhere aside the above location, and 'rowComplete' is only called from the RowComplete() function
(from comments) allScanLineNodes is a Dictionary<int, List<ScanLineNode>>
Right; the indexer for a List<ScanLineNode> returns a copy of the struct. So when you call the method - you are calling it on a disconnected value on the stack that evaporates a moment later (is overwritten on the stack - this isn't the garbage collector).
This is a common error with mutable structs. Your best bet is probably: don't make mutable structs. But... you could copy it out, mutate it, and then push the mutated value back in:
var list = allScanLineNodes[node.rowNumber];
var val = list[i];
val.RowCompleted();
list[i] = val; // push value back in
But immutable is usually more reliable.
Note: you can get away with this with arrays, since the indexer from an array provides access to a reference to the in-place struct - rather than a copy of the value. But: this isn't a recommendation, as relying on this subtle difference can cause confusion and bugs.
This one's really an offshoot of this question, but I think it deserves its own answer.
According to section 15.13 of the ECMA-334 (on the using statement, below referred to as resource-acquisition):
Local variables declared in a
resource-acquisition are read-only, and shall include an initializer. A
compile-time error occurs if the
embedded statement attempts to modify
these local variables (via assignment
or the ++ and -- operators) or
pass them as ref or out
parameters.
This seems to explain why the code below is illegal.
struct Mutable : IDisposable
{
public int Field;
public void SetField(int value) { Field = value; }
public void Dispose() { }
}
using (var m = new Mutable())
{
// This results in a compiler error.
m.Field = 10;
}
But what about this?
using (var e = new Mutable())
{
// This is doing exactly the same thing, but it compiles and runs just fine.
e.SetField(10);
}
Is the above snippet undefined and/or illegal in C#? If it's legal, what is the relationship between this code and the excerpt from the spec above? If it's illegal, why does it work? Is there some subtle loophole that permits it, or is the fact that it works attributable only to mere luck (so that one shouldn't ever rely on the functionality of such seemingly harmless-looking code)?
I would read the standard in such a way that
using( var m = new Mutable() )
{
m = new Mutable();
}
is forbidden - with reason that seem obious.
Why for the struct Mutable it is not allowed beats me. Because for a class the code is legal and compiles fine...(object type i know..)
Also I do not see a reason why changing the contents of the value type does endanger the RA. Someone care to explain?
Maybe someone doing the syntx checking just misread the standard ;-)
Mario
I suspect the reason it compiles and runs is that SetField(int) is a function call, not an assignment or ref or out parameter call. The compiler has no way of knowing (in general) whether SetField(int) is going to mutate the variable or not.
This appears completely legal according to the spec.
And consider the alternatives. Static analysis to determine whether a given function call is going to mutate a value is clearly cost prohibitive in the C# compiler. The spec is designed to avoid that situation in all cases.
The other alternative would be for C# to not allow any method calls on value type variables declared in a using statement. That might not be a bad idea, since implementing IDisposable on a struct is just asking for trouble anyway. But when the C# language was first developed, I think they had high hopes for using structs in lots of interesting ways (as the GetEnumerator() example that you originally used demonstrates).
To sum it up
struct Mutable : IDisposable
{
public int Field;
public void SetField( int value ) { Field = value; }
public void Dispose() { }
}
class Program
{
protected static readonly Mutable xxx = new Mutable();
static void Main( string[] args )
{
//not allowed by compiler
//xxx.Field = 10;
xxx.SetField( 10 );
//prints out 0 !!!! <--- I do think that this is pretty bad
System.Console.Out.WriteLine( xxx.Field );
using ( var m = new Mutable() )
{
// This results in a compiler error.
//m.Field = 10;
m.SetField( 10 );
//This prints out 10 !!!
System.Console.Out.WriteLine( m.Field );
}
System.Console.In.ReadLine();
}
So in contrast to what I wrote above, I would recommend to NOT use a function to modify a struct within a using block. This seems wo work, but may stop to work in the future.
Mario
This behavior is undefined. In The C# Programming language at the end of the C# 4.0 spec section 7.6.4 (Member Access) Peter Sestoft states:
The two bulleted points stating "if the field is readonly...then
the result is a value" have a slightly surprising effect when the
field has a struct type, and that struct type has a mutable field (not
a recommended combination--see other annotations on this point).
He provides an example. I created my own example which displays more detail below.
Then, he goes on to say:
Somewhat strangely, if instead s were a local variable of struct type
declared in a using statement, which also has the effect of making s
immutable, then s.SetX() updates s.x as expected.
Here we see one of the authors acknowledge that this behavior is inconsistent. Per section 7.6.4, readonly fields are treated as values and do not change (copies change). Because section 8.13 tells us using statements treat resources as read-only:
the resource variable is read-only in the embedded statement,
resources in using statements should behave like readonly fields. Per the rules of 7.6.4 we should be dealing with a value not a variable. But surprisingly, the original value of the resource does change as demonstrated in this example:
//Sections relate to C# 4.0 spec
class Test
{
readonly S readonlyS = new S();
static void Main()
{
Test test = new Test();
test.readonlyS.SetX();//valid we are incrementing the value of a copy of readonlyS. This is per the rules defined in 7.6.4
Console.WriteLine(test.readonlyS.x);//outputs 0 because readonlyS is a value not a variable
//test.readonlyS.x = 0;//invalid
using (S s = new S())
{
s.SetX();//valid, changes the original value.
Console.WriteLine(s.x);//Surprisingly...outputs 2. Although S is supposed to be a readonly field...the behavior diverges.
//s.x = 0;//invalid
}
}
}
struct S : IDisposable
{
public int x;
public void SetX()
{
x = 2;
}
public void Dispose()
{
}
}
The situation is bizarre. Bottom line, avoid creating readonly mutable fields.
I'm looking for the C# equivalent of Java's final. Does it exist?
Does C# have anything like the following:
public Foo(final int bar);
In the above example, bar is a read only variable and cannot be changed by Foo(). Is there any way to do this in C#?
For instance, maybe I have a long method that will be working with x, y, and z coordinates of some object (ints). I want to be absolutely certain that the function doesn't alter these values in any way, thereby corrupting the data. Thus, I would like to declare them readonly.
public Foo(int x, int y, int z) {
// do stuff
x++; // oops. This corrupts the data. Can this be caught at compile time?
// do more stuff, assuming x is still the original value.
}
Unfortunately you cannot do this in C#.
The const keyword can only be used for local variables and fields.
The readonly keyword can only be used on fields.
NOTE: The Java language also supports having final parameters to a method. This functionality is non-existent in C#.
from http://www.25hoursaday.com/CsharpVsJava.html
EDIT (2019/08/13):
I'm throwing this in for visibility since this is accepted and highest on the list. It's now kind of possible with in parameters. See the answer below this one for details.
This is now possible in C# version 7.2:
You can use the in keyword in the method signature. MSDN documentation.
The in keyword should be added before specifying a method's argument.
Example, a valid method in C# 7.2:
public long Add(in long x, in long y)
{
return x + y;
}
While the following is not allowed:
public long Add(in long x, in long y)
{
x = 10; // It is not allowed to modify an in-argument.
return x + y;
}
Following error will be shown when trying to modify either x or y since they are marked with in:
Cannot assign to variable 'in long' because it is a readonly variable
Marking an argument with in means:
This method does not modify the value of the argument used as this parameter.
The answer: C# doesn't have the const functionality like C++.
I agree with Bennett Dill.
The const keyword is very useful. In the example, you used an int and people don't get your point. But, why if you parameter is an user huge and complex object that can't be changed inside that function? That's the use of const keyword: parameter can't change inside that method because [whatever reason here] that doesn't matters for that method. Const keyword is very powerful and I really miss it in C#.
Here's a short and sweet answer that will probably get a lot of down votes. I haven't read all of the posts and comments, so please forgive me if this has been previously suggested.
Why not take your parameters and pass them into an object that exposes them as immutable and then use that object in your method?
I realize this is probably a very obvious work around that has already been considered and the OP is trying to avoid doing this by asking this question, but I felt it should be here none-the-less...
Good luck :-)
I'll start with the int portion. int is a value type, and in .Net that means you really are dealing with a copy. It's a really weird design constraint to tell a method "You can have a copy of this value. It's your copy, not mine; I'll never see it again. But you can't change the copy." It's implicit in the method call that copying this value is okay, otherwise we couldn't have safely called the method. If the method needs the original, leave it to the implementer to make a copy to save it. Either give the method the value or do not give the method the value. Don't go all wishy-washy in between.
Let's move on to reference types. Now it gets a little confusing. Do you mean a constant reference, where the reference itself cannot be changed, or a completely locked, unchangeable object? If the former, references in .Net by default are passed by value. That is, you get a copy of the reference. So we have essentially the same situation as for value types. If the implementor will need the original reference they can keep it themselves.
That just leaves us with constant (locked/immutable) object. This might seem okay from a runtime perspective, but how is the compiler to enforce it? Since properties and methods can all have side effects, you'd essentially be limited to read-only field access. Such an object isn't likely to be very interesting.
Create an interface for your class that has only readonly property accessors. Then have your parameter be of that interface rather than the class itself. Example:
public interface IExample
{
int ReadonlyValue { get; }
}
public class Example : IExample
{
public int Value { get; set; }
public int ReadonlyValue { get { return this.Value; } }
}
public void Foo(IExample example)
{
// Now only has access to the get accessors for the properties
}
For structs, create a generic const wrapper.
public struct Const<T>
{
public T Value { get; private set; }
public Const(T value)
{
this.Value = value;
}
}
public Foo(Const<float> X, Const<float> Y, Const<float> Z)
{
// Can only read these values
}
Its worth noting though, that its strange that you want to do what you're asking to do regarding structs, as the writer of the method you should expect to know whats going on in that method. It won't affect the values passed in to modify them within the method, so your only concern is making sure you behave yourself in the method you're writing. There comes a point where vigilance and clean code are the key, over enforcing const and other such rules.
I know this might be little late.
But for people that are still searching other ways for this, there might be another way around this limitation of C# standard.
We could write wrapper class ReadOnly<T> where T : struct.
With implicit conversion to base type T.
But only explicit conversion to wrapper<T> class.
Which will enforce compiler errors if developer tries implicit set to value of ReadOnly<T> type.
As I will demonstrate two possible uses below.
USAGE 1 required caller definition to change. This usage will have only use in testing for correctness of your "TestCalled" functions code. While on release level/builds you shouldn't use it. Since in large scale mathematical operations might overkill in conversions, and make your code slow. I wouldn't use it, but for demonstration purpose only I have posted it.
USAGE 2 which I would suggest, has Debug vs Release use demonstrated in TestCalled2 function. Also there would be no conversion in TestCaller function when using this approach, but it requires a little more of coding of TestCaller2 definitions using compiler conditioning. You can notice compiler errors in debug configuration, while on release configuration all code in TestCalled2 function will compile successfully.
using System;
using System.Collections.Generic;
public class ReadOnly<VT>
where VT : struct
{
private VT value;
public ReadOnly(VT value)
{
this.value = value;
}
public static implicit operator VT(ReadOnly<VT> rvalue)
{
return rvalue.value;
}
public static explicit operator ReadOnly<VT>(VT rvalue)
{
return new ReadOnly<VT>(rvalue);
}
}
public static class TestFunctionArguments
{
static void TestCall()
{
long a = 0;
// CALL USAGE 1.
// explicite cast must exist in call to this function
// and clearly states it will be readonly inside TestCalled function.
TestCalled(a); // invalid call, we must explicit cast to ReadOnly<T>
TestCalled((ReadOnly<long>)a); // explicit cast to ReadOnly<T>
// CALL USAGE 2.
// Debug vs Release call has no difference - no compiler errors
TestCalled2(a);
}
// ARG USAGE 1.
static void TestCalled(ReadOnly<long> a)
{
// invalid operations, compiler errors
a = 10L;
a += 2L;
a -= 2L;
a *= 2L;
a /= 2L;
a++;
a--;
// valid operations
long l;
l = a + 2;
l = a - 2;
l = a * 2;
l = a / 2;
l = a ^ 2;
l = a | 2;
l = a & 2;
l = a << 2;
l = a >> 2;
l = ~a;
}
// ARG USAGE 2.
#if DEBUG
static void TestCalled2(long a2_writable)
{
ReadOnly<long> a = new ReadOnly<long>(a2_writable);
#else
static void TestCalled2(long a)
{
#endif
// invalid operations
// compiler will have errors in debug configuration
// compiler will compile in release
a = 10L;
a += 2L;
a -= 2L;
a *= 2L;
a /= 2L;
a++;
a--;
// valid operations
// compiler will compile in both, debug and release configurations
long l;
l = a + 2;
l = a - 2;
l = a * 2;
l = a / 2;
l = a ^ 2;
l = a | 2;
l = a & 2;
l = a << 2;
l = a >> 2;
l = ~a;
}
}
If you often run into trouble like this then you should consider "apps hungarian". The good kind, as opposed to the bad kind. While this doesn't normally tries to express constant-ness of a method parameter (that's just too unusual), there is certainly nothing that stops you from tacking an extra "c" before the identifier name.
To all those aching to slam the downvote button now, please read the opinions of these luminaries on the topic:
Eric Lippert
Larry Osterman
Joel Spolsky
If struct is passed into a method, unless it's passed by ref, it will not be changed by the method it's passed into. So in that sense, yes.
Can you create a parameter whose value can't be assigned within the method or whose properties cannot be set while within the method? No. You cannot prevent the value from being assigned within the method, but you can prevent it's properties from being set by creating an immutable type.
The question isn't whether the parameter or it's properties can be assigned to within the method. The question is what it will be when the method exits.
The only time any outside data is going to be altered is if you pass a class in and change one of it's properties, or if you pass a value by using the ref keyword. The situation you've outlined does neither.
The recommended (well, by me) is to use an interface that provides read only access to the members. Remembering that if the "real" member is a reference type, then only provide access to an interface supporting read operations for that type -- recursing down the entire object hierarchy.
Inspired by Units of Measure in F#, and despite asserting (here) that you couldn't do it in C#, I had an idea the other day which I've been playing around with.
namespace UnitsOfMeasure
{
public interface IUnit { }
public static class Length
{
public interface ILength : IUnit { }
public class m : ILength { }
public class mm : ILength { }
public class ft : ILength { }
}
public class Mass
{
public interface IMass : IUnit { }
public class kg : IMass { }
public class g : IMass { }
public class lb : IMass { }
}
public class UnitDouble<T> where T : IUnit
{
public readonly double Value;
public UnitDouble(double value)
{
Value = value;
}
public static UnitDouble<T> operator +(UnitDouble<T> first, UnitDouble<T> second)
{
return new UnitDouble<T>(first.Value + second.Value);
}
//TODO: minus operator/equality
}
}
Example usage:
var a = new UnitDouble<Length.m>(3.1);
var b = new UnitDouble<Length.m>(4.9);
var d = new UnitDouble<Mass.kg>(3.4);
Console.WriteLine((a + b).Value);
//Console.WriteLine((a + c).Value); <-- Compiler says no
The next step is trying to implement conversions (snippet):
public interface IUnit { double toBase { get; } }
public static class Length
{
public interface ILength : IUnit { }
public class m : ILength { public double toBase { get { return 1.0;} } }
public class mm : ILength { public double toBase { get { return 1000.0; } } }
public class ft : ILength { public double toBase { get { return 0.3048; } } }
public static UnitDouble<R> Convert<T, R>(UnitDouble<T> input) where T : ILength, new() where R : ILength, new()
{
double mult = (new T() as IUnit).toBase;
double div = (new R() as IUnit).toBase;
return new UnitDouble<R>(input.Value * mult / div);
}
}
(I would have liked to avoid instantiating objects by using static, but as we all know you can't declare a static method in an interface)
You can then do this:
var e = Length.Convert<Length.mm, Length.m>(c);
var f = Length.Convert<Length.mm, Mass.kg>(d); <-- but not this
Obviously, there is a gaping hole in this, compared to F# Units of measure (I'll let you work it out).
Oh, the question is: what do you think of this? Is it worth using? Has someone else already done better?
UPDATE for people interested in this subject area, here is a link to a paper from 1997 discussing a different kind of solution (not specifically for C#)
You are missing dimensional analysis. For example (from the answer you linked to), in F# you can do this:
let g = 9.8<m/s^2>
and it will generate a new unit of acceleration, derived from meters and seconds (you can actually do the same thing in C++ using templates).
In C#, it is possible to do dimensional analysis at runtime, but it adds overhead and doesn't give you the benefit of compile-time checking. As far as I know there's no way to do full compile-time units in C#.
Whether it's worth doing depends on the application of course, but for many scientific applications, it's definitely a good idea. I don't know of any existing libraries for .NET, but they probably exist.
If you are interested in how to do it at runtime, the idea is that each value has a scalar value and integers representing the power of each basic unit.
class Unit
{
double scalar;
int kg;
int m;
int s;
// ... for each basic unit
public Unit(double scalar, int kg, int m, int s)
{
this.scalar = scalar;
this.kg = kg;
this.m = m;
this.s = s;
...
}
// For addition/subtraction, exponents must match
public static Unit operator +(Unit first, Unit second)
{
if (UnitsAreCompatible(first, second))
{
return new Unit(
first.scalar + second.scalar,
first.kg,
first.m,
first.s,
...
);
}
else
{
throw new Exception("Units must match for addition");
}
}
// For multiplication/division, add/subtract the exponents
public static Unit operator *(Unit first, Unit second)
{
return new Unit(
first.scalar * second.scalar,
first.kg + second.kg,
first.m + second.m,
first.s + second.s,
...
);
}
public static bool UnitsAreCompatible(Unit first, Unit second)
{
return
first.kg == second.kg &&
first.m == second.m &&
first.s == second.s
...;
}
}
If you don't allow the user to change the value of the units (a good idea anyways), you could add subclasses for common units:
class Speed : Unit
{
public Speed(double x) : base(x, 0, 1, -1, ...); // m/s => m^1 * s^-1
{
}
}
class Acceleration : Unit
{
public Acceleration(double x) : base(x, 0, 1, -2, ...); // m/s^2 => m^1 * s^-2
{
}
}
You could also define more specific operators on the derived types to avoid checking for compatible units on common types.
Using separate classes for different units of the same measure (e.g., cm, mm, and ft for Length) seems kind of weird. Based on the .NET Framework's DateTime and TimeSpan classes, I would expect something like this:
Length length = Length.FromMillimeters(n1);
decimal lengthInFeet = length.Feet;
Length length2 = length.AddFeet(n2);
Length length3 = length + Length.FromMeters(n3);
You could add extension methods on numeric types to generate measures. It'd feel a bit DSL-like:
var mass = 1.Kilogram();
var length = (1.2).Kilometres();
It's not really .NET convention and might not be the most discoverable feature, so perhaps you'd add them in a devoted namespace for people who like them, as well as offering more conventional construction methods.
I recently released Units.NET on GitHub and on NuGet.
It gives you all the common units and conversions. It is light-weight, unit tested and supports PCL.
Example conversions:
Length meter = Length.FromMeters(1);
double cm = meter.Centimeters; // 100
double yards = meter.Yards; // 1.09361
double feet = meter.Feet; // 3.28084
double inches = meter.Inches; // 39.3701
Now such a C# library exists:
http://www.codeproject.com/Articles/413750/Units-of-Measure-Validator-for-Csharp
It has almost the same features as F#'s unit compile time validation, but for C#.
The core is a MSBuild task, which parses the code and looking for validations.
The unit information are stored in comments and attributes.
Here's my concern with creating units in C#/VB. Please correct me if you think I'm wrong. Most implementations I've read about seem to involve creating a structure that pieces together a value (int or double) with a unit. Then you try to define basic functions (+-*/,etc) for these structures that take into account unit conversions and consistency.
I find the idea very attractive, but every time I balk at what a huge step for a project this appears to be. It looks like an all-or-nothing deal. You probably wouldn't just change a few numbers into units; the whole point is that all data inside a project is appropriately labeled with a unit to avoid any ambiguity. This means saying goodbye to using ordinary doubles and ints, every variable is now defined as a "Unit" or "Length" or "Meters", etc. Do people really do this on a large scale? So even if you have a large array, every element should be marked with a unit. This will obviously have both size and performance ramifications.
Despite all the cleverness in trying to push the unit logic into the background, some cumbersome notation seems inevitable with C#. F# does some behind-the-scenes magic that better reduces the annoyance factor of the unit logic.
Also, how successfully can we make the compiler treat a unit just like an ordinary double when we so desire, w/o using CType or ".Value" or any additional notation? Such as with nullables, the code knows to treat a double? just like a double (of course if your double? is null then you get an error).
Thanks for the idea. I have implemented units in C# many different ways there always seems to be a catch. Now I can try one more time using the ideas discussed above. My goal is to be able to define new units based on existing ones like
Unit lbf = 4.44822162*N;
Unit fps = feet/sec;
Unit hp = 550*lbf*fps
and for the program to figure out the proper dimensions, scaling and symbol to use. In the end I need to build a basic algebra system that can convert things like (m/s)*(m*s)=m^2 and try to express the result based on existing units defined.
Also a requirement must be to be able to serialize the units in a way that new units do not need to be coded, but just declared in a XML file like this:
<DefinedUnits>
<DirectUnits>
<!-- Base Units -->
<DirectUnit Symbol="kg" Scale="1" Dims="(1,0,0,0,0)" />
<DirectUnit Symbol="m" Scale="1" Dims="(0,1,0,0,0)" />
<DirectUnit Symbol="s" Scale="1" Dims="(0,0,1,0,0)" />
...
<!-- Derived Units -->
<DirectUnit Symbol="N" Scale="1" Dims="(1,1,-2,0,0)" />
<DirectUnit Symbol="R" Scale="1.8" Dims="(0,0,0,0,1)" />
...
</DirectUnits>
<IndirectUnits>
<!-- Composite Units -->
<IndirectUnit Symbol="m/s" Scale="1" Lhs="m" Op="Divide" Rhs="s"/>
<IndirectUnit Symbol="km/h" Scale="1" Lhs="km" Op="Divide" Rhs="hr"/>
...
<IndirectUnit Symbol="hp" Scale="550.0" Lhs="lbf" Op="Multiply" Rhs="fps"/>
</IndirectUnits>
</DefinedUnits>
there is jscience: http://jscience.org/, and here is a groovy dsl for units: http://groovy.dzone.com/news/domain-specific-language-unit-. iirc, c# has closures, so you should be able to cobble something up.
Why not use CodeDom to generate all possible permutations of the units automatically? I know it's not the best - but I will definitely work!
you could use QuantitySystem instead of implementing it by your own. It builds on F# and drastically improves unit handling in F#. It's the best implementation I found so far and can be used in C# projects.
http://quantitysystem.codeplex.com
Is it worth using?
Yes. If I have "a number" in front of me, I want to know what that is. Any time of the day. Besides, that's what we usually do. We organize data into a meaningful entity -class, struct, you name it. Doubles into coordinates, strings into names and address etc. Why units should be any different?
Has someone else already done better?
Depends on how one defines "better". There are some libraries out there but I haven't tried them so I don't have an opinion. Besides it spoils the fun of trying it myself :)
Now about the implementation. I would like to start with the obvious: it's futile to try replicate the [<Measure>] system of F# in C#. Why? Because once F# allows you to use / ^ (or anything else for that matter) directly on another type, the game is lost. Good luck doing that in C# on a struct or class. The level of metaprogramming required for such a task does not exist and I'm afraid it is not going to be added any time soon -in my opinion. That's why you lack the dimensional analysis that Matthew Crumley mentioned in his answer.
Let's take the example from fsharpforfunandprofit.com: you have Newtons defined as [<Measure>] type N = kg m/sec^2. Now you have the square function that that the author created that will return a N^2 which sounds "wrong", absurd and useless. Unless you want to perform arithmetic operations where at some point during the evaluation process, you might get something "meaningless" until you multiply it with some other unit and you get a meaningful result. Or even worse, you might want to use constants. For example the gas constant R which is 8.31446261815324 J /(K mol). If you define the appropriate units, then F# is ready to consume the R constant. C# is not. You need to specify another type just for that and still you won't be able to do any operation you want on that constant.
That doesn't mean that you shouldn't try. I did and I am quite happy with the results. I started SharpConvert around 3 years ago, after I got inspired by this very question. The trigger was this story: once I had to fix a nasty bug for the RADAR simulator that I develop: an aircraft was plunging in the earth instead of following the predefined glide path. That didn't make me happy as you could guess and after 2 hours of debugging, I realized that somewhere in my calculations, I was treating kilometers as nautical miles. Until that point I was like "oh well I will just be 'careful'" which is at least naive for any non trivial task.
In your code there would be a couple of things I would do different.
First I would turn UnitDouble<T> and IUnit implementations into structs. A unit is just that, a number and if you want them to be treated like numbers, a struct is a more appropriate approach.
Then I would avoid the new T() in the methods. It does not invoke the constructor, it uses Activator.CreateInstance<T>() and for number crunching it will be bad as it will add overhead. That depends though on the implementation, for a simple units converter application it won't harm. For time critical context avoid like the plague. And don't take me wrong, I used it myself as I didn't know better and I run some simple benchmarks the other day and such a call might double the execution time -at least in my case. More details in Dissecting the new() constraint in C#: a perfect example of a leaky abstraction
I would also change Convert<T, R>() and make it a member function. I prefer writing
var c = new Unit<Length.mm>(123);
var e = c.Convert<Length.m>();
rather than
var e = Length.Convert<Length.mm, Length.m>(c);
Last but not least I would use specific unit "shells" for each physical quantity (length time etc) instead of the UnitDouble, as it will be easier to add physical quantity specific functions and operator overloads. It will also allow you to create a Speed<TLength, TTime> shell instead of another Unit<T1, T2> or even Unit<T1, T2, T3> class. So it would look like that:
public readonly struct Length<T> where T : struct, ILength
{
private static readonly double SiFactor = new T().ToSiFactor;
public Length(double value)
{
if (value < 0) throw new ArgumentException(nameof(value));
Value = value;
}
public double Value { get; }
public static Length<T> operator +(Length<T> first, Length<T> second)
{
return new Length<T>(first.Value + second.Value);
}
public static Length<T> operator -(Length<T> first, Length<T> second)
{
// I don't know any application where negative length makes sense,
// if it does feel free to remove Abs() and the exception in the constructor
return new Length<T>(System.Math.Abs(first.Value - second.Value));
}
// You can add more like
// public static Area<T> operator *(Length<T> x, Length<T> y)
// or
//public static Volume<T> operator *(Length<T> x, Length<T> y, Length<T> z)
// etc
public Length<R> To<R>() where R : struct, ILength
{
//notice how I got rid of the Activator invocations by moving them in a static field;
//double mult = new T().ToSiFactor;
//double div = new R().ToSiFactor;
return new Length<R>(Value * SiFactor / Length<R>.SiFactor);
}
}
Notice also that, in order to save us from the dreaded Activator call, I stored the result of new T().ToSiFactor in SiFactor. It might seem awkward at first, but as Length is generic, Length<mm> will have its own copy, Length<Km> its own, and so on and so forth. Please note that ToSiFactor is the toBase of your approach.
The problem that I see is that as long as you are in the realm of simple units and up to the first derivative of time, things are simple. If you try to do something more complex, then you can see the drawbacks of this approach. Typing
var accel = new Acceleration<m, s, s>(1.2);
will not be as clear and "smooth" as
let accel = 1.2<m/sec^2>
And regardless of the approach, you will have to specify every math operation you will need with hefty operator overloading, while in F# you have this for free, even if the results are not meaningful as I was writing at the beginning.
The last drawback (or advantage depending on how you see it) of this design, is that it can't be unit agnostic. If there are cases that you need "just a Length" you can't have it. You need to know each time if your Length is millimeters, statute mile or foot. I took the opposite approach in SharpConvert and LengthUnit derives from UnitBase and Meters Kilometers etc derive from this. That's why I couldn't go down the struct path by the way. That way you can have:
LengthUnit l1 = new Meters(12);
LengthUnit l2 = new Feet(15.4);
LengthUnit sum = l1 + l2;
sum will be meters but one shouldn't care as long as they want to use it in the next operation. If they want to display it, then they can call sum.To<Kilometers>() or whatever unit. To be honest, I don't know if not "locking" the variable to a specific unit has any advantages. It might worth investigating it at some point.
I would like the compiler to help me as much as possible. So maybe you could have a TypedInt where T contains the actual unit.
public struct TypedInt<T>
{
public int Value { get; }
public TypedInt(int value) => Value = value;
public static TypedInt<T> operator -(TypedInt<T> a, TypedInt<T> b) => new TypedInt<T>(a.Value - b.Value);
public static TypedInt<T> operator +(TypedInt<T> a, TypedInt<T> b) => new TypedInt<T>(a.Value + b.Value);
public static TypedInt<T> operator *(int a, TypedInt<T> b) => new TypedInt<T>(a * b.Value);
public static TypedInt<T> operator *(TypedInt<T> a, int b) => new TypedInt<T>(a.Value * b);
public static TypedInt<T> operator /(TypedInt<T> a, int b) => new TypedInt<T>(a.Value / b);
// todo: m² or m/s
// todo: more than just ints
// todo: other operations
public override string ToString() => $"{Value} {typeof(T).Name}";
}
You could have an extensiom method to set the type (or just new):
public static class TypedInt
{
public static TypedInt<T> Of<T>(this int value) => new TypedInt<T>(value);
}
The actual units can be anything. That way, the system is extensible.
(There's multiple ways of handling conversions. What do you think is best?)
public class Mile
{
// todo: conversion from mile to/from meter
// maybe define an interface like ITypedConvertible<Meter>
// conversion probably needs reflection, but there may be
// a faster way
};
public class Second
{
}
This way, you can use:
var distance1 = 10.Of<Mile>();
var distance2 = 15.Of<Mile>();
var timespan1 = 4.Of<Second>();
Console.WriteLine(distance1 + distance2);
//Console.WriteLine(distance1 + 5); // this will be blocked by the compiler
//Console.WriteLine(distance1 + timespan1); // this will be blocked by the compiler
Console.WriteLine(3 * distance1);
Console.WriteLine(distance1 / 3);
//Console.WriteLine(distance1 / timespan1); // todo!
See Boo Ometa (which will be available for Boo 1.0):
Boo Ometa and Extensible Parsing
I really liked reading through this stack overflow question and its answers.
I have a pet project that I've tinkered with over the years, and have recently started re-writing it and have released it to the open source at https://github.com/MafuJosh/NGenericDimensions
It happens to be somewhat similar to many of the ideas expressed in the question and answers of this page.
It basically is about creating generic dimensions, with the unit of measure and the native datatype as the generic type placeholders.
For example:
Dim myLength1 as New Length(of Miles, Int16)(123)
With also some optional use of Extension Methods like:
Dim myLength2 = 123.miles
And
Dim myLength3 = myLength1 + myLength2
Dim myArea1 = myLength1 * myLength2
This would not compile:
Dim myValue = 123.miles + 234.kilograms
New units can be extended in your own libraries.
These datatypes are structures that contain only 1 internal member variable, making them lightweight.
Basically, the operator overloads are restricted to the "dimension" structures, so that every unit of measure doesn't need operator overloads.
Of course, a big downside is the longer declaration of the generics syntax that requires 3 datatypes. So if that is a problem for you, then this isn't your library.
The main purpose was to be able to decorate an interface with units in a compile-time checking fashion.
There is a lot that needs to be done to the library, but I wanted to post it in case it was the kind of thing someone was looking for.