Are public fields ever OK? - c#

Before you react from the gut, as I did initially, read the whole question please. I know they make you feel dirty, I know we've all been burned before and I know it's not "good style" but, are public fields ever ok?
I'm working on a fairly large scale engineering application that creates and works with an in memory model of a structure (anything from high rise building to bridge to shed, doesn't matter). There is a TON of geometric analysis and calculation involved in this project. To support this, the model is composed of many tiny immutable read-only structs to represent things like points, line segments, etc. Some of the values of these structs (like the coordinates of the points) are accessed tens or hundreds of millions of times during a typical program execution. Because of the complexity of the models and the volume of calculation, performance is absolutely critical.
I feel that we're doing everything we can to optimize our algorithms, performance test to determine bottle necks, use the right data structures, etc. etc. I don't think this is a case of premature optimization. Performance tests show order of magnitude (at least) performance boosts when accessing fields directly rather than through a property on the object. Given this information, and the fact that we can also expose the same information as properties to support data binding and other situations... is this OK? Remember, read only fields on immutable structs. Can anyone think of a reason I'm going to regret this?
Here's a sample test app:
struct Point {
public Point(double x, double y, double z) {
_x = x;
_y = y;
_z = z;
}
public readonly double _x;
public readonly double _y;
public readonly double _z;
public double X { get { return _x; } }
public double Y { get { return _y; } }
public double Z { get { return _z; } }
}
class Program {
static void Main(string[] args) {
const int loopCount = 10000000;
var point = new Point(12.0, 123.5, 0.123);
var sw = new Stopwatch();
double x, y, z;
double calculatedValue;
sw.Start();
for (int i = 0; i < loopCount; i++) {
x = point._x;
y = point._y;
z = point._z;
calculatedValue = point._x * point._y / point._z;
}
sw.Stop();
double fieldTime = sw.ElapsedMilliseconds;
Console.WriteLine("Direct field access: " + fieldTime);
sw.Reset();
sw.Start();
for (int i = 0; i < loopCount; i++) {
x = point.X;
y = point.Y;
z = point.Z;
calculatedValue = point.X * point.Y / point.Z;
}
sw.Stop();
double propertyTime = sw.ElapsedMilliseconds;
Console.WriteLine("Property access: " + propertyTime);
double totalDiff = propertyTime - fieldTime;
Console.WriteLine("Total difference: " + totalDiff);
double averageDiff = totalDiff / loopCount;
Console.WriteLine("Average difference: " + averageDiff);
Console.ReadLine();
}
}
result:
Direct field access: 3262
Property access: 24248
Total difference: 20986
Average difference: 0.00020986
It's only 21 seconds, but why not?

Your test isn't really being fair to the property-based versions. The JIT is smart enough to inline simple properties so that they have a runtime performance equivalent to that of direct field access, but it doesn't seem smart enough (today) to detect when the properties access constant values.
In your example, the entire loop body of the field access version is optimized away, becoming just:
for (int i = 0; i < loopCount; i++)
00000025 xor eax,eax
00000027 inc eax
00000028 cmp eax,989680h
0000002d jl 00000027
}
whereas the second version, is actually performing the floating point division on each iteration:
for (int i = 0; i < loopCount; i++)
00000094 xor eax,eax
00000096 fld dword ptr ds:[01300210h]
0000009c fdiv qword ptr ds:[01300218h]
000000a2 fstp st(0)
000000a4 inc eax
000000a5 cmp eax,989680h
000000aa jl 00000096
}
Making just two small changes to your application to make it more realistic makes the two operations practically identical in performance.
First, randomize the input values so that they aren't constants and the JIT isn't smart enough to remove the division entirely.
Change from:
Point point = new Point(12.0, 123.5, 0.123);
to:
Random r = new Random();
Point point = new Point(r.NextDouble(), r.NextDouble(), r.NextDouble());
Secondly, ensure that the results of each loop iteration are used somewhere:
Before each loop, set calculatedValue = 0 so they both start at the same point. After each loop call Console.WriteLine(calculatedValue.ToString()) to make sure that the result is "used" so the compiler doesn't optimize it away. Finally, change the body of the loop from "calculatedValue = ..." to "calculatedValue += ..." so that each iteration is used.
On my machine, these changes (with a release build) yield the following results:
Direct field access: 133
Property access: 133
Total difference: 0
Average difference: 0
Just as we expect, the x86 for each of these modified loops is identical (except for the loop address)
000000dd xor eax,eax
000000df fld qword ptr [esp+20h]
000000e3 fmul qword ptr [esp+28h]
000000e7 fdiv qword ptr [esp+30h]
000000eb fstp st(0)
000000ed inc eax
000000ee cmp eax,989680h
000000f3 jl 000000DF (This loop address is the only difference)

Given that you deal with immutable objects with readonly fields, I would say that you have hit the one case when I don't find public fields to be a dirty habit.

IMO, the "no public fields" rule is one of those rules which are technically correct, but unless you are designing a library intended to be used by the public it is unlikely to cause you any problem if you break it.
Before I get too massively downvoted, I should add that encapsulation is a good thing. Given the invariant "the Value property must be null if HasValue is false", this design is flawed:
class A {
public bool HasValue;
public object Value;
}
However, given that invariant, this design is equally flawed:
class A {
public bool HasValue { get; set; }
public object Value { get; set; }
}
The correct design is
class A {
public bool HasValue { get; private set; }
public object Value { get; private set; }
public void SetValue(bool hasValue, object value) {
if (!hasValue && value != null)
throw new ArgumentException();
this.HasValue = hasValue;
this.Value = value;
}
}
(and even better would be to provide an initializing constructor and make the class immutable).

I know you feel kind of dirty doing this, but it isn't uncommon for rules and guidelines to get shot to hell when performance becomes an issue. For example, quite a few high traffic websites using MySQL have data duplication and denormalized tables. Others go even crazier.
Moral of the story - it may go against everything you were taught or advised, but the benchmarks don't lie. If it works better, just do it.

If you really need that extra performance, then it's probably the right thing to do. If you don't need the extra performance then it's probably not.
Rico Mariani has a couple of related posts:
Ten Questions on Value-Based Programming
Ten Questions on Value-Based Programming : Solution

Personally, the only time I would consider using public fields is in a very implementation-specific private nested class.
Other times it just feels too "wrong" to do it.
The CLR will take care of performance by optimising out the method/property (in release builds) so that shouldn't be an issue.

Not that I disagree with the other answers, or with your conclusion... but I'd like to know where you get the order of magnitude performance difference stat from. As I understand the C# compiler, any simple property (with no additional code other than direct access to the field), should get inlined by the JIT compiler as a direct access anyway.
The advantedge of using properties even in these simple cases (in most situations) was that by writing it as a property you allow for future changes that might modify the property. (Although in your case there would not be any such changes in future of course)

Try compiling a release build and running directly from the exe instead of through the debugger. If the application was run through a debugger then the JIT compiler will not inline the property accessors. I was not able to replicate your results. In fact, each test I ran indicated that there was virtually no difference in execution time.
But, like the others I am not completely oppossed to direct field access. Especially because it is easy to make the field private and add a public property accessor at a later time without needed make any more code modifications to get the application to compile.
Edit: Okay, my initial tests used an int data type instead of double. I see a huge difference when using doubles. With ints the direct vs. property is virtually the same. With doubles property access is about 7x slower than direct access on my machine. This is somewhat puzzling to me.
Also, it is important to run the tests outside of the debugger. Even in release builds the debugger adds overhead which skews the results.

Here's some scenarios where it is OK (from the Framework Design Guidelines book):
DO use constant fields for constants
that will never change.
DO use public
static readonly fields for predefined
object instances.
And where it is not:
DO NOT assign instances of mutable
types to readonly fields.
From what you have stated I don't get why your trivial properties don't get inlined by the JIT?

If you modify your test to use the temp variables you assign rather than directly access the properties in your calculation you will see a large performance improvement:
sw.Start();
for (int i = 0; i < loopCount; i++)
{
x = point._x;
y = point._y;
z = point._z;
calculatedValue = x * y / z;
}
sw.Stop();
double fieldTime = sw.ElapsedMilliseconds;
Console.WriteLine("Direct field access: " + fieldTime);
sw.Reset();
sw.Start();
for (int i = 0; i < loopCount; i++)
{
x = point.X;
y = point.Y;
z = point.Z;
calculatedValue = x * y / z;
}
sw.Stop();

Perhaps I'll repeat someone else, but here's my point too if it may help.
Teachings are to give you the tools you need to achieve a certain level of ease when encountering such situations.
The Agile Software development methodology says that you have to first deliver the product to your client no matter what your code might look like. Second, you may optimize and make your code "beautiful" or according to the programming states of the art.
Here, either you or your client require performance. Within your project, PERFORMANCE is CRUCIAL, if I understand correctly.
So, I guess you'll agree with me that we don't care about what the code might look like or whether it respects the "art". Do what you have to to make it performant and powerful! Properties allow your code to "format" the data I/O if required. A property has its own memory address, then it looks for its member address when you return the member's value, so you got two searches of address. If performance is such critical, just do it, and make your immutable members public. :-)
This reflects some others point of view too, if I read correctly. :)
Have a good day!

Types which encapsulate functionality should use properties. Types which only serve to hold data should use public fields, except in the case of immutable classes (where wrapping fields in read-only properties is the only way to reliably protect them against modification). Exposing members as public fields essentially proclaims "these members may be freely modified at any time without regard for anything else". If the type in question is a class type, it further proclaims "anyone who exposes a reference to this thing will be allowing the recipient to change these members at any time in any fashion they see fit." While one shouldn't expose public fields in cases where such a proclamation would be inappropriate, one should expose public fields in cases where such a proclamation would be appropriate and client code could benefit from the assumptions enabled thereby.

Related

How to use properties with an integral overflow (byte)?

I am fairly new to programming and C#, and I am creating a game using C# 9.0 in which all instances of Entity have certain stats. I want to be able to change their private data fields using properties, though I'm not entirely sure how properties work. I know they are useful in encapsulation as getters and setters.
Context:
I am trying to optimize code and decrease memory usage where possible
The byte field str should be variable (through events, training, etc.), but have a "ceiling" and "floor"
If dog.str = 253, then dog.Str += 5; should result in dog.str being 255
If dog.str = 2, then dog.Str -= 5; should result in dog.str being 0
private byte str;
public short Str
{
get => str;
set
{
if (value > byte.MaxValue) str = byte.MaxValue; //Pos Overflow
else if (value < byte.MinValue) str = byte.MinValue; //Neg Overflow
else str = (byte)value;
}
}
Questions:
Since the property is of datatype Short, does it create a new private backing field that consumes memory? Or is value/Str{set;} just a local variable that later disappears?
Does the property public float StrMod {get => (float)(str*Effects.Power);} create a backing field? Would it be better to just create a method like public float getStrMod() instead?
Is this code optimal for what I'm trying to achieve? Is there some better way to do this, considering the following?
If for some reason the Short overflowed (unlikely in this scenario, but there may be a similar situation), then I would end up with the same problem. However, if extra memory allocation isn't an issue, then I could use an int.
The {get;} will return a Short, which may or may not be an issue.
Question 1:
No it doesn't, its backing field is str.
Question 2:
Profile your code first instead of making random changes in hope to reduce memory usage.
"Premature optimization is the root of all evil", do you really have such issues at this point ?
Personally I'd use int and use same type for property and backing field for simplicity.
This would avoid wrapping such as assigning 32768 which would then result as -32768 for short.
Side note, don't think that using byte necessarily results in 1 byte, if you have tight packing requirements then you need to look at StructLayoutAttribute.Pack.
Other than that I see nothing wrong with your code, just get it to work first then optimize it!
Here's how I'd write your code, maybe you'll get some ideas from it:
class Testing
{
private int _value;
public int Value
{
get => _value;
set => _value = Clamp(value, byte.MinValue, byte.MaxValue);
}
private static int Clamp(int value, int min, int max)
{
return Math.Max(min, Math.Min(max, value));
}
}
EDIT:
Different scenarios:
class Testing
{
private int _value1;
public int Value1 // backing field is _value1
{
get => _value1;
set => _value1 = value;
}
public int Value2 { get; set; } // adds a backing field
public int Value3 { get; } // adds a backing field
public int Value4 => 42; // no backing field
}
As you might have guessed, properties are syntactic sugar for methods, they can do 'whatever' under the hood compared to a field which can only be assigned a value to.
Also, one difference with a method is that you can browse its value in the debugger, that's handy.
Suggested reading:
https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/classes-and-structs/properties
Finally, properties are expected to return quickly, else write a method, and possibly async if it's going to take a while (advantage to method in this case as properties can't be async).
#aybe answer covers main thing about you question. I would like to add additional info to your 2nd question. You should consider on which platform you write application. There is a word term:
In computing, a word is the natural unit of data used by a particular
processor design. A word is a fixed-sized piece of data handled as a
unit by the instruction set or the hardware of the processor. The
number of bits in a word (the word size, word width, or word length)
is an important characteristic of any specific processor design or
computer architecture.
If processor has 64 bit word, then every variable which type is less than 64 bits will still occupy 64 bits in memory. Keep in mind that variable of given type will be handled as given type and size in memory doesn't impact range, overflow or underflow - arithmetic will be processed for given type.
In short - if you have 64-bit desktop processor and you will use only short variables, then you will not observe any memory savings in comparison to declaring int variables.

Difference between copying reference of a class and copying value

For an experiment, I tried this :
(1) Create 100000 classes, each of them wrapping a double variable
---This is the experiment part---
(2) Measured performance of two methods by running 100000 times :
create a double[] and assign the value of wrapped variables.
create a class[] and assign the reference of wrapping class.
The above may confuse you, so I am attaching the code :
static void Main(string[] args)
{
int length = 100000;
Test test = new Test(length);
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
for (int i = 0; i < length; i++)
test.CopyValue();
//test.CopyReference(); //test.CopyValue(); or test.CopyReference();
stopwatch.Stop();
Console.WriteLine("RunTime : {0} ", stopwatch.ElapsedMilliseconds);
}
class DoubleWrapper
{
public double value = 0.0;
}
class Test
{
DoubleWrapper[] wrapper;
public void CopyValue()
{
double[] x = new double[wrapper.Length];
for (int i = 0; i < wrapper.Length; i++)
x[i] = wrapper[i].value;
}
public void CopyReference()
{
DoubleWrapper[] x = new DoubleWrapper[wrapper.Length];
for (int i = 0; i < wrapper.Length; i++)
x[i] = wrapper[i];
}
public Test(int length)
{
wrapper = new DoubleWrapper[length];
for (int i = 0; i < length; i++)
wrapper[i] = new DoubleWrapper();
}
}
The result is as follows :
test.CopyValue() : 56890 (millisec)
test.CopyReference() : 66688 (millisec)
(built with release configuration and ran exe)
I tried several times, but the result doesn't change much.
So I concluded that CopyReference() takes longer time.
But I hardly understand why. Here is the question :
I thought that, regardless of CopyValue() or CopyReference(), what my machine does is "Copying a number in memory" though one is double value and the another is reference to a class. So there should not be meaningful difference in performance, but the fact is not.
Then what is the difference between copying a value and copying a reference?
Does copying a reference do more thing than copying a value?
(When passing a reference without ref keyword, isn't it true that reference is copied as if it were value? What I am saying is that,
ClassA x = new ClassA();
ClassA y = x;
means making a copy of "reference of x" and then assigning to variable y, consequently y = null doesn't affect x at all. Is this true?)
If I am working with wrong assumptions, please let me know what I am wrong with.
I appreciate your help and advice.
-
I guessed that GC might have some impact, but turning off GC by TryStartNoGCRegion(Int64) doesn't change the conclusion.
(both become faster, but still CopyReference() is slower.)
means making a copy of "reference of x" and then assigning to variable
y, consequently y = null doesn't affect x at all. Is this true?
That's correct - you made a copy of reference.
Now, what about why your implementation of CopyReference is slower.
You have to do performance analysis to get real insight about it, but on top of my head:
You're creating a new instance of DoubleWrapper reference type, inside that method.
Remember, that C# is not "zero-cost abstraction" language, like C/C++/Rust might be.
Creating an instance of, even simple reference type, that does nothing more than a simple wrapper over a primitive type, suppose to cost you more. Because that instance has more in it than a simple double value - the size of DoubleWrapperobject will not be equal to 8 bytes.
Reading about the stack and the heap may help you understand. if you copy a reference, you essentially copy the pointer showing to the actual object in the heap, and if that actual object changes, those changes are shown on every thing that references that object.
If you do a "deep copy", or a clone (like when implementing IClonable) you will duplicate that data in the stack and create a pointer to it, thus not beeing dependant on the original object any longer.
I hope this explanation helps you a little bit? See this for some information https://www.c-sharpcorner.com/article/C-Sharp-heaping-vs-stacking-in-net-part-i/ on the stack and heap.

What is the best way to calculate a formula when using an int as a divisor

Often I find myself having a expression where a division by int is a part of a large formula. I will give you a simple example that illustrate this problem:
int a = 2;
int b = 4;
int c = 5;
int d = a * (b / c);
In this case, d equals 0 as expected, but I would like this to be 1 since 4/5 multiplied by 2 is 1 3/5 and when converted to int get's "rounded" to 1. So I find myself having to cast c to double, and then since that makes the expression a double also, casting the entire expression to int. This code looks like this:
int a = 2;
int b = 4;
int c = 5;
int d = (int)(a * (b / (double)c));
In this small example it's not that bad, but in a big formula this get's quite messy.
Also, I guess that casting will take a (small) hit on performance.
So my question is basically if there is any better approach to this than casting both divisor and result.
I know that in this example, changing a*(b/c) to (a*b)/c would solve the problem, but in larger real-life scenarios, making this change will not be possible.
EDIT (added a case from an existing program):
In this case I'm caclulating the position of a scrollbar according to the size of the scrollbar, and the size of it's container. So if there is double the elements to fit on the page, the scrollbar will be half the height of the container, and if we have scrolled through half of the elements possible, that means that the scroller position should be moved 1/4 down so it will reside in the middle of the container. The calculations work as they should, and it displays fine. I just don't like how the expression looks in my code.
The important parts of the code is put and appended here:
int scrollerheight = (menusize.Height * menusize.Height) / originalheight;
int maxofset = originalheight - menusize.Height;
int scrollerposition = (int)((menusize.Height - scrollerheight) * (_overlayofset / (double)maxofset));
originalheight here is the height of all elements, so in the case described above, this will be the double of menusize.Height.
Disclaimer: I typed all this out, and then I thought, Should I even post this? I mean, it's a pretty bad idea and therefore doesn't really help the OP... In the end I figured, hey, I already typed it all out; I might as well go ahead and click "Post Your Answer." Even though it's a "bad" idea, it's kind of interesting (to me, anyway). So maybe you'll benefit in some strange way by reading it.
For some reason I have a suspicion the above disclaimer's not going to protect me from downvotes, though...
Here's a totally crazy idea.
I would actually not recommend putting this into any sort of production environment, at all, because I literally thought of it just now, which means I haven't really thought it through completely, and I'm sure there are about a billion problems with it. It's just an idea.
But the basic concept is to create a type that can be used for arithmetic expressions, internally using a double for every term in the expression, only to be evaluated as the desired type (in this case: int) at the end.
You'd start with a type like this:
// Probably you'd make this implement IEquatable<Term>, IEquatable<double>, etc.
// Probably you'd also give it a more descriptive, less ambiguous name.
// Probably you also just flat-out wouldn't use it at all.
struct Term
{
readonly double _value;
internal Term(double value)
{
_value = value;
}
public override bool Equals(object obj)
{
// You would want to override this, of course...
}
public override int GetHashCode()
{
// ...as well as this...
return _value.GetHashCode();
}
public override string ToString()
{
// ...as well as this.
return _value.ToString();
}
}
Then you'd define implicit conversions to/from double and the type(s) you want to support (again: int). Like this:
public static implicit operator Term(int x)
{
return new Term((double)x);
}
public static implicit operator int(Term x)
{
return (int)x._value;
}
// ...and so on.
Next, define the operations themselves: Plus, Minus, etc. In the case of your example code, we'd need Times (for *) and DividedBy (for /):
public Term Times(Term multiplier)
{
// This would work because you would've defined an implicit conversion
// from double to Term.
return _value * multiplier._value;
}
public Term DividedBy(Term divisor)
{
// Same as above.
return _value / divisor._value;
}
Lastly, write a static helper class to enable you to perform Term-based operations on whatever types you want to work with (probably just int for starters):
public static class TermHelper
{
public static Term Times(this int number, Term multiplier)
{
return ((Term)number).Times(multiplier);
}
public static Term DividedBy(this int number, Term divisor)
{
return ((Term)number).DividedBy(divisor);
}
}
What would all of this buy you? Practically nothing! But it would clean up your expressions, hiding away all those unsightly explicit casts, making your code significantly more attractive and considerably more impossible to debug. (Once again, this is not an endorsement, just a crazy-ass idea.)
So instead of this:
int d = (int)(a * (b / (double)c)); // Output: 2
You'd have this:
int d = a.Times(b.DividedBy(c)); // Output: 2
Is it worth it?
Well, if having to write casting operations were the worst thing in the world, like, even worse than relying on code that's too clever for its own good, then maybe a solution like this would be worth pursuing.
Since the above is clearly not true... the answer is a pretty emphatic NO. But I just thought I'd share this idea anyway, to show that such a thing is (maybe) possible.
First of all, C# truncates the result of int division, and when casting to int. There's no rounding.
There's no way to do b / c first without any conversions.
Multiply b times 100. Then divide by 100 at the end.
In this case, I would suggest Using double instead, because you don't need 'exact' precision.
However, if you really feel you want to do it all without floating-point operation, I would suggest creating some kind of fraction class, which is far more complex and less efficient but you can keep track of all dividend and divisor and then calculate it all at once.

Find size of object instance in bytes in c#

For any arbitrary instance (collections of different objects, compositions, single objects, etc)
How can I determine its size in bytes?
(I've currently got a collection of various objects and i'm trying to determine the aggregated size of it)
EDIT: Has someone written an extension method for Object that could do this? That'd be pretty neat imo.
First of all, a warning: what follows is strictly in the realm of ugly, undocumented hacks. Do not rely on this working - even if it works for you now, it may stop working tomorrow, with any minor or major .NET update.
You can use the information in this article on CLR internals MSDN Magazine Issue 2005 May - Drill Into .NET Framework Internals to See How the CLR Creates Runtime Objects - last I checked, it was still applicable. Here's how this is done (it retrieves the internal "Basic Instance Size" field via TypeHandle of the type).
object obj = new List<int>(); // whatever you want to get the size of
RuntimeTypeHandle th = obj.GetType().TypeHandle;
int size = *(*(int**)&th + 1);
Console.WriteLine(size);
This works on 3.5 SP1 32-bit. I'm not sure if field sizes are the same on 64-bit - you might have to adjust the types and/or offsets if they are not.
This will work for all "normal" types, for which all instances have the same, well-defined types. Those for which this isn't true are arrays and strings for sure, and I believe also StringBuilder. For them you'll have add the size of all contained elements to their base instance size.
You may be able to approximate the size by pretending to serializing it with a binary serializer (but routing the output to oblivion) if you're working with serializable objects.
class Program
{
static void Main(string[] args)
{
A parent;
parent = new A(1, "Mike");
parent.AddChild("Greg");
parent.AddChild("Peter");
parent.AddChild("Bobby");
System.Runtime.Serialization.Formatters.Binary.BinaryFormatter bf =
new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
SerializationSizer ss = new SerializationSizer();
bf.Serialize(ss, parent);
Console.WriteLine("Size of serialized object is {0}", ss.Length);
}
}
[Serializable()]
class A
{
int id;
string name;
List<B> children;
public A(int id, string name)
{
this.id = id;
this.name = name;
children = new List<B>();
}
public B AddChild(string name)
{
B newItem = new B(this, name);
children.Add(newItem);
return newItem;
}
}
[Serializable()]
class B
{
A parent;
string name;
public B(A parent, string name)
{
this.parent = parent;
this.name = name;
}
}
class SerializationSizer : System.IO.Stream
{
private int totalSize;
public override void Write(byte[] buffer, int offset, int count)
{
this.totalSize += count;
}
public override bool CanRead
{
get { return false; }
}
public override bool CanSeek
{
get { return false; }
}
public override bool CanWrite
{
get { return true; }
}
public override void Flush()
{
// Nothing to do
}
public override long Length
{
get { return totalSize; }
}
public override long Position
{
get
{
throw new NotImplementedException();
}
set
{
throw new NotImplementedException();
}
}
public override int Read(byte[] buffer, int offset, int count)
{
throw new NotImplementedException();
}
public override long Seek(long offset, System.IO.SeekOrigin origin)
{
throw new NotImplementedException();
}
public override void SetLength(long value)
{
throw new NotImplementedException();
}
}
Not directly answers the question, but for those who are interested to investigate object sizes while debugging:
Start debugging in VS, make sure the Diagnostics Tools window is shown (Debug > Windows > Show Diagnostic Tools)
Set a breakpoint (optional)
Click Take Snapshot in the Memory Usage while paused
Explore the snapshot (optionally sort the object list alphabetically to find the type you're interested in)
For unmanaged types aka value types, structs:
Marshal.SizeOf(object);
For managed objects the closer i got is an approximation.
long start_mem = GC.GetTotalMemory(true);
aclass[] array = new aclass[1000000];
for (int n = 0; n < 1000000; n++)
array[n] = new aclass();
double used_mem_median = (GC.GetTotalMemory(false) - start_mem)/1000000D;
Do not use serialization.A binary formatter adds headers, so you can change your class and load an old serialized file into the modified class.
Also it won't tell you the real size in memory nor will take into account memory alignment.
[Edit]
By using BiteConverter.GetBytes(prop-value) recursivelly on every property of your class you would get the contents in bytes, that doesn't count the weight of the class or references but is much closer to reality.
I would recommend to use a byte array for data and an unmanaged proxy class to access values using pointer casting if size matters, note that would be non-aligned memory so on old computers is gonna be slow but HUGE datasets on MODERN RAM is gonna be considerably faster, as minimizing the size to read from RAM is gonna be a bigger impact than unaligned.
safe solution with some optimizations
CyberSaving/MemoryUsage code.
some case:
/* test nullable type */
TestSize<int?>.SizeOf(null) //-> 4 B
/* test StringBuilder */
StringBuilder sb = new StringBuilder();
for (int i = 0; i < 100; i++) sb.Append("わたしわたしわたしわ");
TestSize<StringBuilder>.SizeOf(sb ) //-> 3132 B
/* test Simple array */
TestSize<int[]>.SizeOf(new int[100]); //-> 400 B
/* test Empty List<int>*/
var list = new List<int>();
TestSize<List<int>>.SizeOf(list); //-> 205 B
/* test List<int> with 100 items*/
for (int i = 0; i < 100; i++) list.Add(i);
TestSize<List<int>>.SizeOf(list); //-> 717 B
It works also with classes:
class twostring
{
public string a { get; set; }
public string b { get; set; }
}
TestSize<twostring>.SizeOf(new twostring() { a="0123456789", b="0123456789" } //-> 28 B
This doesn't apply to the current .NET implementation, but one thing to keep in mind with garbage collected/managed runtimes is the allocated size of an object can change throughout the lifetime of the program. For example, some generational garbage collectors (such as the Generational/Ulterior Reference Counting Hybrid collector) only need to store certain information after an object is moved from the nursery to the mature space.
This makes it impossible to create a reliable, generic API to expose the object size.
This is impossible to do at runtime.
There are various memory profilers that display object size, though.
EDIT: You could write a second program that profiles the first one using the CLR Profiling API and communicates with it through remoting or something.
For anyone looking for a solution that doesn't require [Serializable] classes and where the result is an approximation instead of exact science.
The best method I could find is json serialization into a memorystream using UTF32 encoding.
private static long? GetSizeOfObjectInBytes(object item)
{
if (item == null) return 0;
try
{
// hackish solution to get an approximation of the size
var jsonSerializerSettings = new JsonSerializerSettings
{
DateFormatHandling = DateFormatHandling.IsoDateFormat,
DateTimeZoneHandling = DateTimeZoneHandling.Utc,
MaxDepth = 10,
ReferenceLoopHandling = ReferenceLoopHandling.Ignore
};
var formatter = new JsonMediaTypeFormatter { SerializerSettings = jsonSerializerSettings };
using (var stream = new MemoryStream()) {
formatter.WriteToStream(item.GetType(), item, stream, Encoding.UTF32);
return stream.Length / 4; // 32 bits per character = 4 bytes per character
}
}
catch (Exception)
{
return null;
}
}
No, this won't give you the exact size that would be used in memory. As previously mentioned, that is not possible. But it'll give you a rough estimation.
Note that this is also pretty slow.
Use Son Of Strike which has a command ObjSize.
Note that actual memory consumed is always larger than ObjSize reports due to a synkblk which resides directly before the object data.
Read more about both here MSDN Magazine Issue 2005 May - Drill Into .NET Framework Internals to See How the CLR Creates Runtime Objects.
AFAIK, you cannot, without actually deep-counting the size of each member in bytes. But again, does the size of a member (like elements inside a collection) count towards the size of the object, or a pointer to that member count towards the size of the object? Depends on how you define it.
I have run into this situation before where I wanted to limit the objects in my cache based on the memory they consumed.
Well, if there is some trick to do that, I'd be delighted to know about it!
For value types, you can use Marshal.SizeOf. Of course, it returns the number of bytes required to marshal the structure in unmanaged memory, which is not necessarily what the CLR uses.
I have created benchmark test for different collections in .NET: https://github.com/scholtz/TestDotNetCollectionsMemoryAllocation
Results are as follows for .NET Core 2.2 with 1,000,000 of objects with 3 properties allocated:
Testing with string: 1234567
Hashtable<TestObject>: 184 672 704 B
Hashtable<TestObjectRef>: 136 668 560 B
Dictionary<int, TestObject>: 171 448 160 B
Dictionary<int, TestObjectRef>: 123 445 472 B
ConcurrentDictionary<int, TestObject>: 200 020 440 B
ConcurrentDictionary<int, TestObjectRef>: 152 026 208 B
HashSet<TestObject>: 149 893 216 B
HashSet<TestObjectRef>: 101 894 384 B
ConcurrentBag<TestObject>: 112 783 256 B
ConcurrentBag<TestObjectRef>: 64 777 632 B
Queue<TestObject>: 112 777 736 B
Queue<TestObjectRef>: 64 780 680 B
ConcurrentQueue<TestObject>: 112 784 136 B
ConcurrentQueue<TestObjectRef>: 64 783 536 B
ConcurrentStack<TestObject>: 128 005 072 B
ConcurrentStack<TestObjectRef>: 80 004 632 B
For memory test I found the best to be used
GC.GetAllocatedBytesForCurrentThread()
For arrays of structs/values, I have different results with:
first = Marshal.UnsafeAddrOfPinnedArrayElement(array, 0).ToInt64();
second = Marshal.UnsafeAddrOfPinnedArrayElement(array, 1).ToInt64();
arrayElementSize = second - first;
(oversimplified example)
Whatever the approach, you really need to understand how .Net works to correctly interpret the results.
For instance, the returned element size is the "aligned" element size, with some padding.
The overhead and thus the size is different depending on the usage of a type: "boxed" on the GC heap, on the stack, as a field, as an array element.
(I wanted to know what would be the memory impact of using "dummy" empty structs (without any field) to mimic "optional" arguments of generics; making tests with different layouts involving empty structs, I can see that an empty struct uses (at least) 1 byte per element; I vaguely remember it is because .Net needs a different address for each field, which wouldn't work if a field really was empty/0-sized).
You can use reflection to gather all the public member or property information (given the object's type). There is no way to determine the size without walking through each individual piece of data on the object, though.
From Pavel and jnm2:
private int DumpApproximateObjectSize(object toWeight)
{
return Marshal.ReadInt32(toWeight.GetType().TypeHandle.Value, 4);
}
On a side note be careful because it only work with contiguous memory objects
Simplest way is: int size = *((int*)type.TypeHandle.Value + 1)
I know this is implementation detail but GC relies on it and it needs to be as close to start of the methodtable for efficiency plus taking into consideration how GC code complex is nobody will dare to change it in future. In fact it works for every minor/major versions of .net framework+.net core. (Currently unable to test for 1.0)
If you want more reliable way, emit a struct in a dynamic assembly with [StructLayout(LayoutKind.Auto)] with exact same fields in same order, take its size with sizeof IL instruction. You may want to emit a static method within struct which simply returns this value. Then add 2*IntPtr.Size for object header. This should give you exact value.
But if your class derives from another class, you need to find each size of base class seperatly and add them + 2*Inptr.Size again for header. You can do this by getting fields with BindingFlags.DeclaredOnly flag.
Arrays and strings just adds that size its length * element size.
For cumulative size of aggreagate objects you need to implement more sophisticated solution which involves visiting every field and inspect its contents.
For anyone looking for a rough approximation comparing the sizes of disparate object graphs/collections, just serialize to JSON - e.g.:
Console.WriteLine($"Size1:\t{(JsonConvert.SerializeObject(someBusyObject)).Length}")); Console.WriteLine($"Size2:\t{(JsonConvert.SerializeObject(someOtherObject)).Length}"));
In my case I have a bunch of IEnumerable's being pulled during a login I'm benchmarking, and I just wanted to roughly size them to see their relative weight.
They're expensive operations and won't give you direct heap allocation size or anything like that, but it was good enough for my use case and was readily available.

Units of measure in C# - almost

Inspired by Units of Measure in F#, and despite asserting (here) that you couldn't do it in C#, I had an idea the other day which I've been playing around with.
namespace UnitsOfMeasure
{
public interface IUnit { }
public static class Length
{
public interface ILength : IUnit { }
public class m : ILength { }
public class mm : ILength { }
public class ft : ILength { }
}
public class Mass
{
public interface IMass : IUnit { }
public class kg : IMass { }
public class g : IMass { }
public class lb : IMass { }
}
public class UnitDouble<T> where T : IUnit
{
public readonly double Value;
public UnitDouble(double value)
{
Value = value;
}
public static UnitDouble<T> operator +(UnitDouble<T> first, UnitDouble<T> second)
{
return new UnitDouble<T>(first.Value + second.Value);
}
//TODO: minus operator/equality
}
}
Example usage:
var a = new UnitDouble<Length.m>(3.1);
var b = new UnitDouble<Length.m>(4.9);
var d = new UnitDouble<Mass.kg>(3.4);
Console.WriteLine((a + b).Value);
//Console.WriteLine((a + c).Value); <-- Compiler says no
The next step is trying to implement conversions (snippet):
public interface IUnit { double toBase { get; } }
public static class Length
{
public interface ILength : IUnit { }
public class m : ILength { public double toBase { get { return 1.0;} } }
public class mm : ILength { public double toBase { get { return 1000.0; } } }
public class ft : ILength { public double toBase { get { return 0.3048; } } }
public static UnitDouble<R> Convert<T, R>(UnitDouble<T> input) where T : ILength, new() where R : ILength, new()
{
double mult = (new T() as IUnit).toBase;
double div = (new R() as IUnit).toBase;
return new UnitDouble<R>(input.Value * mult / div);
}
}
(I would have liked to avoid instantiating objects by using static, but as we all know you can't declare a static method in an interface)
You can then do this:
var e = Length.Convert<Length.mm, Length.m>(c);
var f = Length.Convert<Length.mm, Mass.kg>(d); <-- but not this
Obviously, there is a gaping hole in this, compared to F# Units of measure (I'll let you work it out).
Oh, the question is: what do you think of this? Is it worth using? Has someone else already done better?
UPDATE for people interested in this subject area, here is a link to a paper from 1997 discussing a different kind of solution (not specifically for C#)
You are missing dimensional analysis. For example (from the answer you linked to), in F# you can do this:
let g = 9.8<m/s^2>
and it will generate a new unit of acceleration, derived from meters and seconds (you can actually do the same thing in C++ using templates).
In C#, it is possible to do dimensional analysis at runtime, but it adds overhead and doesn't give you the benefit of compile-time checking. As far as I know there's no way to do full compile-time units in C#.
Whether it's worth doing depends on the application of course, but for many scientific applications, it's definitely a good idea. I don't know of any existing libraries for .NET, but they probably exist.
If you are interested in how to do it at runtime, the idea is that each value has a scalar value and integers representing the power of each basic unit.
class Unit
{
double scalar;
int kg;
int m;
int s;
// ... for each basic unit
public Unit(double scalar, int kg, int m, int s)
{
this.scalar = scalar;
this.kg = kg;
this.m = m;
this.s = s;
...
}
// For addition/subtraction, exponents must match
public static Unit operator +(Unit first, Unit second)
{
if (UnitsAreCompatible(first, second))
{
return new Unit(
first.scalar + second.scalar,
first.kg,
first.m,
first.s,
...
);
}
else
{
throw new Exception("Units must match for addition");
}
}
// For multiplication/division, add/subtract the exponents
public static Unit operator *(Unit first, Unit second)
{
return new Unit(
first.scalar * second.scalar,
first.kg + second.kg,
first.m + second.m,
first.s + second.s,
...
);
}
public static bool UnitsAreCompatible(Unit first, Unit second)
{
return
first.kg == second.kg &&
first.m == second.m &&
first.s == second.s
...;
}
}
If you don't allow the user to change the value of the units (a good idea anyways), you could add subclasses for common units:
class Speed : Unit
{
public Speed(double x) : base(x, 0, 1, -1, ...); // m/s => m^1 * s^-1
{
}
}
class Acceleration : Unit
{
public Acceleration(double x) : base(x, 0, 1, -2, ...); // m/s^2 => m^1 * s^-2
{
}
}
You could also define more specific operators on the derived types to avoid checking for compatible units on common types.
Using separate classes for different units of the same measure (e.g., cm, mm, and ft for Length) seems kind of weird. Based on the .NET Framework's DateTime and TimeSpan classes, I would expect something like this:
Length length = Length.FromMillimeters(n1);
decimal lengthInFeet = length.Feet;
Length length2 = length.AddFeet(n2);
Length length3 = length + Length.FromMeters(n3);
You could add extension methods on numeric types to generate measures. It'd feel a bit DSL-like:
var mass = 1.Kilogram();
var length = (1.2).Kilometres();
It's not really .NET convention and might not be the most discoverable feature, so perhaps you'd add them in a devoted namespace for people who like them, as well as offering more conventional construction methods.
I recently released Units.NET on GitHub and on NuGet.
It gives you all the common units and conversions. It is light-weight, unit tested and supports PCL.
Example conversions:
Length meter = Length.FromMeters(1);
double cm = meter.Centimeters; // 100
double yards = meter.Yards; // 1.09361
double feet = meter.Feet; // 3.28084
double inches = meter.Inches; // 39.3701
Now such a C# library exists:
http://www.codeproject.com/Articles/413750/Units-of-Measure-Validator-for-Csharp
It has almost the same features as F#'s unit compile time validation, but for C#.
The core is a MSBuild task, which parses the code and looking for validations.
The unit information are stored in comments and attributes.
Here's my concern with creating units in C#/VB. Please correct me if you think I'm wrong. Most implementations I've read about seem to involve creating a structure that pieces together a value (int or double) with a unit. Then you try to define basic functions (+-*/,etc) for these structures that take into account unit conversions and consistency.
I find the idea very attractive, but every time I balk at what a huge step for a project this appears to be. It looks like an all-or-nothing deal. You probably wouldn't just change a few numbers into units; the whole point is that all data inside a project is appropriately labeled with a unit to avoid any ambiguity. This means saying goodbye to using ordinary doubles and ints, every variable is now defined as a "Unit" or "Length" or "Meters", etc. Do people really do this on a large scale? So even if you have a large array, every element should be marked with a unit. This will obviously have both size and performance ramifications.
Despite all the cleverness in trying to push the unit logic into the background, some cumbersome notation seems inevitable with C#. F# does some behind-the-scenes magic that better reduces the annoyance factor of the unit logic.
Also, how successfully can we make the compiler treat a unit just like an ordinary double when we so desire, w/o using CType or ".Value" or any additional notation? Such as with nullables, the code knows to treat a double? just like a double (of course if your double? is null then you get an error).
Thanks for the idea. I have implemented units in C# many different ways there always seems to be a catch. Now I can try one more time using the ideas discussed above. My goal is to be able to define new units based on existing ones like
Unit lbf = 4.44822162*N;
Unit fps = feet/sec;
Unit hp = 550*lbf*fps
and for the program to figure out the proper dimensions, scaling and symbol to use. In the end I need to build a basic algebra system that can convert things like (m/s)*(m*s)=m^2 and try to express the result based on existing units defined.
Also a requirement must be to be able to serialize the units in a way that new units do not need to be coded, but just declared in a XML file like this:
<DefinedUnits>
<DirectUnits>
<!-- Base Units -->
<DirectUnit Symbol="kg" Scale="1" Dims="(1,0,0,0,0)" />
<DirectUnit Symbol="m" Scale="1" Dims="(0,1,0,0,0)" />
<DirectUnit Symbol="s" Scale="1" Dims="(0,0,1,0,0)" />
...
<!-- Derived Units -->
<DirectUnit Symbol="N" Scale="1" Dims="(1,1,-2,0,0)" />
<DirectUnit Symbol="R" Scale="1.8" Dims="(0,0,0,0,1)" />
...
</DirectUnits>
<IndirectUnits>
<!-- Composite Units -->
<IndirectUnit Symbol="m/s" Scale="1" Lhs="m" Op="Divide" Rhs="s"/>
<IndirectUnit Symbol="km/h" Scale="1" Lhs="km" Op="Divide" Rhs="hr"/>
...
<IndirectUnit Symbol="hp" Scale="550.0" Lhs="lbf" Op="Multiply" Rhs="fps"/>
</IndirectUnits>
</DefinedUnits>
there is jscience: http://jscience.org/, and here is a groovy dsl for units: http://groovy.dzone.com/news/domain-specific-language-unit-. iirc, c# has closures, so you should be able to cobble something up.
Why not use CodeDom to generate all possible permutations of the units automatically? I know it's not the best - but I will definitely work!
you could use QuantitySystem instead of implementing it by your own. It builds on F# and drastically improves unit handling in F#. It's the best implementation I found so far and can be used in C# projects.
http://quantitysystem.codeplex.com
Is it worth using?
Yes. If I have "a number" in front of me, I want to know what that is. Any time of the day. Besides, that's what we usually do. We organize data into a meaningful entity -class, struct, you name it. Doubles into coordinates, strings into names and address etc. Why units should be any different?
Has someone else already done better?
Depends on how one defines "better". There are some libraries out there but I haven't tried them so I don't have an opinion. Besides it spoils the fun of trying it myself :)
Now about the implementation. I would like to start with the obvious: it's futile to try replicate the [<Measure>] system of F# in C#. Why? Because once F# allows you to use / ^ (or anything else for that matter) directly on another type, the game is lost. Good luck doing that in C# on a struct or class. The level of metaprogramming required for such a task does not exist and I'm afraid it is not going to be added any time soon -in my opinion. That's why you lack the dimensional analysis that Matthew Crumley mentioned in his answer.
Let's take the example from fsharpforfunandprofit.com: you have Newtons defined as [<Measure>] type N = kg m/sec^2. Now you have the square function that that the author created that will return a N^2 which sounds "wrong", absurd and useless. Unless you want to perform arithmetic operations where at some point during the evaluation process, you might get something "meaningless" until you multiply it with some other unit and you get a meaningful result. Or even worse, you might want to use constants. For example the gas constant R which is 8.31446261815324 J /(K mol). If you define the appropriate units, then F# is ready to consume the R constant. C# is not. You need to specify another type just for that and still you won't be able to do any operation you want on that constant.
That doesn't mean that you shouldn't try. I did and I am quite happy with the results. I started SharpConvert around 3 years ago, after I got inspired by this very question. The trigger was this story: once I had to fix a nasty bug for the RADAR simulator that I develop: an aircraft was plunging in the earth instead of following the predefined glide path. That didn't make me happy as you could guess and after 2 hours of debugging, I realized that somewhere in my calculations, I was treating kilometers as nautical miles. Until that point I was like "oh well I will just be 'careful'" which is at least naive for any non trivial task.
In your code there would be a couple of things I would do different.
First I would turn UnitDouble<T> and IUnit implementations into structs. A unit is just that, a number and if you want them to be treated like numbers, a struct is a more appropriate approach.
Then I would avoid the new T() in the methods. It does not invoke the constructor, it uses Activator.CreateInstance<T>() and for number crunching it will be bad as it will add overhead. That depends though on the implementation, for a simple units converter application it won't harm. For time critical context avoid like the plague. And don't take me wrong, I used it myself as I didn't know better and I run some simple benchmarks the other day and such a call might double the execution time -at least in my case. More details in Dissecting the new() constraint in C#: a perfect example of a leaky abstraction
I would also change Convert<T, R>() and make it a member function. I prefer writing
var c = new Unit<Length.mm>(123);
var e = c.Convert<Length.m>();
rather than
var e = Length.Convert<Length.mm, Length.m>(c);
Last but not least I would use specific unit "shells" for each physical quantity (length time etc) instead of the UnitDouble, as it will be easier to add physical quantity specific functions and operator overloads. It will also allow you to create a Speed<TLength, TTime> shell instead of another Unit<T1, T2> or even Unit<T1, T2, T3> class. So it would look like that:
public readonly struct Length<T> where T : struct, ILength
{
private static readonly double SiFactor = new T().ToSiFactor;
public Length(double value)
{
if (value < 0) throw new ArgumentException(nameof(value));
Value = value;
}
public double Value { get; }
public static Length<T> operator +(Length<T> first, Length<T> second)
{
return new Length<T>(first.Value + second.Value);
}
public static Length<T> operator -(Length<T> first, Length<T> second)
{
// I don't know any application where negative length makes sense,
// if it does feel free to remove Abs() and the exception in the constructor
return new Length<T>(System.Math.Abs(first.Value - second.Value));
}
// You can add more like
// public static Area<T> operator *(Length<T> x, Length<T> y)
// or
//public static Volume<T> operator *(Length<T> x, Length<T> y, Length<T> z)
// etc
public Length<R> To<R>() where R : struct, ILength
{
//notice how I got rid of the Activator invocations by moving them in a static field;
//double mult = new T().ToSiFactor;
//double div = new R().ToSiFactor;
return new Length<R>(Value * SiFactor / Length<R>.SiFactor);
}
}
Notice also that, in order to save us from the dreaded Activator call, I stored the result of new T().ToSiFactor in SiFactor. It might seem awkward at first, but as Length is generic, Length<mm> will have its own copy, Length<Km> its own, and so on and so forth. Please note that ToSiFactor is the toBase of your approach.
The problem that I see is that as long as you are in the realm of simple units and up to the first derivative of time, things are simple. If you try to do something more complex, then you can see the drawbacks of this approach. Typing
var accel = new Acceleration<m, s, s>(1.2);
will not be as clear and "smooth" as
let accel = 1.2<m/sec^2>
And regardless of the approach, you will have to specify every math operation you will need with hefty operator overloading, while in F# you have this for free, even if the results are not meaningful as I was writing at the beginning.
The last drawback (or advantage depending on how you see it) of this design, is that it can't be unit agnostic. If there are cases that you need "just a Length" you can't have it. You need to know each time if your Length is millimeters, statute mile or foot. I took the opposite approach in SharpConvert and LengthUnit derives from UnitBase and Meters Kilometers etc derive from this. That's why I couldn't go down the struct path by the way. That way you can have:
LengthUnit l1 = new Meters(12);
LengthUnit l2 = new Feet(15.4);
LengthUnit sum = l1 + l2;
sum will be meters but one shouldn't care as long as they want to use it in the next operation. If they want to display it, then they can call sum.To<Kilometers>() or whatever unit. To be honest, I don't know if not "locking" the variable to a specific unit has any advantages. It might worth investigating it at some point.
I would like the compiler to help me as much as possible. So maybe you could have a TypedInt where T contains the actual unit.
public struct TypedInt<T>
{
public int Value { get; }
public TypedInt(int value) => Value = value;
public static TypedInt<T> operator -(TypedInt<T> a, TypedInt<T> b) => new TypedInt<T>(a.Value - b.Value);
public static TypedInt<T> operator +(TypedInt<T> a, TypedInt<T> b) => new TypedInt<T>(a.Value + b.Value);
public static TypedInt<T> operator *(int a, TypedInt<T> b) => new TypedInt<T>(a * b.Value);
public static TypedInt<T> operator *(TypedInt<T> a, int b) => new TypedInt<T>(a.Value * b);
public static TypedInt<T> operator /(TypedInt<T> a, int b) => new TypedInt<T>(a.Value / b);
// todo: m² or m/s
// todo: more than just ints
// todo: other operations
public override string ToString() => $"{Value} {typeof(T).Name}";
}
You could have an extensiom method to set the type (or just new):
public static class TypedInt
{
public static TypedInt<T> Of<T>(this int value) => new TypedInt<T>(value);
}
The actual units can be anything. That way, the system is extensible.
(There's multiple ways of handling conversions. What do you think is best?)
public class Mile
{
// todo: conversion from mile to/from meter
// maybe define an interface like ITypedConvertible<Meter>
// conversion probably needs reflection, but there may be
// a faster way
};
public class Second
{
}
This way, you can use:
var distance1 = 10.Of<Mile>();
var distance2 = 15.Of<Mile>();
var timespan1 = 4.Of<Second>();
Console.WriteLine(distance1 + distance2);
//Console.WriteLine(distance1 + 5); // this will be blocked by the compiler
//Console.WriteLine(distance1 + timespan1); // this will be blocked by the compiler
Console.WriteLine(3 * distance1);
Console.WriteLine(distance1 / 3);
//Console.WriteLine(distance1 / timespan1); // todo!
See Boo Ometa (which will be available for Boo 1.0):
Boo Ometa and Extensible Parsing
I really liked reading through this stack overflow question and its answers.
I have a pet project that I've tinkered with over the years, and have recently started re-writing it and have released it to the open source at https://github.com/MafuJosh/NGenericDimensions
It happens to be somewhat similar to many of the ideas expressed in the question and answers of this page.
It basically is about creating generic dimensions, with the unit of measure and the native datatype as the generic type placeholders.
For example:
Dim myLength1 as New Length(of Miles, Int16)(123)
With also some optional use of Extension Methods like:
Dim myLength2 = 123.miles
And
Dim myLength3 = myLength1 + myLength2
Dim myArea1 = myLength1 * myLength2
This would not compile:
Dim myValue = 123.miles + 234.kilograms
New units can be extended in your own libraries.
These datatypes are structures that contain only 1 internal member variable, making them lightweight.
Basically, the operator overloads are restricted to the "dimension" structures, so that every unit of measure doesn't need operator overloads.
Of course, a big downside is the longer declaration of the generics syntax that requires 3 datatypes. So if that is a problem for you, then this isn't your library.
The main purpose was to be able to decorate an interface with units in a compile-time checking fashion.
There is a lot that needs to be done to the library, but I wanted to post it in case it was the kind of thing someone was looking for.

Categories