Best approach to programming highly complex business/math rules - c#

I have to take a piece of data, and apply a large number of possible variables to it. I really don't like the idea of using a gigantic set of if statements, so i'm looking for help in an approach to simplify, and make it easier to maintain.
As an example:
if (isSoccer)
val = soccerBaseVal;
else if (isFootball)
val = footballBaseVal;
.... // 20 different sports
if (isMale)
val += 1;
else
val += 5;
switch(dayOfWeek)
{
case DayOfWeek.Monday:
val += 12;
...
}
etc.. etc.. etc.. with possibly in the range of 100-200 different tests and formula variations.
This just seems like a maintenance nightmare. Any suggestions?
EDIT:
To further add to the problem, many variables are only used in certain situations, so it's more than just a fixed set of logic with different values. The logic itself has to change based on conditions, possibly conditions applied from previous variables (if val > threshold, for instance).
So yes, i agree about using lookups for many of the values, but I also have to have variable logic.

A common way to avoid large switching structures is to put the information into data structures. Create an enumeration SportType and a Dictionary<SportType, Int32> containing the associated values. The you can simply write val += sportTypeScoreMap[sportType] and you are done.
Variations of this pattern will help you in many similar situations.
public enum SportType
{
Soccer, Football, ...
}
public sealed class Foo
{
private static readonly IDictionary<SportType, Int32> sportTypeScoreMap =
new Dictionary<SportType, Int32>
{
{ Soccer, 30 },
{ Football, 20 },
...
}
private static readonly IDictionary<DayOfWeek, Int32> dayOfWeekScoreMap =
new Dictionary<DayOfWeek, Int32>
{
{ DayOfWeek.Monday, 12 },
{ DayOfWeek.Tuesday, 20 },
...
}
public Int32 GetScore(SportType sportType, DayOfWeek dayOfWeek)
{
return Foo.sportTypeScoreMap[sportType]
+ Foo.dayOfWeekScoreMap[dayOfWeek];
}
}

Use either a switch statement or filter function.
By filter function, I mean something like:
func filter(var object, var value)
{
if(object == value)
object = valueDictionary['value'];
}
Then apply the filter with:
filter(theObject, soccer)
filter(theObject, football)
Note that the filter works much better using a dictionary, but it is not required.

Cribbing from The Pragmatic Programmer, you could use a DSL to encapsulate the rules and write a process engine. For your presented problem, a solution might look like:
MATCH{
Soccer soccerBaseVal
IsMale 5
!IsMale 1
}
SWITCH{
Monday 12
Tuesday 13
}
Then match everything in the first col of MATCH, and the first item in each SWITCH you come to. You can make whatever syntax you feel like, then just write a bit of script to cram that into code (or use Xtext because it looks pretty cool).

Here are a few ideas:
1 Use lookup tables:
var val = 0;
SportType sportType = GetSportType();
val += sportvalues[sportType];
You can load the table from the database.
2 Use the factory pattern:
var val = 0;
val += SportFactory.Create(sportType).CalculateValue();
The Dynamic Factory Pattern is useful in situations were new (sport) types are added frequently to the code. This pattern uses reflection to prevent the factory class (or any global configuration) from being changed. It allows you to simply add a new class to your code.
Of course the use of an dynamic factory, or even a factory can be overkill in your situation. You're the only one who can tell.

As a first step I would probably break up each logical processing area into its own method: (May not be the best names as a first pass)
EnforceSportRules
ProcessSportDetails
EnforceGenderRules
Next, depending on how complex the rules are, I may break each section into its own class and have them processed by a main class (like a factory).
GenderRules
GenderContext

I have nothing special to offer you than to first recommend not to just leave it as a big block-- break it into sections, make comment dividers between important parts.
Another suggestion is if you are going to have many very short tests as in your example, break from convention and put the val incrementors on the same line as the evaluatation and indent so they align with eachother.
if (isSoccer) val = soccerBaseVal;
if (isMale) val += 1;
else val += 5;
switch(dayOfWeek){
case DayOfWeek.Monday: val += 12;
...
}
Excess whitespace can make those hundred things into several hundred lines, making vertical scrolling excessive and difficult to get an overall view of the thing.

If you are really just adding values in this sort, I would either create an enumeration with defined indices that correspond to stored values in an array. Then you can do something like this:
enum Sport
{
football = 0,
soccer = 1,
//...
}
int sportValues[] = {
/* footballValue */,
/* soccerValue */,
/* ...Values */
};
int ApplyRules(Sport sport, /* other params */)
{
int value = startingValue;
value += sportValues[(int)sport];
value += /* other rules in same fashion */;
}

Consider implementing the Strategy Pattern which utilizes inheritance/polymorphism to make managing individual functions sane. By seperating each function into its own dedicated class you can forego the nightmare of having miles-long case blocks or if statements.
Not sure if C# supports it yet (or ever will) but VB.NET integrates XML Comment CompletionList directives into intellisense, which--when combined with the Strategy Pattern--can give you the ease of use of an Enum with the open-ended extensibility of OO.

Related

if check nesting c sharp [duplicate]

Any easier way to write this if statement?
if (value==1 || value==2)
For example... in SQL you can say where value in (1,2) instead of where value=1 or value=2.
I'm looking for something that would work with any basic type... string, int, etc.
How about:
if (new[] {1, 2}.Contains(value))
It's a hack though :)
Or if you don't mind creating your own extension method, you can create the following:
public static bool In<T>(this T obj, params T[] args)
{
return args.Contains(obj);
}
And you can use it like this:
if (1.In(1, 2))
:)
A more complicated way :) that emulates SQL's 'IN':
public static class Ext {
public static bool In<T>(this T t,params T[] values){
foreach (T value in values) {
if (t.Equals(value)) {
return true;
}
}
return false;
}
}
if (value.In(1,2)) {
// ...
}
But go for the standard way, it's more readable.
EDIT: a better solution, according to #Kobi's suggestion:
public static class Ext {
public static bool In<T>(this T t,params T[] values){
return values.Contains(t);
}
}
C# 9 supports this directly:
if (value is 1 or 2)
however, in many cases: switch might be clearer (especially with more recent switch syntax enhancements). You can see this here, with the if (value is 1 or 2) getting compiled identically to if (value == 1 || value == 2).
Is this what you are looking for ?
if (new int[] { 1, 2, 3, 4, 5 }.Contains(value))
If you have a List, you can use .Contains(yourObject), if you're just looking for it existing (like a where). Otherwise look at Linq .Any() extension method.
Using Linq,
if(new int[] {1, 2}.Contains(value))
But I'd have to think that your original if is faster.
Alternatively, and this would give you more flexibility if testing for values other than 1 or 2 in future, is to use a switch statement
switch(value)
{
case 1:
case 2:
return true;
default:
return false
}
If you search a value in a fixed list of values many times in a long list, HashSet<T> should be used. If the list is very short (< ~20 items), List could have better performance, based on this test
HashSet vs. List performance
HashSet<int> nums = new HashSet<int> { 1, 2, 3, 4, 5 };
// ....
if (nums.Contains(value))
Generally, no.
Yes, there are cases where the list is in an Array or List, but that's not the general case.
An extensionmethod like this would do it...
public static bool In<T>(this T item, params T[] items)
{
return items.Contains(item);
}
Use it like this:
Console.WriteLine(1.In(1,2,3));
Console.WriteLine("a".In("a", "b"));
You can use the switch statement with pattern matching (another version of jules's answer):
if (value switch{1 or 3 => true,_ => false}){
// do something
}
Easier is subjective, but maybe the switch statement would be easier? You don't have to repeat the variable, so more values can fit on the line, and a line with many comparisons is more legible than the counterpart using the if statement.
In vb.net or C# I would expect that the fastest general approach to compare a variable against any reasonable number of separately-named objects (as opposed to e.g. all the things in a collection) will be to simply compare each object against the comparand much as you have done. It is certainly possible to create an instance of a collection and see if it contains the object, and doing so may be more expressive than comparing the object against all items individually, but unless one uses a construct which the compiler can explicitly recognize, such code will almost certainly be much slower than simply doing the individual comparisons. I wouldn't worry about speed if the code will by its nature run at most a few hundred times per second, but I'd be wary of the code being repurposed to something that's run much more often than originally intended.
An alternative approach, if a variable is something like an enumeration type, is to choose power-of-two enumeration values to permit the use of bitmasks. If the enumeration type has 32 or fewer valid values (e.g. starting Harry=1, Ron=2, Hermione=4, Ginny=8, Neville=16) one could store them in an integer and check for multiple bits at once in a single operation ((if ((thisOne & (Harry | Ron | Neville | Beatrix)) != 0) /* Do something */. This will allow for fast code, but is limited to enumerations with a small number of values.
A somewhat more powerful approach, but one which must be used with care, is to use some bits of the value to indicate attributes of something, while other bits identify the item. For example, bit 30 could indicate that a character is male, bit 29 could indicate friend-of-Harry, etc. while the lower bits distinguish between characters. This approach would allow for adding characters who may or may not be friend-of-Harry, without requiring the code that checks for friend-of-Harry to change. One caveat with doing this is that one must distinguish between enumeration constants that are used to SET an enumeration value, and those used to TEST it. For example, to set a variable to indicate Harry, one might want to set it to 0x60000001, but to see if a variable IS Harry, one should bit-test it with 0x00000001.
One more approach, which may be useful if the total number of possible values is moderate (e.g. 16-16,000 or so) is to have an array of flags associated with each value. One could then code something like "if (((characterAttributes[theCharacter] & chracterAttribute.Male) != 0)". This approach will work best when the number of characters is fairly small. If array is too large, cache misses may slow down the code to the point that testing against a small number of characters individually would be faster.
Using Extension Methods:
public static class ObjectExtension
{
public static bool In(this object obj, params object[] objects)
{
if (objects == null || obj == null)
return false;
object found = objects.FirstOrDefault(o => o.GetType().Equals(obj.GetType()) && o.Equals(obj));
return (found != null);
}
}
Now you can do this:
string role= "Admin";
if (role.In("Admin", "Director"))
{
...
}
public static bool EqualsAny<T>(IEquatable<T> value, params T[] possibleMatches) {
foreach (T t in possibleMatches) {
if (value.Equals(t))
return true;
}
return false;
}
public static bool EqualsAny<T>(IEquatable<T> value, IEnumerable<T> possibleMatches) {
foreach (T t in possibleMatches) {
if (value.Equals(t))
return true;
}
return false;
}
I had the same problem but solved it with a switch statement
switch(a value you are switching on)
{
case 1:
the code you want to happen;
case 2:
the code you want to happen;
default:
return a value
}

Objects with many value checks c#

I want to see your ideas on a efficient way to check values of a newly serialized object.
Example I have an xml document I have serialized into an object, now I want to do value checks. First and most basic idea I can think of is to use nested if statments and checks each property, could be from one value checking that it has he correct url format, to checking another proprieties value that is a date but making sue it is in the correct range etc.
So my question is how would people do checks on all values in an object? Type checks are not important as this is already taken care of it is more to do with the value itself. It needs to be for quite large objects this is why I did not really want to use nested if statements.
Edit:
I want to achieve complete value validation on all properties in a given object.
I want to check the value it self not that it is null. I want to check the value for specific things if i have, an object with many properties one is of type string and named homepage.
I want to be able to check that the string in the in the correct URL format if not fail. This is just one example in the same object I could check that a date is in a given range if any are not I will return false or some form of fail.
I am using c# .net 4.
Try to use Fluent Validation, it is separation of concerns and configure validation out of your object
public class Validator<T>
{
List<Func<T,bool>> _verifiers = new List<Func<T, bool>>();
public void AddPropertyValidator(Func<T, bool> propValidator)
{
_verifiers.Add(propValidator);
}
public bool IsValid(T objectToValidate)
{
try {
return _verifiers.All(pv => pv(objectToValidate));
} catch(Exception) {
return false;
}
}
}
class ExampleObject {
public string Name {get; set;}
public int BirthYear { get;set;}
}
public static void Main(string[] args)
{
var validator = new Validator<ExampleObject>();
validator.AddPropertyValidator(o => !string.IsNullOrEmpty(o.Name));
validator.AddPropertyValidator(o => o.BirthYear > 1900 && o.BirthYear < DateTime.Now.Year );
validator.AddPropertyValidator(o => o.Name.Length > 3);
validator.Validate(new ExampleObject());
}
I suggest using Automapper with a ValueResolver. You can deserialize the XML into an object in a very elegant way using autommaper and check if the values you get are valid with a ValueResolver.
You can use a base ValueResolver that check for Nulls or invalid casts, and some CustomResolver's that check if the Values you get are correct.
It might not be exacly what you are looking for, but I think it's an elegant way to do it.
Check this out here: http://dannydouglass.com/2010/11/06/simplify-using-xml-data-with-automapper-and-linqtoxml
In functional languages, such as Haskell, your problem could be solved with the Maybe-monad:
The Maybe monad embodies the strategy of combining a chain of
computations that may each return Nothing by ending the chain early if
any step produces Nothing as output. It is useful when a computation
entails a sequence of steps that depend on one another, and in which
some steps may fail to return a value.
Replace Nothing with null, and the same thing applies for C#.
There are several ways to try and solve the problem, none of them are particularly pretty. If you want a runtime-validation that something is not null, you could use an AOP framework to inject null-checking code into your type. Otherwise you would really have to end up doing nested if checks for null, which is not only ugly, it will probably violate the Law of Demeter.
As a compromise, you could use a Maybe-monad like set of extension methods, which would allow you to query the object, and choose what to do in case one of the properties is null.
Have a look at this article by Dmitri Nesteruk: http://www.codeproject.com/Articles/109026/Chained-null-checks-and-the-Maybe-monad
Hope that helps.
I assume your question is: How do I efficiently check whether my object is valid?
If so, it does not matter that your object was just deserialized from some text source. If your question regards checking the object while deserializing to quickly stop deserializing if an error is found, that is another issue and you should update your question.
Validating an object efficiently is not often discussed when it comes to C# and administrative tools. The reason is that it is very quick no matter how you do it. It is more common to discuss how to do the checks in a manner that is easy to read and easily maintained.
Since your question is about efficiency, here are some ideas:
If you have a huge number of objects to be checked and performance is of key importance, you might want to change your objects into arrays of data so that they can be checked in a consistent manner. Example:
Instead of having MyObject[] MyObjects where MyObject has a lot of properties, break out each property and put them into an array like this:
int[] MyFirstProperties
float[] MySecondProperties
This way, the loop that traverses the list and checks the values, can be as quick as possible and you will not have many cache misses in the CPU cache, since you loop forward in the memory. Just be sure to use regular arrays or lists that are not implemented as linked lists, since that is likely to generate a lot of cache misses.
If you do not want to break up your objects into arrays of properties, it seems that top speed is not of interest but almost top speed. Then, your best bet is to keep your objects in a serial array and do:
.
bool wasOk = true;
foreach (MyObject obj in MyObjects)
{
if (obj.MyFirstProperty == someBadValue)
{
wasOk = false;
break;
}
if (obj.MySecondProperty == someOtherBadValue)
{
wasOk = false;
break;
}
}
This checks whether all your objects' properties are ok. I am not sure what your case really is but I think you get the point. Speed is already great when it comes to just checking properties of an object.
If you do string compares, make sure that you use x = y where possible, instead of using more sophisticated string compares, since x = y has a few quick opt outs, like if any of them is null, return, if the memory address is the same, the strings are equal and a few more clever things if I remember correctly. For any Java guy reading this, do not do this in Java!!! It will work sometimes but not always.
If I did not answer your question, you need to improve your question.
I'm not certain I understand the depth of your question but, wouldn't you just do somthing like this,
public SomeClass
{
private const string UrlValidatorRegex = "http://...
private const DateTime MinValidSomeDate = ...
private const DateTime MaxValidSomeDate = ...
public string SomeUrl { get; set; }
public DateTime SomeDate { get; set; }
...
private ValidationResult ValidateProperties()
{
var urlValidator = new RegEx(urlValidatorRegex);
if (!urlValidator.IsMatch(this.Someurl))
{
return new ValidationResult
{
IsValid = false,
Message = "SomeUrl format invalid."
};
}
if (this.SomeDate < MinValidSomeDate
|| this.SomeDate > MinValidSomeDate)
{
return new ValidationResult
{
IsValid = false,
Message = "SomeDate outside permitted bounds."
};
}
...
// Check other fields and properties here, return false on failure.
...
return new ValidationResult
{
IsValid = true,
};
}
...
private struct ValidationResult
{
public bool IsValid;
public string Message;
}
}
The exact valdiation code would vary depending on how you would like your class to work, no? Consider a property of a familar type,
public string SomeString { get; set; }
What are the valid values for this property. Both null and string.Empty may or may not be valid depending on the Class adorned with the property. There may be maximal length that should be allowed but, these details would vary by implementation.
If any suggested answer is more complicated than code above without offering an increase in performance or functionality, can it be more efficient?
Is your question actually, how can I check the values on an object without having to write much code?

Non Static/Constant values in case statements within a Switch

I'd like to do something like this in c sharp:
int i = 0;
foreach ( Item item in _Items )
{
foreach (Field theField in doc.Form.Fields)
{
switch (theField.Name)
{
case "Num" + i++.ToString(): // Number of Packages
theField.Value = string.Empty;
break;
}
}
}
I have 20 or so fields named Num1, Num2, etc. If I can do this all in one statement/block, I'd prefer to do so.
But the compiler complains that the case statements need to be constant values. Is there a way to use dynamic variables in the case statement so I can avoid repeating code?
I just thought I'd mention, the purpose of this method is to populate the fields in PDF form, with naming conventions which I can not control. There are 20 rows of fields, with names like "Num1" - "Num20". This is why string concatenation would be helpful in my scenario.
No. This is simply part of the language. If the values aren't constants, you'll have to use if/else if or a similar solution. (If we knew more details about what you were trying to achieve, we may be able to give more details of the solution.)
Fundamentally, I'd question a design which has a naming convention for the fields like this - it sounds like really you should have a collection to start with, which is considerably easier to work with.
Yes case value must be able to be evaluated at compile time. How about this instead
foreach (Field theField in doc.Form.Fields)
{
if(theField.Name == ("Num" + i++))
{
theField.Value = string.Empty;
}
}
How about:
int i = 0;
foreach ( Item item in _Items )
doc.Form.Fields.First(f=>f.Name == "Num" + i++.ToString()).Value = string.Empty;
Not sure what the purpose of item is in your code though.
No. There is no way.
How about replacing the switch with:
if (theField.Name.StartsWith("Num"))
theField.Value = string.Empty;
or some similar test?
var matchingFields =
from item in _Items
join field in doc.Form.Fields
on "Num" + item.PackageCount equals field.Name
select field;
foreach (var field in matchingFields)
{
field.Value = string.Empty;
}
For more efficiency include a DistinctBy on field Name after getting the matching fields (from MoreLINQ or equivalent).
Also consider that every time you concatenate two or more strings together the compiler will create memory variables for each of the component strings and then another one for the final string. This is memory intensive and for very large strings like error reporting it can even be a performance problem, and can also lead to memory fragmentation within your programs running, for long-running programs. Perhaps not so much in this case, but it is good to develop those best-practices into your usual development routines.
So instead of writing like this:
"Num" + i++.ToString()
Consider writing like this:
string.Format("{0}{1}", "Num", i++.ToString());
Also you may want to consider putting strings like "Num" into a separate constants class. Having string constants in your code can lead to program rigidity, and limit program flexibility over time as your program grows.
So you might have something like this at the beginning of your program:
using SysConst = MyNamespace.Constants.SystemConstants;
Then your code would look like this:
string.Format("{0}{1}", SysConst.Num, i++.ToString());
And in your SystemConstants class, you'd have something like this:
/// <summary>System Constants</summary>
public static class SystemConstants
{
/// <summary>The Num string.</summary>
public static readonly string Num = #"Num";
}
That way if you need to use the "Num" string any place else in your program, then you can just use the 'SysConst.Num'
Furthermore, any time you decide to change "Num" to "Number", say perhaps per a customer request, then you only need to change it in one place, and not a big find-replace in your system.

Enum and performance

My app has a lot of different lookup values, these values don't ever change, e.g. US States. Rather than putting them into database tables, I'd like to use enums.
But, I do realize doing it this way involves having a few enums and a lot of casting from "int" and "string" to and from my enums.
Alternative, I see someone mentioned using a Dictionary<> as a lookup tables, but enum implementation seems to be cleaner.
So, I'd like to ask if keeping and passing around a lot of enums and casting them be a problem to performance or should I use the lookup tables approach, which performs better?
Edit: The casting is needed as ID to be stored in other database tables.
Casting from int to an enum is extremely cheap... it'll be faster than a dictionary lookup. Basically it's a no-op, just copying the bits into a location with a different notional type.
Parsing a string into an enum value will be somewhat slower.
I doubt that this is going to be a bottleneck for you however you do it though, to be honest... without knowing more about what you're doing, it's somewhat hard to recommendation beyond the normal "write the simplest, mode readable and maintainable code which will work, then check that it performs well enough."
You're not going to notice a big difference in performance between the two, but I'd still recommend using a Dictionary because it will give you a little more flexibility in the future.
For one thing, an Enum in C# can't automatically have a class associated with it like in Java, so if you want to associate additional information with a state (Full Name, Capital City, Postal abbreviation, etc.), creating a UnitedState class will make it easier to package all of that information into one collection.
Also, even though you think this value will never change, it's not perfectly immutable. You could conceivably have a new requirement to include Territories, for example. Or maybe you'll need to allow Canadian users to see the names of Canadian Provinces instead. If you treat this collection like any other collection of data (using a repository to retrieve values from it), you will later have the option to change your repository implementation to pull values from a different source (Database, Web Service, Session, etc.). Enums are much less versatile.
Edit
Regarding the performance argument: Keep in mind that you're not just casting an Enum to an int: you're also running ToString() on that enum, which adds considerable processing time. Consider the following test:
const int C = 10000;
int[] ids = new int[C];
string[] names = new string[C];
Stopwatch sw = new Stopwatch();
sw.Start();
for (int i = 0; i< C; i++)
{
var id = (i % 50) + 1;
names[i] = ((States)id).ToString();
}
sw.Stop();
Console.WriteLine("Enum: " + sw.Elapsed.TotalMilliseconds);
var namesById = Enum.GetValues(typeof(States)).Cast<States>()
.ToDictionary(s => (int) s, s => s.ToString());
sw.Restart();
for (int i = 0; i< C; i++)
{
var id = (i % 50) + 1;
names[i] = namesById[id];
}
sw.Stop();
Console.WriteLine("Dictionary: " + sw.Elapsed.TotalMilliseconds);
Results:
Enum: 26.4875
Dictionary: 0.7684
So if performance really is your primary concern, a Dictionary is definitely the way to go. However, we're talking about such fast times here that there are half a dozen other concerns I'd address before I would even care about the speed issue.
Enums in C# were not designed to provide mappings between values and strings. They were designed to provide strongly-typed constant values that you can pass around in code. The two main advantages of this are:
You have an extra compiler-checked clue to help you avoid passing arguments in the wrong order, etc.
Rather than putting "magical" number values (e.g. "42") in your code, you can say "States.Oklahoma", which renders your code more readable.
Unlike Java, C# does not automatically check cast values to ensure that they are valid (myState = (States)321), so you don't get any runtime data checks on inputs without doing them manually. If you don't have code that refers to the states explicitly ("States.Oklahoma"), then you don't get any value from #2 above. That leaves us with #1 as the only real reason to use enums. If this is a good enough reason for you, then I would suggest using enums instead of ints as your key values. Then, when you need a string or some other value related to the state, perform a Dictionary lookup.
Here's how I'd do it:
public enum StateKey{
AL = 1,AK,AS,AZ,AR,CA,CO,CT,DE,DC,FM,FL,GA,GU,
HI,ID,IL,IN,IA,KS,KY,LA,ME,MH,MD,MA,MI,MN,MS,
MO,MT,NE,NV,NH,NJ,NM,NY,NC,ND,MP,OH,OK,OR,PW,
PA,PR,RI,SC,SD,TN,TX,UT,VT,VI,VA,WA,WV,WI,WY,
}
public class State
{
public StateKey Key {get;set;}
public int IntKey {get {return (int)Key;}}
public string PostalAbbreviation {get;set;}
}
public interface IStateRepository
{
State GetByKey(StateKey key);
}
public class StateRepository : IStateRepository
{
private static Dictionary<StateKey, State> _statesByKey;
static StateRepository()
{
_statesByKey = Enum.GetValues(typeof(StateKey))
.Cast<StateKey>()
.ToDictionary(k => k, k => new State {Key = k, PostalAbbreviation = k.ToString()});
}
public State GetByKey(StateKey key)
{
return _statesByKey[key];
}
}
public class Foo
{
IStateRepository _repository;
// Dependency Injection makes this class unit-testable
public Foo(IStateRepository repository)
{
_repository = repository;
}
// If you haven't learned the wonders of DI, do this:
public Foo()
{
_repository = new StateRepository();
}
public void DoSomethingWithAState(StateKey key)
{
Console.WriteLine(_repository.GetByKey(key).PostalAbbreviation);
}
}
This way:
you get to pass around strongly-typed values that represent a state,
your lookup gets fail-fast behavior if it is given invalid input,
you can easily change where the actual state data resides in the future,
you can easily add state-related data to the State class in the future,
you can easily add new states, territories, districts, provinces, or whatever else in the future.
getting a name from an int is still about 15 times faster than when using Enum.ToString().
[grunt]
You could use TypeSafeEnum s
Here's a base class
Public MustInherit Class AbstractTypeSafeEnum
Private Shared ReadOnly syncroot As New Object
Private Shared masterValue As Integer = 0
Protected ReadOnly _name As String
Protected ReadOnly _value As Integer
Protected Sub New(ByVal name As String)
Me._name = name
SyncLock syncroot
masterValue += 1
Me._value = masterValue
End SyncLock
End Sub
Public ReadOnly Property value() As Integer
Get
Return _value
End Get
End Property
Public Overrides Function ToString() As String
Return _name
End Function
Public Shared Operator =(ByVal ats1 As AbstractTypeSafeEnum, ByVal ats2 As AbstractTypeSafeEnum) As Boolean
Return (ats1._value = ats2._value) And Type.Equals(ats1.GetType, ats2.GetType)
End Operator
Public Shared Operator <>(ByVal ats1 As AbstractTypeSafeEnum, ByVal ats2 As AbstractTypeSafeEnum) As Boolean
Return Not (ats1 = ats2)
End Operator
End Class
And here's an Enum :
Public NotInheritable Class EnumProcType
Inherits AbstractTypeSafeEnum
Public Shared ReadOnly CREATE As New EnumProcType("Création")
Public Shared ReadOnly MODIF As New EnumProcType("Modification")
Public Shared ReadOnly DELETE As New EnumProcType("Suppression")
Private Sub New(ByVal name As String)
MyBase.New(name)
End Sub
End Class
And it gets easier to add Internationalization.
Sorry about the fact that it's in VB and french though.
Cheers !
Alternatively you can use constants
If the question was "is casting enum faster than accessing a dictionary item?" then the other answers addressing the various aspects of the performance would make sense.
But here the question seems to be "is casting enum when I need to store their value to a database table going to negatively affect the application performance?".
If that is the case, I don't need to run any test to say that storing data in a database table is always going to be orders of magnitude slower than casting an enum or executing its ToString().
In this case I would say the important thing is readability and maintainability of the code. In simple cases enums will do the job cleanly, but I agree with other answers that dictionaries are more flexible in the long term.
Enums will greatly outperform almost anything, especially dictionary's. Enums only use single byte. But why would you be casting? Seems like you should be using the enums everywhere.
Avoid enum as you can: enums should be replaced by singletons deriving from a base class or implementing an interface.
The practice of using enum comes from an old style programming in C.
You start to use an enum for the US States, then you will need the number of inhabitants, the capitol..., and you will need a lot of big switches to get all of this infos.

Units of measure in C# - almost

Inspired by Units of Measure in F#, and despite asserting (here) that you couldn't do it in C#, I had an idea the other day which I've been playing around with.
namespace UnitsOfMeasure
{
public interface IUnit { }
public static class Length
{
public interface ILength : IUnit { }
public class m : ILength { }
public class mm : ILength { }
public class ft : ILength { }
}
public class Mass
{
public interface IMass : IUnit { }
public class kg : IMass { }
public class g : IMass { }
public class lb : IMass { }
}
public class UnitDouble<T> where T : IUnit
{
public readonly double Value;
public UnitDouble(double value)
{
Value = value;
}
public static UnitDouble<T> operator +(UnitDouble<T> first, UnitDouble<T> second)
{
return new UnitDouble<T>(first.Value + second.Value);
}
//TODO: minus operator/equality
}
}
Example usage:
var a = new UnitDouble<Length.m>(3.1);
var b = new UnitDouble<Length.m>(4.9);
var d = new UnitDouble<Mass.kg>(3.4);
Console.WriteLine((a + b).Value);
//Console.WriteLine((a + c).Value); <-- Compiler says no
The next step is trying to implement conversions (snippet):
public interface IUnit { double toBase { get; } }
public static class Length
{
public interface ILength : IUnit { }
public class m : ILength { public double toBase { get { return 1.0;} } }
public class mm : ILength { public double toBase { get { return 1000.0; } } }
public class ft : ILength { public double toBase { get { return 0.3048; } } }
public static UnitDouble<R> Convert<T, R>(UnitDouble<T> input) where T : ILength, new() where R : ILength, new()
{
double mult = (new T() as IUnit).toBase;
double div = (new R() as IUnit).toBase;
return new UnitDouble<R>(input.Value * mult / div);
}
}
(I would have liked to avoid instantiating objects by using static, but as we all know you can't declare a static method in an interface)
You can then do this:
var e = Length.Convert<Length.mm, Length.m>(c);
var f = Length.Convert<Length.mm, Mass.kg>(d); <-- but not this
Obviously, there is a gaping hole in this, compared to F# Units of measure (I'll let you work it out).
Oh, the question is: what do you think of this? Is it worth using? Has someone else already done better?
UPDATE for people interested in this subject area, here is a link to a paper from 1997 discussing a different kind of solution (not specifically for C#)
You are missing dimensional analysis. For example (from the answer you linked to), in F# you can do this:
let g = 9.8<m/s^2>
and it will generate a new unit of acceleration, derived from meters and seconds (you can actually do the same thing in C++ using templates).
In C#, it is possible to do dimensional analysis at runtime, but it adds overhead and doesn't give you the benefit of compile-time checking. As far as I know there's no way to do full compile-time units in C#.
Whether it's worth doing depends on the application of course, but for many scientific applications, it's definitely a good idea. I don't know of any existing libraries for .NET, but they probably exist.
If you are interested in how to do it at runtime, the idea is that each value has a scalar value and integers representing the power of each basic unit.
class Unit
{
double scalar;
int kg;
int m;
int s;
// ... for each basic unit
public Unit(double scalar, int kg, int m, int s)
{
this.scalar = scalar;
this.kg = kg;
this.m = m;
this.s = s;
...
}
// For addition/subtraction, exponents must match
public static Unit operator +(Unit first, Unit second)
{
if (UnitsAreCompatible(first, second))
{
return new Unit(
first.scalar + second.scalar,
first.kg,
first.m,
first.s,
...
);
}
else
{
throw new Exception("Units must match for addition");
}
}
// For multiplication/division, add/subtract the exponents
public static Unit operator *(Unit first, Unit second)
{
return new Unit(
first.scalar * second.scalar,
first.kg + second.kg,
first.m + second.m,
first.s + second.s,
...
);
}
public static bool UnitsAreCompatible(Unit first, Unit second)
{
return
first.kg == second.kg &&
first.m == second.m &&
first.s == second.s
...;
}
}
If you don't allow the user to change the value of the units (a good idea anyways), you could add subclasses for common units:
class Speed : Unit
{
public Speed(double x) : base(x, 0, 1, -1, ...); // m/s => m^1 * s^-1
{
}
}
class Acceleration : Unit
{
public Acceleration(double x) : base(x, 0, 1, -2, ...); // m/s^2 => m^1 * s^-2
{
}
}
You could also define more specific operators on the derived types to avoid checking for compatible units on common types.
Using separate classes for different units of the same measure (e.g., cm, mm, and ft for Length) seems kind of weird. Based on the .NET Framework's DateTime and TimeSpan classes, I would expect something like this:
Length length = Length.FromMillimeters(n1);
decimal lengthInFeet = length.Feet;
Length length2 = length.AddFeet(n2);
Length length3 = length + Length.FromMeters(n3);
You could add extension methods on numeric types to generate measures. It'd feel a bit DSL-like:
var mass = 1.Kilogram();
var length = (1.2).Kilometres();
It's not really .NET convention and might not be the most discoverable feature, so perhaps you'd add them in a devoted namespace for people who like them, as well as offering more conventional construction methods.
I recently released Units.NET on GitHub and on NuGet.
It gives you all the common units and conversions. It is light-weight, unit tested and supports PCL.
Example conversions:
Length meter = Length.FromMeters(1);
double cm = meter.Centimeters; // 100
double yards = meter.Yards; // 1.09361
double feet = meter.Feet; // 3.28084
double inches = meter.Inches; // 39.3701
Now such a C# library exists:
http://www.codeproject.com/Articles/413750/Units-of-Measure-Validator-for-Csharp
It has almost the same features as F#'s unit compile time validation, but for C#.
The core is a MSBuild task, which parses the code and looking for validations.
The unit information are stored in comments and attributes.
Here's my concern with creating units in C#/VB. Please correct me if you think I'm wrong. Most implementations I've read about seem to involve creating a structure that pieces together a value (int or double) with a unit. Then you try to define basic functions (+-*/,etc) for these structures that take into account unit conversions and consistency.
I find the idea very attractive, but every time I balk at what a huge step for a project this appears to be. It looks like an all-or-nothing deal. You probably wouldn't just change a few numbers into units; the whole point is that all data inside a project is appropriately labeled with a unit to avoid any ambiguity. This means saying goodbye to using ordinary doubles and ints, every variable is now defined as a "Unit" or "Length" or "Meters", etc. Do people really do this on a large scale? So even if you have a large array, every element should be marked with a unit. This will obviously have both size and performance ramifications.
Despite all the cleverness in trying to push the unit logic into the background, some cumbersome notation seems inevitable with C#. F# does some behind-the-scenes magic that better reduces the annoyance factor of the unit logic.
Also, how successfully can we make the compiler treat a unit just like an ordinary double when we so desire, w/o using CType or ".Value" or any additional notation? Such as with nullables, the code knows to treat a double? just like a double (of course if your double? is null then you get an error).
Thanks for the idea. I have implemented units in C# many different ways there always seems to be a catch. Now I can try one more time using the ideas discussed above. My goal is to be able to define new units based on existing ones like
Unit lbf = 4.44822162*N;
Unit fps = feet/sec;
Unit hp = 550*lbf*fps
and for the program to figure out the proper dimensions, scaling and symbol to use. In the end I need to build a basic algebra system that can convert things like (m/s)*(m*s)=m^2 and try to express the result based on existing units defined.
Also a requirement must be to be able to serialize the units in a way that new units do not need to be coded, but just declared in a XML file like this:
<DefinedUnits>
<DirectUnits>
<!-- Base Units -->
<DirectUnit Symbol="kg" Scale="1" Dims="(1,0,0,0,0)" />
<DirectUnit Symbol="m" Scale="1" Dims="(0,1,0,0,0)" />
<DirectUnit Symbol="s" Scale="1" Dims="(0,0,1,0,0)" />
...
<!-- Derived Units -->
<DirectUnit Symbol="N" Scale="1" Dims="(1,1,-2,0,0)" />
<DirectUnit Symbol="R" Scale="1.8" Dims="(0,0,0,0,1)" />
...
</DirectUnits>
<IndirectUnits>
<!-- Composite Units -->
<IndirectUnit Symbol="m/s" Scale="1" Lhs="m" Op="Divide" Rhs="s"/>
<IndirectUnit Symbol="km/h" Scale="1" Lhs="km" Op="Divide" Rhs="hr"/>
...
<IndirectUnit Symbol="hp" Scale="550.0" Lhs="lbf" Op="Multiply" Rhs="fps"/>
</IndirectUnits>
</DefinedUnits>
there is jscience: http://jscience.org/, and here is a groovy dsl for units: http://groovy.dzone.com/news/domain-specific-language-unit-. iirc, c# has closures, so you should be able to cobble something up.
Why not use CodeDom to generate all possible permutations of the units automatically? I know it's not the best - but I will definitely work!
you could use QuantitySystem instead of implementing it by your own. It builds on F# and drastically improves unit handling in F#. It's the best implementation I found so far and can be used in C# projects.
http://quantitysystem.codeplex.com
Is it worth using?
Yes. If I have "a number" in front of me, I want to know what that is. Any time of the day. Besides, that's what we usually do. We organize data into a meaningful entity -class, struct, you name it. Doubles into coordinates, strings into names and address etc. Why units should be any different?
Has someone else already done better?
Depends on how one defines "better". There are some libraries out there but I haven't tried them so I don't have an opinion. Besides it spoils the fun of trying it myself :)
Now about the implementation. I would like to start with the obvious: it's futile to try replicate the [<Measure>] system of F# in C#. Why? Because once F# allows you to use / ^ (or anything else for that matter) directly on another type, the game is lost. Good luck doing that in C# on a struct or class. The level of metaprogramming required for such a task does not exist and I'm afraid it is not going to be added any time soon -in my opinion. That's why you lack the dimensional analysis that Matthew Crumley mentioned in his answer.
Let's take the example from fsharpforfunandprofit.com: you have Newtons defined as [<Measure>] type N = kg m/sec^2. Now you have the square function that that the author created that will return a N^2 which sounds "wrong", absurd and useless. Unless you want to perform arithmetic operations where at some point during the evaluation process, you might get something "meaningless" until you multiply it with some other unit and you get a meaningful result. Or even worse, you might want to use constants. For example the gas constant R which is 8.31446261815324 J /(K mol). If you define the appropriate units, then F# is ready to consume the R constant. C# is not. You need to specify another type just for that and still you won't be able to do any operation you want on that constant.
That doesn't mean that you shouldn't try. I did and I am quite happy with the results. I started SharpConvert around 3 years ago, after I got inspired by this very question. The trigger was this story: once I had to fix a nasty bug for the RADAR simulator that I develop: an aircraft was plunging in the earth instead of following the predefined glide path. That didn't make me happy as you could guess and after 2 hours of debugging, I realized that somewhere in my calculations, I was treating kilometers as nautical miles. Until that point I was like "oh well I will just be 'careful'" which is at least naive for any non trivial task.
In your code there would be a couple of things I would do different.
First I would turn UnitDouble<T> and IUnit implementations into structs. A unit is just that, a number and if you want them to be treated like numbers, a struct is a more appropriate approach.
Then I would avoid the new T() in the methods. It does not invoke the constructor, it uses Activator.CreateInstance<T>() and for number crunching it will be bad as it will add overhead. That depends though on the implementation, for a simple units converter application it won't harm. For time critical context avoid like the plague. And don't take me wrong, I used it myself as I didn't know better and I run some simple benchmarks the other day and such a call might double the execution time -at least in my case. More details in Dissecting the new() constraint in C#: a perfect example of a leaky abstraction
I would also change Convert<T, R>() and make it a member function. I prefer writing
var c = new Unit<Length.mm>(123);
var e = c.Convert<Length.m>();
rather than
var e = Length.Convert<Length.mm, Length.m>(c);
Last but not least I would use specific unit "shells" for each physical quantity (length time etc) instead of the UnitDouble, as it will be easier to add physical quantity specific functions and operator overloads. It will also allow you to create a Speed<TLength, TTime> shell instead of another Unit<T1, T2> or even Unit<T1, T2, T3> class. So it would look like that:
public readonly struct Length<T> where T : struct, ILength
{
private static readonly double SiFactor = new T().ToSiFactor;
public Length(double value)
{
if (value < 0) throw new ArgumentException(nameof(value));
Value = value;
}
public double Value { get; }
public static Length<T> operator +(Length<T> first, Length<T> second)
{
return new Length<T>(first.Value + second.Value);
}
public static Length<T> operator -(Length<T> first, Length<T> second)
{
// I don't know any application where negative length makes sense,
// if it does feel free to remove Abs() and the exception in the constructor
return new Length<T>(System.Math.Abs(first.Value - second.Value));
}
// You can add more like
// public static Area<T> operator *(Length<T> x, Length<T> y)
// or
//public static Volume<T> operator *(Length<T> x, Length<T> y, Length<T> z)
// etc
public Length<R> To<R>() where R : struct, ILength
{
//notice how I got rid of the Activator invocations by moving them in a static field;
//double mult = new T().ToSiFactor;
//double div = new R().ToSiFactor;
return new Length<R>(Value * SiFactor / Length<R>.SiFactor);
}
}
Notice also that, in order to save us from the dreaded Activator call, I stored the result of new T().ToSiFactor in SiFactor. It might seem awkward at first, but as Length is generic, Length<mm> will have its own copy, Length<Km> its own, and so on and so forth. Please note that ToSiFactor is the toBase of your approach.
The problem that I see is that as long as you are in the realm of simple units and up to the first derivative of time, things are simple. If you try to do something more complex, then you can see the drawbacks of this approach. Typing
var accel = new Acceleration<m, s, s>(1.2);
will not be as clear and "smooth" as
let accel = 1.2<m/sec^2>
And regardless of the approach, you will have to specify every math operation you will need with hefty operator overloading, while in F# you have this for free, even if the results are not meaningful as I was writing at the beginning.
The last drawback (or advantage depending on how you see it) of this design, is that it can't be unit agnostic. If there are cases that you need "just a Length" you can't have it. You need to know each time if your Length is millimeters, statute mile or foot. I took the opposite approach in SharpConvert and LengthUnit derives from UnitBase and Meters Kilometers etc derive from this. That's why I couldn't go down the struct path by the way. That way you can have:
LengthUnit l1 = new Meters(12);
LengthUnit l2 = new Feet(15.4);
LengthUnit sum = l1 + l2;
sum will be meters but one shouldn't care as long as they want to use it in the next operation. If they want to display it, then they can call sum.To<Kilometers>() or whatever unit. To be honest, I don't know if not "locking" the variable to a specific unit has any advantages. It might worth investigating it at some point.
I would like the compiler to help me as much as possible. So maybe you could have a TypedInt where T contains the actual unit.
public struct TypedInt<T>
{
public int Value { get; }
public TypedInt(int value) => Value = value;
public static TypedInt<T> operator -(TypedInt<T> a, TypedInt<T> b) => new TypedInt<T>(a.Value - b.Value);
public static TypedInt<T> operator +(TypedInt<T> a, TypedInt<T> b) => new TypedInt<T>(a.Value + b.Value);
public static TypedInt<T> operator *(int a, TypedInt<T> b) => new TypedInt<T>(a * b.Value);
public static TypedInt<T> operator *(TypedInt<T> a, int b) => new TypedInt<T>(a.Value * b);
public static TypedInt<T> operator /(TypedInt<T> a, int b) => new TypedInt<T>(a.Value / b);
// todo: m² or m/s
// todo: more than just ints
// todo: other operations
public override string ToString() => $"{Value} {typeof(T).Name}";
}
You could have an extensiom method to set the type (or just new):
public static class TypedInt
{
public static TypedInt<T> Of<T>(this int value) => new TypedInt<T>(value);
}
The actual units can be anything. That way, the system is extensible.
(There's multiple ways of handling conversions. What do you think is best?)
public class Mile
{
// todo: conversion from mile to/from meter
// maybe define an interface like ITypedConvertible<Meter>
// conversion probably needs reflection, but there may be
// a faster way
};
public class Second
{
}
This way, you can use:
var distance1 = 10.Of<Mile>();
var distance2 = 15.Of<Mile>();
var timespan1 = 4.Of<Second>();
Console.WriteLine(distance1 + distance2);
//Console.WriteLine(distance1 + 5); // this will be blocked by the compiler
//Console.WriteLine(distance1 + timespan1); // this will be blocked by the compiler
Console.WriteLine(3 * distance1);
Console.WriteLine(distance1 / 3);
//Console.WriteLine(distance1 / timespan1); // todo!
See Boo Ometa (which will be available for Boo 1.0):
Boo Ometa and Extensible Parsing
I really liked reading through this stack overflow question and its answers.
I have a pet project that I've tinkered with over the years, and have recently started re-writing it and have released it to the open source at https://github.com/MafuJosh/NGenericDimensions
It happens to be somewhat similar to many of the ideas expressed in the question and answers of this page.
It basically is about creating generic dimensions, with the unit of measure and the native datatype as the generic type placeholders.
For example:
Dim myLength1 as New Length(of Miles, Int16)(123)
With also some optional use of Extension Methods like:
Dim myLength2 = 123.miles
And
Dim myLength3 = myLength1 + myLength2
Dim myArea1 = myLength1 * myLength2
This would not compile:
Dim myValue = 123.miles + 234.kilograms
New units can be extended in your own libraries.
These datatypes are structures that contain only 1 internal member variable, making them lightweight.
Basically, the operator overloads are restricted to the "dimension" structures, so that every unit of measure doesn't need operator overloads.
Of course, a big downside is the longer declaration of the generics syntax that requires 3 datatypes. So if that is a problem for you, then this isn't your library.
The main purpose was to be able to decorate an interface with units in a compile-time checking fashion.
There is a lot that needs to be done to the library, but I wanted to post it in case it was the kind of thing someone was looking for.

Categories