Using structs in C# for simple domain values - c#

I am writing a financial application where the concept of 'Price' is used a lot. It's currently represented by the C# decimal type. I would like to make it more explicit and be able to change it to maybe double in the future, so I was thinking of creating a 'Price' struct that would basically act exactly the same as the decimal type (maybe add a bit of validation like must be greater than 0).
What do you think are the pros and cons of doing this?

Please don't use double for money. You'll have to remember to round it for display everywhere you use it at, and you have potential accuracy issues if you divide or multiply by large numbers. Decimal will give overflow errors, double will just lose accuracy. I'm not sure about you, but with money, I'd prefer an error and aborted operation to silently proceeding with a loss of accuracy.
If anything, based on projects I've been on, you may want to consider using a struct that has a decimal and some indication of what currency it is.

Structs should be used for small types that will (in my opinion) be immutable, i.e., value types. I am not sure what you mean by "used a lot", but if these structs will be passed around a lot in performance critical operations you will have to take into account the price of copying them versus the price of heap allocation. I doubt you will need to take that into account, but it is something to think about. I rarely find the need to use structs in my daily activities.
Also, as Jonathan points out, using the double type for money is a bad idea. The decimal type is much better suited to financial calculations.
Yet another aside; you will probably hear a lot of responses which talk about stack v heap allocation, so this article may interest you:
http://blogs.msdn.com/ericlippert/archive/2009/04/27/the-stack-is-an-implementation-detail.aspx

There shouldn't be a reason to change the data type for a quantity like this; however, you may decide to add other information such as the currency or the number of decimal places to keep track of in calculations, so using a struct at this point will save you a LOT of time down the road.

Structs may not be so accessible from .NET languages other than C#. Rounding errors could be a problem too. Why not just create a Money class and store the value as a Decimal and the currency used.

Related

Is it okay to hard-code complex math logic inside my code?

Is there a generally accepted best approach to coding complex math? For example:
double someNumber = .123 + .456 * Math.Pow(Math.E, .789 * Math.Pow((homeIndex + .22), .012));
Is this a point where hard-coding the numbers is okay? Or should each number have a constant associated with it? Or is there even another way, like storing the calculations in config and invoking them somehow?
There will be a lot of code like this, and I'm trying to keep it maintainable.
Note: The example shown above is just one line. There would be tens or hundreds of these lines of code. And not only could the numbers change, but the formula could as well.
Generally, there are two kinds of constants - ones with the meaning to the implementation, and ones with the meaning to the business logic.
It is OK to hard-code the constants of the first kind: they are private to understanding your algorithm. For example, if you are using a ternary search and need to divide the interval in three parts, dividing by a hard-coded 3 is the right approach.
Constants with the meaning outside the code of your program, on the other hand, should not be hard-coded: giving them explicit names gives someone who maintains your code after you leave the company non-zero chances of making correct modifications without having to rewrite things from scratch or e-mailing you for help.
"Is it okay"? Sure. As far as I know, there's no paramilitary police force rounding up those who sin against the one true faith of programming. (Yet.).
Is it wise?
Well, there are all sorts of ways of deciding that - performance, scalability, extensibility, maintainability etc.
On the maintainability scale, this is pure evil. It make extensibility very hard; performance and scalability are probably not a huge concern.
If you left behind a single method with loads of lines similar to the above, your successor would have no chance maintaining the code. He'd be right to recommend a rewrite.
If you broke it down like
public float calculateTax(person)
float taxFreeAmount = calcTaxFreeAmount(person)
float taxableAmount = calcTaxableAmount(person, taxFreeAmount)
float taxAmount = calcTaxAmount(person, taxableAmount)
return taxAmount
end
and each of the inner methods is a few lines long, but you left some hardcoded values in there - well, not brilliant, but not terrible.
However, if some of those hardcoded values are likely to change over time (like the tax rate), leaving them as hardcoded values is not okay. It's awful.
The best advice I can give is:
Spend an afternoon with Resharper, and use its automatic refactoring tools.
Assume the guy picking this up from you is an axe-wielding maniac who knows where you live.
I usually ask myself whether I can maintain and fix the code at 3 AM being sleep deprived six months after writing the code. It has served me well. Looking at your formula, I'm not sure I can.
Ages ago I worked in the insurance industry. Some of my colleagues were tasked to convert the actuarial formulas into code, first FORTRAN and later C. Mathematical and programming skills varied from colleague to colleague. What I learned was the following reviewing their code:
document the actual formula in code; without it, years later you'll have trouble remember the actual formula. External documentation goes missing, become dated or simply may not be accessible.
break the formula into discrete components that can be documented, reused and tested.
use constants to document equations; magic numbers have very little context and often require existing knowledge for other developers to understand.
rely on the compiler to optimize code where possible. A good compiler will inline methods, reduce duplication and optimize the code for the particular architecture. In some cases it may duplicate portions of the formula for better performance.
That said, there are times where hard coding just simplify things, especially if those values are well understood within a particular context. For example, dividing (or multiplying) something by 100 or 1000 because you're converting a value to dollars. Another one is to multiply something by 3600 when you'd like to convert hours to seconds. Their meaning is often implied from the greater context. The following doesn't say much about magic number 100:
public static double a(double b, double c)
{
return (b - c) * 100;
}
but the following may give you a better hint:
public static double calculateAmountInCents(double amountDue, double amountPaid)
{
return (amountDue - amountPaid) * 100;
}
As the above comment states, this is far from complex.
You can however store the Magic numbers in constants/app.config values, so as to make it easier for the next developer to maitain your code.
When storing such constants, make sure to explain to the next developer (read yourself in 1 month) what your thoughts were, and what they need to keep in mind.
Also ewxplain what the actual calculation is for and what it is doing.
Do not leave in-line like this.
Constant so you can reuse, easily find, easily change and provides for better maintaining when someone comes looking at your code for the first time.
You can do a config if it can/should be customized. What is the impact of a customer altering the value(s)? Sometimes it is best to not give them that option. They could change it on their own then blame you when things don't work. Then again, maybe they have it in flux more often than your release schedules.
Its worth noting that the C# compiler (or is it the CLR) will automatically inline 1 line methods so if you can extract certain formulas into one liners you can just extract them as methods without any performance loss.
EDIT:
Constants and such more or less depends on the team and the quantity of use. Obviously if you're using the same hard-coded number more than once, constant it. However if you're writing a formula that its likely only you will ever edit (small team) then hard coding the values is fine. It all depends on your teams views on documentation and maintenance.
If the calculation in your line explains something for the next developer then you can leave it, otherwise its better to have calculated constant value in your code or configuration files.
I found one line in production code which was like:
int interval = 1 * 60 * 60 * 1000;
Without any comment, it wasn't hard that the original developer meant 1 hour in milliseconds, rather than seeing a value of 3600000.
IMO May be leaving out calculations is better for scenarios like that.
Names can be added for documentation purposes. The amount of documentation needed depends largely on the purpose.
Consider following code:
float e = m * 8.98755179e16;
And contrast it with the following one:
const float c = 299792458;
float e = m * c * c;
Even though the variable names are not very 'descriptive' in the latter you'll have much better idea what the code is doing the the first one - arguably there is no need to rename the c to speedOfLight, m to mass and e to energy as the names are explanatory in their domains.
const float speedOfLight = 299792458;
float energy = mass * speedOfLight * speedOfLight;
I would argue that the second code is the clearest one - especially if programmer can expect to find STR in the code (LHC simulator or something similar). To sum up - you need to find an optimal point. The more verbose code the more context you provide - which might both help to understand the meaning (what is e and c vs. we do something with mass and speed of light) and obscure the big picture (we square c and multiply by m vs. need of scanning whole line to get equation).
Most constants have some deeper meening and/or established notation so I would consider at least naming it by the convention (c for speed of light, R for gas constant, sPerH for seconds in hour). If notation is not clear the longer names should be used (sPerH in class named Date or Time is probably fine while it is not in Paginator). The really obvious constants could be hardcoded (say - division by 2 in calculating new array length in merge sort).

do datatype choices affect performance?

I have an object model that I use to fill results from a query and that I then pass along to a gridview.
Something like this:
public class MyObjectModel
{
public int Variable1 {get;set;}
public int VariableN {get;set;}
}
Let's say variable1 holds the value of a count and I know that the count will never get to become very large (ie. number of upcoming appointments for a certain day). For now, I've put these data types as int. Let's say it's safe to say that someone will book less than 255 appointments per day. Will changing the datatype from int to byte affect performance much? Is it worth the trouble?
Thanks
No, performance will not be affected much at all.
For each int you will be saving 3 bytes, or 6 in total for the specific example. Unless you have many millions of these, the savings in memory are very small.
Not worth the trouble.
Edit:
Just to clarify - my answer is specifically about the example code. In many cases the choices will make a difference, but it is a matter of scale and will require performance testing to ensure correct results.
To answer #Filip's comment - There is a difference between compiling an application to 64bit and selecting an isolated data type.
Using a integer variable smaller than an int (System.Int32) will not provide any performance benefits. This is because most integer operations in the CLR will promote the variable to an int prior to performing the operation. int is considered the "natural" integer size on the systems for which the CLR was developed.
Consider the following code:
for (byte appointmentIndex = 0; appointmentIndex < Variable1; appointmentIndex++)
ProcessAppointment(appointmentIndex);
In the compiled code, the comparison (appointmentIndex < Variable1) and the increment (appointmentIndex++) will (most likely) be performed using 32-bit integers. Even if the optimizer uses a smaller data type, the CPU itself will require additional work to use the smaller data type.
If you are storing an array of values, then using a smaller data type could help save space, which might give a performance advantage in some scenerios.
It will affect the amount of memory allocated for that variable. In my personal opinion, I don't think it's worth the trouble in the example case.
If there were a huge number of variables, or a database table where you could really save, then yes, but not in this case.
Besides, after years of maintenance programming, I can safely say that it's rarely safe to assume an upper limit on anything. if there's even a remote chance that some poor maintenance programmer is going to have to re-write the app because of trying to save a trivial amount of resources, it's not worth the pay-off.
The .NET runtime optimizes the use of Int32 especially for counters etc.
.NET Integer vs Int16?
Contrary to popular belief, making your data type smaller does not make access faster. In fact, it's slower. Look at bool, it's implemented as an int.
This is because internally, your CPU works with native-word-sized registers (32/64 bit these days), and you're forcing it to convert your data back and forth for no reason (well only when writing the result in memory, but it's still a penalty you could easily avoid).
Fiddling with integer widths only affects memory access, and caching specifically. This is the kind of stuff you can only figure out by profiling your application and looking at page fault counters in particular.
I agree with the other answers that performance won't be worth it. But if you're going to do it at all, go with a short instead of a byte. My rule of thumb is to pick the highest number you can imagine, multiply by 10, then use that as the basis to pick your value. So if you can't possibly imagine a value higher than 200, then use 2000 as your basis, which would mean you'd need a short.

Suggest a good method for having a number with high precision in C#

I am programming on a project which I should store the key of the user to the initial configuration of a machine, I want to write it in C#.
I have an initial configuration which consists of two number R and X0, R = 3.9988 and X0 = 0.5. I want to add the user key to these numbers. for example:
Key: hos110 =>
R = 3.9988104111115049049048
X0 = 0.5104111115049049048
104111115049049048 are ASCII codes of the key which are concatenated.
How can I store these numbers?
Is there a better method for doing this?
Update: How about MATLAB?
You're not really "adding" numbers. You are concatenating strings.
Store them as strings. You can't get much more precise than that.
If you need to perform any arithmetic operations, it is easy enough to convert them to a decimal number on the fly.
I don't really follow why you're using a key as part of a number, but leaving that aside... System.Decimal (aka decimal) seems like the right tool for the job here.
If you need infinite precision you need something that is called BigInteger. However these classes are usually only used for scientific calculations (and usually unsuited for stroring the data) which doesn't really seem to match your code sample. If you need to do only general calculations use Strings and then convert them to Decimal for the calculations.
However if you are looking for such a BigInterger Class you can find one here.
.Net 4.0 will have a BigInteger built-in-class in the class libraries named System.Numerics.BigInteger.
Well, depending on the precision you are trying to achieve, you can probably save these as a pair of decimal values.
However, if this is an ASCII code, you may just want to save these as a string directly. This will avoid the numerical precision issues, especially if you're going to pull off the 104111... prior to using this information.
It seems that you are storing a "key", so why not use a String then?
Floating point numbers are inherently imprecise. I'm not sure what this 'initial configuration' is or why it's a float, but you're not going to be able to tack on a 'user key' (whatever that may be) and recover it later. Store the user key separately, in a string or something.
If these 'numbers' have no numeric value, i.e. you will not use them for mathematical computation then there is no need to store them in a numeric datatype. You can store them as strings.

Looking for an easier way to use floats in C#

For a number of reasons, I have to use floats in my code instead of doubles. To use a literal in my code, I have to write something like:
float f = 0.75F;
or the compiler will barf since it treats just "0.75" as a double. Is there anything I can put in my code or set in Visual Studio that will make it treat a literal like "0.75" as a float without having to append an "F" every time?
No - fortunately, IMO. Literals are treated the same way everywhere.
This is a good thing - imagine some maintenance developer comes and looks at your code in a year's time. He sees "0.75" and thinks "I know C# - that's a double! Hang on, how is it being assigned to a float variable?" Ick.
Is it really so painful to add the "F" everywhere? Do you really have that many constants? Could you extract them as constant values, so all your "F-suffixed" literals are in the same place.
FYI -- you can find all of the compiler options for C# at http://msdn.microsoft.com/en-us/library/6ds95cz0.aspx. If you check there, you'll see that there isn't any option that allows this -- and rightly so for the reasons that #Jon Skeet noted.
The language interprets floating point precision literals as doubles everywhere. This is not a configurable feature of the compiler - and with good reason.
Configuring how the language interprets you code would lead to problems with both compatibility and the ability of maintenance developers to understand what the code means.
While not advisable generally, you can reduce the pain a little in C# 3 by using:
var f = 0.75F;
Just be careful, because forgetting the 'F' suffix with this syntax WILL cause the compiler to create a double, not a float.
Float comes with an F :-)
I would advise you to always use
var meaning = 1f;
because the "var" keyword saves a lot of human interpretation and maintenance time.
The proper behavior would not be for a compiler to interpret non-suffixed literals as single-precision floats, but rather to recognize that conversions from double to float should be regarded as widening conversions since, for every double value, there is either precisely one unambiguously-correct float representation, or (in a few rare edge cases) there will be precisely two equally-good values, neither of which will be more than a part per quadrillion from being the unambiguously-correct value. Semantically, conversions from float to double should be regarded as narrowing conversions (since they require that the compiler "guess" at information it doesn't have), but the practical difficulties that would cause might justify making conversions in that direction 'widening'.
Perhaps one should petition Microsoft to add a widening conversion from double to float? There's no good reason why code which calculates graphics coordinates as double should be cluttered with typecasts when calling drawing functions.

Numbers that exceeds basic types in C#

I'm solving problems in Project Euler. Most of the problems solved by
big numbers that exceeds ulong,
Ex : ulong number = 81237146123746237846293567465365862854736263874623654728568263582;
very sensitive decimal numbers with significant digits over 30
Ex : decimal dec =
0,3242342543573894756936576474978265726385428569234753964340653;
arrays that must have index values that exceeds biggest int value.
Ex : bool[] items = new
bool[213192471235494658346583465340673475263842864836];
I found a library called IntX to solve this big numbers. But I wonder how can I solve this problems with basic .NET types ?
Thanks for the replies !
Well, for the third item there you really don't want to use an array, since it needs to be allocated that big as well.
Let me rephrase that.
By the time you can afford, and get access to, that much memory, the big-number problem will be solved!
To answer your last question there, there is no way you can solve this using only basic types, unless you do what the makers of IntX did, implement big-number support.
Might I suggest you try a different programming language for the euler-problems? I've had better luck with Python, since it has support for big numbers out of the box and integrated into everything else. Well, except for that array, you really can't do that in any language these days.
Maybe this could give you ideas to how to solve part of your problem:
http://www.codeproject.com/csharp/BigInteger.asp
Wikipedia also has a good article about Arbitrary-precision math and in that article there is a link to Codeplex and W3b.sine wich is an arbitrary precision real number c# library.
Well, I suggest you take look at this other answer to see how I solved the Big Numbers problem. Basically, you need to represent numbers in another way ...
Most of the problems solved by
big numbers that exceeds ulong,
very sensitive decimal numbers with significant digits over 30
arrays that must have index values that exceeds biggest int value.
Most of the problems are designed to fit within 64 bit longs. There are one or two which require bigger integers, but not many. None I've seen require decimal numbers with more than 30 digits, and none require arrays larger than a few thousand entries.
Remember that the correct solutions to the problems should run in a few seconds at most, and populating an array of 213192471235494658346583465340673475263842864836 bits will take 10^30 years.
Another options might be to use the BigInt type that is available in F#: http://cs.hubfs.net/forums/thread/887.aspx

Categories