Should structs always be immutable? - c#

I was reading a couple of threads on here about structs and there is/was one about structs and how they should be representing immutable values (eg like a digit - 1) because of their value type behaviour/semantics.
But on the other hand, structs represent things like phone numbers, which can change for the same household.
Is this a hard and fast rule?

A phone number does not change; you just get a different one and discard the old one. The old one is still the same it always was. Same with dates, numbers, etc. - think of this when approaching structs. They are a way to encapsulate a value - which simply is; not the usage of the value, which changes.

Yes, structs should always be immutable! Mutable structs can cause terrible headaches as their usage can create very strange behavoir.

Yes, structs should almost always be immutable. For example, in your phone number case, the phone number itself doesn't mutate: what happens is that the household is allocated a new phone number. The phone number 555-555-1234 is still the phone number 555-555-1234, but the household's phone number is the different number 555-555-5678.
Note that you can find violations of this guideline in the .NET Framework. For example, the WPF Point and Size structs are mutable. This is not a good practice to follow, as one finds out when one tries to write something.Location.X = newX.

Immutable value != Immutable variable
What I means is, even if your variable contains a value that can't be changed, you can still change your variable to have different contents. int x = 5; x++; is legal. 5++; is not.
If your struct contains an integer, and you assign a new value to that integer (e.g. myStruct.MyInt++. you might think you are changing the value of MyInt. Really, you're storing a new value that's one greater than the old value.
Why does it matter? Because there might be another thread accessing myStruct.MyInt concurrently, and you don't want the value it's working with to suddenly change int the middle of being used.

Related

How to handle null when overloading operator + for a class value object?

I want to have to have a value object that represents length. I would prefer to use a struct given that it is a value type, but since zero length does not make sense I am forced to use a class. Adding two lengths together seems like a reasonable operation, so I want to overload the + operator. I am curious though, how should I handle adding null?
Adding null to an existing string returns a string with the same content as the existing string. Adding null to a int? that has a value returns null.
I can see a case where adding nullto an existing length simply returns a new length with the same value as the existing length. At the same time, I can see a case where adding null would be considered a bug. I have been trying to find some guidance but have not been able to find any. Is there a common guideline for this or is it different for each application?
I would highly recommend using struct for your length, and treating the default representation as zero length.
since zero length does not make sense I am forced to use a class
It is up to your code to treat the default representation of length struct as a representation of some specific length. In addition to treating it as zero length, you have at least two options:
You can treat default length as an unknown, in which case any operation with it would produce an unknown, or
You can treat it as a "trap representation" of length, in which case any operation with it would produce an exception.
It is probably a design mistake to not treat zeros in a uniform way with all other numbers. Specifically, zero length may become handy when you subtract length values, because subtracting two values of equal length would have nothing to produce.
As far as "unknown" length is concerned, using struct gives you a convenient standard representation of Nullable<length> immediately familiar to users of your length structure.
Simple Answer:
if your allowed to add nulls in your system then you should probably keep the existing value and treat it like a 0 like so:
public static NullNumber operator+ (NullNumber b, NullNumber c) {
return (b ?? 0) + (c ?? 0);
}
Advanced Answer:
You are probably correct about length not making sense at 0 and you are right about adding nulls seems like a bug
I can't see where the field is populated but I suspect either:
you don't have a constructor that requires you to pass in a length if it's required.
Or you have a faulty class that sometimes has a length and sometimes meaning it sounds closer to 2 classes
Strictly speaking, a null length doesn't exist in reality, everything has length. Getting a null return or a NullReferenceException when working with your struct would lead me to think I messed up the constructor or instantiation. In other words, the null reference would be employed in the scope of the application and not exposed to the client.
struct length = new MyStruct(); //no!
struct length = new MyStruct(double feet, double inches) //better...
struct length = 34.5; //ok...

Named numbers as variables [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I've seen this a couple of times recently in high profile code, where constant values are defined as variables, named after the value, then used only once. I wondered why it gets done?
E.g. Linux Source (resize.c)
unsigned five = 5;
unsigned seven = 7;
E.g. C#.NET Source (Quaternion.cs)
double zero = 0;
double one = 1;
Naming numbers is terrible practice, one day something will need to change, and you'll end up with unsigned five = 7.
If it has some meaning, give it a meaningful name. The 'magic number' five is no improvement over the magic number 5, it's worse because it might not actually equal 5.
This kind of thing generally arises from some cargo-cult style programming style guidelines where someone heard that "magic numbers are bad" and forbade their use without fully understanding why.
Well named variables
Giving proper names to variables can dramatically clarify code, such as
constant int MAXIMUM_PRESSURE_VALUE=2;
This gives two key advantages:
The value MAXIMUM_PRESSURE_VALUE may be used in many different places, if for whatever reason that value changes you need to change it in only one place.
Where used it immediately shows what the function is doing, for example the following code obviously checks if the pressure is dangerously high:
if (pressure>MAXIMUM_PRESSURE_VALUE){
//without me telling you you can guess there'll be some safety protection in here
}
Poorly named variables
However, everything has a counter argument and what you have shown looks very like a good idea taken so far that it makes no sense. Defining TWO as 2 doesn't add any value
constant int TWO=2;
The value TWO may be used in many different places, perhaps to double things, perhaps to access an index. If in the future you need to change the index you cannot just change to int TWO=3; because that would affect all the other (completely unrelated) ways you've used TWO, now you'd be tripling instead of doubling etc
Where used it gives you no more information than if you just used "2". Compare the following two pieces of code:
if (pressure>2){
//2 might be good, I have no idea what happens here
}
or
if (pressure>TWO){
//TWO means 2, 2 might be good, I still have no idea what happens here
}
Worse still (as seems to be the case here) TWO may not equal 2, if so this is a form of obfuscation where the intention is to make the code less clear: obviously it achieves that.
The usual reason for this is a coding standard which forbids magic numbers but doesn't count TWO as a magic number; which of course it is! 99% of the time you want to use a meaningful variable name but in that 1% of the time using TWO instead of 2 gains you nothing (Sorry, I mean ZERO).
this code is inspired by Java but is intended to be language agnostic
Short version:
A constant five that just holds the number five is pretty useless. Don't go around making these for no reason (sometimes you have to because of syntax or typing rules, though).
The named variables in Quaternion.cs aren't strictly necessary, but you can make the case for the code being significantly more readable with them than without.
The named variables in ext4/resize.c aren't constants at all. They're tersely-named counters. Their names obscure their function a bit, but this code actually does correctly follow the project's specialized coding standards.
What's going on with Quaternion.cs?
This one's pretty easy.
Right after this:
double zero = 0;
double one = 1;
The code does this:
return zero.GetHashCode() ^ one.GetHashCode();
Without the local variables, what does the alternative look like?
return 0.0.GetHashCode() ^ 1.0.GetHashCode(); // doubles, not ints!
What a mess! Readability is definitely on the side of creating the locals here. Moreover, I think explicitly naming the variables indicates "We've thought about this carefully" much more clearly than just writing a single confusing return statement would.
What's going on with resize.c?
In the case of ext4/resize.c, these numbers aren't actually constants at all. If you follow the code, you'll see that they're counters and their values actually change over multiple iterations of a while loop.
Note how they're initialized:
unsigned three = 1;
unsigned five = 5;
unsigned seven = 7;
Three equals one, huh? What's that about?
See, what actually happens is that update_backups passes these variables by reference to the function ext4_list_backups:
/*
* Iterate through the groups which hold BACKUP superblock/GDT copies in an
* ext4 filesystem. The counters should be initialized to 1, 5, and 7 before
* calling this for the first time. In a sparse filesystem it will be the
* sequence of powers of 3, 5, and 7: 1, 3, 5, 7, 9, 25, 27, 49, 81, ...
* For a non-sparse filesystem it will be every group: 1, 2, 3, 4, ...
*/
static unsigned ext4_list_backups(struct super_block *sb, unsigned *three,
unsigned *five, unsigned *seven)
They're counters that are preserved over the course of multiple calls. If you look at the function body, you'll see that it's juggling the counters to find the next power of 3, 5, or 7, creating the sequence you see in the comment: 1, 3, 5, 7, 9, 25, 27, &c.
Now, for the weirdest part: the variable three is initialized to 1 because 30 = 1. The power 0 is a special case, though, because it's the only time 3x = 5x = 7x. Try your hand at rewriting ext4_list_backups to work with all three counters initialized to 1 (30, 50, 70) and you'll see how much more cumbersome the code becomes. Sometimes it's easier to just tell the caller to do something funky (initialize the list to 1, 5, 7) in the comments.
So, is five = 5 good coding style?
Is "five" a good name for the thing that the variable five represents in resize.c? In my opinion, it's not a style you should emulate in just any random project you take on. The simple name five doesn't communicate much about the purpose of the variable. If you're working on a web application or rapidly prototyping a video chat client or something and decide to name a variable five, you're probably going to create headaches and annoyance for anyone else who needs to maintain and modify your code.
However, this is one example where generalities about programming don't paint the full picture. Take a look at the kernel's coding style document, particularly the chapter on naming.
GLOBAL variables (to be used only if you really need them) need to
have descriptive names, as do global functions. If you have a function
that counts the number of active users, you should call that
"count_active_users()" or similar, you should not call it "cntusr()".
...
LOCAL variable names should be short, and to the point. If you have
some random integer loop counter, it should probably be called "i".
Calling it "loop_counter" is non-productive, if there is no chance of it
being mis-understood. Similarly, "tmp" can be just about any type of
variable that is used to hold a temporary value.
If you are afraid to mix up your local variable names, you have another
problem, which is called the function-growth-hormone-imbalance syndrome.
See chapter 6 (Functions).
Part of this is C-style coding tradition. Part of it is purposeful social engineering. A lot of kernel code is sensitive stuff, and it's been revised and tested many times. Since Linux is a big open-source project, it's not really hurting for contributions — in most ways, the bigger challenge is checking those contributions for quality.
Calling that variable five instead of something like nextPowerOfFive is a way to discourage contributors from meddling in code they don't understand. It's an attempt to force you to really read the code you're modifying in detail, line by line, before you try to make any changes.
Did the kernel maintainers make the right decision? I can't say. But it's clearly a purposeful move.
My organisation have certain programming guidelines, one of which is the use of magic numbers...
eg:
if (input == 3) //3 what? Elephants?....3 really is the magic number here...
This would be changed to:
#define INPUT_1_VOLTAGE_THRESHOLD 3u
if (input == INPUT_1_VOLTAGE_THRESHOLD) //Not elephants :(
We also have a source file with -200,000 -> 200,000 #defined in the format:
#define MINUS_TWO_ZERO_ZERO_ZERO_ZERO_ZERO -200000
which can be used in place of magic numbers, for example when referencing a specific index of an array.
I imagine this has been done for "Readability".
The numbers 0, 1, ... are integers. Here, the 'named variables' give the integer a different type. It might be more reasonable to specify these constant (const unsigned five = 5;)
I've used something akin to that a couple times to write values to files:
const int32_t zero = 0 ;
fwrite( &zero, sizeof(zero), 1, myfile );
fwrite accepts a const pointer, but if some function needs a non const pointer, you'll end up using a non const variable.
P.S.: That always keeps me wondering what may be the sizeof zero .
How do you come to a conslusion that it is used only once? It is public, it could be used any number of times from any assembly.
public static readonly Quaternion Zero = new Quaternion();
public static readonly Quaternion One = new Quaternion(1.0f, 1.0f, 1.0f, 1.0f);
Same thing applies to .Net framework decimal class. which also exposes public constants like this.
public const decimal One = 1m;
public const decimal Zero = 0m;
Numbers are often given a name when these numbers have special meaning.
For example in the Quaternion case the identity quaternion and unit length quaternion have special meaning and are frequently used in a special context. Namely Quaternion with (0,0,0,1) is an identity quaternion so it's a common practice to define them instead of using magic numbers.
For example
// define as static
static Quaternion Identity = new Quaternion(0,0,0,1);
Quaternion Q1 = Quaternion.Identity;
//or
if ( Q1.Length == Unit ) // not considering floating point error
One of my first programming jobs was on a PDP 11 using Basic. The Basic interpreter allocated memory to every number required, so every time the program mentioned 0, a byte or two would be used to store the number 0. Of course back in those days memory was a lot more limited than today and so it was important to conserve.
Every program in that work place started with:
10 U0%=0
20 U1%=1
That is, for those who have forgotten their Basic:
Line number 10: create an integer variable called U0 and assign it the number 0
Line number 20: create an integer variable called U1 and assign it the number 1
These variables, by local convention, never held any other value, so they were effectively constants. They allowed 0 and 1 to be used throughout the program without wasting any memory.
Aaaaah, the good old days!
some times it's more readable to write:
double pi=3.14; //Constant or even not constant
...
CircleArea=pi*r*r;
instead of:
CircleArea=3.14*r*r;
and may be you would use pi more again (you are not sure but you think it's possible later or in other classes if they are public)
and then if you want to change pi=3.14 into pi=3.141596 it's easier.
and some other like e=2.71, Avogadro and etc.

do datatype choices affect performance?

I have an object model that I use to fill results from a query and that I then pass along to a gridview.
Something like this:
public class MyObjectModel
{
public int Variable1 {get;set;}
public int VariableN {get;set;}
}
Let's say variable1 holds the value of a count and I know that the count will never get to become very large (ie. number of upcoming appointments for a certain day). For now, I've put these data types as int. Let's say it's safe to say that someone will book less than 255 appointments per day. Will changing the datatype from int to byte affect performance much? Is it worth the trouble?
Thanks
No, performance will not be affected much at all.
For each int you will be saving 3 bytes, or 6 in total for the specific example. Unless you have many millions of these, the savings in memory are very small.
Not worth the trouble.
Edit:
Just to clarify - my answer is specifically about the example code. In many cases the choices will make a difference, but it is a matter of scale and will require performance testing to ensure correct results.
To answer #Filip's comment - There is a difference between compiling an application to 64bit and selecting an isolated data type.
Using a integer variable smaller than an int (System.Int32) will not provide any performance benefits. This is because most integer operations in the CLR will promote the variable to an int prior to performing the operation. int is considered the "natural" integer size on the systems for which the CLR was developed.
Consider the following code:
for (byte appointmentIndex = 0; appointmentIndex < Variable1; appointmentIndex++)
ProcessAppointment(appointmentIndex);
In the compiled code, the comparison (appointmentIndex < Variable1) and the increment (appointmentIndex++) will (most likely) be performed using 32-bit integers. Even if the optimizer uses a smaller data type, the CPU itself will require additional work to use the smaller data type.
If you are storing an array of values, then using a smaller data type could help save space, which might give a performance advantage in some scenerios.
It will affect the amount of memory allocated for that variable. In my personal opinion, I don't think it's worth the trouble in the example case.
If there were a huge number of variables, or a database table where you could really save, then yes, but not in this case.
Besides, after years of maintenance programming, I can safely say that it's rarely safe to assume an upper limit on anything. if there's even a remote chance that some poor maintenance programmer is going to have to re-write the app because of trying to save a trivial amount of resources, it's not worth the pay-off.
The .NET runtime optimizes the use of Int32 especially for counters etc.
.NET Integer vs Int16?
Contrary to popular belief, making your data type smaller does not make access faster. In fact, it's slower. Look at bool, it's implemented as an int.
This is because internally, your CPU works with native-word-sized registers (32/64 bit these days), and you're forcing it to convert your data back and forth for no reason (well only when writing the result in memory, but it's still a penalty you could easily avoid).
Fiddling with integer widths only affects memory access, and caching specifically. This is the kind of stuff you can only figure out by profiling your application and looking at page fault counters in particular.
I agree with the other answers that performance won't be worth it. But if you're going to do it at all, go with a short instead of a byte. My rule of thumb is to pick the highest number you can imagine, multiply by 10, then use that as the basis to pick your value. So if you can't possibly imagine a value higher than 200, then use 2000 as your basis, which would mean you'd need a short.

can int store string value in c#

I am working on the front end of an application. I have to introduce one more filter criteria LoanNumber. Now loan number is E-100. Business layer and domain object is not in my control. So i cannot change it. Domain object which holds loannumber is integer, I have to do
ingeoFilterData.intLoanNumber="E-100"
ingeoFilterData is the domain object. intLoanNumber is declared as Nullable Int32 Now this domainobject is very critical and it goes to some external engine,so i cannot change it.
Please suggest some workaround.
Edit-
I am copying down loannumber from database table.
RT1
RT2
PT1
pt10
PT11
PT12
PT13
PT14
PT15
pt16
pt17
pt8
pt9
MDR1
MDR2
MDR3
If you have only one character, you can do this:
multiply your int by 100. (for example E-51 -> 5100)
Then keep the char as int in the rest of the number (for example 5106).
Do the reverse when you need to show the UI id (E-51).
If you have no limitations (as you mentioned) then you can have your int as a protocol (according to me that is even harder because you are limited by Int32 - 4,294,967,296).
You can set your number to something like
<meaning><number><meaning><number>
and meaning is - 1 - number, 2 - letter, 3 - hyphon.
then 11 will mean 1; 201 will mean A, 3 will mean hyphon, and 113201 will mean 1-A;
It's complicated and not very likely to be usable...
This solution limits your id to length of 5 numbers or 3 letters and 1 number. You can squeez some more by using your int bit-wize and optimize your "protocol" as much as possible.
I hope this helps,
Danail
Is "E-100" a string. ie. E is not a variable?
No, you can't set an int to a string value.
No, an int type cannot store a string. But you can parse your value to an int, before passing this to your domain object for filtering.
If the "prefix" of the loan number is always "E-" you could just exclude it.
Otherwise maybe you could add a property "LoanNumberPrefix" and store the "E-" in it.
Unfortunately at some point, bad design will give you unsolvable problems.
I don't know if this is one of them, but if the domain model has specified that loan numbers are integers, then either you, or the people that made that model clearly hasn't done their job.
Why the E in there? What does it signify? Is it just a prefix, can you remove it when storing it and put it back before displaying it?
Unfortunately, if the prefix can change, so that at some point you will have F-100 and so on, then you need to find a way to encode that into the integer you send to the domain model and business logic.
If you can't do that, you need to find a different place to store that prefix, or possibly the entire code.
If you can't do that, well, then you're screwed.
But to be blunt, this smells badly of someone who has been asleep while designing.
"Yeah, that's a good idea, we'll make the loan identification number an integer. I know somewhere, someplace, that someone has an example of what those loan identification numbers look like, but it's just numbers right? I mean, what could go wrong...?"
i think thats possible if you can convert the char into ASCII code.
string --- ASCII
0-10---48-57
A-Z----65-90
a-z----97-122
check out the ASCII table for more info..
Conversion:
so you can convert
RT1 to 082084049
RT2 to 082084050 and
MDR3 to 077068082051
i just prepend 0's to each character if the value is not 3 digit one(because max possible ASCII (z) value is in 3 digits ). R is actually 82, it becomes 082. And the final integer (no of digits) would be in multiples of 3.
Extraction:
This helps to extract the info in the other end. just split this into seperate 3 digit values and convert them to char and append them. you wil get the final string.
082,084,049 - R,T,1. thats all.
p.s: this method may end up in arithmetic overflow problem for large strings
I suggest that you talk to someone in the business/domain layer, or who is responsible for the design of the system, and point out to them that loannumber need to be changed to a string. No one will thank you for bodging your code to get around what is a design flaw--it can only lead to trouble and confusion later.

Using structs in C# for simple domain values

I am writing a financial application where the concept of 'Price' is used a lot. It's currently represented by the C# decimal type. I would like to make it more explicit and be able to change it to maybe double in the future, so I was thinking of creating a 'Price' struct that would basically act exactly the same as the decimal type (maybe add a bit of validation like must be greater than 0).
What do you think are the pros and cons of doing this?
Please don't use double for money. You'll have to remember to round it for display everywhere you use it at, and you have potential accuracy issues if you divide or multiply by large numbers. Decimal will give overflow errors, double will just lose accuracy. I'm not sure about you, but with money, I'd prefer an error and aborted operation to silently proceeding with a loss of accuracy.
If anything, based on projects I've been on, you may want to consider using a struct that has a decimal and some indication of what currency it is.
Structs should be used for small types that will (in my opinion) be immutable, i.e., value types. I am not sure what you mean by "used a lot", but if these structs will be passed around a lot in performance critical operations you will have to take into account the price of copying them versus the price of heap allocation. I doubt you will need to take that into account, but it is something to think about. I rarely find the need to use structs in my daily activities.
Also, as Jonathan points out, using the double type for money is a bad idea. The decimal type is much better suited to financial calculations.
Yet another aside; you will probably hear a lot of responses which talk about stack v heap allocation, so this article may interest you:
http://blogs.msdn.com/ericlippert/archive/2009/04/27/the-stack-is-an-implementation-detail.aspx
There shouldn't be a reason to change the data type for a quantity like this; however, you may decide to add other information such as the currency or the number of decimal places to keep track of in calculations, so using a struct at this point will save you a LOT of time down the road.
Structs may not be so accessible from .NET languages other than C#. Rounding errors could be a problem too. Why not just create a Money class and store the value as a Decimal and the currency used.

Categories