I am currently studying C# and I really want to get a good coding style from the beginning, so I would like to hear opinions from you professionals on this matter.
Should you always (or mostly) use local variables for conditions/calculations (example 2) or is it just as good/better to use statements directly (example 1)
Example 1.
if (double.TryParse(stringToParse, out dblValue)) ...
Example 2.
bool parseSuccess = double.TryParse(stringToParse, out dblValue);
if (parseSuccess) ...
It would be interesting to hear your thoughts and reasoning at this example.
You should use the more verbose style if putting it all in one line would make it too long or complicated.
You should also use a separate variable if the variable's name would make it easier to understand the code:
bool mustWait = someCommand.ConflictsWith(otherCommand);
if (mustWait) {
...
}
In such cases, you should consider using an enum for additional readability.
I see a lot of example 1 in production code. As long as the expression is simple, and it's easy to understand the logic of what's happening, I don't think you'll find many people who think it is bad style.
Though you will probably find a lot of people with different preferences. :)
Heres the rule I use: Keep it on one line if you can quickly glance over it and know exactly what it's saying. If its too complicated to read as quickly as you could read any other text, give it a local variable. In any case, though, you don't want a really long if statement header. So if it's too long, split it up.
I suggest you use a local variable like here:
bool parseSuccess = double.TryParse(stringToParse, out dblValue);
if (parseSuccess) ...
For two reasons:
1. You can use more times the variable without parse your double another time.
2. It makes the code more readable.
Consider this:
if(double.TryParse(string1, out Value1) && double.TryParse(string2, out Value2) && double.TryParse(string3, out Value3) && double.TryParse(string4, out Value4))
{
//some stuff
}
It's too long and it makes the code hard to be read.
So sometimes local variabels make the code a lot more readable.
The clarity of the source code is an important parameter especially in application maintenance but so is performance.
Insignificant it may seem, sometimes using simple syntax "tricks" of programming languages, we get very good results.
If I think I'll use later in the code somehow the result, I use variables, otherwise I give priority to direct sentences.
There's no right option. Both are perfectly acceptable.
Most people choose the first option if you don't have a lot of conditions to concatenate because it results in less lines of code.
As you said you are studying C#
So my vote will be this style for you
bool parseSuccess = double.TryParse(stringToParse, out dblValue);
if (parseSuccess) ...
If you are studying you will have lot to learn and the above
style clearly tells you that TryParse return a bool, so you won't have
to worry or find whats the return type for TryParse
Related
I have an array of strings and I wish to find out if that array does not contain a certain string. I can use the not operator (!) in conjunction with the Contains method like so:
if (!stringArray.Contains(searchString))
{
//do something
}
The not operator (!) might be overlooked when scanning the code so I was wondering if it was considered bad practice to create an Extension method in an attempt to enhance readability:
public static bool DoesNotContain<T>(this IEnumerable<T> source, T value)
{
return !source.Contains<T>(value);
}
So now the code could read:
if (stringArray.DoesNotContain(searchString))
{
//do something
}
Is this sort of thing frowned upon?
Keep the !. This is where a comment above the line would help readability.
(I suspect ! is more efficient)
//If the word is NOT in the array then...
Another point is to whether you are dead-set on using an Array?
There is something (that you may or may not know about) called a HashSet.
If your sole purpose is to examine whether or not a string is in a list, you are essentially looking at set arithmetic.
Unless you are using the array for something other than finding out whether a certain term is in it or not, try using a HashSet...much faster.
Personally, I wouldn't make an extension method for something so simple. I understand that you're trying to keep it readable but most C# developers should catch the ! operator. It's heavily used and even beginners usually recognize it.
Seems unnecessary, !source.Contains<T>(value); is pretty readable. Also, using the existing Contains function means that your code will be more portable (i.e., it won't be dependent on your extension method being present).
I would definitely use the !stringArray.Contains(string). This what 99.9% of all developers use. DoesNotContain would confuse me at least.
I think your question is based on a bit of a faulty premise. Namely that developers will read past the ! in your code. The ! boolean operator is a very well known operator in a large number of popular programming languages (C, C++, C#, Java, etc ...). Anyone who is likely to read past the ! on a regular basis probably shoudln't be checking in code without a heavy review before hand.
It feels like you`re saying the following
I want people to code in C# but I don't trust them to read it hence I'm going to create a new dialect in my code base with extension methods.
Why stop with the ! operator? It seems just as likely that they would miss the + in += expression or read a | as a ||.
Never seen DoesNot* methods in .NET framework, so I think your problem with ! is overestimated.
I guess this is a personnal choice more than a good/bad practice. IMO I like the extension methods since its more declarative thus more readable, at first glance you know exactly what it does. Just my 2 cents
This sounds like a bad idea, now the consumers of your code have to know about two methods (DoesNotContain and Contains) instead of just one. In general I would avoid XXNotXX methods.
I would personally make an extension method for that if I was going to be using that quite frequently within the project. If it is a one off then i wouldnt bother, but its not really bad practise.
The reason I would do it is because the if() has more context at a glance as to what is going on. Ok granted anyone with a brain cell would know what the current statement is doing, but it just reads nicer. Everyone will have their own preference then...
I made an extension method for formatting strings just to make the code flow better...
I prefer option 1 over option 2. Extension methods are very cool and are great to use for such things as conversions or comparisons that are used frequently. However, Microsoft does recommend to use extension methods sparingly.
I really would consider extension methods which does nothing else than negating an expression as bad practice.
What about:
if (stringArray.Contains(searchString) == false)
{
//do something
}
When !something doesn't work, then fall back to something == false.
I have run across a bunch of code in a few C# projects that have the following constants:
const int ZERO_RECORDS = 0;
const int FIRST_ROW = 0;
const int DEFAULT_INDEX = 0;
const int STRINGS_ARE_EQUAL = 0;
Has anyone ever seen anything like this? Is there any way to rationalize using constants to represent language constructs? IE: C#'s first index in an array is at position 0. I would think that if a developer needs to depend on a constant to tell them that the language is 0 based, there is a bigger issue at hand.
The most common usage of these constants is in handling Data Tables or within 'for' loops.
Am I out of place thinking these are a code smell? I feel that these aren't a whole lot better than:
const int ZERO = 0;
const string A = "A";
Am I out of place thinking these are a code smell? I feel that these aren't a whole lot better than:
Compare the following:
if(str1.CompareTo(str2) == STRINGS_ARE_EQUAL) ...
with
if(str1.CompareTo(str2) == ZERO) ...
if(str1.CompareTo(str2) == 0) ...
Which one makes more immediate sense?
Abuse, IMHO. "Zero" is just is one of the basics.
Although the STRINGS_ARE_EQUAL could be easy, why not ".Equals"?
Accepted limited use of magic numbers?
That definitely a code smell.
The intent may have been to 'add readability' to the code, however things like that actually decrease the readability of code in my opinion.
Some people consider any raw number within a program to be a 'magic number'. I have seen coding standards that basically said that you couldn't just write an integer into a program, it had to be a const int.
Am I out of place thinking these are a code smell? I feel that these aren't a whole lot better than:
const int ZERO = 0;
const int A = 'A';
Probably a bit of smell, but definitely better than ZERO=0 and A='A'. In the first case they're defining logical constants, i.e. some abstract idea (string equality) with a concrete value implementation.
In your example, you're defining literal constants -- the variables represent the values themselves. If this is the case, I would think that an enumeration is preferred since they rarely are singular values.
That is definite bad coding.
I say constants should be used only where needed where things could possible change sometime later. For instance, I have a lot of "configuration" options like SESSION_TIMEOUT defined where it should stay the same, but maybe it could be tweaked later on down the road. I do not think ZERO can ever be tweaked down the road.
Also, for magic numbers zero should not be included.
I'm a bit strange I think on that belief though because I would say something like this is going to far
//input is FIELD_xxx where xxx is a number
input.SubString(LENGTH_OF_FIELD_NAME); //cut out the FIELD_ to give us the number
You should have a look at some of the things at thedailywtf
One2Pt20462262185th
and
Enterprise SQL
I think sometimes people blindly follow 'Coding standards' which say "Don't use hardcoded values, define them as constants so that it's easier to manage the code when it needs to be updated' - which is fair enough for stuff like:
const in MAX_NUMBER_OF_ELEMENTS_I_WILL_ALLOW = 100
But does not make sense for:
if(str1.CompareTo(str2) == STRINGS_ARE_EQUAL)
Because everytime I see this code I need to search for what STRINGS_ARE_EQUAL is defined as and then check with docs if that is correct.
Instead if I see:
if(str1.CompareTo(str2) == 0)
I skip step 1 (search what STRINGS_ARE... is defined as) and can check specs for what value 0 means.
You would correctly feel like replacing this with Equals() and use CompareTo() in cases where you are interested in more that just one case, e.g.:
switch (bla.CompareTo(bla1))
{
case IS_EQUAL:
case IS_SMALLER:
case IS_BIGGER:
default:
}
using if/else statements if appropriate (no idea what CompareTo() returns ...)
I would still check if you defined the values correctly according to specs.
This is of course different if the specs defines something like ComparisonClass::StringsAreEqual value or something like that (I've just made that one up) then you would not use 0 but the appropriate variable.
So it depends, when you specifically need to access first element in array arr[0] is better than arr[FIRST_ELEMENT] because I will still go and check what you have defined as FIRST_ELEMENT because I will not trust you and it might be something different than 0 - for example your 0 element is dud and the real first element is stored at 1 - who knows.
I'd go for code smell. If these kinds of constants are necessary, put them in an enum:
enum StringEquality
{
Equal,
NotEqual
}
(However I suspect STRINGS_ARE_EQUAL is what gets returned by string.Compare, so hacking it to return an enum might be even more verbose.)
Edit: Also SHOUTING_CASE isn't a particularly .NET-style naming convention.
i don't know if i would call them smells, but they do seem redundant. Though DEFAULT_INDEX could actually be useful.
The point is to avoid magic numbers and zeros aren't really magical.
Is this code something in your office or something you downloaded?
If it's in the office, I think it's a problem with management if people are randomly placing constants around. Globally, there shouldn't be any constants unless everyone has a clear idea or agreement of what those constants are used for.
In C# ideally you'd want to create a class that holds constants that are used globally by every other class. For example,
class MathConstants
{
public const int ZERO=0;
}
Then in later classes something like:
....
if(something==MathConstants.ZERO)
...
At least that's how I see it. This way everyone can understand what those constants are without even reading anything else. It would reduce confusion.
There are generally four reasons I can think of for using a constant:
As a substitute for a value that could reasonably change in the future (e.g., IdColumnNumber = 1).
As a label for a value that may not be easy to understand or meaningful on its own (e.g. FirstAsciiLetter = 65),
As a shorter and less error-prone way of typing a lengthy or hard to type value (e.g., LongSongTitle = "Supercalifragilisticexpialidocious")
As a memory aid for a value that is hard to remember (e.g., PI = 3.14159265)
For your particular examples, here's how I'd judge each example:
const int ZERO_RECORDS = 0;
// almost definitely a code smell
const int FIRST_ROW = 0;
// first row could be 1 or 0, so this potentially fits reason #2,
// however, doesn't make much sense for standard .NET collections
// because they are always zero-based
const int DEFAULT_INDEX = 0;
// this fits reason #2, possibly #1
const int STRINGS_ARE_EQUAL = 0;
// this very nicely fits reason #2, possibly #4
// (at least for anyone not intimately familiar with string.CompareTo())
So, I would say that, no, these are not worse than Zero = 0 or A = "A".
If the zero indicates something other than zero (in this case STRINGS_ARE_EQUAL) then that IS Magical. Creating a constant for it is both acceptable and makes the code more readable.
Creating a constant called ZERO is pointless and a waste of finger energy!
Smells a bit, but I could see cases where this would make sense, especially if you have programmers switching from language to language all the time.
For instance, MATLAB is one-indexed, so I could imagine someone getting fed up with making off-by-one mistakes whenever they switch languages, and defining DEFAULT_INDEX in both C++ and MATLAB programs to abstract the difference. Not necessarily elegant, but if that's what it takes...
Right you are to question this smell young code warrior. However, these named constants derive from coding practices much older than the dawn of Visual Studio. They probably are redundant but you could do worse than to understand the origin of the convention. Think NASA computers, way back when...
You might see something like this in a cross-platform situation where you would use the file with the set of constants appropriate to the platform. But Probably not with these actual examples. This looks like a COBOL coder was trying to make his C# look more like english language (No offence intended to COBOL coders).
It's all right to use constants to represent abstract values, but quite another to represent constructs in your own language.
const int FIRST_ROW = 0 doesn't make sense.
const int MINIMUM_WIDGET_COUNT = 0 makes more sense.
The presumption that you should follow a coding standard makes sense. (That is, coding standards are presumptively correct within an organization.) Slavishly following it when the presumption isn't met doesn't make sense.
So I agree with the earlier posters that some of the smelly constants probably resulted from following a coding standard ("no magic numbers") to the letter without exception. That's the problem here.
Let us say for a moment that C# allowed multiple return values in the most pure sense, where we would expect to see something like:
string sender = message.GetSender();
string receiver = message.GetReceiver();
compacted to:
string sender, receiver = message.GetParticipants();
In that case, I do not have to understand the return values of the method until I actually make the method call. Perhaps I rely on Intellisense to tell me what return value(s) I'm dealing with, or perhaps I'm searching for a method that returns what I want from a class I am unfamiliar with.
Similarly, we have something like this, currently, in C#:
string receiver;
string sender = message.GetParticipants(out receiver);
where the argument to GetParticipants is an out string parameter. However, this is a bit different than the above because it means I have to preempt with, or at least go back and write, code that creates a variable to hold the result of the out parameter. This is a little counterintuitive.
My question is, is there any syntactic sugar in current C#, that allows a developer to make this declaration in the same line as the method call? I think it would make development a (tiny) bit more fluid, and also make the code more readable if I were doing something like:
string sender = message.GetParicipants(out string receiver);
to show that receiver was being declared and assigned on the spot.
No, there isn't currently any syntactic sugar around this. I haven't heard of any intention to introduce any either.
I can't say I use out parameters often enough for it really to be a significant concern for me (there are other features I'd rather the C# team spent their time on) but I agree it's a bit annoying.
.NET 4 will be adding a Tuple concept, which deals with this. Unfortunately, the C# language isn't going to provide any language support for "destructuring bind".
Personally, I like the inconvience introduced when using out parameters. It helps me to think about whether my method is really doing what it should be or if I've crammed too much functionality into it. That said, perhaps dynamic typing in C#4.0/.Net 4 will address some of your concerns.
dynamic participant = message.GetParticipants();
var sender = participant.Sender;
var recipient = participant.Recipient;
where
public object GetParticipants()
{
return new { Sender = ..., Recipient = ... };
}
You can also return a Tuple<T,U> or something similar. However, since you want to return two string, it might get confusing.
I use the Tuples structs of the BclExtras library which is very handy (found it on SO, thank you JaredPar!).
I don't think such functionality exists, but if it were implemented in a way similar to arrays in perl that could be useful actually.
In perl You can assign an array to a list of variables in parentheses. So you can for example do this
($user, $password) = split(/:/,$data);
Where this bugs me the most: since there's no overload of (say) DateTime.TryParse that doesn't take an out parameter, you can't write
if (DateTime.TryParse(s, out d))
{
return new ValidationError("{0} isn't a valid date", s);
}
without declaring d. I don't know if this is a problem with out parameters or just with how the TryParse method is implemented, but it's annoying.
This syntactic sugar is now is now available in the roslyn preview as seen here (called Declaration expressions).
int.TryParse(s, out var x);
At best you would have to use var rather than an explicit type, unless you want to restrict all multiple return values to be of the same type (not likely practical). You would also be limiting the scope of the variable; currently you can declare a variable at a higher scope and initialize it in an out parameter. With this approach, the variable would go out of scope in the same block as its assignment. Obviously this is usable in some cases, but I wouldn't want to enforce this as the general rule. Obviously you could leave the 'out' option in place, but chances are people are going to code for one approach or the other.
I think this is not what you want.
You may have come across a piece of code where you would have
liked that. But variables popping out of nowhere because
they have been introduced in the parameter list would be
a personal nightmare ( to me :) )
Multiple return values have grave downsides from the point
of portability/maintainability. If you make a function that returns two strings
and you now want it to return three, you will have to change all the code
that uses this function.
A returned record type however usually plays nice in such common scenarios.
you may be opening pandora's box ;-)
For line compacting:
string s1, s2; s1 = foo.bar(s2);
Lines can be any length, so you could pack some common stuff into one.
Just try to live with the semicolons.
Try the following code
Participants p = message.GetParticipants();
log(p.sender,p.receiver);
Imagine someone coding the following:
string s = "SomeString";
s.ToUpper();
We all know that in the example above, the call to the “ToUpper()” method is meaningless because the returned string is not handled at all. But yet, many people make that mistake and spend time trying to troubleshoot what the problem is by asking themselves “Why aren’t the characters on my ‘s’ variable capitalized”????
So wouldn’t it be great if there was an attribute that could be applied to the “ToUpper()” method that would yield a compiler error if the return object is not handled? Something like the following:
[MustHandleReturnValueAttribute]
public string ToUpper()
{
…
}
If order for this code to compile correctly the user would have to handle the return value like this:
string s = "SomeString";
string uppers = s.ToUpper();
I think this would make it crystal clear that you must handle the return value otherwise there is no point on calling that function.
In the case of the string example this may not be a big deal but I can think of other more valid reasons why this would come in handy.
What do you guys think?
Thanks.
Does one call a method for its side-effects, for its return value, or for both? "Pure" functions (which have no effects and only serve to compute a return value) would be good to annotate as such, both to eliminate the type of error you describe, as well as to enable some potential optimizations/analyses. Maybe in another 5 years we'll see this happen.
(Note that the F# compiler will warn any time you implicitly ignore a return value. The 'ignore' function can be used when you want to explicitly ignore it.)
If you have Resharper it will highlight things like this for you. Cant recommend resharper highly enough, it has lots of useful IDE additions especially around refactoring.
http://www.jetbrains.com/resharper/
I am not sure that I like this. I have called many a method that returns a value that I choose not to capture. Adding some type of default (the compiler generates a warning when a return value is not handled) just seems wrong to me.
I do agree that something along the suggested lines might help out new programmers but adding an attribute at this point in the game will only affect a very small number of methods relative the the large existing body. That same junior programmer will never get their head around the issue when most of their unhandled return values are not flagged by the compiler.
Might have been nice way back when but the horses are out of the barn.
I'd actually prefer a way to flag a struct or class as [Immutable], and have this handled automatically (with a warning) for methods called without using their return values on immutable objects. This could also protect the object by the compiler from changes after creation.
If the object is truly an immutable object, there really would be no way to handle it. It also could potentially be used by compilers to catch other common mistakes.
Tagging the method itself seems less useful to me, though. I agree with most of the other comments regarding that. If the object is mutable, calling a method could have other side-effects, so the above code could be perfectly valid.
I'm having flashbacks to putting (void) casts on all printf() calls because I couldn't get Lint to shut up about the return value being ignored.
That said, it seems like this functionality should be in some code checker tool rather than the compiler itself.
At least a compiler-warning would be helpful. Perhaps they add something similar for C# 4.0 (Design-By-Contract).
This doesn't warrant for a warning or pragma. There are too many places where it is intended to discard the result, and I'd be quite annoyed getting a warning/error from the compiler just because the method was decorated with some dodge attribute.
This kind of 'warning' should be annotated in the IDE's Editor, like a small icon on the gutter "Warning: Discarding return value" or similar.
Resharper certainly thinks so, and out of the box it will nag you to convert
Dooberry dooberry = new Dooberry();
to
var dooberry = new Dooberry();
Is that really considered the best style?
It's of course a matter of style, but I agree with Dare: C# 3.0 Implicit Type Declarations: To var or not to var?. I think using var instead of an explicit type makes your code less readable.In the following code:
var result = GetUserID();
What is result? An int, a string, a GUID? Yes, it matters, and no, I shouldn't have to dig through the code to know. It's especially annoying in code samples.
Jeff wrote a post on this, saying he favors var. But that guy's crazy!
I'm seeing a pattern for stackoverflow success: dig up old CodingHorror posts and (Jeopardy style) phrase them in terms of a question.
I use it only when it's clearly obvious what var is.
clear to me:
XmlNodeList itemList = rssNode.SelectNodes("item");
var rssItems = new RssItem[itemList.Count];
not clear to me:
var itemList = rssNode.SelectNodes("item");
var rssItems = new RssItem[itemList.Count];
The best summary of the answer I've seen to this is Eric Lippert's comment, which essentially says you should use the concrete type if it's important what the type is, but not to otherwise. Essentially type information should be reserved for places where the type is important.
The standard at my company is to use var everywhere, which we came to after reading various recommendations and then spending some time trying it out to see whether the lack of annotated type information was a help or a hindrance. We felt it was a help.
Most of the recommendations people have linked to (e.g. Dare's one) are recommendations made by people who have never tried coding using var instead of the concrete type. This makes the recommendations all but worthless because they aren't speaking from experience, they're merely extrapolating.
The best advice I can give you is to try it for yourself, and see what works for you and your team.
#jongalloway - var doesn't necessarily make your code more unreadable.
var myvariable = DateTime.Now
DateTime myvariable = DateTime.Now;
The first is just as readable as the second and requires less work
var myvariable = ResultFromMethod();
here, you have a point, var could make the code less readable. I like var because if I change a decimal to a double, I don't have to go change it in a bunch of places (and don't say refactor, sometimes I forget, just let me var!)
EDIT: just read the article, I agree. lol.
There was a good discussion on this # Coding Horror
Personally I try to keep its use to a minimum, I have found it hurts readability especially when assigning a variable from a method call.
I have a feeling this will be one of the most popular questions asked over time on Stack Overflow. It boils down to preference. Whatever you think is more readable. I prefer var when the type is defined on the right side because it is terser. When I'm assigning a variable from a method call, I use the explicit type declaration.
It only make sense, when you don't know the type in advance.
In C# 9.0 there is a new way to initialize a class by Target-typed new expressions.
You can initialize the class like this:
Dooberry dooberry = new();
Personally, I like it more than using a var and it is more readable for me.
Regarding calling a method I think it is up to you. Personally, I prefer to specify the type because I think it is more readable this way:
Dooberry dooberry = GetDooberry();
In some cases, it is very clear what the type is, in this case, I use var:
var now = DateTime.Now;
One of the advantages of a tool like ReSharper is that you can write the code however you like and have it reformat to something more maintainable afterward. I have R# set to always reformat such that the actual type in use is visible, however, when writing code I nearly always type 'var'.
Good tools let you have the best of both worlds.
John.
"Best style" is subjective and varies depending on context.
Sometimes it is way easier to use 'var' instead of typing out some hugely long class name, or if you're unsure of the return type of a given function. I find I use 'var' more when mucking about with Linq, or in for loop declarations.
Other times, using the full class name is more helpful as it documents the code better than 'var' does.
I feel that it's up to the developer to make the decision. There is no silver bullet. No "one true way".
Cheers!
I'm seeing a pattern for stackoverflow
success: dig up old CodingHorror posts
and (Jeopardy style) phrase them in
terms of a question.
I plead innocent! But you're right, this seemed to be a relatively popular little question.
There's a really good MSDN article on this topic and it outlines some cases where you can't use var:
The following restrictions apply to implicitly-typed variable declarations:
var can only be used when a local variable is declared and initialized
in the same statement; the variable
cannot be initialized to null, or to a
method group or an anonymous function.
var cannot be used on fields at class scope.
Variables declared by using var cannot be used in the initialization
expression. In other words, this
expression is legal: int i = (i = 20);
but this expression produces a
compile-time error: var i = (i = 20);
Multiple implicitly-typed variables cannot be initialized in the same
statement.
If a type named var is in scope, then the var keyword will resolve to
that type name and will not be treated
as part of an implicitly typed local
variable declaration.
I would recommend checking it out to understand the full implications of using var in your code.
No not always but I would go as far as to say a lot of the time. Type declarations aren't much more useful than Hungarian notation ever was. You still have the same problem that types are subject to change and as much as refactoring tools are helpful for that it's not ideal compared to not having to change where a type is specified except in a single place, which follows the Don't Repeat Yourself principle.
Any single line statement where a type's name can be specified for both a variable and its value should definitely use var, especially when it's a long Generic<OtherGeneric< T,U,V>, Dictionary< X, Y>>>