I have a very quick question about the best way to use two variables. Essentially I have an enum and an int, the value for which I want to get within several ifs. Should I declare them outside the if's or inside - consider the following examples:
e.g.a:
public void test() {
EnumName? value = null;
int distance = 0;
if(anotherValue == something) {
distance = 10;
value = getValue(distance);
}
else if(anotherValue == somethingElse) {
distance = 20;
value = getValue(distance);
}
if (value == theValueWeWant){
//Do something
}
OR
e.g.2
public void test() {
if(anotherValue == something) {
int distance = 10;
EnumType value = getValue(distance);
if (value == theValueWeWant){
//Do something
}
else if(anotherValue == somethingElse) {
int distance = 20;
EnumType value = getValue(distance);
if (value == theValueWeWant){
//Do something
}
}
I am just curious which is best? or if there is a better way?
Purely in terms of maintenance, the first code block is better as it does not duplicate code (assuming that "Do something" is the same in both cases).
In terms of performance, the difference should be negligible. The second case does generate twice as many locals in the compiled IL, but the JIT should notice that their usage does not overlap and optimize them away. The second case is also going to cause emission of the same code twice (if (value == theValueWeWant) { ...), but this should also not cause any significant performance penalty.
(Though both aspects of the second example will cause the compiled assembly to be very slightly larger, more IL does not always imply worse performance.)
Both examples do two different things:
Version 1 will run the same code if you get the desired value, where as Version 2 will potentially run different code even if you get the desired value.
There's a lot of possible (micro)optimizations you could do.
For Example, if distance is only ever used in getValue(distance), you could get rid of it entirely:
/*Waring, micro-optimization!*/
public void test() {
EnumType value = getValue((anotherValue == something) ? 10 : (anotherValue == somethingElse) ? 20 : 0);
if (value == theValueWeWant){
//Do something
}
}
If you wish to use those later on, then g for the second method. Those variables will be lost as soon as they're out of scope.
Even if you don't want to use them later, declaring them before the if's is something you should do, to avoid code repetition.
This question is purely a matter of style and hence has no correct answer, only opinions
The C# best practice is generally to declare variables in the scope where they are used. This would point to the second example as the answer. Even though the types and names are the same, they represent different uses and should be constrained to the blocks in which they are created.
Related
Any easier way to write this if statement?
if (value==1 || value==2)
For example... in SQL you can say where value in (1,2) instead of where value=1 or value=2.
I'm looking for something that would work with any basic type... string, int, etc.
How about:
if (new[] {1, 2}.Contains(value))
It's a hack though :)
Or if you don't mind creating your own extension method, you can create the following:
public static bool In<T>(this T obj, params T[] args)
{
return args.Contains(obj);
}
And you can use it like this:
if (1.In(1, 2))
:)
A more complicated way :) that emulates SQL's 'IN':
public static class Ext {
public static bool In<T>(this T t,params T[] values){
foreach (T value in values) {
if (t.Equals(value)) {
return true;
}
}
return false;
}
}
if (value.In(1,2)) {
// ...
}
But go for the standard way, it's more readable.
EDIT: a better solution, according to #Kobi's suggestion:
public static class Ext {
public static bool In<T>(this T t,params T[] values){
return values.Contains(t);
}
}
C# 9 supports this directly:
if (value is 1 or 2)
however, in many cases: switch might be clearer (especially with more recent switch syntax enhancements). You can see this here, with the if (value is 1 or 2) getting compiled identically to if (value == 1 || value == 2).
Is this what you are looking for ?
if (new int[] { 1, 2, 3, 4, 5 }.Contains(value))
If you have a List, you can use .Contains(yourObject), if you're just looking for it existing (like a where). Otherwise look at Linq .Any() extension method.
Using Linq,
if(new int[] {1, 2}.Contains(value))
But I'd have to think that your original if is faster.
Alternatively, and this would give you more flexibility if testing for values other than 1 or 2 in future, is to use a switch statement
switch(value)
{
case 1:
case 2:
return true;
default:
return false
}
If you search a value in a fixed list of values many times in a long list, HashSet<T> should be used. If the list is very short (< ~20 items), List could have better performance, based on this test
HashSet vs. List performance
HashSet<int> nums = new HashSet<int> { 1, 2, 3, 4, 5 };
// ....
if (nums.Contains(value))
Generally, no.
Yes, there are cases where the list is in an Array or List, but that's not the general case.
An extensionmethod like this would do it...
public static bool In<T>(this T item, params T[] items)
{
return items.Contains(item);
}
Use it like this:
Console.WriteLine(1.In(1,2,3));
Console.WriteLine("a".In("a", "b"));
You can use the switch statement with pattern matching (another version of jules's answer):
if (value switch{1 or 3 => true,_ => false}){
// do something
}
Easier is subjective, but maybe the switch statement would be easier? You don't have to repeat the variable, so more values can fit on the line, and a line with many comparisons is more legible than the counterpart using the if statement.
In vb.net or C# I would expect that the fastest general approach to compare a variable against any reasonable number of separately-named objects (as opposed to e.g. all the things in a collection) will be to simply compare each object against the comparand much as you have done. It is certainly possible to create an instance of a collection and see if it contains the object, and doing so may be more expressive than comparing the object against all items individually, but unless one uses a construct which the compiler can explicitly recognize, such code will almost certainly be much slower than simply doing the individual comparisons. I wouldn't worry about speed if the code will by its nature run at most a few hundred times per second, but I'd be wary of the code being repurposed to something that's run much more often than originally intended.
An alternative approach, if a variable is something like an enumeration type, is to choose power-of-two enumeration values to permit the use of bitmasks. If the enumeration type has 32 or fewer valid values (e.g. starting Harry=1, Ron=2, Hermione=4, Ginny=8, Neville=16) one could store them in an integer and check for multiple bits at once in a single operation ((if ((thisOne & (Harry | Ron | Neville | Beatrix)) != 0) /* Do something */. This will allow for fast code, but is limited to enumerations with a small number of values.
A somewhat more powerful approach, but one which must be used with care, is to use some bits of the value to indicate attributes of something, while other bits identify the item. For example, bit 30 could indicate that a character is male, bit 29 could indicate friend-of-Harry, etc. while the lower bits distinguish between characters. This approach would allow for adding characters who may or may not be friend-of-Harry, without requiring the code that checks for friend-of-Harry to change. One caveat with doing this is that one must distinguish between enumeration constants that are used to SET an enumeration value, and those used to TEST it. For example, to set a variable to indicate Harry, one might want to set it to 0x60000001, but to see if a variable IS Harry, one should bit-test it with 0x00000001.
One more approach, which may be useful if the total number of possible values is moderate (e.g. 16-16,000 or so) is to have an array of flags associated with each value. One could then code something like "if (((characterAttributes[theCharacter] & chracterAttribute.Male) != 0)". This approach will work best when the number of characters is fairly small. If array is too large, cache misses may slow down the code to the point that testing against a small number of characters individually would be faster.
Using Extension Methods:
public static class ObjectExtension
{
public static bool In(this object obj, params object[] objects)
{
if (objects == null || obj == null)
return false;
object found = objects.FirstOrDefault(o => o.GetType().Equals(obj.GetType()) && o.Equals(obj));
return (found != null);
}
}
Now you can do this:
string role= "Admin";
if (role.In("Admin", "Director"))
{
...
}
public static bool EqualsAny<T>(IEquatable<T> value, params T[] possibleMatches) {
foreach (T t in possibleMatches) {
if (value.Equals(t))
return true;
}
return false;
}
public static bool EqualsAny<T>(IEquatable<T> value, IEnumerable<T> possibleMatches) {
foreach (T t in possibleMatches) {
if (value.Equals(t))
return true;
}
return false;
}
I had the same problem but solved it with a switch statement
switch(a value you are switching on)
{
case 1:
the code you want to happen;
case 2:
the code you want to happen;
default:
return a value
}
A strange case popped up. Two properties of two objects are being cast to the same primitive type and (seemingly) have the same value. However, the equality comparer returns false. If we use the Equals method (or other means of comparing the two values), then we get the correct result.
Even stranger, actually placing the result of the cast into a new variable also seems to works.
Below is a VERY simplified code example and and it will NOT yield the same results when copied and compiled. It's just used to illustrate the general setting where the problem occurs.
class Program
{
static void Main(string[] args)
{
var v1 = new Object1 { SomeValue = (short)-1d };
var invalidResult = (int)v1.SomeValue == (int)SomeEnum.Value1; //for some reason this returns false
var validResult = ((int)v1.SomeValue).CompareTo((int)SomeEnum.Value1) == 0; //this works
var extraValidResult = ((int)v1.SomeValue).Equals((int)SomeEnum.Value1);
var cast1 = (int)v1.SomeValue;
var cast2 = (int)SomeEnum.Value1;
var otherValidResult = cast1 == cast2; //this also works
}
}
public class Object1
{
public short SomeValue { get; set; }
}
public enum SomeEnum : short
{
Value1 = -1,
Value2 = 0,
Value3 = 1
}
Here's a screenshot of the VS watch window as proof of what we're seeing:
I know sometimes VS can show invalid values in the "watches" window, however the effects aren't limited to that window and a case actually fails an if check where it should not in one of our tests. AFAIK there's no trickery in the code (like overriding == or Equals).
What could possibly be happening here?
(We've obviously "fixed" the issue using the CompareTo method, but we're all still scratching our heads wondering what exactly had happened...)
EDIT:
I realize the code example above is... a tad bit useless. However posting the actual code in question might prove to be very difficult; there's a lot of it. Finally, the "live" values of some objects are populated from a SQL server (using Entity Framework), which complicates sharing of the code even further. I'm happy to try and answer any additional questions to try and narrow down the issue, but sharing of the FULL code is, unfortunately, not possible (a specific block of it is possible, but it won't compile for obvious reasons). The example code was provided to show how strange the issue is.
EDIT 2:
Sorry for the delay. Here's the particular method in question:
public bool IsLocalizationBlockedByMagPElem(int localizationId)
{
IEnumerable<MagPElem> magPElems = MagPElemRepository.GetByLocalizationIdAndStatusesOrderedByIdDescThenSubLpAsc(localizationId, DocumentStatus.InBuffer, DocumentStatus.InBufferReedition);
if (magPElems.Count() != 0)
{
var magPElem = magPElems.First();
//this commented out code did not return the expected value due to the strange comparison issue
//return (magPElem.MaP_GIDTyp == (int)ExternalSystemType.PM_GIDTyp || (magPElem.MaP_GIDTyp == (int)ExternalSystemType.MP_GIDTyp && magPElem.MaP_SubGIDLp == (short)LocalizationDirection.Destination));
//to avoid the issue CompareTo is being used, but Equals would work just as well
return (magPElem.MaP_GIDTyp == (int)ExternalSystemType.PM_GIDTyp || (magPElem.MaP_GIDTyp == (int)ExternalSystemType.MP_GIDTyp && magPElem.MaP_SubGIDLp.CompareTo((short)LocalizationDirection.Destination) == 0));
}
return false;
}
I've also updated the proof screenshot to encompass more of the screen. In the screenshot there are some minor discrepancies with the above code as we were testing to see what's going on (like trying out different casts or assigning the cast result to a variable). But the gist of the problem is there.
I have been writing:
if(Class.HasSomething() == true/false)
{
// do somthing
}
else
{
// do something else
}
but I've also seen people that do the opposite:
if(true/false == Class.HasSomething())
{
// do somthing
}
else
{
// do something else
}
Is there any advantage in doing one or the other in terms of performance and speed? I'm NOT talking about coding style here.
They're both equivalent, but my preference is
if(Class.HasSomething())
{
// do something
}
else
{
// do something else
}
...for simplicity.
Certain older-style C programmers prefer "Yoda Conditions", because if you accidentally use a single-equals sign instead, you'll get a compile time error about assigning to a constant:
if (true = Foo()) { ... } /* Compile time error! Stops typo-mistakes */
if (Foo() = true) { ... } /* Will actually compile for certain Foo() */
Even though that mistake will no longer compile in C#, old habits die hard, and many programmers stick to the style developed in C.
Personally, I like the very simple form for True statements:
if (Foo()) { ... }
But for False statements, I like an explicit comparison.
If I write the shorter !Foo(), it is easy to over-look the ! when reviewing code later.
if (false == Foo()) { ... } /* Obvious intent */
if (!Foo()) { ... } /* Easy to overlook or misunderstand */
The second example is what I've heard called "Yoda conditions"; "False, this method's return value must be". It's not the way you'd say it in English and so among English-speaking programmers it's generally looked down on.
Performance-wise, there's really no difference. The first example is generally better grammatically (and thus for readability), but given the name of your method the "grammar" involved (and the fact you're comparing bool to bool) would make the equality check redundant anyway. So, for a true statement, I would simply write:
if(Class.HasSomething())
{
// do somthing
}
else
{
// do something else
}
This would be incrementally faster, as the if() block basically has a built-in equality comparison, so if you code if(Class.HasSomething() == true) the CLR will evaluate if((Class.HasSomething() == true) == true). But, we're talking a gain of maybe a few clocks here (not milliseconds, not ticks, but clocks; the ones that happen 2 billion times a second in modern processors).
For a false condition, it's a toss-up between using the not operator: if(!Class.HasSomething()) and using a comparison to false: if(Class.HasSomething() == false). The first is more concise, but it can be easy to miss that little exclamation point in a complex expression (especially since it occurs before the entire expression) and so I'd consider equating with false to ensure that the code is readable.
You will not see any performance difference.
The correct option is
if (Whatever())
The only time you should write == false or != true is when dealing with bool?s. (in which case all four options have different meanings)
You will not see any performance difference, either comparison is translated into the same IL...
if(Class.HasSomething())
{
// do somthing
}
is my way. But better try to avoid a multiple method call of HasSomething(). Better expose the return value once and reuse it.
you should write neither.
Write
if(Class.HasSomething())
{
// do something
}
else
{
// do something else
}
instead. If Class.HasSomething() is already a bool, it's pointless to compare it to another boolean
There is no perf advantage here. This coding style is used to guard against situation where programmer types = instead of ==. Compiler will cathc this because true/false are constants and cannot be assigned a new value
For the case of booleans, I'd recommend neither: just use if (method()) and if (!method()). For the case of things besides booleans, the convention of using yoda-speak, e.g. if (1 == x) came about to prevent mistakes, because if (1 = x) will throw a compiler error while if (x = 1) will not (it is valid code in C, but is probably not what you intended). In C#, such a statement is only valid if the variable was a boolean, which reduces the need to do that.
I mostly develop using C#, but I think this question might be suitable for other languages as well.Also, it seems like there is a lot of code here but the question is very simple.
Inlining, as I understand it, is the compiler (in the case of C# the Virtual Machine) replacing a method call by inserting the body of the method in every place the method was called from.
Let's say I have the following program:
static Main()
{
int number = 7;
bool a;
a = IsEven(number);
Console.WriteLine(a);
}
... the body of the method IsEven:
bool IsEven(int n)
{
if (n % 2 == 0) // Two conditional return paths
return true;
else
return false;
}
I could understand how code will look like after inlining the method:
static Main()
{
int number = 7;
bool a;
if (number % 2 == 0)
a = true;
else
a = false;
Console.WriteLine(a); // Will print true if 'number' is even, otherwise false
}
An obviously simple and correct program.
But if I tweak the body of IsEven a little bit to include an absolute return path...
bool IsEven(int n)
{
if (n % 2 == 0)
return true;
return false; // <- Absolute return path!
}
I personally like this style a bit more in some situations. Some refractoring tools might even suggest that I do change the first version to look like this one - but when I tried to imagine how this method would look like when it's inlined I was stumped.
If we inline the second version of the method:
static Main()
{
int number = 7;
bool a;
if (number % 2 == 0)
a = true;
a = false;
Console.WriteLine(a); // Will always print false!
}
The question to be asked:
How does the Compiler / Virtual Machine deal with inlining a method that has an absolute return path?It seems extremely unlikely that something like this would really prevent method inlining, so I wonder how such things are dealt with. Perhaps the process of inlining isn't as simple as this? Maybe one version is more likely to being inlined by the VM?
Edit:
Profiling both methods (and manual inlining of the first one) showed no difference in performance, so I can only assume that both methods get inlined and work in the same or similar manner (at least on my VM).
Also, these methods are extremely simple and seem almost interchangeable, but complex methods with absolute return paths might be much more difficult to change into versions without absolute return paths.
It's can be kind of hard to explain what the JITter does when it inlines - it does not chage the C# code to do the inlining - it will (always?) work on the produced bytes (the compiled version) - and the "tools" you have when producing assembly-code (the actual machine code bytes) are much more fine grained than what you have in C# (or IL for that matter).
That said, you can have an idea of how it works, in C# terms by considering the break keyword..
Consider the possiblity that every inline function that's not trivial is enclosed in a while (true) loop (or do while(false) loop) - and that every source return is translated to a localVar = result; break; set of statements. Then you get something like this:
static Main()
{
int number = 7;
bool a;
while (true)
{
if (number % 2 == 0)
{
a = true;
break;
}
a = false;
break;
}
Console.WriteLine(a); // Will always print the right thing! Yey!
}
Similarly, when producing assembly, you will see A LOT of jmps being generated - those are the moral equivalent of the break statements, but they are MUCH more flexible (think about them as anonymous gotos or something).
So you can see, the jitter (and any compiler that compiles to native) has A LOT of tools at hand it can use to do "the right thing".
The return statement indicates:
The result value
Destruction of local variables (This applies to C++, not C#) In C#, finally blocks and Dispose calls in using blocks will run.
A jump out of the function
All these things still happen after inlining. The jump will be local instead of cross-functions after inlining, but it will still be there.
Inlining is NOT textual substitution like C/C++ macros are.
Other things that are different between inlining and textual substitution are treatment of variables with the same name.
The machine code that the cpu executes is a very simple language. It doesn't have the notion of the return statement, a subroutine has a single point of entry and a single point of exit. So your IsEven() method like this:
bool IsEven(int n)
{
if (n % 2 == 0)
return true;
return false;
}
needs to be rewritten by the jitter to something that resembles this (not valid C#):
void IsEvent(int n)
{
if (n % 2 == 0) {
$retval = true;
goto exit;
}
$retval = false;
exit:
} // $retval becomes the function return value
The $retval variable might look fake here. It is not, it is the EAX register on an x86 core. You'll now see that this code is simple to inline, it can be transplanted directly into the body of Main(). The $retval variable can be equated to the a variable with a simple logical substitution.
Question: Is there a significant performance hit when assigning a variable to itself?
Although the code can be easily changed to avoid assigning a variable to itself or to use flags to detect how something should be assigned .. I like the use of a null coalescing operator to make cleaner initialization code. Consider the following code:
public class MyObject
{
public string MyString { get; set; }
}
public class Worker
{
Dictionary<int, string> m_cache = new Dictionary<string, string>();
public void Assign(ref MyObject obj)
{
string tmp = null;
if (some condition)
{
//can't pass in MyObject.MyString to out param, so we need to use tmp
m_cache.TryGetValue("SomeKey", out tmp); //assumption: key is guaranteed to exist if condition is met
}
else
{
obj.MyString = MagicFinder(); //returns the string we are after
}
//finally, assign MyString property
obj.MyString = tmp ?? obj.MyString;
//is there a significant performance hit here that is worth changing
//it to the following?
if (!string.IsEmptyOrNull(tmp))
{
obj.MyString = tmp;
}
}
}
Your question seems to boil down to this section:
//finally, assign MyString property
obj.MyString = tmp ?? obj.MyString;
To:
//is there a significant performance hit here that is worth changing
//it to the following?
if (!string.IsNullOrEmpty(tmp))
{
obj.MyString = tmp;
}
Realize, however, that the two versions are not the same. In the first case (using the null coalescing operation), obj.MyString can get assigned to string.Empty. The second case prevents this explicitly, since it's checking against string.IsNullOrEmpty, which will precent the assignment in the case of a non-null, empty string.
Since the behavior is different, I would say this is not an optimization, but rather a behavioral change. Whether this is appropriate depends on the behavior this method is specified to demonstrate. How should empty strings be handled? That should be the determining factor here.
That being said, if you were to do an explicit null check, the performance would be near identical - I would suggest going for maintainability and readability here - to me, the first version is simpler, as it's short and clear.
Also, looking at your code, I think it would be far simpler to put the assignment into the statement. This would make the code far more maintainable (and more efficient, to the delight of your boss...):
public void Assign(ref MyObject obj)
{
if (some condition)
{
string tmp = null;
//can't pass in MyObject.MyString to out param, so we need to use tmp
m_cache.TryGetValue("SomeKey", out tmp); //assumption: key is guaranteed to exist if condition is met
obj.MyString = tmp;
}
else
{
obj.MyString = MagicFinder(); //returns the string we are after
}
}
Given the assumption in the code (that the key will always exist in cache if your condition is true), you can even simplify this down to:
public void Assign(MyObject obj) // No reason to pass by ref...
{
obj.MyString = someCondition ? m_cache["SomeKey"] : MagicFinder();
}
You shouldn't worry about that, the impact should be minimal. However, think about how many people are actually aware of the null coalescing operator. Personally, I find the latter more readable at a glance, you don't have to think about it what it means.
The performance worry of checking if a string is null or empty is of no immediate concern, but you could alter your code to be more concise, I think:
string tmp = null;
if (some condition)
{
m_cache.TryGetValue("SomeKey", out tmp);
}
if (string.IsNullOrEmpty(tmp))
{
tmp = MagicFinder();
}
obj.MyString = tmp;
"Premature optimization is the root of all evil."
In other words, I doubt that any theoretical answer will be representative of real world performance, unless you happen to know how it compiles down to bytecode... How about you create a quick and dirty test framework that'll run it 1000 times with random values (null or valid strings) and measure the performance yourself?
EDIT: To add, I'd be rather surprised if the performance was significantly different enough that it would matter in the grand scheme of things. Go for legibility first (although you could argue in favor of both :))