these two ways of incrementing a value by one
if (Condition) int++;
and
int+= Convert.Toint32(Condition);
so is there and benefit to write in one way or another or are they basically the same?
Adding a Boolean to an integer doesn't make any sense.
Yes, it works, because of the conversion. But it still doesn't make any sense. It's illogical.
Programs should be obvious and clear, not puzzles to be solved.
I get 7527ms and 5888ms on my machine from the benchmark below. The first approach (boolean conversion), besides being just awful from a code readability point of view is also slower. That makes sense, that approach has the overhead of ALWAYS 1) performing a conversion from bool to int, and 2) performing an addition operation. Yes, there are probably shortcuts for adding "0", but that's still ANOTHER test that has to be looked at.
int sum = 0;
var sw = Stopwatch.StartNew();
for (int i = 0; i < Int32.MaxValue; i++) {
bool condition = i < Int32.MaxValue / 2;
sum += Convert.ToInt32(condition);
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
sum = 0;
sw = Stopwatch.StartNew();
for (int i = 0; i < Int32.MaxValue; i++) {
bool condition = i < Int32.MaxValue / 2;
if (condition) {
sum++;
}
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
There are many many many many ways to write code that does the same thing. But it all comes down to readability and maintainability.
You can choose to write it in binary and you will be able to optimize it in the most efficient way. But you won't find too many ppl able to maintain the code you have written. I bet not even you want to read your own code in binary when there is a bug.
so which way do you want to do it? considering that there is not that much performance gain for the 2nd method, I would say definitely go for the 1st for the sake of ppl who might be reading your code later.
I'm thinking the clarity of the code depends on the context.
For almost all ordinary cases,
if (condition) i++;
...is going to be easier to read.
But there may be some situations like this one where the alternative makes it easier to follow. Imagine if this list were very long:
var errorCount = 0
errorCount += Convert.ToInt32(o.HasAProblem);
errorCount += Convert.ToInt32(o.HasSomeOtherProblem);
errorCount += Convert.ToInt32(p.DoesntWork);
On the other hand, for the above, maybe I'd find a different way of structuring the code entirely, e.g.
var errorFlags = new [] {o.HasProblem,
o.HasSomeOtherProblem,
p.DoesntWork};
var errorCount = errorFlags.Count(a => a);
Also, the construct
i += Convert.ToInt32(condition);
...may result in a cleaner pipeline since there is no branch prediction involved. The key word is may.
Related
I'm currently coding battleships as a part of a college project. The game works perfectly fine but I'd like to implement a way to check if a ship has been completely sunk. This is the method I'm currently using:
public static bool CheckShipSunk(string[,] board, string ship){
for(int i = 0; i < board.GetLength(0); i++){
for(int j = 0; j < board.GetLength(1); j++){
if(board[i,j] == ship){return false;}
}
}
return true;
}
The problem with this is that there are 5 ships, and this is very inefficient when checking hundreds of elements 5 times over, not to mention the sub-par quality of college computers. Is there an easier way of checking if a 2D array contains an element?
Use an arithmetic approach to loop-through with just 1 loop.
public static bool CheckShipSunk(string[,] board, string ship){
int rows = board.GetLength(0);
int cols = board.GetLength(1);
for (int i = 0; i < rows * cols; i++) {
int row = i / cols;
int col = i % cols;
if (board[row, col] == ship)
return false;
}
return true;
}
But I am with Nysand on just caching and storing that information in cells. The above code although might work, is not recommended as it is still not as efficient
this is very inefficient when checking hundreds of elements 5 times over
Have you done any profiling? Computers are fast even your old college computers. Checking hundreds of elements should take microseconds. From Donald Knuths famous quote
There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.
So if you feel your program is slow I would recommend to start with profiling. If you are in university this might be a very valuable skill to learn.
There are also better algorithms/datastructures that could be employed. I would for example expect each ship to know what locations they are at, and various other information, like if it is sunk at all. Selecting appropriate data structures are also a very important skill to learn, but a difficult one. Also, try to not get stuck in analysis-paralysis, a terrible inefficient ugly working solution is still better than the most beautiful code that does not work.
However, a very easy thing to fix is moving .GetLength out of the loop. This is a very slow call, and only doing this once should make your loop a several times faster for almost no effort. You might also consider replacing the strings with some other identifier, like an int.
Willy-Nilly you have to scan either the entire array or up to the first ship.
You can simplify the code by quering the array with help of Linq, but not increase performance which has O(length * width) time complexity.
using System.Linq;
...
// No explicit loops, juts a query (.net will loop for you)
public static bool CheckShipSunk(string[,] board, string ship) => board is null
? throw new ArgumentNullException(nameof(board))
: board.AsEnumerable().Any(item => item == ship);
If you are looking for performance (say, you have a really huge array, many ships to test etc.), I suggest changing the data structure: Dictionary<string, (int row, int col)>
instead of the string[,] array:
Dictionary<string, (int row, int col)> sunkShips =
new Dictionary<string, (int row, int col)>(StringComparer.OrdinalIgnoreCase) {
{ "Yamato", (15, 46) },
{ "Bismark", (11, 98) },
{ "Prince Of Wales", (23, 55) },
};
and then get it as easy as
public static bool CheckShipSunk(IDictionary<string, (int row, int col)> sunkShips,
string ship) =>
sunkShips?.Keys?.Contains(ship) ?? false;
Note that time complexity is O(1) which means it doesn't depend on board length and width
I'm tweaking some code in a RationalNumber implementation. In particular, inside the equality logic, I'm considering the following:
public bool Equals(RationalNumber other)
{
if (RationalNumber.IsInfinity(this) ||
RationalNumber.IsInfinity(other) ||
RationalNumber.IsNaN(this) ||
RationalNumber.IsNaN(other))
{
return false;
}
try
{
checked
{
return this.numerator * other.Denominator == this.Denominator * other.numerator;
}
}
catch (OverflowException)
{
var thisReduced = RationalNumber.GetReducedForm(this);
var otherReduced = RationalNumber.GetReducedForm(other);
return (thisReduced.numerator == otherReduced.numerator) && (thisReduced.Denominator == otherReduced.Denominator);
}
}
As you can see I'm using exceptions as a flow control mechanism. The reasoning behind this is that I do not want to incurr in the penalty of evaluating the greatest common divisor of both fractions on every equality check. Thus I only decide to do it in the least probable case: one or both cross products overflow.
Is this an acceptable practice? I've always read that exceptions should never be used as a flow mechanism of your code, but I don't really see another way to achieve what I want.
Any alternative approaches are welcome.
The reasoning behind this is that I do not want to incur in the penalty of evaluating the greatest common divisor of both fractions on every equality check.
This is sound reasoning. The total cost of this code is
{probability of fast-path} * {fast-path cost}
+ ((1.0 - {probability of fast-path}) * {slow-path cost})
Depending on the three constants involved this will be a good or bad choice. You need to have a good understanding of that data will be processed in practice.
Note, that exceptions are very slow. I once benchmarked them to be 10000 per second per CPU core and I'm not sure they would scale to multiple cores due to internal CLR locks involved.
Maybe you can add runtime profiling. Track the rate of exceptions. If too hight, switch off the optimization.
You probably should document why you did this.
It's also not an architectural problem because in case you change your mind later you can easily switch to a different algorithm.
As an alternative, you could first compute and compare unchecked. If the result is "not equal" it is guaranteed that the exact result would be "not equal", too. Even if overflow occurred. So that could be an exception free fast path if many numbers turn out to be not equal.
Usually catching exceptions has high overhead and you should catch exceptions if you can do something about them.
In your case you can do something about the exception. Using it as a control flow is not a problem in my opinion but I suggest you to implement the logic (check different conditions to prevent exceptions) then benchmark both options and compare the performance because usually catching exceptions has high overhead but if checking in order to prevent exceptions takes more time then handling the exception is the better way.
Update due to OPs comment(Its a new implementation, we are not using the .NET framework's Rational. The type of Numerator and Denominator is long)
you can use bigger types to prevent overflow exception like decimal or BigInteger
decimal thisNumerator = this.numerator;
decimal thisDenominator = this.numerator;
decimal otherNumerator = other.numerator;
decimal otherDenominator = other.numerator;
checked
{
return thisNumerator * otherDenominator == thisDenominator * otherNumerator;
}
Update due to comments:
a simple example to show exception overhead.
const int Iterations = 100000;
var sw = new Stopwatch();
var sum1 = 0;
sw.Start();
for (int i = 0; i < Iterations; i++)
{
try
{
var s = int.Parse("s" + i);
sum1 += s;
}
catch (Exception)
{
}
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
Console.WriteLine(sum1);
var sw2 = new Stopwatch();
var sum2 = 0;
sw2.Start();
for (int i = 0; i < Iterations; i++)
{
try
{
int s;
if (int.TryParse("s" + i, out s))
sum2 += s;
}
catch (Exception)
{
}
}
sw2.Stop();
Console.WriteLine(sw2.ElapsedMilliseconds);
Console.WriteLine(sum2);
result is : handling exceptions are at least 170 times slower
5123
0
30
0
This approach is introduced in MSDN.
https://msdn.microsoft.com/en-Us/library/74b4xzyw.aspx
But catching exception is high overhead because process mode will change user-mode to kernel-mode in that time, maybe.
In C#, suppose I have a foreach loop and it is possible that the iterator will be empty. Following the loop, more actions need to be taken only if the iterator was not empty. So I declare bool res = false; before the loop. Is it faster to just set res = true; in each loop iteration, or to test if it's been done yet, as in if (!res) res = true;. I suppose the question could more succinctly be stated as "is it faster to set a bool's value or test its value?"
In addition, even if one is slightly faster than the other, is it feasible to have so many iterations in the loop that the impact on performance is not negligible?
To kill a few minutes:
static void Main(string[] args)
{
bool test = false;
Stopwatch sw = new Stopwatch();
sw.Start();
for (long i = 0; i < 100000000; i++)
{
if (!test)
test = true;
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds + ". Hi, I'm just using test somehow:" + test);
sw.Reset();
bool test2 = false;
sw.Start();
for (long i = 0; i < 100000000; i++)
{
test2 = true;
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds + ". Hi, I'm just using test2 somehow:" + test2);
Console.ReadKey();
}
Output:
448
379
So, unless missed somthing, just setting the value is faster than checking and then setting it. Is that what you wanted to test?
EDIT:
Fixed an error pointed out in the comments. As a side note, I indeed ran this test a few times and even when the miliseconds changed, the second case was always slighty faster.
if (!res) res = true is redundant.
The compiler should be smart enough to know that res will always end up being true and remove your if statement and/or completely remove the set altogether if you compile with Release / Optimize Code.
To your question itself. It should be faster to set a primitive value than to actually compare and set. I highly doubt you would be able to accurately measure the time difference at all on a primitive and just thinking about this alone consumed more time than the process will in x exagerrated iterations.
In the process of writing an "Off By One" mutation tester for my favourite mutation testing framework (NinjaTurtles), I wrote the following code to provide an opportunity to check the correctness of my implementation:
public int SumTo(int max)
{
int sum = 0;
for (var i = 1; i <= max; i++)
{
sum += i;
}
return sum;
}
now this seems simple enough, and it didn't strike me that there would be a problem trying to mutate all the literal integer constants in the IL. After all, there are only 3 (the 0, the 1, and the ++).
WRONG!
It became very obvious on the first run that it was never going to work in this particular instance. Why? Because changing the code to
public int SumTo(int max)
{
int sum = 0;
for (var i = 0; i <= max; i++)
{
sum += i;
}
return sum;
}
only adds 0 (zero) to the sum, and this obviously has no effect. Different story if it was the multiple set, but in this instance it was not.
Now there's a fairly easy algorithm for working out the sum of integers
sum = max * (max + 1) / 2;
which I could have fail the mutations easily, since adding or subtracting 1 from either of the constants there will result in an error. (given that max >= 0)
So, problem solved for this particular case. Although it did not do what I wanted for the test of the mutation, which was to check what would happen when I lost the ++ - effectively an infinite loop. But that's another problem.
So - My Question: Are there any trivial or non-trivial cases where a loop starting from 0 or 1 may result in a "mutation off by one" test failure that cannot be refactored (code under test or test) in a similar way? (examples please)
Note: Mutation tests fail when the test suite passes after a mutation has been applied.
Update: an example of something less trivial, but something that could still have the test refactored so that it failed would be the following
public int SumArray(int[] array)
{
int sum = 0;
for (var i = 0; i < array.Length; i++)
{
sum += array[i];
}
return sum;
}
Mutation testing against this code would fail when changing the var i=0 to var i=1 if the test input you gave it was new[] {0,1,2,3,4,5,6,7,8,9}. However change the test input to new[] {9,8,7,6,5,4,3,2,1,0}, and the mutation testing will fail. So a successful refactor proves the testing.
I think with this particular method, there are two choices. You either admit that it's not suitable for mutation testing because of this mathematical anomaly, or you try to write it in a way that makes it safe for mutation testing, either by refactoring to the form you give, or some other way (possibly recursive?).
Your question really boils down to this: is there a real life situation where we care about whether the element 0 is included in or excluded from the operation of a loop, and for which we cannot write a test around that specific aspect? My instinct is to say no.
Your trivial example may be an example of lack of what I referred to as test-drivenness in my blog, writing about NinjaTurtles. Meaning in the case that you have not refactored this method as far as you should.
One natural case of "mutation test failure" is an algorithm for matrix transposition. To make it more suitable for a single for-loop, add some constraints to this task: let the matrix be non-square and require transposition to be in-place. These constraints make one-dimensional array most suitable place to store the matrix and a for-loop (starting, usually, from index '1') may be used to process it. If you start it from index '0', nothing changes, because top-left element of the matrix always transposes to itself.
For an example of such code, see answer to other question (not in C#, sorry).
Here "mutation off by one" test fails, refactoring the test does not change it. I don't know if the code itself may be refactored to avoid this. In theory it may be possible, but should be too difficult.
The code snippet I referenced earlier is not a perfect example. It still may be refactored if the for loop is substituted by two nested loops (as if for rows and columns) and then these rows and columns are recalculated back to one-dimensional index. Still it gives an idea how to make some algorithm, which cannot be refactored (though not very meaningful).
Iterate through an array of positive integers in the order of increasing indexes, for each index compute its pair as i + i % a[i], and if it's not outside the bounds, swap these elements:
for (var i = 1; i < a.Length; i++)
{
var j = i + i % a[i];
if (j < a.Length)
Swap(a[i], a[j]);
}
Here again a[0] is "unmovable", refactoring the test does not change this, and refactoring the code itself is practically impossible.
One more "meaningful" example. Let's implement an implicit Binary Heap. It is usually placed to some array, starting from index '1' (this simplifies many Binary Heap computations, compared to starting from index '0'). Now implement a copy method for this heap. "Off-by-one" problem in this copy method is undetectable because index zero is unused and C# zero-initializes all arrays. This is similar to OP's array summation, but cannot be refactored.
Strictly speaking, you can refactor the whole class and start everything from '0'. But changing only 'copy' method or the test does not prevent "mutation off by one" test failure. Binary Heap class may be treated just as a motivation to copy an array with unused first element.
int[] dst = new int[src.Length];
for (var i = 1; i < src.Length; i++)
{
dst[i] = src[i];
}
Yes, there are many, assuming I have understood your question.
One similar to your case is:
public int MultiplyTo(int max)
{
int product = 1;
for (var i = 1; i <= max; i++)
{
product *= i;
}
return product;
}
Here, if it starts from 0, the result will be 0, but if it starts from 1 the result should be correct. (Although it won't tell the difference between 1 and 2!).
Not quite sure what you are looking for exactly, but it seems to me that if you change/mutate the initial value of sum from 0 to 1, you should fail the test:
public int SumTo(int max)
{
int sum = 1; // Now we are off-by-one from the beginning!
for (var i = 0; i <= max; i++)
{
sum += i;
}
return sum;
}
Update based on comments:
The loop will only not fail after mutation when the loop invariant is violated in the processing of index 0 (or in the absence of it). Most such special cases can be refactored out of the loop, but consider a summation of 1/x:
for (var i = 1; i <= max; i++) {
sum += 1/i;
}
This works fine, but if you mutate the initial bundary from 1 to 0, the test will fail as 1/0 is invalid operation.
I'm making a small game, and this has a lot of loops, which all use a certain variable adjacentSquares. After every loop however, this should be set to 0. What would be faster, creating this variable again every time or just setting it to 0? Is there maybe a certain 'exotic' approach, that will perform even better?
The associated (unfinished) code:
void Update ()
{
int adjacentSquares = 0;
for (int x = 0; x <= gridX; x++)
{
for (int y = 0; y <= gridY; y++)
{
if (grid[x - 1,y - 1] == true)
adjacentSquares += 1;
//and some more logic
}
}
}
Why not experiment and measure the time elapsed using the System.Diagnostics.Stopwatch class? http://msdn.microsoft.com/en-us/library/system.diagnostics.stopwatch.aspx
Set up a Stopwatch object before that loop and then measure elapsed time after it. Then, report back with your findings :D
The real answer here is: try it out and see!
But, I would not expect there to be a difference in speed. If anything, you're stack will use 4 bytes more memory (per variable), but even that is not a guarantee. There's a good change that (if there is a performance benefit here) either the C# compiler or the JIT compiler will recognized that the first variable is no longer used, so it will simply use that same memory for the subsequent variables. But I'll echo what I said before: run some tests - that's the only true answer to your question.
If you really want to improve performance here you could look at doing a parallel solution here, depending on if each individual calculation relies on all the previous ones you have done.
You can probably even do this with LINQ depending on the "some more logic" you are doing.
Just for improving a little bit more the performance:
void Update ()
{
int adjacentSquares = 0;
for (int x = -1; x < gridX; x++)
{
for (int y = -1; y < gridY; y++)
{
if (grid[x, y])
adjacentSquares++;
//and some more logic
}
}
}
I don't know exactly why you need to start from -1 (0 - 1), but if you have, then put that on the for instead of executing the same each time.