Fastest way to compare two integer arrays for value comparison - c#

I know that array comparison questions have been asked multiple times but I am curious to know if there is a better method than the ones I have tried so far.
The requirement is to compare two integer arrays of same length to check if one of them has values which are always greater than or equal to the ones in the second one when mapped one to one. So, if you were to sort the arrays (which I did for comparing them), the values at the same indices in the first array should always be greater than or equal to the ones at the same indices in the second one.
I tried the check in two ways so far:
Sort the arrays. Run a for loop and return a false value if any of the values in the second array turns out to be greater than the corresponding value in the first array:
Array.Sort(Array1);
Array.Sort(Array2);
//
for (int i = 0; i < Array1.length; i++)
{
if (Array2[i] > Array1[i])
{
return false;
}
}
return true;
In the alternate method, I tried to be cute and used the chaining of FindIndex and IndexOf methods to get the solution in one go. I think it's actually less efficient since, as far as I can tell and please correct me if I am wrong, it is traversing the array twice instead of once as is the case in the previous method:
Array.Sort(Array1);
Array.Sort(Array2);
//
return Array.FindIndex(Array2, i => i > Array1[Array.IndexOf(Array2, i)]) >= 0
? false
: true;
I think the first method works better but is there a faster/better way to achieve the same result? The number of elements can range from 10 to 10000.
Thanks.

Related

SequenceEqual always returning false

I making a lottery simulation and i have 2 different arrays with 6 numbers to hold firstly the numbers the user wishes to play and secondly the numbers that get generated each run.
The user enters their numbers into a textbox and its saved as a string and placed into respective spots in the aray, the randomly generated nums are also saved as a string into the string array.
After this i have a SequenceEqual for comparison
bool equal = lotteryNums.SequenceEqual(playerNums);
This always returns false, i have set all the generated array elements manually to 1-6 and then the players nums accordingly through the textboxes yet it will always return a false.
The generated array is currently filled like this for testing
lotteryNums[0] = "1";
lotteryNums[1] = "2";
lotteryNums[2] = "3";
lotteryNums[3] = "4";
lotteryNums[4] = "5";
lotteryNums[5] = "6";
The player array is filled like this using the next array position for the next number
string inputNum = inputBox_txt.Text;
playerNums[0] = inputNum;
Why is this always returning false?
Since people are asking the arrays are both in the exact same order and do not appear to contain anything more or anything less than the numbers in the arrays
SequenceEqual returns true if the two source sequences are of equal length and their corresponding elements are equal according to the default equality comparer for their type; otherwise, false.
Since the two arrays you provide are not identical then you are getting false.
In addition to what Athanasios Emmanouilidis said: it seems that the collections need to have the same order. So you should order them:
bool equal = playerNums.OrderBy(n => n).SequenceEqual(lotteryNums.OrderBy(n => n));
Another thing to consider: SequenceEquals requires that the numbers be in the exact same order in both arrays. Even if the arrays contain the same numbers, but they are in a different order, you will get false. You can solve this by sorting each list prior to comparing (just make sure you sort both in the same way).
If that doesn't work, verify that the strings from the textbox are actually just the numbers, and do not include any white space or special characters.
SeqqnceEqual will also take the order of the elemnts into account, which you probably don´t want in a lottery. What you want instead is to check if all the expected values are in the inout as well:
var sixCorrects = lotteryNums.All(x => playerNums.Contains(x));
All will stop iterating as soon as an element of lotteryNums was not found within playerNums.

Need help understanding the syntax in an array

I am having trouble understanding what the for loop is doing.
To me, I see it as:
int i = 0; //Declaring i to become 0. i is the value in myArray?
i < myArray.Length; //When i is less than any value in myArray keep looping?
i++; //Every time this loop goes through increase i by 1?
//Making an array called myArray that contains 20,5,7,2,55
int[] myArray = { 20, 5, 7, 2, 55 };
//Using the built in feature, Array.Sort(); to sort out myArray
Array.Sort(myArray);
for (int i = 0; i < myArray.Length; i++)
{
Console.WriteLine(myArray[i]);
}
I'm going to make some assumptions about your knowledge of programming, so forgive me if this explanation covers topics you're already familiar with, but they are all important for understanding what a for loop does, what it's use is and what the semantics are going to be when someone comes behind you and reads your code. Your question demonstrates that you're super close to understanding it, so hopefully it'll hit you like a ton of bricks once you have a good explanation.
Consider an array of strings of length 5. You would initialize it in C# like so:
string[] arr = new string[5];
What this means is that you have an array that has allocated 5 slots for strings. The names of these slots are the indexes of the array. Unfortunately for those who are new to programming, like yourself, indexes start at 0 (this is called zero-indexing) instead of 1. What that means is that the first slot in our new string[] has the name or index of 0, the second of 1, the third of 3 and so on. That means that they length of the array will always be a number equal to the index of the final slot plus one; to put it another way, because arrays are 0 indexed and the first (1st) slot's index is 0, we know what the index of any given slot is n - 1 where n is what folks who are not programmers (or budding programmers!) would typically consider to be the position of that slot in the array as a whole.
We can use the index to pick out the value from an array in the slot that corresponds to the index. Using your example:
int[] myArray = { 20, 5, 7, 2, 55 };
bool first = myArray[0] == 20: //=> true
bool second = myArray[1] == 5; //=> true
bool third = myArray[2] == 7; //=> true
// and so on...
So you see that the number we are passing into the indexer (MSDN) (the square brackets []) corresponds to the location in the array that we are trying to access.
for loops in C syntax languages (C# being one of them along with C, C++, Java, JavaScript, and several others) generally follow the same convention for the "parameters":
for (index_initializer; condition; index_incrementer)
To understand the intended use of these fields it's important to understand what indexes are. Indexes can be thought of as the names or locations for each of the slots in the array (or list or anything that is list-like).
So, to explain each of the parts of the for loop, lets go through them one by one:
Index Initializer
Because we're going to use the index to access the slots in the array, we need to initialize it to a starting value for our for loop. Everything before the first semicolon in the for loop statement is going to run exactly once before anything else in the for loop is run. We call the variable initialized here the index as it keeps track of the current index we're on in the scope of the for loop's life. It is typical (and therefore good practice) to name this variable i for index with nested loops using the subsequent letters of the Latin alphabet. Like I said, this initializing statement happens exactly once so we assign 0 to i to represent that we want to start looping on the first element of the array.
Condition
The next thing that happens when you declare a for loop is that the condition is checked. This check will be the first thing that is run each time the loop runs and the loop will immediately stop if the check returns false. This condition can be anything as long as it results in a bool. If you have a particularly complicated for loop, you might delegate the condition to a method call:
for (int i = 0; ShouldContinueLooping(i); i++)
In the case of your example, we're checking against the length of the array. What we are saying here from an idiomatic standpoint (and what most folks will expect when they see that as the condition) is that you're going to do something with each of the elements of the array. We only want to continue the loop so long as our i is within the "bounds" of the array, which is always defined as 0 through length - 1. Remember how the last index of an array is equal to its length minus 1? That's important here because the first time this condition is going to be false (that is, i will not be less than the length) is when it is equal to the length of the array and therefore 1 greater than the final slot's index. We need to stop looping because the next part of the for statement increases i by one and would cause us to try to access an index outside the bounds of our array.
Index incrementer
The final part of the for loop is executed once as the last thing that happens each time the loop runs. Your comment for this part is spot on.
To recap the order in which things happen:
Index initializer
Conditional check ("break out" or stop lopping if the check returns false)
Body of loop
Index incrementer
Repeat from step 2
To make this clearer, here's your example with a small addition to make things a little more explicit:
// Making an array called myArray that contains 20,5,7,2,55
int[] myArray = { 20, 5, 7, 2, 55 };
// Using the built in feature, Array.Sort(); to sort out myArray
Array.Sort(myArray);
// Array is now [2, 5, 7, 20, 55]
for (int i = 0; i < myArray.Length; i++)
{
int currentNumber = myArray[i];
Console.WriteLine($"Index {i}; Current number {currentNumber}");
}
The output of running this will be:
Index 0; Current number 2
Index 1; Current number 5
Index 2; Current number 7
Index 3; Current number 20
Index 4; Current number 55
I am having trouble understanding what the for loop is doing.
Then let's take a big step back.
When you see
for (int i = 0; i < myArray.Length; i++)
{
Console.WriteLine(myArray[i]);
}
what you should mentally think is:
int i = 0;
while (i < myArray.Length)
{
Console.WriteLine(myArray[i]);
i++;
}
Now we have rewritten the for in terms of while, which is simpler.
Of course, this requires that you understand "while". We can understand while by again, breaking it down into something simpler. When you see while, think:
int i = 0;
START:
if (i < myArray.Length)
goto BODY;
else
goto END;
BODY:
Console.WriteLine(myArray[i]);
i++;
goto START;
END:
// the rest of your program here.
Now we have broken down your loop into its fundamental parts and the control flow is laid bare to your understanding. Walk through it.
We start with i equal to 0. Suppose the length of the array is 3.
Is 0 less than 3? Yes. So we go to BODY next. We write the 0th element of the array and increment i to 1. Now we go back to START.
Is 1 less than 3? Yes. So we go to BODY next. We write the 1th element of the array and increment i to 2. Now we go back to START.
Is 2 less than 3? Yes. So we go to BODY next. We write the 2th element of the array and increment i to 3. Now we go back to START.
Is 3 less than 3? No. So we go to END, and the rest of your program executes.
Now, you probably have noticed that the "goto" form is incredibly ugly and hard to read and reason about. That's why we invented while and for loops, so that you don't have to write awful code that uses gotos. But you can always reason about simple control flow by going back to the goto form mentally.
i < myArray.Length;
This is not testing against the values inside myArray but against the length (how many items the array contains). Therefore it means: When i is less than the length of the array.
So the loop will keep going, adding 1 to i (as you correctly said) each time it loops, when i is equal to the length of the array, meaning it has gone through all the values, it will exit the loop.
As Nicolás Straub pointed out, i is the index of the array, meaning the location of an item in an array, you have initialised it with the value of 0, this is correct because the first value in an array would have an index of 0.
To directly answer your question about for loops:
A for loop is executing lines of code iteratively (multiple times), the amount depends on its control statement:
for (int i = 0; i < myArray.Length; i++)
For loops are generally pre-condition (the condition to loop is before the code) and have loop counters, being i (i is actually a counter but can be seen as the index because you are going through every element, if you wanted to skip some then i would only be a counter). For is great for when you know how many times you want to loop before you start looping.
You are correct in your thinking except as what the others have stated. Think of an array as a sequence of data. You can even use the Reverse() method to apply that to your array. I would research more about arrays so you will understand different things you can do with an array and most importantly if you need to read or write them on the console, in a listbox, or a gridview from the text or csv file.
I suggest you add:
Console.ReadLine();
When you do this the application then will read like this:
2
5
7
20
55

How to get number of elements in an array, Extension Method Count() and Length gives the size of Array.

Hi I was trying to find the number of elements in an array
byte[] salt = new byte[32];
now I only have mentioned size 32 so the Length Property of Array and Enumerable's Count Method will give me 32.
Even if I will iterate on this Array salt using for or foreach or any other looping construct it will iterate 32 times and on each index the value is 0 (i.e default value of byte)
Now suppose I do:
for (int i = 0; i < 5 ; i++)
{
salt[i] = 4-i;
}
And I want to know how many elements are inserted sequentially in Array starting from index 0, Here you may say are you fool you iterating it 5 times and you know the number is 5 , but I am having heavy looping and other logic (appending prepending to another arrays of byte) in it. *My question Is there any other inbuilt function that could give me this number 5 ? * Even if I iterate and check for first default value and break the loop there and get the count but there might be the chance last value inserted is 0 like above salt[4] is 0 so that iterating will give me the count 4 which is incorrect . If I am not wrong I think when we declare Array with size like 32 above 32 consecutive memory bytes are reserved and you are free to insert at any index from 0-31 and its your responsibility to keep the track where and how and how many elements are assigned to Array .I hope you got my point And thanks in advance for help.
An array is an array, and in .NET is initialized when it is allocated. Once it's initialized, the question of whether a given value is uninitialized or simply 0 isn't something that's possible to check. A 0 is a 0.
However, you can bypass that in several ways. You can use a List<int>, like #SLaks suggested, to have a dynamically allocated list that's only initialized with the elements you want.
You can also use, instead of an array of int, and array of int?, so a null value isn't the same as a 0.
Short answer is you can't, the array contains 32 integers, .net framework doesn't care if some of them are 0, so you can create your own function that counts how many integers from an array are different than 0, or keep a "count" when you assign values for array elements or something like that.
Or you can use another container, example a list and dynamically add or remove integers from it.
Ok, when you define an array as int[] myArray = int[32]; you are saying, I HAVE 32 ints. Not, create me space for 32 ints that I will fill in later. That's why count is giving you 32.
If you want something which you can genuinly add to and resize, you need to use a List (or one of it's relatives.
If you want to have a "cap" for a list, I found this :Maximum capacity collection in c#

make sure array is sequential in C#

I've got an array of integers we're getting from a third party provider. These are meant to be sequential but for some reason they miss a number (something throws an exception, its eaten and the loop continues missing that index). This causes our system some grief and I'm trying to ensure that the array we're getting is indeed sequential.
The numbers start from varying offsets (sometimes 1000, sometimes 5820, others 0) but whatever the start, its meant to go from there.
What's the fastest method to verify the array is sequential? Even though its a required step it seems now, I also have to make sure it doesn't take too long to verify. I am currently starting at the first index, picking up the number and adding one and making sure the next index contains that etc.
EDIT:
The reason why the system fails is because of the way people use the system it may not always be returning the tokens the way it was picked initially - long story. The data can't be corrected until it gets to our layer unfortunately.
If you're sure that the array is sorted and has no duplicates, you can just check:
array[array.Length - 1] == array[0] + array.Length - 1
I think it's worth addressing the bigger issue here: what are you going to do if the data doesn't meet your requriements (sequential, no gaps)?
If you're still going to process the data, then you should probably invest your time in making your system more resilient to gaps or missing entries in the data.
**If you need to process the data and it must be clean, you should work with the vendor to make sure they send you well-formed data.
If you're going to skip processing and report an error, then asserting the precondition of no gaps may be the way to go. In C# there's a number of different things you could do:
If the data is sorted and has no dups, just check if LastValue == FirstValue + ArraySize - 1.
If the data is not sorted but dup free, just sort it and do the above.
If the data is not sorted, has dups and you actually want to detect the gaps, I would use LINQ.
List<int> gaps = Enumerable.Range(array.Min(), array.Length).Except(array).ToList();
or better yet (since the high-end value may be out of range):
int minVal = array.Min();
int maxVal = array.Max();
List<int> gaps = Enumerable.Range(minVal, maxVal-minVal+1).Except(array).ToList();
By the way, the whole concept of being passed a dense, gapless, array of integers is a bit odd for an interface between two parties, unless there's some additional data that associated with them. If there's no other data, why not just send a range {min,max} instead?
for (int i = a.Length - 2; 0 <= i; --i)
{
if (a[i] >= a[i+1]) return false; // not in sequence
}
return true; // in sequence
Gabe's way is definitely the fastest if the array is sorted. If the array is not sorted, then it would probably be best to sort the array (with merge/shell sort (or something of similar speed)) and then use Gabe's way.

Dictionary of Primes

I was trying to create this helper function in C# that returns the first n prime numbers. I decided to store the numbers in a dictionary in the <int,bool> format. The key is the number in question and the bool represents whether the int is a prime or not. There are a ton of resources out there calculating/generating the prime numbers(SO included), so I thought of joining the masses by crafting another trivial prime number generator.
My logic goes as follows:
public static Dictionary<int,bool> GetAllPrimes(int number)
{
Dictionary<int, bool> numberArray = new Dictionary<int, bool>();
int current = 2;
while (current <= number)
{
//If current has not been marked as prime in previous iterations,mark it as prime
if (!numberArray.ContainsKey(current))
numberArray.Add(current, true);
int i = 2;
while (current * i <= number)
{
if (!numberArray.ContainsKey(current * i))
numberArray.Add(current * i, false);
else if (numberArray[current * i])//current*i cannot be a prime
numberArray[current * i] = false;
i++;
}
current++;
}
return numberArray;
}
It will be great if the wise provide me with suggestions,optimizations, with possible refactorings. I was also wondering if the inclusion of the Dictionary helps with the run-time of this snippet.
Storing integers explicitly needs at least 32 bits per prime number, with some overhead for the container structure.
At around 231, the maximal value a signed 32 bit integer can take, about every 21.5th number is prime. Smaller primes are more dense, about 1 in ln(n) numbers is prime around n.
This means it is more memory efficient to use an array of bits than to store numbers explicitly. It will also be much faster to look up if a number is prime, and reasonably fast to iterate through the primes.
It seems this is called a BitArray in C# (in Java it is BitSet).
The first thing that bothers is that, why are you storing the number itself ?
Can't you just use the index itself which will represent the number?
PS: I'm not a c# developer so maybe it is not possible with a dictionary, but it can be done with the appropriate structure.
First, you only have to loop untill the square root of the number. Make all numbers false by default and have a simple flag that you set true at the beginning of every iteration.
Further, don't store it in a dictionary. Make it a bool array and have the index be the number you're looking for. Only 0 won't make any sense, but that doesn't matter. You don't have to init either; bools are false by default. Just declare an bool[] of number length.
Then, I would init like this:
primes[2] = true;
for(int i = 3; i < sqrtNumber; i += 2) {
}
So you skip all the even numbers automatically.
By the way, never declare a variable (i) in a loop, it makes it slower.
So that's about it. For more info see this page.
I'm pretty sure the Dictionary actually hurts performance, since it doesn't enable you to perform the trial divisions in an optimal order. Traditionally, you would store the known primes so that they could be iterated from smallest to largest, since smaller primes are factors of more composite numbers than larger primes. Additionally, you never need to try division with any prime larger than the square root of the candidate prime.
Many other optimizations are possible (as you yourself point out, this problem has been studied to death) but those are the ones that I can see off the top of my head.
The dictionary really doesn't make sense here -- just store all primes up to a given number in a list. Then follow these steps:
Is given number in the list?
Yes - it's prime. Done.
Not in list
Is given number larger than the list maximum?
No - it's not prime. Done.
Bigger than maximum; need to fill list up to maximum.
Run a sieve up to given number.
Repeat.
1) From the perspective of the client to this function, wouldn't it be better if the return type was bool[] (from 0 to number perhaps)? Internally, you have three states (KnownPrime, KnownComposite, Unknown), which could be represented by an enumeration. Storing an an array of this enumeration internally, prepopulated with Unknown, will be faster than a dictionary.
2) If you stick with the dictionary, the part of the sieve that marks multiples of the current number as composite could be replaced with a numberArray.TryGetValue() pattern rather than multiple checks for ContainsKey and subsequent retrieval of the value by key.
The trouble with returning an object that holds the primes is that unless you're careful to make it immutable, client code is free to mess up the values, in turn meaning you're not able to cache the primes you've already calculated.
How about having a method such as:
bool IsPrime(int primeTest);
in your helper class that can hide the primes it's already calculated, meaning you don't have to re-calculate them every time.

Categories