Find intersection group of sorted integer arrays - c#

Let's we have some integer short sorted arrays and we need to find intersection equal or more then predefined constant.
Here is code and it demonstrates what i want to do better then i can explain it in words.
The problem is SPEED. My code is working very slow. It takes about 15 sec on 2000 elements array(on my slow machine). Ofcourse i can implement my own intersection method and parallize code but it give a very limited improvement. Execution time growing as N^2 or something and already for 500k arrays it takes a very very long time. So how can i rewrite algorithm for better perfomance? I am not limited c# language maybe CPU or GPU has good special instructions for such job.
Example:
Input:
1,3,7,8
2,3,8,10
3,10,11,12,13,14
minSupport = 1
Output:
1 and 2: 2, 8
1 and 3: 3
2 and 3: 3, 10
var minSupport = 2;
var random = new Random(DateTime.Now.Millisecond);
// Numbers is each array are unique
var sortedArrays = Enumerable.Range(0,2000)
.Select(x => Enumerable.Range(0,30).Select(t => random.Next(1000)).Distinct()
.ToList()).ToList();
var result = new List<int[]>();
var resultIntersection = new List<List<int>>();
foreach (var array in sortedArrays)
{
array.Sort();
}
var sw = Stopwatch.StartNew();
//****MAIN PART*****//
for (int i = 0; i < sortedArrays.Count-1; i++)
{
for (int j = i+1; j < sortedArrays.Count; j++)
{
var intersect = sortedArrays[i].Intersect(sortedArrays[j]).ToList();
if(intersect.Count()>=minSupport)
{
result.Add( new []{i,j});
resultIntersection.Add(intersect);
}
}
}
//*****************//
sw.Stop();
Console.WriteLine(sw.Elapsed);
EDIT:
Now it takes about 9 sec vs 15 sec with old algorithm on 2000 elements. Well...ofcourse it is not fast enough.
//****MAIN PART*****//
// This number(max value which array can contains) is known
var maxValue = 1000;
var reverseIndexDict = new Dictionary<int,List<int>>();
for (int i = 0; i < maxValue; i++)
{
reverseIndexDict[i] = new List<int>();
}
for (int i = 0; i < sortedArrays.Count; i++)
{
for (int j = 0; j < sortedArrays[i].Count; j++)
{
reverseIndexDict[sortedArrays[i][j]].Add(i);
}
}
var tempArr = new List<int>();
for (int i = 0; i < sortedArrays.Count; i++)
{
tempArr.Clear();
for (int j = 0; j < sortedArrays[i].Count; j++)
{
tempArr.AddRange(reverseIndexDict[j]);
}
result.AddRange(tempArr.GroupBy(x => x).Where(x => x.Count()>=minSupport).Select(x => new[]{i,x.Key}).ToList());
}
result = result.Where(x => x[0]!=x[1]).ToList();
for (int i = 0; i < result.Count; i++)
{
resultIntersection.Add(sortedArrays[result[i][0]].Intersect(sortedArrays[result[i][1]]).ToList());
}
//*****************//
EDIT:
Some improvent.
//****MAIN PART*****//
// This number(max value which array can contains) is known
var maxValue = 1000;
var reverseIndexDict = new List<int>[maxValue];
for (int i = 0; i < maxValue; i++)
{
reverseIndexDict[i] = new List<int>();
}
for (int i = 0; i < sortedArrays.Count; i++)
{
for (int j = 0; j < sortedArrays[i].Count; j++)
{
reverseIndexDict[sortedArrays[i][j]].Add(i);
}
}
for (int i = 0; i < sortedArrays.Count; i++)
{
var tempArr = new Dictionary<int, List<int>>();
for (int j = 0; j < sortedArrays[i].Count; j++)
{
var sortedArraysij = sortedArrays[i][j];
for (int k = 0; k < reverseIndexDict[sortedArraysij].Count; k++)
{
if(!tempArr.ContainsKey(reverseIndexDict[sortedArraysij][k]))
{
tempArr[reverseIndexDict[sortedArraysij][k]] = new[]{sortedArraysij}.ToList();
}
else
{
tempArr[reverseIndexDict[sortedArraysij][k]].Add(sortedArrays[i][j]);
}
}
}
for (int j = 0; j < reverseIndexDict.Length; j++)
{
if(reverseIndexDict[j].Count>=minSupport)
{
result.Add(new[]{i,j});
resultIntersection.Add(reverseIndexDict[j]);
}
}
}
// and here we are filtering collections
//*****************//

There are two solutions:
Let us suppose you have 3 sorted arrays and you have to find the intersection between them. Traverse the first array and run a binary search on the rest of the two arrays for the element in first array. If the respective binary search on two list gave positive, then increment the counter of intersection.
result = List
for element in Array1:
status1 = binarySearch(element, Array2)
status2 = binarySearch(element, Array2)
status = status & status
if status == True:
count++
if count == MAX_INTERSECTION:
result.append(element)
break
Time Complexity : N * M * Log(N),
where,
N = Number of element in the array
M = Number of arrays
This solution works only if the number in the arrays are positive integers. Calculate the maximum and the minimum number out of the total elements in all the sorted arrays. As it is sorted, we can determine it by surveying the start and end element of the sorted arrays given. Let the greatest number be max and the lowest number be min. Create an array of size max - min and fill it with zero. Let us suppose you have 3 Arrays, now start traversing the first array and and go to the respective index and increment the value in the previously created array. As mentioned below:
element is 5 in Array 1, the New_array[5]+=1
Traverse all the three sorted list and perform the operation mentioned above. At the end traverse the new_array and look for value equal to 3, these indexes are the intersection result.
Time Complexity : O(N) + O(N) + .. = O(N)
Space Complexity : O(maximum_element - minimum_element)
where,
N = number of elements in the array.

Related

Fill 2d array with unique random numbers from 0 to 15 C#

I can't fill it with numbers to 0 - 15 then shuffle the array, so that's not the solution
I used this code in C but now in c# it doesn't work, for some reason this code let some numbers pass the do while.
Random r = new Random();
bool unique;
int rand_num;
for (int i = 0; i < 4; i++)
{
for (int j = 0; j < 4; j++)
{
do
{
unique = true;
rand_num = r.Next(16);
for (int k = 0; k < 4; k++)
{
for (int l = 0; l < 4; l++)
{
if (numbers[k, j] == rand_num)
{
unique = false;
}
}
}
} while (!unique);
numbers[i, j] = rand_num;
}
}
}
If the list of possible numbers is small, as in this case, just create the full list and randomise it first, then take the items in the order they appear. In your case, you can put the randomised numbers into a queue, then dequeue as required.
var r = new Random();
var numberQueue = new Queue<int>(Enumerable.Range(0, 16).OrderBy(n => r.NextDouble()));
var numbers = new int[4, 4];
for (var i = 0; i <= numbers.GetUpperBound(0); i++)
{
for (var j = 0; j <= numbers.GetUpperBound(1); j++)
{
numbers[i, j] = numberQueue.Dequeue();
}
}
I suggest you to use the Fisher-Yates algorithm to generate your non-repeatable sequence of random numbers.
It would be very straight-forward to implement a code to fill in a 2d array with those numbers, then.
List<int> seq = Enumerable.Range(0,16).ToList();
int[,] numbers = new int[4,4];
Random r = new();
for (int i = 0; i < 4; i++) {
for (int j = 0; j < 4; j++) {
int n = r.Next(0, seq.Count);
numbers[i,j] = seq[n];
seq.RemoveAt(n);
}
}
The approach you have taken may end up in continuous looping and take lot of time to complete.
Also checking for value in 2D using nested for loop is not efficient.
You can use HashSet to keep track of unique value. Searching in HashSet is fast.
following is the code approach I suggest.
var hashSet = new HashSet<int>();
var r = new Random();
var arr = new int[4, 4];
for(var i = 0;i<4;i++)
{
for(var j = 0;j<4;j++)
{
// Generate random value between 0 and 16.
var v = r.Next(0, 16);
// Check if the hashSet has the newly generated random value.
while(hashSet.Contains(v))
{
// generate new random value if the hashSet has the earlier generated value.
v = r.Next(0, 16);
}
//Add value to the hashSet.
hashSet.Add(v);
// add value to the 2D array.
arr[i, j] = v;
}
}
I hope this will help solving your issue.
The problem with your current approach is that as you get closer to the end of the array, you have to work harder and harder to get the next random value.
Imagine you roll a die, and each time you want to get a unique value. The first time you roll, any result will be unique. The next time, you have a 1/6 chance of getting a number that has already been obtained. And then a 2/6 chance, etc. and in the end most of your rolls will be non-unique.
In your example, you have 16 places that you want to fill with numbers 0 to 15. This is not a case of randomly generating numbers, but randomly placing them. How do we do this with a deck of cards? We shufle them!
My proposal is that you fill the array with unique sequential values and then shuffle them:
Random random = new Random();
int dim1 = array.GetLength(0);
int dim2 = array.GetLength(1);
int length = dim1 * dim2;
for (int i = 0; i < length; ++i)
{
int x = i / dim1;
int y = i % dim1;
array[x, y] = i; // set the initial values for each cell
}
// shuffle the values randomly
for (int i = 0; i < length; ++i)
{
int x1 = i / dim1;
int y1 = i % dim1;
int randPos = random.Next(i, length);
int x2 = randPos / dim1;
int y2 = randPos % dim1;
int tmp = array[x1, y1];
array[x1, y1] = array[x2, y2];
array[x2, y2] = tmp;
}
The shuffle in this code is based on the shuffle found here
int[,] numbers = new int[4, 4];
Random r = new Random();
bool unique;
int rand_num;
List<int> listRandom = new List<int> { };
for ( int i = 0; i < 4; i++ )
{
for ( int j = 0; j < 4; j++ )
{
do
{
unique = false;
if (!listRandom.Contains( rand_num = r.Next( 0, 16 )))
{
listRandom.Add( rand_num );
numbers[i, j] = rand_num;
unique = true;
}
} while ( !unique );
}
}

C# Resorting an array in a unique way

I am wanting to create multiple arrays of ints(in C#). However they all must have a unique number in the index, which no other array has that number in that index. So let me try show you what I mean:
int[] ints_array = new int[30];
for (int i = 0; i < ints_array.Count(); i++)
ints_array[i] = i;
//create a int array with 30 elems with each value increment by 1
List<int[]> arrayList = new List<int[]>();
for(int i = 0; i < ints_array.Count(); i++)
arrayList.Add(ints_array[i]. //somehow sort the array here randomly so it will be unique
So I am trying to get the arrayList have 30 int[] arrays and each is sorted so no array has the same int in the same index as another.
Example:
arrayList[0] = {5,2,3,4,1,6,7,8,20,21... etc }
arrayList[1] = {1,0,5,2,9,10,29,15,29... etc }
arrayList[2] = {0,28,4,7,29,23,22,17... etc }
So would this possible to sort the array in this unique kind of way? If you need anymore information just ask and ill fill you in :)
Wouldn't it be easier to create the arrays iteratively using an offset pattern?
What I mean is that if you created the first array using 1-30 where 1 is at index 0, the next array could repeat this using 2-30 where 2 is at index 0 and then wrap back to 1 and start counting forward again as soon as you go past 30. It would be an easy and repeatable way to make sure no array shared the same value/index pair.
You can do it like that:
List<int[]> arrayList = new List<int[]>();
Random rnd = new Random();
for (int i = 0; i < ints_array.Length; i++)
{
ints_array = ints_array.OrderBy(x => rnd.Next()).ToArray();
var isDuplicate = arrayList.Any(x => x.SequenceEqual(ints_array));
if (isDuplicate)
{
while (arrayList.Any(x => x.SequenceEqual(ints_array)))
{
ints_array = ints_array.OrderBy(x => rnd.Next()).ToArray();
}
}
arrayList.Add(ints_array);
}
I think, this wouldn't be so efficient for bigger numbers than 30.But in this case it shouldn't be a problem, in my machine it takes 7 milliseconds.
Jesse's idea would be best unless you needed a pure random pattern. In that case I would recommend generating a random number, checking all your previous arrays, and then placing it in an array if it did not match any other arrays current index. Otherwise, generate a new random number until you find a fresh one. Put that into a loop until all your arrays are filled.
Use a matrix (2D-array). It is easier to handle than a list of arrays. Create a random number generator. Make sure to initialize it only once, otherwise random number generator may create bad random numbers, if created in too short time intervals, since the slow PC-clock might not have ticked in between. (The actual time is used as seed value).
private static Random random = new Random();
Create two helper arrays with shuffeled indexes for rows and columns:
const int N = 30;
int[] col = CreateUniqueShuffledValues(N);
int[] row = CreateUniqueShuffledValues(N);
Then create and initialize the matrix by using the shuffeled row and column indexes:
// Create matrix
int[,] matrix = new int[N, N];
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
matrix[row[i], col[j]] = (i + j) % N;
}
}
The code uses these two helper methods:
private static int[] CreateUniqueShuffledValues(int n)
{
// Create and initialize array with indexes.
int[] array = new int[n];
for (int i = 0; i < n; i++) {
array[i] = i;
}
// Shuffel array using one variant of Fisher–Yates shuffle
// http://en.wikipedia.org/wiki/Fisher-Yates_shuffle#The_modern_algorithm
for (int i = 0; i < n; i++) {
int j = random.Next(i, n);
Swap(array, i, j);
}
return array;
}
private static void Swap(int[] array, int i, int j)
{
int temp = array[i];
array[i] = array[j];
array[j] = temp;
}
int size = 10;
// generate table (no duplicates in rows, no duplicates in columns)
// 0 1 2
// 1 2 0
// 2 0 1
int[,] table = new int[size, size];
for (int y = 0; y < size; y++)
for (int x = 0; x < size; x++)
table[y, x] = (y + x) % size;
// shuffle rows
Random rnd = new Random();
for (int i = 0; i < size; i++)
{
int y1 = rnd.Next(0, size);
int y2 = rnd.Next(0, size);
for (int x = 0; x < size; x++)
{
int tmp = table[y1, x];
table[y1, x] = table[y2, x];
table[y2, x] = tmp;
}
}
// shuffle columns
for (int i = 0; i < size; i++)
{
int x1 = rnd.Next(0, size);
int x2 = rnd.Next(0, size);
for (int y = 0; y < size; y++)
{
int tmp = table[y, x1];
table[y, x1] = table[y, x2];
table[y, x2] = tmp;
}
}
// sample output
for (int y = 0; y < size; y++)
{
for (int x = 0; x < size; x++)
Console.Write("{0} ", table[y, x]);
Console.WriteLine();
}

filling multidimensional array with unique numbers in C#

I'm trying to write a code that will fill array with unique numbers.
I could write the code separately for 1, 2 and 3 dimensional arrays but number of for cycles grow to "infinity".
this is the code for 2D array:
static void fillArray(int[,] array)
{
Random rand = new Random();
for (int i = 0; i < array.GetLength(0); i++)
{
for (int j = 0; j < array.GetLength(1); j++)
{
array[i, j] = rand.Next(1, 100);
for (int k = 0; k < j; k++)
if (array[i, k] == array[i, j])
j--;
}
}
print_info(array);
}
Is it possible to do something like this for n-dimensional arrays?
My approach is to start with a 1-d array of unique numbers, which you can shuffle, and then slot into appropriate places in your real array.
Here is the main function:
private static void Initialize(Array array)
{
var rank = array.Rank;
var dimensionLengths = new List<int>();
var totalSize = 1;
int[] arrayIndices = new int[rank];
for (var dimension = 0; dimension < rank; dimension++)
{
var upperBound = array.GetLength(dimension);
dimensionLengths.Add(upperBound);
totalSize *= upperBound;
}
var singleArray = new int[totalSize];
for (int i = 0; i < totalSize; i++) singleArray[i] = i;
singleArray = Shuffle(singleArray);
for (var i = 0; i < singleArray.Length; i++)
{
var remainingIndex = i;
for (var dimension = array.Rank - 1; dimension >= 0; dimension--)
{
arrayIndices[dimension] = remainingIndex%dimensionLengths[dimension];
remainingIndex /= dimensionLengths[dimension];
}
// Now, set the appropriate cell in your real array:
array.SetValue(singleArray[i], arrayIndices);
}
}
The key in this example is the array.SetValue(value, params int[] indices) function. By building up the correct list of indices, you can use this function to set an arbitrary cell in your array.
Here is the Shuffle function:
private static int[] Shuffle(int[] singleArray)
{
var random = new Random();
for (int i = singleArray.Length; i > 1; i--)
{
// Pick random element to swap.
int j = random.Next(i); // 0 <= j <= i-1
// Swap.
int tmp = singleArray[j];
singleArray[j] = singleArray[i - 1];
singleArray[i - 1] = tmp;
}
return singleArray;
}
And finally a demonstration of it in use:
var array1 = new int[2,3,5];
Initialize(array1);
var array2 = new int[2,2,3,4];
Initialize(array2);
My strategy assigns sequential numbers to the original 1-d array to ensure uniqueness, but you can adopt a different strategy for this as you see fit.
You can use Rank property to get the total number of dimentions in your array
To insert use SetValue method
In the first two for loops you are analysing the array properly (i and j go from the start to the end of the corresponding dimension). The problem comes in the most internal part where you introduce a "correction" which actually provokes an endless loop for j.
First iteration:
- First loop: i = 0;
- Second loop: j = 0;
- Third loop: j = -1
Second iteration
- First loop: i = 0;
- Second loop: j = 0;
- Third loop: j = -1
. etc., etc.
(I start my analysis in the moment when the internal loop is used for the first time. Also bear in mind that the exact behaviour cannot be predicted as far as random numbers are involved. But the idea is that you are making the j counter back over and over by following an arbitrary rule).
What you want to accomplish exactly? What is this last correction (the one provoking the endless loop) meant to do?
If the only thing you intend to do is checking the previously stored values, you have to rely on a different variable (j2, for example) which will not affect any of the loops above:
int j2 = j;
for (int k = 0; k < j2; k++)
if (array[i, k] == array[i, j2])
j2--;

How to code for dynamic for-loop levels?

My problem is like this:
I have several lists need to be permuted, but the list numbers are unknowable. And every element numbers in every list are also unknowable. Sicne I would like to traverse all list element combination, like 1) pick A from list 1, A from list 2, A from list 3; 2) ick A from list 1, A from list 2, B from list 3 ... for ALL permutation.
I use nested for-loop to traverse, like if I have two lists, then:
for (int i = 0; i < list[0].EnergyParameters.ListEnergyLevelCandidates.Count; i++)
{
for (int j = 0; j < list[1].EnergyParameters.ListEnergyLevelCandidates.Count; j++)
{
// Do sth
}
}
If I have three lists, then:
for (int i = 0; i < list[0].EnergyParameters.ListEnergyLevelCandidates.Count; i++)
{
for (int j = 0; j < list[1].EnergyParameters.ListEnergyLevelCandidates.Count; j++)
{
for (int k = 0; k < list[2].EnergyParameters.ListEnergyLevelCandidates.Count; k++)
{
// Do sth
}
}
}
Because the list numbers are unknowable, so the nest numbers are unknowable, which means, I don't know how many levels of for-loop needs to be written.
Under this kind of circumstance, how can I write code for dynamic for-loop levels? I don't want to write 10 loops for 10 lists.
In case you do not know how many lists there are, you do not write nested loops: instead, you write recursion. At each level of the invocation you loop a single list, like this:
void AllCombos(List<string>[] lists, int level, string[] current) {
if (level == lists.Length) {
// Do somthing; items of current[] contain elements of the combination
} else {
foreach (var s in lists[level]) {
current[level] = s;
AllCombos(lists, level+1, current);
}
}
}
Call AllCombos as follows:
var lists = new List<string>[10];
for (int i = 0 ; i != 10 ; i++) {
lists[i] = PopulateMyList(i);
}
string[] current = new string[lists.Length];
AllCombos(lists, 0, current);

Multi-Dimensional Data Structures in C#

How do you create a multi-dimensional data structure in C#?
In my mind it works like so:
List<List<int>> results = new List<List<int>>();
for (int i = 0; i < 10; i++)
{
for (int j = 0; j < 10; j++)
{
results[i][j] = 0;
}
}
This doesn't work (it throws an ArgumentOutOfRangeException). Is there a multi-dimensional structure in C# that allows me to access members through their indexes?
The problem here is that List doesn't automatically create elements. To initialise a List<List<T>> you need something like this:
List<List<int>> results = new List<List<int>>();
for (int i = 0; i < 10; i++)
{
results.Add(new List<int>());
for (int j = 0; j < 10; j++)
{
results[i].Add(0);
}
}
Note that setting Capacity is not sufficient, you need to call Add the number of times you need. Alternatively, you can simplify things by using Linq's Enumerable class:
List<List<int>> results = new List<List<int>>();
for (int i = 0; i < 10; i++)
{
results.Add(new List<int>());
results[i].AddRange(Enumerable.Repeat(0, 10));
}
Again, note that Enumerable.Repeat(new List<int>(), 10) will not work, since it will add 10 references to the same list.
Another approach using Linq to the extreme:
List<List<int>> results = Enumerable.Repeat(0, 10)
.Select(i => Enumerable.Repeat(0, 10).ToList())
.ToList();
(The unused parameter i is necessary to ensure that you don't reference the same list ten times as discussed above.)
Finally, to access elements, you can use exactly the notation you used before. Once the elements have been added, they can be read or modified as shown:
for (int i = 0; i < 10; i++)
{
for (int j = 0; j < 10; j++)
{
results[i][j] = 2;
int x = results[i][j];
}
}
If you know the dimensions of your structure in advance, and you do not plan to add or remove elements, then a 2D array sounds like your thing:
int[,] n = new int[10, 20];
for (int i = 0; i < 10; ++i) {
for (int j = 0; j < 10; ++j) {
n[i, j] = ...
};
};
You have to create the lists and initialize them with zeros before you can't start indexing into them.
List<List<int>> results = new List<List<int>>();
for (int i = 0; i < 10; i++)
{
results.Add(new List<int>(Enumerable.Repeat(0, 10)));
}
You have to actually 1) create each of the inner lists, and 2) set them to that size.
var Results = Enumerable.Range(0, 10).Select(i => Enumerable.Repeat(0, 10).ToList()).ToList();
I'm a bit of a Linq addict, though.
You could use a datatable and add columns and rows? You would then be able to reference them by name or index.

Categories