How to use SkipWhile() to skip duplicate elements in two arrays?

How to use SkipWhile() to skip duplicate elements in two arrays? - c#

I am trying to get a list of numbers (upto 9) that don't contain any numbers already in a 3x3 cell from which is contained within a 81 large 2d array of integers. I'm attempting to make sudoku. There are likely better solutions that have nothing to with skipwhile or something similar, but they all seem to rely on duplicate numbers having the same index in both array which doesn't work.
int[] inCellNumbers = thisCell.Intersect(Numbers).ToArray();
int[] availableNumbers = Numbers.SkipWhile(Numbers.Contains<int>(inCellNumbers) == true);
This is the code that I tried, numbers is an array of integers and I get this error:
'MemoryExtensions.Contains(ReadOnlySpan, int)' requires a receiver of type 'ReadOnlySpan'
I was attempting to skip all numbers that are in 'inCellNumbers' and have them in 'availableNumbers' from 'Numbers'

You can use the Enumerable.Except Method:
int[] availableNumbers = Numbers.Except(inCellNumbers).ToArray();
The Enumerable.SkipWhile Method stops skipping at the first element that does not fulfill the condition. Even if others follow later that do fulfill it.

I'm not sure of the algorithm for checking around a sudoku grid but it sounds like you would want to use some hashsets.
A hashset may only have unique entries. An array or list can be dumped to a hashset using its constructor new HashSet(arrOrList), and you can evaluate immediately to one from LINQ using .ToHashSet().
As for the error, it looks like you are using List<T>.Contains<T>(T item) (notice all of the T here), which is going to check if the List contains the entire parameterized array as a single element/item. Unless Numbers is of type List<int[]> this is going to be a problem, but I'm unsure of what type you give it so this may not be the true issue. I see you are using .Contains<int> but this won't work as inCellNumbers is int[]

Related

C# Hashset of char is sorting itself automatically when using add method on it

I am trying to solve a string manipulation problem where I need to store characters in a collection that doesn't allow duplicates. So I am using HashSet, however I have noticed that when I am adding characters to it, hashset is getting sorted automatically.
For ex: I had characters 'b' and 'c' in the set already at index 0 and 1 but when I added character 'a' it got added to the index 0 shifting characters 'b' and 'c' to the right at indexes 1 and 2 respectively.
I was under the impression that Hashset are suppose to be not sorted.
please explain the reason behind this behavior and what else should I use if HashSet is working as it is suppose to be.
Thanks

From HashSet<T> Class:
A HashSet collection is not sorted and cannot contain duplicate elements. If order or element duplication is more important than performance for your application, consider using the List class together with the Sort method.
The change in order is due to hashing. Consider using a Dictionary to store the indices along with the characters, Dictionary keys can't be duplicates.

I found this:
This is what I found on google

Storing lots of Arrays into a List on every Loop

I have a question about storing lots of arrays into a list.
First, I initialize the array and list:
int[] arr = new int[9];
List<int[]> forkarr = new List<int[]>();
then I run through a lot of for loops and modify the array each time in order to produce all possible tic tac toe variations. While looping through all those variations, if the array is a possible board then I print it, and additionally if the array meets certain criteria for being a 'fork', then I will add it to the list like this:
if (ForkCheck.fork(arr)) { forkarr.Add(arr); Console.WriteLine("It's a Fork!");}
which also prints that message letting you know that particular array is a fork.
Now, when I am printing all of the arrays, the ones that are forks are printed properly and labeled as such.
However, when I go to print out all of the int[] elements of my list forkarr like so:
foreach (int[] arry in forkarr)
{
PrintGame.print(arry);//this is my method that prints the array
Console.WriteLine();
}
for some reason each array becomes an identical:
222
222
222
(I'm using 2 as 'X' and 1 as 'O' and 0 as 'empty' btw)
And it just prints that out over and over for all the forks.
So, something is going wrong when I am adding each modification of the array as a new element in the list I think, but I'm not sure why.
Thanks for any help.

Because you are modifying the same array and adding it to the list. Each time you done with one array array you need to create a new instance like this:
arr = new int[9];
This will create a new reference which will be independent from other arrays. And modifying it's elements wont affect the others.
For more information about value vs reference types you can refer to the question below:
What is the difference between a reference type and value type in c#?

The use of LinkedList<BigInteger>

I've been struggling for a while now with an error I can't fix.
I searched the Internet without any success and started wandering if it is possible what I want to accomplish.
I want the create an array with the a huge amount of nodes, so huge that it I need BigInteger.
I founded that LinkedList would fit my solution the best, so I started with this code.
BigInteger[] numberlist = { 0, 1 };
LinkedList<BigInteger> number = new LinkedList<BigInteger>(numberlist);
for(c = 2; c <= b; c++)
{
numberlist [b] = 1; /* flag numbers to 1 */
}
The meaning of this is to set all nodes in the linkedlist to active (1).
The vars c and b are bigintegers too.
The error I get from VS10 is :
Cannot implicitly convert type 'System.Numerics.BigInteger' to 'int'. An explicit conversion exists (are you missing a cast?)
The questions:
Is it possible to accomplish?
How can I flag all nodes in number with the use of BigInteger (not int)?
Is there an other better method for accomplishing the thing?
UPDATE
In the example I use c++ as the counter. This is variable though...
The node list could look like this:
numberlist[2]
numberlist[3]
numberlist[200]
numberlist[20034759044900]
numberlist[23847982344986350]
I'll remove processed nodes. At the maximum I'll use 1,5gb of memory.
Please reply on this update, I want to know whether my ideas are correct or not.
Also I would like too learn from my mistakes!

The generic argument of LinkedList<T> describes the element type and has nothing to do with the number of elements you can put in the collection.
Indexing into a linked list is a bad idea too. It's an O(n) operation.
And I can't imagine how you can have more elements than what fits into an Int64. There is simply not enough memory to back that.
You can have more than 2^31-1 elements in a 64bit process, but most likely you need to create your own collection type for that, since most built in collections have lower limits.
If you need more than 2^31 flags I'd create my own collection type that's backed by multiple arrays and bitpacks the flags. That way you get about 8*2^31 = 16 billion flags into a 2GB array.
If your data is sparse you could consider using a HashSet<Int64> or Dictionary<Int64,Node>.
If your data has long sequences with the same value you could use some tree structure or perhaps some variant of run-length-encoding.
If you don't need the indexes at all, you could just use a Queue<T> and dequeue from the beginning.

From your update it seems you don't want to have huge amount of data, you just want to index them using huge numbers. If that's the case, you can use Dictionary<BigInteger, int> or Dictionary<BigInteger, bool> if you only want true/false values. Alternatively, you could use HashSet<BigInteger>, if you don't need to distinguish between false and “not in collection”.

LinkedList<BigInteger> is a small number of elements, where each element is a BigInteger.
.NET doesn't allow any single array to be larger than 2GB (even on 64-bit), so there's no point in having an index larger than an int.
Try breaking your big array into smaller segments, where each segment can be addressed by an int.

If I may read your mind, it sounds like what you want a sparse array which is indexed by a BigInteger. As others have mentioned, LinkedList<BigInteger> is entirely the wrong data structure for this. I suggest something entirely different, namely a Dictionary<BigInteger, int>. This allows you to do the following:
Dictionary<BigInteger, int> data = new Dictionary<BigInteger, int>();
BigInteger b = GetBigInteger();
data[b] = 1; // the BigInteger is the *index*, and the integer is the *value*

What is the fast way of getting an index of an element in an array? [duplicate]

This question already has answers here:
How to find the index of an element in an array in Java?
(15 answers)
Closed 6 years ago.
I was asked this question in an interview. Although the interview was for dot net position, he asked me this question in context to java, because I had mentioned java also in my resume.
How to find the index of an element having value X in an array ?
I said iterating from the first element till last and checking whether the value is X would give the result. He asked about a method involving less number of iterations, I said using binary search but that is only possible for sorted array. I tried saying using IndexOf function in the Array class. But nothing from my side answered that question.
Is there any fast way of getting the index of an element having value X in an array ?

As long as there is no knowledge about the array (is it sorted? ascending or descending? etc etc), there is no way of finding an element without inspecting each one.
Also, that is exactly what indexOf does (when using lists).

How to find the index of an element having value X in an array ?
This would be fast:
int getXIndex(int x){
myArray[0] = x;
return 0;
}

A practical way of finding it faster is by parallel processing.
Just divide the array in N parts and assign every part to a thread that iterates through the elements of its part until value is found. N should preferably be the processor's number of cores.

If a binary search isn't possible (beacuse the array isn't sorted) and you don't have some kind of advanced search index, the only way I could think of that isn't O(n) is if the item's position in the array is a function of the item itself (like, if the array is [10, 20, 30, 40], the position of an element n is (n / 10) - 1).

Maybe he wants to test your knowledge about Java.
There is Utility Class called Arrays, this class contains various methods for manipulating arrays (such as sorting and searching)
http://download.oracle.com/javase/6/docs/api/java/util/Arrays.html
In 2 lines you can have a O(n * log n) result:
Arrays.sort(list); //O(n * log n)
Arrays.binarySearch(list, 88)); //O(log n)

Puneet - in .net its:
string[] testArray = {"fred", "bill"};
var indexOffset = Array.IndexOf(testArray, "fred");
[edit] - having read the question properly now, :) an alternative in linq would be:
string[] testArray = { "cat", "dog", "banana", "orange" };
int firstItem = testArray.Select((item, index) => new
{
ItemName = item,
Position = index
}).Where(i => i.ItemName == "banana")
.First()
.Position;
this of course would find the FIRST occurence of the string. subsequent duplicates would require additional logic. but then so would a looped approach.
jim

It's a question about data structures and algorithms (altough a very simple data structure). It goes beyond the language you are using.
If the array is ordered you can get O(log n) using binary search and a modified version of it for border cases (not using always (a+b)/2 as the pivot point, but it's a pretty sophisticated quirk).
If the array is not ordered then... good luck.
He can be asking you about what methods you have in order to find an item in Java. But anyway they're not faster. They can be olny simpler to use (than a for-each - compare - return).
There's another solution that's creating an auxiliary structure to do a faster search (like a hashmap) but, OF COURSE, it's more expensive to create it and use it once than to do a simple linear search.

Take a perfectly unsorted array, just a list of numbers in memory. All the machine can do is look at individual numbers in memory, and check if they are the right number. This is the "password cracker problem". There is no faster way than to search from the beginning until the correct value is hit.

Are you sure about the question? I have got a questions somewhat similar to your question.
Given a sorted array, there is one element "x" whose value is same as its index find the index of that element.
For example:
//0,1,2,3,4,5,6,7,8,9, 10
int a[10]={1,3,5,5,6,6,6,8,9,10,11};
at index 6 that value and index are same.
for this array a, answer should be 6.
This is not an answer, in case there was something missed in the original question this would clarify that.

If the only information you have is the fact that it's an unsorted array, with no reletionship between the index and value, and with no auxiliary data structures, then you have to potentially examine every element to see if it holds the information you want.
However, interviews are meant to separate the wheat from the chaff so it's important to realise that they want to see how you approach problems. Hence the idea is to ask questions to see if any more information is (or could be made) available, information that can make your search more efficient.
Questions like:
1/ Does the data change very often?
If not, then you can use an extra data structure.
For example, maintain a dirty flag which is initially true. When you want to find an item and it's true, build that extra structure (sorted array, tree, hash or whatever) which will greatly speed up searches, then set the dirty flag to false, then use that structure to find the item.
If you want to find an item and the dirty flag is false, just use the structure, no need to rebuild it.
Of course, any changes to the data should set the dirty flag to true so that the next search rebuilds the structure.
This will greatly speed up (through amortisation) queries for data that's read far more often than written.
In other words, the first search after a change will be relatively slow but subsequent searches can be much faster.
You'll probably want to wrap the array inside a class so that you can control the dirty flag correctly.
2/ Are we allowed to use a different data structure than a raw array?
This will be similar to the first point given above. If we modify the data structure from an array into an arbitrary class containing the array, you can still get all the advantages such as quick random access to each element.
But we gain the ability to update extra information within the data structure whenever the data changes.
So, rather than using a dirty flag and doing a large update on the next search, we can make small changes to the extra information whenever the array is changed.
This gets rid of the slow response of the first search after a change by amortising the cost across all changes (each change having a small cost).
3. How many items will typically be in the list?
This is actually more important than most people realise.
All talk of optimisation tends to be useless unless your data sets are relatively large and performance is actually important.
For example, if you have a 100-item array, it's quite acceptable to use even the brain-dead bubble sort since the difference in timings between that and the fastest sort you can find tend to be irrelevant (unless you need to do it thousands of times per second of course).
For this case, finding the first index for a given value, it's probably perfectly acceptable to do a sequential search as long as your array stays under a certain size.
The bottom line is that you're there to prove your worth, and the interviewer is (usually) there to guide you. Unless they're sadistic, they're quite happy for you to ask them questions to try an narrow down the scope of the problem.
Ask the questions (as you have for the possibility the data may be sorted. They should be impressed with your approach even if you can't come up with a solution.
In fact (and I've done this in the past), they may reject all your possibile approaches (no, it's not sorted, no, no other data structures are allowed, and so on) just to see how far you get.
And maybe, just maybe, like the Kobayashi Maru, it may not be about winning, it may be how you deal with failure :-)

i need to use large size of array

My requirement is to find a duplicate number in an array of integers of length 10 ^ 15.
I need to find a duplicate in one pass. I know the method (logic) to find a duplicate number from an array, but how can I handle such a large size.

An array of 10^15 of integers would require more than a petabyte to store. You said it can be done in a single pass, so there's no need to store all the data. But even reading this amount of data would take a lot of time.
But wait, if the numbers are integers, they fall into a certain range, let's say N = 2^32. So you only need to search at most N+1 numbers to find a duplicate. Now that's feasible.

You can use a BitVector array with length = 2^(32-5) = 0x0800000
This has a bit for each posible int32 number.
Note: easy solution (BitArray) do´nt support adecuate constructor.
BitVector32[] bv = new BitVector32[0x8000000];
int[] ARR = ....; // Your array
foreach (int I in ARR)
{
int Element = I >> 5;
int Bit = I & 0x1f;
if (bv[Element ][Bit])
{
// Option for First Duplicate Found
}
else
{
bv[I][Bit] = true;
}
}

You'll need a different data structure. I suspect the requirement isn't really to use an array - I'd hope not, as arrays can only hold up to Int32.MaxValue elements, i.e. 2,147,483,647... much less than 10^15. Even on a 64-bit machine, I believe the CLR requires that arrays have at most that many elements. (See the docs for Array.CreateInstance for example - even though you can specify the bounds as 64-bit integers, it will throw an exception if they're actually that big.)
Now, if you can explain what the real requirement is, we may well be able to suggest alternative data structures.
If this is a theoretical problem rather than a practical one, it would be helpful if you could tell us those constraints, too.
For example, if you've got enough memory for the array itself, then asking for 2^24 bytes to store which numbers you've seen already (one bit per value) isn't asking for much. This is assuming the values themselves are 32-bit integers, of course. Start with an empty array, and set the relevant bit for each number you find. If you find you're about to set one that's already set, then you've found the first duplicate.

You can declare it in the normal way: new int[1000000000000000]. However this will only work on a 64-bit machine; the largest you can expect to store on a 32-bit machine is a little over 2GB.
Realistically you won't be able to store the whole array in memory. You'll need to come up with a way of generating it in smaller chunks, and checking those chunks individually.
What's the data in the array? Maybe you don't need to generate it all in one go. Or maybe you can store the data in a file.

You cannot declare an array of size greater than Int32.MaxValue (2^31, or approx. 2*10^9), so you will have to either chain arrays together or use a List<int> to hold all of the values.

Your algorithm should really be the same regardless of the array size. The best time complexity you'll get has got to be (ideally) O(n) of course.
Consider the following pseudo-code for the algorithm:
Create a HashSet<int> of capacity equal to the range of numbers in your array.
Loop over each number in the array and check if it already exists in the hashset.
if no, add it to the hashset now.
if yes, you've found a duplicate.
Memory usage here is far from trivial, but if you want speed, it will do the job.

You don't need to do anything. By definition, there will be a duplicate because 2^32 < 10^15 - there aren't enough numbers to fill a 10^15 array uniquely. :)
Now if there is an additional requirement that you know where the duplicates are... thats another story, but it wasn't in the original problem.

question,
1) is the number of items in the array 10^15
2) or can the value of the items be 10^15?
if it is #1:
where are you pulling the nummbers from? if its a file you can step through it.
are there more than 2,147,483,647 unique numbers?
if its #2:
a int64 can handle the number
if its #1 and #2:
are there more than 2,147,483,647 unique numbers?
if there are less then 2,147,483,647 unique numbers you can use a List<bigint>

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.