Determining the first available value in a list of integers - c#

I got a simple List of ints.
List<int> myInts = new List<int>();
myInts.Add(0);
myInts.Add(1);
myInts.Add(4);
myInts.Add(6);
myInts.Add(24);
My goal is to get the first unused (available) value from the List.
(the first positive value that's not already present in the collection)
In this case, the answer would be 2.
Here's my current code :
int GetFirstFreeInt()
{
for (int i = 0; i < int.MaxValue; ++i)
{
if(!myInts.Contains(i))
return i;
}
throw new InvalidOperationException("All integers are already used.");
}
Is there a better way? Maybe using LINQ? How would you do this?
Of course here I used ints for simplicity but my question could apply to any type.

You basically want the first element from the sequence 0..int.MaxValue that is not contained in myInts:
int? firstAvailable = Enumerable.Range(0, int.MaxValue)
.Except(myInts)
.FirstOrDefault();
Edit in response to comment:
There is no performance penalty here to iterate up to int.MaxValue. What Linq is going to to internally is create a hashtable for myInts and then begin iterating over the sequence created by Enumerable.Range() - once the first item not contained in the hashtable is found that integer is yielded by the Except() method and returned by FirstOrDefault() - after which the iteration stops. This means the overall effort is O(n) for creating the hashtable and then worst case O(n) for iterating over the sequence where n is the number of integers in myInts.
For more on Except() see i.e. Jon Skeet's EduLinq series: Reimplementing LINQ to Objects: Part 17 - Except

Well, if the list is ordered from smallest to largest and contains values from 0 to positive infinity, you could simply access the i-th element. if (myInts[i] != i) return i; which would be essentially the same, but doesn't necessitate iterating through the list for each and every Contains check (the Contains method iterates through the list, turning your algorithm into an O(n-squared) rather than O(n)).

Related

Would this function be linear, quadratic, or neither? (C#)

In regards to the length of the list being the input (n), would the time complexity of this code be linear because there is only one loop or quadratic due to "any" technically looping through the new array -- but not through every item on every loop? Or would it be neither?
public static List<Item> RemoveDuplicated(List<Item> listToFilter)
{
var newItemList = new List<Item>();
foreach(var item in listToFilter)
{
if(!newItemList.Any( i => i.ItemId == item.ItemId))
{
newItemList.Add(item);
}
}
return newItemList;
}
Algorithm complexity is the asymptotic behaviour as n grows large.
If unspecified, we assume the worst-case behaviour.
Here, that worst case is where every item is new to the list, such that Any has to traverse the entire existing list.
You nailed those parts: the outer loop executes n times; the inner loop has to traverse that list until it finds the element (we might assume checking m elements, where m is the current list size) or doesn't find it (checking all m elements).
In the worst case, the Any 1+2+3+ ... +(n-1) times, adding each item to the list. I'm sure you recognize this as O(n^2).
Assuming that duplicates are some fixed or bounded proportion of the original list, that expression is dependent on n.
Does that help the understanding?
Clarification:
The sum of the sequence 1 .. n is n(n+1) / 2, or (n^2 + n) / 2. This is dominated by the n^2 term.

Why property "ElapsedTicks" of List not equal to "ElapsedTicks" of Array?

For example I have the following code implements Stopwatch:
var list = new List<int>();
var array = new ArrayList();
Stopwatch listStopwatch = new Stopwatch(), arrayStopwatch = new Stopwatch();
listStopwatch.Start();
for (int i =0; i <=10000;i++)
{
list.Add(10);
}
listStopwatch.Stop();
arrayStopwatch.Start();
for (int i = 0; i <= 10000; i++)
{
list.Add(10);
}
arrayStopwatch.Stop();
Console.WriteLine(listStopwatch.ElapsedTicks > arrayStopwatch.ElapsedTicks);
Why this values are not equal?
Different code is expected to produce different timing.
Second loop adds to array as question imply
One most obvious difference is boxing in ArrayList - each int is stored as boxed value (created on heap instead of inline for List<int>).
Second loop adds to list as sample shows
growing list requires re-allocation and copying all elements which may be slower for second set of elements if in particular range it will hit more re-allocations (as copy operation need to copy a lot more elements each time).
Note that on average (as hinted by Adam Houldsworth) re-allocation cost the same (as they happen way lest often when array grows), but one can find set of numbers when there are extra re-allocation in on of the cases to get one number consistently different than another. One would need much higher number of items to add for difference to be consistent.

How to find an item in an array that sum of all values before that is a spcific value? (C++ and C#)

Assume that I have an array of integers with a specific size (say 1000)
I want to find the index of an item in this array, given the sum of all items in the array before this item (or including this item).
for example assume that I have the following array:
int[] values={1,2,3,1,3,6,4,8,2,11}
Input value is 6, then I need to return the index 2 (zero based indexing for 3 in above example) and when given 10, I should return the index 4.
What is the fastest way to do this? in C++ and c#?
If you need to do it only once, then the naive way is also the fastest way: walk the array, and keep the running total. Once you reach the target sum, return the current index.
If you need to run multiple queries for different sums, create an array and set up sums in it, like this:
var sums = new int[values.Length];
sums[0] = values[0];
for (int i = 1 ; i < sums.Length ; i++) {
sums[i] = sums[i-1] + values[i];
}
If all values are positive, you can run binary search on sums to get to the index in O(log(n)).
learn BST, it's will the fastest algo for your problem:
http://en.wikipedia.org/wiki/Binary_search_tree

How can I get a random x number of decimals from a list of unique decimals that total up to y?

Say I have a sorted list of 1000 or so unique decimals, arranged by value.
List<decimal> decList
How can I get a random x number of decimals from a list of unique decimals that total up to y?
private List<decimal> getWinningValues(int xNumberToGet, decimal yTotalValue)
{
}
Is there any way to avoid a long processing time on this? My idea so far is to take xNumberToGet random numbers from the pool. Something like (cool way to get random selection from a list)
foreach (decimal d in decList.OrderBy(x => randomInstance.Next())Take(xNumberToGet))
{
}
Then I might check the total of those, and if total is less, i might shift the numbers up (to the next available number) slowly. If the total is more, I might shift the numbers down. I'm still now sure how to implement or if there is a better design readily available. Any help would be much appreciated.
Ok, start with a little extension I got from this answer,
public static IEnumerable<IEnumerable<T>> Combinations<T>(
this IEnumerable<T> source,
int k)
{
if (k == 0)
{
return new[] { Enumerable.Empty<T>() };
}
return source.SelectMany((e, i) =>
source.Skip(i + 1).Combinations(k - 1)
.Select(c => (new[] { e }).Concat(c)));
}
this gives you a pretty efficient method to yield all the combinations with k members, without repetition, from a given IEnumerable. You could make good use of this in your implementation.
Bear in mind, if the IEnumerable and k are sufficiently large this could take some time, i.e. much longer than you have. So, I've modified your function to take a CancellationToken.
private static IEnumerable<decimal> GetWinningValues(
IEnumerable<decimal> allValues,
int numberToGet,
decimal targetValue,
CancellationToken canceller)
{
IList<decimal> currentBest = null;
var currentBestGap = decimal.MaxValue;
var locker = new object();
allValues.Combinations(numberToGet)
.AsParallel()
.WithCancellation(canceller)
.TakeWhile(c => currentBestGap != decimal.Zero)
.ForAll(c =>
{
var gap = Math.Abs(c.Sum() - targetValue);
if (gap < currentBestGap)
{
lock (locker)
{
currentBestGap = gap;
currentBest = c.ToList();
}
}
}
return currentBest;
}
I've an idea that you could sort the initial list and quit iterating the combinations at a certain point, when the sum must exceed the target. After some consideration, its not trivial to identify that point and, the cost of checking may exceed the benefit. This benefit would have to be balanced agaist some function of the target value and mean of the set.
I still think further optimization is possible but I also think that this work has already been done and I'd just need to look it up in the right place.
There are k such subsets of decList (k might be 0).
Assuming that you want to select each one with uniform probability 1/k, I think you basically need to do the following:
iterate over all the matching subsets
select one
Step 1 is potentially a big task, you can look into the various ways of solving the "subset sum problem" for a fixed subset size, and adapt them to generate each solution in turn.
Step 2 can be done either by making a list of all the solutions and choosing one or (if that might take too much memory) by using the clever streaming random selection algorithm.
If your data is likely to have lots of such subsets, then generating them all might be incredibly slow. In that case you might try to identify groups of them at a time. You'd have to know the size of the group without visiting its members one by one, then you can choose which group to use weighted by its size, then you've reduced the problem to selecting one of that group at random.
If you don't need to select with uniform probability then the problem might become easier. At the best case, if you don't care about the distribution at all then you can return the first subset-sum solution you find -- whether you'd call that "at random" is another matter...

exclude items of one list from another with C#

I have a rather specific question about how to exclude items of one list from another. Common approaches such as Except() won't do and here is why:
If the duplicate within a list has an "even" index - I need to remove THIS element and the NEXT element AFTER it.
if the duplicate within a list had an "odd" index - I need to remove THIS element AND one element BEFORE** it.
there might be many appearances of the same duplicate within a list. i.e. one might be with an "odd" index, another - "even".
I'm not asking for a solution since I've created one myself. However after performing this method many times - "ANTS performance profiler" shows that the method elapses 75% of whole execution time (30 seconds out of 40). The question is: Is there a faster method to perform the same operation? I've tried to optimize my current code but it still lacks performance. Here it is:
private void removedoubles(List<int> exclude, List<int> listcopy)
{
for (int j = 0; j < exclude.Count(); j++)
{
for (int i = 0; i < listcopy.Count(); i++)
{
if (listcopy[i] == exclude[j])
{
if (i % 2 == 0) // even
{
//listcopy.RemoveRange(i, i + 1);
listcopy.RemoveAt(i);
listcopy.RemoveAt(i);
i = i - 1;
}
else //odd
{
//listcopy.RemoveRange(i - 1, i);
listcopy.RemoveAt(i - 1);
listcopy.RemoveAt(i - 1);
i = i - 2;
}
}
}
}
}
where:
exclude - list that contains Duplicates only. This list might contain up to 30 elements.
listcopy - list that should be checked for duplicates. If duplicate from "exclude" is found -> perform removing operation. This list might contain up to 2000 elements.
I think that the LINQ might be some help but I don't understand its syntax well.
A faster way (O(n)) would be to do the following:
go through the exclude list and make it into a HashSet (O(n))
in the checks, check if the element being tested is in the set (again O(n)), since test for presence in a HashSet is O(1).
Maybe you can even change your algorithms so that the exclude collection will be a HashSet from the very beginning, this way you can omit step 1 and gain even more speed.
(Your current way is O(n^2).)
Edit:
Another idea is the following: you are perhaps creating a copy of some list and make this method modify it? (Guess based on the parameter name.) Then, you can change it to the following: you pass the original array to the method, and make the method allocate new array and return it (your method signature should be than something like private List<int> getWithoutDoubles(HashSet<int> exclude, List<int> original)).
Edit:
It could be even faster if you would reorganize the input data in the following way: As the items are always removed in pairs (even index + the following odd index), you should pair them in advance! So that your list if ints becomes list of pairs of ints. This way your method might be be something like that:
private List<Tuple<int, int>> getWithoutDoubles(
HashSet<int> exclude, List<Tuple<int, int>> original)
{
return original.Where(xy => (!exclude.Contains(xy.Item1) &&
!exclude.Contains(xy.Item2)))
.ToList();
}
(you remove the pairs where either the first or the second item is in the exclude collection). Instead of Tuple, perhaps you can pack the items into your custom type.
Here is yet another way to get the results.
var a = new List<int> {1, 2, 3, 4, 5};
var b = new List<int> {1, 2, 3};
var c = (from i in a let found = b.Any(j => j == i) where !found select i).ToList();
c will contain 4,5
Reverse your loops so they start at .Count - 1 and go to 0, so you don't have to change i in one of the cases and Count is only evaluated once per collection.
Can you convert the List to LinkedList and have a try? The List.RemoveAt() is more expensive than LinkedList.Remove().

Categories