C# LINQ - SkipWhile() in reverse, without calling Reverse()? - c#

In this code:
for (e = 0; e <= collection.Count - 2; e++)
{
var itm = collection.Read()
var itm_price = itm.Price
var forwards_satisfied_row = collection
.Skip(e + 1)
.SkipWhile(x => x.Price < ex_price)
.FirstOrDefault();
var backwards_satisfied_row = collection
.Reverse()
.Skip(collection.Count - e)
.SkipWhile(x => x.Price < ex_price)
.FirstOrDefault();
}
Suppose the collection contains millions of items and a Reverse() is too expensive, what would be the best way to achieve the same outcome as 'backwards_satisfied_row' ?
Edit:
For each item in the collection, it should find the first preceding item that matches the SkipWhile predicate.
For context I'm finding the distance a price extrema (minima or maxima) is from a horizontal clash with the price. This gives a 'strength' value for each Minima and Maxima, which determines the importance of it, and to help marry it up with extremas of a similar strength.
Edit 2
This chart shows the data in the reproc code below, note the dip in the middle at item #22, this item has a distance of 18.
Bear in mind this operation will be iterated millions of times.
So I'm trying not to read into memory, and to only evaluate the items needed.
When I run this on a large dataset r_ex takes 5 ms per row, whereas l_ex takes up to a second.
It might be tempting to iterate backwards and check that way, but there could be millions of previous records, being read from a binary file.
Many types of searches like Binary search wouldn't be practical here, since the values aren't ordered.
static void Main(string[] args)
{
var dict_dists = new Dictionary<Int32, Int32>();
var dict = new Dictionary<Int32, decimal> {
{1, 410},{2, 474},{3, 431},
{4, 503},{5, 461},{6, 535},
{7, 488},{8, 562},{9, 508},
{10, 582},{11, 522},{12, 593},
{13, 529},{14, 597},{15, 529},
{16, 593},{17, 522},{18, 582},
{19, 510},{20, 565},{21, 492},
{22, 544},{23, 483},{24, 557},
{25, 506},{26, 580},{27, 524},
{28, 598},{29, 537},{30, 609},
{31, 543},{32, 612},{33, 542},
{34, 607},{35, 534},{36, 594},
{37, 518},{38, 572},{39, 496},
{40, 544},{41, 469},{42, 511},
{43, 437},{44, 474},{45, 404},
{46, 462},{47, 427},{48, 485},
{49, 441},{50, 507}};
var i = 0;
for (i = 0; i <= dict.Count - 2; i++)
{
var ele = dict.ElementAt(i);
var current_time = ele.Key;
var current_price = ele.Value;
var is_maxima = current_price > dict.ElementAt(i + 1).Value;
//' If ele.Key = 23 Then here = True
var shortest_dist = Int32.MaxValue;
var l_ex = new KeyValuePair<int, decimal>();
var r_ex = new KeyValuePair<int, decimal>();
if (is_maxima)
{
l_ex = dict.Reverse().Skip(dict.Count - 1 - i + 1).SkipWhile(x => x.Value < current_price).FirstOrDefault();
r_ex = dict.Skip(i + 1).SkipWhile(x => x.Value < current_price).FirstOrDefault();
}
else
{ // 'Is Minima
l_ex = dict.Reverse().Skip(dict.Count - 1 - i + 1).SkipWhile(x => x.Value > current_price).FirstOrDefault();
r_ex = dict.Skip(i + 1).SkipWhile(x => x.Value > current_price).FirstOrDefault();
}
if (l_ex.Key > 0)
{
var l_dist = (current_time - l_ex.Key);
if ( l_dist < shortest_dist ) {
shortest_dist = l_dist;
};
}
if (r_ex.Key > 0)
{
var r_dist = (r_ex.Key - current_time);
if ( r_dist < shortest_dist ) {
shortest_dist = r_dist;
};
}
dict_dists.Add(current_time, shortest_dist);
}
var dist = dict_dists[23];
}
Edit: As a workaround I'm writing a reversed temp file for the left-seekers.
for (i = file.count - 1; i >= 0; i += -1)
{
file.SetPointerToItem(i);
temp_file.Write(file.Read());
}

You could make it more efficient by selecting the precedent of each item in one pass. Lets make an extension method for enumerables that selects a precedent for each element:
public static IEnumerable<T> SelectPrecedent<T>(this IEnumerable<T> source,
Func<T, bool> selector)
{
T selectedPrecedent = default;
foreach (var item in source)
{
if (selector(item)) selectedPrecedent = item;
yield return selectedPrecedent;
}
}
You could then use this method, and select the precedent and the subsequent of each element by doing only two Reverse operations in total:
var precedentArray = collection.SelectPrecedent(x => x.Price < ex_price).ToArray();
var subsequentArray = collection.Reverse()
.SelectPrecedent(x => x.Price < ex_price).Reverse().ToArray();
for (int i = 0; i < collection.Count; i++)
{
var current = collection[i];
var precedent = precedentArray[i];
var subsequent = subsequentArray[i];
// Do something with the current, precedent and subsequent
}

No need to do .Reverse() and then FirstOrDefault(), just use LastOrDefault(). Instead of Skip(collection.Count - e) use .Take(e) elements
var backwards_satisfied_row = collection
.SkipWhile(x => x.Price < ex_price) //Skip till x.Price < ex_price
.Skip(e+1) //Skip first e+1 elements
.LastOrDefault(); //Get Last or default value
You can make your code more efficient by storing collection and then just get FirstOrDefault() and LastOrDefault() for forwards_satisfied_row and backwards_satisfied_row respectively.
like,
for (e = 0; e <= collection.Count - 2; e++)
{
var itm = collection.Read()
var itm_price = itm.Price
var satisfied_rows = collection
.SkipWhile(x => x.Price < ex_price)
.Skip(e + 1)
.ToList();
var forwards_satisfied_row = satisfied_rows.FirstOrDefault();
var backwards_satisfied_row = satisfied_rows.LastOrDefault();
}

Related

Find values which sum to 0 in Excel with many items

I have to find each subset in a enough big list, 500/1000 items that are positive and negative and are decimal, whiches sum to 0. I'm not an expert so I read many and many articles and solutions, and then I wrote my code. Datas comes from Excel worksheet and I would to mark found sums there.
Code works in this way:
Initally I find all pair that sum to 0
Then I put the remains sums into a list and take the combinations within 20 items, beacause I know the it is not possible bigger combination sum to 0
In these combinations I search if one combinations sums to 0 and save it in result list, else save sum in dictionary as key and then I'll search if dictionary contains next sums (so I check pairs of these subsets)
I keep track of the index so I can reach and modify the cells
To found solutions is enough fast but when I want elaborate the results in Excel become really slow. I don't take care about find all solutions but I want to find as max as possible in a short time.
What do you think about this solution? How can I improve the speed? How can I skip easly the sums that are already taken? And how can mark the cells fastly in my worksheet, beacuse now here is the bottleneck of the program?
I hope it is enough clear :) Thanks to everybody for any help
Here my code of the combination's part:
List<decimal> listDecimal = new List<decimal>();
List<string> listRange = new List<string>();
List<decimal> resDecimal = new List<decimal>();
List<IEnumerable<decimal>> resDecimal2 = new List<IEnumerable<decimal>>();
List<IEnumerable<string>> resIndex = new List<IEnumerable<string>>();
Dictionary<decimal, int> dicSumma = new Dictionary<decimal, int>();
foreach (TarkistaSummat.CellsRemain el in list)
{
decimal sumDec = Convert.ToDecimal(el.Summa.Value);
listDecimal.Add(sumDec);
string row = el.Summa.Cells.Row.ToString();
string col = el.Summa.Cells.Column.ToString();
string range = el.Summa.Cells.Row.ToString() + ":" + el.Summa.Cells.Column.ToString();
listRange.Add(range);
}
var subsets = new List<IEnumerable<decimal>> { new List<decimal>() };
var subsetsIndex = new List<IEnumerable<string>> { new List<string>() };
for (int i = 0; i < list.Count; i++)
{
if (i > 20)
{
List<IEnumerable<decimal>> parSubsets = subsets.GetRange(i, i + 20);
List<IEnumerable<string>> parSubsetsIndex = subsetsIndex.GetRange(i, i + 20);
var Z = parSubsets.Select(x => x.Concat(new[] { listDecimal[i] }));
//var Zfound = Z.Select(x => x).Where(w => w.Sum() ==0);
subsets.AddRange(Z.ToList());
var Zr = parSubsetsIndex.Select(x => x.Concat(new[] { listRange[i] }));
subsetsIndex.AddRange(Zr.ToList());
}
else
{
var T = subsets.Select(y => y.Concat(new[] { listDecimal[i] }));
//var Tfound = T.Select(x => x).Where(w => w.Sum() == 0);
//resDecimal2.AddRange(Tfound);
//var TnotFound = T.Except(Tfound);
subsets.AddRange(T.ToList());
var Tr = subsetsIndex.Select(y => y.Concat(new[] { listRange[i] }));
subsetsIndex.AddRange(Tr.ToList());
}
for (int i = 0; i < subsets.Count; i++)
{
decimal sumDec = subsets[i].Sum();
if (sumDec == 0m)
{
resDecimal2.Add(subsets[i]);
resIndex.Add(subsetsIndex[i]);
continue;
}
else
{
if(dicSumma.ContainsKey(sumDec * -1))
{
dicSumma.TryGetValue(sumDec * -1, out int index);
IEnumerable<decimal> addComb = subsets[i].Union(subsets[index]);
resDecimal2.Add(addComb);
var indexComb = subsetsIndex[i].Union(subsetsIndex[index]);
resIndex.Add(indexComb);
}
else
{
if(!dicSumma.ContainsKey(sumDec))
{
dicSumma.Add(sumDec, i);
}
}
}
}
for (int i = 0; i < resIndex.Count; i++)
{
//List<Range> ranges = new List<Range>();
foreach(string el in resIndex[i])
{
string[] split = el.Split(':');
Range cell = actSheet.Cells[Convert.ToInt32(split[0]), Convert.ToInt32(split[1])];
cell.Interior.ColorIndex = 6;
}
}
}

Check if a string is sorted

I have a string, simplified "12345" which is sorted. The string couild contain Digits (0-9) or letters (a-z). In case of a mixed use the natural sort order. I need a method to verify if this is true.
Attempt with linq technique:
string items1 = "2349"; //sorted
string items2 = "2476"; //not sorted, 6<>7
bool sorted1 = Enumerable.SequenceEqual(items1.OrderBy(x => x), items1); //true
bool sorted2 = Enumerable.SequenceEqual(items2.OrderBy(x => x), items2); //false
but there could be also a descending sort order.
Is there a better way then
string items3 = "4321";
bool sorted3 = Enumerable.SequenceEqual(items3.OrderBy(x => x), items3) || Enumerable.SequenceEqual(items3.OrderByDescending(x => x), items3);
to check if a string is sorted? Maybe some built in solution?
Your solution in fine and very readable. One problem with it is that it requires ordering the string which is O(n * log(n)), this can be solved by iterating the string without sorting it.
For example:
var firstDifs = items1.Zip(items1.Skip(1), (x, y) => y - x);
This Linq projects every 2 items in the first string to a number which indicates their difference, So if you have items1 = "1245" the output will be:
firstDifs: {1, 2, 1}
Now all you need to do is to validate that firstDifs is either ascending or descending:
bool firstSorted = firstDifs.All(x => x > 0) || firstDifs.All(x => x < 0); //true
Now:
Skip is O(1) since the amount of actions required to skip 1 cell is
constant.
Zip is O(n).
All is O(n).
So the whole solution is O(n).
Note that it will be more efficient with a simple loop, also if the first All has returned false because the 3487th item changes its direction (for example: 1234567891), the second All will run for no reason with the Zip running twice as well (Until where All require) - since there are two iterations of All and Linq evaluates them lazily.
It requires a reducer. In C#, it's Enumerable.Aggregate. It's O(n) algorithm.
var query = "123abc".Aggregate(new { asceding = true, descending = true, prev = (char?)null },
(result, currentChar) =>
new
{
asceding = result.prev == null || result.asceding && currentChar >= result.prev,
descending = result.prev == null || result.descending && currentChar <= result.prev,
prev = (char?)currentChar
}
);
Console.WriteLine(query.asceding || query.descending );
I once had to check something similar to your case but with huge data streams, so performance was important. I came up with this small extension class which performs very well:
public static bool IsOrdered<T>(this IEnumerable<T> enumerable) where T: IComparable<T>
{
using (var enumerator = enumerable.GetEnumerator())
{
if (!enumerator.MoveNext())
return true; //empty enumeration is ordered
var left = enumerator.Current;
int previousUnequalComparison = 0;
while (enumerator.MoveNext())
{
var right = enumerator.Current;
var currentComparison = left.CompareTo(right);
if (currentComparison != 0)
{
if (previousUnequalComparison != 0
&& currentComparison != previousUnequalComparison)
return false;
previousUnequalComparison = currentComparison;
left = right;
}
}
}
return true;
}
Using it is obviously very simple:
var items1 = "2349";
var items2 = "2476"; //not sorted, 6<>7
items1.IsOrdered(); //true
items2.IsOrdered(); //false
You can do much better than the accepted answer by not having to compare all of the elements:
var s = "2349";
var r = Enumerable.Range(1, s.Length - 1);
//var isAscending = r.All(i => s[i - 1] <= s[i]);
//var isDescending = r.All(i => s[i - 1] >= s[i]);
var isOrdered = r.All(i => s[i - 1] <= s[i]) || r.All(i => s[i - 1] >= s[i]);
var items = "4321";
var sortedItems = items.OrderBy(i => i); // Process the order once only
var sorted = sortedItems.SequenceEqual(items) || sortedItems.SequenceEqual(items.Reverse()); // Reverse using yield return
I would go for simple iteration over all elements:
string str = "whatever123";
Func<char, char, bool> pred;
bool? asc = str.TakeWhile((q, i) => i < str.Length - 1)
.Select((q, i) => str[i] == str[i+1] ? (bool?)null : str[i] < str[i+1])
.FirstOrDefault(q => q.HasValue);
if (!asc.HasValue)
return true; //all chars are the same
if (asc.Value)
pred = (c1, c2) => c1 <= c2;
else
pred = (c1, c2) => c1 >= c2;
for (int i = 0; i < str.Length - 1; ++i)
{
if (!pred(str[i], str[i + 1]))
return false;
}
return true;

Find values that appear in all lists (or arrays or collections)

Given the following:
List<List<int>> lists = new List<List<int>>();
lists.Add(new List<int>() { 1,2,3,4,5,6,7 });
lists.Add(new List<int>() { 1,2 });
lists.Add(new List<int>() { 1,2,3,4 });
lists.Add(new List<int>() { 1,2,5,6,7 });
What is the best/fastest way of identifying which numbers appear in all lists?
You can use the .net 3.5 .Intersect() extension method:-
List<int> a = new List<int>() { 1, 2, 3, 4, 5 };
List<int> b = new List<int>() { 0, 4, 8, 12 };
List<int> common = a.Intersect(b).ToList();
To do it for two lists one would use x.Intersect(y).
To do it for several we would want to do something like:
var intersection = lists.Aggregate((x, y) => x.Intersect(y));
But this won't work because the result of the lambda isn't List<int> and so it can't be fed back in. This might tempt us to try:
var intersection = lists.Aggregate((x, y) => x.Intersect(y).ToList());
But then this makes n-1 needless calls to ToList() which is relatively expensive. We can get around this with:
var intersection = lists.Aggregate(
(IEnumerable<int> x, IEnumerable<int> y) => x.Intersect(y));
Which applies the same logic, but in using explicit types in the lambda, we can feed the result of Intersect() back in without wasting time and memory creating a list each time, and so gives faster results.
If this came up a lot we can get further (slight) performance improvements by rolling our own rather than using Linq:
public static IEnumerable<T> IntersectAll<T>(this IEnumerable<IEnumerable<T>> source)
{
using(var en = source.GetEnumerator())
{
if(!en.MoveNext()) return Enumerable.Empty<T>();
var set = new HashSet<T>(en.Current);
while(en.MoveNext())
{
var newSet = new HashSet<T>();
foreach(T item in en.Current)
if(set.Remove(item))
newSet.Add(item);
set = newSet;
}
return set;
}
}
This assumes its for internal use only. If it could be called from another assembly it should have error checks, and perhaps should be defined so as to only perform the intersect operations on the first MoveNext() of the calling code:
public static IEnumerable<T> IntersectAll<T>(this IEnumerable<IEnumerable<T>> source)
{
if(source == null)
throw new ArgumentNullException("source");
return IntersectAllIterator(source);
}
public static IEnumerable<T> IntersectAllIterator<T>(IEnumerable<IEnumerable<T>> source)
{
using(var en = source.GetEnumerator())
{
if(en.MoveNext())
{
var set = new HashSet<T>(en.Current);
while(en.MoveNext())
{
var newSet = new HashSet<T>();
foreach(T item in en.Current)
if(set.Remove(item))
newSet.Add(item);
set = newSet;
}
foreach(T item in set)
yield return item;
}
}
}
(In these final two versions there's an opportunity to short-circuit if we end up emptying the set, but it only pays off if this happens relatively often, otherwise it's a nett loss).
Conversely, if these aren't concerns, and if we know that we're only ever going to want to do this with lists, we can optimise a bit further with the use of Count and indices:
public static IEnumerable<T> IntersectAll<T>(this List<List<T>> source)
{
if (source.Count == 0) return Enumerable.Empty<T>();
if (source.Count == 1) return source[0];
var set = new HashSet<T>(source[0]);
for(int i = 1; i != source.Count; ++i)
{
var newSet = new HashSet<T>();
var list = source[i];
for(int j = 0; j != list.Count; ++j)
{
T item = list[j];
if(set.Remove(item))
newSet.Add(item);
}
set = newSet;
}
return set;
}
And further if we know we're always going to want the results in a list, and we know that either we won't mutate the list, or it won't matter if the input list got mutated, we can optimise for the case of there being zero or one lists (but this costs more if we might ever not need the output in a list):
public static List<T> IntersectAll<T>(this List<List<T>> source)
{
if (source.Count == 0) return new List<T>(0);
if (source.Count == 1) return source[0];
var set = new HashSet<T>(source[0]);
for(int i = 1; i != source.Count; ++i)
{
var newSet = new HashSet<T>();
var list = source[i];
for(int j = 0; j != list.Count; ++j)
{
T item = list[j];
if(set.Remove(item))
newSet.Add(item);
}
set = newSet;
}
return new List<T>(set);
}
Again though, as well as making the method less widely-applicable, this has risks in terms of how it could be used, so is only appropriate for internal code were you can know either that you won't change either the input or the output after the fact, or that this won't matter.
Linq already offers Intersect and you can exploit Aggregate as well:
var result = lists.Aggregate((a, b) => a.Intersect(b).ToList());
If you don't trust the Intersect method or you just prefer to see what's going on, here's a snippet of code that should do the trick:
// Output goes here
List<int> output = new List<int>();
// Make sure lists are sorted
for (int i = 0; i < lists.Count; ++i) lists[i].Sort();
// Maintain array of indices so we can step through all the lists in parallel
int[] index = new int[lists.Count];
while(index[0] < lists[0].Count)
{
// Search for each value in the first list
int value = lists[0][index[0]];
// No. lists that value appears in, we want this to equal lists.Count
int count = 1;
// Search all the other lists for the value
for (int i = 1; i < lists.Count; ++i)
{
while (index[i] < lists[i].Count)
{
// Stop if we've passed the spot where value would have been
if (lists[i][index[i]] > value) break;
// Stop if we find value
if (lists[i][index[i]] == value)
{
++count;
break;
}
++index[i];
}
// If we reach the end of any list there can't be any more matches so end the search now
if (index[i] >= lists[i].Count) goto done;
}
// Store the value if we found it in all the lists
if (count == lists.Count) output.Add(value);
// Skip multiple occurrances of the same value
while (index[0] < lists[0].Count && lists[0][index[0]] == value) ++index[0];
}
done:
Edit:
I got bored and did some benchmarks on this vs. Jon Hanna's version. His is consistently faster, typically by around 50%. Mine wins by about the same margin if you happen to have presorted lists, though. Also you can gain a further 20% or so with unsafe optimisations. Just thought I'd share that.
You can also get it with SelectMany and Distinct:
List<int> result = lists
.SelectMany(x => x.Where(e => lists.All(l => l.Contains(e))))
.Distinct().ToList();
Edit:
List<int> result2 = lists.First().Where(e => lists.Skip(1).All(l => l.Contains(e)))
.ToList();
Edit 2:
List<int> result3 = lists
.Select(l => l.OrderBy(n => n).Take(lists.Min(x => x.Count()))).First()
.TakeWhile((n, index) => lists.Select(l => l.OrderBy(x => x)).Skip(1).All(l => l.ElementAt(index) == n))
.ToList();

Does Linq provide a way to easily spot gaps in a sequence?

I am managing a directory of files. Each file will be named similarly to Image_000000.png, with the numeric portion being incremented for each file that is stored.
Files can also be deleted, leaving gaps in the number sequence. The reason I am asking is because I recognize that at some point in the future, the user could use up the number sequence unless I takes steps to reuse numbers when they become available. I realize that it is a million, and that's a lot, but we have 20-plus year users, so "someday" is not out of the question.
So, I am specifically asking whether or not there exists a way to easily determine the gaps in the sequence without simply looping. I realize that because it's a fixed range, I could simply loop over the expected range.
And I will unless there is a better/cleaner/easier/faster alternative. If so, I'd like to know about it.
This method is called to obtain the next available file name:
public static String GetNextImageFileName()
{
String retFile = null;
DirectoryInfo di = new DirectoryInfo(userVars.ImageDirectory);
FileInfo[] fia = di.GetFiles("*.*", SearchOption.TopDirectoryOnly);
String lastFile = fia.Where(i => i.Name.StartsWith("Image_") && i.Name.Substring(6, 6).ContainsOnlyDigits()).OrderBy(i => i.Name).Last().Name;
if (!String.IsNullOrEmpty(lastFile))
{
Int32 num;
String strNum = lastFile.Substring(6, 6);
String strExt = lastFile.Substring(13);
if (!String.IsNullOrEmpty(strNum) &&
!String.IsNullOrEmpty(strExt) &&
strNum.ContainsOnlyDigits() &&
Int32.TryParse(strNum, out num))
{
num++;
retFile = String.Format("Image_{0:D6}.{1}", num, strExt);
while (num <= 999999 && File.Exists(retFile))
{
num++;
retFile = String.Format("Image_{0:D6}.{1}", num, strExt);
}
}
}
return retFile;
}
EDIT: in case it helps anyone, here is the final method, incorporating Daniel Hilgarth's answer:
public static String GetNextImageFileName()
{
DirectoryInfo di = new DirectoryInfo(userVars.ImageDirectory);
FileInfo[] fia = di.GetFiles("Image_*.*", SearchOption.TopDirectoryOnly);
List<Int32> fileNums = new List<Int32>();
foreach (FileInfo fi in fia)
{
Int32 i;
if (Int32.TryParse(fi.Name.Substring(6, 6), out i))
fileNums.Add(i);
}
var result = fileNums.Select((x, i) => new { Index = i, Value = x })
.Where(x => x.Index != x.Value)
.Select(x => (Int32?)x.Index)
.FirstOrDefault();
Int32 index;
if (result == null)
index = fileNums.Count - 1;
else
index = result.Value - 1;
var nextNumber = fileNums[index] + 1;
if (nextNumber >= 0 && nextNumber <= 999999)
return String.Format("Image_{0:D6}", result.Value);
return null;
}
A very simple approach to find the first number of the first gap would be the following:
int[] existingNumbers = /* extract all numbers from all filenames and order them */
var allNumbers = Enumerable.Range(0, 1000000);
var result = allNumbers.Where(x => !existingNumbers.Contains(x)).First();
This will return 1,000,000 if all numbers have been used and no gaps exist.
This approach has the drawback that it performs rather badly, as it iterates existingNumbers multiple times.
A somewhat better approach would be to use Zip:
allNumbers.Zip(existingNumbers, (a, e) => new { Number = a, ExistingNumber = e })
.Where(x => x.Number != x.ExistingNumber)
.Select(x => x.Number)
.First();
An improved version of DuckMaestro's answer that actually returns the first value of the first gap - and not the first value after the first gap - would look like this:
var tmp = existingNumbers.Select((x, i) => new { Index = i, Value = x })
.Where(x => x.Index != x.Value)
.Select(x => (int?)x.Index)
.FirstOrDefault();
int index;
if(tmp == null)
index = existingNumbers.Length - 1;
else
index = tmp.Value - 1;
var nextNumber = existingNumbers[index] + 1;
Improving over the other answer, use the alternate version of Where.
int[] existingNumbers = ...
var result = existingNumbers.Where( (x,i) => x != i ).FirstOrDefault();
The value i is a counter starting at 0.
This version of where is supported in .NET 3.5 (http://msdn.microsoft.com/en-us/library/bb549418(v=vs.90).aspx).
var firstnonexistingfile = Enumerable.Range(0,999999).Select(x => String.Format("Image_{0:D6}.{1}", x, strExt)).FirstOrDefault(x => !File.Exists(x));
This will iterate from 0 to 999999, then output the result of the String.Format() as an IEnumerable<string> and then find the first string out of that sequence that returns false for File.Exists().
It's an old question, but it has been suggested (in the comments) that you could use .Except() instead. I tend to like this solution a little better since it will give you the first missing number (the gap) or the next smallest number in the sequence. Here's an example:
var allNumbers = Enumerable.Range(0, 999999); //999999 is arbitrary. You could use int.MaxValue, but it would degrade performance
var existingNumbers = new int[] { 0, 1, 2, 4, 5, 6 };
int result;
var missingNumbers = allNumbers.Except(existingNumbers);
if (missingNumbers.Any())
result = missingNumbers.First();
else //no missing numbers -- you've reached the max
result = -1;
Running the above code would set result to:
3
Additionally, if you changed existingNumbers to:
var existingNumbers = new int[] { 0, 1, 3, 2, 4, 5, 6 };
So there isn't a gap, you would get 7 back.
Anyway, that's why I prefer Except over the Zip solution -- just my two cents.
Thanks!

Advanced Remove Array Duplicated

I have 3 arrays.
Array 1 = {1,1,1,1,2,2,3,3}
Array 2 = {a,a,a,a,e,e,b,b}
Array 3 = {z,z,z,z,z,z,z,z}
I would like to remove all duplicates from array 1 and also remove the same element at said duplicate in the other arrays to keep them all properly linked.
I know you can use .Distinct().ToArray() to do this for one array, but then the other arrays would not have the elements removed as well.
The result would look like this.
Array 1 = {1,2,3}
Array 2 = {a,e,b}
Array 3 = {z,z,z}
I'm guessing the only way to solve this would be the following.
For(int a = 0; a < Array1.count; a++) {
For(int b = a + 1; b < Array1.count; b++) {
if(Array1[a]==Array1[b]) {
Array1.RemoveAt(b);
Array2.RemoveAt(b);
Array3.RemoveAt(b);
}
}
}
Would be nice to find a simple predefined function however!
var distinctIndexes = array1
.Select((item, idx) => new { Item = item, Index = idx })
.GroupBy(p => p.Item)
.Select(grp => grp.First().Index);
var result1 = distinctIndexes.Select(i => array1[i]).ToArray();
var result2 = distinctIndexes.Select(i => array2[i]).ToArray();
var result3 = distinctIndexes.Select(i => array3[i]).ToArray();
Note this won't necessarily use the first unique element from the first array. If you need to do that you can calculate the indexes as
var distinctIndexes = array1
.Select((item, idx) => new { Item = item, Index = idx })
.Aggregate(new Dictionary<int, int>(), (dict, i) =>
{
if (! dict.ContainsKey(i.Item))
{
dict[i.Item] = i.Index;
}
return dict;
})
.Values;
You should consider what data structure you're using carefully. Is this "remove" operation likely to happen all at once? How often? (I'm not challenging your use of Array necessarily, just a general tip, but your scenario seems weird). Also, you did not explain if this is an index-based removal or an element based removal. If I was implementing this, I would be tempted to create a new Array and add all remaining elements to the new Array in a loop, ignoring the elements you want to remove. Then simply reassign the reference with '='. Of course, that depends on the maximum expected size of the Array, since a copy like I suggested would take up more memory (usually wouldn't be a problem).
I don't really know of a clean way to do what you're asking, but this is a generic example of doing what you asked?
static void RemoveDupes(ref Array a1, ref Array a2, ref Array a3)
{
Type a1t, a2t, a3t;
int newLength, ni, oi;
int[] indices;
a1t = a1.GetType().GetElementType();
a2t = a1.GetType().GetElementType();
a3t = a1.GetType().GetElementType();
Dictionary<object, List<int>> buckets = new Dictionary<object, List<int>>();
for (int i = 0; i < a1.Length; i++)
{
object val = a1.GetValue(i);
if (buckets.ContainsKey(val))
buckets[val].Add(i);
else
buckets.Add(val, new List<int> { i });
}
indices = buckets.Where(kvp => kvp.Value.Count > 1).SelectMany(kvp => kvp.Value.Skip(1)).OrderBy(i => i).ToArray();
newLength = a1.Length - indices.Length;
Array na1 = Array.CreateInstance(a1t, newLength);
Array na2 = Array.CreateInstance(a2t, newLength);
Array na3 = Array.CreateInstance(a3t, newLength);
oi = 0;
ni = 0;
for (int i = 0; i < indices.Length; i++)
{
while (oi < indices[i])
{
na1.SetValue(a1.GetValue(oi), ni);
na2.SetValue(a2.GetValue(oi), ni);
na3.SetValue(a3.GetValue(oi), ni);
oi++;
ni++;
}
oi++;
}
while (ni < newLength)
{
na1.SetValue(a1.GetValue(oi), ni);
na2.SetValue(a2.GetValue(oi), ni);
na3.SetValue(a3.GetValue(oi), ni);
oi++;
ni++;
}
a1 = na1;
a2 = na2;
a3 = na3;
}

Categories