LINQ alternative to creating collection by looping over array of fields - c#

I have an array which contains fields for a data structure in the following format;
[0] = Record 1 (Name Field)
[1] = Record 1 (ID Field)
[2] = Record 1 (Other Field)
[3] = Record 2 (Name Field)
[4] = Record 2 (ID Field)
[5] = Record 2 (Other Field)
etc.
I'm processing this into a collection as follows;
for (int i = 0; i < components.Length; i = i + 3)
{
results.Add(new MyObj
{
Name = components[i],
Id = components[i + 1],
Other = components[i + 2],
});
}
This works fine, but I was wondering if there is a nice way to achieve the same output with LINQ? There's no functional requirement here, I'm just curious if it can be done or not.
I did do some experimenting with grouping by an index (after ToList()'ing the array);
var groupings = components
.GroupBy(x => components.IndexOf(x) / 3)
.Select(g => g.ToArray())
.Select(a => new
{
Name = a[0],
Id = a[1],
Other = a[2]
});
This works, but I think it's a bit overkill for what I'm trying to do. Is there a simpler way to achieve the same output as the for loop?

Looks like a perfect candidate for Josh Einstein's IEnumerable.Batch extension. It slices an enumerable into chunks of a certain size and feeds them out as an enumeration of arrays:
public static IEnumerable<T[]> Batch<T>(this IEnumerable<T> self, int batchSize)
In the case of this question, you'd do something like this:
var results =
from batch in components.Batch(3)
select new MyObj { Name = batch[0], Id = batch[1], Other = batch[2] };
Update: 2 years on and the Batch extension I linked to seems to have disappeared. Since it was considered the answer to the question, and just in case someone else finds it useful, here's my current implementation of Batch:
public static partial class EnumExts
{
/// <summary>Split sequence into blocks of specified size.</summary>
/// <typeparam name="T">Type of items in sequence</typeparam>
/// <param name="sequence"><see cref="IEnumerable{T}"/> sequence to split</param>
/// <param name="batchLength">Number of items per returned array</param>
/// <returns>Arrays of <paramref name="batchLength"/> items, with last array smaller if sequence count is not a multiple of <paramref name="batchLength"/></returns>
public static IEnumerable<T[]> Batch<T>(this IEnumerable<T> sequence, int batchLength)
{
if (sequence == null)
throw new ArgumentNullException("sequence");
if (batchLength < 2)
throw new ArgumentException("Batch length must be at least 2", "batchLength");
using (var iter = sequence.GetEnumerator())
{
var bfr = new T[batchLength];
while (true)
{
for (int i = 0; i < batchLength; i++)
{
if (!iter.MoveNext())
{
if (i == 0)
yield break;
Array.Resize(ref bfr, i);
break;
}
bfr[i] = iter.Current;
}
yield return bfr;
bfr = new T[batchLength];
}
}
}
}
This operation is deferred, single enumeration and executes in linear time. It is relatively quick compared to a few other Batch implementations I've seen, even though it is allocating a new array for each result.
Which just goes to show: you never can tell until you profile, and you should always quote the code in case it disappears.

I would say stick with your for-loop. However, this should work with Linq:
List<MyObj> results = components
.Select((c ,i) => new{ Component = c, Index = i })
.GroupBy(x => x.Index / 3)
.Select(g => new MyObj{
Name = g.First().Component,
Id = g.ElementAt(1).Component,
Other = g.Last().Component
})
.ToList();

Maybe an iterator could be appropriate.
Declare a custom iterator:
static IEnumerable<Tuple<int, int, int>> ToPartitions(int count)
{
for (var i = 0; i < count; i += 3)
yield return new Tuple<int, int, int>(i, i + 1, i + 2);
}
Prepare the following LINQ:
var results = from partition in ToPartitions(components.Length)
select new {Name = components[partition.Item1], Id = components[partition.Item2], Other = components[partition.Item3]};

This method may give you an idea on how to make the code more expressive.
public static IEnumerable<MyObj> AsComponents<T>(this IEnumerable<T> serialized)
where T:class
{
using (var it = serialized.GetEnumerator())
{
Func<T> next = () => it.MoveNext() ? it.Current : null;
var obj = new MyObj
{
Name = next(),
Id = next(),
Other = next()
};
if (obj.Name == null)
yield break;
yield return obj;
}
}
As it stands, I dislike the way I detect the end of the input, but you might have domain specific information on how to do this better.

Related

Split a list of objects into sub-lists of contiguous elements using LINQ?

I have a simple class Item:
public class Item
{
public int Start { get; set;}
public int Stop { get; set;}
}
Given a List<Item> I want to split this into multiple sublists of contiguous elements. e.g. a method
List<Item[]> GetContiguousSequences(Item[] items)
Each element of the returned list should be an array of Item such that list[i].Stop == list[i+1].Start for each element
e.g.
{[1,10], [10,11], [11,20], [25,30], [31,40], [40,45], [45,100]}
=>
{{[1,10], [10,11], [11,20]}, {[25,30]}, {[31,40],[40,45],[45,100]}}
Here is a simple (and not guaranteed bug-free) implementation that simply walks the input data looking for discontinuities:
List<Item[]> GetContiguousSequences(Item []items)
{
var ret = new List<Item[]>();
var i1 = 0;
for(var i2=1;i2<items.Length;++i2)
{
//discontinuity
if(items[i2-1].Stop != items[i2].Start)
{
var num = i2 - i1;
ret.Add(items.Skip(i1).Take(num).ToArray());
i1 = i2;
}
}
//end of array
ret.Add(items.Skip(i1).Take(items.Length-i1).ToArray());
return ret;
}
It's not the most intuitive implementation and I wonder if there is a way to have a neater LINQ-based approach. I was looking at Take and TakeWhile thinking to find the indices where discontinuities occur but couldn't see an easy way to do this.
Is there a simple way to use IEnumerable LINQ algorithms to do this in a more descriptive (not necessarily performant) way?
I set of a simple test-case here: https://dotnetfiddle.net/wrIa2J
I'm really not sure this is much better than your original, but for the purpose of another solution the general process is
Use Select to project a list working out a grouping
Use GroupBy to group by the above
Use Select again to project the grouped items to an array of Item
Use ToList to project the result to a list
public static List<Item[]> GetContiguousSequences2(Item []items)
{
var currIdx = 1;
return items.Select( (item,index) => new {
item = item,
index = index == 0 || items[index-1].Stop == item.Start ? currIdx : ++currIdx
})
.GroupBy(x => x.index, x => x.item)
.Select(x => x.ToArray())
.ToList();
}
Live example: https://dotnetfiddle.net/mBfHru
Another way is to do an aggregation using Aggregate. This means maintaining a final Result list and a Curr list where you can aggregate your sequences, adding them to the Result list as you find discontinuities. This method looks a little closer to your original
public static List<Item[]> GetContiguousSequences3(Item []items)
{
var res = items.Aggregate(new {Result = new List<Item[]>(), Curr = new List<Item>()}, (agg, item) => {
if(!agg.Curr.Any() || agg.Curr.Last().Stop == item.Start) {
agg.Curr.Add(item);
} else {
agg.Result.Add(agg.Curr.ToArray());
agg.Curr.Clear();
agg.Curr.Add(item);
}
return agg;
});
res.Result.Add(res.Curr.ToArray()); // Remember to add the last group
return res.Result;
}
Live example: https://dotnetfiddle.net/HL0VyJ
You can implement ContiguousSplit as a corutine: let's loop over source and either add item into current range or return it and start a new one.
private static IEnumerable<Item[]> ContiguousSplit(IEnumerable<Item> source) {
List<Item> current = new List<Item>();
foreach (var item in source) {
if (current.Count > 0 && current[current.Count - 1].Stop != item.Start) {
yield return current.ToArray();
current.Clear();
}
current.Add(item);
}
if (current.Count > 0)
yield return current.ToArray();
}
then if you want materialization
List<Item[]> GetContiguousSequences(Item []items) => ContiguousSplit(items).ToList();
Your solution is okay. I don't think that LINQ adds any simplification or clarity in this situation. Here is a fast solution that I find intuitive:
static List<Item[]> GetContiguousSequences(Item[] items)
{
var result = new List<Item[]>();
int start = 0;
while (start < items.Length) {
int end = start + 1;
while (end < items.Length && items[end].Start == items[end - 1].Stop) {
end++;
}
int len = end - start;
var a = new Item[len];
Array.Copy(items, start, a, 0, len);
result.Add(a);
start = end;
}
return result;
}

How to find the placement of a List within another List?

I am working with two lists. The first contains a large sequence of strings. The second contains a smaller list of strings. I need to find where the second list exists in the first list.
I worked with enumeration, and due to the large size of the data, this is very slow, I was hoping for a faster way.
List<string> first = new List<string>() { "AAA","BBB","CCC","DDD","EEE","FFF" };
List<string> second = new List<string>() { "CCC","DDD","EEE" };
int x = SomeMagic(first,second);
And I would need x to = 2.
Ok, here is my variant with old-good-for-each-loop:
private int SomeMagic(IEnumerable<string> source, IEnumerable<string> target)
{
/* Some obvious checks for `source` and `target` lenght / nullity are ommited */
// searched pattern
var pattern = target.ToArray();
// candidates in form `candidate index` -> `checked length`
var candidates = new Dictionary<int, int>();
// iteration index
var index = 0;
// so, lets the magic begin
foreach (var value in source)
{
// check candidates
foreach (var candidate in candidates.Keys.ToArray()) // <- we are going to change this collection
{
var checkedLength = candidates[candidate];
if (value == pattern[checkedLength]) // <- here `checkedLength` is used in sense `nextPositionToCheck`
{
// candidate has match next value
checkedLength += 1;
// check if we are done here
if (checkedLength == pattern.Length) return candidate; // <- exit point
candidates[candidate] = checkedLength;
}
else
// candidate has failed
candidates.Remove(candidate);
}
// check for new candidate
if (value == pattern[0])
candidates.Add(index, 1);
index++;
}
// we did everything we could
return -1;
}
We use dictionary of candidates to handle situations like:
var first = new List<string> { "AAA","BBB","CCC","CCC","CCC","CCC","EEE","FFF" };
var second = new List<string> { "CCC","CCC","CCC","EEE" };
If you are willing to use MoreLinq then consider using Window:
var windows = first.Window(second.Count);
var result = windows
.Select((subset, index) => new { subset, index = (int?)index })
.Where(z => Enumerable.SequenceEqual(second, z.subset))
.Select(z => z.index)
.FirstOrDefault();
Console.WriteLine(result);
Console.ReadLine();
Window will allow you to look at 'slices' of the data in chunks (based on the length of your second list). Then SequenceEqual can be used to see if the slice is equal to second. If it is, the index can be returned. If it doesn't find a match, null will be returned.
Implemented SomeMagic method as below, this will return -1 if no match found, else it will return the index of start element in first list.
private int SomeMagic(List<string> first, List<string> second)
{
if (first.Count < second.Count)
{
return -1;
}
for (int i = 0; i <= first.Count - second.Count; i++)
{
List<string> partialFirst = first.GetRange(i, second.Count);
if (Enumerable.SequenceEqual(partialFirst, second))
return i;
}
return -1;
}
you can use intersect extension method using the namepace System.Linq
var CommonList = Listfirst.Intersect(Listsecond)

Find values that appear in all lists (or arrays or collections)

Given the following:
List<List<int>> lists = new List<List<int>>();
lists.Add(new List<int>() { 1,2,3,4,5,6,7 });
lists.Add(new List<int>() { 1,2 });
lists.Add(new List<int>() { 1,2,3,4 });
lists.Add(new List<int>() { 1,2,5,6,7 });
What is the best/fastest way of identifying which numbers appear in all lists?
You can use the .net 3.5 .Intersect() extension method:-
List<int> a = new List<int>() { 1, 2, 3, 4, 5 };
List<int> b = new List<int>() { 0, 4, 8, 12 };
List<int> common = a.Intersect(b).ToList();
To do it for two lists one would use x.Intersect(y).
To do it for several we would want to do something like:
var intersection = lists.Aggregate((x, y) => x.Intersect(y));
But this won't work because the result of the lambda isn't List<int> and so it can't be fed back in. This might tempt us to try:
var intersection = lists.Aggregate((x, y) => x.Intersect(y).ToList());
But then this makes n-1 needless calls to ToList() which is relatively expensive. We can get around this with:
var intersection = lists.Aggregate(
(IEnumerable<int> x, IEnumerable<int> y) => x.Intersect(y));
Which applies the same logic, but in using explicit types in the lambda, we can feed the result of Intersect() back in without wasting time and memory creating a list each time, and so gives faster results.
If this came up a lot we can get further (slight) performance improvements by rolling our own rather than using Linq:
public static IEnumerable<T> IntersectAll<T>(this IEnumerable<IEnumerable<T>> source)
{
using(var en = source.GetEnumerator())
{
if(!en.MoveNext()) return Enumerable.Empty<T>();
var set = new HashSet<T>(en.Current);
while(en.MoveNext())
{
var newSet = new HashSet<T>();
foreach(T item in en.Current)
if(set.Remove(item))
newSet.Add(item);
set = newSet;
}
return set;
}
}
This assumes its for internal use only. If it could be called from another assembly it should have error checks, and perhaps should be defined so as to only perform the intersect operations on the first MoveNext() of the calling code:
public static IEnumerable<T> IntersectAll<T>(this IEnumerable<IEnumerable<T>> source)
{
if(source == null)
throw new ArgumentNullException("source");
return IntersectAllIterator(source);
}
public static IEnumerable<T> IntersectAllIterator<T>(IEnumerable<IEnumerable<T>> source)
{
using(var en = source.GetEnumerator())
{
if(en.MoveNext())
{
var set = new HashSet<T>(en.Current);
while(en.MoveNext())
{
var newSet = new HashSet<T>();
foreach(T item in en.Current)
if(set.Remove(item))
newSet.Add(item);
set = newSet;
}
foreach(T item in set)
yield return item;
}
}
}
(In these final two versions there's an opportunity to short-circuit if we end up emptying the set, but it only pays off if this happens relatively often, otherwise it's a nett loss).
Conversely, if these aren't concerns, and if we know that we're only ever going to want to do this with lists, we can optimise a bit further with the use of Count and indices:
public static IEnumerable<T> IntersectAll<T>(this List<List<T>> source)
{
if (source.Count == 0) return Enumerable.Empty<T>();
if (source.Count == 1) return source[0];
var set = new HashSet<T>(source[0]);
for(int i = 1; i != source.Count; ++i)
{
var newSet = new HashSet<T>();
var list = source[i];
for(int j = 0; j != list.Count; ++j)
{
T item = list[j];
if(set.Remove(item))
newSet.Add(item);
}
set = newSet;
}
return set;
}
And further if we know we're always going to want the results in a list, and we know that either we won't mutate the list, or it won't matter if the input list got mutated, we can optimise for the case of there being zero or one lists (but this costs more if we might ever not need the output in a list):
public static List<T> IntersectAll<T>(this List<List<T>> source)
{
if (source.Count == 0) return new List<T>(0);
if (source.Count == 1) return source[0];
var set = new HashSet<T>(source[0]);
for(int i = 1; i != source.Count; ++i)
{
var newSet = new HashSet<T>();
var list = source[i];
for(int j = 0; j != list.Count; ++j)
{
T item = list[j];
if(set.Remove(item))
newSet.Add(item);
}
set = newSet;
}
return new List<T>(set);
}
Again though, as well as making the method less widely-applicable, this has risks in terms of how it could be used, so is only appropriate for internal code were you can know either that you won't change either the input or the output after the fact, or that this won't matter.
Linq already offers Intersect and you can exploit Aggregate as well:
var result = lists.Aggregate((a, b) => a.Intersect(b).ToList());
If you don't trust the Intersect method or you just prefer to see what's going on, here's a snippet of code that should do the trick:
// Output goes here
List<int> output = new List<int>();
// Make sure lists are sorted
for (int i = 0; i < lists.Count; ++i) lists[i].Sort();
// Maintain array of indices so we can step through all the lists in parallel
int[] index = new int[lists.Count];
while(index[0] < lists[0].Count)
{
// Search for each value in the first list
int value = lists[0][index[0]];
// No. lists that value appears in, we want this to equal lists.Count
int count = 1;
// Search all the other lists for the value
for (int i = 1; i < lists.Count; ++i)
{
while (index[i] < lists[i].Count)
{
// Stop if we've passed the spot where value would have been
if (lists[i][index[i]] > value) break;
// Stop if we find value
if (lists[i][index[i]] == value)
{
++count;
break;
}
++index[i];
}
// If we reach the end of any list there can't be any more matches so end the search now
if (index[i] >= lists[i].Count) goto done;
}
// Store the value if we found it in all the lists
if (count == lists.Count) output.Add(value);
// Skip multiple occurrances of the same value
while (index[0] < lists[0].Count && lists[0][index[0]] == value) ++index[0];
}
done:
Edit:
I got bored and did some benchmarks on this vs. Jon Hanna's version. His is consistently faster, typically by around 50%. Mine wins by about the same margin if you happen to have presorted lists, though. Also you can gain a further 20% or so with unsafe optimisations. Just thought I'd share that.
You can also get it with SelectMany and Distinct:
List<int> result = lists
.SelectMany(x => x.Where(e => lists.All(l => l.Contains(e))))
.Distinct().ToList();
Edit:
List<int> result2 = lists.First().Where(e => lists.Skip(1).All(l => l.Contains(e)))
.ToList();
Edit 2:
List<int> result3 = lists
.Select(l => l.OrderBy(n => n).Take(lists.Min(x => x.Count()))).First()
.TakeWhile((n, index) => lists.Select(l => l.OrderBy(x => x)).Skip(1).All(l => l.ElementAt(index) == n))
.ToList();

Get previous and next item in a IEnumerable using LINQ

I have an IEnumerable of a custom type. (That I've gotten from a SelectMany)
I also have an item (myItem) in that IEnumerable that I desire the previous and next item from the IEnumerable.
Currently, I'm doing the desired like this:
var previousItem = myIEnumerable.Reverse().SkipWhile(
i => i.UniqueObjectID != myItem.UniqueObjectID).Skip(1).FirstOrDefault();
I can get the next item by simply ommitting the .Reverse.
or, I could:
int index = myIEnumerable.ToList().FindIndex(
i => i.UniqueObjectID == myItem.UniqueObjectID)
and then use .ElementAt(index +/- 1) to get the previous or next item.
Which is better between the two options?
Is there an even better option available?
"Better" includes a combination of performance (memory and speed) and readability; with readability being my primary concern.
First off
"Better" includes a combination of performance (memory and speed)
In general you can't have both, the rule of thumb is, if you optimise for speed, it'll cost memory, if you optimise for memory, it'll cost you speed.
There is a better option, that performs well on both memory and speed fronts, and can be used in a readable manner (I'm not delighted with the function name, however, FindItemReturningPreviousItemFoundItemAndNextItem is a bit of a mouthful).
So, it looks like it's time for a custom find extension method, something like . . .
public static IEnumerable<T> FindSandwichedItem<T>(this IEnumerable<T> items, Predicate<T> matchFilling)
{
if (items == null)
throw new ArgumentNullException("items");
if (matchFilling == null)
throw new ArgumentNullException("matchFilling");
return FindSandwichedItemImpl(items, matchFilling);
}
private static IEnumerable<T> FindSandwichedItemImpl<T>(IEnumerable<T> items, Predicate<T> matchFilling)
{
using(var iter = items.GetEnumerator())
{
T previous = default(T);
while(iter.MoveNext())
{
if(matchFilling(iter.Current))
{
yield return previous;
yield return iter.Current;
if (iter.MoveNext())
yield return iter.Current;
else
yield return default(T);
yield break;
}
previous = iter.Current;
}
}
// If we get here nothing has been found so return three default values
yield return default(T); // Previous
yield return default(T); // Current
yield return default(T); // Next
}
You can cache the result of this to a list if you need to refer to the items more than once, but it returns the found item, preceded by the previous item, followed by the following item. e.g.
var sandwichedItems = myIEnumerable.FindSandwichedItem(item => item.objectId == "MyObjectId").ToList();
var previousItem = sandwichedItems[0];
var myItem = sandwichedItems[1];
var nextItem = sandwichedItems[2];
The defaults to return if it's the first or last item may need to change depending on your requirements.
Hope this helps.
For readability, I'd load the IEnumerable into a linked list:
var e = Enumerable.Range(0,100);
var itemIKnow = 50;
var linkedList = new LinkedList<int>(e);
var listNode = linkedList.Find(itemIKnow);
var next = listNode.Next.Value; //probably a good idea to check for null
var prev = listNode.Previous.Value; //ditto
By creating an extension method for establishing context to the current element you can use a Linq query like this:
var result = myIEnumerable.WithContext()
.Single(i => i.Current.UniqueObjectID == myItem.UniqueObjectID);
var previous = result.Previous;
var next = result.Next;
The extension would be something like this:
public class ElementWithContext<T>
{
public T Previous { get; private set; }
public T Next { get; private set; }
public T Current { get; private set; }
public ElementWithContext(T current, T previous, T next)
{
Current = current;
Previous = previous;
Next = next;
}
}
public static class LinqExtensions
{
public static IEnumerable<ElementWithContext<T>>
WithContext<T>(this IEnumerable<T> source)
{
T previous = default(T);
T current = source.FirstOrDefault();
foreach (T next in source.Union(new[] { default(T) }).Skip(1))
{
yield return new ElementWithContext<T>(current, previous, next);
previous = current;
current = next;
}
}
}
You could cache the enumerable in a list
var myList = myIEnumerable.ToList()
iterate over it by index
for (int i = 0; i < myList.Count; i++)
then the current element is myList[i], the previous element is myList[i-1], and the next element is myList[i+1]
(Don't forget about the special cases of the first and last elements in the list.)
You are really over complicating things:
Sometimes just a for loop is going to be better to do something, and I think provide a clearer implementation of what you are trying to do/
var myList = myIEnumerable.ToList();
for(i = 0; i < myList.Length; i++)
{
if(myList[i].UniqueObjectID == myItem.UniqueObjectID)
{
previousItem = myList[(i - 1) % (myList.Length - 1)];
nextItem = myList[(i + 1) % (myList.Length - 1)];
}
}
Here is a LINQ extension method that returns the current item, along with the previous and the next. It yields ValueTuple<T, T, T> values to avoid allocations. The source is enumerated once.
/// <summary>
/// Projects each element of a sequence into a tuple that includes the previous
/// and the next element.
/// </summary>
public static IEnumerable<(T Previous, T Current, T Next)> WithPreviousAndNext<T>(
this IEnumerable<T> source, T firstPrevious = default, T lastNext = default)
{
ArgumentNullException.ThrowIfNull(source);
(T Previous, T Current, bool HasPrevious) queue = (default, firstPrevious, false);
foreach (var item in source)
{
if (queue.HasPrevious)
yield return (queue.Previous, queue.Current, item);
queue = (queue.Current, item, true);
}
if (queue.HasPrevious)
yield return (queue.Previous, queue.Current, lastNext);
}
Usage example:
var source = Enumerable.Range(1, 5);
Console.WriteLine($"Source: {String.Join(", ", source)}");
var result = source.WithPreviousAndNext(firstPrevious: -1, lastNext: -1);
Console.WriteLine($"Result: {String.Join(", ", result)}");
Output:
Source: 1, 2, 3, 4, 5
Result: (-1, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, -1)
To get the previous and the next of a specific item, you could use tuple deconstruction:
var (previous, current, next) = myIEnumerable
.WithPreviousAndNext()
.First(e => e.Current.UniqueObjectID == myItem.UniqueObjectID);
CPU
Depends entirely on where the object is in the sequence. If it is located at the end I would expect the second to be faster with more than a factor 2 (but only a constant factor). If it is located in the beginning the first will be faster because you don't traverse the whole list.
Memory
The first is iterating the sequence without saving the sequence so the memory hit will be very small. The second solution will take as much memory as the length of the list * references + objects + overhead.
I thought I would try to answer this using Zip from Linq.
string[] items = {"nought","one","two","three","four"};
var item = items[2];
var sandwiched =
items
.Zip( items.Skip(1), (previous,current) => new { previous, current } )
.Zip( items.Skip(2), (pair,next) => new { pair.previous, pair.current, next } )
.FirstOrDefault( triplet => triplet.current == item );
This will return a anonymous type {previous,current,next}.
Unfortunately this will only work for indexes 1,2 and 3.
string[] items = {"nought","one","two","three","four"};
var item = items[4];
var pad1 = Enumerable.Repeat( "", 1 );
var pad2 = Enumerable.Repeat( "", 2 );
var padded = pad1.Concat( items );
var next1 = items.Concat( pad1 );
var next2 = items.Skip(1).Concat( pad2 );
var sandwiched =
padded
.Zip( next1, (previous,current) => new { previous, current } )
.Zip( next2, (pair,next) => new { pair.previous, pair.current, next } )
.FirstOrDefault( triplet => triplet.current == item );
This version will work for all indexes.
Both version use lazy evaluation courtesy of Linq.
Here are some extension methods as promised. The names are generic and reusable with any type simple and there are lookup overloads to get at the item needed to get the next or previous items. I would benchmark the solutions and then see where you could squeeze cycles out.
public static class ExtensionMethods
{
public static T Previous<T>(this List<T> list, T item) {
var index = list.IndexOf(item) - 1;
return index > -1 ? list[index] : default(T);
}
public static T Next<T>(this List<T> list, T item) {
var index = list.IndexOf(item) + 1;
return index < list.Count() ? list[index] : default(T);
}
public static T Previous<T>(this List<T> list, Func<T, Boolean> lookup) {
var item = list.SingleOrDefault(lookup);
var index = list.IndexOf(item) - 1;
return index > -1 ? list[index] : default(T);
}
public static T Next<T>(this List<T> list, Func<T,Boolean> lookup) {
var item = list.SingleOrDefault(lookup);
var index = list.IndexOf(item) + 1;
return index < list.Count() ? list[index] : default(T);
}
public static T PreviousOrFirst<T>(this List<T> list, T item) {
if(list.Count() < 1)
throw new Exception("No array items!");
var previous = list.Previous(item);
return previous == null ? list.First() : previous;
}
public static T NextOrLast<T>(this List<T> list, T item) {
if(list.Count() < 1)
throw new Exception("No array items!");
var next = list.Next(item);
return next == null ? list.Last() : next;
}
public static T PreviousOrFirst<T>(this List<T> list, Func<T,Boolean> lookup) {
if(list.Count() < 1)
throw new Exception("No array items!");
var previous = list.Previous(lookup);
return previous == null ? list.First() : previous;
}
public static T NextOrLast<T>(this List<T> list, Func<T,Boolean> lookup) {
if(list.Count() < 1)
throw new Exception("No array items!");
var next = list.Next(lookup);
return next == null ? list.Last() : next;
}
}
And you can use them like this.
var previous = list.Previous(obj);
var next = list.Next(obj);
var previousWithLookup = list.Previous((o) => o.LookupProperty == otherObj.LookupProperty);
var nextWithLookup = list.Next((o) => o.LookupProperty == otherObj.LookupProperty);
var previousOrFirst = list.PreviousOrFirst(obj);
var nextOrLast = list.NextOrLast(ob);
var previousOrFirstWithLookup = list.PreviousOrFirst((o) => o.LookupProperty == otherObj.LookupProperty);
var nextOrLastWithLookup = list.NextOrLast((o) => o.LookupProperty == otherObj.LookupProperty);
I use the following technique:
var items = new[] { "Bob", "Jon", "Zac" };
var sandwiches = items
.Sandwich()
.ToList();
Which produces this result:
Notice that there are nulls for the first Previous value, and the last Next value.
It uses the following extension method:
public static IEnumerable<(T Previous, T Current, T Next)> Sandwich<T>(this IEnumerable<T> source, T beforeFirst = default, T afterLast = default)
{
var sourceList = source.ToList();
T previous = beforeFirst;
T current = sourceList.FirstOrDefault();
foreach (var next in sourceList.Skip(1))
{
yield return (previous, current, next);
previous = current;
current = next;
}
yield return (previous, current, afterLast);
}
If you need it for every element in myIEnumerable I’d just iterate through it keeping references to the 2 previous elements. In the body of the loop I'd do the processing for the second previous element and the current would be its descendant and first previous its ancestor.
If you need it for only one element I'd choose your first approach.

How do I compare items from a list to all others without repetition?

I have a collection of objects (lets call them MyItem) and each MyItem has a method called IsCompatibleWith which returns a boolean saying whether it's compatible with another MyItem.
public class MyItem
{
...
public bool IsCompatibleWith(MyItem other) { ... }
...
}
A.IsCompatibleWith(B) will always be the same as B.IsCompatibleWith(A). If for example I have a collection containing 4 of these, I am trying to find a LINQ query that will run the method on each distinct pair of items in the same collection. So if my collection contains A, B, C and D I wish to do the equivalent of:
A.IsCompatibleWith(B); // A & B
A.IsCompatibleWith(C); // A & C
A.IsCompatibleWith(D); // A & D
B.IsCompatibleWith(C); // B & C
B.IsCompatibleWith(D); // B & D
C.IsCompatibleWith(D); // C & D
The code initially used was:
var result = from item in myItems
from other in myItems
where item != other &&
item.IsCompatibleWith(other)
select item;
but of course this will still do both A & B and B & A (which is not required and not efficient). Also it's probably worth noting that in reality these lists will be a lot bigger than 4 items, hence the desire for an optimal solution.
Hopefully this makes sense... any ideas?
Edit:
One possible query -
MyItem[] items = myItems.ToArray();
bool compatible = (from item in items
from other in items
where
Array.IndexOf(items, item) < Array.IndexOf(items, other) &&
!item.IsCompatibleWith(other)
select item).FirstOrDefault() == null;
Edit2: In the end switched to using the custom solution from LukeH as it was more efficient for bigger lists.
public bool AreAllCompatible()
{
using (var e = myItems.GetEnumerator())
{
var buffer = new List<MyItem>();
while (e.MoveNext())
{
if (buffer.Any(item => !item.IsCompatibleWith(e.Current)))
return false;
buffer.Add(e.Current);
}
}
return true;
}
Edit...
Judging by the "final query" added to your question, you need a method to determine if all the items in the collection are compatible with each other. Here's how to do it reasonably efficiently:
bool compatible = myItems.AreAllItemsCompatible();
// ...
public static bool AreAllItemsCompatible(this IEnumerable<MyItem> source)
{
using (var e = source.GetEnumerator())
{
var buffer = new List<MyItem>();
while (e.MoveNext())
{
foreach (MyItem item in buffer)
{
if (!item.IsCompatibleWith(e.Current))
return false;
}
buffer.Add(e.Current);
}
}
return true;
}
Original Answer...
I don't think there's an efficient way to do this using only the built-in LINQ methods.
It's easy enough to build your own though. Here's an example of the sort of code you'll need. I'm not sure exactly what results you're trying to return so I'm just writing a message to the console for each compatible pair. It should be easy enough to change it to yield the results that you need.
using (var e = myItems.GetEnumerator())
{
var buffer = new List<MyItem>();
while (e.MoveNext())
{
foreach (MyItem item in buffer)
{
if (item.IsCompatibleWith(e.Current))
{
Console.WriteLine(item + " is compatible with " + e.Current);
}
}
buffer.Add(e.Current);
}
}
(Note that although this is reasonably efficient, it does not preserve the original ordering of the collection. Is that an issue in your situation?)
this should do it:
var result = from item in myItems
from other in myItems
where item != other &&
myItems.indexOf(item) < myItems.indexOf(other) &&
item.IsCompatibleWith(other)
select item;
But i dont know if it makes it faster, because in the query has to check the indices of the rows each row.
Edit:
if you have an index in myItem you should use that one instead of indexOf. And you can remove the "item != other" from the where clause, little bit redundant now
Here's an idea:
Implement IComparable so that your MyItem becomes sortable, then run this linq-query:
var result = from item in myItems
from other in myItems
where item.CompareTo(other) < 0 &&
item.IsCompatibleWith(other)
select item;
If your MyItem collection is small enough, you can storage the results of item.IsCompatibleWith(otherItem) in a boolean array:
var itemCount = myItems.Count();
var compatibilityTable = new bool[itemCount, itemCount];
var itemsToCompare = new List<MyItem>();
var i = 0;
var j = 0;
foreach (var item in myItems)
{
j = 0;
foreach (var other in itemsToCompare)
{
compatibilityTable[i,j] = item.IsCompatibleWith(other);
compatibilityTable[j,i] = compatibilityTable[i,j];
j++;
}
itemsToCompare.Add(item);
i++;
}
var result = myItems.Where((item, i) =>
{
var compatible = true;
var j = 0;
while (compatible && j < itemCount)
{
compatible = compatibilityTable[i,j];
}
j++;
return compatible;
}
So, we have
IEnumerable<MyItem> MyItems;
To get all the combinations we could use a function like this.
//returns all the k sized combinations from a list
public static IEnumerable<IEnumerable<T>> Combinations<T>(IEnumerable<T> list,
int k)
{
if (k == 0) return new[] {new T[0]};
return list.SelectMany((l, i) =>
Combinations(list.Skip(i + 1), k - 1).Select(c => (new[] {l}).Concat(c))
);
}
We can then apply this function to our problem like this.
var combinations = Combinations(MyItems, 2).Select(c => c.ToList<MyItem>());
var result = combinations.Where(c => c[0].IsCompatibleWith(c[1]))
This will perform IsCompatableWith on all the combinations without repetition.
You could of course perform the the checking inside the Combinations functions. For further work you could make the Combinations function into an extention that takes a delegate with a variable number of parameters for several lengths of k.
EDIT: As I suggested above, if you wrote these extension method
public static class Extenesions
{
IEnumerable<IEnumerable<T>> Combinations<T>(this IEnumerable<T> list, int k)
{
if (k == 0) return new[] { new T[0] };
return list.SelectMany((l, i) =>
list.Skip(i + 1).Combinations<T>(k - 1)
.Select(c => (new[] { l }).Concat(c)));
}
IEnumerable<Tuple<T, T>> Combinations<T> (this IEnumerable<T> list,
Func<T, T, bool> filter)
{
return list.Combinations(2).Where(c =>
filter(c.First(), c.Last())).Select(c =>
Tuple.Create<T, T>(c.First(), c.Last()));
}
}
Then in your code you could do the rather more elegant (IMO)
var compatibleTuples = myItems.Combinations(a, b) => a.IsCompatibleWith(b)))
then get at the compatible items with
foreach(var t in compatibleTuples)
{
t.Item1 // or T.item2
}

Categories