I have an IEnumerable<T> and an IEnumerable<U> that I want merged into an IEnumerable<KeyValuePair<T,U>> where the indexes of the elements joined together in the KeyValuePair are the same. Note I'm not using IList, so I don't have a count or an index for the items I'm merging. How best can I accomplish this? I would prefer a LINQ answer, but anything that gets the job done in an elegant fashion would work as well.
Note: As of .NET 4.0, the framework includes a .Zip extension method on IEnumerable, documented here. The following is maintained for posterity and for use in .NET framework version earlier than 4.0.
I use these extension methods:
// From http://community.bartdesmet.net/blogs/bart/archive/2008/11/03/c-4-0-feature-focus-part-3-intermezzo-linq-s-new-zip-operator.aspx
public static IEnumerable<TResult> Zip<TFirst, TSecond, TResult>(this IEnumerable<TFirst> first, IEnumerable<TSecond> second, Func<TFirst, TSecond, TResult> func) {
if (first == null)
throw new ArgumentNullException("first");
if (second == null)
throw new ArgumentNullException("second");
if (func == null)
throw new ArgumentNullException("func");
using (var ie1 = first.GetEnumerator())
using (var ie2 = second.GetEnumerator())
while (ie1.MoveNext() && ie2.MoveNext())
yield return func(ie1.Current, ie2.Current);
}
public static IEnumerable<KeyValuePair<T, R>> Zip<T, R>(this IEnumerable<T> first, IEnumerable<R> second) {
return first.Zip(second, (f, s) => new KeyValuePair<T, R>(f, s));
}
EDIT: after the comments I'm obliged to clarify and fix some things:
I originally took the first Zip implementation verbatim from Bart De Smet's blog
Added enumerator disposing (which was also noted on Bart's original post)
Added null parameter checking (also discussed in Bart's post)
As a update to anyone stumbling across this question, .Net 4.0 supports this natively as ex from MS:
int[] numbers = { 1, 2, 3, 4 };
string[] words = { "one", "two", "three" };
var numbersAndWords = numbers.Zip(words, (first, second) => first + " " + second);
Documentation:
The method merges each element of the first sequence with an element that has the same index in the second sequence. If the sequences do not have the same number of elements, the method merges sequences until it reaches the end of one of them. For example, if one sequence has three elements and the other one has four, the result sequence will have only three elements.
Think about what you're asking a bit more closely here:
You want to combine two IEnumerables in which "the indexes of the elements joined together in the KeyValuePair are the same", but you "don't have a count or an index for the items I'm merging".
There's no guarantee your IEnumerables are even sorted or unsorted. There's no correlation between your two IEnumerable objects, so how can you expect to correlate them?
Look at nextension:
Currently Implemented Methods
IEnumerable
ForEach Performs a specified action on each element of the IEnumerable.
Clump Groups items into same size lots.
Scan Creates a list by applying a delegate to pairs of items in the IEnumerable.
AtLeast Checks there are at least a certain amount of items in the IEnumerable.
AtMost Checks there are no more than a certain amount of items in the IEnumerable.
Zip Creates a list by combining two other lists into one.
Cycle Creates a list by repeating another list.
I would use something along the lines of -
IEnumerable<KeyValuePair<T,U>> Merge<T,U>(IEnumerable<T> keyCollection, IEnumerable<U> valueCollection)
{
var keys = keyCollection.GetEnumerator();
var values = valueCollection.GetEnumerator();
try
{
keys.Reset();
values.Reset();
while (keys.MoveNext() && values.MoveNext())
{
yield return new KeyValuePair<T,U>(keys.Current,values.Current);
}
}
finally
{
keys.Dispose();
values.Dispose();
}
}
This should work correctly, and cleanup properly afterwards.
Untested, but should work:
IEnumerable<KeyValuePair<T, U>> Zip<T, U>(IEnumerable<T> t, IEnumerable<U> u) {
IEnumerator<T> et = t.GetEnumerator();
IEnumerator<U> eu = u.GetEnumerator();
for (;;) {
bool bt = et.MoveNext();
bool bu = eu.MoveNext();
if (bt != bu)
throw new ArgumentException("Different number of elements in t and u");
if (!bt)
break;
yield return new KeyValuePair<T, U>(et.Current, eu.Current);
}
}
You could use the Zip methods in MoreLINQ.
The MSDN has the following Custom Sequence Operators example. And Welbog is right; if you have no index on the underlying data you have no guarantee that the operation does what you exspect.
Another implementation from the functional-dotnet project by Alexey Romanov:
/// <summary>
/// Takes two sequences and returns a sequence of corresponding pairs.
/// If one sequence is short, excess elements of the longer sequence are discarded.
/// </summary>
/// <typeparam name="T1">The type of the 1.</typeparam>
/// <typeparam name="T2">The type of the 2.</typeparam>
/// <param name="sequence1">The first sequence.</param>
/// <param name="sequence2">The second sequence.</param>
/// <returns></returns>
public static IEnumerable<Tuple<T1, T2>> Zip<T1, T2>(
this IEnumerable<T1> sequence1, IEnumerable<T2> sequence2) {
using (
IEnumerator<T1> enumerator1 = sequence1.GetEnumerator())
using (
IEnumerator<T2> enumerator2 = sequence2.GetEnumerator()) {
while (enumerator1.MoveNext() && enumerator2.MoveNext()) {
yield return
Pair.New(enumerator1.Current, enumerator2.Current);
}
}
//
//zip :: [a] -> [b] -> [(a,b)]
//zip (a:as) (b:bs) = (a,b) : zip as bs
//zip _ _ = []
}
Replace Pair.New with new KeyValuePair<T1, T2> (and the return type) and you're good to go.
JaredPar has a library with a lot of useful stuff in it, include Zip which will enable what you want to do.
Related
What is the difference between Contains and Any in LINQ?
Contains takes an object, Any takes a predicate.
You use Contains like this:
listOFInts.Contains(1);
and Any like this:
listOfInts.Any(i => i == 1);
listOfInts.Any(i => i % 2 == 0); // Check if any element is an Even Number
So if you want to check for a specific condition, use Any. If you want to check for the existence of an element, use Contains.
MSDN for Contains, Any
Contains checks if the sequence contains a specified element.
Enumerable.Any checks if element of a sequence satisfies a condition.
Consider the following example:
List<int> list = new List<int> { 1, 2, 3, 4, 5 };
bool contains = list.Contains(1); //true
bool condition = list.Any(r => r > 2 && r < 5);
Contains cares about whether the source collection is an ICollection, Any does not.
Enumerable.Contains
http://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs#f60bab4c5e27a849
public static bool Contains<TSource>(this IEnumerable<TSource> source, TSource value)
{
ICollection<TSource> collection = source as ICollection<TSource>;
if (collection != null)
{
return collection.Contains(value);
}
return source.Contains<TSource>(value, null);
}
Enumerable.Any
http://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs#6a1af7c3d17845e3
public static bool Any<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)
{
foreach (TSource local in source)
{
if (predicate(local))
{
return true;
}
}
return false;
}
Another difference as mentioned here is on the performance
Contains is O(n) for a List and O(1) for a HashSet
Any is simply O(n)
Contains
Determines whether a sequence contains a specified element by using the default equality comparer.
Any
Determines whether a sequence contains any elements.
As for the documentation:
Can't seem to find to find any documentation on it.
All (most?) LINQ extension methods: here
I have an enumerator like this
IEnumerable<System.Windows.Documents.FixedPage> page;
How can I add a page (eg: D:\newfile.txt) to it? I have tried Add, Append, Concat etc But nothing worked for me.
Yes, it is possible
It is possible to concatenate sequences (IEnumerables) together and assign the concatenated result to a new sequence. (You cannot change the original sequence.)
The built-in Enumerable.Concat() will only concatenate another sequence; however, it is easy to write an extension method that will let you concatenate a scalar to a sequence.
The following code demonstrates:
using System;
using System.Collections.Generic;
using System.Linq;
namespace Demo
{
public class Program
{
[STAThread]
private static void Main()
{
var stringList = new List<string> {"One", "Two", "Three"};
IEnumerable<string> originalSequence = stringList;
var newSequence = originalSequence.Concat("Four");
foreach (var text in newSequence)
{
Console.WriteLine(text); // Prints "One" "Two" "Three" "Four".
}
}
}
public static class EnumerableExt
{
/// <summary>Concatenates a scalar to a sequence.</summary>
/// <typeparam name="T">The type of elements in the sequence.</typeparam>
/// <param name="sequence">a sequence.</param>
/// <param name="item">The scalar item to concatenate to the sequence.</param>
/// <returns>A sequence which has the specified item appended to it.</returns>
/// <remarks>
/// The standard .Net IEnumerable extensions includes a Concat() operator which concatenates a sequence to another sequence.
/// However, it does not allow you to concat a scalar to a sequence. This operator provides that ability.
/// </remarks>
public static IEnumerable<T> Concat<T>(this IEnumerable<T> sequence, T item)
{
return sequence.Concat(new[] { item });
}
}
}
IEnumerable<T> does not contain a way to modify the collection.
You will need to implement either ICollection<T> or IList<T> as these contain an Add and Remove functions.
If you have an idea of what the original type of the IEnumerable is, you can modify it...
List<string> stringList = new List<string>();
stringList.Add("One");
stringList.Add("Two");
IEnumerable<string> stringEnumerable = stringList.AsEnumerable();
List<string> stringList2 = stringEnumerable as List<string>;
if (stringList2 != null)
stringList2.Add("Three");
foreach (var s in stringList)
Console.WriteLine(s);
This outputs:
One
Two
Three
Change the foreach statement to iterate over stringList2, or stringEnumerable, you'll get the same thing.
Reflection might be useful to determine the real type of the IEnumerable.
This probably isn't a good practice, though... Whatever gave you the IEnumerable is probably not expecting the collection to be modified that way.
IEnumerable<T> is a readonly interface. You should use an IList<T> instead, which provides methods for adding and removing items.
IEnumerable is immutable. You can't add items, you can't delete items.
The classes from System.Collections.Generic return this interface so you can iterate over the items contained in the collection.
From MSDN
Exposes the enumerator, which supports a simple iteration over a collection of a specified type.
See here for MSDN reference.
Try
IEnumerable<System.Windows.Documents.FixedPage> page = new List<System.Windows.Documents.FixedPage>(your items list here)
or
IList<System.Windows.Documents.FixedPage> page = new List<System.Windows.Documents.FixedPage>(1);
page.Add(your item Here);
You cannot add elements to IEnumerable<T>, since it does not support addition operations. You either have to use an implementation of ICollection<T>, or cast the IEnumerable<T> to ICollection<T> if possible.
IEnumerable<System.Windows.Documents.FixedPage> page;
....
ICollection<System.Windows.Documents.FixedPage> pageCollection
= (ICollection<System.Windows.Documents.FixedPage>) page
If the cast is impossible, use for instance
ICollection<System.Windows.Documents.FixedPage> pageCollection
= new List<System.Windows.Documents.FixedPage>(page);
You can do it like this:
ICollection<System.Windows.Documents.FixedPage> pageCollection
= (page as ICollection<System.Windows.Documents.FixedPage>) ??
new List<System.Windows.Documents.FixedPage>(page);
The latter will almost guarantee that you have a collection that is modifiable. It is possible, though, when using cast, to successfully get the collection, but all modification operations to throw NotSupportedException. This is so for read-only collections. In such cases the approach with the constructor is the only option.
The ICollection<T> interface implements IEnumerable<T>, so you can use pageCollection wherever you are currently using page.
I have two sets of datarows. They are each IEnumerable. I want to append/concatenate these two lists into one list. I'm sure this is doable. I don't want to do a for loop and noticed that there is a Union method and a Join method on the two Lists. Any ideas?
Assuming your objects are of the same type, you can use either Union or Concat. Note that, like the SQL UNION keyword, the Union operation will ensure that duplicates are eliminated, whereas Concat (like UNION ALL) will simply add the second list to the end of the first.
IEnumerable<T> first = ...;
IEnumerable<T> second = ...;
IEnumerable<T> combined = first.Concat(second);
or
IEnumerable<T> combined = first.Union(second);
If they are of different types, then you'll have to Select them into something common. For example:
IEnumerable<TOne> first = ...;
IEnumerable<TTwo> second = ...;
IEnumerable<T> combined = first.Select(f => ConvertToT(f)).Concat(
second.Select(s => ConvertToT(s)));
Where ConvertToT(TOne f) and ConvertToT(TTwo s) represent an operation that somehow converts an instance of TOne (and TTwo, respectively) into an instance of T.
I just encountered a similar situation where I need to concatenate multiple sequences.
Naturally searched for existing solutions on Google/StackOverflow, however did not find anything the did not evaluate the enumerable, e.g. convert to array then use Array.Copy() etc., so I wrote an extension and static utiltiy method called ConcatMultiple.
Hope this helps anyone that needs to do the same.
/// <summary>
/// Concatenates multiple sequences
/// </summary>
/// <typeparam name="TSource">The type of the elements of the input sequences.</typeparam>
/// <param name="first">The first sequence to concatenate.</param>
/// <param name="source">The other sequences to concatenate.</param>
/// <returns></returns>
public static IEnumerable<TSource> ConcatMultiple<TSource>(this IEnumerable<TSource> first, params IEnumerable<TSource>[] source)
{
if (first == null)
throw new ArgumentNullException("first");
if (source.Any(x => (x == null)))
throw new ArgumentNullException("source");
return ConcatIterator<TSource>(source);
}
private static IEnumerable<TSource> ConcatIterator<TSource>(IEnumerable<TSource> first, params IEnumerable<TSource>[] source)
{
foreach (var iteratorVariable in first)
yield return iteratorVariable;
foreach (var enumerable in source)
{
foreach (var iteratorVariable in enumerable)
yield return iteratorVariable;
}
}
/// <summary>
/// Concatenates multiple sequences
/// </summary>
/// <typeparam name="TSource">The type of the elements of the input sequences.</typeparam>
/// <param name="source">The sequences to concatenate.</param>
/// <returns></returns>
public static IEnumerable<TSource> ConcatMultiple<TSource>(params IEnumerable<TSource>[] source)
{
if (source.Any(x => (x == null)))
throw new ArgumentNullException("source");
return ConcatIterator<TSource>(source);
}
private static IEnumerable<TSource> ConcatIterator<TSource>(params IEnumerable<TSource>[] source)
{
foreach (var enumerable in source)
{
foreach (var iteratorVariable in enumerable)
yield return iteratorVariable;
}
}
The Join method is like a SQL join, where the list are cross referenced based upon a condition, it isn't a string concatenation or Adding to a list. The Union method does do what you want, as does the Concat method, but both are LAZY evaluations, and have the requirement the parameters be non-null. They return either a ConcatIterator or a UnionIterator, and if called repeatedly this could cause problems. Eager evaluation results in different behavior, if that is what you want, then an extension method like the below could be used.
public static IEnumerable<T> myEagerConcat<T>(this IEnumerable<T> first,
IEnumerable<T> second)
{
return (first ?? Enumerable.Empty<T>()).Concat(
(second ?? Enumerable.Empty<T>())).ToList();
}
Delayed invocation of the second and subsequent enumerables
I usually use Linq IEnumerable<T>.Concat() but today I needed to be 100% sure that the second enumeration was not enumerated until the first one has been processed until the end. (e.g. two db queries that I didn't want to run simultaneously). So the following function made the trick to delay the enumerations.
IEnumerable<T> DelayedConcat<T>(params Func<IEnumerable<T>>[] enumerableList)
{
foreach(var enumerable in enumerableList)
{
foreach (var item in enumerable())
{
yield return item;
}
}
}
Usage:
return DelayedConcat(
() => GetEnumerable1(),
() => GetEnumerable2(),
// and so on.. () => GetEnumerable3(),
);
In this example GetEnumerable2 function invocation will be delayed until GetEnumerable1 has been enumerated till the end.
If I have two sequences and I want to process them both together, I can union them and away we go.
Now lets say I have a single item I want to process between the two sequencs. I can get it in by creating an array with a single item, but is there a neater way? i.e.
var top = new string[] { "Crusty bread", "Mayonnaise" };
string filling = "BTL";
var bottom = new string[] { "Mayonnaise", "Crusty bread" };
// Will not compile, filling is a string, therefore is not Enumerable
//var sandwich = top.Union(filling).Union(bottom);
// Compiles and works, but feels grungy (looks like it might be smelly)
var sandwich = top.Union(new string[]{filling}).Union(bottom);
foreach (var item in sandwich)
Process(item);
Is there an approved way of doing this, or is this the approved way?
Thanks
One option is to overload it yourself:
public static IEnumerable<T> Union<T>(this IEnumerable<T> source, T item)
{
return source.Union(Enumerable.Repeat(item, 1));
}
That's what we did with Concat in MoreLINQ.
The new way of doing this, supported in .NET Core and .NET Framework from version 4.7.1, is using the Append extension method.
This will make your code as easy and elegant as
var sandwich = top.Append(filling).Union(bottom);
Consider using even more flexible approach:
public static IEnumerable<T> Union<T>(this IEnumerable<T> source, params T[] items)
{
return source.Union((IEnumerable<T>)items);
}
Works for single as well as multiple items.
You may also accept null source values:
public static IEnumerable<T> Union<T>(this IEnumerable<T> source, params T[] items)
{
return source != null ? source.Union((IEnumerable<T>)items) : items;
}
I tend to have the following somewhere in my code:
public static IEnumerable<T> EmitFromEnum<T>(this T item)
{
yield return item;
}
While it's not as neat to call col.Union(obj.EmitFromEnum()); as col.Union(obj) it does mean that this single extension method covers all other cases I might want such a single-item enumeration.
Update: With .NET Core you can now use .Append() or .Prepend() to add a single element to an enumerable. The implementation is optimised to avoid generating too many IEnumerator implementations behind the scenes.
I could convert them to lists and just use a regular for loop with indexes, but I'm wondering if there's a way to do it that keeps them as IEnumerables.
I think you want the new Zip feature from .NET 4.0. Eric Lippert blogged about it recently and included a simple form of the implementation.
It's also in MoreLINQ, in Zip.cs, which allows for different options if the sequences aren't the same length. The "default" is to act like .NET 4.0, stopping when either sequence runs out of elements. Alternatives are to pad the shorter sequence or throw an exception.
By default there is no way but it's not difficult to add an extension method to make it a bit easier. I excluded some error checking to ensure they were both the same length for brevity.
public static void ForEachPair<T1,T2>(
this IEnumerable<T1> source1,
IEnumerable<T2> source2,
Action<T1,T2> del) {
using ( var e1 = source1.GetEnumerator() )
using ( var e2 = source2.GetEnumerator() ) {
while ( e1.MoveNext() && e2.MoveNext() ) {
del(e1.Current, e2.Current);
}
}
}
Now you can do the following
var list = GetSomeList();
var otherList = GetSomeOtherList();
list.ForEachPair(otherList, (x,y) =>
{
// Loop code here
});