Appending/concatenating two IEnumerable sequences - c#

I have two sets of datarows. They are each IEnumerable. I want to append/concatenate these two lists into one list. I'm sure this is doable. I don't want to do a for loop and noticed that there is a Union method and a Join method on the two Lists. Any ideas?

Assuming your objects are of the same type, you can use either Union or Concat. Note that, like the SQL UNION keyword, the Union operation will ensure that duplicates are eliminated, whereas Concat (like UNION ALL) will simply add the second list to the end of the first.
IEnumerable<T> first = ...;
IEnumerable<T> second = ...;
IEnumerable<T> combined = first.Concat(second);
or
IEnumerable<T> combined = first.Union(second);
If they are of different types, then you'll have to Select them into something common. For example:
IEnumerable<TOne> first = ...;
IEnumerable<TTwo> second = ...;
IEnumerable<T> combined = first.Select(f => ConvertToT(f)).Concat(
second.Select(s => ConvertToT(s)));
Where ConvertToT(TOne f) and ConvertToT(TTwo s) represent an operation that somehow converts an instance of TOne (and TTwo, respectively) into an instance of T.

I just encountered a similar situation where I need to concatenate multiple sequences.
Naturally searched for existing solutions on Google/StackOverflow, however did not find anything the did not evaluate the enumerable, e.g. convert to array then use Array.Copy() etc., so I wrote an extension and static utiltiy method called ConcatMultiple.
Hope this helps anyone that needs to do the same.
/// <summary>
/// Concatenates multiple sequences
/// </summary>
/// <typeparam name="TSource">The type of the elements of the input sequences.</typeparam>
/// <param name="first">The first sequence to concatenate.</param>
/// <param name="source">The other sequences to concatenate.</param>
/// <returns></returns>
public static IEnumerable<TSource> ConcatMultiple<TSource>(this IEnumerable<TSource> first, params IEnumerable<TSource>[] source)
{
if (first == null)
throw new ArgumentNullException("first");
if (source.Any(x => (x == null)))
throw new ArgumentNullException("source");
return ConcatIterator<TSource>(source);
}
private static IEnumerable<TSource> ConcatIterator<TSource>(IEnumerable<TSource> first, params IEnumerable<TSource>[] source)
{
foreach (var iteratorVariable in first)
yield return iteratorVariable;
foreach (var enumerable in source)
{
foreach (var iteratorVariable in enumerable)
yield return iteratorVariable;
}
}
/// <summary>
/// Concatenates multiple sequences
/// </summary>
/// <typeparam name="TSource">The type of the elements of the input sequences.</typeparam>
/// <param name="source">The sequences to concatenate.</param>
/// <returns></returns>
public static IEnumerable<TSource> ConcatMultiple<TSource>(params IEnumerable<TSource>[] source)
{
if (source.Any(x => (x == null)))
throw new ArgumentNullException("source");
return ConcatIterator<TSource>(source);
}
private static IEnumerable<TSource> ConcatIterator<TSource>(params IEnumerable<TSource>[] source)
{
foreach (var enumerable in source)
{
foreach (var iteratorVariable in enumerable)
yield return iteratorVariable;
}
}

The Join method is like a SQL join, where the list are cross referenced based upon a condition, it isn't a string concatenation or Adding to a list. The Union method does do what you want, as does the Concat method, but both are LAZY evaluations, and have the requirement the parameters be non-null. They return either a ConcatIterator or a UnionIterator, and if called repeatedly this could cause problems. Eager evaluation results in different behavior, if that is what you want, then an extension method like the below could be used.
public static IEnumerable<T> myEagerConcat<T>(this IEnumerable<T> first,
IEnumerable<T> second)
{
return (first ?? Enumerable.Empty<T>()).Concat(
(second ?? Enumerable.Empty<T>())).ToList();
}

Delayed invocation of the second and subsequent enumerables
I usually use Linq IEnumerable<T>.Concat() but today I needed to be 100% sure that the second enumeration was not enumerated until the first one has been processed until the end. (e.g. two db queries that I didn't want to run simultaneously). So the following function made the trick to delay the enumerations.
IEnumerable<T> DelayedConcat<T>(params Func<IEnumerable<T>>[] enumerableList)
{
foreach(var enumerable in enumerableList)
{
foreach (var item in enumerable())
{
yield return item;
}
}
}
Usage:
return DelayedConcat(
() => GetEnumerable1(),
() => GetEnumerable2(),
// and so on.. () => GetEnumerable3(),
);
In this example GetEnumerable2 function invocation will be delayed until GetEnumerable1 has been enumerated till the end.

Related

How to join two list in c# [duplicate]

I have an IEnumerable<T> and an IEnumerable<U> that I want merged into an IEnumerable<KeyValuePair<T,U>> where the indexes of the elements joined together in the KeyValuePair are the same. Note I'm not using IList, so I don't have a count or an index for the items I'm merging. How best can I accomplish this? I would prefer a LINQ answer, but anything that gets the job done in an elegant fashion would work as well.
Note: As of .NET 4.0, the framework includes a .Zip extension method on IEnumerable, documented here. The following is maintained for posterity and for use in .NET framework version earlier than 4.0.
I use these extension methods:
// From http://community.bartdesmet.net/blogs/bart/archive/2008/11/03/c-4-0-feature-focus-part-3-intermezzo-linq-s-new-zip-operator.aspx
public static IEnumerable<TResult> Zip<TFirst, TSecond, TResult>(this IEnumerable<TFirst> first, IEnumerable<TSecond> second, Func<TFirst, TSecond, TResult> func) {
if (first == null)
throw new ArgumentNullException("first");
if (second == null)
throw new ArgumentNullException("second");
if (func == null)
throw new ArgumentNullException("func");
using (var ie1 = first.GetEnumerator())
using (var ie2 = second.GetEnumerator())
while (ie1.MoveNext() && ie2.MoveNext())
yield return func(ie1.Current, ie2.Current);
}
public static IEnumerable<KeyValuePair<T, R>> Zip<T, R>(this IEnumerable<T> first, IEnumerable<R> second) {
return first.Zip(second, (f, s) => new KeyValuePair<T, R>(f, s));
}
EDIT: after the comments I'm obliged to clarify and fix some things:
I originally took the first Zip implementation verbatim from Bart De Smet's blog
Added enumerator disposing (which was also noted on Bart's original post)
Added null parameter checking (also discussed in Bart's post)
As a update to anyone stumbling across this question, .Net 4.0 supports this natively as ex from MS:
int[] numbers = { 1, 2, 3, 4 };
string[] words = { "one", "two", "three" };
var numbersAndWords = numbers.Zip(words, (first, second) => first + " " + second);
Documentation:
The method merges each element of the first sequence with an element that has the same index in the second sequence. If the sequences do not have the same number of elements, the method merges sequences until it reaches the end of one of them. For example, if one sequence has three elements and the other one has four, the result sequence will have only three elements.
Think about what you're asking a bit more closely here:
You want to combine two IEnumerables in which "the indexes of the elements joined together in the KeyValuePair are the same", but you "don't have a count or an index for the items I'm merging".
There's no guarantee your IEnumerables are even sorted or unsorted. There's no correlation between your two IEnumerable objects, so how can you expect to correlate them?
Look at nextension:
Currently Implemented Methods
IEnumerable
ForEach Performs a specified action on each element of the IEnumerable.
Clump Groups items into same size lots.
Scan Creates a list by applying a delegate to pairs of items in the IEnumerable.
AtLeast Checks there are at least a certain amount of items in the IEnumerable.
AtMost Checks there are no more than a certain amount of items in the IEnumerable.
Zip Creates a list by combining two other lists into one.
Cycle Creates a list by repeating another list.
I would use something along the lines of -
IEnumerable<KeyValuePair<T,U>> Merge<T,U>(IEnumerable<T> keyCollection, IEnumerable<U> valueCollection)
{
var keys = keyCollection.GetEnumerator();
var values = valueCollection.GetEnumerator();
try
{
keys.Reset();
values.Reset();
while (keys.MoveNext() && values.MoveNext())
{
yield return new KeyValuePair<T,U>(keys.Current,values.Current);
}
}
finally
{
keys.Dispose();
values.Dispose();
}
}
This should work correctly, and cleanup properly afterwards.
Untested, but should work:
IEnumerable<KeyValuePair<T, U>> Zip<T, U>(IEnumerable<T> t, IEnumerable<U> u) {
IEnumerator<T> et = t.GetEnumerator();
IEnumerator<U> eu = u.GetEnumerator();
for (;;) {
bool bt = et.MoveNext();
bool bu = eu.MoveNext();
if (bt != bu)
throw new ArgumentException("Different number of elements in t and u");
if (!bt)
break;
yield return new KeyValuePair<T, U>(et.Current, eu.Current);
}
}
You could use the Zip methods in MoreLINQ.
The MSDN has the following Custom Sequence Operators example. And Welbog is right; if you have no index on the underlying data you have no guarantee that the operation does what you exspect.
Another implementation from the functional-dotnet project by Alexey Romanov:
/// <summary>
/// Takes two sequences and returns a sequence of corresponding pairs.
/// If one sequence is short, excess elements of the longer sequence are discarded.
/// </summary>
/// <typeparam name="T1">The type of the 1.</typeparam>
/// <typeparam name="T2">The type of the 2.</typeparam>
/// <param name="sequence1">The first sequence.</param>
/// <param name="sequence2">The second sequence.</param>
/// <returns></returns>
public static IEnumerable<Tuple<T1, T2>> Zip<T1, T2>(
this IEnumerable<T1> sequence1, IEnumerable<T2> sequence2) {
using (
IEnumerator<T1> enumerator1 = sequence1.GetEnumerator())
using (
IEnumerator<T2> enumerator2 = sequence2.GetEnumerator()) {
while (enumerator1.MoveNext() && enumerator2.MoveNext()) {
yield return
Pair.New(enumerator1.Current, enumerator2.Current);
}
}
//
//zip :: [a] -> [b] -> [(a,b)]
//zip (a:as) (b:bs) = (a,b) : zip as bs
//zip _ _ = []
}
Replace Pair.New with new KeyValuePair<T1, T2> (and the return type) and you're good to go.
JaredPar has a library with a lot of useful stuff in it, include Zip which will enable what you want to do.

What is the difference between Contains and Any in LINQ?

What is the difference between Contains and Any in LINQ?
Contains takes an object, Any takes a predicate.
You use Contains like this:
listOFInts.Contains(1);
and Any like this:
listOfInts.Any(i => i == 1);
listOfInts.Any(i => i % 2 == 0); // Check if any element is an Even Number
So if you want to check for a specific condition, use Any. If you want to check for the existence of an element, use Contains.
MSDN for Contains, Any
Contains checks if the sequence contains a specified element.
Enumerable.Any checks if element of a sequence satisfies a condition.
Consider the following example:
List<int> list = new List<int> { 1, 2, 3, 4, 5 };
bool contains = list.Contains(1); //true
bool condition = list.Any(r => r > 2 && r < 5);
Contains cares about whether the source collection is an ICollection, Any does not.
Enumerable.Contains
http://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs#f60bab4c5e27a849
public static bool Contains<TSource>(this IEnumerable<TSource> source, TSource value)
{
ICollection<TSource> collection = source as ICollection<TSource>;
if (collection != null)
{
return collection.Contains(value);
}
return source.Contains<TSource>(value, null);
}
Enumerable.Any
http://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs#6a1af7c3d17845e3
public static bool Any<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)
{
foreach (TSource local in source)
{
if (predicate(local))
{
return true;
}
}
return false;
}
Another difference as mentioned here is on the performance
Contains is O(n) for a List and O(1) for a HashSet
Any is simply O(n)
Contains
Determines whether a sequence contains a specified element by using the default equality comparer.
Any
Determines whether a sequence contains any elements.
As for the documentation:
Can't seem to find to find any documentation on it.
All (most?) LINQ extension methods: here

Code for adding to IEnumerable

I have an enumerator like this
IEnumerable<System.Windows.Documents.FixedPage> page;
How can I add a page (eg: D:\newfile.txt) to it? I have tried Add, Append, Concat etc But nothing worked for me.
Yes, it is possible
It is possible to concatenate sequences (IEnumerables) together and assign the concatenated result to a new sequence. (You cannot change the original sequence.)
The built-in Enumerable.Concat() will only concatenate another sequence; however, it is easy to write an extension method that will let you concatenate a scalar to a sequence.
The following code demonstrates:
using System;
using System.Collections.Generic;
using System.Linq;
namespace Demo
{
public class Program
{
[STAThread]
private static void Main()
{
var stringList = new List<string> {"One", "Two", "Three"};
IEnumerable<string> originalSequence = stringList;
var newSequence = originalSequence.Concat("Four");
foreach (var text in newSequence)
{
Console.WriteLine(text); // Prints "One" "Two" "Three" "Four".
}
}
}
public static class EnumerableExt
{
/// <summary>Concatenates a scalar to a sequence.</summary>
/// <typeparam name="T">The type of elements in the sequence.</typeparam>
/// <param name="sequence">a sequence.</param>
/// <param name="item">The scalar item to concatenate to the sequence.</param>
/// <returns>A sequence which has the specified item appended to it.</returns>
/// <remarks>
/// The standard .Net IEnumerable extensions includes a Concat() operator which concatenates a sequence to another sequence.
/// However, it does not allow you to concat a scalar to a sequence. This operator provides that ability.
/// </remarks>
public static IEnumerable<T> Concat<T>(this IEnumerable<T> sequence, T item)
{
return sequence.Concat(new[] { item });
}
}
}
IEnumerable<T> does not contain a way to modify the collection.
You will need to implement either ICollection<T> or IList<T> as these contain an Add and Remove functions.
If you have an idea of what the original type of the IEnumerable is, you can modify it...
List<string> stringList = new List<string>();
stringList.Add("One");
stringList.Add("Two");
IEnumerable<string> stringEnumerable = stringList.AsEnumerable();
List<string> stringList2 = stringEnumerable as List<string>;
if (stringList2 != null)
stringList2.Add("Three");
foreach (var s in stringList)
Console.WriteLine(s);
This outputs:
One
Two
Three
Change the foreach statement to iterate over stringList2, or stringEnumerable, you'll get the same thing.
Reflection might be useful to determine the real type of the IEnumerable.
This probably isn't a good practice, though... Whatever gave you the IEnumerable is probably not expecting the collection to be modified that way.
IEnumerable<T> is a readonly interface. You should use an IList<T> instead, which provides methods for adding and removing items.
IEnumerable is immutable. You can't add items, you can't delete items.
The classes from System.Collections.Generic return this interface so you can iterate over the items contained in the collection.
From MSDN
Exposes the enumerator, which supports a simple iteration over a collection of a specified type.
See here for MSDN reference.
Try
IEnumerable<System.Windows.Documents.FixedPage> page = new List<System.Windows.Documents.FixedPage>(your items list here)
or
IList<System.Windows.Documents.FixedPage> page = new List<System.Windows.Documents.FixedPage>(1);
page.Add(your item Here);
You cannot add elements to IEnumerable<T>, since it does not support addition operations. You either have to use an implementation of ICollection<T>, or cast the IEnumerable<T> to ICollection<T> if possible.
IEnumerable<System.Windows.Documents.FixedPage> page;
....
ICollection<System.Windows.Documents.FixedPage> pageCollection
= (ICollection<System.Windows.Documents.FixedPage>) page
If the cast is impossible, use for instance
ICollection<System.Windows.Documents.FixedPage> pageCollection
= new List<System.Windows.Documents.FixedPage>(page);
You can do it like this:
ICollection<System.Windows.Documents.FixedPage> pageCollection
= (page as ICollection<System.Windows.Documents.FixedPage>) ??
new List<System.Windows.Documents.FixedPage>(page);
The latter will almost guarantee that you have a collection that is modifiable. It is possible, though, when using cast, to successfully get the collection, but all modification operations to throw NotSupportedException. This is so for read-only collections. In such cases the approach with the constructor is the only option.
The ICollection<T> interface implements IEnumerable<T>, so you can use pageCollection wherever you are currently using page.

what is the fastest way to check IEnumerable Count is greater than zero without loop through all records

i know everyone says to avoid doing something like this because its very slow (just to find out if there is 0)
IEnumerable<MyObject> list;
if (list.Count() > 0)
{
}
but what is the best alternative when all i need to do is find out if the list has a count of 0 or if there are items in it
Use list.Any(). It returns true if it finds an element. Implementation wise, it would be:
using (var enumerator = list.GetEnumerator())
{
return enumerator.MoveNext();
}
Something like this should work for you:
public static IsEmpty(this IEnumerable list)
{
IEnumerator en = list.GetEnumerator();
return !en.MoveNext();
}
Just start enumerating, and if you can move onto the first item, it's not empty. Also, you can check if the IEnumerable also implements ICollection, and if so, call its .Count property.
Also check for null and count as if (!list.IsNullOrEmpty()) { ... }
/// <summary>
/// Returns true if collection is null or empty.
/// </summary>
public static bool IsNullOrEmpty<T>(this IEnumerable<T> source)
{
return source == null || !source.Any();
}

C# Difference between First() and Find()

So I know that Find() is only a List<T> method, whereas First() is an extension for any IEnumerable<T>. I also know that First() will return the first element if no parameter is passed, whereas Find() will throw an exception. Lastly, I know that First() will throw an exception if the element is not found, whereas Find() will return the type's default value.
I hope that clears up confusion about what I'm actually asking. This is a computer science question and deals with these methods at the computational level. I've come to understand that IEnumerable<T> extensions do not always operate as one would expect under the hood. So here's the Q, and I mean from a "close to the metal" standpoint: What is the difference between Find() and First()?
Here's some code to provide basic assumptions to operate under for this question.
var l = new List<int> { 1, 2, 3, 4, 5 };
var x = l.First(i => i == 3);
var y = l.Find(i => i == 3);
Is there any actual computational difference between how First() and Find() discover their values in the code above?
Note: Let us ignore things like AsParallel() and AsQueryable() for now.
Here's the code for List<T>.Find (from Reflector):
public T Find(Predicate<T> match)
{
if (match == null)
{
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.match);
}
for (int i = 0; i < this._size; i++)
{
if (match(this._items[i]))
{
return this._items[i];
}
}
return default(T);
}
And here's Enumerable.First:
public static TSource First<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)
{
if (source == null)
{
throw Error.ArgumentNull("source");
}
if (predicate == null)
{
throw Error.ArgumentNull("predicate");
}
foreach (TSource local in source)
{
if (predicate(local))
{
return local;
}
}
throw Error.NoMatch();
}
So both methods work roughly the same way: they iterate all items until they find one that matches the predicate. The only noticeable difference is that Find uses a for loop because it already knows the number of elements, and First uses a foreach loop because it doesn't know it.
First will throw an exception when it finds nothing, FirstOrDefault however does exactly the same as Find (apart from how it iterates through the elements).
BTW Find is rather equal to FirstOrDefault() than to First(). Because if predicate of First() is not satisfied with any list elements you will get an exception.
Here what returns a dotpeek, another great free reflector replacement with some of ReSharper features
Here for Enumerable.First(...) and Enumerable.FirstOrDefault(...) extension methods:
public static TSource FirstOrDefault<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate) {
if (source == null) throw Error.ArgumentNull("source");
if (predicate == null) throw Error.ArgumentNull("predicate");
foreach (TSource element in source) {
if (predicate(element)) return element;
}
return default(TSource);
}
public static TSource First<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate) {
if (source == null) throw Error.ArgumentNull("source");
if (predicate == null) throw Error.ArgumentNull("predicate");
foreach (TSource element in source) {
if (predicate(element)) return element;
}
throw Error.NoMatch();
}
and here is for List<>.Find:
/// <summary>
/// Searches for an element that matches the conditions defined by the specified predicate, and returns the first occurrence within the entire <see cref="T:System.Collections.Generic.List`1"/>.
/// </summary>
///
/// <returns>
/// The first element that matches the conditions defined by the specified predicate, if found; otherwise, the default value for type <paramref name="T"/>.
/// </returns>
/// <param name="match">The <see cref="T:System.Predicate`1"/> delegate that defines the conditions of the element to search for.</param><exception cref="T:System.ArgumentNullException"><paramref name="match"/> is null.</exception>
[__DynamicallyInvokable]
public T Find(Predicate<T> match)
{
if (match == null)
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.match);
for (int index = 0; index < this._size; ++index)
{
if (match(this._items[index]))
return this._items[index];
}
return default (T);
}
1- Find() returns Null if the entity is not in the context but First() will throw an exception
2- Find() returns entities that have been added to the context but have not yet been saved to the database
Since List<> is not indexed in any way, it has to go through all values to find a specific value. Therefore it doesn't make much of a difference compared to traversing the list via an enumerable (apart from the creation of a enumerable helper object instance).
That said, keep in mind that the Find function was created way earlier than the First extension method (Framework V2.0 vs. V3.5), and I doubt that they would have implemented Find if the List<> class had been implemented at the same time as the extension methods.

Categories