Imagine you would want to select all elements of one sequence all, except elements contained in sequence exceptions and single element otherException.
Is there some better way to do this than? I'd like to avoid creating new array, but I couldn't find a method on the sequence that concats it with a single element.
all.Except(exceptions.Concat(new int[] { otherException }));
complete source code for completeness' sake:
var all = Enumerable.Range(1, 5);
int[] exceptions = { 1, 3 };
int otherException = 2;
var result = all.Except(exceptions.Concat(new int[] { otherException }));
An alternative (perhaps more readable) would be:
all.Except(exceptions).Except(new int[] { otherException });
You can also create an extension method that converts any object to an IEnumerable, thus making the code even more readable:
public static IEnumerable<T> ToEnumerable<T>(this T item)
{
return new T[] { item };
}
all.Except(exceptions).Except(otherException.ToEnumerable());
Or if you really want a reusable way to easily get a collection plus one item:
public static IEnumerable<T> Plus<T>(this IEnumerable<T> collection, T item)
{
return collection.Concat(new T[] { item });
}
all.Except(exceptions.Plus(otherException))
Related
I have the following Linq
var seq =
GetCollectionA()
.Concat(GetCollectionB())
.Concat(GetCollectionC())
.FirstOrDefault();
If GetCollectionA() returned some objects, the other two methods wrapped in Concat would still run, and for nothing. And each one of those methods return an actual array, not a true Linq-friendly Enumerable. My goal is to have parameters to Concat evaluated when they are actually needed. Wouldn't it be nice if Concat was done to allow lazy-loading lambda expressions like so?
var seq =
GetCollectionA()
.Concat(() => GetCollectionB())
.Concat(() => GetCollectionC())
.FirstOrDefault();
I am thinking about the following workaround, will this work and not call the subsequent collection methods if the element is found in first collection?
var seq =
GetCollectionA()
.Concat(Enumerable.Range(1, 1).SelectMany(_ => GetCollectionB()))
.Concat(Enumerable.Range(1, 1).SelectMany(_ => GetCollectionC()))
.FirstOrDefault();
Is Concat going to actually iterate the sequence anyway instead of putting it on iteration pipeline?
Is there a better way?
Consider using an approach like this to enumerate the collections one at a time:
The key bit is that SmartConcat takes Func rather than the results of the method call (that you are currently using). So it can stop executing as soon as it finds a match.
using System;
using System.Collections.Generic;
using System.Linq;
namespace Test
{
static class ExtraLINQ
{
public static IEnumerable<T> SmartConcat<T>(this IEnumerable<T> source, params Func<IEnumerable<T>>[] extras)
{
foreach (var entry in source)
yield return entry;
foreach (var laterEntries in extras)
{
foreach (var laterEntry in laterEntries())
{
yield return laterEntry;
}
}
}
}
class Program
{
static void Main(string[] args)
{
// Executes both functions
var first = GetCollectionA().Concat(GetCollectionB()).FirstOrDefault();
Console.WriteLine(first);
// Executes only the first
var otherFirst = GetCollectionA().SmartConcat(GetCollectionB).FirstOrDefault();
Console.WriteLine(otherFirst);
Console.ReadLine();
}
private static IEnumerable<int> GetCollectionA()
{
var results = new int[] { 1, 2, 3 };
Console.WriteLine("GetBob");
return results;
}
private static IEnumerable<int> GetCollectionB()
{
var results = new int[] { 4,5,6 };
Console.WriteLine("GetBob4");
return results;
}
}
}
Alternatively, if you are dealing with a reference type, consider:
var result = GetCollectionA().FirstOrDefault() ?? GetCollectionB().FirstOrDefault();
Apparently my ugly looking hack works.
public static void Main()
{
Console.WriteLine("Hello World");
var seq = GetNumbers().Concat(Enumerable.Range(1, 1).SelectMany(_ => GetNumbers())).FirstOrDefault();
Console.WriteLine(seq);
}
static int[] GetNumbers()
{
Console.WriteLine("GetNumbers called");
return new[]{1, 2, 3};
}
}
GetNumbers was called only once
Hello World
GetNumbers called
1
Here is the fiddle
https://dotnetfiddle.net/VDlL79
So i'd like to ask: why we have only selector that returns an enumerable? For example, i have frequently situation, when i must modify each value of array, for example:
int[] a = {1,2,3,4,5};
a = a.Select(x=>x*2).ToArray();
so here we get an enumerable, and only after it we can convert it back into array.
We can try to use Array.ForEach, but only if we could modify the source. But if we have array of reference types and can't modify them, we should anyway write something like this
SomeClass[] a = FillSomeClassArray();
SomeClass[] b = a.Select(x=> ((SomeClass)x.Clone()).Modify()).ToArray();
in my case i'm using my own class
public static class CollectionHelper
{
public static TResult[] SelectToArray<T, TResult>(this ICollection<T> source, Func<T, TResult> selector)
{
if (source == null)
throw new ArgumentNullException("source");
if (selector == null)
throw new ArgumentNullException("selector");
var result = new TResult[source.Count];
int i = 0;
foreach (T t in source)
{
result[i] = selector(t);
i++;
}
return result;
}
}
here we haven't double-convertation, when we haven't predicate we know length of result and we should use this information. I know that MS shouldn't do all the work instead of me, but afaik it's functionaloty standard enough.
The biggest problem with adding SelectToArray to the framework is consistency. If you add SelectToArray, you should also add each of the following:
CastToArray<T>
ConcatToArray<T>
RepeatToArray<T>
ReverseToArray<T>
SkipToArray<T>
OfTypeToArray<T>
TakeToArray<T>
While we're at the subject of adding new methods, what's wrong with adding the same optimization to lists? Now we also need
SelectToList<T> (similar to the one that started it all)
CastToList<T>
ConcatToList<T>
... and so on - I'm sure you got the idea.
Considering the minuscule savings from knowing the size of the target array or the target list, such major refactoring is impractical. You would be able to achieve the same effect with a simple method like this:
static T[] CopyToArray(
this IEnumerable<T> source
, T[] result
, int pos = 0
, int? lengthOrNull = null
) {
int length = lengthOrNull ?? result.Length;
foreach (var item in source) {
if (pos > length) break;
result[pos++] = item;
}
return result;
}
Now the caller can combine the existing LINQ functionality with this method to compose all of the above XyzToArray methods, like this:
IList<MyClass> data = ...
int[] res = data.Select(x => x.IntProperty).CopyToArray(new int[data.Count]);
You would also be able to write results of LINQ queries into different parts of an existing array, like this:
IList<MyClass> data1 = ...
IList<MyClass> data2 = ...
int[] res = new int[data1.Count+data2.Count];
data1.Select(x => x.IntProperty).CopyToArray(res, 0, data1.Count);
data2.Select(x => x.IntProperty).CopyToArray(res, data1.Count, data2.Count);
If I want an empty enumeration, I can call Enumerable.Empty<T>(). But what if I want to convert a scalar type to an enumeration?
Normally I'd write new List<string> {myString} to pass myString to a function that accepts IEnumerable<string>. Is there a more LINQ-y way?
You can use Repeat:
var justOne = Enumerable.Repeat(value, 1);
Or just an array of course:
var singleElementArray = new[] { value };
The array version is mutable of course, whereas Enumerable.Repeat isn't.
Perhaps the shortest form is
var sequence = new[] { value };
There is, but it's less efficient than using a List or Array:
// an enumeration containing only the number 13.
var oneIntEnumeration = Enumerable.Repeat(13, 1);
You can also write your own extension method:
public static class Extensions
{
public static IEnumerable<T> AsEnumerable<T>(this T item)
{
yield return item;
}
}
Now I haven't done that, and now that I know about Enumerable.Repeat, I probably never will (learn something new every day). But I have done this:
public static IEnumerable<T> MakeEnumerable<T>(params T[] items)
{
return items;
}
And this, of course, works if you call it with a single argument. But maybe there's something like this in the framework already, that I haven't discovered yet.
I would like to call FindLast on a collection which implements IEnumerable, but FindLast is only available for List. What is the best solution?
The equivalent to:
var last = list.FindLast(predicate);
is
var last = sequence.Where(predicate).LastOrDefault();
(The latter will have to check all items in the sequence, however...)
Effectively the "Where()" is the Find part, and the "Last()" is the Last part of "FindLast" respectively. Similarly, FindFirst(predicate) would be map to sequence.Where(predicate).FirstOrDefault() and FindAll(predicate) would be sequence.Where(predicate).
How about with LINQ-to-Objects:
var item = data.LastOrDefault(x=>x.Whatever == "abc"); // etc
If you only have C# 2, you can use a utility method instead:
using System;
using System.Collections.Generic;
static class Program {
static void Main() {
int[] data = { 1, 2, 3, 4, 5, 6 };
int lastOdd = SequenceUtil.Last<int>(
data, delegate(int i) { return (i % 2) == 1; });
}
}
static class SequenceUtil {
public static T Last<T>(IEnumerable<T> data, Predicate<T> predicate) {
T last = default(T);
foreach (T item in data) {
if (predicate(item)) last = item;
}
return last;
}
}
you can add you collection to a new List by passing it to List<> constructor.
List<MyClass> myList = new List<MyClass>(MyCol);
myList.FindLast....
Use the extension method Last()
which is located in the namespace System.Linq.
Your question is invalid because a collection has no last element. A more specialized collection that does have a complete ordering is a list. A more specialized collection that does not have an ordering is a dictionary.
I have two arrays built while parsing a text file. The first contains the column names, the second contains the values from the current row. I need to iterate over both lists at once to build a map. Right now I have the following:
var currentValues = currentRow.Split(separatorChar);
var valueEnumerator = currentValues.GetEnumerator();
foreach (String column in columnList)
{
valueEnumerator.MoveNext();
valueMap.Add(column, (String)valueEnumerator.Current);
}
This works just fine, but it doesn't quite satisfy my sense of elegance, and it gets really hairy if the number of arrays is larger than two (as I have to do occasionally). Does anyone have another, terser idiom?
You've got a non-obvious pseudo-bug in your initial code - IEnumerator<T> extends IDisposable so you should dispose it. This can be very important with iterator blocks! Not a problem for arrays, but would be with other IEnumerable<T> implementations.
I'd do it like this:
public static IEnumerable<TResult> PairUp<TFirst,TSecond,TResult>
(this IEnumerable<TFirst> source, IEnumerable<TSecond> secondSequence,
Func<TFirst,TSecond,TResult> projection)
{
using (IEnumerator<TSecond> secondIter = secondSequence.GetEnumerator())
{
foreach (TFirst first in source)
{
if (!secondIter.MoveNext())
{
throw new ArgumentException
("First sequence longer than second");
}
yield return projection(first, secondIter.Current);
}
if (secondIter.MoveNext())
{
throw new ArgumentException
("Second sequence longer than first");
}
}
}
Then you can reuse this whenever you have the need:
foreach (var pair in columnList.PairUp(currentRow.Split(separatorChar),
(column, value) => new { column, value })
{
// Do something
}
Alternatively you could create a generic Pair type, and get rid of the projection parameter in the PairUp method.
EDIT:
With the Pair type, the calling code would look like this:
foreach (var pair in columnList.PairUp(currentRow.Split(separatorChar))
{
// column = pair.First, value = pair.Second
}
That looks about as simple as you can get. Yes, you need to put the utility method somewhere, as reusable code. Hardly a problem in my view. Now for multiple arrays...
If the arrays are of different types, we have a problem. You can't express an arbitrary number of type parameters in a generic method/type declaration - you could write versions of PairUp for as many type parameters as you wanted, just like there are Action and Func delegates for up to 4 delegate parameters - but you can't make it arbitrary.
If the values will all be of the same type, however - and if you're happy to stick to arrays - it's easy. (Non-arrays is okay too, but you can't do the length checking ahead of time.) You could do this:
public static IEnumerable<T[]> Zip<T>(params T[][] sources)
{
// (Insert error checking code here for null or empty sources parameter)
int length = sources[0].Length;
if (!sources.All(array => array.Length == length))
{
throw new ArgumentException("Arrays must all be of the same length");
}
for (int i=0; i < length; i++)
{
// Could do this bit with LINQ if you wanted
T[] result = new T[sources.Length];
for (int j=0; j < result.Length; j++)
{
result[j] = sources[j][i];
}
yield return result;
}
}
Then the calling code would be:
foreach (var array in Zip(columns, row, whatevers))
{
// column = array[0]
// value = array[1]
// whatever = array[2]
}
This involves a certain amount of copying, of course - you're creating an array each time. You could change that by introducing another type like this:
public struct Snapshot<T>
{
readonly T[][] sources;
readonly int index;
public Snapshot(T[][] sources, int index)
{
this.sources = sources;
this.index = index;
}
public T this[int element]
{
return sources[element][index];
}
}
This would probably be regarded as overkill by most though ;)
I could keep coming up with all kinds of ideas, to be honest... but the basics are:
With a little bit of reusable work, you can make the calling code nicer
For arbitrary combinations of types you'll have to do each number of parameters (2, 3, 4...) separately due to the way generics works
If you're happy to use the same type for each part, you can do better
if there are the same number of column names as there are elements in each row, could you not use a for loop?
var currentValues = currentRow.Split(separatorChar);
for(var i=0;i<columnList.Length;i++){
// use i to index both (or all) arrays and build your map
}
In a functional language you would usually find a "zip" function which will hopefully be part of a C#4.0 . Bart de Smet provides a funny implementation of zip based on existing LINQ functions:
public static IEnumerable<TResult> Zip<TFirst, TSecond, TResult>(
this IEnumerable<TFirst> first,
IEnumerable<TSecond> second,
Func<TFirst, TSecond, TResult> func)
{
return first.Select((x, i) => new { X = x, I = i })
.Join(second.Select((x, i) => new { X = x, I = i }),
o => o.I,
i => i.I,
(o, i) => func(o.X, i.X));
}
Then you can do:
int[] s1 = new [] { 1, 2, 3 };
int[] s2 = new[] { 4, 5, 6 };
var result = s1.Zip(s2, (i1, i2) => new {Value1 = i1, Value2 = i2});
If you're really using arrays, the best way is probably just to use the conventional for loop with indices. Not as nice, granted, but as far as I know .NET doesn't offer a better way of doing this.
You could also encapsulate your code into a method called zip – this is a common higher-order list function. However, C# lacking a suitable Tuple type, this is quite crufty. You'd end up returning an IEnumerable<KeyValuePair<T1, T2>> which isn't very nice.
By the way, are you really using IEnumerable instead of IEnumerable<T> or why do you cast the Current value?
Use IEnumerator for both would be nice
var currentValues = currentRow.Split(separatorChar);
using (IEnumerator<string> valueEnum = currentValues.GetEnumerator(), columnEnum = columnList.GetEnumerator()) {
while (valueEnum.MoveNext() && columnEnum.MoveNext())
valueMap.Add(columnEnum.Current, valueEnum.Current);
}
Or create an extension methods
public static IEnumerable<TResult> Zip<T1, T2, TResult>(this IEnumerable<T1> source, IEnumerable<T2> other, Func<T1, T2, TResult> selector) {
using (IEnumerator<T1> sourceEnum = source.GetEnumerator()) {
using (IEnumerator<T2> otherEnum = other.GetEnumerator()) {
while (sourceEnum.MoveNext() && columnEnum.MoveNext())
yield return selector(sourceEnum.Current, otherEnum.Current);
}
}
}
Usage
var currentValues = currentRow.Split(separatorChar);
foreach (var valueColumnPair in currentValues.Zip(columnList, (a, b) => new { Value = a, Column = b }) {
valueMap.Add(valueColumnPair.Column, valueColumnPair.Value);
}
Instead of creating two seperate arrays you could make a two-dimensional array, or a dictionary (which would be better). But really, if it works I wouldn't try to change it.