LINQ Aggregate special attention for last element of list - c#

I would like to pass in a different function to Aggregate for the last element in the collection.
A use for this would be:
List<string> listString = new List{"1", "2", "3"};
string joined = listString.Aggregate(new StringBuilder(),
(sb,s) => sb.Append(s).Append(", "),
(sb,s) => sb.Append(s)).ToString();
//joined => "1, 2, 3"
What would be a custom implementation if no other exists?
P.S. I would like to do this w/ composable functions iterating once through the collection. In other words, I do not want to do a Select wrapped in a String.Join

Aggregate does not allow that in natural way.
You can carry previous element and do you final handling after Aggregate. Also I think your best bet would be to write custom method that does that custom handling for last (and possibly first) element.
Some approximate code to special case last item with Aggregate (does not handle most special case like empty/short list):
var firstLast = seq.Aggregate(
Tuple.Create(new StringBuilder(), default(string)),
(sum, cur) =>
{
if (sum.Item2 != null)
{
sum.Item1.Append(",");
sum.Item1.Append(sum.Item2);
}
return Tuple.Create(sum.Item1, cur);
});
firstLast.Item1.Append(SpecialProcessingForLast(sum.Item2));
return firstLast.Item1.ToString();
Aggregate with special case for "last". Sample is ready to copy/paste to LinqPad/console app, uncomment "this" when making extension function. Main shows aggregating array with summing all but last element, last one is subtracted from result:
void Main()
{
Console.WriteLine(AggregateWithLast(new[] {1,1,1,-3}, 0, (s,c)=>s+c, (s,c)=>s-c));
Console.WriteLine(AggregateWithLast(new[] {1,1,1,+3}, 0, (s,c)=>s+c, (s,c)=>s-c));
}
public static TAccumulate AggregateWithLast<TSource, TAccumulate>(
/*this */ IEnumerable<TSource> source,
TAccumulate seed,
Func<TAccumulate, TSource, TAccumulate> funcAll,
Func<TAccumulate, TSource, TAccumulate> funcLast)
{
using (IEnumerator<TSource> sourceIterator = source.GetEnumerator())
{
if (!sourceIterator.MoveNext())
{
return seed;
}
TSource last = sourceIterator.Current;
TAccumulate total = seed;
while (sourceIterator.MoveNext())
{
total = funcAll(total, last);
last = sourceIterator.Current;
}
return funcLast(total, last);
}
}
Note: if you need just String.Join than one in .Net 4.0+ takes IEnumerable<T> - so it will iterate sequence only once without need to ToList/ToArray.

Another approach for your particular example, is to skip the comma for the first element and prepend it to the tail elements, like this:
List<string> listString = new() { "1", "2", "3" };
string joined = listString
.Select((value, index) => (value, index))
.Aggregate(new StringBuilder(), (sb, s) =>
s.index == 0
? sb.Append(s.value)
: sb.Append(", ").Append(s.value))
.ToString();
I know this does not address the question in the title, but for most "join things with some infix" problems, this works well. That is, when string.Join is not the solution.

Related

C# & LINQ, Select two (consecutive) items at once [duplicate]

This question already has answers here:
Linq to Objects - return pairs of numbers from list of numbers
(12 answers)
Closed 7 years ago.
Using LINQ on an ordered set (array, list), is there a way to select or otherwise use two consecutive items? I am imagining the syntax:
list.SelectTwo((x, y) => ...)
Where x and y are the items at index i and i + 1 in the list/array.
There may be no way to do this, which I accept as a possibility, but I would at least like to say I tried to find an answer.
I am aware that I could use something other and LINQ to achieve this.
Thank you in advance.
Another answer presents a nice and clean solution using LINQ's Skip and Zip.
It is absolutely correct, but I'd like to point out that it enumerates the source twice. That may or may not matter, depending on each individual use case. If it matters for your case, here's a longer alternative that is functionally equivalent but enumerates the source once:
static class EnumerableUtilities
{
public static IEnumerable<TResult> SelectTwo<TSource, TResult>(this IEnumerable<TSource> source,
Func<TSource, TSource, TResult> selector)
{
if (source == null) throw new ArgumentNullException(nameof(source));
if (selector == null) throw new ArgumentNullException(nameof(selector));
return SelectTwoImpl(source, selector);
}
private static IEnumerable<TResult> SelectTwoImpl<TSource, TResult>(this IEnumerable<TSource> source,
Func<TSource, TSource, TResult> selector)
{
using (var iterator = source.GetEnumerator())
{
var item2 = default(TSource);
var i = 0;
while (iterator.MoveNext())
{
var item1 = item2;
item2 = iterator.Current;
i++;
if (i >= 2)
{
yield return selector(item1, item2);
}
}
}
}
}
Example:
var seq = new[] {"A", "B", "C", "D"}.SelectTwo((a, b) => a + b);
The resulting sequence contains "AB", "BC", "CD".
System.Linq.Enumerable.Zip combines two IEnumerables by pairing up the i-th element for each i. So you just need to Zip your list with a shifted version of it.
As a nice extension method:
using System.Collections.Generic;
using System.Linq;
static class ExtMethods
{
public static IEnumerable<TResult> SelectTwo<TSource, TResult>(this IEnumerable<TSource> source,
Func<TSource, TSource, TResult> selector)
{
return Enumerable.Zip(source, source.Skip(1), selector);
}
}
Example:
Enumerable.Range(1,5).SelectTwo((a,b) => $"({a},{b})");
Results in:
(1,2) (2,3) (3,4) (4,5)
You can do
list.Skip(i).Take(2)
This will return an IEnumerable<T> with only the two consecutive items.
I think you can select item and next item data in an ordered list like this:
var theList = new List<T>();
theList
.Select((item, index) => new { CurrIndex = index, item.Prop1, item.Prop2, theList[index + 1].Prop1 })
.Where(newItem => {some condition on the item});
However, index of the selected items should be less than list size - 1.
If the source sequence has an indexer, i.e. at minimum is IReadOnlyList<T> (array, list as mentioned in the question), and the idea is to split the sequence on consecutive pairs (which is not quite clear from the question), then it can be done simply like this
var pairs = Enumerable.Range(0, list.Count / 2)
.Select(i => Tuple.Create(list[2 * i], list[2 * i + 1]));
You can use a special overload of Select that allows you to use the index of the item, and the GroupBy method to split the list into groups. Each group would have two items. Here is an extension method that does that:
public static class ExtensionMethods
{
public static IEnumerable<TResult> SelectTwo<TSource, TResult>(this IEnumerable<TSource> source,
Func<TSource, TSource, TResult> selector)
{
return source.Select((item, index) => new {item, index})
.GroupBy(x => x.index/2)
.Select(g => g.Select(i => i.item).ToArray())
.Select(x => selector(x[0], x[1]));
}
}
And you can use it like this:
var list = new[] {1, 2, 3, 4, 5, 6};
var result = list.SelectTwo((x, y) => x + y).ToList();
This would return {3,7,11}
Please note that the above method groups the data in memory before starting to yield results. If you have large data sets, you might want to have a streaming approach (yield data as the data from the source is being enumerated), here is an example:
public static class ExtensionMethods
{
public static IEnumerable<TResult> SelectTwo<TSource, TResult>(this IEnumerable<TSource> source,
Func<TSource, TSource, TResult> selector)
{
bool first_item_got = false;
TSource first_item = default(TSource);
foreach (var item in source)
{
if (first_item_got)
{
yield return selector(first_item, item);
}
else
{
first_item = item;
}
first_item_got = !first_item_got;
}
}
}

LINQ: Collapsing a series of strings into a set of "ranges"

I have an array of strings similar to this (shown on separate lines to illustrate the pattern):
{ "aa002","aa003","aa004","aa005","aa006","aa007", // note that aa008 is missing
"aa009"
"ba023","ba024","ba025"
"bb025",
"ca002","ca003",
"cb004",
...}
...and the goal is to collapse those strings into this comma-separated string of "ranges":
"aa002-aa007,aa009,ba023-ba025,bb025,ca002-ca003,cb004, ... "
I want to collapse them so I can construct a URL. There are hundreds of elements, but I can still convey all the information if I collapse them this way - putting them all into a URL "longhand" (it has to be a GET, not a POST) isn't feasible.
I've had the idea to separate them into groups using the first two characters as the key - but does anyone have any clever ideas for collapsing those sequences (without gaps) into ranges? I'm struggling with it, and everything I've come up with looks like spaghetti.
So the first thing that you need to do is parse the strings. It's important to have the alphabetic prefix and the integer value separately.
Next you want to group the items on the prefix.
For each of the items in that group, you want to order them by number, and then group items while the previous value's number is one less than the current item's number. (Or, put another way, while the previous item plus one is equal to the current item.)
Once you've grouped all of those items you want to project that group out to a value based on that range's prefix, as well as the first and last number. No other information from these groups is needed.
We then flatten the list of strings for each group into just a regular list of strings, since once we're all done there is no need to separate out ranges from different groups. This is done using SelectMany.
When that's all said and done, that, translated into code, is this:
public static IEnumerable<string> Foo(IEnumerable<string> data)
{
return data.Select(item => new
{
Prefix = item.Substring(0, 2),
Number = int.Parse(item.Substring(2))
})
.GroupBy(item => item.Prefix)
.SelectMany(group => group.OrderBy(item => item.Number)
.GroupWhile((prev, current) =>
prev.Number + 1 == current.Number)
.Select(range =>
RangeAsString(group.Key,
range.First().Number,
range.Last().Number)));
}
The GroupWhile method can be implemented like so:
public static IEnumerable<IEnumerable<T>> GroupWhile<T>(
this IEnumerable<T> source, Func<T, T, bool> predicate)
{
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
yield break;
List<T> list = new List<T>() { iterator.Current };
T previous = iterator.Current;
while (iterator.MoveNext())
{
if (!predicate(previous, iterator.Current))
{
yield return list;
list = new List<T>();
}
list.Add(iterator.Current);
previous = iterator.Current;
}
yield return list;
}
}
And then the simple helper method to convert each range into a string:
private static string RangeAsString(string prefix, int start, int end)
{
if (start == end)
return prefix + start;
else
return string.Format("{0}{1}-{0}{2}", prefix, start, end);
}
Here's a LINQ version without the need to add new extension methods:
var data2 = data.Skip(1).Zip(data, (d1, d0) => new
{
value = d1,
jump = d1.Substring(0, 2) == d0.Substring(0, 2)
? int.Parse(d1.Substring(2)) - int.Parse(d0.Substring(2))
: -1,
});
var agg = new { f = data.First(), t = data.First(), };
var query2 =
data2
.Aggregate(new [] { agg }.ToList(), (a, x) =>
{
var last = a.Last();
if (x.jump == 1)
{
a.RemoveAt(a.Count() - 1);
a.Add(new { f = last.f, t = x.value, });
}
else
{
a.Add(new { f = x.value, t = x.value, });
}
return a;
});
var query3 =
from q in query2
select (q.f) + (q.f == q.t ? "" : "-" + q.t);
I get these results:

How do I select everything after the last instance of?

I asked yesterday how to select everything after the first instance of a flag in a collection and the answer was to use SkipWhile which work great. But now the logic has changed, and I need a way to Select the last instance of and everything after it.
A bit more detail:
The list contains an ordered list with a number of configurations, and each has a flag called IsTop. What I need to do is find the last instance of IsTop == true, grab that and everything after it.
Can this be done in LINQ or do I have to ToArray() it and do it by hand, so to speak?
You can use Reverse to handle this, and swap out SkipWhile for TakeWhile.
var query = sequence.Reverse()
.TakeWhile(item => !item.IsTop)
.Reverse(); //to get back in the original order; remove if not needed
Unfortunately, the above method doesn't include the last item where IsTop is true, to do so is a tad more complex, and the "easiest" methods of doing so would involve iterating the sequence several times, as such it should really only be used on a List, Array, or other data structure that can access items by index (i.e., an IList). Here is a method that would be able to handle it:
public static IEnumerable<T> Foo<T>(IList<T> data, Func<T, bool> isDivisor)
{
int itemsToTake = data.Reverse()
.TakeWhile(isDivisor)
.Count() + 1;
return data.Skip(data.Count - itemsToTake);
}
Another approach that is more "proper", relies on a helper method. This method will group items while a predicate indicates it should be. If the predicate returns true it is added to the "current group", if it's false, the previous group is "done" and a new group is started. This helper method is as follows:
public static IEnumerable<IEnumerable<T>> GroupWhile<T>(
this IEnumerable<T> source, Func<T, bool> predicate)
{
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
yield break;
List<T> list = new List<T>() { iterator.Current };
while (iterator.MoveNext())
{
if (predicate(iterator.Current))
{
list.Add(iterator.Current);
}
else
{
yield return list;
list = new List<T>() { iterator.Current };
}
}
yield return list;
}
}
Using this it's actually rather straightforward:
var query = sequence.GroupWhile(item => !item.IsTop)
.Last();
Conceptually this models what we're doing the best. We're creating groups in which each group goes from one IsTop item to the next, and then we just want the last group (or the first group, for your other question).
You can write your own simple extension method to do this:
// takes items until the first one where predicate is true;
// includes the first item where predicate is true
public static IEnumerable<TSource> TakeUntil<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate
)
{
foreach (var item in source)
{
yield return item;
if (predicate(item))
break;
}
}
public static IEnumerable<TSource> TakeLastUntil<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate
)
{
return source.Reverse().TakeUntil(predicate).Reverse();
}
Use like:
var myList = new[]
{
new { IsTop = false, S = 'a' },
new { IsTop = true, S = 'b' },
new { IsTop = false, S = 'c' },
new { IsTop = true, S = 'd' },
new { IsTop = false, S = 'e' },
}.ToList();
myList.TakeLastUntil(x => x.IsTop); // has d and e
This might iterate the list more than is necessary. If that's a problem (e.g. because you have a very long list) and you are working with some sort of IList<T> instead of just an IEnumerable<T>, it should be possible to write these methods more efficiently for IList<T>.

LINQ, simplifying expression - take while sum of taken does not exceed given value

Given a setup like this ..
class Product {
int Cost;
// other properties unimportant
}
var products = new List<Product> {
new Product { Cost = 5 },
new Product { Cost = 10 },
new Product { Cost = 15 },
new Product { Cost = 20 }
};
var credit = 15;
Assume that the list will be sorted in the given order. I wish to basically iterate over each item in the list, keep a summed value of cost, and keep getting products out as long as the total cost does not exceed credit.
I can do this with some loops and stuff, but I was wondering if there is a way to compact it into a simpler LINQ query.
Others have pointed out the captured variable approach, and there are arguably correct viewpoints that this approach is bad because it mutates state. Additionally, the captured variable approaches can only be iterated once, and are dangerous because a. you might forget that fact and try to iterate twice; b. the captured variable does not reflect the sum of the items taken.
To avoid these problems, just create an extension method:
public static IEnumerable<TSource> TakeWhileAggregate<TSource, TAccumulate>(
this IEnumerable<TSource> source,
TAccumulate seed,
Func<TAccumulate, TSource, TAccumulate> func,
Func<TAccumulate, bool> predicate
) {
TAccumulate accumulator = seed;
foreach (TSource item in source) {
accumulator = func(accumulator, item);
if (predicate(accumulator)) {
yield return item;
}
else {
yield break;
}
}
}
Usage:
var taken = products.TakeWhileAggregate(
0,
(cost, product) => cost + product.Cost,
cost => cost <= credit
);
Note that NOW you can iterate twice (although be careful if your TAccumulate is mutable a reference type).
Not "fully" linq, because it needs one extra variable, but it is the easiest I could think of:
int current=0;
var selection = products.TakeWhile(p => (current = current + p.Cost) <= credit);
You can do this if you want a solution without an external variable
var indexQuery = products.Select((x,index) => new { Obj = x, Index = index });
var query = from p in indexQuery
let RunningTotal = indexQuery.Where(x => x.Index <= p.Index)
.Sum(x => x.Obj.Cost)
where credit >= RunningTotal
select p.Obj;
ok, re my comment above in #Aducci's answer, here's a version using Scan
var result=products.Scan(new {Product=(Product)null, RunningSum=0},
(self, next) => new {Product=next, RunningSum=self.RunningSum+next.Cost})
.Where(x=>x.RunningSum<=credit)
.Select(x => x.Product);
And this is my implementation of Scan (which I assume is similar to what's in the Rx Framework, but I haven't checked)
public static IEnumerable<TAccumulate> Scan<TSource, TAccumulate>(this IEnumerable<TSource> source,
TAccumulate seed, Func<TAccumulate, TSource, TAccumulate> accumulator) {
foreach(var item in source) {
seed=accumulator(seed, item);
yield return seed;
}
}
Use a captured variable to track the amount taken so far.
int sum = 0;
IEnumerable<Product> query = products.TakeWhile(p =>
{
bool canAfford = (sum + p.Cost) <= credit;
sum = canAfford ? sum + p.Cost : sum;
return canAfford;
});

How take each two items from IEnumerable as a pair?

I have IEnumerable<string> which looks like {"First", "1", "Second", "2", ... }.
I need to iterate through the list and create IEnumerable<Tuple<string, string>> where Tuples will look like:
"First", "1"
"Second", "2"
So I need to create pairs from a list I have to get pairs as mentioned above.
A lazy extension method to achieve this is:
public static IEnumerable<Tuple<T, T>> Tupelize<T>(this IEnumerable<T> source)
{
using (var enumerator = source.GetEnumerator())
while (enumerator.MoveNext())
{
var item1 = enumerator.Current;
if (!enumerator.MoveNext())
throw new ArgumentException();
var item2 = enumerator.Current;
yield return new Tuple<T, T>(item1, item2);
}
}
Note that if the number of elements happens to not be even this will throw. Another way would be to use this extensions method to split the source collection into chunks of 2:
public static IEnumerable<IEnumerable<T>> Chunk<T>(this IEnumerable<T> list, int batchSize)
{
var batch = new List<T>(batchSize);
foreach (var item in list)
{
batch.Add(item);
if (batch.Count == batchSize)
{
yield return batch;
batch = new List<T>(batchSize);
}
}
if (batch.Count > 0)
yield return batch;
}
Then you can do:
var tuples = items.Chunk(2)
.Select(x => new Tuple<string, string>(x.First(), x.Skip(1).First()))
.ToArray();
Finally, to use only existing extension methods:
var tuples = items.Where((x, i) => i % 2 == 0)
.Zip(items.Where((x, i) => i % 2 == 1),
(a, b) => new Tuple<string, string>(a, b))
.ToArray();
morelinq contains a Batch extension method which can do what you want:
var str = new string[] { "First", "1", "Second", "2", "Third", "3" };
var tuples = str.Batch(2, r => new Tuple<string, string>(r.FirstOrDefault(), r.LastOrDefault()));
You could do something like:
var pairs = source.Select((value, index) => new {Index = index, Value = value})
.GroupBy(x => x.Index / 2)
.Select(g => new Tuple<string, string>(g.ElementAt(0).Value,
g.ElementAt(1).Value));
This will get you an IEnumerable<Tuple<string, string>>. It works by grouping the elements by their odd/even positions and then expanding each group into a Tuple. The benefit of this approach over the Zip approach suggested by BrokenGlass is that it only enumerates the original enumerable once.
It is however hard for someone to understand at first glance, so I would either do it another way (ie. not using linq), or document its intention next to where it is used.
You can make this work using the LINQ .Zip() extension method:
IEnumerable<string> source = new List<string> { "First", "1", "Second", "2" };
var tupleList = source.Zip(source.Skip(1),
(a, b) => new Tuple<string, string>(a, b))
.Where((x, i) => i % 2 == 0)
.ToList();
Basically the approach is zipping up the source Enumerable with itself, skipping the first element so the second enumeration is one off - that will give you the pairs ("First, "1"), ("1", "Second"), ("Second", "2").
Then we are filtering the odd tuples since we don't want those and end up with the right tuple pairs ("First, "1"), ("Second", "2") and so on.
Edit:
I actually agree with the sentiment of the comments - this is what I would consider "clever" code - looks smart, but has obvious (and not so obvious) downsides:
Performance: the Enumerable has to
be traversed twice - for the same
reason it cannot be used on
Enumerables that consume their
source, i.e. data from network
streams.
Maintenance: It's not obvious what
the code does - if someone else is
tasked to maintain the code there
might be trouble ahead, especially
given point 1.
Having said that, I'd probably use a good old foreach loop myself given the choice, or with a list as source collection a for loop so I can use the index directly.
IEnumerable<T> items = ...;
using (var enumerator = items.GetEnumerator())
{
while (enumerator.MoveNext())
{
T first = enumerator.Current;
bool hasSecond = enumerator.MoveNext();
Trace.Assert(hasSecond, "Collection must have even number of elements.");
T second = enumerator.Current;
var tuple = new Tuple<T, T>(first, second);
//Now you have the tuple
}
}
Starting from NET 6.0, you can use
Enumerable.Chunk(IEnumerable, Int32)
var tuples = new[] {"First", "1", "Second", "2", "Incomplete" }
.Chunk(2)
.Where(chunk => chunk.Length == 2)
.Select(chunk => (chunk[0], chunk[1]));
If you are using .NET 4.0, then you can use tuple object (see http://mutelight.org/articles/finally-tuples-in-c-sharp.html). Together with LINQ it should give you what you need. If not, then you probably need to define your own tuples to do that or encode those strings like for example "First:1", "Second:2" and then decode it (also with LINQ).

Categories