Zip N IEnumerable<T>s together? Iterate over them simultaneously? - c#

I have:-
IEnumerable<IEnumerable<T>> items;
and I'd like to create:-
IEnumerable<IEnumerable<T>> results;
where the first item in "results" is an IEnumerable of the first item of each of the IEnumerables of "items", the second item in "results" is an IEnumerable of the second item of each of "items", etc.
The IEnumerables aren't necessarily the same lengths. If some of the IEnumerables in items don't have an element at a particular index, then I'd expect the matching IEnumerable in results to have fewer items in it.
For example:-
items = { "1", "2", "3", "4" } , { "a", "b", "c" };
results = { "1", "a" } , { "2", "b" }, { "3", "c" }, { "4" };
Edit: Another example (requested in comments):-
items = { "1", "2", "3", "4" } , { "a", "b", "c" }, { "p", "q", "r", "s", "t" };
results = { "1", "a", "p" } , { "2", "b", "q" }, { "3", "c", "r" }, { "4", "s" }, { "t" };
I don't know in advance how many sequences there are, nor how many elements are in each sequence. I might have 1,000 sequences with 1,000,000 elements in each, and I might only need the first ~10, so I'd like to use the (lazy) enumeration of the source sequences if I can. In particular I don't want to create a new data structure if I can help it.
Is there a built-in method (similar to IEnumerable.Zip) that can do this?
Is there another way?

Now lightly tested and with working disposal.
public static class Extensions
{
public static IEnumerable<IEnumerable<T>> JaggedPivot<T>(
this IEnumerable<IEnumerable<T>> source)
{
List<IEnumerator<T>> originalEnumerators = source
.Select(x => x.GetEnumerator())
.ToList();
try
{
List<IEnumerator<T>> enumerators = originalEnumerators
.Where(x => x.MoveNext()).ToList();
while (enumerators.Any())
{
List<T> result = enumerators.Select(x => x.Current).ToList();
yield return result;
enumerators = enumerators.Where(x => x.MoveNext()).ToList();
}
}
finally
{
originalEnumerators.ForEach(x => x.Dispose());
}
}
}
public class TestExtensions
{
public void Test1()
{
IEnumerable<IEnumerable<int>> myInts = new List<IEnumerable<int>>()
{
Enumerable.Range(1, 20).ToList(),
Enumerable.Range(21, 5).ToList(),
Enumerable.Range(26, 15).ToList()
};
foreach(IEnumerable<int> x in myInts.JaggedPivot().Take(10))
{
foreach(int i in x)
{
Console.Write("{0} ", i);
}
Console.WriteLine();
}
}
}

It's reasonably straightforward to do if you can guarantee how the results are going to be used. However, if the results might be used in an arbitrary order, you may need to buffer everything. Consider this:
var results = MethodToBeImplemented(sequences);
var iterator = results.GetEnumerator();
iterator.MoveNext();
var first = iterator.Current;
iterator.MoveNext();
var second = iterator.Current;
foreach (var x in second)
{
// Do something
}
foreach (var x in first)
{
// Do something
}
In order to get at the items in "second" you'll have to iterate over all of the subsequences, past the first items. If you then want it to be valid to iterate over the items in first you either need to remember the items or be prepared to re-evaluate the subsequences.
Likewise you'll either need to buffer the subsequences as IEnumerable<T> values or reread the whole lot each time.
Basically it's a whole can of worms which is difficult to do elegantly in a way which will work pleasantly for all situations :( If you have a specific situation in mind with appropriate constraints, we may be able to help more.

Based on David B's answer, this code should perform better:
public static IEnumerable<IEnumerable<T>> JaggedPivot<T>(
this IEnumerable<IEnumerable<T>> source)
{
var originalEnumerators = source.Select(x => x.GetEnumerator()).ToList();
try
{
var enumerators =
new List<IEnumerator<T>>(originalEnumerators.Where(x => x.MoveNext()));
while (enumerators.Any())
{
yield return enumerators.Select(x => x.Current).ToList();
enumerators.RemoveAll(x => !x.MoveNext());
}
}
finally
{
originalEnumerators.ForEach(x => x.Dispose());
}
}
The difference is that the enumerators variable isn't re-created all the time.

Here's one that is a bit shorter, but no doubt less efficient:
Enumerable.Range(0,items.Select(x => x.Count()).Max())
.Select(x => items.SelectMany(y => y.Skip(x).Take(1)));

What about this?
List<string[]> items = new List<string[]>()
{
new string[] { "a", "b", "c" },
new string[] { "1", "2", "3" },
new string[] { "x", "y" },
new string[] { "y", "z", "w" }
};
var x = from i in Enumerable.Range(0, items.Max(a => a.Length))
select from z in items
where z.Length > i
select z[i];

You could compose existing operators like this,
IEnumerable<IEnumerable<int>> myInts = new List<IEnumerable<int>>()
{
Enumerable.Range(1, 20).ToList(),
Enumerable.Range(21, 5).ToList(),
Enumerable.Range(26, 15).ToList()
};
myInts.SelectMany(item => item.Select((number, index) => Tuple.Create(index, number)))
.GroupBy(item => item.Item1)
.Select(group => group.Select(tuple => tuple.Item2));

Related

How can I create a list of each item in another list, and the count of how many times that item appears?

So I have a list of items -> A, B, C, D.
C and D are included more than once, and A and B, more than twice. This list can go on and on, so we do not know how many times an item will be included.
I need to create a new list that will have the item in one column and the number of instances of that item in another column, but I do not know how to do this. I may need to use a tuple or a class, but I am not fully sure how to implement either...
What you actually need is to Group the items of your list and perform a group operation, which is Count in your case to calculate how many times does it exist.
This is how you may initialize your list:
List<string> myList = new List<string>() { "A", "B", "C", "D", "A", "B", "C", "D", "A", "B" };
and then you will group it using GroupBy function and apply the Count aggregate function on each group.
myList
.GroupBy(item => item)
.Select(g => new {Key = g.Key, Count = g.Count()})
.ToList();
This will result in the table you need.
You can try like this:
var myList = new List<String>() { "A","B", "C", "D","A","B", "C", "D", "A","B"};
var grp = myList.GroupBy( x => x );
foreach( var g in grp )
{
Console.WriteLine( "{0} {1}", g.Key, g.Count() );
}
DOTNET FIDDLE
char[] items = new[] { 'A', 'B', 'C', 'D', 'A', 'B', 'C', 'D', 'A', 'B' };
Dictionary<char, int> counts = new();
foreach(char c in items)
{
if (counts.TryGetValue(c, out int n))
{
counts[c] = n + 1;
}
else
{
counts.Add(c, 1);
}
}
While not a one liner, a simple and fast option.
I may need to use a tuple or a class, but I am not fully sure how to implement either...
Since you mentioned you may want to use a class, here is an example:
public class TextCount
{
public string Text { get; set; }
public int Count { get; set; }
}
public class Program
{
public static void Main()
{
// Initialize the list of strings
List<string> data = new List<string> { "A", "B", "C", "D", "A", "B", "C", "D", "A", "B" };
// Use LINQ to group the strings by their value and count the number of occurrences of each string
List<TextCount> result = data
.GroupBy(s => s)
.Select(g => new TextCount { Text = g.Key, Count = g.Count() })
.ToList();
// Print the results
foreach (TextCount sc in result)
{
Console.WriteLine("{0}: {1}", sc.Text, sc.Count);
}
}
}
Demo: https://dotnetfiddle.net/2FRBbK

Group array of string arrays with LINQ

I have array like this, values are string:
var arr1 = new [] { "H", "item1", "item2" };
var arr2 = new [] { "T", "thing1", "thing2" };
var arr3 = new [] { "T", "thing1", "thing2" };
var arr4 = new [] { "END", "something" };
var arr5 = new [] { "H", "item1", "item2" };
var arr6 = new [] { "T", "thing1", "thing2" };
var arr7 = new [] { "T", "thing1", "thing2" };
var arr8 = new [] { "END", "something" };
var allArrays = new [] { arr1, arr2, arr3, arr4, arr5, arr6, arr7, arr8 };
I need to group this in to a new array of arrays, so that one array has arrays that start with H or T. The END records (not included in the results) are the delimiters between each section; each new array starts after an END array.
In the end I would like to have somethng like this:
[
[ [H, item1, item2], [T, thing1, thing2], [T, thing1, thing2] ]
[ [H, item1, item2], [T, thing1, thing2], [T, thing1, thing2] ]
]
I know how I can do this with for each loop, but I'm looking for a cleaner way, possibly using linq. All suggestions are much valued, thank you!
you can try this
List<string[]> list = new List<string[]>();
var newArr = allArrays.Select(a => AddToArr(list, a)).Where(a => a != null);
and helper (this code can be put inline, but it easier to read this way)
private static string[][] AddToArr(List<string[]> list, string[] arr)
{
if (arr[0] != "END")
{
list.Add(arr);
return null;
}
var r = list.ToArray();
list.Clear();
return r;
}
result
[
[["H","item1","item2"],["T","thing1","thing2"],["T","thing1","thing2"]],
[["H","item3","item4"],["T","thing3","thing4"],["T","thing5","thing6"]]
]
So arr1, arr2, etc are string[].
allArrays is a string[][].
I hope you gave a meaningful example. From this example it seems that you want all string[] from allArrays, except the string[] that have a [0] that equals the word "END".
If this is what you want, your result is:
string[][] result = allArrays.Where(stringArray => stringArray[0] != "END");
I need to group this in to a new array of arrays, so that one array has arrays that start with H or T. The END records (not included in the results) are the delimiters between each section; each new array starts after an END array.
This is not exactly the same as I see in your example: what if one of the string arrays in allArrays is an empty array, or if it has the value null values. What if one of the the arrays of strings is empty (= length 0), and what if one of the string arrays doesn't start with "H", nor "T", nor "END"?
Literally you say that you only want the string arrays that start with "H" or "T", no other ones. You don't want string arrays that are null, nor empty string arrays. You also don't want string arrays that start with "END", nor the ones that start with String.Empty, or "A" or "B" or anything else than "H" or "T".
If I take your requirement literally, your code should be:
string[] requiredStringAtIndex0 = new string[] {"H", "T"};
string[][] result = allArrays.Where(stringArray => stringArray != null
&& stringArray.Length != 0
&& requiredStringAtIndex0.Contains(stringArray[0]));
In words: from allArrays, keep only those arrays of strings, that are not null, AND that have at least one element AND where the element at index 0 contains either "H" or "T"
Normally I would use an extension method for grouping runs of items based on a predicate, in this case GroupByEndingWith and then throw away the "END" record, like so:
var ans = allArrays.GroupByEndingWith(r => r[0] == "END")
.Select(g => g.Drop(1).ToArray())
.ToArray();
But, in general, you can use Aggregate to collect items based on a predicate at the expense of comprehension. It often helps to use a tuple to track an overall accumulator and a sub-accumulator. Unfortunately, there is no + operator or Append for List<T> that returns the original list (helpful for expression based accumulation) and since C# doesn't yet have a comma operator equivalent, you need an extension method again or you can use ImmutableList.
Using Aggregate and ImmutableList, you can do:
var ans = allArrays.Aggregate(
(ans: ImmutableList<ImmutableList<string[]>>.Empty, curr: ImmutableList<string[]>.Empty),
(ac, r) => r[0] == "END"
? (ac.ans.Add(ac.curr), ImmutableList<string[]>.Empty)
: (ac.ans, ac.curr.Add(r))
).ans
.Select(l => l.ToArray())
.ToArray();
NOTE: You can also do this with List if you are willing to create new Lists a lot:
var ans = allArrays.Aggregate(
(ans: new List<List<string[]>>(), curr: new List<string[]>()),
(ac, r) => r[0] == "END"
? (ac.ans.Concat(new[] { ac.curr }).ToList(), new List<string[]>())
: (ac.ans, ac.curr.Concat(new[] { r }).ToList())
).ans
.Select(l => l.ToArray())
.ToArray();
Here is a simple implementation.
public static void Main(string[] args)
{
var data = ConvertToArrayOfArray(arr1, arr2, arr3, arrr4, arr5, arr6, arr7, arr8);
}
private string[][] ConvertToArrayOfArray(params string[][] arrs)
{
List<string[]> yoList = new List<string[]>();
arrs.ToList().ForEach(x =>
{
if(!x[0] == "END") yoList.Add(x);
});
return yoList.ToArray();
}

linq query in list<list<T>>

I want to extract only the elements of List corresponding to a specific index from List<List>.
Tried to do this using LINQ Query and not For or Foreach, but I wasn't lucky
I want the result like below
List<List<string>> list = new List<List<string>>();
list.Add(new List<string> { "1", "2", "3" });
list.Add(new List<string> { "A", "B", "C" });
// expectation result List : index 1
// { "2", "B" }
Use Select.
int index;
list.Select(l => l[index]);

How to split the elements ​of an array of strings if they are equal?

Good day,
I currently have a string array like this:
string[] array = {"aa","bb","cc","dd","aa","cc","ee","ff","aa","bb"}
I would like to be able to get the positions that are the same from the same string [], example:
string[] a = {"aa","aa","aa"}
string[] b = {"bb","bb"}
string[] c = {"cc","cc"}
string[] d = {"dd"}
string[] e = {"ee"}
string[] f = {"ff"}
It should be noted that the elements of the parent matrix always change and are not always the same.
I tried with linq, but I don't get what I'm looking for.
this was my attempt with linq:
array.Where(x => array.Contains(x)).ToArray();
Thanks for help me!
Despite seeing what you ask for, the result you want is pretty limited an not useful to work with later on. You should take advantage of using the GroupBy in linq and then when you need something find it in that collection.
// your array
string[] array = {"aa","bb","cc","dd","aa","cc","ee","ff","aa","bb"};
// group by value
var groupedValues = array.GroupBy(x => x).ToList();
// get the "aa" group if exist
var aa = groupedValues.FirstOrDefault(x => x.Key == "aa");
// check if the group was found
if(aa != null)
{
// get all "aa" values in that group. This return this collection based on your inpit{ "aa", "aa", "aa" }
var allaaValues = aa.ToList();
}
Hope this is a solution you were looking for, good luck!
string[] array = { "aa", "bb", "cc", "dd", "aa", "cc", "ee", "ff", "aa", "bb" };
var splittedArray = new List<string[]>();
foreach (var strItem in array)
{
//Don't iterating duplicates
if (splittedArray.Any(si => si.Contains(strItem))) continue;
//if more then one item exists in the array getting those identic items and adding to the array list
if (array.Count(si => si.Equals(strItem)) > 0)
{
var identicItems = array
.Where(i => i.Equals(strItem))
.ToArray();
splittedArray.Add(identicItems);
}
else // Adding single item as a new array with this item
{
splittedArray.Add(new string[] { strItem });
}
}

Find common items in list of lists of strings

Hi I have allLists that contains lists of string I want to find common items among these string lists
i have tried
var intersection = allLists
.Skip(1)
.Aggregate(
new HashSet<string>(allLists.First()),
(h, e) => { h.IntersectWith(e); return h);`
and also intersection ( hard code lists by index) all of them did not work when I tried
var inter = allLists[0].Intersect(allLists[1]).Intersect(allLists[2])
.Intersect(allLists[3]).ToList();
foreach ( string s in inter) Debug.WriteLine(s+"\n ");
So how am I going to do this dynamically and get common string items in the lists;
is there a way to avoid Linq?
Isn't this the easiest way?
var stringLists = new List<string>[]
{
new List<string>(){ "a", "b", "c" },
new List<string>(){ "d", "b", "c" },
new List<string>(){ "a", "e", "c" }
};
var commonElements =
stringLists
.Aggregate((xs, ys) => xs.Intersect(ys).ToList());
I get a list with just "c" in it.
This also handles the case if elements within each list can be repeated.
I'd do it like this:
class Program
{
static void Main(string[] args)
{
List<string>[] stringLists = new List<string>[]
{
new List<string>(){ "a", "b", "c" },
new List<string>(){ "d", "b", "c" },
new List<string>(){ "a", "e", "c" }
};
// Will contian only 'c' because it's the only common item in all three groups.
var commonItems =
stringLists
.SelectMany(list => list)
.GroupBy(item => item)
.Select(group => new { Count = group.Count(), Item = group.Key })
.Where(item => item.Count == stringLists.Length);
foreach (var item in commonItems)
{
Console.WriteLine(String.Format("Item: {0}, Count: {1}", item.Item, item.Count));
}
Console.ReadKey();
}
}
An item is a common item if it occurs in all groups hence the condition that its count must be equal to the number of groups:
.Where(item => item.Count == stringLists.Length)
EDIT:
I should have used the HashSet like in the question. For lists you can replace the SelectMany line with this one:
.SelectMany(list => list.Distinct())

Categories