Group array of string arrays with LINQ - c#

I have array like this, values are string:
var arr1 = new [] { "H", "item1", "item2" };
var arr2 = new [] { "T", "thing1", "thing2" };
var arr3 = new [] { "T", "thing1", "thing2" };
var arr4 = new [] { "END", "something" };
var arr5 = new [] { "H", "item1", "item2" };
var arr6 = new [] { "T", "thing1", "thing2" };
var arr7 = new [] { "T", "thing1", "thing2" };
var arr8 = new [] { "END", "something" };
var allArrays = new [] { arr1, arr2, arr3, arr4, arr5, arr6, arr7, arr8 };
I need to group this in to a new array of arrays, so that one array has arrays that start with H or T. The END records (not included in the results) are the delimiters between each section; each new array starts after an END array.
In the end I would like to have somethng like this:
[
[ [H, item1, item2], [T, thing1, thing2], [T, thing1, thing2] ]
[ [H, item1, item2], [T, thing1, thing2], [T, thing1, thing2] ]
]
I know how I can do this with for each loop, but I'm looking for a cleaner way, possibly using linq. All suggestions are much valued, thank you!

you can try this
List<string[]> list = new List<string[]>();
var newArr = allArrays.Select(a => AddToArr(list, a)).Where(a => a != null);
and helper (this code can be put inline, but it easier to read this way)
private static string[][] AddToArr(List<string[]> list, string[] arr)
{
if (arr[0] != "END")
{
list.Add(arr);
return null;
}
var r = list.ToArray();
list.Clear();
return r;
}
result
[
[["H","item1","item2"],["T","thing1","thing2"],["T","thing1","thing2"]],
[["H","item3","item4"],["T","thing3","thing4"],["T","thing5","thing6"]]
]

So arr1, arr2, etc are string[].
allArrays is a string[][].
I hope you gave a meaningful example. From this example it seems that you want all string[] from allArrays, except the string[] that have a [0] that equals the word "END".
If this is what you want, your result is:
string[][] result = allArrays.Where(stringArray => stringArray[0] != "END");
I need to group this in to a new array of arrays, so that one array has arrays that start with H or T. The END records (not included in the results) are the delimiters between each section; each new array starts after an END array.
This is not exactly the same as I see in your example: what if one of the string arrays in allArrays is an empty array, or if it has the value null values. What if one of the the arrays of strings is empty (= length 0), and what if one of the string arrays doesn't start with "H", nor "T", nor "END"?
Literally you say that you only want the string arrays that start with "H" or "T", no other ones. You don't want string arrays that are null, nor empty string arrays. You also don't want string arrays that start with "END", nor the ones that start with String.Empty, or "A" or "B" or anything else than "H" or "T".
If I take your requirement literally, your code should be:
string[] requiredStringAtIndex0 = new string[] {"H", "T"};
string[][] result = allArrays.Where(stringArray => stringArray != null
&& stringArray.Length != 0
&& requiredStringAtIndex0.Contains(stringArray[0]));
In words: from allArrays, keep only those arrays of strings, that are not null, AND that have at least one element AND where the element at index 0 contains either "H" or "T"

Normally I would use an extension method for grouping runs of items based on a predicate, in this case GroupByEndingWith and then throw away the "END" record, like so:
var ans = allArrays.GroupByEndingWith(r => r[0] == "END")
.Select(g => g.Drop(1).ToArray())
.ToArray();
But, in general, you can use Aggregate to collect items based on a predicate at the expense of comprehension. It often helps to use a tuple to track an overall accumulator and a sub-accumulator. Unfortunately, there is no + operator or Append for List<T> that returns the original list (helpful for expression based accumulation) and since C# doesn't yet have a comma operator equivalent, you need an extension method again or you can use ImmutableList.
Using Aggregate and ImmutableList, you can do:
var ans = allArrays.Aggregate(
(ans: ImmutableList<ImmutableList<string[]>>.Empty, curr: ImmutableList<string[]>.Empty),
(ac, r) => r[0] == "END"
? (ac.ans.Add(ac.curr), ImmutableList<string[]>.Empty)
: (ac.ans, ac.curr.Add(r))
).ans
.Select(l => l.ToArray())
.ToArray();
NOTE: You can also do this with List if you are willing to create new Lists a lot:
var ans = allArrays.Aggregate(
(ans: new List<List<string[]>>(), curr: new List<string[]>()),
(ac, r) => r[0] == "END"
? (ac.ans.Concat(new[] { ac.curr }).ToList(), new List<string[]>())
: (ac.ans, ac.curr.Concat(new[] { r }).ToList())
).ans
.Select(l => l.ToArray())
.ToArray();

Here is a simple implementation.
public static void Main(string[] args)
{
var data = ConvertToArrayOfArray(arr1, arr2, arr3, arrr4, arr5, arr6, arr7, arr8);
}
private string[][] ConvertToArrayOfArray(params string[][] arrs)
{
List<string[]> yoList = new List<string[]>();
arrs.ToList().ForEach(x =>
{
if(!x[0] == "END") yoList.Add(x);
});
return yoList.ToArray();
}

Related

How to split the elements ​of an array of strings if they are equal?

Good day,
I currently have a string array like this:
string[] array = {"aa","bb","cc","dd","aa","cc","ee","ff","aa","bb"}
I would like to be able to get the positions that are the same from the same string [], example:
string[] a = {"aa","aa","aa"}
string[] b = {"bb","bb"}
string[] c = {"cc","cc"}
string[] d = {"dd"}
string[] e = {"ee"}
string[] f = {"ff"}
It should be noted that the elements of the parent matrix always change and are not always the same.
I tried with linq, but I don't get what I'm looking for.
this was my attempt with linq:
array.Where(x => array.Contains(x)).ToArray();
Thanks for help me!
Despite seeing what you ask for, the result you want is pretty limited an not useful to work with later on. You should take advantage of using the GroupBy in linq and then when you need something find it in that collection.
// your array
string[] array = {"aa","bb","cc","dd","aa","cc","ee","ff","aa","bb"};
// group by value
var groupedValues = array.GroupBy(x => x).ToList();
// get the "aa" group if exist
var aa = groupedValues.FirstOrDefault(x => x.Key == "aa");
// check if the group was found
if(aa != null)
{
// get all "aa" values in that group. This return this collection based on your inpit{ "aa", "aa", "aa" }
var allaaValues = aa.ToList();
}
Hope this is a solution you were looking for, good luck!
string[] array = { "aa", "bb", "cc", "dd", "aa", "cc", "ee", "ff", "aa", "bb" };
var splittedArray = new List<string[]>();
foreach (var strItem in array)
{
//Don't iterating duplicates
if (splittedArray.Any(si => si.Contains(strItem))) continue;
//if more then one item exists in the array getting those identic items and adding to the array list
if (array.Count(si => si.Equals(strItem)) > 0)
{
var identicItems = array
.Where(i => i.Equals(strItem))
.ToArray();
splittedArray.Add(identicItems);
}
else // Adding single item as a new array with this item
{
splittedArray.Add(new string[] { strItem });
}
}

Removing strings with duplicate letters from string array

I have array of strings like
string[] A = { "abc", "cccc", "fgaeg", "def" };
I would like to obtain a list or array of strings where any letter appears only one time. I means that "cccc", "fgaeg" will be removed from input array.
I managed to do this but I feel that my way is very messy, unnecessarily complicated and not efficient.
Do you have any ideas to improve this algorythm (possibliy replacing with only one Linq query)?
My code:
var goodStrings = new List<string>();
int i = 0;
foreach (var str in A)
{
var tempArr = str.GroupBy(x => x)
.Select(x => new
{
Cnt = x.Count(),
Str = x.Key
}).ToArray();
var resultArr = tempArr.Where(g => g.Cnt > 1).Select(f => f.Str).ToArray();
if(resultArr.Length==0) goodStrings.Add(A[i]);
i++;
}
You can use Distinct method for every array item and get items with count of distinct items equals to original string length
string[] A = { "abc", "cccc", "fgaeg", "def" };
var result = A.Where(a => a.Distinct().Count() == a.Length).ToList();
You'll get list with abc and def values, as expected

Iterate and select over two dimensional string array with LINQ

I did console application that must iterate over two dimensional array of strings and select values that contains in user input and show these values in set by "row".
Unfortunately I got error System.Collections.Generic.List '1[System.String]
Here is the code of application:
static void Main(string[] args)
{
string[,] words = new string[,]
{
{ "5", "" },
{ "10", "kare" },
{ "20", "kanojo" },
{ "1", "karetachi" },
{ "7", "korosu" },
{ "3", "sakura" },
{ "3", "" }
};
try
{
var pre = Console.ReadLine();
var r = Enumerable
.Range(0, words.GetLength(0))
.Where(i => words[i, 1] == pre)
.Select(i => words[i, 1])
.OrderBy(i => words[Int32.Parse(i), 0])
.ToList();
Console.Write(r);
}
catch (Exception ex)
{
TextWriter errorWriter = Console.Error;
errorWriter.WriteLine(ex.Message);
}
Console.ReadLine();
}
Your query is incorrect: you try to match each word from the list to the entirety of the user input, which means that you would always pick a single word (assuming there's no duplicates in the 2D array). Since you are sorting the results, however, it appears that you expect there to be more than one word.
To fix this, replace your selection criteria to use Contains, like this:
var r = Enumerable
.Range(0, words.GetLength(0))
.Where(i => pre.Contains(words[i, 1]))
.Select(i => new {i, w=words[i, 1]})
.OrderBy(p => Int32.Parse(words[p.i, 0]))
.Select(p=>p.w)
.ToList();
To display the results in a single line you could use string.Join:
Console.WriteLine("Results: {0}", string.Join(", ", r));
Note: I assume that the exercise requires you to use a 2D array. If there is no such requirement, you could use an array of tuples or anonymous types, letting you avoid parsing of the integer:
var words = new[] {
new { Priority = 5, Word = "" }
, new { Priority = 10, Word = "kare" }
, new { Priority = 20, Word = "kanojo" }
, ... // and so on
};
Demo.
That's not an error, that's what happens when you display the result of calling the ToString function of a List.
(i.e. your statement ran correctly, you just aren't displaying it the way you think.... see?)
Try:
Console.Write(r.Aggregate((a,b) => a + "," + b));
instead of
Console.Write(r);
The following code creates a 2D List as though we had this
myList[][], consisting of [0] = {0,1,2,3} and [1] = {4,5,6,7,8}
List<List<int>> a2DList = new List<List<int>>()
{
new List<int>()
{
0,1,2,3
},
new List<int>()
{
4,5,6,7,8
}
};
The LINQ code
a2DList.SelectMany(s => s).ToArray().Select(s => s))
returns a copy of the 2d array flattened into 1D form.
SelectMany takes each element and projects each member of each element sequentially.
You could then say
var myObj = a2DList.SelectMany(s => s).ToArray().Select(s => s));
IEnumerable myEnumerable = a2DList.SelectMany(s => s).ToArray().Select(s => s));
int [] myArray = a2DList.SelectMany(s => s).ToArray().Select(s => s)).ToArray();
List myList = a2DList.SelectMany(s => s).ToArray().Select(s => s)).ToList();
etc
This is "join"ed by the string operator for printing out to Console
Console.WriteLine(string.Join(",",a2DList.SelectMany(s => s).ToArray().Select(s => s)));
// Output will be "0,1,2,3,4,5,6,7,8"

Comparing two string arrays in C#

Say we have 5 string arrays as such:
string[] a = {"The","Big", "Ant"};
string[] b = {"Big","Ant","Ran"};
string[] c = {"The","Big","Ant"};
string[] d = {"No","Ants","Here"};
string[] e = {"The", "Big", "Ant", "Ran", "Too", "Far"};
Is there a method to compare these strings to each other without looping through them in C# such that only a and c would yield the boolean true? In other words, all elements must be equal and the array must be the same size? Again, without using a loop if possible.
You can use Linq:
bool areEqual = a.SequenceEqual(b);
Try using Enumerable.SequenceEqual:
var equal = Enumerable.SequenceEqual(a, b);
if you want to get array data that differ from another array you can try .Except
string[] array1 = { "aa", "bb", "cc" };
string[] array2 = { "aa" };
string[] DifferArray = array1.Except(array2).ToArray();
Output:
{"bb","cc"}
If you want to compare them all in one go:
string[] a = { "The", "Big", "Ant" };
string[] b = { "Big", "Ant", "Ran" };
string[] c = { "The", "Big", "Ant" };
string[] d = { "No", "Ants", "Here" };
string[] e = { "The", "Big", "Ant", "Ran", "Too", "Far" };
// Add the strings to an IEnumerable (just used List<T> here)
var strings = new List<string[]> { a, b, c, d, e };
// Find all string arrays which match the sequence in a list of string arrays
// that doesn't contain the original string array (by ref)
var eq = strings.Where(toCheck =>
strings.Where(x => x != toCheck)
.Any(y => y.SequenceEqual(toCheck))
);
Returns both matches (you could probably expand this to exclude items which already matched I suppose)
if (a.Length == d.Length)
{
var result = a.Except(d).ToArray();
if (result.Count() == 0)
{
Console.WriteLine("OK");
}
else
{
Console.WriteLine("NO");
}
}
else
{
Console.WriteLine("NO");
}

Zip N IEnumerable<T>s together? Iterate over them simultaneously?

I have:-
IEnumerable<IEnumerable<T>> items;
and I'd like to create:-
IEnumerable<IEnumerable<T>> results;
where the first item in "results" is an IEnumerable of the first item of each of the IEnumerables of "items", the second item in "results" is an IEnumerable of the second item of each of "items", etc.
The IEnumerables aren't necessarily the same lengths. If some of the IEnumerables in items don't have an element at a particular index, then I'd expect the matching IEnumerable in results to have fewer items in it.
For example:-
items = { "1", "2", "3", "4" } , { "a", "b", "c" };
results = { "1", "a" } , { "2", "b" }, { "3", "c" }, { "4" };
Edit: Another example (requested in comments):-
items = { "1", "2", "3", "4" } , { "a", "b", "c" }, { "p", "q", "r", "s", "t" };
results = { "1", "a", "p" } , { "2", "b", "q" }, { "3", "c", "r" }, { "4", "s" }, { "t" };
I don't know in advance how many sequences there are, nor how many elements are in each sequence. I might have 1,000 sequences with 1,000,000 elements in each, and I might only need the first ~10, so I'd like to use the (lazy) enumeration of the source sequences if I can. In particular I don't want to create a new data structure if I can help it.
Is there a built-in method (similar to IEnumerable.Zip) that can do this?
Is there another way?
Now lightly tested and with working disposal.
public static class Extensions
{
public static IEnumerable<IEnumerable<T>> JaggedPivot<T>(
this IEnumerable<IEnumerable<T>> source)
{
List<IEnumerator<T>> originalEnumerators = source
.Select(x => x.GetEnumerator())
.ToList();
try
{
List<IEnumerator<T>> enumerators = originalEnumerators
.Where(x => x.MoveNext()).ToList();
while (enumerators.Any())
{
List<T> result = enumerators.Select(x => x.Current).ToList();
yield return result;
enumerators = enumerators.Where(x => x.MoveNext()).ToList();
}
}
finally
{
originalEnumerators.ForEach(x => x.Dispose());
}
}
}
public class TestExtensions
{
public void Test1()
{
IEnumerable<IEnumerable<int>> myInts = new List<IEnumerable<int>>()
{
Enumerable.Range(1, 20).ToList(),
Enumerable.Range(21, 5).ToList(),
Enumerable.Range(26, 15).ToList()
};
foreach(IEnumerable<int> x in myInts.JaggedPivot().Take(10))
{
foreach(int i in x)
{
Console.Write("{0} ", i);
}
Console.WriteLine();
}
}
}
It's reasonably straightforward to do if you can guarantee how the results are going to be used. However, if the results might be used in an arbitrary order, you may need to buffer everything. Consider this:
var results = MethodToBeImplemented(sequences);
var iterator = results.GetEnumerator();
iterator.MoveNext();
var first = iterator.Current;
iterator.MoveNext();
var second = iterator.Current;
foreach (var x in second)
{
// Do something
}
foreach (var x in first)
{
// Do something
}
In order to get at the items in "second" you'll have to iterate over all of the subsequences, past the first items. If you then want it to be valid to iterate over the items in first you either need to remember the items or be prepared to re-evaluate the subsequences.
Likewise you'll either need to buffer the subsequences as IEnumerable<T> values or reread the whole lot each time.
Basically it's a whole can of worms which is difficult to do elegantly in a way which will work pleasantly for all situations :( If you have a specific situation in mind with appropriate constraints, we may be able to help more.
Based on David B's answer, this code should perform better:
public static IEnumerable<IEnumerable<T>> JaggedPivot<T>(
this IEnumerable<IEnumerable<T>> source)
{
var originalEnumerators = source.Select(x => x.GetEnumerator()).ToList();
try
{
var enumerators =
new List<IEnumerator<T>>(originalEnumerators.Where(x => x.MoveNext()));
while (enumerators.Any())
{
yield return enumerators.Select(x => x.Current).ToList();
enumerators.RemoveAll(x => !x.MoveNext());
}
}
finally
{
originalEnumerators.ForEach(x => x.Dispose());
}
}
The difference is that the enumerators variable isn't re-created all the time.
Here's one that is a bit shorter, but no doubt less efficient:
Enumerable.Range(0,items.Select(x => x.Count()).Max())
.Select(x => items.SelectMany(y => y.Skip(x).Take(1)));
What about this?
List<string[]> items = new List<string[]>()
{
new string[] { "a", "b", "c" },
new string[] { "1", "2", "3" },
new string[] { "x", "y" },
new string[] { "y", "z", "w" }
};
var x = from i in Enumerable.Range(0, items.Max(a => a.Length))
select from z in items
where z.Length > i
select z[i];
You could compose existing operators like this,
IEnumerable<IEnumerable<int>> myInts = new List<IEnumerable<int>>()
{
Enumerable.Range(1, 20).ToList(),
Enumerable.Range(21, 5).ToList(),
Enumerable.Range(26, 15).ToList()
};
myInts.SelectMany(item => item.Select((number, index) => Tuple.Create(index, number)))
.GroupBy(item => item.Item1)
.Select(group => group.Select(tuple => tuple.Item2));

Categories