Compare size (Count) of many lists - c#

I was wondering if I can compare size of many lists in a elegant and fast way.
Basically this is my problem, I need to assert that 6 lists have the same size. So the usual way is something like (warning ugly code..):
if (list1.Count == list2.Count && list1.Count == list3.Count && .....) {
//ok, so here they have same size.
}
Some Jedi alternatives here?

The all() method should do the trick:
http://msdn.microsoft.com/en-us/library/bb548541.aspx.
Code should look like this, I think:
(new[] {list1, list2, list3, list4, list5, list6}).
All(list => list.Count == list1.Count);

Using Enumerable.All you can check that all lists match the same criteria:
var allLists = new[] { list1, list2, list3 };
bool result = allLists.All(l => l.Count == allLists[0].Count);
Or as a one-liner, but you would then need to refer to a particular list:
bool result = (new[] { list1, list2, list3 }).All(l => l.Count == list1.Count);

How about with LINQ:
bool allSameSize = new[] { list1, list2, list3, list4, list5, list6 }
.Select(list => list.Count)
.Distinct()
.Take(2) // Optimization, not strictly necessary
.Count() == 1;
This idea works for any kind of sequence (not just lists), and will quick-reject as soon as two distinct counts are found.
On another note, is there any reason that the lists aren't part of a "list of lists" collection?

If you make this kind of comparison at just one place, then it is probably not worth trying to make it shorter (especially if it impacts the performance).
However, if you compare list lengths at more than one place, it is perhaps worthwhile putting it in a function then reusing it many times:
static bool SameLength<T>(params IList<T>[] lists) {
int len = -1;
foreach (var list in lists) {
int list_len = list.Count;
if (len >= 0 && len != list_len)
return false;
len = list_len;
}
return true;
}
static void Main(string[] args) {
// All of these lists have same length (2):
var list1 = new List<int> { 1, 2 };
var list2 = new List<int> { 3, 4 };
var list3 = new List<int> { 5, 6 };
var list4 = new List<int> { 7, 8 };
var list5 = new List<int> { 9, 10 };
var list6 = new List<int> { 11, 12 };
if (SameLength(list1, list2, list3, list4, list5, list6)) {
// Executed.
}
// But this one is different (length 3):
var list7 = new List<int> { 11, 22, 33 };
if (SameLength(list1, list2, list3, list7, list4, list5, list6)) {
// Not executed.
}
}
--- EDIT ---
Based on Dean Barnes' idea, you could even do this for extra-short implementation:
static bool SameLength<T>(params IList<T>[] lists) {
return lists.All(list => list.Count == lists[0].Count);
}

var lists = new [] { list1, list2, list3 ... };
bool diffLengths = lists.Select(list => list.Count).Distinct().Skip(1).Any();
Or
bool sameLen = new HashSet<int>(lists.Select(list => list.Count)).Count <= 1;

Related

How to add List<T> items dynamically to IEnumerable<T>

Code
public static void Main()
{
List<int> list1 = new List<int> {1, 2, 3, 4, 5, 6 };
List<int> list2 = new List<int> {1, 2, 3 };
List<int> list3 = new List<int> {1, 2 };
var lists = new IEnumerable<int>[] { list1, list2, list3 };
var commons = GetCommonItems(lists);
Console.WriteLine("Common integers:");
foreach (var c in commons)
Console.WriteLine(c);
}
static IEnumerable<T> GetCommonItems<T>(IEnumerable<T>[] lists)
{
HashSet<T> hs = new HashSet<T>(lists.First());
for (int i = 1; i < lists.Length; i++)
hs.IntersectWith(lists[i]);
return hs;
}
As for the sample, I showed "list1" "list2" "list3", but I may have more than 50 lists that are generating each list using for each loop. How can I add programmatically each "list" to IEnumerable lists for comparing data of each list?
I tried many ways like conversion to list, Add, Append, Concat but nothing worked.
Is there any other best way to compare the N number of lists?
The output of Code: 1 2
You can create a list of lists and add lists to that list dynamically. Something like this:
var lists = new List<List<int>>();
lists.Add(new List<int> {1, 2, 3, 4, 5, 6 });
lists.Add(new List<int> {1, 2, 3 });
lists.Add(new List<int> {1, 2 });
foreach (var list in listSources)
lists.Add(list);
var commons = GetCommonItems(lists);
To find intersections you can use this solution for example: Intersection of multiple lists with IEnumerable.Intersect() (actually looks like that's what you are using already).
Also make sure to change the signature of the GetCommonItems method:
static IEnumerable<T> GetCommonItems<T>(List<List<T>> lists)
What you could do is allow the GetCommonItems method to accept a variable amount of parameters using the params keyword. This way, you avoid needing to create a new collection of lists.
It goes without saying, however, that if the amount of lists in your source is variable as well, this could be trickier to use.
I've also amended the GetCommonItems method to work like the code from https://stackoverflow.com/a/1676684/9945524
public static void Main()
{
List<int> list1 = new List<int> { 1, 2, 3, 4, 5, 6 };
List<int> list2 = new List<int> { 1, 2, 3 };
List<int> list3 = new List<int> { 1, 2 };
var commons = GetCommonItems(list1, list2, list3); // pass the lists here
Console.WriteLine("Common integers:");
foreach (var c in commons)
Console.WriteLine(c);
}
static IEnumerable<T> GetCommonItems<T>(params List<T>[] lists)
{
return lists.Skip(1).Aggregate(
new HashSet<T>(lists.First()),
(hs, lst) =>
{
hs.IntersectWith(lst);
return hs;
}
);
}
Alternate solution using your existing Main method.
EDIT: changed the type of lists to IEnumerable<IEnumerable<T>> as per comment in this answer.
public static void Main()
{
List<int> list1 = new List<int> { 1, 2, 3, 4, 5, 6 };
List<int> list2 = new List<int> { 1, 2, 3 };
List<int> list3 = new List<int> { 1, 2 };
var lists = new List<List<int>> { list1, list2, list3 };
var commons = GetCommonItems(lists);
Console.WriteLine("Common integers:");
foreach (var c in commons)
Console.WriteLine(c);
}
static IEnumerable<T> GetCommonItems<T>(IEnumerable<IEnumerable<T>> enumerables)
{
return enumerables.Skip(1).Aggregate(
new HashSet<T>(enumerables.First()),
(hs, lst) =>
{
hs.IntersectWith(lst);
return hs;
}
);
}
IEnumerable is immutable so you always should return an implementation of IEnumerable depending on your needs.
If I understand correctly you want to get common items of N lists. I would use LINQ for this.
My proposition:
1. make one list that contains all of the items. =>
var allElements = new List<int>();
var lists = new List<List<int>>();
foreach (list in lists)
allElements.AddRange(list);
Take items that are repetitive
allElements.GroupBy(x => x).Where(x => x.Count() > 1).Select(x => x).ToList();

Move list elements meeting condition to the top of the list

I want to move specific number to the top of this list.
int numberToBeMovedOnTop = 4;
List<int> lst = new List<int>(){1, 2, 3, 4, 5, 5, 4, 7, 9, 4, 2, 1};
List<int> lstOdd = lst.FindAll(l => l == numberToBeMovedOnTop);
lstOdd.AddRange(lst.FindAll(l => l != numberToBeMovedOnTop));
Where numberToBeMovedOnTop is a variable.
This gives me the desired result but is a better solution for this? I can iterate the list once and swap first occurence of numberToBeMovedOnTop with first element, second occurence with numberToBeMovedOnTop with second element and so on. But can this be done with some built-in C# function without iterating the list twice?
You could use LINQ:
List<int> lstOdd = lst.OrderByDescending(i => i == numberToBeMovedOnTop).ToList();
Why OrderByDescending? Because the comparison returns a bool and true is higher than false. You could also use:
List<int> lstOdd = lst.OrderBy(i => i == numberToBeMovedOnTop ? 0 : 1).ToList();
Note that this works because OrderBy and OrderByDescending are performing a stable sort. That means that the original order remains for all equal items.
For what it's worth, here is an extension method that works with any type and predicate and is a little bit more efficient:
public static List<T> PrependAll<T>(this List<T> list, Func<T, bool> predicate)
{
var returnList = new List<T>();
var listNonMatch = new List<T>();
foreach (T item in list)
{
if (predicate(item))
returnList.Add(item);
else
listNonMatch.Add(item);
}
returnList.AddRange(listNonMatch);
return returnList;
}
Usage: List<int> lstOdd = lst.PrependAll(i => i == numberToBeMovedOnTop);
Aside from using linq, it might be just as efficient/understandable to do this without linq
var listToAdd = new List<int>();
var listOdd = new List<int>();
for(int i = 0; i < lst.Count; i++)
{
if(lst[i] == numberToBeMovedOnTop)
{
listToAdd.Add(numberToBeMovedOnTop);
}
else
{
listOdd.Add(lst[i]);
}
}
listOdd.AddRange(listToAdd);
Keep track of those that you've removed, then add them on afterwards
Group by the predicate, then union?
var nums = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
var grp = nums.GroupBy(x => x % 2 == 0).ToList();
var changed = grp[0].Union(grp[1]).ToList();

LINQ swap columns into rows

Is there a fancy LINQ expression that could allow me to do the following in a much more simpler fashion. I have a List<List<double>>, assuming the List are columns in a 2d matrix, I want to swap the list of columns into a list of rows. I have the following obvious solution:
int columns = 5;
var values; // assume initialised as List<List<double>>()
var listOfRows = new List<List<double>>();
for (int i = 0; i < columns ; i++)
{
List<double> newRow = new List<double>();
foreach (List<double> value in values)
{
newRow.Add(value[i]);
}
listOfRows.Add(newRow);
}
You could LINQify the inner loop pretty easily:
vector.AddRange(values.Select(value => value[i]));
Whether or not that improves the readability is left entirely up to you!
Here's a Linq expression that would do what you want - looking at it I'd personally stick with the nested foreach loops though - much easier to read:
var columnList= new List<List<double>>();
columnList.Add(new List<double>() { 1, 2, 3 });
columnList.Add(new List<double>() { 4, 5, 6 });
columnList.Add(new List<double>() { 7, 8, 9 });
columnList.Add(new List<double>() { 10, 11, 12 });
int columnCount = columnList[0].Count;
var rowList = columnList.SelectMany(x => x)
.Select((x, i) => new { V = x, Index = i })
.GroupBy(x => (x.Index + 1) % columnCount)
.Select(g => g.Select( x=> x.V).ToList())
.ToList();
This example also would only work on a matrix with a fixed column count. Basically it's flattening the matrix into a list, then creating the list of rows by grouping by the index of the element in the list modulo the column count.
Edit:
A different approach, much closer to a nested loop and probably similar performance besides the overhead.
int columnCount = columnList[0].Count;
int rowCount = columnList.Count;
var rowList = Enumerable.Range(0, columnCount)
.Select( x => Enumerable.Range(0, rowCount)
.Select(y => columnList[y][x])
.ToList())
.ToList();
var inverted = Enumerable.Range(0, columnCount)
.Select(index => columnList.Select(list => list[index]));
In short, we enumerate the column index from a range and use it to collect the nth element of each list.
Please note that you'll need to check that every list has the same number of columns.
Here's one that works for rectangular (non-ragged) matrices. The C# code here works cut-and-paste into LinqPad, a free, interactive C# programming tool.
I define a postfix operator (that is, an extension method) "Transpose." Use the operator as follows:
var rand = new Random();
var xss = new [] {
new [] {rand.NextDouble(), rand.NextDouble()},
new [] {rand.NextDouble(), rand.NextDouble()},
new [] {rand.NextDouble(), rand.NextDouble()},
};
xss.Dump("Original");
xss.Transpose().Dump("Transpose");
resulting in something like this:
Original
0.843094345109116
0.981432441613373
0.649207864724662
0.00594645645746331
0.378864820291691
0.336915332515219
Transpose
0.843094345109116
0.649207864724662
0.378864820291691
0.981432441613373
0.00594645645746331
0.336915332515219
The gist of the implementation of this operator is the following
public static IEnumerable<IEnumerable<T>> Transpose<T>(this IEnumerable<IEnumerable<T>> xss)
{
var heads = xss.Heads();
var tails = xss.Tails();
var empt = new List<IEnumerable<T>>();
if (heads.IsEmpty())
return empt;
empt.Add(heads);
return empt.Concat(tails.Transpose());
}
Here is the full implementation, with some lines commented out that you can uncomment to monitor how the function works.
void Main()
{
var rand = new Random();
var xss = new [] {
new [] {rand.NextDouble(), rand.NextDouble()},
new [] {rand.NextDouble(), rand.NextDouble()},
new [] {rand.NextDouble(), rand.NextDouble()},
};
xss.Dump("Original");
xss.Transpose().Dump("Transpose");
}
public static class Extensions
{
public static IEnumerable<T> Heads<T>(this IEnumerable<IEnumerable<T>> xss)
{
Debug.Assert(xss != null);
if (xss.Any(xs => xs.IsEmpty()))
return new List<T>();
return xss.Select(xs => xs.First());
}
public static bool IsEmpty<T>(this IEnumerable<T> xs)
{
return xs.Count() == 0;
}
public static IEnumerable<IEnumerable<T>> Tails<T>(this IEnumerable<IEnumerable<T>> xss)
{
return xss.Select(xs => xs.Skip(1));
}
public static IEnumerable<IEnumerable<T>> Transpose<T>(this IEnumerable<IEnumerable<T>> xss)
{
// xss.Dump("xss in Transpose");
var heads = xss.Heads()
// .Dump("heads in Transpose")
;
var tails = xss.Tails()
// .Dump("tails in Transpose")
;
var empt = new List<IEnumerable<T>>();
if (heads.IsEmpty())
return empt;
empt.Add(heads);
return empt.Concat(tails.Transpose())
// .Dump("empt")
;
}
}
I am combining some of the answers above, which sometimes had columns and rows inverted form the original answer or from the convention I am used to : row refers to the first index and column to the inner ( second) index. e.g. values[row][column]
public static List<List<T>> Transpose<T>(this List<List<T>> values)
{
if (values.Count == 0 || values[0].Count == 0)
{
return new List<List<T>>();
}
int ColumnCount = values[0].Count;
var listByColumns = new List<List<T>>();
foreach (int columnIndex in Enumerable.Range(0, ColumnCount))
{
List<T> valuesByColumn = values.Select(value => value[columnIndex]).ToList();
listByColumns.Add(valuesByColumn);
}
return listByColumns;
}
Actually the word row and column is just our convention of thinking about the data in rows and columns , and sometimes adds more confusion than solving them.
We are actually just swapping the inner index for the outer index. (or flipping the indexes around). So one could also just define the following extension method. . Again I borrowed from above solutions, just put it into something I find readable and fairly compact.
Checks that the inner lists are of equal sized are required.
public static List<List<T>> InsideOutFlip<T>(this List<List<T>> values)
{
if (values.Count == 0 || values[0].Count == 0)
{
return new List<List<T>>();
}
int innerCount = values[0].Count;
var flippedList = new List<List<T>>();
foreach (int innerIndex in Enumerable.Range(0, innerCount))
{
List<T> valuesByOneInner = values.Select(value => value[innerIndex]).ToList();
flippedList.Add(valuesByOneInner);
}
return flippedList;
}

find common items across multiple lists in C#

I have two generic list :
List<string> TestList1 = new List<string>();
List<string> TestList2 = new List<string>();
TestList1.Add("1");
TestList1.Add("2");
TestList1.Add("3");
TestList2.Add("3");
TestList2.Add("4");
TestList2.Add("5");
What is the fastest way to find common items across these lists?
Assuming you use a version of .Net that has LINQ, you can use the Intersect extension method:
var CommonList = TestList1.Intersect(TestList2)
If you have lists of objects and want to get the common objects for some property then use;
var commons = TestList1.Select(s1 => s1.SomeProperty).ToList().Intersect(TestList2.Select(s2 => s2.SomeProperty).ToList()).ToList();
Note: SomeProperty refers to some criteria you want to implement.
Assuming you have LINQ available. I don't know if it's the fastest, but a clean way would be something like:
var distinctStrings = TestList1.Union(TestList2).Distinct();
var distinctStrings = TestList1.Union(TestList2);
Update: well never mind my answer, I've just learnt about Intersect as well!
According to an update in the comments, Unions apply a distinct, which makes sense now that I think about it.
You can do this by counting occurrences of all items in all lists - those items whose occurrence count is equal to the number of lists, are common to all lists:
static List<T> FindCommon<T>(IEnumerable<List<T>> lists)
{
Dictionary<T, int> map = new Dictionary<T, int>();
int listCount = 0; // number of lists
foreach (IEnumerable<T> list in lists)
{
listCount++;
foreach (T item in list)
{
// Item encountered, increment count
int currCount;
if (!map.TryGetValue(item, out currCount))
currCount = 0;
currCount++;
map[item] = currCount;
}
}
List<T> result= new List<T>();
foreach (KeyValuePair<T,int> kvp in map)
{
// Items whose occurrence count is equal to the number of lists are common to all the lists
if (kvp.Value == listCount)
result.Add(kvp.Key);
}
return result;
}
Sort both arrays and start from the top of both and compare if they are equal.
Using a hash is even faster: Put the first array in a hash, then compare every item of the second array if it is already in the hash.
I don't know those Intersect and Union are implemented. Try to find out their running time if you care about the performance. Of course they are better suited if you need clean code.
Use the Intersect method:
IEnumerable<string> result = TestList1.Intersect(TestList2);
Using HashSet for fast lookup. Here is the solution:
using System;
using System.Linq;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
List<int> list1 = new List<int> {1, 2, 3, 4, 5, 6 };
List<int> list2 = new List<int> {1, 2, 3 };
List<int> list3 = new List<int> {1, 2 };
var lists = new IEnumerable<int>[] {list1, list2, list3 };
var commons = GetCommonItems(lists);
Console.WriteLine("Common integers:");
foreach (var c in commons)
Console.WriteLine(c);
}
static IEnumerable<T> GetCommonItems<T>(IEnumerable<T>[] lists)
{
HashSet<T> hs = new HashSet<T>(lists.First());
for (int i = 1; i < lists.Length; i++)
hs.IntersectWith(lists[i]);
return hs;
}
}
Following the lead of #logicnp on counting the number of lists containing each member, once you have your list of lists, it's pretty much one line of code:
List<int> l1, l2, l3, cmn;
List<List<int>> all;
l1 = new List<int>() { 1, 2, 3, 4, 5 };
l2 = new List<int>() { 1, 2, 3, 4 };
l3 = new List<int>() { 1, 2, 3 };
all = new List<List<int>>() { l1, l2, l3 };
cmn = all.SelectMany(x => x).Distinct()
.Where(x => all .Select(y => (y.Contains(x) ? 1 : 0))
.Sum() == all.Count).ToList();
Or, if you prefer:
public static List<T> FindCommon<T>(IEnumerable<List<T>> Lists)
{
return Lists.SelectMany(x => x).Distinct()
.Where(x => Lists.Select(y => (y.Contains(x) ? 1 : 0))
.Sum() == Lists.Count()).ToList();
}

LINQ intersect, multiple lists, some empty

I'm trying to find an intersect with LINQ.
Sample:
List<int> int1 = new List<int>() { 1,2 };
List<int> int2 = new List<int>();
List<int> int3 = new List<int>() { 1 };
List<int> int4 = new List<int>() { 1, 2 };
List<int> int5 = new List<int>() { 1 };
Want to return: 1 as it exists in all lists.. If I run:
var intResult= int1
.Intersect(int2)
.Intersect(int3)
.Intersect(int4)
.Intersect(int5).ToList();
It returns nothing as 1 obviously isn't in the int2 list. How do I get this to work regardless if one list is empty or not ?
Use the above example or:
List<int> int1 = new List<int>() { 1,2 };
List<int> int2 = new List<int>();
List<int> int3 = new List<int>();
List<int> int4 = new List<int>();
List<int> int5 = new List<int>();
How do I return 1 & 2 in this case.. I don't know ahead of time if the lists are populated...
If you need it in a single step, the simplest solution is to filter out empty lists:
public static IEnumerable<T> IntersectNonEmpty<T>(this IEnumerable<IEnumerable<T>> lists)
{
var nonEmptyLists = lists.Where(l => l.Any());
return nonEmptyLists.Aggregate((l1, l2) => l1.Intersect(l2));
}
You can then use it on a collection of lists or other IEnumerables:
IEnumerable<int>[] lists = new[] { l1, l2, l3, l4, l5 };
var intersect = lists.IntersectNonEmpty();
You may prefer a regular static method:
public static IEnumerable<T> IntersectNonEmpty<T>(params IEnumerable<T>[] lists)
{
return lists.IntersectNonEmpty();
}
var intersect = ListsExtensionMethods.IntersectNonEmpty(l1, l2, l3, l4, l5);
You could write an extension method to define that behaviour. Something like
static class MyExtensions
{
public static IEnumerable<T> IntersectAllIfEmpty<T>(this IEnumerable<T> list, IEnumerable<T> other)
{
if (other.Any())
return list.Intersect(other);
else
return list;
}
}
So the code below would print 1.
List<int> list1 = new List<int>() { 1, 2 };
List<int> list2 = new List<int>();
List<int> list3 = new List<int>() { 1 };
foreach (int i in list1.IntersectAllIfEmpty(list2).IntersectAllIfEmpty(list3))
Console.WriteLine(i);
Update:
Anon brings up a good point in the comments to the question. The above function will result in an empty set if list itself is empty, which should be desirable. This means if the first list in the method chain or the result set of any intersection is empty, the final result will be empty.
To allow for an empty first list but not for empty result sets, you could take a different approach. This is a method which is not an extension method, but rather takes a params array of IEnumerables and first filters out the empty sets and then attempts to intersect the rest.
public static IEnumerable<T> IntersectAllIfEmpty<T>(params IEnumerable<T>[] lists)
{
IEnumerable<T> results = null;
lists = lists.Where(l => l.Any()).ToArray();
if (lists.Length > 0)
{
results = lists[0];
for (int i = 1; i < lists.Length; i++)
results = results.Intersect(lists[i]);
}
else
{
results = new T[0];
}
return results;
}
You would use it like this
List<int> list0 = new List<int>();
List<int> list1 = new List<int>() { 1, 2 };
List<int> list2 = new List<int>() { 1 };
List<int> list3 = new List<int>() { 1,2,3 };
foreach (int i in IntersectAllIfEmpty(list0, list1, list2, list3))
{
Console.WriteLine(i);
}

Categories