Related
I have two C# Lists of different sizes e.g.
List<int> list1 = new List<int>{1,2,3,4,5,6,7};
List<int> list2 = new List<int>{4,5,6,7,8,9};
I want to use the linq Zip method to combine these two into a list of tuples that is of the size list1. Here is the resulting list I am looking for
{(1,4), (2,5), (3,6), (4,7), (5,8), (6,9), (7,0)} //this is of type List<(int,int)
Since the last item of list1 does not has a counterpart in list2, I fill up my last item of the resulting list with a default value (in this case 0 as in my case it will never appear in any of the original lists).
Is there a way I can use the linq Zip method alone to achieve this?
You can use Concat to make them both the same size, and then zip it:
var zipped = list1.Concat(Enumerable.Repeat(0,Math.Max(list2.Count-list1.Count,0)))
.Zip(list2.Concat(Enumerable.Repeat(0,Math.Max(list1.Count-list2.Count,0))),
(a,b)=>(a,b));
Or create an extension method:
public static class ZipExtension{
public static IEnumerable<TResult> Zip<TFirst,TSecond,TResult>(
this IEnumerable<TFirst> first,
IEnumerable<TSecond> second,
Func<TFirst,TSecond,TResult> func,
TFirst padder1,
TSecond padder2)
{
var firstExp = first.Concat(
Enumerable.Repeat(
padder1,
Math.Max(second.Count()-first.Count(),0)
)
);
var secExp = second.Concat(
Enumerable.Repeat(
padder2,
Math.Max(first.Count()-second.Count(),0)
)
);
return firstExp.Zip(secExp, (a,b) => func(a,b));
}
}
So you can use like this:
//last 2 arguments are the padder values for list1 and list2
var zipped = list1.Zip(list2, (a,b) => (a,b), 0, 0);
There is a useful and popular MoreLinq library. Install it and use.
using MoreLinq;
var result = list1.ZipLongest(list2, (x, y) => (x, y));
Try this using Zip function-
static void Main(string[] args)
{
List<int> firstList = new List<int>() { 1, 2, 3, 4, 5, 6, 0, 34, 56, 23 };
List<int> secondList = new List<int>() { 4, 5, 6, 7, 8, 9, 1 };
int a = firstList.Count;
int b = secondList.Count;
for (int k = 0; k < (a - b); k++)
{
if(a>b)
secondList.Add(0);
else
firstList.Add(0);
}
var zipArray = firstList.Zip(secondList, (c, d) => c + " " + d);
foreach(var item in zipArray)
{
Console.WriteLine(item);
}
Console.Read();
}
Or you can try this using ZipLongest Function by installing MoreLinq nuget package-
static void Main(string[] args)
{
List<int> firstList = new List<int>() { 1, 2, 3, 4, 5, 6, 0, 34, 56, 23 };
List<int> secondList = new List<int>() { 4, 5, 6, 7, 8, 9, 1 };
var zipArray = firstList.ZipLongest(secondList, (c, d) => (c,d));
foreach (var item in zipArray)
{
Console.WriteLine(item);
}
Console.Read();
}
Try this code-
static void Main(string[] args)
{
List<int> firstList=new List<int>() { 1, 2, 3, 4, 5, 6,0,34,56,23};
List<int> secondList=new List<int>() { 4, 5, 6, 7, 8, 9,1};
int a = firstList.Count;
int b = secondList.Count;
if (a > b)
{
for(int k=0;k<(a-b);k++)
secondList.Add(0);
}
else
{
for (int k = 0; k < (b-a); k++)
firstList.Add(0);
}
for(int i=0;i<firstList.Count;i++)
{
for(int j=0;j<=secondList.Count;j++)
{
if(i==j)
Console.Write($"({Convert.ToInt32(firstList[i])},{ Convert.ToInt32(secondList[j])})" + "");
}
}
Console.Read();
}
Given two Lists of integer arrays of the form:
genData = { {1,2,3,4}, {1,2,3,4}, {1,2,3,4}, {1,2,3,4}};
orgData = {{1,2,3,4}, {1,2,3,4}, {2,4,6,8}, {1,2,3,4}};
I'd like to determine if the sum of two subarrays at the same index in both lists don't match. If the sums match, do nothing. If the sums don't match, convert every integer in both subarrays into a 0.
For example, in the two lists above the subarrays at index 2 have a non-matching sum (10 vs 20). I'd like to convert the lists to
genData = { {1,2,3,4}, {1,2,3,4}, {0,0,0,0}, {1,2,3,4} };
orgData = { {1,2,3,4}, {1,2,3,4}, {0,0,0,0}, {1,2,3,4} };
I'm trying to first create a list if sums by trying
var genDataSum = genDataList.ForEach(x => x.Sum());
But obviously, that's throwing up errors..."Cannot assign void to an implicitly typed value".
Any help or guidance will be greatly appreciated.
You need to use select to get the sum. list.foreach works like normal for loop.
List<int[]> genData = new List<int[]>
{
new int[] { 1, 2, 3, 4 },
new int[] { 1, 2, 3, 4 },
new int[] { 1, 2, 3, 4 },
new int[] { 1, 2, 3, 4 }
};
List<int[]> orgData = new List<int[]>
{
new int[] { 1, 2, 3, 4 },
new int[] { 1, 2, 3, 4 },
new int[] { 2, 4, 6, 8 },
new int[] { 1, 2, 3, 4 }
};
var sumsGenData = genData.Select(a => a.Sum()).ToList();
var sumsOrgData = orgData.Select(a => a.Sum()).ToList();
for (int i = 0; i < sumsGenData.Count; i++)
{
if (sumsGenData[i] != sumsOrgData[i])
{
orgData[i] = new int[] { 0, 0, 0, 0 };
}
}
ForEach doesn't return anything. Use Select.
var orgData = { {1,2,3,4}, {1,2,3,4}, {0,0,0,0}, {1,2,3,4} };
var sums = orgData.Select( a => a.Sum() );
Given a list of pairs such as
List<int> pair1 = new List<int>() { 1, 3};
List<int> pair2 = new List<int>() { 1, 2 };
List<int> pair3 = new List<int>() { 5, 3 };
List<int> pair4 = new List<int>() { 7, 8 };
List<int> pair5 = new List<int>() { 8, 11 };
List<int> pair6 = new List<int>() { 6, 9 };
List<int> pair7 = new List<int>() { 2, 13 };
List<int> pair8 = new List<int>() { 13, 16 };
How can I find all of the unions where the pairs intersect?
Output should be something like the following:
1,2,3,5,13,16
7,8,11
6,9
// create lists of pairs to sort through
static void links2XML(SQLiteConnection m_dbConnection)
{
List<int> pair1 = new List<int>() { 1, 3};
List<int> pair2 = new List<int>() { 1, 2 };
List<int> pair3 = new List<int>() { 5, 3 };
List<int> pair4 = new List<int>() { 7, 8 };
List<int> pair5 = new List<int>() { 8, 11 };
List<int> pair6 = new List<int>() { 6, 9 };
List<int> pair7 = new List<int>() { 2, 13 };
List<int> pair8 = new List<int>() { 13, 16 };
var pairs = new List<List<int>>();
pairs.Add(pair1);
pairs.Add(pair2);
pairs.Add(pair3);
pairs.Add(pair4);
pairs.Add(pair5);
pairs.Add(pair6);
pairs.Add(pair7);
pairs.Add(pair8);
var output = new List<int>();
foreach (var pair in pairs)
{
foreach (int i in followLinks(pair, pairs))
{
Console.Write(i + ",");
}
Console.WriteLine();
}
}
// loop through pairs to find intersections and recursively call function to
//build full list of all such ints
static List<int> followLinks(List<int> listA, List<List<int>> listB)
{
var links = listA;
var listC = listB.ToList();
bool added = false;
foreach (var l in listB)
{
var result = listA.Intersect(l);
if (result.Count<int>() > 0)
{
links = links.Union<int>(l).ToList();
listC.Remove(l); // remove pair for recursion after adding
added = true;
}
}
if (added)
{
followLinks(links, listC); //recursively call function with updated
//list of pairs and truncated list of lists of pairs
return links;
}
else return links;
}
Code should query the lists of pairs and output groups. I've tried this a few different ways, and this seems to have gotten the closest. I'm sure it requires a recursive loop, but figuring out the structure of it is just not making sense to me at the moment.
To clarify for some of the questions the number pairs are random that I chose for this question. The actual data set will be far larger and pulled from a database. That part is irrelevant to my question, though, and it's already solved anyway. It's really just this sorting that is giving me trouble.
To further clarify, the output will find a list of all of the integers from each pair that had an intersection... given pairs 1,2 and 1,3 the output would be 1,2,3. Given pairs 1,2 and 3,5, the output would be 1,2 for one list and 3,5 for the other. Hopefully that makes it clearer what I'm trying to find.
I used this function to return the full hash set of all links then loop through all of the initial pairs to feed this function. I filter the results to remove duplicates, and it solves the issue. Suggestion was from user Sven.
// Follow links to build hashset of all linked tags
static HashSet<int> followLinks(HashSet<int> testHs, List<HashSet<int>> pairs)
{
while (true)
{
var tester = new HashSet<int>(testHs);
bool keepGoing = false;
foreach (var p in pairs)
{
if (testHs.Overlaps(p))
{
testHs.UnionWith(p);
keepGoing = true;
}
}
for (int i = pairs.Count - 1; i == 0; i-- )
{
if (testHs.Overlaps(pairs[i]))
{
testHs.UnionWith(pairs[i]);
keepGoing = true;
}
}
if (!keepGoing) break;
if (testHs.SetEquals(tester)) break;
}
return testHs;
}
How can I find the set of items that occur in 2 or more sequences in a sequence of sequences?
In other words, I want the distinct values that occur in at least 2 of the passed in sequences.
Note:
This is not the intersect of all sequences but rather, the union of the intersect of all pairs of sequences.
Note 2:
The does not include the pair, or 2 combination, of a sequence with itself. That would be silly.
I have made an attempt myself,
public static IEnumerable<T> UnionOfIntersects<T>(
this IEnumerable<IEnumerable<T>> source)
{
var pairs =
from s1 in source
from s2 in source
select new { s1 , s2 };
var intersects = pairs
.Where(p => p.s1 != p.s2)
.Select(p => p.s1.Intersect(p.s2));
return intersects.SelectMany(i => i).Distinct();
}
but I'm concerned that this might be sub-optimal, I think it includes intersects of pair A, B and pair B, A which seems inefficient. I also think there might be a more efficient way to compound the sets as they are iterated.
I include some example input and output below:
{ { 1, 1, 2, 3, 4, 5, 7 }, { 5, 6, 7 }, { 2, 6, 7, 9 } , { 4 } }
returns
{ 2, 4, 5, 6, 7 }
and
{ { 1, 2, 3} } or { {} } or { }
returns
{ }
I'm looking for the best combination of readability and potential performance.
EDIT
I've performed some initial testing of the current answers, my code is here. Output below.
Original valid:True
DoomerOneLine valid:True
DoomerSqlLike valid:True
Svinja valid:True
Adricadar valid:True
Schmelter valid:True
Original 100000 iterations in 82ms
DoomerOneLine 100000 iterations in 58ms
DoomerSqlLike 100000 iterations in 82ms
Svinja 100000 iterations in 1039ms
Adricadar 100000 iterations in 879ms
Schmelter 100000 iterations in 9ms
At the moment, it looks as if Tim Schmelter's answer performs better by at least an order of magnitude.
// init sequences
var sequences = new int[][]
{
new int[] { 1, 2, 3, 4, 5, 7 },
new int[] { 5, 6, 7 },
new int[] { 2, 6, 7, 9 },
new int[] { 4 }
};
One-line way:
var result = sequences
.SelectMany(e => e.Distinct())
.GroupBy(e => e)
.Where(e => e.Count() > 1)
.Select(e => e.Key);
// result is { 2 4 5 7 6 }
Sql-like way (with ordering):
var result = (
from e in sequences.SelectMany(e => e.Distinct())
group e by e into g
where g.Count() > 1
orderby g.Key
select g.Key);
// result is { 2 4 5 6 7 }
May be fastest code (but not readable), complexity O(N):
var dic = new Dictionary<int, int>();
var subHash = new HashSet<int>();
int length = array.Length;
for (int i = 0; i < length; i++)
{
subHash.Clear();
int subLength = array[i].Length;
for (int j = 0; j < subLength; j++)
{
int n = array[i][j];
if (!subHash.Contains(n))
{
int counter;
if (dic.TryGetValue(n, out counter))
{
// duplicate
dic[n] = counter + 1;
}
else
{
// first occurance
dic[n] = 1;
}
}
else
{
// exclude duplucate in sub array
subHash.Add(n);
}
}
}
This should be very close to optimal - how "readable" it is depends on your taste. In my opinion it is also the most readable solution.
var seenElements = new HashSet<T>();
var repeatedElements = new HashSet<T>();
foreach (var list in source)
{
foreach (var element in list.Distinct())
{
if (seenElements.Contains(element))
{
repeatedElements.Add(element);
}
else
{
seenElements.Add(element);
}
}
}
return repeatedElements;
You can skip already Intesected sequences, this way will be a little faster.
public static IEnumerable<T> UnionOfIntersects<T>(this IEnumerable<IEnumerable<T>> source)
{
var result = new List<T>();
var sequences = source.ToList();
for (int sequenceIdx = 0; sequenceIdx < sequences.Count(); sequenceIdx++)
{
var sequence = sequences[sequenceIdx];
for (int targetSequenceIdx = sequenceIdx + 1; targetSequenceIdx < sequences.Count; targetSequenceIdx++)
{
var targetSequence = sequences[targetSequenceIdx];
var intersections = sequence.Intersect(targetSequence);
result.AddRange(intersections);
}
}
return result.Distinct();
}
How it works?
Input: {/*0*/ { 1, 2, 3, 4, 5, 7 } ,/*1*/ { 5, 6, 7 },/*2*/ { 2, 6, 7, 9 } , /*3*/{ 4 } }
Step 0: Intersect 0 with 1..3
Step 1: Intersect 1 with 2..3 (0 with 1 already has been intersected)
Step 2: Intersect 2 with 3 (0 with 2 and 1 with 2 already has been intersected)
Return: Distinct elements.
Result: { 2, 4, 5, 6, 7 }
You can test it with the below code
var lists = new List<List<int>>
{
new List<int> {1, 2, 3, 4, 5, 7},
new List<int> {5, 6, 7},
new List<int> {2, 6, 7, 9},
new List<int> {4 }
};
var result = lists.UnionOfIntersects();
You can try this approach, it might be more efficient and also allows to specify the minimum intersection-count and the comparer used:
public static IEnumerable<T> UnionOfIntersects<T>(this IEnumerable<IEnumerable<T>> source
, int minIntersectionCount
, IEqualityComparer<T> comparer = null)
{
if (comparer == null) comparer = EqualityComparer<T>.Default;
foreach (T item in source.SelectMany(s => s).Distinct(comparer))
{
int containedInHowManySequences = 0;
foreach (IEnumerable<T> seq in source)
{
bool contained = seq.Contains(item, comparer);
if (contained) containedInHowManySequences++;
if (containedInHowManySequences == minIntersectionCount)
{
yield return item;
break;
}
}
}
}
Some explaining words:
It enumerates all unique items in all sequences. Since Distinct is using a set this should be pretty efficient. That can help to speed up in case of many duplicates in all sequences.
The inner loop just looks into every sequence if the unique item is contained. Thefore it uses Enumerable.Contains which stops execution as soon as one item was found(so duplicates are no issue).
If the intersection-count reaches the minum intersection count this item is yielded and the next (unique) item is checked.
That should nail it:
int[][] test = { new int[] { 1, 2, 3, 4, 5, 7 }, new int[] { 5, 6, 7 }, new int[] { 2, 6, 7, 9 }, new int[] { 4 } };
var result = test.SelectMany(a => a.Distinct()).GroupBy(x => x).Where(g => g.Count() > 1).Select(y => y.Key).ToList();
First you make sure, there are no duplicates in each sequence. Then you join all sequences to a single sequence and look for duplicates as e.g. here.
Each key in dictionary has list of MANY integers. I need to iterate through each key and each time to take n items from list and do it until I iterate through all items in all lists. What is the best way to implement it? Do I need to implement some Enumerator?
The code:
enum ItemType { Type1=1, Type2=2, Type3=3 };
var items = new Dictionary<ItemType, List<int>>();
items[ItemType.Type1] = new List<int> { 1, 2, 3, 4, 5 };
items[ItemType.Type2] = new List<int> { 11, 12, 13, 15 };
items[ItemType.Type3] = new List<int> { 21, 22, 23, 24, 25, 26 };
For example: n=2.
1st iteration returns 1,2,11,12,21,22
2nd iteration returns 3,4,13,15,23,24
3rd iteration returns 5,25,26
UPDATED:
at the end I have to get list of this items in that order : 1,2,11,12,21,22, 3,4,13,15,23,24, 5,25,26
Here is how it might be done:
enum ItemType { Type1 = 1, Type2 = 2, Type3 = 3 };
Dictionary<ItemType, List<int>> items = new Dictionary<ItemType, List<int>>();
items[ItemType.Type1] = new List<int> { 1, 2, 3, 4, 5 };
items[ItemType.Type2] = new List<int> { 11, 12, 13, 15 };
items[ItemType.Type3] = new List<int> { 21, 22, 23, 24, 25, 26 };
// Define upper boundary of iteration
int max = items.Values.Select(v => v.Count).Max();
int i = 0, n = 2;
while (i + n <= max)
{
// Skip and Take - to select only next portion of elements, SelectMany - to merge resulting lists of portions
List<int> res = items.Values.Select(v => v.Skip(i).Take(n)).SelectMany(v => v).ToList();
i += n;
// Further processing of res
}
You don't need to define your custom enumerator, just use the MoveNext manually:
Step 1, convert you Dictionary<ItemType, List<int>> into Dictionary<ItemType, List<IEnumerator<int>>:
var iterators = items.ToDictionary(p => p.Key, p => (IEnumerator<int>)p.Value.GetEnumerator());
Step 2: handle MoveNext manually:
public List<int> Get(Dictionary<ItemType, IEnumerator<int>> iterators, int n)
{
var result = new List<int>();
foreach (var itor in iterators.Values)
{
for (var i = 0; i < n && itor.MoveNext(); i++)
{
result.Add(itor.Current);
}
}
return result;
}
Calling Get multiple times will give you the expected result. The enumerator itself will keep the current position.
This will do it for you:
var resultList = new List<int>();
items.ToList().ForEach(listInts => resultList.AddRange(listInts.Take(n));
This is letting LINQ extensions do the hard work for you. Take() will take as much as it can without throwing an exception if you request more than what there is. In this case I'm adding the results to another list, but you could just as easily tag another ForEach() on the end of the Take() in order to iterate the results.
I notice from the example sequences that you are retriving n number of items from x starting point - if you edit your question to include how the starting point is decided then I will adjust my example.
Edit:
Because you want to take n number of items from each list each iteration until there are no more elements returned, this will do it:
class Program
{
static void Main(string[] args)
{
var items = new Dictionary<ItemType, List<int>>();
items[ItemType.Type1] = new List<int> { 1, 2, 3, 4, 5 };
items[ItemType.Type2] = new List<int> { 11, 12, 13, 15 };
items[ItemType.Type3] = new List<int> { 21, 22, 23, 24, 25, 26 };
int numItemsTaken = 0;
var resultsList = new List<int>();
int n = 2, startpoint = 0, previousListSize = 0;
do
{
items.ToList().ForEach(x => resultsList.AddRange(x.Value.Skip(startpoint).Take(n)));
startpoint += n;
numItemsTaken = resultsList.Count - previousListSize;
previousListSize = resultsList.Count;
}
while (numItemsTaken > 0);
Console.WriteLine(string.Join(", ", resultsList));
Console.ReadKey();
}
enum ItemType { Type1 = 1, Type2 = 2, Type3 = 3 };
}
This is one of the few times you'll use a do while loop, and it will work regardless of the size of n or the size of your lists or how many lists there are.
The "best way" depends on your goal, e.g. readability or performance.
Here's one way:
var firstIter = items.Values.SelectMany(list => list.Take(2));
var secondIter = items.Values.SelectMany(list => list.Skip(2).Take(2));
var thirdIter = items.Values.SelectMany(list => list.Skip(4).Take(2));
var finalResult = firstIter.Concat(secondIter).Concat(thirdIter);
Edit: Here's a more general version:
var finalResult = Flatten(items, 0, 2);
IEnumerable<int> Flatten(
Dictionary<ItemType, List<int>> items,
int skipCount,
int takeCount)
{
var iter = items.Values.SelectMany(list => list.Skip(skipCount).Take(takeCount));
return
iter.Count() == 0 ? // a bit inefficient here
iter :
iter.Concat(Flatten(items, skipCount + takeCount, takeCount));
}