I have a situation that is well explained in this question:
Range intersection / union
I need a C# implementation (a collection maybe) that takes a list of ranges (of ints) and do the union of them.
Then I need to iterate through all ints in this collection (also numbers between ranges)
Are there any library/implementation so that I don't have to rewrite everything by myself?
You might take a look at this implementation and see if it will fit your needs.
Combine ranges with Range.Coalesce:
var range1 = Range.Create(0, 5, "Range 1");
var range2 = Range.Create(11, 41, "Range 2");
var range3 = Range.Create(34, 50, "Range 3");
var ranges = new List<Range> { range1, range2, range3 };
var unioned = Range.Coalesce(ranges);
Iterate over ranges with .Iterate:
foreach (var range in unioned)
{
foreach (int i in range.Iterate(x => x + 1))
{
Debug.WriteLine(i);
}
}
The simplest thing that comes to my mind is to use Enumerable.Range, and then treat the different IEnumerable with standard linq operators. Something like:
var list = Enumerable.Range(1, 5)
.Concat(Enumerable.Range(7, 11))
.Concat(Enumerable.Range(13, 22))
foreach(var number in list)
// Do something
Obviously you can use Union and Intersect as well... clearly you can also put your ranges in a List<IEnumerable<int>> or something similar and then iterate over the elements for producing a single list of the elements:
var ranges = new List<IEnumerable<int>>
{
Enumerable.Range(1, 5),
Enumerable.Range(7, 11),
Enumerable.Range(10, 22)
};
var unionOfRanges = Enumerable.Empty<int>();
foreach(var range in ranges)
unionOfRanges = unionOfRanges.Union(range);
foreach(var item in unionOfRanges)
// Do something
The following is vanilla Linq implementation:
var r1 = Enumerable.Range(1,10);
var r2 = Enumerable.Range(20,5);
var r3 = Enumerable.Range(-5,10);
var union = r1.Union(r2).Union(r3).Distinct();
foreach(var n in union.OrderBy(n=>n))
Console.WriteLine(n);
System.Collections.Generic.HashSet has just the thing:
UnionWith( IEnumerable<T> other ). Modifies the current HashSet object to contain all elements that are present in itself, the specified collection, or both.
IntersectWith( IEnumerable<T> other ). Modifies the current HashSet object to contain only elements that are present in that object and in the specified collection.
The data structure you are looking for is called an "interval tree".
You can find different implementations on the net.
For example here's one: http://www.emilstefanov.net/Projects/RangeSearchTree.aspx
Related
This question already has answers here:
Interleaving multiple (more than 2) irregular lists using LINQ
(5 answers)
Closed 5 years ago.
Suppose I have list of list. I want to create new list from given list of list such that elements are in order of example given below.
Inputs:-
List<List<int>> l = new List<List<int>>();
List<int> a = new List<int>();
a.Add(1);
a.Add(2);
a.Add(3);
a.Add(4);
List<int> b = new List<int>();
b.Add(11);
b.Add(12);
b.Add(13);
b.Add(14);
b.Add(15);
b.Add(16);
b.Add(17);
b.Add(18);
l.Add(a);
l.Add(b);
Output(list):-
1
11
2
12
3
13
4
14
15
16
And output list must not contain more than 10 elements.
I am currently doing this using foreach inside while but I want to know how can I do this using LINQ.
int loopCounter = 0,index=0;
List<int> o=new List<int>();
while(o.Count<10)
{
foreach(List<int> x in l)
{
if(o.Count<10)
o.Add(x[index]);
}
index++;
}
Thanks.
Use the SelectMany and Select overloads that receive the item's index. That will be used to apply the desired ordering. The use of the SelectMany is to flatten the nested collections level. Last, apply Take to retrieve only the desired number of items:
var result = l.SelectMany((nested, index) =>
nested.Select((item, nestedIndex) => (index, nestedIndex, item)))
.OrderBy(i => i.nestedIndex)
.ThenBy(i => i.index)
.Select(i => i.item)
.Take(10);
Or in query syntax:
var result = (from c in l.Select((nestedCollection, index) => (nestedCollection, index))
from i in c.nestedCollection.Select((item, index) => (item, index))
orderby i.index, c.index
select i.item).Take(10);
If using a C# 6.0 and prior project an anonymous type instead:
var result = l.SelectMany((nested, index) =>
nested.Select((item, nestedIndex) => new {index, nestedIndex, item}))
.OrderBy(i => i.nestedIndex)
.ThenBy(i => i.index)
.Select(i => i.item)
.Take(10);
To explain why Zip alone is not enough: zip is equivalent to performing a join operation on the second collection to the first, where the
attribute to join by is the index. Therefore Only items that exist in the first collection, if they have a match in the second, will appear in the result.
The next option is to think about left join which will return all items of the first collection with a match (if exists) in the second. In the case described OP is looking for the functionality of a full outer join - get all items of both collection and match when possible.
I know you asked for LINQ, but I do often feel that LINQ is a hammer and as soon as a developer finds it, every problem is a nail. I wouldn't have done this one with LINQ, for a readability/maintainability point of view because I think something like this is simpler and easier to understand/more self documenting:
List<int> r = new List<int>(10);
for(int i = 0; i < 10; i++){
if(i < a.Count)
r.Add(a[i]);
if(i < b.Count)
r.Add(b[i]);
}
You don't need to stop the loop early if a and b collectively only have eg 8 items, but you could by extending the test of the for loop
I also think this case may be more performant than LINQ because it's doing a lot less
If your mandate to use LINQ is academic (this is a homework that must use LINQ) then go ahead, but if it's a normal everyday system that some other poor sucker will have to maintain one day, I implore you to consider whether this is a good application for LINQ
This will handle 2 or more internal List<List<int>>'s - it returns an IEnumerable<int> via yield so you have to call .ToList() on it to make it a list. Linq.Any is used for the break criteria.
Will throw on any list being null. Add checks to your liking.
static IEnumerable<int> FlattenZip (List<List<int>> ienum, int maxLength = int.MaxValue)
{
int done = 0;
int index = 0;
int yielded = 0;
while (yielded <= maxLength && ienum.Any (list => index < list.Count))
foreach (var l in ienum)
{
done++;
if (index < l.Count)
{
// this list is big enough, we will take one out
yielded++;
yield return l[index];
}
if (yielded > maxLength)
break; // we are done
if (done % (ienum.Count) == 0)
index += 1; // checked all lists, advancing index
}
}
public static void Main ()
{
// other testcases to consider:
// in total too few elememts
// one list empty (but not null)
// too many lists (11 for 10 elements)
var l1 = new List<int> { 1, 2, 3, 4 };
var l2 = new List<int> { 11, 12, 13, 14, 15, 16 };
var l3 = new List<int> { 21, 22, 23, 24, 25, 26 };
var l = new List<List<int>> { l1, l2, l3 };
var zipped = FlattenZip (l, 10);
Console.WriteLine (string.Join (", ", zipped));
Console.ReadLine ();
}
Given a list of lists (let's say 5 lists, to have a real number with which to work), I can find items that are common to all 5 lists with relative ease (see Intersection of multiple lists with IEnumerable.Intersect()) using a variation of the following code:
var list1 = new List<int>() { 1, 2, 3 };
var list2 = new List<int>() { 2, 3, 4 };
var list3 = new List<int>() { 3, 4, 5 };
var listOfLists = new List<List<int>>() { list1, list2, list3 };
var intersection = listOfLists.Aggregate((previousList, nextList) => previousList.Intersect(nextList).ToList());
Now let's say that intersection ends up containing 0 items. It's quite possible that there are some objects that are common to 4/5 lists. How would I go about finding them in the most efficient way?
I know I could just run through all the combinations of 4 lists and save all the results, but that method doesn't scale very well (this will eventually have to be done on approx. 40 lists).
If no item is common to 4 lists, then the search would be repeated looking for items common to 3/5 lists, etc. Visually, this could be represented by lists of grid points and we're searching for the points that have the most overlap.
Any ideas?
EDIT:
Maybe it would be better to look at each point and keep track of how many times it appears in each list, then create a list of the points with the highest occurrence?
You can select all numbers (points) from all lists, and group them by value. Then sort result by group size (i.e. lists count where point present) and select most common item:
var mostCommon = listOfLists.SelectMany(l => l)
.GroupBy(i => i)
.OrderByDescending(g => g.Count())
.Select(g => g.Key)
.First();
// outputs 3
Instead of taking only first item, you can take several top items by replacing First() with Take(N).
Returning items with number of lists (ordered by number of lists):
var mostCommonItems = from l in listOfLists
from i in l
group i by i into g
orderby g.Count() descending
select new {
Item = g.Key,
NumberOfLists = g.Count()
};
Usage (item is a strongly-typed anonymous object):
var topItem = mostCommonItems.First();
var item = topItem.Item;
var listsCount = topItem.NumberOfLists;
foreach(var item in mostCommonItems.Take(3))
// iterate over top three items
You can first combine all the lists, then find the Mode of the list using a dictionary strategy as follows. This makes it pretty fast:
/// <summary>
/// Gets the element that occurs most frequently in the collection.
/// </summary>
/// <param name="list"></param>
/// <returns>Returns the element that occurs most frequently in the collection.
/// If all elements occur an equal number of times, a random element in
/// the collection will be returned.</returns>
public static T Mode<T>(this IEnumerable<T> list)
{
// Initialize the return value
T mode = default(T);
// Test for a null reference and an empty list
if (list != null && list.Count() > 0)
{
// Store the number of occurences for each element
Dictionary<T, int> counts = new Dictionary<T, int>();
// Add one to the count for the occurence of a character
foreach (T element in list)
{
if (counts.ContainsKey(element))
counts[element]++;
else
counts.Add(element, 1);
}
// Loop through the counts of each element and find the
// element that occurred most often
int max = 0;
foreach (KeyValuePair<T, int> count in counts)
{
if (count.Value > max)
{
// Update the mode
mode = count.Key;
max = count.Value;
}
}
}
return mode;
}
I want to compare two lists with the same number of elements, and find the number of differences between them. Right now, I have this code (which works):
public static int CountDifferences<T> (this IList<T> list1, IList<T> list2)
{
if (list1.Count != list2.Count)
throw new ArgumentException ("Lists must have the same number of elements", "list2");
int count = 0;
for (int i = 0; i < list1.Count; i++) {
if (!EqualityComparer<T>.Default.Equals (list1[i], list2[i]))
count++;
}
return count;
}
This feels messy to me, and it seems like there must be a more elegant way to achieve it. Is there a way, perhaps, to combine the two lists into a single list of tuples, then simple examine each element of the new list to see if both elements are equal?
Since order in the list does count this would be my approach:
public static int CountDifferences<T>(this IList<T> list1, IList<T> list2)
{
if (list1.Count != list2.Count)
throw new ArgumentException("Lists must have the same number of elements", "list2");
int count = list1.Zip(list2, (a, b) => a.Equals(b) ? 0 : 1).Sum();
return count;
}
Simply merging the lists using Enumerable.Zip() then summing up the differences, still O(n) but this just enumerates the lists once.
Also this approach would work on any two IEnumerable of the same type since we do not use the list indexer (besides obviously in your count comparison in the guard check).
I think your approach is fine, but you could use LINQ to simplify your function:
public static int CountDifferences<T>(this IList<T> list1, IList<T> list2)
{
if(list1.Count != list2.Count)
throw new ArgumentException("Lists must have same # elements", "list2");
return list1.Where((t, i) => !Equals(t, list2[i])).Count();
}
The way you have it written in the question, I don't think Intersect does what you're looking for. For example, say you have:
var list1 = new List<int> { 1, 2, 3, 4, 6, 8 };
var list2 = new List<int> { 1, 2, 4, 5, 6, 8 };
If you run list1.CountDifferences(list2), I'm assuming that you want to get back 2 since elements 2 and 3 are different. Intersect in this case will return 5 since the lists have 5 elements in common. So, if you're looking for 5 then Intersect is the way to go. If you're looking to return 2 then you could use the LINQ statement above.
Try something like this:
var result = list1.Intersect(list2);
var differences = list1.Count - result.Count();
If order counts:
var result = a.Where((x,i) => x !=b[i]);
var differences = result.Count();
You want the Intersect extension method of Enumerable.
public static int CountDifferences<T> (this IList<T> list1, IList<T> list2)
{
if (list1.Count != list2.Count)
throw new ArgumentException ("Lists must have the same number of elements", "list2");
return list1.Count - list1.Intersect(list2).Count();
}
You can use the extension method Zip of List.
List<int> lst1 = new List<int> { 1, 2, 3, 4, 5 };
List<int> lst2 = new List<int> { 6, 2, 9, 4, 5 };
int cntDiff = lst1.Zip(lst2, (a, b) => a != b).Count(a => a);
// Output is 2
I am trying to create 3 different lists (1,2,3) from 2 existing lists (A,B).
The 3 lists need to identify the following relationships.
List 1 - the items that are in list A and not in list B
List 2 - the items that are in list B and not in list A
List 3 - the items that are in both lists.
I then want to join all the lists together into one list.
My problem is that I want to identify the differences by adding an enum identifying the relationship to the items of each list. But by adding the Enum the Except Linq function does not identify the fact (obviously) that the lists are the same. Because the Linq queries are differed I can not resolve this by changing the order of my statements ie. identify the the lists and then add the Enums.
This is the code that I have got to (Doesn't work properly)
There might be a better approach.
List<ManufactorListItem> manufactorItemList =
manufactorRepository.GetManufactorList();
// Get the Manufactors from the Families repository
List<ManufactorListItem> familyManufactorList =
this.familyRepository.GetManufactorList(familyGuid);
// Identify Manufactors that are only found in the Manufactor Repository
List<ManufactorListItem> inManufactorsOnly =
manufactorItemList.Except(familyManufactorList).ToList();
// Mark them as (Parent Only)
foreach (ManufactorListItem manOnly in inManufactorsOnly) {
manOnly.InheritanceState = EnumInheritanceState.InParent;
}
// Identify Manufactors that are only found in the Family Repository
List<ManufactorListItem> inFamiliesOnly =
familyManufactorList.Except(manufactorItemList).ToList();
// Mark them as (Child Only)
foreach (ManufactorListItem famOnly in inFamiliesOnly) {
famOnly.InheritanceState = EnumInheritanceState.InChild;
}
// Identify Manufactors that are found in both Repositories
List<ManufactorListItem> sameList =
manufactorItemList.Intersect(familyManufactorList).ToList();
// Mark them Accordingly
foreach (ManufactorListItem same in sameList) {
same.InheritanceState = EnumInheritanceState.InBoth;
}
// Create an output List
List<ManufactorListItem> manufactors = new List<ManufactorListItem>();
// Join all of the lists together.
manufactors = sameList.Union(inManufactorsOnly).
Union(inFamiliesOnly).ToList();
Any ideas hot to get around this?
Thanks in advance
You can make it much simplier:
List<ManufactorListItem> manufactorItemList = ...;
List<ManufactorListItem> familyManufactorList = ...;
var allItems = manufactorItemList.ToDictionary(i => i, i => InheritanceState.InParent);
foreach (var familyManufactor in familyManufactorList)
{
allItems[familyManufactor] = allItems.ContainsKey(familyManufactor) ?
InheritanceState.InBoth :
InheritanceState.InChild;
}
//that's all, now we can get any subset items:
var inFamiliesOnly = allItems.Where(p => p.Value == InheritanceState.InChild).Select(p => p.Key);
var inManufactorsOnly = allItems.Where(p => p.Value == InheritanceState.InParent).Select(p => p.Key);
var allManufactors = allItems.Keys;
This seems like the simplest way to me:
(I'm using the following Enum for simplicity:
public enum ContainedIn
{
AOnly,
BOnly,
Both
}
)
var la = new List<int> {1, 2, 3};
var lb = new List<int> {2, 3, 4};
var l1 = la.Except(lb)
.Select(i => new Tuple<int, ContainedIn>(i, ContainedIn.AOnly));
var l2 = lb.Except(la)
.Select(i => new Tuple<int, ContainedIn>(i, ContainedIn.BOnly));
var l3 = la.Intersect(lb)
.Select(i => new Tuple<int, ContainedIn>(i, ContainedIn.Both));
var combined = l1.Union(l2).Union(l3);
So long as you have access to the Tuple<T1, T2> class (I think it's a .NET 4 addition).
If the problem is with the Except() statement, then I suggest you use the 3 parameter override of Except in order to provide a custom IEqualityComparer<ManufactorListItem> compare which tests the appropriate ManufactorListItem fields, but not the InheritanceState.
e.g. your equality comparer might look like:
public class ManufactorComparer : IEqualityComparer<ManufactorListItem> {
public bool Equals(ManufactorListItem x, ManufactorListItem y) {
// you need to write a method here that tests all the fields except InheritanceState
}
public int GetHashCode(ManufactorListItem obj) {
// you need to write a simple hash code generator here using any/all the fields except InheritanceState
}
}
and then you would call this using code a bit like
// Identify Manufactors that are only found in the Manufactor Repository
List<ManufactorListItem> inManufactorsOnly =
manufactorItemList.Except(familyManufactorList, new ManufactorComparer()).ToList();
I have an interesting problem, and I can't seem to figure out the lambda expression to make this work.
I have the following code:
List<string[]> list = GetSomeData(); // Returns large number of string[]'s
List<string[]> list2 = GetSomeData2(); // similar data, but smaller subset
List<string[]> newList = list.FindAll(predicate(string[] line){
return (???);
});
I want to return only those records in list in which element 0 of each string[] is equal to one of the element 0's in list2.
list contains data like this:
"000", "Data", "more data", "etc..."
list2 contains data like this:
"000", "different data", "even more different data"
Fundamentally, i could write this code like this:
List<string[]> newList = new List<string[]>();
foreach(var e in list)
{
foreach(var e2 in list2)
{
if (e[0] == e2[0])
newList.Add(e);
}
}
return newList;
But, i'm trying to use generics and lambda's more, so i'm looking for a nice clean solution. This one is frustrating me though.. maybe a Find inside of a Find?
EDIT:
Marc's answer below lead me to experiment with a varation that looks like this:
var z = list.Where(x => list2.Select(y => y[0]).Contains(x[0])).ToList();
I'm not sure how efficent this is, but it works and is sufficiently succinct. Anyone else have any suggestions?
You could join? I'd use two steps myself, though:
var keys = new HashSet<string>(list2.Select(x => x[0]));
var data = list.Where(x => keys.Contains(x[0]));
If you only have .NET 2.0, then either install LINQBridge and use the above (or similar with a Dictionary<> if LINQBridge doesn't include HashSet<>), or perhaps use nested Find:
var data = list.FindAll(arr => list2.Find(arr2 => arr2[0] == arr[0]) != null);
note though that the Find approach is O(n*m), where-as the HashSet<> approach is O(n+m)...
You could use the Intersect extension method in System.Linq, but you would need to provide an IEqualityComparer to do the work.
static void Main(string[] args)
{
List<string[]> data1 = new List<string[]>();
List<string[]> data2 = new List<string[]>();
var result = data1.Intersect(data2, new Comparer());
}
class Comparer : IEqualityComparer<string[]>
{
#region IEqualityComparer<string[]> Members
bool IEqualityComparer<string[]>.Equals(string[] x, string[] y)
{
return x[0] == y[0];
}
int IEqualityComparer<string[]>.GetHashCode(string[] obj)
{
return obj.GetHashCode();
}
#endregion
}
Intersect may work for you.
Intersect finds all the items that are in both lists.
Ok re-read the question. Intersect doesn't take the order into account.
I have written a slightly more complex linq expression that will return a list of items that are in the same position (index) with the same value.
List<String> list1 = new List<String>() {"000","33", "22", "11", "111"};
List<String> list2 = new List<String>() {"000", "22", "33", "11"};
List<String> subList = list1.Select ((value, index) => new { Value = value, Index = index})
.Where(w => list2.Skip(w.Index).FirstOrDefault() == w.Value )
.Select (s => s.Value).ToList();
Result: {"000", "11"}
Explanation of the query:
Select a set of values and position of that value.
Filter that set where the item in the same position in the second list has the same value.
Select just the value (not the index as well).
Note I used:
list2.Skip(w.Index).FirstOrDefault()
//instead of
list2[w.Index]
So that it will handle lists of different lengths.
If you know the lists will be the same length or list1 will always be shorter then list2[w.Index] would probably a bit faster.