How to compare list efficiently? - c#

I'm currently working on a web application in asp.net. In certain api-calls it is necessary to compare ListA with a ListB of Lists to determine if ListA has the same elements of any List in ListB. In other words: If ListA is included in ListB.
Both collections are queried with Linq of an EF-Code-First db. ListB has either one matching List or none, never more than one. In the worst case ListB has millions of elements, so the comparison needs to be scalable.
Instead of doing nested foreach loops, i'm looking for a pure linq query, which will let the db do the work. (before i consider multi column index)
To illustrate the structure:
//In reality Lists are queried of EF
var ListA = new List<Element>();
var ListB = new List<List<Element>>();
List<Element> solution;
bool flag = false;
foreach (List e1 in ListB) {
foreach(Element e2 in ListA) {
if (e1.Any(e => e.id == e2.id)) flag = true;
else {
flag = false;
break;
}
}
if(flag) {
solution = e1;
break;
}
}
Update Structure
Since its a EF Database i'll provide the relevant Object Structure. I'm not sure if i'm allowed to post real code, so this example is still generic.
//List B
class Result {
...
public int Id;
public virtual ICollection<Curve> curves;
...
}
class Curve {
...
public int Id;
public virtual Result result;
public int resultId;
public virtual ICollection<Point> points;
...
}
public class Point{
...
public int Id;
...
}
The controller (for the api-call) wants to serve the right Curve-Object. To identify the right Object, a filter (ListA) is provided (which is in fact a Curve Object)
Now the filter (ListA) needs to be compared to the List of Curves in Result (ListB)
The only way to compare the Curves is by comparing the Points both have.
(So infact comparing Lists)
Curves have around 1 - 50 Points.
Result can have around 500.000.000 Curves
It's possible to compare by Object-Identity here, because all Objects (even the filter) is re-queried of the db.
I'm looking for a way to implement this mechanism, not how to get around this situation. (e.g. by using multi column index (altering the table))
(for illustration purposes):
class controller {
...
public Response serveRequest(Curve filter) {
foreach(Curve c in db.Result.curves) {
if(compare(filter.points , c.points)) return c;
}
}
}

Use Except:
public static bool ContainsAllItems(IList<T> listA, IList<T> listB)
{
return !listB.Except(listA).Any();
}
the above method will tell if listA contains all the elements of listB or not..and the complexity is much faster than O(n*m) approach.

Try this:
bool isIn = ListB.Any(x=>x.Count==ListA.Count && ListA.All(y=>x.Contains(y)));
or, if you want the element
var solution = ListB.FirstOrDefault(x=>x.Count==ListA.Count && ListA.All(y=>x.Contains(y)));

I have something for you:
var db = new MyContext();
var a = db.LoadList(); // or whatever
var b = new List<IQueryable<Entities>>(db.LoadListOfLists()/*or whatever*/);
b.Any(x => x.Count.Equals(a.Count) & x.All(y => a.Any(z => z.Id == y.Id)));

Because performance is concern, I would suggest convert your listA to lookup/dictionary before comparing Ex-
var listALookup = listA.ToLookup(item => item.Id);
var result = listB.FirstOrDefault(childList => childList.Count == listA.Count && childList.All(childListItem => listALookup.Contains(childListItem.Id)));
Lookup.Contain is O(1) while List.Contains is O(n)
Better option is to perform this comparison at db level, to reduce loading unnecessary data.

Related

Balancing oriented graph

I have a set of edges looking like this:
public class Edge<T>
{
public T From { get; set; }
public T To { get; set; }
}
Now I would like to check if my graph is balanced. Under "balanced" I mean that any vertex have equal count of incoming and outgoing edges. My current code is:
public static bool IsGraphBalanced<T>(List<Edge<T>> edges)
{
var from = new Dictionary<T, int>);
var to = new Dictionary<T, int>);
foreach (var edge in edges)
{
if (!from.ContainsKey(edge.From))
from.Add(edge.From, 0);
if (!to.ContainsKey(edge.To))
to.Add(edge.To, 0);
from[edge.From] += 1;
to[edge.To] += 1;
}
foreach (var kv in from)
{
if (!to.ContainsKey(kv.Key))
return false;
if (to[kv.Key] != kv.Value)
return false;
}
// mirrored check with foreach on "to" dictionary
return true;
}
Can I replace it with Linq?
P.S. Size of edges is under 100-150 items, so I care about a readability rather than performance
Here is a more concise implementation utilizing Enumerable class ToLookup, All, Count and Any extension methods (I'll let you decide whether it's more readable or not):
public static bool IsGraphBalanced<T>(List<Edge<T>> edges)
{
var from = edges.ToLookup(e => e.From);
var to = edges.ToLookup(e => e.To);
return from.All(g => g.Count() == to[g.Key].Count())
&& to.All(g => from[g.Key].Any());
}
The ToLookup method is similar to GroupBy, but creates a reusable data structure (because we'll need 2 passes).
Then from.All(g => g.Count() == to[g.Key].Count()) checks if every From has corresponding To and their counts match. Note that in case the key doesn't exist, the ILookup<TKey, TElement> indexer does not throw exception or return null, but returns an empty IEnumerable<TElement>, which allows us to combine the checks.
Finally the to.All(g => from[g.Key].Any()) checks if every To has corresponding From. There is no need to check the counts here because they have been checked in the previous step.

Merge two Lists of different types

I add data to the objects of a List from another List:
public void MergeLsts(List<A> lstA, List<B> lstB)
{
foreach (A dataA in lstA)
{
foreach (B dataB in lstB)
{
if (dataA.ItemNo == dataB.ItemNo)
{
//dataA.ItemDescription is up to this point empty!
dataA.ItemDescription = dataB.ItemDescription;
}
}
}
DoSomethingWithTheNewLst(lstA);
}
This works perfectly fine. However it takes pretty long because both lists get quite large (around 70k items in lstA and 20k items in lstB).
I was wondering if there is a faster or more efficient way to accomplish what I need? Maybe with LINQ?
You can do it with O(n) complexity instead of O(N²) with Join():
var joinedData = dataA.Join(dataB, dA => dA.ItemNo, dB => dB.ItemNo, (dA, dB) => new { dA, dB }));
foreach(var pair in joinedData)
{
pair.dA.ItemDescription = pair.dB.ItemDescription;
}
Distinct, GroupBy and Join operations use hashing, so they should be close to O(N) instead of O(N²).

LINQ with Querying "Memory"

Does LINQ have a way to "memorize" its previous query results while querying?
Consider the following case:
public class Foo {
public int Id { get; set; }
public ICollection<Bar> Bars { get; set; }
}
public class Bar {
public int Id { get; set; }
}
Now, if two or more Foo have same collection of Bar (no matter what the order is), they are considered as similar Foo.
Example:
foo1.Bars = new List<Bar>() { bar1, bar2 };
foo2.Bars = new List<Bar>() { bar2, bar1 };
foo3.Bars = new List<Bar>() { bar3, bar1, bar2 };
In the above case, foo1 is similar to foo2 but both foo1 and foo2 are not similar tofoo3
Given that we have a query result consisting IEnumerable or IOrderedEnumerable of Foo. From the query, we are to find the first N foo which are not similar.
This task seems to require a memory of the collection of bars which have been chosen before.
With partial LINQ we could do it like this:
private bool areBarsSimilar(ICollection<Bar> bars1, ICollection<Bar> bars2) {
return bars1.Count == bars2.Count && //have the same amount of bars
!bars1.Select(x => x.Id)
.Except(bars2.Select(y => y.Id))
.Any(); //and when excepted does not return any element mean similar bar
}
public void somewhereWithQueryResult(){
.
.
List<Foo> topNFoos = new List<Foo>(); //this serves as a memory for the previous query
int N = 50; //can be any number
foreach (var q in query) { //query is IOrderedEnumerable or IEnumerable
if (topNFoos.Count == 0 || !topNFoos.Any(foo => areBarsSimilar(foo.Bars, q.Bars)))
topNFoos.Add(q);
if (topNFoos.Count >= N) //We have had enough Foo
break;
}
}
The topNFoos List will serve as a memory of the previous query and we can skip the Foo q in the foreach loop which already have identical Bars with Any of the Foo in the topNFoos.
My question is, is there any way to do that in LINQ (fully LINQ)?
var topNFoos = from q in query
//put something
select q;
If the "memory" required is from a particular query item q or a variable outside of the query, then we could use let variable to cache it:
int index = 0;
var topNFoos = from q in query
let qc = index++ + q.Id //depends on q or variable outside like index, then it is OK
select q;
But if it must come from the previous querying of the query itself then things start to get more troublesome.
Is there any way to do that?
Edit:
(I currently am creating a test case (github link) for the answers. Still figuring out how can I test all the answers fairly)
(Most of the answers below are aimed to solve my particular question and are in themselves good (Rob's, spender's, and David B's answers which use IEqualityComparer are particularly awesome). Nevertheless, if there is anyone who can give answer to my more general question "does LINQ have a way to "memorize" its previous query results while querying", I would also be glad)
(Apart from the significant difference in performance for the particular case I presented above when using fully/partial LINQ, one answer aiming to answer my general question about LINQ memory is Ivan Stoev's. Another one with good combination is Rob's. As to make myself clearer, I look for general and efficient solution, if there is any, using LINQ)
I'm not going to answer your question directly, but rather, propose a method that will be fairly optimally efficient for filtering the first N non-similar items.
First, consider writing an IEqualityComparer<Foo> that uses the Bars collection to measure equality. Here, I'm assuming that the lists might contain duplicate entries, so have quite a strict definition of similarity:
public class FooSimilarityComparer:IEqualityComparer<Foo>
{
public bool Equals(Foo a, Foo b)
{
//called infrequently
return a.Bars.OrderBy(bar => bar.Id).SequenceEqual(b.Bars.OrderBy(bar => bar.Id));
}
public int GetHashCode(Foo foo)
{
//called frequently
unchecked
{
return foo.Bars.Sum(b => b.GetHashCode());
}
}
}
You can really efficiently get the top N non-similar items by using a HashSet with the IEqualityComparer above:
IEnumerable<Foo> someFoos; //= some list of Foo
var hs = new HashSet<Foo>(new FooSimilarityComparer());
foreach(var f in someFoos)
{
hs.Add(f); //hashsets don't add duplicates, as measured by the FooSimilarityComparer
if(hs.Count >= 50)
{
break;
}
}
#Rob s approach above is broadly similar, and shows how you can use the comparer directly in LINQ, but pay attention to the comments I made to his answer.
So, it's ... possible. But this is far from performant code.
var res = query.Select(q => new {
original = q,
matches = query.Where(innerQ => areBarsSimilar(q.Bars, innerQ.Bars))
}).Select(g => new { original = g, joinKey = string.Join(",", g.matches.Select(m => m.Id)) })
.GroupBy (g => g.joinKey)
.Select(g => g.First().original.original)
.Take(N);
This assumes that the Ids are unique for each Foo (you could also use their GetHashCode(), I suppose).
A much better solution is to either keep what you've done, or implement a custom comparer, as follows:
Note: As pointed out in the comments by #spender, the below Equals and GetHashCode will not work for collections with duplicates. Refer to their answer for a better implementation - however, the usage code would remain the same
class MyComparer : IEqualityComparer<Foo>
{
public bool Equals(Foo left, Foo right)
{
return left.Bars.Count() == right.Bars.Count() && //have the same amount of bars
left.Bars.Select(x => x.Id)
.Except(right.Bars.Select(y => y.Id))
.ToList().Count == 0; //and when excepted returns 0, mean similar bar
}
public int GetHashCode(Foo foo)
{
unchecked {
int hc = 0;
if (foo.Bars != null)
foreach (var p in foo.Bars)
hc ^= p.GetHashCode();
return hc;
}
}
}
And then your query becomes simply:
var res = query
.GroupBy (q => q, new MyComparer())
.Select(g => g.First())
.Take(N);
IEnumerable<Foo> dissimilarFoos =
from foo in query
let key = string.Join('|',
from bar in foo.Bars
order by bar.Id
select bar.Id.ToString())
group foo by key into g
select g.First();
IEnumerable<Foo> firstDissimilarFoos =
dissimilarFoos.Take(50);
Sometimes, you may not like the behavior of groupby in the above queries. At the time the query is enumerated, groupby will enumerate the entire source. If you only want partial enumeration, then you should switch to Distinct and a Comparer:
class FooComparer : IEqualityComparer<Foo>
{
private string keyGen(Foo foo)
{
return string.Join('|',
from bar in foo.Bars
order by bar.Id
select bar.Id.ToString());
}
public bool Equals(Foo left, Foo right)
{
if (left == null || right == null) return false;
return keyGen(left) == keyGen(right);
}
public bool GetHashCode(Foo foo)
{
return keyGen(foo).GetHashCode();
}
}
then write:
IEnumerable<Foo> dissimilarFoos = query.Distinct(new FooComparer());
IEnumerable<Foo> firstDissimilarFoos = dissimilarFoos.Take(50);
Idea. You might be able to hack something by devising your own fluent interface of mutators over a cache that you'd capture in "let x = ..." clauses, along the lines of,
from q in query
let qc = ... // your cache mechanism here
select ...
but I suspect you'll have to be careful to limit the updates to your cache to those "let ..." only, as I doubt the implementation of the standard Linq operators and extensions methods will be happy if you allow such side effects to happen in their back through predicates applied in the "where", or "join", "group by", etc, clauses.
'HTH,
I guess by "full LINQ" you mean standard LINQ operators/Enumerable extension methods.
I don't think this can be done with LINQ query syntax. From standard methods the only one that supports mutable processing state is Enumerable.Aggregate, but it gives you nothing more than a LINQ flavor over the plain foreach:
var result = query.Aggregate(new List<Foo>(), (list, next) =>
{
if (list.Count < 50 && !list.Any(item => areBarsSimilar(item.Bars, next.Bars)))
list.Add(next);
return list;
});
Since looks like we are allowed to use helper methods (like areBarsSimilar), the best we can do is to make it at least look more LINQ-ish by defining and using a custom extension method
var result = query.Aggregate(new List<Foo>(), (list, next) => list.Count < 50 &&
!list.Any(item => areBarsSimilar(item.Bars, next.Bars)) ? list.Concat(next) : list);
where the custom method is
public static class Utils
{
public static List<T> Concat<T>(this List<T> list, T item) { list.Add(item); return list; }
}
But note that compared to vanilla foreach, Aggregate has an additional drawback of not being able to exit earlier, thus will consume the whole input sequence (which besides the performance also means it doesn't work with infinite sequences).
Conclusion: While this should answer your original question, i.e. it's technically possible to do what you are asking for, LINQ (like the standard SQL) is not well suited for such type of processing.

c# Linq differed execution challenge - help needed in creating 3 different lists

I am trying to create 3 different lists (1,2,3) from 2 existing lists (A,B).
The 3 lists need to identify the following relationships.
List 1 - the items that are in list A and not in list B
List 2 - the items that are in list B and not in list A
List 3 - the items that are in both lists.
I then want to join all the lists together into one list.
My problem is that I want to identify the differences by adding an enum identifying the relationship to the items of each list. But by adding the Enum the Except Linq function does not identify the fact (obviously) that the lists are the same. Because the Linq queries are differed I can not resolve this by changing the order of my statements ie. identify the the lists and then add the Enums.
This is the code that I have got to (Doesn't work properly)
There might be a better approach.
List<ManufactorListItem> manufactorItemList =
manufactorRepository.GetManufactorList();
// Get the Manufactors from the Families repository
List<ManufactorListItem> familyManufactorList =
this.familyRepository.GetManufactorList(familyGuid);
// Identify Manufactors that are only found in the Manufactor Repository
List<ManufactorListItem> inManufactorsOnly =
manufactorItemList.Except(familyManufactorList).ToList();
// Mark them as (Parent Only)
foreach (ManufactorListItem manOnly in inManufactorsOnly) {
manOnly.InheritanceState = EnumInheritanceState.InParent;
}
// Identify Manufactors that are only found in the Family Repository
List<ManufactorListItem> inFamiliesOnly =
familyManufactorList.Except(manufactorItemList).ToList();
// Mark them as (Child Only)
foreach (ManufactorListItem famOnly in inFamiliesOnly) {
famOnly.InheritanceState = EnumInheritanceState.InChild;
}
// Identify Manufactors that are found in both Repositories
List<ManufactorListItem> sameList =
manufactorItemList.Intersect(familyManufactorList).ToList();
// Mark them Accordingly
foreach (ManufactorListItem same in sameList) {
same.InheritanceState = EnumInheritanceState.InBoth;
}
// Create an output List
List<ManufactorListItem> manufactors = new List<ManufactorListItem>();
// Join all of the lists together.
manufactors = sameList.Union(inManufactorsOnly).
Union(inFamiliesOnly).ToList();
Any ideas hot to get around this?
Thanks in advance
You can make it much simplier:
List<ManufactorListItem> manufactorItemList = ...;
List<ManufactorListItem> familyManufactorList = ...;
var allItems = manufactorItemList.ToDictionary(i => i, i => InheritanceState.InParent);
foreach (var familyManufactor in familyManufactorList)
{
allItems[familyManufactor] = allItems.ContainsKey(familyManufactor) ?
InheritanceState.InBoth :
InheritanceState.InChild;
}
//that's all, now we can get any subset items:
var inFamiliesOnly = allItems.Where(p => p.Value == InheritanceState.InChild).Select(p => p.Key);
var inManufactorsOnly = allItems.Where(p => p.Value == InheritanceState.InParent).Select(p => p.Key);
var allManufactors = allItems.Keys;
This seems like the simplest way to me:
(I'm using the following Enum for simplicity:
public enum ContainedIn
{
AOnly,
BOnly,
Both
}
)
var la = new List<int> {1, 2, 3};
var lb = new List<int> {2, 3, 4};
var l1 = la.Except(lb)
.Select(i => new Tuple<int, ContainedIn>(i, ContainedIn.AOnly));
var l2 = lb.Except(la)
.Select(i => new Tuple<int, ContainedIn>(i, ContainedIn.BOnly));
var l3 = la.Intersect(lb)
.Select(i => new Tuple<int, ContainedIn>(i, ContainedIn.Both));
var combined = l1.Union(l2).Union(l3);
So long as you have access to the Tuple<T1, T2> class (I think it's a .NET 4 addition).
If the problem is with the Except() statement, then I suggest you use the 3 parameter override of Except in order to provide a custom IEqualityComparer<ManufactorListItem> compare which tests the appropriate ManufactorListItem fields, but not the InheritanceState.
e.g. your equality comparer might look like:
public class ManufactorComparer : IEqualityComparer<ManufactorListItem> {
public bool Equals(ManufactorListItem x, ManufactorListItem y) {
// you need to write a method here that tests all the fields except InheritanceState
}
public int GetHashCode(ManufactorListItem obj) {
// you need to write a simple hash code generator here using any/all the fields except InheritanceState
}
}
and then you would call this using code a bit like
// Identify Manufactors that are only found in the Manufactor Repository
List<ManufactorListItem> inManufactorsOnly =
manufactorItemList.Except(familyManufactorList, new ManufactorComparer()).ToList();

LINQ Combine Queries

I have two collections of objects of different type. Lets call them type ALPHA and type BRAVO. Each of these types has a property that is the "ID" for the object. No ID is duplicated within the class, so for any given ID, there is at most one ALPHA and one BRAVO instance. What I need to do is divide them into 3 categories:
Instances of the ID in ALPHA which do not appear in the BRAVO collection;
Instances of the ID in BRAVO which do not appear in the ALPHA collection;
Instances of the ID which appear in both collections.
In all 3 cases, I need to have the actual objects from the collections at hand for subsequent manipulation.
I know for the #3 case, I can do something like:
var myCorrelatedItems = myAlphaItems.Join(myBravoItems, alpha => alpha.Id, beta => beta.Id, (inner, outer) => new
{
alpha = inner,
beta = outer
});
I can also write code for the #1 and #2 cases which look something like
var myUnmatchedAlphas = myAlphaItems.Where(alpha=>!myBravoItems.Any(bravo=>alpha.Id==bravo.Id));
And similarly for unMatchedBravos. Unfortunately, this would result in iterating the collection of alphas (which may be very large!) many times, and the collection of bravos (which may also be very large!) many times as well.
Is there any way to unify these query concepts so as to minimize iteration over the lists? These collections can have thousands of items.
If you are only interested in the IDs,
var alphaIds = myAlphaItems.Select(alpha => alpha.ID);
var bravoIds = myBravoItems.Select(bravo => bravo.ID);
var alphaIdsNotInBravo = alphaIds.Except(bravoIds);
var bravoIdsNotInAlpha = bravoIds.Except(alphaIds);
If you want the alphas and bravos themselves,
var alphaIdsSet = new HashSet<int>(alphaIds);
var bravoIdsSet = new HashSet<int>(bravoIds);
var alphasNotInBravo = myAlphaItems
.Where(alpha => !bravoIdsSet.Contains(alpha.ID));
var bravosNotInAlpha = myBravoItems
.Where(bravo => !alphaIdsSet.Contains(bravo.ID));
EDIT:
A few other options:
The ExceptBy method from MoreLinq.
The Enumerable.ToDictionary method.
If both types inherit from a common type (e.g. an IHasId interface), you could write your own IEqualityComparer<T> implementation; Enumerable.Except has an overload that accepts an equality-comparer as a parameter.
Sometimes LINQ is not the answer. This is the kind of problem where I would consider using a HashSet<T> with a custom comparer to reduce the work of performing set operations. HashSets are much more efficient at performing set operations than lists - and (depending on the data) can reduce the work considerably:
// create a wrapper class that can accomodate either an Alpha or a Bravo
class ABItem {
public Object Instance { get; private set; }
public int Id { get; private set; }
public ABItem( Alpha a ) { Instance = a; Id = a.Id; }
public ABItem( Bravo b ) { Instance = b; Id = b.Id; }
}
// comparer that compares Alphas and Bravos by id
class ABItemComparer : IComparer {
public int Compare( object a, object b ) {
return GetId(a).Compare(GetId(b));
}
private int GetId( object x ) {
if( x is Alpha ) return ((Alpha)x).Id;
if( x is Bravo ) return ((Bravo)x).Id;
throw new InvalidArgumentException();
}
}
// create a comparer based on comparing the ID's of ABItems
var comparer = new ABComparer();
var hashAlphas =
new HashSet<ABItem>(myAlphaItems.Select(x => new ABItem(x)),comparer);
var hashBravos =
new HashSet<ABItem>(myBravoItems.Select(x => new ABItem(x)),comparer);
// items with common IDs in Alpha and Bravo sets:
var hashCommon = new HashSet<Alpha>(hashAlphas).IntersectWith( hashSetBravo );
hashSetAlpha.ExceptWith( hashSetCommon ); // items only in Alpha
hashSetBravo.ExceptWith( hashSetCommon ); // items only in Bravo
Dictionary<int, Alpha> alphaDictionary = myAlphaItems.ToDictionary(a => a.Id);
Dictionary<int, Bravo> bravoDictionary = myBravoItems.ToDictionary(b => b.Id);
ILookup<string, int> keyLookup = alphaDictionary.Keys
.Union(bravoDictionary.Keys)
.ToLookup(x => alphaDictionary.ContainsKey(x) ?
(bravoDictionary.ContainsKey(x) ? "both" : "alpha") :
"bravo");
List<Alpha> alphaBoth = keyLookup["both"].Select(x => alphaDictionary[x]).ToList();
List<Bravo> bravoBoth = keyLookup["both"].Select(x => bravoDictionary[x]).ToList();
List<Alpha> alphaOnly = keyLookup["alpha"].Select(x => alphaDictionary[x]).ToList();
List<Bravo> bravoOnly = keyLookup["bravo"].Select(x => bravoDictionary[x]).ToList();
Here is one possible LINQ solution that performs a full outer join on both sets and appends a property to them showing which group they belong to. This solution might lose its luster, however, when you try to separate the groups into different variables. It all really depends on what kind of actions you need to perform on these objects. At any rate this ran at (I thought) an acceptable speed (.5 seconds) for me on lists of 5000 items:
var q =
from g in
(from id in myAlphaItems.Select(a => a.ID).Union(myBravoItems.Select(b => b.ID))
join a in myAlphaItems on id equals a.ID into ja
from a in ja.DefaultIfEmpty()
join b in myBravoItems on id equals b.ID into jb
from b in jb.DefaultIfEmpty()
select (a == null ?
new { ID = b.ID, Group = "Bravo Only" } :
(b == null ?
new { ID = a.ID, Group = "Alpha Only" } :
new { ID = a.ID, Group = "Both" }
)
)
)
group g.ID by g.Group;
You can remove the 'group by' query or create a dictionary from this (q.ToDictionary(x => x.Key, x => x.Select(y => y))), or whatever! This is simply a way of categorizing your items. I'm sure there are better solutions out there, but this seemed like a truly interesting question so I thought I might as well give it a shot!
I think LINQ is not the best answer to this problem if you want to traverse and compare the minimum amount of times. I think the following iterative solution is more performant. And I believe that code readability doesn't suffer.
var dictUnmatchedAlphas = myAlphaItems.ToDictionary(a => a.Id);
var myCorrelatedItems = new List<AlphaAndBravo>();
var myUnmatchedBravos = new List<Bravo>();
foreach (Bravo b in myBravoItems)
{
var id = b.Id;
if (dictUnmatchedAlphas.ContainsKey(id))
{
var a = dictUnmatchedAlphas[id];
dictUnmatchedAlphas.Remove(id); //to get just the unmatched alphas
myCorrelatedItems.Add(new AlphaAndBravo { a = a, b = b});
}
else
{
myUnmatchedBravos.Add(b);
}
}
Definition of AlphaAndBravo:
public class AlphaAndBravo {
public Alpha a { get; set; }
public Bravo b { get; set; }
}

Categories