LINQ join: Flexibility on list of structs - c#

apologises if the question is dumb, by I've want to know something to simplify my code:
I have a few lists:
public struct structure
{
public string item1;
public string item2;
public string item3;
public string item4;
}
public List <structure> listOfStruct=new List <structure>();
public List <string> lista;
public List <string> listb;
With below code I use the LINQ join to find matches in both lists:
query = from x in listOfStruct
join y2 in lista on x.item1 equals y2
select new
{
x.action
};
foreach (var item in query)
{
Answer = item.answer;
}
The variable 'Answer' then is used for further processing.
The thing is that somewhere else in my code I need to match another list with listOfStruct, but with another item of the struct:
query = from x in listOfStruct
join y2 in listB on x.item2 equals y2
select new
{
x.action
};
foreach (var item in query)
{
Answer = item.answer;
}
I would like to write a function something like this
public string matchList(string actionListItem, string [] second_list){
query = from x in listOfStruct
join y2 in second_list on x.actionListItem equals y2
select new
{
x.action
};
foreach (var item in query)
{
Answer = item.answer;
}
return Answer;
}
and call if from anywhere in my program with something like below to be flexible and not write the same piece of code over and over again:
var action= matchList(string itemInlistOfStruct,string []secondList)
where "itemInListOfStruct" could be item1, item2...or item4 and "seconList" lista or listb.
Is this possible?

You must define the comparison with Strings for your structure. Use the following procedure to do this
public struct structure
{
public string item1;
public string item2;
public string item3;
public string item4;
public override bool Equals(string obj)
{
return obj==item1;//your code for check Equals
}
}

Whatever you do with matching elements of listOfStruct (which indeed is very unclear in your question), filtering out those matches is independent from processing them consecutively. So I'd suggest to split the matching and the further processing into two simpler steps.
I'd prefer to use a class instead of a struct in this case, so i rename your structstructure into the class Thing, also changing the type of Item2 for demonstration purposes:
public sealed class Thing
{
public string Item1 { get; set; }
public int Item2 { get; set; }
// ...
}
Now make some extension methods as you need:
public static class MyExtensions
{
public static IEnumerable<TSource> MatchBy<TSource, TItem>(this IEnumerable<TSource> source, Func<TSource, TItem> keySelector, IEnumerable<TItem> keys)
{
var keySet = keys as ISet<TItem> ?? new HashSet<TItem>(keys);
return source.Where(x => keySet.Contains(keySelector(x)));
// or equivalently, if you prefer
// return from x in source where keySet.Contains(keySelector(x)) select x;
}
public static string FurtherProcessing(this IEnumerable<Thing> things) => $"Number of things: {things.Count()}.";
}
And use them like this:
var things = new List<Thing>
{
new Thing { Item1 = "Foo", Item2 = 0 },
new Thing { Item1 = "Bar", Item2 = 1 },
new Thing { Item1 = "Bla", Item2 = 2 },
};
var answerByItem1 = things.MatchBy(thing => thing.Item1, new[] { "Foo", "Bar" }).FurtherProcessing();
var answerByItem2 = things.MatchBy(thing => thing.Item2, new[] { 1 }).FurtherProcessing();
If all of your ItemXXX are strings and you really need to identify them by name, you can add another overload of MatchBy:
public static IEnumerable<Thing> MatchBy(this IEnumerable<Thing> source, string keyName, IEnumerable<string> keys)
{
switch (keyName)
{
case nameof(Thing.Item1): return source.MatchBy(x => x.Item1, keys);
// case ...
default: throw new ArgumentOutOfRangeException(nameof(keyName));
}
}

Related

recursive function call in foreach throwing system.stackoverflowexception

I am getting a System.StackOverflowException: 'Exception of type 'System.StackOverflowException' was thrown.' message.
My code as follows, Here I want to assign value to a variable recursively based on the condition and return the list.
public class FancyTree
{
public string title { get; set; }
public string key { get; set; }
public List<FancyTree> children { get; set; }
}
For example the FancyTree Class produces the output like parent->child or parent->parent->child or parent->parent->parent->child just like the Treeview structure.
public JsonResult EmployeesTree()
{
var output = converttoFancyTree(db.Database.GetEmployees(true));
return Json(output, JsonRequestBehavior.AllowGet);
}
public List<FancyTree> converttoFancyTree(List<EmpTable> emps)
{
var output = new List<FancyTree>();
foreach (var emp in emps)
{
var fancyTreeItem = new FancyTree();
fancyTreeItem.key = emp.EMP_ID.ToString();
fancyTreeItem.title = emp.EMP_NAME;
if (!string.IsNullOrEmpty(emp.TEAM))
{
//var empIDs = emp.TEAM?.Split(',')?.Select(Int32.Parse)?.ToList();
var tms = emp.TEAM.Split(',');
if (tms.Length > 0) {
var empIDs = new List<int>();
foreach (var t in tms)
{
empIDs.Add(int.Parse(t));
}
var TeamMembers = emps.Where(x => empIDs.Contains(x.EMP_ID)).ToList();
if (TeamMembers.Count > 0)
{
var childrens = converttoFancyTree(TeamMembers);
fancyTreeItem.children = childrens;
}
}
}
output.Add(fancyTreeItem);
}
return output;
}
I would assume your input is in the form of a plain list of objects, where each object contains the IDs of all the children, and you want to convert this to an object representation, i.e. something like:
public class Employee{
public int Id {get;}
public List<int> SubordinateIds {get;}
}
public class EmployeeTreeNode{
public IReadOnlyList<EmployeeTreeNode> Subordinates {get;} ;
public int Id {get;}
public EmployeeTreeNode(int id, IEnumerable<EmployeeTreeNode> subordinates){
Id = id;
Subordinates = subordinates;
}
To convert this to a tree representation we can start by finding the roots of the tree, i.e. employees that are not subordinate to anyone.
var allSubordinates = allEmployees.SelectMany(e => e.SubordinateIds).ToList();
var allRoots = allEmployees.Select(e => e.Id).Except(allSubordinates);
We then need an efficient way to find a specific employee by the Id, i.e. a dictionary:
var employeeById = allEmployees.ToDictionary(e => e.Id, e => e.SubordinateIds);
We can then finally do the actual recursion, and we can create a generic helper method to assist:
public static TResult MapChildren<T, TResult>(
T root,
Func<T, IEnumerable<T>> getChildren,
Func<T, IEnumerable<TResult>, TResult> map)
{
return RecurseBody(root);
TResult RecurseBody(T item) => map(item, getChildren(item).Select(RecurseBody));
}
...
var tree = allRoots.Select(r => MapChildren(
r,
id => employeeById[id],
(id, subordinates) => new EmployeeTreeNode(id, subordinates)));
This will recurse down to any employee without any subordinates, create EmployeeTreeNode for these, and then eventually traverse up the tree, creating node objects as it goes.
This assumes that there are no loops/cycles. If that is the case you do not have a tree, since trees are by definition acyclic, and the code will crash. You will instead need to handle the more general case of a graph, and this is a harder problem, and you will need to decide how the cycles should be handled.

How to make two list have the same items (but their own sequence)

I have two list object A and B like this:
A=[
{item:1, value:"ex1"},
{item:2, value:"ex1"},
{item:3, value:"ex1"},
{item:4, value:"ex1"}]
B=[{item:1, value:"ex2"},
{item:4, value:"ex2"},
{item:2, value:"ex3"},
{item:5, value:"ex3"}]
How can i do to make B to have same items/values like A and still keep its own sequence for items it already has?
I want B to remove item 5 because A don't have it, and add in item 3 to the end.
I don't want to clone A, I want to modify B to become like A.
What I want is:
Remove items in B when A don't have them: item5
Update items in B when both A and B have them: item1, item2, item4
Add non-existing items to end of B when A have them: item3
So, the result should be like this:
B = [ {item:1, value:"ex1"},
{item:4, value:"ex1"},
{item:2, value:"ex1"},
{item:3, value:"ex1"} ]
Mycode: (This is what i have now)
foreach (myClass itm in A)
{
foreach (myClass fd in B)
{
if (itm.item == fd.item)
{
fd.value = itm.value;
}
}
}
You can write an extension method that merges the lists by iterating over the keys and checking for existence.
static class ExtensionMethods
{
static public void MergeInto<TKey,TValue>(this Dictionary<TKey,TValue> rhs, Dictionary<TKey,TValue> lhs)
{
foreach (var key in rhs.Keys.Union(lhs.Keys).Distinct().ToList())
{
if (!rhs.ContainsKey(key))
{
lhs.Remove(key);
continue;
}
if (!lhs.ContainsKey(key))
{
lhs.Add(key, rhs[key]);
continue;
}
lhs[key] = rhs[key];
}
}
}
Test program:
public class Program
{
public static Dictionary<int,string> A = new Dictionary<int,string>
{
{ 1,"ex1" },
{ 2,"EX2" },
{ 3,"ex3" },
};
public static Dictionary<int,string> B = new Dictionary<int,string>
{
{ 1,"ex1" },
{ 2,"ex2" },
{ 4,"ex4" }
};
public static void Main()
{
A.MergeInto(B);
foreach (var entry in B )
{
Console.WriteLine("{0}={1}", entry.Key, entry.Value);
}
}
}
Output:
1=ex1
2=EX2
3=ex3
Code on DotNetFiddle
Without preserving order
If all you want to do is keep the instance of B, but make it so all its elements match A, you can just do this:
B.Clear();
B.AddRange(A);
Preserving order
If you want to preserve order, you can still use the solution above, but you will need to sort the list that is passed to AddRange(). This is only a little more work.
First, create a lookup table which tells you the order that that Item values originally appeared in. A generic c# Dictionary uses a hash table for the keys, so this is going to end up being more efficient than scanning the List repeatedly. Note that we pass B.Count to the constructor so that it only needs to allocate space once rather than repeatedly as it grows.
var orderBy = new Dictionary<int,int>(B.Count);
for (int i=0; i<B.Count; i++) orderBy.Add(B[i].Item, i);
Now we use our solution, sorting the input list:
B.Clear();
B.AddRange
(
A.OrderBy( item => orderBy.GetValueOrFallback(item.Item, int.MaxValue) )
);
GetValueOrFallback is a simple extension method on Dictionary<,> that makes it simpler to deal with keys that may or may not exist. You pass in the key you want, plus a value to return if the key is not found. In our case we pass int.MaxValue so that new items will be appended to the end.
static public TValue GetValueOrFallback<TKey,TValue>(this Dictionary<TKey,TValue> This, TKey keyToFind, TValue fallbackValue)
{
TValue result;
return This.TryGetValue(keyToFind, out result) ? result : fallbackValue;
}
Example
Put it all together with a test program:
public class MyClass
{
public int Item { get; set; }
public string Value { get; set; }
public override string ToString() { return Item.ToString() + "," + Value; }
}
static public class ExtensionMethods
{
static public TValue ValueOrFallback<TKey,TValue>(this Dictionary<TKey,TValue> This, TKey keyToFind, TValue fallbackValue)
{
TValue result;
return This.TryGetValue(keyToFind, out result) ? result : fallbackValue;
}
static public void MergeInto(this List<MyClass> mergeFrom, List<MyClass> mergeInto)
{
var orderBy = new Dictionary<int,int>(mergeFrom.Count);
for (int i=0; i<mergeInto.Count; i++) orderBy.Add(mergeInto[i].Item, i);
mergeInto.Clear();
mergeInto.AddRange
(
mergeFrom.OrderBy( item => orderBy.ValueOrFallback(item.Item, int.MaxValue) )
);
}
}
public class Program
{
public static List<MyClass> A = new List<MyClass>
{
new MyClass { Item = 2,Value = "EX2" },
new MyClass { Item = 3,Value = "ex3" },
new MyClass { Item = 1,Value = "ex1" }
};
public static List<MyClass> B = new List<MyClass>
{
new MyClass { Item = 1,Value = "ex1" },
new MyClass { Item = 2,Value = "ex2" },
new MyClass { Item = 4,Value = "ex3" },
};
public static void Main()
{
A.MergeInto(B);
foreach (var b in B) Console.WriteLine(b);
}
}
Output:
1,ex1
2,EX2
3,ex3
Code on DotNetFiddle
This is what you specify in the question. I tested it and it works:
class myClass
{
public int item;
public string value;
//ctor:
public myClass(int item, string value) { this.item = item; this.value = value; }
}
static void updateList()
{
var listA = new List<myClass> { new myClass(1, "A1"), new myClass(2, "A2"), new myClass(3, "A3"), new myClass(4, "A4") };
var listB = new List<myClass> { new myClass(1, "B1"), new myClass(4, "B4"), new myClass(2, "B2"), new myClass(5, "B5") };
for (int i = 0; i < listB.Count; i++) //use index to be able to use RemoveAt which is faster
{
var b = listB[i];
var j = listA.FindIndex(x => x.item == b.item);
if (j >= 0) //A has this item, update its value
{
var v = listA[j].value;
if (b.value != v) b.value = v;
}
else //A does not have this item
{
listB.RemoveAt(i);
}
}
foreach (var a in listA)
{
//if (!listB.Contains(a)) listB.Add(a);
if (!listB.Any(b => b.item == a.item)) listB.Add(a);
}
}
You can do something similar to this:
var listA = new List<int> { 1, 3, 5 };
var listB = new List<int> { 1, 4, 3 };
//Removes items in B that aren't in A.
//This will remove 4, leaving the sequence of B as 1,3
listB.RemoveAll(x => !listA.Contains(x));
//Gets items in A that aren't in B
//This will return the number 5
var items = listA.Where(y => !listB.Any(x => x == y));
//Add the items in A that aren't in B to the end of the list
//This adds 5 to the end of the list
foreach(var item in items)
{
listB.Add(item);
}
//List B should be 1,3,5
Console.WriteLine(listB);

Linq group by except column

I have a class with large amount of properties that I need to group by almost all columns.
class Sample {
public string S1 { get; set; }
public string S2 { get; set; }
public string S3 { get; set; }
public string S4 { get; set; }
// ... all the way to this:
public string S99 { get; set; }
public decimal? N1 { get; set; }
public decimal? N2 { get; set; }
public decimal? N3 { get; set; }
public decimal? N4 { get; set; }
// ... all the way to this:
public decimal? N99 { get; set; }
}
From time to time I need to group by all columns except one or two decimal columns and return some result based on this (namely object with all the fields, but with some decimal value as a sum or max).
Is there are any extension method that would allow me to do something like this:
sampleCollection.GroupByExcept(x => x.N2, x => x.N5).Select(....);
instead of specifying all columns in object?
You won't find anything builtin that handles such a case. You'd have to create one yourself. Depending on how robust you need this to be, you could take a number of approaches.
The main hurdle you'll come across is how you'll generate the key type. In an ideal situation, the new keys that are generated would have their own distinct type. But it would have to be dynamically generated.
Alternatively, you could use another type that could hold multiple distinct values and still could be suitably used as the key. Problem here is that it will still have to be dynamically generated, but you will be using existing types.
A different approach you could take that doesn't involve generating new types, would be to use the existing source type, but reset the excluded properties to their default values (or not set them at all). Then they would have no effect on the grouping. This assumes you can create instances of this type and modify its values.
public static class Extensions
{
public static IQueryable<IGrouping<TSource, TSource>> GroupByExcept<TSource, TXKey>(this IQueryable<TSource> source, Expression<Func<TSource, TXKey>> exceptKeySelector) =>
GroupByExcept(source, exceptKeySelector, s => s);
public static IQueryable<IGrouping<TSource, TElement>> GroupByExcept<TSource, TXKey, TElement>(this IQueryable<TSource> source, Expression<Func<TSource, TXKey>> exceptKeySelector, Expression<Func<TSource, TElement>> elementSelector)
{
return source.GroupBy(BuildKeySelector(), elementSelector);
Expression<Func<TSource, TSource>> BuildKeySelector()
{
var exclude = typeof(TXKey).GetProperties()
.Select(p => (p.PropertyType, p.Name))
.ToHashSet();
var itemExpr = Expression.Parameter(typeof(TSource));
var keyExpr = Expression.MemberInit(
Expression.New(typeof(TSource).GetConstructor(Type.EmptyTypes)),
from p in typeof(TSource).GetProperties()
where !exclude.Contains((p.PropertyType, p.Name))
select Expression.Bind(p, Expression.Property(itemExpr, p))
);
return Expression.Lambda<Func<TSource, TSource>>(keyExpr, itemExpr);
}
}
}
Then to use it you would do this:
sampleCollection.GroupByExcept(x => new { x.N2, x.N5 })...
But alas, this approach won't work under normal circumstances. You won't be able to create new instances of the type within a query (unless you're using Linq to Objects).
If you're using Roslyn, you could generate that type as needed, then use that object as your key. Though that'll mean you'll need to generate the type asynchronously. So you probably will want to separate this from your query all together and just generate the key selector.
public static async Task<Expression<Func<TSource, object>>> BuildExceptKeySelectorAsync<TSource, TXKey>(Expression<Func<TSource, TXKey>> exceptKeySelector)
{
var exclude = typeof(TXKey).GetProperties()
.Select(p => (p.PropertyType, p.Name))
.ToHashSet();
var properties =
(from p in typeof(TSource).GetProperties()
where !exclude.Contains((p.PropertyType, p.Name))
select p).ToList();
var targetType = await CreateTypeWithPropertiesAsync(
properties.Select(p => (p.PropertyType, p.Name))
);
var itemExpr = Expression.Parameter(typeof(TSource));
var keyExpr = Expression.New(
targetType.GetConstructors().Single(),
properties.Select(p => Expression.Property(itemExpr, p)),
targetType.GetProperties()
);
return Expression.Lambda<Func<TSource, object>>(keyExpr, itemExpr);
async Task<Type> CreateTypeWithPropertiesAsync(IEnumerable<(Type type, string name)> properties) =>
(await CSharpScript.EvaluateAsync<object>(
AnonymousObjectCreationExpression(
SeparatedList(
properties.Select(p =>
AnonymousObjectMemberDeclarator(
NameEquals(p.name),
DefaultExpression(ParseTypeName(p.type.FullName))
)
)
)
).ToFullString()
)).GetType();
}
To use this:
sampleCollection.GroupBy(
await BuildExceptKeySelector((CollectionType x) => new { x.N2, x.N5 })
).Select(....);
Borrowing from this answer here:
Create a class EqualityComparer
public class EqualityComparer<T> : IEqualityComparer<T>
{
public bool Equals(T x, T y)
{
IDictionary<string, object> xP = x as IDictionary<string, object>;
IDictionary<string, object> yP = y as IDictionary<string, object>;
if (xP.Count != yP.Count)
return false;
if (xP.Keys.Except(yP.Keys).Any())
return false;
if (yP.Keys.Except(xP.Keys).Any())
return false;
foreach (var pair in xP)
if (pair.Value.Equals( yP[pair.Key])==false)
return false;
return true;
}
public int GetHashCode(T obj)
{
return obj.ToString().GetHashCode();
}
}
Then create your GroupContent method:
private void GroupContent<T>(List<T> dataList, string[] columns, string[] columnsToExclude)
{
string[] columnsToGroup = columns.Except(columnsToExclude).ToArray();
EqualityComparer<IDictionary<string, object>> equalityComparer = new EqualityComparer<IDictionary<string, object>>();
var groupedList = dataList.GroupBy(x =>
{
var groupByColumns = new System.Dynamic.ExpandoObject();
((IDictionary<string, object>)groupByColumns).Clear();
foreach (string column in columnsToGroup)
((IDictionary<string, object>)groupByColumns).Add(column, GetPropertyValue(x, column));
return groupByColumns;
}, equalityComparer);
foreach (var item in groupedList)
{
Console.WriteLine("Group : " + string.Join(",", item.Key));
foreach (object obj in item)
Console.WriteLine("Item : " + obj);
Console.WriteLine();
}
}
private static object GetPropertyValue(object obj, string propertyName)
{
return obj.GetType().GetProperty(propertyName).GetValue(obj, null);
}
I extended the code above borrowing another answer.
public static class IEnumerableExt {
public static IEnumerable<T> GroupBye<T, C>(this IEnumerable<T> query, Func<IGrouping<IDictionary<string, object>, T>, C> grouping) where T : class
{
var cProps = typeof(C).GetProperties().Select(prop => prop.Name).ToArray();
var columnsToGroup = typeof(T).GetProperties().Select(prop => prop.Name).Except(cProps).ToArray();
var equalityComparer = new EqualityComparer<IDictionary<string, object>>();
return query
.GroupBy(x => ExpandoGroupBy(x, columnsToGroup), equalityComparer)
.Select(x => MergeIntoNew(x, grouping, cProps));
}
private static IDictionary<string, object> ExpandoGroupBy<T>(T x, string[] columnsToGroup) where T : class
{
var groupByColumns = new System.Dynamic.ExpandoObject() as IDictionary<string, object>;
groupByColumns.Clear();
foreach (string column in columnsToGroup)
groupByColumns.Add(column, typeof(T).GetProperty(column).GetValue(x, null));
return groupByColumns;
}
private static T MergeIntoNew<T, C>(IGrouping<IDictionary<string, object>, T> x, Func<IGrouping<IDictionary<string, object>, T>, C> grouping, string[] cProps) where T : class
{
var tCtor = typeof(T).GetConstructors().Single();
var tCtorParams = tCtor.GetParameters().Select(param => param.Name).ToArray();
//Calling grouping lambda function
var grouped = grouping(x);
var paramsValues = tCtorParams.Select(p => cProps.Contains(p) ? typeof(C).GetProperty(p).GetValue(grouped, null) : x.Key[p]).ToArray();
return (T)tCtor.Invoke(paramsValues);
}
private class EqualityComparer<T> : IEqualityComparer<T>
{
public bool Equals(T x, T y)
{
var xDict = x as IDictionary<string, object>;
var yDict = y as IDictionary<string, object>;
if (xDict.Count != yDict.Count)
return false;
if (xDict.Keys.Except(yDict.Keys).Any())
return false;
if (yDict.Keys.Except(xDict.Keys).Any())
return false;
foreach (var pair in xDict)
if (pair.Value == null && yDict[pair.Key] == null)
continue;
else if (pair.Value == null || !pair.Value.Equals(yDict[pair.Key]))
return false;
return true;
}
public int GetHashCode(T obj)
{
return obj.ToString().GetHashCode();
}
}
}
Which can be used in the following way:
var list = enumerable.GroupBye(grp => new
{
Value = grp.Sum(val => val.Value)
});
The result will like grouping all other columns but Value, which will be valued to the sum of grouped elements' value

What is the cleanest way to do an outer join without equals?

I have two lists, I need to find the items in the first list that are missing from the second, but I can only compare them with a Boolean function.
class A
{
internal bool Matching(A a)
{
return true;
}
}
class OuterMatch
{
List<A> List1 = new List<A>();
List<A> List2 = new List<A>();
void BasicOuterJoin()
{
// textbook example of an outer join, but it does not use my Matching function
var missingFrom2 = from one in List1
join two in List2
on one equals two into matching
from match in matching.DefaultIfEmpty()
where match == null
select one;
}
void Matching()
{
// simple use of the matching function, but this is an inner join.
var matching = from one in List1
from two in List2
where one.Matching(two)
select one;
}
void MissingBasedOnMatching()
{
// a reasonable substitute for what I'm after
var missingFrom2 = from one in List1
where (from two in List2
where two.Matching(one)
select two)
.Count() == 0
select one;
}
MissingBasedOnMatching gives me the right results, but it's not visually obviously an outer join like BasicOuterJoin is. Is there a clearer way to do this?
There's a form of GroupJoin that takes a comparison operator, but I'm not clear if there is a way to use it to make an outer join.
I've been using some useful (and short!) code from a blog by Ed Khoze.
He's posted a handy class which provides an adapter so that you can use Enumerable.Except() with a lambda.
Once you have that code, you can use Except() to solve your problem like so:
var missing = list1.Except(list2, (a, b) => a.Matching(b));
Here's a complete compilable sample. Credit to Ed Khoze for the LINQHelper class:
using System;
using System.Collections.Generic;
using System.Linq;
namespace Demo
{
class A
{
public int Value;
public bool Matching(A a)
{
return a.Value == Value;
}
public override string ToString()
{
return Value.ToString();
}
}
class Program
{
void test()
{
var list1 = new List<A>();
var list2 = new List<A>();
for (int i = 0; i < 20; ++i) list1.Add(new A {Value = i});
for (int i = 4; i < 16; ++i) list2.Add(new A {Value = i});
var missing = list1.Except(list2, (a, b) => a.Matching(b));
missing.Print(); // Prints 0 1 2 3 16 17 18 19
}
static void Main()
{
new Program().test();
}
}
static class MyEnumerableExt
{
public static void Print<T>(this IEnumerable<T> sequence)
{
foreach (var item in sequence)
Console.WriteLine(item);
}
}
public static class LINQHelper
{
private class LambdaComparer<T>: IEqualityComparer<T>
{
private readonly Func<T, T, bool> _lambdaComparer;
private readonly Func<T, int> _lambdaHash;
public LambdaComparer(Func<T, T, bool> lambdaComparer) :
this(lambdaComparer, o => 0)
{
}
private LambdaComparer(Func<T, T, bool> lambdaComparer, Func<T, int> lambdaHash)
{
if (lambdaComparer == null)
throw new ArgumentNullException("lambdaComparer");
if (lambdaHash == null)
throw new ArgumentNullException("lambdaHash");
_lambdaComparer = lambdaComparer;
_lambdaHash = lambdaHash;
}
public bool Equals(T x, T y)
{
return _lambdaComparer(x, y);
}
public int GetHashCode(T obj)
{
return _lambdaHash(obj);
}
}
public static IEnumerable<TSource> Except<TSource>
(
this IEnumerable<TSource> enumerable,
IEnumerable<TSource> second,
Func<TSource, TSource, bool> comparer
)
{
return enumerable.Except(second, new LambdaComparer<TSource>(comparer));
}
}
}
If your problem statement is actually
Find all members of X that do not exist in Y
And given a class Foo that implements IEquatable<Foo> (pretty much what your Matching method does):
class Foo : IEquatable<Foo>
{
public bool Equals( Foo other )
{
throw new NotImplementedException();
}
}
Then this code should give you what you want:
List<Foo> x = GetFirstList() ;
List<Foo> y = GetSecondList() ;
List<Foo> xNotInY = x.Where( xItem => ! y.Any( yItem => xItem.Equals(yItem) ) ).ToList() ;
You should bear in mind that this runs in O(N2) time. Consequently, you might want to implement an IEqualityComparer<Foo> and put your second list in a HashSet<Foo>:
class FooComparer : IEqualityComparer<Foo>
{
public bool Equals(Foo x, Foo y)
{
if ( x == null )
{
return y == null ;
}
else if ( y == null ) return false ;
else
{
return x.Equals(y) ;
}
}
public int GetHashCode(Foo obj)
{
return obj.GetHashCode() ;
}
}
Then do something like
List<Foo> x = GetFirstList() ;
List<Foo> y = GetSecondList() ;
HashSet<Foo> yLookup = new HashSet<Foo>( y , new FooComparer() ) ;
List<Foo> xNotInY = x.Where( x => !yLookup.Contains(x) ) ;
You'll incur some overhead in constructing the hash set (1 pass through the second list), but subsequent lookups via Contains() are O(1).
If you look at the sources for the Linq join operation, this is close to what it does.
It wouldn't be difficult to strip the Linq sources for Join() and it's helpers and tweak them to product left and right join operators instead of the stock inner join.
Does this work for your purposes?
var missing = List1.Except(List2);
If you need custom comparison logic you can build a custom IEqualityComparer. Note however that Except treats both lists as sets, so it will eliminate duplicates from List1.

Given two collections A & B: want to output the inner join, elements in A that were not in B, elements in B that were not in A

Given two collections A & B, I want to output:
1. their inner join (say on a field called Id)
2. those elements in A that could not be found in B
3. those elements in B that could not be found in A
What is the most efficient way to do this?
When I say those elements in A that could not be found in B, I mean those elements that could not be "inner-joined" with B
For the inner join, have a look at the .Join() extension method: http://msdn.microsoft.com/en-us/library/bb344797.aspx
For the second 2 outputs, have a look at the .Except() extension method. http://msdn.microsoft.com/en-us/library/bb300779.aspx
For examples of most of the LINQ queries, have a look at this page: http://msdn.microsoft.com/en-us/vcsharp/aa336746
I guess I'd write this:
public class DeltaSet<T>
{
public ISet<T> FirstItems { get; private set; }
public ISet<T> SecondItems { get; private set; }
public ISet<Tuple<T, T>> IntersectedItems { get; private set; }
// T is the type of the objects, U is the key used to determine equality
public static DeltaSet<T> GetDeltaSet<T, U>(IDictionary<U, T> first,
IDictionary<U, T> second)
{
var firstUniques = new HashSet<T>(
first.Where(x => !second.ContainsKey(x.Key)).Select(x => x.Value));
var secondUniques = new HashSet<T>(
second.Where(x => !first.ContainsKey(x.Key)).Select(x => x.Value));
var intersection = new HashSet<Tuple<T, T>>(
second.Where(x => first.ContainsKey(x.Key)).Select(x =>
Tuple.Create(first[x.Key], x.Value)));
return new DeltaSet<T> { FirstItems = firstUniques,
SecondItems = secondUniques,
IntersectedItems = intersection };
}
public static DeltaSet<IDClass> GetDeltas(IEnumerable<IDClass> first,
IEnumerable<IDClass> second)
{
return GetDeltaSet(first.ToDictionary(x => x.ID),
second.ToDictionary(x => x.ID));
}
}
Assuming you have class A for elements in collection A and class B in collection B
class AB {
public A PartA;
public B PartB;
// Constructor
};
public void ManyJoin (List<A> colA, List<B> colB)
{
List<AB> innerJoin = new List<AB>();
List<A> leftJoin = new List<A>();
List<B> rightJoin = new List<B>();
bool[] foundB = new bool[colB.Count];
foreach (A itemA in colA)
{
int i = colB.FindIndex(itemB => itemB.ID == itemA.ID);
if (i >= 0)
{
innerJoin.Add (new AB(itemA, colB[i]));
foundB[i] = true;
}
else
leftJoin.Add(itemA);
}
for (int j = 0; j < foundB.count; j++)
{
if (!foundB[j])
rightJoin.Add(colB[j]);
}
}
This is one possible way. Whether it is optimum or not, I'm not sure, it does the job.

Categories