How do I merge records using LINQ? - c#

I'd like to merge two records using a condition for each column in the row. I'd give you a code sample but I don't know where to start.
class Foo
{
public int i {get;set;}
public int b{get;set;}
public string first{get;set;}
public string last{get;set;}
}
//...
var list = new List<Foo>() {
new Foo () { i=1, b=0, first="Vince", last="P"},
new Foo () { i=1, b=1, first="Vince", last="P"},
new Foo () { i=1, b=0, first="Bob", last="Z"},
new Foo () { i=0, b=1, first="Bob", last="Z"},
} ;
// This is how I'd like my result to look like
// Record 1 - i = 1, b = 1, first="Vince", last = "P"
// Record 2 - i = 1, b = 1, first="Bob", last = "Z"

You can group the result, then aggregate the fields from the items in the group:
var result = list.GroupBy(f => f.first).Select(
g => new Foo() {
b = g.Aggregate(0, (a, f) => a | f.b),
i = g.Aggregate(0, (a, f) => a | f.i),
first = g.Key,
last = g.First().last
}
);

You could use the Aggregate method in LINQ.
First add a method to Foo, say Merge that returns a new Foo based on your merging rules.
public Foo Merge (Foo other)
{
// Implement merge rules here ...
return new Foo {..., b=Math.Max(this.b, other,b), ...};
}
You could also, instead, create a helper method outside the Foo class that does the merging.
Now use Aggregate over your list, using the first element as the seed, merging each record with the current aggregate value as you go. Or, instead of using Aggregate (since it's a somewhat contrived use of LINQ in this case), just do:
Foo result = list.First();
foreach (var item in list.Skip(1)) result = result.Merge(item);
How are your merge rules specified?

I found a non-elegant solution that works
var result = list.GroupBy(i=>i.first);
foreach (IGrouping<string, Foo> grp in result)
{
grp.Aggregate ((f1, f2) => {
return new Foo() {
b = f1.b | f2.b,
i = f1.i | f2.i,
first = f1.first,
last = f1.last
};
});
}

Related

Join multiple lists of objects in c#

I have three lists that contain objects with following structure:
List1
- Status
- ValueA
List2
- Status
- ValueB
List3
- Status
- ValueC
I want to joint the lists by status to get a final list that contains object with following structure:
- Status
- ValueA
- ValueB
- ValueC
Not every list has all the status. So a simple (left) join won't do it. Any ideas how to achieve the desired result? I tried with
var result = from first in list1
join second in list2 on first.Status equals second.Status into tmp1
from second in tmp1.DefaultIfEmpty()
join third in list3 on first.Status equals third.Status into tmp2
from third in tmp2.DefaultIfEmpty()
select new { ... };
But result is missing a status. Here is a full MRE:
using System;
using System.Linq;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
List<A> first = new List<A>() { new A("FOO", 1), new A("BAR", 2) };
List<B> second = new List<B>() { new B("FOO", 6), new B("BAR", 3) };
List<C> third = new List<C>() { new C("BAZ", 5) };
var result = from f in first
join s in second on f.Status equals s.Status into tmp1
from s in tmp1.DefaultIfEmpty()
join t in third on f.Status equals t.Status into tmp2
from t in tmp2.DefaultIfEmpty()
select new
{
Status = f.Status,
ValueA = f.ValueA,
ValueB = s.ValueB,
ValueC = t.ValueC,
};
}
}
public record A(string Status, int ValueA);
public record B(string Status, int ValueB);
public record C(string Status, int ValueC);
Unfortunately it is unclear, what should happen, if a status occurs multiple times within one list, cause your aggregate can only hold one value per status.
One possibility to solve this issue would be:
using System;
using System.Linq;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
List<A> first = new List<A>() { new A("FOO", 1), new A("BAR", 2) };
List<B> second = new List<B>() { new B("FOO", 6), new B("BAR", 3) };
List<C> third = new List<C>() { new C("BAZ", 5) };
var allStates = first.Select(a => a.Status)
.Concat(second.Select(b => b.Status))
.Concat(third.Select(c => c.Status))
.Distinct();
var result = allStates
.Select(Status => new
{
Status,
ValueA = first.FirstOrDefault(a => a.Status == Status),
ValueB = second.FirstOrDefault(b => b.Status == Status),
ValueC = third.FirstOrDefault(c => c.Status == Status),
});
foreach (var item in result)
{
Console.WriteLine(item);
}
}
}
public record A(string Status, int ValueA);
public record B(string Status, int ValueB);
public record C(string Status, int ValueC);
Depending on the amount of items that have to be aggregated and the premise that each status occurs only once or never it could make sense to convert your lists to a Dictionary<string, A>, Dictionary<string, B>, etc. to improve the lookup and do something like this in the aggregate:
ValueA = dictFirst.ContainsKey(Status) ? dictFirst[Status] : null
For further improvements (this line makes the lookup twice) you could also factor out a method like this
private static T GetValueOrDefault<T>(IReadOnlyDictionary<string, T> dict, string status)
{
dict.TryGetValue(status, out T value);
return value;
}
And within the .Select() method call it with
ValueA = GetValueOrDefault(firstDict, Status);
Creating the dictionary for the list could be done with:
var firstDict = first.ToDictionary(a => a.Status);
With assumption that status names are unique per list here is a solution
in a single query with help of switch expressions (available since C# 8.0):
using System;
using System.Linq;
using System.Collections.Generic;
List<A> first = new List<A>() { new A("FOO", 1), new A("BAR", 2) };
List<B> second = new List<B>() { new B("FOO", 6), new B("BAR", 3) };
List<C> third = new List<C>() { new C("BAZ", 5) };
var result = first
// concat lists together
.Cast<object>()
.Concat(second)
.Concat(third)
// group on Status value with help of switch expression
.GroupBy(el => el switch {
A a => a.Status,
B b => b.Status,
C c => c.Status,
},
// project groups with anonymous type
(Status, group) => new {
Status,
ValueA = group.OfType<A>().Select(a => a.ValueA).Cast<int?>().FirstOrDefault(),
ValueB = group.OfType<B>().Select(b => b.ValueB).Cast<int?>().FirstOrDefault(),
ValueC = group.OfType<C>().Select(c => c.ValueC).Cast<int?>().FirstOrDefault()
});
public record A(string Status, int ValueA);
public record B(string Status, int ValueB);
public record C(string Status, int ValueC);
This can't using left join.First you must get all keies,then using all keies left join other lists:
var keys = first.Select(item => item.Status).ToList();
keys.AddRange(second.Select(item => item.Status));
keys.AddRange(third.Select(item => item.Status));
keys = keys.Distinct().ToList();
var result = (from k in keys JOIN
f in first on k equals f.Status into tmp0
from f in tmp0.DefaultIfEmpty()
join s in second on k equals s.Status into tmp1
from s in tmp1.DefaultIfEmpty()
join t in third on k equals t.Status into tmp2
from t in tmp2.DefaultIfEmpty()
select new {
Status = k,
ValueA = f?.ValueA,
ValueB = s?.ValueB,
ValueC = t?.ValueC,
}
).ToList();

How to use dictionary in c# to compare two lists

Currently, I have implemented two lists with a double for loop to find matches between the two lists so I can join on them.
I have a list A which contains an ID and some other columns. I have a list B which contains an ID and some other columns. I have currently implemented a for loop within a for loop in order to make the comparisons for all the IDs so that I can find the ones that match and then return the joined results. I know want to understand how to implement a dictionary in this case as that will be more efficient to fix this problem.
public IEnumerable<Details> GetDetails(string ID)
{
// there are two lists defined up here
for (var item in listA)
{
for (var item2 in listB)
{
if (item.ID == item2.ID)
{
item.Name = item2.name;
}
}
}
return results;
}
Instead of having this double for loop, which is very inefficient. I want to learn how to implement a dictionary to fix this problem.
The dictionary would use the ids as keys (or indexes) so
Dictionary<string, object> myListA = new Dictionary<string, object>();
Dictionary<string, object> myListB = new Dictionary<string, object>();
public object GetDetails(string ID)
{
object a = myListA[ID];
object b = myListB[ID];
// combine them here how you want
// object c = a + b;
return c;
}
How about using linq to achieve your actual requirement? Something like:
public IEnumerable<A> GetDetails(int ID)
{
var listA = new List<A>
{
new A(){ ID = 1, Name = 2 },
new A(){ ID = 3, Name = 4 },
new A(){ ID = 5, Name = 6 },
};
var listB = new List<B>
{
new B(){ X = 1, name = 0 },
new B(){ X = 3, name = 1 }
};
return listA.Join(listB, k => k.ID, k => k.ID, (item, item2) =>
{
item.Name = item2.name;
return item;
}).Where(w => w.ID == ID);
}
If you just want the common IDs in the two lists, you can achieve that like this:
var commonIds = listA.Select(o => o.ID).Intersect(listB.Select(o => o.ID));

Build a Where clause using a loop and concatening each iterarion with an OR

I have a list of N pairs of integers, e.g.:
2, 4
5, 7
9, 10
11, 12
And I need to build a query like:
WHERE
(foo = 2 AND bar = 4) OR
(foo = 5 AND bar = 7) OR
(foo = 9 AND bar = 10) OR
(foo = 11 AND bar = 12)
If it was a constant length list, I could write something like:
var query = myClass.Where(x =>
(foo == values[0][0] && bar == values[0][1]) ||
(foo == values[1][0] && bar == values[1][1]) ||
(foo == values[2][0] && bar == values[2][1]) ||
(foo == values[3][0] && bar == values[3][1]));
But the length of the list varies, and I am looking for a way to create the query using a loop.
I found I can use Queryable.Union() for a similar result, but considering there are more conditions in the query, and the list of pairs can be long, I would prefer to avoid the union.
Is there any solution for this problem?
You can perform one trick - concatenate looking for fields: foo and bar and then use Contains method:
var filters = new int[][] {
new int[] { 2, 4 },
new int[] { 5, 7 },
new int[] { 9, 10 },
new int[] { 11, 12 }
};
var newFilter = filters.Select(x => x[0] + "-" + x[1]).ToList();
var answer = dbContext.myClass.Where(x => newFilter.Contains(x.foo + "-" + x.bar)).ToList();
Assuming that values is a jagged array, and that myClass is an IEnumerable<T> of an object that has foo and bar properties:
var query = myClass.Where(x => values.Any(y => x.foo == y[0] && x.bar == y[1]));
The inner Any statement, which is run against each object in myClass, looks for any "row" in values whose contents matches the foo and bar properties of myClass. In essence, the Any clause iterates over each row in the table, while the Where clause iterates over (and filters) each object in myClass.
However, I don't know that it will be any more efficient than using a Union.
As noted in the comments, this method doesn't work with LINQ to Entities. This could still be used in conjunction with Entity Framework by pulling all of the records from the database and filtering them in memory, but obviously this is not an efficient solution.
try this:
var query = myClass.Where(x => x.Any(p => p[0] == foo && p[1] == bar));
Take your integer set and make it into a map. Then test for .Any using LINQ.
var map = new Dictionary<int, int>()
{
{2, 4},
{5, 7},
{9, 10},
{11, 12}
};
var foo = 2;
var bar = 4;
var q = map.Any(kv => foo == kv.Key && bar == kv.Value);
Alternatively, you can take the list of int pairs and make them into a list of Tuple<int, int> and test for foo and bar like this:
var q = listOfTuples.Any(tp => foo == tp.Item1 && bar == tp.Item2);
Main point here is that you need to take your "list of pairs of ints" and make a decision on how that information will be structured. Once you make that call, everything else falls into place. One step at a time, right? :-)
I found a solution for this problem based on a similar solution described at Dynamic Queries in Entity Framework using Expression Trees
Here is the solution.
First, a class to hold a foo and bar pair:
public class FooBarPair
{
public int Foo { get; set; }
public int Bar { get; set; }
}
Then, the collection of foos and bars:
var pairs = new FooBarPair[]
{
new Foo() { Foo = 10537, Bar = 1034 },
new Foo() { Foo = 999, Bar = 999 },
new Foo() { Foo = 888, Bar = 888 },
new Foo() { Foo = 10586, Bar = 63 },
};
And here is the code that builds the query expression:
public static void Main()
{
Expression<Func<MyClass, bool>> whereClause =
BuildOrExpressionTree<MyClass, int>(pairs, m => m.Foo + m.Bar);
var myClasss = model.Set<MyClass>();
IQueryable<MyClass> query = myClasss.Where(whereClause);
}
/// <summary>
/// Starts a recursion to build WHERE (m.Foo = X1 AND m.Bar = Y1) [OR (m.Foo = X2 AND m.Bar = Y2) [...]].
/// </summary>
private static Expression<Func<TValue, bool>> BuildOrExpressionTree<TValue, TCompareAgainst>(
IEnumerable<FooBarPair> wantedItems,
Expression<Func<TValue, TCompareAgainst>> convertBetweenTypes1)
{
ParameterExpression inputParam1 = convertBetweenTypes1.Parameters[0];
BinaryExpression binaryExpression = convertBetweenTypes1.Body as BinaryExpression;
Expression binaryExpressionTree = BuildBinaryOrTree<FooBarPair>(
wantedItems.GetEnumerator(),
binaryExpression.Left,
binaryExpression.Right,
null);
return Expression.Lambda<Func<TValue, bool>>(binaryExpressionTree, new[] { inputParam1 });
}
/// <summary>
/// Recursive function to append one 'OR (m.Foo = X AND m.Bar = Y)' expression.
/// </summary>
private static Expression BuildBinaryOrTree<T>(
IEnumerator<FooBarPair> itemEnumerator,
Expression expressionToCompareTo1,
Expression expressionToCompareTo2,
Expression prevExpression)
{
if (itemEnumerator.MoveNext() == false)
{
return prevExpression;
}
ConstantExpression fooConstant = Expression.Constant(itemEnumerator.Current.Foo, typeof(int));
ConstantExpression barConstant = Expression.Constant(itemEnumerator.Current.Bar, typeof(int));
BinaryExpression fooComparison = Expression.Equal(expressionToCompareTo1, fooConstant);
BinaryExpression barComparison = Expression.Equal(expressionToCompareTo2, barConstant);
BinaryExpression newExpression = Expression.AndAlso(fooComparison, barComparison);
if (prevExpression != null)
{
newExpression = Expression.OrElse(prevExpression, newExpression);
}
return BuildBinaryOrTree<FooBarPair>(
itemEnumerator,
expressionToCompareTo1,
expressionToCompareTo2,
newExpression);
}
Thanks everybody!

Applying multiple group functions on IQueryable<T>

Given the following simple object
public class Foo {
public int PrimaryKey;
public int ForeignKey;
public bool FlagOne;
public bool FlagTwo;
}
Suppose I have received a IQueryable<Foo>. Generally, if I want to do a count operation on each flag I would do this:
IQueryable<Foo> foos = GetFoos();
var total = foos.Count();
var flagOneTotal = foos.Count(p => p.FlagOne);
var flagTwoTotal = foos.Count(p => p.FlagTwo);
In EF, the above would execute 3 queries in the database. I would like to retrieve all these in a single query.
For grouping, I can do this to execute single query:
var q = from foo in foos
group foo by foo.ForeignKey into g
select new {
ForeignKey = g.Key,
Total = g.Count(),
FlagOneTotal = g.Count(p => p.FlagOne),
FlagTwoTotal = g.Count(p => p.FlagTwo)
};
var list = q.ToList();
But how would I do the same if I want to get the totals for all elements regardless of foreign key in a single query and a single anonymous object ?
In other words, how would I tell .net that all elements in foos need to be considered 1 group so I can do Count operations on them.
This should do the job:
var q = from foo in foos
group foo by 1 into g
select new {
Total = g.Count(),
FlagOneTotal = g.Count(p => p.FlagOne),
FlagTwoTotal = g.Count(p => p.FlagTwo)
};
var list = q.ToList();
Cheers

Elegant way to create a nested Dictionary in C#

I realized that I didn't give enough information for most people to read my mind and understand all my needs, so I changed this somewhat from the original.
Say I've got a list of items of a class like this:
public class Thing
{
int Foo;
int Bar;
string Baz;
}
And I want to categorize the Baz string based on the values of Foo, then Bar. There will be at most one Thing for each possible combination of Foo and Bar values, but I'm not guaranteed to have a value for each one. It may help to conceptualize it as cell information for a table: Foo is the row number, Bar is the column number, and Baz is the value to be found there, but there won't necessarily be a value present for every cell.
IEnumerable<Thing> things = GetThings();
List<int> foos = GetAllFoos();
List<int> bars = GetAllBars();
Dictionary<int, Dictionary<int, string>> dict = // what do I put here?
foreach(int foo in foos)
{
// I may have code here to do something for each foo...
foreach(int bar in bars)
{
// I may have code here to do something for each bar...
if (dict.ContainsKey(foo) && dict[foo].ContainsKey(bar))
{
// I want to have O(1) lookups
string baz = dict[foo][bar];
// I may have code here to do something with the baz.
}
}
}
What's an easy, elegant way to generate the nested dictionary? I've been using C# long enough that I'm getting used to finding simple, one-line solutions for all of the common stuff like this, but this one has me stumped.
Here's a solution using Linq:
Dictionary<int, Dictionary<int, string>> dict = things
.GroupBy(thing => thing.Foo)
.ToDictionary(fooGroup => fooGroup.Key,
fooGroup => fooGroup.ToDictionary(thing => thing.Bar,
thing => thing.Baz));
An elegant way would be to not create the dictionaries yourself but use LINQ GroupBy and ToDictionary to generate it for you.
var things = new[] {
new Thing { Foo = 1, Bar = 2, Baz = "ONETWO!" },
new Thing { Foo = 1, Bar = 3, Baz = "ONETHREE!" },
new Thing { Foo = 1, Bar = 2, Baz = "ONETWO!" }
}.ToList();
var bazGroups = things
.GroupBy(t => t.Foo)
.ToDictionary(gFoo => gFoo.Key, gFoo => gFoo
.GroupBy(t => t.Bar)
.ToDictionary(gBar => gBar.Key, gBar => gBar.First().Baz));
Debug.Fail("Inspect the bazGroups variable.");
I assume that by categorizing Baz using Foo and Bar you mean that if two things have both Foo and Bar equals then their Baz value also be the same as well. Please correct me if I'm wrong.
You're basically group by the Foo property first...
then for each resulting group, you group on the Bar property...
then for each resulting group you take the first Baz value as the dictionary value.
If you noticed, the method names matched exactly what you are trying to do. :-)
EDIT: Here's another way using query comprehensions, they are longer but are quiet easier to read and grok:
var bazGroups =
(from t1 in things
group t1 by t1.Foo into gFoo
select new
{
Key = gFoo.Key,
Value = (from t2 in gFoo
group t2 by t2.Bar into gBar
select gBar)
.ToDictionary(g => g.Key, g => g.First().Baz)
})
.ToDictionary(g => g.Key, g => g.Value);
Unfortunately, there are no query comprehension counterpart for ToDictionary so it's not as elegant as the lambda expressions.
...
Hope this helps.
Define your own custom generic NestedDictionary class
public class NestedDictionary<K1, K2, V>:
Dictionary<K1, Dictionary<K2, V>> {}
then in your code you write
NestedDictionary<int, int, string> dict =
new NestedDictionary<int, int, string> ();
if you use the int, int, string one a lot, define a custom class for that too..
public class NestedIntStringDictionary:
NestedDictionary<int, int, string> {}
and then write:
NestedIntStringDictionary dict =
new NestedIntStringDictionary();
EDIT: To add capability to construct specific instance from provided List of items:
public class NestedIntStringDictionary:
NestedDictionary<int, int, string>
{
public NestedIntStringDictionary(IEnumerable<> items)
{
foreach(Thing t in items)
{
Dictionary<int, string> innrDict =
ContainsKey(t.Foo)? this[t.Foo]:
new Dictionary<int, string> ();
if (innrDict.ContainsKey(t.Bar))
throw new ArgumentException(
string.Format(
"key value: {0} is already in dictionary", t.Bar));
else innrDict.Add(t.Bar, t.Baz);
}
}
}
and then write:
NestedIntStringDictionary dict =
new NestedIntStringDictionary(GetThings());
Another approach would be to key your dictionary using an anonymous type based on both the Foo and Bar values.
var things = new List<Thing>
{
new Thing {Foo = 3, Bar = 4, Baz = "quick"},
new Thing {Foo = 3, Bar = 8, Baz = "brown"},
new Thing {Foo = 6, Bar = 4, Baz = "fox"},
new Thing {Foo = 6, Bar = 8, Baz = "jumps"}
};
var dict = things.ToDictionary(thing => new {thing.Foo, thing.Bar},
thing => thing.Baz);
var baz = dict[new {Foo = 3, Bar = 4}];
This effectively flattens your hierarchy into a single dictionary.
Note that this dictionary cannot be exposed externally since it is based on an anonymous type.
If the Foo and Bar value combination isn't unique in your original collection, then you would need to group them first.
var dict = things
.GroupBy(thing => new {thing.Foo, thing.Bar})
.ToDictionary(group => group.Key,
group => group.Select(thing => thing.Baz));
var bazes = dict[new {Foo = 3, Bar = 4}];
foreach (var baz in bazes)
{
//...
}
You may be able to use a KeyedCollection where you define:
class ThingCollection
: KeyedCollection<Dictionary<int,int>,Employee>
{
...
}
Use BeanMap's two key Map class. There is also a 3 key map, and it is quite extensible in case you need n keys.
http://beanmap.codeplex.com/
Your solution would then look like:
class Thing
{
public int Foo { get; set; }
public int Bar { get; set; }
public string Baz { get; set; }
}
[TestMethod]
public void ListToMapTest()
{
var things = new List<Thing>
{
new Thing {Foo = 3, Bar = 3, Baz = "quick"},
new Thing {Foo = 3, Bar = 4, Baz = "brown"},
new Thing {Foo = 6, Bar = 3, Baz = "fox"},
new Thing {Foo = 6, Bar = 4, Baz = "jumps"}
};
var thingMap = Map<int, int, string>.From(things, t => t.Foo, t => t.Bar, t => t.Baz);
Assert.IsTrue(thingMap.ContainsKey(3, 4));
Assert.AreEqual("brown", thingMap[3, 4]);
thingMap.DefaultValue = string.Empty;
Assert.AreEqual("brown", thingMap[3, 4]);
Assert.AreEqual(string.Empty, thingMap[3, 6]);
thingMap.DefaultGeneration = (k1, k2) => (k1.ToString() + k2.ToString());
Assert.IsFalse(thingMap.ContainsKey(3, 6));
Assert.AreEqual("36", thingMap[3, 6]);
Assert.IsTrue(thingMap.ContainsKey(3, 6));
}
I think the simplest approach would be to use the LINQ extension methods. Obviously I haven't tested this code for performace.
var items = new[] {
new Thing { Foo = 1, Bar = 3, Baz = "a" },
new Thing { Foo = 1, Bar = 3, Baz = "b" },
new Thing { Foo = 1, Bar = 4, Baz = "c" },
new Thing { Foo = 2, Bar = 4, Baz = "d" },
new Thing { Foo = 2, Bar = 5, Baz = "e" },
new Thing { Foo = 2, Bar = 5, Baz = "f" }
};
var q = items
.ToLookup(i => i.Foo) // first key
.ToDictionary(
i => i.Key,
i => i.ToLookup(
j => j.Bar, // second key
j => j.Baz)); // value
foreach (var foo in q) {
Console.WriteLine("{0}: ", foo.Key);
foreach (var bar in foo.Value) {
Console.WriteLine(" {0}: ", bar.Key);
foreach (var baz in bar) {
Console.WriteLine(" {0}", baz.ToUpper());
}
}
}
Console.ReadLine();
Output:
1:
3:
A
B
4:
C
2:
4:
D
5:
E
F
Dictionary<int, Dictionary<string, int>> nestedDictionary =
new Dictionary<int, Dictionary<string, int>>();

Categories