Join multiple lists of objects in c# - c#

I have three lists that contain objects with following structure:
List1
- Status
- ValueA
List2
- Status
- ValueB
List3
- Status
- ValueC
I want to joint the lists by status to get a final list that contains object with following structure:
- Status
- ValueA
- ValueB
- ValueC
Not every list has all the status. So a simple (left) join won't do it. Any ideas how to achieve the desired result? I tried with
var result = from first in list1
join second in list2 on first.Status equals second.Status into tmp1
from second in tmp1.DefaultIfEmpty()
join third in list3 on first.Status equals third.Status into tmp2
from third in tmp2.DefaultIfEmpty()
select new { ... };
But result is missing a status. Here is a full MRE:
using System;
using System.Linq;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
List<A> first = new List<A>() { new A("FOO", 1), new A("BAR", 2) };
List<B> second = new List<B>() { new B("FOO", 6), new B("BAR", 3) };
List<C> third = new List<C>() { new C("BAZ", 5) };
var result = from f in first
join s in second on f.Status equals s.Status into tmp1
from s in tmp1.DefaultIfEmpty()
join t in third on f.Status equals t.Status into tmp2
from t in tmp2.DefaultIfEmpty()
select new
{
Status = f.Status,
ValueA = f.ValueA,
ValueB = s.ValueB,
ValueC = t.ValueC,
};
}
}
public record A(string Status, int ValueA);
public record B(string Status, int ValueB);
public record C(string Status, int ValueC);

Unfortunately it is unclear, what should happen, if a status occurs multiple times within one list, cause your aggregate can only hold one value per status.
One possibility to solve this issue would be:
using System;
using System.Linq;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
List<A> first = new List<A>() { new A("FOO", 1), new A("BAR", 2) };
List<B> second = new List<B>() { new B("FOO", 6), new B("BAR", 3) };
List<C> third = new List<C>() { new C("BAZ", 5) };
var allStates = first.Select(a => a.Status)
.Concat(second.Select(b => b.Status))
.Concat(third.Select(c => c.Status))
.Distinct();
var result = allStates
.Select(Status => new
{
Status,
ValueA = first.FirstOrDefault(a => a.Status == Status),
ValueB = second.FirstOrDefault(b => b.Status == Status),
ValueC = third.FirstOrDefault(c => c.Status == Status),
});
foreach (var item in result)
{
Console.WriteLine(item);
}
}
}
public record A(string Status, int ValueA);
public record B(string Status, int ValueB);
public record C(string Status, int ValueC);
Depending on the amount of items that have to be aggregated and the premise that each status occurs only once or never it could make sense to convert your lists to a Dictionary<string, A>, Dictionary<string, B>, etc. to improve the lookup and do something like this in the aggregate:
ValueA = dictFirst.ContainsKey(Status) ? dictFirst[Status] : null
For further improvements (this line makes the lookup twice) you could also factor out a method like this
private static T GetValueOrDefault<T>(IReadOnlyDictionary<string, T> dict, string status)
{
dict.TryGetValue(status, out T value);
return value;
}
And within the .Select() method call it with
ValueA = GetValueOrDefault(firstDict, Status);
Creating the dictionary for the list could be done with:
var firstDict = first.ToDictionary(a => a.Status);

With assumption that status names are unique per list here is a solution
in a single query with help of switch expressions (available since C# 8.0):
using System;
using System.Linq;
using System.Collections.Generic;
List<A> first = new List<A>() { new A("FOO", 1), new A("BAR", 2) };
List<B> second = new List<B>() { new B("FOO", 6), new B("BAR", 3) };
List<C> third = new List<C>() { new C("BAZ", 5) };
var result = first
// concat lists together
.Cast<object>()
.Concat(second)
.Concat(third)
// group on Status value with help of switch expression
.GroupBy(el => el switch {
A a => a.Status,
B b => b.Status,
C c => c.Status,
},
// project groups with anonymous type
(Status, group) => new {
Status,
ValueA = group.OfType<A>().Select(a => a.ValueA).Cast<int?>().FirstOrDefault(),
ValueB = group.OfType<B>().Select(b => b.ValueB).Cast<int?>().FirstOrDefault(),
ValueC = group.OfType<C>().Select(c => c.ValueC).Cast<int?>().FirstOrDefault()
});
public record A(string Status, int ValueA);
public record B(string Status, int ValueB);
public record C(string Status, int ValueC);

This can't using left join.First you must get all keies,then using all keies left join other lists:
var keys = first.Select(item => item.Status).ToList();
keys.AddRange(second.Select(item => item.Status));
keys.AddRange(third.Select(item => item.Status));
keys = keys.Distinct().ToList();
var result = (from k in keys JOIN
f in first on k equals f.Status into tmp0
from f in tmp0.DefaultIfEmpty()
join s in second on k equals s.Status into tmp1
from s in tmp1.DefaultIfEmpty()
join t in third on k equals t.Status into tmp2
from t in tmp2.DefaultIfEmpty()
select new {
Status = k,
ValueA = f?.ValueA,
ValueB = s?.ValueB,
ValueC = t?.ValueC,
}
).ToList();

Related

How to use dictionary in c# to compare two lists

Currently, I have implemented two lists with a double for loop to find matches between the two lists so I can join on them.
I have a list A which contains an ID and some other columns. I have a list B which contains an ID and some other columns. I have currently implemented a for loop within a for loop in order to make the comparisons for all the IDs so that I can find the ones that match and then return the joined results. I know want to understand how to implement a dictionary in this case as that will be more efficient to fix this problem.
public IEnumerable<Details> GetDetails(string ID)
{
// there are two lists defined up here
for (var item in listA)
{
for (var item2 in listB)
{
if (item.ID == item2.ID)
{
item.Name = item2.name;
}
}
}
return results;
}
Instead of having this double for loop, which is very inefficient. I want to learn how to implement a dictionary to fix this problem.
The dictionary would use the ids as keys (or indexes) so
Dictionary<string, object> myListA = new Dictionary<string, object>();
Dictionary<string, object> myListB = new Dictionary<string, object>();
public object GetDetails(string ID)
{
object a = myListA[ID];
object b = myListB[ID];
// combine them here how you want
// object c = a + b;
return c;
}
How about using linq to achieve your actual requirement? Something like:
public IEnumerable<A> GetDetails(int ID)
{
var listA = new List<A>
{
new A(){ ID = 1, Name = 2 },
new A(){ ID = 3, Name = 4 },
new A(){ ID = 5, Name = 6 },
};
var listB = new List<B>
{
new B(){ X = 1, name = 0 },
new B(){ X = 3, name = 1 }
};
return listA.Join(listB, k => k.ID, k => k.ID, (item, item2) =>
{
item.Name = item2.name;
return item;
}).Where(w => w.ID == ID);
}
If you just want the common IDs in the two lists, you can achieve that like this:
var commonIds = listA.Select(o => o.ID).Intersect(listB.Select(o => o.ID));

How to union two LINQ queries but the second query need have more fields

I need make union between two LINQ queries, but the second query need have more fields that the first. How can I do it?
Example:
public static void Dummy()
{
var query1 = this.Db.Table1.Select(s => new MyObject() { A = s.Field1, B = s.Field2 });
var query2 = this.Db.Table2.Select(s => new MyObject() { A = s.Field1, B = s.Field2, C = s.Field3 });
var result = query1.Union(query2);
}
When I calls result.ToList(), occurs the following error:
The type 'MyObject' appears in two structurally incompatible
initializations within a single LINQ to Entities query. A type can be
initialized in two places in the same query, but only if the same
properties are set in both places and those properties are set in the
same order.
How Can I resolve this problem?
Obs.: I can't put the Field3 in the query1 (I don't have access to the query one, because this I Can't changed it)
You don't have to put Field3 in first query but Union requires same number of columns and in same order. Specify a dummy value for third column/field C like:
var query1 = this.Db.Table1.Select(s => new MyObject()
{ A = s.Field1, B = s.Field2 , C= ""});
Assign C whatever is the default value of Field3, may be null for reference type and 0 for numbers etc.
If you don't have access to it modify query1 then create a new query using query1 like:
var newQuery = query1.Select(s=> new MyObject()
{ A = A, B = B , C= ""});
and then use that in Union
var result = newQuery.Union(query2);
As-is, you can't. You can only union 2 sets that have the same structure. If you don't mind modifying query1, however:
var query1 = this.Db.Table1.Select(s => new MyObject()
{ A = s.Field1, B = s.Field2, C = null });
This would allow them to union properly, as they have the same structure.
You can do it, like this:
Create a object devired from MyObject
class MyObjectUnion : MyObject{
}
So, the method goes like this:
public static void Dummy()
{
var query1 = this.Db.Table1.Select(s => new MyObject() { A = s.Field1, B = s.Field2 });
var query1modified = this.Db.Table2.Select(s => new MyObjectUnion() { A = s.Field1, B = s.Field2, C = null });
var query2 = this.Db.Table2.Select(s => new MyObjectUnion() { A = s.Field1, B = s.Field2, C = s.Field3 });
var result = query1modified.Union(query2);
}
It works
Because records in query1 will never have a property "C", and all records in query2 will have a property "C", it is unlikely that a record in query1 will be equivalent to a record in query2. The only reason for using Union over Concat is to remove duplicates and since you can't have any, you should likely be using Concat instead of Union.
public static void Dummy()
{
var query1 = this.Db.Table1.Select(s => new MyObject() { A = s.Field1, B = s.Field2 });
var query2 = this.Db.Table2.Select(s => new MyObject() { A = s.Field1, B = s.Field2, C = s.Field3 });
var result = query1.ToList().Concat(query2);
}
There are exceptions, as if you have a custom IEqualityComparer for MyObject that ignores the "C" property, or the default for the "C" property may exist in a record for table2, and you wanted to remove the duplicate, or if there possibly exists duplicates within either query1 or query2 and you wanted them removed then you can still use Concat, but you need to use Distinct before the Concat.
Editted to force query1 to be materialized before concatenation via .ToList()
Double checked with LinqPad, and the following executable had no issues, using a datasource that had both Categories and Cities tables of which were completely different schemas:
void Main()
{
var query1 = Categories.Select(s => new MyObject { A = s.id, B = s.name });
var query2 = Cities.Select(s => new MyObject { A = s.id, B = s.city_name, C = s.location });
var result = query1.ToList().Concat(query2);
result.Dump();
}
public class MyObject
{
public int A {get;set;}
public string B {get;set;}
public object C {get;set;}
}

Join and subtract values from 2 lists using linq

I have 2 lists that have objects of { DT (date), Value (double) }.
I want to join on date and subtract the 2 values. However, sometimes one list won't have any records for a given DT in which case I'd want to just use the value from the list that does. However, because I'm joining what ends up happening is I get no record at all for that DT. Is there any way to represent this using sql like linq?
I know I could loop over 1 list myself and search for that date in the other, but if I could do it all in 1 linq line it just seems cleaner.
I believe this is what you can do:
var result = (from x in list1 select new Item() { date = x.date, value = x.value - (from y in list2 where x.date.Equals(y.date) select y.value).FirstOrDefault() }).ToList();
Feel free to run the test ConsoleApp I wrote:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace StackOverFlowConsoleApplication
{
class Program
{
static void Main(string[] args)
{
List<Item> list1 = new List<Item>()
{
new Item(){date = DateTime.Today, value=100},
new Item(){date = DateTime.Today.AddDays(-1), value=100}
};
List<Item> list2 = new List<Item>()
{
new Item(){date = DateTime.Today, value=50}
};
var result = (from x in list1 select new Item() { date = x.date, value = x.value - (from y in list2 where x.date.Equals(y.date) select y.value).FirstOrDefault() }).ToList();
}
class Item
{
public DateTime date { get; set; }
public double value { get; set; }
}
}
}
Say your class is named Blub and looks something like this:
public class Blub
{
public DateTime DT { get; set; }
public double Value { get; set; }
}
And you have two lists of it:
var list1 = new List<Blub>();
var list2 = new List<Blub>();
Then you can find the difference for each date using this LINQ query:
var differences = from x1 in list1
join x2 in list2 on x1.DT equals x2.DT into temp
from x2 in temp.DefaultIfEmpty()
select new Blub
{
DT = x1.DT,
Value = x1.Value - (x2 != null ? x2.Value : 0.0)
};
The DefaultIfEmpty() method turns the join into an outer join, ensuring you get a join pair of (x1, null) if there is no matching x2 for any given DT.
PS: Surely a matter of personal taste, but I don't think that this isn't readable..

How do I merge records using LINQ?

I'd like to merge two records using a condition for each column in the row. I'd give you a code sample but I don't know where to start.
class Foo
{
public int i {get;set;}
public int b{get;set;}
public string first{get;set;}
public string last{get;set;}
}
//...
var list = new List<Foo>() {
new Foo () { i=1, b=0, first="Vince", last="P"},
new Foo () { i=1, b=1, first="Vince", last="P"},
new Foo () { i=1, b=0, first="Bob", last="Z"},
new Foo () { i=0, b=1, first="Bob", last="Z"},
} ;
// This is how I'd like my result to look like
// Record 1 - i = 1, b = 1, first="Vince", last = "P"
// Record 2 - i = 1, b = 1, first="Bob", last = "Z"
You can group the result, then aggregate the fields from the items in the group:
var result = list.GroupBy(f => f.first).Select(
g => new Foo() {
b = g.Aggregate(0, (a, f) => a | f.b),
i = g.Aggregate(0, (a, f) => a | f.i),
first = g.Key,
last = g.First().last
}
);
You could use the Aggregate method in LINQ.
First add a method to Foo, say Merge that returns a new Foo based on your merging rules.
public Foo Merge (Foo other)
{
// Implement merge rules here ...
return new Foo {..., b=Math.Max(this.b, other,b), ...};
}
You could also, instead, create a helper method outside the Foo class that does the merging.
Now use Aggregate over your list, using the first element as the seed, merging each record with the current aggregate value as you go. Or, instead of using Aggregate (since it's a somewhat contrived use of LINQ in this case), just do:
Foo result = list.First();
foreach (var item in list.Skip(1)) result = result.Merge(item);
How are your merge rules specified?
I found a non-elegant solution that works
var result = list.GroupBy(i=>i.first);
foreach (IGrouping<string, Foo> grp in result)
{
grp.Aggregate ((f1, f2) => {
return new Foo() {
b = f1.b | f2.b,
i = f1.i | f2.i,
first = f1.first,
last = f1.last
};
});
}

Can a single LINQ Query Expression be framed in this scenario?

I am facing a scenario where I have to filter a single object based on many objects.
For sake of example, I have a Grocery object which comprises of both Fruit and Vegetable properties. Then I have the individual Fruit and Vegetable objects.
My objective is this:
var groceryList = from grocery in Grocery.ToList()
from fruit in Fruit.ToList()
from veggie in Vegetable.ToList()
where (grocery.fruitId = fruit.fruitId)
where (grocery.vegId = veggie.vegId)
select (grocery);
The problem I am facing is when Fruit and Vegetable objects are empty.
By empty, I mean their list count is 0 and I want to apply the filter only if the filter list is populated.
I am also NOT able to use something like since objects are null:
var groceryList = from grocery in Grocery.ToList()
from fruit in Fruit.ToList()
from veggie in Vegetable.ToList()
where (grocery.fruitId = fruit.fruitId || fruit.fruitId == String.Empty)
where (grocery.vegId = veggie.vegId || veggie.vegId == String.Empty)
select (grocery);
So, I intend to check for Fruit and Vegetable list count...and filter them as separate expressions on successively filtered Grocery objects.
But is there a way to still get the list in case of null objects in a single query expression?
I think the LINQ GroupJoin operator will help you here. It's similar to the TSQL LEFT OUTER JOIN
IEnumerable<Grocery> query = Grocery
if (Fruit != null)
{
query = query.Where(grocery =>
Fruit.Any(fruit => fruit.FruitId == grocery.FruitId));
}
if (Vegetable != null)
{
query = query.Where(grocery =>
Vegetable.Any(veggie => veggie.VegetableId == grocery.VegetableId));
}
List<Grocery> results = query.ToList();
Try something like the following:
var joined = grocery.Join(fruit, g => g.fruitId,
f => f.fruitId,
(g, f) => new Grocery() { /*set grocery properties*/ }).
Join(veggie, g => g.vegId,
v => v.vegId,
(g, v) => new Grocery() { /*set grocery properties*/ });
Where I have said set grocery properties you can set the properties of the grocery object from the g, f, v variables of the selector. Of interest will obviouly be setting g.fruitId = f.fruitId and g.vegeId = v.vegeId.
var groceryList =
from grocery in Grocery.ToList()
join fruit in Fruit.ToList()
on grocery.fruidId equals fruit.fruitId
into groceryFruits
join veggie in Vegetable.ToList()
on grocery.vegId equals veggie.vegId
into groceryVeggies
where ... // filter as needed
select new
{
Grocery = grocery,
GroceryFruits = groceryFruits,
GroceryVeggies = groceryVeggies
};
You have to use leftouter join (like TSQL) for this. below the query for the trick
private void test()
{
var grocery = new List<groceryy>() { new groceryy { fruitId = 1, vegid = 1, name = "s" }, new groceryy { fruitId = 2, vegid = 2, name = "a" }, new groceryy { fruitId = 3, vegid = 3, name = "h" } };
var fruit = new List<fruitt>() { new fruitt { fruitId = 1, fname = "s" }, new fruitt { fruitId = 2, fname = "a" } };
var veggie = new List<veggiee>() { new veggiee { vegid = 1, vname = "s" }, new veggiee { vegid = 2, vname = "a" } };
//var fruit= new List<fruitt>();
//var veggie = new List<veggiee>();
var result = from g in grocery
join f in fruit on g.fruitId equals f.fruitId into tempFruit
join v in veggie on g.vegid equals v.vegid into tempVegg
from joinedFruit in tempFruit.DefaultIfEmpty()
from joinedVegg in tempVegg.DefaultIfEmpty()
select new { g.fruitId, g.vegid, fname = ((joinedFruit == null) ? string.Empty : joinedFruit.fname), vname = ((joinedVegg == null) ? string.Empty : joinedVegg.vname) };
foreach (var outt in result)
Console.WriteLine(outt.fruitId + " " + outt.vegid + " " + outt.fname + " " + outt.vname);
}
public class groceryy
{
public int fruitId;
public int vegid;
public string name;
}
public class fruitt
{
public int fruitId;
public string fname;
}
public class veggiee
{
public int vegid;
public string vname;
}
EDIT:
this is the sample result
1 1 s s
2 2 a a
3 3

Categories