Merge contents of multiple lists of custom objects - C# - c#

I have a class Project as
public class Project
{ public int ProjectId { get; set; }
public string ProjectName { get; set; }
public string Customer { get; set; }
public string Address{ get; set; }
}
and I have 3 lists
List<Project> lst1; List<Project> lst2; List<Project> lst3;
lst1 contains Person objects with ProjectId and ProjectName.
ProjectId =1, ProjectName = "X", Customer = null, Address = null
ProjectId =2, ProjectName = "Y", Customer = null, Address = null
lst2 contains Person objects with ProjectId and Customer
ProjectId =1,ProjectName = null, Customer = "c1", Address = null
ProjectId =2,ProjectName = null, Customer = "c2", Address = null
, and
lst3 contains Person objects with ProjectId and Address
ProjectId = 1, ProjectName = null, Customer =null, Address = "a1"
ProjectId = 2, ProjectName = null, Customer =null, Address = "a2".
Considering there are multiple such records in each list and ProjectId is Uniqe for each project, How can I merge/combine these list to get one list with merged objects
ProjectId=1, ProjectName="X", Customer="c1", address="a1"
ProjectId=2, ProjectName="Y", Customer="c2", address="a2"
I found thse links similar and tried with it but could not meet the results
Create a list from two object lists with linq
How to merge two lists using LINQ?
Thank You.

This could be done in a multi-step approach pretty simply. First, define a Func<Project, Project, Project> to handle the actual record merging. That is, you are defining a method with a signature equivalent to public Project SomeMethod(Project p1, Project p2). This method implements the merging logic you outlined above. Next, we concatenate the elements of the lists together before grouping them by ProjectId, using our merge delegate as the an aggregate function in the overload of GroupBy which accepts a result selector:
Func<Project, Project, Project> mergeFunc = (p1,p2) => new Project
{
ProjectId = p1.ProjectId,
ProjectName = p1.ProjectName == null ? p2.ProjectName : p1.ProjectName,
Customer = p1.Customer == null ? p2.Customer : p1.Customer,
Address = p1.Address == null ? p2.Address : p1.Address
};
var output = lst1.Concat(lst2).Concat(lst3)
.GroupBy(x => x.ProjectId, (k, g) => g.Aggregate(mergeFunc));
Here's a quick and dirty test of the above logic along with output:
List<Project> lst1; List<Project> lst2; List<Project> lst3;
lst1 = new List<Project>
{
new Project { ProjectId = 1, ProjectName = "P1" },
new Project { ProjectId = 2, ProjectName = "P2" },
new Project { ProjectId = 3, ProjectName = "P3" }
};
lst2 = new List<Project>
{
new Project { ProjectId = 1, Customer = "Cust1"},
new Project { ProjectId = 2, Customer = "Cust2"},
new Project { ProjectId = 3, Customer = "Cust3"}
};
lst3 = new List<Project>
{
new Project { ProjectId = 1, Address = "Add1"},
new Project { ProjectId = 2, Address = "Add2"},
new Project { ProjectId = 3, Address = "Add3"}
};
Func<Project, Project, Project> mergeFunc = (p1,p2) => new Project
{
ProjectId = p1.ProjectId,
ProjectName = p1.ProjectName == null ? p2.ProjectName : p1.ProjectName,
Customer = p1.Customer == null ? p2.Customer : p1.Customer,
Address = p1.Address == null ? p2.Address : p1.Address
};
var output = lst1
.Concat(lst2)
.Concat(lst3)
.GroupBy(x => x.ProjectId, (k, g) => g.Aggregate(mergeFunc));
IEnumerable<bool> assertedCollection = output.Select((x, i) =>
x.ProjectId == (i + 1)
&& x.ProjectName == "P" + (i+1)
&& x.Customer == "Cust" + (i+1)
&& x.Address == "Add" + (i+1));
Debug.Assert(output.Count() == 3);
Debug.Assert(assertedCollection.All(x => x == true));
--- output ---
IEnumerable<Project> (3 items)
ProjectId ProjectName Customer Address
1 P1 Cust1 Add1
2 P2 Cust2 Add2
3 P3 Cust3 Add3

Using a Lookup you can do it like this:
List<Project> lst = lst1.Union(lst2).Union(lst3).ToLookup(x => x.ProjectId).Select(x => new Project()
{
ProjectId = x.Key,
ProjectName = x.Select(y => y.ProjectName).Aggregate((z1,z2) => z1 ?? z2),
Customer = x.Select(y => y.Customer).Aggregate((z1, z2) => z1 ?? z2),
Address = x.Select(y => y.Address).Aggregate((z1, z2) => z1 ?? z2)
}).ToList();

I belive the folloing is how LINQ Join works:
var mergedProjects =
lst1
.Join(lst2,
proj1 => proj1.ProjectID,
proj2 => proj2.ProjectID,
(proj1, proj2) => new { Proj1 = proj1, Proj2 = proj2 })
.Join(lst3,
pair => pair.Proj1.ProjectID,
proj3 => proj3.ProjectID,
(pair, proj3) => new Project
{
ProjectID = proj3.ProjectID,
ProjectName = pair.Proj1.ProjectName,
Customer = pair.Proj2.Customer,
Address = proj3.Address
});
This will not return any results where the ProjectID is not found in all three lists.
If this is a problem, I think you'd be better off doing this manually rather than using LINQ.

I assume that list contains same number of items and are sorted by ProjectId.
List<Project> lst1; List<Project> lst2; List<Project> lst3
If list are not sorted you can sort it first.
list1.Sort(p => p.ProjectId);
list2.Sort(p => p.ProjectId);
list3.Sort(p => p.ProjectId);
For merging the object
List<Project> list4 = new List<Project>();
for(int i=1; i<list.Count; i++)
{
list4.Add(new Project
{
ProjectId = list1[i].ProjectId;
ProjectName = list1[i].ProjectName;
Customer = list2[i].Customer;
Address = list3[i].Address;
});
}

Although overkill, I was tempted to make this an extension method:
public static List<T> MergeWith<T,TKey>(this List<T> list, List<T> other, Func<T,TKey> keySelector, Func<T,T,T> merge)
{
var newList = new List<T>();
foreach(var item in list)
{
var otherItem = other.SingleOrDefault((i) => keySelector(i).Equals(keySelector(item)));
if(otherItem != null)
{
newList.Add(merge(item,otherItem));
}
}
return newList;
}
Usage would then be:
var merged = list1
.MergeWith(list2, i => i.ProjectId,
(lhs,rhs) => new Project{ProjectId=lhs.ProjectId,ProjectName=lhs.ProjectName, Customer=rhs.Customer})
.MergeWith(list3,i => i.ProjectId,
(lhs,rhs) => new Project{ProjectId=lhs.ProjectId,ProjectName=lhs.ProjectName, Customer=lhs.Customer,Address=rhs.Address});
Live example: http://rextester.com/ETIVB14254

This is assuming that you want to take the first non-null value, or revert to the default value - in this case null for a string.
private static IEnumerable<Project> GetMergedProjects(IEnumerable<List<Project>> projects)
{
var projectGrouping = projects.SelectMany(p => p).GroupBy(p => p.ProjectId);
foreach (var projectGroup in projectGrouping)
{
yield return new Project
{
ProjectId = projectGroup.Key,
ProjectName =
projectGroup.Select(p => p.ProjectName).FirstOrDefault(
p => !string.IsNullOrEmpty(p)),
Customer =
projectGroup.Select(c => c.Customer).FirstOrDefault(
c => !string.IsNullOrEmpty(c)),
Address =
projectGroup.Select(a => a.Address).FirstOrDefault(
a => !string.IsNullOrEmpty(a)),
};
}
}
You could also make this an extension method if needed.

Related

LINQ Query to filter Data?

I have a table which contains Branch Ids and Department Ids. I have three branches and 1st branch has only 1 Department, the 2nd branch has two departments and 3rd branch has three Departments.
Now, I need to write a query to find branches which have department 1 but doesn't have dept. 2 and dept. 3.
This is just an example, I have a much more complex scenario which is very dynamic. I am using this example to put forward my question.
I am attaching the picture to understand the problem.
Here's query:
db.ConnectedBRDE.Where(x => x.DeptId == 1 && x.DeptId != 2)
.Select(x => x.BranchId)
.ToList();
This query is giving my all three Branches, whereas, I only need branch 1 because this is the only branch which doesn't have department 2.
This part && x.DeptId != 2 is wrong, I guess. What should I write here to make my filter working?
Stephen Muecke's comment does indeed work.
I have tested it in DotNetFiddle.
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
public static void Main(string[] args)
{
List<TestClass> lstOfItems = new List<TestClass>();
var itemOne = new TestClass(){BranchName = "Branch One", BranchId = 1, DeptId = 1};
var itemTwo = new TestClass(){BranchName = "Branch Two", BranchId = 2, DeptId = 1};
var itemThree = new TestClass(){BranchName = "Branch Two", BranchId = 2, DeptId = 2};
var itemFour = new TestClass(){BranchName = "Branch Three", BranchId = 3, DeptId = 1};
var itemFive = new TestClass(){BranchName = "Branch Three", BranchId = 3, DeptId = 2};
var itemSix = new TestClass(){BranchName = "Branch Three", BranchId = 3, DeptId = 3};
lstOfItems.Add(itemOne);
lstOfItems.Add(itemTwo);
lstOfItems.Add(itemThree);
lstOfItems.Add(itemFour);
lstOfItems.Add(itemFive);
lstOfItems.Add(itemSix);
var myList = lstOfItems.GroupBy(x => x.BranchName).Where(y => y.Count() == 1 && y.First().DeptId == 1).ToList();
foreach(var item in myList){
Console.WriteLine(item.Key);
}
// Output
// Branch One
}
}
public class TestClass
{
public string BranchName {get;set;}
public int BranchId {get;set;}
public int DeptId {get;set;}
}
Basically, once all of the records are grouped by BranchName property, then we want to count all of the records under each branch name.. and if the count equals 1 then that means that branch only has 1 record.. and then we find the DeptId of that record and if it equals 1 then that satisfies your condition.
I think bellowing code is what are you looking for
var list = new List<Model>();
list.Add(new Model(1, 1));
list.Add(new Model(2, 1));
list.Add(new Model(2, 2));
list.Add(new Model(3, 1));
list.Add(new Model(3, 2));
list.Add(new Model(3, 3));
var notValidBranchIds = list.Where(x => x.DeptId == 2 || x.DeptId == 3).Select(x => x.BranchId);
var result = list.Where(x => x.DeptId == 1 && !notValidBranchIds.Contains(x.BranchId)).Select(x => x.BranchId);
// you can also use this. It solve the problem in a line
var betterResult = list.GroupBy(x => new { x.DeptId })
.Select(x => x.FirstOrDefault(a => a.DeptId == 1))
.Where(y => y != null)
.ToList();
return only first branchId's record.
Hope it helps to you.
if you have access to Branch and Department models, i suggest use this query: Branches.Where(b=>b.Departments.All(d=>d.Id != 2) && b.Departments.Any(d=>d.Id==1))

Select Last non null-able item per product?

Let's say I have,
class Product
{
public int Id {get; set;}
public string Name {get; set;}
public int Order {get; set;}
}
and my data have,
products[0] = new Product { Id = 1, Name = "P1", Order = 1 };
products[1] = new Product { Id = 1, Name = "P2", Order = 2 };
products[2] = new Product { Id = 1, Name = null, Order = 3 };
products[3] = new Product { Id = 2, Name = "P3", Order = 4 };
products[4] = new Product { Id = 2, Name = null, Order = 5 };
products[5] = new Product { Id = 2, Name = null, Order = 6 };
What I need is the last(order by Order desc) non-nullable value of Name per Product.Id. So my final output will look like,
items[0] = new { Id = 1, Name = "P2"};
items[1] = new { Id = 2, Name = "P3"};
If Id=1, I have 3 Names (P1, P2, null) and non-nullable Names (P1, P2) but last one is P3.
This should get the last products in order.
var lastOrders = products
.Where(x => x.Name != null) // Remove inapplicable data
.OrderBy(x => x.Order) // Order by the Order
.GroupBy(x => x.Id) // Group the sorted Products
.Select(x => x.Last()); // Get the last products in the groups
var result = products
.GroupBy(p => p.Id)
.Select(g => g.OrderBy(x => x.Order).Last(x => x.Name != null));
this will give you your desired output:
products.GroupBy(p => p.Id)
.Select(g => g.OrderByDescending(gg => gg.Name)
.Where(gg => gg.Name != null)
.Select(gg => new { gg.Id, gg.Name })
.First());
The task can be solved using the following Linq statement.
var Result = products.OrderBy().Where( null != iProduct.Name ).First();
This requires products to contain at least one item where Name is null, otherwise an Exception will be thrown. Alternatively,
var Result = products.OrderBy().Where( null != iProduct.Name ).FirstOrDefault();
will return null if products contains no such item.
Try with :
var expectedProduct =products.Where(p => p.Id != null).OrderByDescending(p => p.Order).GroupBy(p => p.Id).Last()

Linq: find similar objects from two different lists

I've got two separate lists of custom objects. In these two separate lists, there may be some objects that are identical between the two lists, with the exception of one field ("id"). I'd like to know a smart way to query these two lists to find this overlap. I've attached some code to help clarify. Any suggestions would be appreciated.
namespace ConsoleApplication1
{
class userObj
{
public int id;
public DateTime BirthDate;
public string FirstName;
public string LastName;
}
class Program
{
static void Main(string[] args)
{
List<userObj> list1 = new List<userObj>();
list1.Add(new userObj()
{
BirthDate=DateTime.Parse("1/1/2000"),
FirstName="John",
LastName="Smith",
id=0
});
list1.Add(new userObj()
{
BirthDate = DateTime.Parse("2/2/2000"),
FirstName = "Jane",
LastName = "Doe",
id = 1
});
list1.Add(new userObj()
{
BirthDate = DateTime.Parse("3/3/2000"),
FirstName = "Sam",
LastName = "Smith",
id = 2
});
List<userObj> list2 = new List<userObj>();
list2.Add(new userObj()
{
BirthDate = DateTime.Parse("1/1/2000"),
FirstName = "John",
LastName = "Smith",
id = 3
});
list2.Add(new userObj()
{
BirthDate = DateTime.Parse("2/2/2000"),
FirstName = "Jane",
LastName = "Doe",
id = 4
});
List<int> similarObjectsFromTwoLists = null;
//Would like this equal to the overlap. It could be the IDs on either side that have a "buddy" on the other side: (3,4) or (0,1) in the above case.
}
}
}
I don't know why you want a List<int>, i assume this is what you want:
var intersectingUser = from l1 in list1
join l2 in list2
on new { l1.FirstName, l1.LastName, l1.BirthDate }
equals new { l2.FirstName, l2.LastName, l2.BirthDate }
select new { ID1 = l1.id, ID2 = l2.id };
foreach (var bothIDs in intersectingUser)
{
Console.WriteLine("ID in List1: {0} ID in List2: {1}",
bothIDs.ID1, bothIDs.ID2);
}
Output:
ID in List1: 0 ID in List2: 3
ID in List1: 1 ID in List2: 4
You can implement your own IEqualityComparer<T> for your userObj class and use that to run a comparison between the two lists. This will be the most performant approach.
public class NameAndBirthdayComparer : IEqualityComparer<userObj>
{
public bool Equals(userObj x, userObj y)
{
return x.FirstName == y.FirstName && x.LastName == y.LastName && x.BirthDate == y.BirthDate;
}
public int GetHashCode(userObj obj)
{
unchecked
{
var hash = (int)2166136261;
hash = hash * 16777619 ^ obj.FirstName.GetHashCode();
hash = hash * 16777619 ^ obj.LastName.GetHashCode();
hash = hash * 16777619 ^ obj.BirthDate.GetHashCode();
return hash;
}
}
}
You can use this comparer like this:
list1.Intersect(list2, new NameAndBirthdayComparer()).Select(obj => obj.id).ToList();
You could simply join the lists on those 3 properties:
var result = from l1 in list1
join l2 in list2
on new {l1.BirthDate, l1.FirstName, l1.LastName}
equals new {l2.BirthDate, l2.FirstName, l2.LastName}
select new
{
fname = l1.FirstName,
name = l1.LastName,
bday = l1.BirthDate
};
Instead of doing a simple join on just one property (column), two anonymous objects are created new { prop1, prop2, ..., propN}, on which the join is executed.
In your case we are taking all properties, except the Id, which you want to be ignored and voila:
Output:
And Tim beat me to it by a minute
var similarObjectsFromTwoLists = list1.Where(x =>
list2.Exists(y => y.BirthDate == x.BirthDate && y.FirstName == x.FirstName && y.LastName == x.LastName)
).ToList();
This is shorter, but for large list is more efficient "Intersect" or "Join":
var similarObjectsFromTwoLists =
list1.Join(list2, x => x.GetHashCode(), y => y.GetHashCode(), (x, y) => x).ToList();
(suposing GetHashCode() is defined for userObj)
var query = list1.Join (list2,
obj => new {FirstName=obj.FirstName,LastName=obj.LastName, BirthDate=obj.BirthDate},
innObj => new {FirstName=innObj.FirstName, LastName=innObj.LastName, BirthDate=innObj.BirthDate},
(obj, userObj) => (new {List1Id = obj.id, List2Id = userObj.id}));
foreach (var item in query)
{
Console.WriteLine(item.List1Id + " " + item.List2Id);
}

Combine two lists of entities with a condition

Say I have a class defined as
class Object
{
public int ID { get;set; }
public string Property { get; set; }
public override bool Equals(object obj)
{
Object Item = obj as Object;
return Item.ID == this.ID;
}
public override int GetHashCode()
{
int hash = 13;
hash = (hash * 7) + ID.GetHashCode();
return hash;
}
}
And two lists, defined like so:
List<Object> List1;
List<Object> List2;
These two lists contain objects where ID fields could be the same, but Property fields may or may not. I want to have a result of all objects contained in List1 together with all objects contained in List2, with the condition thatPropertyfield must be set to"1"if it is set to"1"` in any of those lists. The result must contain distinct values (distinct IDs).
For example, if we have 2 lists like this:
List1
-----
ID = 0, Property = "1"
ID = 1, Property = ""
ID = 2, Property = "1"
ID = 3, Property = ""
List2
-----
ID = 1, Property = "1"
ID = 2, Property = ""
ID = 3, Property = ""
I need a result to look like this:
Result
-------
ID = 0, Property = "1"
ID = 1, Property = "1"
ID = 2, Property = "1"
ID = 3, Property = ""
Currently it works like this:
var Result = List1.Except(List2).Concat(List2.Except(List1));
var Intersection = List1.Intersect(List2).ToList();
Intersection.ForEach(x => {
x.Property = List1.Single(y => y.ID == x.ID).Property == "1" ? "1" : List2.Single(y => y.ID == x.ID).Property == "1" ? "1" : "";
});
Result = Result.Concat(Intersection);
...but ForEach is very slow. Can someone suggest a faster way?
var result = List1.Concat(List2)
.GroupBy(o => o.ID)
.Select(g => new Object() {
ID=g.Key,
Property=g.Any(o=>o.Property=="1")?"1":""
})
.ToList();
var result = List1.Concat(List2)
.OrderByDescending(o => o.Property)
.GroupBy(g => o.ID)
.Select(g => g.First())
.ToList();

Assign values from one list to another using LINQ

Hello I have a little problem with assigning property values from one lists items to anothers. I know i could solve it "the old way" by iterating through both lists etc. but I am looking for more elegant solution using LINQ.
Let's start with the code ...
class SourceType
{
public int Id;
public string Name;
// other properties
}
class DestinationType
{
public int Id;
public string Name;
// other properties
}
List<SourceType> sourceList = new List<SourceType>();
sourceList.Add(new SourceType { Id = 1, Name = "1111" });
sourceList.Add(new SourceType { Id = 2, Name = "2222" });
sourceList.Add(new SourceType { Id = 3, Name = "3333" });
sourceList.Add(new SourceType { Id = 5, Name = "5555" });
List<DestinationType> destinationList = new List<DestinationType>();
destinationList.Add(new DestinationType { Id = 1, Name = null });
destinationList.Add(new DestinationType { Id = 2, Name = null });
destinationList.Add(new DestinationType { Id = 3, Name = null });
destinationList.Add(new DestinationType { Id = 4, Name = null });
I would like to achieve the following:
destinationList should be filled with Names of corresponding entries (by Id) in sourceList
destinationList should not contain entries that are not present in both lists at once (eg. Id: 4,5 should be eliminated) - something like inner join
I would like to avoid creating new destinationList with updated entries because both lists already exist and are very large,
so no "convert" or "select new".
In the end destinationList should contain:
1 "1111"
2 "2222"
3 "3333"
Is there some kind of elegant (one line Lambda? ;) solution to this using LINQ ?
Any help will be greatly appreciated! Thanks!
I would just build up a dictionary and use that:
Dictionary<int, string> map = sourceList.ToDictionary(x => x.Id, x => x.Name);
foreach (var item in destinationList)
if (map.ContainsKey(item.Id))
item.Name = map[item.Id];
destinationList.RemoveAll(x=> x.Name == null);
Hope this will your desired result. First join two list based on key(Id) and then set property value from sourceList.
var result = destinationList.Join(sourceList, d => d.Id, s => s.Id, (d, s) =>
{
d.Name = s.Name;
return d;
}).ToList();
Barring the last requirement of "avoid creating new destinationList" this should work
var newList = destinationList.Join(sourceList, d => d.Id, s => s.Id, (d, s) => s);
To take care of "avoid creating new destinationList", below can be used, which is not any different than looping thru whole list, except that it probably is less verbose.
destinationList.ForEach(d => {
var si = sourceList
.Where(s => s.Id == d.Id)
.FirstOrDefault();
d.Name = si != null ? si.Name : "";
});
destinationList.RemoveAll(d => string.IsNullOrEmpty(d.Name));
Frankly, this is the simplest:
var dictionary = sourceList.ToDictionary(x => x.Id, x => x.Name);
foreach(var item in desitnationList) {
if(dictionary.ContainsKey(item.Id)) {
item.Name = dictionary[item.Id];
}
}
destinationList = destinationList.Where(x => x.Name != null).ToList();
You could do something ugly with Join but I wouldn't bother.
I hope this will be useful for you. At the end, destinationList has the correct data, without creating any new list of any kind.
destinationList.ForEach(x =>
{
SourceType newSource = sourceList.Find(s=>s.Id == x.Id);
if (newSource == null)
{
destinationList.Remove(destinationList.Find(d => d.Id == x.Id));
}
else
{
x.Name = newSource.Name;
}
});

Categories