How to merge two lists using LINQ? - c#

How to merge two lists using LINQ like the following:
class Person
{
public int ID { get; set;}
public string Name { get; set;}
public Person Merge( Person p)
{
return new Person { ID = this.ID, Name = this.Name + " " + p.Name };
}
}
I have two List of person:
list1:
1, A
2, B
list2:
2, C
3, D
I want the result like the following
result:
1, A
2, B C
3, D
Any help!

I would strongly recommend against using string-concatenation to represent this information; you will need to perform unnecessary string-manipulation if you want to get the original data back later from the merged list. Additionally, the merged version (as it stands) will become lossy if you ever decide to add additional properties to the class.
Preferably, get rid of the Merge method and use an appropriate data-structure such as a multimap that can each map a collection of keys to one or more values. The Lookup<TKey, TElement> class can serve this purpose:
var personsById = list1.Concat(list2)
.ToLookup(person => person.ID);
Anyway, to answer the question as asked, you can concatenate the two sequences, then group persons by their ID and then aggregate each group into a single person with the provided Merge method:
var mergedList = list1.Concat(list2)
.GroupBy(person => person.ID)
.Select(group => group.Aggregate(
(merged, next) => merged.Merge(next)))
.ToList();
EDIT: Upon re-reading, just realized that a concatenation is required since there are two lists.

Related

Filter a list of address objects by a list of string postcodes [duplicate]

I have a list of parameters like this:
public class parameter
{
public string name {get; set;}
public string paramtype {get; set;}
public string source {get; set;}
}
IEnumerable<Parameter> parameters;
And a array of strings i want to check it against.
string[] myStrings = new string[] { "one", "two"};
I want to iterate over the parameter list and check if the source property is equal to any of the myStrings array. I can do this with nested foreach's but i would like to learn how to do it in a nicer way as i have been playing around with linq and like the extension methods on enumerable like where etc so nested foreachs just feel wrong. Is there a more elegant preferred linq/lambda/delegete way to do this.
Thanks
You could use a nested Any() for this check which is available on any Enumerable:
bool hasMatch = myStrings.Any(x => parameters.Any(y => y.source == x));
Faster performing on larger collections would be to project parameters to source and then use Intersect which internally uses a HashSet<T> so instead of O(n^2) for the first approach (the equivalent of two nested loops) you can do the check in O(n) :
bool hasMatch = parameters.Select(x => x.source)
.Intersect(myStrings)
.Any();
Also as a side comment you should capitalize your class names and property names to conform with the C# style guidelines.
Here is a sample to find if there are match elements in another list
List<int> nums1 = new List<int> { 2, 4, 6, 8, 10 };
List<int> nums2 = new List<int> { 1, 3, 6, 9, 12};
if (nums1.Any(x => nums2.Any(y => y == x)))
{
Console.WriteLine("There are equal elements");
}
else
{
Console.WriteLine("No Match Found!");
}
If both the list are too big and when we use lamda expression then it will take a long time to fetch . Better to use linq in this case to fetch parameters list:
var items = (from x in parameters
join y in myStrings on x.Source equals y
select x)
.ToList();
list1.Select(l1 => l1.Id).Intersect(list2.Select(l2 => l2.Id)).ToList();
var list1 = await _service1.GetAll();
var list2 = await _service2.GetAll();
// Create a list of Ids from list1
var list1_Ids = list1.Select(l => l.Id).ToList();
// filter list2 according to list1 Ids
var list2 = list2.Where(l => list1_Ids.Contains(l.Id)).ToList();

How to find duplicates fieldsa in 2 lists of different types and remove them from both lists, in C#

This is not a duplicate of: Given 2 C# Lists how to merge them and get only the non duplicated elements from both lists since he's looking at lists of the same type.
I have this scenario:
class A
{
string id;
.... some other stuff
}
class B
{
string id;
.... some other stuff
}
I would like to remove, both from A and B, elements that share an id field between the two lists.
I can do it in 3 steps: find the common ids, and then delete the records from both lists, but I'm wondering if there is something more elegant.
Edit: expected output
var A = [ 1, 3, 5, 7, 9 ]
var B = [ 1, 2, 3, 4, 5 ]
output:
A = [ 7, 9 ]
B = [ 2, 4 ]
but this is showing only the id field; as stated above, the lists are of different types, they just share ids.
You will require three steps, but you can use Linq to simplify the code.
Given two classes which have a property of the same (equatable) type, named "ID":
class Test1
{
public string ID { get; set; }
}
class Test2
{
public string ID { get; set; }
}
Then you can find the duplicates and remove them from both lists like so:
var dups =
(from item1 in list1
join item2 in list2 on item1.ID equals item2.ID
select item1.ID)
.ToArray();
list1.RemoveAll(item => dups.Contains(item.ID));
list2.RemoveAll(item => dups.Contains(item.ID));
But that is still three steps.
See .Net Fiddle example for a runnable example.
You can use LINQ Lambda expression for elegance:
var intersectValues = list2.Select(r => r.Id).Intersect(list1.Select(r => r.Id)).ToList();
list1.RemoveAll(r => intersectValues.Contains(r.Id));
list2.RemoveAll(r => intersectValues.Contains(r.Id));
Building on #Matthew Watson's answer you can move all of it to a single LINQ expression with
(from item1 in list1
join item2 in list2 on item1.ID equals item2.ID
select item1.ID)
.ToList()
.ForEach(d =>
{
list1.RemoveAll(i1 => d == i1.ID);
list2.RemoveAll(i2 => d == i2.ID);
}
);
I don't know where you land on the performance scale. The compiler might actually split this up into the three steps steps you already mentioned.
You also lose some readability as the from ... select result does not have a 'speaking' name like duplicates, to directly tell you what you will be working with in the ForEach.
Complete code example at https://gist.github.com/msdeibel/d2f8a97b754cca85fe4bcac130851597
O(n)
var aHash = list<A>.ToHashSet(x=>x.ID);
var bHash = list<B>.ToHashSet(x=>x.ID);
var result1 = new List<A>(A.Count);
var result2 = new List<B>(B.Count);
int value;
foreach (A item in list<A>)
{
if (!bHash.TryGetValue(item.ID, out value))
result1.Add(A);
}
foreach (B item in list<B>)
{
if (!aHash.TryGetValue(item.ID, out value))
result2.Add(B);
}

How to group a list with Linq

I have a list which I get from a database. The structure looks like (which I'm representing with JSON as it's easier for me to visualise)
{id:1
value:"a"
},
{id:1
value:"b"
},
{id:1
value:"c"
},
{id:2
value:"t"
}
As you can see, I have 2 unique ID's, ID 1 and 2. I want to group by the ID. The end result I'd like is
{id:1,
values:["a","b","c"],
},
{id:2,
values["g"]
}
Is this possible with Linq? At the moment, I have a massive complex foreach, which first sorts the list (by ID) and then detects if it's already been added etc but this monstrous loop made me realise I'm doing wrong and honestly, it's too embarrassing to share.
You can group by the item Id and have the resulting type be a Dictionary<int, List<string>>
var result = myList.GroupBy(item => item.Id)
.ToDictionary(item => item.Key,
item => item.Select(i => i.Value).ToList());
You can either use GroupBy method on IEnumerable to create IGrouping object that contains a key and grouped objects or you can use ToLookupto create exactly what you want in result:
yourList.ToLookup(m => m.id, m => m.value);
This creates a hashed collection of keys with their values.
For more information please see below post:
https://www.c-sharpcorner.com/UploadFile/d3e4b1/practical-usage-of-using-tolookup-method-in-linq-C-Sharp/
Just a little more detail to emphasize the difference between the ToLookup approach and the GroupBy approach:
// class definition
public class Item
{
public long Id { get; set; }
public string Value { get; set; }
}
// create your list
var items = new List<Item>
{
new Item{Id = 0, Value = "value0a"},
new Item{Id = 0, Value = "value0b"},
new Item{Id = 1, Value = "value1"}
};
// this approach results in a List<string> (a collection of the values)
var lookup = items.ToLookup(i => i.Id, i => i.Value);
var groupOfValues = lookup[0].ToList();
// this approach results in a List<Item> (a collection of the objects)
var itemsGroupedById = items.GroupBy(i => i.Id).ToList();
var groupOfItems = itemsGroupedById[0].ToList();
So, if you want to work with values only after grouping, then you could take the first approach; if you want to work with objects after grouping, you could take the second approach. And, these are just a couple example implementations, there are plenty of ways to accomplish your goal.
First convert to a Lookup then select into a list, like so:
var groups = list
.ToLookup
(
item => item.ID,
item => item.Value
)
.Select
(
item => new
{
ID = item.Key,
Values = item.ToList()
}
)
.ToList();
The resulting JSON looks like this:
[{"ID":1,"Values":["a","b","c"]},{"ID":2,"Values":["t"]}]
Link to working example on DotNetFiddle.

c# Linq differed execution challenge - help needed in creating 3 different lists

I am trying to create 3 different lists (1,2,3) from 2 existing lists (A,B).
The 3 lists need to identify the following relationships.
List 1 - the items that are in list A and not in list B
List 2 - the items that are in list B and not in list A
List 3 - the items that are in both lists.
I then want to join all the lists together into one list.
My problem is that I want to identify the differences by adding an enum identifying the relationship to the items of each list. But by adding the Enum the Except Linq function does not identify the fact (obviously) that the lists are the same. Because the Linq queries are differed I can not resolve this by changing the order of my statements ie. identify the the lists and then add the Enums.
This is the code that I have got to (Doesn't work properly)
There might be a better approach.
List<ManufactorListItem> manufactorItemList =
manufactorRepository.GetManufactorList();
// Get the Manufactors from the Families repository
List<ManufactorListItem> familyManufactorList =
this.familyRepository.GetManufactorList(familyGuid);
// Identify Manufactors that are only found in the Manufactor Repository
List<ManufactorListItem> inManufactorsOnly =
manufactorItemList.Except(familyManufactorList).ToList();
// Mark them as (Parent Only)
foreach (ManufactorListItem manOnly in inManufactorsOnly) {
manOnly.InheritanceState = EnumInheritanceState.InParent;
}
// Identify Manufactors that are only found in the Family Repository
List<ManufactorListItem> inFamiliesOnly =
familyManufactorList.Except(manufactorItemList).ToList();
// Mark them as (Child Only)
foreach (ManufactorListItem famOnly in inFamiliesOnly) {
famOnly.InheritanceState = EnumInheritanceState.InChild;
}
// Identify Manufactors that are found in both Repositories
List<ManufactorListItem> sameList =
manufactorItemList.Intersect(familyManufactorList).ToList();
// Mark them Accordingly
foreach (ManufactorListItem same in sameList) {
same.InheritanceState = EnumInheritanceState.InBoth;
}
// Create an output List
List<ManufactorListItem> manufactors = new List<ManufactorListItem>();
// Join all of the lists together.
manufactors = sameList.Union(inManufactorsOnly).
Union(inFamiliesOnly).ToList();
Any ideas hot to get around this?
Thanks in advance
You can make it much simplier:
List<ManufactorListItem> manufactorItemList = ...;
List<ManufactorListItem> familyManufactorList = ...;
var allItems = manufactorItemList.ToDictionary(i => i, i => InheritanceState.InParent);
foreach (var familyManufactor in familyManufactorList)
{
allItems[familyManufactor] = allItems.ContainsKey(familyManufactor) ?
InheritanceState.InBoth :
InheritanceState.InChild;
}
//that's all, now we can get any subset items:
var inFamiliesOnly = allItems.Where(p => p.Value == InheritanceState.InChild).Select(p => p.Key);
var inManufactorsOnly = allItems.Where(p => p.Value == InheritanceState.InParent).Select(p => p.Key);
var allManufactors = allItems.Keys;
This seems like the simplest way to me:
(I'm using the following Enum for simplicity:
public enum ContainedIn
{
AOnly,
BOnly,
Both
}
)
var la = new List<int> {1, 2, 3};
var lb = new List<int> {2, 3, 4};
var l1 = la.Except(lb)
.Select(i => new Tuple<int, ContainedIn>(i, ContainedIn.AOnly));
var l2 = lb.Except(la)
.Select(i => new Tuple<int, ContainedIn>(i, ContainedIn.BOnly));
var l3 = la.Intersect(lb)
.Select(i => new Tuple<int, ContainedIn>(i, ContainedIn.Both));
var combined = l1.Union(l2).Union(l3);
So long as you have access to the Tuple<T1, T2> class (I think it's a .NET 4 addition).
If the problem is with the Except() statement, then I suggest you use the 3 parameter override of Except in order to provide a custom IEqualityComparer<ManufactorListItem> compare which tests the appropriate ManufactorListItem fields, but not the InheritanceState.
e.g. your equality comparer might look like:
public class ManufactorComparer : IEqualityComparer<ManufactorListItem> {
public bool Equals(ManufactorListItem x, ManufactorListItem y) {
// you need to write a method here that tests all the fields except InheritanceState
}
public int GetHashCode(ManufactorListItem obj) {
// you need to write a simple hash code generator here using any/all the fields except InheritanceState
}
}
and then you would call this using code a bit like
// Identify Manufactors that are only found in the Manufactor Repository
List<ManufactorListItem> inManufactorsOnly =
manufactorItemList.Except(familyManufactorList, new ManufactorComparer()).ToList();

Overlay/Join two collections with Linq

I have the following scenario:
List 1 has 20 items of type TItem, List 2 has 5 items of the same type. List 1 already contains the items from List 2 but in a different state. I want to overwrite the 5 items in List 1 with the items from List 2.
I thought a join might work, but I want to overwrite the items in List 1, not join them together and have duplicates.
There is a unique key that can be used to find which items to overwrite in List 1 the key is of type int
You could use the built in Linq .Except() but it wants an IEqualityComparer so use a fluid version of .Except() instead.
Assuming an object with an integer key as you indicated:
public class Item
{
public int Key { get; set; }
public int Value { get; set; }
public override string ToString()
{
return String.Format("{{{0}:{1}}}", Key, Value);
}
}
The original list of objects can be merged with the changed one as follows:
IEnumerable<Item> original = new[] { 1, 2, 3, 4, 5 }.Select(x => new Item
{
Key = x,
Value = x
});
IEnumerable<Item> changed = new[] { 2, 3, 5 }.Select(x => new Item
{
Key = x,
Value = x * x
});
IEnumerable<Item> result = original.Except(changed, x => x.Key).Concat(changed);
result.ForEach(Console.WriteLine);
output:
{1:1}
{4:4}
{2:4}
{3:9}
{5:25}
LINQ isn't used to perform actual modifications to the underlying data sources; it's strictly a query language. You could, of course, do an outer join on List2 from List1 and select List2's entity if it's not null and List1's entity if it is, but that is going to give you an IEnumerable<> of the results; it won't actually modify the collection. You could do a ToList() on the result and assign it to List1, but that would change the reference; I don't know if that would affect the rest of your application.
Taking your question literally, in that you want to REPLACE the items in List1 with those from List2 if they exist, then you'll have to do that manually in a for loop over List1, checking for the existence of a corresponding entry in List2 and replacing the List1 entry by index with that from List2.
As Adam says, LINQ is about querying. However, you can create a new collection in the right way using Enumerable.Union. You'd need to create an appropriate IEqualityComparer though - it would be nice to have UnionBy. (Another one for MoreLINQ perhaps?)
Basically:
var list3 = list2.Union(list1, keyComparer);
Where keyComparer would be an implementation to compare the two keys. MiscUtil contains a ProjectionEqualityComparer which would make this slightly easier.
Alternatively, you could use DistinctBy from MoreLINQ after concatenation:
var list3 = list2.Concat(list1).DistinctBy(item => item.Key);
Here's a solution with GroupJoin.
List<string> source = new List<string>() { "1", "22", "333" };
List<string> modifications = new List<string>() { "4", "555"};
//alternate implementation
//List<string> result = source.GroupJoin(
// modifications,
// s => s.Length,
// m => m.Length,
// (s, g) => g.Any() ? g.First() : s
//).ToList();
List<string> result =
(
from s in source
join m in modifications
on s.Length equals m.Length into g
select g.Any() ? g.First() : s
).ToList();
foreach (string s in result)
Console.WriteLine(s);
Hmm, how about a re-usable extension method while I'm at it:
public static IEnumerable<T> UnionBy<T, U>
(
this IEnumerable<T> source,
IEnumerable<T> otherSource,
Func<T, U> selector
)
{
return source.GroupJoin(
otherSource,
selector,
selector,
(s, g) => g.Any() ? g.First() : s
);
}
Which is called by:
List<string> result = source
.UnionBy(modifications, s => s.Length)
.ToList();

Categories