Comparing two list of different objects - c#

I have following list.
One list with Person object has Id & Name property. Other list with People object has Id, Name & Address property.
List<Person> p1 = new List<Person>();
p1.Add(new Person() { Id = 1, Name = "a" });
p1.Add(new Person() { Id = 2, Name = "b" });
p1.Add(new Person() { Id = 3, Name = "c" });
p1.Add(new Person() { Id = 4, Name = "d" });
List<People> p2 = new List<People>();
p2.Add(new People() { Id = 1, Name = "a", Address=100 });
p2.Add(new People() { Id = 3, Name = "x", Address=101 });
p2.Add(new People() { Id = 4, Name = "y", Address=102 });
p2.Add(new People() { Id = 8, Name = "z", Address=103 });
Want to filter list so I used below code. But code returns List of Ids. I want List of People object with matched Ids.
var filteredList = p2.Select(y => y.Id).Intersect(p1.Select(z => z.Id));

You're better off with Join
var filteredList = p2.Join(p1,
people => people.Id,
person => person.Id,
(people, _) => people)
.ToList();
The method will match items from both lists by the key you provide - Id of the People class and Id of Person class.
For each pair where people.Id == person.Id it applies the selector function (people, _) => people. The function says for each pair of matched people and person just give me the people instance; I don't care about person.

Something like this should do the trick :
var result= p1.Join(p2, person => person.Id, people => people.Id, (person, people) => people);

If your list is large enough you should use hashed collection to filter it and improve performance:
var hashedIds = new HashSet<int>(p1.Select(p => p.Id));
var filteredList = p2.Where(p => hashedIds.Contains(p.Id)).ToList();
This will work and work extremely fast because Hashed collections like Dictionary or HashSet allows to perform fast lookups with almost O(1) complexity (which effectively means that in order to find element with certain hash compiler knows exactly where to look for it. And with List<T> to find certain element compiler would have to loop the entire collection in order to find it.
For example line: p2.Where(p => p1.Contains(p.Id)).ToList();
has complexity of O(N2) because using of both .Where and .Contains will form nested loops.
Do not use the simplest answer (and method), use the one that better suits your needs.
Simple performance test against .Join() ...
And the larger collection is the more difference it would make.

Related

How do I sort a List<Type> by List<int>?

In my c# MVC project I have a list of items in that I want to sort in order of another list
var FruitTypes = new List<Fruit> {
new Fruit { Id = 1, Name = "Banana"},
new Fruit { Id = 2, Name = "Apple" },
new Fruit { Id = 3, Name = "Orange" },
new Fruit { Id = 4, Name = "Plum"},
new Fruit { Id = 5, Name = "Pear" },
};
SortValues = new List<int> {5,4,3,1,2};
Currently my list is showing as default of fruit type.
How can I sort the Fruit list by SortValues?
It's unclear if you are sorting by the indexes in SortValues or whether SortValues contains corresponding Id values that should be joined.
In the first case:
First you have to Zip your two lists together, then you can sort the composite type that Zip generates, then select the FruitType back out.
IEnumerable<FruitType> sortedFruitTypes = FruitTypes
.Zip(SortValues, (ft, idx) => new {ft, idx})
.OrderBy(x => x.idx)
.Select(x => x.ft);
However, this is simply sorting the first list by the ordering indicated in SortValues, not joining the ids.
In the second case, a simple join will suffice:
IEnumerable<FruitType> sortedFruitTypes = SortValues
.Join(FruitTypes, sv => sv, ft => ft.Id, (_, ft) => ft);
This works because Enumerable.Join maintains the order of the "left" hand side of the join.
While there is almost certainly a more LINQ-y way, if you tend towards verbosity, you could accomplish this with an iterator function. For example:
public IEnumerable<Fruit> SortFruits(IEnumerable<Fruit> unordered, IEnumerable<int> sortValues)
{
foreach (var value in sortValues)
yield return unordered.Single(f => f.Id == value);
}
I like that it's explicit about what it's doing. You may consider throwing an exception when the number of items in each list is different, or maybe you just don't return an item if there is no sort value for it. You'll have to decide what the behaviour should be for "missing" values in either collection is. I think that having to handle these scenarios is a good reason to put it all in a single method this way, instead of a longer LINQ query.
Time complexity:O(n) + TM of Linq.
Declare list of fruits to store result.
Iterate through each fruit type.
Use Linq FirstOrDefault to get element by sorted value.
List<int> SortValues = new List<int> { 5, 4, 3, 1, 2 };
List<Fruit> result = new List<Fruit>();
foreach (var element in SortValues)
{
Fruit f = FruitTypes.FirstOrDefault(fruitElement => fruitElement.Id == element);
result.Add(f);
}
Implementation: DotNetFiddler

How to Avoid Recreating Object When Using Let with LINQ

Here is my data:
private List<Department> Data
{
get
{
return new List<Department>
{
new Department{
Id = 1,
Name = "Tech",
Employees = new List<Employee>{
new Employee{Name = "x", Id = 1 },
new Employee{ Name = "y", Id = 2}
}
},
new Department{
Id = 2,
Name = "Sales",
Employees = new List<Employee>{
new Employee{Name = "a", Id = 3},
new Employee {Name = "b", Id = 4}
}
}
};
}
}
and here I am getting a list of all employees with their appropriate departments:
List<Employee> employees = (from department in Departments
let d = department
from e in d.Employees
select new Employee{
Id = e.Id,
Name = e.Name
Department = d
}).ToList();
What is bothering me is that I have to recreate my Employee object in order to attach the appropriate department to it. Is there a way that I could write my LINQ statement where I don't have to recreate the Employee?
There might be a better way to phrase this question-- so feel free to let me know is there is.
Edit
The reason I'm going down this path is that I'm storing my data by serializing my department:
[
{
"Id":1,
"Name":"Sales",
"Employees":[{"Id":2,"Name":"x"},{"Id":1,"Name":"y"}]
},
{
"Id":2,
"Name":"Tech",
"Employees":[{"Id":3,"Name":"d"},{"Id":4,"Name":"f"}]
}
]
It looks like you want to use LINQ to update an instance. This is not the intended use. Use LINQ to query the instances you want to have, and then loop over the results to update. (non-nested) Loops are not evil.
var query =
from d in Departments
from e in d.Employees
select new { Employee = e, Department = d };
foreach(var x in query)
{
x.Employee.Department = x.Department;
}
You should not have this problem in the first place - You should fully construct your Employee instances when you initially create them, not sometime later - if an employee needs a department to be used, you should add a constructor that allows/enforces providing it:
public Employee(int id, string name, Department department)
{
...
}
You could, if you really, really want, use a let-clause for a side-effect, since assignment expressions return a value:
List<Employee> employees = (from department in Departments
from e in department.Employees
let _ = e.Department = department
select e).ToList();
Also I fully agree with BrokenGlass...
Using let is redundant and not useful in your example query.
Besides, LINQ is not the right tool here. You want to affect the state of the objects you're querying (i.e. creating side-effects), which is generally not recommended.
By direct comparison, this is a better alternative to what you're trying do to:
foreach(var department in Departments)
foreach(var employee in department.Employees)
employee.Department = department;
If you can however, you should do the department assignment at the time you add the employees to the department, either in an AddEmployee method in the Department class, or maybe in a Employee.Department property setter.

Grouping by property value and writing group members

I need to group the following list by the department value but am having trouble with the LINQ syntax. Here's my list of objects:
var people = new List<Person>
{
new Person { name = "John", department = new List<fields> {new fields { name = "department", value = "IT"}}},
new Person { name = "Sally", department = new List<fields> {new fields { name = "department", value = "IT"}}},
new Person { name = "Bob", department = new List<fields> {new fields { name = "department", value = "Finance"}}},
new Person { name = "Wanda", department = new List<fields> {new fields { name = "department", value = "Finance"}}},
};
I've toyed around with grouping. This is as far as I've got:
var query = from p in people
from field in p.department
where field.name == "department"
group p by field.value into departments
select new
{
Department = departments.Key,
Name = departments
};
So can iterate over the groups, but not sure how to list the Person names -
foreach (var department in query)
{
Console.WriteLine("Department: {0}", department.Department);
foreach (var foo in department.Department)
{
// ??
}
}
Any ideas on what to do better or how to list the names of the relevant departments?
Ah, should have been:
foreach (Person p in department.Name) Console.WriteLine(p.name);
Thanks for the extra set of eyes, Fyodor!
Your department property seems like an awkward implementation, particularly if you want to group by department. Grouping with a List as your key is going to lead to a ton of complexity, and it's unnecessary since you only care about one element in the List.
Also, you seem to have created the fields class as a way of simulating either dynamic/anonymous types, or just the Dictionary<string, string> class, I can't really tell. I suggest not doing that; C# already has those types baked in, and working around them will just be inefficient and stop you from using Intellisense. Whatever led you to do that, there's probably a better, more C#-ish way. Besides--and this is key--your code looks like you can just forget all that and make department a simple string.
If you have control over the data structure, I'd suggest reorganizing it:
var people = new List<Person> {
new Person { name = "John", department = "IT"},
new Person { name = "Sally", department = "IT"},
new Person { name = "Bob", department = "Finance"},
new Person { name = "Wanda", department = "Finance"},
};
Suddenly, grouping all that becomes simple:
var departments = from p in people
group p by p.department into dept
select dept;
foreach (var dept in departments)
{
Console.WriteLine("Department: {0}", dept.Key);
foreach (var person in dept)
{
Console.WriteLine("Person: {0}", person.name);
}
}
If you must leave the data structure as it is, you could try this:
from p in people
from field in p.department
where field.name equals "department"
group p by field.value into dept
select dept;
That should work with the above nested loop.
The list of persons for each department can be accessed via department.Name. Simply iterate over it:
foreach( var person in department.Name ) Console.WriteLine( person.name );
The value of department.Department, on the other hand, is of type string. This value comes from departments.Key, which in turn comes from field.value - because that's the key that you group by.
The foreach statement over department.Department still compiles fine, because string implements IEnumerable<char>. Consequently, your foo variable is of type char.

how to query LIST using linq

Suppose if I add person class instance to list and then I need to query the list using linq.
List lst=new List();
lst.add(new person{ID=1,Name="jhon",salary=2500});
lst.add(new person{ID=2,Name="Sena",salary=1500});
lst.add(new person{ID=3,Name="Max",salary=5500});
lst.add(new person{ID=4,Name="Gen",salary=3500});
Now I want to query the above list with linq. Please guide me with sample code.
I would also suggest LinqPad as a convenient way to tackle with Linq for both advanced and beginners.
Example:
Well, the code you've given is invalid to start with - List is a generic type, and it has an Add method instead of add etc.
But you could do something like:
List<Person> list = new List<Person>
{
new person{ID=1,Name="jhon",salary=2500},
new person{ID=2,Name="Sena",salary=1500},
new person{ID=3,Name="Max",salary=5500}.
new person{ID=4,Name="Gen",salary=3500}
};
// The "Where" LINQ operator filters a sequence
var highEarners = list.Where(p => p.salary > 3000);
foreach (var person in highEarners)
{
Console.WriteLine(person.Name);
}
If you want to learn details of what all the LINQ operators do, and how they can be implemented in LINQ to Objects, you might be interested in my Edulinq blog series.
var persons = new List<Person>
{
new Person {ID = 1, Name = "jhon", Salary = 2500},
new Person {ID = 2, Name = "Sena", Salary = 1500},
new Person {ID = 3, Name = "Max", Salary = 5500},
new Person {ID = 4, Name = "Gen", Salary = 3500}
};
var acertainperson = persons.Where(p => p.Name == "jhon").First();
Console.WriteLine("{0}: {1} points",
acertainperson.Name, acertainperson.Salary);
jhon: 2500 points
var doingprettywell = persons.Where(p => p.Salary > 2000);
foreach (var person in doingprettywell)
{
Console.WriteLine("{0}: {1} points",
person.Name, person.Salary);
}
jhon: 2500 points
Max: 5500 points
Gen: 3500 points
var astupidcalc = from p in persons
where p.ID > 2
select new
{
Name = p.Name,
Bobos = p.Salary*p.ID,
Bobotype = "bobos"
};
foreach (var person in astupidcalc)
{
Console.WriteLine("{0}: {1} {2}",
person.Name, person.Bobos, person.Bobotype);
}
Max: 16500 bobos
Gen: 14000 bobos
Since you haven't given any indication to what you want, here is a link to 101 LINQ samples that use all the different LINQ methods: 101 LINQ Samples
Also, you should really really really change your List into a strongly typed list (List<T>), properly define T, and add instances of T to your list. It will really make the queries much easier since you won't have to cast everything all the time.

Filtering collections in C#

I am looking for a very fast way to filter down a collection in C#. I am currently using generic List<object> collections, but am open to using other structures if they perform better.
Currently, I am just creating a new List<object> and looping thru the original list. If the filtering criteria matches, I put a copy into the new list.
Is there a better way to do this? Is there a way to filter in place so there is no temporary list required?
If you're using C# 3.0 you can use linq, which is way better and way more elegant:
List<int> myList = GetListOfIntsFromSomewhere();
// This will filter ints that are not > 7 out of the list; Where returns an
// IEnumerable<T>, so call ToList to convert back to a List<T>.
List<int> filteredList = myList.Where(x => x > 7).ToList();
If you can't find the .Where, that means you need to import using System.Linq; at the top of your file.
Here is a code block / example of some list filtering using three different methods that I put together to show Lambdas and LINQ based list filtering.
#region List Filtering
static void Main(string[] args)
{
ListFiltering();
Console.ReadLine();
}
private static void ListFiltering()
{
var PersonList = new List<Person>();
PersonList.Add(new Person() { Age = 23, Name = "Jon", Gender = "M" }); //Non-Constructor Object Property Initialization
PersonList.Add(new Person() { Age = 24, Name = "Jack", Gender = "M" });
PersonList.Add(new Person() { Age = 29, Name = "Billy", Gender = "M" });
PersonList.Add(new Person() { Age = 33, Name = "Bob", Gender = "M" });
PersonList.Add(new Person() { Age = 45, Name = "Frank", Gender = "M" });
PersonList.Add(new Person() { Age = 24, Name = "Anna", Gender = "F" });
PersonList.Add(new Person() { Age = 29, Name = "Sue", Gender = "F" });
PersonList.Add(new Person() { Age = 35, Name = "Sally", Gender = "F" });
PersonList.Add(new Person() { Age = 36, Name = "Jane", Gender = "F" });
PersonList.Add(new Person() { Age = 42, Name = "Jill", Gender = "F" });
//Logic: Show me all males that are less than 30 years old.
Console.WriteLine("");
//Iterative Method
Console.WriteLine("List Filter Normal Way:");
foreach (var p in PersonList)
if (p.Gender == "M" && p.Age < 30)
Console.WriteLine(p.Name + " is " + p.Age);
Console.WriteLine("");
//Lambda Filter Method
Console.WriteLine("List Filter Lambda Way");
foreach (var p in PersonList.Where(p => (p.Gender == "M" && p.Age < 30))) //.Where is an extension method
Console.WriteLine(p.Name + " is " + p.Age);
Console.WriteLine("");
//LINQ Query Method
Console.WriteLine("List Filter LINQ Way:");
foreach (var v in from p in PersonList
where p.Gender == "M" && p.Age < 30
select new { p.Name, p.Age })
Console.WriteLine(v.Name + " is " + v.Age);
}
private class Person
{
public Person() { }
public int Age { get; set; }
public string Name { get; set; }
public string Gender { get; set; }
}
#endregion
List<T> has a FindAll method that will do the filtering for you and return a subset of the list.
MSDN has a great code example here: http://msdn.microsoft.com/en-us/library/aa701359(VS.80).aspx
EDIT: I wrote this before I had a good understanding of LINQ and the Where() method. If I were to write this today i would probably use the method Jorge mentions above. The FindAll method still works if you're stuck in a .NET 2.0 environment though.
You can use IEnumerable to eliminate the need of a temp list.
public IEnumerable<T> GetFilteredItems(IEnumerable<T> collection)
{
foreach (T item in collection)
if (Matches<T>(item))
{
yield return item;
}
}
where Matches is the name of your filter method. And you can use this like:
IEnumerable<MyType> filteredItems = GetFilteredItems(myList);
foreach (MyType item in filteredItems)
{
// do sth with your filtered items
}
This will call GetFilteredItems function when needed and in some cases that you do not use all items in the filtered collection, it may provide some good performance gain.
To do it in place, you can use the RemoveAll method of the "List<>" class along with a custom "Predicate" class...but all that does is clean up the code... under the hood it's doing the same thing you are...but yes, it does it in place, so you do same the temp list.
You can use the FindAll method of the List, providing a delegate to filter on. Though, I agree with #IainMH that it's not worth worrying yourself too much unless it's a huge list.
If you're using C# 3.0 you can use linq
Or, if you prefer, use the special query syntax provided by the C# 3 compiler:
var filteredList = from x in myList
where x > 7
select x;
Using LINQ is relatively much slower than using a predicate supplied to the Lists FindAll method. Also be careful with LINQ as the enumeration of the list is not actually executed until you access the result. This can mean that, when you think you have created a filtered list, the content may differ to what you expected when you actually read it.
If your list is very big and you are filtering repeatedly - you can sort the original list on the filter attribute, binary search to find the start and end points.
Initial time O(n*log(n)) then O(log(n)).
Standard filtering will take O(n) each time.

Categories