I am looking for a very fast way to filter down a collection in C#. I am currently using generic List<object> collections, but am open to using other structures if they perform better.
Currently, I am just creating a new List<object> and looping thru the original list. If the filtering criteria matches, I put a copy into the new list.
Is there a better way to do this? Is there a way to filter in place so there is no temporary list required?
If you're using C# 3.0 you can use linq, which is way better and way more elegant:
List<int> myList = GetListOfIntsFromSomewhere();
// This will filter ints that are not > 7 out of the list; Where returns an
// IEnumerable<T>, so call ToList to convert back to a List<T>.
List<int> filteredList = myList.Where(x => x > 7).ToList();
If you can't find the .Where, that means you need to import using System.Linq; at the top of your file.
Here is a code block / example of some list filtering using three different methods that I put together to show Lambdas and LINQ based list filtering.
#region List Filtering
static void Main(string[] args)
{
ListFiltering();
Console.ReadLine();
}
private static void ListFiltering()
{
var PersonList = new List<Person>();
PersonList.Add(new Person() { Age = 23, Name = "Jon", Gender = "M" }); //Non-Constructor Object Property Initialization
PersonList.Add(new Person() { Age = 24, Name = "Jack", Gender = "M" });
PersonList.Add(new Person() { Age = 29, Name = "Billy", Gender = "M" });
PersonList.Add(new Person() { Age = 33, Name = "Bob", Gender = "M" });
PersonList.Add(new Person() { Age = 45, Name = "Frank", Gender = "M" });
PersonList.Add(new Person() { Age = 24, Name = "Anna", Gender = "F" });
PersonList.Add(new Person() { Age = 29, Name = "Sue", Gender = "F" });
PersonList.Add(new Person() { Age = 35, Name = "Sally", Gender = "F" });
PersonList.Add(new Person() { Age = 36, Name = "Jane", Gender = "F" });
PersonList.Add(new Person() { Age = 42, Name = "Jill", Gender = "F" });
//Logic: Show me all males that are less than 30 years old.
Console.WriteLine("");
//Iterative Method
Console.WriteLine("List Filter Normal Way:");
foreach (var p in PersonList)
if (p.Gender == "M" && p.Age < 30)
Console.WriteLine(p.Name + " is " + p.Age);
Console.WriteLine("");
//Lambda Filter Method
Console.WriteLine("List Filter Lambda Way");
foreach (var p in PersonList.Where(p => (p.Gender == "M" && p.Age < 30))) //.Where is an extension method
Console.WriteLine(p.Name + " is " + p.Age);
Console.WriteLine("");
//LINQ Query Method
Console.WriteLine("List Filter LINQ Way:");
foreach (var v in from p in PersonList
where p.Gender == "M" && p.Age < 30
select new { p.Name, p.Age })
Console.WriteLine(v.Name + " is " + v.Age);
}
private class Person
{
public Person() { }
public int Age { get; set; }
public string Name { get; set; }
public string Gender { get; set; }
}
#endregion
List<T> has a FindAll method that will do the filtering for you and return a subset of the list.
MSDN has a great code example here: http://msdn.microsoft.com/en-us/library/aa701359(VS.80).aspx
EDIT: I wrote this before I had a good understanding of LINQ and the Where() method. If I were to write this today i would probably use the method Jorge mentions above. The FindAll method still works if you're stuck in a .NET 2.0 environment though.
You can use IEnumerable to eliminate the need of a temp list.
public IEnumerable<T> GetFilteredItems(IEnumerable<T> collection)
{
foreach (T item in collection)
if (Matches<T>(item))
{
yield return item;
}
}
where Matches is the name of your filter method. And you can use this like:
IEnumerable<MyType> filteredItems = GetFilteredItems(myList);
foreach (MyType item in filteredItems)
{
// do sth with your filtered items
}
This will call GetFilteredItems function when needed and in some cases that you do not use all items in the filtered collection, it may provide some good performance gain.
To do it in place, you can use the RemoveAll method of the "List<>" class along with a custom "Predicate" class...but all that does is clean up the code... under the hood it's doing the same thing you are...but yes, it does it in place, so you do same the temp list.
You can use the FindAll method of the List, providing a delegate to filter on. Though, I agree with #IainMH that it's not worth worrying yourself too much unless it's a huge list.
If you're using C# 3.0 you can use linq
Or, if you prefer, use the special query syntax provided by the C# 3 compiler:
var filteredList = from x in myList
where x > 7
select x;
Using LINQ is relatively much slower than using a predicate supplied to the Lists FindAll method. Also be careful with LINQ as the enumeration of the list is not actually executed until you access the result. This can mean that, when you think you have created a filtered list, the content may differ to what you expected when you actually read it.
If your list is very big and you are filtering repeatedly - you can sort the original list on the filter attribute, binary search to find the start and end points.
Initial time O(n*log(n)) then O(log(n)).
Standard filtering will take O(n) each time.
Related
I have a given linq-sql like this:
var erg = from p in m_session.Query<LovTestData>()
select new
{
SomeString = p.SomeString,
SomeOtherString = p.SomeOtherString
};
This should be the "base"-query for a Lov-Dialog. So this is the query which defines the content of the Lov.
But there are fields in the LOV to search. So this is the query I have to use to fill the Lov at runtime:
var erg = from p in m_session.Query<LovTestData>()
where ((string.IsNullOrEmpty(someStringValueFilter) || p.SomeString.ToLower().Contains(someStringValueFilter.ToLower())) &&
(string.IsNullOrEmpty(someOtherStringFilter) || p.SomeOtherString.ToLower().Contains(someOtherStringFilter.ToLower())))
select new
{
SomeString = p.SomeString,
SomeOtherString = p.SomeOtherString
};
So I wonder how its possible to "inject" the where clause afterwards into the given query? This is how I think it should look like:
var erg = from p in m_session.Query<LovTestData>()
select new
{
SomeString = p.SomeString,
SomeOtherString = p.SomeOtherString
};
var additionalWhere = ... //Some way to define this part: ((string.IsNullOrEmpty(someStringValueFilter) || p.SomeString.ToLower().Contains(someStringValueFilter.ToLower())) && (string.IsNullOrEmpty(someOtherStringFilter) || p.SomeOtherString.ToLower().Contains(someOtherStringFilter.ToLower())))
erg = InjectWhere(erg, additionalWhere); //In this function the where is inserted into the linq so the result is the second query.
Updated:
The additionalWhere should be constructed out of the original query. So its not possible for me to write "p.SomeString" because the construction of the additionalWhere is universal. This is the way I get the fields
Type elementType = erg.ElementType;
foreach (PropertyInfo pi in elementType.GetProperties())
{
//pi.name...
}
If Query returns IQueryable then there is no problem at all.
List<LovTestData> GetTestData(Expression<Func<T, bool>> where)
{
var erg = from p in m_session.Query<LovTestData>()
select new
{
...
}
IQueryable result = erg.Where(where);
return result.ToList();
}
Now. IQueryable will NOT EXECUTE unitl you really use it. So you can do select, where, union and so on, but until you use the IQueryable it won't do anything. Here the real SQL will run in: result.ToList(). That's why you can build all there conditions earlier. Of course assuming that m_session.Query returns IQueryable but as far as I remember - it does.
So you can even do this without result variable that I have created. Just operate on erg.
Waayd's comment will also work.
OK, now about creating filter dynamically.
Let's take a simple class:
public class Person
{
public string Name { get; set; }
public int Age { get; set; }
}
Now, let's create a list of records - database.
List<Person> list = new List<Person>
{
new Person {Name = "Adam Abc", Age = 15},
new Person {Name = "John Abc", Age = 23},
new Person {Name = "Steven Abc", Age = 26},
new Person {Name = "Adam Bca", Age = 21},
new Person {Name = "Adam Xyz", Age = 26},
};
Now, let's prepare a filter. You have to get filter data from a view, now let's just simulate this:
string nameIs = "Adam";
bool createAgeFilter = true;
int ageFilterMin = 20;
int ageFilterMax = 25;
So we need all Adam's that are in age between 20 and 25. Let's create this condition:
First condition on name:
Func<Person, bool> whereName = new Func<Person, bool>((p) =>
{
if (!string.IsNullOrWhiteSpace(nameIs))
return p.Name.Contains(nameIs);
else
return true;
}
);
Next condition on age:
Func<Person, bool> whereAge = new Func<Person, bool>((p) =>
{
if (createAgeFilter)
return p.Age >= ageFilterMin && p.Age <= ageFilterMax;
else
return true;
}
);
Next, let's have our IQueryable:
IQueryable<Person> q = list.AsQueryable();
And finally let's add where clauses:
List<Person> filteredList = q.Where(whereName)
.Where(whereAge)
.ToList();
That's it. The idea behind this is that you have to create several partial where clauses. Each for one thing you want to filter. But what I've done at the end will make "AND" between the filters. If you would like to "OR" them, you should do it in one other filter type - like in age filter.
I've just came up with this idea. So there may be a better solution. Maybe even some one liner.
Edit
If you can't use linq like that, there is another way. But not so simple.
There MUST be a point somewhere in your application that you build filter in LINQ style. For example in your view. So take this expression and call ToString(). You will get string representation of the linq query.
The next thing you have to do is to install Roslyn package.
Finally you can change string representation of LINQ expression to LINQ expression using some Roslyn magic:
public async static Task<Expression<Func<T, bool>>> ExpressionFromStr<T>(string expressionStr)
{
var options = ScriptOptions.Default.AddReferences(typeof(T).Assembly);
return await CSharpScript.EvaluateAsync<Expression<Func<T, bool>>>(expressionStr, options);
}
Usings:
using Microsoft.CodeAnalysis.CSharp.Scripting; //roslyn
using Microsoft.CodeAnalysis.Scripting; //roslyn
using System;
using System.Linq.Expressions;
using System.Threading.Tasks; //for async and Task.
I have a class like this:
class Person
{
string Name;
string Job;
}
And I have a list of Person:
List<Person> persons;
Now I would like to write a function to construct a sequence based on persons, and having as elements the alternate values of Name and Job. For example, if I had:
persons.Add(new Person("Alice", "Accountant"));
persons.Add(new Person("Bob", "Butler"));
persons.Add(new Person("Chris", "Cleaner"));
Then the result of my function would be a sequence of strings like this:
"Alice", "Accountant", "Bob", "Butler", "Chris", "Cleaner"
Of course, I can do this by using a loop, but I'd like to find a way to do it in a single LINQ line, if possible.
Try this
persons.SelectMany(p => new[] { p.Name, p.Job });
You have two options:
Override ToString in the Person class like this:
public override string ToString()
{
return Name + "," + Job;
}
Then if you want to have the result as a List of String:
List<string> result = persons.Select(c => c.ToString()).ToList();
Or use SelectMany to flatten your list:
List<string> result2 = persons.SelectMany(p => new[] {p.Name, p.Job}).ToList();
I have following list.
One list with Person object has Id & Name property. Other list with People object has Id, Name & Address property.
List<Person> p1 = new List<Person>();
p1.Add(new Person() { Id = 1, Name = "a" });
p1.Add(new Person() { Id = 2, Name = "b" });
p1.Add(new Person() { Id = 3, Name = "c" });
p1.Add(new Person() { Id = 4, Name = "d" });
List<People> p2 = new List<People>();
p2.Add(new People() { Id = 1, Name = "a", Address=100 });
p2.Add(new People() { Id = 3, Name = "x", Address=101 });
p2.Add(new People() { Id = 4, Name = "y", Address=102 });
p2.Add(new People() { Id = 8, Name = "z", Address=103 });
Want to filter list so I used below code. But code returns List of Ids. I want List of People object with matched Ids.
var filteredList = p2.Select(y => y.Id).Intersect(p1.Select(z => z.Id));
You're better off with Join
var filteredList = p2.Join(p1,
people => people.Id,
person => person.Id,
(people, _) => people)
.ToList();
The method will match items from both lists by the key you provide - Id of the People class and Id of Person class.
For each pair where people.Id == person.Id it applies the selector function (people, _) => people. The function says for each pair of matched people and person just give me the people instance; I don't care about person.
Something like this should do the trick :
var result= p1.Join(p2, person => person.Id, people => people.Id, (person, people) => people);
If your list is large enough you should use hashed collection to filter it and improve performance:
var hashedIds = new HashSet<int>(p1.Select(p => p.Id));
var filteredList = p2.Where(p => hashedIds.Contains(p.Id)).ToList();
This will work and work extremely fast because Hashed collections like Dictionary or HashSet allows to perform fast lookups with almost O(1) complexity (which effectively means that in order to find element with certain hash compiler knows exactly where to look for it. And with List<T> to find certain element compiler would have to loop the entire collection in order to find it.
For example line: p2.Where(p => p1.Contains(p.Id)).ToList();
has complexity of O(N2) because using of both .Where and .Contains will form nested loops.
Do not use the simplest answer (and method), use the one that better suits your needs.
Simple performance test against .Join() ...
And the larger collection is the more difference it would make.
I am trying to get an idea of what c# code looked like before LINQ came out.
I have tried searching for this for several weeks and came up empty. I understand how LINQ works but say you have a list of objects but you are trying to just locate a small amount. How would you have done this before LINQ?
Example of LINQ (excuse my syntax error, I'm still learning) :)
list<employee> newlist = new List<employee> {john, smith, 30}
newlist.add{jane, smith, 28}
newlist.add{greg, lane, 24}
var last
from name in newlist
where name.last.equals("smith")
select name
foreach(var name in last)
{
Console.WriteLine(last);
}
How would you be able to sort through and locate the name of employees by last name and display them?
It's really the same number of lines of code, just more curley braces.
Here's a translation:
List<employee> newList = new List<employee>
{
new employee {First = john, Last = smith, Age = 30},
new employee {First = jane, Last = smith, Age = 28},
new employee {First = greg, Last = lane, Age = 24},
}
// Original code: // Pre-Linq translation:
var last // Becomes: var last = new List<employee>();
from name in newList // Becomes: foreach (var name in newList) {
where name.Last.Equals("smith") // Becomes: if (name.Last.Equals("smith") {
select name // Becomes: last.Add(name) } }
// Pre-Linq code:
var last = new List<employee>();
foreach (var name in newList)
{
if (name.Last.Equals("smith")
{
last.Add(name)
}
}
Just the traditional way. Loop through and filter.
var smiths = new List<string>();
foreach (var employee in newlist)
{
if(employee.Last == "smith")
{
smiths.Add(employee);
}
}
return smiths;
For sorting, you can pass a delegate to the Sort() method. LINQ is just syntactic sugar on top of it.
newlist.Sort(delegate(Employee e1, Employee e2)
{
//your comparison logic here that compares two employees
});
Another way to sort is to create a class that implements IComparer and pass that to the sort method
newlist.Sort(new LastNameComparer());
class LastNameComparer: IComparer<Employee>
{
public int Compare(Employee e1, Employee e2)
{
// your comparison logic here that compares two employees
return String.Compare(e1.Last, e2.Last);
}
}
Looking at all this code, LINQ is such a time saver :)
Pretty easy. You are looping through all your items in a list and taking(mean copying references to other list) the ones you are looking for:
var smiths = new List<Persons>();
// get all Smiths and print them
foreach(var item in newlist)
{
if(item.last == "smith")
smiths.Add(item);
}
foreach(var item in smiths)
{
Console.WriteLine(item);
}
Suppose if I add person class instance to list and then I need to query the list using linq.
List lst=new List();
lst.add(new person{ID=1,Name="jhon",salary=2500});
lst.add(new person{ID=2,Name="Sena",salary=1500});
lst.add(new person{ID=3,Name="Max",salary=5500});
lst.add(new person{ID=4,Name="Gen",salary=3500});
Now I want to query the above list with linq. Please guide me with sample code.
I would also suggest LinqPad as a convenient way to tackle with Linq for both advanced and beginners.
Example:
Well, the code you've given is invalid to start with - List is a generic type, and it has an Add method instead of add etc.
But you could do something like:
List<Person> list = new List<Person>
{
new person{ID=1,Name="jhon",salary=2500},
new person{ID=2,Name="Sena",salary=1500},
new person{ID=3,Name="Max",salary=5500}.
new person{ID=4,Name="Gen",salary=3500}
};
// The "Where" LINQ operator filters a sequence
var highEarners = list.Where(p => p.salary > 3000);
foreach (var person in highEarners)
{
Console.WriteLine(person.Name);
}
If you want to learn details of what all the LINQ operators do, and how they can be implemented in LINQ to Objects, you might be interested in my Edulinq blog series.
var persons = new List<Person>
{
new Person {ID = 1, Name = "jhon", Salary = 2500},
new Person {ID = 2, Name = "Sena", Salary = 1500},
new Person {ID = 3, Name = "Max", Salary = 5500},
new Person {ID = 4, Name = "Gen", Salary = 3500}
};
var acertainperson = persons.Where(p => p.Name == "jhon").First();
Console.WriteLine("{0}: {1} points",
acertainperson.Name, acertainperson.Salary);
jhon: 2500 points
var doingprettywell = persons.Where(p => p.Salary > 2000);
foreach (var person in doingprettywell)
{
Console.WriteLine("{0}: {1} points",
person.Name, person.Salary);
}
jhon: 2500 points
Max: 5500 points
Gen: 3500 points
var astupidcalc = from p in persons
where p.ID > 2
select new
{
Name = p.Name,
Bobos = p.Salary*p.ID,
Bobotype = "bobos"
};
foreach (var person in astupidcalc)
{
Console.WriteLine("{0}: {1} {2}",
person.Name, person.Bobos, person.Bobotype);
}
Max: 16500 bobos
Gen: 14000 bobos
Since you haven't given any indication to what you want, here is a link to 101 LINQ samples that use all the different LINQ methods: 101 LINQ Samples
Also, you should really really really change your List into a strongly typed list (List<T>), properly define T, and add instances of T to your list. It will really make the queries much easier since you won't have to cast everything all the time.