What's the point of lambda expression in OrderBy? - c#

I have a list of strings and I'd like to order them.
IEnumerable<String> strings = ...;
strings = strings.OrderBy(a => a);
What I don't get is the point of the lambda expression a => a in there. First I thought that I can pull out a property and order at the same like like this.
IEnumerable<Something> somethings = ...;
IEnumerable<String> strings = somethings.OrderBy(a => a.StringProperty);
But that doesn't compile. So I'll have to go like this.
IEnumerable<Something> somethings = ...;
IEnumerable<String> strings = somethings.Select(a
=> a.StringProperty).OrderBy(a => a);
So why am I enforced to use the lambda expression in the OrderBy command?!

The lambda indicates the "what you want to order by".
If you take a set of people, and order them by their birthday, you still have a set of people - not a set of birthdays; i.e.
IEnumerable<Person> people = ...;
IEnumerable<Person> sorted = people.OrderBy(a => a.DateOfBirth);
so similarly, ordering a set of Somethings by StringProperty still results in a set of Somethings:
IEnumerable<Something> somethings = ...;
IEnumerable<Something> sorted = somethings.OrderBy(a => a.StringProperty);
In some (very few) cases, you do actually mean "and order it by the thing itself". This usually applies only to things like IEnumerable<string> or IEnumerable<int> - so the minor inconvenience of .OrderBy(x => x) is trivial. If it bothers you, you could always write an extension method to hide this detail.

When you order a collection it doesn't change it's type, hence
IEnumerable<Something> somethings = ...;
var strings = somethings.OrderBy(a => a.StringProperty);
results in an IEnumerable<Something>, you have to select the property to change the type:
IEnumerable<String> strings = somethings
.OrderBy(s => s.StringProperty)
.Select(s => s.StringProperty);
So why am I enforced to use the lambda expression in the OrderBy
command?!
Because Enumerable.OrderBy is a method that needs an argument.

Because you're not selecting it, you're ordering by it.
Try this:
Console.WriteLine(string.Join(", ",
new[] { new { Int = 1 }, new { Int = 2 }, new { Int = 0 }
.OrderBy(a => a.Int));
This will give you the lists, ordered by the Int property, not just randomly ordered!
This means that you can order by any property of the object, instead of just the object itself.

the structure of the .OrderBy(TSource, TKey) method has a requirement for both the Source item and the item to sort by. the lambda is saying "Order TSource using TKey", or in your case, "Order a using a"

The purpose of the parameter lambda in OrderBy is precisely tell the criteria using for ordering. It takes an object you're sorting, and returns another "thing" (same type or not) which will be sorted, sort of extracting a key to be sorted from the original source.
Your first sample is really trivial, and your rant is somewhat justified there, since if you start from a list of strings, you most likely will want to sort by those strings precisely. Which makes me wonder too, why we can't have a parameterless OrderBy for those trivial cases.
For the second snippet:
IEnumerable<Something> somethings = ...;
IEnumerable<Something> strings = somethings.OrderBy(a => a.StringProperty);
Here is when the "sorting criteria" makes sense, as you order the objects by some property value derived from them, and not just for the objects themselves (which generally aren't comparable). The reason it doesn't compiles is in the second enumerable declaration, it should be an IEnumerable<Something> instead of IEnumerable<string>, because the ordering will return another list of the very same type as it received, but in a different order, regardless of sorting criteria.
In the third snippet, you solve that by Selecting the string property, that effectively yields a list of strings, but you lose all the input objects in the process. The lambda parameter is more or less pointless and trivial here, as you're starting from a plain string to begin with, the very same as the first sample.
Another way to use it would be to specify some different sorting criteria other than the trivial for strings. Say you want to sort not alphabetically, but by the third letter instead:
IEnumerable<String> strings = ...;
strings = strings.OrderBy(a => a.Substring(2, 1));

Related

How to query an array of objects in C#

I want to query an array of objects "sortedData", where each object has two values (ItemId, Sort), for a specific ItemId and set the 'Sort' value. Like this below but this isn't the correct linq syntax.
var sortedData = db.Fetch<object>("SELECT ItemId, Sort FROM CollectionItems WHERE CollectionId = #0", collectionId);
dataWithSort = db.Fetch<OrganizationForExportWithSort>(TpShared.DAL.StoredProcedures.GetOrganizationsForTargetListUI(clientId, organizationIdList));
foreach(OrganizationForExportWithSort export in dataWithSort)
{
export.Sort = sortedData.Select("Sort").Where(sortedData.ItemId == export.Id);
}
As I understand it, you want the Sort property from the item that matches that particular ID. This being the case, you have a few problems with what you've written:
"Where" and "Select" both take Lambda expressions, not property names and expressions, so the code snippet you provide shouldn't compile.
"Where" and "Select" both return collections (even if there's only one item that actually matches the "Where" filter; in fact, even if no items in the collection match the condition in the "Where" clause it'll still return a collection, albeit an empty one). Think of LINQ Select more in terms of running a transform on a collection and LINQ "Where" as applying a filter to one.
As a general rule for LINQ queries, if possible, you should actually run "where" before "select" (filter first, then apply some kind of transform to the remaining items).
In this case, I think you actually just want one item, so you can actually use "FirstOrDefault" instead of "Where." This will leave you with a single .NET object. This is analogous to the TOP 1 restriction in SQL. Once you have the .NET object you can retrieve the property from the object itself.
Try this:
foreach(OrganizationForExportWithSort export in dataWithSort)
{
export.Sort = sortedData.FirstOrDefault(data => data.ItemId == export.Id)?.Sort;
}
The "?" is a new C# feature that will try to call .Sort on the object if (and only if) the query succeeded in finding an item with that ID. If it doesn't it'll just return null.
Have you tried Linq sorting?
var sortedData = db.Fetch<object>("SELECT ItemId, Sort FROM CollectionItems WHERE CollectionId = #0", collectionId);
dataWithSort = db.Fetch<OrganizationForExportWithSort>(TpShared.DAL.StoredProcedures.GetOrganizationsForTargetListUI(clientId, organizationIdList));
// create a list ordered by fields
var sorted = dataWithSort.OrderBy(o => o.SomeField).ThenBy(o => o.OtherField);
The o in the lambda stands for object...
I will add my voice to the chorus of folks saying to read up on some good linq tutorials. Start Here

Finding the list of common objects between two lists

I have list of objects of a class for example:
class MyClass
{
string id,
string name,
string lastname
}
so for example: List<MyClass> myClassList;
and also I have list of string of some ids, so for example:
List<string> myIdList;
Now I am looking for a way to have a method that accept these two as paramets and returns me a List<MyClass> of the objects that their id is the same as what we have in myIdList.
NOTE: Always the bigger list is myClassList and always myIdList is a smaller subset of that.
How can we find this intersection?
So you're looking to find all the elements in myClassList where myIdList contains the ID? That suggests:
var query = myClassList.Where(c => myIdList.Contains(c.id));
Note that if you could use a HashSet<string> instead of a List<string>, each Contains test will potentially be more efficient - certainly if your list of IDs grows large. (If the list of IDs is tiny, there may well be very little difference at all.)
It's important to consider the difference between a join and the above approach in the face of duplicate elements in either myClassList or myIdList. A join will yield every matching pair - the above will yield either 0 or 1 element per item in myClassList.
Which of those you want is up to you.
EDIT: If you're talking to a database, it would be best if you didn't use a List<T> for the entities in the first place - unless you need them for something else, it would be much more sensible to do the query in the database than fetching all the data and then performing the query locally.
That isn't strictly an intersection (unless the ids are unique), but you can simply use Contains, i.e.
var sublist = myClassList.Where(x => myIdList.Contains(x.id));
You will, however, get significantly better performance if you create a HashSet<T> first:
var hash = new HashSet<string>(myIdList);
var sublist = myClassList.Where(x => hash.Contains(x.id));
You can use a join between the two lists:
return myClassList.Join(
myIdList,
item => item.Id,
id => id,
(item, id) => item)
.ToList();
It is kind of intersection between two list so read it like i want something from one list that is present in second list. Here ToList() part executing the query simultaneouly.
var lst = myClassList.Where(x => myIdList.Contains(x.id)).ToList();
you have to use below mentioned code
var samedata=myClassList.where(p=>p.myIdList.Any(q=>q==p.id))
myClassList.Where(x => myIdList.Contains(x.id));
Try
List<MyClass> GetMatchingObjects(List<MyClass> classList, List<string> idList)
{
return classList.Where(myClass => idList.Any(x => myClass.id == x)).ToList();
}
var q = myClassList.Where(x => myIdList.Contains(x.id));

Running a simple LINQ query in parallel

I'm still very new to LINQ and PLINQ. I generally just use loops and List.BinarySearch in a lot of cases, but I'm trying to get out of that mindset where I can.
public class Staff
{
// ...
public bool Matches(string searchString)
{
// ...
}
}
Using "normal" LINQ - sorry, I'm unfamiliar with the terminology - I can do the following:
var matchedStaff = from s
in allStaff
where s.Matches(searchString)
select s;
But I'd like to do this in parallel:
var matchedStaff = allStaff.AsParallel().Select(s => s.Matches(searchString));
When I check the type of matchedStaff, it's a list of bools, which isn't what I want.
First of all, what am I doing wrong here, and secondly, how do I return a List<Staff> from this query?
public List<Staff> Search(string searchString)
{
return allStaff.AsParallel().Select(/* something */).AsEnumerable();
}
returns IEnumerable<type>, not List<type>.
For your first question, you should just replace Select with Where :
var matchedStaff = allStaff.AsParallel().Where(s => s.Matches(searchString));
Select is a projection operator, not a filtering one, that's why you are getting an IEnumerable<bool> corresponding to the projection of all your Staff objects from the input sequence to bools returned by your Matches method call.
I understand it can be counter intuitive for you not to use select at all as it seems you are more familiar with the "query syntax" where select keyword is mandatory which is not the case using the "lambda syntax" (or "fluent syntax" ... whatever the naming), but that's how it is ;)
Projections operators, such a Select, are taking as input an element from the sequence and transform/projects this element somehow to another type of element (here projecting to bool type). Whereas filtering operators, such as Where, are taking as input an element from the sequence and either output the element as such in the output sequence or are not outputing the element at all, based on a predicate.
As for your second question, AsEnumerable returns an IEnumerable as it's name indicates ;)
If you want to get a List<Staff> you should rather call ToList() (as it's name indicates ;)) :
return allStaff.AsParallel().Select(/* something */).ToList();
Hope this helps.
There is no need to abandon normal LINQ syntax to achieve parallelism. You can rewrite your original query:
var matchedStaff = from s in allStaff
where s.Matches(searchString)
select s;
The parallel LINQ (“PLINQ”) version would be:
var matchedStaff = from s in allStaff.AsParallel()
where s.Matches(searchString)
select s;
To understand where the bools are coming from, when you write the following:
var matchedStaff = allStaff.AsParallel().Select(s => s.Matches(searchString));
That is equivalent to the following query syntax:
var matchedStaff = from s in allStaff.AsParallel() select s.Matches(searchString);
As stated by darkey, if you want to use the C# syntax instead of the query syntax, you should use Where():
var matchedStaff = allStaff.AsParallel().Where(s => s.Matches(searchString));

How to force my lambda expressions to evaluate early? Fix lambda expression weirdness?

I have written the following C# code:
_locationsByRegion = new Dictionary<string, IEnumerable<string>>();
foreach (string regionId in regionIds)
{
IEnumerable<string> locationIds = Locations
.Where(location => location.regionId.ToUpper() == regionId.ToUpper())
.Select(location => location.LocationId); //If I cast to an array here, it works.
_locationsByRegion.Add(regionId, LocationIdsIds);
}
This code is meant to create a a dictionary with my "region ids" as keys and lists of "location ids" as values.
However, what actually happens is that I get a dictionary with the "region ids" as keys, but the value for each key is identical: it is the list of locations for the last region id in regionIds!
It looks like this is a product of how lambda expressions are evaluated. I can get the correct result by casting the list of location ids to an array, but this feels like a kludge.
What is a good practice for handling this situation?
You're using LINQ. You need to perform an eager operation to make it perform the .Select. ToList() is a good operator to do that. List is generic it can be assigned to IEnumberable directly.
In the case where you're using LINQ it does lazy evaluation by default. ToList/eager operations force the select to occur. Before you use one of these operators the action is not performed. It is like executing SQL in ADO.NET kind of. If you have the statement "Select * from users" that doesn't actually perform the query until you do extra stuff. The ToList makes the select execute.
Your closing over the variable, not the value.
Make a local copy of the variable so you capture the current value from the foreach loop instead:
_locationsByRegion = new Dictionary<string, IEnumerable<string>>();
foreach (string regionId in regionIds)
{
var regionToUpper = regionId.ToUpper();
IEnumerable<string> locationIds = Locations
.Where(location => location.regionId.ToUpper() == regionToUpper)
.Select(location => location.LocationId); //If I cast to an array here, it works.
_locationsByRegion.Add(regionId, LocationIdsIds);
}
Then read this:
http://msdn.microsoft.com/en-us/vcsharp/hh264182
edit - Forcing a eager evaluation would also work as others have suggested, but most of the time eager evaluations end up being much slower.
Call ToList() or ToArray() after the Select(...). Thus entire collection will be evaluated right there.
Actually the question is about lookup creation, which could be achieved simpler with standard LINQ group join:
var query = from regionId in regionIds
join location in Locations
on regionId.ToLower() equals location.regionId.ToLower() into g
select new { RegionID = regionId,
Locations = g.Select(location => location.LocationId) };
In this case all locations will be downloaded at once, and grouped in-memory. Also this query will not be executed until you try to access results, or until you convert it to dictionary:
var locationsByRegion = query.ToDictionary(x => x.RegionID, x => x.Locations);

C# - AsEnumerable Example

What is the exact use of AsEnumerable? Will it change non-enumerable collection to enumerable
collection?.Please give me a simple example.
From the "Remarks" section of the MSDN documentation:
The AsEnumerable<TSource> method has no effect
other than to change the compile-time
type of source from a type that
implements IEnumerable<T> to
IEnumerable<T> itself.
AsEnumerable<TSource> can be used to choose
between query implementations when a
sequence implements IEnumerable<T> but also has a different set
of public query methods available. For
example, given a generic class Table
that implements IEnumerable<T> and has its own methods such
as Where, Select, and SelectMany, a
call to Where would invoke the public
Where method of Table. A Table type
that represents a database table could
have a Where method that takes the
predicate argument as an expression
tree and converts the tree to SQL for
remote execution. If remote execution
is not desired, for example because
the predicate invokes a local method,
the AsEnumerable<TSource>
method can be used to hide the custom
methods and instead make the standard
query operators available.
If you take a look in reflector:
public static IEnumerable<TSource> AsEnumerable<TSource>(this IEnumerable<TSource> source)
{
return source;
}
It basically does nothing more than down casting something that implements IEnumerable.
Nobody has mentioned this for some reason, but observe that something.AsEnumerable() is equivalent to (IEnumerable<TSomething>) something. The difference is that the cast requires the type of the elements to be specified explicitly, which is, of course, inconvenient. For me, that's the main reason to use AsEnumerable() instead of the cast.
AsEnumerable() converts an array (or list, or collection) into an IEnumerable<T> of the collection.
See http://msdn.microsoft.com/en-us/library/bb335435.aspx for more information.
From the above article:
The AsEnumerable<TSource>(IEnumerable<TSource>) method has no
effect other than to change the compile-time type of source from a type
that implements IEnumerable<T> to IEnumerable<T> itself.
After reading the answers, i guess you are still missing a practical example.
I use this to enable me to use linq on a datatable
var mySelect = from table in myDataSet.Tables[0].AsEnumerable()
where table["myColumn"].ToString() == "Some text"
select table;
AsEnumerable can only be used on enumerable collections. It just changes the type of the collection to IEnumerable<T> to access more easily the IEnumerable extensions.
No it doesn't change a non-enumerable collection to an enumerable one. What is does it return the collection back to you as an IEnumerable so that you can use it as an enumerable. That way you can use the object in conjunction with IEnumerable extensions and be treated as such.
Here's example code which may illustrate LukeH's correct explanation.
IEnumerable<Order> orderQuery = dataContext.Orders
.Where(o => o.Customer.Name == "Bob")
.AsEnumerable()
.Where(o => MyFancyFilterMethod(o, MyFancyObject));
The first Where is Queryable.Where, which is translated into sql and run in the database (o.Customer is not loaded into memory).
The second Where is Enumerable.Where, which calls an in-memory method with an instance of something I don't want to send into the database.
Without the AsEnumerable method, I'd have to write it like this:
IEnumerable<Order> orderQuery =
((IEnumerable<Order>)
(dataContext.Orders.Where(o => o.Customer.Name == "Bob")))
.Where(o => MyFancyFilterMethod(o, MyFancyObject));
Or
IEnumerable<Order> orderQuery =
Enumerable.Where(
dataContext.Orders.Where(o => o.Customer.Name == "Bob"),
(o => MyFancyFilterMethod(o, MyFancyObject));
Neither of which flow well at all.
static void Main()
{
/*
"AsEnumerable" purpose is to cast an IQueryable<T> sequence to IEnumerable<T>,
forcing the remainder of the query to execute locally instead of on database as below example so it can hurt performance. (bind Enumerable operators instead of Queryable).
In below example we have cars table in SQL Server and are going to filter red cars and filter equipment with some regex:
*/
Regex wordCounter = new Regex(#"\w");
var query = dataContext.Cars.Where(car=> article.Color == "red" && wordCounter.Matches(car.Equipment).Count < 10);
/*
SQL Server doesn’t support regular expressions therefore the LINQ-to-db providers will throw an exception: query cannot be translated to SQL.
TO solve this firstly we can get all cars with red color using a LINQ to SQL query,
and secondly filtering locally for Equipment of less than 10 words:
*/
Regex wordCounter = new Regex(#"\w");
IEnumerable<Car> sqlQuery = dataContext.Cars
.Where(car => car.Color == "red");
IEnumerable<Car> localQuery = sqlQuery
.Where(car => wordCounter.Matches(car.Equipment).Count < 10);
/*
Because sqlQuery is of type IEnumerable<Car>, the second query binds to the local query operators,
therefore that part of the filtering is run on the client.
With AsEnumerable, we can do the same in a single query:
*/
Regex wordCounter = new Regex(#"\w");
var query = dataContext.Cars
.Where(car => car.Color == "red")
.AsEnumerable()
.Where(car => wordCounter.Matches(car.Equipment).Count < 10);
/*
An alternative to calling AsEnumerable is ToArray or ToList.
*/
}
The Enumerable.AsEnumerable method can be used to hide a type's custom implementation of a standard query operator
Consider the following example. we have a custom List called MyList
public class MyList<T> : List<T>
{
public string Where()
{
return $"This is the first element {this[0]}";
}
}
MyList has a method called Where which is Enumerable.Where() exact same name. when I use it, actually I am calling my version of Where, not Enumerable's version
MyList<int> list = new MyList<int>();
list.Add(4);
list.Add(2);
list.Add(7);
string result = list.Where();
// the result is "This is the first element 4"
Now how can I find the elements which are less than 5 with the Enumerable's version of Where?
The answer is: Use AsEnumerable() method and then call Where
IEnumerable<int> result = list.AsEnumerable().Where(e => e < 5);
This time the result contains the list of elements that are less than 5

Categories