ordering of OrderBy, Where, Select in the Linq query - c#

Considering this sample code
System.Collections.ArrayList fruits = new System.Collections.ArrayList();
fruits.Add("mango");
fruits.Add("apple");
fruits.Add("lemon");
IEnumerable<string> query = fruits.Cast<string>()
.OrderBy(fruit => fruit)
.Where(fruit => fruit.StartsWith("m"))
.Select(fruit => fruit);
I have two questions:
Do I need to write the last Select clause if Where returns the same type by itself? The example is from msdn, why do they always write it?
What is the correct order of these methods? Does the order affect something? What if I swap Select and Where, or OrderBy?

No, the Select is not necesssary if you are not actually transforming the returned type.
In this case, the ordering of the method calls could have an impact on performance. Sorting all the objects before filtering is sure to take longer than filtering and then sorting a smaller data set.

The .Select is unnecessary in this case because .Cast already guarantees that you're working with IEnumerable<string>.
The ordering of .OrderBy and .Where doesn't affect the results of the query, but in general if you use .Where first you'll get better performance because there will be fewer elements to sort.

Related

C# GroupBy: Will the GroupBy clause keep the original order of the list [duplicate]

I use LINQ to Objects instructions on an ordered array.
Which operations shouldn't I do to be sure the order of the array is not changed?
I examined the methods of System.Linq.Enumerable, discarding any that returned non-IEnumerable results. I checked the remarks of each to determine how the order of the result would differ from order of the source.
Preserves Order Absolutely. You can map a source element by index to a result element
AsEnumerable
Cast
Concat
Select
ToArray
ToList
Preserves Order. Elements are filtered or added, but not re-ordered.
Distinct
Except
Intersect
OfType
Prepend (new in .net 4.7.1)
Skip
SkipWhile
Take
TakeWhile
Where
Zip (new in .net 4)
Destroys Order - we don't know what order to expect results in.
ToDictionary
ToLookup
Redefines Order Explicitly - use these to change the order of the result
OrderBy
OrderByDescending
Reverse
ThenBy
ThenByDescending
Redefines Order according to some rules.
GroupBy - The IGrouping objects are yielded in an order based on the order of the elements in source that produced the first key of each IGrouping. Elements in a grouping are yielded in the order they appear in source.
GroupJoin - GroupJoin preserves the order of the elements of outer, and for each element of outer, the order of the matching elements from inner.
Join - preserves the order of the elements of outer, and for each of these elements, the order of the matching elements of inner.
SelectMany - for each element of source, selector is invoked and a sequence of values is returned.
Union - When the object returned by this method is enumerated, Union enumerates first and second in that order and yields each element that has not already been yielded.
Edit: I've moved Distinct to Preserving order based on this implementation.
private static IEnumerable<TSource> DistinctIterator<TSource>
(IEnumerable<TSource> source, IEqualityComparer<TSource> comparer)
{
Set<TSource> set = new Set<TSource>(comparer);
foreach (TSource element in source)
if (set.Add(element)) yield return element;
}
Are you actually talking about SQL, or about arrays? To put it another way, are you using LINQ to SQL or LINQ to Objects?
The LINQ to Objects operators don't actually change their original data source - they build sequences which are effectively backed by the data source. The only operations which change the ordering are OrderBy/OrderByDescending/ThenBy/ThenByDescending - and even then, those are stable for equally ordered elements. Of course, many operations will filter out some elements, but the elements which are returned will be in the same order.
If you convert to a different data structure, e.g. with ToLookup or ToDictionary, I don't believe order is preserved at that point - but that's somewhat different anyway. (The order of values mapping to the same key is preserved for lookups though, I believe.)
If you are working on an array, it sounds like you are using LINQ-to-Objects, not SQL; can you confirm? Most LINQ operations don't re-order anything (the output will be in the same order as the input) - so don't apply another sort (OrderBy[Descending]/ThenBy[Descending]).
[edit: as Jon put more clearly; LINQ generally creates a new sequence, leaving the original data alone]
Note that pushing the data into a Dictionary<,> (ToDictionary) will scramble the data, as dictionary does not respect any particular sort order.
But most common things (Select, Where, Skip, Take) should be fine.
I found a great answer in a similar question which references official documentation. To quote it:
For Enumerable methods (LINQ to Objects, which applies to List<T>), you can rely on the order of elements returned by Select, Where, or GroupBy. This is not the case for things that are inherently unordered like ToDictionary or Distinct.
From Enumerable.GroupBy documentation:
The IGrouping<TKey, TElement> objects are yielded in an order based on the order of the elements in source that produced the first key of each IGrouping<TKey, TElement>. Elements in a grouping are yielded in the order they appear in source.
This is not necessarily true for IQueryable extension methods (other LINQ providers).
Source: Do LINQ's Enumerable Methods Maintain Relative Order of Elements?
Any 'group by' or 'order by' will possibly change the order.
The question here is specifically referring to LINQ-to-Objects.
If your using LINQ-to-SQL instead there is no order there unless you impose one with something like:
mysqlresult.OrderBy(e=>e.SomeColumn)
If you do not do this with LINQ-to-SQL then the order of results can vary between subsequent queries, even of the same data, which could cause an intermittant bug.

How to use Orderby Clause with IEnumerable

I have written following code:
IEnumerable<Models.bookings> search = new List<bookings>();
search = new available_slotsRepositories().GetAvailableSlot(param1,param2);
var data = from s in search.AsEnumerable().
OrderByDescending(c => c.BookingDate)
select s;
i have also tried this and it does not work:
search.OrderByDescending(c => c.BookingDate);
Third line gives me following error:
Expression cannot contain lambda expressions
Any one guide me how can i fix this issue?
Any help would be appreciated.
Thank you!
why r u using new List()??
follow the below pattern
IEnumerable<Step> steps = allsteps.Where(step => step.X <= Y);
steps = steps.OrderBy(step => step.X);
NOTE:
IEnumerable makes no guarantees about ordering, but the implementations that use IEnumerable may or may not guarantee ordering.
For instance, if you enumerate List, order is guaranteed, but if you enumerate HashSet no such guarantee is provided, yet both will be enumerated using the IEnumerable interface
Perhaps you are looking for the IOrderedEnumerable interface? It is returned by extensions methods like OrderBy() and allow for subsequent sorting with ThenBy().
Have you tried
var data = (from s in search
OrderByDescending(c => c.BookingDate)
select s).ToList();
That will make a List which is IEnumerable.
I'm not sure why you need "new" if as you say GetAvailableSlot returns an IEnumerable. What I think your code should look like assuming GetAvailableSlot returns IEnumerable is this:
var data = available_slotsRepositories().GetAvailableSlot(param1,param2).ToList().OrderByDescending(c => c.BookingDate);
All you're doing to your recordset is ordering the results there is no need to have multiple variables declared. If this still doesn't work then we need to see more of the code in order to see what the problem is...

linq: separate orderby and thenby statements

I'm coding through the 101 Linq tutorials from here:
http://code.msdn.microsoft.com/101-LINQ-Samples-3fb9811b
Most of the examples are simple, but this one threw me for a loop:
[Category("Ordering Operators")]
[Description("The first query in this sample uses method syntax to call OrderBy and ThenBy with a custom comparer to " +
"sort first by word length and then by a case-insensitive sort of the words in an array. " +
"The second two queries show another way to perform the same task.")]
public void Linq36()
{
string[] words = { "aPPLE", "AbAcUs", "bRaNcH", "BlUeBeRrY", "ClOvEr", "cHeRry", "b1" };
var sortedWords =
words.OrderBy(a => a.Length)
.ThenBy(a => a, new CaseInsensitiveComparer());
// Another way. TODO is this use of ThenBy correct? It seems to work on this sample array.
var sortedWords2 =
from word in words
orderby word.Length
select word;
var sortedWords3 = sortedWords2.ThenBy(a => a, new CaseInsensitiveComparer());
No matter which combination of words I throw at it the length is always the first ordering criteria ... even though I don't know how the second statement (with no orderby!) knows what the original orderby clause was.
Am I going crazy? Can anyone explain how Linq "remembers" what the original ordering was?
The return type of OrderBy is not IEnumerable<T>. It's IOrderedEnumerable<T>. This is an object that "remembers" all of the orderings it's been given, and as long as you don't call another method that turns the variable back into an IEnumerable it will retain that knowledge.
See Jon Skeets wonderful blog series Eduling in which he re-implements Linq-to-objects for more info. The key entries on OrderBy/ThenBy are:
IOrderedEnumerable
OrderBy, OrderByDescending, ThenBy, ThenByDescending
This is because LINQ is lazy, the first i.e. all the evaluation only happens when you enumerate the sequence.. the expression tree that has been constructed gets executed.
Your question really doesn't make much sense on the surface because you're not considering the nature of the deferred execution. It doesn't "remember" in either case truthfully, it simply isn't executed until it's really needed. If you run over your examples in the debugger you will find that these generate identical (structurally anyway) statements. Consider:
var sortedWords =
words.OrderBy(a => a.Length)
.ThenBy(a => a, new CaseInsensitiveComparer());
You've explicitly told it to OrderBy, ThenBy. Each statement is stacked on until they're all complete, and the finally query is constructed to look like (psuedo):
Select from sorted words, order by length, order by comparer
Then once that is all ready to go it is executed and placed into sortedWords. Now consider:
var sortedWords2 =
from word in words
orderby word.Length // You're telling it to sort here
select word;
// Now you're telling it to ThenBy here
var sortedWords3 = sortedWords2.ThenBy(a => a, new CaseInsensitiveComparer());
And then once those queries are stacked up it will be executed. However, it WON'T be executed until you NEED them. sortedWords3 won't really have any value until you act on it because the need for it is deferred. So in both cases, you're basically saying to the compiler:
Wait until I'm done building my query
Select from source
Order by length
Then by comparer
Ok do your stuff.
Note: To sum up, LINQ doesn't "remember", it simply doesn't execute until you're done giving it instructions to execute. Then it stacks them up into a query and runs them all at once when they're needed.

LINQ Intersection of two different types

I have two different list types. I need to remove the elements from list1 that is not there in list2 and list2 element satisfies certain criteria.
Here is what I tried, seems to work but each element is listed twice.
var filteredTracks =
from mtrack in mTracks
join ftrack in tracksFileStatus on mtrack.id equals ftrack.Id
where mtrack.id == ftrack.Id && ftrack.status == "ONDISK" && ftrack.content_type_id == 234
select mtrack;
Ideally I don't want to create a new copy of the filteredTracks, is it possible modify mTracks in place?
If you're getting duplicates, it's because your id fields are not unique in one or both of the two sequences. Also, you don't need to say where mtrack.id == ftrack.Id since that condition already has to be met for the join to succeed.
I would probably use loops here, but if you are dead set on LINQ, you may need to group tracksFileStatus by its Id field. It's hard to tell by what you posted.
As far as "modifying mTracks in place", this is probably not possible or worthwhile (I'm assuming that mTracks is some type derived from IEnumerable<T>). If you're worried about the efficiency of this approach, then you may want to consider using another kind of data structure, like a dictionary with Id values as the keys.
Since the Q was about lists primarily...
this is probably better linq wise...
var test = (from m in mTracks
from f in fTracks
where m.Id == f.Id && ...
select m);
However you should optimize, e.g.
Are your lists sorted? If they are, see e.g. Best algorithm for synchronizing two IList in C# 2.0
If it's coming from Db (it's not clear here), then you need to build your linq query based on the SQL / relations and indexes you have in the Db and go a bit different route.
If I were you, I'd make a query (for each of the lists, presuming it's not Db bound) so that tracks are sorted in the first place (and sort on whatever is used to compare them, usually),
then enumerate in parallel (using enumerators), comparing other things in the process (like in that link).
that's likely the most efficient way.
if/when it comes from database, optimize at the 'source' - i.e. fetch data already sorted and filtered as much as you can. And basically, build an SQL first, or inspect the returned SQL from the linq query (let me know if you need the link).

linq orderby.tolist() performance

I have an ordering query to a List and calling for many times.
list = list.OrderBy().ToList();
In this code ToList() method is spending high resources and takes very long time. How can I speed up with another ordering method without converting back to a list. Should I use .Sort extension for arrays?
First of all, try to sort the list once, and keep it sorted.
To speed up things you can use Parallel LINQ.
see: http://msdn.microsoft.com/en-us/magazine/cc163329.aspx
An OrderBy() Parallel looks like this:
var query = data.AsParallel().Where(x => p(x)).Orderby(x => k(x)).ToList();
You only need to call ToList() once to get your sorted list. All future actions should use sortedList.
sortedList = list.OrderBy().ToList();

Categories