linq: separate orderby and thenby statements - c#

I'm coding through the 101 Linq tutorials from here:
http://code.msdn.microsoft.com/101-LINQ-Samples-3fb9811b
Most of the examples are simple, but this one threw me for a loop:
[Category("Ordering Operators")]
[Description("The first query in this sample uses method syntax to call OrderBy and ThenBy with a custom comparer to " +
"sort first by word length and then by a case-insensitive sort of the words in an array. " +
"The second two queries show another way to perform the same task.")]
public void Linq36()
{
string[] words = { "aPPLE", "AbAcUs", "bRaNcH", "BlUeBeRrY", "ClOvEr", "cHeRry", "b1" };
var sortedWords =
words.OrderBy(a => a.Length)
.ThenBy(a => a, new CaseInsensitiveComparer());
// Another way. TODO is this use of ThenBy correct? It seems to work on this sample array.
var sortedWords2 =
from word in words
orderby word.Length
select word;
var sortedWords3 = sortedWords2.ThenBy(a => a, new CaseInsensitiveComparer());
No matter which combination of words I throw at it the length is always the first ordering criteria ... even though I don't know how the second statement (with no orderby!) knows what the original orderby clause was.
Am I going crazy? Can anyone explain how Linq "remembers" what the original ordering was?

The return type of OrderBy is not IEnumerable<T>. It's IOrderedEnumerable<T>. This is an object that "remembers" all of the orderings it's been given, and as long as you don't call another method that turns the variable back into an IEnumerable it will retain that knowledge.
See Jon Skeets wonderful blog series Eduling in which he re-implements Linq-to-objects for more info. The key entries on OrderBy/ThenBy are:
IOrderedEnumerable
OrderBy, OrderByDescending, ThenBy, ThenByDescending

This is because LINQ is lazy, the first i.e. all the evaluation only happens when you enumerate the sequence.. the expression tree that has been constructed gets executed.

Your question really doesn't make much sense on the surface because you're not considering the nature of the deferred execution. It doesn't "remember" in either case truthfully, it simply isn't executed until it's really needed. If you run over your examples in the debugger you will find that these generate identical (structurally anyway) statements. Consider:
var sortedWords =
words.OrderBy(a => a.Length)
.ThenBy(a => a, new CaseInsensitiveComparer());
You've explicitly told it to OrderBy, ThenBy. Each statement is stacked on until they're all complete, and the finally query is constructed to look like (psuedo):
Select from sorted words, order by length, order by comparer
Then once that is all ready to go it is executed and placed into sortedWords. Now consider:
var sortedWords2 =
from word in words
orderby word.Length // You're telling it to sort here
select word;
// Now you're telling it to ThenBy here
var sortedWords3 = sortedWords2.ThenBy(a => a, new CaseInsensitiveComparer());
And then once those queries are stacked up it will be executed. However, it WON'T be executed until you NEED them. sortedWords3 won't really have any value until you act on it because the need for it is deferred. So in both cases, you're basically saying to the compiler:
Wait until I'm done building my query
Select from source
Order by length
Then by comparer
Ok do your stuff.
Note: To sum up, LINQ doesn't "remember", it simply doesn't execute until you're done giving it instructions to execute. Then it stacks them up into a query and runs them all at once when they're needed.

Related

Orderby C# string record

I have the following orderby for a record read from db and then building a string.
The following code works fine but I know this can be improved any suggestion is highly appreciated.
result.Sites.ForEach(x =>
{
result.SiteDetails +=
string.Concat(ICMSRepository.Instance.GetSiteInformationById(x.SiteInformationId).SiteCode,
",");
});
//Sort(Orderby) sites by string value NOT by numerical order
result.SiteDetails = result.SiteDetails.Trim(',');
List<string> siteCodes = result.SiteDetails.Split(',').ToList();
var siteCodesOrder = siteCodes.OrderBy(x => x).ToArray();
string siteCodesSorted = string.Join(", ", siteCodesOrder);
result.SiteDetails = siteCodesSorted;
That's a little convoluted, yeah.
All we need to do is select out the SiteCode as string, sort with OrderBy, then join the results. Since String::Join has a variant that works with IEnumerable<string> we don't need to convert to array in the middle.
What we end up with is a single statement for assigning to your SiteDetails member:
result.SiteDetails = string.Join(", ",
result.Sites
.Select(x => $"{ICMSRepository.Instance.GetSiteInformationById(x.SiteInformationId).SiteCode}")
.OrderBy(x => x)
);
(Or you could use .ToString() instead of $"{...}")
This is the general process for most transforms in LINQ. Figure out what your inputs are, what you need to do with them, and how the outputs should look.
If you're using LINQ it's uncommon that you will have to build and manipulate intermediary lists unless you're doing something quite complex. For simple tasks like sorting a sequence of values there is almost never a reason to put them into transitional collections, since the framework handles all of that for you.
And the best part is it enumerates the collection one time to get the full set of data. No more loops to pull the data out, then process, then rebuild.
One thing that will improve performance is to get rid of the .ToList() and the .ToString. Neither is necessary and just take up extra processing time and memory.
Go with Corey's answer, which this is a variant of, but I thought I'd offer a slightly clearer way to express the query:
result.SiteDetails =
String.Join(", ",
from x in result.Sites
let sc = ICMSRepository.Instance.GetSiteInformationById(x.SiteInformationId).SiteCode
orderby sc
select sc);

ordering of OrderBy, Where, Select in the Linq query

Considering this sample code
System.Collections.ArrayList fruits = new System.Collections.ArrayList();
fruits.Add("mango");
fruits.Add("apple");
fruits.Add("lemon");
IEnumerable<string> query = fruits.Cast<string>()
.OrderBy(fruit => fruit)
.Where(fruit => fruit.StartsWith("m"))
.Select(fruit => fruit);
I have two questions:
Do I need to write the last Select clause if Where returns the same type by itself? The example is from msdn, why do they always write it?
What is the correct order of these methods? Does the order affect something? What if I swap Select and Where, or OrderBy?
No, the Select is not necesssary if you are not actually transforming the returned type.
In this case, the ordering of the method calls could have an impact on performance. Sorting all the objects before filtering is sure to take longer than filtering and then sorting a smaller data set.
The .Select is unnecessary in this case because .Cast already guarantees that you're working with IEnumerable<string>.
The ordering of .OrderBy and .Where doesn't affect the results of the query, but in general if you use .Where first you'll get better performance because there will be fewer elements to sort.

How to use Orderby Clause with IEnumerable

I have written following code:
IEnumerable<Models.bookings> search = new List<bookings>();
search = new available_slotsRepositories().GetAvailableSlot(param1,param2);
var data = from s in search.AsEnumerable().
OrderByDescending(c => c.BookingDate)
select s;
i have also tried this and it does not work:
search.OrderByDescending(c => c.BookingDate);
Third line gives me following error:
Expression cannot contain lambda expressions
Any one guide me how can i fix this issue?
Any help would be appreciated.
Thank you!
why r u using new List()??
follow the below pattern
IEnumerable<Step> steps = allsteps.Where(step => step.X <= Y);
steps = steps.OrderBy(step => step.X);
NOTE:
IEnumerable makes no guarantees about ordering, but the implementations that use IEnumerable may or may not guarantee ordering.
For instance, if you enumerate List, order is guaranteed, but if you enumerate HashSet no such guarantee is provided, yet both will be enumerated using the IEnumerable interface
Perhaps you are looking for the IOrderedEnumerable interface? It is returned by extensions methods like OrderBy() and allow for subsequent sorting with ThenBy().
Have you tried
var data = (from s in search
OrderByDescending(c => c.BookingDate)
select s).ToList();
That will make a List which is IEnumerable.
I'm not sure why you need "new" if as you say GetAvailableSlot returns an IEnumerable. What I think your code should look like assuming GetAvailableSlot returns IEnumerable is this:
var data = available_slotsRepositories().GetAvailableSlot(param1,param2).ToList().OrderByDescending(c => c.BookingDate);
All you're doing to your recordset is ordering the results there is no need to have multiple variables declared. If this still doesn't work then we need to see more of the code in order to see what the problem is...

Get indexes of all matching values from list using Linq

Hey Linq experts out there,
I just asked a very similar question and know the solution is probably SUPER easy, but still find myself not being able to wrap my head around how to do this fairly simple task in the most efficient manner using linq.
My basic scenario is that I have a list of values, for example, say:
Lst1:
a
a
b
b
c
b
a
c
a
And I want to create a new list that will hold all the indexes from Lst1 where, say, the value = "a".
So, in this example, we would have:
LstIndexes:
0
1
6
8
Now, I know I can do this with Loops (which I would rather avoid in favor of Linq) and I even figured out how to do this with Linq in the following way:
LstIndexes= Lst1.Select(Function(item As String, index As Integer) index) _
.Where(Function(index As Integer) Lst1(index) = "a").ToList
My challenge with this is that it iterates over the list twice and is therefore inefficient.
How can I get my result in the most efficient way using Linq?
Thanks!!!!
First off, your code doesn't actually iterate over the list twice, it only iterates it once.
That said, your Select is really just getting a sequence of all of the indexes; that is more easily done with Enumerable.Range:
var result = Enumerable.Range(0, lst1.Count)
.Where(i => lst1[i] == "a")
.ToList();
Understanding why the list isn't actually iterated twice will take some getting used to. I'll try to give a basic explanation.
You should think of most of the LINQ methods, such as Select and Where as a pipeline. Each method does some tiny bit of work. In the case of Select you give it a method, and it essentially says, "Whenever someone asks me for my next item I'll first ask my input sequence for an item, then use the method I have to convert it into something else, and then give that item to whoever is using me." Where, more or less, is saying, "whenever someone asks me for an item I'll ask my input sequence for an item, if the function say it's good I'll pass it on, if not I'll keep asking for items until I get one that passes."
So when you chain them what happens is ToList asks for the first item, it goes to Where to as it for it's first item, Where goes to Select and asks it for it's first item, Select goes to the list to ask it for its first item. The list then provides it's first item. Select then transforms that item into what it needs to spit out (in this case, just the int 0) and gives it to Where. Where takes that item and runs it's function which determine's that it's true and so spits out 0 to ToList, which adds it to the list. That whole thing then happens 9 more times. This means that Select will end up asking for each item from the list exactly once, and it will feed each of its results directly to Where, which will feed the results that "pass the test" directly to ToList, which stores them in a list. All of the LINQ methods are carefully designed to only ever iterate the source sequence once (when they are iterated once).
Note that, while this seems complicated at first to you, it's actually pretty easy for the computer to do all of this. It's not actually as performance intensive as it may seem at first.
This works, but arguably not as neat.
var result = list1.Select((x, i) => new {x, i})
.Where(x => x.x == "a")
.Select(x => x.i);
How about this one, it works pretty fine for me.
static void Main(string[] args)
{
List<char> Lst1 = new List<char>();
Lst1.Add('a');
Lst1.Add('a');
Lst1.Add('b');
Lst1.Add('b');
Lst1.Add('c');
Lst1.Add('b');
Lst1.Add('a');
Lst1.Add('c');
Lst1.Add('a');
var result = Lst1.Select((c, i) => new { character = c, index = i })
.Where(list => list.character == 'a')
.ToList();
}

How to use LINQ to SQL to create ranked search results?

I am looking for a way to use l2s to return ranked result based on keywords.
I would like to take a keyword and be able to search the table for that keyword using .contains(). The trick that I haven't been able to figure out is how to get a count of how many times that keyqord appears, and then .OrderByDescending() based on that count.
So if i had some thing like:
string keyword = "SomeKeyword";
IQueryable<Article> searchResults = from a in GenesisRepository.Article
where a.Body.Contains(keyword)
select a;
What is the best way to order searchResults based on the number of times keyword appears in a.Body?
Thanks for any help.
try inserting order by a.Body.Split(' ').Count(w=>w == keyword). That should allow you to see that the concept works. However, I STRONGLY recommend that the final version include this as part of the select projection, possibly using a key-value pair, and order by the property name:
string keyword = "SomeKeyword";
//EDIT: restructured query to force the ordering to be done on the projection,
//not the source.
IQueryable<Article> searchResults = (from a in GenesisRepository.Article
where a.Body.Contains(keyword)
select new KeyValuePair<int, Article>(
a.Body.Split(' ').Count(w=>w == keyword), a))
.OrderBy(kvp=>kvp.Key);
The reason is performance; the Split().Count() method chain is linear-complexity, and will be evaluated for every comparison of two values, making the overall sort N^2logN complexity (slow).
EDIT: Also, understand that a.Body.Contains(keyword) will not search by whole words, and so will return articles that contain "SomeKeywordLongerThanSearch" and "ThisIsSomeKeyword" as well as "SomeKeyword". You can avoid this with a Regex match on the pattern "\bSomeKeyword\b", which will only match instances of SomeKeyword with a word boundary immediately before and after.
This is a little hack I came up with, pretty simple but definitely not a "best practices" one.
IQueryable<Article> searchResults = from a in GenesisRepository.Article
where a.Body.Contains(keyword)
orderby a.Body.Split(new string[] { keyword }, StringSplitOptions.RemoveEmptyEntries).Count() descending
select a;
Maybe this will work...
IQueryable<Article> searchResults = from a in GenesisRepository.Article
where a.Body.Contains(keyword)
select a;
searchResults.OrderByDescending(s => Regex.Matches(a.Body, keyword).Count);

Categories