C# linq - Order by alphabetical, then by certain value [duplicate] - c#

Can anyone explain what the difference is between:
tmp = invoices.InvoiceCollection
.OrderBy(sort1 => sort1.InvoiceOwner.LastName)
.OrderBy(sort2 => sort2.InvoiceOwner.FirstName)
.OrderBy(sort3 => sort3.InvoiceID);
and
tmp = invoices.InvoiceCollection
.OrderBy(sort1 => sort1.InvoiceOwner.LastName)
.ThenBy(sort2 => sort2.InvoiceOwner.FirstName)
.ThenBy(sort3 => sort3.InvoiceID);
Which is the correct approach if I wish to order by 3 items of data?

You should definitely use ThenBy rather than multiple OrderBy calls.
I would suggest this:
tmp = invoices.InvoiceCollection
.OrderBy(o => o.InvoiceOwner.LastName)
.ThenBy(o => o.InvoiceOwner.FirstName)
.ThenBy(o => o.InvoiceID);
Note how you can use the same name each time. This is also equivalent to:
tmp = from o in invoices.InvoiceCollection
orderby o.InvoiceOwner.LastName,
o.InvoiceOwner.FirstName,
o.InvoiceID
select o;
If you call OrderBy multiple times, it will effectively reorder the sequence completely three times... so the final call will effectively be the dominant one. You can (in LINQ to Objects) write
foo.OrderBy(x).OrderBy(y).OrderBy(z)
which would be equivalent to
foo.OrderBy(z).ThenBy(y).ThenBy(x)
as the sort order is stable, but you absolutely shouldn't:
It's hard to read
It doesn't perform well (because it reorders the whole sequence)
It may well not work in other providers (e.g. LINQ to SQL)
It's basically not how OrderBy was designed to be used.
The point of OrderBy is to provide the "most important" ordering projection; then use ThenBy (repeatedly) to specify secondary, tertiary etc ordering projections.
Effectively, think of it this way: OrderBy(...).ThenBy(...).ThenBy(...) allows you to build a single composite comparison for any two objects, and then sort the sequence once using that composite comparison. That's almost certainly what you want.

I found this distinction annoying in trying to build queries in a generic manner, so I made a little helper to produce OrderBy/ThenBy in the proper order, for as many sorts as you like.
public class EFSortHelper
{
public static EFSortHelper<TModel> Create<TModel>(IQueryable<T> query)
{
return new EFSortHelper<TModel>(query);
}
}
public class EFSortHelper<TModel> : EFSortHelper
{
protected IQueryable<TModel> unsorted;
protected IOrderedQueryable<TModel> sorted;
public EFSortHelper(IQueryable<TModel> unsorted)
{
this.unsorted = unsorted;
}
public void SortBy<TCol>(Expression<Func<TModel, TCol>> sort, bool isDesc = false)
{
if (sorted == null)
{
sorted = isDesc ? unsorted.OrderByDescending(sort) : unsorted.OrderBy(sort);
unsorted = null;
}
else
{
sorted = isDesc ? sorted.ThenByDescending(sort) : sorted.ThenBy(sort)
}
}
public IOrderedQueryable<TModel> Sorted
{
get
{
return sorted;
}
}
}
There are a lot of ways you might use this depending on your use case, but if you were for example passed a list of sort columns and directions as strings and bools, you could loop over them and use them in a switch like:
var query = db.People.AsNoTracking();
var sortHelper = EFSortHelper.Create(query);
foreach(var sort in sorts)
{
switch(sort.ColumnName)
{
case "Id":
sortHelper.SortBy(p => p.Id, sort.IsDesc);
break;
case "Name":
sortHelper.SortBy(p => p.Name, sort.IsDesc);
break;
// etc
}
}
var sortedQuery = sortHelper.Sorted;
The result in sortedQuery is sorted in the desired order, instead of resorting over and over as the other answer here cautions.

if you want to sort more than one field then go for ThenBy:
like this
list.OrderBy(personLast => person.LastName)
.ThenBy(personFirst => person.FirstName)

Yes, you should never use multiple OrderBy if you are playing with multiple keys.
ThenBy is safer bet since it will perform after OrderBy.

Related

how to compare every value in a list with an other list by linq

I have two lists:
ListA ListB
Name Name
Marc Marc
micheal micheal
Jolie Jolie
I want to compare this two lists A and B, if they have same names it shoulds return true. If one list has different values than other it should return false, if one has 2 values and the other has 3 values, it shoulds return false too.
this traitement return me always false even they have same values:
var SameNames = ListA.All(x => ListB.All(y => y.Name.Equals(x.Name)));
I did it in wrong way ? please
var SameNames = ListA.All(x => ListB.All(y => y.Name.Equals(x.Name)));
What you wrote here is a requirement that all names in A match all names in B. That clearly conflicts with your example data, as the "Marc" from A does not equal "Jolie" from B.
Your current code would only return true if A and B in total only contain one distinct name. Which makes no sense for the problem you're trying to solve.
What you're actually trying to solve is if all names from A have one match in B. That would be achieved by doing:
var sameNames = listA.All(a => listB.Any(b => b.Name.Equals(a.Name)));
Note the Any. It returns true if it find (at least) one match. This is different from All, which only returns true if all values in B are a match.
However, this is not enough yet. You've now ascertained that all names from A appear in B, but it's still possible that B contains more names that aren't in A. Take the following example:
var listA = new List<string>() { "Andy", "Bobby", "Cindy" };
var listB = new List<string>() { "Andy", "Bobby", "Cindy", "David" };
var SameNames = listA.All(a => listB.Any(b => b.Name.Equals(a.Name)));
SameNames will be true, but that's because you only one direction. You need to check in both directions:
var listA_in_listB = listA.All(a => listB.Any(b => b.Name.Equals(a.Name)));
var listB_in_listA = listB.All(b => listA.Any(a => a.Name.Equals(b.Name)));
var sameNames = listA_in_listB && listB_in_listA;
This gives you the result you want.
Note that there are a few other variations on how to approach this.
If you can guarantee that each list does not contain any duplicates, then instead of doing the same check twice, you can simple do one of the checks and then confirm that the lists are of equal length:
var listA_in_listB = listA.All(a => listB.Any(b => b.Name.Equals(a.Name)));
var sameLength = listA.Length == listB.Length;
var sameNames = listA_in_listB && sameLength;
This is more efficient, but it does require knowing that your names are unique within each list.
If order should be relevant :
bool result = Enumerable.SequenceEquals<string>(ListA, ListB);
if order should be irrelevant :
bool result = new HashSet<string>(ListA).SetEquals(ListB);
doing this overcomplicated in LINQ is just overtuning, you could just use ForEachs at that point. If u want a simple and clean solution, these are for you.
Greetings
Following is the code to achieve your requirement.
var isEqual = ListA.OrderBy(o=>o.Name).SelectMany(x=>x.Name).SequenceEqual(ListB.OrderBy(o=>o.Name).SelectMany(x => x.Name));
You can use var SameNames = ListA.SequenceEquals(ListB). That will use the default equals defined in the object. It will throw an exception if any of the two objects are null. So make sure your lists are not null.
Equals implementation
class YourListItem {
public string Name { get; set; }
public override bool Equals(object obj)
{
if(obj is YourListItem other) {
return this.Name == other.Name;
}
return false;
}
public override int GetHashCode() => this.Name.GetHashCode();
}
If you need a custom compare logic that does not really 'fit' into the Objects equals method or you need multiple different equality implementations. You can create another class that implements IEqualityComparer for your type and pass that as the second parameter var SameNames = ListA.SequenceEquals(ListB, new MyEqualityComparer()).
Equality comparer
class MyEqualityComparer: IEqualityComparer<YourListItem> {
public bool Equals(YourListItem first, YourListItem second) {
return first.Name == second.Name;
}
public int GetHashCode(YourListItem obj) => obj.Name.GetHashCode();
}
Note: You should always be implementing GetHashCode as well when overriding Equals. Its like a shorter equals (in a sence) if the Hash code is not the same Equals is redundant and does not need to be executed. A lot of prebuilt methods that use EqualityComparers or Equals will not run Equals if the HashCode´s are different. The default HashCode is the Object reference so it would falsely match two lists to not be Equal that really should be Equal (if you do not overwride the HashCode). Pick some field that you ecpect to be different for most objects for your HashCode implementation. If you have a tiny object like this it looks stupid, but you can have gigantic objects with complex Equals implementations or generic reflection based equality checks that are slow. So being able to skip the Equality check entierly most of the time when HashCode´s are not the same is suddenly relevant then.

Dynamic LINQ discards applied OrderBy sortings except the latest applied [duplicate]

Can anyone explain what the difference is between:
tmp = invoices.InvoiceCollection
.OrderBy(sort1 => sort1.InvoiceOwner.LastName)
.OrderBy(sort2 => sort2.InvoiceOwner.FirstName)
.OrderBy(sort3 => sort3.InvoiceID);
and
tmp = invoices.InvoiceCollection
.OrderBy(sort1 => sort1.InvoiceOwner.LastName)
.ThenBy(sort2 => sort2.InvoiceOwner.FirstName)
.ThenBy(sort3 => sort3.InvoiceID);
Which is the correct approach if I wish to order by 3 items of data?
You should definitely use ThenBy rather than multiple OrderBy calls.
I would suggest this:
tmp = invoices.InvoiceCollection
.OrderBy(o => o.InvoiceOwner.LastName)
.ThenBy(o => o.InvoiceOwner.FirstName)
.ThenBy(o => o.InvoiceID);
Note how you can use the same name each time. This is also equivalent to:
tmp = from o in invoices.InvoiceCollection
orderby o.InvoiceOwner.LastName,
o.InvoiceOwner.FirstName,
o.InvoiceID
select o;
If you call OrderBy multiple times, it will effectively reorder the sequence completely three times... so the final call will effectively be the dominant one. You can (in LINQ to Objects) write
foo.OrderBy(x).OrderBy(y).OrderBy(z)
which would be equivalent to
foo.OrderBy(z).ThenBy(y).ThenBy(x)
as the sort order is stable, but you absolutely shouldn't:
It's hard to read
It doesn't perform well (because it reorders the whole sequence)
It may well not work in other providers (e.g. LINQ to SQL)
It's basically not how OrderBy was designed to be used.
The point of OrderBy is to provide the "most important" ordering projection; then use ThenBy (repeatedly) to specify secondary, tertiary etc ordering projections.
Effectively, think of it this way: OrderBy(...).ThenBy(...).ThenBy(...) allows you to build a single composite comparison for any two objects, and then sort the sequence once using that composite comparison. That's almost certainly what you want.
I found this distinction annoying in trying to build queries in a generic manner, so I made a little helper to produce OrderBy/ThenBy in the proper order, for as many sorts as you like.
public class EFSortHelper
{
public static EFSortHelper<TModel> Create<TModel>(IQueryable<T> query)
{
return new EFSortHelper<TModel>(query);
}
}
public class EFSortHelper<TModel> : EFSortHelper
{
protected IQueryable<TModel> unsorted;
protected IOrderedQueryable<TModel> sorted;
public EFSortHelper(IQueryable<TModel> unsorted)
{
this.unsorted = unsorted;
}
public void SortBy<TCol>(Expression<Func<TModel, TCol>> sort, bool isDesc = false)
{
if (sorted == null)
{
sorted = isDesc ? unsorted.OrderByDescending(sort) : unsorted.OrderBy(sort);
unsorted = null;
}
else
{
sorted = isDesc ? sorted.ThenByDescending(sort) : sorted.ThenBy(sort)
}
}
public IOrderedQueryable<TModel> Sorted
{
get
{
return sorted;
}
}
}
There are a lot of ways you might use this depending on your use case, but if you were for example passed a list of sort columns and directions as strings and bools, you could loop over them and use them in a switch like:
var query = db.People.AsNoTracking();
var sortHelper = EFSortHelper.Create(query);
foreach(var sort in sorts)
{
switch(sort.ColumnName)
{
case "Id":
sortHelper.SortBy(p => p.Id, sort.IsDesc);
break;
case "Name":
sortHelper.SortBy(p => p.Name, sort.IsDesc);
break;
// etc
}
}
var sortedQuery = sortHelper.Sorted;
The result in sortedQuery is sorted in the desired order, instead of resorting over and over as the other answer here cautions.
if you want to sort more than one field then go for ThenBy:
like this
list.OrderBy(personLast => person.LastName)
.ThenBy(personFirst => person.FirstName)
Yes, you should never use multiple OrderBy if you are playing with multiple keys.
ThenBy is safer bet since it will perform after OrderBy.

OrderBy on nested collections

I'm trying to sort this complex object:
Order _sut = new Order
{
OrderDataArray = new[]
{
new OrderData
{
OrderHeaderArray = new[]
{
new OrderHeader
{
SequenceNumber = 1,
OrderPositionArray = new[]
{
new OrderPositions
{
LineNumber = 3
},
new OrderPositions
{
LineNumber = 2
},
new OrderPositions
{
LineNumber = 1
}
}
}
}
}
}
};
Using the code:
[Fact]
public void Sorts_By_Sequence_Number()
{
var ordered = _sut.OrderDataArray
.OrderBy(o => o.OrderHeaderArray
.OrderBy(a => a.OrderPositionArray
.OrderBy(p => p.LineNumber)))
.ToArray();
_sut.OrderDataArray = ordered;
OutputHelper(_sut);
}
I don't understand why this doesn't work, meaning the sorting routine simply keeps initial order of LineNumber object. I've tried various things with OrderBy, but looks like it doesn't sort.
EDIT
Thank you for responses, both are correct. I have accepted poke's response as it provides a bit more detailed information on the inner workings of the OrderBy method. Basically I was missing the assignment within the loop, I was trying to sort all objects at once.
You should consider what OrderBy does. It orders a collection by the value you determine in the lambda expression and then returns an enumerable.
Your outer call is good for that:
_sut.OrderDataArray.OrderBy(o => something).ToArray();
You sort by something, and then convert the result into a (then sorted) array. There are two things that matter here: First of all, at least in your example, there is only one object in OrderDataArray, so there is no sort happening. Second, it depends on the return value of something how those objects are sorted.
So in that case, what is something? It’s the following:
o.OrderHeaderArray.OrderBy(a => somethingElse)
So regardless of somethingElse, what does this return? An IEnumerable<OrderHeader>. How do multiple enumerables compare to each other? They are not really comparable; and they especially don’t tell you anything about the order based on their content (you’d have to enumerate it first). So essentially, you order that OrderHeaderArray by “something else”, use the result which does not tell you anything about an order as the key to order the OrderDataArray. Then, you throw the sorted OrderHeaderArray away.
You do the same exactly one level deeper with the OrderPositionArray, which again will not do anything useful. The only actual useful ordering happens to the OrderPositionArray itself but that result is again thrown away.
Now, if you want to order your structure, you should do so properly, by reassinging the sorted structure to the array. So at some point, you would have to do the following:
a.OrderPositionArray = a.OrderPositionArray.OrderBy(p => p.LineNumber).ToArray();
But apart from the OrderPositionArray itself and the OrderHeader, you don’t really have anything that can be sorted (because you can’t really sort a collection by the order of a subcollection). So you could could solve it like this:
foreach (OrderData data in _sut.OrderDataArray)
{
foreach (OrderHeader header in data.OrderHeaderArray)
{
header.OrderPositionArray = header.OrderPositionArray.OrderBy(p => p.LineNumber).ToArray();
}
data.OrderHeaderArray = data.OrderHeaderArray.OrderBy(h => h.SequenceNumber).ToArray();
}
Instead of Linq, you can also sort the arrays in-place, which maybe makes it a bit nicer since you are not creating new inner arrays:
var c = Comparer<int>.Default;
foreach (OrderData data in _sut.OrderDataArray)
{
foreach (OrderHeader header in data.OrderHeaderArray)
{
Array.Sort(header.OrderPositionArray, new Comparison<OrderPositions>((x, y) => c.Compare(x.LineNumber, y.LineNumber)));
}
Array.Sort(data.OrderHeaderArray, new Comparison<OrderHeader>((x, y) => c.Compare(x.SequenceNumber, y.SequenceNumber)));
}
Here,
var ordered = _sut.OrderDataArray.OrderBy(o => ...
expects the Func<OrderData, TKey>, and the values will be sorted by comparing the result of this function execution.
At the same time, you pass the result of another OrderBy, which is IOrderedEnumerable. It simply doesn't make much sense.
In order to sort all the nested collections, you can do the following:
foreach (var orderData in _sut.OrderDataArray)
{
foreach (var orderHeader in orderData.OrderHeaderArray)
{
orderHeader.OrderPositionArray = orderHeader.OrderPositionArray
.OrderBy(x => x.LineNumber).ToArray();
}
orderData.OrderHeaderArray = orderData.OrderHeaderArray
.OrderBy(x => x.SequenceNumber).ToArray();
}
_sut.OrderDataArray = _sut.OrderDataArray
.OrderBy(x => ...).ToArray();
It sorts every OrderPositionArray item by its items' LineNumber.
It sorts every OrderHeaderArray by headers' SequenceNumber.
However, it is pretty unclear how you want to sort _sut.OrderDataArray - it is marked as x => ... in the example.
It has no comparable properties which can be used for sorting.

Find Max/Min element without using IComparable<T>

Say I have the following:
public Class BooClass
{
public int field1;
public double field2;
public DateTime field3;
}
public List<BooClass> booList;
So for example how do I get the element with the earliest time in field3 using booList.Find()
Edit Apologies, I meant to make all the fields public for simplicity of the example. I know can do it in linq, I wondered if there is a simple single line condition for the Find method.
F# has handy minBy and maxBy operators, which I like to implement as C# extension methods, since the Linq library omits them. It's a bit of work, but only a bit, and it allows you to avoid complex expressions such as
var earliest = booList.First(b => b.Field3 == booList.Min(e => e.Field3));
Instead, you can type this:
var earliest = booList.MinBy(b => b.Field3);
A simple implementation:
static T MinBy<T, C>(this IEnumerable<T> sequence, Func<T, C> keySelector)
{
bool first = true;
T result = default(T);
C minKey = default(C);
IComparer<C> comparer = Comparer<C>.Default; //or you can pass this in as a parameter
foreach (var item in sequence)
{
if (first)
{
result = item;
minKey = keySelector.Invoke(item);
first = false;
continue;
}
C key = keySelector.Invoke(item);
if (comparer.Compare(key, minKey) < 0)
{
result = item;
minKey = key;
}
}
return result;
}
This is also somewhat more efficient than the complex expression at the top, since MinBy iterates the sequence exactly once, while the expression iterates more than once and less than or equal to twice. And, of course, sorting and then taking the first item requires sorting, which is O(n log n), while this is just O(n).
As noted by Saeed Amiri, this approach doesn't work if you are relying on Linq to SQL or any other IQueryable<> provider. (More precisely, it works inefficiently because it pulls the objects from the database and works on them locally.) For a solution that doesn't do this, see Saeed's answer.
You could also make an extension method based on that approach, but as I am on my phone at the moment I'll leave the implementation as the proverbial "exercise for the reader."
You'll need to expose field3 through through a public property (we'll call it Field3), but you could use this:
var earliest = booList.First(b => b.Field3 == booList.Min(e => e.Field3));
Take a look at Enumerable.First and Enumerable.Min
NOTE: That this has a time complexity of O(n^2) (quadratic time) because it is traversing the list via Min each iteration. A large enough collection will see serious performance issues compared to Saeed Amiri's answer, which runs in O(n) (linear time).
Use OrderBy Then get the first element
var result = booList.OrderBy(p => p.field3).FirstOrDefault();
The O(n) approach is as follows. First find min date (for field3), then find first object with this min date:
var minDate = booList.Min(x=>x.field3);
var item = booList.First(x=>x.field3 == minDate);
Just make your property public.
As far as I can tell, there is no way to retrieve the BooClass object with the minimal date by just using List<T>.Find. Of course you can do this:
void Main()
{
List<BooClass> booList = new List<BooClass> {
new BooClass { field3 = DateTime.MaxValue},
new BooClass { field3 = DateTime.Now },
new BooClass { field3 = DateTime.MinValue }};
var pred = GetPredicate(booList);
var result = booList.Find(pred);
}
public Predicate<BooClass> GetPredicate(List<BooClass> boos)
{
var minDate = boos.Min(boo => boo.field3);
return bc => bc.field3 == minDate;
}
(which - just like Saeed's solution - also has O(n) time complexity), but I guess that would be considered cheating...
If you don't want to define a MinBy method, you can use aggregate like so:
booList.Aggregate((currMin, test) => currMin < test ? currMin : test);
To support empty lists, seed the aggregate with null, like so:
booList.Aggregate(null, (currMin, test) => null == currMin || currMin > test ? test : currMin);
This solution is O(n)

In LINQ, do projections off an IOrderedEnumerable<T> preserve the order?

If I have an IOrderedEnumberable<Car>, I sort it and then do a projecting query...
is the order preserved in the projection?
For example, does this scenario work?
IOrderedEnumberable<Car> allCarsOrderedFastestToSlowest =
GetAllCars()
.OrderByDescending(car=>car.TopSpeed);
var top3FastestCarManufacturers =
allCarsOrderedFastestToSlowest
.Select(car=>car.Manufacturer)
.Distinct()
.Take(3);
Does the name of the top3FastestCarManufacturers variable convey the meaning of what has really happened in the code?
The documentation for the Distinct method doesn't say anything about whether the order is preserved or not. This is probably because it depends on the underlying implementation of the source.
You can use grouping to get the desired result, by getting the fastest car from each manufacturer, and then get the three fastest from that:
var topThreeFastestCarManufacturers =
GetAllCars()
.GroupBy(c => c.Manufacturer)
.Select(g => g.OrderByDescending(c => c.TopSpeed).First())
.OrderByDescending(c => c.TopSpeed)
.Take(3);
I suspect what is going to mess you up is the Distinct. This will likely reorder the results by manufacturer to produce the distinct results. I'd likely just iterate through the list until I had three distinct manufacturers.
The selection will retain the ordering but the remarks on Distinct indicate that it returns an unordered result set and that it is implementation dependent. To be sure, I wouldn't rely on it retaining the ordering and simply do it using the iteration.
var top3 = new List<string>();
foreach (var manufacturer in allCarsOrderedFastestToSlowest
.Select(car=>car.Manufacturer))
{
if (!top3.Contains(manufacturer))
{
top3.Add(manufacturer);
if (top3.Count == 3)
{
break;
}
}
}

Categories