OrderBy on nested collections - c#

I'm trying to sort this complex object:
Order _sut = new Order
{
OrderDataArray = new[]
{
new OrderData
{
OrderHeaderArray = new[]
{
new OrderHeader
{
SequenceNumber = 1,
OrderPositionArray = new[]
{
new OrderPositions
{
LineNumber = 3
},
new OrderPositions
{
LineNumber = 2
},
new OrderPositions
{
LineNumber = 1
}
}
}
}
}
}
};
Using the code:
[Fact]
public void Sorts_By_Sequence_Number()
{
var ordered = _sut.OrderDataArray
.OrderBy(o => o.OrderHeaderArray
.OrderBy(a => a.OrderPositionArray
.OrderBy(p => p.LineNumber)))
.ToArray();
_sut.OrderDataArray = ordered;
OutputHelper(_sut);
}
I don't understand why this doesn't work, meaning the sorting routine simply keeps initial order of LineNumber object. I've tried various things with OrderBy, but looks like it doesn't sort.
EDIT
Thank you for responses, both are correct. I have accepted poke's response as it provides a bit more detailed information on the inner workings of the OrderBy method. Basically I was missing the assignment within the loop, I was trying to sort all objects at once.

You should consider what OrderBy does. It orders a collection by the value you determine in the lambda expression and then returns an enumerable.
Your outer call is good for that:
_sut.OrderDataArray.OrderBy(o => something).ToArray();
You sort by something, and then convert the result into a (then sorted) array. There are two things that matter here: First of all, at least in your example, there is only one object in OrderDataArray, so there is no sort happening. Second, it depends on the return value of something how those objects are sorted.
So in that case, what is something? It’s the following:
o.OrderHeaderArray.OrderBy(a => somethingElse)
So regardless of somethingElse, what does this return? An IEnumerable<OrderHeader>. How do multiple enumerables compare to each other? They are not really comparable; and they especially don’t tell you anything about the order based on their content (you’d have to enumerate it first). So essentially, you order that OrderHeaderArray by “something else”, use the result which does not tell you anything about an order as the key to order the OrderDataArray. Then, you throw the sorted OrderHeaderArray away.
You do the same exactly one level deeper with the OrderPositionArray, which again will not do anything useful. The only actual useful ordering happens to the OrderPositionArray itself but that result is again thrown away.
Now, if you want to order your structure, you should do so properly, by reassinging the sorted structure to the array. So at some point, you would have to do the following:
a.OrderPositionArray = a.OrderPositionArray.OrderBy(p => p.LineNumber).ToArray();
But apart from the OrderPositionArray itself and the OrderHeader, you don’t really have anything that can be sorted (because you can’t really sort a collection by the order of a subcollection). So you could could solve it like this:
foreach (OrderData data in _sut.OrderDataArray)
{
foreach (OrderHeader header in data.OrderHeaderArray)
{
header.OrderPositionArray = header.OrderPositionArray.OrderBy(p => p.LineNumber).ToArray();
}
data.OrderHeaderArray = data.OrderHeaderArray.OrderBy(h => h.SequenceNumber).ToArray();
}
Instead of Linq, you can also sort the arrays in-place, which maybe makes it a bit nicer since you are not creating new inner arrays:
var c = Comparer<int>.Default;
foreach (OrderData data in _sut.OrderDataArray)
{
foreach (OrderHeader header in data.OrderHeaderArray)
{
Array.Sort(header.OrderPositionArray, new Comparison<OrderPositions>((x, y) => c.Compare(x.LineNumber, y.LineNumber)));
}
Array.Sort(data.OrderHeaderArray, new Comparison<OrderHeader>((x, y) => c.Compare(x.SequenceNumber, y.SequenceNumber)));
}

Here,
var ordered = _sut.OrderDataArray.OrderBy(o => ...
expects the Func<OrderData, TKey>, and the values will be sorted by comparing the result of this function execution.
At the same time, you pass the result of another OrderBy, which is IOrderedEnumerable. It simply doesn't make much sense.
In order to sort all the nested collections, you can do the following:
foreach (var orderData in _sut.OrderDataArray)
{
foreach (var orderHeader in orderData.OrderHeaderArray)
{
orderHeader.OrderPositionArray = orderHeader.OrderPositionArray
.OrderBy(x => x.LineNumber).ToArray();
}
orderData.OrderHeaderArray = orderData.OrderHeaderArray
.OrderBy(x => x.SequenceNumber).ToArray();
}
_sut.OrderDataArray = _sut.OrderDataArray
.OrderBy(x => ...).ToArray();
It sorts every OrderPositionArray item by its items' LineNumber.
It sorts every OrderHeaderArray by headers' SequenceNumber.
However, it is pretty unclear how you want to sort _sut.OrderDataArray - it is marked as x => ... in the example.
It has no comparable properties which can be used for sorting.

Related

C# linq - Order by alphabetical, then by certain value [duplicate]

Can anyone explain what the difference is between:
tmp = invoices.InvoiceCollection
.OrderBy(sort1 => sort1.InvoiceOwner.LastName)
.OrderBy(sort2 => sort2.InvoiceOwner.FirstName)
.OrderBy(sort3 => sort3.InvoiceID);
and
tmp = invoices.InvoiceCollection
.OrderBy(sort1 => sort1.InvoiceOwner.LastName)
.ThenBy(sort2 => sort2.InvoiceOwner.FirstName)
.ThenBy(sort3 => sort3.InvoiceID);
Which is the correct approach if I wish to order by 3 items of data?
You should definitely use ThenBy rather than multiple OrderBy calls.
I would suggest this:
tmp = invoices.InvoiceCollection
.OrderBy(o => o.InvoiceOwner.LastName)
.ThenBy(o => o.InvoiceOwner.FirstName)
.ThenBy(o => o.InvoiceID);
Note how you can use the same name each time. This is also equivalent to:
tmp = from o in invoices.InvoiceCollection
orderby o.InvoiceOwner.LastName,
o.InvoiceOwner.FirstName,
o.InvoiceID
select o;
If you call OrderBy multiple times, it will effectively reorder the sequence completely three times... so the final call will effectively be the dominant one. You can (in LINQ to Objects) write
foo.OrderBy(x).OrderBy(y).OrderBy(z)
which would be equivalent to
foo.OrderBy(z).ThenBy(y).ThenBy(x)
as the sort order is stable, but you absolutely shouldn't:
It's hard to read
It doesn't perform well (because it reorders the whole sequence)
It may well not work in other providers (e.g. LINQ to SQL)
It's basically not how OrderBy was designed to be used.
The point of OrderBy is to provide the "most important" ordering projection; then use ThenBy (repeatedly) to specify secondary, tertiary etc ordering projections.
Effectively, think of it this way: OrderBy(...).ThenBy(...).ThenBy(...) allows you to build a single composite comparison for any two objects, and then sort the sequence once using that composite comparison. That's almost certainly what you want.
I found this distinction annoying in trying to build queries in a generic manner, so I made a little helper to produce OrderBy/ThenBy in the proper order, for as many sorts as you like.
public class EFSortHelper
{
public static EFSortHelper<TModel> Create<TModel>(IQueryable<T> query)
{
return new EFSortHelper<TModel>(query);
}
}
public class EFSortHelper<TModel> : EFSortHelper
{
protected IQueryable<TModel> unsorted;
protected IOrderedQueryable<TModel> sorted;
public EFSortHelper(IQueryable<TModel> unsorted)
{
this.unsorted = unsorted;
}
public void SortBy<TCol>(Expression<Func<TModel, TCol>> sort, bool isDesc = false)
{
if (sorted == null)
{
sorted = isDesc ? unsorted.OrderByDescending(sort) : unsorted.OrderBy(sort);
unsorted = null;
}
else
{
sorted = isDesc ? sorted.ThenByDescending(sort) : sorted.ThenBy(sort)
}
}
public IOrderedQueryable<TModel> Sorted
{
get
{
return sorted;
}
}
}
There are a lot of ways you might use this depending on your use case, but if you were for example passed a list of sort columns and directions as strings and bools, you could loop over them and use them in a switch like:
var query = db.People.AsNoTracking();
var sortHelper = EFSortHelper.Create(query);
foreach(var sort in sorts)
{
switch(sort.ColumnName)
{
case "Id":
sortHelper.SortBy(p => p.Id, sort.IsDesc);
break;
case "Name":
sortHelper.SortBy(p => p.Name, sort.IsDesc);
break;
// etc
}
}
var sortedQuery = sortHelper.Sorted;
The result in sortedQuery is sorted in the desired order, instead of resorting over and over as the other answer here cautions.
if you want to sort more than one field then go for ThenBy:
like this
list.OrderBy(personLast => person.LastName)
.ThenBy(personFirst => person.FirstName)
Yes, you should never use multiple OrderBy if you are playing with multiple keys.
ThenBy is safer bet since it will perform after OrderBy.

Dynamic LINQ discards applied OrderBy sortings except the latest applied [duplicate]

Can anyone explain what the difference is between:
tmp = invoices.InvoiceCollection
.OrderBy(sort1 => sort1.InvoiceOwner.LastName)
.OrderBy(sort2 => sort2.InvoiceOwner.FirstName)
.OrderBy(sort3 => sort3.InvoiceID);
and
tmp = invoices.InvoiceCollection
.OrderBy(sort1 => sort1.InvoiceOwner.LastName)
.ThenBy(sort2 => sort2.InvoiceOwner.FirstName)
.ThenBy(sort3 => sort3.InvoiceID);
Which is the correct approach if I wish to order by 3 items of data?
You should definitely use ThenBy rather than multiple OrderBy calls.
I would suggest this:
tmp = invoices.InvoiceCollection
.OrderBy(o => o.InvoiceOwner.LastName)
.ThenBy(o => o.InvoiceOwner.FirstName)
.ThenBy(o => o.InvoiceID);
Note how you can use the same name each time. This is also equivalent to:
tmp = from o in invoices.InvoiceCollection
orderby o.InvoiceOwner.LastName,
o.InvoiceOwner.FirstName,
o.InvoiceID
select o;
If you call OrderBy multiple times, it will effectively reorder the sequence completely three times... so the final call will effectively be the dominant one. You can (in LINQ to Objects) write
foo.OrderBy(x).OrderBy(y).OrderBy(z)
which would be equivalent to
foo.OrderBy(z).ThenBy(y).ThenBy(x)
as the sort order is stable, but you absolutely shouldn't:
It's hard to read
It doesn't perform well (because it reorders the whole sequence)
It may well not work in other providers (e.g. LINQ to SQL)
It's basically not how OrderBy was designed to be used.
The point of OrderBy is to provide the "most important" ordering projection; then use ThenBy (repeatedly) to specify secondary, tertiary etc ordering projections.
Effectively, think of it this way: OrderBy(...).ThenBy(...).ThenBy(...) allows you to build a single composite comparison for any two objects, and then sort the sequence once using that composite comparison. That's almost certainly what you want.
I found this distinction annoying in trying to build queries in a generic manner, so I made a little helper to produce OrderBy/ThenBy in the proper order, for as many sorts as you like.
public class EFSortHelper
{
public static EFSortHelper<TModel> Create<TModel>(IQueryable<T> query)
{
return new EFSortHelper<TModel>(query);
}
}
public class EFSortHelper<TModel> : EFSortHelper
{
protected IQueryable<TModel> unsorted;
protected IOrderedQueryable<TModel> sorted;
public EFSortHelper(IQueryable<TModel> unsorted)
{
this.unsorted = unsorted;
}
public void SortBy<TCol>(Expression<Func<TModel, TCol>> sort, bool isDesc = false)
{
if (sorted == null)
{
sorted = isDesc ? unsorted.OrderByDescending(sort) : unsorted.OrderBy(sort);
unsorted = null;
}
else
{
sorted = isDesc ? sorted.ThenByDescending(sort) : sorted.ThenBy(sort)
}
}
public IOrderedQueryable<TModel> Sorted
{
get
{
return sorted;
}
}
}
There are a lot of ways you might use this depending on your use case, but if you were for example passed a list of sort columns and directions as strings and bools, you could loop over them and use them in a switch like:
var query = db.People.AsNoTracking();
var sortHelper = EFSortHelper.Create(query);
foreach(var sort in sorts)
{
switch(sort.ColumnName)
{
case "Id":
sortHelper.SortBy(p => p.Id, sort.IsDesc);
break;
case "Name":
sortHelper.SortBy(p => p.Name, sort.IsDesc);
break;
// etc
}
}
var sortedQuery = sortHelper.Sorted;
The result in sortedQuery is sorted in the desired order, instead of resorting over and over as the other answer here cautions.
if you want to sort more than one field then go for ThenBy:
like this
list.OrderBy(personLast => person.LastName)
.ThenBy(personFirst => person.FirstName)
Yes, you should never use multiple OrderBy if you are playing with multiple keys.
ThenBy is safer bet since it will perform after OrderBy.

How to keep initializer list order within Select and/or SelectMany

I hope this is not a duplicate but I wasn't able to find an answer on this.
It either seems to be an undesired behavior or missing knowledge on my part.
I have a list of platform and configuration objects. Both contains a member string CodeName in it.
The list of CodeNames look like this:
dbContext.Platforms.Select(x => x.CodeName) => {"test", "PC", "Nintendo"}
dbContext.Configurations.Select(x => x.CodeName) => {"debug", "release"}
They are obtained from a MySQL database hence the dbContext object.
Here is a simple code that I was to translate in LINQ because 2 foreach are things of the past:
var choiceList = new List<List<string>>();
foreach (Platform platform in dbContext.Platforms.ToList())
{
foreach (Configuration configuration in dbContext.Configurations.ToList())
{
choiceList.Add(new List<string>() { platform.CodeName, configuration.CodeName });
}
}
This code gives my exactly what I want, keeping the platform name first which looks like :
var results = new List<List<string>>() {
{"test", "debug"},
{"test", "release"},
{"PC", "debug"}
{"PC", "release"}
{"Nintendo", "debug"}
{"Nintendo", "release"}};
But if I translate that to this, my list contains item in a different order:
var choiceList = dbContext.Platforms.SelectMany(p => dbContext.Configurations.Select(t => new List<string>() { p.CodeName, t.CodeName })).ToList();
I will end up with this, where the platform name isn't always first, which is not what is desired:
var results = new List<List<string>>() {
{"debug", "test"},
{"release", "test"},
{"debug", "PC"}
{"PC", "release"}
{"debug", "Nintendo"}
{"Nintendo", "release"}};
My question is, is it possible to obtain the desired result using LINQ?
Let me know if I'm not clear or my question lacks certain details.
Thanks
EDIT: So Ivan found the explanation and I modified my code in consequence.
In fact, only the Enumerable in front of the SelectMany needed the .ToList().
I should also have mentioned that I was stuck with the need of a List>.
Thanks everyone for the fast input, this was really appreciated.
When you use
var choiceList = dbContext.Platforms.SelectMany(p => dbContext.Configurations.Select(t => new List<string>() { p.CodeName, t.CodeName })).ToList();
it's really translated to some SQL query where the order of the returned records in not defined as soon as you don't use ORDER BY.
To get the same results as your nested loops, execute and materialize both queries, and then do SelectMany in memory:
var platforms = dbContext.Platforms.ToList();
var configurations = dbContext.Configurations.ToList();
var choiceList = platforms.SelectMany(p => configurations,
(p, c) => new List<string>() { p.CodeName, c.CodeName })
.ToList();
Rather than projecting it out to an array, project it out two a new object with two fields (potentially an anonymous object) and then, if you need it, project that into a two element array after you have retrieved the objects from the database, if you really do need these values in an array.
Try this-
var platforms= dbContext.Platforms.Select(x=>x.CodeName);
var configurations=dbContext.Configurations.Select(x=>x.CodeName);
var mix=platforms.SelectMany(num => configurations, (n, a) => new { n, a });
If you want to learn more in detail- Difference between Select and SelectMany

Categorical sorting optimization

Question: What is the best way to sort items(T) into buckets(ConcurrentBag)?
Ok, so I have not yet taken an Algorithms class, so I am unsure of the best approach to the problem I have come across.
Preconditions:
Each bucket has a unique identifier (within each sBucket).
Each sBucket has a unique identifier.
Each item has a unique identifier.
Each item has a property (bucketId) corresponding to the bucket it
belongs to.
Each item has a property (sBucketId) corresponding to the
superBucket it belongs to.
Bucket and sBucket id's are unique.
I have a ConcurrentBag of items I wish to sort into these
buckets.
There are several hundred items.
There are several dozen buckets.
There are 3 super-buckets which contain the buckets.
Each super-bucket contains the same buckets, though with different
items within the buckets.
I am currently using brute force via a Parallel.foreach loop on the collection of items to compare the item's bucketId to each individual bucket using linq. This is incredibly slow and cumbersome though, so I'd like to find a better method.
I have thought about sorting the items based on their superBucket then Bucket, and then iterating through each superbucket->bucket to insert the items. Should this be the path I take?
Thanks for any help you can provide.
Example of current code
ConcurrentBag<Item> items ...
List<SuperBuckets> ListOfSuperBuckets ...
Parallel.ForEach(items, item =>
{
ListOfSuperBuckets
.Where(sBucket => sBucket.id == item.sBucketId)
.First()
.buckets
.Where(bucket => bucket.id == item.bucketId)
.First()
.items
.Add(item);
});
I wouldn't use parallelism for this, but there are a bunch of options.
var groupedBySBucket = ListOfSuperBuckets
.GroupJoin(items, a => a.id, b => b.sBucketId, (a,b) => new
{
sBucket = a,
buckets = a.buckets
.GroupJoin(b, c => c.id, x => x.bucketId, (c, x) => new
{
bucket = c,
items = x
});
});
foreach (var g in groupedBySBucket)
{
// We benefit here from that the collection types are passed by reference.
foreach (var b in g.buckets)
{
b.bucket.AddRange(b.items);
}
}
Or if that's too much code for you, this is comparable.
var groupedByBucket = ListOfSuperBuckets
.SelectMany(c => c.buckets, (a,b) => new { sBucketId = a.id, bucket = b })
.GroupJoin(items, a => new { a.sBucketId, bucketId = a.bucket.id }, b => new { b.sBucketId, b.bucketId }, (a, b) => new
{
bucket = a.bucket,
items = b
}));
foreach (var g in groupedByBucket)
{
// We benefit here from that the collection types are passed by reference.
g.bucket.AddRange(b.items);
}
This is also assuming ListOfSuperBuckets is a given. If that was simply an artifact of your implementation, there'd be a simpler way even yet. This builds the list.
Beware, of course, because these are different--this one won't have any empty buckets for no data, but the first implementation could. We're also creating new buckets, which the first implementation doesn't; good if we need to, bad if you've already created them elsewhere. The first one could easily be modified to create them, of course.
var ListOfSuperBuckets = items
.GroupBy(c => new { c.bucketId, c.sBucketId })
.GroupBy(c => c.Key.sBucketId)
.Select(c => new SuperBucket
{
id = c.Key,
buckets = c.Select(b => new Bucket
{
id = b.Key.bucketId,
items = b.ToList()
}).ToList()
})
.ToList();
For what it's worth, all these ToList calls are meant to preserve the contract I assume you have. If you don't need them, you could benefit from LINQ's deferred execution by leaving them off. It's really a matter of how you're using the code, but that's worth consideration.
You should use Dictionary so you can look up buckets and SuperBuckets by ID instead of searching for them.
SuperBucket should have a Dictionary<id_type,Bucket> that you can use to look up buckets by ID, and should should keep the SuperBuckets in a Dictionary<id_type,SuperBucket>. (id_type is the type of your IDs -- probably string or int, but I can't tell from your code)
If you don't want to modify the existing classes, then build a Dictionary<id_type, Dictionary<id_type, Bucket>> and use that.

Sort one list by another

I have 2 list objects, one is just a list of ints, the other is a list of objects but the objects has an ID property.
What i want to do is sort the list of objects by its ID in the same sort order as the list of ints.
Ive been playing around for a while now trying to get it working, so far no joy,
Here is what i have so far...
//**************************
//*** Randomize the list ***
//**************************
if (Session["SearchResultsOrder"] != null)
{
// save the session as a int list
List<int> IDList = new List<int>((List<int>)Session["SearchResultsOrder"]);
// the saved list session exists, make sure the list is orded by this
foreach(var i in IDList)
{
SearchData.ReturnedSearchedMembers.OrderBy(x => x.ID == i);
}
}
else
{
// before any sorts randomize the results - this mixes it up a bit as before it would order the results by member registration date
List<Member> RandomList = new List<Member>(SearchData.ReturnedSearchedMembers);
SearchData.ReturnedSearchedMembers = GloballyAvailableMethods.RandomizeGenericList<Member>(RandomList, RandomList.Count).ToList();
// save the order of these results so they can be restored back during postback
List<int> SearchResultsOrder = new List<int>();
SearchData.ReturnedSearchedMembers.ForEach(x => SearchResultsOrder.Add(x.ID));
Session["SearchResultsOrder"] = SearchResultsOrder;
}
The whole point of this is so when a user searches for members, initially they display in a random order, then if they click page 2, they remain in that order and the next 20 results display.
I have been reading about the ICompare i can use as a parameter in the Linq.OrderBy clause, but i can’t find any simple examples.
I’m hoping for an elegant, very simple LINQ style solution, well I can always hope.
Any help is most appreciated.
Another LINQ-approach:
var orderedByIDList = from i in ids
join o in objectsWithIDs
on i equals o.ID
select o;
One way of doing it:
List<int> order = ....;
List<Item> items = ....;
Dictionary<int,Item> d = items.ToDictionary(x => x.ID);
List<Item> ordered = order.Select(i => d[i]).ToList();
Not an answer to this exact question, but if you have two arrays, there is an overload of Array.Sort that takes the array to sort, and an array to use as the 'key'
https://msdn.microsoft.com/en-us/library/85y6y2d3.aspx
Array.Sort Method (Array, Array)
Sorts a pair of one-dimensional Array objects (one contains the keys
and the other contains the corresponding items) based on the keys in
the first Array using the IComparable implementation of each key.
Join is the best candidate if you want to match on the exact integer (if no match is found you get an empty sequence). If you want to merely get the sort order of the other list (and provided the number of elements in both lists are equal), you can use Zip.
var result = objects.Zip(ints, (o, i) => new { o, i})
.OrderBy(x => x.i)
.Select(x => x.o);
Pretty readable.
Here is an extension method which encapsulates Simon D.'s response for lists of any type.
public static IEnumerable<TResult> SortBy<TResult, TKey>(this IEnumerable<TResult> sortItems,
IEnumerable<TKey> sortKeys,
Func<TResult, TKey> matchFunc)
{
return sortKeys.Join(sortItems,
k => k,
matchFunc,
(k, i) => i);
}
Usage is something like:
var sorted = toSort.SortBy(sortKeys, i => i.Key);
One possible solution:
myList = myList.OrderBy(x => Ids.IndexOf(x.Id)).ToList();
Note: use this if you working with In-Memory lists, doesn't work for IQueryable type, as IQueryable does not contain a definition for IndexOf
docs = docs.OrderBy(d => docsIds.IndexOf(d.Id)).ToList();

Categories