Select() decline in performance

Select() decline in performance - c#

I'm working on small app which is written in c# .net core and I'm populating one prop in a code because that information is not available in database, code looks like this:
public async Task<IEnumerable<ProductDTO>> GetData(Request request)
{
IQueryable<Product> query = _context.Products;
var products = await query.ToListAsync();
// WARNING - THIS SOLUTION LOOKS EXPENCIVE TO ME!
return MapDataAsDTO(products).Select(c =>
{
c.HasBrandStock = products.Any(cc => cc.ParentProductId == c.Id);
return c;
});
}
}
private IEnumerable<ProductDTO> MapDataAsDTO(IEnumerable<Product> products)
{
return products.Select(p => MapData(p)).ToList();
}
What is bothering me here is this code:
return MapDataAsDTO(products).Select(c =>
{
c.HasBrandStock = data.Any(cc => cc.ParentProductId == c.Id);
return c;
});
}
I've tested it on like 300k rows and it seems slow, I'm wondering is there a better solutions in this situations?
Thanks guys!
Cheers

First up, this method is loading all products, and generally that is a bad idea unless you are guaranteeing that the total number of records will remain reasonable, and the total size of those records will be reasonable. If the system can grow, add support for server-side pagination now. (Page # and Page size, leveraging Skip & Take) 300k products is not a reasonable number to be loading all data in one hit. Any way you skin this cat it will be slow, expensive, and error prone due to server load without paging. One user making a request on the server will need to have the DB server allocate for and load up 300k rows, transmit that data over the wire to the app server, which will allocate memory for those 300k rows, then transmit that data over the wire to the client who literally does not need those 300k rows at once. What do you think happens when 10 users hit this page? 100? And what happens when it's "to slow" and they start hammering the F5 key a few times. >:)
Second, async is not a silver bullet. It doesn't make queries faster, it actually makes them a bit slower. What it does do is allow your web server to be more responsive to other requests while those slower queries are running. Default to synchronous queries, get them running as efficiently as possible, then for the larger ones that are justified, switch them to asynchronous. MS made async extremely easy to implement, perhaps too easy to treat as a default. Keep it simple and synchronous to start, then re-factor methods to async as needed.
From what I can see you want to load all products into DTOs, and for products that are recognized as being a "parent" of at least one other product, you want to set their DTO's HasBrandStock to True. So given product IDs 1 and 2, where 2's parent ID is 1, the DTO for Product ID 1 would have a HasBrandStock True while Product ID 2 would have HasBrandStock = False.
One option would be to tackle this operation in 2 queries:
var parentProductIds = _context.Products
.Where(x => x.ParentProductId != null)
.Select(x => x.ParentProductId)
.Distinct()
.ToList();
var dtos = _context.Products
.Select(x => new ProductDTO
{
ProductId = x.ProductId,
ProductName = x.ProductName,
// ...
HasBrandStock = parentProductIds.Contains(x.ProductId)
}).ToList();
I'm using a manual Select here because I don't know what your MapAsDto method is actually doing. I'd highly recommend using Automapper and it's ProjectTo<T> method if you want to simplify the mapping code. Custom mapping functions can too easily hide expensive bugs like ToList calls when someone hits a scenario that EF cannot translate.
The first query gets a distinct list of just the Product IDs that are the parent ID of at least one other product. The second query maps out all products into DTOs, setting the HasBrandStock based on whether each product appears in the parentProductIds list or not.
This option will work if a relatively limited number of products are recognized as "parents". That first list can only get so big before it risks crapping out being too many items to translate into an IN clause.
The better option would be to look at your mapping. You have a ParentProductId, does a product entity have an associated ChildProducts collection?
public class Product
{
public int ProductId { get; set; }
public string ProductName { get; set; }
// ...
public virtual Product ParentProduct { get; set; }
public virtual ICollection<Product> ChildProducts { get; set; } = new List<Product>();
}
public class ProductConfiguration : EntityTypeConfiguration<Product>
{
public ProductConfiguration()
{
HasKey(x => x.ProductId);
HasOptional(x => x.ParentProduct)
.WithMany(x => x.ChildProducts)
.Map(x => x.MapKey("ParentProductId"));
}
}
This example maps the ParentProductId without exposing a field in the entity (recommended). Otherwise, if you do expose a ParentProductId, substitute the .Map(...) call with .HasForeignKey(x => x.ParentProductId).
This assumes EF6 as per your tags, if you're using EF Core then you use HasForeignKey("ParentProductId") in place of Map(...) to establish a shadow property for the FK without exposing a property. The entity configuration is a bit different with Core.
This allows your queries to leverage the relationship between parent products and any related children products. Populating the DTOs can be accomplished with one query:
var dtos = _context.Products
.Select(x => new ProductDTO
{
ProductId = x.ProductId,
ProductName = x.ProductName,
// ...
HasBrandStock = x.ChildProducts.Any()
}).ToList();
This leverages the relationship to populate your DTO and it's flag in one pass. The caveat here is that there is now a cyclical relationship between product and itself represented in the entity. This means don't feed entities to something like a serializer. That includes avoiding adding entities as members of DTOs/ViewModels.

Related

C# LINQ executing the same work over and over

Came across some legacy code where the logic attempts to prevent un-necessary multiple calls to an expensive query GetStudentsOnCourse(), but fails due to a misunderstanding of deferred execution.
var students = studentsToRemoveRecords.Select(x => x.CourseId)
.Distinct()
.SelectMany(c => studentRepository.GetStudentsOnCourse(c.Value));
var studentsToRemove = new List<Student>();
foreach (var record in studentsToRemoveRecords)
{
studentsToRemove.Add(
students.Single(s => s.Id == record.StudentId));
}
Here, if there are 2 records for the same course in studentsToRemoveRecords, the query GetStudentsOnCourse() will needlessly be called twice (with the same course id) instead of once.
You can solve this by converting students to a list beforehand and forcing it to memory (preventing the execution from being deferred). Or by simply rewriting the logic into something a bit simpler.
But I then realised I actually struggle to put into words exactly why GetStudentsOnCourse() is called twice in the scenario above... is it that LINQ is repeating the same work everytime studentsToRemoveRecords is iterated over, even though the resulting input values are identical each time?

is it that LINQ is repeating the same work everytime studentsToRemoveRecords is iterated over, even though the resulting input values are identical each time?
Yes, that's the nature of LINQ. Some Visual Studio Extensions, like ReSharper, give you warnings when you create code that might lead to multiple iterations of a LINQ Query.
If you want to avoid it, do this:
var students = studentsToRemoveRecords.Select(x => x.CourseId)
.Distinct()
.SelectMany(c => studentRepository.GetStudentsOnCourse(c.Value))
.ToList();
With ToList() the Query is executed immediately and the resulting entities are stored in a List<T>. Now you can iterate several times over students without having performance issues.
Edit to include comments:
Here is a link to some good documentation about it (thank you Sergio): LINQ Documentation
And some thoughts about your question how to handle this in a large code base:
Well, there are reasons for both scenarios - direct execution and storing the result into a new list, and deferred execution.
If you are familiar with SQL databases, you can think of a LINQ Query like a View or a Stored Procedure. You define what filtering/altering you want to execute on a base table to get the resulting entities. And each time you query that View/execute that Stored Procedure, it runs based on the current data in the base table.
Same for LINQ. Your Query (without ToList()) was just like the definition of the View. And each time you iterate over it, that definition gets executed based on the current Entities in studentsToRemoveRecords at that moment.
And maybe that's your intetion. Maybe you know that this base list is altering and you want to execute your query several times, expecting different results. Then do it without ToList().
But when you want to execute your query only once and then expect an immutable result list over which you can iterate multiple times, do it with ToList().
So both Scenarios are valid. And when you iterate only once, both scenarios are equal (disclaimer: when you iterate directly after defining the query). Maybe that's why you saw it so many times like this. It depends what you want.

Unclear exactly how your classes are done, BUT:
public class Student
{
public int Id { get; set; }
}
public class StudentCourse
{
public int StudentId { get; set; }
public int? CourseId { get; set; }
}
public class StudentRepository
{
public StudentCourse[] StudentCourses = new[]
{
new StudentCourse { CourseId = 1, StudentId = 100 },
new StudentCourse { CourseId = 2, StudentId = 200 },
new StudentCourse { CourseId = 3, StudentId = 300 },
new StudentCourse { CourseId = 4, StudentId = 400 },
};
public Student[] GetStudentsOnCourse(int courseId)
{
Console.WriteLine($"{nameof(GetStudentsOnCourse)}({courseId})");
return StudentCourses.Where(x => x.CourseId == courseId).Select(x => new Student { Id = x.StudentId }).ToArray();
}
}
and then
static void Main(string[] args)
{
var studentRepository = new StudentRepository();
var studentsToRemoveRecords = studentRepository.StudentCourses.ToArray();
var students = studentsToRemoveRecords.Select(x => x.CourseId)
.Distinct()
.SelectMany(c => studentRepository.GetStudentsOnCourse(c.Value));
//.ToArray();
var studentsToRemove = new List<Student>();
foreach (var record in studentsToRemoveRecords)
{
studentsToRemove.Add(
students.Single(s => s.Id == record.StudentId));
}
}
the method is called 16 times, with .ToArray() it is called 4 times. Note that .Single() will parse the full students collection to check that there is a single student with the "right" Id. Compare it with First() that will break after finding one record with the right Id (10 total calls of the method). As I've said in my comment, the method is called studentsToRemoveRecords.Count() * studentsToRemoveRecords.Distinct().Count(), so something like x ^ 2. Doing a .ToArray() "memoizes" the result of the GetStudentsOnCourse.
Just out of curiosity, you can add this class to your code:
public static class Tools
{
public static IEnumerable<T> DebugEnumeration<T>(this IEnumerable<T> enu)
{
Console.WriteLine("Begin Enumeration");
foreach (var res in enu)
{
yield return res;
}
}
}
and then do:
.SelectMany(c => studentRepository.GetStudentsOnCourse(c.Value))
.DebugEnumeration();
This will show you when the SelectMany is enumerated.

Using Include with Intersect/Union/Exclude in Linq

What seemed that it should be a relatively straight-forward task has turned into something of a surprisingly complex issue. To the point that I'm starting to think that my methodology perhaps is simply out of scope with the capabilities of Linq.
What I'm trying to do is piece-together a Linq query and then invoke .Include() in order to pull-in values from a number of child entities. For example, let's say I have these entities:
public class Parent
{
public int Id { get; set; }
public string Name { get; set; }
public string Location { get; set; }
public ISet<Child> Children { get; set; }
}
public class Child
{
public int Id { get; set; }
public int ParentId { get; set; }
public Parent Parent { get; set; }
public string Name { get; set; }
}
And let's say I want to perform a query to retrieve records from Parent, where Name is some value and Location is some other value, and then include Child records, too. But for whatever reason I don't know the query values for Name and Location at the same time, so I have to take two separate queryables and join them, such:
MyDbContext C = new MyDbContext();
var queryOne = C.Parent.Where(p => p.Name == myName);
var queryTwo = C.Parent.Where(p => p.Location == myLocation);
var finalQuery = queryOne.Intersect(queryTwo);
That works fine, producing results exactly as if I had just done:
var query = C.Parent.Where(p => p.Name == myName && p.Location = myLocation);
And similarly, I can:
var finalQuery = queryOne.Union(queryTwo);
To give me results just as if I had:
var query = C.Parent.Where(p => p.Name == myName || p.Location = myLocation);
What I cannot do, however, once the Intersect() or Union() is applied, however, is then go about mapping the Child using Include(), as in:
finalQuery.Include(p => p.Children);
This code will compile, but produces results as follows:
In the case of a Union(), a result set will be produced, but no Child entities will be enumerated.
In the case of an Intersect(), a run-time error is generated upon attempt to apply Include(), as follows:
Expression of type
'System.Collections.Generic.IEnumerable`1[Microsoft.EntityFrameworkCore.Query.Internal.AnonymousObject]'
cannot be used for parameter of type
'System.Collections.Generic.IEnumerable`1[System.Object]' of method
'System.Collections.Generic.IEnumerable`1[System.Object]
Intersect[Object](System.Collections.Generic.IEnumerable`1[System.Object],
System.Collections.Generic.IEnumerable`1[System.Object])'
The thing that baffles me is that this code will work exactly as expected:
var query = C.Parent.Where(p => p.Name == myName).Where(p => p.Location == myLocation);
query.Include(p => p.Children);
I.e., with the results as desired, including the Child entities enumerated.

my methodology perhaps is simply out of scope with the capabilities of Linq
The problem is not LINQ, but EF Core query translation, and specifically the lack of Intersect / Union / Concat / Except method SQL translation, tracked by #6812 Query: Translate IQueryable.Concat/Union/Intersect/Except/etc. to server.
Shortly, such queries currently use client evaluation, which with combination of how the EF Core handles Include leads to many unexpected runtime exceptions (like your case #2) or wrong behaviors (like Ignored Includes in your case #1).
So while your approach technically perfectly makes sense, according to the EF Core team leader response
Changing this to producing a single SQL query on the server isn't currently a top priority
so this currently is not even planned for 3.0 release, although there are plans to change (rewrite) the whole query translation pipeline, which might allow implementing that as well.
For now, you have no options. You may try processing the query expression trees yourself, but that's a complicated task and you'll probably find why it is not implemented yet :) If you can convert your queries to the equivalent single query with combined Where condition, then applying Include will be fine.
P.S. Note that even now your approach technically "works" w/o Include, prefomance wise the way it is evaluated client side makes it absolutely non equivalent of the corresponding single query.

A long time has gone by, but this .Include problem still exists in EF 6. However, there is a workaround: Append every child request with .Include before intersecting/Unionizing.
MyDbContext C = new MyDbContext();
var queryOne = db.Parents.Where(p => p.Name == parent.Name).Include("Children");
var queryTwo = db.Parents.Where(p => p.Location == parent.Location).Include("Children");
var finalQuery = queryOne.Intersect(queryTwo);
As stated by #Ivan Stoev, Intersection/Union is done with after-fetched data, while .Include is ok at request time.
So, as of now, you have this one option available.

Entity Framework Include directive not getting all expected related rows

While debugging some performance issues I discovered that Entity framework was loading a lot of records via lazy loading (900 extra query calls ain't fast!) but I was sure I had the correct include. I've managed to get this down to quite a small test case to demonstrate the confusion I'm having, the actual use case is more complex so I don't have a lot of scope to re-work the signature of what I'm doing but hopefully this is a clear example of the issue I'm having.
Documents have Many MetaInfo rows related. I want to get all documents grouped by MetaInfo rows with a specific value, but I want all the MetaInfo rows included so I don't have to fire off a new request for all the Documents MetaInfo.
So I've got the following Query.
ctx.Configuration.LazyLoadingEnabled = false;
var DocsByCreator = ctx.Documents
.Include(d => d.MetaInfo) // Load all the metaInfo for each object
.SelectMany(d => d.MetaInfo.Where(m => m.Name == "Author") // For each Author
.Select(m => new { Doc = d, Creator = m })) // Create an object with the Author and the Document they authored.
.ToList(); // Actualize the collection
I expected this to have all the Document / Author pairs, and have all the Document MetatInfo property filled.
That's not what happens, I get the Document objects, and the Authors just fine, but the Documents MetaInfo property ONLY has MetaInfo objects with Name == "Author"
If I move the where clause out of the select many it does the same, unless I move it to after the actualisation (which while here might not be a big deal, it is in the real application as it means we're getting a huge amount more data than we want to deal with.)
After playing with a bunch of different ways to do this I think it really looks like the issue is when you do a select(...new...) as well as the where and the include. Doing the select, or the Where clause after actualisation makes the data appear the way I expected it to.
I figured it was an issue with the MetaInfo property of Document being filtered, so I rewrote it as follows to test the theory and was surprised for find that this also gives the same (I think wrong) result.
ctx.Configuration.LazyLoadingEnabled = false;
var DocsByCreator = ctx.Meta
.Where(m => m.Name == "Author")
.Include(m => m.Document.MetaInfo) // Load all the metaInfo for Document
.Select(m => new { Doc = m.Document, Creator = m })
.ToList(); // Actualize the collection
Since we're not putting the where on the Document.MetaInfo property I expected this to bypass the problem, but strangely it doesn't the documents still only appear to have "Author" MetaInfo object.
I've created a simple test project and uploaded it to github with a bunch of test cases in, as far as I can tell they should all pass, bug only the ones with premature actualisation pass.
https://github.com/Robert-Laverick/EFIncludeIssue
Anyone got any theories? Am I abusing EF / SQL in some way I'm missing? Is there anything I can do differently to get the same organisation of results? Is this a bug in EF that's just been hidden from view by the LazyLoad being on by default, and it being a bit of an odd group type operation?

This is a limitation in EF in that Includes will be ignored if the scope of the entities returned is changed from where the include was introduced.
I couldn't find the reference to this for EF6, but it is documented for EF Core. (https://learn.microsoft.com/en-us/ef/core/querying/related-data) (see "ignore includes") I suspect it is a limit in place to stop EF's SQL generation from going completely AWOL in certain scenarios.
So while var docs = context.Documents.Include(d => d.Metas) would return the metas eager loaded against the document; As soon as you .SelectMany() you are changing what EF is supposed to return, so the Include statement is ignored.
If you want to return all documents, and include a property that is their author:
var DocsByCreator = ctx.Documents
.Include(d => d.MetaInfo)
.ToList() // Materialize the documents and their Metas.
.SelectMany(d => d.MetaInfo.Where(m => m.Name == "Author") // For each Author
.Select(m => new { Doc = d, Creator = m })) // Create an object with the Author and the Document they authored.
.ToList(); // grab your collection of Doc and Author.
If you only want documents that have authors:
var DocsByCreator = ctx.Documents
.Include(d => d.MetaInfo)
.Where(d => d.MetaInfo.Any(m => m.Name == "Author")
.ToList() // Materialize the documents and their Metas.
.SelectMany(d => d.MetaInfo.Where(m => m.Name == "Author") // For each Author
.Select(m => new { Doc = d, Creator = m })) // Create an object with the Author and the Document they authored.
.ToList(); // grab your collection of Doc and Author.
This means you will want to be sure that all of your filtering logic is done above that first 'ToList() call. Alternatively you can consider resolving the Author meta after the query such as when view models are populated, or an unmapped "Author" property on Document that resolves it. Though I generally avoid unmapped properties because if their use slips into an EF query, you get a nasty error at runtime.
Edit: Based on the requirement to skip & take I would recommend utilizing view models to return data rather than returning entities. Using a view model you can instruct EF to return just the raw data you need, compose the view models with either simple filler code or utilizing Automapper which plays nicely with IQueryable and EF and can handle most deferred cases like this.
For example:
public class DocumentViewModel
{
public int DocumentId { get; set; }
public string Name { get; set; }
public ICollection<MetaViewModel> Metas { get; set; } = new List<MetaViewModel>();
[NotMapped]
public string Author // This could be update to be a Meta, or specialized view model.
{
get { return Metas.SingleOrDefault(x => x.Name == "Author")?.Value; }
}
}
public class MetaViewModel
{
public int MetaId { get; set; }
public string Name { get; set; }
public string Value { get; set; }
}
Then the query:
var viewModels = context.Documents
.Select(x => new DocumentViewModel
{
DocumentId = x.DocumentId,
Name = x.Name,
Metas = x.Metas.Select(m => new MetaViewModel
{
MetaId = m.MetaId,
Name = m.Name,
Value = m.Value
}).ToList()
}).Skip(pageNumber*pageSize)
.Take(PageSize)
.ToList();
The relationship of an "author" to a document is implied, not enforced, at the data level. This solution keeps the entity models "pure" to the data representation and lets the code handle transforming that implied relationship into exposing a document's author.
The .Select() population can be handled by Automapper using .ProjectTo<TViewModel>().
By returning view models rather than entities you can avoid issues like this where .Include() operations get invalidated, plus avoid issues due to the temptation of detaching and reattaching entities between different contexts, plus improve performance and resource usage by only selecting and transmitting the data needed, and avoiding lazy load serialization issues if you forget to disable lazy-load or unexpected #null data with it.

"Group By" on entities from document-based storage?

If this is a duplicate, I apologize; I have done my share of searching, but I have figured out what to search for.
Let's say you have a student database and you want to average their scores based on gender. With your standard issue relational database, this is pretty trivial. It might require a query with an explicit join, or you may just use navigation properties or something, but it's going to look a little like this:
var averageScore = db.Grades
.Where(grade => grade.Student.Gender == selectedGender)
.Average();
But what if you're connected to a document-based system and your data structure is, instead, just a Student object with a collection of Grade objects embedded in it?
var averageScore = db.Students.GroupBy(student => student.Gender)
.ThisDoesNotWork(no => matter.What);
I have tried three dozen different ways to do a GroupBy that manages to transform collections of values into a single collection of values sharing a common key, but none of them have worked. Most of my attempts have involved attempting a SelectMany inside the GroupBy, and--if that's possible--let's just say that the compiler doesn't like my bedside manner.
Edit: Not sure what you mean by "format." The data structure we're talking about is just a class with a collection as one of its members.
class Student
{
public string Name { get; set; }
public Gender Gender { get; set; }
public ICollection<int> Grades { get; set; }
}

SelectMany will flatten the collection for you.
var average = db.Students
.Where(s => s.Gender == selectedGender)
.SelectMany(s => s.Grades)
.Average();
GroupBy, on the other hand, will group specific elements together. So, if you want to group all by gender:
var averages = db.Students
.GroupBy(
s => s.Gender,
(gender, group) => group
.SelectMany(s => s.Grades)
.Average());
"group" is an IEnumerable, ie. all the students that fit each group.

Rich's answer got me thinking a little harder about SelectMany() (plus I got off work and was bored), so I put in a little more work and here's what I've got:
var averagesByGender = db.Students
.SelectMany(
student => student.Grades,
(student, grade) => new { Gender = student.Gender, Grade = grade })
.GroupBy(
record => record.Gender,
record => record.Grade)
.Select(group => new { group.Key, Average = group.Average() });
The SelectMany() works pretty much exactly like a join statement would in any SQL database: you get one record per grade with the associated student information, and from there you can query whatever you want to in the old-fashioned way (or, as in my example, you can get one result for each of the genders represented).
The only wrinkle is that, apparently, this is too relational... As in RavenDB refuses to try to translate it into a query. Luckily enough, that's irrelevant if you hide it behind .ToList(). Wonder if it will work the same way with MongoDB.

LINQ Query - Only get Order and MAX Date from Child Collection

I'm trying to get a list that displays 2 values in a label from a parent and child (1-*) entity collection model.
I have 3 entities:
[Customer]: CustomerId, Name, Address, ...
[Order]: OrderId, OrderDate, EmployeeId, Total, ...
[OrderStatus]: OrderStatusId, StatusLevel, StatusDate, ...
A Customer can have MANY Order, which in turn an Order can have MANY OrderStatus, i.e.
[Customer] 1--* [Order] 1--* [OrderStatus]
Given a CustomerId, I want to get all of the Orders (just OrderId) and the LATEST (MAX?) OrderStatus.StatusDate for that Order.
I've tried a couple of attempts, but can seem to get the results I want.
private IQueryable<Customer> GetOrderData(string customerId)
{
var ordersWithLatestStatusDate = Context.Customers
// Note: I am not sure if I should add the .Expand() extension methods here for the other two entity collections since I want these queries to be as performant as possible and since I am projecting below (only need to display 2 fields for each record in the IQueryable<T>, but thinking I should now after some contemplation.
.Where(x => x.CustomerId == SelectedCustomer.CustomerId)
.Select(x => new Custom
{
CustomerId = x.CustomerId,
...
// I would like to project my Child and GrandChild Collections, i.e. Orders and OrderStatuses here but don't know how to do that. I learned that by projecting, one does not need to "Include/Expand" these extension methods.
});
return ordersWithLatestStatusDate ;
}
---- UPDATE 1 ----
After the great solution from User: lazyberezovsky, I tried the following:
var query = Context.Customers
.Where(c => c.CustomerId == SelectedCustomer.CustomerId)
.Select(o => new Customer
{
Name = c.Name,
LatestOrderDate = o.OrderStatus.Max(s => s.StatusDate)
});
In my hastiness from my initial posting, I didn't paste everything in correctly since it was mostly from memory and didn't have the exact code for reference at the time. My method is a strongly-typed IQueryabled where I need it to return a collection of items of type T due to a constraint within a rigid API that I have to go through that has an IQueryable query as one of its parameters. I am aware I can add other entities/attributes by either using the extension methods .Expand() and/or .Select(). One will notice that my latest UPDATED query above has an added "new Customer" within the .Select() where it was once anonymous. I'm positive that is why the query failed b/c it couldn't be turn into a valid Uri due to LatestOrderDate not being a property of Customer at the Server level. FYI, upon seeing the first answer below, I had added that property to my client-side Customer class with simple { get; set; }. So given this, can I somehow still have a Customer collection with the only bringing back those 2 fields from 2 different entities? The solution below looked so promising and ingenious!
---- END UPDATE 1 ----
FYI, the technologies I'm using are OData (WCF), Silverlight, C#.
Any tips/links will be appreciated.

This will give you list of { OrderId, LatestDate } objects
var query = Context.Customers
.Where(c => c.CustomerId == SelectedCustomer.CustomerId)
.SelectMany(c => c.Orders)
.Select(o => new {
OrderId = o.OrderId,
LatestDate = o.Statuses.Max(s => s.StatusDate) });
.
UPDATE construct objects in-memory
var query = Context.Customers
.Where(c => c.CustomerId == SelectedCustomer.CustomerId)
.SelectMany(c => c.Orders)
.AsEnumerable() // goes in-memory
.Select(o => new {
OrderId = o.OrderId,
LatestDate = o.Statuses.Max(s => s.StatusDate) });
Also grouping could help here.

If I read this correctly you want a Customer entity and then a single value computed from its Orders property. Currently this is not supported in OData. OData doesn't support computed values in the queries. So no expressions in the projections, no aggregates and so on.
Unfortunately even with two queries this is currently not possible since OData doesn't support any way of expressing the MAX functionality.
If you have control over the service, you could write a server side function/service operation to execute this kind of query.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.