"Group By" on entities from document-based storage? - c#

If this is a duplicate, I apologize; I have done my share of searching, but I have figured out what to search for.
Let's say you have a student database and you want to average their scores based on gender. With your standard issue relational database, this is pretty trivial. It might require a query with an explicit join, or you may just use navigation properties or something, but it's going to look a little like this:
var averageScore = db.Grades
.Where(grade => grade.Student.Gender == selectedGender)
.Average();
But what if you're connected to a document-based system and your data structure is, instead, just a Student object with a collection of Grade objects embedded in it?
var averageScore = db.Students.GroupBy(student => student.Gender)
.ThisDoesNotWork(no => matter.What);
I have tried three dozen different ways to do a GroupBy that manages to transform collections of values into a single collection of values sharing a common key, but none of them have worked. Most of my attempts have involved attempting a SelectMany inside the GroupBy, and--if that's possible--let's just say that the compiler doesn't like my bedside manner.
Edit: Not sure what you mean by "format." The data structure we're talking about is just a class with a collection as one of its members.
class Student
{
public string Name { get; set; }
public Gender Gender { get; set; }
public ICollection<int> Grades { get; set; }
}

SelectMany will flatten the collection for you.
var average = db.Students
.Where(s => s.Gender == selectedGender)
.SelectMany(s => s.Grades)
.Average();
GroupBy, on the other hand, will group specific elements together. So, if you want to group all by gender:
var averages = db.Students
.GroupBy(
s => s.Gender,
(gender, group) => group
.SelectMany(s => s.Grades)
.Average());
"group" is an IEnumerable, ie. all the students that fit each group.

Rich's answer got me thinking a little harder about SelectMany() (plus I got off work and was bored), so I put in a little more work and here's what I've got:
var averagesByGender = db.Students
.SelectMany(
student => student.Grades,
(student, grade) => new { Gender = student.Gender, Grade = grade })
.GroupBy(
record => record.Gender,
record => record.Grade)
.Select(group => new { group.Key, Average = group.Average() });
The SelectMany() works pretty much exactly like a join statement would in any SQL database: you get one record per grade with the associated student information, and from there you can query whatever you want to in the old-fashioned way (or, as in my example, you can get one result for each of the genders represented).
The only wrinkle is that, apparently, this is too relational... As in RavenDB refuses to try to translate it into a query. Luckily enough, that's irrelevant if you hide it behind .ToList(). Wonder if it will work the same way with MongoDB.

Related

How to implement nested search in a list

I'm trying to solve a problem where I need to filter a list which hold my custom reference object. Search criteria is based on nested properties. For a reference guide, let consider Student and Subject classes.
public class Student
{
public String Name {get;set;}
public List<Subject> Subjects {get;set;}
}
public Subject
{
public String Name {get;set;}
}
Not only I want to search Student by their names but the same search should also work with subject names as well. I've a single field where the text can be entered. For student search by their names, I've done:
FilteredList = Students.Where(s => s.Name.Contains(searchQuery));
Now, I also want to search students by the subject names but only want to show the matching results. A student can take many courses but a query of "Chemistry" should only show students who are taking this course but the rest of the courses they're taking should be ignored.
Basically my FilteredList is bound with ListView and Subjects list should only contain matching results. I'm keeping original source aside as Students. Any help implementing this search is highly appreciated.
You could use the LINQ function Any in this case. Something like this should give you the indented results :
FilteredList = Students.Where(s => s.Subjects.Any(subs => subs.Name.Contains(subjectSearchQuery))
If you want to use both filters at the same time, you can chain
FilteredList = Students.Where(s => s.Name.Contains(searchQuery))
.Where(s => s.Subjects.Any(subs => subs.Name.Contains(subjectSearchQuery))
EDIT : Seems like I understood the question wrong, here is what I think is the right answer (see the comments on this answer)
In this you want to use Select, in a fashion like this :
FilteredList = Students.Where(s => s.Name.Contains(studentNameFilter))
.Select(s => new Student()
{
Name = s.Name,
subjects = s.Subjects.Where(sub => sub.Name.Contains(subjectNameFilter))
});
This should give you the results you want.

Combine Two Properties From Entity In List And Flatten It With Linq

I have a list of entities in this structure below. How can I create a distinct list of List<int> of all the AwayTeamId and HomeTeamId with linq in one call? I know I can do a select many on the HomeTeamId and get all of those, but I also need the AwayTeamId included.
class Game {
int AwayTeamId;
int HomeTeamId;
}
Uriil's answer may be even shorter:
var result = games
.SelectMany(game => new[] { game.AwayTeamId, game.HomeTeamId })
.Distinct()
;
No need for additional .Select and lists creation for every Game record.
Assuming you are just after a flat list of all the team ids (home or away), then how about UNIONing two SELECTs?
var teamIds = games.Select(g => g.HomeTeamId).Union(games.Select(g => g.AwayTeamId));
[games being the list of Game entities in my example above]
This will work:
var result = entityContext.ListOfGames
.Select(p=>new List<int>{p.AwayTeamId, p.HomeTeamId})
.SelectMany(p=>p).Distinct();
If it's LINQ to ENTITY you will need to call .ToList() after ListOfGames, to make this solution works

LINQ Query - Only get Order and MAX Date from Child Collection

I'm trying to get a list that displays 2 values in a label from a parent and child (1-*) entity collection model.
I have 3 entities:
[Customer]: CustomerId, Name, Address, ...
[Order]: OrderId, OrderDate, EmployeeId, Total, ...
[OrderStatus]: OrderStatusId, StatusLevel, StatusDate, ...
A Customer can have MANY Order, which in turn an Order can have MANY OrderStatus, i.e.
[Customer] 1--* [Order] 1--* [OrderStatus]
Given a CustomerId, I want to get all of the Orders (just OrderId) and the LATEST (MAX?) OrderStatus.StatusDate for that Order.
I've tried a couple of attempts, but can seem to get the results I want.
private IQueryable<Customer> GetOrderData(string customerId)
{
var ordersWithLatestStatusDate = Context.Customers
// Note: I am not sure if I should add the .Expand() extension methods here for the other two entity collections since I want these queries to be as performant as possible and since I am projecting below (only need to display 2 fields for each record in the IQueryable<T>, but thinking I should now after some contemplation.
.Where(x => x.CustomerId == SelectedCustomer.CustomerId)
.Select(x => new Custom
{
CustomerId = x.CustomerId,
...
// I would like to project my Child and GrandChild Collections, i.e. Orders and OrderStatuses here but don't know how to do that. I learned that by projecting, one does not need to "Include/Expand" these extension methods.
});
return ordersWithLatestStatusDate ;
}
---- UPDATE 1 ----
After the great solution from User: lazyberezovsky, I tried the following:
var query = Context.Customers
.Where(c => c.CustomerId == SelectedCustomer.CustomerId)
.Select(o => new Customer
{
Name = c.Name,
LatestOrderDate = o.OrderStatus.Max(s => s.StatusDate)
});
In my hastiness from my initial posting, I didn't paste everything in correctly since it was mostly from memory and didn't have the exact code for reference at the time. My method is a strongly-typed IQueryabled where I need it to return a collection of items of type T due to a constraint within a rigid API that I have to go through that has an IQueryable query as one of its parameters. I am aware I can add other entities/attributes by either using the extension methods .Expand() and/or .Select(). One will notice that my latest UPDATED query above has an added "new Customer" within the .Select() where it was once anonymous. I'm positive that is why the query failed b/c it couldn't be turn into a valid Uri due to LatestOrderDate not being a property of Customer at the Server level. FYI, upon seeing the first answer below, I had added that property to my client-side Customer class with simple { get; set; }. So given this, can I somehow still have a Customer collection with the only bringing back those 2 fields from 2 different entities? The solution below looked so promising and ingenious!
---- END UPDATE 1 ----
FYI, the technologies I'm using are OData (WCF), Silverlight, C#.
Any tips/links will be appreciated.
This will give you list of { OrderId, LatestDate } objects
var query = Context.Customers
.Where(c => c.CustomerId == SelectedCustomer.CustomerId)
.SelectMany(c => c.Orders)
.Select(o => new {
OrderId = o.OrderId,
LatestDate = o.Statuses.Max(s => s.StatusDate) });
.
UPDATE construct objects in-memory
var query = Context.Customers
.Where(c => c.CustomerId == SelectedCustomer.CustomerId)
.SelectMany(c => c.Orders)
.AsEnumerable() // goes in-memory
.Select(o => new {
OrderId = o.OrderId,
LatestDate = o.Statuses.Max(s => s.StatusDate) });
Also grouping could help here.
If I read this correctly you want a Customer entity and then a single value computed from its Orders property. Currently this is not supported in OData. OData doesn't support computed values in the queries. So no expressions in the projections, no aggregates and so on.
Unfortunately even with two queries this is currently not possible since OData doesn't support any way of expressing the MAX functionality.
If you have control over the service, you could write a server side function/service operation to execute this kind of query.

Distinct elements in LINQ

I have a situation where i display a list of products for a customer. So, there are two kinds of products. So, if customer is registerd to two products, then both the products get displayed. So, I need to display distinct rows. I did this:
var queryProducts = DbContext.CustomerProducts.Where(p => p.Customers_Id ==
customerID).ToList().Select(r => new
{
r.Id,
r.Products_Id,
ProductName = r.Product.Name,
ShortName = r.Product.ShortName,
Description = r.Product.Description,
IsActive = r.Product.IsActive
}).Distinct();
In this, customerID is the value that i get from dropdownlist. However, it still displays the same row twice. So, can you please let me know how i can display only distinct records.
The most likely reasons could be that Distinct when called with no parameter by default compares all the public properties for equality. I suspect your Id is going to be unique. Hence the Distinct is not working for you.
You can try something like
myCustomerList.GroupBy(product => product.Products_Id).Select(grp => grp.First());
I found this as answers to
How to get distinct instance from a list by Lambda or LINQ
Distinct() with lambda?
Have a look at LINQ Select Distinct with Anonymous Types
I'm guessing r.ID is varying between the two products that are the same, but you have the same Products_Id?
You can write an implementation of IEqualityComparer<CustomerProduct>. Once you've got that, then you can use this:
DbContext.CustomerProducts.Where(p => p.Customers_Id == customerId)
.ToList()
.Distinct(new MyComparer())
.Select(r => new {
// etc.
public class MyComparer : IEqualityComparer<CustomerProduct>
{
// implement **Equals** and **GetHashCode** here
}
Note, using this anonymous comparer might work better for you, but it compares all properties in the anonymous type, not just the customer ID as specified in the question.

In LINQ, how can I do an .OrderBy() on data that came from my .Include()?

Here's what I'm doing:
List<Category> categories =
db.Categories.Include("SubCategories").OrderBy(c => c.Order).ToList();
I have a column on my categories table called "Order" which simply holds an integer that gives the table some kind of sorting order.
I have the same column on my "SubCategories" table...
I want to know the simplest solution to add the sort on my subcategories table... something like:
List<Category> categories =
db.Categories.Include("SubCategories").OrderBy(c => c.Order)
.ThenBy(c => c.SubCategories as x => x.Order).ToList();
I'd like to keep it in this type of LINQ format... (method format)...
Keep in mind, i'm working in MVC and need to return it to a view as a model. I've been having trouble with errors because of AnonymousTypes...
I'm not sure if this is supported, but here's how it might be done:
List<Category> categories =
db.Categories.Include(c => c.SubCategories.OrderBy(s => s.Order)).OrderBy(c => c.Order)
The Include method now supports Expressions like this, but I'm not certain if it supports ordering too.
You might be better off sorting the subcategories when you use them, probably in your view.
For example:
#for (var cat in Model.Categories) {
#cat.Name
#for (var sub in cat.SubCategories.OrderBy(c => c.Order) {
#sub.Name
}
}
You can split the single query into 2 queries which just fill up the context:
IQueryable<Category> categoryQuery = db.Categories.Where(c=> /*if needed*/);
List<Category> categories = categoryQuery.OrderBy(c => c.Order).ToList();
categoryQuery.SelectMany(c => c.SubCategories)
.OrderBy(sub => sub.Order)
.AsEnumerable().Count(); // will just iterate (and add to context) all results
You even don't need the error prone string "SubCategories" anymore then.
If Category.SubCategories is a collection in itself, then you won't be able to order using the existing extension methods (and c => c.SubCategories as x => x.Order translates to almost nothing, basically saying that SubCategories is a Func<SubCategory, bool>)
If you're content to have the sorting done in memory (which shouldn't really be a problem since you're already fetching them from the database anyway, provided you don't have thousands of the things) you can implement your own custom IComparer<Category> which interrogates the SubCategories of each Category to determine whether one Category should be placed above or below another Category in a sort operation.
Your statement would then be:
var categories = db.Categories.Include("SubCategories").OrderBy(x => x, new CategorySubCategoryComparer())

Categories