Counting in a Linq Query - c#

I have a fairly complicated join query that I use with my database. Upon running it I end up with results that contain an baseID and a bunch of other fields. I then want to take this baseID and determine how many times it occurs in a table like this:
TableToBeCounted (One to Many)
{
baseID,
childID
}
How do I perform a linq query that still uses the query I already have and then JOINs the count() with the baseID?
Something like this in untested linq code:
from k in db.Kingdom
join p in db.Phylum on k.KingdomID equals p.KingdomID
where p.PhylumID == "Something"
join c in db.Class on p.PhylumID equals c.PhylumID
select new {c.ClassID, c.Name};
I then want to take that code and count how many orders are nested within each class. I then want to append a column using linq so that my final select looks like this:
select new {c.ClassID, c.Name, o.Count()}//Or something like that.
The entire example is based upon the Biological Classification system.
Assume for the example that I have multiple tables:
Kingdom
|--Phylum
|--Class
|--Order
Each Phylum has a Phylum ID and a Kingdom ID. Meaning that all phylum are a subset of a kingdom. All Orders are subsets of a Class ID. I want to count how many Orders below to each class.

select new {c.ClassID, c.Name, (from o in orders where o.classId == c.ClassId select o).Count()}
Is this possible for you? Best I can do without knowing more of the arch.

If the relationships are as you describe:
var foo = db.Class.Where(c=>c.Phylum.PhylumID == "something")
.Select(x=> new { ClassID = x.ClassID,
ClassName = x.Name,
NumOrders= x.Order.Count})
.ToList();
Side question: why are you joining those entities? Shouldn't they naturally be FK'd, thereby not requiring an explicit join?

Related

Linq join on two values

Suppose I have a list of {City, State}. It originally came from the database, and I have LocationID, but by now I loaded it into memory. Suppose I also have a table of fast food restaurants that has City and State as part of the record. I need to get a list of establishments that match city and state.
NOTE: I try to describe a simplified scenario; my business domain is completely different.
I came up with the following LINQ solution:
var establishments = from r in restaurants
from l in locations
where l.LocationId == id &&
l.City == r.City &&
l.State == r.State
select r
and I feel there must be something better. For starters, I already have City/State in memory - so to go back to the database only to have a join seems very inefficient. I am looking for some way to say {r.City, r.State} match Any(MyList) where MyList is my collection of City/State.
UPDATE
I tried to update based on suggestion below:
List<CityState> myCityStates = ...;
var establishments =
from r in restaurants
join l in myCityStates
on new { r.City, r.State } equals new { l.City, l.State } into gls
select r;
and I got the following compile error:
Error CS1941 The type of one of the expressions in the join clause is incorrect. Type inference failed in the call to 'Join'.
UPDATE 2
Compiler didn't like anonymous class in the join. I made it explicit and it stopped complaining. I'll see if it actually works in the morning...
It seems to me that you need this:
var establishments =
from r in restaurants
join l in locations.Where(x => x.LocationId == id)
on new { r.City, r.State } equals new { l.City, l.State } into gls
select r;
Well, there isn't a lot more that you can do, as long as you rely on a table lookup, the only thing you can do to speed up things is to put an index on City and State.
The linq statement has to translate into a valid SQL Statement, where "Any" would translate to something like :
SELECT * FROM Restaurants where City in ('...all cities')
I dont know if other ORM's give better performance for these types of scenarios that EF, but it might be worth investigating. EF has never had a rumor for being fast on reads.
Edit: You can also do this:
List<string> names = new List { "John", "Max", "Pete" };
bool has = customers.Any(cus => names.Contains(cus.FirstName));
this will produce the necessary IN('value1', 'value2' ...) functionality that you were looking for

Query returns all results quickly, but then when call to JSON times out due to loading all unnecessary properties

I have the following query that gets a list of schools based on the criteria provided. Note: This database is very, very large with 10,000+ records. The end result is a list of 188 schools, which is exactly as we need.
return (from s in Context.Schools
join d in Context.Districts on s.DistrictID equals d.DistrictID
join r in Context.Rosters on s.SchoolID equals r.SchoolID
join te in Context.TestEvents on r.TestEventID equals te.TestEventID
join ta in Context.TestAdministrations on te.TestAdministrationID equals ta.TestAdministrationID
join sr in Context.ScoreResults on r.RosterID equals sr.RosterID into exists
from any in exists.DefaultIfEmpty()
where d.DistrictID == DistrictID
&& ta.SchoolYearID == SchoolYearID.Value
select s)
.Distinct()
.OrderBy(x => x.Name)
.ToList();
The problem is when we call return Json(Schools, JsonRequestBehavior.AllowGet); to send our schools back to the client the operation times out. It appears when stepping thorough the code that for some reason the DbContext is trying to pull in ALL of the properties for this result set, including the ones we don't need. I already have everything I need from the database in this Schools object. Why does it go back and start creating all the associated objects. Is there a way to stop this?.
This is an MVC application using EF 5 Code First.
Instead of selecting the whole entity, select a projection of only what you need:
var results = from s in Context.Schools
...
select new MyClassContainingOnlyAFewProperties {
Prop1 = s.Prop1,
Prop2 = s.Prop2,
//etc.
}
return results;
See also: What does Query Projection mean in Entity Framework?

one-to-many projected LINQ query executes repeatedly

I am projecting LINQ to SQL results to strongly typed classes: Parent and Child. The performance difference between these two queries is large:
Slow Query - logging from the DataContext shows that a separate call to the db is being made for each parent
var q = from p in parenttable
select new Parent()
{
id = p.id,
Children = (from c in childtable
where c.parentid = p.id
select c).ToList()
}
return q.ToList() //SLOW
Fast Query - logging from the DataContext shows a single db hit query that returns all required data
var q = from p in parenttable
select new Parent()
{
id = p.id,
Children = from c in childtable
where c.parentid = p.id
select c
}
return q.ToList() //FAST
I want to force LINQ to use the single-query style of the second example, but populate the Parent classes with their Children objects directly. otherwise, the Children property is an IQuerierable<Child> that has to be queried to expose the Child object.
The referenced questions do not appear to address my situation. using db.LoadOptions does not work. perhaps it requires the type to be a TEntity registered with the DataContext.
DataLoadOptions options = new DataLoadOptions();
options.LoadWith<Parent>(p => p.Children);
db.LoadOptions = options;
Please Note: Parent and Child are simple types, not Table<TEntity> types. and there is no contextual relationship between Parent and Child. the subqueries are ad-hoc.
The Crux of the Issue: in the 2nd LINQ example I implement IQueriable statements and do not call ToList() function and for some reason LINQ knows how to generate one single query that can retrieve all the required data. How do i populate my ad-hoc projection with the actual data as is accomplished in the first query? Also, if anyone could help me better-phrase my question, I would appreciate it.
It's important to remember that LINQ queries rely in deferred execution. In your second query you aren't actually fetching any information about the children. You've created the queries, but you haven't actually executed them to get the results of those queries. If you were to iterate the list, and then iterate the Children collection of each item you'd see it taking as much time as the first query.
Your query is also inherently very inefficient. You're using a nested query in order to represent a Join relationship. If you use a Join instead the query will be able to be optimized appropriately by both the query provider as well as the database to execute much more quickly. You may also need to adjust the indexes on your database to improve performance. Here is how the join might look:
var q = from p in parenttable
join child in childtable
on p.id equals child.parentid into children
select new Parent()
{
id = p.id,
Children = children.ToList(),
}
return q.ToList() //SLOW
The fastest way I found to accomplish this is to do a query that returns all the results then group all the results. Make sure you do a .ToList() on the first query, so that the second query doesn't do many calls.
Here r should have what you want to accomplish with only a single db query.
var q = from p in parenttable
join c in childtable on p.id equals c.parentid
select c).ToList();
var r = q.GroupBy(x => x.parentid).Select(x => new { id = x.Key, Children=x });
You must set correct options for your data load.
options.LoadWith<Document>(d => d.Metadata);
Look at this
P.S. Include for the LINQToEntity only.
The second query is fast precisely because Children is not being populated.
And the first one is slow just because Children is being populated.
Choose the one that fits your needs best, you simply can't have their features together!
EDIT:
As #Servy says:
In your second query you aren't actually fetching any information about the children. You've created the queries, but you haven't actually executed them to get the results of those queries. If you were to iterate the list, and then iterate the Children collection of each item you'd see it taking as much time as the first query.

Filtering a graph of entity framework objects

I'm trying to filter down the results returned by EF into only those relevant - in the example below to those in a year (formattedYear) and an ordertype (filtOrder)
I have a simple set of objects
PEOPLE 1-M ORDERS 1-M ORDERLINES
with these relationships already defined in the Model.edmx
in SQL I would do something like...
select * from PEOPLE inner join ORDERS on ORDERS.PEOPLE_RECNO=PEOPLE.RECORD_NUMBER
inner join ORDERLINE on ORDERLINE.ORDER_RECNO=ORDERS.RECORD_NUMBER
where ORDERLINE.SERVICE_YEAR=#formattedYear
and ORDERS.ORDER_KEY=#filtOrder
I've tried a couple of approaches...
var y = _entities.PEOPLE.Include("ORDERS").Where("it.ORDERS.ORDER_KEY=" + filtOrder.ToString()).Include("ORDERLINEs").Where("it.ORDERS.ORDERLINEs.SERVICE_YEAR='" + formattedYear + "'");
var x = (from hp in _entities.PEOPLE
join ho in _entities.ORDERS on hp.RECORD_NUMBER equals ho.PEOPLE_RECNO
join ol in _entities.ORDERLINEs on ho.RECORD_NUMBER equals ol.ORDERS_RECNO
where (formattedYear == ol.SERVICE_YEAR) && (ho.ORDER_KEY==filtOrder)
select hp
);
y fails with ORDER_KEY is not a member of transient.collection...
and x returns the right PEOPLE but they have all of their orders attached - not just those I am after.
I guess I'm missing something simple ?
Imagine you have a person with 100 orders. Now you filter those orders down to 10. Finally you select the person who has those orders. Guess what? The person still has 100 orders!
What you're asking for is not the entity, because you don't want the whole entity. What you seem to want is a subset of the data from the entity. So project that:
var x = from hp in _entities.PEOPLE
let ho = hp.ORDERS.Where(o => o.ORDER_KEY == filtOrder
&& o.ORDERLINES.Any(ol => ol.SERVICE_YEAR == formattedYear))
where ho.Any()
select new
{
Id = hp.ID,
Name = hp.Name, // etc.
Orders = from o in ho
select new { // whatever
};
I am not exactly sure what your question is but the following might be helpful.
In entity framework if you want to load an object graph and filter the children then you might first do a query for the child objects and enumerate it (i.e. call ToList()) so the childern will be fetched in memory.
And then when you fetch the parent objects (and do not use .include) enitity framework will able to construct the graph on its own (but note that you might have to disable lazy loading first or it will take long to load).
here is an example (assuming your context is "db"):
db.ContextOptions.LazyLoadingEnabled = false;
var childQuery = (from o in db.orders.Take(10) select o).ToList();
var q = (from p in db.people select p).ToList();
Now you will find that every people object has ten order objects
EDIT: I was in a hurry when I wrote the sample code, and as such I have not tested it yet, and I probably went wrong by claiming that .Take(10) will bring back ten orders for every people object, instead I believe that .Take(10) will bring back only ten overall orders when lazy loading is disabled, (and for the case when lazy loading is enabled I have to actually test what the result will be) and in order to bring back ten orders for every people object you might have to do more extensive filtering.
But the idea is simple, you first fetch all children objects and entity framework constructs the graph on its own.

Trying to create some dynamic linq

I'm trying to create a linq query based on some dynamic/optional arguments passed into a method.
User [Table] -> zero to many -> Vehicles [Table]
User [Table] -> zero to many -> Pets
So we want all users (including any vechile and/or pet info). Optional filters are
Vehicle numberplate
Pet name
Because the vehicle and pet tables are zero-to-many, i usually have outer joins between the user table and the vehicle|pet table.
To speed up the query, i was trying to create the dynamic linq and if we have an optional argument provided, redfine the outer join to an inner join.
(The context diagram will have the two tables linked as an outer join by default.)
Can this be done?
I'm also not sure if this SO post can help me, either.
I think you are heading in the wrong direction. You can easily use the fact that LINQ queries are composable here.
First, you would always use the outer join, and get all users with the appropriate vehicles and pets:
// Get all the users.
IQueryable<User> users = dbContext.Users;
Then you would add the filters if necessary:
// If a filter on the pet name is required, filter.
if (!string.IsNullOrEmpty(petNameFilter))
{
// Filter on pet name.
users = users.Where(u => u.Pets.Where(
p => p.Name == petNameFilter).Any());
}
// Add a filter on the license plate number.
if (!string.IsNullOrEmpty(licensePlateFilter))
{
// Filter on the license plate.
users = users.Where(
u => u.Cars.Where(c => c.LicensePlace == licensePlateFilter).Any());
}
Note that this will not filter out the pets or cars that don't meet the filter, as it is simply looking for the users that have pets with that name, or cars with that plate.
If you are trying to change tables or joins of a LINQ to SQL query at runtime you need to do that with reflection. LINQ expressions are not special; same as working with any other object call - you can change the value of properties and variables at runtime, but choosing which properties to change or which methods to call requires reflecting.
I would add to that by pointing out dynamically creating LINQ expressions via reflection is probably a little silly for most (all?) cases, since under the hood the expression is essentially reflected back into SQL statements. Might as well write the SQL yourself if you are doing it on-the-fly. The point of LINQ is to abstract the data source from the developer, not the end-user.
This is how I do what you are asking...
var results = u from dc.Users
join veh from dc.vehicles on u.userId equals v.userId into vtemp from v in vtemp.DefaultIfEmpty()
join pet from dc.pets on u.userId equals p.userId into ptemp from p in ptemp.DefaultItEmpty()
select new { user = u, vehicle = v, pet = p };
if ( !string.IsNullOrEmpty(petName) )
{
results = results.Where(r => r.pet.PetName == petName);
}
if ( !string.IsNullOrEmpty(licNum) )
{
results = results.Where(r => r.vehicle.LicNum == licNum);
}

Categories