Querying multiple parameters using Linq with Marten in Visual Studio - c#

I am learning Document Databases, and we are using Marten/Linq in Visual Studio. The database is running through Postgres/PGAdmin. My database is football(not American) leagues, teams, players and managers. I am trying to construct queries based on multiple parameters. I have singular parameters down pretty well.
List<Player> englishPlayers = finalDb.Query<Player>().Where(x => x.Nationality.Contains("English")).ToList();
This query would create a list of all players whose Nationality is set to "English" in the Document Database.
Player is my class/table with a Nationality "field". What I am trying to do is query based on multiple parameters. For instance, I have multiple "fields" that are either an int or bool. As an example, if I wanted to create a query to show all players of a certain Nationality with lifetimeGoals > 100, how would I accomplish this?
I have searched Google for about an hour, and read through the suggested similar questions, but most of those questions don't account for Marten being used.
I've tried breaking it down by individual queries first and then combine them, for instance:
Player m4 = finalDb.Query<Player>().SelectMany(e => e.lifetimeGoals
.Where(p => e.lifetimeGoals >= 0));
However, this throws an error stating
int does not contain a definition for where, and no extension method
'Where' accepting a first argument of type 'int'.
My terminology with this is a little off, but hopefully this is clear enough to find guidance.
Class for Player:
class Player
{
public int Id { get; set; }
public string Name { get; set; }
public int Age { get; set; }
public string Team { get; set; }
public string prefPosition { get; set; }
public string Nationality { get; set; }
public int yearsAtCurrentClub { get; set; }
public int lifetimeGoals { get; set; }
public int domesticTitles { get; set; }
public int europeanTitles { get; set; }
}//Class Player
Main class
static void Main(string[] args)
{
string connectionString = ConfigurationManager.ConnectionStrings
["FinalProjectDB"].ConnectionString;
IDocumentStore store = DocumentStore.For(connectionString);
using (var finalDb = store.OpenSession())
{
Player m4 = finalDb.Query<Player>().SelectMany(p => p.lifetimeGoals)
.Where(p => p.lifetimeGoals >= 0 && p.myString.Equals("valueToCheck"));
Console.ReadLine();
}

You can't use .Where() on an integer. Instead use it like this:
Player m4 = finalDb.Query<Player>().SelectMany(e => e.lifetimeGoals)
.Where(p => p.lifetimeGoals >= 0);
the above query has a close bracket at the end of SelectMany allowing the Where clause to work with the intended query.
Due to adding a bracket on the end of the SelectMany there is then no needed to have an additional bracket at the end of the query.
Edit: You can simply add another clause to your .Where()
Player m4 = finalDb.Query<Player>().SelectMany(e => e.lifetimeGoals)
.Where(p => p.lifetimeGoals >= 0 && p.myString.Equals("valueToCheck"));
You can use && for and or you can use || for or.
Second edit: I don't see why you are using .SelectMany(). You should be able to use your query like this:
Player m4 = finalDb.Query<Player>().Where(p => p.lifetimeGoals >= 0 && p.myString.Equals("valueToCheck")).FirstOrDefault();
Or use .ToList() when you want a list of players.
List<Player> players = finalDb.Query<Player>().Where(p => p.lifetimeGoals >= 0 && p.myString.Equals("valueToCheck")).ToList();

Related

How could I make this EF Core query better?

I need to fetch from the database this:
rack
it's type
single shelf with all its boxes and their box types
single shelf above the previous shelf without boxes and with shelf type
Shelves have VerticalPosition which is in centimeters from the ground - when I am querying for e.g. second shelf in rack, I need to order them and select shelf on index 1.
I have this ugly EF query now:
var targetShelf = await _warehouseContext.Shelves
.Include(s => s.Rack)
.ThenInclude(r => r.Shelves)
.ThenInclude(s => s.Type)
.Include(s => s.Rack)
.ThenInclude(r => r.Type)
.Include(s => s.Rack)
.ThenInclude(r => r.Shelves)
.Include(s => s.Boxes)
.ThenInclude(b => b.BoxType)
.Where(s => s.Rack.Aisle.Room.Number == targetPosition.Room)
.Where(s => s.Rack.Aisle.Letter == targetPosition.Aisle)
.Where(s => s.Rack.Position == targetPosition.Rack)
.OrderBy(s => s.VerticalPosition)
.Skip(targetPosition.ShelfNumber - 1)
.FirstOrDefaultAsync();
but this gets all boxes from all shelves and it also shows warning
Compiling a query which loads related collections for more than one collection navigation, either via 'Include' or through projection, but no 'QuerySplittingBehavior' has been configured. By default, Entity Framework will use 'QuerySplittingBehavior.SingleQuery', which can potentially result in slow query performance.
Also I would like to use AsNoTracking(), because I don't need change tracker for these data.
First thing: for AsNoTracking() I would need to query Racks, because it complains about circular include.
Second thing: I tried conditional include like this:
.Include(r => r.Shelves)
.ThenInclude(s => s.Boxes.Where(b => b.ShelfId == b.Shelf.Rack.Shelves.OrderBy(sh => sh.VerticalPosition).Skip(shelfNumberFromGround - 1).First().Id))
but this won't even translate to SQL.
I have also thought of two queries - one will retrieve rack with shelves and second only boxes, but I still wonder if there is some single call command for this.
Entities:
public class Rack
{
public Guid Id { get; set; }
public Guid RackTypeId { get; set; }
public RackType Type { get; set; }
public ICollection<Shelf> Shelves { get; set; }
}
public class RackType
{
public Guid Id { get; set; }
public ICollection<Rack> Racks { get; set; }
}
public class Shelf
{
public Guid Id { get; set; }
public Guid ShelfTypeId { get; set; }
public Guid RackId { get; set; }
public int VerticalPosition { get; set; }
public ShelfType Type { get; set; }
public Rack Rack { get; set; }
public ICollection<Box> Boxes { get; set; }
}
public class ShelfType
{
public Guid Id { get; set; }
public ICollection<Shelf> Shelves { get; set; }
}
public class Box
{
public Guid Id { get; set; }
public Guid ShelfId { get; set; }
public Guid BoxTypeId { get; set; }
public BoxType BoxType { get; set; }
public Shelf Shelf { get; set; }
}
public class BoxType
{
public Guid Id { get; set; }
public ICollection<Box> Boxes { get; set; }
}
I hope I explained it good enough.
Query Splitting
First, I'd recommend benchmarking the query as-is before deciding whether to attempt any optimization.
It can be faster to perform multiple queries than one large query with many joins. While you avoid a single complex query, you have additional network round-trips if your DB isn't on the same machine, and some databases (e.g. SQL Server without MARS enabled) only support one active query at a time. Your mileage may vary in terms of actual performance.
Databases do not generally guarantee consistency between separate queries (SQL Server allows you to mitigate that with the performance-expensive options of serializable or snapshot transactions). You should be cautious using a multiple-query strategy if intervening data modifications are possible.
To split a specific query, use the AsSplitQuery() extension method.
To use split queries for all queries against a given DB context,
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
optionsBuilder
.UseSqlServer(
#"Server=(localdb)\mssqllocaldb;Database=EFQuerying;Trusted_Connection=True;ConnectRetryCount=0",
o => o.UseQuerySplittingBehavior(QuerySplittingBehavior.SplitQuery));
}
Reference.
Query that won't translate
.Include(r => r.Shelves)
.ThenInclude(s => s.Boxes.Where(b => b.ShelfId == b.Shelf.Rack.Shelves.OrderBy(sh => sh.VerticalPosition).Skip(shelfNumberFromGround - 1).First().Id))
Your expression
s.Boxes.Where(b => b.ShelfId == b.Shelf.Rack.Shelves.OrderBy(sh => sh.VerticalPosition).Skip(shelfNumberFromGround - 1).First().Id
resolves to an Id. ThenInclude() expects an expression that ultimately specifies a collection navigation (in other words, a table).
Ok, from your question I'm assuming you have a method where you need these bits of information:
single shelf with all its boxes and their box types
single shelf above the previous shelf without boxes and with shelf type
rack and it's type
Whether EF breaks up the queries or you do doesn't really make much of a difference performance-wise. What matters is how well the code is later understood and can adapt if/when requirements change.
The first step I would recommend is to identify the scope of detail you actually need. You mention that you don't need tracking, so I would expect you intend to deliver these results or otherwise consume the information without persisting changes. Project this down to just the details from the various tables that you need to be served by a DTO or ViewModel, or an anonymous type if the data doesn't really need to travel. For instance you will have a shelf & shelf type which is effectively a many-to-one so the shelf type details can probably be part of the shelf results. Same with the Box and BoxType details. A shelf would then have an optional set of applicable box details. The Rack & Racktype details can come back with one of the shelf queries.
[Serializable]
public class RackDTO
{
public int RackId { get; set; }
public int RackTypeId { get; set; }
public string RackTypeName { get; set; }
}
[Serializable]
public class ShelfDTO
{
public int ShelfId { get; set; }
public int VerticalPosition { get; set; }
public int ShelfTypeId { get; set; }
public string ShelfTypeName { get; set; }
public ICollection<BoxDTO> Boxes { get; set; } = new List<BoxDTO>();
public RackDTO Rack { get; set; }
}
[Serializable]
public class BoxDTO
{
public int BoxId { get; set; }
public int BoxTypeId { get; set; }
public string BoxTypeName { get; set; }
}
Then when reading the information, I'd probably split it into two queries. One to get the "main" shelf, then a second optional one to get the "previous" one if applicable.
ShelfDTO shelf = await _warehouseContext.Shelves
.Where(s => s.Rack.Aisle.Room.Number == targetPosition.Room
&& s.Rack.Aisle.Letter == targetPosition.Aisle
&& s.Rack.Position == targetPosition.Rack)
.Select(s => new ShelfDTO
{
ShelfId = s.ShelfId,
VerticalPosition = s.VerticalPosition,
ShelfTypeId = s.ShelfType.ShelfTypeId,
ShelfTypeName = s.ShelfType.Name,
Rack = s.Rack.Select(r => new RackDTO
{
RackId = r.RackId,
RackTypeId = r.RackType.RackTypeId,
RackTypeName = r.RackType.Name
}).Single(),
Boxes = s.Boxes.Select(b => new BoxDTO
{
BoxId = b.BoxId,
BoxTypeId = b.BoxType.BoxTypeId,
BoxTypeName = b.BoxType.Name
}).ToList()
}).OrderBy(s => s.VerticalPosition)
.Skip(targetPosition.ShelfNumber - 1)
.FirstOrDefaultAsync();
ShelfDTO previousShelf = null;
if (targetPosition.ShelfNumber > 1 && shelf != null)
{
previousShelf = await _warehouseContext.Shelves
.Where(s => s.Rack.RackId == shelf.RackId
&& s.VerticalPosition < shelf.VerticalPosition)
.Select(s => new ShelfDTO
{
ShelfId = s.ShelfId,
VerticalPosition = s.VerticalPosition,
ShelfTypeId = s.ShelfType.ShelfTypeId,
ShelfTypeName = s.ShelfType.Name,
Rack = s.Rack.Select(r => new RackDTO
{
RackId = r.RackId,
RackTypeId = r.RackType.RackTypeId,
RackTypeName = r.RackType.Name
}).Single()
}).OrderByDescending(s => s.VerticalPosition)
.FirstOrDefaultAsync();
}
Two fairly simple to read queries that should return what you need without much problem. Because we project down to a DTO we don't need to worry about eager loading and potential cyclical references if we wanted to load an entire detached graph. Obviously this would need to be fleshed out to include the details from the shelf, box, and rack that are relevant to the consuming code/view. This can be trimmed down even more by leveraging Automapper and it's ProjectTo method to take the place of that whole Select projection as a one-liner.
In SQL raw it could look like
WITH x AS(
SELECT
r.*, s.Id as ShelfId, s.Type as ShelfType
ROW_NUMBER() OVER(ORDER BY s.verticalposition) as shelfnum
FROM
rooms
JOIN aisles on aisles.RoomId = rooms.Id
JOIN racks r on r.AisleId = aisles.Id
JOIN shelves s ON s.RackId = r.Id
WHERE
rooms.Number = #roomnum AND
aisles.Letter = #let AND
r.Position = #pos
)
SELECT *
FROM
x
LEFT JOIN boxes b
ON
b.ShelfId = x.ShelfId AND x.ShelfNum = #shelfnum
WHERE
x.ShelfNum BETWEEN #shelfnum AND #shelfnum+1
The WITH uses room/aisle/rack joins to locate the rack; you seem to have these identifiers. Shelves are numbered in increasing height off ground. Outside the WITH, boxes are left joined only if they are on the shelf you want, but two shelves are returned; the shelf you want with all it's boxes and the shelf above but box data will be null because the left join fails
As an opinion, if your query is getting this level of depth, you might want to consider either using views as a shortcut in your database or use No-SQL as a read store.
Having to do lots of joins, and doing taxing operations like order by during runtime with LINQ is something I'd try my best to avoid.
So I'd approach this as a design problem, rather than a code/query problem.
In EF, All related entities loaded with Include, ThenInclude etc. produce joins on the database end. This means that when we load related master tables, the list values will get duplicated across all records, thus causing what is called "cartesian explosion". Due to this, there was a need to split huge queries into multiple calls, and eventually .AsSplitQuery() was introduced.
Eg:
var query = Context.DataSet<Transactions>()
.Include(x => x.Master1)
.Include(x => x.Master2)
.Include(x => x.Master3)
.ThenInclude(x => x.Master3.Masterx)
.Where(expression).ToListAsync();
Here we can introduce splitquery
var query = Context.DataSet<Transactions>()
.Include(x => x.Master1)
.Include(x => x.Master2)
.Include(x => x.Master3)
.ThenInclude(x => x.Master3.Masterx)
.Where(expression).AsSplitQuery.ToListAsync();
As an alternate to include this to all existing queries, which could be time consuming, we could specify this globally like
services.AddDbContextPool<EntityDataLayer.ApplicationDbContext>(options =>
{
options.EnableSensitiveDataLogging(true);
options.UseMySql(mySqlConnectionStr,
ServerVersion.AutoDetect(mySqlConnectionStr), x =>
x.UseQuerySplittingBehavior(QuerySplittingBehavior.SplitQuery)
x.EnableRetryOnFailure(
maxRetryCount: 10,
maxRetryDelay: TimeSpan.FromSeconds(30),
errorNumbersToAdd: null));
});
This will ensure that all queries are called as split queries.
Now in case we need single query, we can just override this by stating single query explicitly in individual queries. This may be done vice-versa though.
var data = await query.AsSingleQuery().ToListAsync();

Filter list of entity objects by string child object property using Linq Lambda

I am trying to return an IQueryable lands filtered by a child object property Owner.Name. Is working well with the query style solution, but I want to use a lambda one.
On short these are my classes mapped by EntityFramework:
public class Land
{
public int Id { get; set; }
public virtual ICollection<Owner> Owners { get; set; }
}
public class Owner
{
public int Id { get; set; }
public string Name { get; set; }
public int LandId { get; set; }
public virtual Land Lands { get; set; }
}
The query which is working fine:
var list = from land in db.Lands
join owner in db.Owners on land.Id equals Owner.LandId
where owner.Name.Contains("Smit")
select land;
I was trying using this:
var list = db.Lands.Where(lnd => lnd.Owners.Count() > 0 &&
lnd.Owners.Where(own => own.Name.Contains("Smit")).Count() > 0);
It works only for small lists, but for some with thousands of records it gives timeout.
Well, one issue which may be causing the speed problem is that your lambda version and your non-lambda versions do very different things. You're non lambda is doing a join with a where on one side of the join.
Why not just write the lambda equivalent of it?
var list = db.Lands.Join(db.Owners.Where(x=> x.Name.Contains("Smit")), a=> a.Id, b => b.LandId, (a,b) => a).toList();
I mean, that is the more direct equivalent of your non lambda
I think you can use this one:
var list = db.Lands.Where(lnd => lnd.Owners.Any(x => x.Name.Contains("Smit")));
Try something more straightforward:
var lands = db.Owners.Where(o => o.Name.Contains("Smit")).Select(o => o.Lands);
You just need to make sure that Owner.Name is not null and LINQ will do the rest.

Select The Record With the Lowest Payment for Each Customer using EntityFramework Core

I have an application that matches customers to vehicles in inventory. There are 3 main tables: Customer, Match, and Inventory. The match record contains the estimated monthly payment for a specific Customer and Inventory record. A customer can be matched to multiple vehicles in Inventory.
The match record contains a CustomerId and an InventoryId along with a MonthlyPayment field and a few other miscellaneous fields.
There is a 1 to Many relationship between Customer and Match.
There is a 1 to Many relationship between Inventory and Match.
For each customer, I want to select the Customer record, the match record with the lowest monthly payment, and the Inventory record for that match.
What is the best way to do this? Can it be done with a single query?
I tried this code, but entity framework can't evaluate it and it executes it locally which kills the performance.
var bestMatches = _matchRepository.GetAll(customerMatchSummaryRequest)
.Where(match =>
(_matchRepository.GetAll(customerMatchSummaryRequest)
.GroupBy(m => new { m.Bpid, m.BuyerId, m.CurrentVehicleId })
.Select(g => new
{
g.Key.Bpid,
g.Key.BuyerId,
g.Key.CurrentVehicleId,
LowestMonthlyPayment = g.Min(m => m.MonthlyPayment)
})
.Where(m => m.Bpid == match.Bpid
&& m.BuyerId == match.BuyerId
&& m.CurrentVehicleId == match.CurrentVehicleId
&& m.LowestMonthlyPayment == match.MonthlyPayment)
).Any())
.Include(m => m.Buyer)
.Include(m => m.Inventory);
I receive the following Output when stepping through the debugger:
Microsoft.EntityFrameworkCore.Query:Warning: The LINQ expression 'GroupBy(new <>f__AnonymousType2`3(Bpid = [<generated>_2].Bpid, BuyerId = [<generated>_2].BuyerId, CurrentVehicleId = [<generated>_2].CurrentVehicleId), [<generated>_2])' could not be translated and will be evaluated locally.
Microsoft.EntityFrameworkCore.Query:Warning: The LINQ expression 'GroupBy(new <>f__AnonymousType2`3(Bpid = [<generated>_2].Bpid, BuyerId = [<generated>_2].BuyerId, CurrentVehicleId = [<generated>_2].CurrentVehicleId), [<generated>_2])' could not be translated and will be evaluated locally.
Assuming your model is something like this
public class Customer
{
public int Id { get; set; }
public string Name { get; set; }
public ICollection<Match> Matches { get; set; }
}
public class Inventory
{
public int Id { get; set; }
public string Name { get; set; }
public ICollection<Match> Matches { get; set; }
}
public class Match
{
public int CustomerId { get; set; }
public Customer Custmer { get; set; }
public int InventoryId { get; set; }
public Inventory Inventory { get; set; }
public decimal MonthlyPayment { get; set; }
}
the query in question could be something like this:
var query =
from customer in db.Set<Customer>()
from match in customer.Matches
where !customer.Matches.Any(m => m.MonthlyPayment > match.MonthlyPayment)
select new
{
Customer = customer,
Match = match,
Inventory = match.Inventory
};
Note that it could return more than one match for a customer if it contains more than one inventory record with the lowest payment. If the data allows that and you want to get exactly 0 or 1 result per customer, change the
m.MonthlyPayment > match.MonthlyPayment
criteria to
m.MonthlyPayment > match.MonthlyPayment ||
(m.MonthlyPayment == match.MonthlyPayment && m.InventoryId < match.InventoryId)
P.S. The above LINQ query is currently the only way which translates to single SQL query. Unfortinately the more natural ways like
from customer in db.Set<Customer>()
let match = customer.Matches.OrderBy(m => m.MonthlyPayment).FirstOrDefault()
...
or
from customer in db.Set<Customer>()
from match in customer.Matches.OrderBy(m => m.MonthlyPayment).Take(1)
...
lead to client evaluation.

I need help speeding up this EF LINQ query

I am using EntityFramework 6 and running into some major speed issues -- this query is taking over two seconds to run. I have spent the better part of the day using LinqPad in order to speed up the query but I could only get it down from 4 to two seconds. I have tried grouping, joins, etc. but the generated SQL looks overly complicated to me. I am guessing that I am just taking the wrong approach to writing the LINQ.
Here is what I am attempting to do
Find all A where Valid is null and AccountId isn't the current user
Make sure the Collection of B does not contain any B where AccountId is the current user
Order the resulting A by the number of B in its collection in descending order
Any A that doesn't have any B should be at the end of the returned results.
I have to models which look like this:
public class A
{
public int Id { get; set; }
public bool? Valid { get; set; }
public string AccountId { get; set; }
public virtual ICollection<B> Collection { get; set; }
}
public class B
{
public int Id { get; set; }
public bool Valid { get; set; }
public string AccountId { get; set; }
public DateTime CreatedDate { get; set; }
public virtual A Property { get; set; }
}
The table for A has about one million rows and B will eventually have around ten million. Right now B is sitting at 50,000.
Here is what the query currently looks like. It gives me the expected results but I have to run an orderby multiple times and do other unnecessary steps:
var filterA = this.context.A.Where(gt => gt.Valid == null && !gt.AccountId.Contains(account.Id));
var joinedQuery = from b in this.context.B.Where(gv => !gv.AccountId.Contains(account.Id))
join a in filterA on gv.A equals a
where !a.Collection.Any(v => v.AccountId.Contains(account.Id))
let count = gt.Collection.Count()
orderby count descending
select new { A = gt, Count = count };
IQueryable<GifTag> output = joinedQuery
.Where(t => t.A != null)
.Select(t => t.A)
.Distinct()
.Take(20)
.OrderBy(t => t.Collection.Count);
Thanks
Well you could always try to remove these two lines from the joinQuery
where !a.Collection.Any(v => v.AccountId.Contains(account.Id))
and
orderby count descending
the first line have already been filtered in the first Query
and the orderline, well do do the ordering on the last Query so there is no point in doing it twice

LINQ Lambda: Getting Last item (where == ) of sub collection

I have the following (which I've used in CodeFirst):
public class MyForm
{
public int MyFormId { get; set; }
//Many Form Properties
public virtual ICollection<AuditItem> AuditItems { get; set; }
}
public class AuditItem
{
public int AuditItemId { get; set; }
public int MyFormId { get; set; }
public IAppUser User { get; set; }
public string ActionByUserStr { get; set; }
public string ActionByUserDisplayName { get; set; }
public DateTime DateOfAction { get; set; }
public string TypeOfAction { get; set; }
public int Step { get; set; }
public virtual MyForm MyForm { get; set; }
}
The AuditItem tracks many actions carried out on MyForm
I'm presenting an index page showing a table where the AuditItem's last Step = 1.
I do an orderby DateOfAction (because the Step can get setback to 0, so I wouldn't want it to equal 1 unless it is also the last AuditItem Record for that MyForm), then Select the last record's step, then query on that. I want to query as early as I can so that I'm not pulling back unrequired MyForms records.
This is the query I have:
var myFormsAtStep1 =
context.MyForms.Include("AuditItems").Where(
mf => mf.AuditItems.OrderBy(ai => ai.DateOfAction).Last().Step == 1);
It gives this error:
LINQ to Entities does not recognize the method 'MyForms.Models.AuditItem Last[AuditItem](System.Collections.Generic.IEnumerable`1[MyForms.Models.AuditItem])' method, and this method cannot be translated into a store expression.
Have a look at the list of Supported and Unsupported LINQ Methods.
Last() is not supported, but First() is, so you can just go on and reverse your ordering and use First() instead of Last().
mf => mf.AuditItems.OrderByDescending(ai => ai.DateOfAction).First().Step == 1);
You simply sort in descending order and get the first item.
var myFormsAtStep1 =
context.MyForms.Include("AuditItems").Where(
mf => mf.AuditItems.OrderByDescending(
ai => ai.DateOfAction).First().Step == 1);
I would also use the query syntax, which would be more readable, as well as the strongly-typed Include (import System.Data.Entity):
from mf in context.MyForms.Include(x => x.AuditItems)
let items = from ai in mf.AuditItems
orderby ai.DateOfAction descending
select ai;
where items.First().Step == 1
select mf

Categories