Ef core not ignoring columns in generated SQL - c#

I am using EF for .NET Core
I have a table which I dont want to bring back all the columns for in a query
(from ae in Table1.Select(x => new {x.CreatedDate, x.EntryId, x.Amount})
join si in Table2.Select(x => new {x.SessionId, x.IdToFind})
.Where(y => y.SessionId == Guid.Parse("52F0C862-15D0-4C7A-975D-285C618342B0")) on ae.EntryId equals si.IdToFind
select new
{
create output
}).ToList();
As you can see, I am specifically only including 2 columns from Table2
SELECT [a].[EntryId], [a].[Amount]
FROM [Table1] AS [a]
INNER JOIN (
SELECT [s].[Id], [s].[CreatedBy], [s].[CreatedDate], [s].[IdToFind], [s].[SessionId], [s].[UpdatedDate], [s].[UpdateddBy]
FROM [Table2] AS [s]
WHERE [s].[SessionId] = '52f0c862-15d0-4c7a-975d-285c618342b0'
) AS [t] ON [a].[EntryId] = [t].[IdToFind]
When I use LinqPad to evaluate the generated SQL, I can see that all the columns from Table 2 have been include
How can I prevent this? I dont want to use the Ignore option on the model loading because in other situations I may want the other columns

You have to use this syntax:
ResultClass result =(from ae in Table1
join si in Table2
on ae.EntryId equals si.IdToFind
where(si.SessionId == Guid.Parse("52F0C862-15D0-4C7A-975D-285C618342B0"))
select new ResultClass
{
ae.CreatedDate,
ae.EntryId,
ae.Amount,
si.SessionId,
si.IdToFind
}).ToList();
But in this case you will need to create a new class that contains properties of both classes or you get an anonymos object that you have to use immediately.
When I like to avoid creating a new class I usually add some [NotMapped] properties to one of Table1 or Table2 classes in a partial class code and use this class instead of the ResultClass

Your approach should work, however mixing the Linq query syntax and Fluent methods isn't making it very readable at all and I suspect that may be responsible for the unexpected query generation. Particularly this:
(from ae in Table1.Select(x => new {x.CreatedDate, x.EntryId, x.Amount})
join si in Table2.Select(x => new {x.SessionId, x.IdToFind}).Where(y => y.SessionId == Guid.Parse("52F0C862-15D0-4C7A-975D-285C618342B0"))
Which has the Where condition on the Table2 Select.
From what I can deduce from your approach, you are being too eager with your optimization:
With EF queries, Select projections do not need to be done at a per-entity level, they are done at the end of the expression and EF will work out exactly what columns from what table need to be included. That said, you can pre-select columns and EF should work it all out, but it will do so from your final projection.
Provided you have an actual FK between Table1.EntryId and Table2.IdToFind, ideally you would want that relationship mapped out in the entities rather than relying on an explicit join. Navigation properties make it much easier to organize queries, but there can be more generic cases where Table2 can serve multiple other tables etc.
Assuming that you want to just build an output from the CreatedDate, EntryId, Amount, and SessionId from these two loosely related tables:
var results = context.Table1
.Join(context.Table2.Where(t2 => t2.SessionId == sessionId),
t1 => t1.EntryId,
t2 => t2.IdToFind,
(t1, t2) => new { Table1 = t1, Table2 = t2 })
.Select(x => new ResultViewModel
{
EntryId = x.Table1.EntryId,
CreatedDate = x.Table1.CreatedDate,
Amount = x.Table1.Amount,
// ...
}).ToList();
Assuming you don't even need anything from Table2 then:
// ...
.Join(context.Table2.Where(t2 => t2.SessionId == sessionId),
t1 => t1.EntryId,
t2 => t2.IdToFind,
(t1, t2) => t1)
.Select(x => new ResultViewModel
{
EntryId = x.EntryId,
CreatedDate = x.CreatedDate,
Amount = x.Amount
}).ToList();
When you read the Linq it may seem like EF would be selecting all columns from the related tables so you'd want to add Select projections there, but they are not needed. The final query should select columns strictly by the final Select projection. I would avoid mixing Linq query language (from Table1 join Table2 select ....) with the Fluent builder methods. IMO the Fluent methods are easier to read from a C# code perspective, but ultimately I'd use one or the other, not both.

Related

C# LINQ INNER JOIN is returning wrong count

I have the following sql with JOIN but which returns a record count for 50 but when I convert it to LINQ, I am not getting the matching count. I noticed that when I add the ON clause, the visual studio intellisense dropdown does not show the ID property for the 2nd table. I am wondering if that is an issue.
Here is the simple SQL
SELECT * FROM Table1 T1 JOIN Table2 T2 ON T1.MyId = T2.MyId WHERE T1.IsCompleted
Here is my lamba LINQ with the comment where the VC Intellisense is not working right. For table2, the intellisense dropdown, it is only showing Equals, GetHashCode, GetType, and ToString. Just manually type the MyId and everything successfully builds but the count is too high. Thanks
var test = this.myDbContent.Table1
.Join (this.myDbContent.Table2
table1 => table1.MyId,
table2 => table2.MyId,
(table1, table2) => new { table1, table2}
)
.Where (joinedTable => joinedTable.Table1.IsCompleted == 1)
Join is the simple LINQ operation, but better understandable via LINQ Query.
var query =
from t1 in myDbContent.Table1
join t2 in myDbContent.table2 on t1.MyId equals t2.MyId
where t1.IsCompleted
select new
{
t1,
t2
};
If you still have different results, probably MyId is nullable field and EF also matches nulls.
It can be disabled via options configuration:
builder.UseSqlServer(ceConnection, x => x.UseRelationalNulls(true));

Group child list into single parent object - AdventureWorks query - Lambda Expression

I am working with the AdventureWorks database and I am using the following query :
SELECT c.CustomerID, c.AccountNumber, soh.SalesOrderID, soh.SalesOrderNumber, sod.UnitPrice, sod.ProductID, p.Name
FROM Sales.SalesOrderHeader soh
inner join sales.SalesOrderDetail sod on soh.SalesOrderID = sod.SalesOrderID
inner join sales.Customer c on c.CustomerID = soh.CustomerID
inner join Production.Product p on p.ProductID = sod.ProductID
I have code already that populates the data like this just as it comes back:
Based on that I am wanting to merge the child list objects into a single parent to end up with a one to many structure if that makes sense.
As of right now I am hoping to use a lambda expression to do this. I have tried the following code but I don't want to have to specify --> o.CustomerID, o.AccountNumber in the group by because i would have to specify ALL the properties of my Customer class to have access to the final select as you can see in the code.
.GroupBy(o => new { o.CustomerID, o.AccountNumber }).Select(group => new Customer2
{
CustomerID = group.Key.CustomerID,
AccountNumber = group.Key.AccountNumber,
Orders = group.SelectMany(x => x.Orders).ToList()
}).ToList();
I would like the end result to look like this for each customer:
I hope this makes sense and thanks in advance for the help.

Linq group join and where statement on property of the joined table

So I believe I have found out that group join is a left outer join and that is what I need. But I need to check if the joined tables property is null. But I haven't got it working yet.
So I basically need the equivalent of this query in Linq Entity Framework
SELECT
id, test, test2
FROM Table1
LEFT OUTER JOIN Table2 ON
table1.id = table2.id
WHERE table2.example = NULL;
I have tried to do this with lambda but without any success yet. I can't seem to get the hold of the table2 property example for the where statement.
You can flow this example using LINQ Extension Method (GroupJoin):
Table1.GroupJoin(Table2,
x => x.ID,
y => y.ID,
(tbl1, tbl2) => new {Table1=tbl1, Table2 =tbl2.DefaultIfEmpty()})
.SelectMany(
tbl => tbl.Table2.Where(t2 => t2.example == null).Select(x => new
{
id= tbl.Table1.ID,
test = tbl.Table1.Test,
test2 = tbl.Table2.Test
}))ToList();
You might want to check out: http://www.sqltolinq.com/
Linqer is a SQL to LINQ converter tool. It helps you to learn LINQ and convert your existing SQL statements.
Not every SQL statement can be converted to LINQ, but Linqer covers many different types of SQL expressions.
Lets assume you have Table1 and Table2 in an EF dbcontext.
from Table1 in context
from Table2 in context
.Where(t2=> t2.ID == Table1.ID && t2.example == null).DefaultIfEmpty()
select new
{
id= Table1.ID
,test = Table1.Test
,test2 = Table2.Test
}

Does Entity Framework query the database multiple times if I use different fields of the same Linq query at different times?

I tried the Internet and the SOF but couldn't locate a helpful resource. Perhaps I may not be using correct wording to search. If there are any previous questions I have missed due to this reason please let me know and I will take this question down.
I am dealing with a busy database so I am required to send less queries to the database.
If I access different columns of the same Linq query from different levels of the code then is Entity Framework smart enough to foresee the required columns and bring them all or does it call the db twice?
eg.
var query = from t1 in table_1
join t2 in table_2 on t1.col1 equals t2.col1
where t1.EmployeeId == EmployeeId
group new { t1, t2 } by t1.col2 into grouped
orderby grouped.Count() descending
select new { Column1 = grouped.Key, Column2 = grouped.Sum(g=>g.t2.col4) };
var records = query.Take(10);
// point x
var x = records.Select(a => a.Column1).ToArray();
var y = records.Select(a => a.Column2).ToArray();
Does EF generate query the database twice to faciliate x and y (send a query first to get Column1, and then send another to get Column2) or is it smart enough to know it needs both Columns to be materialised and bring them both at point x?
Added to clarify the intention of the question:
I understand I can simply add a greedy method to the end of query.Take(10) and get it done but I am trying to understand if the approach I try (and in my opinion, more elegant) does work of if not what makes EF to make two queries please.
Yes currently your code will generate 2 queries that will be executed to the database. Reason being is because you have 2 different sqls generated:
First is the top query, taking only 10 records and then only Column1
Second is the top query, taking only 10 records and then only Column2
The reason these are 2 queries is because you have a ToArray over different Select statements -> generating different sql. Most of linq queries are differed executed and will be executed only when you use something like ToArray()/ToList()/FirstOrDefault() and so on - those that actually give you the concrete data. In your original query you have 2 different ToArray on data that has not yet been retrieved - meaning 2 queries (once for the first field and then for the second).
The following code will result in a single query to the database
var records = (from t1 in table_1
join t2 in table_2 on t1.col1 equals t2.col1
where t1.EmployeeId == EmployeeId
group new { t1, t2 } by t1.col2 into grouped
orderby grouped.Count() descending
select new { Column1 = grouped.Key, Column2 = grouped.Sum(g=>g.t2.col4) })
.Take(10).ToList();
var x = records.Select(a => a.Column1).ToArray();
var y = records.Select(a => a.Column2).ToArray();
In my solution above I added a ToList() after filtering out only that data you need (Take(10)) and then at that point it will execute to the database. Then you have all the data in memory and you can do any other linq operation over it without it going again to the database.
Add to your code ToString() so you can check the generated sql at different points. Then you will understand when and what is being executed:
var query = from t1 in table_1
join t2 in table_2 on t1.col1 equals t2.col1
where t1.EmployeeId == EmployeeId
group new { t1, t2 } by t1.col2 into grouped
orderby grouped.Count() descending
select new { Column1 = grouped.Key, Column2 = grouped.Sum(g=>g.t2.col4) };
var generatedSql = query.ToString(); // Here you will see a query that brings all records
var records = query.Take(10);
generatedSql = query.ToString(); // Here you will see it taking only 10 records
// point x
var xQuery = records.Select(a => a.Column1);
generatedSql = xQuery.ToString(); // Here you will see only 1 column in query
// Still nothing has been executed to DB at this point
var x = xQuery.ToArray(); // And that is what will be executed here
// Now you are before second execution
var yQuery = records.Select(a => a.Column2);
generatedSql = yQuery.ToString(); // Here you will see only the second column in query
// Finally, second execution, now with the other column
var y = yQuery.ToArray();
When you are running linq statement on an entity in EF if only prepares the Select statement (thats why the type is IQueryable). The data is loaded lazily. When you try to use a value from that query then only the result gets evaluated using a enumerator.
So when you turn it to a collection (.toList() etc.) explicitly it tries to get data to populate the list and hence the sql command is fired.
It is designed so to enhance the performance. So if a particular property of an entity is to be used EF doesn't get the value for all the columns from that table

select multiple objects using one linq query in entity framwork 6

I have the following linq queries:
Var q1 = (from t1 in context.Table1
where column = value
select t1).FirstOrDefault();
Var q2 = (from t2 in context.Table2
where column = value
select t2).FirstOrDefault();
As far as I understand, the above linq statements will call the database two times to get the table data but I want to write the linq query in such a way to get both the tables data in a single database call. How can I achieve this?
You can achieve this by selecting to anonymous type:
Var q = (from t1 in context.Table1
where t1.column == value
select t1
)
.Select(t1 => new {
t1 = t1,
t2 = context.Table2
.FirstOrDefault(t2 => t2.column == value);
})
.FirstOrDefault();
var t1 = q.t1;
var t2 = q.t2;
This way it will make one query from this all. I simplified a bit the query part to obtain t2 item, but there is not obstacle to use the one you wrote.
Short Answer: This is not possible
Although both queries hitting the same DB but they are working on different tables,
think about it as you are creating normal SQL statement, you will not be able to combine this tow queries in only one
Correct.
You can however use Entity Framework.Extended to do bulk queries though if you don't mind adding an extension.
See: github Link
It is possible to load related entities using Include() method of EF.
// Load all blogs, all related posts, and all related comments
var blogs1 = context.Blogs
.Include(b => b.Posts.Select(p => p.Comments))
.ToList();

Categories