Entity Framework under the hood

Entity Framework under the hood - c#

I have some legacy code that uses Entity Framework.
When I debug the code I can see that EF DbContext contains the whole table. It was passed by OData to the frontend, and then angular processed it.
So I tried to search, is it possible to get only a single record by EF?
Everywhere I see the SingleOrDefault method, or other IQueryable, but as I understood, these are parts of the collections.
Microsoft says: Sometimes the value of default(TSource) is not the default value that you want to use if the collection contains no elements.
Does that mean EF always get all the data from the table and I can use them later?
Or is there a way to force inner query to get only one, and only one row?
We are using postgresql.

With Entity Framework, you can use LINQ to run queries and get single records or limited sets. However, in your .NET project the controller should be parsing OData query parameters and filtering the dataset before returning results to the client application. Please check your Controller code against this tutorial to see if you might be missing something.
If you are somehow bypassing the built-in OData framework, what might help is understanding which queries execute immediately vs which ones are deferred. See this list to understand exactly which operations will force a trip to the database and try to hold off on anything with immediate execution until as late as possible.

No, EF will not SELECT the entire table into memory if you use it correctly. By correctly; I mean:
context.Table.First();
Will translate into a SQL query that only returns one row, that will then map to an object to be returned to the calling code. This is because the above code uses LINQ-to-Entities. If you did something like this instead:
context.Table.ToList().First();
Then the entire table is selected to make the ToList work, and LINQ-to-Objects handles the First. So as long as you do your queries with lazy enumeration (not realizing the result ahead of time), you'll be fine.

Related

In Entity Framework Core, how can I query data on joined tables with a condition at DB level (not local LINQ)?

As part of a small .NET Core 3 project, I'm trying to use the data model based in Entity Framework, but I'm having some troubles related with queries on joined tables.
When looking for data matching a condition in a single table, the model is easy to understand
List<Element> listOfElements = context.Elements.Where(predicate).ToList();
However, when this query requires joined tables, I'm not sure how to do it efficiently. After some investigation, it seems that the include (and theninclude) methods are the way to go, but I have the impression that the Where clause after the include is not executed at DB level but after all the data has been retrieved. This might work with small datasets, but I don't think it's a good idea for a production system with millions of rows.
List<Element> listOfElements = context.Elements.Include(x => x.SubElement).
Where(predicate).ToList();
I've seen some examples using EF+ library, but I'm looking for a solution using the nominal EF Core. Is there any clean/elegant way to do it?
Thank you.

There are a few scenarios when the data from DB is populating:
Deferred query execution: this is when you try to access your query results, for example, in the foreach statement.
Immediate Query Execution: this when you call ToList() (or conversions to other collections, like ToArray()).
I think the answer's to your question:
... but I have the impression that the Where clause after the include is not executed at DB level but after all the data has been retrieved.
is that your assumptions are wrong because you are calling ToList() at the end, not before the Where method.
For more information please also check here.
I've another suggestion also: to be sure about what is exactly executing at the DB level run SQL Server Profiler when executing your query.
Hope this will help ))

Entity Framework performance of include

It's more a technical (behind the scenes of EF) kind of question for a better understanding of Include for my own.
Does it make the query faster to Include another table when using a Select statement at the end?
ctx.tableOne.Include("tableTwo").Where(t1 => t1.Value1 == "SomeValueFor").Select(res => new {
res.Value1,
res.tableTwo.Value1,
res.tableTwo.Value2,
res.tableTwo.Value3,
res.tableTwo.Value4
});
Does it may depend on the number of values included from another table?
In the example above, 4 out of 5 values are from the included table. I wonder if it does have any performance-impact. Even a good or a bad one?
So, my question is: what is EF doing behind the scenes and is there any preferred way to use Include when knowing all the values I will select before?

In your case it doesn't matter if you use Include(<relation-property-name>) or not because you don't materialize the values before the Select(<mapping-expression>). If you use the SQL Server Profiler (or other profiler) you can see that EF generates two exactly the same queries.
The reason for this is because the data is not materialized in memory before the Select - you are working on IQueryable which means EF will generate an SQL query at the end (before calling First(), Single(), FirstOrDefault(), SingleOrDefault(), ToList() or use the collection in a foreach statement). If you use ToList() before the Select() it will materialize the entities from the database into your memory where Include() will come in hand not to make N+1 queries when accessing nested properties to other tables.

It is about how you want EF to load your data. If you want A 'Table' data to be pre populated than use Include. It is more handy if Include statement table is going to be used more frequently and it will be little slower as EF has to load all the relevant date before hand. Read the difference between Lazy and Eager loading. by using Include, it will be the eager loading where data will be pre populated while on the other hand EF will send a call to the secondary table when projection takes place i-e Lazy loading.

I agree with #Karamfilov for his general discussion, but in your example your query could not be the most performant. The performance can be affected by many factors, such as indexes present on the table, but you must always help EF in the generation of SQL. The Include method can produce a SQL that includes all columns of the table, you should always check what is the generated SQL and verify if you can obtain a better one using a Join.
This article explains the techniques that can be used and what impact they have on performance: https://msdn.microsoft.com/it-it/library/bb896272(v=vs.110).aspx

Entity Framework query ToString won't produce SQL query

Through my research I've discovered that, since at least EF 4.1, the .ToString() method on an EF query will return the SQL to be run. Indeed, this very frequently works for me, using Entity Framework 5 and 6.
However, occasionally I call this method and get the runtime type of the query object.
Here is my specific example:
Entity input = ...;
IQueryable<Entity> query = dbContext.SetOfEntity.Where(e => e.Prop == input.Prop);
More specifically, I'm setting a breakpoint in VS2013 and hovering over the query object, and seeingSystem.Data.Entity.Infrastructure.DbQuery<Namespace.Entity> instead of the SQL to run that query. Interestingly enough, if I hover over the DBSet property (dbContext.SetOfEntity), I do see the basic select SQL for the associated table. It's only when I filter the results that I lose the SQL.
Obviously, this is a pretty simple query and I could work out the SQL for myself, but this problem has happened on more complex queries, and it would be nice to be able to debug the SQL being sent to the server without running a database trace.
Some background
A while back, I was using EF5 and the ToString() seemed to work. Shortly before switching to EF6, it seemed that none of the queries showed me SQL, but after switching to EF6, it went back to the correct behavior.
Additionally, whenever I hover over IQueryable queries and try to use the IDE's "Results View" feature, it tells me "Children could not be evaluated". This may be a separate issue, but I figure I'd include it in case it had a common cause.

If you do not need the SQL BEFORE it is executed against the DB you can do the following:
dbContext.Database.Log = s => Debug.WriteLine(s);
This would print the SQL (and some additional data) to the debug output.
See the following link for details: http://msdn.microsoft.com/de-DE/data/dn469464
Also check, as martin_costello suggested, that you are not querying the DB before you try to get the SQL via ToString(). It happened to me too, that I already got objects, because of using IEnumerable<> "to early" (instead if IQueryable<>) and so got way to many entities from the DB and did some filtering "in code" instead of "in SQL"...

Why I can't see my T-SQL string generated by a queryable

I'm noticed that some queries in my application are so slowly, so that's the reason why I want to know what is trying to accomplish my queries in LINQ TO SQL through entity framework.
In some sites, I realized that if you put your mouse over the IQueryable variable, you can see the T-SQL generated and at this moment, I can't see that.
I'd like to know if I'm doing a wrong configuration in my Entity Framework model

For Entity Framework you can see generated SQL query by inspecting your context Log property, or you can cast your IQueryable to System.Data.Objects.ObjectQuery and use method ToTraceString().

I want to suggest a different approach: Look at the real query in SQL Profiler. You can see all queries executed including parameter values. You can copy the query including parameter assignments to SSMS to debug it.

mixing database and object query in linq and provide paged results

I need to build a query that provides paged results. Part of filtering occurs in the database and part of it occurs in objects that are in memory.
Below is a simplified sample that shows what I could do i.e. run a linq query against the database and then further filter it using the custom code and then use skip/take for paging but this would be very inefficient as it needs to load all items that match the first part of my query.
Things.Where(e=>e.field1==1 && e.field2>1).ToList()
.Where(e=>Helper.MyFilter(e,param1,param2)).Skip(m*pageSize).Take(pageSize);
MyFilter function uses additional data that is not located in the database and it is run with additional parameters (paramX in the above example)
Is there a preferred way to handle this situation without loading the initial result fully in memory.

yes, query and page at the database level. whatever logic is in Helper.MyFilter needs to be in the sql query.
the other option, which is more intrusive to your code base. is to save the view model, as well as the domain entity when the entity changes. part of the view model would contain the result of Helper.MyFilter(e) so you can quickly and efficiently query for it.

To support Jason's answer above - entity framework supports .Skip().Take(). So send it all down to the db level and convert your where into something EF can consume.
If your where helper is complicated use Albahari's predicate builder:
http://www.albahari.com/nutshell/predicatebuilder.aspx
or the slightly easier to use Universal Predicate Builder:
http://petemontgomery.wordpress.com/2011/02/10/a-universal-predicatebuilder/ based on the above.

.ToList()
You are converting your query into a memory object i.e. list and thus causing the query to execute and then you provide the paging on the data.
You can put it all in one Where clause:
Things.Where(e=>e.field1==1 && e.field2>1
&& e=>Helper.MyFilter(e)).Skip(m*pageSize).Take(pageSize);
and then .ToList().
That way you will give Linq to Sql a chance to generate a query and get you only the data that you want.
Or there is a particular reason why you want to do just that - converting to a memory object and then filtering? Although I don't see the point. You should be able to filter out the results that you don't want in the Linq to Sql query before you actually execute it against the database.
EDIT
As I can see from the discussion you have several options.
If you have a lot of data and do more reads than writes it might be wise to save the results from Helper.MyFilter into the database on insert if it's possible. That way you can increase performance on select as you will not pull all the data from the database and also you will have a more filtered data on the SELECT itself.
Or you can take another approach. You can put Helper class in a separate assembly and reference that assembly from SQL Server. This will enable you to put the paging logic in your database and use your code as well.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.