Entity Framework Performance Issues over 500+ entities model - c#

I'm having troubles with the high cost on initial data model first load over 500+ tables. I've elaborated a little testing program to evidence this.
For the proof, the database is AdventureWorks with 72 tables where the largest table with row count is [Sales].[SalesOrderDetail] (121,317 records):
On EF5 without pre-generated views performing a basic query (select * from SalesOrderDetails where condition) the result is: 4.65 seconds.
On EF5 with pre-generated views performing (select * from SalesOrderDetails where condition) the result is: 4.30 seconds.
Now, on EF6 without pre-generated views performing the same query (select * from SalesOrderDetails where condition) the result is: 6.49 seconds.
Finally, on EF6 with pre-generated views performing the same query (select * from SalesOrderDetails where condition) the result is: 4.12 seconds.
The source code has been uploaded in my TFS online at [https://diegotrujillor.visualstudio.com/DefaultCollection/EntityFramework-PerformanceTest], please let me know the user(s) or email(s) to grant all access and proceed to download it to examine. The test was performed on a local server pointing to .\SQLEXPRESS.
So far, we can see some subtle differences and the picture doesn't look very daunting, however, the same scenario in a real prod environment with 538 tables definitely goes in a wrong direction. It is impossible to me to attach the original code and a database backup due to size and privacity stuff (i can send some pictures or even have a conference sharing my desktop to show running live). I've executed hundreds of queries attempting to compare the generated output trace on sql server profiler, and then paste and execute the sentence and the time consumed is 0.00 seconds in sql server editor.
On the live env, EF5 without pre-generated views it can take up 259.8 seconds in a table with 8049 records and 104 columns executing a very similar query respect to the mentioned above. This goes better with pre-generated views: 21.9 seconds, one more time, the statement generated in sql server profiler takes 0.00 seconds to execute.
Nevertheless, on the live env, EF6 it can take up 49.3 seconds executing the same query without pre-generated views, with pre-generated: 47.9 seconds. It looks like the pre-generated views doesn't cause any effect in EF6 or it already have pre-generated views, from core functionality or something else, don't know.
Thus, i had to perform a downgrade to EF5 as mentioned in my recent post [http://blogs.msdn.com/b/adonet/archive/2014/05/19/ef7-new-platforms-new-data-stores.aspx?CommentPosted=true#10561183].
I've already performed the same tests over database first and code first with same results. Actually i'm using the "Entity Framework Power Tools" add-in to pre-generate views, both projects, the real and testing are in .NET Framework 4.0, Visual Studio 2013 is the IDE and SQL Server 2008 R2 SP2 the DBMS.
Any help would be appreciated. Thanks in advance.

Related

extremely Inconsistent AWS remote database query times,

I am querying for values from a database in AWS sydney, (I am in new zealand), using stopwatch i measured the query time, it is wildly inconsistent, sometimes in the 10s of milliseconds and sometimes in the hundreds of milliseconds, for the exact same query. I have no idea why.
Var device = db.things.AsQueryable().FirstOrDefault(p=>p.ThingName == model.thingName);
things table only has 5 entries, I have tried it without the asqueryable and it seems to make no difference. I am using visual studio 2013, entity framework version 6.1.1
EDIT:
Because this is for a business, I cannot put a lot of code up, another time example is that it went from 34 ms to 400 ms
thanks
This can be related to cold-warm query execution.
The very first time any query is made against a given model, the Entity Framework does a lot of work behind the scenes to load and validate the model. We frequently refer to this first query as a "cold" query. Further queries against an already loaded model are known as "warm" queries, and are much faster.
You can find more information about this in the following article:
https://msdn.microsoft.com/en-us/library/hh949853(v=vs.113).aspx
One way to make sure this is the problem is to write a Stored Procedure and get data by it(using Entity Framework) to see if the problem is in the connection or in the query(Entity Framework) itself.

How can I identify which functions and sql procedures are causing high cpu usage on the host server

I have a c#/asp.net web application (non mvc) that has some pages which access a sql database many times, using stored procedures and views.
The application now has 5-10 users, and the hosts have informed me that it's causing 95%+ cpu usage on the server.
My question, how can I identify which functions/procedures/threads are causing the high cpu use so I can cache or optimise them?
Note the hosts do not give me access to ANY server logs, stats, or serversystem database tables, only my application's database, which causes a major headache!
you can use SqlProfiler to trace the performance and behavior of any SQL procedure, function and etc.
You can check this: SqlProfiler as it's a very helpful tool and it's really help me a lot in enhancing sql stored procedures performance.
You will run your application but before that go to your sql profiler and configure it to listen only on the needed events, like 'procedure completed' and select your filtering criteria like your database or execution user then when you make any action in your app related to database, profiler will track it and analyze it.
You can check the step by step usage of it here.
About your C# functions that not related to data access you can measure it's performance using stopwatch class to calculate it's execution time:
var watch = Stopwatch.StartNew();
// the code that you want to measure comes here
watch.Stop();
var elapsedMs = watch.ElapsedMilliseconds;
Or you can Analyze Performance While Debugging in Visual Studio 2015
You will need to profile .NET code and SQL separately. I'll tackle the SQL part.
Here's a modified version of Pinal Dave's query which finds recent expensive queries using sys.dm_exec_query_stats. I've tweaked it to include the hot path to show you the bottleneck. The query plan will help you further break it down.
SELECT TOP 50
st.text AS [SQL Definition]
, SUBSTRING(
st.text,
qs.statement_start_offset / 2 + 1,
( CASE
WHEN qs.statement_end_offset = -1
THEN LEN(CONVERT(NVARCHAR(MAX), st.text)) * 2
ELSE qs.statement_end_offset
END - qs.statement_start_offset ) / 2
) AS [Hot Path]
, qs.execution_count AS [Execution Count]
, qs.total_worker_time / 1000000 AS [Total CPU (s)]
, (qs.total_worker_time / $1000000) / qs.execution_count AS [Average CPU (s)]
, qs.total_elapsed_time / 1000000 AS [Total Time (s)]
, qs.total_logical_reads / qs.execution_count AS [Average Logical Reads]
, qs.total_logical_writes / qs.execution_count AS [Average Logical Writes]
, qs.total_physical_reads / qs.execution_count AS [Average Physical Reads]
, ISNULL(qp.query_plan, '') AS [Query Plan]
FROM sys.dm_exec_query_stats AS qs WITH (NOLOCK)
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
CROSS apply sys.dm_exec_query_plan (qs.plan_handle) AS qp
ORDER BY [Total CPU (s)] DESC
This is obviously just a starting point, but if your problem is due to an inefficient query or missing index, it'll point you in the right direction.
Deploy the applicaton locally and diagnose away.
If your application can choke a server, it will most likely kill your machine too.
In case you call the store procedure from your application:
Add option for verbose trace logs on your application, (you will enable it only when you want to check scenarios as you described).
Trace the time, cpu ... for each method that you suspect that cause the problem
(if you can't get the logs file, trace the time to your database).
From the function on your method you can find the store procedure
Sql Profiler, as mentioned previously, is your best bet. However, I think you would need the developer edition / tools for SQL server before you can use it.
More often than not, Sql Server will take a long time to load and / or use high CPU if it keeps on doing table scan(s).
Adding indexes to the table will help.
You should also check in case you have accidentally missed out on creating primary key(s) for the tables.
In case, you think some fields might be unique - then specify unique constraints appropriately. For example, email address field in a Customer table might be unique.
Additionally, try avoiding "like" on nvarchar fields. If you you xml fields, see if you can store the most important field separately in a column. For example, in of the Sql Servers that I worked on, one of the unique "id" fields was within a XML column - making Sql queries slow.

Profiling shows slow performance when explicitly loading related entities

I have tried to profile my wpf application concentrating on non-visual parts that do some calculations and evaluations, I have used Visual Studio 2012 bulit in profiler.
There is quite a lot of code (tens of thousands of lines) in that application, so I was surprised that it showed 46.3% time spent on a single line:
db.Entry(qzv.ZkouskaVzorku).Collection(p => p.VyhodnoceniZkouskies).Load();
This line should just explicitly load related entities as specified here.
I have checked this line using SQL Express profiler and it showed only this SQL command:
exec sp_executesql N'SELECT
[Extent1].[VyhodnoceniZkouskyID] AS [VyhodnoceniZkouskyID],
[Extent1].[Kontext] AS [Kontext],
[Extent1].[NormaVlastnostiID] AS [NormaVlastnostiID],
[Extent1].[ZkouskaVzorkuID] AS [ZkouskaVzorkuID],
[Extent1].[ZkouskaTypuID] AS [ZkouskaTypuID],
[Extent1].[JeShodaITT] AS [JeShodaITT],
[Extent1].[JeITT] AS [JeITT],
[Extent1].[JeStorno] AS [JeStorno]
FROM [dbo].[VyhodnoceniZkousky] AS [Extent1]
WHERE [Extent1].[ZkouskaVzorkuID] = #EntityKeyValue1',N'#EntityKeyValue1 int',#EntityKeyValue1=1816601
go
And this command executes very quickly in 0 ms as it is just selecting several rows using primary clustered index.
Using Entity framework 6.1.0 with SQL Server LocalDB 2014.
I have commented this line as it is important only for ViewModels and the calculations really work cca 2x faster.
What could be the issue and is there any workaround to fix it?

Why does the same Linq-to-SQL query consume much more CPU time on the database server for a different project?

I have a legacy .Net 4 project (name it "A") which uses Linq-to-SQL to query a database and
I have another .Net 4 project (name it "B") with similiar but not the same code which queries the same database as "A".
Both projects:
are C# projects {FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}
use the same assemblies (version v4.0.30319, same folder)
System.dll
System.Data.dll
System.Data.Linq.dll
The auto-generated DataContext is specific for each project but instantiated the same way:
same connection string using SQL authentication
both DataContext set their CommandTimeout from the default to 60 seconds
all other configuration options for the DataContext are the defaults
The way the Linq query is constructed is not exactly the same for the projects but the resulting Linq query is the same.
The generated (T-)SQL select statement is the same as well! (monitored and verified the SQL handles on the db server)
The database server is:
Microsoft SQL Server Enterprise 2005 x64 (9.00.4035.00)
Operating System: Microsoft Server 2003 R2 SP2 x64
If ran the monitored CPU time (on db server) increased drastically for the query of project "A" and a command timeout exception was thrown.
(System.Data.SqlClient.SqlException: Timeout expired)
On the other hand the query of "B" executed within seconds (around 3).
I was able to reproduce the behavior by calling the code of "A" with the same parameters again (no changes to code or database).
"B" even executed within seconds at the same time "A" was increasing its CPU time.
Regrettably after a co-worker created the indices anew I can no longer reprocude the behavior.
The same co-worker mentioned that the query ran fast "last month" (although no code changed from "last month"...).
I debugged the code for both projects - both DataContext instances looked alike.
The db server process' sql handle contains the same SQL statement.
But "A" threw a timeout exception and "B" executed within seconds - repetitive!
Why does the same Linq-to-SQL query consume much more CPU time on the database server for project "A" as for "B"?
To be precise: If the query runs "slow" due to reasons - repetitive - how can the same query run faster just because it is called by another Linq-to-SQL code?
Can there be side effects I do not know of (yet)?
Are there some instance values of the DataContext I have to look at runtime specifically?
By the way: the SQL statement - via SSMS - does use the same query plan on each run.
For the sake of completeness I have linked a sample of:
the C# code fragments of project "B" (the SqlRequest.GetQuery part looks alike for both projects)
the SQL file contains the appropriate database schema
the database execution plan
Please keep in mind that I cannot disclose the full db schema nor the code nor the actual data I am querying against.
(The SQL tables have other columns beside the named ones and the C# code is a bit more complex because the Linq query is constructed conditionally.)
Update - more insight at run-time
Some properties of both DataContext instances:
Log = null;
Transaction = null;
CommandTimeout = 60;
Connection: System.Data.SqlClient.SqlConnection;
The SqlConnection was created from a connection string like that (both cases):
"Data Source=server;Initial Catalog=sourceDb;Persist Security Info=True;User ID=user;Password=password"
There are no explicit SqlCommands being run to pass SET options to the database session. Neither contains the inline TVF SET options.
You need to run a trace on SQL Server instead of debugging this from the C# side. This will show you everything both A and B are executing on the server. The execution plan does you no good because it's precisely that - just a plan. You want to see the exact statements and their actual performance metrics.
In the rare event you were to tell me that both SELECT statements are exactly the same but had vastly different performance I would be virtually certain they are running under different transaction isolation levels. A single SQL command is an implicit transaction even if you aren't explicitly creating any.
If for whatever reason the trace doesn't make it clear you should post the commands being ran along with their metrics.
Note: running a trace has some performance overhead cost to it so I would try to keep the timeframe small or run during off-peak if possible.
I think you will check LazyLoadingEnabled="true" in your "A" project edmx-file .
If LazyLoadingEnabled="true": In case of lazy loading, related objects (child objects) are not automatically loaded with its parent object until they are requested. Default LINQ supports lazy loading.
IF LazyLoadingEnabled="false": In case of eager loading, related objects (child objects) are loaded automatically with its parent object. To use Eager loading you need to use Include() method.

Any way to speed up CreateIfNotExists in Entity Framework?

I'm using EF 4.3 Code First on SQL Server 2008. I run several test suites that delete and recreate the database with CreateIfNotExists. This works fine but is dog slow. It can take up to 15 seconds to create the database on the first call, and typically 3-6 seconds after that. I have several places where this is called. I've already optimized to call this as few times as I can. Is there something I can do to speed up database creation programmatically? I'm willing to go around EF to do this if that helps, but I would like to keep my database build in code and not go back to a SQL script. Thanks!
This works fine but is dog slow.
Yes. The point is to use the real database only for integration tests which don't have to be executed so often and the whole set of integration tests is usually executed only on build server.
It can take up to 15 seconds to create the database on the first call
This is because of slow initialization of EF when unit testing (you can try to switch to x86). The time is also consumed by view generation. Views can be pre-generated which is usually done to reduce startup and initialization of the real system but in case of speeding up unit tests using view pre-generation will not help too much because you will just move the time from test to build.
I'm willing to go around EF to do this if that helps, but I would like to keep my database build in code and not go back to a SQL
Going around would just mean using plain old SQL script. The additional time needed for this operation is may be spent in generating that SQL. I think the SQL is not cached because normal application execution normally doesn't need it more than once but you can ask EF to give you at lest the most important part of that SQL, cache it somewhere and execute it yourselves every time you need it. EF is able to give you SQL for tables and constraints:
var dbSql = ((IObjectContextAdapter) context).ObjectContext.CreateDatabaseScript();
You just need to have your own small SQL to create database and use them together. Even something like following script should be enough:
CREATE DATABASE YourDatabaseName
USE YourDatabaseName
You must also turn off database generation in code first to make this work and to take control over the process:
Database.SetInitializer<YourContextType>(null);
When executing database creation SQL you will need separate connection string pointing to Master database.

Categories