In my website, I am using ASP.NET MVC 5 with EF6.
I experience slow performance with only 20K records for the first call.
For example, I need to get all rows from the Person table.
First invocation: 7500 ms (after that in the second row it takes only 1000ms)
List<Person> persons = await _context.Person.ToListAsync(); // Time : 7500ms
List<Person> persons2 = await _context.Person.ToListAsync(); // Time : 1000ms
What I tried:
Canceled the lazyload from edmx schema
Refresh the schema
The same query in SQL Server Management Studio takes 400 ms (and it's a really simple query without joins and conditions)
and it happens every time client goes to the person page
I would have posted this in a comment, but it's too long.
There are many things that can factor into that time difference, in order of less likely/impactful to more likely/impactful:
The first query, once in SQL Server (if that's the underlying engine) has to "Warm Up" SQL sometimes. I doubt that this is the actual problem since SQL Server probably hasn't enough time to go "Down" between your tries. Also, the execution plan shouldn't be too problematic for that query.
The first query has to open the communication channel. For example, if it has to route through VPNs, or simply open an SQL connection, it adds up a delay.
Migrations: Unless you manually force migrations, when you create the DbContext, EF6 doesn't run the migrations (and Seeding) at that moment. It waits for the first time it actually has to query, then builds the configurations and execute migrations.
If you want to investigate, put a breakpoint in the OnModelCreating method and see when it's called. You can also add another query before these two queries to an unrelated entity and you'll see that it's not because of caching (AFAIK, the Caching is only used when using DbSet<T>.Find(...))
Related
I'm using Entity 6 with PostgreSQL database (with Npgsql connector). Everything works fine, except for poor performance of this setup. When I try to insert not-so-large amount of objects to database (about 20k records), it takes much more time than it should. As this is my first time using Entity Framework, I was rather confused why inserting 20k records to database on my local machine would take more than 1 minute.
In order to optimize inserts I followed every tip I found. I tried to set AutoDetectChangesEnabled to false, call SaveChanges() every 100 or 1000 records, re-creating database context object and use DbContextTransaction objects (by calling dbContext.Database.BeginTransaction() and commiting transaction at the end of the operation or every 100/1000 records). Nothing improved inserts performance even a little bit.
By loging SQL queries generated by Entity I was finally able to discover that no matter what I do, every object is inserted separatedly and every insert takes 2-4 ms. Without re-creating DB context objects and without transactions, there is just one commit after over 20k inserts. When I use transactions and commit every few records, there are more commits and new transaction creations (same when I re-create DB context object, just with connection being re-established as well). If I use transactions and commit them every a few records, I should notice a performance boost, no? But in the end there is no difference in performance, no matter if I use multiple transactions or not. I know transactions won't improve performance drastically, but they should help at least a little bit. Instead, every insert still takes at least 2ms to execute on my local DB.
Database on local machine is one thing, but performing creation of 20k objects on remote database takes much, much, MUCH longer than one minute - logs indicate that single insert can take even 30ms (!), with transactions being commited and created again every 100 or 1000 records. On the other hand, if I execute a single insert manually (taking it straight from the log), it takes less than 1ms to execute. It seems like Entity takes its sweet time inserting every single object to database, even though it uses transactions to wrap larger amount of inserts together. I don't really get it...
What can I do to speed it up for real?
In case anyone's interested, I found a solution to my problem. Entity Framework 6 is unable to provide fast bulk inserts without additional third-party libraries (as mentioned in comments to my question), which are either expensive or not supporting other databases than SQL Server. Entity Framework Core, on the other hand, is another story. It supports fast bulk insertions and can replace EF 6 in project with just a bunch of changes in code: https://learn.microsoft.com/pl-pl/ef/core/index
I'm having an issue with a really simple password hash retrieval LINQ query. The problem is if the user logs out, then tries to log back it, it just uses the cached values of the query without querying the database again. The query in question is the following:
using (var db = new DataModel.DatabaseContext())
{
return (from emp in db.Employees where emp.Username == username select emp.Password).SingleOrDefault();
}
But when I break, it seems that EF IS executing a reader on a separate thread! Then why do I think it isn't really querying the database? Well the execution time is just too short. It messes up my async methods, it basically doesn't leave enough time for a MessageBox to be shown (works properly when I call the method for the first time). Maybe the database itself has some transient options set up?
EDIT: I thought I found out what the problem is but this is just unreal. It executes the query on a remote server faster than a ping request. <0.001s I'm stumped
It is because the first time you create a DbContext in your AppDomain (maybe first call to new YourDbContext() in your application) there is going a lot of initialization and configuration under the hood so it takes some time the first time, but after that (while application is running) the process speeds up so you can't feel it.
I have a performance problem we have done a bunch of analysis and are stuck. Hopefully one of you have seen this before.
I'm calling DbContext.Database.SqlQuery the database portion takes 3ms but the full execution takes 9 seconds.
We've used EF Profiler to discover this and we also run the SQL directly in SQL Server Management Studio and it is instantaneous.
We also used glimpse and couldn't see deep enough into the process.
The result type is not an entity from the model and therefore we are confident that tracking is not involved.
We also know that this is not the first query executed against the context therefore we are not paying EF startup cost on this query.
We tried the .net profiler and had so many problems running it that we decided we should just ask.
Any tips on how to dig in and figure this out ?
EDIT: The result set for this query is 1 row with 4 columns (decimal)
The line of code is just:
var list=contextInstance.Database.SqlQuery<nonEntityType>(sqstring).ToList();
The SQL itself is not a very long string. We will use a more detailed profiler to find out where in the process this is getting hung up.
We've used EF profiler to discover this and we also run the SQL
directly in SQL server management studio and it is instantaneous.
This doesn't prove anything. The query might run fast, but the data might result in 100MB of data which is then transported to the client and materialized in objects. This might take more time than you think.
The query in SSMS might return instantaneous because it shows only part of the data. You didn't say what the data was.
Use a real .NET profiler, like dotTrace or Ants. This way you can see where time is lost exactly on the line. EF Prof (or my own ORM Profiler: http://www.ormprofiler.com) will tell you which part of the total route taken (ORM->DB->ORM) takes what time. Even EF prof does ;)
If the client for some reason can't use a profiler as Frans suggest you will have to play the guessing game and exclude possiblities.
First of all I think a critical piece of information is missing. Does it always take around 9 seconds or does it vary?
First step:
Decide if the delay is before or after the query hits the database. Should be possible to do either with EF profiler and looking at the timestamps in Sql profiler.
Either way you will have limited the possibilities a bit.
Second step:
Exclude as much as possible
Indexes (No, the query is fast)
Returning too much data (No, according to the info you have)
Slow query compilation (No, raw sql query is used)
Slow data transfer (No, the other queries works well)
Slow DbContext initialization (No, you said it's not the first query)
Row or table locks (Not likely, That would probably show up as a long running query in the profiler)
Slow materialization (No, to few fields unless there is a serious edge case bug)
Third step:
What's left? That depends on the answer to #1 and also if it's always 9 seconds.
My prime suspects here is either some connection issue because another call is blocking so it has to wait for a connection or some second level cache or something that doesn't work well with this query.
To exclude some more alternatives I would try to run the same query using plain old ADO.NET. If the problem persists you know it's not a EF problem and very likely a connection issue. If it goes away it could still be both issues though.
Not so much as an answer as some rants, but hopefully something you didn't think of already.
Have a project that uses the entity framework (v1 with .NET 3.5). It's been in use for a few years, but it's now being used by more people. Started getting timeout errors and have tracked it down to a few things. For simplicity sake let's say my database has three tables, product, part, and product_part. There are ~1400 parts and a handful of products.
The user has the ability to add any number of parts to a product. My problem is that when there are many parts added to the product the inserts take a long time. I think it's mostly due to network traffic/delay, but to insert all 1400 takes around a minute. If someone goes in and tries to view the details of a part while those records are being inserted I get a timeout and can see a block in the Activity Monitor of SQL Server.
What can I do to avoid this? My apologies if this has been asked before and I missed it.
Thanks,
Nick
I think the root problem is that your write transaction is taking so long. EF is not good at executing mass DML. It executes each insert in a separate network roundtrip and separate statement.
If you want to insert 1400 rows, and performance matters, do the insert in one single statement using TVP's (INSERT ... SELECT * FROM #tvp). Or switch to bulk-copy but I don't think that will be advantageous at only 1400 rows.
If your read transactions are getting blocked, and this is a problem, switch on snapshot isolation. That takes care of the readers 100% as they never block under snapshot isolation.
We have a website which is using linq to entities, we found that it's very slow recently, after troubleshooting, I found whenever we use linq to entities to search data from database, it will consume very much CPU time, like toList() function. I know it might because we have lots of data in database which caused to slow response, but I just wonder if there are any other reasons which might cause this problem?
How should I do to optimize these kinds of problem? following is the possible reasons:
ToList() might load all object's foreign object(foreign key), how can I force it only load the object?
Is my connection pool too small?
Please let me know if there are any other possible reason, and point me the right direction to solve this issue.
In Linq - a query returns the results from a sequence of manipulations to sources when the query is enumerated.
IQueryable<Customer> myQuery = ...
foreach(Customer c in myQuery) //enumerating the query causes it to be executed
{
}
List<Customer> customers = myQuery.ToList();
// ToList will enumerate the query, and put the results in a list.
// enumerating the query causes it to be executed.
An executing query requires a few things (in no particular order)
A database connection is drawn from the pool.
The query is interpreted by the query provider (in this case, the provider is linq to entities and the interpretation is some form of sql)
The interpretted form is transmitted to the database, where it does what it does and returns data objects.
Some method must be generated to translate the incoming data objects into the desired query output.
The database connection is returned to the pool.
The desired query output may have state tracking done to it before it is returned to your code.
Additionally, the database has a few steps, here listed from the point of view of querying a sql server:
The query text is recieved and checked against the query plan cache for an existing plan.
If no plan exists, a new one is created and stuck into the plan cache by the query optimizer.
The query plan is executed - IO/locks/CPU/Memory - any of these may be bottlenecks
Query results are returned - network may be a bottleneck, particularly if the resultset is large.
So - to find out where the problem with your query is, you need to start measuring. I'll order these targets in the order I'd check them. This is not a complete list.
Get the translated sql text of the query. You can use sql server profiler for this. You can use the debugger. There are many ways to go about it. Make sure the query text returns what you require for your objects, no more no less. Make sure the tables queried match your expectations. Run the query a couple times.
Look at the result set. Is it reasonable or are we looking at 500 Gigs of results? Was a whole table queried, when the whole thing wasn't needed? Was a cartesian result generated unexpectedly?
Get the execution plan of the query (in sql studio, click the show estimated execution plan button). Does the query use the indexes you expect it to? Does the plan look wierd (possibly a bad plan came from the cache)? Does the query work on tables in the order you expect it to, and perform nested/merge/hash joins in the way you expect? Is there parallellization kicking in, when the query doesn't deserve it (this is a sign of bad indexes/TONS of IO)?
Measure the IO of the query. (in sql server, issue SET STATISTICS IO ON). Examine the logical IO per table. Which table stands out? Again, look for a wrong order of table access or an index that can support the query.
If you've made it this far, you've likely found and fixed the problem. I'll keep going though, in case you haven't.
Compare the execution time of the query to the execution time of the enumeration. If there's a large difference, it may be that the code which interprets the data objects is slow or that it generated slow. It could also be that the translation of the query took a while. These are tricky problems to solve (in LinqToSql we use compiled queries to sort them out).
Measure Memory and CPU for the machine the code is running on. If you are capped there, use a code profiler or memory profiler to identify and resolve the issue.
Look at the network stats on the machine, in particular you may want to use TCPView to see the TCP socket connections on the machine. Socket resources may be mis-used (such as opening and closing thousands in a minute).
Examine the database for locks held by other connections.
I guess that's enough. Hope I didn't forget any obvious things to check.
You might find the solution to your problem in Performance Considerations (Entity Framework) on MSDN. In particular
Return the correct amount of data
In some scenarios, specifying a query path using the Include method is
much faster because it requires fewer round trips to the database.
However, in other scenarios, additional round trips to the database to
load related objects may be faster because the simpler queries with
fewer joins result in less redundancy of data. Because of this, we
recommend that you test the performance of various ways to retrieve
related objects. For more information, see Loading Related Objects.
To avoid returning too much data in a single query, consider paging
the results of the query into more manageable groups. For more
information, see How to: Page Through Query Results.