This is hard to explain so bear with me.
I have an Entity Framework Context being used by a View Model. Essentially, it is a search box which has a service which uses the context to run queries based on the search criteria.
The problem is, when the first search is performed, the DbContext then kicks into action and looks at the database to generate the entities and relationships. (At least this is what I think is happening)
This is demonstrated below:
The first search takes a few seconds, as Entity Framework is doing it's thing. After the first search is performed, all other searches that are performed happen pretty much instantaneously. It's just the first search which takes a long time.
Now, onto my question.
Is it possible to force the DbContext to load the relationships and generally do it's thing (asynchronously) before any action is performed on the context? i.e a query.
Ideally, the first search should be as quick as the other searches.
Yes, simply query the entities, but do nothing with them. The dbContext then caches the results.
What is taking a lot of time on first use is dependant on the size of your db schema (building EF's virtual tables) and done once at runtime on first instantiation.
Just initialise a context on another thread at startup and do any query on it and it will take that performance hit asynchronously.
Don't try to keep a reference to that context either, creating contexts is cheap and they are meant to be short lived, what is expensive is only the first time you create one in your process.
If the slowdown is an issue even asynchronously you can have EF do this work at compile time but it is somewhat involved
Related
I recently exposed to Entity Framework 6 caching mechanism.
As we might figure from this article, it does it in First-Lever manner.
Our system uses EF 6 (code first) along with MemoryCache to improve performance.
The main reason we use MemoryCache is because we need to execute an intense query on every page request. We execute this query x3 times (in the worst case) on every page request since there are client call backs.
I wonder if we still need to use the MemoryCache mechanism if the EF 6 already use one.
It is worth saying that we don't use any special caching feature or cache dependencies. Just a simple MemoryCache with timeouts.
The fact that EF caches entities in context is in no way a replacement for "real" cache, for various reasons:
You should not reuse EF context for more that one logical operation, because EF context represents unit of work, and so should be used according to this pattern. Also, even if you for some reason reuse context in multiple operations - you absolutely cannot do that in multi-threaded environment, like web server application.
It does not prevent you from making multiple queries for the same data to your database, for example:
var entity1 = ctx.Entities.Where(c => c.Id == 1).First();
var entity2 = ctx.Entities.Where(c => c.Id == 1).First();
This will still execute two queries to your database, despite the fact that query is the same and returns the same entity. So nothing is really "cached" in usual sense here. Note however, that both queries will return the same entity, even if database row has been changed between two queries. That is what is meant by EF context "caching". It will execute database query two times, but second time, while evaluating the result, it will notice that there is already entity with the same key attached to the context. So it will return this existing ("cached") entity instead, and will ignore new values (if any) returned by the second query. That behaviour is additional reason to not reuse the context between multiple operations (though you should not do it anyway).
So if you want to reduce load on your database - you have to use second-level caching using whatever suits your needs (from simple InMemoryCache to caching EF provider to distributed memcached instance).
EF only implements what is called first level cache for entities, It stores the entities which have been retrieved during the life time of a context so when you ask for that entity the second time it returns the entity from context. What you need is a second level cache but EF dosen't implants this features. NCache for example implements a wonderful caching architecture and a out of the box a second level cache provider for EF. Not in its open source version.
I'm using Entity Framework 6 on a SQL Server database to query an existing database (database first, so there's an EDMX in my project).
I've noticed that the first time I request an entity, it can take up to thirty seconds for the query to be executed. Subsequent queries to the same object then get completed in a matter of milliseconds. The actual SQL being executed is very fast so it's not a slow query.
I've found that Entity Framework generates views on the background and that this is the most likely culprit. What I haven't found, however, is a good solution for this. There's a NuGet package that can handle the View Generation (EFInteractiveViews), but it hasn't been updated since 2014 and I hardly seem to find any information on how to use it.
What options do I have nowadays? I've tried initializing Entity Framework on Application_Start by doing a few queries, but this doesn't seem to help much at all, and also it's quite difficult to perform the real queries on Application_Start, because most queries use data from the current user (who is not yet logged on at Application_Start) so it's difficult to run these in advance.
I've thought about creating an ashx file that constantly polls the application by calling the API and keep it alive. I've also set the Application Pool to "AlwaysRunning" so that EF doesn't restart when the app pool is recycled.
Does anyone have any tips or ideas on how I can resolve this or things I can try?
Thanks a lot in advance. I've spent the better part of two days already searching for a viable solution.
There are many practices to speed up Entity Framework, I will mention some of them
Turn off the LazyLoading (EDMX => open the file right click anywhere => properties => Lazy Loading Enabled set it to false )
Use AsNoTracking().ToList() and when you want to update, use Attach and update object state to EntityState.Modified
Use Indexes on your table
Use Paging, do not load all the data at once
Split your Edmx into many smaller, only include the ones you need in your page, ( this will effect the performance in good way)
If you want to load related objects "be eager and not lazy", use Include, you might include using System.Data.Entity to use the lambda include features
Example for splitting your Edmx
If you have the following objects for a rent a car app : Country, City , Person, Car, Rent, Gender, Engine, Manufacturers,..etc.
Now
If you are working on a screen to Manage (CRUD) person, this means you don't need Car,Rent,Manufacturer, so create ManagePerson.edmx contains ( Country, City, Person, Gender)
If you are working on managing (CRUD) Car then you don't need (Person,City, Gender,Rent), so you can create ManageCar.edmx containing ( Car, Manufacturer,Country, Engine)
Entity Framework must first compile and translate your LINQ queries into SQL, but after this it then caches them. The first hit to a query is always going to take a long time, but as you mention after that the query will run very quickly.
When I first used EF it was constantly an issue brought up by testers, but when the system went live and was used frequently (and queries were cached) it wasn't an issue.
See Hadi Hassans answer for general speed up tips.
This question already has an answer here:
Is it always better to use 'DbContext' instead of 'ObjectContext'?
(1 answer)
Closed 9 years ago.
I recently moved my entity model from an ObjectContext using 4.1 to a DbContext using 5.0. I'm starting to regret doing that because I'm noticing some very poor performance on query's using the DbContext vs ObjectContext. Here's the test scenario :
Both contexts use the same database with about 600 tables. LazyLoading and ProxyCreation is turned off for both (not shown in code example). Both have pre-generated views.
The test first makes 1 call to load up the metadata workspace. Then in a for loop that gets executed 100 times, I new up a context and make one call that takes the first 10. (I'm creating the context inside the for loop because this simulates being used in a WCF service, which would create the context every time)
for (int i = 0; i < 100; i++)
{
using (MyEntities db = new MyEntities())
{
var a = db.MyObject.Take(10).ToList();
}
}
When I run this with the ObjectContext it takes about 4.5 seconds. When I run it using the DbContext it takes about 17 seconds. I profiled this using RedGate's performance profiler. For the DbContext it seems the major culprit is a method called UpdateEntitySetMappings. This is called on every query and appears to retrieve the metadataworkspace and cycle through every item in the OSpace. AsNoTracking did not help.
EDIT : To give some better detail, the problem has to do with the creation\initialization of a DbSet vs an ObjectSet, not the actual query. When I make a call with the ObjectContext, it takes on average 42ms to create the ObjectSet. When I make a call with the DbContext, it takes about 140ms to create the internal dbset. Both ObjectSet and DbSet do some entityset mapping lookups from the metadataworkspace. What I've noticed is that the DbSet does it for ALL the types in the workspace while the ObjectSet does not. I'm guessing (haven't tried it) that a model with fewer tables that the performance difference is less.
I've been also concerned by the underperformance of the code first approach and I've performed some benchmarks in a scenario similar to yours
http://netpl.blogspot.com/2013/05/yet-another-orm-micro-benchmark-part-23_15.html
The results were no suprise, since the DbContext is a wrapper over ObjectContext, it has to sacrifice performance for the simplicity. However, my tests show that:
the more records you retrieve the less is the difference
the more records you retrieve the more important it is to turn off tracking if you want to be faster
For example, retrieving just 10 records
Note that code first is significantly slower than model first and there is no noticeable difference between tracking and no tracking - both observations are exactly like yours.
However when retrieving 10000 rows you have
Note that there is almost no difference between code first and model first in the notracking version. Also, both perform surprisingly well, almost as fast as the raw ado.net datareader.
Please follow my blog entry for more details.
That simple benchmark helped me to accept the nature of the code first. I still prefer it for smaller projects because of two features: poco entities and migrations. On the other hand, I would never pick any of the two for a project where performance is a critical requirement. This effectively means that I will probably never use the model first approach again.
(A side note: my benchmark also reveals that there is something wrong with nHibernate. I still haven't found anyone to help me to explain this even though I've consulted two independent developers who use NH daily)
DbContext is a wrapper for ObjectContext. Here is good answer about your question. It is possible that to make it easier to use they sacrificed performance.
I use Simple.Data to query millions of records and it works quite well and fast.
I'm using .NET entity framework 4.1 with code-first approach to effectively solve the following problem, here simplified.
There's a database table with tens of thousands of entries.
Several users of my program need to be able to
View the (entire) table in a GridRow, which implied that the entire Table has to be downloaded.
Modify values of any random row, changes are frequent but need not be persisted immediately. It's expected that different users will modify different rows, but this is not always true. Some loss of changes is permitted, as users will most likely update same rows to same values.
On occasion add new rows.
Sounds simple enough. My initial approach was to use a long-running DbContext instance. This one DbContext was supposed to track changes to the entities, so that when SaveChanges() is called, most of the legwork is done automatically. However many have pointed out that this is not an optimal solution in the long run, notably here. I'm still not sure if I understand the reasons, and I don't see what a unit-of-work is in my scenario either. The user chooses herself when to persist changes, and let's say that client always wins for simplicity. It's also important to note that objects that have not been touched don't overwrite any data in the database.
Another approach would be to track changes manually or use objects that track changes for me, however I'm not too familiar with such techniques, and I would welcome a nudge in the right direction.
What's the correct way to solve this problem?
I understand that this question is a bit wishy-washy, but think of it as more fundamental. I lack fundamental understanding about how to solve this class of problems. It seems to me that long living DbContext is the right way, but knowledgeable people tell me otherwise, which leads me to confusion and imprecise questions.
EDIT1
Another point of confusion is the existance of Local property on the DbSet<> object. It invites me to use a long running context, as another user has posted here.
Problem with long running context is that it doesn't refresh data - I more discussed problems here. So if your user opens the list and modify data half an hour she doesn't know about changes. But in case of WPF if your business action is:
Open the list
Do as many actions as you want
Trigger saving changes
Then this whole is unit of work and you can use single context instance for that. If you have scenario where last edit wins you should not have problems with this until somebody else deletes record which current user edits. Additionally after saving or cancelling changes you should dispose current context and load data again - this will ensure that you really have fresh data for next unit of work.
Context offers some features to refresh data but it only refreshes data previously loaded (without relations) so for example new unsaved records will be still included.
Perhaps you can also read about MS Sync framework and local data cache.
Sounds to me like your users could have a copy (cached) of the data for an indefinate period of time. The longer the users are using cached data the greater the odds that they could become disconnected from the database connection in DbContext. My guess is EF doesn't handle this well and you probably want to deal with that. (e.g. occaisionally connected architecture). I would expect implementing that may solve many of your issues.
With linq, do you create a single dbContext per request like nHibernate requires (for performance reasons, creating sessions in nhibernate from what I understand are an expensive call).
i.e. in my asp.net-mvc application, I may for a given action, hit the database 5-10 times on seperate calls. Do I need to create a context and re-use it for the entire request?
DataContexts are intended to be used for a single set of actions interacting with your database. I know, that's vague. Their usage is situational. If you are doing related, or specifically sequential activities, then one DataContext is probably good for you. If you are doing unrelated or parallel activities, consider using a DataContext for each activity.
Consider a few guidelines:
Entities retrieved by one DataContext can only be used (read: updated, deleted, etc.) by that same DataContext. If you need to match up objects across separate DataContexts, you'll have to do something such as running a LINQ query to select objects with the same primary key.
LINQ to SQL uses optimistic concurrency.
Dispose of the DataContext when you are done with it (letting it go out of scope and be garbage collected is fine)
Do not use a static or shared DataContext.
When I did a small app using LinqToSql, I found the app was very sluggish when i did a create-use-dispose of a DatabaseContext object each time I had to hit the database.
When I moved to sharing the DBContext across multiple requests... the app suddenly came back to life w.r.t. responsiveness.
Here's a question that I posted which is relevant