To improve performance of Entity Framework application there is suggestion to set AutoDetectChangesEnabled = false.
Following tutorial on MSDN states :
An alternative to disabling and re-enabling is to leave automatic detection of changes turned off at all times and either call context.ChangeTracker.DetectChanges explicitly or use change tracking proxies diligently. Both of these options are advanced and can easily introduce subtle bugs into your application so use them with care.
https://msdn.microsoft.com/en-us/data/jj556205.aspx
The last part is what concerns me.
Can you give some most common problems that could happen with this optimization approach?
What are good measures to prevent from unintended consequences?
My experience with ChangeTracking is: You should leave it on, if possible at all.
For me, I had two subtle problems with ChangeTracking (For us ChangeTracking was disabled globally).
Firstly, when adding/removing entities, you WILL have to set the entity state manually, because usually ChangeTracking sets the entity state to modified/added (you have to set deleted manually anyways), and this for every single entity (also those in navigation properties). Also, you have to set FK's manually in many cases.
Secondly, when EDITING related entities, you will have to call ChangeTracking or set the related entities manually - which in my experience, is quite complicated. This is because EF keeps an snapshot of related entities in its context graph, and checks this for referential integrity, not the actual related entry in your DbSet entries.
For further reference, I found an interesting article on ChangeTracking by one EF developer, Arthur Vickers.
Part 1
Part 2
Part 3 - possibly most interesting to you
Part 4
Part 5
Always make sure that you haven't accidentally disabled the EntityFramework Proxy Types. I had such problem and spent a plenty of time fixing it. EF's changes tracking is somehow related to this and when I disabled the changes tracking it also disabled proxy types.
EF uses it's own proxy types that mimic your types to apply it's own Lazy Loading to them. When proxy types and hence lazy loading are disabled, EF just stops loading inner entities. So if you will have a MyClass with a property myClass.MyAnotherClass it will be always null.
Personally I would recommend to leave the changes tracking enabled if you're not proficient with it. I tried to work with it being disabled, spent a few days trying to make it working and then turned it back to enabled state. It definitely affects the performance, but it's pretty intelligent and gives you a lot of advantages in exchange of that.
Related
I have a question regarding the .AsNoTracking() extension, as this is all quite new and quite confusing.
I'm using a per-request context for a website.
A lot of my entities don't change so don't need to be tracked, but I have the following scenario where I'm unsure of what's going to the database, or even whether it makes a difference in this case.
This example is what I'm currently doing:
context.Set<User>().AsNoTracking()
// Step 1) Get user
context.Set<User>()
// Step 2) Update user
This is the same as above but removing the .AsNoTracking() from Step 1:
context.Set<User>();
// Step 1) Get user
context.Set<User>()
// Step 2) Update user
The Steps 1 & 2 use the same context but occur at different times. What I can't work out is whether there is any difference. As Step 2 is an update I'm guessing both will hit the database twice anyway.
Can anyone tell me what the difference is?
The difference is that in the first case the retrieved user is not tracked by the context so when you are going to save the user back to database you must attach it and set correctly state of the user so that EF knows that it should update existing user instead of inserting a new one. In the second case you don't need to do that if you load and save the user with the same context instance because the tracking mechanism handles that for you.
see this page Entity Framework and AsNoTracking
What AsNoTracking Does
Entity Framework exposes a number of performance tuning options to help you optimise the performance of your applications. One of these tuning options is .AsNoTracking(). This optimisation allows you to tell Entity Framework not to track the results of a query. This means that Entity Framework performs no additional processing or storage of the entities which are returned by the query. However, it also means that you can't update these entities without reattaching them to the tracking graph.
there are significant performance gains to be had by using AsNoTracking
No Tracking LINQ to Entities queries
Usage of AsNoTracking() is recommended when your query is meant for read operations. In these scenarios, you get back your entities but they are not tracked by your context.This ensures minimal memory usage and optimal performance
Pros
Improved performance over regular LINQ queries.
Fully materialized objects.
Simplest to write with syntax built into the programming
language.
Cons
Not suitable for CUD operations.
Certain technical restrictions, such as: Patterns using DefaultIfEmpty for
OUTER JOIN queries result in more complex queries than simple OUTER
JOIN statements in Entity SQL.
You still can’t use LIKE with general pattern matching.
More info available here:
Performance considerations for Entity Framework
Entity Framework and NoTracking
Disabling tracking will also cause your result sets to be streamed into memory. This is more efficient when you're working with large sets of data and don't need the entire set of data all at once.
References:
How to avoid memory overflow when querying large datasets with Entity Framework and LINQ
Entity framework large data set, out of memory exception
AsNoTracking() allows the "unique key per record" requirement in EF to be bypassed (not mentioned explicitly by other answers).
This is extremely helpful when reading a View that does not support a unique key because perhaps some fields are nullable or the nature of the view is not logically indexable.
For these cases the "key" can be set to any non-nullable column but then AsNoTracking() must be used with every query else records (duplicate by key) will be skipped.
If you have something else altering the DB (say another process) and need to ensure you see these changes, use AsNoTracking(), otherwise EF may give you the last copy that your context had instead, hence it being good to usually use a new context every query:
http://codethug.com/2016/02/19/Entity-Framework-Cache-Busting/
Essentially, we have a database with a recurring template pattern and instances of this template. Templates live indefinitely, while the instances are bound in time. One group of users work only with templates and one group of users work only with "answer" entities connected to the instances. When a change is made to the template, the instances that are currently active automatically receive the changes from the templates (including cloning related entities or bringing existing clones into sync), while older instances are left alone "as you left them", which is an absolute requirement in order to not retroactively change history. When you go back to 2013, you want to see the data that was current as of the last change in 2013, not anything newer. Thus the cloning.
This all sounds good, except that making the clone involves cloning an involved graph of entities, sometimes including many-to-many relationships. Making sure that the information of the just-updated version of the template is used involves passing around that specific as-yet-unsaved entity object or saving at every step, forgetting all objects and making a new context every time. This code is hard to write, harder to get right and a nightmare to maintain.
I have desperately been looking for suitable literature about this and have been unable to even find something written up about the database modelling pattern (or for that matter better alternatives), never mind what to do in EF to work as efficiently as possible. Am I missing something, or is this just a case of it being a problem with inherent complexity?
There is nothing built in to help with this specific scenario. I'd consider a solution based on reflection and on the entity framework metadata model to automate a lot of this. That makes it easier to get right as well.
Cloning graph of objects should be automatable and has little inherent complexity. But if you want to clone only specific parts I can see complexity creep in easily. That's likely going to be inherent complexity. On the other hand if you find yourself writing the same cloning code and copy loops all over the place that's a missed abstraction and is artificial complexity.
Making sure that the information of the just-updated version of the template is used involves passing around that specific as-yet-unsaved entity object or saving at every step, forgetting all objects and making a new context every time.
I did not quite understand what you mean here. But talking about multiple contexts makes me very alert because that's a common anti-pattern. Normally, you want to have one context per logical unit of work. Often, that UOW is an HTTP request or a WCF request or a user interaction. When all entities are part of the same context many issues go away.
Also, it's not necessary to keep objects unsaved. Generally, the database should be synchronized with the in-memory entity state. So when you create fresh objects as part of your template cloning procedure there should be no reason to not save them. It's not necessary to save after each new entity. For performance reasons try not to save too often.
If you elaborate more on specific issues I can add commentary.
I've been tasked with speeding up a giant codebase. One of the things I have noticed is that the team uses lazy loading everywhere. So much so that I think there's a lot to be gained by disabling it. There would be too much of an impact if I disabled it entirely so I'd rather do this in phases.
This got me thinking: is there a way (an event?) to detect when EF is doing something lazily?
In case it matters, we're using EF6, but the context is based on ObjectContext instead of DbContext.
Due to the mess of the codebase it's not an option to just find references on the navigation properties.
I recommend you to use Glimpse, its a powerful tool for so many things, including EF profiler. You can see how the querys are been translated, and what time each query takes.
What is the difference between SaveOptions.AcceptAllChangesAfterSave and SaveOptions.DetectChangesBeforeSave in Entity Framework? When to use SaveOptions.None?
These options are provided in objectContext.SaveChanges(SaveOptions options).
Can any of these option, in any way, be used to reverse the changes made by objectContext.SaveChanges()?
They're two entirely different things. Note how SaveOptions has the Flags attribute: this indicates you can combine multiple flags, in this case to make SaveOptions.AcceptAllChangesAfterSave | SaveOptions.DetectChangesBeforeSave.
And if you're wondering about something like SaveOptions.None | SaveOptions.AcceptAllChangesAfterSave, then keep in mind that SaveOptions.None is the zero value, so this is just a long-winded way of writing SaveOptions.AcceptAllChangesAfterSave.
So you use SaveOptions.None when you want neither SaveOptions.AcceptAllChangesAfterSave nor SaveOptions.DetectChangesBeforeSave.
Can any of these option, in any way, be used to reverse the changes made by objectContext.SaveChanges()?
In the context? If you don't include SaveOptions.AcceptAllChangesAfterSave, then all changes will be preserved locally as unsaved. All added entities will remain in "added" state, all modified entities will remain in "modified" state, all deleted entities will still be available by explicitly requesting your context's deleted entities. Attempting to save again will likely fail, as the database has already been updated. You can then use the regular methods for reverting unsaved changes, but it requires a lot of manual work on your part, it requires manually looking up the original values of all properties and restoring that value. A detailed example of how to do this is, I think, beyond the scope of this question, but see Undo changes in entity framework entities.
In the database? This requires even more work on your part, and may not even be possible at all: once an entity with a server-generated column (e.g. auto-increment key, or row version field), it is generally impossible to restore it with those exact same values it originally had.
I have a question regarding the .AsNoTracking() extension, as this is all quite new and quite confusing.
I'm using a per-request context for a website.
A lot of my entities don't change so don't need to be tracked, but I have the following scenario where I'm unsure of what's going to the database, or even whether it makes a difference in this case.
This example is what I'm currently doing:
context.Set<User>().AsNoTracking()
// Step 1) Get user
context.Set<User>()
// Step 2) Update user
This is the same as above but removing the .AsNoTracking() from Step 1:
context.Set<User>();
// Step 1) Get user
context.Set<User>()
// Step 2) Update user
The Steps 1 & 2 use the same context but occur at different times. What I can't work out is whether there is any difference. As Step 2 is an update I'm guessing both will hit the database twice anyway.
Can anyone tell me what the difference is?
The difference is that in the first case the retrieved user is not tracked by the context so when you are going to save the user back to database you must attach it and set correctly state of the user so that EF knows that it should update existing user instead of inserting a new one. In the second case you don't need to do that if you load and save the user with the same context instance because the tracking mechanism handles that for you.
see this page Entity Framework and AsNoTracking
What AsNoTracking Does
Entity Framework exposes a number of performance tuning options to help you optimise the performance of your applications. One of these tuning options is .AsNoTracking(). This optimisation allows you to tell Entity Framework not to track the results of a query. This means that Entity Framework performs no additional processing or storage of the entities which are returned by the query. However, it also means that you can't update these entities without reattaching them to the tracking graph.
there are significant performance gains to be had by using AsNoTracking
No Tracking LINQ to Entities queries
Usage of AsNoTracking() is recommended when your query is meant for read operations. In these scenarios, you get back your entities but they are not tracked by your context.This ensures minimal memory usage and optimal performance
Pros
Improved performance over regular LINQ queries.
Fully materialized objects.
Simplest to write with syntax built into the programming
language.
Cons
Not suitable for CUD operations.
Certain technical restrictions, such as: Patterns using DefaultIfEmpty for
OUTER JOIN queries result in more complex queries than simple OUTER
JOIN statements in Entity SQL.
You still can’t use LIKE with general pattern matching.
More info available here:
Performance considerations for Entity Framework
Entity Framework and NoTracking
Disabling tracking will also cause your result sets to be streamed into memory. This is more efficient when you're working with large sets of data and don't need the entire set of data all at once.
References:
How to avoid memory overflow when querying large datasets with Entity Framework and LINQ
Entity framework large data set, out of memory exception
AsNoTracking() allows the "unique key per record" requirement in EF to be bypassed (not mentioned explicitly by other answers).
This is extremely helpful when reading a View that does not support a unique key because perhaps some fields are nullable or the nature of the view is not logically indexable.
For these cases the "key" can be set to any non-nullable column but then AsNoTracking() must be used with every query else records (duplicate by key) will be skipped.
If you have something else altering the DB (say another process) and need to ensure you see these changes, use AsNoTracking(), otherwise EF may give you the last copy that your context had instead, hence it being good to usually use a new context every query:
http://codethug.com/2016/02/19/Entity-Framework-Cache-Busting/