How can I tell an EF4 DbContext to clear its internal cache/state?
I have a database updater program which executes a large number of inserts on a variety of tables via EF4 within a transaction. I find the inserts to a common Permissions table get slower and slower as the update proceeds.
There are the following constraints at play:
1) Everything must occur within a single (giant) transaction.
2) Can't introduce dependency on MSDTC - so I can't have a cross-connection transaction.
3) Can't seem to open a new DbContext for a SqlConnection which is already open - encounters "EntityConnection can only be constructed with a closed DbConnection" error. (note that I am already sharing a SqlConnection between multiple DbContext's, but only opening the connection after they are all initialized)
Given these constraints, I can't create a new DbContext for each chunk of the work, as it breaks the transaction.
I've satisfied these functional constraints, but the performance is poor. I suspect the DbContext is struggling to handle the volume of data being inserted into the DbSet.
How can I tell the DbContext to reset its internal cache (eg. the rows I inserted recently and don't care about any more)?
IIRC, you get a decent speedup on insert if you:
myDbcontext.Configuration.AutoDetectChangesEnabled = false;
myDbcontext.Configuration.ValidateOnSaveEnabled = false;
Perhaps it might be better to take a read of this: http://patrickdesjardins.com/blog/entity-framework-4-3-with-poco-and-track-change
I'd probably abandon EF for a gigantic insert go with SqlBulkCopy. The relevant section is here: http://msdn.microsoft.com/en-us/library/tchktcdk.aspx#sectionSection2
In your application you can use a mix of Entity Framework for reading your data and doing small insert and updates and use ADO.NET DataAdapters for Bulk Inserts and Updates http://msdn.microsoft.com/en-us/library/aadf8fk2.aspx
Alternatively you could use the ExecuteSQLCommand of EF5 http://msdn.microsoft.com/en-us/library/gg679456(v=vs.103).aspx to do your inserts in combination with stored procedures and passing a Table parameter to pass the bulk data. In EF4 it's ExecuteStoreCommand http://msdn.microsoft.com/en-us/library/system.data.objects.objectcontext.executestorecommand.aspx
Related
We have a lot of data that needs to be loaded into a number of tables. As far as I can see we have two options:
Include the data as part of the Configuration class seed method
Problems
1.a. this would be slow and involve writing a lot of C# code)
Use bulk insert with code first migrations - a lot quicker and probably a better solution.
Problems
2.a. It may prove tricky working with other data that gets loaded into the same tables as part of the seed.
2.b. It requires SQL Identity Insert to be switched on.
What solution is best and if it is 2 how do I go about bulk insert with code first migrations and how can I address the problems?
Bypassing EF and using ADO.NET/SQL is definitely a good approach for bulk data upload. The best approach depends on whether you want the data to be inserted as part of migration or just logic that runs on app start.
If you want it to be inserted as part of a migration (which may be nice since then you don't have to worry about defensive checks if the data exists etc.) then you can use the Sql(string) method to execute sql that uses whatever format and sql features you want (including switching IDENTITY_INSERT on/off). In EF6.1 there is also an overload that allows you to easily run a .sql file rather than having everything in code as a string.
If you want to do it on app start, then just create an instance of your context and then access Database.Connection to get the raw SqlConnection and use ADO.NET directly to insert the data.
Has anyone been able to create temporary tables or triggers using Microsoft's Entity Framework and SQLite? I have a working application that can create permanent tables and triggers, but not temporary ones. Listing sqlite_temp_master turns up no entries, and any attempts to interact with the temporary tables fail with "no table" errors. These interactions are taking place through a single SQLiteConnection though there is at least one other connection active in the application at the time.
I am using Database.ExecuteSqlCommand() to create the tables and triggers. If the TEMPORARY keyword is supplied, there are no errors and no tables/triggers. If the TEMPORARY keyword is not supplied, there are no errors, and permanent tables/triggers are created.
Any help would be appreciated.
The System.Data.Entity.Database object opens and closes the connection as it deems appropriate. In the way I was using ExecuteSqlCommand, it will open and close the connection for each command. So temporary tables will be discarded as soon as they are created.
Manually opening and closing Database.Connection won't work because of an apparent problem in the DbContext class. However, the internal ObjectContext object can do the job.
Here's the best summary that I've found for the solution (many thanks to Brent McKendrick)
(dbContext as IObjectContextAdapter).ObjectContext.Connection.Open();
using (TransactionScope scope = new TransactionScope(TransactionScopeOption.Required))
{
// perform a list of queries
// The connection will not close!
scope.Complete();
(dbContext as IObjectContextAdapter).ObjectContext.Connection.Close();
}
I used that technique in conjunction with SQLiteCommand to create a set of temporary tables and triggers, perform my operations, use LINQ to get the results, then end the transaction, and close the connection. The temporary objects were dropped only at the end, as expected.
I did not check to see if Database.ExecuteSqlCommand can be used in place of SQLiteCommand when using this technique.
Edit: The TransactionScope is not necessary, and certainly adds to the overhead of any operation. The critical part is the opening and closing of the connection through the dbContext's ObjectContext.
I am working on a Legacy application, and we have a poor performance with Entity Framework (4.0.0) and massive insert.
When I tried the POCO Generator (T4), the issue was worse, the SaveChanges was three times longer. This is huge, if you have any idea why I have this issue, I am interested.
I don't have any performance metrics for different generators. But the bottleneck should not be in your context anyway. You should know EF will generate one SQL statement per insert, update and delete, and if you didn't explicitly open the connection first, it will log on before and log off from sql server once per SQL statement.
Also the context must maintain the states and relationships so the performance degrades as your context gets larger and larger. SaveChanges must figure out what's happening in the context first and should be the reason why POCO Generator vs Entity Object ends up with different execution times. As far as it being 3 times longer, more details will be needed to figure it out.
PS, if you are stuck with the legacy code, you should look into using bulk copy with EF.
I'm writing an app using WPF, Entity framework and SQLServer, all very run of the mill stuff. I was having a look at what calls get made to the database using sql profiler and found quite a few unnecessary calls. The first one was solved pretty easy but I have included it for anyone reading this thread in the future. Assuming I have a table structure with 3 tables like this Invoice->InvoiceDetail->Product
1) When I load up an Invoice object, it will then execute a seperate statement to retrieve each InvoiceDetail item. This is solved pretty easy by using the Include statement, eg
context.Invoices.Include("InvoiceDetails").Where(i => i.Something == somethingelse);
2) When I delete an Invoice the database has a cascade delete which automatically deletes all of the InvoiceDetails. However EF still insists on calling a delete for each of the InvoiceDetail objects that it has in memory. If an invoice has 100 items on it then it will execute 101 statements instead of 1. This is bad.
3) In addition to the extra statements executed in point 2, assuming each InvoiceDetail object points to a product and I have caused the products to loaded into memory (this would happen if I showed the invoice before I deleted it) then EF executes a useless update statement on every product!!!! In fact this update statement is more than useless because if someone else has changed something about the product in the mean time then this code will change the data back!! If I'm logging changes then we get useless log entries. I suspect it is doing this because Product would have had an InvoiceDetails collection which has had some items removed, but the Product itself has not changed so why the update?
Thanks for reading
Cheers,
Michael
The initial behavior was something known as lazy loading. You have replaced it with eager loading which is exact solution for this problem.
For entity framework this is the only correct behavior because EF doesn't support any batch modifications. Every record must be deleted with its own statement and round trip to the database. Once you load entities to memory you simply have to delete them one by one otherwise you will get exception before any database call will be done (= database cascade delete will not help you). The only workaround is custom stored procedure for deletion and disposing the current context after running the stored procedure because its internal state will not be consistent with the database.
This is interesting. It would require little bit more investigation but it can be simply design flaw / bug in EF and you will most probably not avoid it (unless you use stored procedure as described in 2.). If you want to avoid overwriting changes in Product you must involve optimistic concurrency. In such case your changes will not be overwritten but your delete will fail with OptimisticConcurrencyException. I will check this behavior later and let you know if I'm able to reproduce it and find any workaround.
I've been using this as a solution to let SQL Server handle the cascading deletes without the EF hit.
Public Sub DeleteCheckedOutByUser(ByVal username As String)
Dim cmd As String = String.Format("delete Maintenance.CheckoutManager where CheckOutTo = '{0}'", username)
_context.ExecuteStoreCommand(cmd)
End Sub
Sorry it's in VB, that's what my current client is using. If you have any trouble translating what I'm saying just let me know.
To remove the cascading deletes (and presumably rely on SQL Server to do the deletes), see the approach here: http://geekswithblogs.net/danemorgridge/archive/2010/12/17/ef4-cpt5-code-first-remove-cascading-deletes.aspx
As the question asks really.
The EF modeller tool allows us to map the Insert/Update/Delete functions to a sproc, is there any benefit to overriding them?
If it requires some custom validation, then obviously yes, but if I'm happy with how it is now, is it worth creating sproc's for them all?
I can't remember how to view the SQL it's executing for them to find out the exact query, but I should imagine it'd be pretty similar to a standard Insert/Update/Delete query.
I can think of a few cases where it could be useful:
You're working with a legacy database which doesn't quite map to your EF model precisely.
You need extra queries to be executed on insert/update/delete, but you don't have rights to triggers on your database.
Soft deletes in your database which you want to abstract away from. So a regular delete will actually perform a soft delete.
Not quite sure how viable these options are, as I personally am more of a NHibernate guy. These are theoretical options.
As for viewing the executed queries, there's a few possibilities to do that. You could attach a profiler to your SQL Server instance and look at the raw queries that are executed. There's also Entity Framework Profiler (By Ayende/Oren Eini) which isn't free, but it does make reading and debugging the queries a lot easier.
Yes. There is a benefit to overriding them.
Not everybody actually updates or deletes a row of data when an update or delete happens.
In some cases, deleting a record really simply means setting an EffictiveUntil date to an existing record and keeping the record in the database for historical purposes.
The same can go for an Update. Instead of updating an existing row, the current row gets the EffectiveUntil date set and a brand new row gets inserted with the new data with a null EffectiveUntil date (or similar mechanism).
By providing Insert/Update/Delete logic to Entity Framework, you are allowed to specify exactly what those operations mean in terms of your database rather than what they mean in the scope of an RDBMS.
As for the second question (that I apparently originally missed), if you're happy with what is currently being generated, then no it's not worth creating them. You'd just add the extra headache of having to remember to update your Stored Procedures whenever you change the table structure.