I recently upgraded from Entity Framework 5 to Entity Framework 6.1.3.
The below code using multiple contexts of the same connection worked fine before in EF5:
var Ids = MyDbContext.MyObject.Select(x => x.Id).Take(5).AsEnumerable();
var myObjects = MyDbContext2.MyObject.Where(x => Ids.Contains(x.Id)).ToList();
In EF6, I receive:
The specified LINQ expression contains references to queries that are
associated with different contexts. Description: An unhandled
exception occurred during the execution of the current web request.
Please review the stack trace for more information about the error and
where it originated in the code.
Exception Details: System.NotSupportedException: The specified LINQ
expression contains references to queries that are associated with
different contexts.
What in Entity Framework changed to stop this from working? Is there anyway I can get this to work without changing code?
Change first line from .AsEnumerable() to .ToList().
https://msdn.microsoft.com/en-us/data/hh949853.aspx#_Query_Plan_Caching
According to this documentation, there were changes made to Contains processing in EF 6 to optimize the way the underlying SQL query is generated.
Just a shot in the dark without looking at EF6's code:
IEnumerable is generally a deferred execution that doesn't hit the database until you reference the data in some way. From the framework's point of view, that isn't a list of integers or longs, but a query that needs to be performed in a different context. Since it's in the middle of a query in a different context, the SQL parser is probably having trouble resolving it with their new way of doing things. IEnumerable is a sort of half way state between Queryable and loaded. I'd guess whatever changes they made for optimization do not perform the outstanding queries any longer, and it just immediately short circuits to an exception if the referenced object isn't part of the context, no matter what it is.
This is also why changing it to a List() allows it to work. You're working off of a list of primitives and not an unresolved query.
Why did they make the change? I suppose they have their reasons (even beyond optimization). One I can think of is that it prevents the query generation parts from modifying the loaded state of that IEnumerable to remove that possibly unwanted side effect.
You can add the ToDictionary():
var Ids = (
from x in MyDbContext.MyObject.Select()
where x.Contains(x.Id)
select x
).ToDictionary(x => x.Key).Keys.ToList();
var myObjects = (
from y in MyDbContext2.MyObject
where y => Ids.Contains(y.Id).ToList()
).ToList();
return myObjects;
Related
I am wondering about why BulkDelete not throwing an exception if there is no such entity to delete on database? I am forced to check database if there are some matching entities and after calling BulkDelete method sending matched entities which I've got from querying database. Do EF Extensions has some options to automate it?
You dont have to check database. I am assuming you are using EF Extensions - BulkDelete. When you call BulkDelete, you just delete it but to make changes to the database, you have to .SaveChanges(); - This method however return the number of rows affected.
So, if the number is 0, then you know your DELETE failed. If the number is above 0, then you know the DELETE was successfull
The BulkDelete has been optimized for performance.
So indeed, there is no check about if the entity exists or not first, we just perform the Delete operation.
Something you could do on your side to make it easier is using the BulkRead method before using the BulkDelete.
var customers = context.Customers.BulkRead(deserializedCustomers);
Docs: https://entityframework-extensions.net/bulk-read
By using this method, you will be able to easily compare with your current list to get the list of customer doesn't exist in the database for example.
You could also get the RowAffecteds and compare it with your list count: https://entityframework-extensions.net/rows-affected
I am wondering about why BulkDelete not throwing an exception if there is no such entity to delete on database?
Because there is nothing wrong with having a filter that ends up yielding 0 results. That is not inherently an erroneous state.
Sure, for a specific use case you may have been expecting to find at least something, but a generalized library cannot account for your specific expectation in your specific use case. There are plenty of cases where ending up not deleting something is not a problematic scenario, e.g. a daily cleanup job that removes old entries if there are any. If there are no old entries, that's not a reason to throw an exception. It just means that nothing needed to be deleted.
This is no different from why a foreach doesn't throw an exception when you pass it an empty collection. The foreach is still doing its job, i.e. performing the work for each item in the collection. There are 0 items, and therefore it performs the work 0 times.
I'm running into some speed issues in my project and it seems like the primary cause it calls to the database using entity framework. Every time I call the database, it is always done as
database.Include(...).Where(...)
and I'm wondering if that is different than
database.Where(...).Include(...)?
My thinking is that the first way includes everything for all the elements in the target table, then filters out the ones I want, while the second one filters out the ones I want, then only includes everything for those. I don't fully understand entity framework, so is my thinking correct?
Entity Framework delays its querying as long as it can, up until the point where your code start working on the data. Just to prove the example:
var query = db.People
.Include(p => p.Cars)
.Where(p => p.Employer.Name == "Globodyne")
.Select(p => p.Employer.Founder.Cars);
With all these chained calls, EF has not yet called the database. Instead, it has kept track of what you're trying to fetch, and it knows what query to run if you start working with the data. If you never do anything else with query after this point, then you will never hit the database.
However, if you do any of the following:
var result = query.ToList();
var firstCar = query.FirstOrDefault();
var founderHasCars = query.Any();
Now, EF is forced to look at the database because it cannot answer your question unless it actually fetches the data from the database. At this point, not before, does EF actually hit the database.
For reference, this trigger to fetch the data is often referred to as "enumerating the collection", i.e. turning a query into an actual result set.
By deferring the execution of that query for as long as possible, EF is able to wait and see if you're going to filter/order/paginate/transform/... the result set, which could lead to EF needing to return less data than when it executes every command immediately.
This also means that when you call Include, you're not actually hitting the database, so you're not going to be loading data from items that will later be filtered by your Where clause, if you didn't enumerate the collection.
Take these two examples:
var list1 = db.People
.Include(p => p.Cars)
.ToList() // <= enumeration
.Where(p => p.Name == "Bob");
var list2 = db.People
.Include(p => p.Cars)
.Where(p => p.Name == "Bob")
.ToList(); // <= enumeration
These lists will eventually yield the same result. However, the first list will fetch data before you filter it because you called ToList before Where. This means you're going to be loading all people and their cars in memory, only to then filter that list in memory.
The second list, however, will only enumerate the collection when it already knows about the Where clause, and therefore EF will only load people named Bob and their cars into memory. The filtering will happen on the database before it gets sent back to your runtime.
You did not show enough code for me to verify whether you are prematurely enumerating the collection. I hope this answer helps you in determining whether this is the cause of your performance issues.
database.Include(...).Where(...) and I'm wondering if that is different than database.Where(...).Include(...)?
Assuming this code is verbatim (except the missing db set) and there is nothing happening inbetween the Include and Where, the order does not change the execution and therefore it is not the source of your performance issue.
I generally advise you to put your Include statements before anything else (i.e. right after db.MyTable), as a matter of readability. The other operations depends on the specific query you're trying to construct.
Most of the times order of clauses will not make any difference
Include statement tells to SQL Join one table with another
While Where will results in.. yes, SQL Where
When you do something like database.Include(...).Where(...) you are building IQueryable object that will be transleted to direct SQL after you try to access it like with .ToList() or .FirstOrDefault() and those queries are already optimized
So if you still have performance issues - you should use profiler to look for bottlenecks and maybe consider using stored procedures (those could be integrated with EF)
We are in the process of switching over our codebase to have the Check for arithmetic overflow/underflow option turned on by default and we've run into problems with our DevForce queries.
I'm able to reproduce the problem with a very basic query such as this against the NorthwindIB database:
var coolProducts = em.Products.Where(p => p.UnitsInStock == 42).Execute();
By doing some debugging, it looks like DevForce is trying to add that query to the cache which involves making a hash code for the query. The class that does that hash code generation (ExpressionHashCodeCalculator) is missing a switch case for the ConvertChecked ExpressionType and so it throws an ArgumentException saying "Unknown Expression type".
It seems the compiler sprinkles that ConvertChecked thing all over the place in expressions when you are running in a checked context.
Thanks for reporting this. It will be fixed in the next release, due in March.
var ret = (from f in context.foo
join b in context.bar on f.barid = b.barid
select f).ToList();
my returning list cointains all foos that have a barId, it also contains all navigation properties. What I mean by that is,
context.foo.mark is populated even though I didnt not explicitly include it nor did i access it during the query. I have lazy loading turned on, why is this occuring?
To elaborate on my question, somehow my related entities are getting loaded from the above query.I am curious as to how that is occurring, I have lazy loading enabled and I am not accessing any of the related objects
The lazy loading inspection is kind of a "catch-22" type problem. With Lazy Loading turned on, even a call to the property from the debugger will load the results as long as your context is still hanging around. Furthermore, if your context is still open from other queries, EF will maintain the state of those objects automatically and include them.
The only real way I can think of to determine if it is being lazily loaded or not is to inspect the SQL code sent to your database.
First, add this line to your DbContext constructor:
this.Database.Log = s => System.Diagnostics.Debug.WriteLine(s); //remove in production
Then, run your code as normal (but don't stop in the debugger to inspect your object). Look at your debug console and inspect the SQL calls made. My bet is that the SQL will not include the related properties.
If you run the code again, and stop the debugger to inspect the object properties, you should see another SQL call in the debug console fetching the related entities.
We currently have a production application that runs as a windows service. Many times this application will end up in a loop that can take several hours to complete. We are using Entity Framework for .net 4.0 for our data access.
I'm looking for confirmation that if we load new data into the system, after this loop is initialized, it will not result in items being added to the loop itself. When the loop is initialized we are looking for data "as of" that moment. Although I'm relatively certain that this will work exactly like using ADO and doing a loop on the data (the loop only cycles through data that was present at the time of initialization), I am looking for confirmation for co-workers.
Thanks in advance for your help.
//update : here's some sample code in c# - question is the same, will the enumeration change if new items are added to the table that EF is querying?
IEnumerable<myobject> myobjects = (from o in db.theobjects where o.id==myID select o);
foreach (myobject obj in myobjects)
{
//perform action on obj here
}
It depends on your precise implementation.
Once a query has been executed against the database then the results of the query will not change (assuming you aren't using lazy loading). To ensure this you can dispose of the context after retrieving query results--this effectively "cuts the cord" between the retrieved data and that database.
Lazy loading can result in a mix of "initial" and "new" data; however once the data has been retrieved it will become a fixed snapshot and not susceptible to updates.
You mention this is a long running process; which implies that there may be a very large amount of data involved. If you aren't able to fully retrieve all data to be processed (due to memory limitations, or other bottlenecks) then you likely can't ensure that you are working against the original data. The results are not fixed until a query is executed, and any updates prior to query execution will appear in results.
I think your best bet is to change the logic of your application such that when the "loop" logic is determining whether it should do another interation or exit you take the opportunity to load the newly added items to the list. see pseudo code below:
var repo = new Repository();
while (repo.HasMoreItemsToProcess())
{
var entity = repo.GetNextItem();
}
Let me know if this makes sense.
The easiest way to assure that this happens - if the data itself isn't too big - is to convert the data you retrieve from the database to a List<>, e.g., something like this (pulled at random from my current project):
var sessionIds = room.Sessions.Select(s => s.SessionId).ToList();
And then iterate through the list, not through the IEnumerable<> that would otherwise be returned. Converting it to a list triggers the enumeration, and then throws all the results into memory.
If there's too much data to fit into memory, and you need to stick with an IEnumerable<>, then the answer to your question depends on various database and connection settings.
I'd take a snapshot of ID's to be processed -- quickly and as a transaction -- then work that list in the fashion you're doing today.
In addition to accomplishing the goal of not changing the sample mid-stream, this also gives you the ability to extend your solution to track status on each item as it's processed. For a long-running process, this can be very helpful for progress reporting restart / retry capabilities, etc.