This is my first time working with Entity Framework (EF) and I'm trying to learn what exactly executes a query on my database and what doesn't.
This is the code I'm working with. Don't mind the functionality, it isn't important for this question.
using (var db = new Context())
{
//Check if any reviews have been given.
if (combinedReviews.Any())
{
var restaurantsReviewedIds = combinedReviews.Select(rev => rev.RestaurantId);
//(1)
ratedRestaurants = db.Restaurants.Where(rest => restaurantsReviewedIds.Contains(rest.Id))
.DistinctBy(rest => rest.Id)
.ToList();
}
//(2)
var restsClose = db.Restaurants.Where(rest => db.Reviews.Any(rev => rev.RestaurantId == rest.Id))
.OrderBy(rest => rest.Location.Distance(algorithmParams.Location))
.Take(algorithmParams.AmountOfRecommendations);
//(3)
tempList = ratedRestaurants.Union(restsClose).ToList();
var tempListIds = tempList.Select(rest => rest.Id); //Temporary list.
//(4)
restsWithAverage = db.Reviews.Where(rev => tempListIds.Contains(rev.RestaurantId))
.GroupBy(rev => rev.RestaurantId)
.ToList();
}
I have marked each piece of code with numbers, so I'll refer to them with that. Below is what I think is what happens.
This executes a query since I'm calling .ToList() here.
This returns an IQueryable, so this won't execute a query against the database.
This executes the query from (2).
This executes another query since I'm calling .ToList().
How close to the truth am I? Is all of this correct? If this doesn't make sense, could you give an example what executes a query and what doesn't?
I'm sorry for asking so many questions in one question, but I thought I wouldn't need to create so many questions since all of this is about a single topic.
If you don't want to execute a query you can use AsEnumerable.
ToList vs AsEnumerable
ToList – converts an IEnumerable<T> to a List<T>. The advantage of using AsEnumerable vs. ToList is that AsEnumerable does not execute the query. AsEnumerable preserves deferred execution and does not build an often useless intermediate list.
On the other hand, when forced execution of a LINQ query is desired, ToList can be a way to do that.
You could also force execution by putting a For Each loop immediately after the query expression, but by calling ToList or ToArray you cache all the data in a single collection object.
ToLookup and ToDictionary also executing the queries.
Here you can find a list of operators and if they are executing query:
https://msdn.microsoft.com/en-us/library/mt693095.aspx.
Linq query execution is different per query. I recommend reading the following page: https://msdn.microsoft.com/en-us/library/bb738633(v=vs.110).aspx
I have two objects that are connected in a way such that ObjectA contains an ICollection of ObjectB. I would like to be able to determine the indexOf ObjectB in inside the ICollection<ObjectB> that is stored in ObjectA.
One of the solutions was to convert the ICollection to a list and then use the built in IndexOf. However, when multiple threads access the ICollection, in many cases, I can get the same indexOf value for multiple ObjectBs.
Question Is there any way to have a certain field (of type int) that stores the index of ObjectB inside the ICollection? If not, is there anyway to ensure that indexOf (when multiple threads attempt to access it) gives the right index (i.e. no matter of the thread)?
Possible solutions I've tried to ensure to use a new context for each look up as well as GetDatabaseValues() and Reload(). This has worked better (especially in Debug mode), but when the debug mode is turned off, the same index of value is given to more ObjectBs.
Edit
I tried to add an OrderBy statement, but it seems like none of the approaches work.
// var objectB = new ObjectB();
using(var context = new ContextDb())
{
var objectA = context.ObjectAs.Single(x => x.Id == 1);
objectA.objectBs.Add(objectB);
context.SaveChanges();
context.Entry(objectB).Reload();
context.Entry(objectA).Reload();
var list = objectA.objectBs.Select(x => x.Id).OrderBy(x => x).ToList(); // order by primary key.
sb.AppendLine( string.Join(",", list.ToArray())); // for testing
objectB.LocalId= list.IndexOf(objectB.Id) + 1; // the "local id"
context.SaveChanges();
}
The result is quite strange, although I seem to be able to see a pattern. Note, the code above is in a for loop that runs a certain amount of times. During the first iteration (first line in the string builder) gives the following:
2932 2932,2933 2932,2933,2934 2932,2933,2934,2935,2936 2932,2933,2934,2935,2936,2937,2938 2932,2933,2934,2935,2936,2937,2938,2939,2940 2932,2933,2934,2935,2936,2937,2938,2939,2940,2941,2942
The second line:
2932,2933,2934,2935,2936,2937,2938,2939,2940,2941,2942,2943,2944,2945 2932,2933,2934,2935,2936,2937,2938,2939,2940,2941,2942,2943,2944,2945,2946,2947
The last line:
2932,2933,2934,2935,2936,2937,2938,2939,2940,2941,2942,2943,2944,2945,2946,2947,2948,2949,2950,2951,2952,2953,2954,2955,2956,2957,2958,2959,2960,2961,2962,2963,2964,2965,2966,2967,2968,2969,2970
Does anyone know if there is a build in way in Entity framework to avoid these duplicates?
Honestly, I'm still not entirely clear on what you're looking for. However, there's a couple of things I can say based on your expanded question. First, when you pull anything from a database, there's no inherent order. By default, it'll generally be ordered by PK, if possible, or more appropriately by "insert order". However, that's risky to rely on if an exact order is necessary. If you need a truly exact and replicate-able order, then you need to issue an ORDER BY clause with the order you want.
Especially if you relying on navigation properties filled by Entity Framework through either eager or lazy-loading a foreign key, you can't rely on the implict order at all. Again, if order is important, then you need to use OrderBy or OrderByDescending with some property on the entity to make sure that you get a true apples-to-apples order comparison.
Use IDictionary in which you can store index as well as the object.
IDictionary
Any decent compiler should eliminate dead code, at least to a certain extent. However, I am curious how a compiler (specifically MSBuild) handles a situation like the following:
// let's assume LazyLoadingEnabled = false;
var users = db.Users.ToList();
// more code that never touches 'users'
Since LazyLoadingEnabled = false, will the compiled code:
Eagerly load the results from the database call
Make the call to the database without storing the results
or
Never make the call to begin with?
I was cleaning up some old code at work and I found several cases of this occurring, so I'm curious as to whether we've been wasting resources or not.
It feels like the right answer is number 3, but I haven't found any solid evidence to back up my claims. Thank you for your help!
The answer is #1.
Not only will this execute the database query to select all the records from the Users table, but it will fetch all those records and construct entities for each of those records in the Users table. Very expensive if you have many records. Of course, the GC will eventually collect the wasted resources.
If you want to prove the above for yourself, just add the following line after you create your DbContext to log the SQL being executed:
db.Database.Log = s => Console.WriteLine(s);
BTW, the LazyLoadingEnabled setting has no effect on the observed behavior. The LazyLoadingEnabled setting determines if navigational properties are eagerly loaded or not. In this case, db.Users is not a navigational property, so it has no effect.
I have a few mongo queries.
var threads = postCollection.AsQueryable<PostMongoEntity>()
.Select(w => w.ThreadId);
var entities = threadCollection.AsQueryable<ThreadMongoEntity>()
.Where(e => e.ThreadId.In(threads))
.OrderBy(e => e.Time)
.Skip(page * ThreadPageSize)
.Take(ThreadPageSize);
The first query finds all threads ids from a posts collection, the second gets all threads with that id. I wanted to know if this will do everything on the actual database. This isn't the complete query, but most of the important stuff is here. The part I'm woried about is Where(e => e.ThreadId.In(threads)). Will it send the thread list to the database or will it get all threads and do filtering locally?
It will send the list of threadIds to MongoDB. IT will NOT pull all the records back and do the filtering locally. I assume this is what you are wanting.
Well, from type compatibility looks legal. threads is IQueryable that implements IEnumerable, while operation In accept exactly IEnumerable (http://api.mongodb.org/csharp/1.9.2/)
Sorry just look attentively at your question
But!
obviously you need use long (or what type is dedicated for Id in PostMongoEntity). So it is became legal only if In accept IEnumerable of primitive types instead of entities.
P.S. This method have some restriction on number of PostMongoEntity keys - cannot quickly find exact reference.
I'm trying to load a lot of records from the DB and I would like to run them in parallel to speed things up.
Below is some example code which breaks when it tries to access the Applicants property which is null. However in a non-parallel loop, Applicants property is either populated or is an empty list, but is never null. Lazy loading is definitely enabled.
var testList = new List<string>();
Context.Jobs
.AsParallel()
.WithDegreeOfParallelism(5)
.ForAll(
x => testList.Add(x.Applicants.Count().ToString())
);
Can I do something like this? Is it related to the entity framework connection? Can I make it parallel friendly and pass an instance of it into the task or something? I'm just shooting out ideas but really I haven't a clue.
Edit:
Is this post related to mine? My issue sounds kind of similar. Entity Framework lazy loading doesn't work from other thread
PLINQ does not offer a way to parallelize LINQ-to-SQL and LINQ-to-Entities queries. So when you call AsParallel EF should first materialize the query.
Furthermore, it doesn't make any sence to parallelize the query that executes on database, cause database can do that itself.
But if you want to parallelize cliend-side stuff, below code may help:
Context.Jobs
.Select(x => x.Applicants.Count().ToString())
.AsParallel()
.WithDegreeOfParallelism(5)
.ForAll(
x => testList.Add(x)
);
Note that you can access navigation properties only before the query is materialized. (in your case before AsParallel() call). So use Select to get all what you want.
Context.Jobs
.Select(x => new { Job = x, Applicants = x.Applicants })
.AsParallel()
.WithDegreeOfParallelism(5)
.ForAll(
x => testList.Add(x.Applicants.Count().ToString())
);
You also can use Include method to include navigation properties into results of the query...
Context.Jobs
.Include("Applicants")
.AsParallel()
.WithDegreeOfParallelism(5)
.ForAll(
x => testList.Add(x.Applicants.Count().ToString())
);