Is this entity framework call actually making two trips to the database?
var result = from p in entities.people where p.id == 6 select p;
entities.DeleteObject(result);
It strikes me that maybe DeleteObject would force the first trip to get results and THEN, having the object to work with, would execute the delete function.
If that is the case, how do I avoid the second trip? How does one do a remove-by-query in entity framework with a single database trip?
Thanks!
EDIT
My original example was misleading, because it was a query by primary key. I guess the real question is whether there is a way to have a single-trip function that can delete items from an IQueryable. For example:
var result = from p in entities.people where p.cityid == 6 select p;
foreach (var r in result)
{
entities.DeleteObject(r);
}
(Notice that the query is of a foreign key, so there may be multiple results).
You can do it like this:
entities.ExecuteStoreCommand("DELETE FROM people WHERE people.cityid={0}", 6);
this is one trip to the database for sure, and effective as it can be.
EDIT:
Also, take a look here, they suggest the same solution. And to answer the question, this is the only way to delete entities, not referenced by primary key, from a database using entity framework, without fetching these entities (and without writing some helper extension methods like suggested in this answer).
Direct delete:
var people = new people() { id = 6 };
entities.people.Attach(people);
entities.people.Remove(people);
entities.SaveChanges();
If you want to see it for yourself, fire up a profiler.
EDIT:
This will allow you to use Linq but it won't be one trip.
var peopleToDelete = entities.people.Where(p => p.id == 6);
foreach (var people in peopleToDelete )
entities.people.DeleteObject(people );
entities.SaveChanges();
There's no easy way to do that out of the box in EF, a big annoyance indeed (as long as one does not want to resort to using direct SQL, which personally I don't). One of the other posters links to an answer that in turn links to this article, which describes a way to make your own function for this, using ToTraceString.
Related
I have a parent entity with a navigation property to a child entity. The parent entity may not be removed as long as there are associated records in the child entity. The child entity can contain hundreds of thousands of records.
I'm wondering what will be the most efficient to do in Entity Framework to do this:
var parentRecord = _context.Where(x => x.Id == request.Id)
.Include(x => x.ChildTable)
.FirstOrDefault();
// check if parentRecord exists
if (parentRecord.ChildTable.Any()) {
// cannot remove
}
or
var parentRecord = _context.Where(x => x.Id == request.Id)
.Select(x => new {
ParentRecord = x,
HasChildRecords = x.ChildTable.Any()
})
.FirstOrDefault();
// check if parentRecord exists
if (parentRecord.HasChildRecords) {
// cannot remove
}
The first query may include thousands of records while the second query will not, however, the second one is more complex.
Which is the best way to do this?
I would say it depens. It depends on which DBMS you're using. it depends on how good the optimizer works etc.
So one single statement with a JOIN could be far faster than a lot of SELECT statements.
In general I would say when you need the rows from your Child table use .Include(). Otherwise don't include them.
Or in simple words, just read the data you need.
The answer depends on your database design. Which columns are indexed? How much data is in table?
Include() offloads work to your C# layer, but means a more simple query. It's probably the better choice here but you should consider extracting the SQL that is generated by entity framework and running each through an optimisation check.
You can output the sql generated by entity framework to your visual studio console as note here.
This example might create a better sql query that suites your needs.
In a one to many relationship situation which of the following has better performance.
1st approach
public Order GetOrder(long orderId) {
var orderDetails =
(from o in Orders
from d in OrderDetails
where d.OrderId = o.Id && o.Id = orderId
select new {
Order = o,
Detail = d
}).ToList();
var order = orderDetails.First().Order;
order.Details = orderDetails.Select(od => od.Detail).ToList();
return order;
}
2nd approach
public Order GetOrder(long orderId) {
var order = Orders.First(o => o.Id == orderId);
order.Details = OrderDetails.Where(od => od.OrderId = orderId).ToList();
return order;
}
The point I am trying to figure out (in terms of performance) is, in first approach there is single query but repeated data is being selected where, in second approach, there are two seperate queries but selecting only the data that is enough.
You can assume Orders and OrderDetails are IQueryable<T> of EntityFramework (dbContext.Set<T>()) or NHibernate (session.Query<T>()). I tried with both and they create very similar sql queries. Also as far as I know, these ORM's built in one to many queries use something like the first approach.
UPDATE, to clarify what I am asking: Which one (single query but repeated data or only required data but multiple queries) performs better under which circumstances? There may be many situations that I may not think of. That's why I am not trying benchmarking. As already stated in some answers column count or more joins were the kinds of answers that I expected. (There may be also something about row count of table and/or result set). Based on these kind of answers I may try benchmarking. And of course I am asking why? I am not trying to solve Order - OrderDetail problem or solve anything at all. I am trying to learn and understand when to use single query but repeated data or only required data but multiple queries.
A single one-to-many query is pretty straightforward for ORMs. It's when you need to make several interrelated one-to-many queries that performance considerations start making themselves known.
always measure performance for your particular case. if order table has few-small sized columns, getting all data in one round trip may be better. if order tables has too many or blob columns, issuing 2 seperate queries may outperform.
Using the EntityFramework, you should either call Include on the context
var order = context.Orders.Include(x => x.Details).First(x => x.Id == orderId);
Loading Related Objects
So I've read a lot about using AsNoTracking() when performing a query in EF, specifically if it returns entities, as to not keep around references to things if you will not be updating.
But I've also read that AsNoTracking may also speed up the queries themselves as EF does not have to map each item queried to an entity in the map.
The question is, if my Linq query simply returns values from the columns/rows but not an entity type, does using AsNoTracking() make a difference to speed up the query? And if not obviously I shouldn't use it because it just clutters the code?
example 1 (I would expect to use AsNoTracking():
var result = (from p in context.Pogs
select p).AsNoTracking();
example 2 (My question... I'm thinking it doesn't make sense to use here, but I don't know the answer):
var result = (from p in context.Pogs
select p.Name); // assuming p.Name is a string or something
versus
var result = (from p in context.Pogs.AsNoTracking()
select p.Name);
No, it does not since the entities won't be loaded, as evidenced by examining context.Pogs.Local which won't contain the entities whose properties were retrieved through LINQ.
You can check the entities being tracked through DbContext.ChangeTracker. So if you retrieve the entries of the tracker for your Pogs DbSet through context.ChangeTracker.Entries<Pogs>() you'll see that for your first example there are entries tracking the corresponding entities, while for the second example there are none.
In my schema I have two database tables. relationships and relationship_memberships. I am attempting to retrieve all the entries from the relationship table that have a specific member in it, thus having to join it with the relationship_memberships table. I have the following method in my business object:
public IList<DBMappings.relationships> GetRelationshipsByObjectId(int objId)
{
var results = from r in _context.Repository<DBMappings.relationships>()
join m in _context.Repository<DBMappings.relationship_memberships>()
on r.rel_id equals m.rel_id
where m.obj_id == objId
select r;
return results.ToList<DBMappings.relationships>();
}
_Context is my generic repository using code based on the code outlined here.
The problem is I have 3 records in the relationships table, and 3 records in the memberships table, each membership tied to a different relationship. 2 membership records have an obj_id value of 2 and the other is 3. I am trying to retrieve a list of all relationships related to object #2.
When this linq runs, _context.Repository<DBMappings.relationships>() returns the correct 3 records and _context.Repository<DBMappings.relationship_memberships>() returns 3 records. However, when the results.ToList() executes, the resulting list has 2 issues:
1) The resulting list contains 6 records, all of type DBMappings.relationships(). Upon further inspection there are 2 for each real relationship record, both are an exact copy of each other.
2) All relationships are returned, even if m.obj_id == 3, even though objId variable is correctly passed in as 2.
Can anyone see what's going on because I've spent 2 days looking at this code and I am unable to understand what is wrong. I have joins in other linq queries that seem to be working great, and my unit tests show that they are still working, so I must be doing something wrong with this. It seems like I need an extra pair of eyes on this one :)
Edit: Ok so it seems like the whole issue was the way I designed my unit test, since the unit test didn't actually assign ID values to the records since it wasn't hitting sql (for unit testing).
Marking the answer below as the answer though as I like the way he joins it all together better.
Just try like this
public IList<DBMappings.relationships> GetRelationshipsByObjectId(int objId)
{
var results = (from m in _context.Repository<DBMappings.relationship_memberships>()
where m.rel_id==objID
select m.relationships).ToList();
return results.ToList<DBMappings.relationships>();
}
How about to set _context.Log = Console.Out just to see the generated SQL query? Share the output with us (maybe use some streamwriter instead of console.out so that you can copy that easily and without mistakes).
Pz, the TaskConnect developer
I might have this backwards, but I don't think you need a join here. If you've setup your foreign keys correctly, this should work, right?
public IList<DBMappings.relationships> GetRelationshipsByObjectId(int objId)
{
var mems = _context.Repository<DBMappings.relationship_memberships>();
var results = mems.Where(m => m.obj_id == objId).Select(m => m.relationships);
return results.ToList<DBMappings.relationships>();
}
Here's the alternative (if I've reversed the mapping in my brain):
public IList<DBMappings.relationships> GetRelationshipsByObjectId(int objId)
{
var mems = _context.Repository<DBMappings.relationship_memberships>();
var results = mems.Where(m => m.obj_id == objId).SelectMany(m => m.relationships);
return results.ToList<DBMappings.relationships>();
}
Let me know if I'm way off with this, and I can take another stab at it.
I'm having trouble building an Entity Framework LINQ query whose select clause contains method calls to non-EF objects.
The code below is part of an app used to transform data from one DBMS into a different schema on another DBMS. In the code below, Role is my custom class unrelated to the DBMS, and the other classes are all generated by Entity Framework from my DB schema:
// set up ObjectContext's for Old and new DB schemas
var New = new NewModel.NewEntities();
var Old = new OldModel.OldEntities();
// cache all Role names and IDs in the new-schema roles table into a dictionary
var newRoles = New.roles.ToDictionary(row => row.rolename, row => row.roleid);
// create a list or Role objects where Name is name in the old DB, while
// ID is the ID corresponding to that name in the new DB
var roles = from rl in Old.userrolelinks
join r in Old.roles on rl.RoleID equals r.RoleID
where rl.UserID == userId
select new Role { Name = r.RoleName, ID = newRoles[r.RoleName] };
var list = roles.ToList();
But calling ToList gives me this NotSupportedException:
LINQ to Entities does not recognize
the method 'Int32
get_Item(System.String)' method, and
this method cannot be translated into
a store expression
Sounds like LINQ-to-Entities is barfing on my call to pull the value out of the dictionary given the name as a key. I admittedly don't understand enough about EF to know why this is a problem.
I'm using devart's dotConnect for PostgreSQL entity framework provider, although I assume at this point that this is not a DBMS-specific issue.
I know I can make it work by splitting up my query into two queries, like this:
var roles = from rl in Old.userrolelinks
join r in Old.roles on rl.RoleID equals r.RoleID
where rl.UserID == userId
select r;
var roles2 = from r in roles.AsEnumerable()
select new Role { Name = r.RoleName, ID = newRoles[r.RoleName] };
var list = roles2.ToList();
But I was wondering if there was a more elegant and/or more efficient way to solve this problem, ideally without splitting it in two queries.
Anyway, my question is two parts:
First, can I transform this LINQ query into something that Entity Framework will accept, ideally without splitting into two pieces?
Second, I'd also love to understand a little about EF so I can understand why EF can't layer my custom .NET code on top of the DB access. My DBMS has no idea how to call a method on a Dictionary class, but why can't EF simply make those Dictionary method calls after it's already pulled data from the DB? Sure, if I wanted to compose multiple EF queries together and put custom .NET code in the middle, I'd expect that to fail, but in this case the .NET code is only at the end, so why is this a problem for EF? I assume the answer is something like "that feature didn't make it into EF 1.0" but I am looking for a bit more explanation about why this is hard enough to justify leaving it out of EF 1.0.
The problem is that in using Linq's delayed execution, you really have to decide where you want the processing and what data you want to traverse the pipe to your client application. In the first instance, Linq resolves the expression and pulls all of the role data as a precursor to
New.roles.ToDictionary(row => row.rolename, row => row.roleid);
At that point, the data moves from the DB into the client and is transformed into your dictionary. So far, so good.
The problem is that your second Linq expression is asking Linq to do the transform on the second DB using the dictionary on the DB to do so. In other words, it is trying to figure out a way to pass the entire dictionary structure to the DB so that it can select the correct ID value as part of the delayed execution of the query. I suspect that it would resolve just fine if you altered the second half to
var roles = from rl in Old.userrolelinks
join r in Old.roles on rl.RoleID equals r.RoleID
where rl.UserID == userId
select r.RoleName;
var list = roles.ToDictionary(roleName => roleName, newRoles[roleName]);
That way, it resolves your select on the DB (selecting just the rolename) as a precursor to processing the ToDictionary call (which it should do on the client as you'd expect). This is essentially exactly what you are doing in your second example because AsEnumerable is pulling the data to the client before using it in the ToList call. You could as easily change it to something like
var roles = from rl in Old.userrolelinks
join r in Old.roles on rl.RoleID equals r.RoleID
where rl.UserID == userId
select r;
var list = roles.AsEnumerable().Select(r => new Role { Name = r.RoleName, ID = newRoles[r.RoleName] });
and it'd work out the same. The call to AsEnumerable() resolves the query, pulling the data to the client for use in the Select that follows it.
Note that I haven't tested this, but as far as I understand Entity Framework, that's my best explanation for what's going on under the hood.
Jacob is totally right.
You can not transform the desired query without splitting it in two parts, because Entity Framework is unable to translate the get_Item call into the SQL query.
The only way is to write the LINQ to Entities query and then write a LINQ to Objects query to its result, just as Jacob advised.
The problem is Entity-Framework-specific one, it does not arise from our implementation of the Entity Framework support.