EF: db.SaveChanges() vs dbTransaction.Commit - c#

I am fairly new to entity framework and I have a doubt on EF's db.SaveChange. From some posts and MSDN I learned that db.SaveChange by default does all the changes in transaction. Also there is a way we can create a transaction of our own using db.Database.BeginTransaction() ,"db" being my context class object. So I have two questions:
What to use & when
If I am inserting data to one table whose ##identity is foreign key to my next inserting table, rather than using db.SaveChange() to get the ##identity is there any other way (db.SaveChanges() is in a user defined transaction scope) and will db.SaveChanges() commit my changes to DB

Yes, if you explicitly wrap your context within a transaction such as .Net's TransactionScope, you can retrieve auto-generated IDs from entities after a .SaveChanges() call, without committing the scoped transaction.
using (var tx = new TransactionScope())
{
using (var context = new MyDbContext())
{
var newEntity = populateNewEntity();
context.MyEntities.Add(newEntity);
context.SaveChanges();
int entityId = newEntity.EntityId; // Fetches the identity value.
}
} // Rolls back the transaction. Entity not committed.
However, operations like this should be avoided unless absolutely necessary, and cautiously. Firstly, the above example is common use of TransactionScope, and the default isolation level of TransactionScope is "Serializable" which is the most pessimistic in terms of locking. Even moderate use of this pattern on systems that have a number of concurrent operations/users will result in deadlocks and performance hits due to lock waits. So if using a TransactionScope, be sure to specify an isolation level.
DTC is useful in scenarios where you want to coordinate commits between databases or other Tx-bound operations. For instance system A is saving changes and needs to coordinate an update/insert with system B through an API. A & B need to be configured to use DTC, but once that is done A can start a transaction, register it with DTC, append the DTC token to the header for B's API, B can find that token, create a ScopedTransaction linked to that token, and commit/rollback based on what A signals. This has an overhead cost meaning transactions on both systems are open longer than usual. If it's necessary then that is a cost of business. If it's not necessary then it is a waste and potential source of headaches.
One other reason that someone might look at using an explicit Tx is when they want to update FK's in a related entity. Creating an order has an option to create a new customer, order has a customer ID so we need to create the customer, get it's ID to set on the Order, then save the order. If the order save fails then the customer creation should roll back.
using (var tx = new TransactionScope())
{
using (var context = new MyDbContext())
{
var newCustomer = createNewCustomer(); // dummy method to indicate creating a customer entity.
context.Customers.Add(newCustomer);
context.SaveChanges();
var newOrder = createNewOrder();
newOrder.CustomerId = newCustomer.CustomerId;
context.Orders.Add(newOrder);
context.SaveChanges();
}
tx.Commit();
}
With EF this scenario should be mitigated by using navigation properties with a relationship between order and customer. In this way you can create a customer, create the order, set the order's Customer reference to the new customer, add the order to the DbContext, and .SaveChanges(). This lets EF take care of going through the order, seeing the referenced customer, inserting that, associating the FK in the order, and committing the changes in one implicit Tx.
using (var context = new MyDbContext())
{
var newCustomer = createNewCustomer();
var newOrder = createNewOrder();
newOrder.Customer = newCustomer;
context.Orders.Add(newOrder);
context.SaveChanges();
}
Update: To outline avoiding FK references in your entities... (many-to-one)
EntityTypeConfiguration for Order With FK in entity:
HasRequired(x => x.Customer)
.WithMany(x => x.Orders) // Links to an element in the Orders collection of the Customer. If Customer does not have/need an Orders collection then .WithMany()
.HasForeignKey(x => x.CustomerId); // Maps Order.Customer to use CustomerId property on Order entity.
EntityTypeConfiguration for Order With No FK in entity:
HasRequired(x => x.Customer)
.WithMany(x => x.Orders)
.Map(x => x.MapKey("CustomerId")); // Maps Order.Customer to use CustomerId column on underlying Order table. Order entity does not expose a CustomerId.
With EF Core -- From memory, may need to be updated.
HasRequired(x => x.Customer)
.WithMany(x => x.Orders) // Links to an element in the Orders collection of the Customer. If Customer does not have/need an Orders collection then .WithMany()
.HasForeignKey("CustomerId"); // Creates a shadow property where Entity does not have a CustomerId property.
Both approaches (with or without mapped FK) work the same. The benefit of the second approach is that there is no confusion in the code about how to update or assess the customer reference for the order. For example if you have both a Customer, and a CustomerId on the Order, changing the CustomerId and calling SaveChanges does not move the order to a new customer, only setting the Customer reference. Setting the Customer reference does not automatically update the CustomerId, so any code "getting" the customerId via the CustomerId property on order would still retrieve the old customer reference until the entity is refreshed.
The important thing to using navigation properties is to leverage them with deferred execution or eager-load them efficiently. For example if you want to load a list of orders and include their customer name:
using (var myContext = new MyDbContext())
{
var orders = myContext.Orders.Where(x => x.OrderDate >= startDate && x.OrderDate < endDate).ToList();
return orders;
}
** Bad: If this is MVC/Web API the serializer will take the orders collection, and attempting to serialize them hit every navigation property and attempt to load it. This triggers lazy-load calls one-by-one. So if Order has a Customer, that is a hit to the DB /w "SELECT * FROM Customers WHERE CustomerId = 42" If Order has Order lines then "SELECT * FROM OrderLines WHERE OrderLineId = 121", "SELECT * FROM OrderLines WHERE OrderLineId = 122" ... (You might think it'd know to fetch order lines by OrderId, but nope! Huge performance impact returning Entities, just don't do it.
using (var myContext = new MyDbContext())
{
var orders = myContext.Orders
.Include(x => x.Customer)
.Include(x => x.OrderLines)
.Where(x => x.OrderDate >= startDate && x.OrderDate < endDate).ToList();
return orders;
}
** Better, but still bad. You might only include the items you think you'll need, but the serializer will still fetch everything on the order. This comes back to bite you as entities are revised to include new links to data. Even if you Include everything this is wasteful if all you wanted was the Customer Name.
using (var myContext = new MyDbContext())
{
var orders = myContext.Orders
.Where(x => x.OrderDate >= startDate && x.OrderDate < endDate)
.Select(x => new OrderLineViewModel
{
OrderId = x.OrderId,
OrderNumber = x.OrderNumber,
OrderAmount = x.OrderAmount,
CustomerName = x.Customer.Name
}).ToList();
return orders;
}
** This is the sweet spot with navigation properties and deferred execution. The SQL that gets run on the DB returns just those 4 columns from the related data. No lazy load hits, and you send across the wire just the amount of data you need.
Some might argue that if you commonly need a CustomerId reference from an Order for example that having a CustomerId on the Order entity saves referencing the Customer. But as outlined above, that Id may not be reliable, and by using deferred execution to let EF use the entities to populate the data you want Getting the customer IDs of orders is just a matter of including/selecting x.Customer.CustomerId which includes just that desired column, not loading the entire entity to get it.

Related

Updating nested list without AsNoTracking

Simplified model:
Profile {Guid Id, string Name, List<Property> Properties}
Property {Guid Id, string Name, List<Type> Types}
Type {Guid Id, string Key, string Value}
DbContext:
{
public DbSet<Profile> Profiles { get; set; }
}
I didn't include Properties and Types in the DbContext so I used the ModelBuilder:
modelBuilder.Entity<Property>().HasMany<Type>();
In the Update service:
public async Task<Response> Update([FromBody] Profile profile)
{
var entity = await _context.Profiles
.Include(x => x.Properties)
.ThenInclude(x => x.Types)
.FirstOrDefaultAsync(x => x.Id == profile.Id);
foreach (var prop in profile.Properties)
{
var existingProp = entity.Properties.SingleOrDefault(a => a.Id == prop.Id);
//Update
if (existingProp != null)
{
var entry = _context.Entry(existingProp);
entry.State = EntityState.Modified;
existingProp.ChargeFrom(prop);//maps the new values to the db entity
_context.SaveChanges();
}
}
}
But the above code throws this exception at SaveChanges:
The instance of entity type 'Type' cannot be tracked because another
instance with the same key value for {'Id'} is already being tracked.
When attaching existing entities, ensure that only one entity instance
with a given key value is attached. Consider using
'DbContextOptionsBuilder.EnableSensitiveDataLogging' to see the
conflicting key values.
I marked the Types entity AsNoTracking:
.ThenInclude(x => x.Types).AsNoTracking()
and the problem is solved, but I don't know why this exception is thrown, some other thread mentions that the DbContext might be used by another process, and might be registered as Singleton, but in my case, it's registered as scoped.
I marked the Types entity AsNoTracking:
.ThenInclude(x => x.Types).AsNoTracking()
and the problem is solved, but I don't know why this exception is thrown
The reason for the error will be because this line:
existingProp.ChargeFrom(prop);//maps the new values to the db entity
... will be attempting to copy the untracked Types from prop into existingProp. Using AsNoTracking will remove the exception but it will most likely be resulting in a duplication of data on SaveChanges where Type would be set up with an Identity key or duplicate row exceptions. If you received no exception I would be checking the Types collection to see if there are duplicate rows appearing there.
When copying data across from an untracked entity to a tracked entity, you will want to ensure that only values, and not references, are copied across. Copying an untracked reference across, EF will treat that as a new entity by default. Even if you force its state over to Modified, the DbContext could already be tracking an entity with that ID.
If Property.Types is a collection of references, such as an association to a lookup, and these could change where associations are added and removed, then to apply changes you need to load the associated types from the database then use that to remove associations that are no longer valid and add ones that aren't currently associated.
For example: given a Property (PropertyA) with Types (Type1) and (Type2), if we edit that to have (Type1) and (Type3) we need to fetch Type1 & Type3 from the DbContext (tracked) then compare against the tracked PropertyA to determine to remove Type2 and add Type3
var entity = await _context.Profiles
.Include(x => x.Properties)
.ThenInclude(x => x.Types)
.SingleAsync(x => x.Id == profile.Id);
// Get the IDs for all Types we want to associate... In the above example this would
// ask for Type1 and Type3 if only the one property. We get a Distinct list because
// multiple properties might reference the same TypeId(s).
var existingTypeIds = profile.Properties
.SelectMany(x => x.Types.Select(t => t.Id))
.Distinct()
.ToList();
// Load references to all Types that will be needed. Where associating new types, these will be referenced.
var existingTypes = _context.Types
.Where(x => existingTypeIds.Contains(x.Id))
.ToList();
foreach (var prop in profile.Properties)
{
existingProp = entity.Properties.SingleOrDefault(x => x.Id == prop.Id);
if (existingProp == null)
continue;
var updatedTypeIds = prop.Types.Select(x => x.Id).ToList();
var existingTypeIds = existingProp.Types.Select(x => x.Id).ToList();
var addedTypeIds = updatedTypeIds.Except(existingTypeIds).ToList();
var removedTypeIds = existingTypeIds.Except(updatedTypeIds).ToList();
var addedTypes = existingTypes
.Where(x => addedTypeIds.Contains(x.Id))
.ToList();
var removedTypes = existingProp.Types
.Where(x => removedTypeIds.Contains(x.Id))
.ToList();
foreach(var removedType in removedTypes)
existingProp.Types.Remove(removedType);
foreach(var addedType in addedTypes)
existingProp.Types.Add(addedType);
}
If instead the type is a child row that contains properties that can be updated then these values should be copied across between the updated data and the existing data state. This adds a considerable amount of work, though tools like AutoMapper can be configured to help. You still need to manage cases where Types can be added, removed, or have contents changed. That would apply to Properties as well, as your example only handles cases where a property is updated, not potentially added or removed.
Ultimately it can be beneficial to try and structure update scenarios to be as atomic as possible to avoid an update that is going to make changes to an entire object graph of entities, properties, and types but rather one update for just entity values, one for property values, and one for a single type update. This would apply also to adding a property, adding a type, removing a property, removing a type. While it may look like more code to break up operations like this, it keeps them very simple and straight forward rather than one big, complex method trying to compare the before and after to figure out what to add, remove, and update. Bugs hide in complex code, not simple methods. :)
When editing an object graph you should also avoid calling SaveChanges more than once. Instead of calling it within the loop of properties it should be done once when the loop is completed. The reason for this is that something like an exception on one of the properties would result in an incomplete/invalid data state being persisted. If you have 4 properties in the object being saved, and the 3rd one fails with an exception for any reason, the first 2 will be updated with the last two not persisting. Generally within an update operation, the update should follow an "all or nothing" approach to persistence.
Hopefully that helps explain the behavior you are seeing and gives you something to consider moving forward.

Why is this LINQ query not working correctly?

I'm having trouble making this next query work correctly
var result = await db.Set<Employee>()
.Include(ca => ca.Person)
.Include(ca=> ca.Person).ThenInclude(x=> x.GenderType)
.Include(ca=> ca.Position).ThenInclude(x=> x.Department)
.AsNoTracking()
.ToListAsync();
When this gets executed the person entity is null but just in new registries, with this I mean that there is already employees in the DB that were inputted directly with SQL and with this registries it works fine.
I thought the problem may be with how the registries are saved since two different entities are saved at the same time with PersonId being the key to correlate the employee with person, here is how it's done:
var person = await db.Set<Person>().AddAsync(obj.Person);
await db.SaveChangesAsync();
obj.PersonId = person.PersonId;
db.Entry(obj.Person).State = EntityState.Detached;
await db.Set<Employee>().AddAsync(obj);
await db.SaveChangesAsync();
I use the EntityState.Detached since Employee.Person is already saved. This works fine for saving but when I try to get all the entities from Employee the query will return null even when the Employee.PersonId is correct.
If I make a more "direct" query it works:
var query = from e in a
join person in db.Set<Person>().AsNoTracking()
on new { e.PersonId, e.SubscriptionId}
equals new { person.PersonId, person.SubscriptionId}
select person;
So I'm sure the registry is there, that's why I can't seem to find the problem.
PS: Sorry for the ambiguous question
UPDATE:
I realized why this was happening, the correlation one to one in the FK had an error, since I didn't make that part of the code I didn't realized it earlier.
this was the problem :
modelBuilder.Entity<Employee>(entity =>
{
entity.HasKey(e => new { e.EmployeeId, e.SubscriptionId });
entity.HasOne(d => d.Person)
.WithOne(p => p.Employee)
.HasForeignKey<Person>(d => new { d.PersonId, d.SubscriptionId })
.OnDelete(DeleteBehavior.ClientSetNull)
.HasConstraintName("FK_Employee_PersonId_SubscriptionId");
When it should have been like this
modelBuilder.Entity<Employee>(entity =>
{
entity.HasKey(e => new { e.EmployeeId, e.SubscriptionId });
entity.HasOne(d => d.Person)
.WithOne(p => p.Employee)
.HasForeignKey<Employee>(d => new { d.PersonId, d.SubscriptionId })
.OnDelete(DeleteBehavior.ClientSetNull)
.HasConstraintName("FK_Employee_PersonId_SubscriptionId");
As you can see.HasForeignKey<Employee>(d => new { d.PersonId,d.SubscriptionId }) was HasForeignKey<Person>... I hope this can help someone facing the same problem.
I believe you may be over-complicating things with your detaching + add, and trying to manually ensure referenced entities are saved first. Under most normal scenarios, when allowed to track entities normally, EF can manage this perfectly find on it's own. I also highly recommend that you define your entities to use navigation properties or FK fields, not both. I.e. Navigation properties + shadow properties for the FK, or simply FK fields if you don't need any of the related entity properties or they are something like cached lookups. If you do use both, rely on the tracked navigation properties and do not set relationships by FK.
By using db.Entry(obj.Person).State = EntityState.Detached; you've basically told the DbContext to forget about that entity. I agree though that if you later tell the DbContext to load an Employee and .Include(x => x.Person) that it would be quite strange for that Person entity to be #null. But perhaps you can avoid this "bug"/behavior:
This code here is a smell:
var person = await db.Set<Person>().AddAsync(obj.Person);
await db.SaveChangesAsync();
obj.PersonId = person.PersonId;
EF manages FK assignments 100% automatically. When I see code like this, it hints at an SQL/ADO developer not trusting EF to manage the associations.
Taking the following simplified example code:
var person = new Person { Name = "Steve" };
var employee = new Employee { Title = "Developer", Person = person };
In SQL land, Employee has a Person ID, so we'd typically need to ensure the Person record is saved first, get it's ID, and associate that to our Employee.PersonId column. Hence code like this:
context.Persons.Add(person);
context.SaveChanges(); // Generate PersonId.
employee.PersonId = person.PersonId;
context.Employees.Add(employee);
context.SaveChanges();
However, if the relationship is mapped in EF, this is completely unnecessary and can potentially lead to "already referenced" errors, which might be behind reasons to be messing with detaching entities. In reality, all you would need instead all of the above would be:
context.Employees.Add(employee);
context.SaveChanges();
When the employee is added, EF goes through all the related entities. It finds a Person that it doesn't know about so it will treat that as an added entity too. Because of the relationship mapping it will know that the Person needs to be inserted first, and the Employee PersonId will be updated as a result before the Employee is inserted.
Where people typically get tripped up when dealing with relationships is untracked instances. Let's say the "Person" record already exists. We're creating an employee we want to associate to Person ID#14. The most common examples I see is when the Person was loaded from the DbContext, sent to the client, then passed back to the server and developers assume that it's still an "entity" rather than a deserialized POCO that the DbContext has no knowledge of. For instance:
public void CreateEmployeeForPerson(Person person)
{
var employee = new Employee( Title = "Developer", Person = person );
Context.Employees.Add(employee);
Context.SaveChanges();
}
This ends up raising a confusing error that a row already exists. This is due to the person reference being treated as a new entity, it isn't tracked. EF wanted to generate an INSERT statement for the Person as well as the Employee. By tinkering with attached state, and AsNoTracking() with entity references you might want to use for Updates and such you can run into issues like this as well. This can be solved by using Attach to associate it to the Context, though that can be risky as if anything sets a modified state on it and the data has been tampered with by the client that could persist unintentional changes. Instead, we should look to always deal with tracked instances:
public void CreateEmployeeForPerson(int personId)
{
var person = Context.Persons.Single(x => x.PersonId == personId);
var employee = new Employee( Title = "Developer", Person = person );
Context.Employees.Add(employee);
Context.SaveChanges();
}
The person reference is known/tracked by the DbContext, and we've asserted the PersonId actually exists in the DB via the Single call. Now when the Employee is added, it's .Person reference is pointing at a known instance to the DbContext so EF just generates the appropriate INSERT statement for the Employee only.
While this might not pinpoint why .Include was not including your newly created Person instances, hopefully this can help simplify your persistence code overall and avoid weird behaviour around detached entities.

Select() decline in performance

I'm working on small app which is written in c# .net core and I'm populating one prop in a code because that information is not available in database, code looks like this:
public async Task<IEnumerable<ProductDTO>> GetData(Request request)
{
IQueryable<Product> query = _context.Products;
var products = await query.ToListAsync();
// WARNING - THIS SOLUTION LOOKS EXPENCIVE TO ME!
return MapDataAsDTO(products).Select(c =>
{
c.HasBrandStock = products.Any(cc => cc.ParentProductId == c.Id);
return c;
});
}
}
private IEnumerable<ProductDTO> MapDataAsDTO(IEnumerable<Product> products)
{
return products.Select(p => MapData(p)).ToList();
}
What is bothering me here is this code:
return MapDataAsDTO(products).Select(c =>
{
c.HasBrandStock = data.Any(cc => cc.ParentProductId == c.Id);
return c;
});
}
I've tested it on like 300k rows and it seems slow, I'm wondering is there a better solutions in this situations?
Thanks guys!
Cheers
First up, this method is loading all products, and generally that is a bad idea unless you are guaranteeing that the total number of records will remain reasonable, and the total size of those records will be reasonable. If the system can grow, add support for server-side pagination now. (Page # and Page size, leveraging Skip & Take) 300k products is not a reasonable number to be loading all data in one hit. Any way you skin this cat it will be slow, expensive, and error prone due to server load without paging. One user making a request on the server will need to have the DB server allocate for and load up 300k rows, transmit that data over the wire to the app server, which will allocate memory for those 300k rows, then transmit that data over the wire to the client who literally does not need those 300k rows at once. What do you think happens when 10 users hit this page? 100? And what happens when it's "to slow" and they start hammering the F5 key a few times. >:)
Second, async is not a silver bullet. It doesn't make queries faster, it actually makes them a bit slower. What it does do is allow your web server to be more responsive to other requests while those slower queries are running. Default to synchronous queries, get them running as efficiently as possible, then for the larger ones that are justified, switch them to asynchronous. MS made async extremely easy to implement, perhaps too easy to treat as a default. Keep it simple and synchronous to start, then re-factor methods to async as needed.
From what I can see you want to load all products into DTOs, and for products that are recognized as being a "parent" of at least one other product, you want to set their DTO's HasBrandStock to True. So given product IDs 1 and 2, where 2's parent ID is 1, the DTO for Product ID 1 would have a HasBrandStock True while Product ID 2 would have HasBrandStock = False.
One option would be to tackle this operation in 2 queries:
var parentProductIds = _context.Products
.Where(x => x.ParentProductId != null)
.Select(x => x.ParentProductId)
.Distinct()
.ToList();
var dtos = _context.Products
.Select(x => new ProductDTO
{
ProductId = x.ProductId,
ProductName = x.ProductName,
// ...
HasBrandStock = parentProductIds.Contains(x.ProductId)
}).ToList();
I'm using a manual Select here because I don't know what your MapAsDto method is actually doing. I'd highly recommend using Automapper and it's ProjectTo<T> method if you want to simplify the mapping code. Custom mapping functions can too easily hide expensive bugs like ToList calls when someone hits a scenario that EF cannot translate.
The first query gets a distinct list of just the Product IDs that are the parent ID of at least one other product. The second query maps out all products into DTOs, setting the HasBrandStock based on whether each product appears in the parentProductIds list or not.
This option will work if a relatively limited number of products are recognized as "parents". That first list can only get so big before it risks crapping out being too many items to translate into an IN clause.
The better option would be to look at your mapping. You have a ParentProductId, does a product entity have an associated ChildProducts collection?
public class Product
{
public int ProductId { get; set; }
public string ProductName { get; set; }
// ...
public virtual Product ParentProduct { get; set; }
public virtual ICollection<Product> ChildProducts { get; set; } = new List<Product>();
}
public class ProductConfiguration : EntityTypeConfiguration<Product>
{
public ProductConfiguration()
{
HasKey(x => x.ProductId);
HasOptional(x => x.ParentProduct)
.WithMany(x => x.ChildProducts)
.Map(x => x.MapKey("ParentProductId"));
}
}
This example maps the ParentProductId without exposing a field in the entity (recommended). Otherwise, if you do expose a ParentProductId, substitute the .Map(...) call with .HasForeignKey(x => x.ParentProductId).
This assumes EF6 as per your tags, if you're using EF Core then you use HasForeignKey("ParentProductId") in place of Map(...) to establish a shadow property for the FK without exposing a property. The entity configuration is a bit different with Core.
This allows your queries to leverage the relationship between parent products and any related children products. Populating the DTOs can be accomplished with one query:
var dtos = _context.Products
.Select(x => new ProductDTO
{
ProductId = x.ProductId,
ProductName = x.ProductName,
// ...
HasBrandStock = x.ChildProducts.Any()
}).ToList();
This leverages the relationship to populate your DTO and it's flag in one pass. The caveat here is that there is now a cyclical relationship between product and itself represented in the entity. This means don't feed entities to something like a serializer. That includes avoiding adding entities as members of DTOs/ViewModels.

EF: Clear Collection Property without First Getting Data

I have a class called Facility. Facility has a collection property on it called Employees. I'm using the disconnected layer of EF. I want to clear the Employees collection from a specific facility, but I don't want to make two trips to the DB: (1) getting all the employees, and then (2) clearing the. How can I do this?
Here's what I've tried...
Facility f = new Facility()
{
Id = 4,
Employees = new List<Employee>()
};
context.Facilities.Attach(f);
context.Entry<Facility>(f).Collection(fac => fac.Employees).IsLoaded = true;
context.SaveChanges();
I think I'm close, but it doesn't work. Thanks for the advice.
If you want to use EF only, you're always going to need some roundtrip. In the end, EF needs to generate DELETE ... WHERE Id = x statements. How would it know the values for x without first grabbing them from the database?
But of course you can do this in a more efficient way than fetching the complete Employee objects. It's enough to get the Id values. Then you can use these Ids to create stub entities that you mark as Deleted:
var ids = context.Empoyees.Where(e => e.FacilityId == 4)
.Select(e => e.Id).ToArray();
foreach(int id in ids)
{
var emp = new Empoyee { Id = id }; // Stub entity
context.Entry(emp).State = System.Data.Entity.EntityState.Deleted;
}
context.SaveChanges();
This is pure EF. But you can also use EntityFramework.Extended. This allows you to execute a statement like
context.Empoyees.Where(e => e.FacilityId == 4)
.Delete();

LINQ Query - Only get Order and MAX Date from Child Collection

I'm trying to get a list that displays 2 values in a label from a parent and child (1-*) entity collection model.
I have 3 entities:
[Customer]: CustomerId, Name, Address, ...
[Order]: OrderId, OrderDate, EmployeeId, Total, ...
[OrderStatus]: OrderStatusId, StatusLevel, StatusDate, ...
A Customer can have MANY Order, which in turn an Order can have MANY OrderStatus, i.e.
[Customer] 1--* [Order] 1--* [OrderStatus]
Given a CustomerId, I want to get all of the Orders (just OrderId) and the LATEST (MAX?) OrderStatus.StatusDate for that Order.
I've tried a couple of attempts, but can seem to get the results I want.
private IQueryable<Customer> GetOrderData(string customerId)
{
var ordersWithLatestStatusDate = Context.Customers
// Note: I am not sure if I should add the .Expand() extension methods here for the other two entity collections since I want these queries to be as performant as possible and since I am projecting below (only need to display 2 fields for each record in the IQueryable<T>, but thinking I should now after some contemplation.
.Where(x => x.CustomerId == SelectedCustomer.CustomerId)
.Select(x => new Custom
{
CustomerId = x.CustomerId,
...
// I would like to project my Child and GrandChild Collections, i.e. Orders and OrderStatuses here but don't know how to do that. I learned that by projecting, one does not need to "Include/Expand" these extension methods.
});
return ordersWithLatestStatusDate ;
}
---- UPDATE 1 ----
After the great solution from User: lazyberezovsky, I tried the following:
var query = Context.Customers
.Where(c => c.CustomerId == SelectedCustomer.CustomerId)
.Select(o => new Customer
{
Name = c.Name,
LatestOrderDate = o.OrderStatus.Max(s => s.StatusDate)
});
In my hastiness from my initial posting, I didn't paste everything in correctly since it was mostly from memory and didn't have the exact code for reference at the time. My method is a strongly-typed IQueryabled where I need it to return a collection of items of type T due to a constraint within a rigid API that I have to go through that has an IQueryable query as one of its parameters. I am aware I can add other entities/attributes by either using the extension methods .Expand() and/or .Select(). One will notice that my latest UPDATED query above has an added "new Customer" within the .Select() where it was once anonymous. I'm positive that is why the query failed b/c it couldn't be turn into a valid Uri due to LatestOrderDate not being a property of Customer at the Server level. FYI, upon seeing the first answer below, I had added that property to my client-side Customer class with simple { get; set; }. So given this, can I somehow still have a Customer collection with the only bringing back those 2 fields from 2 different entities? The solution below looked so promising and ingenious!
---- END UPDATE 1 ----
FYI, the technologies I'm using are OData (WCF), Silverlight, C#.
Any tips/links will be appreciated.
This will give you list of { OrderId, LatestDate } objects
var query = Context.Customers
.Where(c => c.CustomerId == SelectedCustomer.CustomerId)
.SelectMany(c => c.Orders)
.Select(o => new {
OrderId = o.OrderId,
LatestDate = o.Statuses.Max(s => s.StatusDate) });
.
UPDATE construct objects in-memory
var query = Context.Customers
.Where(c => c.CustomerId == SelectedCustomer.CustomerId)
.SelectMany(c => c.Orders)
.AsEnumerable() // goes in-memory
.Select(o => new {
OrderId = o.OrderId,
LatestDate = o.Statuses.Max(s => s.StatusDate) });
Also grouping could help here.
If I read this correctly you want a Customer entity and then a single value computed from its Orders property. Currently this is not supported in OData. OData doesn't support computed values in the queries. So no expressions in the projections, no aggregates and so on.
Unfortunately even with two queries this is currently not possible since OData doesn't support any way of expressing the MAX functionality.
If you have control over the service, you could write a server side function/service operation to execute this kind of query.

Categories