As an example let's say my database has a table with thousands of ships with every ship potentially having thousands of passengers as a navigation property:
public DbSet<Ship> Ship { get; set; }
public DbSet<Passenger> Passenger { get; set; }
public class Ship
{
public List<Passenger> passengers { get; set; }
//properties omitted for example
}
public class Passenger
{
//properties omitted for example
}
The example use case is that someone is fetching all ships per API and would like to know for each ship whether it is empty (0 passengers), so the returned JSON will contain a list of ships each with a bool whether it is empty.
My current code seems very inefficient (including all passengers just to determine if a ship is empty):
List<Ship> ships = dbContext.Ship
.Include(x => x.passengers)
.ToList();
and later when the ships are serialized to JSON:
jsonShip.isEmpty = !ship.passengers.Any();
I would like a more performant (and not bloated) alternative to including all passengers. What options do I have?
I have looked at computed columns but they only seem to support sql as string. If possible I would like to stay in the C# code world, so for example having a property which is set correctly by being automatically woven in the SQL query would be optimal.
Create a Data Transfer Object for Ship that reflects the shape of your JSON result, like -
public class ShipDto
{
public int Id { get; set; }
public string Name { get; set; }
public bool IsEmpty { get; set; }
}
Then use projection in your query -
var ships = dbCtx.Ships
.Select(p => new ShipDto
{
Id = p.Id,
Name = p.Name,
IsEmpty = !p.Passengers.Any()
})
.ToList();
Usually, APIs need to produce responses of various shapes and DTOs give you well defined models to represent the shape of your API response. Domain entities are not always suitable for this.
If your domain entity (Ship) has a lot of properties, then copying all those properties in the .Select() method might be cumbersome. You can use AutoMapper to map them for you. AutoMapper has a ProjectTo<T>() method that can generate the SQL and return the projected result. For example, you can achieve the above result with a mapping configuration -
CreateMap<Ship, ShipDto>()
.ForMember(d => d.IsEmpty, opt => opt.MapFrom(s => !s.Passengers.Any()));
and a query -
var ships = Mapper.ProjectTo<ShipDto>(dbCtx.Ships).ToList();
assuming all other properties in ShipDto are named similar as in Ship entity.
EDIT :
If you don't want a DTO model -
you can add a NotMapped property in Ship model -
public class Ship
{
public int Id { get; set; }
public string Name { get; set; }
[NotMapped]
public bool IsEmpty { get; set; }
public List<Passenger> passengers { get; set; }
}
and then do the query like -
var ships = dbCtx.Ships
.Select(p => new Ship
{
Id = p.Id,
Name = p.Name,
IsEmpty = !p.Passengers.Any()
})
.ToList();
You can return an anonymous type -
var ships = dbCtx.Ships
.Select(p => new
{
Id = p.Id,
Name = p.Name,
IsEmpty = !p.Passengers.Any()
})
.ToList();
If I understand your intention correctly...
One way is to store the number of passengers inside each Ship entity. This can work well if you use Domain Driven Design, treat the Ship as an aggregate root, and only add or remove passengers through methods exposed on the given Ship entity, e.g. RegisterPassenger() / RemovePassenger(). Inside these methods, increment or decrement the passenger number along with adding or removing the passenger.
Then, obviously, you can query the Ships dbset with a PassengerCount < 0 projection to the bool you need. And, again obviously, it won't even touch the Passengers table.
In traditional anemic domain ASP.NET systems this sort of data redundancy might be a bit more risky, because properties are usually publicly mutable, and you have multiple services that 'massage' the entities, which is a potential source of data integrity loss.
Related
In .NET Core 2.2 I'm stuck with filtering IQueryable built as:
_context.Ports.Include(p => p.VesselsPorts)
.ThenInclude(p => p.Arrival)
.Include(p => p.VesselsPorts)
.ThenInclude(p => p.Departure)
.OrderBy(p => p.PortLocode);
in many-to-many relation. And the entity models are such as:
public class PortModel
{
[Key]
public string PortLocode { get; set; }
public double? MaxKnownLOA { get; set; }
public double? MaxKnownBreadth { get; set; }
public double? MaxKnownDraught { get; set; }
public virtual ICollection<VesselPort> VesselsPorts { get; set; }
}
public class VesselPort
{
public int IMO { get; set; }
public string PortLocode { get; set; }
public DateTime? Departure { get; set; }
public DateTime? Arrival { get; set; }
public VesselModel VesselModel { get; set; }
public PortModel PortModel { get; set; }
}
Based on this this SO answer I managed to create LINQ like that:
_context.Ports.Include(p => p.VesselsPorts).ThenInclude(p => p.Arrival).OrderBy(p => p.PortLocode)
.Select(
p => new PortModel
{
PortLocode = p.PortLocode,
MaxKnownBreadth = p.MaxKnownBreadth,
MaxKnownDraught = p.MaxKnownDraught,
MaxKnownLOA = p.MaxKnownLOA,
VesselsPorts = p.VesselsPorts.Select(vp => vp.Arrival > DateTime.UtcNow.AddDays(-1)) as ICollection<VesselPort>
}).AsQueryable();
BUT what I need is to find all port records, where:
VesselsPorts.Arrival > DateTime.UtcNow.AddDays(-1) quantity is greater than int x = 5 value (for the example). And I have no clue how to do it :/
Thanks to #GertArnold comment, I ended up with query:
ports = ports.Where(p => p.VesselsPorts.Where(vp => vp.Arrival > DateTime.UtcNow.AddDays(-1)).Count() > x);
When using entity framework people tend to use Include instead of Select to save them some typing. It is seldom wise to do so.
The DbContext holds a ChangeTracker. Every complete row from any table that you fetch during the lifetime of the DbContext is stored in the ChangeTracker, as well as a clone. You get a reference to the copy. (or maybe a reference to the original). If you change properties of the data you got, they are changed in the copy that is in the ChangeTracker. During SaveChanges, the original is compared to the copy, to see if the data must be saved.
So if you are fetching quite a lot of data, and use include, then every fetched items is cloned. This might slow down your queries considerably.
Apart from this cloning, you will probably fetch more properties than you actually plan to use. Database management systems are extremely optimized in combining tables, and searching rows within tables. One of the slower parts is the transfer of the selected data to your local process.
For example, if you have a database with Schools and Students, with the obvious one to many-relation, then every Student will have a foreign key to the School he attends.
So if you ask for School [10] with his 2000 Students, then every Student will have a foreign key value of [10]. If you use Include, then you will be transferring this same value 10 over 2000 times. What a waste of processing power!
In entity framework, when querying data, always use Select to select the properties, and Select only the properties that you actually plan to use. Only use Include if you plan to change the fetched items.
Certainly don't use Include to save you some typing!
Requirement: Give me the Ports with their Vessels
var portsWithTheirVessels = dbContext.Ports
.Where(port => ...) // if you don't want all Ports
.Select(port => new
{
// only select the properties that you want:
PortLocode = port.PortLoCode,
MaxKnownLOA = port.MaxKnownLOA,
MaxKnownBreadth = prot.MaxKnownBreadth,
MaxKnownDraught = ports.MaxKnownDraught,
// The Vessels in this port:
Vessels = port.VesselsPort.Select(vessel => new
{
// again: only the properties that you plan to use
IMO = vessel.IMO,
...
// do not select the foreign key, you already know the value!
// PortLocode = vessle.PortLocode,
})
.ToList(),
});
Entity framework knows your one-to-many relation, and knows that if you use the virtual ICollection that it should do a (Group-)Join.
Some people prefer to do the Group-Join themselves, or they use a version of entity framework that does not support using the ICollection.
var portsWithTheirVessels = dbContext.Ports.GroupJoin(dbContext.VesselPorts,
port => port.PortLocode, // from every Port take the primary key
vessel => vessel.PortLocode, // from every Vessel take the foreign key to Port
// parameter resultSelector: take every Port with its zero or more Vessels to make one new
(port, vesselsInThisPort) => new
{
PortLocode = port.PortLoCode,
...
Vessels = vesselsInThisPort.Select(vessel => new
{
...
})
.ToList(),
});
Alternative:
var portsWithTheirVessels = dbContext.Ports.Select(port => new
{
PortLocode = port.PortLoCode,
...
Vessels = dbContext.VesselPorts.Where(vessel => vessel.PortLocode == port.PortLocode)
.Select(vessel => new
{
...
}
.ToList(),
});
Entity framework will translate this also to a GroupJoin.
I have two PostgreSQL tables, "stock_appreciation_rights" and "sar_vest_schedule" which are mapped to the classes "StockAppreciationRights" and "SarVestingUnit". The relationship is this: One SarVestingUnit is associated with one StockAppreciationRights, and one StockAppreciationRight is associated with many SarVestingUnit. Here is the SarVestingUnit class:
public class SarVestingUnit
{
public string UbsId { get; set; }
public short Units { get; set; }
public DateTime VestDate { get; set; }
public virtual StockAppreciationRights Sar { get; set; }
}
Here is the StockAppreciationRights class:
public class StockAppreciationRights
{
public StockAppreciationRights()
{
SarVestingUnits = new HashSet<SarVestingUnit>();
}
public string UbsId { get; set; }
public DateTime GrantDate { get; set; }
public DateTime ExpirationDate { get; set; }
public short UnitsGranted { get; set; }
public decimal GrantPrice { get; set; }
public short UnitsExercised { get; set; }
public virtual ICollection<SarVestingUnit> SarVestingUnits { get; set; }
}
And a snippet from my dbContext:
modelBuilder.Entity<SarVestingUnit>(entity =>
{
entity.HasKey(e => new { e.UbsId, e.VestDate })
.HasName("sar_vest_schedule_pkey");
entity.ToTable("sar_vest_schedule");
entity.Property(e => e.UbsId)
.HasColumnName("ubs_id")
.HasColumnType("character varying");
entity.Property(e => e.VestDate)
.HasColumnName("vest_date")
.HasColumnType("date");
entity.Property(e => e.Units).HasColumnName("units");
entity.HasOne(d => d.Sar)
.WithMany(p => p.SarVestingUnits)
.HasForeignKey(d => d.UbsId)
.OnDelete(DeleteBehavior.ClientSetNull)
.HasConstraintName("sar_vest_schedule_ubs_id_fkey")
.IsRequired();
});
Here is the behavior I don't understand. If I get just the StockAppreciationRights from my dbcontext like this:
var x = new personalfinanceContext();
//var test = x.SarVestingUnit.ToList();
var pgSARS = x.StockAppreciationRights.ToList();
The SarVestingUnits collection is empty in my StockAppreciationRights objects.
Likewise, if I just get the SarVestingUnits and not the StockAppreciationRights like this:
var x = new personalfinanceContext();
var test = x.SarVestingUnit.ToList();
//var pgSARS = x.StockAppreciationRights.ToList();
The Sar property in my SarVestingUnit objects is null.
However, if I get them both like this:
var x = new personalfinanceContext();
var test = x.SarVestingUnit.ToList();
var pgSARS = x.StockAppreciationRights.ToList();
Then everything is populated as it should be. Can someone explain this behavior? Obviously I'm new to entity framework and PostgreSQL.
From your described behaviour it seems that lazy loading is either disabled or not available (I.e. EF Core <=2.1)
Lazy loading would assign proxies for related references that it hasn't loaded so that if those are later accessed, a DB query against the DbContext would be made to retrieve them. This allows data to be retrieved "as required" but can result in a significant performance penalty.
Alternatively you can eager-load related data. For example:
var pgSARS = context.StockAppreciationRights.Include(x => x.SarVestingUnits).ToList();
would tell EF to load the StockAppreciation rights and pre-fetch any vesting units for each of those entries. You can use Include and ThenInclude to drill down and pre-fetch any number of dependencies.
The reason your final example appears to work is that EF will auto-populate relations that the context knows about. When you load the first set, it won't resolve any of the related entities, however, loading the second set and EF already knows about the related other entities and will auto-set those references for you.
The real power of EF though is not dealing with entities like a 1-to-1 mapping to tables (outside of editing/inserting) but rather leveraging projection to pull relational data to populate what you need. Leveraging Select or Automapper's ProjectTo methods mean you can extract whatever data you want through the mapped EF relationships and EF can build you an efficient query to fetch it rather than worrying about lazy or eager loading. For instance if you want to get a list of vesting units with their associated right details:
var sars = context.StockAppreciationRights.Select(x => new SARSummary
{
UbsId = x.UbsId,
Units = x.Units,
VestDate = x.VestDate,
GrantDate = x.Sar.GrantDate,
ExpirationDate = x.Sar.ExpirationDate,
UnitsGranted = x.Sar.UnitsGranted,
GrantPrice = x.Sar.GrantPrice
}).ToList();
From within the Select statement you can access any of the related details to flatten into a view model that serves your immediate needs, or even compose a hierarchy of view models simplified for what the view/consumer needs.
I have the following method which is meant to build me up a single object instance, where its properties are built via recursively calling the same method:
public ChannelObjectModel GetChannelObject(Guid id, Guid crmId)
{
var result = (from channelObject in _channelObjectRepository.Get(x => x.Id == id)
select new ChannelObjectModel
{
Id = channelObject.Id,
Name = channelObject.Name,
ChannelId = channelObject.ChannelId,
ParentObjectId = channelObject.ParentObjectId,
TypeId = channelObject.TypeId,
ChannelObjectType = channelObject.ChannelObjectTypeId.HasValue ? GetChannelObject(channelObject.ChannelObjectTypeId.Value, crmId) : null,
ChannelObjectSearchType = channelObject.ChannelObjectSearchTypeId.HasValue ? GetChannelObject(channelObject.ChannelObjectSearchTypeId.Value, crmId) : null,
ChannelObjectSupportingObject = channelObject.ChannelObjectSupportingObjectId.HasValue ? GetChannelObject(channelObject.ChannelObjectSupportingObjectId.Value, crmId) : null,
Mapping = _channelObjectMappingRepository.Get().Where(mapping => mapping.ChannelObjectId == channelObject.Id && mapping.CrmId == crmId).Select(mapping => new ChannelObjectMappingModel
{
CrmObjectId = mapping.CrmObjectId
}).ToList(),
Fields = _channelObjectRepository.Get().Where(x => x.ParentObjectId == id).Select(field => GetChannelObject(field.Id, crmId)).ToList()
}
);
return result.First();
}
public class ChannelObjectModel
{
public ChannelObjectModel()
{
Mapping = new List<ChannelObjectMappingModel>();
Fields = new List<ChannelObjectModel>();
}
public Guid Id { get; set; }
public Guid ChannelId { get; set; }
public string Name { get; set; }
public List<ChannelObjectMappingModel> Mapping { get; set; }
public int TypeId { get; set; }
public Guid? ParentObjectId { get; set; }
public ChannelObjectModel ParentObject { get; set; }
public List<ChannelObjectModel> Fields { get; set; }
public Guid? ChannelObjectTypeId { get; set; }
public ChannelObjectModel ChannelObjectType { get; set; }
public Guid? ChannelObjectSearchTypeId { get; set; }
public ChannelObjectModel ChannelObjectSearchType { get; set; }
public Guid? ChannelObjectSupportingObjectId { get; set; }
public ChannelObjectModel ChannelObjectSupportingObject { get; set; }
}
this is connecting to a SQL database using Entity Framework Core 2.1.1
Whilst it technically works, it causes loads of database queries to be made - I realise its because of the ToList() and First() etc. calls.
However because of the nature of the object, I can make one huge IQueryable<anonymous> object with a from.... select new {...} and call First on it, but the code was over 300 lines long going just 5 tiers deep in the hierarchy, so I am trying to replace it with something like the code above, which is much cleaner, albeit much slower..
ChannelObjectType, ChannelObjectSearchType, ChannelObjectSupportingObject
Are all ChannelObjectModel instances and Fields is a list of ChannelObjectModel instances.
The query takes about 30 seconds to execute currently, which is far too slow and it is on a small localhost database too, so it will only get worse with a larger number of db records, and generates a lot of database calls when I run it.
The 300+ lines code generates a lot less queries and is reasonably quick, but is obviously horrible, horrible code (which I didn't write!)
Can anyone suggest a way I can recursively build up an object in a similar way to the above method, but drastically cut the number of database calls so it's quicker?
I work with EF6, not Core, but as far as I know, same things apply here.
First of all, move this function to your repository, so that all calls share the DbContext instance.
Secondly, use Include on your DbSet on properties to eager load them:
ctx.DbSet<ChannelObjectModel>()
.Include(x => x.Fields)
.Include(x => x.Mapping)
.Include(x => x.ParentObject)
...
Good practice is to make this a function of context (or extension method) called for example BuildChannelObject() and it should return the IQueryable - just the includes.
Then you can start the recursive part:
public ChannelObjectModel GetChannelObjectModel(Guid id)
{
var set = ctx.BuildChannelObject(); // ctx is this
var channelModel = set.FirstOrDefault(x => x.Id == id); // this loads the first level
LoadRecursive(channelModel, set);
return channelModel;
}
private void LoadRecursive(ChannelObjectModel c, IQueryable<ChannelObjectModel> set)
{
if(c == null)
return; // recursion end condition
c.ParentObject = set.FirstOrDefault(x => x.Id == c?.ParentObject.Id);
// all other properties
LoadRecursive(c.ParentObject, set);
// all other properties
}
If all this code uses the same instance of DbContext, it should be quite fast. If not, you can use another trick:
ctx.DbSet<ChannelObjectModel>().BuildChannelObjectModel().Load();
This loads all objects to memory cache of your DbContext. Unfortunately, it dies with context instance, but it makes those recursive calls much faster, since no database trip is made.
If this is still to slow, you can add AsNoTracking() as last instruction of BuildChannelObjectModel().
If this is still to slow, just implement application wide memory cache of those objects and use that instead of querying database everytime - this works great if your app is a service that can have long startup, but then work fast.
Whole another approach is to enable lazy loading by marking navigation properties as virtual - but remember that returned type will be derived type anonymous proxy, not your original ChannelObjectModel! Also, properties will load only as long you don't dispose the context - after that you get an exception. To load all properties with the context and then return complete object is also a little bit tricky - easiest (but not the best!) way to do it to serialize the object to JSON (remember about circural references) before returning it.
If that does not satisfy you, switch to nHibernate which I hear has application wide cache by default.
I have 2 classes like this:
Parent.cs
public class Parent
{
public int Id {get;set;}
public virtual ICollection<Child> Children { get; set; }
}
Child.cs
public class Child
{
public int Id {get;set;}
public ItemStatusType ItemStatusTyp { get; set; }
public int ParentId {get;set;}
[ForeignKey("ParentId")]
public virtual Parent Parent { get; set; }
}
ItemStatusType.cs
public enum ItemStatusType
{
Active = 1,
Deactive = 2,
Deleted = 3
}
What I want is to somehow retrieve always the active ones and not the deleted ones. Since I am not deleting the record physically, I'm merely updating the ItemStatusType to Deleted status.
So, when I say ParentObj.Children I only wish to retrieve the active ones without further using Where condition.
Here is so far what I've done but giving an exception on runtime that I stated afterwards:
public class ParentConfiguration : EntityTypeConfiguration<Parent>
{
public ParentConfiguration()
{
HasMany(c => c.Children.Where(p => p.ItemStatusTyp != ItemStatusType.Deleted).ToList())
.WithRequired(c => c.Parent)
.HasForeignKey(c => c.ParentId)
;
}
}
Runtime Exception:
The expression 'c => c.Children.Where(p => (Convert(p.ItemStatusTyp)
!= 3)).ToList()' is not a valid property expression. The expression
should represent a property: C#: 't => t.MyProperty' VB.Net:
'Function(t) t.MyProperty'.
I had to use ToList after the expression, otherwise it does not compile.
What is the proper what to do what I want?
Thanks in advance,
You cannot use Where or any other logic in fluent property mapping - that's just configuration.
Basically you cannot solve what you need in declarative way.
There are some workarounds which you can use for first-level entities, like implement your own extension(s) MySet<T> which will return .Set<T>.Where(x => x.ItemStatusType != ItemStatusType.Deleted) and use it everywhere, but that won't solve filtering issue for child collections.
You can go hard way and prepare a separate set of entities to use for selecting data, which basically should be based on database views or stored procedures; you will have to create separate view for every entity, so you will be able to combine selecting in any relations based on these views-entities.
For inserting though you will need to have entities mapped over "real" tables. No sure if it worth it but it might in some cases.
I have a database table which represents accounts with a multi-level hierarchy. Each row has an "AccountKey" which represents the current account and possibly a "ParentKey" which represents the "AccountKey" of the parent.
My model class is "AccountInfo" which contains some information about the account itself, and a List of child accounts.
What's the simplest way to transform this flat database structure into a hierarchy? Can it be done directly in LINQ or do I need to loop through after the fact and build it manually?
Model
public class AccountInfo
{
public int AccountKey { get; set; }
public int? ParentKey { get; set; }
public string AccountName { get; set; }
public List<AccountInfo> Children { get; set; }
}
LINQ
var accounts =
from a in context.Accounts
select new AccountInfo
{
AccountKey = a.AccountKey,
AccountName = a.AccountName,
ParentKey = a.ParentKey
};
The structure you currently have is actually a hierarchy (an adjacency list model). The question is, do you want to keep this hierarchical model? If you do, there's a Nuget package called MVCTreeView. This package works directly with the table structure you describe - in it, you can create a Tree View for your UI, implement CRUD operations at each level, etc. I had to do exactly this and I wrote an article on CodeProject that shows how to cascade delete down an adjacency list model table in SQL via C#. If you need more specifics, leave a comment, and I'll edit this post.
http://www.codeproject.com/Tips/668199/How-to-Cascade-Delete-an-Adjace
You can simply create an association property for the parent key:
public class AccountInfo {
... // stuff you already have
public virtual AccountInfo Parent { get; set; }
}
// in the configuration (this is using Code-first configuration)
conf.HasOptional(a => a.Parent).WithMany(p => p.Children).HasForeignKey(a => a.ParentKey);
With this setup, you can traverse the hierarchy in either direction in queries or outside of queries via lazy-loading if you want lazy loading of the children, make sure to make the property virtual.
To select all children for a given parent, you might run the following query:
var children = context.Accounts
.Where(a => a.AccountKey = someKey)
.SelectMany(a => a.Children)
.ToArray();