I'm using EF 6 to work with a somewhat shoddily constructed database. I'm using a code-first model.
A lot of the logical relations there aren't implemented correctly using keys, but use various other strategies (Such as character-separated ids or strings, for example) that were previously manipulated using complex SQL queries.
(Changing the schema is not an option)
I really want to capture those relations as properties. It's possible to do this by using explicit queries instead of defining actual relations using the fluent/attribute syntax.
I'm planning to do this by having IQueryable<T> properties that perform a query. For example:
partial class Product {
public IQueryable<tblCategory> SubCategories {
get {
//SubCategoriesID is a string like "1234, 12351, 12" containing a list of IDs.
var ids = SubCategoriesID.Split(',').Select(x => int.Parse(x.Trim()));
return from category in this.GetContext().tblCategories
where ids.Contains(category.CategoryID)
select category;
}
}
}
(The GetContext() method is an extension method that somehow acquires an appropriate DbContext)
However, is there a better way to do this that I'm not familiar with?
Furthermore, if I do do this, what's the best way of getting the DbContext for the operation? It could be:
Just create a new one. I'm a bit leery of doing this, since I don't know much about how they work.
Use some tricks to get the context that was used to create this specific instance.
Do something else?
First, I would recommend not returning an IQueryable, as that retains a relationship to the original DbContext. Instead, I'd ToList the results of the query and return that as an IEnumerable<tblCategory>
Try not to keep DbContext instances hanging around; there's a lot of state management baked into them, and since they are not thread-safe you don't want to have multiple threads hitting the same instance. The pattern I personally tend to follow on data access methods is to use a new DbContext in a using block:
using (var ctx = new YourDbContextTypeHere()) {
return (from category in ctx.tblCategories
where ids.Contains(category.CategoryID)
select category).ToList();
}
Beware that .Contains() on a list of ids is very slow in EF, i.e. try to avoid it. I'd use subqueries, such as
var subcategories = context.SubCategories.Where(...);
var categories = context.Categories.Where(x => subCategories.Select(x => x.Id).Contains(category.CategoryId);
In this setup, you can avoid loading all the ids onto the server, and the query will be fast.
Related
Imagine we have the following db structure
Organization
{
Guid OrganizationId
//....
}
User
{
Guid UserId
}
OrganizationUsers
{
Guid OrganizationId
Guid UserId
}
When the edmx generated this class it abstracts away the OrganizationUsers into a many to many references. So no POCO class will be generated for it.
Say I'm loading data from my context, but to avoid Cartesian Production, I don't use an include I make two seperate queries.
using(var context = new EntitiesContext())
{
var organizationsQuery = context.Where(FilterByParent);
var organizations = organizationsQuery.ToList();
var users = organizationsQuery.SelectMany(x => x.Users).Load();
}
Is it safe to assume that the connected entitites are loaded?
Would this make any difference if I loaded the users directly from the DBSet?
From database point of view:
Is it safe to assume that the connected entitites are loaded?
Yes It's safe, because first organizations being tracked by EF Change Tracker and then by calling Load in next statement EF knows that results should be attach to tracked entities
Would this make any difference if I loaded the users directly from the DBSet?
In fact using Load this way does nothing better than Include!
If you use Include EF translate it to LEFT JOIN, if you use Load it will be translated to INNER JOIN, and if you fetch Users directly by their ids using Contains method it will be translated to IN on Sql side.
In Load and Contains cases you execute two query (in two pass) on Sql, but in Include case it's being done in one pass, so overally it's outperform your approach.
You can compare these approaches yourself using Sql Profiler tool.
Update:
Based on conversations I realized that the main issue of Johnny is just existence of OrganizationUsers object. So I suggest to change your approach from DB First to Code first then this object explicitly could be exist! See this to help you on this way
Also another approach that I guess maybe work is customizing T4 Template that seems harder but not impossible!
I'm using Entity Framework with a code-first approach to store data for a C# application I'm working with a SQL Server database. A challenge I'm currently running into involves a structure (approximately) like this:
public class MainEntity
{
// Data
public List<SubEntity> SubEntities { get; private set; }
}
public class SubEntity
{
// More Data
bool DoNotLoad { get; set; }
}
Now, I know that Entity Framework is able to "see" private property setters and populate the entities using reflection. That's why this works:
IEnumerable<MainEntity> Entities = MainEntities.Include(m => m.SubEntities).ToList();
And it will retrieve the MainEntity and all of its SubEntities from the database even though the setter for SubEntities is private.
I also know that Entity Framework supports more free-form projections, like so:
var projectedEntities = MainEntities.Select(m =>
new {
Main = m,
Sub = m.SubEntities.Where(s => !s.DoNotLoad)
}
);
And then I'll have an anonymous type with the main entity and its sub entities, with a filter applied to the sub entities.
However, I would like to combine the two methods and end up with MainEntity objects that have their SubEntity property populated, but filtered.
Unfortunately, this doesn't work:
var invalidEntities = MainEntities.Select(m =>
new MainEntity{
SubEntities = m.SubEntities.Where(s => !s.DoNotLoad)
}
);
C# doesn't let me use property initialization that way because SubEntities has a private setter, even though Entity Framework would work around that. Is there a way to make this work how I want it? My first priority is to avoid making two queries (e.g. get MainEntity, get filtered SubEntities, use specialized code to insert it), but I would also like to actually do the filtering in the database rather than getting everything and then filtering locally (e.g. MainEntity.FilterSubEntities()). Making the setter public isn't entirely impossible, but in order to use Property initialization I think I would need to change EVERY setter to public, which I would rather avoid.
I've been told that this is possible by projecting into an Anonymous type and if I name things in a certain way Entity Framework will "recognize" that it should project into MainEntity instead, but I haven't been able to find any references to this anywhere else. If that is possible then that would be my preferred method since it seems flexible enough to apply in various other situations where I need to filter in other ways.
I've found it
I've finally found an example of what I was told, and my testing indicates that it works. A few notes about the method:
It does not work to arbitrarily project into private properties or properties with private setters. It only allows for more complex filtering of sub-entities, which fortunately was the primary use I was looking for.
This may be undocumented behavior, or abuse of some other feature, so for all I know it could break at any time, and I can't necessarily state what would happen in any edge-cases that arise.
I don't have any evidence that this is better by any metric than simply filtering locally, and it could be significantly worse. I haven't measured it, and I recognize that this seems to be "odd behavior" that might mess with Entity Framework's normal optimizations. I think it's pretty cool though.
With that out of the way, this is the technique using the sample classes described in the question:
var queryResult = MainEntities.Select(m =>
new {
MainEntity = m,
m.SubEntities.Where(s => !s.DoNotLoad)
})
.ToList();
var finalList = queryResult.Select(q => q.MainEntity).ToList();
(Doing this on two lines and using a temporary variable isn't strictly necessary, but I think it clarifies that a DB query is executed at the first ToList() and then additional operations are applied locally.)
I believe that this works because Entity Framework populates navigation properties in a particularly eager manner. Essentially it populates the SubEntities List purely by adding every loaded SubEntity that has a foreign key to that MainEntity, regardless of what caused those entities to be loaded. That is speculation though, all I definitely know is that it currently works how I need it to.
If you query the DbSet of a DbContext, the query is valid until the DbContext is disposed. The following will lead to an exception:
IQueryable<Video> allVideos = null;
using (var context = new MyDbContext())
{
allVideos = context.Videos;
}
var firstVideo = allVideos.first();
Apparently the used DbSet is stored somewhere in the returned object that implements the IQueryable.
However, MSDN advises (Link)
When working with Web applications, use a context instance per request.
Of course I could use ToList() and return the result as a list of objects, but this is rather undesirable because I don't know the reason for the query.
Example: Suppose my database has a collection countries, which have cities, which have streets, which have houses, which have families which have persons which have names.
If someone asks for the IQueryable, then it could be that he wants to search for the name of the oldest person living on Downing Street nr 10 in London in the United Kingdom.
If I returned the sequence with a ToList(), all cities, streets, houses, persons, etc would be returned, which would be quite a waste if he only needed the name of this one person. That's the nice thing about deferred execution of Linq.
So I can't return ToList(), I have to return the IQueryable.
So what I'd like to do, is open a new DbContext, and somehow tell the query that it should use the new DbContext:
IQueryable<Video> allVideos = null;
using (var context = new MyDbContext())
{
allVideos = context.Videos;
}
// do something else
using (var context = new MyDbContext())
{
// here some code to attach the query to the new context
var firstVideo = allVideos.first();
}
How to do this?
The local guru happened to pass by. He explained to me that the error in my design was that I already use a DbContext while I am only composing the query. My interface should be such that I only need the DbContext when actually materializing the requested objects.
The question was a simplified version of the following:
I have a DbContext, with several public DbSet properties. These properties mirror the actual database. I want to hide the actual database implementation in my Abstract Database Layer in order to protect my data. I don't want anyone to give access change the contents of the database without having checked whether these contents are correct.
This is easy: just don't expose your actual DbContext to the outside world, but expose a facade that hides the actually used DbContext. This facade communicates with the actual DbContext.
With most functions that return an IQueryable I need the DbContext to access the DbSets. That's why I thought to create a context, construct the query and Dispose the context. But because of the deferred execution the context is still needed.
The solution
The solution is not to create your own context, but let the caller construct the DbContext. This constructed DbContext will be one of the parameters of the function. In that case the external user can call several functions of my facade to concatenate the query, even mix with his own Linq queries on the DbContext without creating and disposing the context. So like others suggested:
Callers creates the dbContext
Caller calls several of my functions that return a query, pass the dbContext as parameter
Caller executes the query by using ToList() / ToArray() / First() / Count() etc.
Caller disposes the context
To make it even nicer, the dbContext parameter is used in an extension method:
public static IQueryable<Video> GetObsoleteVideos(this MyDbContext dbContext)
{
// perform several difficult Linq statements on context
// that will return all obsolete videos
return ...
}
public static IQueryable<Video> GetThrillerVideos(this MyDbContext dbContext)
{
return dbContext.Videos.Where(video => video.Genre == VideoGenre.Thriller);
}
usage:
using (var myContext = new MyDbContext())
{
var difficultQuery = myContext.GetObsoleteVideos()
.Where(video => video.Name == ...)
.GetThrillerVideos()
.Take(10);
// Note: the query still deferred, execute it now, before Disposing myContext
var result = difficultQuery.ToList();
}
This way (and especially if I create an interface) I am able to prohibit access to my DbSets. I am even free to internally reorganize my Db and DbContext without external users noticing anything.
There are methods in the object context to do this:
var objectContext = ((IObjectContextAdapter)context).ObjectContext;
objectContext.Detach(entity);
objectContext.Attach(entity);
However, as it says in the quote from MSDN, you should use one instance of the EF context per request. This refers to the HttpRequest not to a single query. When you do operations in one request, you should not put using blocks around your EF context and you should extend its lifetime. For new requests, it is advisable not to keep states across requests but rather follow the protocol
Query the item again and reload (another request might have modified it in the meantime)
Make the modifications
Save
I have two tables in my database: TPM_AREAS and TPM_WORKGROUPS. There exists a many-to-many relationship between these two tables, and these relationships are stored in a table called TPM_AREAWORKGROUPS. This table looks like this:
What I need to do is load all these mappings into memory at once, in the quickest way possible. As TPM_AREAWORKGROUPS is an association, I can't just say:
var foo = (from aw in context.TPM_AREAWORKGROUPS select aw);
I can think of three ways to possibly do this, however I'm not quite sure how to accomplish each of them nor which one is the best.
1) Load in every workgroup, including the associated areas:
Something like:
var allWG = (from w in context.TPM_WORKGROUPS.Include("TPM_AREAS")
where w.TPM_AREAS.Count > 0
select w);
// Loop through this enumeration and manually build a mapping of distinct AREAID/WORKGROUPID combinations.
Pros: This is probably the standard EntityFramework way of doing things, and doesn't require me to change any of the database structure or mappings.
Cons: Could potentially be slow, since the TPM_WORKGROUPS table is rather large and the TPM_AREAWORKGROUPS table only has 13 rows. Plus, there's no TPM_AREAWORKGROUPS class, so I'd have to return a collection of Tuples or make a new class for this.
2) Change my model
Ideally, I'd like a TPM_AREAWORKGROUP class, and a context.TPM_AREAWORKGROUP property. I used the designer to create this model directly from the database, so I'm not quite sure how to force this association to be an actual model. Is there an easy way to do this?
Pros: It would allow me to select directly against this table, done in one line of code. Yay!
Cons: Forces me to change my model, but is this a bad thing?
3) Screw it, use raw SQL to get what I want.
I can get the StoreConnection property of the context, and call CreateCommand() directly. I can then just do:
using (DbCommand cmd = conn.CreateCommand())
{
cmd.CommandText = "SELECT AreaId, WorkgroupId FROM TPM_AREAWORKGROUPS";
var reader = cmd.ExecuteReader();
// Loop through and get each mapping
}
Pros: Fast, easy, doesn't require me to change my model.
Cons: Seems kind of hacky. Everywhere else in the project, we're just using standard Entity Framework code so this deviates from the norm. Also, it has the same issues as the first option; there's still no TPM_AREAWORKGROUPS class.
Question: What's the best solution for this problem?
Ideally, I'd like to do #2 however I'm not quite sure how to adjust my model. Or, perhaps someone knows of a better way than my three options.
You could do:
var result = context
.TPM_WORKGROUPS
.SelectMany(z => z.TPM_AREAS.Select(z2 => new
{
z2.AREAID,
z.WORKGROUPID
}));
The translated SQL will be a simple SELECT AREAID, WORKGROUPID FROM TPM_AREAWORKGROUPS.
About other options:
I wouldn't use option 3) because I personnally avoid raw SQL as much as possible when using Entity Framework (see https://stackoverflow.com/a/8880157/870604 for some reasons).
I wouldn't use option 2) because you would have to change your model, and there is a simple and efficient way that allows to not change it.
What about use projection to load data?
You could do that do fill a annonymous object and then work with it the way you like.
I have an ASP.NET MVC application coded with C#. The application is structured this way:
Controller
Repository
LINQ to Entities (Entity Framework)
View
I use the Repository (_ProductRep) to query the LINQ to Entities and give to the Controller actual entities or List<T>, not IQueriables<T>.
I would like to have some help about a situation where I have more than a doubt. I have the following code:
List<Monthly_Report> lproduct_monthlyReport = _ProductRep.GetArchiveReport(product.Prod_ID, lmonth, lyear);
After I get this lproduct_monthlyReport I need to query it inside a foreach and get a specific record. Currently I implemented the solution like this:
foreach (var item in litemList)
{
var lproductItem_monthlyReport = lproduct_monthlyReport.Single(m => m.Item_ID == item.Item_ID);
// Other code
}
Where litemList is the list of all the possible items a product can have.
I wanted to know whether this solution sensibly increase the coupling (and violates the law of Demeter) or it is acceptable because I am actually querying a List<T> and not an IQueriable<T>. Correct me if I am wrong, but I guess that since the List does not need to access the EF DataContext, there is no coupling between Controller and EF.
In case I am wrong, the only solution I can think about is to substitute the query with a Repository method (that still I have to implement):
var lproductItem_monthlyReport_ProductRep.GetArchiveReport(product.Prod_ID, lmonth, lyear, item.Item_ID);
with this solution however the Repository makes one query with 4 conditions every loop cycle whilst in the previous solution the repository was making a query with just one conditions.
May you please enlighten me on this issue? Thanks.
PS: I need both variables lproduct_monthlyReport and lproductItem_monthlyReport inside the loop, I cannot just use one of them
PPS: I know that I should have a Business Service Layer between Controller and Repository, it is my next step.
Returning Lists from your repository will give you awful performance, because you lose the deferred execution behaviour. Basically your repository will retrieve every single record, and not related entities, into memory, and turn them into a List, which then gets processed in memory. If you want to access a related entity, it'll need another database hit. If you stick with IEnumerable (or IQueryable), then you are hiding the nuances of the entity framework behaviour from the client, but still getting the advantages like lazy loading and deferred execution.
Ignoring the specifics of your Repository for now, if you do this:
List<Product> products = MyEntities.Products.ToList();
Product product1 = products.Single(p => p.Id = 1);
it will perform much worse than this:
IEnumerable<Product> products = MyEntities.Products;
Product product1 = products.Single(p => p.Id = 1);
The first one will perform a SELECT in the database with no WHERE clauses, then instantiate .Net objects for every result, then query that in-memory list. The second will do nothing until you access a property on product1 and will at that point issue a database command to just retrieve the 1 product, and only instantiate that 1 product.
The difference between the 2 may not be noticeable with small data sets, but as the data set gets larger this will get worse and worse. Throw in a connected entity (or worse still entity collection), and you'll get potentially thousands of database hits, where if you stuck with IEnumerable you'd get 1.
I would probably have function like this GetArchiveReport(int prodID, int lmonth, int lyear, IEnumerable<int> itemIDs) that would do a itemIDs.Contains(tbl.ID) inside your query
var SelectedReports = _ProductRep.GetArchiveReport(product.Prod_ID, lmonth, lyear, litemList.Select(item => item.Item_ID));
foreach(var prodItem in SelectedReports)
{
//Do code
}