I'm new to Orchard and this must be something involving how the underlying data is stored.
The joining with CommonPart seems fast enough, like this:
var items = _contentManager.Query<MyUserPart, MyUserPartRecord>("someTypeName")
.ForVersion(VersionOptions.Published)
.Join<CommonPartRecord>().List().ToList();
That runs fairly fast. But whenever I try accessing some field in CommonPart, it runs extremely slow like this:
var items = _contentManager.Query<MyUserPart, MyUserPartRecord>("someTypeName")
.ForVersion(VersionOptions.Published)
.Join<CommonPartRecord>().List()
//access some field from commonpart
.Select(e => new {
User = e.As<CommonPart>().Owner.UserName
}).ToList();
The total data is just about 1200 items, and the time it needs is about 5 seconds, it cannot be slow like that. For a simple SQL query run in background, it should take a time of about 0.5 second or even less than.
I've tried investigating the Orchard's source code but found nothing that could be the issue. Everything seems to go into a blackbox at the accessing point of IContent. I hope someone here could give me some suggestion to diagnose and solve this hard issue. Thanks!
Update:
I've tried debugging a bit and seen that the following method is hit inside the DefaultContentManager:
ContentItem New(string contentType) { ... }
Well that's really interesting, the query is just asking for data without modifying, inserting and updating anything. But that method being hit shows that something's wrong here.
Update:
With #Bertrand Le Roy's comment, I've tried the following codes with QueryHint but looks like it does not change anything:
var items = _contentManager.Query<MyUserPart, MyUserPartRecord>("someTypeName")
.ForVersion(VersionOptions.Published)
.Join<CommonPartRecord>()
.WithQueryHints(new QueryHints().ExpandParts<CommonPart>())
.List()
//access some field from commonpart
.Select(e => new {
User = e.As<CommonPart>().Owner.UserName
}).ToList();
and this (without .Join)
var items = _contentManager.Query<MyUserPart, MyUserPartRecord>("someTypeName")
.ForVersion(VersionOptions.Published)
.WithQueryHints(new QueryHints().ExpandParts<CommonPart>())
.List()
//access some field from commonpart
.Select(e => new {
User = e.As<CommonPart>().Owner.UserName
}).ToList();
Accessing the Owner property from your Select causes the lazy loader in CommonPartHandler to ask the content manager to load the user content item: _contentManager.Get<IUser>(part.Record.OwnerId). This happens once per content item result from your query, so results in a select n+1 where n = 1200 according to your question.
There are at least two ways of avoiding that:
You can use HQL and craft a query that gives you everything you need up front in 1 operation.
You can make a 1st content manager query to get the set of owner ids, and then
make a second content manager query for those Ids and get everything you need with a total of 2 queries instead of 1201.
Related
Alright, this one will have quite a bit of code in it, and may be somewhat long. That said, I'm really having trouble finding the best way to do this, and any help would be immensely appreciated. I am here to learn, specifically the most efficient way to write these types of queries.
NOTE: The system I'm working with right now is an ASP.NET MVC site written in VB.NET, using Entity Framework 6.
With all that out of the way, I believe my best bet here is to just show an example, and see what options are available. My issue comes in to play when I need to populate an object's properties utilizing a LINQ to Entities query hitting the database, but some of those properties are lists themselves of both simple integers, as well as more complex objects. For example, take this query:
Note that in the code example below I removed a complex WHERE clause and some other irrelevant to the question properties I populate to make it smaller and easier to read.
//NOTE: If any context is needed to better understand this, I can provide it. I do provide a bit in the comments. Actual code uses ' for comments (VB).
/*Context: db is an instance of the EF context. Code is representing threads
and replies in an "activity feed". Threads can have private recipients and
uploads, both of which are grabbed from the database. These are the two
properties I'm populating which this question addresses further down.*/
//Get all my activity feed threads
model.ActivityFeedThreads = (From thds In db.tblThreads
From thps In db.tblThreadParticipants.Where(Function(w) w.thpThread_thdID = thds.thdID).DefaultIfEmpty()
Join prns In db.tblPersons On thds.thdOwner_prnID Equals prns.prnID
Join emps In db.tblEmployees On prns.prnID Equals emps.empPerson_prnID
Where (thps Is Nothing And thds.thdPrivacy_lvlID = ActivityThread.ActivityThreadPrivacy.PublicThread) _
Select New ActivityThread With {.thdID = thds.thdID,
.Content = thds.thdContent,
.ThreadOwner_prnID = thds.thdOwner_prnID,
.PrivacyOption_lvlID = thds.thdPrivacy_lvlID,
.ThreadDate = thds.thdDateCreated}).Distinct().Take(20).OrderByDescending(Function(o) o.ThreadDate).ToList()
Now, one property which I need to populate that is not included in that query is "AttachedFiles", which is a list of type "AppFile", a custom class in our system. In addition, there is a second property called "PrivateRecipients", which is a list of simple Int32s. I have tried to work those in to the above query without success, so I have to loop through the resulting list and populate them, resulting in numerous hits to the database. Below is my current code:
//I need to get the private recipients for this post.. Is there a way to work this in to the above query? What's the most efficient solution here?
For Each threadToEdit In model.ActivityFeedThreads
threadToEdit.PrivateRecipients = db.tblThreadParticipants.Where(Function(w) w.thpThread_thdID = threadToEdit.thdID).Select(Function(s) s.thpPerson_prnID).ToList()
Next
//Again, loop through, grab all attached files. Similar situation as above.. Can I work it in to the query?
For Each threadToEdit In model.ActivityFeedThreads
threadToEdit.AttachedFiles = (From flks In db.tblFileLinks
Join fils In db.tblFiles On flks.flkFile_filID Equals fils.filID
Where flks.flkTarget_ID = threadToEdit.thdID And flks.flkLinkType_lvlID = AppFile.FileLinkType.Thread_Upload
Select New AppFile With {.filID = fils.filID,
.Location = fils.filLocation,
.FileURL = fils.filLocation,
.Name = fils.filName,
.FileName = fils.filName,
.MIMEType = fils.filMIMEType}).ToList()
Next
As you can see, in both situations, I'm having to loop through and hit the database a bunch of times, and I have to imagine as the data grows this will become a bit of a bottleneck.. Is there a better way to go about this with LINQ to Entities? Should I change my approach altogether?
Thank you in advance for your time/help, it's greatly appreciated!
We have an object with nested properties which we want to make easily searchable. This has been simple enough to achieve but we also want to aggregate information based on multiple fields. In terms of the domain we have multiple deals which have the same details with the exception of the seller. We need consolidate these as a single result and show seller options on the following page. However, we still need to be able to filter based on the seller on the initial page.
We attempted something like the below, to try to collect multiple sellers on a row but it contains duplicates and the creation of the index takes forever.
Map = deals => deals.Select(deal => new
{
Id = deal.ProductId,
deal.ContractLength,
Provider = deal.Provider.Id,
Amount = deal.Amount
});
Reduce = deals => deals.GroupBy(result => new
{
result.ProductId,
result.ContractLength,
result.Amount
}).Select(result => new
{
result.Key.ProductId,
result.Key.ContractLength,
Provider = result.Select(x => x.Provider).Distinct(),
result.Key.Amount
});
I'm not sure this the best way to handle this problem but fairly new to Raven and struggling for ideas. If we keep the index simple and group on the client side then we can't keep paging consistent.
Any ideas?
You are grouping on the document id. deal.Id, so you'll never actually generate a reduction across multiple documents.
I don't think that this is intended.
This is not a good approach here...! can anyone say why?
var dbc= new SchoolContext();
var a=dbc.Menus.ToList().Select(x=> new {
x.Type.Name,
ListOfChildmenus = x.ChildMenu.Select(cm=>cm.Name),
ListOfSettings = x.Settings.SelectMany(set=>set.Role)
});
Because when you call .ToList() or .FirstOrDefault() and so on (when you enumerate), your query will get executed.
So when you do dbc.Menus.ToList() you bring in memory from the database all your Menus, and you didn't want that.
You want to bring in memory only what you select ( the list of child menus and the list of settings ).
Relevant furter reading : http://www.codeproject.com/Articles/652556/Can-you-explain-Lazy-Loading - probably you are using lazy loading
And if you want to add a filter to your IQueryable you may read about difference between ienumerable, iqueryable http://blog.micic.ch/net/iqueryable-vs-ienumerable-vs-ihaveheadache
And some dinamic filtering https://codereview.stackexchange.com/questions/3560/is-there-a-better-way-to-do-dynamic-filtering-and-sorting-with-entity-framework
Actually Razvan's answer isn't totally accurate. What happens in your query is this:
When you call ToList() the contents of the entire table get dumped into memory.
When you access navigation properties such as ChildMenu and Settings a new query is generated and run for each element in that table.
If you'd done it like so:
dbc.Menus
.Select(x=> new {
x.Type.Name,
ListOfChildmenus = x.ChildMenu.Select(m=>m.Name),
ListOfSettings = x.Settings.SelectMany(z=>z.Role)
})
.ToList()
your whole structure would have been generated in one query and one round trip to the database.
Also, as Alex said in his comment, it's not necessarily a bad approach. For instance if your database is under a lot of load it's sometimes better to just dump things in the web application's memory and work with them there.
I am using ASP NET MVC 4.5 and EF6, code first migrations.
I have this code, which takes about 6 seconds.
var filtered = _repository.Requests.Where(r => some conditions); // this is fast, conditions match only 8 items
var list = filtered.ToList(); // this takes 6 seconds, has 8 items inside
I thought that this is because of relations, it must build them inside memory, but that is not the case, because even when I return 0 fields, it is still as slow.
var filtered = _repository.Requests.Where(r => some conditions).Select(e => new {}); // this is fast, conditions match only 8 items
var list = filtered.ToList(); // this takes still around 5-6 seconds, has 8 items inside
Now the Requests table is quite complex, lots of relations and has ~16k items. On the other hand, the filtered list should only contain proxies to 8 items.
Why is ToList() method so slow? I actually think the problem is not in ToList() method, but probably EF issue, or bad design problem.
Anyone has had experience with anything like this?
EDIT:
These are the conditions:
_repository.Requests.Where(r => ids.Any(a => a == r.Student.Id) && r.StartDate <= cycle.EndDate && r.EndDate >= cycle.StartDate)
So basically, I can checking if Student id is in my id list and checking if dates match.
Your filtered variable contains a query which is a question, and it doesn't contain the answer. If you request the answer by calling .ToList(), that is when the query is executed. And that is the reason why it is slow, because only when you call .ToList() is the query executed by your database.
It is called Deferred execution. A google might give you some more information about it.
If you show some of your conditions, we might be able to say why it is slow.
In addition to Maarten's answer I think the problem is about two different situation
some condition is complex and results in complex and heavy joins or query in your database
some condition is filtering on a column which does not have an index and this cause the full table scan and make your query slow.
I suggest start monitoring the query generated by Entity Framework, it's very simple, you just need to set Log function of your context and see the results,
using (var context = new MyContext())
{
context.Database.Log = Console.Write;
// Your code here...
}
if you see something strange in generated query try to make it better by breaking it in parts, some times Entity Framework generated queries are not so good.
if the query is okay then the problem lies in your database (assuming no network problem).
run your query with an SQL profiler and check what's wrong.
UPDATE
I suggest you to:
add index for StartDate and EndDate Column in your table (one for each, not one for both)
ToList executes the query against DB, while first line is not.
Can you show some conditions code here?
To increase the performance you need to optimize query/create indexes on the DB tables.
Your first line of code only returns an IQueryable. This is a representation of a query that you want to run not the result of the query. The query itself is only runs on the databse when you call .ToList() on your IQueryable, because its the first point that you have actually asked for data.
Your adjustment to add the .Select only adds to the existing IQueryable query definition. It doesnt change what conditions have to execute. You have essentially changed the following, where you get back 8 records:
select * from Requests where [some conditions];
to something like:
select '' from Requests where [some conditions];
You will still have to perform the full query with the conditions giving you 8 records, but for each one, you only asked for an empty string, so you get back 8 empty strings.
The long and the short of this is that any performance problem you are having is coming from your "some conditions". Without seeing them, its is difficult to know. But I have seen people in the past add .Where clauses inside a loop, before calling .ToList() and inadvertently creating a massively complicated query.
Jaanus. The most likely reason of this issue is complecity of generated SQL query by entity framework. I guess that your filter condition contains some check of other tables.
Try to check generated query by "SQL Server Profiler". And then copy this query to "Management Studio" and check "Estimated execution plan". As a rule "Management Studio" generatd index recomendation for your query try to follow these recomendations.
I thought I would be clever and write something like this code sample. It also seemed like a clean and efficient way to fill an array without enumerating a second time.
int i = 0;
var tickers = new List<string>();
var resultTable = results.Select(result => new Company
{
Ticker = tickers[i++] = result.CompanyTicker,
});
I don't really care for an alternative way to do this, because I can obviously accomplish this easily with a for loop. I'm more interested why this snippet doesn't work ie, tickers.Count = 0 after the code runs, despite there being 100+ results. Can anyone tell me why I'm getting this unexpected behavior?
You need to iterate your query, for example use .ToArray() or ToList() at the end. Currently you just created a query, it hasn't been executed yet.
You may see: LINQ and Deferred Execution
Plus, I believe your code should throw an exception, for IndexOutOfRange, since your List doesn't have any items.
This is due to LINQ's lazy execution. When the query gets executed (i.e. when you iterate over it), the list should have your results. An easy way to do this is to use ToArrayorToList.
Linq should ideally not have side affects.
I don't see what would prevent this from being a two step process:
var tickers = results.Select(r => r.CompanyTicker).ToList();
var resultTable = tickers.Select(t => new Company { Ticker = t }).ToList();