Most Efficient EF Ordering for Multiple Columns in LocalDb - c#

What is the most efficient way to order a LocalDb table in descending order by four columns? I have a table that tracks a file storage hierarchy. Four folders act like an odometer (one digit for each folder). The table reflects this as a "storage item." I need to find the highest number using all four folders.
Here is the code I am currently using. I am worried that it is not efficient or accurate for a LocalDb database...
public StorageItem GetLastItem()
{
var item = _context.StorageItems.AsNoTracking()
.OrderByDescending(x => x.LevelA) // int
.OrderByDescending(x => x.LevelB) // int
.OrderByDescending(x => x.LevelC) // int
.OrderByDescending(x => x.ItemNumber) // int
.Where(x => !x.AuditDateDeleted.HasValue) // DateTime?
FirstOrDefault();
// Caching logic here
return item;
}

I don't think it'll be inefficient, but chaining a bunch of OrderByDescendings is probably not what you intended to do. Currently, this should generate a SQL ORDER BY clause of ItemNumber DESC, LevelC DESC, LevelB DESC, LevelA DESC. I think you want to use ThenByDescending...
var item = _context.StorageItems.AsNoTracking()
.Where(x => !x.AuditDateDeleted.HasValue)
.OrderByDescending(x => x.LevelA)
.ThenByDescending(x => x.LevelB)
.ThenByDescending(x => x.LevelC)
.ThenByDescending(x => x.ItemNumber)
.FirstOrDefault();
Also moved the where clause higher up, although I think the database should be smart enough to optimize that.

Related

How to sort something in LINQ based on many dates?

Hello this is a LINQ Query but it doesn't sort properly because four different dates are involved.
var EventReportRemarks = (from i in _context.pm_main_repz
.Include(a => a.PM_Evt_Cat)
.Include(b => b.department)
.Include(c => c.employees)
.Include(d => d.provncs)
where i.department.DepartmentName == "Finance"
orderby i.English_seen_by_executive_on descending
orderby i.Brief_seen_by_executive_on descending
orderby i.French_seen_by_executive_on descending
orderby i.Russian_seen_by_executive_on descending
select i).ToList();
All i want is that it should somehow combine the four dates and sort them in group not one by one.
For Example, at the moment it sorts all English Reports based on the date that executive has seen it, then Brief Report and So on.
But i want that it should check which one is seen first and so on. For example if the first report which is seen is French, then Brief, then English then Russian, so it should sort it accordingly.
Is it Possible??
You need to have them all in one column. The approach I would do, assuming that the value of the respective cells is null, when you don't want them to show up in the order by:
var EventReportRemarks = (from i in _context.pm_main_repz
.Include(a => a.PM_Evt_Cat)
.Include(b => b.department)
.Include(c => c.employees)
.Include(d => d.provncs)
where i.department.DepartmentName == "Finance"
select new
{
Date =
(
i.English_seen_by_executive_on != null ? i.English_seen_by_executive_on :
i.Brief_seen_by_executive_on != null ? i.Brief_seen_by_executive_on :
i.French_seen_by_executive_on != null ? i.French_seen_by_executive_on :
i.Russian_seen_by_executive_on
)
}).ToList().OrderBy(a => a.Date);
In the select clause you could add more columns if you whish.
Reference taken from here.
Why not just use .Min() or .Max() on the dates and then .OrderBy() or .OrderByDescending() based on that?
Logic is creating a new Enumerable (here, an array) with the 4 dates for the current line, and calculate the Max/Min of the 4 dates: this results in getting the latest/earliest of the 4. Then order the records based on this value.
var EventReportRemarks = (from i in _context.pm_main_repz
.Include(a => a.PM_Evt_Cat)
.Include(b => b.department)
.Include(c => c.employees)
.Include(d => d.provncs)
where i.department.DepartmentName == "Finance"
select i)
.OrderBy(i => new[]{
i.English_seen_by_executive_on,
i.Brief_seen_by_executive_on,
i.French_seen_by_executive_on,
i.Russian_seen_by_executive_on
}.Max())
.ToList();
Your problem is not a problem if you use method syntax for your LINQ query instead of query syntax.
var EventReportRemarks = _context.pm_main_repz
.Where(rep => rep.Department.DepartmentName == "Finance")
.OrderByDescending(rep => rep.English_seen_by_executive_on)
.ThenByDescending(rep => rep.Brief_seen_by_executive_on)
.ThenByDescending(rep => rep.French_seen_by_executive_on descending)
.ThenByDescending(rep => resp.Russian_seen_by_executive_on descending)
.Select(rep => ...);
Optimization
One of the slower parts of a database query is the transport of selected data from the DBMS to your local process. Hence it is wise to limit the transported data to values you actually plan to use.
You transport way more data than you need to.
For example. Every pm_main_repz (my, you do love to use easy identifiers for your items, don't you?), every pm_main_repz has zero or more Employees. Every Employees belongs to exactly one pm_main_repz using a foreign key like pm_main_repzId.
If you use include to transport pm_main_repz 4 with his 1000 Employees every Employee will have a pm_main_repzId with value 4. You'll transport this value 1001 times, while 1 time would have been enough
Always use Select to select data from the database and Select only the properties you actually plan to use. Only use Include if you plan to update the fetched objects
Consider using a proper Select where you only select the items that you actually plan to use:
.Select(rep => new
{
// only Select the rep properties you actually plan to use:
Id = rep.Id,
Name = rep.Name,
...
Employees = rep.Employees.Select(employee => new
{
// again: select only the properties you plan to use
Id = employee.Id,
Name = employee.Name,
// not needed: foreign key to pm_main_repz
// pm_main_repzId = rep.pm_main_repzId,
})
.ToList(),
Department = new
{
Id = rep.Department,
...
}
// etc for pm_evt_cat and provencs
});

C# change from groupby

Suppose there are two properties in Myclass: Date, Symbol
I want to frequently convert between those two properties, but I find that
for List <Myclass> vector
if I use
vector.groupby(o => o.Date).Select(o => o)
the vector is no longer the type of List<IGrouping<string, Myclass>>
And if I want to convert groupby(o => o.Date) to groupby(o => o.Symbol)
I have to use
vector.groupby(o => o.Date).Selectmany(o => o).groupby(o => o.Symbol)
I try to use SortedList<Date, Myclass>, but I am not familiar with SortedList(actually, I don't know what's the difference between SortedList and Groupby).
Is there any effective way to achieve such effect, as I highly depend on the speed of running?
int volDay = 100;
Datetime today = new DateTime(2012, 1, 1);
//choose the effective database used today, that is the symbol with data more than volDay
var todayData = dataBase.Where(o => o.Date <= today).OrderByDescending(o => o.Date)
.GroupBy(o => o.Symbol).Select(o => o.Take(volDay))
.Where(o => o.Count() == volDay).SelectMany(o => o);
//Select symbols we want today
var symbolList = todayData
.Where(o => o.Date == today && o.Eqy_Dvd_Yld_12M > 0))
.OrderByDescending(o => o.CUR_MKT_CAP)
.Take((int)(1.5 * volDay)).Where(o => o.Close > o.DMA10)
.OrderBy(o => o.AnnualizedVolatility10)
.Take(volDay).Select(o => o.Symbol).ToList();
//Select the database again only for the symbols in symbolList
var portfolios = todayData.GroupBy(o => o.Symbol)
.Where(o=>symbolList.Contains(o.Key)).ToList();
This is my real code, dataBase is the total data, and I will run the cycle day by day(here just given a fixed day). The last List portfolios is the final goal I want obtain, you can ignore other properties, which are used for the selections under the collection of Date and Symbol
It may be faster, or at least easier to read, if you performed a .Distinct().
To get distinct Dates:
var distinctDates = vector.Select(o => o.Date).Distinct()
To get distinct Symbols:
var distinctSymbols = vector.Select(o => o.Symbol).Distinct()
I asked what you were trying to accomplish so that I can provide you with a useful answer. Do you need both values together? E.g., the unique set of symbols and dates? You should only need a single group by statement depending on what you are ultimately trying to achieve.
E.g., this question Group By Multiple Columns would be relevant if you want to group by multiple properties and track the two unique pieces of data. a .Distinct() after the grouping should still work.

order by descending bug LINQ

This is mostly a curiosity rather than a real problem as I've already fixed that bug. I would be glad to have a deep answer that explain the LINQ mechanics behind this wizardry. So I have this query:
List<List<IMS_CMM_Measurement>> imsCMMMeasurements =
(from i in context.IMS_CMM_Measurement
where i.Job_FK == jobId
&& currentOperationCharacteristics.Contains(i.Characteristic_FK)
select i)
.GroupBy(elm => elm.Characteristic_FK)
.Select(CharacGroup => CharacGroup.GroupBy(elm => elm.Part_Number))
.Select(CharacGroup => CharacGroup.Select(PartGroup => PartGroup.OrderByDescending(measure => measure.Timestamp)))
.Select(CharacGroup => CharacGroup.Select(PartGroup => PartGroup.FirstOrDefault()))
.Select(CharacGroup => CharacGroup.OrderBy(measure => measure.Part_Number))
.Select(CharacGroup => CharacGroup.ToList()).ToList();
Basically, it takes measurements from a database and group them by characteristics. For each characteristic there are several measures that correspond to different parts, and some parts are measured more than once. In the later case, we only take the most recent one measure (the one with the greatest Timestamp, wich is a Date format entry). In order to do that, I have to order the measure for each part by the timestamp in decreasing order. Unfortunately, this does the exact opposite: it takes the oldest measure.
I managed to get the appropriate result by doing this instead (adding .ToList() at the end of the LINQ query at the 4th line):
List<List<IMS_CMM_Measurement>> imsCMMMeasurements =
(from i in context.IMS_CMM_Measurement
where i.Job_FK == jobId
&& currentOperationCharacteristics.Contains(i.Characteristic_FK)
select i).ToList()
.GroupBy(elm => elm.Characteristic_FK)
.Select(CharacGroup => CharacGroup.GroupBy(elm => elm.Part_Number))
.Select(CharacGroup => CharacGroup.Select(PartGroup => PartGroup.OrderByDescending(measure => measure.Timestamp)))
.Select(CharacGroup => CharacGroup.Select(PartGroup => PartGroup.FirstOrDefault()))
.Select(CharacGroup => CharacGroup.OrderBy(measure => measure.Part_Number))
.Select(CharacGroup => CharacGroup.ToList()).ToList();
Why does this work now? ;)

linq-to-sql SelectMany() troubles

I'm trying to find a more elegant way of pulling information from my db to my web application. Currently I pull all data in my table and use only two columns' data. It was suggested that I look into using SelectMany() to accomplish this by being able to select only the columns I need.
I'm not entirely sure how to translate the msdn example to a linq statement using a linq-to-sql db.
My current statement is this:
return db.document_library_sitefiles
.Where(item => item.SiteID == siteId)
.Select(item => item.document_library)
.GroupBy(item => item.Filename)
.Select(group => group.OrderByDescending(p=>p.Version).First())
.Where(item => !item.Filename.Contains("*")).ToList();
My current attempt, which I know is wrong, looks like this:
return db.document_library_sitefiles
.Where(item => item.SiteID == siteId)
.SelectMany(item => item.document_library, (filename, filesize)
=> new { filename, filesize })
.Select(item => new { filename = item.document_library.filename,
filesize = item.document_library.filesize })
.ToList();
Am I remotely close to getting my intended results?
Basically I want to get the data in my filename and filesize columns without pulling the rest of the data which includes file content (not my design or idea) so I'm not flooding my server with needless information just to show a simple data table of the files currently in this db.
I think you're going in the right direction. It looks like you're just changing the second query in an undesirable way. Give this a try;
return db.document_library_sitefiles
.Where(item => item.SiteID == siteId)
.Select(item => item.document_library)
.GroupBy(item => item.Filename)
.Select(group => group.OrderByDescending(p=>p.Version).First())
.Where(item => !item.Filename.Contains("*"))
.Select( item => new { filename = item.document_library.filename,
filesize = item.document_library.filesize } ).ToList();
Basically you want to keep all of the logic exactly the same as in the first query then just tack on one more select where you initialize the anonymous object to return.
In your attempt at the query you altered some of the underlying logic. You want all of the early operations to remain exactly the same (otherwise the results you return will be from a different set), you only want to transform objects in the resulting set which is why you add a select after the final where.
Cant you just append a select to you first statement?
....Where(item => !item.Filename.Contains("*"))
.Select(item => new {
item.Filename,
item.Filesize
}).ToList();

Counting grouped data with Linq to Sql

I have a database of documents in an array, each with an owner and a document type, and I'm trying to get a list of the 5 most common document types for a specific user.
var docTypes = _documentRepository.GetAll()
.Where(x => x.Owner.Id == LoggedInUser.Id)
.GroupBy(x => x.DocumentType.Id);
This returns all the documents belonging to a specific owner and grouped as I need them, I now need a way to extract the ids of the most common document types. I'm not too familiar with Linq to Sql, so any help would be great.
This would order the groups by count descending and then take the top 5 of them, you could adapt to another number or completely take out the Take() if its not needed in your case:
var mostCommon = docTypes.OrderByDescending( x => x.Count()).Take(5);
To just select the top document keys:
var mostCommonDocTypes = docTypes.OrderByDescending( x => x.Count())
.Select( x=> x.Key)
.Take(5);
You can also of course combine this with your original query by appending/chaining it, just separated for clarity in this answer.
Using the Select you can get the value from the Key of the Grouping (the Id) and then a count of each item in the grouping.
var docTypes = _documentRepository.GetAll()
.Where(x => x.Owner.Id == LoggedInUser.Id)
.GroupBy(x => x.DocumentType.Id)
.Select(groupingById=>
new
{
Id = groupingById.Key,
Count = groupingById.Count(),
})
.OrderByDescending(x => x.Count);

Categories