I have three sets of data representing a counted value, grouped by country code.
select distinct m.CountryCode, count(m.MetricId) as 'Impressions'
from Metrics m
inner join impressions i on m.MetricId = i.MetricId
where ...
group by m.CountryCode
select distinct m.CountryCode, count(m.MetricId) as 'Conversions'
from Metrics m
inner join Conversions c on m.MetricId = c.MetricId
where ...
group by m.CountryCode
..and there's a third one that joins with a table called "Leads"
So each of these give me a nice set of distinct country codes and a corresponding number.
CountryCode Impressions
AU 25
DE 34
US 264
CountryCode Conversions
AU 11
US 140
something like that. so my goal is to get all three recordsets merged to one that looks like this:
CountryCode Impressions Conversions Leads
US 264 140 98
I'd like to learn how to do this with LINQ and without doing three queries. There's gotta be a more straightforward approach but I've been working on it too long and my eyes aren't seeing it. Would appreciate a nudge in the proper direction, thanks
var qry1 = (from m in Db.Metrics
join i in Db.Impressions on m.MetricId equals i.MetricId
//where
group m by m.CountryCode into grp
select new
{
CountryCode = grp.Key,
Impressions = grp.Count()
});
var qry2 = (from m in Db.Metrics
join c in Db.Conversions on m.MetricId equals c.MetricId
//where
group m by m.CountryCode into grp
select new
{
CountryCode = grp.Key,
Conversions = grp.Count()
});
var result = (from x in qry1
join y in qry2 on x.CountryCode equals y.CountryCode
select new
{
CountryCode = x.CountryCode,
Impressions = x.Impressions,
Conversions = y.Conversions
});
var lst = result.ToList();
The first 2 queries are lazy, they will not yet execute. The result-variable just joins them together and the last part executes the final query and materializes the objects.
Splitting these in their separate queries can be helpfull in keeping it simpler.
Related
I can't figure it out and tested some answers from similare questions.
Now i need a linq pro. :)
The following query with linq returns each device.storeID and device.deviceID with max date in DeviceList:
var query = (from device in db.DeviceList
join store in db.stores on device.storeID equals store.id
join type in db.devices on device.deviceID equals type.id
group device by new { device.storeID, device.deviceID } into g
select new
{
deviceID = g.Key.deviceID,
storeID = g.Key.storeID,
MaxDate = g.Max(d => d.Date)
});
In device there is also device.amount, but i can't acess like:
var query = (from device in db.DeviceList
join store in db.stores on device.storeID equals store.id
join type in db.devices on device.deviceID equals type.id
group device by new { device.storeID, device.deviceID } into g
select new
{
amount = device.amount,
deviceID = g.Key.deviceID,
storeID = g.Key.storeID,
MaxDate = g.Max(d => d.Date)
});
Because it's not in group. But if i add in group:
group device by new { device.storeID, device.deviceID, device.amount } into g
select new
{
amount = g.Key.amount,
deviceID = g.Key.deviceID,
storeID = g.Key.storeID,
MaxDate = g.Max(d => d.Date)
});
I get more results back than i need. What seems logical to me, but can i get amount without group it? I don't need device.storeID and device.deviceID with max date for each amount in DeviceList.
Oh, and by the way: As you can see, i tried to join store and type to get the store/device name. How can i add:
select new
{
deviceName = type.name,
deviceID = g.Key.deviceID,
storeID = g.Key.storeID,
MaxDate = g.Max(d => d.Date)
});
???
Thank you for every useful hint!
#Rafal: Thank you! Perhaps there is a easier solution for my query. So i have to explain what i want to read from the database.
The table device contains:
device.id device.deviceID device.storeID device.amount device.Date
123 52 20 10 2021-11-11
124 57 20 5 2021-12-01
125 57 20 2 2021-12-02
126 52 20 8 2021-12-03
127 52 21 3 2021-12-03
So, i need every different deviceID for every different storeID and amount from the last (max/highest) date:
device.id device.deviceID device.storeID device.amount device.Date
125 57 20 2 2021-12-02
126 52 20 8 2021-12-03
127 52 21 3 2021-12-03
If you stop for a second and write same query in sql you will come the same realization that amount cannot be selected from those groups just because you have multiple values within your store&device group. As pointed by #Juharr you can make an aggregation over amount as most likely it makes sens that you want to know sum of those amounts rather than one random of them from the group? or maybe you know which one you need? the one with max date?
if the one with max date you are after than you need to join device after the group by and select it:
var query = (from device in db.DeviceList
join store in db.stores on device.storeID equals store.id
group device by new { device.storeID, device.deviceID } into g
select new
{
deviceID = g.Key.deviceID,
storeID = g.Key.storeID,
MaxDate = g.Max(d => d.Date)
}) into s
let dev = db.devices.FirstOrDefault(x =>
x.deviceID == s.deviceID
&& x.storeID == s.storeID
&& s.MaxDate == x.Date)
join type in db.devices on dev.deviceID equals type.id
select new {
s.deviceID,
s.storeID,
s.MaxDate,
dev.amount,
type.name
};
this is also not perfect as you can have multiple device records with the same date that happens to be max so it will choose one at (semi) random or you can add some order before that FirstOrDefault
Short answer is you did not define properly what you want to read from the database in regards of that amount or type.
Also as a pro tip do not use join syntax until you have to. Writing queries is this manner by default reduces EF capabilities to generate queries for you, increases your workload as you need to think about relations and foreign keys and not on what the requirements are. You should use Navigation Properites by default and when EF fails to create proper code you can fall back to join syntax and fix the query. More often than not EF query is good enough.
I am trying to find out best 5 company names which makes the max profit for the company.
Here is what I have written so far:
var result = from c in db.Customers
join o in db.Orders
on c.CustomerID equals o.CustomerID
join od in db.Order_Details
on o.OrderID equals od.OrderID
select new
{
CompanyName = c.CompanyName,
Profit = (float)od.UnitPrice * (float)od.Quantity * (1 - od.Discount)
};
However, it doesn't contain not the group by and the best 5 company part which I'm actually looking for. I tried to do with
group c by c.CompanyName into CompanyName
but it doesn't work, and I couldn't find out that top 5 company query.
I think you would need to group your result with company name and sum the profit, then take the highest 5 by order by desc and take.
Something like the below (even though the sentence may not be exactly correct)
var grouped =from p in query
group p by p.CompanyName into g
select new
{
CompanyName = g.Key,
TotalProfit = g.sum(x=>x.Profit)
};
var Top5=grouped.orderbyDesc(x=>TotalProfit).take(5);
var totalCompanies = db.Order_Details.GroupBy(x => x.Order.Customer.CompanyName).OrderByDescending(x=>x.Sum(y=> (float)y.Quantity * (float)y.UnitPrice * 1 - (y.Discount))).Take(5).ToList();
List<string> bestCompanies = new List<string>();
foreach (var item in totalCompanies)
{
bestCompanies.Add(item.Key);
};
Here is the similar solution for the question.
I am trying to join the products and attributes tables and group them by product id in order to get a list of products with their attributes. So I tried the following Entity Framework query:
var productsWithAttributes = (from product in ctx.products
join attribute in ctx.attributes on product.id equals attribute.productId
select new
{
product = product,
a1 = attribute.a1,
a2 = attribute.a2,
a3 = attribute.a3,
a4 = attribute.a4
} into t
group t by t.product.id into g
select new
{
product = g.Select(p => p.product).FirstOrDefault(),
attributes = g.Select(r => new Attr()
{
a1 = r.a1,
a2 = r.a2,
a3 = r.a3,
a4 = r.a4
}).ToList()
}
).ToList();
But this took around 70 minutes and when I looked into the SQL query it produced, I saw tens of subqueries with tens of joins.
Then I tried just to do the grouping on the sql server and did the projection into the desired structure on the application server. And this is the EF code for that:
var productsWithAttributes = (from product in ctx.products
join attribute in ctx.attributes on product.id equals attribute.productId
select new
{
product = product,
a1 = attribute.a1,
a2 = attribute.a2,
a3 = attribute.a3,
a4 = attribute.a4
} into t
group t by t.product.id
).ToList();
This took around 3 minutes. But the SQL prodcued by this query still looked complex with multiple subqueries and joins. I would epect something along the lines of:
select product.*, attribute.a1, attribute.a2, attribute.a3, attribute.a4
from product
join attribute on product.id = attribute.productId
group by product.id
Then I tried just the join without grouping:
var productsWithAttributes = (from product in ctx.products
join attribute in ctx.attributes on product.id equals attribute.productId
select new
{
product = product,
a1 = attribute.a1,
a2 = attribute.a2,
a3 = attribute.a3,
a4 = attribute.a4,
}
).ToList();
This took 1.5 minutes and the SQL code produced by EF was as expected.
In short, adding grouping to the join creates a convoluted SQL query which takes longer but it is still acceptable in terms of performance. But adding the final projection after this grouping produces an increadibly convoluted SQL query that takes and unaccepytable amount of time.
What is the correct way of creating this query with EF?
If you want to create joined tables then all you have to do is create another table with both pk(Primay keys) and full join them instead of inner joining them or just joinig.
The recommended way of creating such query in LINQ to Entities is to use collection navigation property, or in case it is missing - Group Join construct (join ... into):
A group join produces a hierarchical result sequence, which associates elements in the left source sequence with one or more matching elements in the right side source sequence. A group join has no equivalent in relational terms; it is essentially a sequence of object arrays.
Something like this:
var productsWithAttributes = (
from product in ctx.products
join attribute in ctx.attributes on product.id equals attribute.productId
into attributes // <-- emulate product.attributes property
select new
{
product = product,
attributes = attributes.Select(attribute => new Attr()
{
a1 = attribute.a1,
a2 = attribute.a2,
a3 = attribute.a3,
a4 = attribute.a4
}).ToList(),
}).ToList();
I have several tables, the main one is called DefectRecord, others are called DefectArea, DefectLevel...etc and the one called DefectAttachment. And this problem is about joining DefectRecord with other tables to get a ViewModel for further use. What the hard part I am facing is about the DefectAttachment table.
DefectRecord has a 1-to-many relation with DefectAttachment. While there may be NO attachment at all for one defect record, there may be multiple attachments.
Logically I tried to perform a left join among DefectRecord & DefectAttachment, but there is one more requiredment:
If there is multiple attachments, select ONLY the oldest one(i.e. the
one with oldest CreatedDate field value)
I am stuck at this requirement, how can I perform this with LINQ-to-Entities? Below is the code of what I have now:
var ret = (from dr in defectRecordQuery
join ft in filterQuery on dr.FilterID equals ft.FilterID
join l in levelQuery on dr.LevelID equals l.LevelID
join a in attachmentQuery on dr.DefectRecordID equals a.DefectRecordID into drd
from g in drd.DefaultIfEmpty()
select new DefectRecordViewModel
{
DefectRecordCode = dr.Code,
DefectAttachmentContent = g == null ? null : g.FileContent,
LookupFilterName = ft.FilterName,
}).ToList();
The *Query variable are the IQueryable object which get the full list of corresponding table.
Group your results by the Code and FilterName and then for the content take that of the item in the group that has the oldest date
var ret = (from dr in defectRecordQuery
join ft in filterQuery on dr.FilterID equals ft.FilterID
join l in levelQuery on dr.LevelID equals l.LevelID
join d in attachmentQuery on dr.DefectRecordID equals d.DefectRecordID into drd
from g in drd.DefaultIfEmpty()
group g by new { dr.Code, ft.FilterName } into gg
select new DefectRecordViewModel
{
DefectRecordCode = gg.Key.Code,
DefectAttachmentContent = gg.OrderByDescending(x => x.CreateDateTime).FirstOrDefault() == null? null: gg.OrderByDescending(x => x.CreateDateTime).FirstOrDefault().FileContent,
LookupFilterName = gg.Key.FilterName,
}).ToList();
If using C# 6.0 or higher then you can do:
DefectAttachmentContent = gg.OrderByDescending(x => x.CreateDateTime)
.FirstOrDefault()?.FileContent,
I have a fairly long linq query and everything works as it should.. but in a final join i am doing an innerjoin on a table that has a log, the log returns more than 50 records, i just want the latest record..
Here is an example
var tst = from w in context.storage
join p in context.products on w.id equals p.wid
join l in context.logger on p.id equals l.pid
select new
{
storageid = w.id,
productid = p.id
productname = p.name
bought = l.when
};
So a quick explanation of what happens, each product is stored in a storage center and there is a log when that product was bought, if it was bought 100 times then there is 100 records in the logger.
So currently it returns 50 records for productid = 5 ... why .. because it was bought 50 times but i only want 1 record, hence i only want the latest date time for from the logger.
Can anyone help? I am a little stuck.
Use result.Distinct(x => x.Prop) to get unique entries only
Use result.Max(x => x.Prop) to get latest date, and Min() to get earliest.
This is a case where you want to restrict to collection of records on which to join, which you can do by coding the join manually (sort of):
from w in context.storage
join p in context.products on w.id equals p.wid
// "manual" join:
from l in context.logger.Where(l => l.pid == p.id).OrderByDescencing(l => l.when).Take(1)
select new
{
storageid = w.id,
productid = p.id
productname = p.name
bought = l.when
};
In fluent linq syntax this is a SelectMany with a result selector.