Grouping Joined Tables - c#

I am trying to join the products and attributes tables and group them by product id in order to get a list of products with their attributes. So I tried the following Entity Framework query:
var productsWithAttributes = (from product in ctx.products
join attribute in ctx.attributes on product.id equals attribute.productId
select new
{
product = product,
a1 = attribute.a1,
a2 = attribute.a2,
a3 = attribute.a3,
a4 = attribute.a4
} into t
group t by t.product.id into g
select new
{
product = g.Select(p => p.product).FirstOrDefault(),
attributes = g.Select(r => new Attr()
{
a1 = r.a1,
a2 = r.a2,
a3 = r.a3,
a4 = r.a4
}).ToList()
}
).ToList();
But this took around 70 minutes and when I looked into the SQL query it produced, I saw tens of subqueries with tens of joins.
Then I tried just to do the grouping on the sql server and did the projection into the desired structure on the application server. And this is the EF code for that:
var productsWithAttributes = (from product in ctx.products
join attribute in ctx.attributes on product.id equals attribute.productId
select new
{
product = product,
a1 = attribute.a1,
a2 = attribute.a2,
a3 = attribute.a3,
a4 = attribute.a4
} into t
group t by t.product.id
).ToList();
This took around 3 minutes. But the SQL prodcued by this query still looked complex with multiple subqueries and joins. I would epect something along the lines of:
select product.*, attribute.a1, attribute.a2, attribute.a3, attribute.a4
from product
join attribute on product.id = attribute.productId
group by product.id
Then I tried just the join without grouping:
var productsWithAttributes = (from product in ctx.products
join attribute in ctx.attributes on product.id equals attribute.productId
select new
{
product = product,
a1 = attribute.a1,
a2 = attribute.a2,
a3 = attribute.a3,
a4 = attribute.a4,
}
).ToList();
This took 1.5 minutes and the SQL code produced by EF was as expected.
In short, adding grouping to the join creates a convoluted SQL query which takes longer but it is still acceptable in terms of performance. But adding the final projection after this grouping produces an increadibly convoluted SQL query that takes and unaccepytable amount of time.
What is the correct way of creating this query with EF?

If you want to create joined tables then all you have to do is create another table with both pk(Primay keys) and full join them instead of inner joining them or just joinig.

The recommended way of creating such query in LINQ to Entities is to use collection navigation property, or in case it is missing - Group Join construct (join ... into):
A group join produces a hierarchical result sequence, which associates elements in the left source sequence with one or more matching elements in the right side source sequence. A group join has no equivalent in relational terms; it is essentially a sequence of object arrays.
Something like this:
var productsWithAttributes = (
from product in ctx.products
join attribute in ctx.attributes on product.id equals attribute.productId
into attributes // <-- emulate product.attributes property
select new
{
product = product,
attributes = attributes.Select(attribute => new Attr()
{
a1 = attribute.a1,
a2 = attribute.a2,
a3 = attribute.a3,
a4 = attribute.a4
}).ToList(),
}).ToList();

Related

How to perform a left join with an additional filtering in LINQ to entities?

I have several tables, the main one is called DefectRecord, others are called DefectArea, DefectLevel...etc and the one called DefectAttachment. And this problem is about joining DefectRecord with other tables to get a ViewModel for further use. What the hard part I am facing is about the DefectAttachment table.
DefectRecord has a 1-to-many relation with DefectAttachment. While there may be NO attachment at all for one defect record, there may be multiple attachments.
Logically I tried to perform a left join among DefectRecord & DefectAttachment, but there is one more requiredment:
If there is multiple attachments, select ONLY the oldest one(i.e. the
one with oldest CreatedDate field value)
I am stuck at this requirement, how can I perform this with LINQ-to-Entities? Below is the code of what I have now:
var ret = (from dr in defectRecordQuery
join ft in filterQuery on dr.FilterID equals ft.FilterID
join l in levelQuery on dr.LevelID equals l.LevelID
join a in attachmentQuery on dr.DefectRecordID equals a.DefectRecordID into drd
from g in drd.DefaultIfEmpty()
select new DefectRecordViewModel
{
DefectRecordCode = dr.Code,
DefectAttachmentContent = g == null ? null : g.FileContent,
LookupFilterName = ft.FilterName,
}).ToList();
The *Query variable are the IQueryable object which get the full list of corresponding table.
Group your results by the Code and FilterName and then for the content take that of the item in the group that has the oldest date
var ret = (from dr in defectRecordQuery
join ft in filterQuery on dr.FilterID equals ft.FilterID
join l in levelQuery on dr.LevelID equals l.LevelID
join d in attachmentQuery on dr.DefectRecordID equals d.DefectRecordID into drd
from g in drd.DefaultIfEmpty()
group g by new { dr.Code, ft.FilterName } into gg
select new DefectRecordViewModel
{
DefectRecordCode = gg.Key.Code,
DefectAttachmentContent = gg.OrderByDescending(x => x.CreateDateTime).FirstOrDefault() == null? null: gg.OrderByDescending(x => x.CreateDateTime).FirstOrDefault().FileContent,
LookupFilterName = gg.Key.FilterName,
}).ToList();
If using C# 6.0 or higher then you can do:
DefectAttachmentContent = gg.OrderByDescending(x => x.CreateDateTime)
.FirstOrDefault()?.FileContent,

LINQ SQL Trying to merge 3 recordsets into one

I have three sets of data representing a counted value, grouped by country code.
select distinct m.CountryCode, count(m.MetricId) as 'Impressions'
from Metrics m
inner join impressions i on m.MetricId = i.MetricId
where ...
group by m.CountryCode
select distinct m.CountryCode, count(m.MetricId) as 'Conversions'
from Metrics m
inner join Conversions c on m.MetricId = c.MetricId
where ...
group by m.CountryCode
..and there's a third one that joins with a table called "Leads"
So each of these give me a nice set of distinct country codes and a corresponding number.
CountryCode Impressions
AU 25
DE 34
US 264
CountryCode Conversions
AU 11
US 140
something like that. so my goal is to get all three recordsets merged to one that looks like this:
CountryCode Impressions Conversions Leads
US 264 140 98
I'd like to learn how to do this with LINQ and without doing three queries. There's gotta be a more straightforward approach but I've been working on it too long and my eyes aren't seeing it. Would appreciate a nudge in the proper direction, thanks
var qry1 = (from m in Db.Metrics
join i in Db.Impressions on m.MetricId equals i.MetricId
//where
group m by m.CountryCode into grp
select new
{
CountryCode = grp.Key,
Impressions = grp.Count()
});
var qry2 = (from m in Db.Metrics
join c in Db.Conversions on m.MetricId equals c.MetricId
//where
group m by m.CountryCode into grp
select new
{
CountryCode = grp.Key,
Conversions = grp.Count()
});
var result = (from x in qry1
join y in qry2 on x.CountryCode equals y.CountryCode
select new
{
CountryCode = x.CountryCode,
Impressions = x.Impressions,
Conversions = y.Conversions
});
var lst = result.ToList();
The first 2 queries are lazy, they will not yet execute. The result-variable just joins them together and the last part executes the final query and materializes the objects.
Splitting these in their separate queries can be helpfull in keeping it simpler.

Selecting the whole data after joining in linq

When I need to join some tables using linq, and when those tables consist of a lot of fields, it takes a lot of work to get all the data that I need. For instance:
var result = from i in Person
join y in Works
on i.PID euqals y.PID
join z in Groups
on y.GID on z.GID
select new {Name = i.Name, Work = y.work, WG = z.GroupName};
How can make the query return all the tables ?
I guess what you need is simply this :
var Query = from x in Table_1
join y in Table_2
on x.id equals y.id
where x.Country.Equals("X Country")
select new {x,y};

Grouping a many relationship in LINQ

I've got the following code:
var preGroup =
from l in e.WorkOrderRequests
join w in e.WorkOrders on l.WorkOrderRequestKey equals w.WorkOrderRequestKey into lw
from l2 in lw.DefaultIfEmpty()
select new { l.WorkOrderRequestStatusKey,
l.WorkOrderRequest_Status.Status,
w.Property.Address.StateKey,
w.Property.Address.MetroKey };
The relationship structure is that a Work Order Request can have many Work Orders, so it is a 1 to many relationship. I want to do a select so that each work order request will show up for as many times as there are work orders attached to it. But also it still should appear if there are no work orders attached. So a left outer join makes sense. The select above doesn't work, but that shows the data that I want to select. I need to then group this data:
var workOrderRequests =
(from l in preGroup.ToList()
group l by new { l.WorkOrderRequestStatusKey, l.Status, l.StateKey, l.MetroKey } into g
select new DashboardView
{
StatusKey = g.Key.WorkOrderRequestStatusKey,
StatusName = g.Key.Status,
StateKey = g.Key.StateKey,
MetroKey = g.Key.MetroKey,
Count = g.Count()
});
I can't get to the grouping because my preGroup query is not working. I have been able to accomplish this for other things, but those are all 1 to 1 relationships. Any help is much appreciated.

Get the "latest" datetime from a large linq query that currently returns every record that has a datetime

I have a fairly long linq query and everything works as it should.. but in a final join i am doing an innerjoin on a table that has a log, the log returns more than 50 records, i just want the latest record..
Here is an example
var tst = from w in context.storage
join p in context.products on w.id equals p.wid
join l in context.logger on p.id equals l.pid
select new
{
storageid = w.id,
productid = p.id
productname = p.name
bought = l.when
};
So a quick explanation of what happens, each product is stored in a storage center and there is a log when that product was bought, if it was bought 100 times then there is 100 records in the logger.
So currently it returns 50 records for productid = 5 ... why .. because it was bought 50 times but i only want 1 record, hence i only want the latest date time for from the logger.
Can anyone help? I am a little stuck.
Use result.Distinct(x => x.Prop) to get unique entries only
Use result.Max(x => x.Prop) to get latest date, and Min() to get earliest.
This is a case where you want to restrict to collection of records on which to join, which you can do by coding the join manually (sort of):
from w in context.storage
join p in context.products on w.id equals p.wid
// "manual" join:
from l in context.logger.Where(l => l.pid == p.id).OrderByDescencing(l => l.when).Take(1)
select new
{
storageid = w.id,
productid = p.id
productname = p.name
bought = l.when
};
In fluent linq syntax this is a SelectMany with a result selector.

Categories