LINQ to SQL: Grouping and limiting a record set - c#

I'm using LINQ to SQL and I have a stored procedure which brings back a result set that looks like so:
Type Field1 Field2
5 1 1
6 2 0
21 0 0
I'm hoping to do a few things with this record set:
1) Have 3 groups of results, one that has values in both field1 and field2, one that has a value in field1 but not field2 and one that has zeros in field1 and field2.
2) I'm only interested in a subset of types. I have a list of type id's I'm looking for (say 5 and 21 for this example). Right now the type values are stored in an enumeration but I can move them to a better data type if appropriate.
I've gotten to where I can group each set but I'm unsure of how to limit the types so I only bring back those I'm interested in.
Here's what I have:
var result = from all in dataContext.sp(..variables...)
group all by all into list
let grp1 = (from a in list
where a.field1 != 0 && a.field2 != 0
select a)
let grp2 = (from b in list
where b.field1 == 0 && b.field2 != 0
select b)
let grp3 = (from c in list
where c.field1 == 0 && c.field2 == 0
select c)
select new { grp1, grp2, grp3 };
Any help is appreciated.

Do you actually want all the data for those groups? If so, you might as well do the grouping back in .NET - just filter in SQL:
// Assuming interestingTypes is an array or list of the interesting types
var query = db.Whatever.Where(entry => interestingTypes.Contains(entry.Type)
// I understand you're not interested in this group
&& !(entry.Field1==0 && entry.Field2==1));
var grouped = query.AsEnumerable()
.GroupBy(entry => new { entry.Field1, entry.Field2 });
An alternative to GroupBy is to use ToLookup:
var lookup = query.AsEnumerable()
.ToLookup(entry => new { entry.Field1, entry.Field2 });
Then:
var values00 = lookup[new { Field1=0, Field2=0 }];
(Then values00 will be an IEnumerable<T> of your entry type.)
If you're only interested in the types for each field group, you could change the lookup to:
var lookup = query.AsEnumerable()
.ToLookup(entry => new { entry.Field1, entry.Field2 },
entry => entry.Type);
You'd fetch values00 in the same way, but each entry would be the type rather than the whole record.

I dont think you will be able to do it in a single query (maybe, but not without it being ugly).
I would recommend storing the result of the stored proc, and just use 3 queries. Or modify the stored proc to return the 3 resultsets you are looking for.

Related

EF Core 2,1 GroupBy with Where and take first item

I seem to have the simplest problem but can't seem to get it right. I have a SQL table which looks something like this:
Code | Period | PeriodVersion | SourceId
Foo 201810 1 Source1
Foo 201810 2 Source1
Foo 201811 1 Source1
Bar 201810 1 Source1
Foo 201809 2 Source1
Foo 201809 1 Source1
Foo 201808 1 Source1
The query has the following requirements:
Period should be grouped by 201809, 201810 and 201811 and only the highest PeriodVersion should be returned. (in some cases there are 6 periods as well)
Code should be equal to Foo
SourceId should be equal to Source1
If all works well I would like to have the following result:
Code | Period | PeriodVersion | SourceId
Foo 201810 2 Source1
Foo 201811 1 Source1
Foo 201809 2 Source1
I've tried the following:
var query = from item in context.MyTable
orderby item.PeriodVersion descending
where item .Code== item.ISINCode &&
item .SourceID == "Source1" &&
(
"201810" == item.Period ||
"201811" == item.Period ||
"201819 == item.Period
)
group item by item.Period into g
select g.FirstOrDefault();
It translates to:
SELECT * // Selected columns here....
FROM [MyTable] AS [table]
WHERE ((([table].[Code] = 'Foo') AND ([table].[SourceID] = 'Source1'))) AND [table].[Period] IN ('201209', '201208', '201207')
ORDER BY [table].[Period], [table].[PeriodVersion] DESC
This will return the "correct" results, but it executes the groupby in memory which fetches all PeriodVersion from the database. In some cases I have >50 PeriodVersion for each Period which makes the query above very inefficient. Is there any way to make this more efficient?
I have also tried this based on this answer:
var query = context.MyTable
.GroupBy(x => x.Period)
.Select(g => g
.OrderByDescending(p =>
p.PeriodVersion
).FirstOrDefault(x => x.Code == "Foo" &&
x.SourceID == "Source1" &&
(
"201810" == item .Period ||
"201811" == item .Period ||
"201819 == item .Period
)
);
It gave an even worse result since it did execute the where in memory.
Select *
FROM MyTable AS [x]
ORDER BY [x].[Period]
My actual table has a lot more columns than the ones listed here. I'm using EF Core 2.1. I can upgrade to a newer version but it would require some major overhaul. According to the documentation groupby is supported. But when reading about it here on SO it seems to be tricky.
Maybe you can try something like this:
IList<string> list = new List<string>(){
"201810",
"201811",
"201819"
};
context.MyTable
.Where(m=>list.Contains(m.Period)
&& m.Code == "Foo"
&& m.Source = "Source1" )
.GroupBy(m=>m.Period)
.Select(g=>g.OrderByDesceding(s=>s.PeriodVersion).FirstOrDefault())
.ToList()
Group by in SQL can not return records you wanted, only aggregate (columns that you want to grop by) if you try to group by something and than select columns that are not part of group by clause, everything will be loaded in memory and evaluated there. That's how it worked unitl .Net Core 3.0 I think. After .NetCore 3.0 you need to explicitly call .ToList() before you call LINQ that can't be evaluated to SQL.
Anyways this won't be fully evaluated to SQL so it will not be great with performance if you are still not satisfied i recomend trying raw sql query approach which will definitley perform much better.
Edit: I think i originaly misunderstood this question so I'm updateing my code.

how to take 100 records from linq query based on a condition

I have a query, which will give the result set . based on a condition I want to take the 100 records. that means . I have a variable x, if the value of x is 100 then I have to do .take(100) else I need to get the complete records.
var abc=(from st in Context.STopics
where st.IsActive==true && st.StudentID == 123
select new result()
{
name = st.name }).ToList().Take(100);
Because LINQ returns an IQueryable which has deferred execution, you can create your query, then restrict it to the first 100 records if your condition is true and then get the results. That way, if your condition is false, you will get all results.
var abc = (from st in Context.STopics
where st.IsActive && st.StudentID == 123
select new result
{
name = st.name
});
if (x == 100)
abc = abc.Take(100);
abc = abc.ToList();
Note that it is important to do the Take before the ToList, otherwise, it would retrieve all the records, and then only keep the first 100 - it is much more efficient to get only the records you need, especially if it is a query on a database table that could contain hundreds of thousands of rows.
One of the most important concept in SQL TOP command is order by. You should not use TOP without order by because it may return different results at different situations.
The same concept is applicable to linq too.
var results = Context.STopics.Where(st => st.IsActive && st.StudentID == 123)
.Select(st => new result(){name = st.name})
.OrderBy(r => r.name)
.Take(100).ToList();
Take and Skip operations are well defined only against ordered sets. More info
Although the other users are correct in giving you the results you want...
This is NOT how you should be using Entity Framework.
This is the better way to use EF.
var query = from student in Context.Students
where student.Id == 123
from topic in student.Topics
order by topic.Name
select topic;
Notice how the structure more closely follows the logic of the business requirements.
You can almost read the code in English.

Creating count based upon counts of Group By Statement in LINQ / lambda

What I want to do is group a huge bunch of records together by Employer. Then, I want to return an integer variable which will have a count of only those groups with at least 30 records.
I.E. I have 100 subscribers at Employer A, 20 at Employer B, and 30 at Employer C.
I group the records together and come up with
Employer A - 100
Employer B - 20
Employer C - 30
I want to return the Scalar Variable of 2.
Here is what I currently have:
var Step1 =
(from y in recordsActivstJoin
where y.q.Market.Contains(market) && y.x.ActivistCodeID.Equals(activismCode)
select new {y}).ToList();
//this groups my previous query
var Step2 = (from z in Step1 group z by z.y.q.Employer into f select new {f}).ToList();
When I watch the locals I can see that it does in fact group down from Step 1 where there are 34 rows to 17 in step 2. Now, I want narrow to only those where the group is >=30 .
Any suggestions?
I'm not the best at writing LINQ blindly, but I'm fairly certain you are looking for something very close to the following:
var Step1 =
(from y in recordsActivstJoin
where y.q.Market.Contains(market) && y.x.ActivistCodeID.Equals(activismCode)
select new {y}).ToList();
//this groups my previous query
var Step2 = (from i in Step1 group i by i.y.q.Employer into groupedEmployees
select new
{
EmployeeCount = groupedEmployees.Count()
}).Where(n=>n.EmployeeCount >= 30).Count();
Patrick pointed out that this could be shortened to:
var Step2 = Step1.GroupBy(i => i.y.q.Employer).Count(g => g.Count() >= 30);
Step2 should be 2 in your example. Hope this helps!
As an alternative:
Query to group records by employer:
Code:
var groupedRecords = recordsActivstJoin
.Where(y => y.q.Market.Contains(market) && y.x.ActivistCodeID.Equals(activismCode))
.ToLookup(y => y.q.Employer);
Count of the groups with more than 30 entries:
Code:
Int32 count = groupedRecords.Count(g => g.Count() >= 30);
Notes:
ToLookup is used at is most likely avalanche-safe compared to GroupBy which is typically not. It depends on the provider used to query your data e.g. There is no difference on LinqToObject whilst for LinqToSql there is a massive difference on big varied data sets.
ToLookup is immediate execution though, so if you want to deffer execution for the grouping you will need to go down a different path.

Displaying specific elements of a collection's subcollection

I have a List collection that contains a List subcollection as a property within it, and I want to filter out items in that subcollection based on the value of certain properties.
To simplify, I'll call the main collection THING and the subcollection SUBTHING. They are different types. THINGS can have 1 to many SUBTHINGS. SUBTHING has 2 properties I want to filter by, PROP1 should equal 1 (it can equal 1,2,3) and PROP2 should not be NULL (it can contain a string).
So when I use a query like the one below it seems to give me what I want (though I'm not sure All() is doing what I expect):
search = from c in search
where c.SUBTHING.All(s=>s.PROP1==1)
select c;
Then I get suspicious when I add the other property:
search = from c in search
where c.SUBTHING.All(s=>s.PROP1==1 && s.PROP2 != NULL)
select c;
And I get THINGS that have PROP2 as Null.
When I switch to Any() I lose all filtering on SUBTHING and it shows SUBTHINGS where PROP1 = 1,2,3 and where PROP2 is NULL and not NULL.
What I'm trying to get is a collection that lists all THING IDs and then lists the Name of all SUBTHINGS, sort of like this:
THING.ID
SUBTHING.Name
SUBTHING.Name
THING.ID
SUBTHING.Name
SUBTHING.Name
Is this possible to also filter SUBTHINGS while filtering THINGS with LINQ since THING and SUBTHING are two different types?
Try something like this:
search =
from c in search
where c.SUBTHING.All(s=>s.PROP1==1 && s.PROP2 != NULL)
select new {
ThingId = c.ThingID,
Something = c.SomeThing.Select(x=>x.Name)
};
To apply filter on subitems try:
from product in products
where product.productid == 1
from image in product.productimages
where image.ismainimage
select image.imagename
From : 101 linq queries
One way is using Enumerable.Where and an anonymous type:
var result = from thing in search
from subthing in thing.subthings
where subthing.prop1 == 1 && subthing.prop2 != null
select new {ID = thing.ID, Name = subthing.Name};
foreach(var x in result)
{
Console.WriteLine("ID={0} Name{1}", x.ID, x.Name);
}
You need a projection as you are querying over the parent entity (THING) but in the result set you want to only have a subset of its SUBTHINGS.
You can do it e.g. in the following way:
class Thing
{
Thing(Thing original, IEnumerable<Subthing> subthings)
{
// Initialize based on original and set the collection
//
...
}
}
and then run the query like this:
var filtered = from c in search
select new Thing(c, c.Subthings.Where(x => x.PROP1 == 1 && x.PROP2 != null))
I'm not sure any of these answers really give you what you want (although they're close). From my understanding, you want a list of THINGs in which at least 1 SUBTHING has the values you're interested in (in this case, Prop1 == 1 and Prop2 != null). There are a few options here, just depends on whether you're working from a THING or a SUBTHING perspective.
Option 1: THING approach.
You're looking at any THING that has a SUBTHING with your condition. So:
var result = from thing in search
where thing.Subthings.Any(tr => tr.Prop1 == 1 && tr.Prop2 != null)
select new { ID = thing.ID, Names = thing.Subthings.Where(tr => tr.Prop1 == 1 && tr.Prop2 != null) };
Option 2: SUBTHING approach.
You're looking at ALL SUBTHINGs and finding the ones where the condition is met, grouping by the ID at that point.
var result = from thing in search
from sub in thing.Subthings
where sub.Prop1 == 1 && sub.Prop2 != null
group sub by thing.id into sg
select new { ID = sg.Key, Names = sg.Select(tr => tr.Name) };
I like this approach just a little better, but still room for improvement. The reason I like this is because you find the SUBTHINGs first, and only then will it pull the THING that's associated with it (instead of first having to find if any SUBTHING matches the criteria, and THEN selecting it).
Option 3: Hybrid approach.
This is a little of both. We're going to select from SUBTHINGs either way, so might as well just perform the select. Then, if any of the projected subcollections have any elements, then we return our THING with the Names.
var result = from thing in search
let names = thing.Subthings
.Where(sub => sub.Prop1 == 1 && sub.Prop2 != null)
.Select(sub => sub.Name)
where names.Any()
select new { ID = thing.ID, Names = names };
Cleanest option, in my opinion. The Any extension method on a collection without any parameters will return true if there are any items in the collection. Perfect for our situation.
Hope that helps, let us know what you came up with.

Linq Union: How to add a literal value to the query?

I need to add a literal value to a query. My attempt
var aa = new List<long>();
aa.Add(0);
var a = Products.Select(p => p.sku).Distinct().Union(aa);
a.ToList().Dump(); // LinqPad's way of showing the values
In the above example, I get an error:
"Local sequence cannot be used in LINQ to SQL implementation
of query operators except the Contains() operator."
If I am using Entity Framework 4 for example, what could I add to the Union statement to always include the "seed" ID?
I am trying to produce SQL code like the following:
select distinct ID
from product
union
select 0 as ID
So later I can join the list to itself so I can find all values where the next highest value is not present (finding the lowest available ID in the set).
Edit: Original Linq Query to find lowest available ID
var skuQuery = Context.Products
.Where(p => p.sku > skuSeedStart &&
p.sku < skuSeedEnd)
.Select(p => p.sku).Distinct();
var lowestSkuAvailableList =
(from p1 in skuQuery
from p2 in skuQuery.Where(a => a == p1 + 1).DefaultIfEmpty()
where p2 == 0 // zero is default for long where it would be null
select p1).ToList();
var Answer = (lowestSkuAvailableList.Count == 0
? skuSeedStart :
lowestSkuAvailableList.Min()) + 1;
This code creates two SKU sets offset by one, then selects the SKU where the next highest doesn't exist. Afterward, it selects the minimum of that (lowest SKU where next highest is available).
For this to work, the seed must be in the set joined together.
Your problem is that your query is being turned entirely into a LINQ-to-SQL query, when what you need is a LINQ-to-SQL query with local manipulation on top of it.
The solution is to tell the compiler that you want to use LINQ-to-Objects after processing the query (in other words, change the extension method resolution to look at IEnumerable<T>, not IQueryable<T>). The easiest way to do this is to tack AsEnumerable() onto the end of your query, like so:
var aa = new List<long>();
aa.Add(0);
var a = Products.Select(p => p.sku).Distinct().AsEnumerable().Union(aa);
a.ToList().Dump(); // LinqPad's way of showing the values
Up front: not answering exactly the question you asked, but solving your problem in a different way.
How about this:
var a = Products.Select(p => p.sku).Distinct().ToList();
a.Add(0);
a.Dump(); // LinqPad's way of showing the values
You should create database table for storing constant values and pass query from this table to Union operator.
For example, let's imagine table "Defaults" with fields "Name" and "Value" with only one record ("SKU", 0).
Then you can rewrite your expression like this:
var zero = context.Defaults.Where(_=>_.Name == "SKU").Select(_=>_.Value);
var result = context.Products.Select(p => p.sku).Distinct().Union(zero).ToList();

Categories