Creating count based upon counts of Group By Statement in LINQ / lambda - c#

What I want to do is group a huge bunch of records together by Employer. Then, I want to return an integer variable which will have a count of only those groups with at least 30 records.
I.E. I have 100 subscribers at Employer A, 20 at Employer B, and 30 at Employer C.
I group the records together and come up with
Employer A - 100
Employer B - 20
Employer C - 30
I want to return the Scalar Variable of 2.
Here is what I currently have:
var Step1 =
(from y in recordsActivstJoin
where y.q.Market.Contains(market) && y.x.ActivistCodeID.Equals(activismCode)
select new {y}).ToList();
//this groups my previous query
var Step2 = (from z in Step1 group z by z.y.q.Employer into f select new {f}).ToList();
When I watch the locals I can see that it does in fact group down from Step 1 where there are 34 rows to 17 in step 2. Now, I want narrow to only those where the group is >=30 .
Any suggestions?

I'm not the best at writing LINQ blindly, but I'm fairly certain you are looking for something very close to the following:
var Step1 =
(from y in recordsActivstJoin
where y.q.Market.Contains(market) && y.x.ActivistCodeID.Equals(activismCode)
select new {y}).ToList();
//this groups my previous query
var Step2 = (from i in Step1 group i by i.y.q.Employer into groupedEmployees
select new
{
EmployeeCount = groupedEmployees.Count()
}).Where(n=>n.EmployeeCount >= 30).Count();
Patrick pointed out that this could be shortened to:
var Step2 = Step1.GroupBy(i => i.y.q.Employer).Count(g => g.Count() >= 30);
Step2 should be 2 in your example. Hope this helps!

As an alternative:
Query to group records by employer:
Code:
var groupedRecords = recordsActivstJoin
.Where(y => y.q.Market.Contains(market) && y.x.ActivistCodeID.Equals(activismCode))
.ToLookup(y => y.q.Employer);
Count of the groups with more than 30 entries:
Code:
Int32 count = groupedRecords.Count(g => g.Count() >= 30);
Notes:
ToLookup is used at is most likely avalanche-safe compared to GroupBy which is typically not. It depends on the provider used to query your data e.g. There is no difference on LinqToObject whilst for LinqToSql there is a massive difference on big varied data sets.
ToLookup is immediate execution though, so if you want to deffer execution for the grouping you will need to go down a different path.

Related

What is the difference between group n by vs group n by into g in LINQ?

I notice that both LINQ query produce the same output. May I know what's difference of these two query for grouping? Is it because into can group by 2 element?
var groupBy = from n in numbers
group n by n;
And:
var groupBy = from n in numbers
group n by n into g
select g;
The difference stands out immediately in method syntax:
var groupBy = numbers.GroupBy(n => n);
vs. (with into)
var groupBy = numbers.GroupBy(n => n).Select(g => g);
Now your example isn't too useful to demonstrate the practical differences because each group is just one item, so let's take this example:
var group = from c in Company
group c by c.City;
If this is all we need, listing companies by cities, we're done. But if we want to do anything with the results of the grouping we need into and select, for example:
var group = from c in Company
group c by c.City
into cg
select new
{
City = cg.Key,
NumberOfCompanies = cg.Count()
};
In method syntax:
var group = Companies
.GroupBy(c => c.City)
.Select(gc => new
{
City = cg.Key,
NumberOfCompanies = cg.Count()
});
https://codeblog.jonskeet.uk/2010/09/15/query-expression-syntax-continuations/
When "into" is used after either a "group x by y" or "select x"
clause, it’s called a query continuation. (Note that "join … into"
clauses are not query continuations; they’re very different.) A query
continuation effectively says, "I’ve finished one query, and I want to
do another one with the results… but all in one expression."
The into keyword makes your query continuation, it is effectively starting a new query with the results of the old one in a new range variable.
You can also see how they are Compiled.

Please help me write LINQ statement for following SQL query

My Db named MyDB has 5 tables: MSize, MModel, Met, MResult, SResult. They are connected as follows:
MSize has a common field MSizeId with MModel.
MModel links with Met with MModelId.
Met can be linked with MResult on basis of MId.
Similarly SResult can be linked with MResult on SResultId.
My aim is to get average accuracy of all the items(description field in Msize table) with Acc(decimal data type) >=70 and <=130 grouped by description.
Here is my SQL query:
use MyDB;
SELECT a.[Description],AVG(CASE WHEN d.[Acc] >= 70 AND d.[Acc] <= 130 THEN d.[Acc] END)
FROM MSize a
INNER JOIN MModel b ON a.MSizeId = b.MSizeId
INNER JOIN Met c ON b.MModelId = c.MModelId
INNER JOIN MResult d ON c.MId = d.MId
INNER JOIN SResult e ON d.SResultId = e.SResultId
GROUP BY a.Description
This query gives me the correct result on SQL server.
I have been struggling to write a LINQ query for the same. The problem comes with the SQL CASE statement. I don't want to specify the false result of the CASE, meaning, if d.acc doesn't fall in the range specified in SQL query, discard it.
Assuming all Model classes and fields have the same name as these DBtables and columns. What can be the LINQ query for the given SQL statement?
You can fill up the code here in curly braces:
using (var db = new MyDBContext()){ }
here MyDBContext refers to Partial Class Data Model template generated by LINQ
You didn't bother to write the classes, and I'm not going to do that for you.
Apparently you have a sequence of MSizes, where every Msize has zero or more MModels. Every MModel has zero or more Mets, and every Met has zero or more MResults, and every MResult has an Acc.
You also forgot to write in words your requirements, now I had to extract it from your SQL query
It seemt that you want the Description of every MSize with the average value of all the Accs that it has, that have a value between 70 and 130.
If you use entity framework, you can use the virtual ICollection which makes live fairly easy. I'll do it in two steps, because below I do the same with a GroupJoin without using the ICollection. The 2nd part is the same for both methods.
First I'll fetch the Description of every MSize, together with all its deeper Acc that are in the MResults of the Mets of the MModels of this MSize:
var descriptionsWithTheirAccs = dbContext.MSizes.Select(msize => new
{
Description = msize.Description,
// SelectMany the inner collections until you see the Accs
Accs = mSize.Mmodels.SelectMany(
// collection selector:
model => model.Mets,
// result selector: flatten MResults in the Mets
(model, mets) => mets
.SelectMany(met => met.MResults,
// result Selector: from every mResult take the Acc
(met, mResults) => mResults
.Select(mResult => mResult.Acc)));
Now that we have the Description of every MSize with all Accs that it has deep inside it,
we can throw away all Accs that we don't want and Average the remaining ones:
var result= descriptionsWithTheirAccs.Select(descriptionWithItsAccs => new
{
Description = descriptionWithItsAccs.Description,
Average = descriptionWithItsAccs.Accs
.Where(acc => 70 <= acc && acc <= 130)
// and the average from all remaining Accs
.Avg(),
});
If you don't have access to the ICollections, you'll have to do the Groupjoin yourself, which looks pretty horrible if you have so many tables:
var descriptionsWithTheirAccs = dbContext.MSizes.GroupJoin(dbContext.MModels,
msize => msize.MSizeId,
mmodel => mmodel.MSizeId,
(msize, mmodels) => new
{
Description = msize.Description,
Accs = mmodels.GroupJoin(dbContext.Mets,
mmodel => mModel.MModelId,
met => met.MModelId,
(mmodel, metsofThisModel) => metsOfThisModel
.GroupJoin(dbContext.MResults,
met => met.MetId
mresult => mresult.MetId,
// result selector
(met, mresults) => mResult.Select(mresult => mresult.Acc))),
});
Now that you have the DescriptionsWithTheirAccs, you can use the Select above to calculation the Averages.

Group list and perform a function on the resulting matching items

I want to retrieve data from a database by a grouping query and then calculate their ratio. Take for instance positions in a warehouse where you retrieve stock values for 2 days and you want to know the change ratio.
E.g.:
var query = from o in dbContext.Orders
where (o.Date == firstDate) || (o.Date == secondDate)
group o by o.Date into g
select g;
How can i now (inside or outside of the query) caluclate the change ratio of the matching order items? (The ratio being defined as (newOrder.Stock / oldOrder.Stock) -1) I know how to do it by a simple somewhat verbose way, but i was hoping that there is a more elegant solution in linq.
Edit: An example of the data queried and the desired result.
ID Date InStock ItemID
1 15.01 5000 1
2 16.01 7000 1
3 15.01 9000 2
4 16.01 2000 2
This would now show an 40% increase for item 1 and an -78% decrease for item 2.
I already did achieve this by separating the groups into two lists and then checking each list for the corresponding items in the other list. This way you can easily calculate the ratios but you create some new variables and nested foreach loops which seem unnecessary. I'm simply searching for a more elegant solution.
var query = from o in orders
where (o.Date == firstDate || o.Date == secondDate)
group o by o.ItemID into g
select new
{
ItemID = g.Key,
DateStart = g.ElementAt(0).Date,
DateEnd = g.ElementAt(1).Date,
Ratio = g.ElementAt(1).InStock / (float)g.ElementAt(0).InStock
};
We use the fact that we know that each grouping will only contain two items (one for each date) to simply select the result of the division in a new anonymously typed item.
My example used int as the type for InStock, so feel free to remove the (float) cast if it's not needed.

how to take 100 records from linq query based on a condition

I have a query, which will give the result set . based on a condition I want to take the 100 records. that means . I have a variable x, if the value of x is 100 then I have to do .take(100) else I need to get the complete records.
var abc=(from st in Context.STopics
where st.IsActive==true && st.StudentID == 123
select new result()
{
name = st.name }).ToList().Take(100);
Because LINQ returns an IQueryable which has deferred execution, you can create your query, then restrict it to the first 100 records if your condition is true and then get the results. That way, if your condition is false, you will get all results.
var abc = (from st in Context.STopics
where st.IsActive && st.StudentID == 123
select new result
{
name = st.name
});
if (x == 100)
abc = abc.Take(100);
abc = abc.ToList();
Note that it is important to do the Take before the ToList, otherwise, it would retrieve all the records, and then only keep the first 100 - it is much more efficient to get only the records you need, especially if it is a query on a database table that could contain hundreds of thousands of rows.
One of the most important concept in SQL TOP command is order by. You should not use TOP without order by because it may return different results at different situations.
The same concept is applicable to linq too.
var results = Context.STopics.Where(st => st.IsActive && st.StudentID == 123)
.Select(st => new result(){name = st.name})
.OrderBy(r => r.name)
.Take(100).ToList();
Take and Skip operations are well defined only against ordered sets. More info
Although the other users are correct in giving you the results you want...
This is NOT how you should be using Entity Framework.
This is the better way to use EF.
var query = from student in Context.Students
where student.Id == 123
from topic in student.Topics
order by topic.Name
select topic;
Notice how the structure more closely follows the logic of the business requirements.
You can almost read the code in English.

Linq Union: How to add a literal value to the query?

I need to add a literal value to a query. My attempt
var aa = new List<long>();
aa.Add(0);
var a = Products.Select(p => p.sku).Distinct().Union(aa);
a.ToList().Dump(); // LinqPad's way of showing the values
In the above example, I get an error:
"Local sequence cannot be used in LINQ to SQL implementation
of query operators except the Contains() operator."
If I am using Entity Framework 4 for example, what could I add to the Union statement to always include the "seed" ID?
I am trying to produce SQL code like the following:
select distinct ID
from product
union
select 0 as ID
So later I can join the list to itself so I can find all values where the next highest value is not present (finding the lowest available ID in the set).
Edit: Original Linq Query to find lowest available ID
var skuQuery = Context.Products
.Where(p => p.sku > skuSeedStart &&
p.sku < skuSeedEnd)
.Select(p => p.sku).Distinct();
var lowestSkuAvailableList =
(from p1 in skuQuery
from p2 in skuQuery.Where(a => a == p1 + 1).DefaultIfEmpty()
where p2 == 0 // zero is default for long where it would be null
select p1).ToList();
var Answer = (lowestSkuAvailableList.Count == 0
? skuSeedStart :
lowestSkuAvailableList.Min()) + 1;
This code creates two SKU sets offset by one, then selects the SKU where the next highest doesn't exist. Afterward, it selects the minimum of that (lowest SKU where next highest is available).
For this to work, the seed must be in the set joined together.
Your problem is that your query is being turned entirely into a LINQ-to-SQL query, when what you need is a LINQ-to-SQL query with local manipulation on top of it.
The solution is to tell the compiler that you want to use LINQ-to-Objects after processing the query (in other words, change the extension method resolution to look at IEnumerable<T>, not IQueryable<T>). The easiest way to do this is to tack AsEnumerable() onto the end of your query, like so:
var aa = new List<long>();
aa.Add(0);
var a = Products.Select(p => p.sku).Distinct().AsEnumerable().Union(aa);
a.ToList().Dump(); // LinqPad's way of showing the values
Up front: not answering exactly the question you asked, but solving your problem in a different way.
How about this:
var a = Products.Select(p => p.sku).Distinct().ToList();
a.Add(0);
a.Dump(); // LinqPad's way of showing the values
You should create database table for storing constant values and pass query from this table to Union operator.
For example, let's imagine table "Defaults" with fields "Name" and "Value" with only one record ("SKU", 0).
Then you can rewrite your expression like this:
var zero = context.Defaults.Where(_=>_.Name == "SKU").Select(_=>_.Value);
var result = context.Products.Select(p => p.sku).Distinct().Union(zero).ToList();

Categories