Group on Two Columns and Create Two Separate Counts

Group on Two Columns and Create Two Separate Counts - c#

Question
We have a table of StudentId and LectureId, and we want to know two things.
CountStudentId How many times each StudentId occurs.
CountStudentIdLectureId How many times each StudentId + LectureId pair occurs.
(2) is done in below. (1) is not.
In other words, how can we count two different groups in a single query?
Another way of thinking about this, would be to count the StudentId + LectureId group and also sum that count for each StudentId.
What we've tried
The following query groups on StudentId + LectureId. It does count how many times the StudentId + LectureId group occurs. It doesn't count how many times the StudentId group occurs. That's what we also want.
var query = joinTable
.GroupBy(jt => new { jt.StudentId, jt.LectureId } )
.Select(g => new {
StudentId = g.Key.StudentId,
LectureId = g.Key.LectureId,
CountStudentId = -1, // Count all StudentId (i.e. 10)?
CountStudentIdLectureId = g.Count()
});
This is the result we're currently receiving. In each row, the -1 value should be 10 (because we seeded the JoinTable with ten of each StudentId) and we haven't achieved that.
Results we want to achieve
...but with 10 instead of -1 in each case.**
StudentId LectureId CountStudentId CountStudentLectureId
0 0 -1 3
0 1 -1 3
0 2 -1 3
0 3 -1 1
1 0 -1 2
1 1 -1 3
1 2 -1 3
1 3 -1 2
In those results, we need CountStudentId to be 10 not -1 (the latter is just a placeholder for now.)
That's the expected result, because each StudentId occurs 10 times and because the sum of CountStudentLectureId for each StudentId is 10, which is just two ways of saying the same thing.
Full demo code
This is the full Fiddle code for reference.
using System;
using System.Linq;
using System.Collections.Generic;
public static class Program
{
public static void Main()
{
var joinTable = SeedJoinTable();
var query = joinTable
.GroupBy(jt => new { jt.StudentId, jt.LectureId } )
.Select(g => new {
StudentId = g.Key.StudentId,
LectureId = g.Key.LectureId,
CountStudentId = -1, // Count all StudentId (i.e. 10)?
CountStudentIdLectureId = g.Count()
});
// this is just the printing of the results
Console.WriteLine(
"StudentId".PadRight(15) +
"LectureId".PadRight(15) +
"CountStudentId".PadRight(17) +
"CountStudentLectureId".PadRight(15));
foreach(var x in query)
{
Console.WriteLine(string.Format("{0}{1}{2}{3}",
x.StudentId.ToString().PadRight(15),
x.LectureId.ToString().PadRight(15),
x.CountStudentId.ToString().PadRight(17),
x.CountStudentIdLectureId.ToString().PadRight(15)));
}
}
public static List<JoinTable> SeedJoinTable()
{
var list = new List<JoinTable>();
var studentId = 0;
var lectureId = 0;
// insert 20 records
for(int i = 0; i < 20; ++i)
{
if(i != 0)
{
if(i % 10 == 0)
{
// 10 of each studentId
++studentId;
lectureId = 0;
}
if(i % 3 == 0)
{
// 3 of each lectureId per student
++lectureId;
}
}
list.Add(new JoinTable() {
StudentId = studentId,
LectureId = lectureId
});
}
return list;
}
public class JoinTable
{
public int StudentId { get; set; }
public int LectureId { get; set; }
}
}

Here is a working DotNotFiddle that produces the results that you want to achieve.
You will want to group by StudentId and set the value to LectureId. This allows you to get the count of both studentId and studentIdLectureId pairs.
var query = joinTable
.GroupBy(jt => jt.StudentId, jt => jt.LectureId)
.Select(x =>
new {
StudentId = x.Key,
CountStudentId = x.Count(),
LectureIds = x.GroupBy(y => y),
});
This does alter how you will loop through the final list, but will provide you the same data with the same amount of loops:
foreach(var x in query)
{
foreach(var lectureId in x.LectureIds)
{
Console.WriteLine(string.Format("{0}{1}{2}{3}",
x.StudentId.ToString().PadRight(15),
lectureId.Key.ToString().PadRight(15),
x.CountStudentId.ToString().PadRight(17),
lectureId.Count().ToString().PadRight(15)));
}
}
If you want to include anything with the lectureId (lecture name, professor, etc.) you can do so like this:
var query = joinTable
.GroupBy(jt => jt.StudentId, jt => new {LectureId = jt.LectureId, ProfessorId = jt.ProfessorId})
.Select(x =>
new {
StudentId = x.Key,
CountStudentId = x.Count(),
LectureIds = x.GroupBy(y => y),
});

Related

Editing a property from within a foreach is not affecting the collection

I made up a test case:
I have a table like this
Id|IdA|IdW|Quantity
1 1 3 5
2 1 4 2
3 2 5 3
Id is the primary key, IdA is the article id, IdW is the box id, quantity is the amount of articles.
Now I have to group by IdA summing the quantities, so I'm doing:
var groups =
models
.GroupBy(x => new { x.IdA })
.Select
(
x =>
new Model
{
Id = (x.Select(y => y.Id).Count() > 1) ? 0 : x.Select(y => y.Id).First(),
IdA = x.Key.IdA,
Qty = x.Sum(y => y.Qty)
}
);
Here models is the table above. It works fine, I also managed to kept the primary key when no grouping is done (there is only one IdA)
Now I want to do this: I want to keep the IdW for the ones that haven't been grouped. The ideal result would be:
Id|IdA|IdW|Quantity
0 1 0 7
3 2 5 3
I tried to do a foreach on the groups, retrieving the row using the primary key, and then setting the IdW to the group, like this:
foreach(var e in groups)
{
var nonGroupedRow = models.Where(x => e.Id != 0 && x.Id == e.Id).FirstOrDefault();
var targetModel = groups.FirstOrDefault(x => x.Id == e.Id);
if(nonGroupedRow != null && targetModel != null)
{
targetModel.IdW = nonGroupedRow.IdW;
}
}
This incredibly is not working. Both groups still have IdW = 0. I also made up another test to be sure, doing:
void Main()
{
var a = new List<A> { new A { Id = 1 }, new A { Id = 2 } };
a.FirstOrDefault(x => x.Id == 1).Id = 2;
// both have Id = 2
}
class A
{
public long Id {get;set;}
}
It just have to work in my head, also given the example here above, yet it's not. Where am I wrong?

Firstly your count is counting the amount of entries found in the group by so you do not need to specify it to count the id's it would be the same
Secondly same thing to retrieve the first one's Id, take the first entry from the grouped data then the id property from there
You basically had the answer
var groups =
models
.GroupBy(x => new { x.IdA })
.Select
(
x =>
new Model
{
Id = (x.Count() > 1) ? 0 : x.First().Id,
IdA = x.Key.IdA,
IdW = (x.Count() > 1) ? 0 : x.First().IdW,
Qty = x.Sum(y => y.Qty)
}
);

linq Contains but less

I have a list to search a table,
List<long> searchListIds = new List<long>();
searchListIds.Add(1);
searchListIds.Add(2);
List<long> searchListFieldValues = new List<long>();
searchListFieldValues.Add(100);
searchListFieldValues.Add(50);
and my query is:
var adsWithRelevantadFields =
from adField in cwContext.tblAdFields
group adField by adField.adId into adAdFields
where searchListIds.All(i => adAdFields.Select(co => co.listId).Contains(i))
&& searchListFieldValues.All(i => adAdFields.Select(co => co.listFieldValue).Contains(i))
select adAdFields.Key;
everything is ok, but now: i need to get all records that meet less than searchListFieldValues. i mean:
all adId that have (listId == 1)&(listFieldValue <100) AND (listId == 2)&(listFieldValue <50)
contains part must change to something like contains-less
example:
cwContext.tblAdFields:
id 1 2 3 4 5 6 7
adId 1 2 1 2 3 3 3
listId 1 1 2 2 1 2 3
listfieldValue 100 100 50 50 100 49 10
Now if I want to get (listId == 1)&(listFieldValue ==100) AND (listId == 2)&(listFieldValue ==50) my code works, and return id adId: 1,2
but I can't get
all adId that have (listId == 1)&(listFieldValue ==100) AND (listId == 2)&(listFieldValue <50)
it must return 3

You should try changing Contains to Any, but I'm not sure if LINQ to Entities will translate it correctly into proper SQL statement.
var adsWithRelevantadFields =
from adField in cwContext.tblAdFields
group adField by adField.adId into adAdFields
where searchListIds.All(i => adAdFields.Select(co => co.listId).Contains(i))
&& searchListFieldValues.All(i => adAdFields.Select(co => co.listFieldValue).Any(x => x < i))
select adAdFields.Key;

Here is a full example that should work if I understood you correctly:
class Program
{
static void Main(string[] args)
{
List<int> searchListIds = new List<int>
{
1,
2,
};
List<int> searchListFieldValues = new List<int>
{
100,
50,
};
List<Tuple<int, int>> searchParameters = new List<Tuple<int,int>>();
for (int i = 0; i < searchListIds.Count; i++)
{
searchParameters.Add(new Tuple<int,int>(searchListIds[i], searchListFieldValues[i]));
}
List<AdField> adFields = new List<AdField>
{
new AdField(1, 1, 1, 100),
new AdField(2, 2, 1, 100),
new AdField(3, 1, 2, 50),
new AdField(4, 2, 2, 50),
new AdField(5, 3, 1, 100),
new AdField(6, 3, 2, 49),
new AdField(7, 3, 3, 10)
};
var result = adFields.Where(af => searchParameters.Any(sp => af.ListId == sp.Item1 && af.ListFieldValue < sp.Item2)).Select(af => af.AdId).Distinct();
foreach (var item in result)
{
Console.WriteLine(item);
}
Console.Read();
}
public class AdField
{
public int Id { get; private set; }
public int AdId { get; private set; }
public int ListId { get; private set; }
public int ListFieldValue { get; private set; }
public AdField(int id, int adId, int listId, int listFieldValue)
{
Id = id;
AdId = adId;
ListId = listId;
ListFieldValue = listFieldValue;
}
}
}

First, you're probably looking for functionality of Any() instead of Contains(). Another thing is that if your search criteria consists of two items - use one list of Tuple<int,int> instead of two lists. In this case you will e able to efficiently search by combination of listId and fieldValue:
var result = from adField in cwContext.tblAdFields
where searchParams.Any(sp => adField.listId == sp.Item1 && adField.listFieldValue < sp.Item2)
group adField by adField.adId into adAdFields
select adAdField.Key;

Count of flattened parent child association in LINQ

I'm trying to get a count of parents with no children plus parents children. As I write this I realize it is better explained with code.. So, here it goes:
With these example types:
public class Customer
{
public int Id { get; set; }
public string Name { get; set; }
public List<Order> Orders { get; set; }
}
public class Order
{
public int Id { get; set; }
public string Description { get; set; }
}
And this data:
var customers = new List<Customer>
{
new Customer
{
Id = 2,
Name = "Jane Doe"
},
new Customer
{
Id = 1,
Name = "John Doe",
Orders = new List<Order>
{
new Order { Id = 342, Description = "Ordered a ball" },
new Order { Id = 345, Description = "Ordered a bat" }
}
}
};
// I'm trying to get a count of customer orders added with customers with no orders
// In the above data, I would expect a count of 3 as detailed below
//
// CId Name OId
// ---- -------- ----
// 2 Jane Doe
// 1 John Doe 342
// 1 John Doe 345
int customerAndOrdersCount = {linq call here}; // equals 3
I am trying to get a count of 3 back.
Thank you in advance for your help.
-Jessy Houle
ADDED AFTER:
I was truly impressed with all the great (and quick) answers. For others coming to this question, looking for a few options, here is a Unit Test with a few of the working examples from below.
[TestMethod]
public void TestSolutions()
{
var customers = GetCustomers(); // data from above
var count1 = customers.Select(customer => customer.Orders).Sum(orders => (orders != null) ? orders.Count() : 1);
var count2 = (from c in customers from o in (c.Orders ?? Enumerable.Empty<Order>() ).DefaultIfEmpty() select c).Count();
var count3 = customers.Sum(c => c.Orders == null ? 1 : c.Orders.Count());
var count4 = customers.Sum(c => c.Orders==null ? 1 : Math.Max(1, c.Orders.Count()));
Assert.AreEqual(3, count1);
Assert.AreEqual(3, count2);
Assert.AreEqual(3, count3);
Assert.AreEqual(3, count4);
}
Again, thank you all for your help!

How about
int customerAndOrdersCount = customers.Sum(c => c.Orders==null ? 1 : Math.Max(1, c.Orders.Count()));

If you would initialize that Order property with an empty list instead of a null, you could do:
int count =
(
from c in customers
from o in c.Orders.DefaultIfEmpty()
select c
).Count();
If you decide to keep the uninitialized property around, then instead do:
int count =
(
from c in customers
from o in (c.Orders ?? Enumerable.Empty<Order>() ).DefaultIfEmpty()
select c
).Count();

customers
.Select(customer => customer.Order)
.Sum(orders => (orders != null) ? orders.Count() : 1)

This works if you want to count "no orders" as 1 and count the orders otherwise:
int customerOrders = customers.Sum(c => c.Orders == null ? 1 : c.Orders.Count());
By the way, the question is very exemplary.

You probabbly searching for something like this:
customers.GroupBy(customer=>customer). //group by object iyself
Select(c=> //select
new
{
ID = c.Key.Id,
Name = c.Key.Name,
Count = (c.Key.Orders!=null)? c.Key.Orders.Count():0
}
);

var orderFreeCustomers = customers.Where(c=>c.Orders== null || c.Orders.Any()==false);
var totalOrders = customers.Where (c => c.Orders !=null).
Aggregate (0,(v,e)=>(v+e.Orders.Count) );
Result is the sum of those two values

C# Linq Average

I have a table with data similar to below:
Group TimePoint Value
1 0 1
1 0 2
1 0 3
1 1 3
1 1 5
I want to project a table as such:
Group TimePoint AverageValue
1 0 2
1 1 4
EDIT: The data is in a datatable.
Anybody any ideas how this can be done with LINQ or otherwise?
Thanks.

You need to perform Group By
The linq you need is something like:
var query = from item in inputTable
group item by new { Group = item.Group, TimePoint = item.TimePoint } into grouped
select new
{
Group = grouped.Key.Group,
TimePoint = grouped.Key.TimePoint,
AverageValue = grouped.Average(x => x.Value)
} ;
For more Linq samples, I highly recommend the 101 Linq samples page - http://msdn.microsoft.com/en-us/vcsharp/aa336747#avgGrouped

Here's a more function-oriented approach (the way I prefer it). The first line won't compile, so fill it in with your data instead.
var items = new[] { new { Group = 1, TimePoint = 0, Value = 1} ... };
var answer = items.GroupBy(x => new { TimePoint = x.TimePoint, Group = x.Group })
.Select(x => new {
Group = x.Key.Group,
TimePoint = x.Key.TimePoint,
AverageValue = x.Average(y => y.Value),
}
);

You can do:
IEnumerable<MyClass> table = ...
var query = from item in table
group item by new { item.Group, item.TimePoint } into g
select new
{
g.Key.Group,
g.Key.TimePoint,
AverageValue = g.Average(i => i.Value)
};

Assuming a class like this:
public class Record
{
public int Group {get;set;}
public int TimePoint {get;set;}
public int Value {get;set;}
}
var groupAverage = from r in records
group r by new { r.Group, r.TimePoint } into groups
select new
{
Group = groups.Key.Group,
TimePoint = groups.Key.TimePoint,
AverageValue = groups.Average(rec => rec.Value)
};

Find 2nd max salary using linq

I have following sql query for finding 2nd max salary.
Select * From Employee E1 Where
(2) = (Select Count(Distinct(E2.Salary)) From Employee E2 Where
E2.Salary > E1.Salary)
I want to convert it into Linq statement.

I think what you're asking is to find the employee with the second-highest salary?
If so, that would be something like
var employee = Employees
.OrderByDescending(e => e.Salary)
.Skip(1)
.First();
If multiple employees may have equal salary and you wish to return an IEnumerable of all the employees with the second-highest salary you could do:
var employees = Employees
.GroupBy(e => e.Salary)
.OrderByDescending(g => g.Key)
.Skip(1)
.First();
(kudos to #diceguyd30 for suggesting this latter enhancement)

List<Employee> employees = new List<Employee>()
{
new Employee { Id = 1, UserName = "Anil" , Salary = 5000},
new Employee { Id = 2, UserName = "Sunil" , Salary = 6000},
new Employee { Id = 3, UserName = "Lokesh" , Salary = 5500},
new Employee { Id = 4, UserName = "Vinay" , Salary = 7000}
};
var emp = employees.OrderByDescending(x => x.Salary).Skip(1).FirstOrDefault();

You can define equally comparer class as bellow:
public class EqualityComparer : IEqualityComparer<Employee >
{
#region IEqualityComparer<Employee> Members
bool IEqualityComparer<Employee>.Equals(Employee x, Employee y)
{
// Check whether the compared objects reference the same data.
if (Object.ReferenceEquals(x, y))
return true;
// Check whether any of the compared objects is null.
if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
return false;
return x.Salary == y.Salary;
}
int IEqualityComparer<Employee>.GetHashCode(Employee obj)
{
return obj.Salary.GetHashCode();
}
#endregion
}
and use it as bellow:
var outval = lst.OrderByDescending(p => p.Id)
.Distinct(new EqualityComparer()).Skip(1).First();
or do it without equally comparer (in two line):
var lst2 = lst.OrderByDescending(p => p.Id).Skip(1);
var result = lst2.SkipWhile(p => p.Salary == lst2.First().Salary).First();
Edit: As Ani said to work with sql should do : var lst = myDataContext.Employees.AsEnumerable(); but if is for commercial software it's better to use TSQL or find another linq way.

Using LINQ, you can find the 3rd highest salary like this:
// first use LINQ to sort by salary, then skip first 2 and get next
var thirdHighestSalary= (from n in db.Employee order by n.salary descending select n).distinct().skip(2). FirstOrDefault()
// write the result to console
Console.WriteLine(Third Highest Salary is : {0},thirdHighestSalary.Salary);

This will work for duplicate record as well as nth highest salary just need to play with take and skip thats all for ex below is for 3 rd highest salary with duplicate record present in table-
emplist.OrderByDescending(x => x.Salary).Select(x=>x.Salary).Distinct().Take(3).Skip(2).First();

public class Program
{
public static void Main()
{
IList<int> intList = new List<int>() { 10, 21, 91, 30, 91, 45, 51, 87, 87 };
var largest = intList.Max();
Console.WriteLine("Largest Element: {0}", largest);
var secondLargest = intList.Max(i => {
if(i != largest)
return i;
return 0;
});
Console.WriteLine("second highest element in list: {0}", secondLargest);
}
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Group on Two Columns and Create Two Separate Counts - c#

Related

Editing a property from within a foreach is not affecting the collection

linq Contains but less

Count of flattened parent child association in LINQ

C# Linq Average

Find 2nd max salary using linq

Categories

Resources