Multiple group by with aggregate in Linq - c#

I currently have this code:
foreach (var newsToPolitician in news.NewsToPoliticians)
{
var politician = newsToPolitician.Politician;
var votes = (from s in db.Scores
where o.IDPolitician == politician.IDPolitician
&& o.IDNews == IDNews
group o by o.IDAtribute
into g
select new{
Atribute= g.Key,
TotalScore= g.Sum(x => x.Score)
}).ToList();
}
It works alright, but I want to avoid making multiple queries to my database in foreach loop.
My table Scores looks like this:
IDScore | IDNews | IDUser | IDPolitician | IDAtribute | Score
1 40 1010 35 1 1
2 40 1010 35 2 -1
3 40 1002 35 1 1
4 40 1002 35 2 1
5 40 1002 40 1 -1
...
My goal is to aggregate all the scores for all politicians in a news. A news can have up to 7 politicians.
Is it expensive to call my database up to seven times in a foreach loop. I know that isn't best practice so I'm interested is there any way to avoid it in this particular case and make one call to database and then process it on the server side?

Update - Due to user comments have re-jigged to try and ensure aggregation on the server.
In this case we can group on the server by both IDPolitician and IDAttribute and then pull the groups in with ToLookup locally as so:
var result = db.Scores.Where(s => s.IDNews == IDNews)
.Where(s => news.NewsToPoliticians
.Select(n => n.Politician.IDPolitician)
.Contains(s.IDPolitician))
.GroupBy(s => new
{
s.IDPolitician,
s.IDAttribute
},
(k,g ) => new
{
k.IDPolitician,
k.IDAttribute,
Sum = g.Sum(x => x.Score)
})
.ToLookup(anon => anon.IDPolitician,
anon => new { anon.IDAttribute, anon.Sum })
Legacy -
You want to use GroupJoin here, it would be something along the lines of:
var result = news.NewsToPoliticians
.GroupJoin( db.Scores.Where(s= > s.IDNews == IDNews),
p => p.IDPolitician,
s => s.IDPolitician,
(k,g) => new
{
PoliticianId = k,
GroupedVotes = g.GroupBy(s => s.IDAtribute,
(id, group) => new
{
Atribute = id,
TotalScore = group.Sum(x => x.Score)
})
})
.ToList();
However you are at the mercy of your provider as to how it translates this so it might still be multiple queries to get round this you could use something like:
var politicianIds = news.NewsToPoliticians.Select(p => p.IDPolitician).ToList()
var result = db.Scores.Where(s= > s.IDNews == IDNews)
.Where(s => politicianIds.Contains(s.IDPolitician))
.GroupBy(p => p.IDPolitician,
(k,g) => new
{
PoliticianId = k,
GroupedVotes = g.GroupBy(s => s.IDAtribute,
(id, group) => new
{
Atribute = id,
TotalScore = group.Sum(x => x.Score)
})
})
.ToList();
Which hopefully should be at most 2 query (depending on whether NewsToPoliticians is db dependent). You'll just have to try it out and see.

Use a stored procedure and get the SQL server engine to do all the work. You can still use Linq to call the stored procedure and this will minimize all the calls to the database

Related

Simple LINQ query to count by month

This should be a fairly basic LINQ query but I am confused on how to get this in the format I need.
Here's what I have right now, which I realize isn't even close:
var accidents = DataParser
.Parser
.ParseData()
.Where(w => w.EventDate.Year == year)
.GroupBy(g => g.EventDate.Month)
.Count()
.ToList();
The ToList isn't going to work here due to the Count.
I need the data in the simple format of:
Month Number | EventCount
1 | 45
2 | 62
3 | 42
... etc through month 12, preferably as a List<Events> (the DataParser returns Events objects).
You need to use your group like this:
var accidents = DataParser
.Parser
.ParseData()
.Where(w => w.EventDate.Year == year)
.GroupBy(g => g.EventDate.Month)
.Select(g => new { Month = g.Key, Count = g.Count() }
.ToList();
That will give a list of anonymous objects. If you need list of Events, create Event object instead: .Select(g => new Event() { Month = g.Key, Count = g.Count() }

how to aggregate a linq query by different groupings

How do you perform multiple seperate aggregations on different grouping in linq?
for example, i have a table:
UNO YOS Ranking Score
123456 1 42 17
645123 3 84 20
I want to perform an set of aggregations on this data both grouped and ungrouped, like:
var grouped = table.GroupBy(x => x.score )
.Select(x => new
{
Score = x.Key.ToString(),
OverallAverageRank = x.Average(y => y.Ranking),
Year1RankAvg = x.Where(y => y.YOS == 1).Average(y => y.Ranking),
Year2RankAvg = x.Where(y => y.YOS == 2).Average(y => y.Ranking)
//...etc
});
I also want to perform different aggregations (standard deviation) on the same slices and whole-set data.
I can't figure out how to both group and not group the YOS at the same time and while this compiles fine, when it comes to runtime, I get "Sequence contains no elements", if any of the YOS averages are in.
Like anything programming, when you have a sequence of similar items, use a collection. In this case, I left it IEnumerable, but you could make it a List, or a Dictionary by YOS, if desired.
var ans = table.GroupBy(t => t.Score)
.Select(tg => new {
Score = tg.Key,
OverallAverageRank = tg.Average(t => t.Ranking),
YearRankAvgs = tg.GroupBy(t => t.YOS).Select(tyg => new { YOS = tyg.Key, RankAvg = tyg.Average(t => t.Ranking) })
});
If you need the range of years from 1 to max (or some other number) filled in, you can modify the answer:
var ans2 = ans.Select(soryr => new {
soryr.Score,
soryr.OverallAverageRank,
YearRankDict = soryr.YearRankAvgs.ToDictionary(yr => yr.YOS),
YearMax = soryr.YearRankAvgs.Max(yr => yr.YOS)
})
.Select(soryr => new {
Score = soryr.Score,
OverAverageRank = soryr.OverallAverageRank,
YearRankAvgs = Enumerable.Range(1, soryr.YearMax).Select(yos => soryr.YearRankDict.ContainsKey(yos) ? soryr.YearRankDict[yos] : new { YOS = yos, RankAvg = 0.0 }).ToList()
});
If you preferred, you could modify the original ans to return RankAvg as double? and put null in place of 0.0 when adding missing years.

Determine Duplicate data using LINQ to EF

I have a dataset that i want to groupby to determine duplicate data.
Example i have a dataset that looks like this.
|id | Number | ContactID
1 1234 5
2 9873 6
3 1234 7
4 9873 6
Now i want to select data that has more than one occurrence of Number but only if the ContactID is not the same.
So basically return
| Number | Count |
1234 2
Any help would be appreciated using LINQ to EF, thanks.
Update:
All thanks to #DrCopyPaste, as he told me that I misunderstood your problem. Here is the correct solution:-
var result = from c in db.list
group c by c.Number into g
let count = g.GroupBy(x => x.ContactID).Where(x => x.Count() == 1).Count()
where count != 0
select new
{
Number = g.Key,
Count = count
};
Sample Fiddle.
This query avoids making a custom IEqualityComparer as if I remember correctly don't think they play well with EF.
var results = data.GroupBy(number => number.Number)
.Where(number => number.Count() > 1)
.Select(number => new
{
Number = number.Key,
Count = number.GroupBy(contactId => contactId.ContactId).Count(x => x.Count() == 1)
})
.Where(x => x.Count > 0).ToList();
Fiddle
It does an initial GroupBy to get all Numbers that are duplicated. It then selects a new type that contains the number and a second GroupBy that groups by ContactId then counts all groups with exactly one entry. Then it takes all results whose count is greater than zero.
Have not testing it against EF, but the query uses only standard Linq operators so EF shouldn't have any issues translating it.
Another way of doing this(using 1 level of grouping):
var results = data
.Where(x => data.Any(y => y.Id != x.Id && y.Number == x.Number && y.ContactId != x.ContactId))
.GroupBy(x => x.Number)
.Select(grp => new { Number = grp.Key, Count = grp.Count() })
.ToList();
Fiddle

Filter some unique Data with LINQ and C#

i am very new with C# and MVC.
My Problem:
I have a list OF IDs
int[] mylist = {10, 23}
I try to query some data from DB
var result = db.tableName.Where(o => mylist.Any(y => y == o.item_ID && o.readed)).ToList();
This is what I get with the query:
item_ID Product_ID readed
277 1232 1
277 1233 1
277 1235 1
280 1235 1
What I need is:
item_ID Product_ID readed
277 1235 1
280 1235 1
If I change "any" to "all" i don't get any results, but I have definitely one item where the condition fits.
I think its more like make a query with id 277, then a query with 280 and then merge the list and return only where where "Product_ID" match.
Any ideas?
I assume that what you need is this:
var temp = db.tableName.Where(o => mylist.Any(y => y == o.item_ID && o.readed))
.ToList();
// Find the Product_id which appeared more than one time
// By the way, this assumes that there is at least one product_Id whihc has appeared more than one time
var neededProductID = temp.GroupBy(x => x.Product_ID)
.Where(x => x.Count() > 1)
.First()
.Key;
// Filter the result by neededProductID
var result = temp.Where(x => x.Product_ID == neededProductID).ToList();
Also, if there could be more tha one Product_ID which has apperaed more than one time, then you can consider this:
var neededProductID = temp.GroupBy(x => x.Product_ID)
.Where(x => x.Count() > 1)
.Select(x => x.Key)
.ToList();
var result = temp.Where(x => neededProductID.Any(y => y == x.Product_ID)).ToList();
By the way, you don't need All(). It tells you if all the elements in a collection match a certain condition.
You can use the following
var result = db.tableName.Where(o => mylist.conains(o.item_ID)
&& o.readed).ToList();

C# Retrieve Common Records by LINQ

I have a database with the table that keeps user_ids and tag_ids. I want to write a function which takes two user_ids and returns the tag_ids that both users have in common.
These are the sample rows from the database:
User_id Tag_id
1 100
1 101
2 100
3 100
3 101
3 102
What I want from my function is that when I call my function like getCommonTagIDs(1, 3), it should return (100,101). What I did so far is that I keep the rows which are related to user_id in two different lists and then using for loops, return the common tag_ids.
using (TwitterDataContext database = TwitterDataContext.CreateTwitterDataContextWithNoLock())
{
IEnumerable<Usr_Tag> tags_1 = database.Usr_Tags.Where(u => u.User_id == userID1).ToList();
IEnumerable<Usr_Tag> tags_2 = database.Usr_Tags.Where(u => u.User_id == userID2).ToList();
foreach (var x in tags_1)
{
foreach (var y in tags_2) {
if (x.Tag_id == y.Tag_id) {
var a =database.Hashtags.Where(u => u.Tag_id==x.Tag_id).SingleOrDefault();
Console.WriteLine(a.Tag_val);
}
}
}
}
What I want to ask is that, instead of getting all rows from database and searching for the common tag_ids in the function, I want to get the common tag_ids directly from database with LINQ by making the calculations on the database side. I would be grateful if you could help me.
This is the SQL that I wrote:
SELECT [Tag_id]
FROM [BitirME].[dbo].[User_Tag]
WHERE USER_ID = '1' AND Tag_id IN (
SELECT [Tag_id]
FROM [BitirME].[dbo].[User_Tag]
where USER_ID = '3')
What you want is the "Intersection" of those two sets:
var commonTags = database.Usr_Tags.Where(u => u.User_id == userID1).Select(u => u.Tag_id)
.Intersect(database.Usr_Tags.Where(u => u.User_id == userID2).Select(u => u.Tag_id));
And voila, you're done.
Or, to clean it up a bit:
public static IQueryable<int> GetUserTags(int userId)
{
return database.Usr_Tags
.Where(u => u.User_id == userId)
.Select(u => u.Tag_id);
}
var commonTags = GetUserTags(userID1).Intersect(GetUserTags(userID2));
Here's one way to do it:
int[] users = new int[] {1,3}; // for testing
database.Ustr_Tags.Where(t => users.Contains(t.User_id))
.GroupBy(t => t.Tag_id)
.Where(g => users.All(u => g.Any(gg=>gg.User_id == u))) // all tags where all required users are tagged
.Select(g => g.Key);
One benefit of this one is it can be used for any number of users (not just 2).
If i got it right, query like this is maybe what you need
var q = from t in database.Usr_Tags
//all Usr_Tags for UserID1
where t.User_Id == userID1 &&
//and there is a User_tag for User_ID2 with same Tag_ID
database.User_Tags.Any(t2=>t2.User_ID==userID2 && t2.Tag_ID==t.Tag_ID)
select t.Tag_Id;
var commonTags = q.ToList();

Categories