Determine Duplicate data using LINQ to EF - c#

I have a dataset that i want to groupby to determine duplicate data.
Example i have a dataset that looks like this.
|id | Number | ContactID
1 1234 5
2 9873 6
3 1234 7
4 9873 6
Now i want to select data that has more than one occurrence of Number but only if the ContactID is not the same.
So basically return
| Number | Count |
1234 2
Any help would be appreciated using LINQ to EF, thanks.

Update:
All thanks to #DrCopyPaste, as he told me that I misunderstood your problem. Here is the correct solution:-
var result = from c in db.list
group c by c.Number into g
let count = g.GroupBy(x => x.ContactID).Where(x => x.Count() == 1).Count()
where count != 0
select new
{
Number = g.Key,
Count = count
};
Sample Fiddle.

This query avoids making a custom IEqualityComparer as if I remember correctly don't think they play well with EF.
var results = data.GroupBy(number => number.Number)
.Where(number => number.Count() > 1)
.Select(number => new
{
Number = number.Key,
Count = number.GroupBy(contactId => contactId.ContactId).Count(x => x.Count() == 1)
})
.Where(x => x.Count > 0).ToList();
Fiddle
It does an initial GroupBy to get all Numbers that are duplicated. It then selects a new type that contains the number and a second GroupBy that groups by ContactId then counts all groups with exactly one entry. Then it takes all results whose count is greater than zero.
Have not testing it against EF, but the query uses only standard Linq operators so EF shouldn't have any issues translating it.

Another way of doing this(using 1 level of grouping):
var results = data
.Where(x => data.Any(y => y.Id != x.Id && y.Number == x.Number && y.ContactId != x.ContactId))
.GroupBy(x => x.Number)
.Select(grp => new { Number = grp.Key, Count = grp.Count() })
.ToList();
Fiddle

Related

Getting the count of most repeated records in Linq

I am working on an application in which I have to store play history of a song in the data table. I have a table named PlayHistory which has four columns.
Id | SoundRecordingId(FK) | UserId(FK) | DateTime
Now i have to implement a query that will return the songs that are in trending phase i.e. being mostly played. I have written the following query in sql server that returns me data somehow closer to what I want.
select COUNT(*) as High,SoundRecordingId
from PlayHistory
where DateTime >= GETDATE()-30
group by SoundRecordingId
Having COUNT(*) > 1
order by SoundRecordingId desc
It returned me following data:
High SoundRecordingId
2 5
2 3
Which means Song with Ids 5 and 3 were played the most number of times i.e.2
How can I implement this through Linq in c#.
I have done this so far:
DateTime d = DateTime.Now;
var monthBefore = d.AddMonths(-1);
var list =
_db.PlayHistories
.OrderByDescending(x=>x.SoundRecordingId)
.Where(t => t.DateTime >= monthBefore)
.GroupBy(x=>x.SoundRecordingId)
.Take(20)
.ToList();
It returns me list of whole table with the count of SoundRecording objects but i want just count of the most repeated records.
Thanks
There is an overload of the .GroupBy method which will solve your problem.
DateTime d = DateTime.Now;
var monthBefore = d.AddMonths(-1);
var list =
_db.PlayHistories
.OrderByDescending(x=>x.SoundRecordingId)
.Where(t => t.DateTime >= monthBefore)
.GroupBy(x=>x.SoundRecordingId, (key,values) => new {SoundRecordingID=key, High=values.count()})
.Take(20)
.ToList();
I have simply added the result selector to the GroupBy method call here which does the same transformation you have written in your SQL.
The method overload in question is documented here
To go further into your problem, you will probably want to do another OrderByDescending to get your results in popularity order. To match the SQL statement you also have to filter for only counts > 1.
DateTime d = DateTime.Now;
var monthBefore = d.AddMonths(-1);
var list =
_db.PlayHistories
.Where(t => t.DateTime >= monthBefore)
.GroupBy(x=>x.SoundRecordingId, (key,values) => new {SoundRecordingID=key, High=values.count()})
.Where(x=>x.High>1)
.OrderByDescending(x=>x.High)
.ToList();
I like the 'linq' syntax it's similar to SQL
var query = from history in _db.PlayHistories
where history.DateTime >= monthBefore
group history by history.SoundRecordingId into historyGroup
where historyGroup.Count() > 1
orderby historyGroup.Key
select new { High = historyGroup.Count(), SoundRecordingId = historyGroup.Key };
var data = query.Take(20).ToList();
You´re allmost done. Just order your list by the count and take the first:
var max =
_db.PlayHistories
.OrderByDescending(x=>x.SoundRecordingId)
.Where(t => t.DateTime >= monthBefore)
.GroupBy(x=>x.SoundRecordingId)
.OrderByDescending(x => x.Count())
.First();
This gives you a single key-value-pair where the Key is your SoundRecordingId and the value is the number of its occurences in your input-list.
EDIT: To get all records with that amount chose this instead:
var grouped =
_db.PlayHistories
.OrderByDescending(x => x.SoundRecordingId)
.Where(t => t.DateTime >= monthBefore)
.GroupBy(x => x.SoundRecordingId)
.Select(x => new { Id = x.Key, Count = x.Count() }
.OrderByDescending(x => x.Count)
.ToList();
var maxCount = grouped.First().Count;
var result = grouped.Where(x => x.Count == maxCount);
This solves the problem by giving you what you asked for. Your query in LINQ, returning just the play counts.
var list = _db.PlayHistories.Where(x => x.DateTimeProp > (DateTime.Now).AddMonths(-1))
.OrderByDescending(y => y.SoundRecordingId.Count())
.ThenBy(z => z.SoundRecordingId)
.Select(xx => xx.SoundRecordingId).Take(20).ToList();

LINQ: How to get the Max Id with a group by clause?

I am looking for a way in LINQ to get a max Id record by using 'Group By' clause
Consider the following Sample data
Table: ProcessAud
ProcessSeq ProjectSeq ProjectValue Active
11 1 50000 Y
12 1 10000 Y
13 2 70000 Y
14 2 90000 Y
In which I want to get two records as a list such that is second and fourth
records (i.e) ProcessSeq 12 and 14. And I tried it like following
var ProcessAudList = ProcessAudService.FilterBy(x => x.Active == "Y"
).GroupBy(x => x.ProjectSeq).Max().ToList();
It is not working properly, So how to do it in LINQ. Please anybody help.
You want to pick top record from each group.
var ProcessAudList = ProcessAudService.Where(x => x.Active == "Y")
.GroupBy(x => x.ProjectSeq, (key,g)=>g.OrderByDescending(e=>e.ProjectValue).First());
Check demo code
When you use GroupBy extension, method will return you IGrouping instance and you should query IGrouping instance like below;
var ProcessAudList = collection.Where(x => x.Active == "Y").GroupBy(x => x.ProjectSeq).Select(x => x.OrderByDescending(a => a.ProcessSeq).FirstOrDefault()).ToList();
Hope this helps
You're most of the way there, but Max is the wrong term to use.
Each IGrouping is an IEnumerable (or IQueryable) sequence of its own, so you can use OrderBy and First clauses to get the answer you need:
var ProcessAudList = ProcessAudService
.FilterBy(x => x.Active == "Y")
.GroupBy(x => x.ProjectSeq)
.Select(grp => grp.OrderByDescending(x => x.ProcessSeq).First())
.ToList();
The Select clause will process each of the groups, order the groups descending by ProcessSeq and select the first one. For the data you provided this will select the rows with ProcessSeq equal to 12 and 14.
With this code you can get all max id in foreach
var res = from pa in ProcessAud
group Cm by pa.ProjectSeq into Cm1
select new
{
_max = Cm1.Max(x => x.ProcessSeq)
};
foreach (var item in res)
{
//item._max have biggest id in group
}

Filter some unique Data with LINQ and C#

i am very new with C# and MVC.
My Problem:
I have a list OF IDs
int[] mylist = {10, 23}
I try to query some data from DB
var result = db.tableName.Where(o => mylist.Any(y => y == o.item_ID && o.readed)).ToList();
This is what I get with the query:
item_ID Product_ID readed
277 1232 1
277 1233 1
277 1235 1
280 1235 1
What I need is:
item_ID Product_ID readed
277 1235 1
280 1235 1
If I change "any" to "all" i don't get any results, but I have definitely one item where the condition fits.
I think its more like make a query with id 277, then a query with 280 and then merge the list and return only where where "Product_ID" match.
Any ideas?
I assume that what you need is this:
var temp = db.tableName.Where(o => mylist.Any(y => y == o.item_ID && o.readed))
.ToList();
// Find the Product_id which appeared more than one time
// By the way, this assumes that there is at least one product_Id whihc has appeared more than one time
var neededProductID = temp.GroupBy(x => x.Product_ID)
.Where(x => x.Count() > 1)
.First()
.Key;
// Filter the result by neededProductID
var result = temp.Where(x => x.Product_ID == neededProductID).ToList();
Also, if there could be more tha one Product_ID which has apperaed more than one time, then you can consider this:
var neededProductID = temp.GroupBy(x => x.Product_ID)
.Where(x => x.Count() > 1)
.Select(x => x.Key)
.ToList();
var result = temp.Where(x => neededProductID.Any(y => y == x.Product_ID)).ToList();
By the way, you don't need All(). It tells you if all the elements in a collection match a certain condition.
You can use the following
var result = db.tableName.Where(o => mylist.conains(o.item_ID)
&& o.readed).ToList();

Entity framework Group by query

Let's say I have a table with the following data:
Id | title | image | page
-------------------------
1 test a.jpg 1
2 test b.jpg 2
3 test 1 c.jpg 1
4 test 1 d.jpg 2
How would I go about grouping the data by title and retrieving the first results. Like so:
Id | title | image | page
-------------------------
1 test a.jpg 1
3 test 1 c.jpg 1
What I have tried so far but without luck is:
var result = _db.Records.Select(r => new Records
{
Id = r.Id,
title = r.title,
image = r.image,
page = r.page
}).OrderByDescending(x => x.Id)
.GroupBy(x => x.title)
.Select(x => x.First()).AsQueryable();
Am I going about this the right way? Any help appreciated.
Why order by and why return AsQueryable? This is what I have done. If you must return a queryable, appending AsQueryable() will still work.
Records.GroupBy (r => r.Title)
.Select (r =>r.First ())
The first Select doesn't seem to do anything. You already have Records and you're selecting Records.
The second Select is also not needed. Instead of calling x => x.First() why not call First() ? Or am I missing something?
var result = _db.Records
//.OrderByDescending(x => x.Id)
.GroupBy(x => x.title)
.First();
Edit: the OrderBy is doing work that is negated (somewhat) by the GroupBy
Edit 2: The above will only get the first group. So the x => x.First() was correct:
var result = _db.Records
.GroupBy(x => x.title)
.Select(group => group.First());
var results
= _db.Records.GroupBy(
i => i.title,
(key, group) => group.First()
)
Hope this helps

Selecting distinct with certain condition

I have values in a list:
List1
ID groupID testNo
1 123 0
2 653 1
3 776 6
4 653 0
I want to write a T-Sql or linq or lambda expression, so that whenever there is a duplicate it should pick the one with value !=0
I am using this expression but it is basically not giving the results I want.
var list2 = list1.GroupBy(x => x.testNo).Select(y => y.First());
How can I get the results so groupID 653 is chosen with testNo 1 with rest of the records?
There are a few approaches you could take. I don't know if any of them are full proof. One would be to do an OrderBy on testNo so that items with a non zero testNo will come up before those with 0.
var list2 = list1.Orderby(y => y.testNo).GroupBy(x => x.testNo).Select(z => z.FirstOrDefault());
If you can guarantee that testNo = 0 only occurs for dupes then the easiest way is just to use a where.
var list2 = list1.Where(x => x.testNo > 0).First();
This should give you the desired results:
var list2 = list1.GroupBy(x => x.groupID)
.Select(x => list1.Single(item => item.groupID == x.Key
&& item.testNo == x.Max(y => y.testNo)))
.ToList();
Basically, group by groupID and then select each item from the original list1 that matches the distinct groupID and has the max value for testNo for that groupID.
var result = list.GroupBy(x => x.groupID).Select(g => g.Count() == 1 ? g.First() : g.First(x => x.testNo != 0));

Categories