Get Duplicate values count using LINQ c# - c#

I want to get the duplicate values count from one table. The input values are like as below,
SUB -xxx-20160721
SUB -xxx-20160721
SUB -125-20160022
Here (1) and (2) are same value. If the Name is more than 1 it should return 1 as a result. the result should return the count as (2).
var numberOfDuplicates = this.UnitOfWork.Repository<Models.SUB>()
.Queryable().GroupBy(x => x.Name)
.Where(x => x.Count() > 1)
.Select(x => x.Count());
The result is
2
2
2
2
2
3
2
2
4
Please guide me on this..

The return value of the below code is an anonymous object with 2 properties:
Value The value that is duplicate
Amount The amount of times it is duplicate
var numberOfDuplicates = this.UnitOfWork.Repository()
.Queryable().GroupBy(x => x.Name)
.Where(x => x.Count() > 1)
.Select(x => new { Value = x.Key, Amount = x.Count() } );

The problem with your code is that you're returning the Count of each group using the Select(x=> x.Count()) statement.
You can return the Name (The Key of Grouping) and the Count using anonymous types:
var numberOfDuplicates = this.UnitOfWork.Repository<Models.SUB>()
.Queryable().GroupBy(x => x.Name)
.Where(x => x.Count() > 1)
.Select(x => new { Name = x.Key, Count = x.Count() });
foreach(var dup in numberOfDuplicates)
{
Console.WriteLine($"Name = {dup.Name } ** Counter = {dup.Count}");
}
Results:
Name = 1.SUB -xxx-20160721 ** Count = 2

Related

Getting the count of most repeated records in Linq

I am working on an application in which I have to store play history of a song in the data table. I have a table named PlayHistory which has four columns.
Id | SoundRecordingId(FK) | UserId(FK) | DateTime
Now i have to implement a query that will return the songs that are in trending phase i.e. being mostly played. I have written the following query in sql server that returns me data somehow closer to what I want.
select COUNT(*) as High,SoundRecordingId
from PlayHistory
where DateTime >= GETDATE()-30
group by SoundRecordingId
Having COUNT(*) > 1
order by SoundRecordingId desc
It returned me following data:
High SoundRecordingId
2 5
2 3
Which means Song with Ids 5 and 3 were played the most number of times i.e.2
How can I implement this through Linq in c#.
I have done this so far:
DateTime d = DateTime.Now;
var monthBefore = d.AddMonths(-1);
var list =
_db.PlayHistories
.OrderByDescending(x=>x.SoundRecordingId)
.Where(t => t.DateTime >= monthBefore)
.GroupBy(x=>x.SoundRecordingId)
.Take(20)
.ToList();
It returns me list of whole table with the count of SoundRecording objects but i want just count of the most repeated records.
Thanks
There is an overload of the .GroupBy method which will solve your problem.
DateTime d = DateTime.Now;
var monthBefore = d.AddMonths(-1);
var list =
_db.PlayHistories
.OrderByDescending(x=>x.SoundRecordingId)
.Where(t => t.DateTime >= monthBefore)
.GroupBy(x=>x.SoundRecordingId, (key,values) => new {SoundRecordingID=key, High=values.count()})
.Take(20)
.ToList();
I have simply added the result selector to the GroupBy method call here which does the same transformation you have written in your SQL.
The method overload in question is documented here
To go further into your problem, you will probably want to do another OrderByDescending to get your results in popularity order. To match the SQL statement you also have to filter for only counts > 1.
DateTime d = DateTime.Now;
var monthBefore = d.AddMonths(-1);
var list =
_db.PlayHistories
.Where(t => t.DateTime >= monthBefore)
.GroupBy(x=>x.SoundRecordingId, (key,values) => new {SoundRecordingID=key, High=values.count()})
.Where(x=>x.High>1)
.OrderByDescending(x=>x.High)
.ToList();
I like the 'linq' syntax it's similar to SQL
var query = from history in _db.PlayHistories
where history.DateTime >= monthBefore
group history by history.SoundRecordingId into historyGroup
where historyGroup.Count() > 1
orderby historyGroup.Key
select new { High = historyGroup.Count(), SoundRecordingId = historyGroup.Key };
var data = query.Take(20).ToList();
You´re allmost done. Just order your list by the count and take the first:
var max =
_db.PlayHistories
.OrderByDescending(x=>x.SoundRecordingId)
.Where(t => t.DateTime >= monthBefore)
.GroupBy(x=>x.SoundRecordingId)
.OrderByDescending(x => x.Count())
.First();
This gives you a single key-value-pair where the Key is your SoundRecordingId and the value is the number of its occurences in your input-list.
EDIT: To get all records with that amount chose this instead:
var grouped =
_db.PlayHistories
.OrderByDescending(x => x.SoundRecordingId)
.Where(t => t.DateTime >= monthBefore)
.GroupBy(x => x.SoundRecordingId)
.Select(x => new { Id = x.Key, Count = x.Count() }
.OrderByDescending(x => x.Count)
.ToList();
var maxCount = grouped.First().Count;
var result = grouped.Where(x => x.Count == maxCount);
This solves the problem by giving you what you asked for. Your query in LINQ, returning just the play counts.
var list = _db.PlayHistories.Where(x => x.DateTimeProp > (DateTime.Now).AddMonths(-1))
.OrderByDescending(y => y.SoundRecordingId.Count())
.ThenBy(z => z.SoundRecordingId)
.Select(xx => xx.SoundRecordingId).Take(20).ToList();

Determine Duplicate data using LINQ to EF

I have a dataset that i want to groupby to determine duplicate data.
Example i have a dataset that looks like this.
|id | Number | ContactID
1 1234 5
2 9873 6
3 1234 7
4 9873 6
Now i want to select data that has more than one occurrence of Number but only if the ContactID is not the same.
So basically return
| Number | Count |
1234 2
Any help would be appreciated using LINQ to EF, thanks.
Update:
All thanks to #DrCopyPaste, as he told me that I misunderstood your problem. Here is the correct solution:-
var result = from c in db.list
group c by c.Number into g
let count = g.GroupBy(x => x.ContactID).Where(x => x.Count() == 1).Count()
where count != 0
select new
{
Number = g.Key,
Count = count
};
Sample Fiddle.
This query avoids making a custom IEqualityComparer as if I remember correctly don't think they play well with EF.
var results = data.GroupBy(number => number.Number)
.Where(number => number.Count() > 1)
.Select(number => new
{
Number = number.Key,
Count = number.GroupBy(contactId => contactId.ContactId).Count(x => x.Count() == 1)
})
.Where(x => x.Count > 0).ToList();
Fiddle
It does an initial GroupBy to get all Numbers that are duplicated. It then selects a new type that contains the number and a second GroupBy that groups by ContactId then counts all groups with exactly one entry. Then it takes all results whose count is greater than zero.
Have not testing it against EF, but the query uses only standard Linq operators so EF shouldn't have any issues translating it.
Another way of doing this(using 1 level of grouping):
var results = data
.Where(x => data.Any(y => y.Id != x.Id && y.Number == x.Number && y.ContactId != x.ContactId))
.GroupBy(x => x.Number)
.Select(grp => new { Number = grp.Key, Count = grp.Count() })
.ToList();
Fiddle

Filter some unique Data with LINQ and C#

i am very new with C# and MVC.
My Problem:
I have a list OF IDs
int[] mylist = {10, 23}
I try to query some data from DB
var result = db.tableName.Where(o => mylist.Any(y => y == o.item_ID && o.readed)).ToList();
This is what I get with the query:
item_ID Product_ID readed
277 1232 1
277 1233 1
277 1235 1
280 1235 1
What I need is:
item_ID Product_ID readed
277 1235 1
280 1235 1
If I change "any" to "all" i don't get any results, but I have definitely one item where the condition fits.
I think its more like make a query with id 277, then a query with 280 and then merge the list and return only where where "Product_ID" match.
Any ideas?
I assume that what you need is this:
var temp = db.tableName.Where(o => mylist.Any(y => y == o.item_ID && o.readed))
.ToList();
// Find the Product_id which appeared more than one time
// By the way, this assumes that there is at least one product_Id whihc has appeared more than one time
var neededProductID = temp.GroupBy(x => x.Product_ID)
.Where(x => x.Count() > 1)
.First()
.Key;
// Filter the result by neededProductID
var result = temp.Where(x => x.Product_ID == neededProductID).ToList();
Also, if there could be more tha one Product_ID which has apperaed more than one time, then you can consider this:
var neededProductID = temp.GroupBy(x => x.Product_ID)
.Where(x => x.Count() > 1)
.Select(x => x.Key)
.ToList();
var result = temp.Where(x => neededProductID.Any(y => y == x.Product_ID)).ToList();
By the way, you don't need All(). It tells you if all the elements in a collection match a certain condition.
You can use the following
var result = db.tableName.Where(o => mylist.conains(o.item_ID)
&& o.readed).ToList();

Entity Framework incremented index property for groups

I'd like to have an int property that is incremented for each item in group independently (as described here, because quotes need to be accessible like /person/quote/1..2..3 not /person/quote/1..5..10:
Quote Person Index
Lorem Smith 1
Ipsum Smith 2
Loremi Lewis 1
Ipsumi Lewis 2
Using code in that question with EF:
var query = _data.Quotes
.GroupBy(x => x.Person.Name)
.Select
(
x => x.Select((y, i) => new { y.Text, y.Person.Name, Index = i + 1 })
)
.SelectMany(x => x);
But EF cannot parse it and returns NotSupportedException exception:
LINQ to Entities does not recognize the method
System.Collections.Generic.IEnumerable`1[<>f__AnonymousType9`2[System.String,System.Int32]] Select[Quote,<>f__AnonymousType9`2](System.Collections.Generic.IEnumerable`1[App.Models.Quote], System.Func`3[App.Models.Quote,System.Int32,<>f__AnonymousType9`2[System.String,System.Int32]]) and this method cannot be translated into a store expression
Thanks to Dabblernl's comment this code is working:
var query = _data.Quotes
.GroupBy(x => x.Person.Name)
.ToList()
.Select
(
x => x.Select((y, i) => new { y.Person.Name, y.Text, Index = i + 1 })
)
.SelectMany(x => x);
Query:
var query = _data.Quotes
.GroupBy(x => x.Person.Name.ToLower())
.Select
(
x => x.Select((y, i) => new { y.Text.ToLower(), y.Person.Name.ToLower(), Index = i + 1 })
)
.SelectMany(x => x);

Find MAX/MIN list item using LINQ?

I have a list Having multiple Items and 3 props ID,DATE,COMMENT.ID field is Auto incremented in DATABASE.
Let say list Contains
2,16AUG,CommentMODIFIED
1,15AUG,CommentFIRST
3,18AUG,CommentLASTModified
I want to get a single ITEM.Item Having Minimum DATE and having Latest Comment. In this case
1,15AUG,CommentLASTModified
Any easy way to do it using LINQ.
orderedItems = items.OrderBy(x => x.Date);
var result = items.First();
result.Comment = items.Last().Comment;
To get a single item out of the list, you can order the items then take the first one, like this:
var result = items
.OrderByDescending(x => x.Date)
.First();
But First will throw an exception if the items collection is empty. This is a bit safer:
var result = items
.OrderByDescending(x => x.Date)
.FirstOrDefault();
To get the min / max of different columns you can do this:
var result =
new Item {
Id = 1,
Date = items.Min(x => x.Date),
Comment = items.Max(x => x.Comment)
};
But this will require two trips to the database. This might be a bit more efficient:
var result =
(from x in items
group x by 1 into g
select new Item {
Id = 1,
Date = g.Min(g => g.Date),
Comment = g.Max(g => g.Comment)
})
.First();
Or in fluent syntax:
var result = items
.GroupBy(x => 1)
.Select(g => new Item {
Id = 1,
Date = g.Min(g => g.Date),
Comment = g.Max(g => g.Comment)
})
.First();

Categories