I tried linq to remove duplicate item:
var MyItems = (from b in this.result
select new Item{ Name = b.Name, ID = b.ID }).Distinct();
The I checked the result, it is not removed the duplicated items.
How to resolve this problem?
By default, Distinct() uses EqualityComparer<T>.Default, which has the following rules:
The default equality comparer, Default, is used to compare values of the types that implement the IEquatable generic interface. To compare a custom data type, you need to implement this interface and provide your own GetHashCode and Equals methods for the type.
In your case, this means Item needs to implement IEquatable<Item>.
Alternatively, you can use the overload of Distinct which takes an IEqualityComparer<T> directly.
You can pass Distinct() a comparer object:
var MyItems = (from b in this.result
select new Item{ Name = b.Name, ID = b.ID }).Distinct(new ItemComparer());
Here is an example of the custom comparer class
// Custom comparer for the Item class
class ItemComparer: IEqualityComparer<Product>
{
// Items are equal if their names and IDs are equal.
public bool Equals(Item x, Item y)
{
//Check whether the compared objects reference the same data.
if (Object.ReferenceEquals(x, y)) return true;
//Check whether any of the compared objects is null.
if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
return false;
//Check whether the items' properties are equal.
return x.ID == y.ID && x.Name == y.Name;
}
// If Equals() returns true for a pair of objects
// then GetHashCode() must return the same value for these objects.
public int GetHashCode(Item item)
{
//Check whether the object is null
if (Object.ReferenceEquals(item, null)) return 0;
//Get hash code for the Name field if it is not null.
int hashItemName = item.Name == null ? 0 : item.Name.GetHashCode();
//Get hash code for the ID field.
int hashItemID = item.ID.GetHashCode();
//Calculate the hash code for the item.
return hashItemName ^ hashItemID;
}
}
As you're comparing objects, not primitives, you're going to have to do some work to define what Distinct means.
Have a look at the Distinct override that includes IEqualityComparer:
http://msdn.microsoft.com/en-us/library/bb338049.aspx
Regular Distinct() returns elements from a collection by using the default equality comparer.
You can use custom comparer for this:
// modified example from docs, not tested
class MyComparer : IEqualityComparer<Item>
{
// Items are equal if their ids are equal.
public bool Equals(Item x, Item y)
{
// Check whether the compared objects reference the same data.
if (Object.ReferenceEquals(x, y)) return true;
// Check whether any of the compared objects is null.
if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
return false;
//Check whether the items properties are equal.
return x.ID == y.ID;
}
// If Equals() returns true for a pair of objects
// then GetHashCode() must return the same value for these objects.
public int GetHashCode(Product product)
{
//Check whether the object is null
if (Object.ReferenceEquals(item, null)) return 0;
//Get hash code for the ID field.
int hashProductId = product.ID.GetHashCode();
return hashProductId;
}
}
var myItems = (from b in this.result
select new Item{ Name = b.Name, ID = b.ID }).Distinct(new MyComparer());
Since I don't know how you're using the Items after this, I'm gambling here.
If really only need the ID-Name pair, you can use an anonymous type and get the comparison for free:
var MyItems = (from b in this.result
select new { b.Name, b.ID }).Distinct();
After this (and once again assuming all you need is the Name-ID pair), the resulting object will have properties you need:
foreach(var item in MyItems)
Console.WriteLine("{0} -> {1}", item.ID, item.Name);
Quoting MSDN on C# Anonymous Types:
Because the Equals and GetHashCode methods on anonymous types are defined in terms of the Equals and GetHashcode methods of the properties, two instances of the same anonymous type are equal only if all their properties are equal.
Enumerable.Distinct Method (IEnumerable) Returns distinct elements from a sequence by using the default equality comparer to compare values.
please check this:
https://msdn.microsoft.com/en-us/library/bb348436.aspx
You need add new items to List, use foreach exam:
foreach(var _item in result.Distinct()){
//Code here
}
ok :)
Related
I have one list which has data and sometimes it contains duplicate rows and I want to remove that duplicate row for that I used below code
num = numDetailsTemp.Distinct().ToList();
var query = num.GroupBy(o => new { o.Number })
.Select(group =>
new
{
Name = group.Key,
Numbers = group.OrderByDescending(x => x.Date)
})
.OrderBy(group => group.Numbers.First().Date);
List<NumberDetails> numTemp = new List<NumberDetails>();
foreach (var group in query)
{
foreach (var numb in group.Numbers)
{
numTemp.Add(numb);
break;
}
}
num = numTemp;
The below image shows the duplicate value from the list.
And when I apply remove duplicate it give me an output
But I want to remove that row which not contains alter no or id proof and date like shown in first image first row not, contains AlterNo and ID Proof and date and the second row contains that so I want to remove the first row and display only second row. The date is compulsory to check and after that AlterNo and ID Proof.
You can try the following:
var group =
list
.GroupBy(r => r.Number)
.SelectMany(g => g) //flatten your grouping and filter where you have alterno and id
.Where(r => !string.IsNullOrEmpty(r.AlterNo) && !string.IsNullOrEmpty(r.Id))
.OrderByDescending(r=>r.Date)
.ToList();
You may eliminate duplicates using Distinct operator. First you need to define a comparer class which implements IEqualityComparer interface, and then pass it to the distinct operator in your method.
internal class NumberDetailsComparer : IEqualityComparer<NumberDetails>
{
public bool Equals(NumberDetails x, NumberDetails y)
{
if (\* Set of conditions for equality matching *\)
{
return true;
}
return false;
}
public int GetHashCode(Student obj)
{
return obj.Name.GetHashCode(); // Name or whatever unique property
}
}
And here is how to use it:
var distinctRecords = source.Distinct(new NumberDetailsComparer());
All you need to do is define the criteria for comparer class.
Hope this solves your problem.
This link could be useful for a fully working example:
http://dotnetpattern.com/linq-distinct-operator
So you have a sequence of NumberDetails, and a definition about when you would consider to NumberDetails equal.
Once you have found which NumberDetails are equal, you want to eliminate the duplicates, except one: a duplicate that has values for AlterNo and IdProof.
Alas you didn't specify what you want if there are no duplicates with values for AlterNo and IdProof. Nor what you want if there are several duplicates with values for AlterNo and IdProof.
But let's assume that if there are several of these items, you don't care: just pick one, because they are duplicates anyway.
In your requirement you speak about duplicates. So let's write a class that implements your requirements of equality:
class NumberDetailEqualityComparer : IEqualityComparer<NumberDetail>
{
public static IEQualityComparer<NumberDetail> Default {get;} = new NumberDetaulEqualityComparer();
public bool Equals(NumberDetail x, NumberDetail y)
{
if (x == null) return y == null; // true if both null
if (y == null) return false; // because x not null and y null
if (Object.ReferenceEquals(x, y) return true; // because same object
if (x.GetType() != y.GetType()) return false; // because not same type
// by now we are out of quick checks, we need a value check
return x.Number == y.Number
&& x.FullName == y.FullName
&& ...
// etc, such that this returns true if according your definition
// x and y are equal
}
You also need to implement GetHashCode. You can return anything you want, as long as you
are certain that if x and y are equal, then they return the same HashCode
Furthermore it would be more efficient that if x and y not equal,
then there is a high probability for different HashCode.
Something like:
public int GetHashCode(NumberDetail numberDetail)
{
const int prime1 = 12654365;
const int prime2 = 54655549;
if (numberDetail == null) return prime1;
int hash = prime1;
unsafe
{
hash = prime2 * hash + numberDetail.Number.GetHashCode();
hash = prime2 * hash + numberDetail.FullName.GetHashCode();
hash = prime2 * hash + numberDetail.Date.GetHashCode();
...
}
return hash;
Of course you have to check if any of the properties equal NULL before asking the HashCode.
Obviously in your equality (and thus in GetHashCode) you don't look at AlterNo nor IdProof.
Once that you've defined precisely when you consider two NumberDetails equal, you can make groups of equal NumberDetails
var groupsEqualNumberDetails = numberDetails.GroupBy(
// keySelector: make groups with equal NumberDetails:
numberDetail => numberDetail,
// ResultSelector: take the key and all NumberDetails thas equal this key:
// and keep the first one that has values for AlterNo and IdProof
(key, numberDetailsEqualToKey) => numberDetailsEqualToKey
.Where(numberDetail => numberDetail.AlterNo != null
&& numberDetail.IdProof != null)
.FirstOrDefault(),
// KeyComparer: when do you consider two NumberDetails equal?
NumberDetailEqualityComparer.Default;
}
I have managed to remove most of the duplicate values in my list, but I still have lower-case duplicates, and empty string values in my list that I want to remove.
CategoriesList yield returns about 1000 records; noDuplicateCategories reduces this number to 20 removing most of the duplicates:
var CSVCategories = from line in File.ReadAllLines(path).Skip(1)
let columns = line.Split(',')
select new Category
{
Name = columns[9]
};
var CategoriesList = CSVCategories.ToList();
var noDuplicateCategories = CategoriesList.Distinct(new CategoryComparer()).ToList();
This is my object class overridden methods for the Equalitycomparer Interface:
class CategoryComparer : IEqualityComparer<Category>
{
// Products are equal if their names and product numbers are equal.
public bool Equals(Category x, Category y)
{
//Check whether the compared objects reference the same data.
if (Object.ReferenceEquals(x, y)) return true;
//Check whether any of the compared objects is null.
if (Object.ReferenceEquals(x, null ) || Object.ReferenceEquals(y, null))
return false;
//Check whether the products' properties are equal.
return string.Compare(x.Name, y.Name, true) == 0;
}
// If Equals() returns true for a pair of objects
// then GetHashCode() must return the same value for these objects.
public int GetHashCode(Category category)
{
//Check whether the object is null
if (Object.ReferenceEquals(category, null)) return 0;
//Get hash code for the Name field if it is not null.
int hashCategoryName = category.Name == null ? 0 : category.Name.GetHashCode();
//Get hash code for the Code field.
int hashCategoryCode = category.Name.GetHashCode();
//Calculate the hash code for the product.
return hashCategoryName;
}
}
What do I need to change here to remove empty string values and also ignore casing?
My data:
Why deal with Category object if all you need to be unique is name. You can prepare names before converting them to categories:
var categories = File.ReadLines(path).Skip(1)
.Select(l => l.Split(new [] {','}, StringSplitOptions.RemoveEmptyEntries))
.Where(parts => parts.Length >= 10)
.Select(parts => parts[9].Trim())
.Distinct(StringComparer.InvariantCultureIgnoreCase)
.Select(s => new Category { Name = s });
Of course if you are pretty sure that data in your file is reliable - no empty lines, every line has at least 10 parts, and each part does not have whitespace around, then you can simplify query to
var categories = File.ReadLines(path).Skip(1)
.Select(l => l.Split(',')[9])
.Distinct(StringComparer.InvariantCultureIgnoreCase)
.Select(s => new Category { Name = s });
NOTE: Use ReadLines instead of ReadAllLines to avoid dumping all file content into in-memory array.
I'm trying to build a list of items based on their presence in a list.
itemsAll contains all products
itemsNew contains only new products
I'd like itemsOld to contain only old products (i.e. itemsAll -
itemsNew)
This was my approach, which doesn't return the correct number of items.
var itemsAll = objProductStagingRepository.AllImports(fileId, cid).ToList();
var itemsNew = objProductStagingRepository.DetectNonPresentProductNames(fileId, cid).ToList();
var itemsOld = from t1 in itemsAll where !(from o in itemsNew select o.Id).Contains(t1.Id)
select t1; // this does not work
Does anybody have any suggestions as to how I shuold be approacing this? I have tried itemsAll.Except(itemsNew) which also doesn't yield the correct results!
I think you probably could use the Except method, but you would need to provide an equality comparer for the method to know when two items are equal.
http://msdn.microsoft.com/en-us/library/bb336390.aspx
In your question it looks like you're not using your own comparer, so it's comparing the items to see if they are the same object in memory (most likely), which is not what you're trying to do.
You want to compare the objects by database identity, which means you need to provide you're own comparer.
Example:
public class Item
{
public int Id { get; set; }
}
class ItemComparer : IEqualityComparer<Item>
{
public bool Equals(Item x, Item y)
{
if (Object.ReferenceEquals(x, y)) return true;
if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
return false;
return x.Id == y.Id;
}
public int GetHashCode(Item value)
{
if (Object.ReferenceEquals(value, null)) return 0;
int hash = value.Id.GetHashCode();
return hash;
}
}
itemsOld.AddRange(itemsAll.Where(p => !itemsNew.Any(a => a.Id == p.Id)));
I prefer the fluent syntax so:
var itemsOld = itemsAll.Where(x => !itemsNew.Any(y => y.Id == x.Id));
or
var itemsOld = itemsAll.Where(x => !itemsNew.Exists(y => y.Id == x.Id));
This might work
var itemsOld = from a in itemsAll
join n in itemsNew on a.Id equals n.Id into ng
where !ng.Any()
select a;
I have this comparer for my object Tenant
public class TenantComparer : IEqualityComparer<Tenant>
{
// Products are equal if their names and product numbers are equal.
public bool Equals(Tenant x, Tenant y)
{
//Check whether the compared objects reference the same data.
if (Object.ReferenceEquals(x, y)) return true;
//Check whether any of the compared objects is null.
if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
return false;
//Check whether the products' properties are equal.
return x.Name == y.Name;
}
// If Equals() returns true for a pair of objects
// then GetHashCode() must return the same value for these objects.
public int GetHashCode(Tenant tenant)
{
//Check whether the object is null
if (Object.ReferenceEquals(tenant, null)) return 0;
//Get hash code for the Name field if it is not null.
int hashProductName = tenant.Name == null ? 0 : tenant.Name.GetHashCode();
//Calculate the hash code for the product.
return hashProductName;
}
}
Now, I have a table with some tenants and some of them have the same name. I want to fetch those who are distinct order by name:
public static List<Tenant> GetTenantListOrderyByNameASC()
{
DataClassesDataContext db = new DataClassesDataContext();
var tenantsList = (from t in db.Tenants
select t).Distinct().OrderBy( x => x.Name ).ToList();
return tenantsList;
}
But it still shows the tenants with the same names...
Can you please tell me where I am wrong?
You need to provide comparer explicitly, which at the moment you do not:
var tenantsList = (from t in db.Tenants
select t)
.Distinct(new TenantComparer())
.OrderBy( x => x.Name )
.ToList();
See the documentation.
I have a list with two or more objects of class Agent.
Name = "A"
Priority = 0
ResultCount = 100
;
Name = "B"
Priority = 1
ResultCount = 100
;
Both objects have the same ResultCount. In that case I only need one object and not two or more. I did this with a Linq Query with Distinct and an custom made Comparer.
IEnumerable<Agent> distinctResultsAgents =
(from agt in distinctUrlsAgents select agt).Distinct(comparerResultsCount);
With this query I get only one object from the list but I never know which one.
But I don't want just any object, I want object "B" because the Priority is higher then object "A".
How can I do that?
My custom Comparer is very simple and has a method like this:
public bool Equals(Agent x, Agent y)
{
if (x == null || y == null)
return false;
if (x.ResultCount == y.ResultCount)
return true;
return false;
}
First group the elements by ResultCount so that you only get one result for each distinct value of ResultCount. Then for each group select the element in that group with the highest priority.
Try this query:
IEnumerable<Agent> distinctResultsAgents =
from d in distinctUrlsAgents
group d by d.ResultCount into g
select g.OrderByDescending(x => x.Priority).First();
If you use morelinq there is a function called MaxBy that you could use instead of the last line, but note that it only works for LINQ To Objects.