Remove Duplicate value from List<T> - c#

I have one list which has data and sometimes it contains duplicate rows and I want to remove that duplicate row for that I used below code
num = numDetailsTemp.Distinct().ToList();
var query = num.GroupBy(o => new { o.Number })
.Select(group =>
new
{
Name = group.Key,
Numbers = group.OrderByDescending(x => x.Date)
})
.OrderBy(group => group.Numbers.First().Date);
List<NumberDetails> numTemp = new List<NumberDetails>();
foreach (var group in query)
{
foreach (var numb in group.Numbers)
{
numTemp.Add(numb);
break;
}
}
num = numTemp;
The below image shows the duplicate value from the list.
And when I apply remove duplicate it give me an output
But I want to remove that row which not contains alter no or id proof and date like shown in first image first row not, contains AlterNo and ID Proof and date and the second row contains that so I want to remove the first row and display only second row. The date is compulsory to check and after that AlterNo and ID Proof.

You can try the following:
var group =
list
.GroupBy(r => r.Number)
.SelectMany(g => g) //flatten your grouping and filter where you have alterno and id
.Where(r => !string.IsNullOrEmpty(r.AlterNo) && !string.IsNullOrEmpty(r.Id))
.OrderByDescending(r=>r.Date)
.ToList();

You may eliminate duplicates using Distinct operator. First you need to define a comparer class which implements IEqualityComparer interface, and then pass it to the distinct operator in your method.
internal class NumberDetailsComparer : IEqualityComparer<NumberDetails>
{
public bool Equals(NumberDetails x, NumberDetails y)
{
if (\* Set of conditions for equality matching *\)
{
return true;
}
return false;
}
public int GetHashCode(Student obj)
{
return obj.Name.GetHashCode(); // Name or whatever unique property
}
}
And here is how to use it:
var distinctRecords = source.Distinct(new NumberDetailsComparer());
All you need to do is define the criteria for comparer class.
Hope this solves your problem.
This link could be useful for a fully working example:
http://dotnetpattern.com/linq-distinct-operator

So you have a sequence of NumberDetails, and a definition about when you would consider to NumberDetails equal.
Once you have found which NumberDetails are equal, you want to eliminate the duplicates, except one: a duplicate that has values for AlterNo and IdProof.
Alas you didn't specify what you want if there are no duplicates with values for AlterNo and IdProof. Nor what you want if there are several duplicates with values for AlterNo and IdProof.
But let's assume that if there are several of these items, you don't care: just pick one, because they are duplicates anyway.
In your requirement you speak about duplicates. So let's write a class that implements your requirements of equality:
class NumberDetailEqualityComparer : IEqualityComparer<NumberDetail>
{
public static IEQualityComparer<NumberDetail> Default {get;} = new NumberDetaulEqualityComparer();
public bool Equals(NumberDetail x, NumberDetail y)
{
if (x == null) return y == null; // true if both null
if (y == null) return false; // because x not null and y null
if (Object.ReferenceEquals(x, y) return true; // because same object
if (x.GetType() != y.GetType()) return false; // because not same type
// by now we are out of quick checks, we need a value check
return x.Number == y.Number
&& x.FullName == y.FullName
&& ...
// etc, such that this returns true if according your definition
// x and y are equal
}
You also need to implement GetHashCode. You can return anything you want, as long as you
are certain that if x and y are equal, then they return the same HashCode
Furthermore it would be more efficient that if x and y not equal,
then there is a high probability for different HashCode.
Something like:
public int GetHashCode(NumberDetail numberDetail)
{
const int prime1 = 12654365;
const int prime2 = 54655549;
if (numberDetail == null) return prime1;
int hash = prime1;
unsafe
{
hash = prime2 * hash + numberDetail.Number.GetHashCode();
hash = prime2 * hash + numberDetail.FullName.GetHashCode();
hash = prime2 * hash + numberDetail.Date.GetHashCode();
...
}
return hash;
Of course you have to check if any of the properties equal NULL before asking the HashCode.
Obviously in your equality (and thus in GetHashCode) you don't look at AlterNo nor IdProof.
Once that you've defined precisely when you consider two NumberDetails equal, you can make groups of equal NumberDetails
var groupsEqualNumberDetails = numberDetails.GroupBy(
// keySelector: make groups with equal NumberDetails:
numberDetail => numberDetail,
// ResultSelector: take the key and all NumberDetails thas equal this key:
// and keep the first one that has values for AlterNo and IdProof
(key, numberDetailsEqualToKey) => numberDetailsEqualToKey
.Where(numberDetail => numberDetail.AlterNo != null
&& numberDetail.IdProof != null)
.FirstOrDefault(),
// KeyComparer: when do you consider two NumberDetails equal?
NumberDetailEqualityComparer.Default;
}

Related

Remove empty string values and ignore casing in C# List when using Distinct() method

I have managed to remove most of the duplicate values in my list, but I still have lower-case duplicates, and empty string values in my list that I want to remove.
CategoriesList yield returns about 1000 records; noDuplicateCategories reduces this number to 20 removing most of the duplicates:
var CSVCategories = from line in File.ReadAllLines(path).Skip(1)
let columns = line.Split(',')
select new Category
{
Name = columns[9]
};
var CategoriesList = CSVCategories.ToList();
var noDuplicateCategories = CategoriesList.Distinct(new CategoryComparer()).ToList();
This is my object class overridden methods for the Equalitycomparer Interface:
class CategoryComparer : IEqualityComparer<Category>
{
// Products are equal if their names and product numbers are equal.
public bool Equals(Category x, Category y)
{
//Check whether the compared objects reference the same data.
if (Object.ReferenceEquals(x, y)) return true;
//Check whether any of the compared objects is null.
if (Object.ReferenceEquals(x, null ) || Object.ReferenceEquals(y, null))
return false;
//Check whether the products' properties are equal.
return string.Compare(x.Name, y.Name, true) == 0;
}
// If Equals() returns true for a pair of objects
// then GetHashCode() must return the same value for these objects.
public int GetHashCode(Category category)
{
//Check whether the object is null
if (Object.ReferenceEquals(category, null)) return 0;
//Get hash code for the Name field if it is not null.
int hashCategoryName = category.Name == null ? 0 : category.Name.GetHashCode();
//Get hash code for the Code field.
int hashCategoryCode = category.Name.GetHashCode();
//Calculate the hash code for the product.
return hashCategoryName;
}
}
What do I need to change here to remove empty string values and also ignore casing?
My data:
Why deal with Category object if all you need to be unique is name. You can prepare names before converting them to categories:
var categories = File.ReadLines(path).Skip(1)
.Select(l => l.Split(new [] {','}, StringSplitOptions.RemoveEmptyEntries))
.Where(parts => parts.Length >= 10)
.Select(parts => parts[9].Trim())
.Distinct(StringComparer.InvariantCultureIgnoreCase)
.Select(s => new Category { Name = s });
Of course if you are pretty sure that data in your file is reliable - no empty lines, every line has at least 10 parts, and each part does not have whitespace around, then you can simplify query to
var categories = File.ReadLines(path).Skip(1)
.Select(l => l.Split(',')[9])
.Distinct(StringComparer.InvariantCultureIgnoreCase)
.Select(s => new Category { Name = s });
NOTE: Use ReadLines instead of ReadAllLines to avoid dumping all file content into in-memory array.

What to do to get only one List?

Hello i have a method that compares the objects of 2 Lists for differences. Right now this works but only for one property at a time.
Here is the Method:
public SPpowerPlantList compareTwoLists(string sqlServer, string database, DateTime timestampCurrent, string noteCurrent, DateTime timestampOld, string noteOld)
{
int count = 0;
SPpowerPlantList powerPlantListCurrent = loadProjectsAndComponentsFromSqlServer(sqlServer, database, timestampCurrent, noteCurrent);
SPpowerPlantList powerPlantListOld = loadProjectsAndComponentsFromSqlServer(sqlServer, database, timestampOld, noteOld);
SPpowerPlantList powerPlantListDifferences = new SPpowerPlantList();
count = powerPlantListOld.Count - powerPlantListCurrent.Count;
var differentObjects = powerPlantListCurrent.Where(p => !powerPlantListOld.Any(l => p.mwWeb == l.mwWeb)).ToList();
foreach (var differentObject in differentObjects)
{
powerPlantListDifferences.Add(differentObject);
}
return powerPlantListDifferences;
}
This works and i get 4 Objects in the new List. The Problem is that i have a few other properties that i need to compare. Instead of mwWeb for example name. When i try to change it i need for every new property a new List and a new Foreach-Loop.
e.g.
int count = 0;
SPpowerPlantList powerPlantListCurrent = loadProjectsAndComponentsFromSqlServer(sqlServer, database, timestampCurrent, noteCurrent);
SPpowerPlantList powerPlantListOld = loadProjectsAndComponentsFromSqlServer(sqlServer, database, timestampOld, noteOld);
SPpowerPlantList powerPlantListDifferences = new SPpowerPlantList();
SPpowerPlantList powerPlantListDifferences2 = new SPpowerPlantList();
count = powerPlantListOld.Count - powerPlantListCurrent.Count;
var differentObjects = powerPlantListCurrent.Where(p => !powerPlantListOld.Any(l => p.mwWeb == l.mwWeb)).ToList();
var differentObjects2 = powerPlantListCurrent.Where(p => !powerPlantListOld.Any(l => p.shortName == l.shortName)).ToList();
foreach (var differentObject in differentObjects)
{
powerPlantListDifferences.Add(differentObject);
}
foreach (var differentObject in differentObjects2)
{
powerPlantListDifferences2.Add(differentObject);
}
return powerPlantListDifferences;
Is there a way to prevent this? or to make more querys and get only 1 List with all different Objects back?
I tried it with except and intersect but that didnt worked.
So any help or advise would be great and thx for your time.
PS: If there is something wrong with my question-style please say it to me becouse i try to learn to ask better questions.
You may be able to simply chain the properties that you wanted to compare within your Where() clause using OR statements :
// This should get you any elements that have different A properties, B properties, etc.
var different = current.Where(p => !old.Any(l => p.A == l.A || p.B == l.B))
.ToList();
If that doesn't work and you really want to use the Except() or Intersect() methods to properly compare the objects, you could write your own custom IEqualityComparer<YourPowerPlant> to use to properly compare them :
class PowerPlantComparer : IEqualityComparer<YourPowerPlant>
{
// Powerplants are are equal if specific properties are equal.
public bool Equals(YourPowerPlant x, YourPowerPlant y)
{
// Check whether the compared objects reference the same data.
if (Object.ReferenceEquals(x, y)) return true;
//Check whether any of the compared objects is null.
if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
return false;
// Checks the other properties to compare (examples using mwWeb and shortName)
return x.mwWeb == y.mwWeb && x.shortName == y.shortName;
}
// If Equals() returns true for a pair of objects
// then GetHashCode() must return the same value for these objects.
public int GetHashCode(YourPowerPlant powerPlant)
{
// Check whether the object is null
if (Object.ReferenceEquals(powerPlant, null)) return 0;
// Get hash code for the mwWeb field if it is not null.
int hashA = powerPlant.mwWeb == null ? 0 : powerPlant.mwWeb.GetHashCode();
// Get hash code for the shortName field if it is not null.
int hashB = powerPlant.shortName == null ? 0 : powerPlant.shortName.GetHashCode();
// Calculate the hash code for the product.
return hashA ^ hashB;
}
}
and then you could likely use something like one of the following depending on your needs :
var different = current.Except(old,new PowerPlantComparer());
or :
var different = current.Intersect(old,new PowerPlantComparer());
One way is to use IEqualityComparer as Rion Williams suggested, if you'd like a more flexible solution you can split logic in to two parts. First create helper method that accepts two lists, and function where you can define what properties you wish to compare. For example :
public static class Helper
{
public static SPpowerPlantList GetDifference(this SPpowerPlantList current, SPpowerPlantList old, Func<PowerPlant, PowerPlant, bool> func)
{
var diff = current.Where(p => old.All(l => func(p, l))).ToList();
var result = new SPpowerPlantList();
foreach (var item in diff) result.Add(item);
return result;
}
}
And use it :
public SPpowerPlantList compareTwoLists(string sqlServer, string database,
DateTime timestampCurrent, string noteCurrent,
DateTime timestampOld, string noteOld)
{
var powerPlantListCurrent = ...;
var powerPlantListOld = ...;
var diff = powerPlantListCurrent.GetDifference(
powerPlantListOld,
(x, y) => x.mwWeb != y.mwWeb ||
x.shortName != y.shortName);
return diff;
}
P.S. if it better suits your needs, you could move method inside of existing class :
public class MyClass
{
public SPpowerPlantList GetDifference(SPpowerPlantList current, SPpowerPlantList old, Func<PowerPlant, PowerPlant, bool> func)
{
...
}
}
And call it (inside of class) :
var result = GetDifference(currentValues, oldValues, (x, y) => x.mwWeb != y.mwWeb);
The easiest way to do this would be to compare some unique identifier (ID)
var differentObjects = powerPlantListCurrent
.Where(p => !powerPlantListOld.Any(l => p.Id == l.Id)
.ToList();
If the other properties might have been updated and you want to check that too, you'll have to compare all of them to detect changes made to existing elements:
Implement a camparison-method (IComparable, IEquatable, IEqualityComparer, or override Equals) or, if that's not possible because you didn't write the class yourself (code generated or external assembly), write a method to compare two of those SPpowerPlantList elements and use that instead of comparing every single property in Linq. For example:
public bool AreThoseTheSame(SPpowerPlantList a,SPpowerPlantList b)
{
if(a.mwWeb != b.mwWeb) return false;
if(a.shortName != b.shortName) return false;
//etc.
return true;
}
Then replace your difference call with this:
var differentObjects = powerPlantListCurrent
.Where(p => !powerPlantListOld.Any(l => AreThoseTheSame(p,l))
.ToList();

Return objects which contain Lists, based on the those lists matching

I have Cell objects which contain a List<int> called PossibleValues. I'm trying to find a way to get a list of cells in which all members have matching PossibleValues. I currently have:
foreach (var cell in group)
{
var cellsWithMatchingPossibleValues = group.Where(c => c.PossibleValues == cell.PossibleValues);
}
Unfortunately this isn't working, I suspect my linq statement isn't comparing the contents of PossibleValues, but instead comparing a reference of some kind, so that even in the case where both lists are composed of 3 and nothing else, cellsWithMatchingPossibleValues ends up only containing one cell, although I'm not certain, or sure how to get around that.
To formalise the question:
How can I return objects which contain Lists based on the those lists matching?
You can implement your own IEqualityComparer<Cell> for your Cell class that states equality when the PossibleValues are equal like this:
public class CellComparer : IEqualityComparer<Cell>
{
public bool Equals(Cell x, Cell y)
{
if (ReferenceEquals(x, null)) return ReferenceEquals(y, null);
if (ReferenceEquals(y, null)) return false;
return x.PossibleValues.SequenceEqual(y.PossibleValues);
}
public int GetHashCode(Cell obj)
{
if (obj == null) return 0;
unchecked
{
int hash = 1;
foreach (int h in obj.PossibleValues.Select(v => v?.GetHashCode() ?? 0))
hash = (hash * 397) ^ h;
return hash;
}
}
}
Then you can use this for a simple LINQ grouping like this:
var cellsGroupedByEqualValues = group.GroupBy(c => c, new CellComparer());
This returns an IEnumerable<IGrouping<Cell,Cell>> and you can iterate through it and receive the number of matching cells:
foreach(var groupedCells in cellsGroupedByEqualValues)
Console.WriteLine(groupedCells.Count());
But these contain duplicates since GroupBy generates a IGrouping for every Cell and adds all matching cells to that. (still trying to find a good way around that)
But for now you can tell for every Cell how many other Cells with the same list of values there are.
You can implement IEqualityComparer interface and use GroupBy method.
Here you can find good GetHashCode for List, and here how to compare a lists.
public class PossibleValuesCellComparer : IEqualityComparer<Cell>
{
public bool Equals(Cell x, Cell y)
{
return Enumerable.SequenceEqual(x.PossibleValues.OrderBy(t => t), y.PossibleValues.OrderBy(t => t));
}
public int GetHashCode(Cell cell)
{
var list = cell.PossibleValues.OrderBy(t => t);
unchecked
{
int hash = 19;
foreach (var obj in list)
{
hash = hash * 31 + obj.GetHashCode();
}
return hash;
}
}
}
....
var g2 = group.GroupBy(x => x, new PossibleValuesCellComparer());

C# Distinct with ability to choose which object to save, which ones to remove

I implemented this comparer which works OK.
class ReservationDatesDistinctComparer : IEqualityComparer<ReservationModel>
{
public bool Equals(ReservationModel x, ReservationModel y)
{
return x.FromDate.Date== y.FromDate.Date && x.ToDate.Date == y.ToDate.Date && x.UnitId == x.UnitId;
}
public int GetHashCode(ReservationModel product)
{
int hashProductCode = 1;
return hashProductCode;
}
}
But on ReservationModel I have some other property let's call it ReservationType and I would like to filter out with distinct same dates but keep only ReservationModel who has Type A not Type B.
How it is posible to affect on Distinct which model it will choose?
Distinct will keep the elements it encounters first, a possible solution would be to order those which have ReservationType A first:
reservatonModels.OrderByDescending(m => m.ReservationType == ReservationType.A)
.Distinct(new ReservationDatesDistinctComparer());
I don't think you can use Distinct for this. (Unless you want to rely on undocumented implementation details, as per Lukazoid's answer.)
Something similar to this might do the trick. (Group the elements that your comparer deems to be equal, then order each group so that Type A is prioritised, then take the first element from each group.)
var result = source.GroupBy(x => x, new ReservationDatesDistinctComparer())
.Select(g => g.OrderBy(x => (x.ReservationType == "Type A") ? 1 : 2)
.First());

How to Check All Values in Dictionary is same in C#?

I have a Dictionary, I want to write a method to check whether all values are same in this Dictionary.
Dictionary Type:
Dictionary<string, List<string>>
List {1,2,3}`and {2,1,3} are same in my case.
I have done this previously for simple datatype values, but I can not find logic for new requirement, please help me.
For simple values:
MyDict.GroupBy(x => x.Value).Where(x => x.Count() > 1)
I have also written a Generic Method to compare two datatypes in this way.
// 1
// Require that the counts are equal
if (a.Count != b.Count)
{
return false;
}
// 2
// Initialize new Dictionary of the type
Dictionary<T, int> d = new Dictionary<T, int>();
// 3
// Add each key's frequency from collection A to the Dictionary
foreach (T item in a)
{
int c;
if (d.TryGetValue(item, out c))
{
d[item] = c + 1;
}
else
{
d.Add(item, 1);
}
}
// 4
// Add each key's frequency from collection B to the Dictionary
// Return early if we detect a mismatch
foreach (T item in b)
{
int c;
if (d.TryGetValue(item, out c))
{
if (c == 0)
{
return false;
}
else
{
d[item] = c - 1;
}
}
else
{
// Not in dictionary
return false;
}
}
// 5
// Verify that all frequencies are zero
foreach (int v in d.Values)
{
if (v != 0)
{
return false;
}
}
// 6
// We know the collections are equal
return true;
Implement an IEqualityComparer for List<string> that compares two list based on their content. Then just use Distinct on Values and check the count:
dictionary.Values.Distinct(new ListEqualityComparer()).Count() == 1
This should do the trick
var lists = dic.Select(kv => kv.Value.OrderBy(x => x)).ToList();
var first = lists.First();
var areEqual = lists.Skip(1).All(hs => hs.SequenceEqual(first));
You'll need to add some checks to make this work for the empty case.
...or if you want to take #Selman's approach here's an implementation of the IEqualityComparer:
class SequenceComparer<T>:IEqualityComparer<IEnumerable<T>>
{
public bool Equals(IEnumerable<T> left, IEnumerable<T> right)
{
return left.OrderBy(x => x).SequenceEqual(right.OrderBy(x => x));
}
public int GetHashCode(IEnumerable<T> item)
{
//no need to sort because XOR is commutative
return item.Aggregate(0, (acc, val) => val.GetHashCode() ^ acc);
}
}
You could make a variant of this combining the best of both approaches using a HashSet<T> that might be considerably more efficient in the case that you have many candidates to test:
HashSet<IEnumerable<int>> hs = new HashSet<IEnumerable<int>>(new SequenceComparer<int>());
hs.Add(dic.First().Value);
var allEqual = dic.All(kvp => !hs.Add(kvp.Value));
This uses the feature of HashSets that disallows adding more than one item that is considered equal with an item already in the set. We make the HashSet use the custom IEqualityComparer above...
So we insert an arbitrary item from the dictionary before we start, then the moment another item is allowed into the set (i.e. hs.Add(kvp.Value) is true), we can say that there's more than one item in the set and bail out early. .All does this automatically.
Selman22's answer works perfectly - you can also do this for your Dictionary<string, List<string>> without having to implement an IEqualityComparer yourself:
var firstValue = dictionary.Values.First().OrderBy(x => x);
return dictionary.Values.All (x => x.OrderBy(y => y).SequenceEqual(firstValue));
We compare the first value to every other value, and check equality in each case. Note that List<string>.OrderBy(x => x) simply sorts the list of strings alphabetically.
Its not the fastest sdolution, but its works for me:
bool AreEqual = l1.Intersect(l2).ToList().Count() == l1.Count() && l1.Count() == l2.Count();

Categories