Duplicate values in generic list c#

Duplicate values in generic list c# - c#

I'm adding values to a c# generic list while trying to prevent duplicates, but without success. Anyone know of a reason why this code below wouldn't work?
I have a simple class here:
public class DrivePairs
{
public int Start { get; set; }
public int End { get; set; }
}
And here is my method which tries to return a generic list of the above class:
ArrayList found = DriveRepository.GetDriveArray(9, 138);
List<DrivePairs> drivePairs = new List<DrivePairs>();
foreach (List<int> item in found)
{
int count = item.Count;
if (count > 1)
{
for (int i = 0; i < (count - 1); i++)
{
DrivePairs drivePair = new DrivePairs();
drivePair.Start = item[i];
drivePair.End = item[i + 1];
if (!drivePairs.Contains(drivePair))
drivePairs.Add(drivePair);
}
}
}
drivePairs = drivePairs.Distinct().ToList();
As you can maybe see, I have an ArrayList, and each row contains a List<int>. What I'm doing is going through each and adding to a list which contains only pairs. E.g. if my List<int> contains [1,3,6,9] I want to add three entries to my pairs list:
[1,3]
[3,6]
[6,9]
It all works fine apart from not recognising duplicates. I thought this line would be enough:
if (!drivePairs.Contains(drivePair))
drivePairs.Add(drivePair);
but it continues to add them all. Even when I add a Distinct() at the end, it still doesn't remove them. I've also tried adding them to a HashSet, but it still includes all the duplicates.
Anyone know of a reason why the duplicates might not be getting picked up?

Your DrivePairs class does not specify equality, as a result, the Contains method will be using reference equality. Add an Equals method that uses both Start and End to determine equality and you will probably find your code works.
See: Equality Comparisons (C# Programming Guide)

List.Contains Method
This method determines equality by using the default equality
comparer, as defined by the object's implementation of the
IEquatable.Equals method for T (the type of values in the list).
Change your DrivePairs class
public class DrivePairs: IEquatable<DrivePairs>
{
public int Start { get; set; }
public int End { get; set; }
public bool Equals(DrivePairs other)
{
return (this.Start == other.Start && this.End == other.End)
}
}
See: http://msdn.microsoft.com/en-us/library/bhkz42b3.aspx
Hope this helps

You are creating new List<int> objects - these are different objects and when compared to each other, even if they contain identical values (in the same or in different orders), will be evaluated as different as the default comparison method on reference types is a reference comparison.
You need to write a custom comparer that will identify equal lists in the manner your application requires.

I've marked Colin's as the answer, but here was the code just in case it's any use to anyone:
Equality comparer:
public class EqualityComparer : IEqualityComparer<DrivePairs>
{
public bool Equals(DrivePairs x, DrivePairs y)
{
return x.StartHub.Equals(y.Start);
}
public int GetHashCode(DrivePairs obj)
{
return obj.Start.GetHashCode();
}
}
and in the controller:
IEqualityComparer<DrivePairs> customComparer = new EqualityComparer();
IEnumerable<DrivePairs> distinctDrivePairs = drivePairs.Distinct(customComparer);
drivePairs = distinctDrivePairs.ToList();
Thanks for all the help and comments

I have not tested it but I think the default equality test is if it is the same instance. Try overriding the Equals method and make it use your properties.

The DrivePairs class type is a reference type(remember reference type and value type concept). So when you check if DrivePairs varible is already added in List collections or not it return false as every DrivePairs varibale has different memory location from other.
Try using either Dictionary or StringDictionary or any other Key value pair collection. It will definately work.

Related

Sort (Array array, System.Collections.IComparer? comparer) - parameter or implementation?

The Methode Array.Sort() has the following signature
public static void Sort (Array array, System.Collections.IComparer? comparer);
It looks like you need to pass an IComparer reference. But what is really needed is that array needs to implements IComparable, isn't it?
I see this syntax the first time. Is this common? How can I differentiate between a real parameter? Is there somewhere more information about this topic (in general)?
Important/Edit: ATM I'm reading a C# book and it says about Sort.Array (translated from German to English):
To the first parameter we pass the array to be sorted, in our case
arr. The second parameter is of type IComparer interface. Of course,
you can't pass an instance of type IComparer to the method call,
because interfaces are not instantiable. This is not how the type
specification of the second parameter should be understood. Instead,
the second parameter simply requires that the fist argument passed to
it be an object that implements the interface IComparer - whether the
object is of type DemoClass, Circle,
Basically he says that the second parameter is kind of a description for the first parameter. Is he correct or maybe that's just wrong and the source for my confusion?
https://openbook.rheinwerk-verlag.de/visual_csharp_2012/1997_04_008.html
I just implemented the following snippet. So this could be a way how to pass the second parameter, right?
Array.Sort(shapes, (a, b) => {
if (a.GetArea() < b.GetArea()) return -1;
else if (a.GetArea() > b.GetArea()) return 1;
return 0;
});

If you do not pass the comparer it will use the default comparer implementation for the Array items. But if you have a special comparer then you can pass your own custom Comparer to sort the elements.
Suppose you have a Class of Students (Array of Students), and your default Student comparer can be based on total marks. However, a maths teacher may want to sort the Students based on marks for the Maths only, in that case maths teacher can write his custom MathsRankComparer and pass it to the Sort method so that he will get the Students ordered by marks in Maths.
Similarly, English or Science teacher can pass the respective comparers to get their required ranking/ordering/sorting.
Hope this helps in understanding use of that overload.
Update: some examples to understand details.
public class Student: IComparable<Student>
{
public int ID { get; set; }
public string Name { get; set; }
public float TotalMarks { get; set; }
public float ScienceMarks { get; set; }
public float MathsMarks { get; set; }
public float EnglishMarks { get; set; }
public int CompareTo(Student other)
{
if (this.TotalMarks == other.TotalMarks)
return 0;
if (this.TotalMarks < other.TotalMarks)
return -1;
return 1;
}
}
public class MathsMarksBasedComparer : System.Collections.Generic.IComparer<Student>
{
public int Compare(Student a, Student b)
{
if (a.MathsMarks == b.MathsMarks)
return 0;
if (a.MathsMarks < b.MathsMarks)
return -1;
return 1;
}
}
public class EnglishMarksBasedComparer : System.Collections.Generic.IComparer<Student>
{
public int Compare(Student a, Student b)
{
if (a.EnglishMarks == b.EnglishMarks)
return 0;
if (a.EnglishMarks < b.EnglishMarks)
return -1;
return 1;
}
}
And finally, you can use them like this.
Student[] arr = new Student[100]; // Ignore this, you can use other styles of declaration
Array.Sort(arr, new EnglishMarksBasedComparer());
Array.Sort(arr, new MathsMarksBasedComparer());
Array.Sort(arr);

Basically he says that the second parameter is kind of a description for the first parameter. Is he correct or maybe that's just wrong and the source for my confusion?
It's not wrong it's just worded a bit confusingly.
The IComparer is a nullable type (defined by the questionmark at the end of IComparer). This states that the IComparer is optional/does not have to be passed. However as Mahesh Bongani already meantioned in his reply - internaly if you do not provide a comparer it takes the defualt comparer of the object.
So for this particular funtion if you would pass a Array with objects that do not implement a comparable the function wouldn't be able to sort the elements properly.

I have seen IComparer a few times and am unsure what implements it as standard - as far as lists, arrays and things go. I do know that numbers implement it and I think string does too.
You can though custom implement this inferface. If memory serves me correctly, it provides just one method (interface so you have to write logic yourself) that returns an int. -1 (<0) is lower ranked, +1(>0) is higher ranked, 0 is the same.

Is my syncfunctions hashCode usage approach correct?

Please read my previous question, because my fear of getting collision when using hashCode for strings !
Previous question
I having a database table with items in a repo, and a "incoming" function with items from a model that should sync - to the database table.
Im using intersect and except to make this possible.
The class i use for my sunc purpose:
private class syncItemModel
{
public override int GetHashCode()
{
return this.ItemLookupCode.GetHashCode();
}
public override bool Equals(object other)
{
if (other is syncItemModel)
return ((syncItemModel)other).ItemLookupCode == this.ItemLookupCode;
return false;
}
public string Description { get; set; }
public string ItemLookupCode { get; set; }
public int ItemID { get; set; }
}
Then i use this in my method:
1) Convert datatable items to syncmodel:
var DbItemsInCampaignDiscount_SyncModel =
DbItemsInCampaignDiscount(dbcampaignDiscount, datacontext)
.Select(i => new syncItemModel { Description = i.Description,
ItemLookupCode = i.ItemLookupCode,
ItemID = i.ID}).ToList();
2) Convert my incoming item model to syncmodel:
var ItemsInCampaignDiscountModel_SyncModel = modelItems
.Select(i => new syncItemModel { Description =
i.Description, ItemLookupCode = i.ItemLookUpCode, ItemID =0 }).ToList();
3) Make an intersect:
var CommonItemInDbAndModel =
ItemsInCampaignDiscountModel_SyncModel.Intersect(DbItemsInCampaignDiscount_SyncModel).ToList();
4) Take out items to be deleted in database (that not exist in incoming model items)
var SyncModel_OnlyInDb =
DbItemsInCampaignDiscount_SyncModel.Except(CommonItemInDbAndModel).ToList();
5) Take out items to be added to database, items that exist in incoming model but not in db:
var SyncModel_OnlyInModel =
ItemsInCampaignDiscountModel_SyncModel.Except(CommonItemInDbAndModel).ToList();
My question is then - can it be a collision? Can two differnt ItemLookupCode in my example be treated as the same ItemLookupCode? Because intersect and except using HashCode ! Or vill the Equal function "double check" it -so this approach is safe to use? If its a possible chance of collision how big is that chance?

Yes, there could be always a hash-collision, that's why identity should be confirmed by calling Equals(). GetHashCode() and Equals() must be implemented correctly.
Except() in LINQ to Objects internally uses HashSet, in case of hash-collision it will call Equals to guarantee identity. As you are using a single property, you are good to proxy calls to its hashcode and equals methods.
Please find some comments below about your implementation:
comparison with ==
This is fine to compare strings with ==, but if type is changed to non-primitive, you'll get issues because object reference instead of content will be compared. Proxy call to Equals() instead of ==.
mutability of the object
That is very error prone to bound gethashcode/Equals logic to mutable state. I'd strongly recommend to encapsulate your state so that once you create your object it could not be changed, make set private for a sake of safety.

List.BinarySearch Windows Phone

I search in my list with already sorted data like this:
public class ShortWord: IComparable<ShortWord>
{
public int id { get; set; }
public string Word { get; set; }
public int CompareTo(ShortWord obj)
{
return this.Word.CompareTo(obj.Word);
}
}
List<ShortWord> words;
words.Where(t => t.Word.IndexOf(text.ToUpper()) == 0).Take(30).ToList();
It is working very slowly. I think need use List.BinarySearch but I don't understand how can I use it for my example.
I trying implement something but it isn't working.

Since compare is based on the word, you can create new instance with input word and pass it to the BinarySearch method:
List<ShortWord> words;
int index = words.BinarySearch(new ShortWord() {
Word = text,
};
if (index >= 0) {
ShortWord result = words[index];
}
Accoring to MSDN, BinarySearch will use the implemented IComparable.CompareTo method:
This method uses the default comparer Comparer.Default for type T
to determine the order of list elements. The Comparer.Default
property checks whether type T implements the IComparable generic
interface and uses that implementation, if available. If not,
Comparer.Default checks whether type T implements the IComparable
interface. If type T does not implement either interface,
Comparer.Default throws an InvalidOperationException.
Edit:
If you may have multiple items with the same word in the list, you should iterate the list from index until you get an item with different word.

Weird dictionary ContainsKey issue

Before I start, I'd like to clarify that this is not like all the other somewhat "similar" questions out there. I've tried implementing each approach, but the phenomena I am getting here are really weird.
I have a dictionary where ContainsKey always returns false, even if their GetHashCode functions return the same output, and even if their Equals method returns true.
What could this mean? What am I doing wrong here?
Additional information
The two elements I am inserting are both of type Owner, with no GetHashCode or Equals method. These inherit from a type Storable, which then implements an interface, and also has GetHashCode and Equals defined.
Here's my Storable class. You are probably wondering if the two Guid properties are indeed equal - and yes, they are. I double-checked. See the sample code afterwards.
public abstract class Storable : IStorable
{
public override int GetHashCode()
{
return Id == default(Guid) ? 0 : Id.GetHashCode();
}
public override bool Equals(object obj)
{
var other = obj as Storable;
return other != null && (other.Id == Id || ReferenceEquals(obj, this));
}
public Guid Id { get; set; }
protected Storable()
{
Id = Guid.NewGuid();
}
}
Now, here's the relevant part of my code where the dictionary stuff occurs. It takes in a Supporter object which has a link to an Owner.
public class ChatSession : Storable, IChatSession
{
static ChatSession()
{
PendingSupportSessions = new Dictionary<IOwner, LinkedList<IChatSession>>();
}
private static readonly IDictionary<IOwner, LinkedList<IChatSession>> PendingSupportSessions;
public static ChatSession AssignSupporterForNextPendingSession(ISupporter supporter)
{
var owner = supporter.Owner;
if (!PendingSupportSessions.ContainsKey(owner)) //always returns false
{
var hashCode1 = owner.GetHashCode();
var hashCode2 = PendingSupportSessions.First().Key.GetHashCode();
var equals = owner.Equals(PendingSupportSessions.First().Key);
//here, equals is true, and the two hashcodes are identical,
//and there is only one element in the dictionary according to the debugger.
//however, calling two "Add" calls after eachother does indeed crash.
PendingSupportSessions.Add(owner, new LinkedList<IChatSession>());
PendingSupportSessions.Add(owner, new LinkedList<IChatSession>()); //crash
}
...
}
}
If you need additional information, let me know. I am not sure what kind of information would be sufficient, so it was hard for me to include more.

Guillaume was right. It appears that I was changing the value of one of my keys after it is added to the dictionary. Doh!

Make sure you are passing same object that is stored as key in dictionary. If you are creating new object each time and trying to find key assuming the object is already stored because of similar values, then containsKey returns false. Object comparisons are different than value comparisons.

Quick Question: C# Linq "Single" statement vs "for" loop

I need some clarification. Are these two methods the same or different? I get a little bit confused about when the reference to an object passed in a parameter by value is updated and when a new one is created. I know if that assignment creates a new reference, but what about changing a property? Will both of these methods update the field "_someObjectList" the same way?
public class SomeObject{
public Guid UniqueKey { get; set; }
public object SomeProperty{ get; set; }
}
public class SomeObjectListWrapper{
public SomeObjectListWrapper(List<SomeObject> someObjectList){
_someObjectList = someObjectList;
}
private readonly List<SomeObject> _someObjectList;
public void ReplaceItemPropertyValue1(Guid itemUniqueKey, object propertyValue)
{
List<int> resultIndices = new List<int>();
for (var i = 0; i < _someObjectList.Count(); i++)
{
if (_someObjectList[i].UniqueKey == itemUniqueKey)
resultIndices.Add(i);
}
if (resultIndices.Count != 1)
throw new Exception(
"just pretend this is the same exception as Single() throws when it can't find anything");
_someObjectList[resultIndices[0]].SomeProperty = propertyValue;
}
public void ReplaceItemPropertyValue2(Guid itemUniqueKey, object propertyValue)
{
_someObjectList.Single(x=>x.UniqueKey==itemUniqueKey).SomeProperty=propertyValue;
}
}

Because SomeObject is a class (ie. a reference type), both ReplaceItemPropertyValue methods are updating the same object as was inserted into the list and will be retrieved from the list later. (If SomeObject was a struct/value type, the compiler would prevent you from updating an rvalue/return value [1].)
As a minor side-note, your two methods are not actually identical. The Single method raises an exception if there is more than one matching item in the sequence. To properly match the behaviour, use First instead.
"rvalue" is not actually short for "return value," it just happens that in this case your rvalue is a return value, which is why I specified both options.

They may do the same thing depending on the data in your list.
ReplaceItemPropertyValue2 uses the Single method which will throw an exception if itemUnqiueKey is not found or found more than once.
But as long as itemUniqueKey can't appear more than once in the list, the two functions should accomplish the same task.

Both may be same.
The algorithm in the for loop set the object when key matches and then breaks out.
While the LINQ statement will set the object to all entries whose key match. It depends if your collection has same key entered more than once.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Duplicate values in generic list c# - c#

Your DrivePairs class does not specify equality, as a result, the Contains method will be using reference equality. Add an Equals method that uses both Start and End to determine equality and you will probably find your code works. See: Equality Comparisons (C# Programming Guide)

I have not tested it but I think the default equality test is if it is the same instance. Try overriding the Equals method and make it use your properties.

Related

Sort (Array array, System.Collections.IComparer? comparer) - parameter or implementation?

Is my syncfunctions hashCode usage approach correct?

List.BinarySearch Windows Phone

Weird dictionary ContainsKey issue

Quick Question: C# Linq "Single" statement vs "for" loop

Categories

Resources