C#: Dictionary indexed by List

C#: Dictionary indexed by List - c#

I try to write a program where Dictionary is indexed by List. (trust me i do, and yes there are option, but i like indexing by list). There is a minimal working (actually not working, only one last line which is a problem) example:
using System;
using System.Collections.Generic;
namespace test
{
class Program
{
static void Main(string[] args)
{
Dictionary<List<String>, int> h = new Dictionary<List<string>,int>();
List<String> w = new List<string> {"a"};
h.Add(w, 1);
w = new List<string>{"b"};
h.Add(w,2);
w = new List<string>{"a"};
int value = 0;
h.TryGetValue(w, out value);
Console.WriteLine(value+" "+h[w]);
}
}
if one debugs this program, he will clearly see that there two elements in h, but still these elements are not accessible via correct indexes --- h[w]. Am I wrong or is there something weird going on?

The problem with your app extends from the fact that:
new List<String> { "a" } != new List<String> { "a" }
Equality for lists checks to see if the two references refer to the same instance. In this case, they don't. You've instead created two Lists with the same elements...which doesn't make them equal.
You can fix the problem by creating a custom Equality Comparer:
public class ListEqualityComparer<T> : IEqualityComparer<List<T>>
{
public bool Equals(List<T> list1, List<T> list2)
{
return list1.SequenceEquals(list2);
}
public int GetHashCode(List<T> list)
{
if(list != null && list.Length > 0)
{
var hashcode = list[0].GetHashCode();
for(var i = 1; i <= list.Length; i++)
hashcode ^= list[i].GetHashCode();
return hashcode;
}
return 0;
}
}
And then passing that to the Dictionary constructor:
Dictionary<List<String>, int> h =
new Dictionary<List<string>,int>(new ListEqualityComparer<String>());

The problem is the index by List, what you are indexing by isn't the data in the list but you are essentially indexing by the memory pointer to the List (i.e the memory address of where this List is located).
You Created one list at one memory location, you then created a totally different list at a different memory location (ie when you create a new instance). The two lists are different even though they contain the same data, and this means you can add as many as you want to the dictionary.
One solution is Rather than indexing by List would be to index by String and use a comma separated List containing all the data in your list as an index.

This won't ever work for you, because List<T>'s Equals and GetHashCode methods don't consider the contents of the list. If you want to use a collection of objects as a key, you'll need to implement your own collection type that overrides Equals in such a way as to check the equality of the objects in the collection (perhaps using Enumerable.SequenceEqual.)

The Dictionary class uses reference comparison to look for the specified key, that's why even if the lists contain the same items, they are different.

Related

How to merge two HashSet of complex types and keep duplicates from second set?

I have a complex type as:
class Row : IEquatable<Row>
{
public Type Type1 { get; }
public Type Type2 { get; }
public int dummy;
public override int GetHashCode()
{
var type1HashCode = Type1.GetHashCode();
//djb2 hash
unchecked
{
return ((type1HashCode << 5) + type1HashCode) ^ Type2.GetHashCode();
}
}
// Equals method also overrided
}
I have a HashSet<Row> and I want to merge it with another HashSet with two different strategies; first I want to merge and keep duplicates from main HashSet, I tried main.UnionWith(second) now I want to merge main with second (result being in main) and keep duplicates from second one; How can I do that? (it's a performance critical code)
My code:
var main = new HashSet<Row>()
{
new Row(typeof(int), typeof(long))
{
dummy = 10
}
};
var second = new HashSet<Row>()
{
new Row(typeof(int), typeof(long))
{
dummy = 20
}
};
// Merge here.
Trace.Write(main.First().dummy) //I want 20
I expect main.First().dummy to be 20.

The second strategy can be implemented by calling main.ExceptWith(second); first and then main.UnionWith(second) like the first strategy.
Since the UnionWith is basically a shortcut for
foreach (var element in second)
main.Add(element);
and ExceptWith - a shortcut for
foreach (var element in second)
main.Remove(element);
the second strategy can also be implemented with a single loop:
foreach (var element in second)
{
main.Remove(element);
main.Add(element);
}
But I think the performance gain would be negligible compared to ExceptWith + UnionWith approach.

If I'm reading this correctly, you want to keep duplicated values after merging. In this scenario, HashSet is the wrong data structure for your objective.
From the MSDN documentation for HashSet(T):
A HashSet collection is not sorted and cannot contain duplicate elements. If order or element duplication is more important than performance for your application, consider using the List class together with the Sort method.

C# - List of Lists

I am coding in C# and I have a class with a property of type List<List<T>> that gets initialized in the constructor. The code is as follows:
public class MyMatrix<T>
{
private readonly List<List<T>> matrix;
public MyMatrix(int dimension)
{
this.matrix = new List<List<T>>();
for (int i = 0; i < dimension; i++)
{
List<T> subList = new List<T>();
this.matrix.Add(subList);
}
}
.....
The problem is that if I create a new object of type MyMatrix the sublists are empty so if I invoke the ToString() method of the class or any other method that returns the values contained in the sublists I get an OutOfOrder Exception as expected.
Get and Set methods are as follows:
public T Get(int row, int column)
{
return this.matrix[row][column];
}
public void Set(int row, int column, T value)
{
this.matrix[row].Insert(column, value);
}
If I initialize the sublists with a Set method then everything is fine obviously.
I can't change the constructor as it is up to the user of the class to initialize the sublists so it is not possible to know in advance what they are going to contain.
How would you manage the exceptions in the class methods or would you bother at all?

There are several approaches on managing exceptions in your case, and it depends on how you want to use the matrix class:
If you expect users to set values without initializing the row/column, then on the Set method I would just resize the list if necesary to accomodate the row/column arguments. You can always insert empty items in the list by using default(T) (this works both with value and reference objects). In this scenario, the Get method should check if the matrix coordinates exist and otherwise return default(T) so that no exceptions occurr.
If you expect users to always initialize the matrix, then just leave it as it is and throw exceptions. This is a clear hint that the application is misbehaving and the programmer must take care of this.
If you are trying to implement something like a Sparse Matrix, then using List<T> is not the best way and you should try another approach - for example using Dictionary<int, Dictionary<int, T>> or some sort of linked list. Anyway in this scenario, if you go for the Dictionary approach, you still need to take the same decisions as above (ie. throw if accessing a non existent coordinate or just return default(T))

What about C# 6.0?
public T Get(int row, int column)
{
return this.matrix[row]?[column] ?? default(T);
}

Remove specific entry from list (beginner in c#)

I have a simple static inventory class which is a list of custom class Item. I am working on a crafting system and when I craft something I need to remove the required Items from my inventory list.
I tried to create a method that I can call which takes an array of the items to remove as a parameter, but its not working.
I think its because the foreach loop doesn't know which items to remove? I am not getting an error messages, it just doesn't work. How can I accomplish this?
public class PlayerInventory: MonoBehaviour
{
public Texture2D tempIcon;
private static List<Item> _inventory=new List<Item>();
public static List<Item> Inventory
{
get { return _inventory; }
}
public static void RemoveCraftedMaterialsFromInventory(Item[] items)
{
foreach(Item item in items)
{
PlayerInventory._inventory.Remove(item);
}
}
}
Here is the function that shows what items will be removed:
public static Item[] BowAndArrowReqs()
{
Item requiredItem1 = ObjectGenerator.CreateItem(CraftingMatType.BasicWood);
Item requiredItem2 = ObjectGenerator.CreateItem(CraftingMatType.BasicWood);
Item requiredItem3 = ObjectGenerator.CreateItem(CraftingMatType.String);
Item[] arrowRequiredItems = new Item[]{requiredItem1, requiredItem2, requiredItem3};
return arrowRequiredItems;
}
And here is where that is called:
THis is within the RecipeCheck static class:
PlayerInventory.RemoveCraftedMaterialsFromInventory(RecipeCheck.BowAndArrowReqs());

While I like Jame's answer (and it sufficiently covers the contracts), I will talk on how one might implement this equality and make several observations.
For starts, in the list returned there may be multiple objects of the same type - e.g. BasicWood, String. Then there needs to be a discriminator used for each new object.
It would be bad if RemoveCraftedMaterialsFromInventory(new [] { aWoodBlock }) to remove a Wood piece in the same way that two wood pieces were checked ("equals") to each other. This is because being "compatible for crafting" isn't necessarily the same as "being equals".
One simple approach is to assign a unique ID (see Guid.NewGuid) for each specific object. This field would be used (and it could be used exclusively) in the Equals method - however, now we're back at the initial problem, where each new object is different from any other!
So, what's the solution? Make sure to use equivalent (or identical objects) when removing them!
List<Item> items = new List<Item> {
new Wood { Condition = Wood.Rotten },
new Wood { Condition = Wood.Epic },
};
// We find the EXISTING objects that we already have ..
var woodToBurn = items.OfType<Wood>
.Where(w => w.Condition == Wood.Rotten);
// .. so we can remove them
foreach (var wood in woodToBurn) {
items.Remove(wood);
}
Well, okay, that's out of the way, but then we say: "How can we do this with a Recipe such that Equals isn't butchered and yet it will remove any items of the given type?"
Well, we can either do this by using LINQ or a List method that supports predicates (i.e. List.FindIndex) or we can implement a special Equatable to only be used in this case.
An implementation that uses a predicate might look like:
foreach (var recipeItem in recipeItems) {
// List sort of sucks; this implementation also has bad bounds
var index = items.FindIndex((item) => {
return recipeItem.MaterialType == item.MaterialType;
});
if (index >= 0) {
items.RemoveAt(index);
} else {
// Missing material :(
}
}

If class Item doesn't implement IEquatable<Item> and the bool Equals(Item other) method, then by default it will use Object.Equals which checks if they are the same object. (not two objects with the same value --- the same object).
Since you don't say how Item is implemented, I can't suggest how to write it's Equals(), however, you should also override GetHashCode() so that two Items that are Equal return the same hash code.
UPDATE (based on comments):
Essentially, List.Remove works like this:
foreach(var t in theList)
{
if (t.Equals(itemToBeRemove))
PerformSomeMagicToRemove(t);
}
So, you don't have to do anything to the code you've given in your question. Just add the Equals() method to Item.

Dictionary ContainsKey does not seem to work with string[] key

I am trying to have a data structure with multiple string keys. To do this, I tried to create a Dictionary with string[] element. But the ContainsKey do no seem to work as I expect:
Dictionary<string[], int> aaa = new Dictionary<string[], int>();
int aaaCount = 0;
aaa.Add(new string[] { string1, string2 }, aaaCount++);
if (!aaa.ContainsKey(new string[] { string1, string2 }))
{
aaa.Add(new string[] { string1, string2 }, aaaCount++);
}
I see that there are two entries in aaa after the execution of the code above while I was expecting only one. Is this the expected behaviour? How can I ensure that there are no duplicate entries in the Dictionary?
Note: I tried the same with a list as well (List and the result is the same - the Contains method does not really work with string[])

If you want to use string[] as TKey, you should pass IEqualityComparer<string[]> to the constructor of Dictionary. Because Otherwise Dictionary uses standard comparison for TKey and in case of string[] it just compares references hence string[] is reference type. You have to implement IEqualityComparer yourself. It can be done in the following way:
(The implementation is quite naive, I provide it just as the starting point)
public class StringArrayComparer : IEqualityComparer<string[]>
{
public bool Equals(string[] left, string[] right)
{
if (ReferenceEquals(left, right))
{
return true;
}
if ((left == null) || (right == null))
{
return false;
}
return left.SequenceEqual(right);
}
public int GetHashCode(string[] obj)
{
return obj.Aggregate(17, (res, item) => unchecked(res * 23 + item.GetHashCode()));
}
}

You need to create an IEqualityComparer<string[]> and pass it to the dictionary's constructor.
This tells the dictionary how to compare keys.
By default, it compares them by reference.

Because an array is a reference type, i.e., you are checking reference (identity) equality, not equality based on the values within the array. When you create a new array with the same values the arrays themselves are still two distinct objects, so ContainsKey returns false.
Using an array as a Dictionary key is a bit... odd. What are you trying to map here? There is probably a better way to do it.

You may be better off, if your application supports it, to combine the string array into a single string.
We have numerous cases where two pieces of information uniquely identifies a record in a collection and in these cases, we join the two strings using a value that should never be in either string (i.e. Char(1)).
Since it is usually a class instance that is being added, we let the class specify the generation of the key so that the code adding to the collection only has to worry about checking a single property (i.e. CollectionKey).

ToList()-- does it create a new list?

Let's say I have a class
public class MyObject
{
public int SimpleInt{get;set;}
}
And I have a List<MyObject>, and I ToList() it and then change one of the SimpleInt, will my change be propagated back to the original list. In other words, what would be the output of the following method?
public void RunChangeList()
{
var objs = new List<MyObject>(){new MyObject(){SimpleInt=0}};
var whatInt = ChangeToList(objs );
}
public int ChangeToList(List<MyObject> objects)
{
var objectList = objects.ToList();
objectList[0].SimpleInt=5;
return objects[0].SimpleInt;
}
Why?
P/S: I'm sorry if it seems obvious to find out. But I don't have compiler with me now...

Yes, ToList will create a new list, but because in this case MyObject is a reference type then the new list will contain references to the same objects as the original list.
Updating the SimpleInt property of an object referenced in the new list will also affect the equivalent object in the original list.
(If MyObject was declared as a struct rather than a class then the new list would contain copies of the elements in the original list, and updating a property of an element in the new list would not affect the equivalent element in the original list.)

From the Reflector'd source:
public static List<TSource> ToList<TSource>(this IEnumerable<TSource> source)
{
if (source == null)
{
throw Error.ArgumentNull("source");
}
return new List<TSource>(source);
}
So yes, your original list won't be updated (i.e. additions or removals) however the referenced objects will.

ToList will always create a new list, which will not reflect any subsequent changes to the collection.
However, it will reflect changes to the objects themselves (Unless they're mutable structs).
In other words, if you replace an object in the original list with a different object, the ToList will still contain the first object.
However, if you modify one of the objects in the original list, the ToList will still contain the same (modified) object.

Yes, it creates a new list. This is by design.
The list will contain the same results as the original enumerable sequence, but materialized into a persistent (in-memory) collection. This allows you to consume the results multiple times without incurring the cost of recomputing the sequence.
The beauty of LINQ sequences is that they are composable. Often, the IEnumerable<T> you get is the result of combining multiple filtering, ordering, and/or projection operations. Extension methods like ToList() and ToArray() allow you to convert the computed sequence into a standard collection.

The accepted answer correctly addresses the OP's question based on his example. However, it only applies when ToList is applied to a concrete collection; it does not hold when the elements of the source sequence have yet to be instantiated (due to deferred execution). In case of the latter, you might get a new set of items each time you call ToList (or enumerate the sequence).
Here is an adaptation of the OP's code to demonstrate this behaviour:
public static void RunChangeList()
{
var objs = Enumerable.Range(0, 10).Select(_ => new MyObject() { SimpleInt = 0 });
var whatInt = ChangeToList(objs); // whatInt gets 0
}
public static int ChangeToList(IEnumerable<MyObject> objects)
{
var objectList = objects.ToList();
objectList.First().SimpleInt = 5;
return objects.First().SimpleInt;
}
Whilst the above code may appear contrived, this behaviour can appear as a subtle bug in other scenarios. See my other example for a situation where it causes tasks to get spawned repeatedly.

A new list is created but the items in it are references to the orginal items (just like in the original list). Changes to the list itself are independent, but to the items will find the change in both lists.

Just stumble upon this old post and thought of adding my two cents. Generally, if I am in doubt, I quickly use the GetHashCode() method on any object to check the identities. So for above -
public class MyObject
{
public int SimpleInt { get; set; }
}
class Program
{
public static void RunChangeList()
{
var objs = new List<MyObject>() { new MyObject() { SimpleInt = 0 } };
Console.WriteLine("objs: {0}", objs.GetHashCode());
Console.WriteLine("objs[0]: {0}", objs[0].GetHashCode());
var whatInt = ChangeToList(objs);
Console.WriteLine("whatInt: {0}", whatInt.GetHashCode());
}
public static int ChangeToList(List<MyObject> objects)
{
Console.WriteLine("objects: {0}", objects.GetHashCode());
Console.WriteLine("objects[0]: {0}", objects[0].GetHashCode());
var objectList = objects.ToList();
Console.WriteLine("objectList: {0}", objectList.GetHashCode());
Console.WriteLine("objectList[0]: {0}", objectList[0].GetHashCode());
objectList[0].SimpleInt = 5;
return objects[0].SimpleInt;
}
private static void Main(string[] args)
{
RunChangeList();
Console.ReadLine();
}
And answer on my machine -
objs: 45653674
objs[0]: 41149443
objects: 45653674
objects[0]: 41149443
objectList: 39785641
objectList[0]: 41149443
whatInt: 5
So essentially the object that list carries remain the same in above code. Hope the approach helps.

I think that this is equivalent to asking if ToList does a deep or shallow copy. As ToList has no way to clone MyObject, it must do a shallow copy, so the created list contains the same references as the original one, so the code returns 5.

ToList will create a brand new list.
If the items in the list are value types, they will be directly updated, if they are reference types, any changes will be reflected back in the referenced objects.

In the case where the source object is a true IEnumerable (i.e. not just a collection packaged an as enumerable), ToList() may NOT return the same object references as in the original IEnumerable. It will return a new List of objects, but those objects may not be the same or even Equal to the objects yielded by the IEnumerable when it is enumerated again

var objectList = objects.ToList();
objectList[0].SimpleInt=5;
This will update the original object as well. The new list will contain references to the objects contained within it, just like the original list. You can change the elements either and the update will be reflected in the other.
Now if you update a list (adding or deleting an item) that will not be reflected in the other list.

I don't see anywhere in the documentation that ToList() is always guaranteed to return a new list. If an IEnumerable is a List, it may be more efficient to check for this and simply return the same List.
The worry is that sometimes you may want to be absolutely sure that the returned List is != to the original List. Because Microsoft doesn't document that ToList will return a new List, we can't be sure (unless someone found that documentation). It could also change in the future, even if it works now.
new List(IEnumerable enumerablestuff) is guaranteed to return a new List. I would use this instead.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C#: Dictionary indexed by List - c#

The Dictionary class uses reference comparison to look for the specified key, that's why even if the lists contain the same items, they are different.

Related

How to merge two HashSet of complex types and keep duplicates from second set?

C# - List of Lists

Remove specific entry from list (beginner in c#)

Dictionary ContainsKey does not seem to work with string[] key

ToList()-- does it create a new list?

Categories

Resources