Dictionary<string, Dictionary<string, ... >...> nestedDictionary;
Above Dictionary has a one-to-many relationship at each level from top to bottom. Adding an item is pretty easy since we have the leaf object and we start from bottom, creating dictionaries and adding each to the relevant parent...
My problem is when I want to find an item at the inner Dictionaries. There are two options:
Nested foreach and find the item
then snapshot all the loops at the
moment we found the item and exit
all loops. Then we know item
pedigree is
string1->string2->...->stringN.
Problems with this solution is A)
Performance B) Thread-safety (since I want to remove the item, the parent if it has no child and it's parent if it has no child...)
Creating a reverse look-up dictionary
and indexing added items. Something
like a Tuple for all outer
dictionaries. Then add the item as
key and all the outer parents as
Tuple members. Problem: A)
Redundancy B) Keeping synchronized
reverse look-up Dictionary with
main Dictionary.
Any idea for a fast and thread-safe solution?
It looks like you actually have more than two levels of Dictionary. Since you cannot support a variable number of dictionaries using this type syntax:
Dictionary<string, Dictionary<string, ... >...> nestedDictionary;
I can only assume that it is some number greater than two. Let's say that it's three. For any data structure you construct, you have an intended use and operations that you want to perform efficiently.
I'm going to assume you need calls like this:
var dictionary = new ThreeLevelDictionary();
dictionary.Add(string1, string2, string3, value);
var value = dictionary[string1, string2, string3];
dictionary.Remove(string1, string2, string3);
And (critical to the question) the reverse lookup you are describing:
var strings = dictionary.FindKeys(value);
If these are the operations that you need to perform and to perform quickly, then one data structure that you can use is a Dictionary with a Tuple key:
public class ThreeLevelDictionary<TValue> : Dictionary<Tuple<string, string, string>, TValue>
{
public void Add(string s1, string s2, string s3, TValue value)
{
Add(Tuple.Create(s1, s2, s3), value);
}
public TValue this[string s1, string s2, string s3]
{
get { return this[Tuple.Create(s1, s2, s3)]; }
set { value = this[Tuple.Create(s1, s2, s3)]; }
}
public void Remove(string s1, string s2, string s3)
{
Remove(Tuple.Create(s1, s2, s3);
}
public IEnumerable<string> FindKeys(TValue value)
{
foreach (var key in Keys)
{
if (EqualityComparer<TValue>.Default.Equals(this[key], value))
return new string[] { key.Item1, key.Item2, key.Item3 };
}
throw new InvalidOperationException("missing value");
}
}
Now you are perfectly positioned to create a reverse-lookup hashtable using another Dictionary if performance indicates that this is a bottleneck.
If the previous liked operations are not the ones you want to perform, then this data structure might not meet your needs. Either way, if you describe the interface first that summarizes what you want the data structure to do, then it's easier to see if there are other alternatives.
Although I have little direct experience with the C5 collection library, it sounds like you could use their TreeDictionary class. It comes with a whole suite of useful methods for finding, iterating and modifying the tree, and is surprisingly well documented.
Another option would be to use the QuickGraph library (which you can find in NuGet or on codeplex). This involves some knowledge of graph theory but is otherwise a very useful library.
Both libraries require you to handle concurrency, just like the standard BCL collections.
Related
I have dictionary indices and want to add several keys to it from another dictionary using LINQ.
var indices = new Dictionary<string, int>();
var source = new Dictionary<string, int> { { "1", 1 }, { "2", 2 } };
source.Select(name => indices[name.Key] = 0); // doesn't work
var res = indices.Count; // returns 0
Then I replace Select with Min and everything works as expected, LINQ creates new keys in my dictionary.
source.Min(name => indices[name.Key] = 0); // works!!!
var res = indices.Count; // returns 2
Question
All I want to do is to initialize dictionary without foreach. Why dictionary keys disappear when LINQ is executed? What iterator or aggregator I could use instead of Min to create keys for a dictionary declared outside of LINQ query?
Update #1
Decided to go with System.Interactive extension.
Update #2
I appreciate and upvote all answers, but need to clarify that, purpose of the question is not to copy a dictionary, but to execute some code in a LINQ query. To add more sense to it, I actually have hierarchical structure of classes with dictionaries and at some point they need to be synchronized, so I want to create flat, non-hierarchical dictionary, used for tracking, that includes all hierarchical keys.
class Account
{
Dictionary<string, User> Users;
}
class User
{
Dictionary<string, Activity> Activities;
}
class Activity
{
string Name;
DateTime Time;
}
Now I want to sync all actions by time, so I need a tracker that will help me to align all actions by time, and I don't want to create 3 loops for Account, User, and Activity. Because that would be considered a hierarchical hell of loops, the same as async or callback hell. With LINQ I don't have to create loop inside loop, inside loop, etc.
Accounts.ForEach(
account => account.Value.Users.ForEach(
user => user.Value.Activities.ForEach(
activity => indices[account.Key + user.Key + activity.Key] = 0));
Also, having loops where it can be replaced with LINQ can be considered as a code smell, not my opinion, but I totally agree, because having too many loops you will probably end up in duplicated code.
https://jasonneylon.wordpress.com/2010/02/23/refactoring-to-linq-part-1-death-to-the-foreach/
You can say that LINQ is used for querying, not for setting a variable, I would say I'm querying ... the KEYS.
Linq is not intended to be used to mutate the elements of a sequence. Rather, it is intended to be used to traverse, filter and project elements of a sequence. In this respect, it is intended to be used more in a "functional programming" style.
As you have discovered, Linq can be used in other than a functional programming style - but by using it in that way you are really misusing it.
Technically, the reason that source.Min() has the effect you were looking for is that it has to visit each of the elements of your sequence in order to determine the minimum element.
Because your selector for Min() has a side-effect (i.e. indices[name.Key] = 0) then a side-effect of finding the minimum value is to add each element's key to indices, but with a value of zero rather than the original value.
(I suspect you might have meant to put indices[name.Key] = name.Value...)
The reason that your use of Select() has no effect is that it has not been used to traverse the sequence - it uses "deferred execution".
You can force it to traverse the sequence by counting the elements, like so:
source.Select(name => indices[name.Key] = 0).Count();
However, that is also counter-intuitive and is a misuse of Linq.
The correct solution is to use foreach. This expresses your intent clearly and unambiguously.
An alternative approach is to write an AddRange() extension method for Dictionary like so:
public static class DictionaryExt
{
public static Dictionary<TKey, TValue> AddRange<TKey, TValue>(
this Dictionary<TKey, TValue> self,
IEnumerable<KeyValuePair<TKey, TValue>> items)
{
foreach (var item in items)
{
self[item.Key] = item.Value;
}
return self;
}
}
Then you can just call indices.AddRange(source); to achieve your aim.
Interestingly, the ImmutableDictionary type does already have an AddRange() method that you could use like so:
var indices = ImmutableDictionary.Create<string, int>();
var source = new Dictionary<string, int> { { "1", 1 }, { "2", 2 } };
indices = indices.AddRange(source);
Console.WriteLine(indices.Count);
But I wouldn't recommend you change over to using ImmutableDictionary just so you can use its AddRange().
Also note that ImmutableDictionary is, well, immutable - so you can't just do indices.AddRange(source);; you have to assign the result back as in indices = indices.AddRange(source); (like when you modify a string using ToUpper()).
You wrote:
All I want to do is to initialize dictionary without foreach
Do you want to replace the values in your indices dictionary with the values in source? Use Enumerable.ToDictionary
indices = (KeyValuePair<string, int>)source // regard the items in the dictionary as KeyValuePairs
.ToDictionary(pair => pair.Key, // the key is the key from original dictionary
pair => pair.Value); // the value is the value from the original
Or do you want to add the values from source to the already existing values in indices? If you don't want a foreach you'll have to take the current values from both dictionaries and Concat them to the values from source. Then use the ToDictionary to create a new Dictionary.
indices = (KeyValuePair<string, int>) indices
.Concat(KeyValuePair<string, int>) source)
.ToDictionary(... etc)
However this would be a waste of processing power.
Consider creating extension functions for Dictionary. See Extension Methods Demystified
public static Dictionary<TKey, TValue> Copy>Tkey, TValue>(
this Dictionary<TKey, TValue> source)
{
return source.ToDictionary(x => x.Key, x => x.Value);
}
public static void AddRange<TKey, TValue>(
this Dictionary<TKey, TValue> destination,
Dictionary<TKey, TValue> source)
{
foreach (var keyValuePair in source)
{
destination.Add(keyValuePair.Key, keyValuePair.Value);
// TODO: decide what to do if Key already in Destination
}
}
Usage:
// initialize:
var indices = source.Copy();
// add values:
indices.AddRange(otherDictionary);
I know how to make a new dictionary case insensitive with the code below:
var caseInsensitiveDictionary = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase);
But I'm using WebApi which serializes JSON objects into a class we've created.
public class Notification : Common
{
public Notification();
[JsonProperty("substitutionStrings")]
public Dictionary<string, string> SubstitutionStrings { get; set; }
}
So besides rebuilding the dictionary after receiving the "Notification" object, is there a way to set this dictionary to case insensitive in the first place or after it's been created?
Thanks
So besides rebuilding the dictionary after receiving the "Notification" object, is there a way to set this dictionary to case insensitive in the first place or after it's been created?
No, it is impossible. You need to create a new dictionary.
Currently the dictionary has all of the keys in various different buckets; changing the comparer would mean that a bunch of keys would all suddenly be in the wrong buckets. You'd need to go through each key and re-compute where it needs to go and move it, which is basically the same amount of work as creating a new dictionary would be.
Whenever an item is added to a dictionary, the dictionary will compute its hash code and make note of it. Whenever a dictionary is asked to look up an item, the dictionary will compute the hash code on the item being sought and assume that any item in the dictionary which had returned a different hash code cannot possibly match it, and thus need not be examined.
In order for a dictionary to regard "FOO", "foo", and "Foo" as equal, the hash code function it uses must yield the same value for all of them. If a dictionary was built using a hash function which returns different values for "FOO", "foo", and "Foo", changing to a hash function which yielded the same value for all three strings would require that the dictionary re-evaluate the hash value of every item contained therein. Doing this would require almost as much work as building a new dictionary from scratch, and for that reason .NET does not support any means of changing the hash function associated with a dictionary other than copying all the items from the old dictionary to a new dictionary, abandoning the old one.
Note that one could design a SwitchablyCaseSensitiveComparator whose GetHashCode() method would always return a case-insensitive hash value, but whose Equals method could be switched between case-sensitive and non-case sensitive operation. If one were to implement such a thing, one could add items to a dictionary and then switch between case-sensitive and non-case-sensitive modes. The biggest problem with doing that would be that adding if the dictionary is in case-sensitive mode when two items are added which differ only in case, attempts to retrieve either of those items when the dictionary is in case-insensitive mode might not behave as expected. If populating a dictionary in case-insensitive mode and performing some look-ups in case-sensitive mode should be relatively safe, however.
Try changing your class definition to something like this
public class Notification : Common
{
public Notification()
{
this.substitutionStringsBackingStore =
new Dictionary<string,string>( StringComparer.OrdinalIgnoreCase )
;
}
[JsonProperty("substitutionStrings")]
public Dictionary<string, string> SubstitutionStrings
{
get { return substitutionStringsBackingStore ; }
set { substitutionStringsBackingStore = value ; }
}
private Dictionary<string,string> substitutionStringsBackingStore ;
}
You do have to re-create the dictionary, but this can be done with extensions:
public static class extensions
{
public static Dictionary<string, T> MakeCI<T>(this Dictionary<string, T> dictionary)
{
return dictionary.ToDictionary(kvp => kvp.Key, kvp => kvp.Value, StringComparer.OrdinalIgnoreCase);
}
}
I've specified string type for the key as this is what we want to be CI, but the value can be any type.
You would use it like so:
myDict = myDict.MakeCI();
I'm having issues finding the most efficient way to remove duplicates from a list of strings (List).
My current implementation is a dual foreach loop checking the instance count of each object being only 1, otherwise removing the second.
I know there are MANY other questions out there, but they all the best solutions require above .net 2.0, which is the current build environment I'm working in. (GM and Chrysler are very resistant to changes ... :) )
This limits the possible results by not allowing any LINQ, or HashSets.
The code I'm using is Visual C++, but a C# solution will work just fine as well.
Thanks!
This probably isn't what you're looking for, but if you have control over this, the most efficient way would be to not add them in the first place...
Do you have control over this? If so, all you'd need to do is a myList.Contains(currentItem) call before you add the item and you're set
You could do the following.
List<string> list = GetTheList();
Dictionary<string,object> map = new Dictionary<string,object>();
int i = 0;
while ( i < list.Count ) {
string current = list[i];
if ( map.ContainsKey(current) ) {
list.RemoveAt(i);
} else {
i++;
map.Add(current,null);
}
}
This has the overhead of building a Dictionary<TKey,TValue> object which will duplicate the list of unique values in the list. But it's fairly efficient speed wise.
I'm no Comp Sci PhD, but I'd imagine using a dictionary, with the items in your list as the keys would be fast.
Since a dictionary doesn't allow duplicate keys, you'd only have unique strings at the end of iteration.
Just remember when providing a custom class to override the Equals() method in order for the Contains() to function as required.
Example
List<CustomClass> clz = new List<CustomClass>()
public class CustomClass{
public bool Equals(Object param){
//Put equal code here...
}
}
If you're going the route of "just don't add duplicates", then checking "List.Contains" before adding an item works, but its O(n^2) where n is the number strings you want to add. Its no different from your current solution using two nested loops.
You'll have better luck using a hashset to store items you've already added, but since you're using .NET 2.0, a Dictionary can substitute for a hash set:
static List<T> RemoveDuplicates<T>(List<T> input)
{
List<T> result = new List<T>(input.Count);
Dictionary<T, object> hashSet = new Dictionary<T, object>();
foreach (T s in input)
{
if (!hashSet.ContainsKey(s))
{
result.Add(s);
hashSet.Add(s, null);
}
}
return result;
}
This runs in O(n) and uses O(2n) space, it will generally work very well for up to 100K items. Actual performance depends on the average length of the strings -- if you really need to maximum performance, you can exploit some more powerful data structures like tries make inserts even faster.
I've got a dictionary, something like
Dictionary<Foo,String> fooDict
I step through everything in the dictionary, e.g.
foreach (Foo foo in fooDict.Keys)
MessageBox.show(fooDict[foo]);
It does that in the order the foos were added to the dictionary, so the first item added is the first foo returned.
How can I change the cardinality so that, for example, the third foo added will be the second foo returned? In other words, I want to change its "index."
If you read the documentation on MSDN you'll see this:
"The order in which the items are returned is undefined."
You can't gaurantee the order, because a Dictionary is not a list or an array. It's meant to look up a value by the key, and any ability to iterate values is just a convenience but the order is not behavior you should depend on.
You may be interested in the OrderedDicationary class that comes in System.Collections.Specialized namespace.
If you look at the comments at the very bottom, someone from MSFT has posted this interesting note:
This type is actually misnamed; it is not an 'ordered' dictionary as such, but rather an 'indexed' dictionary. Although, today there is no equivalent generic version of this type, if we add one in the future it is likely that we will name such as type 'IndexedDictionary'.
I think it would be trivial to derive from this class and make a generic version of OrderedDictionary.
I am not fully educated in the domain to properly answer the question, but I have a feeling that the dictionary sorts the values according to the key, in order to perform quick key search. This would suggest that the dictionary is sorted by key values according to key comparison. However, looking at object methods, I assume they are using hash codes to compare different objects considering there is no requirement on the type used for keys. This is only a guess. Someone more knowledgey should fill in with more detail.
Why are you interested in manipulating the "index" of a dictionary when its purpose is to index with arbitrary types?
I don't know if anyone will find this useful, but here's what I ended up figuring out. It seems to work (by which I mean it doesn't throw any exceptions), but I'm still a ways away from being able to test that it works as I hope it does. I have done a similar thing before, though.
public void sortSections()
{
//OMG THIS IS UGLY!!!
KeyValuePair<ListViewItem, TextSection>[] sortable = textSecs.ToArray();
IOrderedEnumerable<KeyValuePair<ListViewItem, TextSection>> sorted = sortable.OrderBy(kvp => kvp.Value.cardinality);
foreach (KeyValuePair<ListViewItem, TextSection> kvp in sorted)
{
TextSection sec = kvp.Value;
ListViewItem key = kvp.Key;
textSecs.Remove(key);
textSecs.Add(key, sec);
}
}
The short answer is that there shouldn't be a way since a Dictionary "Represents a collection of keys and values." which does not imply any sort of ordering. Any hack you might find is outside the definition of the class and may be liable to change.
You should probably first ask yourself if a Dictionary is really called for in this situation, or if you can get away with using a List of KeyValuePairs.
Otherwise, something like this might be useful:
public class IndexableDictionary<T1, T2> : Dictionary<T1, T2>
{
private SortedDictionary<int, T1> _sortedKeys;
public IndexableDictionary()
{
_sortedKeys = new SortedDictionary<int, T1>();
}
public new void Add(T1 key, T2 value)
{
_sortedKeys.Add(_sortedKeys.Count + 1, key);
base.Add(key, value);
}
private IEnumerable<KeyValuePair<T1, T2>> Enumerable()
{
foreach (T1 key in _sortedKeys.Values)
{
yield return new KeyValuePair<T1, T2>(key, this[key]);
}
}
public new IEnumerator<KeyValuePair<T1, T2>> GetEnumerator()
{
return Enumerable().GetEnumerator();
}
public KeyValuePair<T1, T2> this[int index]
{
get
{
return new KeyValuePair<T1, T2> (_sortedKeys[index], base[_sortedKeys[index]]);
}
set
{
_sortedKeys[index] = value.Key;
base[value.Key] = value.Value;
}
}
}
With client code looking something like this:
static void Main(string[] args)
{
IndexableDictionary<string, string> fooDict = new IndexableDictionary<string, string>();
fooDict.Add("One", "One");
fooDict.Add("Two", "Two");
fooDict.Add("Three", "Three");
// Print One, Two, Three
foreach (KeyValuePair<string, string> kvp in fooDict)
Console.WriteLine(kvp.Value);
KeyValuePair<string, string> temp = fooDict[1];
fooDict[1] = fooDict[2];
fooDict[2] = temp;
// Print Two, One, Three
foreach (KeyValuePair<string, string> kvp in fooDict)
Console.WriteLine(kvp.Value);
Console.ReadLine();
}
UPDATE: For some reason it won't let me comment on my own answer.
Anyways, IndexableDictionary is different from OrderedDictionary in that
"The elements of an OrderedDictionary are not sorted in any way." So foreach's would not pay attention to the numerical indices
It is strongly typed, so you don't have to mess around with casting things out of DictionaryEntry structs
So I've been poking around with C# a bit lately, and all the Generic Collections have me a little confused. Say I wanted to represent a data structure where the head of a tree was a key value pair, and then there is one optional list of key value pairs below that (but no more levels than these). Would this be suitable?
public class TokenTree
{
public TokenTree()
{
/* I must admit to not fully understanding this,
* I got it from msdn. As far as I can tell, IDictionary is an
* interface, and Dictionary is the default implementation of
* that interface, right?
*/
SubPairs = new Dictionary<string, string>();
}
public string Key;
public string Value;
public IDictionary<string, string> SubPairs;
}
It's only really a simple shunt for passing around data.
There is an actual Data Type called KeyValuePair, use like this
KeyValuePair<string, string> myKeyValuePair = new KeyValuePair<string,string>("defaultkey", "defaultvalue");
One possible thing you could do is use the Dictionary object straight out of the box and then just extend it with your own modifications:
public class TokenTree : Dictionary<string, string>
{
public IDictionary<string, string> SubPairs;
}
This gives you the advantage of not having to enforce the rules of IDictionary for your Key (e.g., key uniqueness, etc).
And yup you got the concept of the constructor right :)
I think what you might be after (as a literal implementation of your question), is:
public class TokenTree
{
public TokenTree()
{
tree = new Dictionary<string, IDictionary<string,string>>();
}
IDictionary<string, IDictionary<string, string>> tree;
}
You did actually say a "list" of key-values in your question, so you might want to swap the inner IDictionary with a:
IList<KeyValuePair<string, string>>
There is a KeyValuePair built-in type. As a matter of fact, this is what the IDictionary is giving you access to when you iterate in it.
Also, this structure is hardly a tree, finding a more representative name might be a good exercise.
Just one thing to add to this (although I do think you have already had your question answered by others). In the interests of extensibility (since we all know it will happen at some point) you may want to check out the Composite Pattern This is ideal for working with "Tree-Like Structures"..
Like I said, I know you are only expecting one sub-level, but this could really be useful for you if you later need to extend ^_^
#Jay Mooney: A generic Dictionary class in .NET is actually a hash table, just with fixed types.
The code you've shown shouldn't convince anyone to use Hashtable instead of Dictionary, since both code pieces can be used for both types.
For hashtable:
foreach(object key in h.keys)
{
string keyAsString = key.ToString(); // btw, this is unnecessary
string valAsString = h[key].ToString();
System.Diagnostics.Debug.WriteLine(keyAsString + " " + valAsString);
}
For dictionary:
foreach(string key in d.keys)
{
string valAsString = d[key].ToString();
System.Diagnostics.Debug.WriteLine(key + " " + valAsString);
}
And just the same for the other one with KeyValuePair, just use the non-generic version for Hashtable, and the generic version for Dictionary.
So it's just as easy both ways, but Hashtable uses Object for both key and value, which means you will box all value types, and you don't have type safety, and Dictionary uses generic types and is thus better.
Dictionary Class is exactly what you want, correct.
You can declare the field directly as Dictionary, instead of IDictionary, but that's up to you.
Use something like this:
class Tree < T > : Dictionary < T, IList< Tree < T > > >
{
}
It's ugly, but I think it will give you what you want. Too bad KeyValuePair is sealed.