IList<KeyValuePair> vs IDictionary to serve as [DataMember] in WCF - c#

I have a dictionary data structure that must be passed around using WCF. To do that I created a member property with get and set methods. I can basicly achieve the same functionality, with this property being either a:
IDictionary<keyType, valueType>
or a
IList<KeyValuePair<keyType, valueType>>
I can see no strong reason for choosing one over the other. One mild reaons I could think of is:
IDictionary - People reading the code will think that IDictionary makes more sense, since the data structure is a dictionary, but in terms of what is passed through WCF they are all the same.
Can anyone think of a reason to choose IList? If there is none I'll just go with IDictionary.

Design your interfaces based on use, not on implementation.
If the consumer of a class needs to iterate through the entire set, use IEnumerable. If they should be able to modify the result, and need index-based access, return IList. If they want specific items, and there is a single useful key value, return IDictionary.
Write your internal code this way, too :)

It depends on your consumers. I would cater for the most likely use case and make their API as simple as possible. Edge cases can always iterate the dictionary via the Values collection.
Don't make them think about it. If the the term dictionary is what they'd think about as the result of the operation and then the type with name is a very useful thing to use.

If the collection of keyValuePairs expects unique key, you can use dictionary.
If the same key can appear in more than one keyValuePair, use Ilist/ ienumerable.

Related

Sorting ConcurrentDictionary makes any sense?

At first my thought was like "this is an hash-based data type, then it is unsorted".
Then since I was about to use it I examined the matter in depth and found out that this class implements IEnumerable and also this post confirmed that it is possible to iterate over this kind of data.
So, my question is: if I use foreach over a ConcurrentDictionary which is the order I read the elements in?
Then, as a second question, I'd like to know if the sorting methods inherited by its interfaces are of any kind of use. If I call a sorting method over a ConcurrentDictionary the new order will persist (for example for an incoming foreach)?.
Hope I've made myself clear
The current implementation makes no promises whatsoever regarding the order of the elements.
A future implementation can easily change the order by which the elements are enumerated.
As such, your code should not depend on that order.
From the Dictionary<TKey, TValue> msdn docs:
The order in which the items are returned is undefined.
(I couldn't find any reference regarding the ConcurrentDictionary, but the same principle applies.)
When you refer to "the sorting methods inherited by its interfaces", do you mean LINQ extensions? Like OrderBy? If so, these extensions are purely functional and always return a new collection. So, to answer your question "the new order will persist?": no, it won't. You can however use it like this:
foreach(KeyValuePair<T1, T2> kv in dictionary.OrderBy(...))
{
}
if I use foreach over a ConcurrentDictionary which is the order I read the elements in?
You get them in the order of buckets they belong to, and if a bucket contains multiple items, the items are in the order in which they've been added.
But as others have said, this is an implementation detail you shouldn't rely on.
I'd like to know if the sorting methods inherited by its interfaces
are of any kind of use. If I call a sorting method over a
ConcurrentDictionary the new order will persist (for example for an
incoming foreach)?.
I assume you're refering to the OrderBy() extension method on the IEnumnerable<KeyValuePair<TKey, TValue>> interface. No nothing will persist. This method returns another IEnumnerable<KeyValuePair<TKey, TValue>>. The dictionary remains as it is.
Sounds like you might be asking for trouble if you aren't particularly careful. As was mentioned by dcastro order of elements is not ensured. A more troublesome issue is that a ConcurrentDictionary can be changed at any time by other threads. This means that even if order was ensured there is no reason why new items being added while you iterate wouldn't be missed. Unless you know you can prevent other threads from changing the dictionary it's probably not a good idea to iterate over it.

'Don't expose generic list', why to use collection<T> instead of list<T> in method parameter

I am using FxCop and it shows warning for "Don't expose generic list" which suggests use Collection<T> instead of List<T>. The reason why it is preferred, I know all that stuff, as mentioned in this SO post and MSDN and many more articles I have been through.
But my question is this, I am having few methods which does so much heavy calculation and methods accepts parameters of List<T> which is supposed to be faster and good in terms of performance. But FxCop issues warning for this as well as. So one option is that I should declare the parameter as Collection<T>, then use ToList() inside the method and then use it.
So which one is optimized?
"Suppress the warning for this case" OR "use Collection<T> in parameter and then use ToList() inside the method itself".
The code analysis/FxCop rules have been written to support framework creators (Microsoft creates a lot of frameworks). A framework is consumed by external parties and you should be careful when you design the public interface. Provided that you are not writing a framework to be consumed by external parties you can simply ignore rules that doesn't provide value to you.
However, one of the reasons that this rule exists is that exposing collections on a class is somewhat difficult. Often the elements in the collection are owned by the containing class and in that case you violate encapsulation if you allow clients to modify the collection used to store the aggregated items. By returning List<T> you allow the clients to modify the collection in many different ways. But often you want to keep track of the items in the collection. E.g. adding a new element might require some additional bookkeeping in the containing class etc. You lose this kind of control when you return a List<T> unless of course you make a copy when you return it (but then the client should understand that they only get a copy of collection and modifications will be ignored).
All in all you can probably improve your class design by avoiding exposing classes like List<T> and being more explicit about how aggregated elements can be added, modified and removed. But if you are in a hurry and just want to crank out some code then using List<T> may be exactly what you need to get the job done.
Don't bother using generic lists in public properties as long as you are not coding a framework somebody else want's to extend in the near future.
I suggest to suppress the warning. You can refactor your classes later if requirements change.
IMHO your interpretation of "Don't expose generic list' which suggests use collection instead of list". Is invalid.
The critical difference between collection and list is that the elements in list are ordered. Some methods may require that passed elements have order. Then we must use in parameter a list.
The key to understand delivered warning is that you should use instead of concrete class List<T> a interface IList<T>.
As the method operate on the list it is not so important what kind of list it is. The key factor is that it is a list.
Concluding the method parameters should be abstract as possible.
You should use the type that is most appropriate for your purposes (and suppress the warning if appropriate). If you're passing a bunch of items, and order and uniqueness don't matter, use a collection. If you're passing an ordered collection of items, use a list. If you're passing data such that every item is unique but order doesn't matter, use a set. Use the type that has the semantic meaning appropriate for the exchange. In a few cases where the semantics and the methods that you need don't necessarily align (suppose you need AddRange), make an exception, or use the conversion methods.

Why refactor argument of List<Term> to IEnumerable<Term>?

I have a method that looks like this:
public void UpdateTermInfo(List<Term> termInfoList)
{
foreach (Term termInfo in termInfoList)
{
UpdateTermInfo(termInfo);
}
m_xdoc.Save(FileName.FullName);
}
Resharper advises me to change the method signature to IEnumerable<Term> instead of List<Term>. What is the benefit of doing this?
The other answers point out that by choosing a "larger" type you permit a broader set of callers to call you. Which is a good enough reason in itself to make this change. However, there are other reasons. I would recommend that you make this change because when I see a method that takes a list or an array, the first thing I think is "what if that method tries to change an item in my list/array?"
You want the contents of a bucket, but you are requiring not just the bucket but also the ability to change its contents. Why would you require that if you're not going to use that ability? When you say "this method cannot take any old sequence; it has to take a mutable list that is indexed by integers" I think that you're making that requirement on the caller because you're going to take advantage of that power.
If "I'm planning on messing up your data structure" is not what you intend to communicate to the caller of the method then don't communicate that. A method that takes a sequence communicates "The most I'm going to do is read from this sequence in order".
Simply put, accepting an enumerable allows your function to be compatible with a broader scope of input arguments, such as arrays and LINQ queries.
To expound on accepting LINQ queries, one could do:
UpdateTermInfo(myTermList.Where(x => somefilter));
Additionally, specifying an interface rather than a concrete class allows others to provide their own implementation of that interface. In this way, you are being "subscriptive" rather than "proscriptive." (Yes, I did just make up a word.)
In general (with many exceptions relating to what sort of abilities you want to reserve for potential later modifications), it is a best-practice to implement functions using arguments that are the most general that they can be. This gives maximum flexibility to the consumer of your function.
As a result, if you are dead-set on using a list for this function (perhaps because at some later date you expect you might want to use properties such as Count or the index operator), I would strongly urge you to consider using IList<Term> instead of List<Term> for the reasons mentioned above.
List implements IEnumerable, using it would makes things more flexible. If an instance came along where you didn't want to use a List and wanted to use a different collection object it would cast from IEnumerable with ease.
For instance IEnumerable allows you to use Arrays and many others as opposed to always using a List.
Inumerable is simply a collection of items, dissimilar to a List, where you can add, remove, sort, use For Each, Count etc.
The main idea behind that refactor is that you make the method more general. You don't say what data structure you want, only what you need from it: that you can iterate through its elements.
So later, when you decide that O(n) search is not good enough for you, you only have to change one line and move along.
If you use List then you are confining yourself to only use a concrete implementation of List where as with IEnumerable you can pass in Arrays, Lists, Collections as they all implement that interface.

Why doesn't/couldn't IDictionary<TKey,TValue> implement ILookup<TKey,TValue>?

I suppose this doesn't really matter, I'm just curious.
If the difference between dictionary and lookup is one is one-to-one and the other one-to-many, wouldn't dictionary by a more specific/derived version of the other?
A lookup is a collection of key/value pairs where the key can be repeated.
A dictionary is a collection of key/value pairs where the key cannot be repeated.
Why couldn't IDictionary implement ILookup?
I suspect this is mainly because the intention is different.
ILookup<T,U> is designed specifically to work with a collection of values. IDictionary<T,U> is intended to work with a single value (that could, of course, be a collection).
While you could, of course, have IDictionary<T,U> implementations implement this via returning an IEnumerable<U> with a single value, this would be confusing, especially if your "U" is a collection itself (ie: List<int>). In that case, would ILookup<T,U>.Item return an IEnumerable<List<int>>, or should it do some type of check for an IEnumerable<T> value type, and then "flatten" it? Either way, it'd look confusing, and add questionable value.
Interfaces IDictionary<T,U> and ILookup<T,U> both inherit IEnumerable. If an IDictionary<T,U> is cast to IEnumerable and GetEnumerator() is called on it, the resulting enumerator should return instances of KeyValuePair<T,U>. If an ILookup<T,U> is cast to IEnumerable and GetEnumerator() is called upon it, the resulting enumerator should return instances of IGrouping<T,U>. If the KeyValuePair<T,U> struct were modified to implement IGrouping<T,U> that might be workable, but hardly clean.
I suspect it's because the IDictionary'2 interface came out long before ILookup'2 did. Going back and modifying is unnecessary. Concrete implementations can use ILookup'2. I don't see what would be gained by modifying an interface people have been using for years.

C# associative array

I've been using a Hashtable, but by nature, hashtables are not ordered, and I need to keep everything in order as I add them (because I want to pull them out in the same order). Forexample if I do:
pages["date"] = new FreeDateControl("Date:", false, true, false);
pages["plaintiff"] = new FreeTextboxControl("Primary Plaintiff:", true, true, false);
pages["loaned"] = new FreeTextboxControl("Amount Loaned:", true, true, false);
pages["witness"] = new FreeTextboxControl("EKFG Witness:", true, true, false);
And when I do a foreach I want to be able to get it in the order of:
pages["date"]
pages["plaintiff"]
pages["loaned"]
pages["witness"]
How can I do this?
I believe that .NET has the OrderedDictionary class to deal with this. It is not generic, but it can serve as a decent Hashtable substitute - if you don't care about strict type safety.
I've written a generic wrapper around this class, which I would be willing to share.
http://msdn.microsoft.com/en-us/library/system.collections.specialized.ordereddictionary.aspx
EDIT: LBushkin is right - OrderedDictionary looks like it does the trick, albeit in a non-generic way. It's funny how many specialized collections there are which don't have generic equivalents :( (It would make sense for Malfist to change the accepted answer to LBushkin's.)
(I thought that...) .NET doesn't have anything built-in to do this.
Basically you'll need to keep a List<string> as well as a Dictionary<string,FreeTextboxControl>. When you add to the dictionary, add the key to the list. Then you can iterate through the list and find the keys in insertion order. You'll need to be careful when you remove or replace items though.
use sorted list i think it will solve your problem
becuase SortedList object internally maintains two arrays to store the elements of the list; that is, one array for the keys and another array for the associated values. Each element is a key/value pair that can be accessed as a DictionaryEntry object
SortedList sl = new SortedList();
foreach(DictionaryEntry x in sl)
{}
Use the KeyedCollection
Its underlying base is a List but provides a dictionary lookup based on key. In this case your key is the strings. So as long as you aren't adding the same key twice you are fine.
http://msdn.microsoft.com/en-us/library/ms132438.aspx
There's no perfect solution before .NET 4.0. In < 3.5 You can:
Use a generic SortedList with integer key-type, and value type of the most-derived common type of your items. Define an integer value (i, let's say) and as you add each item to the SortedList, make the key i++, incrementing it's value as you go. Later, iterate over the GetValueList property of the sorted list. This IList property will yield your objects in the order you put them in, because they will be sorted by the key you used.
This is not lightening-fast, but pretty good, and generic. If you want to also access by key, you need to do something else, but I don't see that in your requirements. If you don't new to retrieve by key, and you add items in key order so the collection doesn't actually have to do its sorting, this is it.
In .NET 4.0 you'll have the generic SortedSet Of T, which will be absolutely perfect for you. No tradeoffs.
The best way is to use the C# indexers. It is configurable to anything we like. We can pass an int, enum, long, double or anything we like.
Just have to create a class and give it indexers and configure input and output parameters. It is a little more work but I think this is the only right way.
Please see this MSDN link for more information how to use it.
See Indexers: http://msdn.microsoft.com/en-us/library/6x16t2tx.aspx
One alternative is to keep your ordered key values in an ordered structure like a List, the rest being still stored in a dictionnary.
Then, when you need to access your data, just go through your sorted List and query your dictionnary along the way.
look at sorted list
http://msdn.microsoft.com/en-us/library/system.collections.sortedlist.aspx
As Haxelit suggests, you might derive from KeyedCollection<TKey, TValue>. It actually uses a List underneath until you hit a certain threshold value, and then it maintains both a List and a Dictionary. If you can use a function to derive one of your keys from one of your values, then this is an easy solution. If not, then it gets pretty messy.

Categories