Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
This StackOverflow answer completely describes that a HashSet is unordered and its item enumeration order is undefined and should not be relied upon.
However,
This brings up another question: should I or should I not rely upon the enumeration order between two or more sebsequent enumerations? Given there are no insertions or removals.
For example, lets say I have added some items to a HashSet:
HashSet<int> set = new HashSet<int>();
set.Add(1);
set.Add(2);
set.Add(3);
set.Add(4);
set.Add(5);
Now, when I enumerate this set via foreach, let us say I receive this sequence:
// Result: 1, 3, 4, 5, 2.
The question is: will the order preserve if I enumerate the set times and times again given I do no modifications? Will it always be the same?
Practically speaking, it might always be the same between enumerations, but that assumption is not provided for in the description of IEnumerable and the implementor could decide to return then in whichever order it wants.
Who knows what it is doing under the hood, and whether it will keep doing it the same way in the future. For example, a future implementation of HashSet might be optimized to detect low memory conditions and rearrange its contents in memory, thereby affecting the order in which they are returned. So 99.9% of the time they would come back the same order, but if you started exhausting memory resources, it would suddenly return things in a different order.
Bottom line is I would not rely on the order of enumeration to be consistent over time. If the order is important to you then do your foreach over set.OrderBy(x => x) so that you can make sure it is in the order you want.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
C# Why do arrays and collections have a difference between the names of the same attribute (Count and Length)?
It just causes headaches to people that are not familiar with this matter.
Length generally refers to a fixed size, whereas Count generally refers to content which could change. (I say generally because there are some exceptions to this, such as an IReadOnlyList which isn't going to change, but still has a Count since it is based upon a more generalized List interface.)
Besides #McGuireV10's answer part of the reason is historical. C# has it's roots in C, which use the "length" term when talking about arrays and strings. There was no compelling reason to not use "length".
Over the years, collections have been refined, genericized, and hold all kinds of different, countable objects, so "count" also makes sense.
I think another part of this is how we talk about our data structures. It is more natural to say, "what is the length of the array" than "what of the count of the array"; the former sounds natural, and the latter is ambiguous (did you mean count of items in the array or *the number of arrays".
Similarly when answering, "how many widgets are in the dictionary"? you are going to express your answer in terms of a count, not a length.
For something like a string, it's not wrong to think of it in terms of both count and length:
This string has (a count of) 40 characters
This string has a length of 40
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I'm currently examining some code I'm going to maintain.
I see a few occasions of
.Take(1).SingleOrDefault()
Why would you use that instead of simply
.First() or .FirstOrDefault()
(I'm not sure whether .Take(1) would throw an exeption if the result set is empty, which imho would make the difference between the two .First... Methods?)
It is impossible for us to know for sure the inner implementation of whatever LINQ provider you may be using. They all vary in how they do it. Some may be more performant in cases like this whereas others may be less performant. You should get the same result either way.
It is not possible for us to read someones mind to determine why they would have done it this way in this case.
With that said, if you want to dig in deeper and it is a SQL provider, you can see what SQL it generates and compare the two cases.
The main objection is that .Take(1).SingleOrDefault() defeats the purpose of SingleOrDefault, which is to throw an exception when the LINQ query returns more than one element.
To illustrate this, when running LINQ against a Sql Server backend Single(OrDefault) will translate into SELECT TOP (2) ... in order to determine whether there actually is one record. Preceding this by Take(1) will never return more than one record, so the "multiple result" exception will never occur although the code seems to require it. This code looks like a (premature) optimization by someone who's worried about returning two objects instead of one.
So the answer to your question "Why would you use that?" is: there's absolutely no reason to do it this way. There are only reasons not to do it.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I would like to know what is more optimal, A number N of private integer variables, or a single array containing N integer values.
When you use an array, that is a plus indirection. The array will be allocated at a separate part of the memory, so when you first access it, your code first obtains its address, then it is able to read out its content. It also needs some indexing, but that is done extremly fast by the CPU. However, .NET is a safe environment and it will do a check whether you use a valid array index. It adds additional time.
When you use separate variables, these will be encompassed by your object instance and no indirection is needed. Also, no index bound check is needed.
Moreover, you cannot name nicely the Nth element of an array, but you can give good names for individual variables. So your code will be readable.
As others mentioned, you shouldn't do this kind of optimalizations, the compiler/jitter take care of it. The compiler knows several common use cases and has optimialization strategy for that. If you start doing tricky things, the compiler will not recognize your intention and cannot make the optimalization for you.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I'm refactoring some legacy code that uses a 2D string array:
/// <summary>Array of valid server messages</summary>
private static string[,] serverRsp =
{
{"JOIN", "RSP" },
{"SETTING", "RSP" },
. . .
I want to modernize this, but don't know if I should use a Dictionary, a List of list of string, or something else. Is there a standard correlation between the "olden" way and the golden way (legacy vs. refactored)?
IWBN (it would be nice) if there was a chart somewhere that showed the olden vs. the golden for data types and structures, etc.
[,] is not an "old" datastructure, and hopefully will never become.
Keep using it whenever appropriate.
For example:
just in this case have a List<List<T>> is much more confusing then having simple 2 dimensional array.
It's lighter then List<T>in terms of memory consumption (at least from my measurements).
In short: if there is no any real reason, or new requirement to change it, like make it faster O(1) access data structure key-value store (for non index, hence key like, fast access), do not change it. It is clear and it is readable.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I'm writing a service where performance is essential, and I'm not sure what is the fastest thing. I have a few Objects (50-200) which each have an ID in them (ints, e.g. 84397 or 23845). Would it be faster to have a Dictionary, a List of KeyValue Pairs or a List with the indexes set to the IDs with the rest having null values or an array with the same idea?
It depends on which operation you want to execute. Let's assume that you want to find an object with a given ID.
The huge array approach is fastest: Accessing myArray[84397] is a constant-time operation O(1). Of course, this approach requires the most memory.
The dictionary is almost as fast but requires less memory, since it uses a hash table internally.
The list of pairs approach is the slowest, since you might have to traverse the whole list to find your entry, which yields O(n) complexity.
Thus, in your situation, I would choose the dictionary, unless the marginally better performance of the huge array is really relevant in your case.
Dictionary<TKey, TValue> uses a hash table internally so I think it would be the fastest one.
Dictionary versus List Lookup time
Also, for a more detailed explaination of the different collections, check out this question.
You can use Hashtables as well. Dictionary internally using it anyway.
but dictionary has an advantage that it is a GENERIC type which gives you type safety.
here is different thread
Dictionary Vs HashTable
I hope it helps you decide.
Praveen