Related
(Sorry for the vague title; couldn't think of anything better. Feel free to rephrase.)
So let's say my function or property returns an IEnumerable<T>:
public IEnumerable<Person> Adults
{
get
{
return _Members.Where(i => i.Age >= 18);
}
}
If I run a foreach on this property without actually materializing the returned enumerable:
foreach(var Adult in Adults)
{
//...
}
Is there a rule that governs whether IEnumerable<Person> will be materialized to array or list or something else?
Also is it safe to cast Adults to List<Person> or Array without calling ToList() or ToArray()?
Edit
Many people have spent a lot of effort into answering this question. Thanks to all of them. However, the gist of this question still remains unanswered. Let me put in some more details:
I understand that foreach doesn't require the target object to be an array or list. It doesn't even need to be a collection of any kind. All it needs the target object to do is to implement enumeration. However if I place inspect the value of target object, it reveals that the actual underlying object is List<T> (just like it shows object (string) when you inspect a boxed string object). This is where the confusion starts. Who performed this materialization? I inspected the underlying layers (Where() function's source) and it doesn't look like those functions are doing this.
So my problem lies at two levels.
First one is purely theoretical. Unlike many other disciplines like physics and biology, in computer sciences we always know precisely how something works (answering #zzxyz's last comment); so I was trying to dig about the agent who created List<T> and how it decided it should choose a List and not an Array and if there is a way of influencing that decision from our code.
My second reason was practical. Can I rely on the type of actual underlying object and cast it to List<T>? I need to use some List<T> functionality and I was wondering if for example ((List<Person>)Adults).BinarySearch() is as safe as Adults.ToList().BinarySearch()?
I also understand that it isn't going to create any performance penalty even if I do call ToList() explicitly. I was just trying to understand how it is working. Anyway, thanks again for the time; I guess I have spent just too much time on it.
In general terms all you need for a foreach to work is to have an object with an accessible GetEnumerator() method that returns an object that has the following methods:
void Reset()
bool MoveNext()
T Current { get; private set; } // where `T` is some type.
You don't even need an IEnumerable or IEnumerable<T>.
This code works as the compiler figures out everything it needs:
void Main()
{
foreach (var adult in new Adults())
{
Console.WriteLine(adult.ToString());
}
}
public class Adult
{
public override string ToString() => "Adult!";
}
public class Adults
{
public class Enumerator
{
public Adult Current { get; private set; }
public bool MoveNext()
{
if (this.Current == null)
{
this.Current = new Adult();
return true;
}
this.Current = null;
return false;
}
public void Reset() { this.Current = null; }
}
public Enumerator GetEnumerator() { return new Enumerator(); }
}
Having a proper enumerable makes the process work more easily and more robustly. The more idiomatic version of the above code is:
public class Adults
{
private class Enumerator : IEnumerator<Adult>
{
public Adult Current { get; private set; }
object IEnumerator.Current => this.Current;
public void Dispose() { }
public bool MoveNext()
{
if (this.Current == null)
{
this.Current = new Adult();
return true;
}
this.Current = null;
return false;
}
public void Reset()
{
this.Current = null;
}
}
public IEnumerator<Adult> GetEnumerator()
{
return new Enumerator();
}
}
This enables the Enumerator to be a private class, i.e. private class Enumerator. The interface then does all of the hard work - it's not even possible to get a reference to the Enumerator class outside of Adults.
The point is that you do not know at compile-time what the concrete type of the class is - and if you did you may not even be able to cast to it.
The interface is all you need, and even that isn't strictly true if you consider my first example.
If you want a List<Adult> or an Adult[] you must call .ToList() or .ToArray() respectively.
There is no such thing as a default concrete type for any interface.
The entire point of an interface is to guarantee properties, methods, events or indexers, without the user need of any knowledge of the concrete type that implements it.
When using an interface, all you can know is the properties, methods, events and indexers this interface declares, and that's all you actually need to know. That's just another aspect of encapsulation - same as when you are using a method of a class you don't need to know the internal implementation of that method.
To answer your question in the comments:
who decides that concrete type in case we don't, just as I did above?
That's the code that created the instance that's implementing the interface.
Since you can't do var Adults = new IEnumerable<Person> - it has to be a concrete type of some sort.
As far as I see in the source code for linq's Enumerable extensions - the where returns either an instance of Iterator<TSource> or an instance of WhereEnumerableIterator<TSource>. I didn't bother checking further what exactly are those types, but I can pretty much guarantee they both implement IEnumerable, or the guys at Microsoft are using a different c# compiler then the rest of us... :-)
The following code hopefully highlights why neither you nor the compiler can assume an underlying collection:
public class OneThroughTen : IEnumerable<int>
{
private static int bar = 0;
public IEnumerator<int> GetEnumerator()
{
while (true)
{
yield return ++bar;
if (bar == 10)
{ yield break; }
}
}
IEnumerator IEnumerable.GetEnumerator() { return GetEnumerator(); }
}
class Program
{
static void Main(string[] args)
{
IEnumerable<int> x = new OneThroughTen();
foreach (int i in x)
{ Console.Write("{0} ", i); }
}
}
Output being, of course:
1 2 3 4 5 6 7 8 9 10
Note, the code above behaves extremely poorly in the debugger. I don't know why. This code behaves just fine:
public IEnumerator<int> GetEnumerator()
{
while (bar < 10)
{
yield return ++bar;
}
bar = 0;
}
(I used static for bar to highlight that not only does the OneThroughTen not have a specific collection, it doesn't have any collection, and in fact has no instance data whatsoever. We could just as easily return 10 random numbers, which would've been a better example, now that I think on it :))
From your edited question and comments it sounds like you understand the general concept of using IEnumerable, and that you cannot assume that "a list object backs all IEnumerable objects". Your real question is about something that has confused you in the debugger, but we've not really been able to understand exactly what it is you are seeing. Perhaps a screenshot would help?
Here I have 5 IEnumerable<int> variables which I assign in various ways, along with how the "Watch" window describes them. Does this show the confusion you are having? If not, can you construct a similarly short program and screenshot that does?
Coming a bit late into the party here :)
Actually Linq's "Where" decides what's going to be the underlying implementation of IEnumerable's GetEnumerator.
Look at the source code:
https://github.com/dotnet/runtime/blob/918e6a9a278bc66fb191c43d4db4a71e63ffad31/src/libraries/System.Linq/src/System/Linq/Where.cs#L59
You'll see that based on the "source" type, the methods return "WhereSelectArrayIterator" or "WhereSelectListIterator" or a more generic "WhereSelectEnumerableSelector".
Each of this objects implement the GetEnumerator over an Array, or a List, so I'm pretty sure that's why you see the underlying object type being one of these on VS inspector.
Hope this helps clarifying.
I have been digging into this myself. I believe the 'underlying type' is an iterator method, not an actual data structure type.
An iterator method defines how to generate the objects in a sequence
when requested.
https://learn.microsoft.com/en-us/dotnet/csharp/iterators#enumeration-sources-with-iterator-methods
In my usecase/testing, the iterator is System.Linq.Enumerable.SelectManySingleSelectorIterator. I don't think this is a collection data type. It is a method that can enumerate IEnumerables.
Here is a snippet:
public IEnumerable<Item> ItemsToBuy { get; set; }
...
ItemsToBuy = Enumerable.Range(1, rng.Next(1, 20))
.Select(RandomItem(rng, market))
.SelectMany(e => e);
The property is IEnumerable and .SelectMany returns IEnumerable. So what is the actual collection data structure? I don't think there is one in how I am interpreting 'collection data structure'.
Also is it safe to cast Adults to List or Array without
calling ToList() or ToArray()?
Not for me. When attempting to cast ItemsToBuy collection in a foreach loop I get the following runtime exception:
{"Unable to cast object of type
'SelectManySingleSelectorIterator2[System.Collections.Generic.IEnumerable1[CashMart.Models.Item],CashMart.Models.Item]'
to type 'CashMart.Models.Item[]'."}
So I could not cast, but I could .ToArray(). I do suspect there is a performance hit as I would think that the IEnumerable would have to 'do things' to make it an array, including memory allocation for the array even if the entities are already in memory.
However if I place inspect the value of target object, it reveals that
the actual underlying object is List
This was not my experience and I think it may depend on the IEnumerable source as well as the LinQ provider. If I add a where, the returned iterator is:
System.Linq.Enumerable.WhereEnumerableIterator
I am unsure what your _Member source is, but using LinQ-to-Objects, I get an iterator. LinQ-to-Entities must call the database and store the result set in memory somehow and then enumerate on that result. I would doubt that it internally makes it a List, but I don't know much. I suspect instead that _Members may be a List somewhere else in your code thus, even after the .Where, it shows as a List.
This is a question about design/best practices.
I have the following class:
class MyClass
{
public bool IsNextAvailablle()
{
// implementation
}
public SomeObject GetNext()
{
return nextObject;
}
}
I consider this a bad design because the users of this class need to be aware that they need to call IsNextAvailable() before calling GetNext().
However, this "hidden contract" is the only thing which I can see wrong about, that the user can call GetNext() when there is nothing avaiable. (I would be happy if anyone can point out other scenarios in which this implementation is bad)
A second implementation I thought of is that GetNext() throws an exception if nextObject is not available. Client will have to handle the exception, plus it can have a small impact on performance and cpu usage due to the exception handling mechanism in .net (I expect this exception to be thrown quite often). Is the exception-driven way a better approach than the previous one? Which is the best way?
That's just fine. In fact, this two-step process is a common idiom for a bunch of .NET BCL classes. See, for example, an IEnumerable:
using(var enumerator = enumerable.Enumerator())
{
while(enumerator.MoveNext())
{
// Do stuff with enumerator.Current
}
}
Or DbDataReader:
using(var dbDataReader = dbCommand.ExecuteReader())
{
while(dbDataReader.Read())
{
// Do stuff with dbDataReader
}
}
Or Stream, for that matter:
var buffer = new byte[1024];
using(var stream = GetStream())
{
var read = 0;
while((read = stream.Read(buffer, 0, buffer.Length)))
{
// Do stuff with buffer
}
}
Now, your entire IsNextAvailable()/GetNext() could very well be replaced by implementing an IEnumerable<SomeObject> and thusly your API will be immediately familiar to any .NET developer.
Neither of them is an ideal solution, where the Exception has my personal preference because it allows a single point of entry.
You could opt to implement the IEnumerable<SomeObject> interface. In that way you can provide an enumerator that actually does all the checking for you.
class MyClass : IEnumerable<SomeObject>
{
private bool IsNextAvailablle()
{
// implementation
}
private SomeObject GetNext()
{
return nextObject;
}
public IEnumerator<SomeObject> GetEnumerator()
{
while (IsNextAvailablle())
{
yield return GetNext();
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return this.GetEnumerator();
}
}
Disclaimer: This question is in hindsight asking for opinions so I'm torn between closing it (and deleting my answer) or leaving my answer here.
In any case, this is my opinion, and only my opinion.
You should always strive for "pit of success".
The "pit of success" is best described by Jeff Atwood: Falling into the Pit of Success:
The Pit of Success: in stark contrast to a summit, a peak, or a journey across a desert to find victory through many trials and surprises, we want our customers to simply fall into winning practices by using our platform and frameworks. To the extent that we make it easy to get into trouble we fail.
The term was coined by Rico Mariani but I am unable to find a clear source for this term.
Basically, make an API that invites correct use and makes it hard to use wrong.
Or let me rephrase that: Make the correct usage of your API the only way to use your API.
In your case, you haven't done that.
Broad Explanation
In the case of "is it bad design to require consumers of my API to call methods in the right order, otherwise bad/incorrect things will happen?" - the answer is yes. This is bad.
Instead you should try to restructure your API so that the consumer "falls into the pit of success". In other words, make the API behave in the way that the consumer would assume it would by default.
The problem with this is that it invariably falls down to what people considers "by default". Different people might be used to different behavior.
For instance, let's say we get rid of IsNextAvailablle [sic] altogether and make GetNext return null in the case of no next available.
Some purists might say that then perhaps the method should be called TryGetNext. It may "fail" to produce a next item.
So here's your revised class:
class MyClass
{
public SomeObject TryGetNext()
{
return nextObject; // or null if none is available
}
}
There should no longer be any doubts as to what this method does. It attempts to get the next object from "something". It may fail, but you should also document that in the case where it fails, the consumer get null in return.
An example API in the .NET framework that behaves like this is the TextReader.ReadLine method:
Return Value:
The next line from the reader, or null if all characters have been read.
HOWEVER, if the question "is there anything else" can easily be answered, but "give me the next something" is an expensive operation then perhaps this is the wrong approach. For instance, if the output of GetNext is an expensive and large data structure that can be produced if one has an index into something, and the IsNextAvailablle can be answered by simply looking at the index and seeing that it is still less than 10, then perhaps this should be rethought.
Additionally, this type of "simplification" may not always be possible. For instance, the Stopwatch class requires the consumer to start the stopwatch before reading time elapsed.
A better restructuring of such a class would be that you either have a stopped stopwatch or a started stopwatch. A started stopwatch cannot be started. Let me show this class:
public class StoppedStopwatch
{
public RunningStopwatch Start()
{
return new RunningStopwatch(...);
}
}
public class RunningStopwatch
{
public PausedStopwatch Pause()
{
return new PausedStopwatch(...);
}
public TimeSpan Elapsed { get; }
}
public class PausedStopwatch
{
public RunningStopwatch Unpause()
{
return new RunningStopwatch(...);
}
public TimeSpan Elapsed { get; }
}
This API doesn't even allow you to do the wrong things. You cannot stop a stopped stopwatch and since it has never been started you can't even read the time elapsed.
A running stopwatch however can be paused, or you can read the elapsed time. If you pause it, you can unpause it to get it running again, or you can read the elapsed time (as of when you paused it).
This API invites correct usage because it doesn't make incorrect usage available.
So in the broad sense, your class is bad design (in my opinion).
Try to restructure the API so that the correct way to use it is the only way to use it.
Specific Case
Now, let's deal with your specific code example. Is that bad design, and how do you improve it?
Well, as I said in a comment, if you squint slightly and replace some of the names in the class you have reimplemented IEnumerable:
class MyClass interface IEnumerable
{ {
public bool IsNextAvailablle() public bool MoveNext()
{ {
// implementation
} }
public SomeObject GetNext() public SomeObject Current
{ {
return nextObject; get { ... }
} }
} }
So your example class looks a lot like a collection. I can start enumerating over it, I can move to the next item, one item at a time, and at some point I reach the end.
In this case I would simply say "don't reinvent the wheel". Implement IEnumerable because as a consumer of your class, this is what I would you expect you to do.
So your class should look like this:
class MyClass : IEnumerable<SomeObject>
{
public IEnumerator<SomeObject> GetEnumerator()
{
while (... is next available ...)
yield return ... get next ...;
}
public IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}
Again, this is "pit of success". If the class is in reality a collection of things, use the tools built into .NET to make it behave like any other .NET collection.
If you were to document your class as "a collection of SomeObject instances" I would grab my LINQ and foreach toolkit by default. When I get a compiler error I would start looking at the members to find the actual collection because I have a very strong sense of what a collection should be in .NET. I would be very puzzled if you reimplemented all the tools that would be able to handle IEnumerable but simply didn't make it implement this interface.
So, instead of me having to write code like this:
var c = new MyClass();
while (c.IsNextAvailablle())
{
var item = c.GetNext();
// process item
}
I can write this:
var c = new MyClass();
foreach (var item in c)
// process item
Why should the users even have to call IsNextAvailable? By rights, IsNextAvailable should be private and GetNext should be the one calling it, and then throw an exception or a warning or return a null if there is nothing available.
public SomeObject GetNext()
{
if(IsNextAvailable())
return nextObject;
else
throw new Exception("there is no next"); // this is just illustrative. By rights exceptions shouldn't be used for this scenario
}
private bool IsNextAvailable()
{
// implementation
}
I've few methods which acept collections of fixed size (e.g. 2, 3, 5). And I can't decide which way is better:
public void Foo(IEnumerable<Object> objects)
{
if(objects.Count() != 3)
{
throw new Exception()
}
// actions
}
public void Foo(Object objectA, Object objectB, Object objectC)
{
// actions
}
Is there any ultimate +\- of each option?
The second is much better in my view:
It's obvious from the signature that it's expecting 3 values
Failures are flagged at compile time instead of execution time
If you have a specific number of members that are required, use your second option. It is confusing to the consumer of your method if a collection is allowed but then an exception is thrown at run time. This may or may not be caught if proper testing is not utilized and it is misleading. Always design for the person who will consume your code, never assuming that you will always be the one to maintain it.
I would go for this:
public class Bar
{
public Object object1;
public Object object2;
public Object object3;
// add a constructor if you want
}
...
public void Foo(Bar b)
{
// actions
}
I'm writing an iterator that needs to pass around a mutable integer.
public IEnumerable<T> Foo(ref int valueThatMeansSomething)
{
// Stuff
yield return ...;
}
This nets me "Error 476 Iterators cannot have ref or out parameters".
What I need is this integer value to be modified in the iterator and usable by the caller of the iterator. In other words, whatever calls Foo() above wants to know the end value of valueThatMeansSomething and Foo() may use it itself. Really, I want an integer that is a reference type not a value type.
Only thing I can think of is to write a class that encapsulates my integer and permits me to modify it.
public class ValueWrapper<T>
where T : struct
{
public ValueWrapper(T item)
{
this.Item = item;
}
public T Item { get; set; }
}
So:
ValueWrapper<int> w = new ValueWrapper<int>(0);
foreach(T item in Foo(w))
{
// Do stuff
}
if (w.Item < 0) { /* Do stuff */ }
Is there any class or mechanism to handle this already in the BCL? Any flaws with ValueWrapper<T> proposed above?
(My actual use is more complicated than the example above so handling the variable inside my foreach loop that calls Foo() is not an option. Period.)
If you only need to write the value then another technique would be:
public IEnumerable<whatever> Foo(Action<int> setter) { ... }
int value = 0;
foreach(var x in Foo(x => {value=x;}) { ... }
Coincidentally, I'll be doing a series on the reasons why there are so many goofy restrictions on iterator blocks in my blog in July. "Why no ref parameters?" will be early in the series.
http://blogs.msdn.com/ericlippert/archive/tags/Iterators/default.aspx
Nope, I'm pretty confident there's nothing existing in the BCL that can do this. Your best option is precisely what you have proposed I think. The implementation of ValueWrapper really need not be any more complicated than what you have proposed.
Of course, it's not guaranteed to be thread-safe, but if you need that you can simply convert the automatic property into a standard one with a backing variable and mark the field as volatile (to insure the value is up-to-date at all times).
I have long thought that the BCL really should have a class and interface something like the following:
public delegate void ActByRef<T1,T2>(ref T1 p1);
public delegate void ActByRefRef<T1,T2>(ref T1 p1, ref T2 p2);
public interface IReadWriteActUpon<T>
{
T Value {get; set;}
void ActUpon(ActByRef<T> proc);
void ActUpon<TExtraParam>(ActByRefRef<T, TExtraParam> proc,
ref TExtraparam ExtraParam);
}
public sealed class MutableWrapper<T> : IReadWrite<T>
{
public T Value;
public MutableWrapper(T value) { this.Value = value; }
T IReadWriteActUpon<T>.Value {get {return this.Value;} set {this.Value = value;} }
public void ActUpon(ActByRef<T> proc)
{
proc(ref Value);
}
public void ActUpon<TExtraParam>(ActByRefRef<T, TExtraParam> proc,
ref TExtraparam ExtraParam)
{
proc(ref Value, ref ExtraParam);
}
}
Although many people instinctively wrap fields in auto-properties, fields often allow cleaner and more efficient code especially when using value types. In many situations, the increased encapsulation one can gain by using properties may be worth the cost in efficient and semantics, but when the whole purpose of a type is to be a class object whose state is completely exposed and mutable, such encapsulation is counterproductive.
The interface is included not because many users of a MutableWrapper<T> would want to use the interface instead, but rather because an IReadWriteActUpon<T> could be useful in a variety of situations, some of which would entail encapsulation, and someone who has an instance of MutableWrapper<T> might wish to pass it to code which is designed to work with data encapsulated in an IReadWriteActUpon<T> interface.
I need to validate an integer to know if is a valid enum value.
What is the best way to do this in C#?
You got to love these folk who assume that data not only always comes from a UI, but a UI within your control!
IsDefined is fine for most scenarios, you could start with:
public static bool TryParseEnum<TEnum>(this int enumValue, out TEnum retVal)
{
retVal = default(TEnum);
bool success = Enum.IsDefined(typeof(TEnum), enumValue);
if (success)
{
retVal = (TEnum)Enum.ToObject(typeof(TEnum), enumValue);
}
return success;
}
(Obviously just drop the ‘this’ if you don’t think it’s a suitable int extension)
IMHO the post marked as the answer is incorrect.
Parameter and data validation is one of the things that was drilled into me decades ago.
WHY
Validation is required because essentially any integer value can be assigned to an enum without throwing an error.
I spent many days researching C# enum validation because it is a necessary function in many cases.
WHERE
The main purpose in enum validation for me is in validating data read from a file: you never know if the file has been corrupted, or was modified externally, or was hacked on purpose.
And with enum validation of application data pasted from the clipboard: you never know if the user has edited the clipboard contents.
That said, I spent days researching and testing many methods including profiling the performance of every method I could find or design.
Making calls into anything in System.Enum is so slow that it was a noticeable performance penalty on functions that contained hundreds or thousands of objects that had one or more enums in their properties that had to be validated for bounds.
Bottom line, stay away from everything in the System.Enum class when validating enum values, it is dreadfully slow.
RESULT
The method that I currently use for enum validation will probably draw rolling eyes from many programmers here, but it is imho the least evil for my specific application design.
I define one or two constants that are the upper and (optionally) lower bounds of the enum, and use them in a pair of if() statements for validation.
One downside is that you must be sure to update the constants if you change the enum.
This method also only works if the enum is an "auto" style where each enum element is an incremental integer value such as 0,1,2,3,4,.... It won't work properly with Flags or enums that have values that are not incremental.
Also note that this method is almost as fast as regular if "<" ">" on regular int32s (which scored 38,000 ticks on my tests).
For example:
public const MyEnum MYENUM_MINIMUM = MyEnum.One;
public const MyEnum MYENUM_MAXIMUM = MyEnum.Four;
public enum MyEnum
{
One,
Two,
Three,
Four
};
public static MyEnum Validate(MyEnum value)
{
if (value < MYENUM_MINIMUM) { return MYENUM_MINIMUM; }
if (value > MYENUM_MAXIMUM) { return MYENUM_MAXIMUM; }
return value;
}
PERFORMANCE
For those who are interested, I profiled the following variations on an enum validation, and here are the results.
The profiling was performed on release compile in a loop of one million times on each method with a random integer input value. Each test was ran more than 10 times and averaged. The tick results include the total time to execute which will include the random number generation etc. but those will be constant across the tests. 1 tick = 10ns.
Note that the code here isn't the complete test code, it is only the basic enum validation method. There were also a lot of additional variations on these that were tested, and all of them with results similar to those shown here that benched 1,800,000 ticks.
Listed slowest to fastest with rounded results, hopefully no typos.
Bounds determined in Method = 13,600,000 ticks
public static T Clamp<T>(T value)
{
int minimum = Enum.GetValues(typeof(T)).GetLowerBound(0);
int maximum = Enum.GetValues(typeof(T)).GetUpperBound(0);
if (Convert.ToInt32(value) < minimum) { return (T)Enum.ToObject(typeof(T), minimum); }
if (Convert.ToInt32(value) > maximum) { return (T)Enum.ToObject(typeof(T), maximum); }
return value;
}
Enum.IsDefined = 1,800,000 ticks
Note: this code version doesn't clamp to Min/Max but returns Default if out of bounds.
public static T ValidateItem<T>(T eEnumItem)
{
if (Enum.IsDefined(typeof(T), eEnumItem) == true)
return eEnumItem;
else
return default(T);
}
System.Enum Convert Int32 with casts = 1,800,000 ticks
public static Enum Clamp(this Enum value, Enum minimum, Enum maximum)
{
if (Convert.ToInt32(value) < Convert.ToInt32(minimum)) { return minimum; }
if (Convert.ToInt32(value) > Convert.ToInt32(maximum)) { return maximum; }
return value;
}
if() Min/Max Constants = 43,000 ticks = the winner by 42x and 316x faster.
public static MyEnum Clamp(MyEnum value)
{
if (value < MYENUM_MINIMUM) { return MYENUM_MINIMUM; }
if (value > MYENUM_MAXIMUM) { return MYENUM_MAXIMUM; }
return value;
}
-eol-
As others have mentioned, Enum.IsDefined is slow, something you have to be aware of if it's in a loop.
When doing multiple comparisons, a speedier method is to first put the values into a HashSet. Then simply use Contains to check whether the value is valid, like so:
int userInput = 4;
// below, Enum.GetValues converts enum to array. We then convert the array to hashset.
HashSet<int> validVals = new HashSet<int>((int[])Enum.GetValues(typeof(MyEnum)));
// the following could be in a loop, or do multiple comparisons, etc.
if (validVals.Contains(userInput))
{
// is valid
}
Update 2022-09-27
As of .NET 5, a fast, generic overload is available: Enum.IsDefined<TEnum>(TEnum value).
The generic overload alleviates the performance issues of the non-generic one.
Original Answer
Here is a fast generic solution, using a statically-constucted HashSet<T>.
You can define this once in your toolbox, and then use it for all your enum validation.
public static class EnumHelpers
{
/// <summary>
/// Returns whether the given enum value is a defined value for its type.
/// Throws if the type parameter is not an enum type.
/// </summary>
public static bool IsDefined<T>(T enumValue)
{
if (typeof(T).BaseType != typeof(System.Enum)) throw new ArgumentException($"{nameof(T)} must be an enum type.");
return EnumValueCache<T>.DefinedValues.Contains(enumValue);
}
/// <summary>
/// Statically caches each defined value for each enum type for which this class is accessed.
/// Uses the fact that static things exist separately for each distinct type parameter.
/// </summary>
internal static class EnumValueCache<T>
{
public static HashSet<T> DefinedValues { get; }
static EnumValueCache()
{
if (typeof(T).BaseType != typeof(System.Enum)) throw new Exception($"{nameof(T)} must be an enum type.");
DefinedValues = new HashSet<T>((T[])System.Enum.GetValues(typeof(T)));
}
}
}
Note that this approach is easily extended to enum parsing as well, by using a dictionary with string keys (minding case-insensitivity and numeric string representations).
Brad Abrams specifically warns against Enum.IsDefined in his post The Danger of Oversimplification.
The best way to get rid of this requirement (that is, the need to validate enums) is to remove ways where users can get it wrong, e.g., an input box of some sort. Use enums with drop downs, for example, to enforce only valid enums.
This answer is in response to deegee's answer which raises the performance issues of System.Enum so should not be taken as my preferred generic answer, more addressing enum validation in tight performance scenarios.
If you have a mission critical performance issue where slow but functional code is being run in a tight loop then I personally would look at moving that code out of the loop if possible instead of solving by reducing functionality. Constraining the code to only support contiguous enums could be a nightmare to find a bug if, for example, somebody in the future decides to deprecate some enum values. Simplistically you could just call Enum.GetValues once, right at the start to avoid triggering all the reflection, etc thousands of times. That should give you an immediate performance increase. If you need more performance and you know that a lot of your enums are contiguous (but you still want to support 'gappy' enums) you could go a stage further and do something like:
public abstract class EnumValidator<TEnum> where TEnum : struct, IConvertible
{
protected static bool IsContiguous
{
get
{
int[] enumVals = Enum.GetValues(typeof(TEnum)).Cast<int>().ToArray();
int lowest = enumVals.OrderBy(i => i).First();
int highest = enumVals.OrderByDescending(i => i).First();
return !Enumerable.Range(lowest, highest).Except(enumVals).Any();
}
}
public static EnumValidator<TEnum> Create()
{
if (!typeof(TEnum).IsEnum)
{
throw new ArgumentException("Please use an enum!");
}
return IsContiguous ? (EnumValidator<TEnum>)new ContiguousEnumValidator<TEnum>() : new JumbledEnumValidator<TEnum>();
}
public abstract bool IsValid(int value);
}
public class JumbledEnumValidator<TEnum> : EnumValidator<TEnum> where TEnum : struct, IConvertible
{
private readonly int[] _values;
public JumbledEnumValidator()
{
_values = Enum.GetValues(typeof (TEnum)).Cast<int>().ToArray();
}
public override bool IsValid(int value)
{
return _values.Contains(value);
}
}
public class ContiguousEnumValidator<TEnum> : EnumValidator<TEnum> where TEnum : struct, IConvertible
{
private readonly int _highest;
private readonly int _lowest;
public ContiguousEnumValidator()
{
List<int> enumVals = Enum.GetValues(typeof (TEnum)).Cast<int>().ToList();
_lowest = enumVals.OrderBy(i => i).First();
_highest = enumVals.OrderByDescending(i => i).First();
}
public override bool IsValid(int value)
{
return value >= _lowest && value <= _highest;
}
}
Where your loop becomes something like:
//Pre import-loop
EnumValidator< MyEnum > enumValidator = EnumValidator< MyEnum >.Create();
while(import) //Tight RT loop.
{
bool isValid = enumValidator.IsValid(theValue);
}
I'm sure the EnumValidator classes could written more efficiently (it’s just a quick hack to demonstrate) but quite frankly who cares what happens outside the import loop? The only bit that needs to be super-fast is within the loop. This was the reason for taking the abstract class route, to avoid an unnecessary if-enumContiguous-then-else in the loop (the factory Create essentially does this upfront).
You will note a bit of hypocrisy, for brevity this code constrains functionality to int-enums. I should be making use of IConvertible rather than using int's directly but this answer is already wordy enough!
Building upon Timo's answer, here is an even faster, safer and simpler solution, provided as an extension method.
public static class EnumExtensions
{
/// <summary>Whether the given value is defined on its enum type.</summary>
public static bool IsDefined<T>(this T enumValue) where T : Enum
{
return EnumValueCache<T>.DefinedValues.Contains(enumValue);
}
private static class EnumValueCache<T> where T : Enum
{
public static readonly HashSet<T> DefinedValues = new HashSet<T>((T[])Enum.GetValues(typeof(T)));
}
}
Usage:
if (myEnumValue.IsDefined()) { ... }
Update - it's even now cleaner in .NET 5:
public static class EnumExtensions
{
/// <summary>Whether the given value is defined on its enum type.</summary>
public static bool IsDefined<T>(this T enumValue) where T : struct, Enum
{
return EnumValueCache<T>.DefinedValues.Contains(enumValue);
}
private static class EnumValueCache<T> where T : struct, Enum
{
public static readonly HashSet<T> DefinedValues = new(Enum.GetValues<T>());
}
}
This is how I do it based on multiple posts online. The reason for doing this is to make sure enums marked with Flags attribute can also be successfully validated.
public static TEnum ParseEnum<TEnum>(string valueString, string parameterName = null)
{
var parsed = (TEnum)Enum.Parse(typeof(TEnum), valueString, true);
decimal d;
if (!decimal.TryParse(parsed.ToString(), out d))
{
return parsed;
}
if (!string.IsNullOrEmpty(parameterName))
{
throw new ArgumentException(string.Format("Bad parameter value. Name: {0}, value: {1}", parameterName, valueString), parameterName);
}
else
{
throw new ArgumentException("Bad value. Value: " + valueString);
}
}
You can use the FluentValidation for your project. Here is a simple example for the "Enum Validation"
Let's create a EnumValidator class with using FluentValidation;
public class EnumValidator<TEnum> : AbstractValidator<TEnum> where TEnum : struct, IConvertible, IComparable, IFormattable
{
public EnumValidator(string message)
{
RuleFor(a => a).Must(a => typeof(TEnum).IsEnum).IsInEnum().WithMessage(message);
}
}
Now we created the our enumvalidator class; let's create the a class to call enumvalidor class;
public class Customer
{
public string Name { get; set; }
public Address address{ get; set; }
public AddressType type {get; set;}
}
public class Address
{
public string Line1 { get; set; }
public string Line2 { get; set; }
public string Town { get; set; }
public string County { get; set; }
public string Postcode { get; set; }
}
public enum AddressType
{
HOME,
WORK
}
Its time to call our enum validor for the address type in customer class.
public class CustomerValidator : AbstractValidator<Customer>
{
public CustomerValidator()
{
RuleFor(x => x.type).SetValidator(new EnumValidator<AddressType>("errormessage");
}
}
To expound on the performance scaling specifically regarding Timo/Matt Jenkins method:
Consider the following code:
//System.Diagnostics - Stopwatch
//System - ConsoleColor
//System.Linq - Enumerable
Stopwatch myTimer = Stopwatch.StartNew();
int myCyclesMin = 0;
int myCyclesCount = 10000000;
long myExt_IsDefinedTicks;
long myEnum_IsDefinedTicks;
foreach (int lCycles in Enumerable.Range(myCyclesMin, myCyclesMax))
{
Console.WriteLine(string.Format("Cycles: {0}", lCycles));
myTimer.Restart();
foreach (int _ in Enumerable.Range(0, lCycles)) { ConsoleColor.Green.IsDefined(); }
myExt_IsDefinedTicks = myTimer.ElapsedTicks;
myTimer.Restart();
foreach (int _ in Enumerable.Range(0, lCycles)) { Enum.IsDefined(typeof(ConsoleColor), ConsoleColor.Green); }
myEnum_IsDefinedTicks = myTimer.E
Console.WriteLine(string.Format("object.IsDefined() Extension Elapsed: {0}", myExt_IsDefinedTicks.ToString()));
Console.WriteLine(string.Format("Enum.IsDefined(Type, object): {0}", myEnum_IsDefinedTicks.ToString()));
if (myExt_IsDefinedTicks == myEnum_IsDefinedTicks) { Console.WriteLine("Same"); }
else if (myExt_IsDefinedTicks < myEnum_IsDefinedTicks) { Console.WriteLine("Extension"); }
else if (myExt_IsDefinedTicks > myEnum_IsDefinedTicks) { Console.WriteLine("Enum"); }
}
Output starts out like the following:
Cycles: 0
object.IsDefined() Extension Elapsed: 399
Enum.IsDefined(Type, object): 31
Enum
Cycles: 1
object.IsDefined() Extension Elapsed: 213654
Enum.IsDefined(Type, object): 1077
Enum
Cycles: 2
object.IsDefined() Extension Elapsed: 108
Enum.IsDefined(Type, object): 112
Extension
Cycles: 3
object.IsDefined() Extension Elapsed: 9
Enum.IsDefined(Type, object): 30
Extension
Cycles: 4
object.IsDefined() Extension Elapsed: 9
Enum.IsDefined(Type, object): 35
Extension
This seems to indicate there is a steep setup cost for the static hashset object (in my environment, approximately 15-20ms.
Reversing which method is called first doesn't change that the first call to the extension method (to set up the static hashset) is quite lengthy. Enum.IsDefined(typeof(T), object) is also longer than normal for the first cycle, but, interestingly, much less so.
Based on this, it appears Enum.IsDefined(typeof(T), object) is actually faster until lCycles = 50000 or so.
I'm unsure why Enum.IsDefined(typeof(T), object) gets faster at both 2 and 3 lookups before it starts rising. Clearly there's some process going on internally as object.IsDefined() also takes markedly longer for the first 2 lookups before settling in to be bleeding fast.
Another way to phrase this is that if you need to lots of lookups with any other remotely long activity (perhaps a file operation like an open) that will add a few milliseconds, the initial setup for object.IsDefined() will be swallowed up (especially if async) and become mostly unnoticeable. At that point, Enum.IsDefined(typeof(T), object) takes roughly 5x longer to execute.
Basically, if you don't have literally thousands of calls to make for the same Enum, I'm not sure how hashing the contents is going to save you time over your program execution. Enum.IsDefined(typeof(T), object) may have conceptual performance problems, but ultimately, it's fast enough until you need it thousands of times for the same enum.
As an interesting side note, implementing the ValueCache as a hybrid dictionary yields a startup time that reaches parity with Enum.IsDefined(typeof(T), object) within ~1500 iterations. Of course, using a HashSet passes both at ~50k.
So, my advice: If your entire program is validating the same enum (validating different enums causes the same level of startup delay, once for each different enum) less than 1500 times, use Enum.IsDefined(typeof(T), object). If you're between 1500 and 50k, use a HybridDictionary for your hashset, the initial cache populate is roughly 10x faster. Anything over 50k iterations, HashSet is a pretty clear winner.
Also keep in mind that we are talking in Ticks. In .Net a 10,000 ticks is 1 ms.
For full disclosure I also tested List as a cache, and it's about 1/3 the populate time as hashset, however, for any enum over 9 or so elements, it's way slower than any other method. If all your enums are less than 9 elements, (or smaller yet) it may be the fastest approach.
The cache defined as a HybridDictionary (yes, the keys and values are the same. Yes, it's quite a bit harder to read than the simpler answers referenced above):
//System.Collections.Specialized - HybridDictionary
private static class EnumHybridDictionaryValueCache<T> where T : Enum
{
static T[] enumValues = (T[])Enum.GetValues(typeof(T));
static HybridDictionary PopulateDefinedValues()
{
HybridDictionary myDictionary = new HybridDictionary(enumValues.Length);
foreach (T lEnumValue in enumValues)
{
//Has to be unique, values are actually based on the int value. Enums with multiple aliases for one value will fail without checking.
//Check implicitly by using assignment.
myDictionary[lEnumValue] = lEnumValue;
}
return myDictionary;
}
public static readonly HybridDictionary DefinedValues = PopulateDefinedValues();
}
I found this link that answers it quite well. It uses:
(ENUMTYPE)Enum.ToObject(typeof(ENUMTYPE), INT)
To validate if a value is a valid value in an enumeration, you only need to call the static method Enum.IsDefined.
int value = 99;//Your int value
if (Enum.IsDefined(typeof(your_enum_type), value))
{
//Todo when value is valid
}else{
//Todo when value is not valid
}