Significance of AsEnumerable? - c#

var query =
from dt1 in dtStudent.AsEnumerable()
join dt2 in dtMarks.AsEnumerable()
on dt1.Field<int>("StudentID")
equals dt2.Field<int>("StudentID")
select new StudentMark
{
StudentName = dt1.Field<string>("StudentName"),
Mark = dt2.Field<int>("Mark")
};
In the above coding, what is the significance of AsEnumerable? if the AsEnumerable doesn't exist in .NET Framework, then what would be the approach of developers to perform the above the task?

Assuming I'm interpreting it correctly, it's calling DataTableExtensions.AsEnumerable(). Without that (or something similar), you can't use LINQ to Objects as DataTable doesn't implement IEnumerable<T>, only IEnumerable.
Note that an alternative would be to use Cast<DataRow>, but that would be subtly different as that would use the DataTable's GetEnumerator directly within Cast, whereas I believe EnumerableRowCollection<TRow> does slightly more funky things with the data table. It's unlikely to show up any real changes, except possibly a slight performance difference.

The .AsEnumerable() extension is just short-hand for casting something that implements IEnumerable<T> to be IEnumerable<T>
So, if xs is int[], you can call xs.AsEnumerable() instead of (xs as IEnumerable<int>). It uses type inference to avoid needing to explicitly keying the type of xs.
Here's the code extracted by Reflector.NET:
public static IEnumerable<TSource> AsEnumerable<TSource>(
this IEnumerable<TSource> source)
{
return source;
}
But in this case I think I have to agree with Jon. It's probably from the System.Data.DataSetExtensions assembly.

Related

Casting to custom type, Enumerable.Cast<T> and the as keyword

This is more a question out of curiosity than necessity and came about having had to deal with Active Directory (MS) types such as SearchResultCollection (in the System.DirectoryServices namespace).
Frequently when dealing with AD in code, I find that I'm having to check values for null, Count, [0] .. whatever and convert what I get out .. all the while hoping that the underlying AD object via COM doesn't go poof etc.
After having a play about with Parallel.ForEach recently - and having to pass in an IEnumerable<T>, I thought, maybe it would be fun to see how I could cast a SearchResultCollection to an IEnumerable of my own custom type. In this type I would pull out all the values from the SearchResult object and stick them in my own, .NET managed code. Then I'd do away with the DirectoryEntry, DirectorySearcher etc. etc.
So, having already worked out that it's a good idea to do searchResultCollection.Cast() in order to supply Parallel.ForEach with it's source, I added an explicit operator for casting to my own type (let's just call it 'Item').
I tested this out within the ParallelForEach, var myItem = (Item)currentSearchResult.
Good times, my cast operator is called and it all works. I then thought, it would be nice to do something like searchResultCollection.Cast<Item>(). Sadly this didn't work, didn't hit any breakpoints in the cast operator.
I did some Googling and discovered a helpful post which Jon Skeet had answered:
IEnumerable.Cast<>
The crux of it, use .Select(...) to force the explicit cast operation. OK, but, hmmm.
I went off and perhaps disassembled System.Core -> System.Linq.Enumerable.Cast<TResult>, I noticed that this 'cast' is actually doing an 'as' keyword conversion under the hood:
public static IEnumerable<TResult> Cast<TResult>(this IEnumerable source)
{
IEnumerable<TResult> enumerable = source as IEnumerable<TResult>;
if (enumerable != null)
{
return enumerable;
}
if (source == null)
{
throw Error.ArgumentNull("source");
}
return CastIterator<TResult>(source);
}
I read some more and found this:
Implicit/Explicit conversion with respect to the "as" keyword
The top answer here states that 'as' doesn't invoke any conversion operators .. use a (cast). Semantically I find this a little weird, since the extension method is called cast.. Shouldn't it be casting? There will no doubt be a really good reason why this doesn't happen, anyone know what it is?
Even if it would have used the cast operator instead of as it still wouldn't be invoking user defined explicit operators, so it wouldn't matter. The only difference would be the type of exception thrown.
Explicit operators aren't known at all by the runtime. According to the CLR there is no way to cast your search result to Item. When the compiler notices that there is a cast that matches a given explicit operator it injects at compile time a call to that explicit operator (which is basically a static method) into the code. Once you get to runtime there is no remaining knowledge of the cast, there is simply a method call in place to handle the conversion.
Because this is how explicit operators are implemented, rather than providing knowledge to the runtime of how to do the conversion, there is no way for Cast to inject the explicit operator's call into the code. It's already been compiled. When it was compiled there was no knowledge of any explicit operator to inject, so none was injected.
Semantically I find this a little weird, since the extension method is called cast.. Shouldn't it be casting?
It's casting each element if it needs to, within CastIterator... although using a generic cast, which won't use explicit conversion operator you've defined anyway. You should think of the explicit conversion operator as a custom method with syntactic sugar over the top, and not something the CLR cares about in most cases.
For for the as operator: that's just used to say "If this is already a sequence of the right type, we can just return." It's used on the sequence as a whole, not each element.
This can actually cause problems in some slightly bizarre situations where the C# conversion and the CLR conversions aren't aligned, although I can't remember the example I first came upon immediately.
See the Cast/OfType post within Edulinq for more details.
If I understand correctly you need a DynamicCast which I wrote sometime ago.
Runtime doesn't know about implicit and explicit casting; it is the job of the compiler to do that. but using Enumerable.Cast<> you can't get that because Enumerable.Cast involves casting from System.Object to Item where there is no conversion available(you have conversion from X to Item, and not Object to Item)
Take the advantage of dynamic in .Net4.0
public static class DynamicEnumerable
{
public static IEnumerable<T> DynamicCast<T>(this IEnumerable source)
{
foreach (dynamic current in source)
{
yield return (T)(current);
}
}
}
Use it as
var result = collection.DynamicCast<Item>();
It is casting. See the implementation of CastIterator.
static IEnumerable<TResult> CastIterator<TResult>(IEnumerable source) {
foreach (object obj in source) yield return (TResult)obj;
}
The use of the as keyword here is only to check if the entire collection can be casted to your target instead of casting item by item. If the as returns something that isn't null, then the entire collection is returned and we skip the whole iteration process to cast every item.
For example this trivial example would return a casted collection instead of iterating over every item
int?[] collection = ...;
var castedCollection = collection.Cast<int?>()
because the as works right off the bat, no need to iterate over every item.
In this example, the as gives a null result and we have to use the CastIterator to go over every object
int?[] collection = ...;
var castedCollection = collection.Cast<object>()

IQueryable OrderBy with Func<TModel, TValue> : what's happening

With the code given from this question
OrderBy is not translated into SQL when passing a selector function
Func<Table1, string> f = x => x.Name;
var t = db.Table1.OrderBy(f).ToList();
The translated SQL is:
SELECT
[Extent1].[ID] AS [ID],
[Extent1].[Name] AS [Name]
FROM [dbo].[Table1] AS [Extent1]
OK.
I can understand that the code compiles : IQueryable inherits from IEnumerable, which have an OrderBy method taking a Func<TModel, TValue> as parameter.
I can understand that the ORDER BY clause is not generated in SQL, as we didn't pass an Expression<Func<TModel, TValue>> as the OrderBy parameter (the one for IQueryable)
But what happens behind the scene ? What happens to the "wrong" OrderBy method ? Nothing ? I can't see how and why... Any light in my night ?
Because f is a delegate rather than an expression, the compiler picks the IEnumerable OrderBy extension method instead of the IQueryable one.
This then means that all the results are fetched from the database, because the ordering is then done in memory as if it were Linq to Objects. That is, in-memory, the ordering can only be done by fetching all the records.
Of course, in reality this still doesn't actually happen until you start enumerating the result - which in your case you do straight away because you eager-load the result with your call to ToList().
Update in response to your comment
It seems that your question is as much about the IQueryable/IEnumerable duality being 'dangerous' from the point of view of introducing ambiguity. It really isn't:
t.OrderBy(r => r.Field)
C# sees the lambda as an Expression<> first and foremost so if t is an IQueryable then the IQueryable extension method is selected. It's the same as a variable of string being passed to an overloaded method with a string and object overload - the string version will be used because it's the best representation.
As Jeppe has pointed out, it's actually because the immediate interface is used, before inherited interfaces
t.AsEnumerable().OrderBy(r => r.Field)
C# can't see an IQueryable any more, so treats the lambda as a Func<A, B>, because that's it's next-best representation. (The equivalent of only an object method being available in my string/object analogy before.
And then finally your example:
Func<t, string> f = r => r.Field;
t.OrderBy(f);
There is no possible way that a developer writing this code can expect this to be treated as an expression for a lower-level component to translate to SQL, unless the developer fundamentally doesn't understand the difference between a delegate and an expression. If that's the case, then a little bit of reading up solves the problem.
I don't think it's unreasonable to require a developer to do a little bit of reading before they embark on using a new technology; especially when, in MSDN's defence, this particular subject is covered so well.
I realise now that by adding this edit I've now nullified the comment by #IanNewson below - but I hope it provides a compelling argument that makes sense :)
But what happens behind the scene?
Assuming db.Table1 returns an Table<Table1>, the compiler will:
Check whether Table<T> has an OrderBy method - nope
Check whether any of the base classes or interfaces it implements has an OrderBy method - nope
Start looking at extension methods
It will find both Queryable.OrderBy and Enumerable.OrderBy as extension methods which match the target type, but the Queryable.OrderBy method isn't applicable, so it uses Enumerable.OrderBy instead.
So you can think of it as if the compiler has rewritten your code into:
List<Table1> t = Enumerable.ToList(Enumerable.OrderBy(db.Table1, f));
Now at execution time, Enumerable.OrderBy will iterate over its source (db.Table1) and perform the appropriate ordering based on the key extraction function. (Strictly speaking, it will immediately return an IEnumerable<T> which will iterate over the source when it's asked for the first result.)
The queryable returns all records (hence no WHERE clause in the SQL statement), and then the Func is applied to the objects in the client's memory, through Enumerable.OrderBy. More specifically, the OrderBy call resolves to Enumerable.OrderBy because the parameter is a Func. You can therefore rewrite the statement using static method call syntax, to make it a bit clearer what's going on:
Func<Table1, string> f = x => x.Name;
var t = Enumerable.OrderBy(db.Table1, f).ToList();
The end result is that the sort specified by OrderBy is done by the client process rather than by the database server.
This answer is a kind of comment to Andras Zoltan's answer (but this is too long to fit in the comment format).
Zoltan's answer is interesting and mostly correct, except the phrase C# sees the lambda as an Expression<> first and foremost [...].
C# sees a lambda (and any anonymous function) as equally "close" to a delegate and the Expression<> (expression tree) of that same delegate. According to the C# specification, neither is a "better conversion target".
So consider this code:
class C
{
public void Overloaded(Expression<Func<int, int>> e)
{
Console.WriteLine("expression tree");
}
public void Overloaded(Func<int, int> d)
{
Console.WriteLine("delegate");
}
}
Then:
var c = new C();
c.Overloaded(i => i + 1); // will not compile! "The call is ambiguous ..."
So the reason why it works with IQueryable<> is something else. The method defined by the direct interface type is preferred over the method defined in the base interface.
To illustrate, change the above code to this:
interface IBase
{
void Overloaded(Expression<Func<int, int>> e);
}
interface IDerived : IBase
{
void Overloaded(Func<int, int> d);
}
class C : IDerived
{
public void Overloaded(Expression<Func<int, int>> e)
{
Console.WriteLine("expression tree");
}
public void Overloaded(Func<int, int> d)
{
Console.WriteLine("delegate");
}
}
Then:
IDerived x = new C();
x.Overloaded(i => i + 1); // compiles! At runtime, writes "delegate" to the console
As you see, the member defined in IDerived is chosen, not the one defined in IBase. Note that I reversed the situation (compared to IQueryable<>) so in my example the delegate overload is defined in the most derived interface and is therefore preferred over the expression tree overload.
Note: In the IQueryable<> case the OrderBy methods in question are not ordinary instance methods. Instead, one is an extension method to the derived interface, and the other is an extension method to the base interface. But the explanation is similar.

When is ObjectQuery really an IOrderedQueryable?

Applied to entity framework, the extension methods Select() and OrderBy() both return an ObjectQuery, which is defined as:
public class ObjectQuery<T> : ObjectQuery, IOrderedQueryable<T>,
IQueryable<T>, <... more interfaces>
The return type of Select() is IQueryable<T> and that of OrderBy is IOrderedQueryable<T>. So you could say that both return the same type but in a different wrapper. Luckily so, because now we can apply ThenBy after OrderBy was called.
Now my problem.
Let's say I have this:
var query = context.Plots.Where(p => p.TrialId == 21);
This gives me an IQueryable<Plot>, which is an ObjectQuery<Plot>. But it is also an IOrderedQueryable:
var b = query is IOrderedQueryable<Plot>; // True!
But still:
var query2 = query.ThenBy(p => p.Number); // Does not compile.
// 'IQueryable<Plot>' does not contain a definition for 'ThenBy'
// and no extension method 'ThenBy' ....
When I do:
var query2 = ((IOrderedQueryable<Plot>)query).ThenBy(p => p.Number);
It compiles, but gives a runtime exception:
Expression of type 'IQueryable`1[Plot]' cannot be used for parameter of type 'IOrderedQueryable`1[Plot]' of method 'IOrderedQueryable`1[Plot] ThenBy[Plot,Nullable`1](IOrderedQueryable`1[Plot], Expressions.Expression`1[System.Func`2[Plot,System.Nullable`1[System.Int32]]])'
The cast is carried out (I checked), but the parameter of ThenBy is still seen as IQueryable (which puzzles me a bit).
Now suppose some method returns an ObjectQuery<Plot> to me as IQueryable<Plot> (like Select()). What if I want to know whether it is safe to call ThenBy on the returned object. How can I figure it out if the ObjectQuery is "real" or a "fake" IOrderedQueryable without catching exeptions?
Expression Trees are genuinely good fun! (or perhaps I'm a little bit of a freak) and will likely become useful in many a developer's future if Project Roslyn is anything to go by! =)
In your case, simple inherit from MSDN's ExpressionVisitor, and override the VisitMethodCall method in an inheriting class with something to compare m.MethodInfo with SortBy (i.e. if you're not too fussy simply check the name, if you want to be fussy use reflection to grab the actual SortBy MethodInfo to compare with.
Let me know if/what you need examples of, but honestly, after copy/pasting the ExpressionVisitor you'll probably need no more than 10 lines of non-expression-tree code ;-)
Hope that helps
Although Expression Trees are good fun, wouldn't in this case the simple solution be to use OrderBy rather than ThenBy?
OrderBy is an extension on IQueryable and returns an IOrderedQueryable.
ThenBy is an extension on IOrderedQueryable and returns an IOrderedQueryable.
So if you have a IQueryable (as in your case above, where query is an IQueryable) and you want to apply an initial ordering to it, use OrderBy. ThenBy is only intended to apply additional ordering to an already ordered query.
If you have a LINQ result of some kind, but you aren't sure if it is an IQueryable or an IOrderedQueryable and want to apply additional filtering to it, you could make two methods like:
static IOrderedQueryable<T, TKey> ApplyAdditionalOrdering<T, TKey>(this IOrderedQueryable<T, TKey> source, Expression<Func<T, TFilter>> orderBy)
{
return source.ThenBy(orderBy);
}
And
static IOrderedQueryable<T, TKey> ApplyAdditionalOrdering<T, TKey>(this IQueryable<T> source, Expression<Func<T, TFilter>> orderBy)
{
return source.OrderBy(orderBy);
}
The compiler will figure out the correct one to call based on the compile-time type of your query object.

Returning multiple streams from LINQ query

I want to write a LINQ query which returns two streams of objects. In F# I would write a Seq expression which creates an IEnumerable of 2-tuples and then run Seq.unzip. What is the proper mechanism to do this in C# (on .NET 3.5)?
Cheers, Jurgen
Your best bet is probably to create a Pair<T1, T2> type and return a sequence of that. (Or use an anonymous type to do the same thing.)
You can then "unzip" it with:
var firstElements = pairs.Select(pair => pair.First);
var secondElements = pairs.Select(pair => pair.Second);
It's probably worth materializing pairs first though (e.g. call ToList() at the end of your first query) to avoid evaluating the query twice.
Basically this is exactly the same as your F# approach, but with no built-in support.
Due to the lack of tuples in C# you may create an anonymous type.
Semantics for this are:
someEnumerable.Select( inst => new { AnonTypeFirstStream = inst.FieldA, AnonTypeSecondStream = inst.FieldB });
This way you're not bound in the amount of streams you return, you can just add a field to the anonymous type pretty like you can add an element to a tuple.

C# 3.0 Func/OrderBy type inference

So odd situation that I ran into today with OrderBy:
Func<SomeClass, int> orderByNumber =
currentClass =>
currentClass.SomeNumber;
Then:
someCollection.OrderBy(orderByNumber);
This is fine, but I was going to create a method instead because it might be usable somewhere else other than an orderBy.
private int ReturnNumber(SomeClass currentClass)
{
return currentClass.SomeNumber;
}
Now when I try to plug that into the OrderBy:
someCollection.OrderBy(ReturnNumber);
It can't infer the type like it can if I use a Func. Seems like to me they should be the same since the method itself is "strongly typed" like the Func.
Side Note: I realize I can do this:
Func<SomeClass, int> orderByNumber = ReturnNumber;
This could also be related to "return-type type inference" not working on Method Groups.
Essentially, in cases (like Where's predicate) where the generic parameters are only in input positions, method group conversion works fine. But in cases where the generic parameter is a return type (like Select or OrderBy projections), the compiler won't infer the appropriate delegate conversion.
ReturnNumber is not a method - instead, it represents a method group containing all methods with the name ReturnNumber but with potentially different arity-and-type signatures. There are some technical issues with figuring out which method in that method group you actually want in a very generic and works-every-time way. Obviously, the compiler could figure it out some, even most, of the time, but a decision was made that putting an algorithm into the compiler which would work only half the time was a bad idea.
The following works, however:
someCollection.OrderBy(new Func<SomeClass, int>(ReturnNumber))

Categories