Can someone explain me this code, especially I'm not sure how generic function as parameter works:
result.Notes= orderInfo.Notes.SafeConvert(x => (Communication.OrderNotes)x);
public static TOut[] SafeConvert<TIn, TOut>(
this TIn[] input,
Func<TIn, TOut> converter)
{
if (input == null)
{
return null;
}
return input
.Where(i => i != null)
.Select(converter)
.ToArray();
}
SafeConvert is a generic extension method. The first parameter (an array of the generic type TIn) is implicitly added when the method is invoked on an array of some type (in this case maybe a note?). The method also requires an function as a parameter. This function must take an instance of the type TIn and return a TOut instance. So, you'd invoke this method on an array of some type, supply a lambda expression or a delegate function, and it will return an array of whatever type your supplied function returns. It does this by using Linq to filter out nulls, run each item in the array through the method, then return the enumeration of those items as an array.
In the implementation you've given, it takes the "Notes" of "orderInfo" and explicitly casts them to "CommunicationOrderNotes."
Here's another way you could invoke the method.
var decimals = new [] {5, 3, 2, 1}.SafeConvert(someInt => (decimal) someInt);
This is what's known as an extension method. It's a static function that allows you to "add" methods to types without modifying the original code. It's somewhat analogous to the Decorator Pattern but there's controversy about whether it's actually an implementation of that particular pattern.
"Under the hood," at least, extension methods are just "syntactic sugar" for calling a static method, but you can call them as if they were an instance method of the extended object (in this case, arrays).
The <TIn, TOut> part means that TIn and TOut are some type you haven't specified yet (but that you intend to specify what they actually are when you go to use the class). To understand the purpose of this, think of a Linked List - really, the implementation of a Linked List of integers isn't any different than the code for a Linked List of strings, so you'd like it to be the case that you can create a single class and specify later that you want a list of integers or a list of strings or whatever. You definitely would not want to have to create an implementation for every single possible type of object - that would require a massive amount of redundant code.
Now, for the LINQ query:
return input
.Where(i => i != null)
.Select(converter)
.ToArray();
LINQ (Language Integrated Query) is a mechanism for querying different types of collections using a single syntax. You can use it to query .NET collections (like they're doing here), XML documents, or databases, for example.
LINQ Queries take an anonymous function of some kind and apply the operator to the collection in some way (see below).
Going through this query;
.Where(i => i != null)
As the name suggests, Where applies a filter. When applied to a collection, it returns a second collection with all of the elements of the first collection that match the filter condition. The
i => i != null
bit is the anonymous function that acts as a filter. Basically, what this is saying is "give me a collection with all of the members of the array that aren't null."
The Select method applies a transform to every element of the collection and returns the result as a second collection. In this case, you apply whatever transformation you passed in as an argument to the method.
It might sound slightly odd to think of code as data, but this is actually very routine in some other languages like F#, Lisp, and Scala. ("Under the hood", C# is implementing this behavior in an object-oriented way, but the effect is the same).
The basic idea of this function, then, is that it converts an array of one type to an array of a second type, filtering out all of the null references.
Related
I may be stupid, but what is the difference between contains and contains<> in VS whisper help? Sometimes I get both, sometimes only the one with <>.
They things is that I am trying to use contains in where as in some solutions found here on SO, but it throws error that I best overload method has some invalid arguments (them method is System.Linq.ParallelEnumerable.Contains<TSource>(...)).
My code is like this:
defaultDL = db.SomeEntity
.Where(dl => dl.Something == this.Something
&& (dl.AllLocation == true || this.SomeOtherEntity.Select(loc => loc.Location).Contains(dl.Location)))
.ToList();
If you navigate to definition of System.Linq.Enumerable.Contains method, you will see that it is declared as generic extension method.
public static bool Contains<TSource>(this IEnumerable<TSource> source, TSource value);
The reason why sometimes it is called with <type> arguments, and sometimes not - is because most of the time compiler will analize it's arguments and determine type automatically. So under the hood, it will be rewritten to explicit generic call.
Like
someCollection.Contains(someValue);
actually is being compiled to
Enumerable.Contains<CollectionInnerType>(someCollection, someValue);
Linq has extension method Contains<>. When you are using it - you can enter type parameters, or not. If you are not enter - c# compiler will try to specify arguments implicitly.
Some other enumerable classes (e.g. List<>) implement own Contain method.
So, when IntelliSense suggest Contains<> method - it is an implementation from Linq; when Contains - it is own implementation of concrete class.
About difference in implementation. Own implementation of class seems to be faster, than Linq implementation, because Linq implementation is more abstract from endpoint class.
There are many possibilities. But here are the most common.
I'm guessing SomeOtherEntity is a reference to an ICollection<T>. That is a standard method on ICollection that scans in memory for reference equality (depending on implementation). You can read about that here.
There also is the Contains<T> which comes from LINQ. It is an extension method. It works on IEnumerable<T> which ICollection<T> is derived from. You can read about this one here.
It has the following basic difference.
Contains is an Extension method while Contains is not.
Contains retrun IEnumerable<T> while Contais return bool value and determines whether your item is present or not. In Contain you can pass deligates that based on condition will return IEnumerable<T>.
I want to write a generic method along the following lines:
public IEnumerable<T> S<T> (List<T> source)
{
//dosomething with source
if (someCondition)
yield return null;
else
yield return someNonNullItem;
}
T can be a value type (e.g. int), a nullable type (e.g. int?), or a ref type (e.g. string). In all the three cases, I want to have the ability to return a null value.
The //dosomething block is pretty generic, involves shifting things around, and can be used with all types with no modification. Similarly, the (someCondition) boolean check does not have any type dependency.
Some considerations:
I cannot use default(T) where T is a non-nullable value type (e.g. default(T) where T is int won't work). An explicit representation of null is required.
I don't want to convert from T to T? if I can avoid it since the source list can be quite long (millions of items).
At present, I'm stuck with having to write three functions, and one of them has to have a different name (because the type constraints are not deemed part of the method signature). The three functions have identical bodies (not shown for brevity).
public IEnumerable<T?> S<T>(List <T> source) where T:struct
{
}
public IEnumerable<T?> S<T>(List <T?> source) where T : struct
{
}
public IEnumerable<T> S4Ref<T>(List <T> source) where T : class
{
}
In the first two methods, I need the T:struct constraint to be able to return the Nullable. In the third method: (a) I need a new name, S4Ref, to avoid clashing with the first method, and, (b) I need the T:class constraint to be able to return a null.
In reality, there are numerous such S methods I have to write, and if I follow the above approach, I'll have to write three versions for each of them. I'll also turn them into extension methods for List
Questions:
Is there a way to have a single generic function that does this? Or at least reduce from 3 to 2 methods?
If not, what is the best way to eliminate duplication in the function bodies?
At present I'm veering towards using T4 templates to address this.
You've set yourself up with conflicting constraints. In the problem definition, you say you want a function that works on value types (i.e. int) but be able to return null without turning it into a nullable type. As a design paradigm, I don't expect a collection of things to include something that doesn't exist. Yeah, I said that. To me, returning a null item means it doesn't exist, and yet you are returning null. If the purpose of function S is to filter items, it would be better to skip the item that doesn't match and return the next one.
For example, a typical use of IEnumerable is code that looks like this:
List<int> myList = .... // somehow fill the list
foreach(int element in S(myList))
{
// do something with the int element. What should I expect to do with a null
// even if you could return one? The only thing I could reasonable do here is
// skip it. I have no idea which element of myList the null corresponds to
}
in other words, even if the foreach look like this:
foreach(int? element in S(myList))
I'm still stuck as to what to do with a null value for element. There is still no context to know which element of myList caused a null from function S.
This must be a duplicate but i haven't found it. I've found this question which is related since it answers why it's recommended to use a method group instead of a lambda.
But how do i use an existing method group instead of a lambda if the method is not in the current class and the method is not static?
Say i have a list of ints which i want to convert to strings, i can use List.ConvertAll, but i need to pass a Converter<int, string> to it:
List<int> ints = new List<int> { 1 };
List<string> strings = ints.ConvertAll<string>(i => i.ToString());
This works, but it creates an unnecessary anonymous method with the lambda. So if Int32.ToString would be static and would take an int i could write:
List<string> strings = ints.ConvertAll<string>(Int32.ToString);
But that doesn't compile - of course. So how can i use a method group anyway?
If i'd create an instance method like this
string FooInt(int foo)
{
return foo.ToString();
}
i could use strings = ints.ConvertAll<string>(FooInt);, but that is not what i want. I don't want to create a new method just to be able to use an existing.
There is an static method in the framework, that can be used to convert any integrated data type into a string, namely Convert.ToString:
List<int> ints = new List<int> { 1 };
List<string> strings = ints.ConvertAll<string>(Convert.ToString);
Since the signature of Convert.ToString is also known, you can even eliminate the explicit target type parameter:
var strings = ints.ConvertAll(Convert.ToString);
This works. However, I'd also prefer the lambda-expression, even if ReSharper tells you something different. ReSharper sometimes optimizes too much imho. It prevents developers from thinking about their code, especially in the aspect of readability.
Update
Based on Tim's comment, I will try to explain the difference between lambda and static method group calls in this particular case. Therefor, I first took a look into the mscorlib disassembly to figure out, how int-to-string conversion exactly works. The Int32.ToString method calls an external method within the Number-class of the System namespace:
[__DynamicallyInvokable, TargetedPatchingOptOut("Performance critical to inline across NGen image boundaries"), SecuritySafeCritical]
public string ToString(IFormatProvider provider)
{
return Number.FormatInt32(this, null, NumberFormatInfo.GetInstance(provider));
}
The static Convert.ToString member does nothing else than calling ToString on the parameter:
[__DynamicallyInvokable]
public static string ToString(int value)
{
return value.ToString(CultureInfo.CurrentCulture);
}
Technically there would be no difference, if you'd write your own static member or extension, like you did in your question. So what's the difference between those two lines?
ints.ConvertAll<string>(i => i.ToString());
ints.ConvertAll(Convert.ToString);
Also - technically - there is no difference. The first example create's an anonymous method, that returns a string and accepts an integer. Using the integer's instance, it calls it's member ToString. The second one does the same, with the exception that the method is not anonymous, but an integrated member of the framework.
The only difference is that the second line is shorter and saves the compiler a few operations.
But why can't you call the non-static ToString directly?
Let's take a look into the ConvertAll-method of List:
public List<TOutput> ConvertAll<TOutput>(Converter<T, TOutput> converter)
{
if (converter == null)
{
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.converter);
}
List<TOutput> list = new List<TOutput>(this._size);
for (int i = 0; i < this._size; i++)
{
list._items[i] = converter(this._items[i]);
}
list._size = this._size;
return list;
}
The list iteraterates over each item, calls the converter with the item as an argument and copys the result into a new list which it returns in the end.
So the only relation here is your converter that get's called explicitly. If you could pass Int32.ToString to the method, the compiler would have to decide to call this._items[i].ToString() within the loop. In this specific case it would work, but that's "too much intelligence" for the compiler. The type system does not support such code conversions. Instead the converter is an object, describing a method that can be called from the scope of the callee. Either this is an existing static method, like Convert.ToString, or an anonymous expression, like your lambda.
What causes the differences in your benchmark results?
That's hard to guess. I can imagine two factors:
Evaluating lambdas may result in runtime-overhead.
Framework calls may be optimized.
The last point especially means, that the JITer is able to inline the call which results in a better performance. However, those are just assumptions of mine. If anyone could clarify this, I'd appreciate it! :)
You hit the nail on the head yourself:
This works, but it creates an unnecessary anonymous method with the
lambda.
You can't do what you're asking for because there is no appropriate method group that you can use so the anonymous method is necessary. It works in that other case because the implicit range variable is passed to the delegate created by the method group. In your case, you need the method to be called on the range variable. It's a completely different scenario.
I am trying to understand specifically why it is neccessary to have this Where<TSource>
What does the type straight after Where tell you?
I understand the 'this' concept which means its an extension method but cannoth understand the type after Where
public static IEnumerable<TSource> Where<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate
)
Func<TSource, bool> is a pointer to a function which takes TSource as parameter and returns boolean. For example if you had the following function:
public bool Foo(SomeType abc)
{
return abc.SomeProperty == "123";
}
you could pass it as argument to the Where method if you had a list of SomeType:
SomeType[] values = ...
var result = values.Where(Foo);
You could also use an anonymous function which avoids you the need to declare another function explicitly:
SomeType[] values = ...
var result = values.Where(x => x.SomeProperty == "123");
UPDATE:
I seem to have misunderstood the question. The type after the name of the function Where<TSource> indicates a generic function definition. It indicates that this function has a generic argument which can be of any type. So for example when you write:
SomeType[] values = ...
var result = values.Where(x => x.SomeProperty == "123");
TSource equals SomeType and the compiler is capable of automatically inferring it from the delegate. You could specify it explicitly but it's too much of a writing:
SomeType[] values = ...
IEnumerable<SomeType> result = values.Where<SomeType>(x => x.SomeProperty == "123");
The type in <...> after Where is a declaration of generic type parameter. The Where method is generic which means that some of the types involved in its type declaration can be provided when the method is used. In C#, this is called generics.
The <...> is a place where you declare types that the caller needs to specify when using your method. It is, in some way, similar to declarations of parameters
When using parameters, you say that the caller needs to give you some values and you give them names (e.g. source and predicate).
When writing generic method, you say that the caller needs to give you some type and you give the type a name (e.g. TSource) that you can use in the method declaration and body.
A simple answer to your question would be for type safety and IntelliSense while programming. (All the benifits that Generics provide).
LINQ to Objects works on Extension Methods defined in a type named Enumerable. Mostly, they deal with type IEnumerable<T>.
So, for example, when you want it to operate on a List<String>, the Where<TSource> becomes Where<String>. And C# 3.0 type inferance comes into picture so that you dont have to explicitly specify the String part. Since, the compiler knows you are working with IEnumerable<T>.
To sumarrize, when you use Where method on a List, the method expects an IEnumerable<String> and a Predicate that takes String to filter on input sequence as an input and provides with an IEnumerable<String> as an output.
So odd situation that I ran into today with OrderBy:
Func<SomeClass, int> orderByNumber =
currentClass =>
currentClass.SomeNumber;
Then:
someCollection.OrderBy(orderByNumber);
This is fine, but I was going to create a method instead because it might be usable somewhere else other than an orderBy.
private int ReturnNumber(SomeClass currentClass)
{
return currentClass.SomeNumber;
}
Now when I try to plug that into the OrderBy:
someCollection.OrderBy(ReturnNumber);
It can't infer the type like it can if I use a Func. Seems like to me they should be the same since the method itself is "strongly typed" like the Func.
Side Note: I realize I can do this:
Func<SomeClass, int> orderByNumber = ReturnNumber;
This could also be related to "return-type type inference" not working on Method Groups.
Essentially, in cases (like Where's predicate) where the generic parameters are only in input positions, method group conversion works fine. But in cases where the generic parameter is a return type (like Select or OrderBy projections), the compiler won't infer the appropriate delegate conversion.
ReturnNumber is not a method - instead, it represents a method group containing all methods with the name ReturnNumber but with potentially different arity-and-type signatures. There are some technical issues with figuring out which method in that method group you actually want in a very generic and works-every-time way. Obviously, the compiler could figure it out some, even most, of the time, but a decision was made that putting an algorithm into the compiler which would work only half the time was a bad idea.
The following works, however:
someCollection.OrderBy(new Func<SomeClass, int>(ReturnNumber))