List<int> result1 =
(from number in list where number < 3 select number).ToList();
List<int> result2 = list.Where(n => n<3).ToList();
What's the difference between these two different statements?
The first notation is usually called "query syntax", the second one "method syntax" (or dot notation, or lambda syntax) - both are compiled down to exactly the same code, but as already mentioned usually one of the two is more succinct, for most scenarios this is the dot notation but especially for joining or grouping over multiple enumerations query syntax really shines.
Also check out LINQ Query Syntax versus Method Syntax (C#):
Most queries in the introductory LINQ
documentation are written as query
expressions by using the declarative
query syntax introduced in C# 3.0.
However, the .NET common language
runtime (CLR) has no notion of query
syntax in itself. Therefore, at
compile time, query expressions are
translated to something that the CLR
does understand: method calls. These
methods are called the standard query
operators, and they have names such as
Where, Select, GroupBy, Join, Max,
Average, and so on. You can call them
directly by using method syntax
instead of query syntax.
In general, we recommend query syntax
because it is usually simpler and more
readable; however there is no semantic
difference between method syntax and
query syntax.
Nothing.
The first one uses LINQ notation, while the second one uses extension method notation -- they both do the same thing.
Use whatever looks more pleasing to you. :)
There is no difference. One is just a language extension that looks similar to SQL instead of using delegates to achieve the same result.
You notice already the first is LINQ notation and the second one uses extension method with lambda. Use the second for less code maintainance. but if you think the similarity of internal code or performance, simply use stop watch and run this code 100000 times and choose the fastest one. If the compiled code is similar, you will get the time almost the same.
Related
I am reading a book about C# in advanced level. And, now I am reading this part:
Behind-the-scenes operation of the Linq query methods that implement delegate-based syntax.
So far, I have read about Where, Select, Skip, SkipWhile, Take, TakeWhile methods.
And, I know about Defferred and Immediate execution and Iterators which is returned by some of these methods.
Deferred execution is a pattern of the execution model by which the
CLR ensures a value will be extracted only when it is required from
the IEnumerable-based information source. When any Linq operator
uses the deferred execution, the CLR encapsulates the related
information, such as the original sequence, predicate, or selector (if
any), into an iterator, which will be used when the information is
extracted from the original sequence using ToListmethod or
ForEachmethod or manually using the underlying GetEnumeratorand
MoveNextmethods in C#.
Now let's take these two examples:
IList<int> series = new List<int>() { 1, 2, 3, 4, 5, 6, 7 };
// First example
series.Where(x => x > 0).TakeWhile(x => x > 0).ToList();
// Second example
series.Where(x => x > 0).Take(4).ToList();
When I am putting breakpoints and debugging these two statements, I can see one difference.
TakeWhile() method executing when an item is met the condition in Where statement. But, this is not the case with Take method.
First statement:
Second statement:
Could you explain me why?
It's not entirely clear what you mean, but if you're asking why you hit a breakpoint in the lambda expression in TakeWhile, but you don't hit one within Take, it's just that Take doesn't accept a delegate at all - it just accepts a number. There's no user-defined code to execute while it's finding a value to return, so there's no breakpoint to hit.
In your example with TakeWhile, you've got two lambda expressions - one for Where and one for TakeWhile. So you can break into either of those lambda expressions.
It's important to understand that the Where and TakeWhile methods themselves are only called once - but the sequences they return evaluate the delegate passed to them for each value they encounter.
You might want to look at my Edulinq blog series for more details about the innards of LINQ.
Well, the condition in TakeWhile will need to be evaluated for each item, just like Where, so it will call each of them for each item.
Take(4) does not need to be evaluated per item, only the Where does, so in the second one, only the Where condition will be evaluated each time, (probably four times).
Suppose I have the following code:
var X = XElement.Parse (#"
<ROOT>
<MUL v='2' />
<MUL v='3' />
</ROOT>
");
Enumerable.Range (1, 100)
.Select (s => X.Elements ()
.Select (t => Int32.Parse (t.Attribute ("v").Value))
.Aggregate (s, (t, u) => t * u)
)
.ToList ()
.ForEach (s => Console.WriteLine (s));
What is the .NET runtime actually doing here? Is it parsing and converting the attributes to integers each of the 100 times, or is it smart enough to figure out that it should cache the parsed values and not repeat the computation for each element in the range?
Moreover, how would I go about figuring out something like this myself?
Thanks in advance for your help.
LINQ and IEnumerable<T> is pull based. This means that the predicates and actions that are part of the LINQ statement in general are not executed until values are pulled. Furthermore the predicates and actions will execute each time values are pulled (e.g. there is no secret caching going on).
Pulling from an IEnumerable<T> is done by the foreach statement which really is syntactic sugar for getting an enumerator by calling IEnumerable<T>.GetEnumerator() and repeatedly calling IEnumerator<T>.MoveNext() to pull the values.
LINQ operators like ToList(), ToArray(), ToDictionary() and ToLookup() wraps a foreach statement so these methods will do a pull. The same can be said about operators like Aggregate(), Count() and First(). These methods have in common that they produce a single result that has to be created by executing a foreach statement.
Many LINQ operators produce a new IEnumerable<T> sequence. When an element is pulled from the resulting sequence the operator pulls one or more elements from the source sequence. The Select() operator is the most obvious example but other examples are SelectMany(), Where(), Concat(), Union(), Distinct(), Skip() and Take(). These operators don't do any caching. When then N'th element is pulled from a Select() it pulls the N´th element from the source sequence, applies the projection using the action supplied and returns it. Nothing secret going on here.
Other LINQ operators also produce new IEnumerable<T> sequences but they are implemented by actually pulling the entire source sequence, doing their job and then producing a new sequence. These methods include Reverse(), OrderBy() and GroupBy(). However, the pull done by the operator is only performed when the operator itself is pulled meaning that you still need a foreach loop "at the end" of the LINQ statement before anything is executed. You could argue that these operators use a cache because they immediately pull the entire source sequence. However, this cache is built each time the operator is iterated so it is really an implementation detail and not something that will magically detect that you are applying the same OrderBy() operation multiple times to the same sequence.
In your example the ToList() will do a pull. The action in the outer Select will execute 100 times. Each time this action is executed the Aggregate() will do another pull that will parse the XML attributes. In total your code will call Int32.Parse() 200 times.
You can improve this by pulling the attributes once instead of on each iteration:
var X = XElement.Parse (#"
<ROOT>
<MUL v='2' />
<MUL v='3' />
</ROOT>
")
.Elements ()
.Select (t => Int32.Parse (t.Attribute ("v").Value))
.ToList ();
Enumerable.Range (1, 100)
.Select (s => x.Aggregate (s, (t, u) => t * u))
.ToList ()
.ForEach (s => Console.WriteLine (s));
Now Int32.Parse() is only called 2 times. However, the cost is that a list of attribute values have to be allocated, stored and eventually garbage collected. (Not a big concern when the list contains two elements.)
Note that if you forget the first ToList() that pulls the attributes the code will still run but with the exact same performance characteristics as the original code. No space is used to store the attributes but they are parsed on each iteration.
It has been a while since I dug through this code but, IIRC, the way Select works is to simply cache the Func you supply it and run it on the source collection one at a time. So, for each element in the outer range, it will run the inner Select/Aggregate sequence as if it were the first time. There isn't any built-in caching going on -- you would have to implement that yourself in the expressions.
If you wanted to figure this out yourself, you've got three basic options:
Compile the code and use ildasm to view the IL; it's the most accurate but, especially with lambdas and closures, what you get from IL may look nothing like what you put into the C# compiler.
Use something like dotPeek to decompile System.Linq.dll into C#; again, what you get out of these kinds of tools may only approximately resemble the original source code, but at least it will be C# (and dotPeek in particular does a pretty good job, and is free.)
My personal preference - download the .NET 4.0 Reference Source and look for yourself; this is what it's for :) You have to just trust MS that the reference source matches the actual source used to produce the binaries, but I don't see any good reason to doubt them.
As pointed out by #AllonGuralnek you can set breakpoints on specific lambda expressions within a single line; put your cursor somewhere inside the body of the lambda and press F9 and it will breakpoint just the lambda. (If you do it wrong, it will highlight the entire line in the breakpoint color; if you do it right, it will just highlight the lambda.)
I had a list of tuples where every tuple consists of two integers and I wanted to sort by the 2nd integer. After looking in the python help I got this:
sorted(myList, key=lambda x: x[1])
which is great. My question is, is there an equally succinct way of doing this in C# (the language I have to work in)? I know the obvious answer involving creating classes and specifying an anonymous delegate for the whole compare step but perhaps there is a linq oriented way as well. Thanks in advance for any suggestions.
Another way to do it in python is this
from operator import itemgetter
sorted(myList, key=itemgetter(1))
Assuming that the list of tuples has a type IEnumerable<Tuple<int, int>> (a sequence of tuples represented using Tuple<..> class from .NET 4.0), you can write the following using LINQ extension methods:
var result = myList.OrderBy(k => k.Item2);
In the code k.Item2 returns the second component of the tuple - in C#, this is a property (because accessing item by index wouldn't be type-safe in general). Otherwise, I think that the code is pretty succinct (also thanks to nice lambda function notation).
Using the LINQ query syntax, you could write it like this (although the first version is IMHO more readable and definitely more succinct):
var result = from k in myList orderby k.Item2 select k;
I am looking for a parser that can operate on a query filter. However, I'm not quite sure of the terminology so it's proving hard work. I hope that someone can help me. I've read about 'Recursive descent parsers' but I wonder if these are for full-blown language parsers rather than the logical expression evaluation that I'm looking for.
Ideally, I am looking for .NET code (C#) but also a similar parser that works in T-SQL.
What I want is for something to parse e.g.:
((a=b)|(e=1))&(c<=d)
Ideally, the operators can be definable (e.g. '<' vs 'lt', '=' vs '==' vs 'eq', etc) and we can specify function-type labels (e.g. (left(x,1)='e')). The parser loads this, obeys order precedence (and ideally handles the lack of any brackets) and then calls-back to my code with expressions to evaluate to a boolean result - e.g. 'a=b'?). I wouldn't expect the parser to understand the custom functions in the expression (though some basic ones would be useful, like string splitting). Splitting the expression (into left- and right-hand parts) would be nice.
It is preferable that the parser asks the minimum number of questions to have to work out the final result - e.g. if one side of an AND is false, there is no point evaluating the other side, and to evaluate the easiest side first (i.e. in the above expression, 'c<=d' should be assumed to be quicker and thus evaluated first.
I can imagine that this is a lot of work to do, however, fairly common. Can anyone give me any pointers? If there aren't parsers that are as flexible as above, are there any basic parsers that I can use as a start?
Many Thanks
Lee
Take a look at this. ANTLR is a good parser generator and the linked-to article has working code which you may be able to adapt to your needs.
You could check out Irony. With it you define your grammar in C# code using a syntax which is not to far from bnf. They even have a simple example on their site (expression evaluator) which seems to be quite close to what you want to achieve.
Edit: There's been a talk about Irony at this year's Lang.Net symposium.
Hope this helps!
Try Vici.Parser: download it here (free) , it's the most flexible expression parser/evaluator I've found so far.
If it's possible for you, use .Net 3.5 expressions.
Compiler parses your expression for you and gives you expression tree that you can analyze and use as you need. Not very simple but doable (actually all implementations of IQueryable interface do exactly this).
You can use .NET expression trees for this. And the example is actually pretty simple.
Expression<Func<int, int, int, int, bool>> test = (int a, int b, int c, int d) => ((a == b) | (c == 1)) & (c <= d);
And then just look at "test" in the debugger. Everything is already parsed for you, you can just use it.
The only problem is that in .NET 3.5 you can have only up to 4 arguments in Func. So, I changed "e" to "c" in one place. In 4.0 this limit is changed to 16.
In a project that I'm working on I have to work with a rather weird data source. I can give it a "query" and it will return me a DataTable. But the query is not a traditional string. It's more like... a set of method calls that define the criteria that I want. Something along these lines:
var tbl = MySource.GetObject("TheTable");
tbl.AddFilterRow(new FilterRow("Column1", 123, FilterRow.Expression.Equals));
tbl.AddFilterRow(new FilterRow("Column2", 456, FilterRow.Expression.LessThan));
var result = tbl.GetDataTable();
In essence, it supports all the standard stuff (boolean operators, parantheses, a few functions, etc.) but the syntax for writing it is quite verbose and uncomfortable for everyday use.
I wanted to make a little parser that would parse a given expression (like "Column1 = 123 AND Column2 < 456") and convert it to the above function calls. Also, it would be nice if I could add parameters there, so I would be protected against injection attacks. The last little piece of sugar on the top would be if it could cache the parse results and reuse them when the same query is to be re-executed on another object.
So I was wondering - are there any existing solutions that I could use for this, or will I have to roll out my own expression parser? It's not too complicated, but if I can save myself two or three days of coding and a heapload of bugs to fix, it would be worth it.
Try out Irony. Though the documentation is lacking, the samples will get you up and running very quickly. Irony is a project for parsing code and building abstract syntax trees, but you might have to write a little logic to create a form that suits your needs. The DLR may be the complement for this, since it can dynamically generate / execute code from abstract syntax trees (it's used for IronPython and IronRuby). The two should make a good pair.
Oh, and they're both first-class .NET solutions and open source.
Bison or JavaCC or the like will generate a parser from a grammar. You can then augment the nodes of the tree with your own code to transform the expression.
OP comments:
I really don't want to ship 3rd party executables with my soft. I want it to be compiled in my code.
Both tools generate source code, which you link with.
I wrote a parser for exaclty this usage and complexity level by hand. It took about 2 days. I'm glad I did it, but I wouldn't do it again. I'd use ANTLR or F#'s Fslex.