Is LINQ a new feature in .NET 4.0, unsupported in older versions like .NET 3.5? What is it useful for? It seems to be able to build Expression Trees. What is an Expression Tree, actually? Is LINQ able to extract info like class, method and field from a C# file?
Can someone provide me a working piece of code to demonstrate what LINQ can do?
Linq was added in .Net 3.5 (and added to the c# 3.0 compiler as well as in slightly limited form to the VB.net compiler in the same release)
In is language integrated query, though it covers many complex additions to both the language and the runtime in order to achieve this which are useful in and of themselves.
The Expression functionality is simply put the ability for a program, at runtime, inspect the abstract syntax of certain code constructs passed around. These are called lambdas. And are, in essence a way of writing anonymous functions more easily whilst making runtime introspection of their structure easier.
The 'SQL like' functionality Linq is most closely associated with (though by no means the only one) is called Linq to Sql where by something like this:
from f in Foo where s.Blah == "wibble" select f.Wobble;
is compiled into a representation of this query, rather than simply code to execute the query. The part that makes it linq to sql is the 'backend' which converts it into sql. For this the expression is translated into sql server statements to execute the query against a linked database with mapping from rows to .net objects and conversion of the c# logic into equivalent where clauses. You could apply exactly the same code if Foo was a collection of plain .net objects (at which point it is "Linq to objects") the conversion of the expression would then be to straight .Net code.
The lambda above written in the language integrated way is actually the equivalent of:
Foo.Where(f => f.Blah == "wibble).Select(f => f.Wobble);
Where Foo is a typed collection. For databases classes are synthesized to represent the values in the database to allow this to both compile, and to allow round tripping values from the sql areas to the .net areas and vice versa.
The critical aspect of the Language Integrated part of Linq is that the resulting language constructs are first class parts of the resulting code. Rather than simply resulting in a function they provide the way the function was constructed (as an expression) so that other aspects of the program can manipulate it.
Consumers of this functionality may simply chose to run it (execute the function which the lambda is compiled to) or to ask for the expression which describes it and then do something different with it.
Many aspects of what makes this possible are placed under the "Linq" banner despite not really being Linq themsleves.
For example anonymous types are required for easy use of projection (choosing a subset of the possible properties) but anonymous types can be used outside of Linq as well.
Linq, especially via the lambdas (which make writing anonymous delegates very lightweight in terms of syntax) has lead to an increase in the functional capabilities of c#. this is reinforced by the extension methods on IEnumerable<T> like Select(), corresponding to map in many function languages and Where() corresponding to filter. Like the anonymous types this is not in and of itself "Linq" though is viewed by many as a strongly beneficial effect on c# development (this is not a universal view but is widely held).
For an introduction to Linq from microsoft read this article
For an introduction to how to use Linq-to-Sql in Visual Studio see this series from Scott Guthrie
For a guide to how you can use linq to make plain c# easier when using collections read this article
Expressions are a more advanced topic, and understanding of them is entirely unecessary to use linq, though certain 'tricks' are possible using them.
In general you would care about Expressions only if you were attempting to write linq providers which is code to take an expression rather than just a function and use that to do something other than what the plain function would do, like talk to an external data source.
Here are some Linq Provider examples
A multi part guide to implementing your own provider
The MDSN documentation for the namespace
Other uses would be when you wish to get some meta data about what the internals of the function is doing, perhaps then compiling the expression (resulting in a delegate which will allow you to execute the expression as a function) and doing something with it or just looking at the metadata of the objects to do reflective code which is compile time verified as this answer shows.
One area of this question that hasn't been covered yet is expression trees. There is a really good article on expression trees (and lambda expression) available here.
The other important thing to bring up about expression trees is that by building an expression tree to define what you are going to do, you don't have to actually do anything. I am referring to deferred execution.
//this code will only build the expression tree
var itemsInStock = from item in warehouse.Items
where item.Quantity > 0;
// this code will cause the actual execution
Console.WriteLine("Items in stock: {0}", itemsInStock.Count());
LINQ was introduced with .NET 3.5. This site has a lot of examples.
System.Linq.Expressions is for hand building (or machine generating) expression trees. I have a feeling that given the complexity of building more complicated functionality that this namespace is under used. However it is exceedingly powerful. For instance one of my co workers recently implemented an expression tree that can auto scale any LINQ to SQL object using a cumultive density function. Every column gets its own tree that gets compiled so its fast. I have been building a specialized compiler that uses them extensively to implement basic functionality as well as glue the rest of the generated code together.
Please see this blog post for more information and ideas.
LINQ is a .NET 3.5 feature with built-in language support from C# 3.0 and Visual Basic 2008. There are plenty of examples on MSDN.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
There was a discussion on Kotlin Slack about a possibility of adding code trees to support things like C# LINQ.
In C# LINQ has many applications, but I want to focus on only one (because others a already presumably covered by Kotlin syntax): composing SQL queries to remote databases.
Prerequisites:
We have a data schema of an SQL database expressed somehow in the code so that static tools (or the type system) could check the correctness (at least the naming) of an SQL query
We have to generate the queries as strings
We want a syntax close to SQL or Java streams
Question: what do expression trees add to the syntax that is so crucial for the task at hand? How good an SQL builder can be without them?
How good a query dsl can be without expression trees?
As JINQ has shown you can get very far by analyzing the bytecode to understand the intent of developer and thus translate predicate to SQL. So in principle expression trees are not essential to build a great looking query dsl:
val alices = database.customerStream().where { it.name == "Alice" }
Even without a hackery such as bytecode analysis it's possible to get a decent query dsl with code generation. Querydsl and JOOQ are great examples. With a little bit of Kotlin wrapping code you can then write
val alices = db.findAll(QCustomer.customer, { it.name.eq("Alice") })
How do expression trees help building query dsl
Expression tree is a structure representing some code that resolves to a value. By having such structure generated by compiler one does not need bytecode analysis to understand what's the supposed to do. Given the example
val alices = database.customerStream().where { it.name == "Alice" }
The argument to where function would be an expression that we can inspect in runtime and translate it to SQL or other query language. Because the expression trees represent code you don't need to switch between Kotlin and SQL paradigm to write queries. The query code expressed using linq/jinq look pretty much the same regardless if they are executed in memory using POCO/POJO or in the database engine using its query language. The compiler can also do more type checking. Furthermore it's very easy to replace the underlying database with in memory representation to make running tests much faster.
Further reading:
What are Expression Trees and how do you use them and why would you use them?
Practical use of expression trees
JOOQ and Querydsl:
The typical solution to ORM has been to employ either a DSL or an embedded DSL from the application logic. While great advances have been made with these schemes, culminating in JOOQ and Querydsl, there are still many caveats to such a system:
many of the paradigms the people writing these queries are used to (namely type safety) are either missing or different in key ways
the exact semantics are non-obvious: in the previous answer it is suggested that we use an extension method eq to perform a db-native equality filter. It is highly probably that a new developer will mistakenly use equals instead of eq.
this second point is compounded by the difficulty in testing: using a live connector with fake data is a nutoriously difficult problem, so depending on the testing procedure, the incorrect code jooqDB.where { it.name.equals("alice") } may not be discovered until much further down the development pipeline.
Jinq
Jinq is not a first party data connector. While I think the importance of this is largely psychosomatic, it is important none the less. Many projects are going to use the tools suggested by the vendors, and all of the major DB providers have Java connectors, so it is likely that most developers will simply use them.
While I have not used Jinq, it is my belief that another reason Jinq has not seen wide-spread adoption is largely because it's attempting to use a much tougher domain to solve the problem: building queries from AST's is much easier than building queries from byte code for the same reason that building the back-end of a compiler is easier than building a transcompiler. While I cannot help but tip my hat to the Jinq team for doing such an amazing job, I also cannot help but think they are hampered by their tools: building queries out of bytecode is hard. By definition, java bytecode is committed to running on the JVM, trying to retrofit that commitment for another interpreter is a very hard problem.
My current work does not permit me to use a traditional database, but if I was to switch projects, knowing that I would need a great deal of data exposure in my DAL, I would likely retreat from Kotlin and Java back to .net, largely because of Linq, rather than investigate Jinq. "Linq from Kotlin" might well change my mind.
Support from DB vendors:
The LINQ-to-SQL and LINQ-to-mongo database connectors have seen wide-spread adoption in the .net community. This is because they are first party, high quality, and act in a reasonably straightforward manner: compiling an AST to SQL (or the mongo-query-language) is at least conceptually straight forward. Many of the traditional caveats of ORM's apply, but the vendors (Microsoft and Mongo) continue to address these problems.
If Kotlin supported runtime code trees in a similar vein to Linq, and if Kotlin continues to gain traction at its current rate, then I believe the MongoDB and the Hibernate teams would be quick to start retrofitting their existing LINQ-to-X connectors to support Kotlin clients, and eventually even the bigger slower companies like Microsoft and IBM would begin to support the same flow.
Linq from Kotlin
What’s more, the exact roles the Kotlin-unique concepts of a "receiver type" and the aggressive implementation of inline might play in the Linq space is interesting. Linq-from-Kotlin might well be more effective than LINQ-from-C#.
where C# has
someInterface
.where(customer -> customer.Name.StartsWith("A") && ComplexFunctionCallDoneByMongoDriver(customer))
.select(customer -> new ResultObject(customer.Name, customer.Age, OtherContext()))
Kotlin might be able to make advances:
someInterface
.filter { it.name startsWith "A" && inlinedComplexFunctionCallDoneOnDB(it) }
//inlined methods would have their AST exposed -> can be run on the DB process instead of on the driver.
.map { ResultObject(name, age, otherContext()) }
//uses a reciever type, no need to specify "customer ->"
//otherContext() might also be inlined
This is off the top of my head, I suspect that smarter brains than mine can put these tools to better use.
Other uses:
Its worth mentioning, the assumption made about the applications of runtime-code-AST's is false:
other [runtime AST problem domains] [are] already presumably covered by Kotlin syntax
The reason I brought this up in the first place was because I was annoyed with Kotlin's null-safety feature and its interaction with Mockito: Having spent some time researching the issue, there is no Mocking framework designed for Kotlin, only java frameworks that can be used from Kotlin, with some pain.
Some currently unsolved problems in both the java domain and the Kotlin domain:
Mocking frameworks, as above. With access to the AST all of the clever but bizarre tricks around argument-operation-order employed by Mockito become obsolete. Other more traditional mocking frameworks gain a much more intuitive and type-safe front-end.
binding expressions, for UI frameworks or otherwise, often devolve to strings. Consider a UI framework where developers could write notifyOfUIElementChange { this.model.displayName } instead of notifyOfUIElementChange("model.displayName"). The latter suffers from the crippling problem of being stale if somebody renames the property.
I'm very excited to see what the ControlsFX guys or Thomas Mikula might do with a feature such as this.
similar to Kotlin specific Linq: I suspect Kotlin’s applications here might provide for a number of tools I'm not aware of. But I'm very confident that they do exist.
I really like Linq, and I cannot help but think that with Kotlin's focus on industry problems, a Linq-from-Kotlin module would be a perfect fit and make a number of peoples lives, including mine, a fair bit easier.
I'm developing some modelling software in C# which relies heavily on compiled mathematical statements. They are built at runtime using Linq Expressions giving me native performance. Performance is critical as formulas are run billions of times.
Longer term I'm considering moving the project to Java. However Java doesn't seem to have an equivalent library.
What options can the Java platform provide for compiling mathematical statements at run-time & getting native performance?
(P.S. apparently Java 8 will support lambda expressions, but I doubt the framework be as advanced as Linq.Expressions)
One way to do this is to compile a Java class for each different expression, with a single static method that implements the function. See this question.
So, for example, if all your functions just use one variable x, then you could generate a .java file looking like this...
class Expression {
static double apply(double x) {
return /* expression here */;
}
}
Then compile and load that class as in the cited question, get the apply() method with reflection, and call it whenever you want to compute that expression. (Alternately, have your compiled classes implement an interface, create instances of them, and avoid the reflection)
Admittedly, this is not particularly friendly to your memory usage if you end up with thousands of such classes...
Either LINQ to SQL or LINQ to Entities already have the ability to convert LINQ into a SQL text string. But I want my application to make the conversion without using the db context - which in turn means an active database connection - that both those providers require.
I'd like to convert a LINQ expression into an equivalent SQL string(s) for WHERE and ORDER BY clauses, without a DB context dependency, to make the following repository interface work:
public interface IStore<T> where T : class
{
void Add(T item);
void Remove(T item);
void Update(T item);
T FindByID(Guid id);
//sure could use a LINQ to SQL converter!
IEnumerable<T> Find(Expression<Func<T, bool>> predicate);
IEnumerable<T> FindAll();
}
QUESTION
It is primarily the expression tree traversal and transform I am interested in. Does anyone know of an existing library (nuget?) that I can incorporate to be used in such a custom context?
As it is I've already built my own working "LINQ transformed to SQL text" tool, similar to this expression tree to SQL example which works in my above repository. It allows me to write code like this:
IRepository<Person> repo = new PersonRepository();
var maxWeight = 170;
var results = repo.Find(x => (x.Age > 40 || x.Age < 20) && x.Weight < maxWeight);
But my code and that sample are primitive (and that sample itself relies on a LINQ to SQL db context). For example, neither handle generation of "LIKE" statements.
I don't expect or need a generator-tool that handles every conceivable LINQ query. For example, I'm not worried about handling and generating joins or includes. In fact, with another ~20 hours my own custom code may cover all the cases that I care about (mostly "WHERE" and "ORDER BY" statements).
But at the same time I feel that I should not have to write my own custom code to do this. If I'm stuck writing my own, then I'd still be interested if someone could point me to specific classes I can reflect and imitate (NHibernate, EF, etc.). I'm asking about specific classes to peek at, if you know them, because I don't want to spend hours sifting through the code of a massive tool just to find the part I need.
Not that it matters, but if anyone wants to know why I'm not simply using LINQ to SQL or LINQ to Entities...for my specific application I simply prefer to use a tool such as Dapper.
USE CASES
Whether I finish building the tool myself, or find a 3rd party library, here are reasons why a "LINQ to SQL text string" would be useful:
The predicate I type into the IRepository.Find method has intellisense and basic compile-time checking.
My proposed IStore interface can be implemented for DB access or web service access. To clarify, if I can convert the LINQ "WHERE/ORDER BY" predicate to a SQL "WHERE/ORDER BY" clause then...
The SQL string could be used by Dapper directly.
The SQL string, unlike a LINQ expression, can be sent to a WCF service to be used for direct DB access (which itself might not be using Dapper).
The SQL string could be deserialized, with custom code, back into a LINQ statement by the WCF service. Eric Lippert comments on this.
The UI can use IQueryable mechanics to dynamically generate a predicate to give to the repository
In short, such a tool helps fulfill the "specification" or "query object" notion of repositories according to DDD, and does so without taking a dependency on EF or LINQ to SQL.
Doing this properly is really extremely complicated, especially if right now, you don't seem to know much about expression trees (which is what IQueryable uses to represent queries).
But if you really want to get started (or just get an idea of how much work it would be), have a look at Matt Warren's 17-part series Building an IQueryable provider.
I can confirm as this is a fairly big amount of work that’s suited only for the most experienced .NET developers. Perfect knowledge of C#, experience with multiple languages, including T-SQL is a must. One must be very well versed in both C# (or VB.NET) and T-SQL as they’ll have to write translator using the former into the latter. Additionally, this is in the realm of meta-programming, which is considered a fairly advanced branch of computer science. There is a lot of abstract thinking involved. Layers of abstract concepts stacked on each other.
If all of this isn’t a barrier, then this exercise can actually be quite enjoyable and rewarding, at least the first month or so. One common problem in these providers I noticed is that inflexibility and questionable design choices at the start led to difficulties later on and hacky fixes, etc. Planning as much as possible in advance, clearly understanding the whole process, different stages, components properly identifying layers and concerns would make it much easier to develop this. The biggest mistake I saw in one provider was – failing to break down the output query into its parts – select, from, where and order by. Each part should be represented by its own object throughout and then put together at the end. I explain this approach in my end-to-end tutorial on how to write a provider in the series linked below. There’s a sample project included, with a simpliefied/tutorial variant and the full version made from scratch for a project. Finding the time to write about it was a challenge in itself.
How to write a LINQ to SQL provider in C#:
Introduction
Expression Visitor
Where Clause Visitor
Compiling Expression Trees
This is something I briefly looked into quite a while ago. You may want to have a look at http://iqtoolkit.codeplex.com/ and/or http://expressiontree.codeplex.com/ for ideas. As mentioned by others, Linq query provider building is far from trivial if you do not limit your scope to the minimum set of features you really need.
If your goals relate to "specification" or "query object" notion of repositories according to DDD, this may not be the best direction to take. Instead of CRUD like technology related abstractions, it may be more productive to focus on ways in which the behaviour of the domain can be expressed, with a minimum of direct dependencies on technology related abstractions. As Eric Evans recently discussed, he regrets the focus on the technical building blocks, such as repositories, in his initial descriptions of DDD.
I think of C# language compiler as a self contained black box capable of understanding text of a certain syntax and producing compiled code. On the other hand .NET framework is a massive library that contains functionality written partly by C# and partly by C++. So .NET framework depends on C# language, not the other way around.
But I cannot fit this into how LINQ works. LINQ queries are text of a particular syntax that C# compiler can understand. But to build by own LINQ provider I need to work with interfaces like IQueryable and IQueryProvider both of which are defined in System.Linq namespace of the framework.
Does that mean a functionality C# language offers is dependent on a part of .NET framework? Does C# language know about .NET framework?
.NET Framework contains of many pieces. One of the most important is CLR — Common Language Runtime. All .NET languages depend on it, C# included, because they produce IL-code which cannot be executed by machine processor. Instead, CLR executes it.
And there is also Base Class Library, BCL, which is available to use for every .NET language: C#, VB.NET, Managed C++, F#, IronRuby, you name it. I doubt it was written in C#. It doesn't depend on any features of those languages, because classes and OOP are built in CLR.
So, yes, C# language knows about .NET framework, it absolutely must know about it. Think about IEnumerable: to compile foreach into GetEnumerator(), and MoveNext() calls, C# compiler has to know that, well, IEnumerable exists. And is somewhat special.
Or think about attributes! C# compiler has the intrinsic knowledge about what methods Attribute interface provides.
But CLR itself doesn't know anything about C#. At all.
LINQ queries are text of a particular syntax that C# compiler can understand.
Well, query expressions are - but the compiler doesn't really "understand" them. It just translates them in a pretty mechanical manner. For example, take this query:
var query = from foo in bar
where foo.X > 10
select foo.Y;
That is translated into:
var query = bar.Where(foo => foo.X > 10)
.Select(foo => foo.Y);
The compiler doesn't know anything about what Where and Select mean here. They don't even have to be methods - if you had appropriate fields or properties of delegate types, the compiler would be fine with it. Basically, if the second form will compile, so will the query expression.
Most LINQ providers use extension methods to provide these methods (Where, Select, SelectMany etc). Again, they're just part of the C# language - the compiler doesn't know or care what the extension methods do.
For more details about how query expressions are translated, see part 41 of my Edulinq blog series. You may find the rest of my Edulinq series informative, too - it's basically a series of blog posts in which I reimplement LINQ to Objects, one method at a time. Again, this demonstrates that the C# compiler doesn't rely on the LINQ implementation being in the System.Linq namespace, or anything like that.
I have used .Net 3.5 and VS 2008 for more than a month. Like most .Net developers, I have evolved from years experience in .Net 1.0 & 2.0 and VS 2005. Just recently, I discovered the simplicity and power of LINQ and Lambda Expressions, as in my recent questions such as Find an item in list by LINQ, Convert or map a class instance to a list of another one by using Lambda or LINQ, and Convert or map a list of class to another list of class by using Lambda or LINQ.
I admit that Lambda and LINQ are much simpler and easy to read and they seem very powerful. Behind the scenes, the .Net compiler must generate lots of code to achieve those functions. Therefore I am little bit hesitant to switch to the new syntax since I already know the "old" way to achieve the same results.
My question is the about the efficiency and performance of Lambda and LINQ. Maybe Lambda expressions are mostly in-line functions, in that case I guess Lambda should be okay. How about LINQ?
Let's limit the discussion to LINQ-to-Objects LINQ-to-SQL (LINQ-to-SQL). Any comments, comparison and experiences?
There's no one single answer that will suffice here.
LINQ has many uses, and many implementations, and thus many implications to the efficiency of your code.
As with every piece of technology at our fingertips, LINQ can and will be abused and misused alike, and the ability to distinguish between that, and proper usage, is only dependent on one thing: knowledge.
So the best advice I can give you is to go and read up on how LINQ is really implemented.
Things you should check into are:
LINQ and how it uses the methods and extension methods on existing collection types
How LINQ Works
How LINQ works internally (Stack Overflow)
How does coding with LINQ work? What happens behind the scenes?
How LINQ-to-objects and LINQ-to-SQL differs
What is the difference between LINQ query expressions and extension methods (Stack Overflow)
Alternatives to the new LINQ syntax, for instance, the usage of the .Where(...) extension method for collections
And as always, when looking at efficiency questions, the only safe approach is just to measure. Create a piece of code using LINQ that does a single, know, thing, and create an alternative, then measure both, and try to improve. Guessing and assuming will only lead to bad results.
Technically the fastest way is to control all the minutia yourself. Here are some performance tests. Notice that the foreach keyword and the ForEach LINQ construct are identically far far slower than just using for and writing procedural code.
However, the compiler can and will be improved and you can always profile your code and optimize any problematic areas. It is generally recommended to use the more expressive features that make code easier to read unless you really need the extra nanoseconds.
For LINQ queries, with the 'new syntax', the IL (code) generated, is fundamentally no different than calling the extension methods provided by Enumerable and Queryable directly.
Dont optimize prematurely. Use Linq and the new extension methods liberally if they improve readability and profile your application afterwards.
Most times the difference between Linq and using plain for loops is not relevant at all. The improved maintainability of your code should be worth a few ms.
Linq can be slower because it works on enumerators which are implemented as state machines. So plain for(...) loops will be faster.
I would recommend following Lasse V. Karlsens advice and append http://www.davesquared.net/2009/07/enumerables-linq-and-speed.html to his link list.
There is no performance difference between LINQ queries and Lambda expressions.
You should completely understand how LINQ feature(both Lambda, LINQ queries) works in .Net before you are looking into performance issues.
Basically you can work with any one of both LINQ queries and Lambda expressions..
LINQ Queries
It is high level readable query.
It is converted into equalent Lambda
expressions and Lambda expressions added as nodes into an expression tree. Expression tree
which makes structure of lambda expressions. This is
done by compiler.
Query provider looks into
expressions(added as nodes in expression tree) and
produces equalent SQL query operators thus equalent sql query formed during
runtime.
Return Type : Result set (IEnumerable).
Lambda Expressions
It is a set of expressions/statements and creates delegate /
expression tree. It can
be passed to a function as an
argument.
It supports all the LINQ methods like LINQ queries.
(Where,Select,Count,Sum,etc)
An expression tree formed which
makes structure of lambda
expressions. This is done by
compiler.
Query provider looks into
expressions(expression tree) and
produces equalent SQL query during
runtime.
Return Type : Delagate /
Expression Tree
Which is Best?
You can understand the LINQ (Queries,Lambda) If you look into the above points.
Advantage of LINQ query - It is readable.
Advantage of Lambda
Lambda will have an advantage since it creates a delegate and by using
the delagte you can just pass the
input paremeters and get the result
for the different input
parameters.You need not write
different queries for different
criteria as well.
You can create dynamic query by
using Lambda expressions and
Expression trees.
You can use Lambda expressions if
you want to pass the result of
statement(s) to a method as an
argument.
expressions are shorter.
So Lambda expression is the best for development over LINQ queries.
In some cases LINQ is just as fast if not faster than other methods, but in other cases it can be slower. We work on a project that we converted to linq and the data lookup is faster but the merging of data between two tables is much slower. There is a little overhead, but in most cases I don't see the speed difference having much effect on your program.