Legacy collection querying before LINQ - c#

I'm looking to give a talk about LINQ, and wanted to mention how querying collections used to work. Back in .Net 1.1, I seem to remember there being a method (Find() maybe?) where you would pass the address of another method which would interrogate each item in the collection and determine whether it should be included in the filtered collection.
Am I completely misremembering this? It stuck with me, as the syntax was unusual for the time.
I thought it was something like:
public bool ContainsFoo(string term){
if(term.contains("Foo"){
return true;
}
return false;
}
And you could call it like:
filteredCollection = collection.Find(ContainsFoo);
I seem to remember a lot of people commenting on how LINQ was so much faster to code because developers could now write functions in-line. How were we writing functions "out-line" previously?

Before LINQ you were just limited to all the built-in List/List<T> methods, and yes Find is one of them (still is). The difference is it expects a Predicate<T> as opposed to a Func<Boolean, T> which you can still do inline e.g.
var found = list.Find(delegate(Item item) { return item != null; });
Or as you demonstrated by using a named method.

Related

Return true called inside LINQ Select statement

I ran across some code in an older project I am working on that I've never seen before, and has me confused on it's intent.
updatables.Select(r =>
{
// some operations are done here for each element in the list
return true;
}).ToArray();
It seems like a select statement is being used to iterate the updatables collection. Also seems the ToArray call isn't doing anything.
My question is, what does calling return true in the Select statement accomplish, if anything?
This looks very much like a hack to emulate ForEach:
ToArray() call is added to ensure that updatables will be iterated to completion,
return true is added to silence the compiler that does not allow Action<T>, but allows Func<T,bool> in LINQ's Select.
I would strongly recommend against writing code like this, because it is a lot less readable than an equivalent foreach loop.
Select takes a Func<T, TResult> - which means it won't accept an Action<T>. In other words, a lambda which does not return anything will result in a compilation error when passed to Select, so the author bypassed that "limitation" by having it return a dummy value.
The intent behind this code is likely to run a foreach loop on the collection using the LINQ syntax. However, the way it's done in this code is a bad practice, as LINQ methods are expected to be pure - that is, not modify any sort of state outside of the expression.

including "offline" code in compiled querys

What happens behind the curtains when I include a function into my compiled query, like I do with DataConvert.ToThema() here to convert a table object into my custom business object:
public static class Queries
{
public static Func<MyDataContext, string, Thema> GetThemaByTitle
{
get
{
var func = CompiledQuery.Compile(
(MyDataContext db, string title) =>
(from th in elan.tbl_Thema
where th.Titel == title
select DataConvert.ToThema(th)).Single()
);
return func;
}
}
}
public static class DataConvert
{
public static Thema ToThema(tbl_Thema tblThema)
{
Thema thema = new Thema();
thema.ID = tblThema.ThemaID;
thema.Titel = tblThema.Titel;
// and some other stuff
return thema;
}
}
and call it like this
Thema th = Queries.GetThemaByTitle.Invoke(db, "someTitle");
Apparently the function is not translated in to SQL or something (how could it), but it also does not hold when I set a breakpoint there in VS2010.
It works without problems, but I don't understand how or why. What exactly happens there?
Your DataConvert.ToThema() static method is simply creating an instance of a type which has a default constructor, and setting various properties, is that correct? If so, it's not terribly different from:
(from th in elan.tbl_Thema
where th.Titel == title
select new Thema{ID=th.ThemaID, Titel=th.Titel, etc...}
).Single());
When you call Queries.GetThemaByTitle, a query is being compiled. (The way you are calling this, by the way, may or may not actually be giving you any benefits from pre-compiling). That 'Query' is actually a code expression tree, only part of which is intended to generate SQL code that is sent to the database.
Other parts of it will generate IL code which is grabbing what is returned from the database and putting it into some form for your consumption. LINQ (EF or L2S) is smart enough to be able to take your static method call and generate the IL from it to do what you want - and maybe it's doing so with an internal delegate or some such. But ultimately, it doesn't need to be (much) different from what would be generated from I substituted above.
But note that this happens regardless what the type is that you get back; somewhere, IL code is being generated that puts DB values into a CLR object. That is the other part of those expression trees.
If you want a more detailed look at those expression trees and what they involved, I'd have to dig for ya, but I'm not sure from your question if that's what you are looking for.
Let me start by pointing out, that whether you compile your query or not does not matter. You would observe the very same results even if you did not pre-compile.
Technically, as Andrew has pointed out, making this work is not that complicated. When your LINQ expression is evaluated an expression tree is constructed internally. Your function appears as a node in this expression tree. No magic here. You'll be able to write this expression both in L2S and L2E and it will compile and run fine. That is until you try to actually execute the actual SQL query against the database. This is where difference begins. L2S seems to happily execute this task, whereas L2E fails with NotSupportedException, and reporting that it does not know how to convert ToThema into store query.
So what's happening inside? In L2S, as Andrew has explained, the query compiler understands that your function can be run separately from the store query has been executed. So it emits calls to your function into the object reading pipeline (where data read from SQL is transformed to the objects that are returned as the result of your call).
Once thing Andrew was not quite right, is that it matters what's inside your static method. I don't think it does.
If you put a break point in the debugger to your function, you will see that it's called once per returned row. In the stack trace you will see "Lightweight Function", which, in reality, means that the method was emitted at run time. So this is how it works for Linq to Sql.
Linq to Entity team seemed to go different route. I do not know, what was the reasoning, why they decided to ban all InvocationExpressions from L2E queries. Perhaps these were performance reason, or may be the fact that they need to support all kind of providers, not SQL Server only, so that data readers might behave differently. Or they simply thought that most people wouldn't realize that some of those are executed per returned row and preferred to keep this option closed.
Just my thoughts. If anyone has any more insight, please chime in!

More fluent C# / .NET

A co-worker of mine came up with this and I wonder what others think? Personally, I find it interesting but wonder if it is too big a departure? Code examples below. Extension methods at the bottom.
General thoughts please. Other extension methods that could be added?
var ddl = Page.FindControl("LocationDropDownList") as DropDownList;
ddl.Visible = true;
ddl.SelectedValue = "123";
if(isAdmin)
ddl .SelectedValue = "111";
Becomes:
Page.FindControl("LocationDropDownList")
.CastAs<DropDownList>()
.With(d => d.Visible = true)
.With(d => d.SelectedValue = "123")
.WithIf(isAdmin, d => d.Items.Add(new ListItem("Admin", "1")));
Or:
Page.FindControl("LocationDropDownList")
.CastAs<DropDownList>()
.With(d =>
{
d.Visible = true;
d.SelectedValue = "123";
})
.WithIf(isAdmin, d => d.SelectedValue = "111");
Extension methods:
public static TResult CastAs<TResult>(this object obj) where TResult : class
{
return obj as TResult;
}
public static T With<T>(this T t, Action<T> action)
{
if (action == null)
throw new ArgumentNullException("action");
action(t);
return t;
}
public static T WithIf<T>(this T t, bool condition, Action<T> action)
{
if (action == null)
throw new ArgumentNullException("action");
if (condition)
action(t);
return t;
}
Amongst my rules of thumb for writing clear code is: put all side effects in statements; non-statement expressions should have no side effects.
Your first version of the program clearly follows this rule. The second version clearly violates it.
An additional thought: if I were to read code like the code you've displayed, I would naturally assume that the purpose of the code was to build up a lazily-evaluated structure which represented those operations -- this is exactly why query comprehensions in C# 3 are built in this way. The result of the query expression is an object representing the deferred application of the query.
If your intention is to capture the notion of "execute these side effects in a deferred manner at a later moment of my choosing", then this is a sensible approach. Essentially what you're building up is a side-effecting monad. If your intention is merely to provide a different syntax for the eagerly executed code, then this is just confusing, verbose and unnecessary.
I see no advantage to this besides being confusing to the reader. With respect to my fellow answerer, I would like to know on what planet this is more readable. As far as I can tell, the first version has more or less perfect readability, whereas this is fairly readable, but makes the reader wonder whether there's some strange magic happening within With and WithIf.
Compared to the first version, it's longer, harder to type, less obvious, and less performant.
I guess I fail to see what the new versions get you. The original is pretty clear and is less wordy. I would guess that it would be faster as well. I would avoid using (abusing?) language features like this unless there is a clear benefit.
One more vote for "not useful". The With extension method doesn't do anything except wrap up sequenced statements with a method. C# already already has a built-in function for sequencing statements, its called ;.
Similarly, the WithIf wraps an if-statement without any modification to the control flow. From my point of view, you're only inviting yourself to methods like:
public static T For<T>(
this T t, int start, Func<int, bool> cond, Action<T, int> f)
{
for(int i = start; cond(i); i++)
{
f(t, i);
}
return t;
}
The original is more readable.
The simplest API change would be to make the object returned by FindControl() a Builder-esque thing (where all the set methods return 'this'):
Page.FindControl("LocationDropDownList")
.setVisible(true)
.setSelectedValue(isAdmin ? "111" : "123");
That is some extension method abuse if I ever saw it!
It's an interesting use of extensions, and I appreciate it on that merit alone. I'm not sure I'd use it, but if your team likes it, then by all means, use it.
They're just different coding styles, what do you mean by "too big a departure"? Departure from what? From what you're used to? Only you can decide that. I will say that VB's With block has done more harm than good to code readability, and I would not try to replicate the behavior in C#, but that's just my preference.
I pretty much always use this for FindControl (yeah, strongly typed to RepeaterItem, it doesn't have to be, but that's the only thing I ever use it for anyway):
public static T FindControl<T>(this RepeaterItem item, string id)
{
return item.FindControl(id) as T;
}
And invoke it like so:
Literal myLiteral = e.Item.FindControl<Literal>("myLiteral");
I am more comfortable with the first version. It takes less time to read and understand. I agree that the extension methods are also fine if you are familiar with it and also familiar with the With method, but what’s the benefit of it in this case?
Minor note. From personal experience, I'd change:
if(isAdmin)
ddl.SelectedValue = "111";
to
if(isAdmin) {
ddl.SelectedValue = "111";
}
or
if(isAdmin)
{
ddl.SelectedValue = "111";
}
This will save you time in debugging sooner or later.
If this was a language feature:
With(Page.FindControl("LocationDropDownList") as DropDownList)
{
Visible = true;
SelectedValue = "123";
if(isAdmin)
Add(new ListItem( "111"));
}
You would win something:
avoid redundancy of the mutated object
all language features available in the "With" block
Above tries to emulate the style without reaping the benefits. Cargo Cult.
(Note: I do understand the various arguments against it, but It'd still be nice)
Incidentally, some of my C++ Win32 UI Helpers contain setters that use chaining similar what you want to achieve:
LVItem(m_lc, idx).SetText(_T("Hello")).SetImg(12).SetLParam(id);
In that case, I least win the "no redundancy", but that's because I don't have properties.
I predict the whole "fluent interface" fad will be the "hungarian notation" of the 2000's. I personally think it doesn't look very clean and it runs the risk of becoming very inconsistent if you have multiple developers each with their own preference.
Looks like your co worker is a Lambda Junkie.
I think the question of readability is subjective and I personally have no issue with what you've done. I would consider using it if your organization "approved" it.
I think the concept is sound and if you changed "With" to "Let" it would be more "functional" or "F#-ish". Personal opinion.
Page.FindControl("LocationDropDownList")
.CastAs<DropDownList>()
.Let(d => d.Visible = true)
.Let(d => d.SelectedValue = "123");
My 2 cents: It looks fine, my only comment is that "With" kind of implies something like "Where" or "Having" when you are actually setting a property. I'd suggest a method name of something like "Do", "Execute" or "Set" but maybe thats just my odd world view.
How about:
Page.WithControl<DropDownList>("LocationDropDownList")
.Do(d => d.Visible = true)
.Do(d => d.SelectedValue = "123")
.DoIf(isAdmin, d => d.Items.Add(new ListItem("Admin", "1")));
I'd say stick with the first version. What you've posted is too clever to be immediately useful to someone reading the code.
You could even go a step further and do away with that "var":
DropDownList ddl = (DropDownList) Page.FindControl("ddlName");
This is a perfect learning case on how to make something more complicated than it needs to be.
The first version is clear and requires no extra knowledge beyond normal language contructs.
I say stick with the first version without the extension methods or lamba expressions. These are relatively new concepts so not many developers will have a handle on them yet outside their use in data retrieval/manipulation from a database. If you use them you may have a hit on maintenance cost. It is nice to say "read up if this is Greek to you"; but in real-life that may be the best approach.
Regarding a "Fluent Interface" C# already has a great syntax for initializers which is (IMHO) better that trying to use the fluent style. Of course, in your example you are not initializing an new object, you are changing an existing one. My whole expertise with Fluent interfaces comes from a 30 second scan of wikipedia, but I think that JeeBee's answer is more in the spirit of Fluent programming, though I might change things slightly:
Page.FindDropDownList("LocationDropDownList")
.setVisible(true)
.setAdminSelectedValue("111")
.setSelectedValue("123")
One could argue that this is more readable, especially for a language without Properties, but I still think it doesn't make sense in C#.
In certain circumstances thoughtfully constructed fluent interfaces can be very useful. First, because the developer is presented with a limited number of options they are (typically) easy to use correctly and difficult to use incorrectly. Second, because of the sentence like structure they can be a nice clean way to declare your intentions, especially when building complex objects.
I have found fluent interfaces to be very useful when developing test code in which it is often necessary to build lots of domain objects with slight variations. I have also used them successfully as a way to introduce the decorator pattern and to eliminate excessive method overloading.
If anyone is interested in learning more about fluent interfaces, I suggest checking out this work in progress by Martin Fowler.
Good rule of thumb:
If your first impression of your code is "This is clever" - it's probably not a good idea.
Good code should be simple, readable, and only "clever" if absolutely necessary.

Is there a way I can dynamically define a Predicate body from a string containing the code?

This is probably a stupid question, but here goes. I would like to be able to dynamically construct a predicate < T > from a string parsed from a database VARCHAR column, or any string, for that matter. For example, say the column in the database contained the following string:
return e.SomeStringProperty.Contains("foo");
These code/string values would be stored in the database knowing what the possible properties of the generic "e" is, and knowing that they had to return a boolean. Then, in a magical, wonderful, fantasy world, the code could execute without knowing what the predicate was, like:
string predicateCode = GetCodeFromDatabase();
var allItems = new List<SomeObject>{....};
var filteredItems = allItems.FindAll(delegate(SomeObject e) { predicateCode });
or Lambda-ized:
var filteredItems = allItems.FindAll(e => [predicateCode]);
I know it can probably never be this simple, but is there a way, maybe using Reflection.Emit, to create the delegate code dynamically from text and give it to the FindAll < T > (or any other anonymous/extension) method?
The C# and VB compilers are available from within the .NET Framework:
C# CodeDom Provider
Be aware though, that this way you end up with a separate assembly (which can only be unloaded if it's in a separate AppDomain). This approach is only feasible if you can compile all the predicates you are going to need at once. Otherwise there is too much overhead involved.
System.Reflection.Emit is a great API for dynamically emitting code for the CLR. It is, however, a bit cumbersome to use and you must learn CIL.
LINQ expression trees are an easy to use back-end (compilation to CIL) but you would have to write your own parser.
I suggest you have a look at one of the "dynamic languages" that run on the CLR (or DLR) such as IronPython. It's the most efficient way to implement this feature, if you ask me.
Check out the Dynamic Linq project it does all this and more!
http://weblogs.asp.net/scottgu/archive/2008/01/07/dynamic-linq-part-1-using-the-linq-dynamic-query-library.aspx
Great for simple stuff like user selected orderby's or where clauses
It is possible using emit, but you'd be building your own parser.
EDIT
I remember that in ScottGu's PDC keynote, he showed a feature using the CLI version of the .net framework that resembled Ruby's eval, but I can't find a URL that can corroborate this. I'm making this a commnity wiki so that anyone who has a good link can add it.
I stepped off the dynamic linq because it's limited in ways I want to search a collection, unless you prove me wrong.
My filter needs to be: in a list of orders, filter the list so that I have only the orders with in the collection of items in that order, an item with the name "coca cola".
So that will result to a method of: orders.Findall(o => o.Items.Exists(i => i.Name == "coca cola"))
In dynamic linq I didn't find any way to do that, so I started with CodeDomProvicer.
I created a new Type with a method which contains my dynamically built FindAll Method:
public static IList Filter(list, searchString)
{
// this will by dynamically built code
return orders.Findall(o => o.Items.Exists(i => i.Name == "coca cola"));
}
when I try to build this assembly:
CompilerResults results = provider.CompileAssemblyFromSource(parameters, sb.ToString());
I'm getting the error:
Invalid expression term ">"
Why isn't the compiler able to compile the predicate?

Adding functonality to Linq-to-SQL objects to perform common selections

In a previous question I asked how to make "Computed properties" in a linq to sql object. The answer supplied there was sufficient for that specific case but now I've hit a similar snag in another case.
I have a database with Items that have to pass through a number of Steps. I want to have a function in my database that retrieves the Current step of the item that I can then build on. For example:
var x = db.Items.Where(item => item.Steps.CurrentStep().Completed == null);
The code to get the current step is:
Steps.OrderByDescending(step => step.Created).First();
So I tried to add an extension method to the EntitySet<Step> that returned a single Step like so:
public static OrderFlowItemStep CurrentStep(this EntitySet<OrderFlowItemStep> steps)
{
return steps.OrderByDescending(o => o.Created).First();
}
But when I try to execute the query at the top I get an error saying that the CurrentStep() function has no translation to SQL. Is there a way to add this functionality to Linq-to-SQL in any way or do I have to manually write the query every time? I tried to write the entire query out first but it's very long and if I ever change the way to get the active step of an item I have to go over all the code again.
I'm guessing that the CurrentStep() method has to return a Linq expression of some kind but I'm stuck as to how to implement it.
The problem is that CurrentStep is a normal method. Hence, the Expression contains a call to that method, and naturally SQL cannot execute arbitrary .NET methods.
You will need to represent the code as an Expression. I have one in depth example here: http://www.atrevido.net/blog/2007/09/06/Complicated+Functions+In+LINQ+To+SQL.aspx
Unfortunately, the C# 3.0 compiler has a huge omission and you cannot generate calls to Expressions. (i.e., you can't write "x => MyExpression(x)"). Working around it either requires you to write the Expression manually, or to use a delegate as a placeholder. Jomo Fisher has an interesting post about manipulating Expression trees in general.
Without actually having done it, the way I'd probably approach it is by making the CurrentStep function take the predicate you want to add ("Completed == null"). Then you can create a full Expression> predicate to hand off to Where. I'm lazy, so I'm going to do an example using String and Char (String contains Chars, just like Item contains Steps):
using System;
using System.Linq;
using System.Linq.Expressions;
class Program {
static void Main(string[] args) {
Console.WriteLine(StringPredicate(c => Char.IsDigit(c)));
var func = StringPredicate(c => Char.IsDigit(c)).Compile();
Console.WriteLine(func("h2ello"));
Console.WriteLine(func("2ello"));
}
public static Expression<Func<string,bool>> StringPredicate(Expression<Func<char,bool>> pred) {
Expression<Func<string, char>> get = s => s.First();
var p = Expression.Parameter(typeof(string), "s");
return Expression.Lambda<Func<string, bool>>(
Expression.Invoke(pred, Expression.Invoke(get, p)),
p);
}
}
So "func" is created by using StringPredicate to create an Expression. For the example, we compile it to execute it locally. In your case, you'd pass the whole predicate to "Where" so it gets translated to SQL.
The "get" expression is where you put your "extension" stuff (OrderByWhatever, First, etc.). This is then passed in to the predicate that's given to you.
Don't worry if it looks complicated; it sorta is at first. If you haven't done this kinda stuff before, it'll take a bit of time (the first time I did this kinda stuff, it took hours to get it right :|.. now it comes slightly easier). Also, as I mentioned, you can write a helper method to do this re-writing for you (so you don't directly need to use the Expression.Whatever methods), but I haven't seen any examples and haven't really needed it yet.
Check out my answer to "switch statement in linq" and see if that points you in the right direction...
The technique i demonstrate there is the one that got me past the scary "no translation to SQL" error.

Categories