Pulling the WHERE clause out of LINQ to SQL

Pulling the WHERE clause out of LINQ to SQL - c#

I'm working with a client who wants to mix LINQ to SQL with their in-house DAL. Ultimately they want to be able to query their layer using typical LINQ syntax. The point where this gets tricky is that they build their queries dynamically. So ultimately what I want is to be able to take a LINQ query, pull it apart and be able to inspect the pieces to pull the correct objects out, but I don't really want to build a piece to translate the 'where' expression into SQL. Is this something I can just generate using Microsoft code? Or is there an easier way to do this?

(you mean just LINQ, not really LINQ-to-SQL)
Sure, you can do it - but it is massive amounts of work. Here's how; I recommend "don't". You could also look at the source code for DbLinq - see how they do it.
If you just want Where, it is a bit easier - but as soon as you start getting joins, groupings, etc - it will be very hard to do.
Here's just Where support on a custom LINQ implemention (not a fully queryable provider, but enough to get LINQ with Where working):
using System;
using System.Collections.Generic;
using System.Linq.Expressions;
using System.Reflection;
namespace YourLibrary
{
public static class MyLinq
{
public static IEnumerable<T> Where<T>(
this IMyDal<T> dal,
Expression<Func<T, bool>> predicate)
{
BinaryExpression be = predicate.Body as BinaryExpression;
var me = be.Left as MemberExpression;
if(me == null) throw new InvalidOperationException("don't be silly");
if(me.Expression != predicate.Parameters[0]) throw new InvalidOperationException("direct properties only, please!");
string member = me.Member.Name;
object value;
switch (be.Right.NodeType)
{
case ExpressionType.Constant:
value = ((ConstantExpression)be.Right).Value;
break;
case ExpressionType.MemberAccess:
var constMemberAccess = ((MemberExpression)be.Right);
var capture = ((ConstantExpression)constMemberAccess.Expression).Value;
switch (constMemberAccess.Member.MemberType)
{
case MemberTypes.Field:
value = ((FieldInfo)constMemberAccess.Member).GetValue(capture);
break;
case MemberTypes.Property:
value = ((PropertyInfo)constMemberAccess.Member).GetValue(capture, null);
break;
default:
throw new InvalidOperationException("simple captures only, please");
}
break;
default:
throw new InvalidOperationException("more complexity");
}
return dal.Find(member, value);
}
}
public interface IMyDal<T>
{
IEnumerable<T> Find(string member, object value);
}
}
namespace MyCode
{
using YourLibrary;
static class Program
{
class Customer {
public string Name { get; set; }
public int Id { get; set; }
}
class CustomerDal : IMyDal<Customer>
{
public IEnumerable<Customer> Find(string member, object value)
{
Console.WriteLine("Your code here: " + member + " = " + value);
return new Customer[0];
}
}
static void Main()
{
var dal = new CustomerDal();
var qry = from cust in dal
where cust.Name == "abc"
select cust;
int id = int.Parse("123");
var qry2 = from cust in dal
where cust.Id == id // capture
select cust;
}
}
}

Technically if your DAL exposes IQueryable<T> instead of IEnumerable<T> you can also implement a IQueryProvider and do exactly what you describe. However, this is not for the faint of heart.
But if you expose the LINQ to SQL tables themselves in the DAL, they will do exactly this for you. There is a (big) risk though since you'll be handling the client code total control over how to express SQL queries, and the usual result is some complex query that joins everything and slaps pagination a top of it with less than spectacular run time performance.
I think you should consider carefully what is actually needed from the DAL and expose only that.

I just read an interesting article on Expression Trees, LINQ to SQL uses these to translate the query into SQL and send it over the wire.
Maybe that's something you could use?

Just some though. I know some language support building a string that can be execute in the code itself. I never tried it with .Net, but this is common in functional languages like LISP. Since .Net support lambdas, maybe this is possible.
Since F# is coming to .Net soon, maybe it will possible if it is not right now.
What I am trying to say is if you can do this then maybe you can build that string that will be use as the LINQ statement and then execute it. Since it is a string, it will be possible to analyse the string and get the information you want.

Try Dynamic Linq

To anyone else with the same question out there. Pulling out the where clause from LINQ-to-SQL isn’t quite as straightforward, as one would’ve hoped for. Additionally, doing that by itself is probably meaningless. There are a couple of options, depending on the requirements – either grab it from the generated string, but then it would contain parameter references and object property mappings that would also have to be resolved, so those would also have to be pulled out of the original provider somehow, otherwise this would be pointless. Another – would be to find a modular provider that can do that, as well as make member mappings easily accessible, but once again, without the rest of the query, I see little utility in doing that, because the where clause would reference table/column aliases from the select statement.
I had a similar task to write a full blown provider for a custom ORM/DAL a couple of years ago. While it qualifies as the most complex thing I’ve worked on, being an experience developer, I can say it’s not as bad, as some people claim once you wrap your head around the concepts that lie at the foundation of such a component. Some solutions that I’ve seen go the wrong way about it, add redundant functionality and have extra code addressing problems introduced by underlying logic. E.g. the “optimization” stage/module that attempts to re-factor bloated, nested SQL produced by the main parser. If the latter was designed in such a way that would output clean SQL from the start, then no clean-up phase would be needed. I’ve seen providers that create a new level of nesting for each where and join call. That’s a bad strategy. By breaking down a query into three/four main parts – select, from, where and orderby, which are built individually as the tree is being visited, this problem is avoided altogether. I’ve developed an object-to-data (aka LINQ-to-SQL) provided based on these principles for a custom ORM/DAL and it produces nice, clean SQL, with an excellent performance, as each statement is compiled to IL and cached.
For anyone that is looking to do something similar, please see my posts that include a sample project with a tutorial/barebones implementation that makes it easy to see how it works. Included is also the full solution:
How to write a LINQ to SQL provider in C# Part 1 - Introduction
How to write a LINQ to SQL provider in C# Part 2 - Expression Visitor
How to write a LINQ to SQL provider in C# Part 3 - Where Clause Visitor
How to write a LINQ to SQL provider in C# Part 4 - Compiling Expression Trees

Related

C# creating methods with some kind of submethods

I want to build my own mysql class. But I'm relative new to c#.
So in my thoughts I would like to build some like:
mySqlTool.select("a,b,c").from("foo").where("x=y");
I dont know how this is called and actually I dont really know if this is even possible. My google search ended with no real result.
So my questions are:
Is it possible to do some like my sample above?

Creating a fluent api isn't overly complicated. You simply return the current instance of the class from each method which allows them to be chained the way that you specify in your post:
public class MySqlTool
{
private string _fields, _table, _filters;
public MySqlTool Select(string fields)
{
_fields = fields;
return this;
}
public MySqlTool From(string table)
{
_table = table;
return this;
}
public MySqlTool Where(string filters)
{
_filters = filters;
return this;
}
public Results Execute()
{
// build your query from _fields, _table, _filters
return results;
}
}
Would allow for you to run a query like:
var tool = new MySqlTool();
var results = tool.Select("a, b, c").From("someTable").Where("a > 1").Execute();

It looks like you are trying to do something similar to Linq to SQL, but with MySQL instead. There is a project called LinqConnect that does this. I have no affiliation with them.
You can use it like this (from a LinqConnect tutorial):
CrmDemoDataContext context = new CrmDemoDataContext();
var query = from it in context.Companies
orderby it.CompanyID
select it;
foreach (Company comp in query)
Console.WriteLine("{0} | {1} | {2}", comp.CompanyID, comp.CompanyName, comp.Country);
Console.ReadLine();

I'm not a big fan of it personally, but it can be useful if you're just learning. Linq2SQL can give you that functionality somewhat out of the box.
If you're looking to accomplish this yourself. You'll want to turn to extension methods. A feature within c# that will allow you to extend your classes using static methods, operating on the base.
My recommendation if you're doing your own data-access layer, then you should use something like dapper (built/used by the crew at StackOverflow). Which offers you a simple wrapper around the base connection to the database. It offers simple ORM functionality and does parameterization for you.
I would also recommend strongly that you encapsulate your intensive queries within Stored Procedures.

What you're looking to do is design a fluent interface, and it's somewhat of an advanced concept for someone who is just learning C#. I would suggest you stick to the basics until you have a little more experience.
More importantly, there are already existing data adapters built into the .NET FCL and also third-party adapters like this one that are more suitable and already do what you're trying to do using "LINQ to (database)". I wouldn't reinvent the wheel if I were you.

Recursion in Fluent API

I am designing a fluent API for writing SQL. Keep in mind one of my goals is to have API not suggest functions that can't be called in that part of the chain. For instance if you just got done defining a field in the select clause you can't call Where until you called From first. A simple query looks like this:
string sql = SelectBuilder.Create()
.Select()
.Fld("field1")
.From("table1")
.Where()
.Whr("field1 > field2")
.Whr("CURRENT_TIMESTAMP > field3")
.Build()
.SQL;
My problem comes with recursion in SQL code. Say you wanted to have a field contain another SQL statement like below:
string sql = SelectBuilder.Create()
.Select()
.Fld("field1")
.SQLFld()
.Select
.Count("field6")
.From("other table")
.EndSQLFld()
.FLd("field2")
.From("table1")
.Where()
.Whr("field1 > field2")
.Whr("CURRENT_TIMESTAMP > field3")
.Build()
.SQL;
I am using method chaining to build my fluent API. It many ways it is a state machine strewn out across many classes which represent each state. To add this functionality I would need to copy essentially every state I already have and wrap them around the two SQLFld and EndSQLFld states. I would need yet another copy if you were one more level down and were embedding a SQL statement in to a field of the already embedded SQL statement. This goes on to infinity, so with an infinitely deep embedded SQL query I would need an infinite number of classes to represent the infinite states.
I thought about writing a SelectBuilder query that was taken to the point of the Build method and then embedding that SelectBuilder in to another SelectBuilder and that fixes my infinity problem, but it is not very elegant and that is the point of this API.
I could also throw out the idea that the API only offers functions when they are appropriate but I would really hate to do that. I feel like that helps you best discover how to use the API. In many fluent APIs it doesn't matter which order you call what, but I want the API to appear as close to the actual SQL statement as possible and enforce its syntax.
Anyone have any idea how to solve this issue?

Glad to see you are trying fluent interfaces, I think they are a very elegant and expressive.
The builder pattern is not the only implementation for fluent interfaces. Consider this design, and let us know what you think =)
This is an example and I leave to you the details of your final implementation.
Interface design example:
public class QueryDefinition
{
// The members doesn't need to be strings, can be whatever you use to handle the construction of the query.
private string select;
private string from;
private string where;
public QueryDefinition AddField(string select)
{
this.select = select;
return this;
}
public QueryDefinition From(string from)
{
this.from = from;
return this;
}
public QueryDefinition Where(string where)
{
this.where = where;
return this;
}
public QueryDefinition AddFieldWithSubQuery(Action<QueryDefinition> definitionAction)
{
var subQueryDefinition = new QueryDefinition();
definitionAction(subQueryDefinition);
// Add here any action needed to consider the sub query, which should be defined in the object subQueryDefinition.
return this;
}
Example usage:
static void Main(string[] args)
{
// 1 query deep
var def = new QueryDefinition();
def
.AddField("Field1")
.AddField("Filed2")
.AddFieldWithSubQuery(subquery =>
{
subquery
.AddField("InnerField1")
.AddField("InnerFiled2")
.From("InnerTable")
.Where("<InnerCondition>");
})
.From("Table")
.Where("<Condition>");
// 2 queries deep
var def2 = new QueryDefinition();
def2
.AddField("Field1")
.AddField("Filed2")
.AddFieldWithSubQuery(subquery =>
{
subquery
.AddField("InnerField1")
.AddField("InnerField2")
.AddFieldWithSubQuery(subsubquery =>
{
subsubquery
.AddField("InnerInnerField1")
.AddField("InnerInnerField2")
.From("InnerInnerTable")
.Where("<InnerInnerCondition>");
})
.From("InnerInnerTable")
.Where("<InnerCondition>");
})
.From("Table")
.Where("<Condition>");
}

You can't "have only applicable methods available" without either sub-APIs for the substructures or clear bracketing/ending of all inner structural levels (SELECT columns, expressions in WHERE clause, subqueries).
Even then, running it all through a single API will require it to be stateful & "modal" with "bracketing" methods, to track whereabouts in the decl you are. Error reporting & getting these right will be tedious.
Ending bracketing by "fluent" methods, to me, seems non-fluent & ugly. This would result in a ugly appearence of EndSelect, EndWhere, EndSubquery etc. I'd prefer to build substructures (eg SUBQUERY for select) into a local variable & add that.
I don't like the EndSQLFld() idiom, which terminates the Subquery implicitly by terminating the Field. I'd prefer & guess it would be better design to terminate the subquery itself which is the complex part of the nested structure -- not the field.
To be honest, trying to enforce ordering of a "declarative" API for a "declarative" language (SQL) seems to be a waste of time.
Probably what I'd consider closer to an ideal usage:
SelectBuilder select = SelectBuilder.Create("CUSTOMER")
.Column("ID")
.Column("NAME")
/*.From("CUSTOMER")*/ // look, I'm just going to promote this onto the constructor.
.Where("field1 > field2")
.Where("CURRENT_TIMESTAMP > field3");
SelectBuilder countSubquery = SelectBuilder.Create("ORDER")
.Formula("count(*)");
.Where("ORDER.FK_CUSTOMER = CUSTOMER.ID");
.Where("STATUS = 'A'");
select.Formula( countSubquery, "ORDER_COUNT");
string sql = SelectBuilder.SQL;
Apologies to the Hibernate Criteria API :)

Using Dynamic LINQ (or Generics) to query/filter Azure tables

So here's my dilemma. I'm trying to utilize Dynamic LINQ to parse a search filter for retrieving a set of records from an Azure table. Currently, I'm able to get all records by using a GenericEntity object defined as below:
public class GenericEntity
{
public string PartitionKey { get; set; }
public string RowKey { get; set; }
Dictionary<string, object> properties = new Dictionary<string, object>();
/* "Property" property and indexer property omitted here */
}
I'm able to get this completely populated by utilizing the ReadingEntity event of the TableServiceContext object (called OnReadingGenericEvent). The following code is what actually pulls all the records and hopefully filter (once I get it working).
public IEnumerable<T> GetTableRecords(string tableName, int numRecords, string filter)
{
ServiceContext.IgnoreMissingProperties = true;
ServiceContext.ReadingEntity -= LogType.GenericEntity.OnReadingGenericEntity;
ServiceContext.ReadingEntity += LogType.GenericEntity.OnReadingGenericEntity;
var result = ServiceContext.CreateQuery<GenericEntity>(tableName).Select(c => c);
if (!string.IsNullOrEmpty(filter))
{
result = result.Where(filter);
}
var query = result.Take(numRecords).AsTableServiceQuery<GenericEntity>();
IEnumerable<GenericEntity> res = query.Execute().ToList();
return res;
}
I have TableServiceEntity derived types for all the tables that I have defined, so I can get all properties/types using Reflection. The problem with using the GenericEntity class in the Dynamic LINQ Query for filtering is that the GenericEntity object does NOT have any of the properties that I'm trying to filter by, as they're really just dictionary entries (dynamic query errors out). I can parse out the filter for all the property names of that particular type and wrap
"Property[" + propName + "]"
around each property (found by using a type resolver function and reflection). However, that seems a little... overkill. I'm trying to find a more elegant solution, but since I actually have to provide a type in ServiceContext.CreateQuery<>, it makes it somewhat difficult.
So I guess my ultimate question is this: How can I use dynamic classes or generic types with this construct to be able to utilize dynamic queries for filtering? That way I can just take in the filter from a textbox (such as "item_ID > 1023000") and just have the TableServiceEntity types dynamically generated.
There ARE other ways around this that I can utilize, but I figured since I started using Dynamic LINQ, might as well try Dynamic Classes as well.
Edit: So I've got the dynamic class being generated by the initial select using some reflection, but I'm hitting a roadblock in mapping the types of GenericEntity.Properties into the various associated table record classes (TableServiceEntity derived classes) and their property types. The primary issue is still that I have to initially use a specific datatype to even create the query, so I'm using the GenericEntity type which only contains KV pairs. This is ultimately preventing me from filtering, as I'm not able to do comparison operators (>, <, =, etc.) with object types.
Here's the code I have now to do the mapping into the dynamic class:
var properties = newType./* omitted */.GetProperties(
System.Reflection.BindingFlags.Instance |
System.Reflection.BindingFlags.Public);
string newSelect = "new(" + properties.Aggregate("", (seed, reflected) => seed += string.Format(", Properties[\"{0}\"] as {0}", reflected.Name)).Substring(2) + ")";
var result = ServiceContext.CreateQuery<GenericEntity>(tableName).Select(newSelect);
Maybe I should just modify the properties.Aggregate method to prefix the "Properties[...]" section with the reflected.PropertyType? So the new select string will be made like:
string newSelect = "new(" + properties.Aggregate("", (seed, reflected) => seed += string.Format(", ({1})Properties[\"{0}\"] as {0}", reflected.Name, reflected.PropertyType)).Substring(2) + ")";
Edit 2: So now I've hit quite the roadblock. I can generate the anonymous types for all tables to pull all values I need, but LINQ craps out on my no matter what I do for the filter. I've stated the reason above (no comparison operators on objects), but the issue I've been battling with now is trying to specify a type parameter to the Dynamic LINQ extension method to accept the schema of the new object type. Not much luck there, either... I'll keep you all posted.

I've created a simple System.Refection.Emit based solution to create the class you need at runtime.
http://blog.kloud.com.au/2012/09/30/a-better-dynamic-tableserviceentity/

I have run into exactly the same problem (with almost the same code :-)). I have a suspicion that the ADO.NET classes underneath somehow do not cooperate with dynamic types but haven't found exactly where yet.

So I've found a way to do this, but it's not very pretty...
Since I can't really do what I want within the framework itself, I utilized a concept used within the AzureTableQuery project. I pretty much just have a large C# code string that gets compiled on the fly with the exact object I need. If you look at the code of the AzureTableQuery project, you'll see that a separate library is compiled on the fly for whatever table we have, that goes through and builds all the properties and stuff we need as we query the table. Not the most elegant or lightweight solution, but it works, nevertheless.
Seriously wish there was a better way to do this, but unfortunately it's not as easy as I had hoped. Hopefully someone will be able to learn from this experience and possibly find a better solution, but I have what I need already so I'm done working on it (for now).

Getting magic strings out of QueryOver (or Fluent NHibernate perhaps)?

One of the many reason to use FluentNHibernate, the new QueryOver API, and the new Linq provider are all because they eliminate "magic string," or strings representing properties or other things that could be represented at compile time.
Sadly, I am using the spatial extensions for NHibernate which haven't been upgraded to support QueryOver or LINQ yet. As a result, I'm forced to use a combination of QueryOver Lambda expressions and strings to represent properties, etc. that I want to query.
What I'd like to do is this -- I want a way to ask Fluent NHibernate (or perhaps the NHibernate QueryOver API) what the magic string "should be." Here's a pseudo-code example:
Currently, I'd write --
var x = session.QueryOver<Shuttle>().Add(SpatialRestrictions.Intersects("abc", other_object));
What I'd like to write is --
var x = session.QueryOver<Shuttle>().Add(SpatialRestriction.Intersects(session.GetMagicString<Shuttle>(x => x.Abc), other_object));
Is there anything like this available? Would it be difficult to write?
EDIT: I just wanted to note that this would apply for a lot more than spatial -- really anything that hasn't been converted to QueryOver or LINQ yet could be benefit.

update
The nameof operator in C# 6 provides compile time support for this.
There is a much simpler solution - Expressions.
Take the following example:
public static class ExpressionsExtractor
{
public static string GetMemberName<TObj, TProp>(Expression<Func<TObj, TProp>> expression)
{
var memberExpression = expression.Body as MemberExpression;
if (memberExpression == null)
return null;
return memberExpression.Member.Name;
}
}
And the usage:
var propName = ExpressionsExtractor.GetMemberName<Person, int>(p => p.Id);
The ExpressionsExtractor is just a suggestion, you can wrap this method in whatever class you want, maybe as an extension method or preferably a none-static class.
Your example may look a little like this:
var abcPropertyName = ExpressionsExtractor.GetMemberName<Shuttle, IGeometry>(x => x.Abc);
var x = session.QueryOver<Shuttle>().Add(SpatialRestriction.Intersects(abcPropertyName, other_object));

Assuming I'm understanding your question what you might want is a helper class for each entity you have with things like column names, property names and other useful things, especially if you want to use ICriteria searches. http://nhforge.org/wikis/general/open-source-project-ecosystem.aspx has plenty of projects that might help. NhGen (http://sourceforge.net/projects/nhgen/) creates very simple helper classes which might help point you down a design path for what you might want.
Clarification Edit: following an "I don't understand" comment
In short, I don't beleive there is a solution for you just yet. The QueryOver project hasn't made it as far as you want it to. So as a possible solution in the mean time, to remove magic strings build a helper class, so your query becomes
var x = session.QueryOver<Shuttle>().Add(SpatialRestrictions.Intersects(ShuttleHelper.Abc, other_object));
That way your magic string is behind some other property ( I just chose .Abc to demonstrate but I'm sure you'll have a better idea of what you want ) then if "abc" changes ( say to "xyz" ) you either change the property name from .Abc to .Xyz and then you will have build errors to show you where you need to update your code ( much like you would with lambda expressions ) or just change the value of the .Abc property to "xyz" - which would really only work if your property had some meaningfull name ( such as .OtherObjectIntersectingColumn etc ) not that property name itself. That does have the advantage of not having to update code to correct the build errors. At that point your query could be
var x = session.QueryOver<Shuttle>().Add(SpatialRestrictions.Intersects(ShuttleHelper.OtherObjectIntersectingColumn, other_object));
I mentioned the open source project ecosystem page as it can give you some pointers on what types of helper classes other people have made so your not re-inventing the wheel so to speak.

Linq based generic alternate to Predicate<T>?

I have an interface called ICatalog as shown below where each ICatalog has a name and a method that will return items based on a Predicate<Item> function.
public interface ICatalog
{
string Name { get; }
IEnumerable<Item> GetItems(Predicate<Item> predicate);
}
A specific implementation of a catalog may be linked to catalogs in various format such as XML, or a SQL database.
With an XML catalog I end up deserializing the entire XML file into memory, so testing each item with the predicate function does does not add a whole lot more overhead as it's already in memory.
Yet with the SQL implementation I'd rather not retrieve the entire contents of the database into memory, and then filter the items with the predicate function. Instead I'd want to find a way to somehow pass the predicate to the SQL server, or somehow convert it to a SQL query.
This seems like a problem that can be solved with Linq, but I'm pretty new to it. Should my interface return IQueryable instead? I'm not concerned right now with how to actually implement a SQL version of my ICatalog. I just want to make sure my interface will allow for it in the future.

Rob has indicated how you might do this (although a more classic LINQ approach might take Expression<Func<Item,bool>>, and possbily return IQueryable<IFamily>).
The good news is that if you want to use the predicate with LINQ-to-Objects (for your xml scenario) you can then just use:
Predicate<Item> func = predicate.Compile();
or (for the other signature):
Func<Item,bool> func = predicate.Compile();
and you have a delegate (func) to test your objects with.
The problem though, is that this is a nightmare to unit test - you can only really integration test it.
The problem is that you can't reliably mock (with LINQ-to-Objects) anything involving complex data-stores; for example, the following will work fine in your unit tests but won't work "for real" against a database:
var foo = GetItems(x => SomeMagicFunction(x.Name));
static bool SomeMagicFunction(string name) { return name.Length > 3; } // why not
The problem is that only some operations can be translated to TSQL. You get the same problem with IQueryable<T> - for example, EF and LINQ-to-SQL support different operations on a query; even just First() behaves differently (EF demands you explicitly order it first, LINQ-to-SQL doesn't).
So in summary:
it can work
but think carefully whether you want to do this; a more classic black box repository / service interface may be more testable

You don't need to go all the way and create an IQueryable implementation
If you declare your GetItems method as:
IEnumerable<IFamily> GetItems(Expression<Predicate<Item>> predicate);
Then your implementing class can inspect the Expression to determine what is being asked.
Have a read of the IQueryable article though, because it explains how to build a expression tree visitor, which you'll need to build a simple version of.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.