Using custom method inside Linq Select with Entity Framework - c#

I am trying to use a custom Function inside a Linq Select that is used with EF.
I want to project each item of tblMitarbeiter onto one tblMitarbeiterPersonalkostenstelleHistories that is valid ad the given date.
This should be done with an extension method so that I do not repeat myself ;)
I can only get it to work when used directly on the DbSet, but not inside a Select.
How can I teach EF to recognize my Method (3.) as if I would be writing it out (1.)?
void Main()
{
var date = DateTime.Now;
// 1. works, returns IEnumerable<tblMitarbeiterPersonalkostenstelleHistories>
tblMitarbeiters
.Select(m => m.tblMitarbeiterPersonalkostenstelleHistories.Where(p => p.ZuordnungGültigAb <= date).OrderByDescending(p => p.ZuordnungGültigAb).FirstOrDefault())
.Dump();
// 2. works, returns one tblMitarbeiterPersonalkostenstelleHistories
tblMitarbeiterPersonalkostenstelleHistories
.GetValidItemForDate(p => p.ZuordnungGültigAb, date)
.Dump();
// 3. throws NotSupportedException
tblMitarbeiters
.Select(m => m.tblMitarbeiterPersonalkostenstelleHistories.GetValidItemForDate(p => p.ZuordnungGültigAb, date))
.Dump();
// 4. throws NotSupportedException
tblMitarbeiters
.Select(m => m.tblMitarbeiterPersonalkostenstelleHistories.AsQueryable().GetValidItemForDate(p => p.ZuordnungGültigAb, date))
.Dump();
}
public static class QueryableExtensions
{
public static T GetValidItemForDate<T>(this IQueryable<T> source, Expression<Func<T, DateTime>> selector, DateTime date)
{
var dateAccessor = Expression.Lambda<Func<T, DateTime>>(Expression.Constant(date), selector.Parameters);
var lessThanOrEqual = Expression.LessThanOrEqual(selector.Body, dateAccessor.Body);
var lambda = Expression.Lambda<Func<T, bool>>(lessThanOrEqual, selector.Parameters);
return source.Where(lambda).OrderByDescending(selector).FirstOrDefault();
}
public static T GetValidItemForDate<T>(this IEnumerable<T> source, Func<T, DateTime> selector, DateTime date) =>
source.Where(i => selector(i) <= date).OrderByDescending(selector).FirstOrDefault();
}

You can, to some extent, split up complex LINQ expressions using LINQKit. If you'll excuse me, I'll use an example model that's less germanic:
public class Employee
{
public long Id { get; set; }
public virtual ICollection<EmployeeHistoryRecord> HistoryRecords { get; set; }
}
public class EmployeeHistoryRecord
{
public long Id { get; set; }
public DateTime ValidFrom { get; set; }
public long EmployeeId { get; set; }
public Employee Employee { get; set; }
}
If I understood your question correctly, it should be identical to yours where it matters.
When using LINQKit, and LINQ in general, you must understand that the only tool you have at your disposal when reusing query code, without using stored procedures, is breaking apart and stitching together expressions.
Your utility method would translate to something like this:
private static Expression<Func<IEnumerable<TItem>, TItem>> GetValidItemForDate<TItem>(
Expression<Func<TItem, DateTime>> dateSelector,
DateTime date)
{
return Linq.Expr((IEnumerable<TItem> items) =>
items.Where(it => dateSelector.Invoke(it) <= date)
.OrderByDescending(it => dateSelector.Invoke(it))
.FirstOrDefault())
.Expand();
}
What this method does is dynamically create an expression whose input is an IEnumerable<TItem> that returns a TITem. You can see it's pretty similar to the code you're extracting. A few things to note:
The source collection is not a parameter of the utility method, but of the expression returned.
You have to call the Invoke() extension method from LinqKit on any expressions you're "plugging into" this one.
You should call Expand() on the result if you used any Invoke()s inside it. This will make LINQKit replace the calls to Invoke() in the expression tree with the expression being invoked. (This isn't 100% necessary, but it makes it easier to fix errors when expansion fails for some reason. If you don't Expand() in every helper method, any error that happens during expansion will manifest in the method that does the expansion, and not in the method that actually contains the offending code.)
You then use this similarly, again using Invoke():
var db = new EmployeeHistoryContext();
var getValidItemForDate = GetValidItemForDate((EmployeeHistoryRecord cab) => cab.ValidFrom, DateTime.Now);
var historyRecords = db.Employees.AsExpandable().Select(emp => getValidItemForDate.Invoke(emp.HistoryRecords));
(I've only tested this code against an empty database, insofar that it doesn't make EntityFramework throw a NotSupportedException.)
Here, you should note:
The subexpression you're plugging into the one you're passing into Select() needs to be saved in a local variable, LINQKit doesn't support method calls during expansion.
You need to call AsExpandable() on the first IQueryable in the chain, so LINQKit gets to work its magic.
You're probably not going to be able to use extension method call syntax inside the expression like in your question.
All the subexpressions have to be determined before expansion occurs.
These limitations stem from the fact that what you're doing isn't really calling methods. You're building one ginormous expression from a bunch of smaller ones, but the resulting expression itself still has to be something that LINQ-to-Entities will understand. On the other hand, the input has to be something LINQKit will understand, and it only handles expressions of the form localVariable.Invoke(). Any dynamism has to be in the code outside this expression tree. Basically, it's doing the same as your solution 2, just using syntax more intuitive than building the expression tree programmatically.
Last, but not least: when doing this, do not go overboard. Complex EF queries are already really hard to debug when anything goes wrong, because you're not told where in your code the problem is. If the query was assembled dynamically from bits and pieces all over your codebase, debugging some errors (like the delightful "Unable to cast the type X to type Y") will easily become a nightmare.
(For future questions: I think it's usually a good idea when if you make a code sample from scratch, instead of using bits from your actual codebase. They might be overly domain-specific, and understanding the names might require some context you take for granted. Identifiers should ideally be simple English names everyone can understand. I can maybe speak enough German to interview for a job in it, but "Mitarbeiterpersonalkostenstellehistorie" is just hard to keep in my head and reason about when I haven't actually worked on your project long enough to be familiar with what it's supposed to mean.)

Related

What are the benefits of lambda expressions and Linq [duplicate]

After reading this article, I can't figure out why lambda expressions are ever used. To be fair, I don't think I have a proper understanding of what delegates and expression tree types are, but I don't understand why anyone would use a lambda expression instead of a declared function. Can someone enlighten me?
First: brevity and locality:
Which would you rather write, read and maintain? This:
var addresses = customers.Select(customer=>customer.Address);
or:
static private Address GetAddress(Customer customer)
{
return customer.Address;
}
... a thousand lines later ...
var addresses = customers.Select(GetAddress);
What's the point of cluttering up your program with hundreds or thousands of four-line functions when you could just put the code you need where you need it as a short expression?
Second: lambdas close over local scopes
Which would you rather read, write and maintain, this:
var currentCity = GetCurrentCity();
var addresses = customers.Where(c=>c.City == currentCity).Select(c=>c.Address);
or:
static private Address GetAddress(Customer customer)
{
return customer.Address;
}
private class CityGetter
{
public string currentCity;
public bool DoesCityMatch(Customer customer)
{
return customer.City == this.currentCity;
}
}
....
var currentCityGetter = new CityGetter();
currentCityGetter.currentCity = GetCurrentCity();
var addresses = customers.Where(currentCityGetter.DoesCityMatch).Select(GetAddress);
All that vexing code is written for you when you use a lambda.
Third: Query comprehensions are rewritten to lambdas for you
When you write:
var addresses = from customer in customers
where customer.City == currentCity
select customer.Address;
it is transformed into the lambda syntax for you. Many people find this syntax pleasant to read, but we need the lambda syntax in order to actually make it work.
Fourth: lambdas are optionally type-inferred
Notice that we don't have to give the type of "customer" in the query comprehension above, or in the lambda versions, but we do have to give the type of the formal parameter when declaring it as a static method. The compiler is smart about inferring the type of a lambda parameter from context. This makes your code less redundant and more clear.
Fifth: Lambdas can become expression trees
Suppose you want to ask a web server "send me the addresses of the customers that live in the current city." Do you want to (1) pull down a million customers from the web site and do the filtering on your client machine, or (2) send the web site an object that tells it "the query contains a filter on the current city and then a selection of the address"? Let the server do the work and send you only the result that match.
Expression trees allow the compiler to turn the lambda into code that can be transformed into another query format at runtime and sent to a server for processing. Little helper methods that run on the client do not.
The primary reason you'd use a lambda over a declared function is when you need to use a piece of local information in the delegate expression. For example
void Method(IEnumerable<Student> students, int age) {
var filtered = students.Where(s => s.Age == age);
...
}
Lambdas allow for the easy capture of local state to be used within the delegate expression. To do this manually requires a lot of work because you need to declare both a function and a containing type to hold the state. For example here's the above without a lambda
void Method(IEnumerable<Student> students, int age) {
var c = new Closure() { Age = age };
var filtered = students.Where(c.WhereDelegate);
...
}
class Closure {
public int age;
bool WhereDelegate(Student s) {
return s.Age == age;
}
}
Typing this out is tedious and error prone. Lambda expressions automate this process.
Let's leave expression trees out of the equation for the moment and pretend that lambdas are just a shorter way to write delegates.
This is still a big win in the realm of statically typed languages like C# because such languages require lots of code to be written in order to achieve relatively simple goals. Do you need to compare sort an array of strings by string length? You need to write a method for that. And you need to write a class to put the method into. And then good practice dictates that this class should be in its own source file. In any but the smallest project, all of this adds up. When we 're talking about small stuff, most people want a less verbose path to the goal and lambdas are about as terse as it can get.
Furthermore, lambdas can easily create closures (capture variables from the current scope and extend their lifetime). This isn't magic (the compiler does it by creating a hidden class and performing some other transformations that you can do yourself), but it's so much more convenient than the manual alternative.
And then there are expression trees: a way for you to write code and have the compiler transform this code into a data structure that can be parsed, modified and even compiled at runtime. This is an extremely powerful feature that opens the door to impressive functionality (which I definitely consider LINQ to be). And you get it "for free".
http://msdn.microsoft.com/en-us/magazine/cc163362.aspx
Great article on what lambdas are, and why you can/should use them.
Essentially, the lambda expression
provides a shorthand for the compiler
to emit methods and assign them to
delegates; this is all done for you.
The benefit you get with a lambda
expression that you don't get from a
delegate/function combination is that
the compiler performs automatic type
inference on the lambda arguments
They are heavily used with LINQ, actually LINQ would be pretty bad without it. You can do stuff like:
Database.Table.Where(t => t.Field ==
"Hello");
They make it easy to pass a simple piece of functionality to another function. For example, I may want to perform an arbitrary, small function on every item in a list (perhaps I want to square it, or take the square root, or so on). Rather than writing a new loop and function for each of these situations, I can write it once, and apply my arbitrary functionality defined later to each item.
Lambda makes code short and sweet. Consider the following two examples:
public class Student
{
public string Name { get; set; }
public float grade { get; set; }
public static void failed(List<Student> studentList, isFaild fail)
{
foreach (Student student in studentList)
{
if(fail(student))
{
Console.WriteLine("Sorry" + " "+student.Name + " "+ "you faild this exam!");
}
}
}
public delegate bool isFaild(Student myStudent);
class Program
{
static void Main(string[] args)
{
List<Student> studentsList = new List<Student>();
studentsList .Add(new Student { ID = 101, Name = "Rita", grade = 99 });
studentsList .Add(new Student { ID = 102, Name = "Mark", grade = 48 });
Student.failed(studentsList, std => std.grade < 60); // with Lamda
}
}
private static bool isFaildMethod(Student myStudent) // without Lambda
{
if (myStudent.grade < 60)
{
return true;
}
else
{
return false;
}
}

Can I clone an IQueryable to run on a DbSet for another DbContext?

Suppose I have built up, through some conditional logic over many steps, an IQueryable<T> instance we'll call query.
I want to get a count of total records and a page of data, so I want to call query.CountAsync() and query.Skip(0).Take(10).ToListAsync(). I cannot call these in succession, because a race condition occurs where they both try to run a query on the same DbContext at the same time. This is not allowed:
"A second operation started on this context before a previous asynchronous operation completed. Use 'await' to ensure that any asynchronous operations have completed before calling another method on this context. Any instance members are not guaranteed to be thread safe."
I do not want to 'await' the first before even starting the second. I want to fire off both queries as soon as possible. The only way to do this is to run them from separate DbContexts. It seems ridiculous that I might have to build the entire query (or 2, or 3) side-by-side starting with different instances of DbSet. Is there any way to clone or alter an IQueryable<T> (not necessarily that interface, but it's underlying implementation) such that I can have one copy that runs on DbContext "A", and another that will run on DbContext "B", so that both queries can be executing simultaneously? I'm just trying to avoid recomposing the query X times from scratch just to run it on X contexts.
There is no standard way of doing that. The problem is that EF6 query expression trees contain constant nodes holding ObjectQuery instances which are bound to the DbContext (actually the underlying ObjectContext) used when creating the query. Also there is a runtime check before executing the query if there are such expressions bound to a different context than the one executing the query.
The only idea that comes in my mind is to process the query expression tree with ExpressionVisitor and replace these ObjectQuery instances with new ones bound to the new context.
Here is a possible implementation of the aforementioned idea:
using System.Data.Entity.Core.Objects;
using System.Data.Entity.Infrastructure;
using System.Linq;
using System.Linq.Expressions;
namespace System.Data.Entity
{
public static class DbQueryExtensions
{
public static IQueryable<T> BindTo<T>(this IQueryable<T> source, DbContext target)
{
var binder = new DbContextBinder(target);
var expression = binder.Visit(source.Expression);
var provider = binder.TargetProvider;
return provider != null ? provider.CreateQuery<T>(expression) : source;
}
class DbContextBinder : ExpressionVisitor
{
ObjectContext targetObjectContext;
public IQueryProvider TargetProvider { get; private set; }
public DbContextBinder(DbContext target)
{
targetObjectContext = ((IObjectContextAdapter)target).ObjectContext;
}
protected override Expression VisitConstant(ConstantExpression node)
{
if (node.Value is ObjectQuery objectQuery && objectQuery.Context != targetObjectContext)
return Expression.Constant(CreateObjectQuery((dynamic)objectQuery));
return base.VisitConstant(node);
}
ObjectQuery<T> CreateObjectQuery<T>(ObjectQuery<T> source)
{
var parameters = source.Parameters
.Select(p => new ObjectParameter(p.Name, p.ParameterType) { Value = p.Value })
.ToArray();
var query = targetObjectContext.CreateQuery<T>(source.CommandText, parameters);
query.MergeOption = source.MergeOption;
query.Streaming = source.Streaming;
query.EnablePlanCaching = source.EnablePlanCaching;
if (TargetProvider == null)
TargetProvider = ((IQueryable)query).Provider;
return query;
}
}
}
}
One difference with the standard EF6 LINQ queries is that this produces ObjectQuery<T> rather than DbQuery<T>, although except that ToString() does not return the generated SQL, I haven't noticed any difference in the further query building / execution. It seems to work, but use it with care and on your own risk.
You could write a function to build up your query, taking DbContext as a parameter.
public IQueryable<T> MyQuery(DbContext<T> db)
{
return db.Table
.Where(p => p.reallycomplex)
....
...
.OrderBy(p => p.manythings);
}
I've done this many times and it works well.
Now it's easy to make queries with two different contexts:
IQueryable<T> q1 = MyQuery(dbContext1);
IQueryable<T> q2 = MyQuery(dbContext2);
If your concern was the execution time taken to build the IQueryable objects, then my only suggestion is don't worry about it.
So you have an IQueryable<T> that will be performed on DbContext A as soon as the query is executed and you want the same query to run on DbContext B when the query is executed.
For this you'll have to understand the difference between an IEnumerable<T> and an IQueryable<T>.
An IEnumerable<T> holds all code to enumerate over the elements that the enumerable represents. The enumeration starts when GetEnumerator and MoveNext are called. This can be done explicitly. However it is usually done implicitly by functions like foreach, ToList, FirstOrDefault, etc.
An IQueryable does not hold the code to enumerate, it holds an Expression and a Provider. The Provider knows who will execute the query, and it knows how to translate the Expression into the language that is understood by the query executioner.
Due to this separation, it is possible to let the same Expression be executed by different data sources. They don't even have to be of the same type: one data source can be a database management system that understands SQL, the other one could be a comma separated file.
As long as you concatenate Linq statements that return an IQueryable, the query is not executed, only the Expression is changed.
As soon as enumeration starts, either by calling GetEnumerator / MoveNext, or by using foreach or one of the LINQ functions that do not return an IQueryable, the Provider will translate the Expression into the language the the data source understands and communicates with the data source to execute the query. The result of the query is an IEnumerable, which can be enumerated as if all data was in local code.
Some Providers are smart and use some buffering, so that not all data is transferred to local memory, but only part of the data. New data is queried when needed. So if you do a foreach in a database with a zillion elements, only the first few (thousands) elements are queried. More data is queried if your foreach runs out of fetched data.
So you already have one IQueryable<T>, therefore you have an Expression a Provider and an ElementType. You want the same Expression / ElementType to be executed by a differentProvider. You even want to change theExpression` slightly before you execute it.
Therefore you need to be able to create an object that implements IQueryable<T> and you want to be able to set the Expression, ElementType and a Provider
class MyQueryable<T> : IQueryable<T>
{
public type ElementType {get; set;}
public Expression Expression {get; set;}
public Provider Provider {get; set;}
}
IQueryable<T> queryOnDbContextA= dbCotextA ...
IQueryable<T> setInDbContextB = dbContextB.Set<T>();
IQueryable<T> queryOnDbContextB = new MyQueryable<T>()
{
ElementType = queryOnDbContextA.ElementType,
Expression = queryOnDbContextB.Expression,
Provider = setInDbContextB.Provider,
}
If desired you can adjust the query on the other context before executing it:
var getPageOnContextB = queryOnDbContextB
.Skip(...)
.Take(...);
Both queries are still not executed yet. Execute them:
var countA = await queryOnContextA.CountAsync();
var fetchedPageContextB = await getPageOnContextB.ToListAsync();

Using custom methods or delegates in linq-sql statement

I am searching the land for an elegant, reusable solution to a problem that has been bothering me for ages. Thus,
Say I have some business logic I use all over the site: (don't get held up as to how simple this is, it could be complex)
public DateTime ExpiryDate
{
get { return DateAdded.Date.AddMonths(ApplicationConfiguration.Rule3ExpiryLengthInMonths); }
}
And a Linq statement:
groupedByPatient.Count(x =>
x.Max(a => System.Data.Objects.EntityFunctions.AddMonths(a.DateAdded, ApplicationConfiguration.Rule3ExpiryLengthInMonths))
<= DateTime.Now);
This "expired" logic has got to be repeated as (understandably) Expired is not a column in the db. The net result is that we end up with scattered business logic across the code. Ideally we would have:
var count = groupedByPatient.Count(x =>
x.Max(a => a.ExpiryDate)
<= DateTime.Now);
Theoretically as long as you conform to Linq's "c#" rules you should be able to abstract this code out, say:
public DateTime ExpiryDate
{
get { return System.Data.Objects.EntityFunctions.AddMonths(
DateAdded, ApplicationConfiguration.Rule3ExpiryLengthInMonths).D }
}
Why don't you create an extension method on DateTime? That way, whenever you have a date you can just call that to get your expiry date:
static class DateTimeExtensions
{
public static DateTime ExpiryDate(this DateTime dte)
{
return dte.AddMonths(ApplicationConfiguration.Rule3ExpiryLengthInMonths);
}
}
If I understand your example correctly, DateAdded is a date column in your table, from which you wish to find the expiry date. Then, just do this:
var count = groupedByPatient.Count(x =>
x.Max(a => a.DateAdded.ExpiryDate()) <= DateTime.Now);
I'm not sure from the subject of this vs the code samples you've put in, but I'm pretty sure the 2nd here is what you're looking for.
If you just want the result for a materialised query (ie after you've got the data), then use extensions:
public static string ToExpiryDate(this DateTime date)
{
return date.AddMonths(ApplicationConfiguration.Rule3ExpiryLengthInMonths);
}
If you want the result from within a IQueryable (which by your subject is what I think you are looking for), then you can use expressions:
public static Expression<Func<IEnumerable<YourEntity>, DateTime>> MaxExpiryDate = (y) => y.Max(
System.Data.Objects.EntityFunctions.AddMonths(y.DateAdded, ApplicationConfiguration.Rule3ExpiryLengthInMonths)
);
Then your query would look like:
var count = groupedByPatient.Count(x => x.YourEntities(MaxExpiryDate) <= DateTime.Now);
NOTE: The Func<> MUST be wrapped in Expression<> even though both will appear to work, without wrapping it in expression, the query will force materialisation before it is run. By putting expression around the function we tell EF to do it as part of the query.

Apply Linq Func<T, TResult> key selector at single element level

Sorry if the title is misleading, wasn't sure how to describe this one.
My end goal is to have an extension method of IQueryable<T> and some form (see below for example) of expression that will allow me to have to return an IQueryable<EntityIndex<T>> (or similar) which contains the original T in the Entity field, and an array/ienumerable containing the elements as describe by the some form of expression.
I know that doesn't really make sense, hopefully it will after an example...
This is what I have so far:
class EntityIndex<T, TKey>
{
T Entity { get; set; }
// Doesn't have to be IEnumerable, whatever is easier
IEnuermable<TKey> Index { get; set; }
}
static class Elsewhere
{
[Extension()]
public IQueryable<EntityIndex<T, TKey>> IndexBy<T, TKey>(this IQueryable<T> source, Expression<Func<T, TKey[]>> indexSelector)
{
return source.Select(n => new EntityIndex<T, TKey> {
Entity = n,
Index = new T[] { n }.Select(indexSelector)
});
}
}
Note: The above does not compile, it's simply there to try and show what I'm trying to achieve.
I've used the standard selector, but sub-optimally, had to arbitrarily create an array of T on the assignment to the 'Index' property to be able to apply the selector. I'm hoping a better choice of parameter may resolve this, but possibly not. The main issue is this doesn't compile so if there is a minor tweak that will allow it to work that's fine by me, if you can understand my gibberish and understand what I'm trying to do, and happen to know a better way to go about it I'd be greatly appreciative.
Ideally, I need the solution to be understood by the L2S engine, which I'm not convinced the above is thanks to the introduction of the EntityIndex class, but I'm holding out hope that it'll treat it as an anonymous class.
EDIT:
Good point Damien, the bigger picture is probably much easier to describe...
I want an extension method that accepts an expression, the expression should describe which fields on the entity to index, which will be used after this particular expression to allow a criterion (where clause) to be applied to the selected fields.
Long story short, in a number of places in code we have a wildcard string search. If I have an EntityA with Property1, Property2, Property3, etc, it is not uncommon to see code such as:
Handwritten, please excuse minor typos
public string[] WildcardSearch(string prefixText, int count)
{
string searchTerm = prefixText.Replace(wildcard, string.Empty);
if (prefixText.StartsWith(wildcard) && prefixText.EndsWith(wildcard)) {
return entitySet.Where(n => n.Property1.Contains(searchTerm) || n.Property2.Contains(searchTerm)).Select(n => n.Property3).ToArray();
} else if (prefixText.StartsWith(wildcard)) {
return entitySet.Where(n => n.Property1.EndsWith(searchTerm) || n.Property2.EndsWith(searchTerm)).Select(n => n.Property3).ToArray();
// you get the picture, same with EndsWith, no wildcards defaults to contains...
}
}
EDIT:
Further clarification - using the above WildcardEarch as an example, what I was hoping for was to be able to have a selector as follows or similar:
Func<EntityA, IEnumerable<string>> indexSelector = n => new string[] {
n.Property1,
n.Property2
};
// Alternatively, a ParamArray of keySelector might work?
Func<EntityA, string>[] keySelectors = new Func<EntityA, string>[] {
n => n.Property1,
n => n.Property2
};
Given an adequate expression describing which fields on the entity to search, returning the IQueryable<EntitySearch<T>> as shown above, I hoped to be able to apply a single criterion, similar to:
Func<EntitySearch<T>, bool> criterion = n => false;
if (wildcardIsContains) {
criterion = n => n.Values.Any(x => x.Contains(searchTerm));
} else if (wildCardIsStartsWith) {
criterion = n => n.Values.Any(x => x.Contains(searchTerm));
//etc
}
Given the extension at the very top that I can't get to work, and this criterion logic, I should be able to take an IQueryable<T>, select some fields, and apply an appropriate wildcard search on the fields, finally returning IQueryable<T> again having added the filtering.
Thanks¬!
Please comment if you need more information/clarification...
EDIT:
Fair one #subkamren and thanks for the interest. Some non-generic examples may be of use. I'll draft something up and add them shortly. For the time being, some clarification based on your comment...
Given an IQueryable<Animal> I want an extension allowing me to select fields on Animal which I intend to search/index by. For example, Animal.Description, Animal.Species.Name etc. This extension should return something like an IIndexedQueryable<Animal>. That is the issue I'm trying to deal with in the question above. The wider picture mentioned, which I'd be exceptionally pleased if you're willing to help with, is as follows:
The IIndexedQueryable<T> interface in turn I would like an extension for which could take a string search term. The extension should resolve the wildcards within the search term, extend the original IQueryable with the necessary criterion to perform a search on the indexed fields, and return an IQueryable<T> again.
I appreciate this could be done in a single step, but I hoped to do it this way so that later on I can look into adding a third extension method applicable to IIndexedQueryable<T> allowing me to perform a freetext search with SQL Server... ^^ Make any sense?
That's the bigger picture at least, this question deals primarily with being able to specify the fields I aim to index in such a way I can use them thereafter as mentioned here.
So something like:
public static IEnumerable<EntityIndex<T, Y>> IndexBy<T, Y>(this IEnumerable<T> entities, Func<T, Y> indexSelector) {
return entities.Select(e => new EntityIndex<T, Y> { Entity = e, IndexValue = indexSelector(e) });
}
Noting that generically defining EntityIndex with the TIndexType (called Y here) is important because you don't know ahead of time what the index is. The use of a generic allows Y to be an enumeration, thus the following would work as an index selector:
// Assuming Animal has attributes "Kingdom", "Phylum", "Family", "Genus", "Species"
// this returns an enumeration of EntityIndex<Animal, String[]>
var animalsClassified = someAnimals.IndexBy(a => new String[] { a.Kingdom, a.Phylum, a.Family, a.Genus, a.Species });
EDIT (Adding further detail):
Using the above, you can group the results by unique index value:
var animalClassifications = animalsClassified
.SelectMany(ac => ac.IndexValue.Select(iv => new { IndexValue = iv, Entity = ac.Entity }))
.GroupBy(kvp => kvp.IndexValue)
What I've described here, by the way, is (a very simplified form of) the MapReduce algorithm as popularized by Google. A distributed form of the same is commonly used for keyword identification in text search, where you want to build an index of (search term)->(list of containing documents).

What is the correct way to get a sortable string from a DateTime in a LINQ entity?

// EmployeeService:
[WebMethod]
public List<Employee> GetEmployees()
{
return
(
from p in db.Persons
where p.Type == 'E'
select new Employee
{
Name = p.FullName,
//HireDate = p.CreationDate.ToString(),
// works, but not in the format I need
//HireDate = p.CreationDate.ToString("s"),
// throws a NotSupportedException
//HireDate = Wrapper(p.CreationDate),
// works, but makes me worry about performance
// and feel dead inside
}
).ToList<Employee>();
}
private String Wrapper(DateTime date)
{
return date.ToString("s");
}
// Elsewhere:
public class Employee
{
public String Name;
public String HireDate;
}
I'm using a JavaScript framework that needs dates in ISO 8601 style format, which is exactly what calling .ToString("s") on a DateTime object will return.
Is there a cleaner/more efficient way to do this in LINQ to SQL?
I believe the Wrapper trick is as good as you can get in this situation. Sorry.
Update: Seems this has been asked again here: Linq to Sql - DateTime Format - YYYY-MMM (2009-Mar). The answer was pretty much "sorry" there too; considering who participated in that question, I 'm now really sure that you can't do better.
The problem is that, when using IQueryable, the provider tries to translate all of the LINQ expressions into something it can send down to the database. It has no way of knowing what to do with ToString("s"), so the NotSupported exception is thrown.
If you were to add .AsEnumerable() before the Select call, then it should work. The difference is that the Person object will be brought into memory completely, then the projection (the Select) method will be ran and all of that is done as a .NET compiled method, not as SQL. So essentially anything after AsEnumerable() will be done in memory and not in the database, so it's generally not recommended to do until you've trimmed down the number of rows as much as possible (i.e. after all Where and OrderBys).

Categories