Retrieve user defined 'Combination of Columns' from database with Entity Framework - c#

I need to retrieve user defined columns from database with Entity Framework.
I need to create column Projection based on passed collection, a List of strings, where each element contains column names, with Entity Framework
I have a list of string which contains strings like this;
List<string> columnsToSelect = new List<string>();
columnsToSelect.Add("col1 + col2");
columnsToSelect.Add("col2");
columnsToSelect.Add("col3 + col4");
I have a table called 'RawData' with 6 columns like this:
Id EmpId col1 col2 col3 col4
Now, if i were to query simple
var rawDatacolums = Context.RawData.where(a => a.EmpId = #EmpId)
this will generate SQL statement like this,
Select Id,EmpId,col1,col2,col3,col4 from RawData
where EmpId = #EmpId
Here i want to pass columnsToSelect as argument and my result should be based on column selector i am passing in List
What i want to do is,
var rawDatacolums = Context.RawData.select(columnsToSelect).where(a =>
a.EmpId = #EmpId)
Which should generate SQL like;
Select col1 + col2 as Col1Col2, col2 as Col2, col3 + col4 as Col3Col4
from RawData where EmpId = #EmpId
I have tried to use "SelectProperties" from this article here:
https://byalexblog.net/entity-framework-dynamic-columns
https://github.com/lAnubisl/entity-framework-dynamic-queries/blob/master/entity-framework-dynamic-queries/SelectiveQuery.cs
var rawDatacolums = Context.RawData.SelectProperties(columnsToSelect)
if a pass exact columns like col1, col2 as list it works
but it doesn't work the way i want for example Sum of Two columns
My Requirement is i need to project addition of columns
like 'col1 + col2' & 'col3 + col4'
Updated Answer
Based on couple of suggestions, i played more with Dynamic LINQ
and i made it work, i was able to apply various math conditions on my projection
and was able to create Dynamic Class out of it
The original github reference is as below:
https://github.com/kahanu/System.Linq.Dynamic
but i found explanation here more useful please take a look here:
http://ak-dynamic-linq.azurewebsites.net/GettingStarted#conversions
Some other material i referred and had to use - which maybe helpful to someone is here - http://www.albahari.com/nutshell/predicatebuilder.aspx
Sample working code would look like this;
var sampleQuery = "New(Col1+Col2 as Stage1Count)";
IEnumerable queryResult= Context.RawData.AsQueryable().Select(sampleQuery );
System.Diagnostics.Debug.WriteLine("Debug Sample Query: " + queryResult.ToString());
foreach (var cust in queryResult)
{
System.Diagnostics.Debug.WriteLine("Debug Sample StageCount : " + cust.ToString());
}
Thanks all for your comments and suggestions! Cheers!

It is obviously possible to create classes at runtime, or even new anonymous types, but they are extremely limited in how you can use them in your code.
If you prefer to work within the modern generic Queryable framework, and avoid creating classes at runtime that have limited compile time access, you can roll your own expression parser and build Expression trees. The trick is to use an Array type as the return from the Select to make the members accessible. This does mean all the expressions must return the same type, but this implementation will convert all the expressions to one type if necessary.
Here is a sample implementation:
public static class IQueryableExt {
public static Expression<Func<TRec, TVal?[]>> SelectExpr<TRec, TVal>(this IEnumerable<string> strExprs) where TVal : struct {
var p = Expression.Parameter(typeof(TRec), "p");
var exprs = strExprs.Select(se => {
var e = se.ParseExpression(p);
return e.Type.IsNullableType() && e.Type.GetGenericArguments()[0] == typeof(TVal) ? e : Expression.Convert(e, typeof(TVal?));
}).ToArray();
return Expression.Lambda<Func<TRec, TVal?[]>>(Expression.NewArrayInit(typeof(TVal?), exprs), p);
}
static char[] operators = { '+', '-', '*', '/' };
static Regex tokenRE = new Regex($#"(?=[-+*/()])|(?<=[-+*/()])", RegexOptions.Compiled);
static HashSet<char> hsOperators = operators.ToHashSet();
static Dictionary<char, ExpressionType> opType = new Dictionary<char, ExpressionType>() {
{ '*', ExpressionType.Multiply },
{ '/', ExpressionType.Divide },
{ '+', ExpressionType.Add },
{ '-', ExpressionType.Subtract }
};
static int opPriority(char op) => hsOperators.Contains(op) ? Array.IndexOf(operators, op) >> 1 : (op == ')' ? -1 : -2);
public static Expression ParseExpression(this string expr, ParameterExpression dbParam) {
var opStack = new Stack<char>();
opStack.Push('(');
var operandStack = new Stack<Expression>();
foreach (var t in tokenRE.Split(expr).Where(t => !String.IsNullOrEmpty(t)).Append(")")) {
if (t.Length > 1) // process column name
operandStack.Push(Expression.PropertyOrField(dbParam, t));
else {
while (t[0] != '(' && opPriority(opStack.Peek()) >= opPriority(t[0])) {
var curOp = opStack.Pop();
var right = operandStack.Pop();
var left = operandStack.Pop();
if (right.Type != left.Type) {
if (right.Type.IsNullableType())
left = Expression.Convert(left, right.Type);
else if (left.Type.IsNullableType())
right = Expression.Convert(right, left.Type);
else
throw new Exception($"Incompatible types for operator{curOp}: {left.Type.Name}, {right.Type.Name}");
}
operandStack.Push(Expression.MakeBinary(opType[curOp], left, right));
}
if (t[0] != ')')
opStack.Push(t[0]);
else
opStack.Pop(); // pop (
}
}
return operandStack.Pop();
}
public static bool IsNullableType(this Type nullableType) =>
// instantiated generic type only
nullableType.IsGenericType &&
!nullableType.IsGenericTypeDefinition &&
Object.ReferenceEquals(nullableType.GetGenericTypeDefinition(), typeof(Nullable<>));
}
Unfortunately type inference can't easily get the answer type, so you have to manually pass in the record type and answer type. Note there's special code in the parser to handle conversion to (common in SQL) nullable types when mixing nullable and non-nullable.
Given the columnsToSelect you provided as an example:
List<string> columnsToSelect = new List<string>();
columnsToSelect.Add("col1 + col2");
columnsToSelect.Add("col2");
columnsToSelect.Add("col3 + col4");
You can query the database like so:
var queryResult= Context.RawData.Select(columnsToSelect.SelectExpr<TRawData, int>());
And queryResult would be of type IQueryable<int[]> or IQueryable<int?[]> depending on the SQL column types.

Related

Linq query parameterised operator and column name [duplicate]

This question already has answers here:
C# Dynamic database filtering with Linq Expression
(3 answers)
Closed 5 months ago.
I am looking to implement a system whereby a use that 'build' conditions and then return the resulting data back from the database. At present, there is a stored procedure which generates SQL on the fly and executes it. This is a particular issue that I want to remove.
My problem is coming from the fact that I can have multiple fields within my criteria, and for each of these fields, there could be 1 or more values, with different potential operators.
For example,
from t in Contacts
where t.Email == "email#domain.com" || t.Email.Contains ("mydomain")
where t.Field1 == "valuewewant"
where t.Field2 != "valuewedontwant"
select t
The field, criteria and operator are stored in the database (and List<FieldCriteria>) and would be some thing like this (based on above);
Email, Equals, "email#domain.com"
Email, Contains, "mydomain" Field1,
Equals, "valuewewant" Field2,
DoesNotEqual, "valuewedontwant"
or
new FieldCriteria
{
FieldName = "Email",
Operator = 1,
Value = "email#mydomain.com"
}
So using the information that I have, I want to be able to build a query with any number of conditions. I have seen previous links to Dynamic Linq and PredicateBuilder, but am not able to visualise this as a solution to my own problem.
Any suggestions would be appreciated.
Update
Following on from the suggestion about Dynamic Linq, I came up with a very basic solution, using a Single Operator, with 2 Fields and multiple Criteria. A little crude at the moment as coded in LinqPad, but the results are exactly what I wanted;
enum Operator
{
Equals = 1,
}
class Condition
{
public string Field { get; set; }
public Operator Operator { get; set;}
public string Value { get; set;}
}
void Main()
{
var conditions = new List<Condition>();
conditions.Add(new Condition {
Field = "Email",
Operator = Operator.Equals,
Value = "email1#domain.com"
});
conditions.Add(new Condition {
Field = "Email",
Operator = Operator.Equals,
Value = "email2#domain.com"
});
conditions.Add(new Condition {
Field = "Field1",
Operator = Operator.Equals,
Value = "Chris"
});
var statusConditions = "Status = 1";
var emailConditions = from c in conditions where c.Field == "Email" select c;
var field1Conditions = from c in conditions where c.Field == "Field1" select c;
var emailConditionsFormatted = from c in emailConditions select string.Format("Email=\"{0}\"", c.Value);
var field1ConditionsFormatted = from c in field1Conditions select string.Format("Field1=\"{0}\"", c.Value);
string[] conditionsArray = emailConditionsFormatted.ToArray();
var emailConditionsJoined = string.Join("||", conditionsArray);
Console.WriteLine(String.Format("Formatted Condition For Email: {0}",emailConditionsJoined));
conditionsArray = field1ConditionsFormatted.ToArray();
var field1ConditionsJoined = string.Join("||", conditionsArray);
Console.WriteLine(String.Format("Formatted Condition For Field1: {0}",field1ConditionsJoined));
IQueryable results = ContactView.Where(statusConditions);
if (emailConditions != null)
{
results = results.Where(emailConditionsJoined);
}
if (field1Conditions != null)
{
results = results.Where(field1ConditionsJoined);
}
results = results.Select("id");
foreach (int id in results)
{
Console.WriteLine(id.ToString());
}
}
With an SQL generated of;
-- Region Parameters
DECLARE #p0 VarChar(1000) = 'Chris'
DECLARE #p1 VarChar(1000) = 'email1#domain.com'
DECLARE #p2 VarChar(1000) = 'email2#domain.com'
DECLARE #p3 Int = 1
-- EndRegion
SELECT [t0].[id]
FROM [Contacts].[ContactView] AS [t0]
WHERE ([t0].[field1] = #p0) AND (([t0].[email] = #p1) OR ([t0].[email] = #p2)) AND ([t0].[status] = #p3)
And Console Output:
Formatted Condition For Email: Email="email1#domain.com"||Email="email2#domain.com"
Formatted Condition For Field1: Field1="Chris"
Just need clean this up and add the other Operators and it is looking good.
If anyone has any comments on this so far, any input would be appreciated
The trick with LINQ would be to build an Expression from the data. As an example, to illustrate the example shown:
var param = Expression.Parameter(typeof(MyObject), "t");
var body = Expression.Or(
Expression.Equal(Expression.PropertyOrField(param, "Email"), Expression.Constant("email#domain.com")),
Expression.Call(Expression.PropertyOrField(param, "Email"), "Contains", null, Expression.Constant("mydomain"))
);
body = Expression.AndAlso(body, Expression.Equal(Expression.PropertyOrField(param, "Field1"), Expression.Constant("valuewewant")));
body = Expression.AndAlso(body, Expression.NotEqual(Expression.PropertyOrField(param, "Field2"), Expression.Constant("valuewedontwant")));
var lambda = Expression.Lambda<Func<MyObject, bool>>(body, param);
var data = source.Where(lambda);
In particular, note how AndAlso can be used to compose the various operations (the same as multiple Where, but simpler).
I think Dynamic LINQ will be one of option. DLINQ allows you to specify part of the LINQ query as "string" and DLINQ then compiles that string to Expression tree so that be passed to the underlying LINQ provider. Your need is also same i.e you need to create Expression trees at runtime.
I would suggest you to make the property Operator in FieldCriteria as an Enum which represent all the required operations (equals, less then etc). Then you will need to write a function that takes a list of FieldCriteria and return a "expression" string which then can be fed into DLINQ to get the expression tree.
This sounds very similar to a problem I solved recently. In my case, I had to filter the objects into different categories based on a complex filters that were defined in Sql.
I have created a Nuget package DynamicFilter.Sql to dynamically generate the lambda expression from a sql based filter. The package is open source and available on github.
you can simply use it like so,
var filter = FilterExpression.Compile<User>("(Email = 'email#domain.com' or Email like '%#%mydomain.com') and deleted <> true ");
bool match = filter(new User {Email="alice#mydomain.com", Deleted=false}); //Matches true
This can be simply done by Linq where you attach additional operators to the query object. Here is an example.
query = db.Contacts.Where( ... );
query = query.Where( ... );
query = query.Where( ... );
This is a more simpler and short solution.

An expression tree may not contain a call or invocation that uses option arguments in C# Linq

I am trying to do a case statement for one of the properties when selecting an anonymous type in the first part and then convert it to a list of my return type (retList). In the retList part at the bottom when I set QuarterName = p.QuarterName I get the following error on the DatePart functions from the section above:
An expression tree may not contain a call or invocation that uses
optional arguments
public static IEnumerable<Product> GetProducts(int categoryId)
{
using (var context = new DbContext())
{
var pList = (from p in context.Products
where (p.CategoryId == proformaId)
select new
{
Id = p.Id,
ProductName = p.ProductName,
QuarterName = pa.Quarter != "ExtraQuarter" ? "Q" + DateAndTime.DatePart(DateInterval.Quarter, p.PurchaseDate) +
"-" + DateAndTime.DatePart(DateInterval.Year, p.PurchaseDate) :
"<b><i>" + p.Quarter + "</i></b>"
}).ToList();
var retList = from p in pList
select new ProformaAssumption()
{
Id = pa.Id,
ProductName = p.ProformaId,
QuarterName = p.QuarterName
};
return retList;
}
The DatePart methods have additional, optional parameters. C# doesn't allow Expression Trees to leverage the optional parameters, so you'll need to provide the whole parameter list to each of these method calls.
According to the documentation, FirstDayOfWeek.Sunday and FirstWeekOfYear.Jan1 are the values that would be used if you didn't provide a value for the optional parameters.
QuarterName = pa.Quarter != "ExtraQuarter"
? "Q" +
DateAndTime.DatePart(DateInterval.Quarter, p.PurchaseDate,
FirstDayOfWeek.Sunday, FirstWeekOfYear.Jan1) +
"-" + DateAndTime.DatePart(DateInterval.Year, p.PurchaseDate,
FirstDayOfWeek.Sunday, FirstWeekOfYear.Jan1)
: "<b><i>" + p.Quarter + "</i></b>"

Querying a list of entities with composite keys in EF [duplicate]

given a list of ids, I can query all relevant rows by:
context.Table.Where(q => listOfIds.Contains(q.Id));
But how do you achieve the same functionality when the Table has a composite key?
This is a nasty problem for which I don't know any elegant solution.
Suppose you have these key combinations, and you only want to select the marked ones (*).
Id1 Id2
--- ---
1 2 *
1 3
1 6
2 2 *
2 3 *
... (many more)
How to do this is a way that Entity Framework is happy? Let's look at some possible solutions and see if they're any good.
Solution 1: Join (or Contains) with pairs
The best solution would be to create a list of the pairs you want, for instance Tuples, (List<Tuple<int,int>>) and join the database data with this list:
from entity in db.Table // db is a DbContext
join pair in Tuples on new { entity.Id1, entity.Id2 }
equals new { Id1 = pair.Item1, Id2 = pair.Item2 }
select entity
In LINQ to objects this would be perfect, but, too bad, EF will throw an exception like
Unable to create a constant value of type 'System.Tuple`2 (...) Only primitive types or enumeration types are supported in this context.
which is a rather clumsy way to tell you that it can't translate this statement into SQL, because Tuples is not a list of primitive values (like int or string). For the same reason a similar statement using Contains (or any other LINQ statement) would fail.
Solution 2: In-memory
Of course we could turn the problem into simple LINQ to objects like so:
from entity in db.Table.AsEnumerable() // fetch db.Table into memory first
join pair Tuples on new { entity.Id1, entity.Id2 }
equals new { Id1 = pair.Item1, Id2 = pair.Item2 }
select entity
Needless to say that this is not a good solution. db.Table could contain millions of records.
Solution 3: Two Contains statements (incorrect)
So let's offer EF two lists of primitive values, [1,2] for Id1 and [2,3] for Id2. We don't want to use join, so let's use Contains:
from entity in db.Table
where ids1.Contains(entity.Id1) && ids2.Contains(entity.Id2)
select entity
But now the results also contains entity {1,3}! Well, of course, this entity perfectly matches the two predicates. But let's keep in mind that we're getting closer. In stead of pulling millions of entities into memory, we now only get four of them.
Solution 4: One Contains with computed values
Solution 3 failed because the two separate Contains statements don't only filter the combinations of their values. What if we create a list of combinations first and try to match these combinations? We know from solution 1 that this list should contain primitive values. For instance:
var computed = ids1.Zip(ids2, (i1,i2) => i1 * i2); // [2,6]
and the LINQ statement:
from entity in db.Table
where computed.Contains(entity.Id1 * entity.Id2)
select entity
There are some problems with this approach. First, you'll see that this also returns entity {1,6}. The combination function (a*b) does not produce values that uniquely identify a pair in the database. Now we could create a list of strings like ["Id1=1,Id2=2","Id1=2,Id2=3]" and do
from entity in db.Table
where computed.Contains("Id1=" + entity.Id1 + "," + "Id2=" + entity.Id2)
select entity
(This would work in EF6, not in earlier versions).
This is getting pretty messy. But a more important problem is that this solution is not sargable, which means: it bypasses any database indexes on Id1 and Id2 that could have been used otherwise. This will perform very very poorly.
Solution 5: Best of 2 and 3
So the most viable solution I can think of is a combination of Contains and a join in memory: First do the contains statement as in solution 3. Remember, it got us very close to what we wanted. Then refine the query result by joining the result as an in-memory list:
var rawSelection = from entity in db.Table
where ids1.Contains(entity.Id1) && ids2.Contains(entity.Id2)
select entity;
var refined = from entity in rawSelection.AsEnumerable()
join pair in Tuples on new { entity.Id1, entity.Id2 }
equals new { Id1 = pair.Item1, Id2 = pair.Item2 }
select entity;
It's not elegant, messy all the same maybe, but so far it's the only scalable1 solution to this problem I found, and applied in my own code.
Solution 6: Build a query with OR clauses
Using a Predicate builder like Linqkit or alternatives, you can build a query that contains an OR clause for each element in the list of combinations. This could be a viable option for really short lists. With a couple of hundreds of elements, the query will start performing very poorly. So I don't consider this a good solution unless you can be 100% sure that there will always be a small number of elements. One elaboration of this option can be found here.
Solution 7: Unions
There's also a solution using UNIONs that I posted later here.
1As far as the Contains statement is scalable: Scalable Contains method for LINQ against a SQL backend
Solution for Entity Framework Core with SQL Server
🎉 NEW! QueryableValues EF6 Edition has arrived!
The following solution makes use of QueryableValues. This is a library that I wrote to primarily solve the problem of query plan cache pollution in SQL Server caused by queries that compose local values using the Contains LINQ method. It also allows you to compose values of complex types in your queries in a performant way, which will achieve what's being asked in this question.
First you will need to install and set up the library, after doing that you can use any of the following patterns that will allow you to query your entities using a composite key:
// Required to make the AsQueryableValues method available on the DbContext.
using BlazarTech.QueryableValues;
// Local data that will be used to query by the composite key
// of the fictitious OrderProduct table.
var values = new[]
{
new { OrderId = 1, ProductId = 10 },
new { OrderId = 2, ProductId = 20 },
new { OrderId = 3, ProductId = 30 }
};
// Optional helper variable (needed by the second example due to CS0854)
var queryableValues = dbContext.AsQueryableValues(values);
// Example 1 - Using a Join (preferred).
var example1Results = dbContext
.OrderProduct
.Join(
queryableValues,
e => new { e.OrderId, e.ProductId },
v => new { v.OrderId, v.ProductId },
(e, v) => e
)
.ToList();
// Example 2 - Using Any (similar behavior as Contains).
var example2Results = dbContext
.OrderProduct
.Where(e => queryableValues
.Where(v =>
v.OrderId == e.OrderId &&
v.ProductId == e.ProductId
)
.Any()
)
.ToList();
Useful Links
Nuget Package
GitHub Repository
Benchmarks
QueryableValues is distributed under the MIT license.
You can use Union for each composite primary key:
var compositeKeys = new List<CK>
{
new CK { id1 = 1, id2 = 2 },
new CK { id1 = 1, id2 = 3 },
new CK { id1 = 2, id2 = 4 }
};
IQuerable<CK> query = null;
foreach(var ck in compositeKeys)
{
var temp = context.Table.Where(x => x.id1 == ck.id1 && x.id2 == ck.id2);
query = query == null ? temp : query.Union(temp);
}
var result = query.ToList();
You can create a collection of strings with both keys like this (I am assuming that your keys are int type):
var id1id2Strings = listOfIds.Select(p => p.Id1+ "-" + p.Id2);
Then you can just use "Contains" on your db:
using (dbEntities context = new dbEntities())
{
var rec = await context.Table1.Where(entity => id1id2Strings .Contains(entity.Id1+ "-" + entity.Id2));
return rec.ToList();
}
You need a set of objects representing the keys you want to query.
class Key
{
int Id1 {get;set;}
int Id2 {get;set;}
If you have two lists and you simply check that each value appears in their respective list then you are getting the cartesian product of the lists - which is likely not what you want. Instead you need to query the specific combinations required
List<Key> keys = // get keys;
context.Table.Where(q => keys.Any(k => k.Id1 == q.Id1 && k.Id2 == q.Id2));
I'm not completely sure that this is valid use of Entity Framework; you may have issues with sending the Key type to the database. If that happens then you can be creative:
var composites = keys.Select(k => p1 * k.Id1 + p2 * k.Id2).ToList();
context.Table.Where(q => composites.Contains(p1 * q.Id1 + p2 * q.Id2));
You can create an isomorphic function (prime numbers are good for this), something like a hashcode, which you can use to compare the pair of values. As long as the multiplicative factors are co-prime this pattern will be isomorphic (one-to-one) - i.e. the result of p1*Id1 + p2*Id2 will uniquely identify the values of Id1 and Id2 as long as the prime numbers are correctly chosen.
But then you end up in a situation where you're implementing complex concepts and someone is going to have to support this. Probably better to write a stored procedure which takes the valid key objects.
Ran into this issue as well and needed a solution that both did not perform a table scan and also provided exact matches.
This can be achieved by combining Solution 3 and Solution 4 from Gert Arnold's Answer
var firstIds = results.Select(r => r.FirstId);
var secondIds = results.Select(r => r.SecondId);
var compositeIds = results.Select(r => $"{r.FirstId}:{r.SecondId}");
var query = from e in dbContext.Table
//first check the indexes to avoid a table scan
where firstIds.Contains(e.FirstId) && secondIds.Contains(e.SecondId))
//then compare the compositeId for an exact match
//ToString() must be called unless using EF Core 5+
where compositeIds.Contains(e.FirstId.ToString() + ":" + e.SecondId.ToString()))
select e;
var entities = await query.ToListAsync();
For EF Core I use a slightly modified version of the bucketized IN method by EricEJ to map composite keys as tuples. It performs pretty well for small sets of data.
Sample usage
List<(int Id, int Id2)> listOfIds = ...
context.Table.In(listOfIds, q => q.Id, q => q.Id2);
Implementation
public static IQueryable<TQuery> In<TKey1, TKey2, TQuery>(
this IQueryable<TQuery> queryable,
IEnumerable<(TKey1, TKey2)> values,
Expression<Func<TQuery, TKey1>> key1Selector,
Expression<Func<TQuery, TKey2>> key2Selector)
{
if (values is null)
{
throw new ArgumentNullException(nameof(values));
}
if (key1Selector is null)
{
throw new ArgumentNullException(nameof(key1Selector));
}
if (key2Selector is null)
{
throw new ArgumentNullException(nameof(key2Selector));
}
if (!values.Any())
{
return queryable.Take(0);
}
var distinctValues = Bucketize(values);
if (distinctValues.Length > 1024)
{
throw new ArgumentException("Too many parameters for SQL Server, reduce the number of parameters", nameof(values));
}
var predicates = distinctValues
.Select(v =>
{
// Create an expression that captures the variable so EF can turn this into a parameterized SQL query
Expression<Func<TKey1>> value1AsExpression = () => v.Item1;
Expression<Func<TKey2>> value2AsExpression = () => v.Item2;
var firstEqual = Expression.Equal(key1Selector.Body, value1AsExpression.Body);
var visitor = new ReplaceParameterVisitor(key2Selector.Parameters[0], key1Selector.Parameters[0]);
var secondEqual = Expression.Equal(visitor.Visit(key2Selector.Body), value2AsExpression.Body);
return Expression.AndAlso(firstEqual, secondEqual);
})
.ToList();
while (predicates.Count > 1)
{
predicates = PairWise(predicates).Select(p => Expression.OrElse(p.Item1, p.Item2)).ToList();
}
var body = predicates.Single();
var clause = Expression.Lambda<Func<TQuery, bool>>(body, key1Selector.Parameters[0]);
return queryable.Where(clause);
}
class ReplaceParameterVisitor : ExpressionVisitor
{
private ParameterExpression _oldParameter;
private ParameterExpression _newParameter;
public ReplaceParameterVisitor(ParameterExpression oldParameter, ParameterExpression newParameter)
{
_oldParameter = oldParameter;
_newParameter = newParameter;
}
protected override Expression VisitParameter(ParameterExpression node)
{
if (ReferenceEquals(node, _oldParameter))
return _newParameter;
return base.VisitParameter(node);
}
}
/// <summary>
/// Break a list of items tuples of pairs.
/// </summary>
private static IEnumerable<(T, T)> PairWise<T>(this IEnumerable<T> source)
{
var sourceEnumerator = source.GetEnumerator();
while (sourceEnumerator.MoveNext())
{
var a = sourceEnumerator.Current;
sourceEnumerator.MoveNext();
var b = sourceEnumerator.Current;
yield return (a, b);
}
}
private static TKey[] Bucketize<TKey>(IEnumerable<TKey> values)
{
var distinctValueList = values.Distinct().ToList();
// Calculate bucket size as 1,2,4,8,16,32,64,...
var bucket = 1;
while (distinctValueList.Count > bucket)
{
bucket *= 2;
}
// Fill all slots.
var lastValue = distinctValueList.Last();
for (var index = distinctValueList.Count; index < bucket; index++)
{
distinctValueList.Add(lastValue);
}
var distinctValues = distinctValueList.ToArray();
return distinctValues;
}
In the absence of a general solution, I think there are two things to consider:
Avoid multi-column primary keys (will make unit testing easier too).
But if you have to, chances are that one of them will reduce the
query result size to O(n) where n is the size of the ideal query
result. From here, its Solution 5 from Gerd Arnold above.
For example, the problem leading me to this question was querying order lines, where the key is order id + order line number + order type, and the source had the order type being implicit. That is, the order type was a constant, order ID would reduce the query set to order lines of relevant orders, and there would usually be 5 or less of these per order.
To rephrase: If you have a composite key, changes are that one of them have very few duplicates. Apply Solution 5 from above with that.
I tried this solution and it worked with me and the output query was perfect without any parameters
using LinqKit; // nuget
var customField_Ids = customFields?.Select(t => new CustomFieldKey { Id = t.Id, TicketId = t.TicketId }).ToList();
var uniqueIds1 = customField_Ids.Select(cf => cf.Id).Distinct().ToList();
var uniqueIds2 = customField_Ids.Select(cf => cf.TicketId).Distinct().ToList();
var predicate = PredicateBuilder.New<CustomFieldKey>(false); //LinqKit
var lambdas = new List<Expression<Func<CustomFieldKey, bool>>>();
foreach (var cfKey in customField_Ids)
{
var id = uniqueIds1.Where(uid => uid == cfKey.Id).Take(1).ToList();
var ticketId = uniqueIds2.Where(uid => uid == cfKey.TicketId).Take(1).ToList();
lambdas.Add(t => id.Contains(t.Id) && ticketId.Contains(t.TicketId));
}
predicate = AggregateExtensions.AggregateBalanced(lambdas.ToArray(), (expr1, expr2) =>
{
var invokedExpr = Expression.Invoke(expr2, expr1.Parameters.Cast<Expression>());
return Expression.Lambda<Func<CustomFieldKey, bool>>
(Expression.OrElse(expr1.Body, invokedExpr), expr1.Parameters);
});
var modifiedCustomField_Ids = repository.GetTable<CustomFieldLocal>()
.Select(cf => new CustomFieldKey() { Id = cf.Id, TicketId = cf.TicketId }).Where(predicate).ToArray();
I ended up writing a helper for this problem that relies on System.Linq.Dynamic.Core;
Its a lot of code and don't have time to refactor at the moment but input / suggestions appreciated.
public static IQueryable<TEntity> WhereIsOneOf<TEntity, TSource>(this IQueryable<TEntity> dbSet,
IEnumerable<TSource> source,
Expression<Func<TEntity, TSource,bool>> predicate) where TEntity : class
{
var (where, pDict) = GetEntityPredicate(predicate, source);
return dbSet.Where(where, pDict);
(string WhereStr, IDictionary<string, object> paramDict) GetEntityPredicate(Expression<Func<TEntity, TSource, bool>> func, IEnumerable<TSource> source)
{
var firstP = func.Parameters[0];
var binaryExpressions = RecurseBinaryExpressions((BinaryExpression)func.Body);
var i = 0;
var paramDict = new Dictionary<string, object>();
var res = new List<string>();
foreach (var sourceItem in source)
{
var innerRes = new List<string>();
foreach (var bExp in binaryExpressions)
{
var emp = ToEMemberPredicate(firstP, bExp);
var val = emp.GetKeyValue(sourceItem);
var pName = $"#{i++}";
paramDict.Add(pName, val);
var str = $"{emp.EntityMemberName} {emp.SQLOperator} {pName}";
innerRes.Add(str);
}
res.Add( "(" + string.Join(" and ", innerRes) + ")");
}
var sRes = string.Join(" || ", res);
return (sRes, paramDict);
}
EMemberPredicate ToEMemberPredicate(ParameterExpression firstP, BinaryExpression bExp)
{
var lMember = (MemberExpression)bExp.Left;
var rMember = (MemberExpression)bExp.Right;
var entityMember = lMember.Expression == firstP ? lMember : rMember;
var keyMember = entityMember == lMember ? rMember : lMember;
return new EMemberPredicate(entityMember, keyMember, bExp.NodeType);
}
List<BinaryExpression> RecurseBinaryExpressions(BinaryExpression e, List<BinaryExpression> runningList = null)
{
if (runningList == null) runningList = new List<BinaryExpression>();
if (e.Left is BinaryExpression lbe)
{
var additions = RecurseBinaryExpressions(lbe);
runningList.AddRange(additions);
}
if (e.Right is BinaryExpression rbe)
{
var additions = RecurseBinaryExpressions(rbe);
runningList.AddRange(additions);
}
if (e.Left is MemberExpression && e.Right is MemberExpression)
{
runningList.Add(e);
}
return runningList;
}
}
Helper class:
public class EMemberPredicate
{
public readonly MemberExpression EntityMember;
public readonly MemberExpression KeyMember;
public readonly PropertyInfo KeyMemberPropInfo;
public readonly string EntityMemberName;
public readonly string SQLOperator;
public EMemberPredicate(MemberExpression entityMember, MemberExpression keyMember, ExpressionType eType)
{
EntityMember = entityMember;
KeyMember = keyMember;
KeyMemberPropInfo = (PropertyInfo)keyMember.Member;
EntityMemberName = entityMember.Member.Name;
SQLOperator = BinaryExpressionToMSSQLOperator(eType);
}
public object GetKeyValue(object o)
{
return KeyMemberPropInfo.GetValue(o, null);
}
private string BinaryExpressionToMSSQLOperator(ExpressionType eType)
{
switch (eType)
{
case ExpressionType.Equal:
return "==";
case ExpressionType.GreaterThan:
return ">";
case ExpressionType.GreaterThanOrEqual:
return ">=";
case ExpressionType.LessThan:
return "<";
case ExpressionType.LessThanOrEqual:
return "<=";
case ExpressionType.NotEqual:
return "<>";
default:
throw new ArgumentException($"{eType} is not a handled Expression Type.");
}
}
}
Use Like so:
// This can be a Tuple or whatever.. If Tuple, then y below would be .Item1, etc.
// This data structure is up to you but is what I use.
[FromBody] List<CustomerAddressPk> cKeys
var res = await dbCtx.CustomerAddress
.WhereIsOneOf(cKeys, (x, y) => y.CustomerId == x.CustomerId
&& x.AddressId == y.AddressId)
.ToListAsync();
Hope this helps others.
in Case of composite key you can use another idlist and add a condition for that in your code
context.Table.Where(q => listOfIds.Contains(q.Id) && listOfIds2.Contains(q.Id2));
or you can use one another trick create a list of your keys by adding them
listofid.add(id+id1+......)
context.Table.Where(q => listOfIds.Contains(q.Id+q.id1+.......));
I tried this on EF Core 5.0.3 with the Postgres provider.
context.Table
.Select(entity => new
{
Entity = entity,
CompositeKey = entity.Id1 + entity.Id2,
})
.Where(x => compositeKeys.Contains(x.CompositeKey))
.Select(x => x.Entity);
This produced SQL like:
SELECT *
FROM table AS t
WHERE t.Id1 + t.Id2 IN (#__compositeKeys_0)),
Caveats
this should only be used where the combination of Id1 and Id2 will always produce a unique result (e.g., they're both UUIDs)
this cannot use indexes, though you could save the composite key to the db with an index

Compose LINQ-to-SQL predicates into a single predicate

(An earlier question, Recursively (?) compose LINQ predicates into a single predicate, is similar to this but I actually asked the wrong question... the solution there satisfied the question as posed, but isn't actually what I need. They are different, though. Honest.)
Given the following search text:
"keyword1 keyword2 ... keywordN"
I want to end up with the following SQL:
SELECT [columns] FROM Customer
WHERE (
Customer.Forenames LIKE '%keyword1%'
OR
Customer.Forenames LIKE '%keyword2%'
OR
...
OR
Customer.Forenames LIKE '%keywordN%'
) AND (
Customer.Surname LIKE '%keyword1%'
OR
Customer.Surname LIKE '%keyword2%'
OR
....
OR
Customer.Surname LIKE '%keywordN%'
)
Effectively, we're splitting the search text on spaces, trimming each token, constructing a multi-part OR clause based on each , and then AND'ing the clauses together.
I'm doing this in Linq-to-SQL, and I have no idea how to dynamically compose a predicate based on an arbitrarily-long list of subpredicates. For a known number of clauses, it's easy to compose the predicates manually:
dataContext.Customers.Where(
(
Customer.Forenames.Contains("keyword1")
||
Customer.Forenames.Contains("keyword2")
) && (
Customer.Surname.Contains("keyword1")
||
Customer.Surname.Contains("keyword2")
)
);
In short, I need a technique that, given two predicates, will return a single predicate composing the two source predicates with a supplied operator, but restricted to the operators explicitly supported by Linq-to-SQL. Any ideas?
You can use the PredicateBuilder class
IQueryable<Customer> SearchCustomers (params string[] keywords)
{
var predicate = PredicateBuilder.False<Customer>();
foreach (string keyword in keywords)
{
// Note that you *must* declare a variable inside the loop
// otherwise all your lambdas end up referencing whatever
// the value of "keyword" is when they're finally executed.
string temp = keyword;
predicate = predicate.Or (p => p.Forenames.Contains (temp));
}
return dataContext.Customers.Where (predicate);
}
(that's actually the example from the PredicateBuilder page, I just adapted it to your case...)
EDIT:
Actually I misread your question, and my example above only covers a part of the solution... The following method should do what you want :
IQueryable<Customer> SearchCustomers (string[] forenameKeyWords, string[] surnameKeywords)
{
var predicate = PredicateBuilder.True<Customer>();
var forenamePredicate = PredicateBuilder.False<Customer>();
foreach (string keyword in forenameKeyWords)
{
string temp = keyword;
forenamePredicate = forenamePredicate.Or (p => p.Forenames.Contains (temp));
}
predicate = PredicateBuilder.And(forenamePredicate);
var surnamePredicate = PredicateBuilder.False<Customer>();
foreach (string keyword in surnameKeyWords)
{
string temp = keyword;
surnamePredicate = surnamePredicate.Or (p => p.Surnames.Contains (temp));
}
predicate = PredicateBuilder.And(surnamePredicate);
return dataContext.Customers.Where(predicate);
}
You can use it like that:
var query = SearchCustomers(
new[] { "keyword1", "keyword2" },
new[] { "keyword3", "keyword4" });
foreach (var Customer in query)
{
...
}
Normally you would chain invocations of .Where(...). E.g.:
var a = dataContext.Customers;
if (kwd1 != null)
a = a.Where(t => t.Customer.Forenames.Contains(kwd1));
if (kwd2 != null)
a = a.Where(t => t.Customer.Forenames.Contains(kwd2));
// ...
return a;
LINQ-to-SQL would weld it all back together into a single WHERE clause.
This doesn't work with OR, however. You could use unions and intersections, but I'm not sure whether LINQ-to-SQL (or SQL Server) is clever enough to fold it back to a single WHERE clause. OTOH, it won't matter if performance doesn't suffer. Anyway, it would look something like this:
<The type of dataContext.Customers> ff = null, ss = null;
foreach (k in keywords) {
if (keywords != null) {
var f = dataContext.Customers.Where(t => t.Customer.Forenames.Contains(k));
ff = ff == null ? f : ff.Union(f);
var s = dataContext.Customers.Where(t => t.Customer.Surname.Contains(k));
ss = ss == null ? s : ss.Union(s);
}
}
return ff.Intersect(ss);

c# finding matching words in table column using Linq2Sql

I am trying to use Linq2Sql to return all rows that contain values from a list of strings. The linq2sql class object has a string property that contains words separated by spaces.
public class MyObject
{
public string MyProperty { get; set; }
}
Example MyProperty values are:
MyObject1.MyProperty = "text1 text2 text3 text4"
MyObject2.MyProperty = "text2"
For example, using a string collection, I pass the below list
var list = new List<>() { "text2", "text4" }
This would return both items in my example above as they both contain "text2" value.
I attempted the following using the below code however, because of my extension method the Linq2Sql cannot be evaluated.
public static IQueryable<MyObject> WithProperty(this IQueryable<MyProperty> qry,
IList<string> p)
{
return from t in qry
where t.MyProperty.Contains(p, ' ')
select t;
}
I also wrote an extension method
public static bool Contains(this string str, IList<string> list, char seperator)
{
if (str == null) return false;
if (list == null) return true;
var splitStr = str.Split(new char[] { seperator },
StringSplitOptions.RemoveEmptyEntries);
bool retval = false;
int matches = 0;
foreach (string s in splitStr)
{
foreach (string l in list)
{
if (String.Compare(s, l, true) == 0)
{
retval = true;
matches++;
}
}
}
return retval && (splitStr.Length > 0) && (list.Count == matches);
}
Any help or ideas on how I could achieve this?
Youre on the right track. The first parameter of your extension method WithProperty has to be of the type IQueryable<MyObject>, not IQueryable<MyProperty>.
Anyways you dont need an extension method for the IQueryable. Just use your Contains method in a lambda for filtering. This should work:
List<string> searchStrs = new List<string>() { "text2", "text4" }
IEnumerable<MyObject> myFilteredObjects = dataContext.MyObjects
.Where(myObj => myObj.MyProperty.Contains(searchStrs, ' '));
Update:
The above code snippet does not work. This is because the Contains method can not be converted into a SQL statement. I thought a while about the problem, and came to a solution by thinking about 'how would I do that in SQL?': You could do it by querying for each single keyword, and unioning all results together. Sadly the deferred execution of Linq-to-SQL prevents from doing that all in one query. So I came up with this compromise of a compromise. It queries for every single keyword. That can be one of the following:
equal to the string
in between two seperators
at the start of the string and followed by a seperator
or at the end of the string and headed by a seperator
This spans a valid expression tree and is translatable into SQL via Linq-to-SQL. After the query I dont defer the execution by immediatelly fetch the data and store it in a list. All lists are unioned afterwards.
public static IEnumerable<MyObject> ContainsOneOfTheseKeywords(
this IQueryable<MyObject> qry, List<string> keywords, char sep)
{
List<List<MyObject>> parts = new List<List<MyObject>>();
foreach (string keyw in keywords)
parts.Add((
from obj in qry
where obj.MyProperty == keyw ||
obj.MyProperty.IndexOf(sep + keyw + sep) != -1 ||
obj.MyProperty.IndexOf(keyw + sep) >= 0 ||
obj.MyProperty.IndexOf(sep + keyw) ==
obj.MyProperty.Length - keyw.Length - 1
select obj).ToList());
IEnumerable<MyObject> union = null;
bool first = true;
foreach (List<MyObject> part in parts)
{
if (first)
{
union = part;
first = false;
}
else
union = union.Union(part);
}
return union.ToList();
}
And use it:
List<string> searchStrs = new List<string>() { "text2", "text4" };
IEnumerable<MyObject> myFilteredObjects = dataContext.MyObjects
.ContainsOneOfTheseKeywords(searchStrs, ' ');
That solution is really everything else than elegant. For 10 keywords, I have to query the db 10 times and every time catch the data and store it in memory. This is wasting memory and has a bad performance. I just wanted to demonstrate that it is possible in Linq (maybe it can be optimized here or there, but I think it wont get perfect).
I would strongly recommend to swap the logic of that function into a stored procedure of your database server. One single query, optimized by the database server, and no waste of memory.
Another alternative would be to rethink your database design. If you want to query contents of one field (you are treating this field like an array of keywords, seperated by spaces), you may simply have chosen an inappropriate database design. You would rather want to create a new table with a foreign key to your table. The new table has then exactly one keyword. The queries would be much simpler, faster and more understandable.
I haven't tried, but if I remember correctly, this should work:
from t in ctx.Table
where list.Any(x => t.MyProperty.Contains(x))
select t
you can replace Any() with All() if you want all strings in list to match
EDIT:
To clarify what I was trying to do with this, here is a similar query written without linq, to explain the use of All and Any
where list.Any(x => t.MyProperty.Contains(x))
Translates to:
where t.MyProperty.Contains(list[0]) || t.MyProperty.Contains(list[1]) ||
t.MyProperty.Contains(list[n])
And
where list.Any(x => t.MyProperty.Contains(x))
Translates to:
where t.MyProperty.Contains(list[0]) && t.MyProperty.Contains(list[1]) &&
t.MyProperty.Contains(list[n])

Categories