Better performance on updating objects with linq

Better performance on updating objects with linq - c#

I have two lists of custom objects and want to update a field for all objects in one list if there is an object in the other list which matches on another pair of fields.
This code explains the problem better and produces the results I want. However for larger lists 20k, and a 20k list with matching objects, this takes a considerable time (31s). I can improve this with ~50% by using the generic lists Find(Predicate) method.
using System;
using System.Linq;
using System.Linq.Expressions;
using System.Collections.Generic;
namespace ExperimentFW3
{
public class PropValue
{
public string Name;
public decimal Val;
public decimal Total;
}
public class Adjustment
{
public string PropName;
public decimal AdjVal;
}
class Program
{
static List<PropValue> propList;
static List<Adjustment> adjList;
public static void Main()
{
propList = new List<PropValue>{
new PropValue{Name = "Alfa", Val=2.1M},
new PropValue{Name = "Beta", Val=1.0M},
new PropValue{Name = "Gamma", Val=8.0M}
};
adjList = new List<Adjustment>{
new Adjustment{PropName = "Alfa", AdjVal=-0.1M},
new Adjustment{PropName = "Beta", AdjVal=3M}
};
foreach (var p in propList)
{
Adjustment a = adjList.SingleOrDefault(
av => av.PropName.Equals(p.Name)
);
if (a != null)
p.Total = p.Val + a.AdjVal;
else
p.Total = p.Val;
}
}
}
}
The desired result is: Alfa total=2,Beta total=4,Gamma total=8
But I wonder if this is possible to do even faster. Inner joining the two lists takes very little time, even when looping over 20k items in the resultset.
var joined = from p in propList
join a in adjList on p.Name equals a.PropName
select new { p.Name, p.Val, p.Total, a.AdjVal };
So my question is if it's possible to do something like I would do with T-SQL? An UPDATE from a left join using ISNULL(val,0) on the adjustment value.

That join should be fairly fast, as it will first loop through all of adjList to create a lookup, then for each element in propList it will just use the lookup. This is faster than your O(N * M) method in the larger code - although that could easily be fixed by calling ToLookup (or ToDictionary as you only need one value) on adjList before the loop.
EDIT: Here's the modified code using ToDictionary. Untested, mind you...
var adjDictionary = adjList.ToDictionary(av => av.PropName);
foreach (var p in propList)
{
Adjustment a;
if (adjDictionary.TryGetValue(p.Name, out a))
{
p.Total = p.Val + a.AdjVal;
}
else
{
p.Total = p.Val;
}
}

If adjList might have duplicate names, you should group the items before pushing to dictionary.
Dictionary<string, decimal> adjDictionary = adjList
.GroupBy(a => a.PropName)
.ToDictionary(g => g.Key, g => g.Sum(a => a.AdjVal))
propList.ForEach(p =>
{
decimal a;
adjDictionary.TryGetValue(p.Name, out a);
p.Total = p.Val + a;
});

I know I am late posting this, but I thought someone would appreciate the clearer shorter answer below that handles multiple records per lookup in adjList. Creating a LookUp will allow fast lookups on multiple items and will return an empty list if there are no records in LookUp.
var adjLookUp = adjList.ToLookUp(a => a.PropName);
foreach (var p in propList)
p.Total = p.Val + adjLookUp[p.Name].Sum(a => a.AdjVal);

Related

Compare two List elements and replace if id is equals

I have two lists with Classes
public class Product
{
int id;
string url;
ect.
}
I need compare in the old list (10k+ elements) a new list(10 elements) by ID
and if an id is same just replace data from new List to old list
I think it will be good using LINQ.
Can you help me how can I use LINQ or there are batter library?

Do you need to modify the collection in place or return a new collection?
If you are returning a new collection you could
var query = from x in oldItems
join y in newItems on y.Id equals x.Id into g
from z in g.DefaultIfEmpty()
select z ?? x;
var new List = query.ToList();
This method will ignore entries in newItems that do not exist in old items.
If you are going to be modifying the collection in place you would be better off working with a dictionary and referencing that everywhere.
You can create a dictionary from the list by doing
var collection = items.ToDictionary(x => x.Id, x => x);
Note modifying the dictionary doesn't alter the source collection, the idea is to replace your collection with the dictionary object.
If you are using the dictionary you can then iterate over new collection and check the key.
foreach (var item in newItems.Where(x => collection.ContainsKey(x.Id))) {
collection[item.Id] = item;
}
Dictionaries are iterable so you can loop over the Values collection if you need to. Adds and removes are fast because you can reference by key. The only problem I can think you may run into is if you rely on the ordering of the collection.
If you are stuck needing to use the original collection type then you could use the ToDictionary message on your newItems collection. This makes your update code look like this.
var converted = newItems.ToDictionary(x => x.Id, x => x);
for (var i = 0; i < oldItems.Count(); i++) {
if (converted.ContainsKey(oldItems[i].Id)) {
oldItems[i] = converted[oldItems[i].Id];
}
}
This has the advantage the you only need to loop the newitems collection once, from then on it's key lookups, so it's less cpu intensive. The downside is you've created an new collection of keys for newitems so it consumes more memory.

Send you a sample function that joins the two list by id property of both lists and then update original Product.url with the newer one
void ChangeItems(IList<Product> original, IList<Product> newer){
original.Join(newer, o => o.id, n => n.id, (o, n) => new { original = o, newer = n })
.ToList()
.ForEach(j => j.original.Url = j.newer.Url);
}

Solution :- : The LINQ solution you're look for will be something like this
oldList = oldList.Select(ele => { return (newList.Any(i => i.id == ele.id) ? newList.FirstOrDefault(newObj => newObj.id == ele.id) : ele); }).ToList();
Note :- Here we are creating the OldList based on NewList & OldList i.e we are replacing OldList object with NewList object.If you only want some of the new List properties you can create a copy Method in your class
EG for copy constructor
oldList = oldList.Select(ele => { return (newList.Any(i => i.id == ele.id) ? ele.Copy(newList.FirstOrDefault(newObj => newObj.id == ele.id)) : ele); }).ToList();
//Changes in your class
public void Copy(Product prod)
{
//use req. property of prod. to be replaced the old class
this.id = prod.id;
}
Read
It is not a good idea to iterate over 10k+ elements even using linq as such it will still affect your CPU performance*
Online sample for 1st solution

As you have class
public class Product
{
public int id;
public string url;
public string otherData;
public Product(int id, string url, string otherData)
{
this.id = id;
this.url = url;
this.otherData = otherData;
}
public Product ChangeProp(Product newProd)
{
this.url = newProd.url;
this.otherData = newProd.otherData;
return this;
}
}
Note that, now we have ChangeProp method in data class, this method will accept new class and modify old class with properties of new class and return modified new class (as you want your old class be replaced with new classes property (data). So at the end Linq will be readable and clean.
and you already have oldList with lots of entries, and have to replace data of oldList by data of newList if id is same, you can do it like below.
suppose they are having data like below,
List<Product> oldList = new List<Product>();
for (int i = 0; i < 10000; i++)
{
oldList.Add(new Product(i, "OldData" + i.ToString(), "OldData" + i.ToString() + "-other"));
}
List<Product> newList = new List<Product>();
for (int i = 0; i < 5; i++)
{
newList.Add(new Product(i, "NewData" + i.ToString(), "NewData" + i.ToString() + "-other"));
}
this Linq will do your work.
oldList.Where(x => newList.Any(y => y.id == x.id))
.Select(z => oldList[oldList.IndexOf(z)].ChangeProp(newList.Where(a => a.id == z.id).FirstOrDefault())).ToList();

foreach(var product in newList)
{
int index = oldList.FindIndex(x => x.id == product.id);
if (index != -1)
{
oldList[index].url = product.url;
}
}
This will work and i think it's a better solution too.
All the above solution are creating new object in memory and creating new list with 10k+
records is definitely a bad idea.
Please make fields in product as it won't be accessible.

how do I make this LINQ query faster?

modelData has 100,000 items in the list.
I am doing 2 "Selects" within 2 loops.
Could it be structured differently - as it take a long time - 10 mins
public class ModelData
{
public string name;
public DateTime DT;
public int real;
public int trade;
public int position;
public int dayPnl;
}
List<ModelData> modelData;
var dates = modelData.Select(x => x.DT.Date).Distinct();
var names = modelData.Select(x => x.name).Distinct();
foreach (var aDate in dates)
{
var dateRealTrades = modelData.Select(x => x)
.Where(x => x.DT.Date.Equals(aDate) && x.real.Equals(1));
foreach (var aName in names)
{
var namesRealTrades = dateRealTrades.Select(x => x)
.Where(x => x.name.Equals(aName));
// DO MY PROCESSING
}
}

I believe what you want can be achieved with two queries using group by. One to create a lookup by the date and the other to give you the name-date grouped items.
var data = modelData.Where(x => x.real.Equals(1))
.GroupBy(x => new { x.DT.Date, x.name });
var byDate = modelData.Where(x => x.real.Equals(1))
.ToLookup(x => x.DT.Date);
foreach(var item in data)
{
var aDate = item.Key.Date;
var aName = item.Key.name;
var namesRealTrades = item.ToList();
var dateRealTrades = byDate[aDate].ToList();
// DO MY PROCESSING
}
The first query will give you items grouped by the name and date to iterate over and the second will give you a lookup to get all the items associated with a given date. The second uses a lookup so that the list is iterated once and gives you fast access to the resulting list of items.
This should greatly reduce the number of times you iterate over modelData from what you currently have.

You could rewrite your for loop like this:
foreach (var namesRealTrades in names.Select(aName => dateRealTrades.Where(x => x.name.Equals(aName))))
{
//DO STUFF
}
Depending on your data this could reduce the number of queries you have to make

Did you try to compile your query as suggested on MSDN WebSite?
When you have an application that executes structurally similar
queries many times, you can often increase performance by compiling
the query one time and executing it several times with different
parameters. For example, an application might have to retrieve all the
customers who are in a particular city, where the city is specified at
runtime by the user in a form. LINQ to SQL supports the use of
compiled queries for this purpose.
https://msdn.microsoft.com/en-us/library/bb399335(v=vs.110).aspx

A couple of things:
use .ToList() to calculate a sequence once, so you can keep it for later.
use .GroupBy() to avoid re-searching modelData for things you have already found.
// Collections of models having the same Date or Name.
var dates = modelData.GroupBy(x => x.DT.Date);
var names = modelData.GroupBy(x => x.Name);
foreach (var modelsWithDate in dates)
{
var aDate = modelsWithDate.Key;
var dateRealTrades = modelsWithDate.Where(x => x.real == 1).ToList();
foreach (var modelsWithName in names)
{
var aName = modelsWithName.Key;
var namesRealTrades = modelsWithName.ToList();
// DO MY PROCESSING
}
}

There are two ways the code is ineffective.
names has deffered evaluation. Every time You iterate over it, it has to go though the whole data to find all the distinct names again. You should save the result.
You find distinct values from collection and then You go through collection again for every distinct value and look fot its occurences. You should use grouping.
the rewritten code can look like this
var dates = modelData.GroupBy(x => x.DT.Date);
var names = modelData.Select(x => x.name).Distinct().ToArray();
foreach (var date in dates)
{
var dateRealTrades = date.Where(x => x.real.Equals(1)).ToArray();
var namesRealTradesLookup = dateRealTrades.ToLookup(x => x.name);
foreach (var aName in names)
{
var namesRealTrades = namesRealTradesLookup[aName];
// DO MY PROCESSING
// var aDate = date.Key;
}
}
In case You are not interestested in date/name combination with no real trade, it can be done in much more straightforward way
var realModelData = modelData.Where(x => x.real.Equals(1));
foreach (var dateRealTrades in realModelData.ToLookup(x => x.DT.Date))
{
foreach (var namesRealTrades in dateRealTrades.ToLookup(x => x.name))
{
// DO MY PROCESSING
//var aDate = dateRealTrades.Key;
//var aName = namesRealTrades.Key;
//foreach(var trade in namesRealTrades) { ...
//foreach(var trade in dateRealTrades) { ...
}
}

Querying a list of entities with composite keys in EF [duplicate]

given a list of ids, I can query all relevant rows by:
context.Table.Where(q => listOfIds.Contains(q.Id));
But how do you achieve the same functionality when the Table has a composite key?

This is a nasty problem for which I don't know any elegant solution.
Suppose you have these key combinations, and you only want to select the marked ones (*).
Id1 Id2
--- ---
1 2 *
1 3
1 6
2 2 *
2 3 *
... (many more)
How to do this is a way that Entity Framework is happy? Let's look at some possible solutions and see if they're any good.
Solution 1: Join (or Contains) with pairs
The best solution would be to create a list of the pairs you want, for instance Tuples, (List<Tuple<int,int>>) and join the database data with this list:
from entity in db.Table // db is a DbContext
join pair in Tuples on new { entity.Id1, entity.Id2 }
equals new { Id1 = pair.Item1, Id2 = pair.Item2 }
select entity
In LINQ to objects this would be perfect, but, too bad, EF will throw an exception like
Unable to create a constant value of type 'System.Tuple`2 (...) Only primitive types or enumeration types are supported in this context.
which is a rather clumsy way to tell you that it can't translate this statement into SQL, because Tuples is not a list of primitive values (like int or string). For the same reason a similar statement using Contains (or any other LINQ statement) would fail.
Solution 2: In-memory
Of course we could turn the problem into simple LINQ to objects like so:
from entity in db.Table.AsEnumerable() // fetch db.Table into memory first
join pair Tuples on new { entity.Id1, entity.Id2 }
equals new { Id1 = pair.Item1, Id2 = pair.Item2 }
select entity
Needless to say that this is not a good solution. db.Table could contain millions of records.
Solution 3: Two Contains statements (incorrect)
So let's offer EF two lists of primitive values, [1,2] for Id1 and [2,3] for Id2. We don't want to use join, so let's use Contains:
from entity in db.Table
where ids1.Contains(entity.Id1) && ids2.Contains(entity.Id2)
select entity
But now the results also contains entity {1,3}! Well, of course, this entity perfectly matches the two predicates. But let's keep in mind that we're getting closer. In stead of pulling millions of entities into memory, we now only get four of them.
Solution 4: One Contains with computed values
Solution 3 failed because the two separate Contains statements don't only filter the combinations of their values. What if we create a list of combinations first and try to match these combinations? We know from solution 1 that this list should contain primitive values. For instance:
var computed = ids1.Zip(ids2, (i1,i2) => i1 * i2); // [2,6]
and the LINQ statement:
from entity in db.Table
where computed.Contains(entity.Id1 * entity.Id2)
select entity
There are some problems with this approach. First, you'll see that this also returns entity {1,6}. The combination function (a*b) does not produce values that uniquely identify a pair in the database. Now we could create a list of strings like ["Id1=1,Id2=2","Id1=2,Id2=3]" and do
from entity in db.Table
where computed.Contains("Id1=" + entity.Id1 + "," + "Id2=" + entity.Id2)
select entity
(This would work in EF6, not in earlier versions).
This is getting pretty messy. But a more important problem is that this solution is not sargable, which means: it bypasses any database indexes on Id1 and Id2 that could have been used otherwise. This will perform very very poorly.
Solution 5: Best of 2 and 3
So the most viable solution I can think of is a combination of Contains and a join in memory: First do the contains statement as in solution 3. Remember, it got us very close to what we wanted. Then refine the query result by joining the result as an in-memory list:
var rawSelection = from entity in db.Table
where ids1.Contains(entity.Id1) && ids2.Contains(entity.Id2)
select entity;
var refined = from entity in rawSelection.AsEnumerable()
join pair in Tuples on new { entity.Id1, entity.Id2 }
equals new { Id1 = pair.Item1, Id2 = pair.Item2 }
select entity;
It's not elegant, messy all the same maybe, but so far it's the only scalable1 solution to this problem I found, and applied in my own code.
Solution 6: Build a query with OR clauses
Using a Predicate builder like Linqkit or alternatives, you can build a query that contains an OR clause for each element in the list of combinations. This could be a viable option for really short lists. With a couple of hundreds of elements, the query will start performing very poorly. So I don't consider this a good solution unless you can be 100% sure that there will always be a small number of elements. One elaboration of this option can be found here.
Solution 7: Unions
There's also a solution using UNIONs that I posted later here.
1As far as the Contains statement is scalable: Scalable Contains method for LINQ against a SQL backend

Solution for Entity Framework Core with SQL Server
🎉 NEW! QueryableValues EF6 Edition has arrived!
The following solution makes use of QueryableValues. This is a library that I wrote to primarily solve the problem of query plan cache pollution in SQL Server caused by queries that compose local values using the Contains LINQ method. It also allows you to compose values of complex types in your queries in a performant way, which will achieve what's being asked in this question.
First you will need to install and set up the library, after doing that you can use any of the following patterns that will allow you to query your entities using a composite key:
// Required to make the AsQueryableValues method available on the DbContext.
using BlazarTech.QueryableValues;
// Local data that will be used to query by the composite key
// of the fictitious OrderProduct table.
var values = new[]
{
new { OrderId = 1, ProductId = 10 },
new { OrderId = 2, ProductId = 20 },
new { OrderId = 3, ProductId = 30 }
};
// Optional helper variable (needed by the second example due to CS0854)
var queryableValues = dbContext.AsQueryableValues(values);
// Example 1 - Using a Join (preferred).
var example1Results = dbContext
.OrderProduct
.Join(
queryableValues,
e => new { e.OrderId, e.ProductId },
v => new { v.OrderId, v.ProductId },
(e, v) => e
)
.ToList();
// Example 2 - Using Any (similar behavior as Contains).
var example2Results = dbContext
.OrderProduct
.Where(e => queryableValues
.Where(v =>
v.OrderId == e.OrderId &&
v.ProductId == e.ProductId
)
.Any()
)
.ToList();
Useful Links
Nuget Package
GitHub Repository
Benchmarks
QueryableValues is distributed under the MIT license.

You can use Union for each composite primary key:
var compositeKeys = new List<CK>
{
new CK { id1 = 1, id2 = 2 },
new CK { id1 = 1, id2 = 3 },
new CK { id1 = 2, id2 = 4 }
};
IQuerable<CK> query = null;
foreach(var ck in compositeKeys)
{
var temp = context.Table.Where(x => x.id1 == ck.id1 && x.id2 == ck.id2);
query = query == null ? temp : query.Union(temp);
}
var result = query.ToList();

You can create a collection of strings with both keys like this (I am assuming that your keys are int type):
var id1id2Strings = listOfIds.Select(p => p.Id1+ "-" + p.Id2);
Then you can just use "Contains" on your db:
using (dbEntities context = new dbEntities())
{
var rec = await context.Table1.Where(entity => id1id2Strings .Contains(entity.Id1+ "-" + entity.Id2));
return rec.ToList();
}

You need a set of objects representing the keys you want to query.
class Key
{
int Id1 {get;set;}
int Id2 {get;set;}
If you have two lists and you simply check that each value appears in their respective list then you are getting the cartesian product of the lists - which is likely not what you want. Instead you need to query the specific combinations required
List<Key> keys = // get keys;
context.Table.Where(q => keys.Any(k => k.Id1 == q.Id1 && k.Id2 == q.Id2));
I'm not completely sure that this is valid use of Entity Framework; you may have issues with sending the Key type to the database. If that happens then you can be creative:
var composites = keys.Select(k => p1 * k.Id1 + p2 * k.Id2).ToList();
context.Table.Where(q => composites.Contains(p1 * q.Id1 + p2 * q.Id2));
You can create an isomorphic function (prime numbers are good for this), something like a hashcode, which you can use to compare the pair of values. As long as the multiplicative factors are co-prime this pattern will be isomorphic (one-to-one) - i.e. the result of p1*Id1 + p2*Id2 will uniquely identify the values of Id1 and Id2 as long as the prime numbers are correctly chosen.
But then you end up in a situation where you're implementing complex concepts and someone is going to have to support this. Probably better to write a stored procedure which takes the valid key objects.

Ran into this issue as well and needed a solution that both did not perform a table scan and also provided exact matches.
This can be achieved by combining Solution 3 and Solution 4 from Gert Arnold's Answer
var firstIds = results.Select(r => r.FirstId);
var secondIds = results.Select(r => r.SecondId);
var compositeIds = results.Select(r => $"{r.FirstId}:{r.SecondId}");
var query = from e in dbContext.Table
//first check the indexes to avoid a table scan
where firstIds.Contains(e.FirstId) && secondIds.Contains(e.SecondId))
//then compare the compositeId for an exact match
//ToString() must be called unless using EF Core 5+
where compositeIds.Contains(e.FirstId.ToString() + ":" + e.SecondId.ToString()))
select e;
var entities = await query.ToListAsync();

For EF Core I use a slightly modified version of the bucketized IN method by EricEJ to map composite keys as tuples. It performs pretty well for small sets of data.
Sample usage
List<(int Id, int Id2)> listOfIds = ...
context.Table.In(listOfIds, q => q.Id, q => q.Id2);
Implementation
public static IQueryable<TQuery> In<TKey1, TKey2, TQuery>(
this IQueryable<TQuery> queryable,
IEnumerable<(TKey1, TKey2)> values,
Expression<Func<TQuery, TKey1>> key1Selector,
Expression<Func<TQuery, TKey2>> key2Selector)
{
if (values is null)
{
throw new ArgumentNullException(nameof(values));
}
if (key1Selector is null)
{
throw new ArgumentNullException(nameof(key1Selector));
}
if (key2Selector is null)
{
throw new ArgumentNullException(nameof(key2Selector));
}
if (!values.Any())
{
return queryable.Take(0);
}
var distinctValues = Bucketize(values);
if (distinctValues.Length > 1024)
{
throw new ArgumentException("Too many parameters for SQL Server, reduce the number of parameters", nameof(values));
}
var predicates = distinctValues
.Select(v =>
{
// Create an expression that captures the variable so EF can turn this into a parameterized SQL query
Expression<Func<TKey1>> value1AsExpression = () => v.Item1;
Expression<Func<TKey2>> value2AsExpression = () => v.Item2;
var firstEqual = Expression.Equal(key1Selector.Body, value1AsExpression.Body);
var visitor = new ReplaceParameterVisitor(key2Selector.Parameters[0], key1Selector.Parameters[0]);
var secondEqual = Expression.Equal(visitor.Visit(key2Selector.Body), value2AsExpression.Body);
return Expression.AndAlso(firstEqual, secondEqual);
})
.ToList();
while (predicates.Count > 1)
{
predicates = PairWise(predicates).Select(p => Expression.OrElse(p.Item1, p.Item2)).ToList();
}
var body = predicates.Single();
var clause = Expression.Lambda<Func<TQuery, bool>>(body, key1Selector.Parameters[0]);
return queryable.Where(clause);
}
class ReplaceParameterVisitor : ExpressionVisitor
{
private ParameterExpression _oldParameter;
private ParameterExpression _newParameter;
public ReplaceParameterVisitor(ParameterExpression oldParameter, ParameterExpression newParameter)
{
_oldParameter = oldParameter;
_newParameter = newParameter;
}
protected override Expression VisitParameter(ParameterExpression node)
{
if (ReferenceEquals(node, _oldParameter))
return _newParameter;
return base.VisitParameter(node);
}
}
/// <summary>
/// Break a list of items tuples of pairs.
/// </summary>
private static IEnumerable<(T, T)> PairWise<T>(this IEnumerable<T> source)
{
var sourceEnumerator = source.GetEnumerator();
while (sourceEnumerator.MoveNext())
{
var a = sourceEnumerator.Current;
sourceEnumerator.MoveNext();
var b = sourceEnumerator.Current;
yield return (a, b);
}
}
private static TKey[] Bucketize<TKey>(IEnumerable<TKey> values)
{
var distinctValueList = values.Distinct().ToList();
// Calculate bucket size as 1,2,4,8,16,32,64,...
var bucket = 1;
while (distinctValueList.Count > bucket)
{
bucket *= 2;
}
// Fill all slots.
var lastValue = distinctValueList.Last();
for (var index = distinctValueList.Count; index < bucket; index++)
{
distinctValueList.Add(lastValue);
}
var distinctValues = distinctValueList.ToArray();
return distinctValues;
}

In the absence of a general solution, I think there are two things to consider:
Avoid multi-column primary keys (will make unit testing easier too).
But if you have to, chances are that one of them will reduce the
query result size to O(n) where n is the size of the ideal query
result. From here, its Solution 5 from Gerd Arnold above.
For example, the problem leading me to this question was querying order lines, where the key is order id + order line number + order type, and the source had the order type being implicit. That is, the order type was a constant, order ID would reduce the query set to order lines of relevant orders, and there would usually be 5 or less of these per order.
To rephrase: If you have a composite key, changes are that one of them have very few duplicates. Apply Solution 5 from above with that.

I tried this solution and it worked with me and the output query was perfect without any parameters
using LinqKit; // nuget
var customField_Ids = customFields?.Select(t => new CustomFieldKey { Id = t.Id, TicketId = t.TicketId }).ToList();
var uniqueIds1 = customField_Ids.Select(cf => cf.Id).Distinct().ToList();
var uniqueIds2 = customField_Ids.Select(cf => cf.TicketId).Distinct().ToList();
var predicate = PredicateBuilder.New<CustomFieldKey>(false); //LinqKit
var lambdas = new List<Expression<Func<CustomFieldKey, bool>>>();
foreach (var cfKey in customField_Ids)
{
var id = uniqueIds1.Where(uid => uid == cfKey.Id).Take(1).ToList();
var ticketId = uniqueIds2.Where(uid => uid == cfKey.TicketId).Take(1).ToList();
lambdas.Add(t => id.Contains(t.Id) && ticketId.Contains(t.TicketId));
}
predicate = AggregateExtensions.AggregateBalanced(lambdas.ToArray(), (expr1, expr2) =>
{
var invokedExpr = Expression.Invoke(expr2, expr1.Parameters.Cast<Expression>());
return Expression.Lambda<Func<CustomFieldKey, bool>>
(Expression.OrElse(expr1.Body, invokedExpr), expr1.Parameters);
});
var modifiedCustomField_Ids = repository.GetTable<CustomFieldLocal>()
.Select(cf => new CustomFieldKey() { Id = cf.Id, TicketId = cf.TicketId }).Where(predicate).ToArray();

I ended up writing a helper for this problem that relies on System.Linq.Dynamic.Core;
Its a lot of code and don't have time to refactor at the moment but input / suggestions appreciated.
public static IQueryable<TEntity> WhereIsOneOf<TEntity, TSource>(this IQueryable<TEntity> dbSet,
IEnumerable<TSource> source,
Expression<Func<TEntity, TSource,bool>> predicate) where TEntity : class
{
var (where, pDict) = GetEntityPredicate(predicate, source);
return dbSet.Where(where, pDict);
(string WhereStr, IDictionary<string, object> paramDict) GetEntityPredicate(Expression<Func<TEntity, TSource, bool>> func, IEnumerable<TSource> source)
{
var firstP = func.Parameters[0];
var binaryExpressions = RecurseBinaryExpressions((BinaryExpression)func.Body);
var i = 0;
var paramDict = new Dictionary<string, object>();
var res = new List<string>();
foreach (var sourceItem in source)
{
var innerRes = new List<string>();
foreach (var bExp in binaryExpressions)
{
var emp = ToEMemberPredicate(firstP, bExp);
var val = emp.GetKeyValue(sourceItem);
var pName = $"#{i++}";
paramDict.Add(pName, val);
var str = $"{emp.EntityMemberName} {emp.SQLOperator} {pName}";
innerRes.Add(str);
}
res.Add( "(" + string.Join(" and ", innerRes) + ")");
}
var sRes = string.Join(" || ", res);
return (sRes, paramDict);
}
EMemberPredicate ToEMemberPredicate(ParameterExpression firstP, BinaryExpression bExp)
{
var lMember = (MemberExpression)bExp.Left;
var rMember = (MemberExpression)bExp.Right;
var entityMember = lMember.Expression == firstP ? lMember : rMember;
var keyMember = entityMember == lMember ? rMember : lMember;
return new EMemberPredicate(entityMember, keyMember, bExp.NodeType);
}
List<BinaryExpression> RecurseBinaryExpressions(BinaryExpression e, List<BinaryExpression> runningList = null)
{
if (runningList == null) runningList = new List<BinaryExpression>();
if (e.Left is BinaryExpression lbe)
{
var additions = RecurseBinaryExpressions(lbe);
runningList.AddRange(additions);
}
if (e.Right is BinaryExpression rbe)
{
var additions = RecurseBinaryExpressions(rbe);
runningList.AddRange(additions);
}
if (e.Left is MemberExpression && e.Right is MemberExpression)
{
runningList.Add(e);
}
return runningList;
}
}
Helper class:
public class EMemberPredicate
{
public readonly MemberExpression EntityMember;
public readonly MemberExpression KeyMember;
public readonly PropertyInfo KeyMemberPropInfo;
public readonly string EntityMemberName;
public readonly string SQLOperator;
public EMemberPredicate(MemberExpression entityMember, MemberExpression keyMember, ExpressionType eType)
{
EntityMember = entityMember;
KeyMember = keyMember;
KeyMemberPropInfo = (PropertyInfo)keyMember.Member;
EntityMemberName = entityMember.Member.Name;
SQLOperator = BinaryExpressionToMSSQLOperator(eType);
}
public object GetKeyValue(object o)
{
return KeyMemberPropInfo.GetValue(o, null);
}
private string BinaryExpressionToMSSQLOperator(ExpressionType eType)
{
switch (eType)
{
case ExpressionType.Equal:
return "==";
case ExpressionType.GreaterThan:
return ">";
case ExpressionType.GreaterThanOrEqual:
return ">=";
case ExpressionType.LessThan:
return "<";
case ExpressionType.LessThanOrEqual:
return "<=";
case ExpressionType.NotEqual:
return "<>";
default:
throw new ArgumentException($"{eType} is not a handled Expression Type.");
}
}
}
Use Like so:
// This can be a Tuple or whatever.. If Tuple, then y below would be .Item1, etc.
// This data structure is up to you but is what I use.
[FromBody] List<CustomerAddressPk> cKeys
var res = await dbCtx.CustomerAddress
.WhereIsOneOf(cKeys, (x, y) => y.CustomerId == x.CustomerId
&& x.AddressId == y.AddressId)
.ToListAsync();
Hope this helps others.

in Case of composite key you can use another idlist and add a condition for that in your code
context.Table.Where(q => listOfIds.Contains(q.Id) && listOfIds2.Contains(q.Id2));
or you can use one another trick create a list of your keys by adding them
listofid.add(id+id1+......)
context.Table.Where(q => listOfIds.Contains(q.Id+q.id1+.......));

I tried this on EF Core 5.0.3 with the Postgres provider.
context.Table
.Select(entity => new
{
Entity = entity,
CompositeKey = entity.Id1 + entity.Id2,
})
.Where(x => compositeKeys.Contains(x.CompositeKey))
.Select(x => x.Entity);
This produced SQL like:
SELECT *
FROM table AS t
WHERE t.Id1 + t.Id2 IN (#__compositeKeys_0)),
Caveats
this should only be used where the combination of Id1 and Id2 will always produce a unique result (e.g., they're both UUIDs)
this cannot use indexes, though you could save the composite key to the db with an index

Query Nested Dictionary

I was curious if anyone had a good way to solving this problem efficiently. I currently have the following object.
Dictionary<int, Dictionary<double, CustomStruct>>
struct CustomStruct
{
double value1;
double value2;
...
}
Given that I know the 'int' I want to access, I need to know how to return the 'double key' for the dictionary that has the lowest sum of (value1 + value2). Any help would be greatly appreciated. I was trying to use Linq, but any method would be appreciated.

var result = dict[someInt].MinBy(kvp => kvp.Value.value1 + kvp.Value.value2).Key;
using the MinBy Extension Method from the awesome MoreLINQ project.

Using just plain LINQ:
Dictionary<int, Dictionary<double, CustomStruct>> dict = ...;
int id = ...;
var minimum =
(from kvp in dict[id]
// group the keys (double) by their sums
group kvp.Key by kvp.Value.value1 + kvp.Value.value2 into g
orderby g.Key // sort group keys (sums) in ascending order
select g.First()) // select the first key (double) in the group
.First(); // return first key in the sorted collection of keys
Whenever you want to get the minimum or maximum item using plain LINQ, you usually have to do it using ith a combination of GroupBy(), OrderBy() and First()/Last() to get it.

A Dictionary<TKey,TValue> is also a sequence of KeyValuePair<TKey,TValue>. You can select the KeyValuePair with the least sum of values and and get its key.
Using pure LINQ to Objects:
dict[someInt].OrderBy(item => item.Value.value1 + item.Value.value2)
.FirstOrDefault()
.Select(item => item.Key);

Here is the non LINQ way. It is not shorter than its LINQ counterparts but it is much more efficient because it does no sorting like most LINQ solutions which may turn out expensive if the collection is large.
The MinBy solution from dtb is a good one but it requires an external library. I do like LINQ a lot but sometimes you should remind yourself that a foreach loop with a few local variables is not archaic or an error.
CustomStruct Min(Dictionary<double, CustomStruct> input)
{
CustomStruct lret = default(CustomStruct);
double lastSum = double.MaxValue;
foreach (var kvp in input)
{
var other = kvp.Value;
var newSum = other.value1 + other.value2;
if (newSum < lastSum)
{
lastSum = newSum;
lret = other;
}
}
return lret;
}
If you want to use the LINQ method without using an extern library you can create your own MinBy like this one:
public static class Extensions
{
public static T MinBy<T>(this IEnumerable<T> coll, Func<T,double> criteria)
{
T lret = default(T);
double last = double.MaxValue;
foreach (var v in coll)
{
var newLast = criteria(v);
if (newLast < last)
{
last = newLast;
lret = v;
}
}
return lret;
}
}
It is not as efficient as the first one but it does the job and is more reusable and composable as the first one. Your solution with Aggregate is innovative but requires recalculation of the sum of the current best match for every item the current best match is compared to because you carry not enough state between the aggregate calls.

Thanks for all the help guys, found out this way too:
dict[int].Aggregate(
(seed, o) =>
{
var v = seed.Value.TotalCut + seed.Value.TotalFill;
var k = o.Value.TotalCut + o.Value.TotalFill;
return v < k ? seed : o;
}).Key;

LINQ Combine Queries

I have two collections of objects of different type. Lets call them type ALPHA and type BRAVO. Each of these types has a property that is the "ID" for the object. No ID is duplicated within the class, so for any given ID, there is at most one ALPHA and one BRAVO instance. What I need to do is divide them into 3 categories:
Instances of the ID in ALPHA which do not appear in the BRAVO collection;
Instances of the ID in BRAVO which do not appear in the ALPHA collection;
Instances of the ID which appear in both collections.
In all 3 cases, I need to have the actual objects from the collections at hand for subsequent manipulation.
I know for the #3 case, I can do something like:
var myCorrelatedItems = myAlphaItems.Join(myBravoItems, alpha => alpha.Id, beta => beta.Id, (inner, outer) => new
{
alpha = inner,
beta = outer
});
I can also write code for the #1 and #2 cases which look something like
var myUnmatchedAlphas = myAlphaItems.Where(alpha=>!myBravoItems.Any(bravo=>alpha.Id==bravo.Id));
And similarly for unMatchedBravos. Unfortunately, this would result in iterating the collection of alphas (which may be very large!) many times, and the collection of bravos (which may also be very large!) many times as well.
Is there any way to unify these query concepts so as to minimize iteration over the lists? These collections can have thousands of items.

If you are only interested in the IDs,
var alphaIds = myAlphaItems.Select(alpha => alpha.ID);
var bravoIds = myBravoItems.Select(bravo => bravo.ID);
var alphaIdsNotInBravo = alphaIds.Except(bravoIds);
var bravoIdsNotInAlpha = bravoIds.Except(alphaIds);
If you want the alphas and bravos themselves,
var alphaIdsSet = new HashSet<int>(alphaIds);
var bravoIdsSet = new HashSet<int>(bravoIds);
var alphasNotInBravo = myAlphaItems
.Where(alpha => !bravoIdsSet.Contains(alpha.ID));
var bravosNotInAlpha = myBravoItems
.Where(bravo => !alphaIdsSet.Contains(bravo.ID));
EDIT:
A few other options:
The ExceptBy method from MoreLinq.
The Enumerable.ToDictionary method.
If both types inherit from a common type (e.g. an IHasId interface), you could write your own IEqualityComparer<T> implementation; Enumerable.Except has an overload that accepts an equality-comparer as a parameter.

Sometimes LINQ is not the answer. This is the kind of problem where I would consider using a HashSet<T> with a custom comparer to reduce the work of performing set operations. HashSets are much more efficient at performing set operations than lists - and (depending on the data) can reduce the work considerably:
// create a wrapper class that can accomodate either an Alpha or a Bravo
class ABItem {
public Object Instance { get; private set; }
public int Id { get; private set; }
public ABItem( Alpha a ) { Instance = a; Id = a.Id; }
public ABItem( Bravo b ) { Instance = b; Id = b.Id; }
}
// comparer that compares Alphas and Bravos by id
class ABItemComparer : IComparer {
public int Compare( object a, object b ) {
return GetId(a).Compare(GetId(b));
}
private int GetId( object x ) {
if( x is Alpha ) return ((Alpha)x).Id;
if( x is Bravo ) return ((Bravo)x).Id;
throw new InvalidArgumentException();
}
}
// create a comparer based on comparing the ID's of ABItems
var comparer = new ABComparer();
var hashAlphas =
new HashSet<ABItem>(myAlphaItems.Select(x => new ABItem(x)),comparer);
var hashBravos =
new HashSet<ABItem>(myBravoItems.Select(x => new ABItem(x)),comparer);
// items with common IDs in Alpha and Bravo sets:
var hashCommon = new HashSet<Alpha>(hashAlphas).IntersectWith( hashSetBravo );
hashSetAlpha.ExceptWith( hashSetCommon ); // items only in Alpha
hashSetBravo.ExceptWith( hashSetCommon ); // items only in Bravo

Dictionary<int, Alpha> alphaDictionary = myAlphaItems.ToDictionary(a => a.Id);
Dictionary<int, Bravo> bravoDictionary = myBravoItems.ToDictionary(b => b.Id);
ILookup<string, int> keyLookup = alphaDictionary.Keys
.Union(bravoDictionary.Keys)
.ToLookup(x => alphaDictionary.ContainsKey(x) ?
(bravoDictionary.ContainsKey(x) ? "both" : "alpha") :
"bravo");
List<Alpha> alphaBoth = keyLookup["both"].Select(x => alphaDictionary[x]).ToList();
List<Bravo> bravoBoth = keyLookup["both"].Select(x => bravoDictionary[x]).ToList();
List<Alpha> alphaOnly = keyLookup["alpha"].Select(x => alphaDictionary[x]).ToList();
List<Bravo> bravoOnly = keyLookup["bravo"].Select(x => bravoDictionary[x]).ToList();

Here is one possible LINQ solution that performs a full outer join on both sets and appends a property to them showing which group they belong to. This solution might lose its luster, however, when you try to separate the groups into different variables. It all really depends on what kind of actions you need to perform on these objects. At any rate this ran at (I thought) an acceptable speed (.5 seconds) for me on lists of 5000 items:
var q =
from g in
(from id in myAlphaItems.Select(a => a.ID).Union(myBravoItems.Select(b => b.ID))
join a in myAlphaItems on id equals a.ID into ja
from a in ja.DefaultIfEmpty()
join b in myBravoItems on id equals b.ID into jb
from b in jb.DefaultIfEmpty()
select (a == null ?
new { ID = b.ID, Group = "Bravo Only" } :
(b == null ?
new { ID = a.ID, Group = "Alpha Only" } :
new { ID = a.ID, Group = "Both" }
)
)
)
group g.ID by g.Group;
You can remove the 'group by' query or create a dictionary from this (q.ToDictionary(x => x.Key, x => x.Select(y => y))), or whatever! This is simply a way of categorizing your items. I'm sure there are better solutions out there, but this seemed like a truly interesting question so I thought I might as well give it a shot!

I think LINQ is not the best answer to this problem if you want to traverse and compare the minimum amount of times. I think the following iterative solution is more performant. And I believe that code readability doesn't suffer.
var dictUnmatchedAlphas = myAlphaItems.ToDictionary(a => a.Id);
var myCorrelatedItems = new List<AlphaAndBravo>();
var myUnmatchedBravos = new List<Bravo>();
foreach (Bravo b in myBravoItems)
{
var id = b.Id;
if (dictUnmatchedAlphas.ContainsKey(id))
{
var a = dictUnmatchedAlphas[id];
dictUnmatchedAlphas.Remove(id); //to get just the unmatched alphas
myCorrelatedItems.Add(new AlphaAndBravo { a = a, b = b});
}
else
{
myUnmatchedBravos.Add(b);
}
}
Definition of AlphaAndBravo:
public class AlphaAndBravo {
public Alpha a { get; set; }
public Bravo b { get; set; }
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Better performance on updating objects with linq - c#

Related

Compare two List elements and replace if id is equals

how do I make this LINQ query faster?

Querying a list of entities with composite keys in EF [duplicate]

Query Nested Dictionary

LINQ Combine Queries

Categories

Resources