Operation Intersect with linq - c#

Sorry for strange title of the question, but I don't know how to formulate it more short. If you know how to formulate it better, I will be glad if you edit my question.
So, I have the following table:
I'm tolking about CustomerId and EventType fields. The rest is not important. I think you understand that this table is something like log by customers events. Some customer make event - I have event in the table. Simple.
I need to choice all customers events where each customer had event with type registration and type deposit. In other words, customer had registration before? The same customer had deposit? If yes and yes - I need to select all events of this customer.
How I can do that with the help of LINQ?
So I can write SQL like
select *
From "CustomerEvents"
where "CustomerId" in (
select distinct "CustomerId"
from "CustomerEvents"
where "EventType" = 'deposit'
intersect
select distinct "CustomerId"
from "CustomerEvents"
where "EventType" = 'registration'
)
It works, but how to write it on LINQ?
And second question. SQL above works, but not it is not universal. What if tomorrow I will need to show events of customers who have registration, deposit and - new one event - visit? I have to write new one query. Like:
select *
From "CustomerEvents"
where "CustomerId" in (
select "CustomerId"
from "CustomerEvents"
where "EventType" = 'deposit'
intersect
select distinct "CustomerId"
from "CustomerEvents"
where "EventType" = 'registration'
intersect
select distinct "CustomerId"
from "CustomerEvents"
where "EventType" = 'visit'
)
Uncomfortable :(
As source data, I have List with event types. Is there some way to make it dynamically? I mean, I have new one event in the list - I have new one intersect.
P.S I use Postgres and .NET Core 3.1
Update
I pine here a scheme

I haven't tested to see if this will translate to SQL correctly, but if we assume ctx.CustomerEvents is DbSet<CustomerEvent> you could try this:
var targetCustomerIds = ctx
.CustomerEvents
.GroupBy(event => event.CustomerId)
.Where(grouped =>
grouped.Any(event => event.EventType == "deposit")
&& grouped.Any(event => event.EventType == "registration"))
.Select(x => x.Key)
.ToList();
and then select all events for these customers:
var events = ctx.CustomerEvents.Where(event => targetCustomerIds.Contains(event.CustomerId));
To get targetCustomerIds dynamically with a variable number of event types, you could try this:
// for example
var requiredEventTypes = new [] { "deposit", "registration" };
// First group by customer ID
var groupedByCustomerId = ctx
.CustomerEvents
.GroupBy(event => event.CustomerId);
// Then filter out any grouping which doesn't satisfy your condition
var filtered = GetFilteredGroups(groupedByCustomerId, requiredEventTypes);
// Then select the target customer IDs
var targetCustomerIds = filtered.Select(x => x.Key).ToList();
// Finally, select your target events
var events = ctx.CustomerEvents.Where(event =>
targetCustomerIds.Contains(event.CustomerId));
You can define the GetFilteredGroups method like this:
private static IQueryable<IGrouping<int, CustomerEvent>> GetFilteredGroups(
IQueryable<IGrouping<int, CustomerEvent>> grouping,
IEnumerable<string> requiredEventTypes)
{
var result = grouping.Where(x => true);
foreach (var eventType in requiredEventTypes)
{
result = result.Where(x => x.Any(event => event.EventType == eventType));
}
return result;
}
Alternatively, instead of selecting the target customer IDs, you can try to directly select your target events from the filtered groupings:
// ...
// Filter out any grouping which doesn't satisfy your condition
var filtered = GetFilteredGroups(groupedByCustomerId, requiredEventTypes);
// Select your events here
var results = filtered.SelectMany(x => x).Distinct().ToList();
Regarding the inability to translate the query to SQL
Depending on your database size and particularly on the size of CustomerEvents table, this solution may or may not be ideal, but what you could do is load the optimized collection to memory and perform the grouping there:
// for example
var requiredEventTypes = new [] { "deposit", "registration" };
// First group by customer ID, but load into memory
var groupedByCustomerId = ctx
.CustomerEvents
.Where(event => requiredEventTypes.Contains(event.EventType))
.Select(event => new CustomerEventViewModel
{
Id = event.Id,
CustomerId = event.CustomerId,
EventType = event.EventType
})
.GroupBy(event => event.CustomerId)
.AsEnumerable();
// Then filter out any grouping which doesn't satisfy your condition
var filtered = GetFilteredGroups(groupedByCustomerId, requiredEventTypes);
// Then select the target customer IDs
var targetCustomerIds = filtered.Select(x => x.Key).ToList();
// Finally, select your target events
var events = ctx.CustomerEvents.Where(event =>
targetCustomerIds.Contains(event.CustomerId));
You will need to create a type called CustomerEventViewModel like this (so you don't have to load the entire CustomerEvent entity instances to memory):
public class CustomerEventViewModel
{
public int Id { get; set; }
public int CustomerId { get; set; }
public string EventType { get; set; }
}
And change the GetFilteredGroups like this:
private static IEnumerable<IGrouping<int, CustomerEvent>> GetFilteredGroups(
IEnumerable<IGrouping<int, CustomerEvent>> grouping,
IEnumerable<string> requiredEventTypes)
{
var result = grouping.Where(x => true);
foreach (var eventType in requiredEventTypes)
{
result = result.Where(x => x.Any(event => event.EventType == eventType));
}
return result;
}
It should now work fine.

Thank you for #Dejan Janjušević. He is excpirienced developer. But it seems EF can't translate him solution to SQL (or just my hands grow from wrong place). I publish here my solution for this situation. It's simple stuped. So. I have in the table EventType. It is string. And I have from the client the following filter request:
List<string> eventType
Just list with event types. So, in the action I have the following code of the filter:
if (eventType.Any())
{
List<int> ids = new List<int>();
foreach (var e in eventType)
{
var customerIdsList =
_context.customerEvents.Where(x => x.EventType == e).Select(x => x.CustomerId.Value).Distinct().ToList();
if (!ids.Any())
{
ids = customerIdsList;
}
else
{
ids = ids.Intersect(customerIdsList).ToList();
}
}
customerEvents = customerEvents.Where(x => ids.Contains(x.CustomerId.Value));
}
Not very fast, but works.

Related

Linq - EntityFramework NotSupportedException

I have a query that looks like this:
var caseList = (from x in context.Cases
where allowedCaseIds.Contains(x => x.CaseId)
select new Case {
CaseId = x.CaseId,
NotifierId = x.NotifierId,
Notifier = x.NotifierId.HasValue ? new Notifier { Name = x.Notifier.Name } : null // This line throws exception
}).ToList();
A Case class can have 0..1 Notifier
The query above will result in the following System.NotSupportedException:
Unable to create a null constant value of type 'Models.Notifier'. Only entity types, enumeration types or primitive types are supported
in this context.
At the moment the only workaround I found is to loop the query result afterwards and manually populate Notifierlike this:
foreach (var c in caseList.Where(x => x.NotifierId.HasValue)
{
c.Notifier = (from x in context.Notifiers
where x.CaseId == c.CaseId
select new Notifier {
Name = x.Name
}).FirstOrDefault();
}
But I really don't want to do this because in my actual scenario it would generate hundreds of additional queries.
Is there any possible solution for a situation like this?.
I think you need to do that in two steps. First you can fetch only the data what you need with an anonymous type in a single query:
var caseList = (from x in context.Cases
where allowedCaseIds.Contains(x => x.CaseId)
select new {
CaseId = x.CaseId,
NotifierId = x.NotifierId,
NotifierName = x.Notifier.Name
}).ToList();
After that, you can work in memory:
List<Case> cases = new List<Case>();
foreach (var c in caseList)
{
var case = new Case();
case.CaseId = c.CaseId;
case.NotifierId = c.NotifierId;
case.NotifierName = c.NotifierId.HasValue ? c.NotifierName : null;
cases.Add(case);
}
You could try writing your query as a chain of function calls rather than a query expression, then put an .AsEnumerable() in between:
var caseList = context.Clases
.Where(x => allowedCaseIds.Contains(x.CaseId))
.AsEnumerable() // Switch context
.Select(x => new Case() {
CaseId = x.CaseId,
NotifierId = x.NotifierId,
Notifier = x.NotifierId.HasValue
? new Notifier() { Name = x.Notifier.Name }
: null
})
.ToList();
This will cause EF to generate an SQL query only up to the point where you put the .AsEnumerable(), further down the road, LINQ to Objects will do all the work. This has the advantage that you can use code that cannot be translated to SQL and should not require a lot of changes to your existing code base (unless you're using a lot of let expressions...)

The LINQ expression contains references to queries that are associated with different contexts

Here's my code:
var myStrings = (from x in db1.MyStrings.Where(x => homeStrings.Contains(x.Content))
join y in db2.MyStaticStringTranslations on x.Id equals y.id
select new MyStringModel()
{
Id = x.Id,
Original = x.Content,
Translation = y.translation
}).ToList();
And I get the error that the specified LINQ expression contains references to queries that are associated with different contexts. I know that the problem is that I try to access tables from both db1 and db2, but how do I fix this?
MyStrings is a small table
Load filtered MyStrings in memory, then join with MyStaticStringTranslations using LINQ:
// Read the small table into memory, and make a dictionary from it.
// The last step will use this dictionary for joining.
var byId = db1.MyStrings
.Where(x => homeStrings.Contains(x.Content))
.ToDictionary(s => s.Id);
// Extract the keys. We will need them to filter the big table
var ids = byId.Keys.ToList();
// Bring in only the relevant records
var myStrings = db2.MyStaticStringTranslations
.Where(y => ids.Contains(y.id))
.AsEnumerable() // Make sure the joining is done in memory
.Select(y => new {
Id = y.id
// Use y.id to look up the content from the dictionary
, Original = byId[y.id].Content
, Translation = y.translation
});
You are right that db1 and db2 can't be used in the same Linq expression. x and y have to be joined in this process and not by a Linq provider. Try this:
var x = db1.MyStrings.Where(xx => homeStrings.Contains(xx.Content)).ToEnumerable();
var y = db2.MyStaticStringTranslations.ToEnumerable();
var myStrings = (from a in x
join b in y on x.Id equals y.id
select new MyStringModel()
{
Id = x.Id,
Original = x.Content,
Translation = y.translation
}).ToList();
Refer to this answer for more details: The specified LINQ expression contains references to queries that are associated with different contexts
dasblinkenlight's answer has a better overall approach than this. In this answer I'm trying to minimize the diff against your original code.
I also faced the same problem:
"The specified LINQ expression contains references to queries that are associated with different contexts."
This is because it's not able to connect to two context at a time so i find the solution as below.
Here in this example I want to list the lottery cards with the owner name but the Table having the owner name is in another Database.So I made two context DB1Context and DB2Context.and write the code as follows:
var query = from lc in db1.LotteryCardMaster
from om in db2.OwnerMaster
where lc.IsActive == 1
select new
{
lc.CashCardID,
lc.CashCardNO,
om.PersonnelName,
lc.Status
};
AB.LottryList = new List<LotteryCardMaster>();
foreach (var result in query)
{
AB.LottryList.Add(new LotteryCardMaster()
{
CashCardID = result.CashCardID,
CashCardNO = result.CashCardNO,
PersonnelName =result.PersonnelName,
Status = result.Status
});
}
but this gives me the above error so i found the other way to perform joining on two tables from diffrent database.and that way is as below.
var query = from lc in db1.LotteryCardMaster
where lc.IsActive == 1
select new
{
lc.CashCardID,
lc.CashCardNO,
om.PersonnelName,
lc.Status
};
AB.LottryList = new List<LotteryCardMaster>();
foreach (var result in query)
{
AB.LottryList.Add(new LotteryCardMaster()
{
CashCardID = result.CashCardID,
CashCardNO = result.CashCardNO,
PersonnelName =db2.OwnerMaster.FirstOrDefault(x=>x.OwnerID== result.OwnerID).OwnerName,
Status = result.Status
});
}

query and create objects with a one to many relationship using LINQ

In the DB, I have a two tables with a one-to-many relationship:
orders suborders
----------- -----------
id id
name order_id
name
I'd like to query these tables and end up with a list of order objects, each of which contains a list (or empty list) of suborder objects. I'd also like to do this in a single DB query so it performs well.
In traditional SQL query land, I'd do something like (forgive the pseudocode):
rs = "select o.id, o.name, so.id, so.name from orders o left join suborders so on o.id = so.order_id order by o.id"
orders = new List<Order>
order = null
foreach (row in rs) {
if (order == null || row.get(o.id) != order.id) {
order = new Order(row.get(o.id), row.get(o.name), new List<Suborders>)
orders.add(order)
}
if (row.get(so.id) != null) {
order.suborders.add(new Suborder(row.get(so.id) row.get(so.name))
}
}
Is there a way to get this same resulting object structure using LINQ-to-Entities? Note that I want to get new objects out of the query, not the Entity Framework generated objects.
The following gets me close, but throws an exception: "LINQ to Entities does not recognize the method..."
var orders =
(from o in Context.orders
join so in Context.suborders on o.id equals so.order_id into gj
select new Order
{
id = o.id,
name = o.name,
suborders = (from so in gj select new Suborder
{
so.id,
so.name
}).ToList()
}).ToList();
The solution ends up being pretty simple. The key is to use a group join to get SQL to do the left join to suborders, and add a second ToList() call to force the query to be run so you're not trying to do object creation on the SQL server.
orders = Context.orders
.GroupJoin(
Context.suborders,
o => o.id,
so => so.order_id,
(o, so) => new { order = o, suborders = so })
.ToList()
.Select(r => new Order
{
id = r.order.id,
name = r.order.name,
suborders = r.suborders.Select(so => new Suborder
{
id = so.id,
name = so.name
}.ToList()
}).ToList();
This code only makes a single query to SQL for all objects and their child objects. It also lets you transform the EF objects into whatever you need.
I Always create a virtualized Property for Relations
so just extend (add a property) to your order class :
public class Order{
...
List<Suborder> _suborders;
public List<Suborder> Suborders{
get {
return _suborders ?? (_suborders = MyContext.Suborders.Where(X=>X.order_id==this.id).ToList());
}
...
}
so data will be fetched (pulled) only when you call the getters
How about this code ?
You can get a local cache.
List<Orders> orders = new List<Orders>();
private void UpdateCache(List<int> idList)
{
using (var db = new Test(Settings.Default.testConnectionString))
{
DataLoadOptions opt = new DataLoadOptions();
opt.LoadWith<Orders>(x => x.Suborders);
db.LoadOptions = opt;
orders = db.Orders.Where(x => idList.Contains(x.Id)).ToList();
}
}
private void DumpOrders()
{
foreach (var order in orders)
{
Console.WriteLine("*** order");
Console.WriteLine("id:{0},name:{1}", order.Id, order.Name);
if (order.Suborders.Any())
{
Console.WriteLine("****** sub order");
foreach (var suborder in order.Suborders)
{
Console.WriteLine("\torder id:{0},id{1},name:{2}", suborder.Order_id, suborder.Id, suborder.Name);
}
}
}
}
private void button1_Click(object sender, EventArgs e)
{
UpdateCache(new List<int> { 0, 1, 2 });
DumpOrders();
}
Output example below
*** order
id:0,name:A
****** sub order
order id:0,id0,name:Item001
order id:0,id1,name:Item002
order id:0,id2,name:Item003
*** order
id:1,name:B
****** sub order
order id:1,id0,name:Item003
*** order
id:2,name:C
****** sub order
order id:2,id0,name:Item004
order id:2,id1,name:Item005

Linq Select Clause w/ Unknown Number of Fields

I have a linq query in which I need to be able to select an variable number of fields from a datatable. I do know all of the fields that could be included, but only two will for sure be in the datatable. I also will know which fields are included in the datatable (it will just be different depending on the user's selections). Right now I set up something like this:
var query = from item in dt.AsEnumerable()
group item by item.Field<string>("ID") into g
select new
{
ID = g.Key, //required
Status = g.Min(i => dostuff(i,"Status")), //not required
Disc = g.Min(i => dostuff(i,"Disc")), //not required
Loc = String.Join<string>(",", from i in g select i.Field<string>("Loc")) //required
};
dostuff(DataRow i,string field)
{
try
{
return i.Field<string>(field);
}
catch
{
return null;
}
}
So dostuff basically is just checking whether or not that field exists in the dataset, and then I would just need to ignore the non-existant fields when working with the query results, which would not be too difficult. However, it seems like there is probably a better way to do this, but I've had a tough time finding anything via Google about using a dynamic select clause.
You could do it with dynamic type (nb, I did not test so this might have typos.):
var query =dt.AsEnumerable().GroupBy(item => item.Field<string>("ID"))
.Select(g => {
dynamic t = new System.Dynamic.ExpandoObject();
if (g.Table.Columns.Any(c => c.ColumnName == "Status"))
t.Status = g.Field<string>("Status");
if (g.Table.Columns.Any(c => c.ColumnName == "Disc"))
t.Disc = g.Field<string>("Disc");
t.ID = g.Key;
t.Loc = String.Join<string>(",",g.Select(i => i.Field<string>("Loc")));
return t;
}

Is there any way to reduce duplication in these two linq queries

Building a bunch of reports, have to do the same thing over and over with different fields
public List<ReportSummary> ListProducer()
{
return (from p in Context.stdReports
group p by new { p.txt_company, p.int_agencyId }
into g
select new ReportSummary
{
PKi = g.Key.int_agencyId,
Name = g.Key.txt_company,
Sum = g.Sum(foo => foo.lng_premium),
Count = g.Count()
}).OrderBy(q => q.Name).ToList();
}
public List<ReportSummary> ListCarrier()
{
return (from p in Context.stdReports
group p by new { p.txt_carrier, p.int_carrierId }
into g
select new ReportSummary
{
PKi = g.Key.int_carrierId,
Name = g.Key.txt_carrier,
Sum = g.Sum(foo => foo.lng_premium),
Count = g.Count()
}).OrderBy(q => q.Name).ToList();
}
My Mind is drawing a blank on how i might be able to bring these two together.
It looks like the only thing that changes are the names of the grouping parameters. Could you write a wrapper function that accepts lambdas specifying the grouping parameters? Or even a wrapper function that accepts two strings and then builds raw T-SQL, instead of using LINQ?
Or, and I don't know if this would compile, can you alias the fields in the group statement so that the grouping construct can always be referenced the same way, such as g.Key.id1 and g.Key.id2? You could then pass the grouping construct into the ReportSummary constructor and do the left-hand/right-hand assignment in one place. (You'd need to pass it as dynamic though, since its an anonymous object at the call site)
You could do something like this:
public List<ReportSummary> GetList(Func<Record, Tuple<string, int>> fieldSelector)
{
return (from p in Context.stdReports
group p by fieldSelector(p)
into g
select new ReportSummary
{
PKi = g.Key.Item2
Name = g.Key.Item1,
Sum = g.Sum(foo => foo.lng_premium),
Count = g.Count()
}).OrderBy(q => q.Name).ToList();
}
And then you could call it like this:
var summary = GetList(rec => Tuple.Create(rec.txt_company, rec.int_agencyId));
or:
var summary = GetList(rec => Tuple.Create(rec.txt_carrier, rec.int_carrierId));
Of course, you'll want to replace Record with whatever type Context.stdReports is actually returning.
I haven't checked to see if that will compile, but you get the idea.
Since all that changes between the two queries is the group key, parameterize it. Since it's a composite key (has more than one value within), you'll need to create a simple class which can hold those values (with generic names).
In this case, to parameterize it, make the key selector a parameter to your function. It would have to be an expression and the method syntax to get this to work. You could then generalize it into a function:
public class GroupKey
{
public int Id { get; set; }
public string Name { get; set; }
}
private IQueryable<ReportSummary> GetReport(
Expression<Func<stdReport, GroupKey>> groupKeySelector)
{
return Context.stdReports
.GroupBy(groupKeySelector)
.Select(g => new ReportSummary
{
PKi = g.Key.Id,
Name = g.Key.Name,
Sum = g.Sum(report => report.lng_premium),
Count = g.Count(),
})
.OrderBy(summary => summary.Name);
}
Then just make use of this function in your queries using the appropriate key selectors.
public List<ReportSummary> ListProducer()
{
return GetReport(r =>
new GroupKey
{
Id = r.int_agencyId,
Name = r.txt_company,
})
.ToList();
}
public List<ReportSummary> ListCarrier()
{
return GetReport(r =>
new GroupKey
{
Id = r.int_carrierId,
Name = r.txt_carrier,
})
.ToList();
}
I don't know what types you have mapped for your entities so I made some assumptions. Use whatever is appropriate in your case.

Categories