Custom Union in Linq to Entities - c#

I need to union these rows on two ids without using an IEqualityComparer, as those are not supported in LINQ to Entities.
In result I need every unique combination of BizId and BazId, with the value from foos if the id pair came from there, else the value should be zero. This is a greatly simplified example and in reality these tables are very large and these operations cannot be done in memory. Because of this, this query needs to work with LINQ to Entities so that it can be translated to valid SQL and execute on the database. I suspect this can be done with some combination of where, join, and DefaultIfEmpty() instead of the Union and Distinct() but I am at a loss for now.
var foos = from f in Context.Foos where f.isActive select new { BizId = f.bizId, BazId = f.BazId, Value = f.Value };
var bars = from b in Context.Bars where b.isEnabled select new { BizId = b.bizId, BazId = b.BazId, Value = 0 };
var result = foos.Union(bars).Distinct(); //I need this to compare only BizId and BazId

You can group by the two fields and then get the first item of each group:
foos.Union(bars).GroupBy(x => new { x.bizId, x.bazId })
.Select(g => g.FirstOrDefault())

Related

Return is not reached in debug-mode after Join to entities in DbSet

I don't have strong knowledge in TPL. Maybe I misunderstood some points.
I have a linq-query to entities and JOIN-operator I use to connect diffrent DBSet with diffrent entities. The result list contains INNER JOIN selection. Here is :
public async Task<List<PlantsViewModel>> GetPlants()
{
var plants = _context.Plants
.Join(await _context.Saptransfer.ToListAsync(), plant => plant.PlantNumber, sap => sap.Plant,
(plant, sap) => new
{
plant.PlantNumber,
plant.Plant,
plant.PlantManager,
plant.District,
plant.Area
}).GroupBy(x => new
{
x.PlantNumber,
x.Plant,
x.PlantManager,
x.District,
x.Area
}).ToList();
return plants.
Select(p => new PlantsViewModel
{
PlantID = p.Key.PlantNumber,
PlantName = p.Key.Plant,
PlantSupervisor = p.Key.PlantManager,
DistrictName = p.Key.District,
RegionName = p.Key.Area
}).OrderBy(p => p.PlantName).ToList();
}
First of all, it consumes much time to execute - the non-clustered indexes're abscent. But main question - why break-point near return not switched during run-time? Maybe someone will advise me how to make linq-query in a right way. Thank you.
You have did a big mistake when put everywhere ToList/Async. LINQ query is effective when you materialize objects at the end.
I have no idea why you have put join without using it's values and then GroupBy for removing duplicates, but this query will be more effective that in original question:
public Task<List<PlantsViewModel>> GetPlants()
{
var plants =
from plant in _context.Plants
join sap in _context.Saptransfer on sap.Plant equals plant.PlantNumber
group plant by new {
plant.PlantNumber,
plant.Plant,
plant.PlantManager,
plant.District,
plant.Area
} into g
orderby g.Key.Plant
select new PlantsViewModel
{
PlantID = g.Key.PlantNumber,
PlantName = g.Key.Plant,
PlantSupervisor = g.Key.PlantManager,
DistrictName = g.Key.District,
RegionName = g.Key.Area
};
return plants.ToListAsync();
}
There are several issues with your query. Firstly the grouping looks to be completely unnecessary, at worst case with the join you might need to use a Distinct to avoid duplicates, but this can possibly also be avoided. GroupBy would normally be used where you want to aggregate data, such as getting a count, a minimum or maximum. or simply group related records by a detail, which your query is not doing. The next issue is that you have declared the method as async yet the bulk of the query is synchronous (ToList, not ToListAsync) plus you are fetching the entire grouped aggregate set of the data before projecting just the values you care about for the view model. (Doing a ToList before a Select)
Assuming the Plant and SapTransfer tables don't share a FK relationship, joining this table is effectively only filtering Plants that have a matching record in the SapTransfer table. If that is the intended behavior then you can try the following:
public async Task<List<PlantsViewModel>> GetPlants()
{
var plants = _context.Plants
.Join(_context.Saptransfer, plant => plant.PlantNumber,
sap => sap.Plant,
(plant, sap) => new
{
plant.PlantNumber,
plant.Plant,
plant.PlantManager,
plant.District,
plant.Area
}).OrderBy(p => p.Plant)
.Select(p => new PlantsViewModel
{
PlantID = p.PlantNumber,
PlantName = p.Plant,
PlantSupervisor = p.PlantManager,
DistrictName = p.District,
RegionName = p.Area
}).Distinct();
return await plants.ToListAsync();
}

Select distinct column names in Linq

My table contains several columns and I need to select distinct rows in two specific columns using Linq.
My SQL equivalent is:
Select distinct Level1Id, Level1Name
from levels
What I currently do is:
db.levels.GroupBy(c=> c.Level1Id).Select(s => s.First())
This will retrieve the whole row not only Level1Id and Level1Name. How can I specify the columns I want to retrieve in this linq query?
With Select, you can specify the columns in an anonymous object and then use Distinct on that:
db.levels.Select(l => new{ l.Level1Id, l.Level1Name }).Distinct();
try
db.levels.Select(c => new {c.Level1Id, c.Level1Name}).Distinct();
Specify the two columns in your LINQ query select, create an anonymous object with Level1Id and Level1Name properties:
var query = (from v in db.levels
select new { Level1Id = v.Level1Id, Level1Name = v.Level1Name }).Distinct();
and use each item like this:
foreach (var r in query){
int valId = r.LevelId;
int level = r.Level1Name;
//do something
}
You are so close, one more step:
var result = db.levels.GroupBy(c=> new { c.Level1Id, c.Level1Name })
.Select(s => s.First())
The key thing is: Anonymous type uses structural comparison, that's why GroupBy or any other answer do work.

Anonymous Type with Linq and Guid

I have a simple table:
ID | Value
When I do this:
var sequence = from c in valuesVault.GetTable()
select new {RandomIDX = Guid.NewGuid(), c.ID, c.Value};
each element in the projection has the value of the same guid... How do I write this so that I get a different random guid value for each of my element in the projection?
Edit
To clarify on the issue. The GetTable() method simply calls this:
return this.context.GetTable<T>();
where the this.contenxt is the DataContext of type T.
The itteration is done as it's always done, nothing fancy:
foreach (var c in seq)
{
Trace.WriteLine(c.RandomIDX + " " + c.Value);
}
Output:
bf59c94e-119c-4eaf-a0d5-3bb91699b04d What is/was your mother's maiden name?
bf59c94e-119c-4eaf-a0d5-3bb91699b04d What was the last name of one of your high school English teachers?
bf59c94e-119c-4eaf-a0d5-3bb91699b04d In elementary school, what was your best friend's first and last name?
Edit 2
Using out the box linq2Sql Provider. I had built some generic wrappers around it but they do not alter the way IQuaryable or IEnumerable function in the code.
What is underneath valuesVault.GetTable()?
You probably have a Linq provider such as Linq 2 SQL.
That means that valuesVault.GetTable() is of type IQueryable which in turn means that the entire query becomes an expression.
An expression is a query that is defined but not yet executed.
When sequence is being iterated over, the query is executed using the Linq provider and that Linq provider and one of the steps it has to perform is to execute this expression: Guid.NewGuid(). Most Linq providers cannot pass that expression to the underlying source (SQL Server wouldn't know what to do with it) so it gets executed once and the result of the execution returned with the rest of the result.
What you could do is to force the valuesVault.GetTable() expression to become a collection by calling the .ToList() or .ToArray() methods. This executes the expression and returns an IEnumerable which represents an in-memory collection.
When performing queries against an IEnumerable, the execution is not passed to the Linq provider but executed by the .NET runtime.
In your case this means that the expression Guid.NewGuid() can be executed correctly.
Try this:
var sequence = from c in valuesVault.GetTable().ToArray()
select new {RandomIDX = Guid.NewGuid(), c.ID, c.Value};
Notice the .ToArray() there. That is what will make the statement go from IQueryable to IEnumerable and that will change its behaviour.
I think it's happening when it gets translated into SQL (ie: it's the database doing it). Since you have no WHERE clauses in your example, you could just do:
var sequence = from c in valuesVault.GetTable().ToList()
select new { RandomID = Guid.NewGuid(), c.ID, c.Value };
Which forces Guid.NewGuid() to be executed in the client. However, it's ugly if your table grows and you start adding filtering clauses. You could solve it by using a second LINQ query that projects a second result set with your new GUIDs:
var sequence = from c in valuesVault.GetTable()
where c.Value > 10
select new { c.ID, c.Value };
var finalSequence = from s in sequence.ToList()
select new { RandomID = Guid.NewGuid(), s.ID, s.Value };
Seems to work for me.
List<int> a = new List<int> {10, 11, 12, 13};
var v = a.Select(i => new {ID = Guid.NewGuid(), I = i});
foreach (var item in v)
{
Console.WriteLine(item);
}
output
{ ID = b760f0c8-8dcc-458e-a924-4401ce02e04c, I = 10 }
{ ID = 2d4a0b17-54d3-4d69-8a5c-d2387e50f054, I = 11 }
{ ID = 906e1dc7-6de4-4f8d-b1cd-c129142a277a, I = 12 }
{ ID = 6a67ef6b-a7fe-4650-a8d7-4d2d3b77e761, I = 13 }
I'm not able to reproduce this behavior with a simple LINQ query. Sample:
List<int> y = new List<int> { 0, 1, 2, 3, 4, 5 };
var result = y.Select(x => new { Guid = Guid.NewGuid(), Id = x }).ToList();
I'm imagining if you try to convert your Table value to a List in Linq, then perform your select, you'll get different Guids.

Linq Union: How to add a literal value to the query?

I need to add a literal value to a query. My attempt
var aa = new List<long>();
aa.Add(0);
var a = Products.Select(p => p.sku).Distinct().Union(aa);
a.ToList().Dump(); // LinqPad's way of showing the values
In the above example, I get an error:
"Local sequence cannot be used in LINQ to SQL implementation
of query operators except the Contains() operator."
If I am using Entity Framework 4 for example, what could I add to the Union statement to always include the "seed" ID?
I am trying to produce SQL code like the following:
select distinct ID
from product
union
select 0 as ID
So later I can join the list to itself so I can find all values where the next highest value is not present (finding the lowest available ID in the set).
Edit: Original Linq Query to find lowest available ID
var skuQuery = Context.Products
.Where(p => p.sku > skuSeedStart &&
p.sku < skuSeedEnd)
.Select(p => p.sku).Distinct();
var lowestSkuAvailableList =
(from p1 in skuQuery
from p2 in skuQuery.Where(a => a == p1 + 1).DefaultIfEmpty()
where p2 == 0 // zero is default for long where it would be null
select p1).ToList();
var Answer = (lowestSkuAvailableList.Count == 0
? skuSeedStart :
lowestSkuAvailableList.Min()) + 1;
This code creates two SKU sets offset by one, then selects the SKU where the next highest doesn't exist. Afterward, it selects the minimum of that (lowest SKU where next highest is available).
For this to work, the seed must be in the set joined together.
Your problem is that your query is being turned entirely into a LINQ-to-SQL query, when what you need is a LINQ-to-SQL query with local manipulation on top of it.
The solution is to tell the compiler that you want to use LINQ-to-Objects after processing the query (in other words, change the extension method resolution to look at IEnumerable<T>, not IQueryable<T>). The easiest way to do this is to tack AsEnumerable() onto the end of your query, like so:
var aa = new List<long>();
aa.Add(0);
var a = Products.Select(p => p.sku).Distinct().AsEnumerable().Union(aa);
a.ToList().Dump(); // LinqPad's way of showing the values
Up front: not answering exactly the question you asked, but solving your problem in a different way.
How about this:
var a = Products.Select(p => p.sku).Distinct().ToList();
a.Add(0);
a.Dump(); // LinqPad's way of showing the values
You should create database table for storing constant values and pass query from this table to Union operator.
For example, let's imagine table "Defaults" with fields "Name" and "Value" with only one record ("SKU", 0).
Then you can rewrite your expression like this:
var zero = context.Defaults.Where(_=>_.Name == "SKU").Select(_=>_.Value);
var result = context.Products.Select(p => p.sku).Distinct().Union(zero).ToList();

LINQ to Entity: using Contains in the "select" portion throws unexpected error

I've got a LINQ query going against an Entity Framework object. Here's a summary of the query:
//a list of my allies
List<int> allianceMembers = new List<int>() { 1,5,10 };
//query for fleets in my area, including any allies (and mark them as such)
var fleets = from af in FleetSource
select new Fleet
{
fleetID = af.fleetID,
fleetName = af.fleetName,
isAllied = (allianceMembers.Contains(af.userID) ? true : false)
};
Basically, what I'm doing is getting a set of fleets. The allianceMembers list contains INTs of all users who are allied with me. I want to set isAllied = true if the fleet's owner is part of that list, and false otherwise.
When I do this, I am seeing an exception: "LINQ to Entities does not recognize the method 'Boolean Contains(Int32)' method"
I can understand getting this error if I had used the contains in the where portion of the query, but why would I get it in the select? By this point I would assume the query would have executed and returned the results. This little ditty of code does nothing to constrain my data at all.
Any tips on how else I can accomplish what I need to with setting the isAllied flag?
Thanks
This poached from a previous answer...
Contains not supported.
IN and JOIN are not the same operator (Filtering by IN never changes the cardinality of the query).
Instead of doing it that way use the join method. It's somewhat difficult to understand without using the query operators, but once you get it, you've got it.
var foo =
model.entitySet.Join( //Start the join
values, //Join to the list of strings
e => e.Name, // on entity.Name
value => value, //equal to the string
(ModelItem ent, String str) => ent);//select the entity
Here it is using the query operators
var foo = from e in model.entitySet
join val in values on
e.Name equals val
select e;
Basically the entity framework attempts to translate your LINQ query into a SQL statement but doesn't know how to handle the Contains.
What you can do instead is retrieve your fleets from the database and set the isAllied property later:
var fleets = (from af in FleetSource
select new Fleet
{
fleetID = af.fleetID,
fleetName = af.fleetName,
userId = af.userId
}).AsEnumerable();
foreach (var fleet in fleets)
{
fleet.isAllied = (allianceMembers.Contains(fleet.userID) ? true : false);
}
Everyone above me is wrong!!! (No offense ...) It doesn't work because you are using the IList overload of "Contains" and not the IEnumerable overload of "Contains". Simply change to:
allianceMembers.Contains<int>(af.userID)
By adding the <int>, you are telling the compiler to use the IEnumerable overload instead of the IList overload.
var fleets = from af in FleetSource;
var x = from u in fleets.ToList()
select new Fleet
{
fleetID = u.fleetID,
fleetName = u.fleetName,
isAllied = (allianceMembers.Contains(u.userID) ? true : false)
}
calling ToList() on fleets the query is executed, later you can use Contains().

Categories