'Invalid column name [ColumnName]' on a nested linq query - c#

Last update
After alot of testing, I realised that if i ran the same query over the same dataset (in this case a Northwind) table on SQL 2000 and SQL 2005, I get two different results.
On SQL 2000, i get the error that's in the question.
On SQL 2005, it succeeds.
So I've concluded that the query generated by linqpad doesn't work on sql 2000. To reproduce this, run:
OrderDetails
.GroupBy(x=>x.ProductID)
.Select(x=>new {product_id = x.Key, max_quantity = x.OrderByDescending(y=>y.UnitPrice).FirstOrDefault().Quantity}).Dump();
on a Northwind DB in sql 2000. The sql translation is:
SELECT [t1].[ProductID] AS [product_id], (
SELECT [t3].[Quantity]
FROM (
SELECT TOP 1 [t2].[Quantity]
FROM [OrderDetails] AS [t2]
WHERE [t1].[ProductID] = [t2].[ProductID]
ORDER BY [t2].[UnitPrice] DESC
) AS [t3]
) AS [max_quantity]
FROM (
SELECT [t0].[ProductID]
FROM [OrderDetails] AS [t0]
GROUP BY [t0].[ProductID]
) AS [t1]
Original Question
I've got the following query:
ATable
.GroupBy(x=> new {FieldA = x.FieldAID, FieldB = x.FieldBID, FieldC = x.FieldCID})
.Select(x=>new {FieldA = x.Key.FieldA, ..., last_seen = x.OrderByDescending(y=>y.Timestamp).FirstOrDefault().Timestamp})
results in:
SqlException: Invalid column name 'FieldAID' x 5
SqlException: Invalid column name 'FieldBID' x 5
SqlException: Invalid column name 'FieldCID' x 1
I've worked out it has to do with the last query to Timestamp because this works:
ATable
.GroupBy(x=> new {FieldA = x.FieldAID, FieldB = x.FieldBID, FieldC = x.FieldCID})
.Select(x=>new {FieldA = x.Key.FieldA, ..., last_seen = x.OrderByDescending(y=>y.Timestamp).FirstOrDefault()})
The query has been simplified. The purpose is to group by a set of variables and then show the last time this grouping occured in the db.
I'm using Linqpad 4 to generate these results so the Timestamp gives me a string whereas FirstOrDefault gives me the whole object which isn't ideal.
Update
On further testing I've noticed that the number and type of SQLException is related to the class created in the groupby clause.
So,
ATable
.GroupBy(x=> new {FieldA = x.FieldAID})
.Select(x=>new {FieldA = x.Key.FieldA, last_seen = x.OrderByDescending(y=>y.Timestamp).FirstOrDefault()})
results in
SqlException: Invalid column name 'FieldAID' x 5

You should use the SQL profiler to check if the SQL generated against the 2 databases is different.
We have only had two problems where something ran on SQL Server 2005 but not on SQL Server 2000. In both cases it was due to the lack of support for Multiple Active Result Sets (MARS) in SQL Server 2000. In one case it led to locking in the database, in the other case it led to a reduction of performance.

Related

Linq Sql Query result is different from the SQL query run in the database

When I run the query in C# code using Linq the result returned is different from the sql query run in sql server
SQL query
SELECT TOP (1000) [Teamid]
,[TeamName]
,[TemplateId]
,[TemplateName]
FROM [MPFT_SendIT].[dbo].[VMTemplate]
where
Teamid=1
Result
SQL Query of the VMTemplate View
SELECT dbo.Team.Id AS Teamid, dbo.Team.TeamName,
dbo.MessageTemplate.Id AS TemplateId,
dbo.MessageTemplate.TemplateName
FROM dbo.Team INNER JOIN
dbo.TemplateLookup ON dbo.Team.Id =
dbo.TemplateLookup.TeamId INNER JOIN
dbo.MessageTemplate ON dbo.TemplateLookup.TemplateId =
dbo.MessageTemplate.Id
where
TeamId= 1
Result
Linq SQL
var teamid = _db.TeamLookups.Where(i => i.UserId == 20).Select(x =>
x.TeamId).ToList(); // teamid return value is 1
ViewBag.messageTemplate = _db.VMTemplates.Where(i =>
teamid.Contains(i.Teamid));
Linq query only returns one record line 1 of the sql query instead of 2 records as expected. Any help on how to solve this issue ?
Why Contains()? You should use equality operator rather
_db.VMTemplates.Where(i => teamid == i.Teamid).ToList();
Per your comment, then your Linq expression should work just fine. Add a ToList() to it
ViewBag.messageTemplate = _db.VMTemplates.Where(i =>
teamid.Contains(i.Teamid)).ToList();

Does Entity Framework query the database multiple times if I use different fields of the same Linq query at different times?

I tried the Internet and the SOF but couldn't locate a helpful resource. Perhaps I may not be using correct wording to search. If there are any previous questions I have missed due to this reason please let me know and I will take this question down.
I am dealing with a busy database so I am required to send less queries to the database.
If I access different columns of the same Linq query from different levels of the code then is Entity Framework smart enough to foresee the required columns and bring them all or does it call the db twice?
eg.
var query = from t1 in table_1
join t2 in table_2 on t1.col1 equals t2.col1
where t1.EmployeeId == EmployeeId
group new { t1, t2 } by t1.col2 into grouped
orderby grouped.Count() descending
select new { Column1 = grouped.Key, Column2 = grouped.Sum(g=>g.t2.col4) };
var records = query.Take(10);
// point x
var x = records.Select(a => a.Column1).ToArray();
var y = records.Select(a => a.Column2).ToArray();
Does EF generate query the database twice to faciliate x and y (send a query first to get Column1, and then send another to get Column2) or is it smart enough to know it needs both Columns to be materialised and bring them both at point x?
Added to clarify the intention of the question:
I understand I can simply add a greedy method to the end of query.Take(10) and get it done but I am trying to understand if the approach I try (and in my opinion, more elegant) does work of if not what makes EF to make two queries please.
Yes currently your code will generate 2 queries that will be executed to the database. Reason being is because you have 2 different sqls generated:
First is the top query, taking only 10 records and then only Column1
Second is the top query, taking only 10 records and then only Column2
The reason these are 2 queries is because you have a ToArray over different Select statements -> generating different sql. Most of linq queries are differed executed and will be executed only when you use something like ToArray()/ToList()/FirstOrDefault() and so on - those that actually give you the concrete data. In your original query you have 2 different ToArray on data that has not yet been retrieved - meaning 2 queries (once for the first field and then for the second).
The following code will result in a single query to the database
var records = (from t1 in table_1
join t2 in table_2 on t1.col1 equals t2.col1
where t1.EmployeeId == EmployeeId
group new { t1, t2 } by t1.col2 into grouped
orderby grouped.Count() descending
select new { Column1 = grouped.Key, Column2 = grouped.Sum(g=>g.t2.col4) })
.Take(10).ToList();
var x = records.Select(a => a.Column1).ToArray();
var y = records.Select(a => a.Column2).ToArray();
In my solution above I added a ToList() after filtering out only that data you need (Take(10)) and then at that point it will execute to the database. Then you have all the data in memory and you can do any other linq operation over it without it going again to the database.
Add to your code ToString() so you can check the generated sql at different points. Then you will understand when and what is being executed:
var query = from t1 in table_1
join t2 in table_2 on t1.col1 equals t2.col1
where t1.EmployeeId == EmployeeId
group new { t1, t2 } by t1.col2 into grouped
orderby grouped.Count() descending
select new { Column1 = grouped.Key, Column2 = grouped.Sum(g=>g.t2.col4) };
var generatedSql = query.ToString(); // Here you will see a query that brings all records
var records = query.Take(10);
generatedSql = query.ToString(); // Here you will see it taking only 10 records
// point x
var xQuery = records.Select(a => a.Column1);
generatedSql = xQuery.ToString(); // Here you will see only 1 column in query
// Still nothing has been executed to DB at this point
var x = xQuery.ToArray(); // And that is what will be executed here
// Now you are before second execution
var yQuery = records.Select(a => a.Column2);
generatedSql = yQuery.ToString(); // Here you will see only the second column in query
// Finally, second execution, now with the other column
var y = yQuery.ToArray();
When you are running linq statement on an entity in EF if only prepares the Select statement (thats why the type is IQueryable). The data is loaded lazily. When you try to use a value from that query then only the result gets evaluated using a enumerator.
So when you turn it to a collection (.toList() etc.) explicitly it tries to get data to populate the list and hence the sql command is fired.
It is designed so to enhance the performance. So if a particular property of an entity is to be used EF doesn't get the value for all the columns from that table

Using sql with index in linq query

I'm using a dbcontext linq query:
var list = context.MyTable.Where(x => x.IsValid).ToList();
The SqlProfiler shows this Sql Query:
SELECT * FROM [MyTable] WHERE IsValid = 1
The problem is that in this table I'm using a lot of sql indexes, and by default it uses the wrong index and query is taking a very long time. I need to add the index I have in the table into the query.
In other words how to get this query from linq?
SELECT * FROM [MyTable] WITH(INDEX(PK_MyIndexName)) WHERE IsValid = 1

Performing Subquery using Query Builder in Dataset C#

I'm using this sql statement to produce the desired result. And would like to display those result query in my report in C# using rdlc via Dataset(.xsd).
SELECT pay.Cutoff,
emp.Id,
emp.LastName,
emp.FirstName,
emp.MiddleName,
emp.TinNumber,
job.Rate * 25 AS FixBIR,
(SELECT COUNT(*) AS MonthsWorked
FROM payroll AS pay3
WHERE YEAR(pay3.DateGenerated) = 2014
AND pay3.EmployeeId = 1
AND pay3.Cutoff = 1
ORDER BY MONTH(pay3.DateGenerated) ASC) * (job.Rate * 25)
AS MonthsWorked_FixBIR_TODATE,
pay.TaxWithheld,
(SELECT SUM(payroll2.TaxWithheld) AS TaxTotal
FROM employee AS employee2
INNER JOIN payroll AS payroll2
ON employee2.Id = payroll2.EmployeeId
WHERE (payroll2.Cutoff = 1)
AND (employee2.Id = emp.Id)
AND YEAR(payroll2.DateGenerated) = 2014)
AS Tax_TODATE,
YEAR(pay.DateGenerated) AS YEAR
FROM employee AS emp
INNER JOIN payroll AS pay
ON emp.Id = pay.EmployeeId
INNER JOIN job
ON emp.JobId = job.Id
WHERE pay.Cutoff = 1
AND pay.PayrollMonth = 'August'
AND Year(pay.DateGenerated) = 2014
This statement works fine (tested in navicat).
However, when I transferred this to Dataset using Query Builder it doesn't work. The error says:
The wizard detects the following problems when configuring the TableAdapter: "Fill" Details:
!Generated SELECT statement
Error in SELECT clause: expression near 'SELECT'
Error in SELECT clause: expression near 'FROM'
Missing FROM Clause
Error in SELECT clause: expression near ','.
Unable to parse query text
And if I try to use a simple subquery in the SELECT statement, an error says:
The query cannot be represented graphically in the Diagram and Criteria Pane.
What do I do for me to use the sql statement in rdlc? Are there other ways aside from query builder in dataset?

GroupBy query is slow what are the other options?

I need a query to select rows with minimum Insert Date Time. but groupby query is very slow
what can I do instead of groupby?
How can improve the performance of the query?
Is there any other way to write this query using linq?
Number of rows in my table is around 600,000 rows and growing
fullsimnumber isn't indexed but isdeleted and insertdatetime are indexed.
fullsimnumber is a column which consist of three indexed coloumn prec+cod+subscribec.
my problem is with linq queries which always gives timeout exception.
I changed fullsimnumber to groupby prec,cod,subscribec which are indexed but still getting tiemout exception
Im using linq to EF, (code first style) and my query in sql is:
SELECT *
FROM dbo.Sim AS t1
JOIN (SELECT FullSimNumber,MIN(InsertDateTime) AS insd
FROM dbo.Sim
GROUP BY FullSimNumber) AS t2
ON t1.FullSimNumber = t2.FullSimNumber AND t1.InsertDateTime = t2.insd
WHERE t1.IsDeleted = 0
and my query in linq
from s in ADbContext.Numbers
where !s.IsDeleted
group s by s.FullSimNumber
into g
let sd = g.OrderBy(x => x.InsertDateTime).FirstOrDefault()
select sd;
without any data on indexes and amounts of rows and such it is hard to know what exactly is needed to make your query faster. One thing which is worthwhile to try out is this:
SELECT *
FROM dbo.Sim AS t1
CROSS APPLY (
SELECT TOP 1 FullSimNumber
, InsertDateTime AS insd
FROM dbo.Sim t2
WHERE t1.FullSimNumber = t2.FullSimNumber
AND t1.InsertDateTime = t2.insd
ORDER BY InsertDateTime DESC
) AS t2
WHERE t1.IsDeleted = 0
Again this could very well be worse! test it and compare the execution times and load.

Categories