Why is this query resolving to a DataQuery - c#

I have a linq to sql query that gets all my logs for the current hour(Stored as an Iqueryable):
currentLogs = from dll in cDataContext.DownloadLogs
where dll.DTS.Hour == DateTime.Now.Hour
select dll
And then I have another query(also stored as an Iqueryable) that gets the logs that are currently being processed, and dont appear in the logs for that time slot.
notDownloadedIds = (from x in cDataContext.CategoryCountryCategoryTypeMappings
where !(
from dll in currentLogs
select dll.CategoryCountryCategoryTypeMappingID)
.Contains(x.CategoryCountryCategoryTypeMappingID)
select x);
When I debug and hover over currentLogs i see a sql query, when i hover over notDownloadedIDs i see a DataQuery. If i refactor NotDownloadedIDs to not use current logs, notDownloadedIds stays as a sql query, instead of a DataQuery. Why doesnt notDownloadedIds stay as a sql query, and/or how can I get it to stay like that.
If i dont I get problems down the line when use it in a method.
EDIT after using sanders advice i found out the sql statement generated is
SELECT ccc.[CategoryCountryCategoryTypeMappingID], ccc.[CountryID], ccc.[CategoryID],
ccc.[CategoryTypeID], ccc.[URLSegment], ccc.[DTS], ccc.[DTSUTC]
FROM [Store].[CategoryCountryCategoryTypeMappings] AS ccc
WHERE EXISTS(
SELECT *
FROM [dbo].[DownloadLog] AS [t1]
WHERE ([t1].[CategoryCountryCategoryTypeMappingID]
<> ccc.[CategoryCountryCategoryTypeMappingID])
AND (DATEPART(Hour, [t1].[DTS])) = (DATEPART(Hour, GETDATE()))
)
I need to change WHERE EXISTS .... column <> column, to WHERE NOT EXISTS ..... column =column. Is it possible to do this without resolving it to a dataquery?

There are subtle problems where you lose the direct-to-SQL mapping. Without diving into the details (I seem to recall .Contains() being problematic), I would recommend you try to refactor your query into a different form, for example:
notDownloadedIds = cDataContext.CategoryCountryCategoryTypeMappings.Where(mapping =>
!currentLogs.Select(dll => dll.CategoryCountryCategoryTypeMappingID)
.Any(id => id == mapping.CategoryCountryCategoryTypeMappingID))
If I read your code right, this should result in an equivalent query. Is this also transformed into a DataQuery?

My guess is that since you're incorporating external data (currentLogs) as part of your query that Linq-to-SQL is going to pull all data from CategoryCountryCategoryTypeMappings and then do the filtering in Linq-to-Objects.
By why does it matter? Certainly there could be a performance difference but I owuld expect you to expose the queries as anything other than IEnumerable<T> or IQueryable<T>.

Depending on the amount of objects in your currentLogs collection, you can do the following, which should result in a SQL query.
I think the maximum parameter count (which is what your ids will be translated to) for a query is around 2000.
var ids = currentLogs
.Select(x => x.CategoryCountryCategoryTypeMappingID)
.ToList();
notDownloadedIds =
from x in cDataContext.CategoryCountryCategoryTypeMappings
where !ids.Contains(x.CategoryCountryCategoryTypeMappingID)
select x;

Related

How to get item value and item count in linq c#

I have an sql database table named hate,
I want to get each items name and its count by linq query
that is my codes:
var qLocation = (from L in db.Hato
where L.HatoRecDate >= startDate && L.HatoRecDate <= endDate
group L by L.HatoLocation into g
select new { HatoLocation = g.Key, count = g.Count() })
.OrderByDescending(o => o.count).ToList();
var l = qLocation[0].HatoLocation;
var c = qLocation[0].count;
It gives me item name; but shows 0 result for any item count
please, tell me where is wrong with my code?
Update
After feedback I have captured the following output, what is interesting is that it is only ever the last record in the set that has a zero count:
Your code looks OK, I see no syntax issues with the query itself, what you need is a few tricks that will help you debug this.
When you run this with an In-Memory record set it behaves as expected, this means that the issue is in the generated SQL that your Linq query is translated into via the DbContext.
As a proof for your In-Memory, review this fiddle: https://dotnetfiddle.net/Widget/jxKNG5
Although it is not good practice for production code, one way to work around, and prove this issue is a SQL issue is by reading the data into memory before executing the group by. The results of an IQueryable<T> expression can be loaded into memory using .ToList().
Rather than calling .ToList() on the entire table, if the filter conditions are not in question, call .ToList() after the filter criteria. If you accidentally leave this in your code after your debug session it is going to have less impact than if you were reading every record from the database
#region A safer way to bring the recordset into memory for debugging
// Build the query in 2 steps, first create the filtered query
var filteredHatoQuery = from L in db.Hato
where L.HatoRecDate >= startDate && L.HatoRecDate <= endDate
select L;
// you could also consider only projecting the columns you need
// select new { L.HatoRecDate, L.HatoLocation };
// then operate on the data
var qLocation = (from L in filteredHatoQuery.ToList() // remove the .ToList() to query against the DB
group L by L.HatoLocation into g
select new { HatoLocation = g.Key, count = g.Count() })
.OrderByDescending(o => o.count).ToList();
#endregion A safer way to bring the recordset into memory for debugging
To be honest, I had a really hard time re-creating a query where you could possibly get a Count() of zero. Zero items means no records in the group, which would normally prevent the group header from returning at all, in fact I tried a lot of different angles to this, and really can't figure it out.
There are two complicating factors for manually debugging a query like this:
Linq / C# group by is vastly different to SQL GROUP BY. In C# grouping simply splits the results into sub-arrays, all the records are still in the output, but in SQL the GROUP BY doesn't return all the records, it only returns the aggregate group results. To do this properly, the grouping should be realised in SQL as a nested query, it won't necessarily always involve a SQL GROUP BY.
Either way, the resulting SQL will NOT be as simple as this:
SELECT HatoLocation, COUNT(*)
FROM Hato
WHERE HatoRecDate >= '2021-05-21' AND HatoRecDate <= '2021-05-24'
GROUP BY HatoLoction
You are ordering by the results of an aggregate within a filter. This is not always a big deal, but it can often lead to complications in SQL if you are not also using a limiting factor like TOP. As a general proposition, if the sorting only affects the rendered output, and not the functional logic, then you should leave the sort process to the renderer. Or at the very least, sort In-Memory, not in the SQL.
The original query would evaluate into SQL similar to this:
(I have substituted the Start and end parameters #p_linq_0 and #p_linq_1)
SELECT
[Project1].[C2] AS [C1],
[Project1].[HatoLocation] AS [HatoLocation],
[Project1].[C1] AS [C2]
FROM ( SELECT
[GroupBy1].[A1] AS [C1],
[GroupBy1].[K1] AS [HatoLocation],
1 AS [C2]
FROM ( SELECT
[Extent1].[HatoLocation] AS [K1],
COUNT(1) AS [A1]
FROM [dbo].[Hato] AS [Extent1]
WHERE ([Extent1].[HatoRecDate] >= '2021-05-21') AND ([Extent1].[HatoRecDate] <= '2021-05-24')
GROUP BY [Extent1].[HatoLocation]
) AS [GroupBy1]
) AS [Project1]
ORDER BY [Project1].[C1] DESC
But even that is not going to result in a count of zero. I can only assume that OPs runtime environment or database introduces some other factor that has not been taken into account for this exploration.
In Linq to Entities you can get the resulting SQL for queries that have not been read into memory simply by calling .ToString() on the query, or by using the inspector tool during a debug session. There is a good discussion in this post Get SQL query from LINQ to SQL?
For debugging purposes, it is a good idea to separate the linq query from the resulting enumerated or In-Memory result set, also in this example we have specifically isolated out the sort to occur after the .ToList() and the SQL has been written to the debug output.
var qLocationQuery = from L in db.Hato
where L.HatoRecDate >= startDate && L.HatoRecDate <= endDate
group L by L.HatoLocation into g
select new { HatoLocation = g.Key, count = g.Count() };
System.Diagnostics.Debug.WriteLine("Hato Query SQL:");
System.Diagnostics.Debug.WriteLine(qLocationQuery.ToString());
var qLocation = qLocationQuery.ToList();
// now perform the sort, this simulates leaving the sort to the rendering logic.
qLocation = qLocation.OrderByDescending(o => o.count).ToList();
Please update your post with the resulting SQL so we can further explore this!
Update
I've updated the fiddle with an actual DbContext implementation, I still cannot produce a grouping with a count of zero.
https://dotnetfiddle.net/G4RvUV
This shows how to extract the SQL query, but it shows there is something else wrong with your code. We either need to see more of the data, more of the schema, or a copy of the data without the grouping (as shown in the fiddle) so we can provide more assistance.
Try this...
Do the .ToList() and after that do the group by.

SelectMany converts query to Enumerable List. How to avoid it?

I have a MVC controller action that does a database query this way:
var marcaciones = db.Marcacion
where db is the database context in Entity Framework, and Marcacion is a database table. After that instruction, marcaciones type becomes
System.Data.Entity.DbSet`1[CasinosCloud.Models.Marcacion]
That allows to add any filter before framework actually executes the query in database.
So far, so good.
However, depending on certain condition, marcaciones variable is assigned in a different way.
The database model is such that marcaciones entity in database is a child of another entity. To get that marcaciones list, I can do this:
var marcaciones = trabajador.ServicioSupervisado.SelectMany(s => s.Marcacion).AsQueryable();
As you can infer from instruction above, trabajador is a parent database entity that have many ServicioSupervisado entities, which, in turn, can have many Marcacion entities.
Since marcaciones variable is the same as the marcaciones variable I showed before, I have to convert to Queryable.
After executing the above instruction, marcaciones type becomes;
{System.Linq.Enumerable+<SelectManyIterator>d__17`2
[CasinosCloud.Models.Servicio,CasinosCloud.Models.Marcacion]}
That mean query is actually converted to an Enumerable List.
All that works when no other filter is applied. When I add query filter I got problems with the second form. First, the whole web page is slower because all filters are applied in a memory list, not in the database, and second, I have problems with string comparisons, specially when I try to find a text in lowercase when in database is stored in uppercase. Of course, nothing is found in such a case.
I think the problem is reduced by solving the type issue. Why after calling SelectMany, the query is actually executed and converted to an Enumerable List? Is there a way to avoid this and all that to be executed in database? Maybe I should rewrite that instruction not using SelectMany. I tried by using db.Marcacion.Insersect() to do the intersection with this code, but the same problem occurs:
trabajador.ServicioSupervisado.SelectMany(s => s.Marcacion)
EDIT:
Query I want to execute in database takes the following form:
For the first way:
SELECT m.*
FROM Marcacion m
For the second way:
SELECT m.*
FROM Marcacion m
INNER JOIN Servicio s ON s.ServicioId = m.ServicioId
INNER JOIN Trabajador t ON t.TrabajadorId = s.TrabajadorId
WHERE t.TrabajadorId = 1069
EDIT 2:
For the second way, I tried with:
marcaciones = marcaciones.Where(m => trabajador.ServicioSupervisado.Any(s => s.ServicioId == m.ServicioId));
After that, when query is actually executed in database, this error happens:
System.NotSupportedException: 'Unable to create a constant value of type 'CasinosCloud.Models.Servicio'. Only primitive types or enumeration types are supported in this context.'
I solved it by writing the query this way:
var servicios = trabajador.ServicioSupervisado.Select(s => s.ServicioId);
marcaciones = marcaciones.Where(m => servicios.Any(s => s == m.ServicioId));
That way query is executed when I call ToList(), and works in both cases I told about, being very fast in both since query is run directly in database with all filters applied.
By seeing the database log, the final query executed by Entity Framework was:
SELECT
[Extent1].[MarcacionId] AS [MarcacionId],
[Extent1].[MonitorId] AS [MonitorId],
[Extent1].[TrabajadorId] AS [TrabajadorId],
[Extent1].[EmpresaId] AS [EmpresaId],
[Extent1].[ServicioId] AS [ServicioId],
[Extent1].[MarcacionFechaHora] AS [MarcacionFechaHora],
[Extent1].[MarcacionEntradaSalida] AS [MarcacionEntradaSalida],
[Extent1].[MarcacionChecksum] AS [MarcacionChecksum],
[Extent1].[MarcacionEquipo] AS [MarcacionEquipo],
[Extent1].[MarcacionEsManual] AS [MarcacionEsManual],
[Extent1].[MarcacionCreadoEn] AS [MarcacionCreadoEn],
[Extent1].[MarcacionActualizadoEn] AS [MarcacionActualizadoEn],
[Extent1].[MarcacionIndice] AS [MarcacionIndice]
FROM [dbo].[Marcacion] AS [Extent1]
INNER JOIN [dbo].[Trabajador] AS [Extent2] ON [Extent1].[TrabajadorId] = [Extent2].[TrabajadorId]
WHERE ( EXISTS (SELECT
1 AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable1]
WHERE 3 = [Extent1].[ServicioId]
)) AND ([Extent2].[TrabajadorNombres] + N' ' + [Extent2].[TrabajadorApellidos] LIKE #p__linq__0 ESCAPE N'~') AND ([Extent1].[MarcacionFechaHora] >= #p__linq__1) AND ([Extent1].[MarcacionFechaHora] <= #p__linq__2)

Simulate Entity Framework's .Last() when using SQL Server

SQL server is able to translate EF .First() using its function TOP(1). But when using Entity Framework's .Last() function, it throws an exception. SQL server does not recognize such functions, for obvious reasons.
I used to work it around by sorting descending and taking the first corresponding line :
var v = db.Table.OrderByDescending(t => t.ID).FirstOrDefault(t => t.ClientNumber == ClientNumberDetected);
This does it with a single query, but sorting the whole table (million rows) before querying...
Do I have good reasons to think there will be speed issues if I abuse of this technique ?
I thought of something similar... but it requires two query :
int maxID_of_Client = db.Where(t => t.ClientNumber == ClientNumberDetected).Max(t => t.ID);
var v = db.First(t => t.ID == maxID_of_Client);
It's consisting of retrieving the max ID of the client, then use this ID to retrieve the last line of the client.
It doesn't seems faster to query two times...
There must be a way to optimize this and use a single query without sorting millions of datas.
Unless there is something I don't understand, I'm probably not the first to think about this problem and I want to solve it for good !
Thanks in advance.
The assumption driving this question is that result sets with no ordering clause come back from your DB in any predictable order at all.
In reality, result sets that come back from SQL have no implicit ordering and none should be assumed.
Therefore, the result of
db.Table.FirstOrDefault(t => t.ClientNumber == ClientNumberDetected)
is actually indeterminate.
Whether you're taking first or last, without ordering it's all meaningless anyway.
Now, what goes to SQL where you add an ordering clause to your LINQ? It will be something similar to...
SELECT TOP(1) something FROM somewhere WHERE foo=bar ORDER BY somevalue
or, in the the descending/last case
SELECT TOP(1) something FROM somewhere WHERE foo=bar ORDER BY somevalue DESC
From SQL's POV, there's no significant difference here and your DB will be optimized for this sort of query. The index can be scanned in either direction, and the cost of each query above is the same.
TL;DR :
db.Table.OrderByDescending(t => t.ID)
.FirstOrDefault(t => t.ClientNumber == ClientNumberDetected)
is just fine.

Multiple Where vs Inner Join

I have a filter where depending on the user selection I conditionally add in more Where/Joins.
Which method is faster than the other and why?
Example with Where:
var queryable = db.Sometable.Where(x=> x.Id > 30);
queryable = queryable.Where(x=> x.Name.Contains('something'));
var final = queryable.ToList();
Example with Join:
var queryable1 = db.Sometable.Where(x=> x.Id > 30);
var queryable2 = db.Sometable.Where(x=> x.Name.Contains('something'));
var final = (from q1 in queryable1 join q2 in queryable2 on q1.Id equals q2.Id select q1).ToList();
NOTE: I would have preferred the multiple Where but it is causing error as described in a question. Hence had to shift to JOIN. Hope 'JOIN' code is not slower than multiple WHERE
I just tried running similar linq statements against an MSsql 2008 database table with 10million rows. I found that the query optimizer converted both statements into similar query plans and the performance difference was a wash.
I would say that as someone who is reading the code, the first example more clearly states your intentions, and therefore would be preferred. Many times performance is not the best metric to choose when evaluating code.
i whould go for the where clause, avoiding to self joining the same table and make the code clearer
you can add a log to your dbcontext to see the generated sql query
db.context.Database.Log = System.Diagnostic.Debug.WriteLine;
anyway to improve the performance of the query i would :
select ONLY the fields that you actually need (not *)
check the indexes of the table
do you really need the contains statement ? if the records grow a lot you will have performance issue with sql as "like '%XXX%'"
I'm sure you already understand that LINQ converts your code into a SQL statement. Your first query would result in something like:
SELECT * FROM Sometable WHERE Id > 30 AND Name LIKE '%something%'
Your second query would result in something like
SELECT q1.*
FROM Sometable q1
JOIN Sometable q2 ON q1.Id = q2.Id
WHERE q1.Id > 30 AND q2.Name LIKE '%something%')
Nearly every time, a select from a single will return results faster than a join between 2 tables.
If you LINQ statement is failing to add tables, be sure you are including them.
var queryable = db.Sometable.Include(i => i.ForeignTable).Where(x=> x.Id > 30);

Sort Linq list with one column

I guess it should be really simple, but i cannot find how to do it.
I have a linq query, that selects one column, of type int, and i need it sorted.
var values = (from p in context.Products
where p.LockedSince == null
select Convert.ToInt32(p.SearchColumn3)).Distinct();
values = values.OrderBy(x => x);
SearchColumn3 is op type string, but i only contains integers. So i thought, converting to Int32 and ordering would definitely give me a nice 1,2,3 sorted list of values. But instead, the list stays ordered like it were strings.
199 20 201
Update:
I've done some tests with C# code and LinqPad.
LinqPad generates the following SQL:
SELECT [t2].[value]
FROM (
SELECT DISTINCT [t1].[value]
FROM (
SELECT CONVERT(Int,[t0].[SearchColumn3]) AS [value], [t0].[LockedSince], [t0].[SearchColumn3]
FROM [Product] AS [t0]
) AS [t1]
WHERE ([t1].[LockedSince] IS NULL)
) AS [t2]
ORDER BY [t2].[value]
And my SQL profiler says that my C# code generates this piece of SQL:
SELECT DISTINCT a.[SearchColumn3] AS COL1
FROM [Product] a
WHERE a.[LockedSince] IS NULL
ORDER BY a.[SearchColumn3]
So it look like C# Linq code just omits the Convert.ToInt32.
Can anyone say something useful about this?
[Disclaimer - I work at Telerik]
You can solve this problem with Telerik OpenAccess ORM too. Here is what i would suggest in this case.
var values = (from p in context.Products
where p.LockedSince == null
orderby "cast({0} as integer)".SQL<int>(p.SearchColumn3)
select "cast({0} as integer)".SQL<int>(p.SearchColumn3)).ToList().Distinct();
OpenAccess provides the SQL extension method, which gives you the ability to add some specific sql code to the generated sql statement.
We have started working on improving this behavior.
Thank you for pointing this out.
Regards
Ralph
Same answer as one my other questions, it turns out that the Linq provider i'm using, the one that comes with Telerik OpenAccess ORM does things different than the standard Linq to SQL provider! See the SQL i've posted in my opening post! I totally wasn't expecting something like this, but i seem that the Telerik OpenAccess thing still needs a lot of improvement. So be careful before you start using it. It looks nice, but it has some serious shortcomings.
I can't replicate this problem. But just make sure you're enumerating the collection when you inspect it. How are you checking the result?
values = values.OrderBy(x => x);
foreach (var v in values)
{
Console.WriteLine(v.ToString());
}
Remember, this won't change the order of the records in the database or anywhere else - only the order that you can retrieve them from the values enumeration.
Because your values variable is a result of a Linq expression, so that it doest not really have values until you calling a method such as ToList, ToArray, etc.
Get back to your example, the variable x in OrderBy method, will be treated as p.SearchColumn3 and therefore, it's a string.
To avoid that, you need to let p.SearchColumn3 become integer before OrderBy method.
You should add a let statement in to your code as below:
var values = (from p in context.Products
where p.LockedSince == null
let val = Convert.ToInt32(p.SearchColumn3)
select val).Distinct();
values = values.OrderBy(x => x);
In addition, you can combine order by statement with the first, it will be fine.

Categories