Multiple Where vs Inner Join - c#

I have a filter where depending on the user selection I conditionally add in more Where/Joins.
Which method is faster than the other and why?
Example with Where:
var queryable = db.Sometable.Where(x=> x.Id > 30);
queryable = queryable.Where(x=> x.Name.Contains('something'));
var final = queryable.ToList();
Example with Join:
var queryable1 = db.Sometable.Where(x=> x.Id > 30);
var queryable2 = db.Sometable.Where(x=> x.Name.Contains('something'));
var final = (from q1 in queryable1 join q2 in queryable2 on q1.Id equals q2.Id select q1).ToList();
NOTE: I would have preferred the multiple Where but it is causing error as described in a question. Hence had to shift to JOIN. Hope 'JOIN' code is not slower than multiple WHERE

I just tried running similar linq statements against an MSsql 2008 database table with 10million rows. I found that the query optimizer converted both statements into similar query plans and the performance difference was a wash.
I would say that as someone who is reading the code, the first example more clearly states your intentions, and therefore would be preferred. Many times performance is not the best metric to choose when evaluating code.

i whould go for the where clause, avoiding to self joining the same table and make the code clearer
you can add a log to your dbcontext to see the generated sql query
db.context.Database.Log = System.Diagnostic.Debug.WriteLine;
anyway to improve the performance of the query i would :
select ONLY the fields that you actually need (not *)
check the indexes of the table
do you really need the contains statement ? if the records grow a lot you will have performance issue with sql as "like '%XXX%'"

I'm sure you already understand that LINQ converts your code into a SQL statement. Your first query would result in something like:
SELECT * FROM Sometable WHERE Id > 30 AND Name LIKE '%something%'
Your second query would result in something like
SELECT q1.*
FROM Sometable q1
JOIN Sometable q2 ON q1.Id = q2.Id
WHERE q1.Id > 30 AND q2.Name LIKE '%something%')
Nearly every time, a select from a single will return results faster than a join between 2 tables.
If you LINQ statement is failing to add tables, be sure you are including them.
var queryable = db.Sometable.Include(i => i.ForeignTable).Where(x=> x.Id > 30);

Related

PostgreSQL select query loading issue some time only?

This query loads some time and take 1 minute to complete, that time we remove the 'user_action_id' it completed with in millisecond. Most of the value in the 'user_action_id' is null.
select sum(Round(((cast((bd.posi_loose/u.uom_max_loose) as numeric))+bd.quantity)*bd.price,2))
into totalAmountInItemVoid
from t_bill_details bd
left join public.c_terminal trml on trml.id=bd.terminal_id
left join public.t_bill bil on bil.id=bd.bill_id and bil.terminal_id=bd.terminal_id
left join public.c_uom u on u.id=bd.uom_id
where bil.status!=7
and bil.status!=9
and bd.user_action_id=2
and bil.created_by=userid
and bil.eod_businessday_id is null;
If most values in user_action_id column are null, you can improve the look-up performance by creating a partial index like this:
CREATE INDEX yourIndex ON t_bill_details(user_action_id)
WHERE user_action_id IS NOT NULL;
That will ignore the rows with null values when executing the query, thus saving execution time.
You can also use EXPLAIN over your query to get more insights about why adding that condition to the WHERE clause is causing such a performance degradation. With that information you will be able to take a more informed decision - my partial index suggestion is just a guess.
Start by simplifying the query. The LEFT JOINs are being turned into inner joins anyway so express them correctly:
select sum(Round(((cast((bd.posi_loose/u.uom_max_loose) as numeric))+bd.quantity)*bd.price,2))
into totalAmountInItemVoid
from t_bill_details bd join
public.c_terminal trml
on trml.id = bd.terminal_id join
public.t_bill bil
on bil.id = bd.bill_id and
bil.terminal_id = bd.terminal_id left join
public.c_uom u
on u.id = bd.uom_id
where bil.status not in (7, 9) and
bd.user_action_id = 2 and
bil.created_by = userid and
bil.eod_businessday_id is null;
This query should be able to take advantage of an index on t_bill_details(user_action_id). I suspect that the performance issue has to do with different execution plans with this condition. You would need to look at the execution plans to see what is happening.
I also wonder how a filter clause would work. Remove the db.user_action_id = 2 from the where clause and instead try:
select sum(Round(((cast((bd.posi_loose/u.uom_max_loose) as numeric))+bd.quantity)*bd.price,2)) filter (where db.user_action_id = 2)

How to get item value and item count in linq c#

I have an sql database table named hate,
I want to get each items name and its count by linq query
that is my codes:
var qLocation = (from L in db.Hato
where L.HatoRecDate >= startDate && L.HatoRecDate <= endDate
group L by L.HatoLocation into g
select new { HatoLocation = g.Key, count = g.Count() })
.OrderByDescending(o => o.count).ToList();
var l = qLocation[0].HatoLocation;
var c = qLocation[0].count;
It gives me item name; but shows 0 result for any item count
please, tell me where is wrong with my code?
Update
After feedback I have captured the following output, what is interesting is that it is only ever the last record in the set that has a zero count:
Your code looks OK, I see no syntax issues with the query itself, what you need is a few tricks that will help you debug this.
When you run this with an In-Memory record set it behaves as expected, this means that the issue is in the generated SQL that your Linq query is translated into via the DbContext.
As a proof for your In-Memory, review this fiddle: https://dotnetfiddle.net/Widget/jxKNG5
Although it is not good practice for production code, one way to work around, and prove this issue is a SQL issue is by reading the data into memory before executing the group by. The results of an IQueryable<T> expression can be loaded into memory using .ToList().
Rather than calling .ToList() on the entire table, if the filter conditions are not in question, call .ToList() after the filter criteria. If you accidentally leave this in your code after your debug session it is going to have less impact than if you were reading every record from the database
#region A safer way to bring the recordset into memory for debugging
// Build the query in 2 steps, first create the filtered query
var filteredHatoQuery = from L in db.Hato
where L.HatoRecDate >= startDate && L.HatoRecDate <= endDate
select L;
// you could also consider only projecting the columns you need
// select new { L.HatoRecDate, L.HatoLocation };
// then operate on the data
var qLocation = (from L in filteredHatoQuery.ToList() // remove the .ToList() to query against the DB
group L by L.HatoLocation into g
select new { HatoLocation = g.Key, count = g.Count() })
.OrderByDescending(o => o.count).ToList();
#endregion A safer way to bring the recordset into memory for debugging
To be honest, I had a really hard time re-creating a query where you could possibly get a Count() of zero. Zero items means no records in the group, which would normally prevent the group header from returning at all, in fact I tried a lot of different angles to this, and really can't figure it out.
There are two complicating factors for manually debugging a query like this:
Linq / C# group by is vastly different to SQL GROUP BY. In C# grouping simply splits the results into sub-arrays, all the records are still in the output, but in SQL the GROUP BY doesn't return all the records, it only returns the aggregate group results. To do this properly, the grouping should be realised in SQL as a nested query, it won't necessarily always involve a SQL GROUP BY.
Either way, the resulting SQL will NOT be as simple as this:
SELECT HatoLocation, COUNT(*)
FROM Hato
WHERE HatoRecDate >= '2021-05-21' AND HatoRecDate <= '2021-05-24'
GROUP BY HatoLoction
You are ordering by the results of an aggregate within a filter. This is not always a big deal, but it can often lead to complications in SQL if you are not also using a limiting factor like TOP. As a general proposition, if the sorting only affects the rendered output, and not the functional logic, then you should leave the sort process to the renderer. Or at the very least, sort In-Memory, not in the SQL.
The original query would evaluate into SQL similar to this:
(I have substituted the Start and end parameters #p_linq_0 and #p_linq_1)
SELECT
[Project1].[C2] AS [C1],
[Project1].[HatoLocation] AS [HatoLocation],
[Project1].[C1] AS [C2]
FROM ( SELECT
[GroupBy1].[A1] AS [C1],
[GroupBy1].[K1] AS [HatoLocation],
1 AS [C2]
FROM ( SELECT
[Extent1].[HatoLocation] AS [K1],
COUNT(1) AS [A1]
FROM [dbo].[Hato] AS [Extent1]
WHERE ([Extent1].[HatoRecDate] >= '2021-05-21') AND ([Extent1].[HatoRecDate] <= '2021-05-24')
GROUP BY [Extent1].[HatoLocation]
) AS [GroupBy1]
) AS [Project1]
ORDER BY [Project1].[C1] DESC
But even that is not going to result in a count of zero. I can only assume that OPs runtime environment or database introduces some other factor that has not been taken into account for this exploration.
In Linq to Entities you can get the resulting SQL for queries that have not been read into memory simply by calling .ToString() on the query, or by using the inspector tool during a debug session. There is a good discussion in this post Get SQL query from LINQ to SQL?
For debugging purposes, it is a good idea to separate the linq query from the resulting enumerated or In-Memory result set, also in this example we have specifically isolated out the sort to occur after the .ToList() and the SQL has been written to the debug output.
var qLocationQuery = from L in db.Hato
where L.HatoRecDate >= startDate && L.HatoRecDate <= endDate
group L by L.HatoLocation into g
select new { HatoLocation = g.Key, count = g.Count() };
System.Diagnostics.Debug.WriteLine("Hato Query SQL:");
System.Diagnostics.Debug.WriteLine(qLocationQuery.ToString());
var qLocation = qLocationQuery.ToList();
// now perform the sort, this simulates leaving the sort to the rendering logic.
qLocation = qLocation.OrderByDescending(o => o.count).ToList();
Please update your post with the resulting SQL so we can further explore this!
Update
I've updated the fiddle with an actual DbContext implementation, I still cannot produce a grouping with a count of zero.
https://dotnetfiddle.net/G4RvUV
This shows how to extract the SQL query, but it shows there is something else wrong with your code. We either need to see more of the data, more of the schema, or a copy of the data without the grouping (as shown in the fiddle) so we can provide more assistance.
Try this...
Do the .ToList() and after that do the group by.

Making a UNION query more efficient in LINQ

I am currently working on a project leveraging EF and I am wondering if there is a more efficient or cleaner way to handle what I have below.
In SQL Server I could get the data I want by doing something like this:
SELECT tbl2.* FROM
dbo.Table1 tbl
INNER JOIN dbo.Table2 tbl2 ON tbl.Column = tbls2.Colunm
WHERE tbl.Column2 IS NULL
UNION
SELECT * FROM
dbo.Table2
WHERE Column2 = value
Very straight forward. However in LINQ I have something that looks like this:
var results1 = Repository.Select<Table>()
.Include(t => t.Table2)
.Where(t => t.Column == null);
var table2Results = results1.Select(t => t.Table2);
var results2 = Repository.Select<Table2>().Where(t => t.Column2 == "VALUE");
table2Results = table2Results.Concat(results2);
return results2.ToList();
First and foremost the return type of the method that contains this code is of type IEnumerable< Table2 > so first I get back all of the Table2 associations where a column in Table1 is null. I then have to select out my Table2 records so that I have a variable that is of type IEnumerable. The rest of the code is fairly straightforward in what it does.
This seems awfully chatty to me and, I think, there is a better way to do what I am trying to achieve. The produced SQL isn't terrible (I've omitted the column list for readability)
SELECT
[UnionAll1].*
FROM (SELECT
[Extent2].*
FROM [dbo].[Table1] AS [Extent1]
INNER JOIN [dbo].[Table2] AS [Extent2] ON [Extent1].[Column] = [Extent2].[Column]
WHERE [Extent1].[Column2] IS NULL
UNION ALL
SELECT
[Extent3].*
FROM [dbo].[Table2] AS [Extent3]
WHERE VALUE = [Extent3].[Column]) AS [UnionAll1]
So is there a cleaner / more efficient way to do what I have described? Thanks!
Well, one problem is that your results may not return the same data as your original SQL query. Union will select distinct values, Union All will select all values. First, I think your code could be made a lot clearer like so:
// Notice the lack of "Include". "Include" only states what should be returned
// *with* the original type, and is not necessary if you only need to select the
// individual property.
var firstResults = Repository.Select<Table>()
.Where(t => t.Column == null)
.Select(t => t.Table2);
var secondResults = Repository.Select<Table2>()
.Where(t => t.Column2 == "Value");
return firstResults.Union(secondResults);
If you know that it's impossible to have duplicates in this query, use Concat instead on the last line (which will produce the UNION ALL that you see in your current code) for reasons described in more detail here. If you want something similar to the original query, continue to use Union like in the example above.
It's important to remember that LINQ-to-Entities is not always going to be able to produce the SQL that you desire, since it has to handle so many cases in a generic fashion. The benefit of using EF is that it makes your code a lot more expressive, clearer, strongly typed, etc. so you should favor readability first. Then, if you actually see a performance problem when profiling, then you might want to consider alternate ways to query for the data. If you profile the two queries first, then you might not even care about the answer to this question.

Linq To SQL equivalent group by with multiple table columns in the output

I have just started on a project that uses Linq To SQL (there are various reasons why this is so, but for the moment, that is what is being used, not EF or ANOther ORM).
I have been tasked with migrating old (and I'm talking VB6 here) legacy code.
I come from a predominantly T-SQL background, so I knocked up a query that would do what I want, but I have to use LINQ to SQL (c# 3.5), which I don't have much experience with.
Note that the database will be SQL Server 2008 R2 and/or SQL Azure
Here is the T-SQL (simplified)
SELECT TBS.ServiceAvailID, sum(Places) as TakenPlaces,MAX(SA.TakenPlaces)
FROM TourBookService TBS
JOIN TourBooking TB
ON TB.TourBookID=TBS.TourBookID
JOIN ServiceAvail SA
ON TBS.ServiceAvailID = SA.ServiceAvailID
WHERE TB.Status = 10
AND ServiceSvrCode='test'
GROUP BY TBS.ServiceAvailID
HAVING sum(Places) <> MAX(SA.TakenPlaces)
So, there is a TourBooking table which has details of a customer's booking. This hangs off the TourBookService table which has details of the service they have booked. There is also a ServiceAvail table which links to the TourBookService table. Now, the sum of the Places should equal the Taken places amount in the ServiceAvail table, but sometimes this is not the case. This query gives back anything where this is not the case. I can create the Linq to just get the sum(places) details, but I am struggling to get the syntax to also get the TakenPlaces (note that this doesn't include the HAVING clause either)
var q = from tbs in TourBookServices
join tb in TourBookings on tbs.TourBookID equals tb.TourBookID
join sa in ServiceAvails on tbs.ServiceAvailID equals sa.ServiceAvailID
where (tb.Status == 10)
&& ( tbs.ServiceSvrCode =="test")
group tbs by tbs.ServiceAvailID
into g
select new {g.Key, TotalPlaces = g.Sum(p => p.Places)};
I need to somehow get the sa table into the group so that I can add g.Max(p=>p.PlacesTaken) to the select.
Am I trying to force T-SQL thinking into LINQ ?
I could just have another query that gets all the appropriate details from the ServiceAvail table, then loop through both result sets and match on the key, which would be easy to do, but feels wrong (but that may just be me!)
Any comments would be appreciated.
UPDATE:
As per the accepted answer below, this is what Linqer gave me. I will have a play and see what SQL it actually creates.
from tbs in db.TourBookService
join sa in db.ServiceAvail on tbs.ServiceAvailID equals sa.ServiceAvailID
where
tbs.TourBooking.Status == 10
tbs.ServiceSvrCode == "test")
group new {tbs, sa} by new {
tbs.ServiceAvailID
} into g
where g.Sum(p => p.tbs.Places) != g.Max(p => p.sa.TakenPlaces)
select new {
ServiceAvailID = (System.Int32?)g.Key.ServiceAvailID,
TakenPlaces = (System.Int32?)g.Sum(p => p.tbs.Places),
Column1 = (System.Int32?)g.Max(p => p.sa.TakenPlaces)
}
In your case I would try to use some kind of converter in my personal experience I used this program http://sqltolinq.com/ it often works very well in convertitng sql to linq.

Joins and subqueries in LINQ

I am trying to do a join with a sub query and can't seem to get it. Here is what is looks like working in sql. How do I get to to work in linq?
SELECT po.*, p.PermissionID
FROM PermissibleObjects po
INNER JOIN PermissibleObjects_Permissions po_p ON (po.PermissibleObjectID = po_p.PermissibleObjectID)
INNER JOIN Permissions p ON (po_p.PermissionID = p.PermissionID)
LEFT OUTER JOIN
(
SELECT u_po.PermissionID, u_po.PermissibleObjectID
FROM Users_PermissibleObjects u_po
WHERE u_po.UserID = '2F160457-7355-4B59-861F-9871A45FD166'
) used ON (p.PermissionID = used.PermissionID AND po.PermissibleObjectID = used.PermissibleObjectID)
WHERE used.PermissionID is null
Without seeing your database and data model, it's pretty impossible to offer any real help. But, probably the best way to go is:
download linqpad - http://www.linqpad.net/
create a connection to your database
start with the innermost piece - the subquery with the "where" clause
get each small query working, then join them up. Linqpad will show you the generated SQL, as well as the results, so build your small queries up until they are right
So, basically, split your problem up into smaller pieces. Linqpad is fantastic as it lets you test these things out, and check your results as you go
hope this helps, good luck
Toby
The LINQ translation for your query is suprisingly simple:
from pop in PermissibleObjectPermissions
where !pop.UserPermissibleObjects.Any (
upo => upo.UserID == new Guid ("2F160457-7355-4B59-861F-9871A45FD166"))
select new { pop.PermissibleObject, pop.PermissionID }
In words: "From all object permissions, retrieve those with at least one user-permission whose UserID is 2F160457-7355-4B59-861F-9871A45FD16".
You'll notice that this query uses association properties for navigating relationships - this avoids the need for "joining" and simplfies the query. As a result, the LINQ query is much closer to its description in English than the original SQL query.
The trick, when writing LINQ queries, is to get out of the habit of "transliterating" SQL into LINQ.

Categories