I have a following query in nHibernate. The idea is to get first and last measurement time of certain data group.
var measurements = _session.Query<Measurement>()
.Where(x => categories.Contains(x.CategoryId));
first = measurements.Min(o => o.StartTime);
last = measurements.Max(o => o.StartTime);
The SQL Server Profiler gives following output:
exec sp_executesql N'select cast(min(measuremen0_.StartTime) as DATETIME) as col_0_0_ from Measurement measuremen0_ where measuremen0_.Category in (#p0 , #p1)',N'#p0 int,#p1 int',#p0=7654321,#p1=3324673
exec sp_executesql N'select cast(max(measuremen0_.StartTime) as DATETIME) as col_0_0_ from Measurement measuremen0_ where measuremen0_.Category in (#p0 , #p1)',N'#p0 int,#p1 int',#p0=7654321,#p1=3324673
Can I somehow optimize this without using HQL, so that it would create only one request to the database server?
Did you take a look at the Future Queries? I think it works for linq queries as well.
Related
I create a query in linq which returns a table of most active salesmen in my shop:
ProjectDB3Context db = new ProjectDB3Context();
db.Database.Log = message => Trace.WriteLine(message);
var result = db.tblUsers.Join(db.tblSales,
u => u.ID,
sl => sl.tblUserId,
(u, sl) => new { u, sl })
.Select(o => new
{
UserId = o.u.ID,
Login = o.u.UserLogin,
FullName = o.u.Name + " " + o.u.Surname,
ItemsToSell = db.tblSales.Where(x => x.tblUserId == o.u.ID).Count()
})
.Distinct()
.OrderByDescending(x => x.ItemsToSell)
.ToList();
The henerated SQL query looks like:
SELECT
[Distinct1].[C1] AS [C1],
[Distinct1].[ID] AS [ID],
[Distinct1].[UserLogin] AS [UserLogin],
[Distinct1].[C2] AS [C2],
[Distinct1].[C3] AS [C3]
FROM ( SELECT DISTINCT
[Project1].[ID] AS [ID],
[Project1].[UserLogin] AS [UserLogin],
1 AS [C1],
[Project1].[Name] + N' ' + [Project1].[Surname] AS [C2],
[Project1].[C1] AS [C3]
FROM ( SELECT
[Extent1].[ID] AS [ID],
[Extent1].[UserLogin] AS [UserLogin],
[Extent1].[Name] AS [Name],
[Extent1].[Surname] AS [Surname],
(SELECT
COUNT(1) AS [A1]
FROM [dbo].[tblSale] AS [Extent3]
WHERE [Extent3].[tblUserId] = [Extent1].[ID]) AS [C1]
FROM [dbo].[tblUser] AS [Extent1]
INNER JOIN [dbo].[tblSale] AS [Extent2] ON [Extent1].[ID] = [Extent2].[tblUserId]
) AS [Project1]
) AS [Distinct1]
ORDER BY [Distinct1].[C3] DESC
Statistics:
SQL Server Execution Times:
CPU time = 359 ms, elapsed time = 529 ms.
Execution plan screen shot
I want to optimize the generated SQL query and insert optimized query into a stored procedure. SQL Server Management Studio gives me a tip to create a nonclustered index (tblUserId) on tblSale (you can see this tip in image that I included).
When I create it using command:
CREATE NONCLUSTERED INDEX IX_ProductVendor_tblUserId
ON tblSale (tblUserId);
and then run the SQL query in SQL Server Management Studio I get:
SQL Server Execution Times:
CPU time = 328 ms, elapsed time = 631 ms.
So it takes much longer after I used index to optimize my SQL query.
Can anybody help me with optimize this query in SQL Server using indexes?
Can anybody help me with optimize this query in SQL Server using indexes?
First off, before trying to optimize the SQL query in database, make sure your LINQ query is optimal. Which is not the case with yours. There is unnecessary join which in turn requires distinct etc. And tblSales is accessed twice (see the generated SQL).
What you are trying to achieve is to get users with sales ordered by sales count descending. The following simple query should produce the desired result
var result = db.tblUsers
.Select(u => new
{
UserId = u.ID,
Login = u.UserLogin,
FullName = u.Name + " " + u.Surname,
ItemsToSell = db.tblSales.Count(s => s.tblUserId == u.ID)
})
.Where(x => x.ItemsToSel > 0)
.OrderByDescending(x => x.ItemsToSell)
.ToList();
Try and see the new execution plan/time.
I want to optimize the generated SQL query and insert optimized query into a stored procedure.
Bzzt. Wrong.
Your query is already "optimized" - in that there isn't anything you can do to the query itself to improve its runtime performance.
Stored procedures in SQL Server do not have any kind of magic-optimization or other real advantages over immediately-executed queries. Stored procedures do benefit from cached execution plans, but so do immediate-queries after their first execution, and execution-plan generation isn't that expensive an operation.
Anyway, using stored procedures for read-only SELECT operations is inadvisble, it is better to use an UDF (CREATE FUNCTION) so you can take advantage of function composition which can be optimized and runtime far better than nested stored procedure calls.
If SQL Server's Show Execution Plan feature tells you to create an index, that is outside of EF's responsibility, it is also outside a stored procedure's responsibility too. Just define the index in your database and include it in your setup script. Your EF-generated query will run much faster without it being a stored procedure.
I have test this query in management studio and this execute very fast(less than a second)
declare #p_Message_0 varchar(3) = 'whh'
declare #p_CreatedTime_0 datetime = '2015-06-01'
SELECT count(1) FROM (SELECT * FROM [Logs](nolock) WHERE CONTAINS([Message], #p_Message_0) AND [CreatedTime]<#p_CreatedTime_0) t
SELECT t2.* FROM (SELECT t.*,ROW_NUMBER() OVER (ORDER BY Id DESC) as rownum FROM (SELECT * FROM [Logs](nolock) t WHERE CONTAINS([Message], #p_Message_0) AND [CreatedTime]<#p_CreatedTime_0) t) t2 WHERE rownum>0 AND rownum<=20
execution plan like this:
then I move it into C# ado.net, it run as this
exec sp_executesql N'SELECT count(1) FROM (SELECT * FROM [Logs](nolock) WHERE CONTAINS([Message], #p_Message_0) AND [CreatedTime]<#p_CreatedTime_0) t
SELECT t2.* FROM (SELECT t.*,ROW_NUMBER() OVER (ORDER BY Id desc) as rownum FROM (SELECT * FROM [Logs](nolock) t WHERE CONTAINS([Message], #p_Message_0) AND [CreatedTime]<#p_CreatedTime_0) t) t2 WHERE rownum>0 AND rownum<=20',N'#p_Message_0 varchar(3),#p_CreatedTime_0 datetime',#p_Message_0='whh',#p_CreatedTime_0='2015-06-01'
this one run really slow(about 30s). execution plan like:
I don't know what make these two plan different. Sql server is 2008 R2 with SP2, and I have tried parameter hint and OPTION (RECOMPILE), both not work for me.
Try updating statistics. The first one uses a variable with today's date. Variables aren't sniffed so you will get a guessed distribution. The second one uses a parameter. This can be sniffed.
If the stats haven't been updated today SQL Server will think no rows exist for that date so will give a plan on that basis. Such as a nested loops plan that is estimated to execute the TVF once but actually execute it many times.
AKA the ascending date problem.
I use SQL profiler to know how Entity Framework convert LINQ expression to sql use database. When query is 'heavy' then I try to optimalize it by examine execution plan.
Profiler (I use Profiler Express) give my query in this format
exec sp_executesql N'SELECT
[Project2].[Id] AS [Id], (... rest od query ... )
',#p__linq__0=N'test#mypage.com'
To see execution plan I have to convert (copy, paste code wrrr) to this format
DELARE #p__linq__0 NVARCHAR(100) =N'test#mypage.com'
SELECT
[Project2].[Id] AS [Id], (... rest od query ... )
It is boring ,irritating etc. Does someony know page or somethig which do it for me? Or I can just set it in options?
Enable "Show Actual Execution Plan" and run the query.
I'm trying to make my Linq-to-SQL query more efficient by including child properties in one trip to the DB. I started by trying various linq queries to accomplish this. The queries were getting complex, so I tried the LoadWith() option:
The constructor of my DAL class sets the LoadWith() settings:
public TrackerJobData()
{
dataLoadOptions = new DataLoadOptions();
dataLoadOptions.LoadWith<TrackerJobRecord>(x => x.SpecBarcodeRecords);
dataLoadOptions.LoadWith<TrackerJobRecord>(x => x.TrackerJobEquipmentTriggerRecords);
dataLoadOptions.LoadWith<TrackerJobRecord>(x => x.EtaRecord);
this.Database.LoadOptions = dataLoadOptions;
}
And here is the query I'm using:
public TrackerJob GetItem(int trackerJobId)
{
TrackerJobRecord record =
(from trackerJob in this.Database.TrackerJobRecords
where trackerJob.TrackerJobId == trackerJobId
select trackerJob).FirstOrDefault();
return record.Map();
}
When I debug and F10 on just the linq query (not the return), I get this output in SQL Profiler:
Pardon my ignorance of SQL Profiler, but do the three highlighted lines mean there were three round trips from the client (my code) to the server? If so, why? Will SQL Server ever execute multiple sp_executesql calls in one trip?
And since I thought LoadWith() would eliminate multiple calls, what am I doing incorrectly?
EDIT
Here are the three statements within SQL Profiler:
exec sp_executesql N'SELECT TOP (1) [t0].[TrackerJobId], [t0].[Name], [t0].[EtaId], [t0].[SamplingProcessorTypeId], [t0].[Description], [t0].[LastModifiedUser], [t0].[LastModifiedTime], [t0].[VersionNumber], [t0].[Active], [t0].[Archived], [t1].[EtaId] AS [EtaId2], [t1].[EtaNumber], [t1].[Title], [t1].[State], [t1].[DateInitialized], [t1].[EtaOriginatorId], [t1].[Quantity], [t1].[Ehs], [t1].[Ship], [t1].[InternalUse], [t1].[DateClosed], [t1].[ExperimentId], [t1].[Disposition], [t1].[TestType], [t1].[LastModifiedUser] AS [LastModifiedUser2], [t1].[LastModifiedTime] AS [LastModifiedTime2], [t1].[VersionNumber] AS [VersionNumber2]
FROM [AutoTracker].[TrackerJob] AS [t0]
INNER JOIN [Global].[Eta] AS [t1] ON [t1].[EtaId] = [t0].[EtaId]
WHERE [t0].[TrackerJobId] = #p0',N'#p0 int',#p0=17
exec sp_executesql N'SELECT [t0].[SpecBarcodeId], [t0].[TrackerJobId], [t0].[EquipmentId], [t0].[StartTime], [t0].[EndTime], [t0].[LastModifiedUser], [t0].[LastModifiedTime], [t0].[VersionNumber]
FROM [AutoTracker].[SpecBarcode] AS [t0]
WHERE [t0].[TrackerJobId] = #x1',N'#x1 int',#x1=17
exec sp_executesql N'SELECT [t0].[TrackerJobId], [t0].[EquipmentId], [t0].[LastModifiedUser], [t0].[LastModifiedTime], [t0].[VersionNumber]
FROM [AutoTracker].[TrackerJobEquipmentTrigger] AS [t0]
WHERE [t0].[TrackerJobId] = #x1',N'#x1 int',#x1=17
Linq-2-sql LoadWith does not support multiple 1:N relationships.
http://weblogs.asp.net/zeeshanhirani/archive/2008/08/11/constraints-with-loadwith-when-loading-multiple-1-n-relationships.aspx
Linq2SQl eager load with multiple DataLoadOptions
Each of those SQL Profiler calls represents a single roundtrip from client to DB server instance. SQL Server does support returning data sets in a single roundtrip, but I'm not sure how you would do that with LINQ to SQL.
I have a query linq like this:
var query = from c in Context.Customers select c;
var result = query.ToList();
Linq query generate this tsql code:
exec sp_executesql N'SELECT
[Project1].[Id] AS [Id],
[Project1].[Name] AS [Name],
[Project1].[Email] AS [Email]
FROM ( SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name],
[Extent1].[Email] AS [Email]
FROM [dbo].[Customers] AS [Extent1] ) AS [Project1]
Is there a way for not generate subquery?
Do you have any evidence that that query is causing performance problems? I imagine the query-optimizer would easily recognize that.
If you're certain after profiling that the query is a performance problem (doubtful) - and only then - you could simply turn the query into a stored procedure, and call that instead.
You use a tool like Linq because you don't want to write SQL, before abandoning that you should at least compare the query plan of your proposed SQL vs that generated by the tool. I don't have access to SQL Studio at the moment, but I would be a bit surprised if the query plans aren't identical...
EDIT: having had a chance to check out the query plans, they are in fact identical.
Short answer: No you cannot modify that query.
Long answer: If you want to reimplement Linq provider and query generator then perhaps there is a way but I doubt you want to do that. You can also implement custom EF provider wrapper which will take query passed from EF and reformat it but that will be hard as well - and slow. Are you going to write custom interpreter for SQL queries?