Is there any way of speeding up this query:
return _database.Countries
.Include("Accounts")
.Where(country => country.Accounts.Count > 0)
.ToList();
There are about 70 accounts and 70 Countries, this query takes about 1.5 seconds to execute which is quite long.
EDIT: _database is an EntityFramework model
You could try changing the Where clause to:
Where(country => country.Accounts.Any())
... but really your first port of call should be a database profiler. Look at the generated query, and put it into your favourite profiler. Check indexes etc as you would any other SQL query.
Once you've worked out why the generated SQL is slow and what you'd like the SQL to look like, then you can start working out how to change your query to generate that SQL.
You could, use sql profiler to trap the query that is being executed against the database, then using this you could optimise the scenario with indexes on the sql server.
Related
When looking into the best ways to implement paging in C# (using LINQ), most suggestions are something along these lines:
// Execute the query
var query = db.Entity.Where(e => e.Something == something);
// Get the total num records
var total = query.Count();
// Page the results
var paged = query.Skip((pageNum - 1) * pageSize).Take(pageSize);
This seems to be the commonly suggested strategy (simplified).
For me, my main purpose in paging is for efficiency. If my table contains 1.2 million records where Something == something, I don't want to retrieve all of them at the same time. Instead, I want to page the data, grabbing as few records as possible. But with this method, it seems that this is a moot point.
If I understand it correctly, the first statement is still retrieving the 1.2 million records, then it is being paged as necessary.
Does paging in this way actually improve performance? If the 1.2 million records are going to be retrieved every time, what's the point (besides the obvious UI benefits)?
Am I misunderstanding this? Any .NET gurus out there that can give me a lesson on LINQ, paging, and performance (when dealing with large data sets)?
The first statement does not execute the actual SQL query, it only builds part of the query you intend to run.
It is when you call query.Count() that the first will be executed
SELECT COUNT(*) FROM Table WHERE Something = something
On query.Skip().Take() won't execute the query either, it is only when you try to enumerate the results(doing a foreach over paged or calling .ToList() on it) that it will execute the appropriate SQL statement retrieving only the rows for the page (using ROW_NUMBER).
If watch this in the SQL Profiler you will see that exactly two queries are executed and at no point it will try to retrieve the full table.
Be careful when you are using the debugger, because if you step after the first statement and try to look at the contents of query that will execute the SQL query. Maybe that is the source of your misunderstanding.
// Execute the query
var query = db.Entity.Where(e => e.Something == something);
For your information, nothing is called after the first statement.
// Get the total num records
var total = query.Count();
This count query will be translated to SQL, and it'll make a call to database.
This call will not get all records, because the generated SQL is something like this:
SELECT COUNT(*) FROM Entity where Something LIKE 'something'
For the last query, it doesn't get all the records neither. The query will be translated into SQL, and the paging run in the database.
Maybe you'll find this question useful: efficient way to implement paging
I believe Entity Framework might structure the SQL query with the appropriate conditions based on the linq statements. (e.g. using ROWNUMBER() OVER ...).
I could be wrong on that, however. I'd run SQL profiler and see what the generated query looks like.
public List<Agents_main_view_distinct> getActiveAgents(DateTime start, DateTime end)
{
myactiveagents = null;
myactiveagents = mydb.Agents_main_view_distincts.Where(u => u.Status.Equals("Existing") && u.DateJoined2 >=start && u.DateJoined2 <=end).OrderByDescending(ac => ac.Recno).ToList();
return myactiveagents;
}
I have a simple LINQ query that queries a view.My worry is its performance. It works well with few hundreds records but when the records are over 2000. The SQL server times out.
Things I have done to improve on the performance.
1. Wrote a query to query the tables directly (No improvement).
2. Reduced unnecessary columns, previous it had 27 columns, reduces it to 20.
In a desperation attempt I increased the server time out to 600. But still it was timing out.
View SQL query
SELECT dbo.Agents.Recno, dbo.Agents.Rec_date, dbo.Agents.AgentsId, dbo.Agents.AgentsName, dbo.Agents.Industry_status, dbo.Agents.DOB, dbo.Agents.Branch,
dbo.Agents.MobileNumber, dbo.Agents.MaritalStatus, dbo.Agents.PIN, dbo.Agents.Gender, dbo.Agents.Email, dbo.Agents.ProvisionalLicense,
dbo.Agents.IRALicenseNumber, dbo.Agents.PreviousCompany, dbo.Agents.YearsOfExperience, dbo.Agents.COPNumber, dbo.Agents.DateJoined AS DateJoined2,
dbo.Agents.DateJoined, dbo.Agents.PreviousOccupation, dbo.Agents.ProffesionalQualification, dbo.Agents.EducationalQualification, dbo.Agents.Status,
dbo.Agents.Termination_Date2 AS Termination_Date, dbo.Agents.Comments, dbo.Agents.Temination_code, dbo.Agents.Company_ID, dbo.Agents.Submit_By,
dbo.Agents.PassportPhoto, dbo.Insurane_Companies.Company_name, DATEDIFF(year, dbo.Agents.DOB, GETDATE()) AS age, YEAR(dbo.Agents.DateJoined)
AS YearJoined, YEAR(dbo.Agents.Termination_Date2) AS YearTermination, dbo.Agents.REGION, dbo.Agents.DOB AS DOB2,
dbo.Insurane_Companies.Company_code
FROM dbo.Agents INNER JOIN
dbo.Insurane_Companies ON dbo.Agents.Company_ID = dbo.Insurane_Companies.Company_id
You could try moving the Where and OrderBy clauses into the view itself, passing in parameters by using a stored procedure/ user defined function if necessary.
You could also add Glimpse to your project. Amongst many other things, you can inspect SQL calls to see if you have any unnecessary or time consuming DB hits.
As James suggested, it would be best to create a stored procedure and call it via LINQ, passing any necessary parameters. This way, the server will handle the processing in the should be much faster since the query would not have to be converted back to SQL to be processed. Other than that you can use SQL's Query Analyzer or Profiler to see what the bottleneck may be.
I do something like this:
var src = dbContext.Set<Person>().Where(o => o.LastName.StartsWith(search));
var page = src.OrderBy(u => u.Id).Skip((page - 1) * pageSize).Take(pageSize);
var count = src.Count();
is ef pulling from the database everything and after does the query or not ? how do I know this? what are the ways finding out this?
(using ef4 ctp5 code first)
Try downloading LinqPad, it will show you the SQL that gets executed so you can see exactly what's happening.
Here's a Linq query and the results:
Here's the same Linq query along with the SQL that gets executed:
It's a very good tool for writing and optimising Linq to EF and Linq to SQL queries. It's also great for writing and testing .Net code snippets.
This tool has saved me so much time simply because you don't need to fire up the debugger! It's the most useful .Net tool that I've found in years.
None of the calls you are using is a To* operator (like ToList), so all method calls will be translated into SQL and executed in the database. However, the Count operator is immediately evaluated, you might should delay that assignment to the place you actually need the value. The other variables will be evaluated once iterated over.
I've got this toy code, works fine, using MySQL
var r = new SimpleRepository("DB", SimpleRepositoryOptions.None);
var q = r.Find<User>(x => x.UserName == "testuser");
How do I view the SQL generated by that query ?
For SQL Server, you can always run SQL Profiler to see the queries.
Unfortunately using SimpleRepository you can't do what you want without stepping into the SubSonic code. Because the Find method returns an IList it's executed before you get the chance to evaluate the SQL that's going to be executed. There are efforts underway to add this functionality in future versions of SubSonic but until then you're probably best looking at the MySQL Query Profiler.
How can I return first 100 records using Linq?
I have a table with 40million records.
This code works, but it's slow, because will return all values before filter:
var values = (from e in dataContext.table_sample
where e.x == 1
select e)
.Take(100);
Is there a way to return filtered? Like T-SQL TOP clause?
No, that doesn't return all the values before filtering. The Take(100) will end up being part of the SQL sent up - quite possibly using TOP.
Of course, it makes more sense to do that when you've specified an orderby clause.
LINQ doesn't execute the query when it reaches the end of your query expression. It only sends up any SQL when either you call an aggregation operator (e.g. Count or Any) or you start iterating through the results. Even calling Take doesn't actually execute the query - you might want to put more filtering on it afterwards, for instance, which could end up being part of the query.
When you start iterating over the results (typically with foreach) - that's when the SQL will actually be sent to the database.
(I think your where clause is a bit broken, by the way. If you've got problems with your real code it would help to see code as close to reality as possible.)
I don't think you are right about it returning all records before taking the top 100. I think Linq decides what the SQL string is going to be at the time the query is executed (aka Lazy Loading), and your database server will optimize it out.
Have you compared standard SQL query with your linq query? Which one is faster and how significant is the difference?
I do agree with above comments that your linq query is generally correct, but...
in your 'where' clause should probably be x==1 not x=1 (comparison instead of assignment)
'select e' will return all columns where you probably need only some of them - be more precise with select clause (type only required columns); 'select *' is a vaste of resources
make sure your database is well indexed and try to make use of indexed data
Anyway, 40milions records database is quite huge - do you need all that data all the time? Maybe some kind of partitioning can reduce it to the most commonly used records.
I agree with Jon Skeet, but just wanted to add:
The generated SQL will use TOP to implement Take().
If you're able to run SQL-Profiler and step through your code in debug mode, you will be able to see exactly what SQL is generated and when it gets executed. If you find the time to do this, you will learn a lot about what happens underneath.
There is also a DataContext.Log property that you can assign a TextWriter to view the SQL generated, for example:
dbContext.Log = Console.Out;
Another option is to experiment with LINQPad. LINQPad allows you to connect to your datasource and easily try different LINQ expressions. In the results panel, you can switch to see the SQL generated the LINQ expression.
I'm going to go out on a limb and guess that you don't have an index on the column used in your where clause. If that's the case then it's undoubtedly doing a table scan when the query is materialized and that's why it's taking so long.