I have a simple query that deletes an entry from a table
CB_User_Schedule deleted = (from x in db.CB_User_Schedules
where x.ScheduleID == CurrentScheduleID
select x).Single();
db.CB_User_Schedules.DeleteOnSubmit(deleted);
db.SubmitChanges();
However, the first statement returns Sequence contains no elements. I can see that when it executes the value of CurrentScheduleID is in fact a number and when I directly execute select * from CB_UserSchedules where ScheduleID = 3 it does return a row. So why is my statement not finding that row in the database?
Check the emitted SQL with the built-in LINQ-to-SQL logging (hook up a TextWriter to the Log property on your DataContext) or SQL profiler. More than likely you've got a datatype mismatch or key issue in your DBML (ie, something doesn't match reality in the DB and the mapper is barfing on it). We run a very large and complex site almost exclusively on L2S, and these kinds of issues are almost always related to broken DBML.
Related
I am using ASP NET MVC 4.5 and EF6, code first migrations.
I have this code, which takes about 6 seconds.
var filtered = _repository.Requests.Where(r => some conditions); // this is fast, conditions match only 8 items
var list = filtered.ToList(); // this takes 6 seconds, has 8 items inside
I thought that this is because of relations, it must build them inside memory, but that is not the case, because even when I return 0 fields, it is still as slow.
var filtered = _repository.Requests.Where(r => some conditions).Select(e => new {}); // this is fast, conditions match only 8 items
var list = filtered.ToList(); // this takes still around 5-6 seconds, has 8 items inside
Now the Requests table is quite complex, lots of relations and has ~16k items. On the other hand, the filtered list should only contain proxies to 8 items.
Why is ToList() method so slow? I actually think the problem is not in ToList() method, but probably EF issue, or bad design problem.
Anyone has had experience with anything like this?
EDIT:
These are the conditions:
_repository.Requests.Where(r => ids.Any(a => a == r.Student.Id) && r.StartDate <= cycle.EndDate && r.EndDate >= cycle.StartDate)
So basically, I can checking if Student id is in my id list and checking if dates match.
Your filtered variable contains a query which is a question, and it doesn't contain the answer. If you request the answer by calling .ToList(), that is when the query is executed. And that is the reason why it is slow, because only when you call .ToList() is the query executed by your database.
It is called Deferred execution. A google might give you some more information about it.
If you show some of your conditions, we might be able to say why it is slow.
In addition to Maarten's answer I think the problem is about two different situation
some condition is complex and results in complex and heavy joins or query in your database
some condition is filtering on a column which does not have an index and this cause the full table scan and make your query slow.
I suggest start monitoring the query generated by Entity Framework, it's very simple, you just need to set Log function of your context and see the results,
using (var context = new MyContext())
{
context.Database.Log = Console.Write;
// Your code here...
}
if you see something strange in generated query try to make it better by breaking it in parts, some times Entity Framework generated queries are not so good.
if the query is okay then the problem lies in your database (assuming no network problem).
run your query with an SQL profiler and check what's wrong.
UPDATE
I suggest you to:
add index for StartDate and EndDate Column in your table (one for each, not one for both)
ToList executes the query against DB, while first line is not.
Can you show some conditions code here?
To increase the performance you need to optimize query/create indexes on the DB tables.
Your first line of code only returns an IQueryable. This is a representation of a query that you want to run not the result of the query. The query itself is only runs on the databse when you call .ToList() on your IQueryable, because its the first point that you have actually asked for data.
Your adjustment to add the .Select only adds to the existing IQueryable query definition. It doesnt change what conditions have to execute. You have essentially changed the following, where you get back 8 records:
select * from Requests where [some conditions];
to something like:
select '' from Requests where [some conditions];
You will still have to perform the full query with the conditions giving you 8 records, but for each one, you only asked for an empty string, so you get back 8 empty strings.
The long and the short of this is that any performance problem you are having is coming from your "some conditions". Without seeing them, its is difficult to know. But I have seen people in the past add .Where clauses inside a loop, before calling .ToList() and inadvertently creating a massively complicated query.
Jaanus. The most likely reason of this issue is complecity of generated SQL query by entity framework. I guess that your filter condition contains some check of other tables.
Try to check generated query by "SQL Server Profiler". And then copy this query to "Management Studio" and check "Estimated execution plan". As a rule "Management Studio" generatd index recomendation for your query try to follow these recomendations.
We are using C#. VS2012, EF 5.0, and Oracle 11g. Approach is code first. I have a table that is defined, and it is plainly visible in looking at the code that it is defined with all the correct columns (and none that are not there.)
Still, when I run certain LINQ queries (joins) and attempt to select the results into a new object, things break. Here is the LINQ:
IQueryable<CheckWage> query =
from clientWage in context.ClientWages
join paycheckWage in context.PaycheckWages
on
new {clientWage.PermanentClientId, clientWage.WageId} equals
new {paycheckWage.PermanentClientId, paycheckWage.WageId}
where
(paycheckWage.PermanentClientId == Session.PermanentClientId) &&
(clientWage.PermanentClientId == Session.PermanentClientId)
select new CheckWage
{
CWage = clientWage,
PWage = paycheckWage
};
Now, here is the SQL it emits (as captured by Devart's DbMonitor tool):
SELECT
"Extent1".ASSOCIATE_NO,
"Extent1".PCLIENT_ID,
"Extent1".CLIENT_NO,
"Extent1".CLIENT_NAME,
"Extent1".ADDRESS1,
"Extent1".ADDRESS2,
"Extent1".CITY,
"Extent1".STATE,
"Extent1".ZIP,
"Extent1".COUNTRY,
"Extent1".CLIENT_TYPE,
"Extent1".DOING_BUSINESS_AS,
"Extent1".CONTACT,
"Extent1".PHONE,
"Extent1".EXTENSION,
"Extent1".FAX,
"Extent1".FAX_EXTENSION,
"Extent1".EMAIL,
"Extent1".NEXTEMP,
"Extent1".PAY_FREQ,
"Extent1".EMPSORT,
"Extent1".DIVUSE,
"Extent1".CLIENT_ACCESS_TYPE,
"Extent1".AUTOPAY_WAGE_ID,
"Extent1".FEIN,
"Extent1".HR_MODULE,
"Extent1".BANK_CODE,
"Extent1".ACH_DAYS,
"Extent1".ACH_COLLECT,
"Extent1".UPDATED,
"Extent1".IAT_FLAG,
"Extent1".ORIG_EMAIL,
"Extent1"."R1",
"Extent1"."R2"
FROM INSTANTPAY.CLIENT "Extent1"
WHERE "Extent1".PCLIENT_ID = :EntityKeyValue1'
There are no such columns as "R1" and "R2." I am guessing is has something to do with the join into a new object type with two properties, but I am pulling my hair out trying to figure out what I've done or haven't done that is resulting in this errant SQL. Naturally, the error from the Oracle server is "ORA-00904: "Extent1"."R2": invalid identifier." Strange that is doesn't choke on R1, but perhaps it only lists the last error or something...
Thanks in advance,
Peter
5/23/2014: I left out an important detail. The SQL is emitted when I attempt to drill into one of the CheckWage objects (using Lazy loading), as both of the contained objects have a navigation property to the "Client" entity. I can access the client table just fine in other LINQ queries that do not use a join, it is only this one that creates the "R1" and "R2" in the SELECT statement.
Peter
I'm using C#, .NET (4.0) and Entity Framework to connect to SQL CE 4.0.
I query some objects with specific properties, but the query returns only objects that meet search criteria only if that data was already saved to database, which is not that problematic, bigger problem is that if data is changed, but not yet saved to database it will still meet search criteria.
Example:
var query = from location in mainDBContext.Locations
where location.InUse == true
select location;
This query returns also objects where location.InUse = false if InUse was true when loaded from DB and then changed later on in code.
This is screen capture from one of the query results objects.
I really don't understand why it does this. I would understand if this query would always query database and I would get the older version of this object (thus InUse would be true).
Thank you for your time and answers.
That is how EF works internally.
Every entity uniquely identified by its key can be tracked by the context only once - that is called identity map. So it doesn't matter how many times did you execute the query. If the query is returning tracked entities and if it is repeatedly executed on the same context instance it will always return the same instance.
If the instance was modified in the application but not saved to the database your query will be executed on the database where persisted state will be evaluated but materialization process will by default use the current data from the application instead of data retrieved from the database. You can force the query to return state from the database (by setting mainDBContext.Locations.MergeOption = MergeOption.OverwriteChagens) but because of identity map your current modifications will be lost.
I'm not really sure what exactly your problem is, but I think you have to know this:
That kind of query always return data that is submitted into DB. When you change some entities in your code but they are not submitted into database the LINQ query will query the data from database, without your in-code changes.
LINQ queries use Deferred Execution, so your 'query' variable is not a list of results, it's just a query definition that is evaluated each time results are needed. You should add .ToList() to evaluate that query and get a list of results in that certain line of code.
An example for .ToList():
var query = (from location in mainDBContext.Locations
where location.InUse == true
select location).ToList();
I just ran into the same thing myself. It's a bit messy, but another option is to examine the Local cache. You can do this, for example:
var query = from location in mainDBContext.Locations.Local
where location.InUse == true
select location;
This will only use the local cache not saved to the database. A combination of local and database queries should enable you to get what you want.
Using the Entity Framework, when one executes a query on lets say 2000 records requiring a groupby and some other calculations, does the query get executed on the server and only the results sent over to the client or is it all sent over to the client and then executed?
This using SQL Server.
I'm looking into this, as I'm going to be starting a project where there will be loads of queries required on a huge database and want to know if this will produce a significant load on the network, if using the Entity Framework.
I would think all database querying is done on the server side (where the database is!) and the results are passed over. However, in Linq you have what's known as Delayed Execution (lazily loaded) so your information isn't actually retrieved until you try to access it e.g. calling ToList() or accessing a property (related table).
You have the option to use the LoadWith to do eager loading if you require it.
So in terms of performance if you only really want to make 1 trip to the Database for your query (which has related tables) I would advise using the LoadWith options. However, it does really depend on the particular situation.
It's always executed on SQL Server. This also means sometimes you have to change this:
from q in ctx.Bar
where q.Id == new Guid(someString)
select q
to
Guid g = new Guid(someString);
from q in ctx.Bar
where q.Id == g
select q
This is because the constructor call cannot be translated to SQL.
Sql's groupby and linq's groupby return differently shaped results.
Sql's groupby returns keys and aggregates (no group members)
Linq's groupby returns keys and group members.
If you use those group members, they must be (re-)fetched by the grouping key. This can result in +1 database roundtrip per group.
well, i had the same question some time ago.
basically: your linq-statement is converted to a sql-statement. however: some groups will get translated, others not - depending on how you write your statement.
so yes - both is possible
example:
var a = (from entity in myTable where entity.Property == 1 select entity).ToList();
versus
var a = (from entity in myTable.ToList() where entity.Property == 1 select entity).ToList();
How can I return first 100 records using Linq?
I have a table with 40million records.
This code works, but it's slow, because will return all values before filter:
var values = (from e in dataContext.table_sample
where e.x == 1
select e)
.Take(100);
Is there a way to return filtered? Like T-SQL TOP clause?
No, that doesn't return all the values before filtering. The Take(100) will end up being part of the SQL sent up - quite possibly using TOP.
Of course, it makes more sense to do that when you've specified an orderby clause.
LINQ doesn't execute the query when it reaches the end of your query expression. It only sends up any SQL when either you call an aggregation operator (e.g. Count or Any) or you start iterating through the results. Even calling Take doesn't actually execute the query - you might want to put more filtering on it afterwards, for instance, which could end up being part of the query.
When you start iterating over the results (typically with foreach) - that's when the SQL will actually be sent to the database.
(I think your where clause is a bit broken, by the way. If you've got problems with your real code it would help to see code as close to reality as possible.)
I don't think you are right about it returning all records before taking the top 100. I think Linq decides what the SQL string is going to be at the time the query is executed (aka Lazy Loading), and your database server will optimize it out.
Have you compared standard SQL query with your linq query? Which one is faster and how significant is the difference?
I do agree with above comments that your linq query is generally correct, but...
in your 'where' clause should probably be x==1 not x=1 (comparison instead of assignment)
'select e' will return all columns where you probably need only some of them - be more precise with select clause (type only required columns); 'select *' is a vaste of resources
make sure your database is well indexed and try to make use of indexed data
Anyway, 40milions records database is quite huge - do you need all that data all the time? Maybe some kind of partitioning can reduce it to the most commonly used records.
I agree with Jon Skeet, but just wanted to add:
The generated SQL will use TOP to implement Take().
If you're able to run SQL-Profiler and step through your code in debug mode, you will be able to see exactly what SQL is generated and when it gets executed. If you find the time to do this, you will learn a lot about what happens underneath.
There is also a DataContext.Log property that you can assign a TextWriter to view the SQL generated, for example:
dbContext.Log = Console.Out;
Another option is to experiment with LINQPad. LINQPad allows you to connect to your datasource and easily try different LINQ expressions. In the results panel, you can switch to see the SQL generated the LINQ expression.
I'm going to go out on a limb and guess that you don't have an index on the column used in your where clause. If that's the case then it's undoubtedly doing a table scan when the query is materialized and that's why it's taking so long.