EF5 generating "imaginary" columns in select statements - c#

We are using C#. VS2012, EF 5.0, and Oracle 11g. Approach is code first. I have a table that is defined, and it is plainly visible in looking at the code that it is defined with all the correct columns (and none that are not there.)
Still, when I run certain LINQ queries (joins) and attempt to select the results into a new object, things break. Here is the LINQ:
IQueryable<CheckWage> query =
from clientWage in context.ClientWages
join paycheckWage in context.PaycheckWages
on
new {clientWage.PermanentClientId, clientWage.WageId} equals
new {paycheckWage.PermanentClientId, paycheckWage.WageId}
where
(paycheckWage.PermanentClientId == Session.PermanentClientId) &&
(clientWage.PermanentClientId == Session.PermanentClientId)
select new CheckWage
{
CWage = clientWage,
PWage = paycheckWage
};
Now, here is the SQL it emits (as captured by Devart's DbMonitor tool):
SELECT
"Extent1".ASSOCIATE_NO,
"Extent1".PCLIENT_ID,
"Extent1".CLIENT_NO,
"Extent1".CLIENT_NAME,
"Extent1".ADDRESS1,
"Extent1".ADDRESS2,
"Extent1".CITY,
"Extent1".STATE,
"Extent1".ZIP,
"Extent1".COUNTRY,
"Extent1".CLIENT_TYPE,
"Extent1".DOING_BUSINESS_AS,
"Extent1".CONTACT,
"Extent1".PHONE,
"Extent1".EXTENSION,
"Extent1".FAX,
"Extent1".FAX_EXTENSION,
"Extent1".EMAIL,
"Extent1".NEXTEMP,
"Extent1".PAY_FREQ,
"Extent1".EMPSORT,
"Extent1".DIVUSE,
"Extent1".CLIENT_ACCESS_TYPE,
"Extent1".AUTOPAY_WAGE_ID,
"Extent1".FEIN,
"Extent1".HR_MODULE,
"Extent1".BANK_CODE,
"Extent1".ACH_DAYS,
"Extent1".ACH_COLLECT,
"Extent1".UPDATED,
"Extent1".IAT_FLAG,
"Extent1".ORIG_EMAIL,
"Extent1"."R1",
"Extent1"."R2"
FROM INSTANTPAY.CLIENT "Extent1"
WHERE "Extent1".PCLIENT_ID = :EntityKeyValue1'
There are no such columns as "R1" and "R2." I am guessing is has something to do with the join into a new object type with two properties, but I am pulling my hair out trying to figure out what I've done or haven't done that is resulting in this errant SQL. Naturally, the error from the Oracle server is "ORA-00904: "Extent1"."R2": invalid identifier." Strange that is doesn't choke on R1, but perhaps it only lists the last error or something...
Thanks in advance,
Peter
5/23/2014: I left out an important detail. The SQL is emitted when I attempt to drill into one of the CheckWage objects (using Lazy loading), as both of the contained objects have a navigation property to the "Client" entity. I can access the client table just fine in other LINQ queries that do not use a join, it is only this one that creates the "R1" and "R2" in the SELECT statement.
Peter

Related

EF Core with Postgres - Poor performance compared to same query as raw SQL

I'm trying to diagnose the exact issue with a query that I wrote in C# against a Postgres DB that I've generated a context for in a .NET Core WebAPI project with Scaffold-DbContext.
I'm expecting similar speeds for both queries, but when I use the Postgres ODBC driver or PgAdmin to run a query that gives me the same result set, I'm wondering: Why does the "plain SQL" version perform so much better?
My query in SQL:
SELECT oolk.salesrep, SUM(oool.openqty)
FROM public.oolookup oolk
INNER JOIN public.ooorderlines oool ON oool.orderlinekey = oolk.orderlinekey
WHERE oolk.salesteam = 'Team1' AND oolk.categorycode = 'Category 8'
GROUP BY oolk.salesrep
Running this query via ODBC and returning the result as JSON via WebAPI (localhost) takes: 2216ms
The "same" query in C#:
(from oolk in db.Oolookup
join oool in db.Ooorderlines on oolk.Orderlinekey equals oool.Orderlinekey
where oolk.Salesteam == "Team1"
where oolk.Categorycode == "Category 8"
group new { oolk, oool } by oolk.Salesrep into g
select new
{
SalesRep = g.Key,
OpenQty = g.Sum(gr => gr.oool.Openqty)
}).ToList()
Running this query and returning the result as JSON via WebAPI (localhost) takes: 8353ms
When I use DB logging in my C# code, this is the query that appears to get sent to my PG database by the query expression.
SELECT "oolk0"."pk_oolookup", "oolk0"."categorycode", "oolk0"."customercode", "oolk0"."customerdiv", "oolk0"."customergroup", "oolk0"."forecastgroup", "oolk0"."orderclass", "oolk0"."orderkey", "oolk0"."orderlinekey", "oolk0"."productcode", "oolk0"."saleslocation", "oolk0"."salesrep", "oolk0"."salesteam", "oolk0"."shippinglocation", "oool0"."orderlinekey", "oool0"."openamount", "oool0"."openappliedheadercharges", "oool0"."openitemamount", "oool0"."openlinechargeamount", "oool0"."openqty", "oool0"."openvolume", "oool0"."openweight", "oool0"."totalamount", "oool0"."totalheaderchargeamount", "oool0"."totalitemamount", "oool0"."totalqty", "oool0"."totalvolume", "oool0"."totalweight"
FROM "oolookup" AS "oolk0"
INNER JOIN "ooorderlines" AS "oool0" ON "oolk0"."orderlinekey" = "oool0"."orderlinekey"
WHERE ("oolk0"."salesteam" = 'Team1') AND ("oolk0"."categorycode" = 'Category 8')
ORDER BY "oolk0"."salesrep"
I find a few things about this strange. First of all, I'm never specifying that I want to select so many columns from the database, like I have to do with "Plain SQL". Yet, here they are. Secondly, this looks like only half of the "real query". I can't see any "SQL" that does a summation, so I assume this is happening internally in .NET objects and not on the database.
For indexes, I am building a data warehouse with a star schema. So my join field/primary key on the ooorderlines table is indexed (single column) and every column on my oolookup table has an index on it (single column). So I don't suspect an indexing issue at play. I am chalking this up to my inexperience with query expressions in .NET.
Where is the difference in six seconds coming from?

SQL generated from LINQ not consistent

I am using Telerik Open/Data Access ORM against an ORACLE.
Why do these two statements result in different SQL commands?
Statement #1
IQueryable<WITransmits> query = from wiTransmits in uow.DbContext.StatusMessages
select wiTransmits;
query = query.Where(e=>e.MessageID == id);
Results in the following SQL
SELECT
a."MESSAGE_ID" COL1,
-- additional fields
FROM "XFE_REP"."WI_TRANSMITS" a
WHERE
a."MESSAGE_ID" = :p0
Statement #2
IQueryable<WITransmits> query = from wiTransmits in uow.DbContext.StatusMessages
select new WITransmits
{
MessageID = wiTranmits.MessageID,
Name = wiTransmits.Name
};
query = query.Where(e=>e.MessageID == id);
Results in the following SQL
SELECT
a."MESSAGE_ID" COL1,
-- additional fields
FROM "XFE_REP"."WI_TRANSMITS" a
The query generated with the second statement #2 returns, obviously EVERY record in the table when I only want the one. Millions of records make this prohibitive.
Telerik Data Access will try to split each query into database-side and client-side (or in-memory LINQ if you prefer it).
Having projection with select new is sure trigger that will make everything in your LINQ expression tree after the projection to go to the client side.
Meaning in your second case you have inefficient LINQ query as any filtering is applied in-memory and you have already transported a lot of unnecessary data.
If you want compose LINQ expressions in the way done in case 2, you can append the Select clause last or explicitly convert the result to IEnumerable<T> to make it obvious that any further processing will be done in-memory.
The first query returns the full object defined, so any additional limitations (like Where) can be appended to it before it is actually being run. Therefore the query can be combined as you showed.
The second one returns a new object, which can be whatever type and contain whatever information. Therefore the query is sent to the database as "return everything" and after the objects have been created all but the ones that match the Where clause are discarded.
Even though the type were the same in both of them, think of this situation:
var query = from wiTransmits in uow.DbContext.StatusMessages
select new WITransmits
{
MessageID = wiTranmits.MessageID * 4 - 2,
Name = wiTransmits.Name
};
How would you combine the Where query now? Sure, you could go through the code inside the new object creation and try to move it outside, but since there can be anything it is not feasible. What if the checkup is some lookup function? What if it's not deterministic?
Therefore if you create new objects based on the database objects there will be a border where the objects will be retrieved and then further queries will be done in memory.

AsQueryable() does not return needed type in custom LINQ query using Lightswitch

I am using Lightswitch to build my application and I have the following problem.
In my database, I have three tables:
Article
Provider
ArticleProvider
Article and Provider have a many-to-many relation, therefore junction table ArticleProvider is needed.
Now, I want a screen in my application where the user can choose a provider and sees all articles which have a relation to this provider.
Using SQL, I would to it like this (123 is the Provider_Id I want to select).
SELECT *
FROM Article a
WHERE a.Id IN
(SELECT ap.Article_Id FROM ArticleProvider ap WHERE ap.Provider_Id=123)
In my Lightswitch application, I created a Query by clicking on the "Articles" Table in my Datasource and choosed "Add Query". I added a parameter ProviderId and switched to the source code editor to create my custom query:
partial void ArticleByProvider_PreprocessQuery(int? ProviderId,
ref IQueryable<Article> query)
{
...
}
Next I started to create my Linq Query. I think I need an IQueryable<ArticleProvider> Query to filter by them, so I tried:
(from art in query select art.ProviderQuery).AsQueryable<ArticleProvider>()
But, when trying this, I get a compile time error saying that this type can not be converted. So I tried this and it compiles fine:
(from art in query select art.ProviderQuery)
.AsQueryable<IDataServiceQueryable<ArticleProvider>>()
However, when using the returned IQueryable apList in my next query:
from ap in apList where ap.Provider.Id == 123 select ap.Article.Id
It seems that the fields Provider and Article can not be found. Also Visual Studio's code completion does not suggest these fields, only lots of methods and fields which are not in my database.
How can I solve this problem?
I played around with casts and other method calls like ToList(), but I get always stuck at this point. I am new to Linq and C#. Thank you in advance for any help.
EDIT:
I checked the return type of the first query by using:
var temp = (from art in query select art.ProviderQuery).AsQueryable()
The returned type is System.Linq.IQueryable<Microsoft.LightSwitch.IDataServiceQueryable<LightSwitchApplication.ArticleProvider>>
Your problem is that you are in the PreprocessQuery
This is for filtering data further, not adding extra.
If you look around a little this is mentioned a lot.
Give this query a try and see if this works:
partial void ArticleByProvider_PreprocessQuery(int? ProviderId,
ref IQueryable<Article> query)
{
query.Where(art => art.ArticleProviders
.Any(artProv => artProv.Provider.Id == ProviderId));
}
The idea is to get all Articles that at least matches the Provider Id.
Note: Haven't tested this code myself. But the idea should be there.

Entity Framework SQL Query Execution

Using the Entity Framework, when one executes a query on lets say 2000 records requiring a groupby and some other calculations, does the query get executed on the server and only the results sent over to the client or is it all sent over to the client and then executed?
This using SQL Server.
I'm looking into this, as I'm going to be starting a project where there will be loads of queries required on a huge database and want to know if this will produce a significant load on the network, if using the Entity Framework.
I would think all database querying is done on the server side (where the database is!) and the results are passed over. However, in Linq you have what's known as Delayed Execution (lazily loaded) so your information isn't actually retrieved until you try to access it e.g. calling ToList() or accessing a property (related table).
You have the option to use the LoadWith to do eager loading if you require it.
So in terms of performance if you only really want to make 1 trip to the Database for your query (which has related tables) I would advise using the LoadWith options. However, it does really depend on the particular situation.
It's always executed on SQL Server. This also means sometimes you have to change this:
from q in ctx.Bar
where q.Id == new Guid(someString)
select q
to
Guid g = new Guid(someString);
from q in ctx.Bar
where q.Id == g
select q
This is because the constructor call cannot be translated to SQL.
Sql's groupby and linq's groupby return differently shaped results.
Sql's groupby returns keys and aggregates (no group members)
Linq's groupby returns keys and group members.
If you use those group members, they must be (re-)fetched by the grouping key. This can result in +1 database roundtrip per group.
well, i had the same question some time ago.
basically: your linq-statement is converted to a sql-statement. however: some groups will get translated, others not - depending on how you write your statement.
so yes - both is possible
example:
var a = (from entity in myTable where entity.Property == 1 select entity).ToList();
versus
var a = (from entity in myTable.ToList() where entity.Property == 1 select entity).ToList();

Eager loading of Linq to SQL Entities in a self referencing table

I have 2 related Linq to SQL questions. Please see the image below to see what my Model looks like.
Question 1
I am trying to figure how to eager load the User.AddedByUser field on my User class/table. This field is generated from the relationship on the User.AddedByUserId field. The table is self-referencing, and I am trying to figure out how to get Linq to SQL to load up the User.AddedByUser property eagerly, i.e. whenever any User entity is loaded/fetched, it must also fetch the User.AddedByUser and User.ChangedByUser. However, I understand that this could become a recursive problem...
Update 1.1:
I've tried to use the DataLoadOptions as follows:
var options = new DataLoadOptions();
options.LoadWith<User>(u => u.ChangedByUser);
options.LoadWith<User>(u => u.AddedByUser);
db = new ModelDataContext(connectionString);
db.LoadOptions = options;
But this doesn't work, I get the following exception on Line 2:
System.InvalidOperationException occurred
Message="Cycles not allowed in LoadOptions LoadWith type graph."
Source="System.Data.Linq"
StackTrace:
at System.Data.Linq.DataLoadOptions.ValidateTypeGraphAcyclic()
at System.Data.Linq.DataLoadOptions.Preload(MemberInfo association)
at System.Data.Linq.DataLoadOptions.LoadWith[T](Expression`1 expression)
at i3t.KpCosting.Service.Library.Repositories.UserRepository..ctor(String connectionString) in C:\Development\KP Costing\Trunk\Code\i3t.KpCosting.Service.Library\Repositories\UserRepository.cs:line 15
InnerException:
The exception is quite self-explanatory - the object graph isn't allowed to be Cyclic.
Also, assuming Line 2 didn't throw an exception, I'm pretty sure Line 3 would, since they are duplicate keys.
Update 1.2:
The following doesn't work either (not used in conjuction with Update 1.1 above):
var query = from u in db.Users
select new User()
{
Id = u.Id,
// other fields removed for brevityy
AddedByUser = u.AddedByUser,
ChangedByUser = u.ChangedByUser,
};
return query.ToList();
It throws the following, self-explanatory exception:
System.NotSupportedException occurred
Message="Explicit construction of entity type 'i3t.KpCosting.Shared.Model.User' in query is not allowed."
I am now REALLY at a loss on how to solve this. Please help!
Question 2
On every other table in my DB, and hence Linq to SQL model, I have two fields, Entity.ChangedByUser (linked to Entity.ChangedByUserId foreign key/relationship) and Entity.AddedByUser (linked to Entity.AddedByUserId foreign key/relationship)
How do I get Linq to SQL to eageryly load these fields for me? Do I need to do a simple join on my queries?, or is there some other way?
Linq to SQL eager loading on self referencing table http://img245.imageshack.us/img245/5631/linqtosql.jpg
Any type of cycles just aren't allowed. Since the LoadWith<T> or AssociateWith<T> are applied to every type on the context, there's no internal way to prevent an endless loop. More accurately, it's just confused on how to create the SQL since SQL Server doesn't have CONNECT BY and CTEs are really past what Linq can generate automatically with the provided framework.
The best option available to you is to manually do the 1 level join down to the user table for both of the children and an anonymous type to return them. Sorry it's not a clean/easy solution, but it's really all that's available thus far with Linq.
Maybe you could try taking a step back and seeing what you want to do with the relation? I'm assuming you want to display this information to the user in e.g. "modified by Iain Galloway 8 hours ago".
Could something like the following work? :-
var users = from u in db.Users
select new
{
/* other stuff... */
AddedTimestamp = u.AddedTimestamp,
AddedDescription = u.AddedByUser.FullName,
ChangedTimestamp = u.ChangedTimestamp,
ChangedDescription = u.ChangedByUser.FullName
};
I've used an anonymous type there for (imo) clarity. You could add those properties to your User type if you preferred.
As for your second question, your normal LoadWith(x => x.AddedByUser) etc. should work just fine - although I tend to prefer storing the description string directly in the database - you've got a trade-off between your description updating when ChangedByUser.FullName changes and having to do something complicated and possibly counterintuitive if the ChangedByUser gets deleted (e.g. ON DELETE CASCADE, or dealing with a null ChangedByUser in your code).
Not sure there is a solution to this problem with Linq to Sql. If you are using Sql Server 2005 you could define a (recursive like) Stored Procecdure that uses common table expressions to get the result that you want and then execute that using DataContext.ExecuteQuery.

Categories