Linq performance with subqueries and LET - c#

I have a query I'm executing in LinqPad (EF4 against a SQL2010 backend), which uses navigation properties to access related tables. I don't really have access to fiddle with indexes and so forth, and was wondering if there are any ways to avoid all the subqueries inherent in those 'firstordefault' items toward the bottom of the query. I've tried using a 'let' statement at the top as in "let accounts = i.ExpenseItemAccountings.FirstOrDefault()" and then use that reference to make it run only one subquery, but it still takes easily more than an hour to run this query for any meaningful number of records.
Is there any way I can make this more efficient?
var output = from i in ExpenseItems where
i.Er_Approved_Date >= fromDate &&
i.Er_Approved_Date <= toDate
select new {ER_Num = i.ErNum,
Line_Num = i.ItemNum,
Report_Title = i.Report_Title,
Requestor = i.Requester_Name,
Preparer = i.Preparer_Name,
ER_Total_Value = i.Er_TotalVal,
Partition = i.Org,
Transaction_Date = i.Item_Transaction_Date,
Approved_Date = i.Er_Approved_Date,
Item_Amount = i.Item_Amount,
Tips = i.Item_Tips,
GST = i.Item_Gst,
Have_Receipt = i.Item_Have_ReceiptTf,
Have_Invoice = i.Item_Have_InvoiceTf,
Vendor = i.Item_Vendor,
City = i.Item_City,
Item_Expense_Type = i.Item_Expense_Type,
Item_Description = i.Item_Expense_Description,
Misc_Item_Commodity = i.Item_Misc_Commodity_Name,
Misc_Item_SubCategory = i.Item_Misc_Specify,
Misc_Item_OtherMisc_Description = i.Item_Misc_Specify_Other_Desc,
Entity_Num = i.ExpenseItemAccountings.FirstOrDefault().Item_Entity_Num,
Entity_Name = i.ExpenseItemAccountings.FirstOrDefault().Item_Entity_Name,
Account_Num = i.ExpenseItemAccountings.FirstOrDefault().Item_Account_Num,
Account_Desc = i.ExpenseItemAccountings.FirstOrDefault().Item_Account_Name,
SubAccount_Num = i.ExpenseItemAccountings.FirstOrDefault().Item_SubAccount_Num,
SubAccount_Name = i.ExpenseItemAccountings.FirstOrDefault().Item_SubAccount_Name,
CostCentre_Num = i.ExpenseItemAccountings.FirstOrDefault().Item_CostCentre_Num,
CostCentre_Name = i.ExpenseItemAccountings.FirstOrDefault().Item_CostCentre_Name,
Project_Code = i.ExpenseItemAccountings.FirstOrDefault().Expense_Item_ProjectCode,
////Percent_Allocated = i.ExpenseItemAccountings.FirstOrDefault().Item_Percent,
ER_Comments = i.Er_Comments,
Item_First_Comment = i.ExpenseItemComments.FirstOrDefault().Comment_Content,
Violations = i.ExpenseItemViolations.Count()
};

On LinqPad you should be able to view the SQL statement generated from the LINQ statement.
In my opinion there is always a debate between using let or navigation properties in LINQ to Entities (or LINQ to SQL). Sometimes a simple JOIN might work better too. In other words it all depends on how your LINQ provider (Entity Framework) optimizes your specific query into SQL statements.
I would suggest you test your query with all let/join/navigation to see the generated SQL statement.
You can use ObjectQuery under System.Data.Objects in your code to view realtime SQL statement in case you don't have SQL profiler or intellitrace tool:
((ObjectQuery)anyLinqQuery).ToTraceString();
Also you can try using multiple from clause:
var output = from i in ExpenseItems
from exp in i.ExpenseItemAccountings
where ....
select new {...};
or
var output = from i in ExpenseItems
from exp in i.ExpenseItemAccountings.DefaultIfEmpty()
where ....
select new {...};

Related

convert a SQL update statement to a LINQ to Entities

Without writing an entire foreach loop is there a way to do a Update/Set in LINQ to Entities?
Using EF 6.x
Simple update query:
UPDATE stop_detail
SET cap_unique_id = b.Delivery_Location_Id
FROM order_detail b
WHERE Stop_Detail.CAP_Unique_Id IS NULL AND ((b.customer_id = 20 OR b.customer_id = 291) AND b.id = stop_detail.order_detail_id AND stop_type = 1)
all the context name are the same.
I normally end up writing about 30 lines of C# code to do this and I know there has to be a better way!
Whether you can and whether you should are two different things.
Here's how you can.
Example from EF6 Raw SQL Queries
using (var context = new BloggingContext())
{
context.Database.ExecuteSqlCommand(
"UPDATE dbo.Blogs SET Name = 'Another Name' WHERE BlogId = 1");
}
Hint: you probably shouldn't

Error from SQL query

Currently I'm working on cleaning up some code on the backend of an application I'm contracted for maintenance to. I ran across a method where a call is being made to the DB via Oracle Data Reader. After examining the SQL, I realized it was not necessary to make the call to open up Oracle Data Reader seeing how the object being loaded up was already within the Context of our Entity Framework. I changed the code to follow use of the Entity Model instead. Below are the changes I made.
Original code
var POCs = new List<TBLPOC>();
Context.Database.Connection.Open();
var cmd = (OracleCommand)Context.Database.Connection.CreateCommand();
OracleDataReader reader;
var SQL = string.Empty;
if (IsAssociate == 0)
SQL = #"SELECT tblPOC.cntPOC,INITCAP(strLastName),INITCAP(strFirstName)
FROM tblPOC,tblParcelToPOC
WHERE tblParcelToPOC.cntPOC = tblPOC.cntPOC AND
tblParcelToPOC.cntAsOf = 0 AND
tblParcelToPOC.cntParcel = " + cntParcel + " ORDER BY INITCAP(strLastName)";
else
SQL = #"SELECT cntPOC,INITCAP(strLastName),INITCAP(strFirstName)
FROM tblPOC
WHERE tblPOC.cntPOC NOT IN ( SELECT cntPOC
FROM tblParcelToPOC
WHERE cntParcel = " + cntParcel + #"
AND cntAsOf = 0 )
AND tblPOC.ysnActive = 1 ORDER BY INITCAP(strLastName)";
cmd.CommandText = SQL;
cmd.CommandType = CommandType.Text;
using (reader = cmd.ExecuteReader())
{
while (reader.Read())
{
POCs.Add(new TBLPOC { CNTPOC = (decimal)reader[0],
STRLASTNAME = reader[1].ToString(),
STRFIRSTNAME = reader[2].ToString() });
}
}
Context.Database.Connection.Close();
return POCs;
Replacement code
var sql = string.Empty;
if (IsAssociate == 0)
sql = string.Format(#"SELECT tblPOC.cntPOC,INITCAP(strLastName),INITCAP(strFirstName)
FROM tblPOC,tblParcelToPOC
WHERE tblParcelToPOC.cntPOC = tblPOC.cntPOC
AND tblParcelToPOC.cntAsOf = 0
AND tblParcelToPOC.cntParcel = {0}
ORDER BY INITCAP(strLastName)",
cntParcel);
else
sql = string.Format(#"SELECT cntPOC,INITCAP(strLastName), INITCAP(strFirstName)
FROM tblPOC
WHERE tblPOC.cntPOC NOT IN (SELECT cntPOC
FROM tblParcelToPOC
WHERE cntParcel = {0}
AND cntAsOf = 0)
AND tblPOC.ysnActive = 1
ORDER BY INITCAP(strLastName)",
cntParcel);
return Context.Database.SqlQuery<TBLPOC>(sql, "0").ToList<TBLPOC>();
The issue I'm having right now is when the replacement code is executed, I get the following error:
The data reader is incompatible with the specified 'TBLPOC'. A member of the type 'CNTPOCORGANIZATION', does not have a corresponding column in the data reader with the same name.
The field cntPOCOrganization does exist within tblPOC, as well as within the TBLPOC Entity. cntPOCOrganization is a nullable decimal (don't ask why decimal, I myself don't get why the previous contractors used decimals versus ints for identifiers...). However, in the past code and the newer code, there is no need to fill that field. I'm confused on why it is errors out on that particular field.
If anyone has any insight, I would truly appreciate it. Thanks.
EDIT: So after thinking on it a bit more and doing some research, I think I know what the issue is. In the Entity Model for TBLPOC, the cntPOCOrganization field is null, however, there is an object tied to this Entity Model called TBLPOCORGANIZATION. I'm pondering if it's trying to fill it. It too has cntPOCOrganization within itself and I'm guessing that maybe it is trying to fill itself and is what is causing the issue.
That maybe possibly why the previous contractor wrote the Oracle Command versus run it through the Entity Framework. I'm going to revert back for time being (on a deadline and really don't want to play too long with it). Thanks!
This error is issued when your EF entity model does not match the query result. If you post your entity model you are trying to fetch this in, the SQL can be fixed. In general you need to use:
sql = string.Format(#"SELECT tblPOC.cntPOC AS <your_EF_model_property_name_here>,INITCAP(strLastName) AS <your_EF_model_property_name_here>,INITCAP(strFirstName) AS <your_EF_model_property_name_here>
FROM tblPOC,tblParcelToPOC
WHERE tblParcelToPOC.cntPOC = tblPOC.cntPOC
AND tblParcelToPOC.cntAsOf = 0
AND tblParcelToPOC.cntParcel = {0}
ORDER BY INITCAP(strLastName)",
cntParcel);

Optimize LINQ query that runs fast in Sql server?

I want to calculate the rows of a related table:
MainTable tbl = tblInfo(id);
var count = tbl.Related_Huge_Table_Data.Count();
The problem is: this takes too long (about 20 seconds) to execute, although when I run this query in Sql Server it executes below one second. How can I optimize this query in linq? I also tried to use stored procedure but no luck.
This is the tblInfo method:
public MainTable tblInfo(int id)
{
MyDataContext context = new MyDataContext();
MainTable mt = (from c in context.MainTables
where c.Id == id
select c).SingleOrDefault();
return mt;
}
I used LinqToSql and classes was generated by LinqToSql.
By running SingleOrDefault() you execute the query and have to deal with results in memory after that. You need to stay with IQueryable until your query is fully constructed.
The easiest way to answer "how many child records this parent record has" is to approach it from the child side:
using (var dx = new MyDataContext())
{
// If you have an association between the tables defined in the context
int count = dx.Related_Huge_Table_Datas.Where(t => t.MainTable.id == 42).Count();
// If you don't
int count = dx.Related_Huge_Table_Datas.Where(t => t.parent_id == 42).Count();
}
If you insist on the parent side approach, you can do that too:
using (var dx = new MyDataContext())
{
int count = dx.MainTables.Where(t => t.id == 42).SelectMany(t => t.Related_Huge_Table_Datas).Count();
}
If you want to keep a part of this query in a function like tblInfo, you can, but you can't instantiate MyDataContext from inside such function, otherwise you will get an exception when trying to use the query with another instance of MyDataContext. So either pass MyDataContext to tblInfo or make tblInfo a member of partial class MyDataContext:
public static IQueryable<MainTable> tblInfo(MyDataContext dx, int id)
{
return dx.MainTables.Where(t => t.id == id);
}
...
using (var dx = new MyDataContext())
{
int count = tblInfo(dx, 42).SelectMany(t => t.Related_Huge_Table_Datas).Count();
}
Try this
MyDataContext context = new MyDataContext();
var count=context.Related_Huge_Table_Data.where(o=>o.Parentid==id).Count();
//or
int count=context.Database.SqlQuery<int>("select count(1) from Related_Huge_Table_Data where Parentid="+id).FirstOrDefault();
If you wish to take full advantage of your SQL Database's performance, it may make sense to query it directly rather than use Linq. Should be reasonably more performent :)
var Related_Huge_Table_Data = "TABLENAME";//Input table name here
var Id = "ID"; //Input Id name here
var connectionString = "user id=USERNAME; password=PASSWORD server=SERVERNAME; Trusted_Connection=YESORNO; database=DATABASE; connection timeout=30";
SqlCommand sCommand = new SqlCommand();
sCommand.Connection = new SqlConnection(connectionString);
sCommand.CommandType = CommandType.Text;
sCommand.CommandText = $"COUNT(*) FROM {Related_Huge_Table_Name} WHERE Id={ID}";
sCommand.Connection.Open();
SqlDataReader reader = sCommand.ExecuteReader();
var count = 0;
if (reader.HasRows)
{
reader.Read();
count = reader.GetInt32(0);
}
else
{
Debug.WriteLine("Related_Huge_Table_Data: No Rows returned in Query.");
}
sCommand.Connection.Close();
Try this:
MyDataContext context = new MyDataContext();
var count = context.MainTables.GroupBy(x => x.ID).Distict().Count();
The answer of GSerg is the correct one in many case. But when your table starts to be really huge, even a Count(1) directly in SQL Server is slow.
The best way you can get round this is to query the database stats directly, which is impossible with Linq (or I don't know of).
The best thing you can do is to create a static sub (C#) on your tables definition witch will return the result of the following query:
SELECT
SUM(st.row_count)
FROM
sys.dm_db_partition_stats st
WHERE
object_name(object_id) = '{TableName}'
AND (index_id < 2)
where {TableName} is the database name of your table.
Beware it's an answer only for the case of counting all records in a table!
Is your linq2sql returning the recordset and then doing the .Count() locally, or is it sending SQL to the server to do the count on the server? There will be a big difference in performance there.
Also, have you inspected the SQL that's being generated when you execute the query? From memory, Linq2Sql allows you to inspect SQL (maybe by setting up a logger on your class?). In Entity Framework, you can see it when debugging and inspecting the IQueryable<> object, not sure if there's an equivalent in Linq2Sql.
Way to view SQL executed by LINQ in Visual Studio?
Alternatively, use the SQL Server Profiler (if available), or somehow see what's being executed.
You may try following:-
var c = from rt in context.Related_Huge_Table_Data
join t in context.MainTables
on rt.MainTableId ==t.id where t.id=id
select new {rt.id};
var count=c.Distict().Count();

Comparisons of columns returned by linq to sql queries

I've done this before, and I'm drawing a blank on how I did it.
I'm using linq to sql to compare list from different databases on different servers. I can't combine the queries due to the complexety of the actual query and huge amounts of data this will process.
var query1 = (from u in database1server1
select new
{ PrimaryLine1 = u.PrimaryLine1,
PrimaryLine2 = u.PrimaryLine2,
PrimaryLine3 = u.PrimaryLine3,
}).ToList().ToString();
var query2 = (from m in database2server2
select new
{ PrimaryLine1 = m.PrimaryLine1,
PrimaryLine2 = m.PrimaryLine2,
PrimaryLine3 = m.PrimaryLine3,
}).ToList().ToString();
I need to account for differences in these list lists, column by column. So, I need a list of all the PrimaryLine1 values which are in database1server1 but not database2server2... and so on PrimaryLine2, PrimaryLine3, etc.
How can I build the string dataMismatces so I can generate a neat list for the end user?
string failedTests = null; // in
failedTests = string.Join("\n", dataMismatces);

LINQ to DataSet Query Help

I'm really new to LINQ so I'm hoping someone can help me. I've got a database which I need to run a large query from but it's a really old ODBC driver and takes a long time to respond (30+min for even a simple query). It only takes about 2-3min to dump all the data into a dataset however so I figured this was best and then I could run a LINQ to Dataset query. I can't seem to get the query to work and I'm a little confused. I put all the data into an SQL Express database to test the LINQ to SQL query to make sure I was going down the right path. I don't have this option where the application is going to be run as the environment will always be different.
SQL:
SELECT Invoice_detail.Code, Invoice_detail.Description, Product_master.Comment AS Packing, Invoice_detail.QtyInv AS INV, Invoice_detail.QtyBackOrder AS BO, Alternate_product_codes.MasterBarCode AS BarCode, Invoice_detail.PriceAmt AS Price, Invoice_detail.DiscPerc AS Disc, ROUND(Invoice_detail.TaxableAmt/Invoice_detail.QtyInv,2) AS Nett FROM ((Invoice_detail INNER JOIN Product_master ON Invoice_detail.Code = Product_master.Code) INNER JOIN Invoice_header ON Invoice_detail.InternalDocNum = Invoice_header.InternalDocNum AND Invoice_detail.DocType = Invoice_header.DocType) LEFT JOIN Alternate_product_codes ON Invoice_detail.Code = Alternate_product_codes.Code WHERE Invoice_header.DocNum = '{0}' AND Invoice_header.DocType = 1 AND Invoice_detail.LineType = 1 AND Invoice_detail.QtyInv > 0
LINQ to SQL:
from detail in INVOICE_DETAILs
join prodmast in PRODUCT_MASTERs on detail.Code equals prodmast.Code
join header in INVOICE_HEADERs on new { detail.InternalDocNum, detail.DocType } equals new { header.InternalDocNum, header.DocType}
join prodcodes in ALTERNATE_PRODUCT_CODES on detail.Code equals prodcodes.Code into alt_invd
from prodcodes in alt_invd.DefaultIfEmpty()
where
header.DocType == 1 &&
detail.LineType == 1 &&
detail.QtyInv > 0 &&
header.Date > DateTime.Parse("17/07/2011").Date &&
header.DocNum.Trim() == "119674"
select new {
detail.Code,
detail.Description,
Packing = prodmast.Comment,
INV = detail.QtyInv,
BO = detail.QtyBackOrder,
Barcode = prodcodes.MasterBarCode,
Price = detail.PriceAmt,
Disc = detail.DiscPerc,
Nett = Math.Round(Convert.ToDecimal(detail.TaxableAmt/detail.QtyInv),2,MidpointRounding.AwayFromZero)
}
LINQ to Dataset:
var query = from detail in ds.Tables["Invoice_detail"].AsEnumerable()
join prodmast in ds.Tables["Product_master"].AsEnumerable() on detail["Code"] equals prodmast["Code"]
join header in ds.Tables["Invoice_header"].AsEnumerable() on new { docnum = detail["InternalDocNum"], doctype = detail["DocType"] } equals new { docnum = header["InternalDocNum"], doctype = header["DocType"] }
join prodcodes in ds.Tables["Alternate_product_codes"].AsEnumerable() on detail["Code"] equals prodcodes["Code"] into alt_invd
from prodcodes in alt_invd.DefaultIfEmpty()
where
(int)header["DocType"] == 1 &&
(int)detail["LineType"] == 1 &&
(int)detail["QtyInv"] > 0 &&
//header.Field<DateTime>("Date") > DateTime.Parse("17/07/2011").Date &&
header.Field<DateTime>("Date") > DateTime.Now.Date.AddDays(-7) &&
header.Field<string>("DocNum").Trim() == "119674"
select new
{
Code = detail["Code"],
Description = detail["Description"],
Packing = prodmast["Comment"],
INV = detail["QtyInv"],
BO = detail["QtyBackOrder"],
Barcode = prodcodes["MasterBarCode"],
Price = detail["PriceAmt"],
Disc = detail["DiscPerc"],
Nett = Math.Round(Convert.ToDecimal((double)detail["TaxableAmt"] / (int)detail["QtyInv"]), 2, MidpointRounding.AwayFromZero)
};
I need to run the LINQ to DataSet query and then put the results into a DataTable so that I can export to CSV. The query will return many rows so I can see the CopyToDataTable method however that doesn't seem to work unless it is a typed dataset. I'm using the ODBC data adapter fill method so not specifically setting the data types on the Datatables I'm filling. The reason for this is that there is a lot of columns in those tables and setting them all up would be time consuming.
Is LINQ the best option? Am I close? Do I have to set the DataTables up for all the columns and data types? The only other way I can think of is to dump the data into an access database every time and query from there. I'm more curious to get LINQ to work though as I think it's going to be more beneficial for me going forward.
Any help or pointers is appreciated.
Thanks.
Pete.
Consider using POCO objects instead of a DataSet.
Blogs # MSDN
If I understand you correctly, the Linq To Dataset query retrieves the correct information, but you are not able to export the information to csv.
If this is just one csv file you need creating using just the nine fields in your example, you may be able to use a csv library (for example FileHelpers) to export the information.
To give you an example of the extra work involved, you need to define a class eg
[DelimitedRecord(",")]
public class Info
{
[FieldQuoted()]
public string Code ;
[FieldQuoted()]
public string Description ;
[FieldQuoted()]
public string Packing ;
public decimal INV ;
public decimal BO ;
[FieldQuoted()]
public string Barcode ;
public decimal Price ;
public decimal Disc ;
public decimal Nett ;
}
(Note, I'm guessing some of the field types)
You then change your query to use Info , ie
select new Info {
Code = detail["Code"],
...
and finally
FileHelperEngine engine = new FileHelperEngine(typeof(Info));
engine.WriteFile(".\\outputfile.csv", query);
and you are done.

Categories