I'm really new to LINQ so I'm hoping someone can help me. I've got a database which I need to run a large query from but it's a really old ODBC driver and takes a long time to respond (30+min for even a simple query). It only takes about 2-3min to dump all the data into a dataset however so I figured this was best and then I could run a LINQ to Dataset query. I can't seem to get the query to work and I'm a little confused. I put all the data into an SQL Express database to test the LINQ to SQL query to make sure I was going down the right path. I don't have this option where the application is going to be run as the environment will always be different.
SQL:
SELECT Invoice_detail.Code, Invoice_detail.Description, Product_master.Comment AS Packing, Invoice_detail.QtyInv AS INV, Invoice_detail.QtyBackOrder AS BO, Alternate_product_codes.MasterBarCode AS BarCode, Invoice_detail.PriceAmt AS Price, Invoice_detail.DiscPerc AS Disc, ROUND(Invoice_detail.TaxableAmt/Invoice_detail.QtyInv,2) AS Nett FROM ((Invoice_detail INNER JOIN Product_master ON Invoice_detail.Code = Product_master.Code) INNER JOIN Invoice_header ON Invoice_detail.InternalDocNum = Invoice_header.InternalDocNum AND Invoice_detail.DocType = Invoice_header.DocType) LEFT JOIN Alternate_product_codes ON Invoice_detail.Code = Alternate_product_codes.Code WHERE Invoice_header.DocNum = '{0}' AND Invoice_header.DocType = 1 AND Invoice_detail.LineType = 1 AND Invoice_detail.QtyInv > 0
LINQ to SQL:
from detail in INVOICE_DETAILs
join prodmast in PRODUCT_MASTERs on detail.Code equals prodmast.Code
join header in INVOICE_HEADERs on new { detail.InternalDocNum, detail.DocType } equals new { header.InternalDocNum, header.DocType}
join prodcodes in ALTERNATE_PRODUCT_CODES on detail.Code equals prodcodes.Code into alt_invd
from prodcodes in alt_invd.DefaultIfEmpty()
where
header.DocType == 1 &&
detail.LineType == 1 &&
detail.QtyInv > 0 &&
header.Date > DateTime.Parse("17/07/2011").Date &&
header.DocNum.Trim() == "119674"
select new {
detail.Code,
detail.Description,
Packing = prodmast.Comment,
INV = detail.QtyInv,
BO = detail.QtyBackOrder,
Barcode = prodcodes.MasterBarCode,
Price = detail.PriceAmt,
Disc = detail.DiscPerc,
Nett = Math.Round(Convert.ToDecimal(detail.TaxableAmt/detail.QtyInv),2,MidpointRounding.AwayFromZero)
}
LINQ to Dataset:
var query = from detail in ds.Tables["Invoice_detail"].AsEnumerable()
join prodmast in ds.Tables["Product_master"].AsEnumerable() on detail["Code"] equals prodmast["Code"]
join header in ds.Tables["Invoice_header"].AsEnumerable() on new { docnum = detail["InternalDocNum"], doctype = detail["DocType"] } equals new { docnum = header["InternalDocNum"], doctype = header["DocType"] }
join prodcodes in ds.Tables["Alternate_product_codes"].AsEnumerable() on detail["Code"] equals prodcodes["Code"] into alt_invd
from prodcodes in alt_invd.DefaultIfEmpty()
where
(int)header["DocType"] == 1 &&
(int)detail["LineType"] == 1 &&
(int)detail["QtyInv"] > 0 &&
//header.Field<DateTime>("Date") > DateTime.Parse("17/07/2011").Date &&
header.Field<DateTime>("Date") > DateTime.Now.Date.AddDays(-7) &&
header.Field<string>("DocNum").Trim() == "119674"
select new
{
Code = detail["Code"],
Description = detail["Description"],
Packing = prodmast["Comment"],
INV = detail["QtyInv"],
BO = detail["QtyBackOrder"],
Barcode = prodcodes["MasterBarCode"],
Price = detail["PriceAmt"],
Disc = detail["DiscPerc"],
Nett = Math.Round(Convert.ToDecimal((double)detail["TaxableAmt"] / (int)detail["QtyInv"]), 2, MidpointRounding.AwayFromZero)
};
I need to run the LINQ to DataSet query and then put the results into a DataTable so that I can export to CSV. The query will return many rows so I can see the CopyToDataTable method however that doesn't seem to work unless it is a typed dataset. I'm using the ODBC data adapter fill method so not specifically setting the data types on the Datatables I'm filling. The reason for this is that there is a lot of columns in those tables and setting them all up would be time consuming.
Is LINQ the best option? Am I close? Do I have to set the DataTables up for all the columns and data types? The only other way I can think of is to dump the data into an access database every time and query from there. I'm more curious to get LINQ to work though as I think it's going to be more beneficial for me going forward.
Any help or pointers is appreciated.
Thanks.
Pete.
Consider using POCO objects instead of a DataSet.
Blogs # MSDN
If I understand you correctly, the Linq To Dataset query retrieves the correct information, but you are not able to export the information to csv.
If this is just one csv file you need creating using just the nine fields in your example, you may be able to use a csv library (for example FileHelpers) to export the information.
To give you an example of the extra work involved, you need to define a class eg
[DelimitedRecord(",")]
public class Info
{
[FieldQuoted()]
public string Code ;
[FieldQuoted()]
public string Description ;
[FieldQuoted()]
public string Packing ;
public decimal INV ;
public decimal BO ;
[FieldQuoted()]
public string Barcode ;
public decimal Price ;
public decimal Disc ;
public decimal Nett ;
}
(Note, I'm guessing some of the field types)
You then change your query to use Info , ie
select new Info {
Code = detail["Code"],
...
and finally
FileHelperEngine engine = new FileHelperEngine(typeof(Info));
engine.WriteFile(".\\outputfile.csv", query);
and you are done.
Related
I have a current working solution that is doing a new db call for each project Id in a list and I am trying to do a single call instead that returns data from multiple projects.
To do this I am trying to pass a list of project Id's into a Dapper Query that hits a MySQL database. I either get an error of operand should contain 1 column(s) or I get the first result back and not one per projectId that is in the database.
The current c# code I am using is
public List<ProjectPortalManager> GetPPTech(IEnumerable<int> projIds)
{
string sql = #"SELECT tProject.ProjectID,
tProject.ProjectName,
tProject.PMUserID,
if(cast(tproject.dateinit as char) = '0000-00-00 00:00:00',null,tproject.dateinit) as DateInit,
tproject.comments,
tproject.ProjectNumber,
c.LName,
c.FName,
c.orgid,
c.orgname as organization,
c.Email,
c.Phone
From tProject left Join tContacts c on tProject.PMUserID = c.UserId
Where tProject.ProjectID in (#ProjIds);";
try
{
List<ProjectManager> pms = Conn.Query<ProjectManager>(sql, new { ProjIds = new[] { projIds } }).ToList();
return pms;
}
catch (Exception ex)
{
ErrorReport.ReportError(ex);
}
return new List<ProjectPortalManager>();
}
This does not error out but returns 0 results. When running the query in MySQL Workbench I do get one result back. However I am expecting several results. The SQL I run in workbench is:
SET #projIds = ('28, 99, 9');
SELECT tProject.ProjectID,
tProject.ProjectName,
tProject.PMUserID,
if(cast(tproject.dateinit as char) = '0000-00-00 00:00:00',null,tproject.dateinit) as DateInit,
tproject.comments, tproject.ProjectNumber,
c.LName,
c.FName,
c.orgid,
c.orgname as organization,
c.Email,
c.Phone
From tProject left Join tContacts c on tProject.PMUserID = c.UserId
where tProject.ProjectID IN (#projIds);
I have verified that all the Id numbers used do exists in the database.
There seems to be conflicting information online about how to do this but I have not found a solution that seems to work.
Don't put parentheses around the IN if you want Dapper to expand it to a list of parameters and populate them
where tProject.ProjectID IN #PIDs
Suppose you'd passed an array of size 3 in new { PIDs = projIds.ToArray() } - Dapper would effectively transform your SQL to:
where tProject.ProjectID IN (#PIDs1, #PIDs2, #PIDs3)
then behave as if you'd passed new { PIDs1 = projIds[0], PIDs2 = projIds[1], PIDs3 = projIds[2] }
I have Master and Detail classes:
class Master
{
public int ID { get; set; }
public string Name { get; set; }
public List<Detail> Details { get; set; }
}
class Detail
{
public Description { get; set; }
public Amount { get; set; }
}
I use below approach and working fine now.
List<Master> result = new List<Master>();
// SQL Connection
string sqlCommand = "SELECT * FROM Master LEFT JOIN Detail on Master.ID = Detail.ID";
using (System.Data.SqlClient.SqlDataReader dr = db.DbDataReader as System.Data.SqlClient.SqlDataReader)
{
if (dr.HasRows)
{
Master LastMaster = null;
while (dr.Read())
{
if (LastMaster == null || Convert.ToInt(dr["ID"]) != LastMaster.ID)
{
Master h = new Master();
h.ID = Convert.ToInt(dr["ID"]);
h.Name = Convert.ToString(dr["Name"]);
result.Add(h);
LastMaster = h;
}
if (dr["Description"] == DBNull.Value)
continue;
if (h.Detail == null)
h.Detail = new List<Detail>();
Detail d = new Detail();
d.Description = dr["Description"] as string;
d.Amount = Convert.ToDouble(dr["Amount"]);
LastMaster.Detail.Add(d);
......
}
}
.....
}
Is there any better approach to fill list of list objects in C# ? I appreciate any suggestion. Thanks.
You can use Dapper (a micro ORM) for your scenario. Below is a sample code
const string createSql = #"
create table #Users (Id int, Name varchar(20))
create table #Posts (Id int, OwnerId int, Content varchar(20))
insert #Users values(99, 'Sam')
insert #Users values(2, 'I am')
insert #Posts values(1, 99, 'Sams Post1')
insert #Posts values(2, 99, 'Sams Post2')
insert #Posts values(3, null, 'no ones post')";
using(var connection = new SqlConnection("database connection string"))
{
connection.Execute(createSql);
try
{
const string sql =#"select * from #Posts p
left join #Users u on u.Id = p.OwnerId
Order by p.Id";
var data = connection.Query<Post, User, Post>(sql, (post, user) => { post.Owner = user; return post; }).ToList();
}
catch(Exception ex){}
}
Ibram commented about EF and Dapper and Abu gave an example for Dapper (but I'm not sure it demos generating a graph with a single master and multiple detail per master, as you have - dapper can do so if you want to explore it)
In EF we could do something like:
install EF core power tools - as you have a db already we will use it to generate classes from. This operation can just be done with the command line but EFCPT makes a lot of operations easier
right click your project, choose EF Core Power Tools .. Reverse Engineer
fill in a new connection string detail
choose the database objects you wish to turn into classes
set other options as appropriate (you can find out more about them later, maybe only use the pluralize one for now, if your db tables are like Orders, Customers, Companies and you want your classes called Order/Customer/Company (classes should not have plural names). Tick on "put connectionstring in code" for now- you can remove it to config file later
finish. Eventually you'll get some classes and a context that has a load of code in OnModelCreating that lays out a description of everything in the tables, the columns, keys, relationships..
Now you can run some query like:
var c = new YourContext();
var ms = c.Masters.Include(m => m.Details).ToList();
That's basically the equivalent of what you posted
You can get more trick by shaping a more involved linq query:
var q = c.Masters.Include(m => m.Details)
.Where(m => m.Name.StartsWith("Smith"))
.OrderBy(m => m.Name);
var ms = q.ToList();
It will be translated into something like
SELECT * FROM master join detail on ...
WHERE name LIKE 'Smith%'
ORDER BY m.name
You can see the generated query if you inspect the DebugView property of q
You could make changes:
ms[0].Details.Clear(); //causes delete of all details for this master
ms[1].Details.Add(new Detail { someprop = some value}); //causes insert of new details for this master
ms[2].Name = "Hello"; //causes update of this master name
c.SaveChanges(); //carries out the above, in sql, to affect the db
When you manipulate the returned objects and save, EF will delete/insert/update as appropriate to sync the db to what happened to the objects. It is important that you understand that EF tracks what happens to all the objects it creates, so that it can do this
When would you use EF and when would you use Dapper? Well, it doesn't have to be mutually exclusive; you can use them in the same project. Generally I'd say use EF (or some other ORM like it - nHibernate is another popular one, works on a similar concept of translating linq expressions to sql and tracking the data back into an object) for stuff where the sql is so simple that it's a productivity boost to not have to write it, track it, and write the changes back. What it is not intended for, is forming as hoc queries that don't map well to client side objects. For that you can use Dapper, or you could form client side objects and add them to EF's model and then run raw sql that populates them. Dapper is fast, because it doesn't do any of that tracking changes, mapping or wiring up complex object graphs; you do all that manually. Dapper makes a convenient abstraction over raw sql and creates classes, but EF goes much further; it comes at a cost - EF is highly convenient but much more heavy weight.
I want to calculate the rows of a related table:
MainTable tbl = tblInfo(id);
var count = tbl.Related_Huge_Table_Data.Count();
The problem is: this takes too long (about 20 seconds) to execute, although when I run this query in Sql Server it executes below one second. How can I optimize this query in linq? I also tried to use stored procedure but no luck.
This is the tblInfo method:
public MainTable tblInfo(int id)
{
MyDataContext context = new MyDataContext();
MainTable mt = (from c in context.MainTables
where c.Id == id
select c).SingleOrDefault();
return mt;
}
I used LinqToSql and classes was generated by LinqToSql.
By running SingleOrDefault() you execute the query and have to deal with results in memory after that. You need to stay with IQueryable until your query is fully constructed.
The easiest way to answer "how many child records this parent record has" is to approach it from the child side:
using (var dx = new MyDataContext())
{
// If you have an association between the tables defined in the context
int count = dx.Related_Huge_Table_Datas.Where(t => t.MainTable.id == 42).Count();
// If you don't
int count = dx.Related_Huge_Table_Datas.Where(t => t.parent_id == 42).Count();
}
If you insist on the parent side approach, you can do that too:
using (var dx = new MyDataContext())
{
int count = dx.MainTables.Where(t => t.id == 42).SelectMany(t => t.Related_Huge_Table_Datas).Count();
}
If you want to keep a part of this query in a function like tblInfo, you can, but you can't instantiate MyDataContext from inside such function, otherwise you will get an exception when trying to use the query with another instance of MyDataContext. So either pass MyDataContext to tblInfo or make tblInfo a member of partial class MyDataContext:
public static IQueryable<MainTable> tblInfo(MyDataContext dx, int id)
{
return dx.MainTables.Where(t => t.id == id);
}
...
using (var dx = new MyDataContext())
{
int count = tblInfo(dx, 42).SelectMany(t => t.Related_Huge_Table_Datas).Count();
}
Try this
MyDataContext context = new MyDataContext();
var count=context.Related_Huge_Table_Data.where(o=>o.Parentid==id).Count();
//or
int count=context.Database.SqlQuery<int>("select count(1) from Related_Huge_Table_Data where Parentid="+id).FirstOrDefault();
If you wish to take full advantage of your SQL Database's performance, it may make sense to query it directly rather than use Linq. Should be reasonably more performent :)
var Related_Huge_Table_Data = "TABLENAME";//Input table name here
var Id = "ID"; //Input Id name here
var connectionString = "user id=USERNAME; password=PASSWORD server=SERVERNAME; Trusted_Connection=YESORNO; database=DATABASE; connection timeout=30";
SqlCommand sCommand = new SqlCommand();
sCommand.Connection = new SqlConnection(connectionString);
sCommand.CommandType = CommandType.Text;
sCommand.CommandText = $"COUNT(*) FROM {Related_Huge_Table_Name} WHERE Id={ID}";
sCommand.Connection.Open();
SqlDataReader reader = sCommand.ExecuteReader();
var count = 0;
if (reader.HasRows)
{
reader.Read();
count = reader.GetInt32(0);
}
else
{
Debug.WriteLine("Related_Huge_Table_Data: No Rows returned in Query.");
}
sCommand.Connection.Close();
Try this:
MyDataContext context = new MyDataContext();
var count = context.MainTables.GroupBy(x => x.ID).Distict().Count();
The answer of GSerg is the correct one in many case. But when your table starts to be really huge, even a Count(1) directly in SQL Server is slow.
The best way you can get round this is to query the database stats directly, which is impossible with Linq (or I don't know of).
The best thing you can do is to create a static sub (C#) on your tables definition witch will return the result of the following query:
SELECT
SUM(st.row_count)
FROM
sys.dm_db_partition_stats st
WHERE
object_name(object_id) = '{TableName}'
AND (index_id < 2)
where {TableName} is the database name of your table.
Beware it's an answer only for the case of counting all records in a table!
Is your linq2sql returning the recordset and then doing the .Count() locally, or is it sending SQL to the server to do the count on the server? There will be a big difference in performance there.
Also, have you inspected the SQL that's being generated when you execute the query? From memory, Linq2Sql allows you to inspect SQL (maybe by setting up a logger on your class?). In Entity Framework, you can see it when debugging and inspecting the IQueryable<> object, not sure if there's an equivalent in Linq2Sql.
Way to view SQL executed by LINQ in Visual Studio?
Alternatively, use the SQL Server Profiler (if available), or somehow see what's being executed.
You may try following:-
var c = from rt in context.Related_Huge_Table_Data
join t in context.MainTables
on rt.MainTableId ==t.id where t.id=id
select new {rt.id};
var count=c.Distict().Count();
I've done this before, and I'm drawing a blank on how I did it.
I'm using linq to sql to compare list from different databases on different servers. I can't combine the queries due to the complexety of the actual query and huge amounts of data this will process.
var query1 = (from u in database1server1
select new
{ PrimaryLine1 = u.PrimaryLine1,
PrimaryLine2 = u.PrimaryLine2,
PrimaryLine3 = u.PrimaryLine3,
}).ToList().ToString();
var query2 = (from m in database2server2
select new
{ PrimaryLine1 = m.PrimaryLine1,
PrimaryLine2 = m.PrimaryLine2,
PrimaryLine3 = m.PrimaryLine3,
}).ToList().ToString();
I need to account for differences in these list lists, column by column. So, I need a list of all the PrimaryLine1 values which are in database1server1 but not database2server2... and so on PrimaryLine2, PrimaryLine3, etc.
How can I build the string dataMismatces so I can generate a neat list for the end user?
string failedTests = null; // in
failedTests = string.Join("\n", dataMismatces);
I a have a simple linq to sql query and for some reason the .take() doesn't work. I tried to add skip() as well thinking that maybe it needed some starting point from where to take the records but the results are still same and rather than taking only 10 records, it takes all 240 records.
Would appreciate if somebody can tell me what's going on. Thanks in advance.
The code is:
var types = (from t in EventTypes.tl_event_types
select new
{
type_id = t.event_type_id,
type_name = t.type_name
}).Take(10);
I'm assuming that by naming conventions that EventTypes is your object. You need to select from your data context... So
var types = (from t in dataContext.EventTypes.tl_event_types
select new
{
type_id = t.event_type_id,
type_name = t.type_name
}).Take(10);
should work.