DataTable Join using LINQ in C# - c#

I am joining two data tables using LINQ this way:
DataTable targetTable = dataTable1.Clone();
var dt2Columns = dataTable2.Columns.OfType<DataColumn>().Select(dc =>
new DataColumn(dc.ColumnName, dc.DataType, dc.Expression, dc.ColumnMapping));
var dt2FinalColumns = from dc in dt2Columns.AsEnumerable()
where targetTable.Columns.Contains(dc.ColumnName) == false
select dc;
targetTable.Columns.AddRange(dt2FinalColumns.ToArray());
var rowData = from row1 in dataTable1.AsEnumerable()
join row2 in dataTable2.AsEnumerable()
on row1.Field<string>("keyCol") equals row2.Field<string>("keyCol")
select row1.ItemArray.Concat(row2.ItemArray.Where(r2 => row1.ItemArray.Contains(r2) == false)).ToArray();
foreach (object[] values in rowData)
targetTable.Rows.Add(values);
I am facing three issues here:
In case of row count is not same for two tables, I want to pass default value or assign empty string for values not found in other table. How do I achieve this ?
If I have multiple columns and need to compare with AND how is that possible ?
What if I have to join multiple tables in run time. Is there any way to generate dynamic LINQ ?

If both tables have the same primary-key DataTable.Merge will work:
dataTable1.Merge(dataTable2 ,false, MissingSchemaAction.Add);
This will merge the schema(columns) of both tables, joins rows which have the same primary-key and add the other rows.

Related

Left join two datatable on multiple columns LINQ

How to do left join 2 DataTables based on multiple columns match ?
In order to compare what datarows is not matching in the right table
Part of incremental upload need to bring in just the new rows from source Datatable
Found a way to use LINQ to do the comparison of two datatables in c# using join (LEFT)
IEnumerable<DataRow> result = (from srcDt in dtSource.AsEnumerable()
join dstDt in dtDestination.AsEnumerable()
on new { EmployeeID = srcDt["EmployeeID "], Environment = srcDt["Environment"] } equals new { EmployeeID = dstDt["EmployeeID "], Environment = dstDt["Environment"] }
into g
from row in g.DefaultIfEmpty()
where row == null
select srcDt);
// verify if the result has any rows in the dataset
if (result.Any())
{
DataTable dtInserts = result.CopyToDataTable();
// other code which uses the new datarows to perform inserts
}

Using LINQ, how can I get a joined table returned as IEnumerable<DataRow>

I use SQL a lot, and I'm trying to transfer the same joining logic to LINQ queries against a DataSet. The DataSet is a bunch of tables which are pulled from SQL queries further up the line.
I managed to get this join working -
IEnumerable<DataRow> query =
from a in ds.Tables["Names"].AsEnumerable()
join b in ds.Tables["NameHasAffiliate"].AsEnumerable()
on a.Field<int>("PK") equals b.Field<int>("fk_MainTable_PK")
select a;
This doesn't compile:
select a.Field<int>("PK"), a.Field<string>("Name")
and nor does
select new { PK = a.Field<int>("PK"), Name = a.Field<string>("Name") }
It's definitely querying correctly (I see the expected amount of rows and duplicated table a data) but this is only returning the table a columns - obviously because of select a.
I've tried changing to select new { a, b } and also wrapping the query up in () to add an .ToList() at the end, but neither compiles to give me the IEnumerable version of the query for a simple converstion to table aftewards using
DataTable boundTable = query.CopyToDataTable<DataRow>();
How can I select ALL the columns like this?
Or rather, if I'm joining a lot of tables, how can I specify which columns from each table?
This is because returning anything other than a DataRow will not work in your example, since you are defining the query to be of type IEnumerable<DataRow>. So if you are returning a field or collection of fields, then you need to change the expected return type.
I tried the following and it works just fine.
var ds = new DataSet();
ds.Tables.Add("Names");
ds.Tables["Names"].Columns.Add("PK", typeof(Int32));
ds.Tables["Names"].Columns.Add("Name", typeof(String));
ds.Tables["Names"].Rows.Add("1", "NameValue1");
ds.Tables.Add("NameHasAffiliate");
ds.Tables["NameHasAffiliate"].Columns.Add("fk_MainTable_PK", typeof(Int32));
ds.Tables["NameHasAffiliate"].Columns.Add("AffiliateValue", typeof(String));
ds.Tables["NameHasAffiliate"].Rows.Add("1", "AffiliateValue1");
var query =
ds.Tables["Names"].AsEnumerable()
.Join(
ds.Tables["NameHasAffiliate"].AsEnumerable(),
n => n.Field<int>("PK"), a => a.Field<int>("fk_MainTable_PK"),
(n, a) => new { Key = n.Field<Int32>("PK"), Name = n.Field<String>("Name")})
.ToList();

C# Linq Table query to count the non-matching entries

I am quite new to this, I am running two SQL queries and I am creating two separate data tables, DataTable1 and DataTable2.
I am applying some linq criteria to DataTable1 and creating another data table from that, which is DataTable3.
var Query3 = from table1 in DataTable1.AsEnumerable()
where table1.Field<DateTime>("DateTime") <= Yday
where table1.Field<string>("StockCode").Contains("-CA") && !(table1.Field<string>("StockCode").Contains("-CAB")) ||
table1.Field<string>("StockCode").Contains("-CM") ||
table1.Field<string>("StockCode").Contains("-LP")
select table1;
DataTable DataTable3 = Query3.CopyToDataTable()
Now I would write another query to do the following.
Both data tables have a column JobNumber. I would like to query DataTable3 in DataTable 2 to count the rows that have similar JobNumber entries. Below is what I am doing but I am not getting the correct count.
int count = (from table3 in DataTable3.AsEnumerable()
join table2 in DataTable2.AsEnumerable() on table2.Field<string>("JobNumber") equals table3.Field<string>("JobNumber")
where table2.Field<string>("JobNumber") == table3.Field<string>("JobNumber")
select table2).Count();
You are creating a cartesian join and counting its result, was that what you indented ? Also in your linq your Join expression and where expression is same (where is redundant). It is not clear what you really want to count. Probably you instead wanted to count those in DataTable2 where JobNumbers exists in DataTable3?:
var jobNumbers = (from r in DataTable3.AsEnumerable()
select r.Field<string>("JobNumber")).ToList();
var count = (from r in DataTable2.AsEnumerable()
where jobNumbers.Contains( r.Field<string>("JobNumber") )
select r).Count();
As a side note, it would be much easier if you used Linq To SQL instead (rather than Linq To DataSet).

getting non matched values from two Datatable

Is there any direct method for getting non matched values from two data table. I have one datatable which returns all the groups from Active Directory, and another datatable consist of all the group names from sharepoint list. But i need the non matched values by comparing these two datatables. please help me, if it possible.
Thanks in advance.
You could use DataRowComparer to compare the rows.
For instance, to compare the first rows of 2 data tables:
DataRow left = table1.Rows[0];
DataRow right = table2.Rows[0];
IEqualityComparer<DataRow> comparer = DataRowComparer.Default;
bool bEqual = comparer.Equals(left, right);
You can use .Except to do this. (Assuming an ID column)
IEnumerable<int> idsInDataTableA = dataTableA.AsEnumerable().Select(row => (int)row["ID"]);
IEnumerable<int> idsInDataTableB = dataTableB.AsEnumerable().Select(row => (int)row["ID"]);
IEnumerable<int> difference = idsInDataTableA.Except(idsInDataTableB );
I want compare DataTable1 that not exist in DataTable2
You can use Linq. Very efficient approaches are Enumerable.Except or Enumerable.Join(as LEFT OUTER JOIN) which are using sets:
var keyColRows = dt1.AsEnumerable()
.Select(r => r.Field<int>("KeyColumn")
.Except(dt2.AsEnumerable().Select(r2 => r2.Field<int>("KeyColumn"));
foreach(int inTable2Missing)
Console.WriteLine(inTable2Missing);
or the Join approach selecting the whole DataRow:
var rowsOnlyInDT1 = from r1 in dt1.AsEnumerable()
join r2 in dt2.AsEnumerable()
on r1.Field<int>("KeyColumn") equals r2.Field<int>("KeyColumn") into groupJoin
from subRow in groupJoin.DefaultIfEmpty()
where subRow == null
select r1;
Here you can use rowsOnlyInDT1.CopyToDataTable to create a new DataTable of the rows in table1 which are unique/new or use foreach to enumerate them.

finding distinct rows in dataset using linq

I am using the below query to find the distinct rows from a dataset but its not getting me the distinct for example its not removing the duplicate and show me the distinct count.
var distinctRows = (from DataRow dRow in _dsMechanic.Tables[0].Rows
select new { col1 = dRow["colName"] }).Distinct();
This should work:
var distinctRows = (
from DataRow dRow in _dsMechanic.Tables[0].Rows
select dRow["colName"]).
Distinct();
Doing the distinct on an anonymous type is just asking for trouble.

Categories