I am trying to merge data from two separate queries using C#. The data is located on separate servers or I would just combine the queries. I want to update the data in one of the columns of the first data set with the data in one of the columns of the second data set, joining on a different column.
Here is what I have so far:
ds.Tables[3].Columns[2].ReadOnly = false;
List<object> table = new List<object>();
table = ds.Tables[3].AsEnumerable().Select(r => r[2] = reader.AsEnumerable().Where(s => r[3] == s[0])).ToList();
The ToList() is just for debugging. To summarize, ds.Tables[3].Rows[2] is the column I want to update. ds.Tables[3].Rows[3] contains the key I want to join to.
In the reader, the first column contains the matching key to ds.Tables[3].Rows[3] and the second column contains the data with which I want to update ds.Tables[3].Rows[2].
The error I keep getting is
Unable to cast object of type 'WhereEnumerableIterator1[System.Data.IDataRecord]' to type 'System.IConvertible'.Couldn't store <System.Linq.Enumerable+WhereEnumerableIterator1[System.Data.IDataRecord]> in Quoting Dealers Column. Expected type is Int32.
Where am I going wrong with my LINQ?
EDIT:
I updated the line where the updating is happening
table = ds.Tables[3].AsEnumerable().Select(r => r[2] = reader.AsEnumerable().First(s => r[3] == s[0])[1]).ToList();
but now I keep getting
Sequence contains no matching element
For the record, the sequence does contain a matching element.
You can use the following sample to achieve the join and update operation. Let's suppose there are two Datatables:
tbl1:
tbl2:
Joining two tables and updating the value of column "name1" of tbl1 from column "name2" of tbl2.
public DataTable JoinAndUpdate(DataTable tbl1, DataTable tbl2)
{
// for demo purpose I have created a clone of tbl1.
// you can define a custom schema, if needed.
DataTable dtResult = tbl1.Clone();
var result = from dataRows1 in tbl1.AsEnumerable()
join dataRows2 in tbl2.AsEnumerable()
on dataRows1.Field<int>("ID") equals dataRows2.Field<int>("ID") into lj
from reader in lj
select new object[]
{
dataRows1.Field<int>("ID"), // ID from table 1
reader.Field<string>("name2"), // Updated column value from table 2
dataRows1.Field<int>("age")
// .. here comes the rest of the fields from table 1.
};
// Load the results in the table
result.ToList().ForEach(row => dtResult.LoadDataRow(row, false));
return dtResult;
}
Here's the result:
After considering what #DStanley said about LINQ, I abandoned it and went with a foreach statement. See code below:
ds.Tables[3].Columns[2].ReadOnly = false;
while (reader.Read())
{
foreach (DataRow item in ds.Tables[3].Rows)
{
if ((Guid)item[3] == reader.GetGuid(0))
{
item[2] = reader.GetInt32(1);
}
}
}
Related
I have a DataSet and read data from two sources in it. One table from a XML file and another table from a Firebird SQL database. What I try to get is only one table that has all columns from the XML file AND a few fields from the SQL data in a single table. Both tables have the same unique key field in it so it could be merged easily. I would like to bind all fields of this table to fields on a form.
Is it possible like described or do I not see that there is a simpler solution to my problem?
Edit:
To show what I try to do a bit of extra code.
DataSet dataSet = new DataSet();
DataTable table1 = new DataTable("test1", "test1");
table1.Columns.Add("id");
table1.Columns.Add("name");
table1.Columns[0].Unique = true;
table1.Rows.Add(new object[2] { 1, "name1" });
table1.Rows.Add(new object[2] { 2, "name2" });
DataTable table2 = new DataTable("test2", "test2");
table2.Columns.Add("id");
table2.Columns.Add("thing");
table2.Columns[0].Unique = true;
table2.Rows.Add(new object[2] { 1, "thing1" });
table2.Rows.Add(new object[2] { 2, "thing2" });
dataSet.Tables.Add(table1);
dataSet.Tables[0].Merge(table2, false);
When I run this code I get a ConstraintException. When I remove the unique on the id fields it fills the list with all the needed columns but one row with data from table1 and another one with table2 data. How can I merge them?
Edit 2:
I tried to use the PrimaryKey solution as follows in live data.
xmlData.Tables[0].PrimaryKey = new[] { xmlData.Tables[0].Columns["usnr"] };
dbData.PrimaryKey = new[] { dbData.Columns["usid"] };
xmlData.Tables[0].Merge(dbData, true, MissingSchemaAction.Add);
xmlData is a DataSet which comes from a XML file. It has id, usnr and a few other fields in it. dbData is a DataTable with data from db it has id, usid, name and a few other fields. The id fields are not relevant to my data. Both fields usnr and usid are strings in the table as I tested with GetType().
When I now add xmlData.Tables[0].Merge(dbData, true, MissingSchemaAction.Add); it throws a DataException
<target>.ID and <source>.ID have conflicting properties: DataType property mismatch.
While writing this I realized that the id fields where different in both tables but I dont need them anyways so did remove the column before changing the primaryKey entries and merging. Now I get a NullReferenceException with no further information in it. The tables are all fine and have data in them, where could the Exception come frome now?
Instead of ...
table1.Columns[0].Unique = true;
table2.Columns[0].Unique = true;
... add these lines:
table1.PrimaryKey = new[] { table1.Columns[0] };
table2.PrimaryKey = new[] { table2.Columns[0] };
Because the merge command needs tho know the primary keys of the data tables. Just indicating which columns are unique is not enough. With the primary keys the output will be:
id name thing
== ===== ======
1 name1 thing1
2 name2 thing2
Note that for this to work properly, the primary key fields must have matching data types and names. If the data types don't match, you get a decent error message. However, if the names don't match, a nondescript null reference exception is thrown. Microsoft could have done a better job there.
That means that in your case, I'd recommend to rename either usnr or usid before merging the data tables.
You can use Linq for this purpose and join your two DataTables like this:
.........
.........
dataSet.Tables.Add(table1);
//dataSet.Tables[0].Merge(table2, false);
var collection = from t1 in dataSet.Tables[0].AsEnumerable()
join t2 in table2.AsEnumerable()
on t1["id"] equals t2["id"]
select new
{
ID = t1["id"],
Name = t1["name"],
Thing = t2["thing"]
};
DataTable result = new DataTable("Result");
result.Columns.Add("ID", typeof(string));
result.Columns.Add("Name", typeof(string));
result.Columns.Add("Thing", typeof(string));
foreach (var item in collection)
{
result.Rows.Add(item.ID, item.Name, item.Thing);
}
The result in a DataGridView will be what you want as shown below:
dataGridView1.DataSource = result;
Here you cannot merge those two data tables together. you need to merge data in those two tables iterating each.
Create new data table with containing all the columns(Id,Name,Thing). Then populate that table reading other two.
I'm struggling with the following problem:
There are 2 DataTables (SSFE and FE in my case).
FE will contain items that match with SSFE, but it will also contain values not present in SSFE.
For Example
SSFE 1,2,3,4,5,6,9,10
FE 1,2,3,4,5,6,7,8,9,10,11
The ouput I need is in this example : 7, 8, 11.
I'm using the following code to find items that do match:
DataSet set = new DataSet();
//wrap the tables in a DataSet.
set.Tables.Add(SSFEData);
set.Tables.Add(FEData);
//Creates a ForeignKey like Join between two tables.
//Table1 will be the parent. Table2 will be the child.
DataRelation relation = new DataRelation("IdJoin", SSFEData.Columns[0], FEData.Columns[0], false);
//Have the DataSet perform the join.
set.Relations.Add(relation);
//Loop through table1 without using LINQ.
for (int i = 0; i < SSFEData.Rows.Count; i++)
{
//If any rows in Table2 have the same Id as the current row in Table1
if (SSFEData.Rows[i].GetChildRows(relation).Length > 0)
{
SSFEData.Rows[i]["PackageError"] = SSFEData.Rows[i].GetChildRows(relation)[0][1];
SSFEData.Rows[i]["SaleError"] = SSFEData.Rows[i].GetChildRows(relation)[0][2];
}
}
There should be an trick to find these items that do not have an relation.
Any suggestion will be great!
Well, you could of course use a little bit of LINQ by turning the data tables into IEnumerables using the AsEnumerable()1 extension method.
I am using a few assumptions to illustrate this:
"id" is the column with an integer value relating rows in FEData and SSFEData.
"id" is the primary key column on both FEData and SSFEData.
Then this will return a list of rows from FEData that are not present in SSFEData:
var notInSSFEData = FEData.AsEnumerable()
.Where(x => SSFEData.Rows.Find((object)x.Field<int>("id")) == null)
.ToList();
If assumption 2 above does not hold (i.e. the "id" field is not the primary key), a slightly more elaborate query is required.
var notInSSFEData = FEData.AsEnumerable()
.Where(x1 => !SSFEData.AsEnumerable().Any(x2 => x2.Field<int>("id") == x1.Field<int>("id")))
.ToList();
1 this requires adding a reference to System.Data.DataSetExtensions (in System.Data.DataSetExtensions.dll).
I'm stumped on this one.
I'm trying to merge two DataTables into one. Preferably I would use linq to perform this task, but the problem is I need to add conditions for the join dynamically. The data for each table comes from two different calls to stored procedures and which calls are used can be switched. The results can therefor vary in number of columns and which primary keys are available.
The goal is to replace regular strings in the first result set with a second database that can contain unicode (but only if it contains a value for that specific combination of primary keys).
My linq query would look like this:
var joined = (from DataRow reg in dt1.Rows
join DataRow uni in dt2.Rows
on new { prim1 = reg.ItemArray[0], prim2 = reg.ItemArray[1] }
equals new { prim1 = uni.ItemArray[0], prim2 = uni.ItemArray[1] }
select new
{
prim1 = reg.ItemArray[0],
prim2 = reg.ItemArray[1],
value1 = reg.ItemArray[4],
value2 = uni.ItemArray[3] ?? reg.ItemArray[3]
}
);
This works perfectly for what I want, but as I said I need to be able to define which columns in each table are primary keys, so this:
join DataRow uni in dt2.Rows
on new { prim1 = reg.ItemArray[0], prim2 = reg.ItemArray[1] }
equals new { prim1 = uni.ItemArray[0], prim2 = uni.ItemArray[1] }
needs to be replaced by something like creating a DataRelation between the tables or before performing the linq adding the primary keys dynamically.
ALSO, I need to make the select something like SQLs * instead of specifying each column, as I do not know the number of columns in the first result set.
I've also tried joining the tables by adding primary keys and doing a merge, but how do I then choose which column in dt2 to overwrite which one in dt1?
DataTable join = new DataTable("joined");
join = dt1.Copy();
join.Merge(dt2, false, MissingSchemaAction.Add);
join.AcceptChanges();
I'm using VS2012.
I ended up using a very simple approach, which doesn't involve creating primary key relations or joins at all. I'm sure there are more elegant or performance effective ways of solving the problem.
Basically I've adapted the solution in Linq dynamically adding where conditions, where instead of joining I dynamically add .Where-clauses.
That way I can loop through the rows and compare for each dynamically added primary key:
foreach (DataRow regRow in dt1.Rows)
{
//Select all rows in second result set
var uniRows = (from DataRow uniRow in dt2.Rows select uniRow);
//Add where clauses as needed
if (firstCondition) { uniRows = uniRows.Where(x => x["SalesChannel"] == "001"); }
else if (secondCondition) { uniRows = uniRows.Where(x => x["Language"] == "SV"); }
else (thirdCondition) { uniRows = uniRows.Where(x => x["ArticleNo"] == "242356"); }
// etc...
}
Each row gets compared to a diminishing list of rows in the second result set.
I have two tables:
tbl_ClassFac:
ClassFacNo (Primary Key)
,FacultyID
,ClassID
tbl_EmpClassFac:
EmpID, (Primary Key)
DateImplement, (Primary Key)
ClassFacNo
I want to know all the Employees who are on a specific ClassFacNo. ie. All EmpID with a specific ClassFacNo... What I do is that I first search tbl_EmpClassFac with the EmpID supplied by the user. I store these datarows. Then use the ClassFacNo from these datarows to search through tbl_ClassFac.
The following is my code.
empRowsCF = ClassFacDS.Tables["EmpClassFac"].Select("EmpID='" + txt_SearchValueCF.Text + "'");
int maxempRowsCF = empRowsCF.Length;
if (maxempRowsCF > 0)
{
foundempDT = ClassFacDS.Tables["ClassFac"].Clone();
foreach (DataRow dRow in empRowsCF)
{
returnedRowsCF = ClassFacDS.Tables["ClassFac"].Select("ClassFacNo='" + dRow[2].ToString() + "'");
foundempDT.ImportRow(returnedRowsCF[0]);
}
}
dataGrid_CF.DataSource = null;
dataGrid_CF.DataSource = foundempDT.DefaultView;
***returnedRowsCF = foundempDT.Rows;*** // so NavigateRecordsCF can be used
NavigateRecordsCF("F"); // function to display data in textboxes (no importance here)
I know the code is not very good but that is all I can think of. If anyone has any suggestions please please tell me. If not tell me how do I copy all the Rows in a datatable to a datarow array ???
"How to copy all the rows in a datatable to a datarow array?"
If that helps, use the overload of Select without a parameter
DataRow[] rows = table.Select();
DataTable.Select()
Gets an array of all DataRow objects.
According to the rest of your question: it's actually not clear what's the question.
But i assume you want to filter the first table by a value of a field in the second(related) table. You can use this concise Linq-To-DataSet query:
var rows = from cfrow in tbl_ClassFac.AsEnumerable()
join ecfRow in tbl_EmpClassFac.AsEnumerable()
on cfrow.Field<int>("ClassFacNo") equals ecfRow.Field<int>("ClassFacNo")
where ecfRow.Field<int>("EmpId") == EmpId
select cfrow;
// if you want a new DataTable from the filtered tbl_ClassFac-DataRows:
var tblResult = rows.CopyToDataTable();
Note that you can get an exception at CopyToDataTable if the sequence of datarows is empty, so the filter didn't return any rows. You can avoid it in this way:
var tblResult = rows.Any() ? rows.CopyToDataTable() : tbl_ClassFac.Clone(); // empty table with same columns as source table
I've a datatable which has a single text column 'Title' which can have multiple values with duplicates. I can remove the duplicates using a dataview.
DataView v = new DataView(tempTable);
tempTable = v.ToTable(true, "Title");
But how can i get the number of duplicates for each distinct value without any looping?
If you don't want to loop or use Linq, so there is no way to do that but you can use a computed column on the data table with one more condition if applicable with you. That is the data should be in two related tables like this.
DataRelation rel = new DataRelation("CustToOrders", data.Tables["Customers"].Columns["customerid"], data.Tables["Orders"].Columns["customerid"]);
data.Relations.Add(rel);
Given that customerid field as a Foreign key in the Orders table so it has duplicates.
You can get the count of the duplicates this way:
data.Tables["Customers"].Columns.Add("Duplicates",
GetType(Decimal), "Count(child.customerid)");
The way I would get the results that you want would look something like this:
tempTable.Rows.Cast<DataRow>()
.Select(dr => Convert.ToString(dr[0]))
.GroupBy(dr => dr)
.Select(g => new { Title = g.Key, Count = g.Count() });
However, it's actually looping under the hood. In fact, I can't think of a way to do that kind of a grouping without inspecting each record.
The drawback is that the result of that expression is a sequence of anonymous type instances. If you still want the result to be a DataView, you could rewrite the last Select to create a new DataRow with two columns, and shove them into a new DataTable which you pass to the DataView.