I'm struggling with the following problem:
There are 2 DataTables (SSFE and FE in my case).
FE will contain items that match with SSFE, but it will also contain values not present in SSFE.
For Example
SSFE 1,2,3,4,5,6,9,10
FE 1,2,3,4,5,6,7,8,9,10,11
The ouput I need is in this example : 7, 8, 11.
I'm using the following code to find items that do match:
DataSet set = new DataSet();
//wrap the tables in a DataSet.
set.Tables.Add(SSFEData);
set.Tables.Add(FEData);
//Creates a ForeignKey like Join between two tables.
//Table1 will be the parent. Table2 will be the child.
DataRelation relation = new DataRelation("IdJoin", SSFEData.Columns[0], FEData.Columns[0], false);
//Have the DataSet perform the join.
set.Relations.Add(relation);
//Loop through table1 without using LINQ.
for (int i = 0; i < SSFEData.Rows.Count; i++)
{
//If any rows in Table2 have the same Id as the current row in Table1
if (SSFEData.Rows[i].GetChildRows(relation).Length > 0)
{
SSFEData.Rows[i]["PackageError"] = SSFEData.Rows[i].GetChildRows(relation)[0][1];
SSFEData.Rows[i]["SaleError"] = SSFEData.Rows[i].GetChildRows(relation)[0][2];
}
}
There should be an trick to find these items that do not have an relation.
Any suggestion will be great!
Well, you could of course use a little bit of LINQ by turning the data tables into IEnumerables using the AsEnumerable()1 extension method.
I am using a few assumptions to illustrate this:
"id" is the column with an integer value relating rows in FEData and SSFEData.
"id" is the primary key column on both FEData and SSFEData.
Then this will return a list of rows from FEData that are not present in SSFEData:
var notInSSFEData = FEData.AsEnumerable()
.Where(x => SSFEData.Rows.Find((object)x.Field<int>("id")) == null)
.ToList();
If assumption 2 above does not hold (i.e. the "id" field is not the primary key), a slightly more elaborate query is required.
var notInSSFEData = FEData.AsEnumerable()
.Where(x1 => !SSFEData.AsEnumerable().Any(x2 => x2.Field<int>("id") == x1.Field<int>("id")))
.ToList();
1 this requires adding a reference to System.Data.DataSetExtensions (in System.Data.DataSetExtensions.dll).
Related
I am trying to merge data from two separate queries using C#. The data is located on separate servers or I would just combine the queries. I want to update the data in one of the columns of the first data set with the data in one of the columns of the second data set, joining on a different column.
Here is what I have so far:
ds.Tables[3].Columns[2].ReadOnly = false;
List<object> table = new List<object>();
table = ds.Tables[3].AsEnumerable().Select(r => r[2] = reader.AsEnumerable().Where(s => r[3] == s[0])).ToList();
The ToList() is just for debugging. To summarize, ds.Tables[3].Rows[2] is the column I want to update. ds.Tables[3].Rows[3] contains the key I want to join to.
In the reader, the first column contains the matching key to ds.Tables[3].Rows[3] and the second column contains the data with which I want to update ds.Tables[3].Rows[2].
The error I keep getting is
Unable to cast object of type 'WhereEnumerableIterator1[System.Data.IDataRecord]' to type 'System.IConvertible'.Couldn't store <System.Linq.Enumerable+WhereEnumerableIterator1[System.Data.IDataRecord]> in Quoting Dealers Column. Expected type is Int32.
Where am I going wrong with my LINQ?
EDIT:
I updated the line where the updating is happening
table = ds.Tables[3].AsEnumerable().Select(r => r[2] = reader.AsEnumerable().First(s => r[3] == s[0])[1]).ToList();
but now I keep getting
Sequence contains no matching element
For the record, the sequence does contain a matching element.
You can use the following sample to achieve the join and update operation. Let's suppose there are two Datatables:
tbl1:
tbl2:
Joining two tables and updating the value of column "name1" of tbl1 from column "name2" of tbl2.
public DataTable JoinAndUpdate(DataTable tbl1, DataTable tbl2)
{
// for demo purpose I have created a clone of tbl1.
// you can define a custom schema, if needed.
DataTable dtResult = tbl1.Clone();
var result = from dataRows1 in tbl1.AsEnumerable()
join dataRows2 in tbl2.AsEnumerable()
on dataRows1.Field<int>("ID") equals dataRows2.Field<int>("ID") into lj
from reader in lj
select new object[]
{
dataRows1.Field<int>("ID"), // ID from table 1
reader.Field<string>("name2"), // Updated column value from table 2
dataRows1.Field<int>("age")
// .. here comes the rest of the fields from table 1.
};
// Load the results in the table
result.ToList().ForEach(row => dtResult.LoadDataRow(row, false));
return dtResult;
}
Here's the result:
After considering what #DStanley said about LINQ, I abandoned it and went with a foreach statement. See code below:
ds.Tables[3].Columns[2].ReadOnly = false;
while (reader.Read())
{
foreach (DataRow item in ds.Tables[3].Rows)
{
if ((Guid)item[3] == reader.GetGuid(0))
{
item[2] = reader.GetInt32(1);
}
}
}
I have a CheckedListbox which contains values from some table called products.
The idea is to check the products that are associated to a customer. Now it does save correctly in an link table, yet when loading it again, the items that were checked do not get loaded correctly into the CheckedListbox.
So from that link table where, I would like to get all rows from just one column. All tables are already loaded into the application so I don't want to use sql.
I've tried using linq, with no success, Ids is just empty here.
int[] Ids = (from m in dataset.Tables["LinkTable"].AsEnumerable()
where m.Field<int>("customerId") == customerId
select m.Field<int>("productId")).ToArray();
Then, if I do succeed to get those Id's, I would like to get the indexes of those primary keys so I can set the correct products to checked.
I've tired doing it like this, but this gives me error in other parts of the program, because I am setting a Primary key to a global datatable. Datagridviews don't like that.
DataColumn[] keyColumns = new DataColumn[1];
keyColumns[0] = dataset.Tables["products"].Columns["Id"];
currentPatient.GetTheDataSet.Tables["products"].PrimaryKey = keyColumns;
foreach (int Id in Ids)
{
DataRow row = dataset.Tables["Products"].Rows.Find(Id);
int index = dataset.Tables["Products"].Rows.IndexOf(row);
clbMedications.SetItemChecked(index, true);
}
I would like to do that last part without specifying a primary key, I couldn't find how to do that in linq.
I know it consists of 2 questions, but perhaps this can be done with just one linq statement so I better combine them.
[EDIT]
Finally, i think i've got what you need:
var qry = (from p in ds.Tables["products"].AsEnumerable()
select new {
Id = p.Field<int>("Id"),
Index = ds.Tables["products"].Rows.IndexOf(p),
Checked = ds.Tables["LinkTable"].AsEnumerable().Any(x=>x.Field<int>("productId") == p.Field<int>("Id") && x.Field<int>("customerId")==customerid)
}).ToList();
Above query returns the list, which you can bnid with CheckedListbox.
This question already has answers here:
Best way to remove duplicate entries from a data table
(11 answers)
Closed 9 years ago.
Im trying to delete rows from DataTable AllItems with rows from DataTables Items; The purpose of this to get items from DataTable AllItems which are not inside DataTable Items
All these rows Fiiled from same Excel file which contains several columns and are equal.
I have tried using foreach loop:
foreach(DataRow dr in AllItems.Rows)
{
if (Items.Contains(dr))
{
AllItems.Rows.Remove(dr);
}
But I get following error: Table doesn't have primary key.
Does anyone knows how i can delete these rows?
You have a few choices here:
1. Add a Primary Key
you can add a primary key to your data table when creating it.
Assuming you had a column called "Id" then you would do it this way:
AllItems.PrimaryKey = new DataColumn[] { workTable.Columns["Id"] };}
Or, for cases where your primary key is a composite key (multiple columns):
AllItems.PrimaryKey = new DataColumn[] {
workTable.Columns["Id"],
workTable.Columns["Name"] };}
This would then allow Contains to work correctly.
2. Use a DataView
You can use a DataView to filter out the distinct rows;
DataView view = new DataView(AllItems);
DataTable distinctValues = view.ToTable(true, "Column1", "Column2" , ..., "ColumnN");
3. Find Matching Rows using Select
Or you can rely on the Select method to test if a corresponding row exists in the Items DataTable based on a statement that's like a SQL WHEREclause:
List<DataRow> rowsToRemove = new List<DataRow>();
foreach(DataRow allItemRow in AllItems.Rows)
{
if(Items.Select(String.Format("Id = {0}"),
allItemRow.Field<Int32>("Id")).Length == 0)
{
rowsToRemove.Add(allItemRow);
}
}
rowsToRemove.ForEach(x => x.Delete());
AllItems.AcceptChanges();
Note that it's important NOT to remove rows while you are iterating the collection of Rows in AllItems - instead, collect these rows, and remove them afterwards.
4. Filter on the way in
Also note, and I haven't tried it, but, depending on how you are selecting the rows out of Excel, you may be able to use the SQL DISTINCT clause; if you are using ODBC to load data from Excel then you could try filtering at source.
You may try this:
var exceptItems = AllItems.Rows.Cast<DataRow>()
.Except(Items.Rows.Cast<DataRow>(), DataRowComparer.Default)
.ToList();
As an alternative, if you want to keep working with the allItems data table after removing the items rows from it, you may try this (assuming that you have the column Id in both data tables, which uniquely identifies a row per data table):
var exceptItems = AllItems.Rows.Cast<DataRow>()
.Select((i, index) => new { id = i["Id"], index })
.Intersect(Items.Rows.Cast<DataRow>()
.Select((i, index) => new { id = i["Id"], index }))
.ToList();
for (int i = exceptItems.Count()-1; i >= 0; i--)
{
AllItems.Rows.RemoveAt(exceptItems[i].index);
}
Here's a nicer arrangement of the last example above:
AllItems.Rows.Cast<DataRow>()
.Select((i, index) => new { id = i["Id"], index })
.Intersect(Items.Rows.Cast<DataRow>()
.Select((i, index) => new { id = i["Id"], index }))
.OrderByDescending(i => i.index)
.ToList()
.ForEach(i => AllItems.Rows.RemoveAt(i.index));
I'm stumped on this one.
I'm trying to merge two DataTables into one. Preferably I would use linq to perform this task, but the problem is I need to add conditions for the join dynamically. The data for each table comes from two different calls to stored procedures and which calls are used can be switched. The results can therefor vary in number of columns and which primary keys are available.
The goal is to replace regular strings in the first result set with a second database that can contain unicode (but only if it contains a value for that specific combination of primary keys).
My linq query would look like this:
var joined = (from DataRow reg in dt1.Rows
join DataRow uni in dt2.Rows
on new { prim1 = reg.ItemArray[0], prim2 = reg.ItemArray[1] }
equals new { prim1 = uni.ItemArray[0], prim2 = uni.ItemArray[1] }
select new
{
prim1 = reg.ItemArray[0],
prim2 = reg.ItemArray[1],
value1 = reg.ItemArray[4],
value2 = uni.ItemArray[3] ?? reg.ItemArray[3]
}
);
This works perfectly for what I want, but as I said I need to be able to define which columns in each table are primary keys, so this:
join DataRow uni in dt2.Rows
on new { prim1 = reg.ItemArray[0], prim2 = reg.ItemArray[1] }
equals new { prim1 = uni.ItemArray[0], prim2 = uni.ItemArray[1] }
needs to be replaced by something like creating a DataRelation between the tables or before performing the linq adding the primary keys dynamically.
ALSO, I need to make the select something like SQLs * instead of specifying each column, as I do not know the number of columns in the first result set.
I've also tried joining the tables by adding primary keys and doing a merge, but how do I then choose which column in dt2 to overwrite which one in dt1?
DataTable join = new DataTable("joined");
join = dt1.Copy();
join.Merge(dt2, false, MissingSchemaAction.Add);
join.AcceptChanges();
I'm using VS2012.
I ended up using a very simple approach, which doesn't involve creating primary key relations or joins at all. I'm sure there are more elegant or performance effective ways of solving the problem.
Basically I've adapted the solution in Linq dynamically adding where conditions, where instead of joining I dynamically add .Where-clauses.
That way I can loop through the rows and compare for each dynamically added primary key:
foreach (DataRow regRow in dt1.Rows)
{
//Select all rows in second result set
var uniRows = (from DataRow uniRow in dt2.Rows select uniRow);
//Add where clauses as needed
if (firstCondition) { uniRows = uniRows.Where(x => x["SalesChannel"] == "001"); }
else if (secondCondition) { uniRows = uniRows.Where(x => x["Language"] == "SV"); }
else (thirdCondition) { uniRows = uniRows.Where(x => x["ArticleNo"] == "242356"); }
// etc...
}
Each row gets compared to a diminishing list of rows in the second result set.
I've a datatable which has a single text column 'Title' which can have multiple values with duplicates. I can remove the duplicates using a dataview.
DataView v = new DataView(tempTable);
tempTable = v.ToTable(true, "Title");
But how can i get the number of duplicates for each distinct value without any looping?
If you don't want to loop or use Linq, so there is no way to do that but you can use a computed column on the data table with one more condition if applicable with you. That is the data should be in two related tables like this.
DataRelation rel = new DataRelation("CustToOrders", data.Tables["Customers"].Columns["customerid"], data.Tables["Orders"].Columns["customerid"]);
data.Relations.Add(rel);
Given that customerid field as a Foreign key in the Orders table so it has duplicates.
You can get the count of the duplicates this way:
data.Tables["Customers"].Columns.Add("Duplicates",
GetType(Decimal), "Count(child.customerid)");
The way I would get the results that you want would look something like this:
tempTable.Rows.Cast<DataRow>()
.Select(dr => Convert.ToString(dr[0]))
.GroupBy(dr => dr)
.Select(g => new { Title = g.Key, Count = g.Count() });
However, it's actually looping under the hood. In fact, I can't think of a way to do that kind of a grouping without inspecting each record.
The drawback is that the result of that expression is a sequence of anonymous type instances. If you still want the result to be a DataView, you could rewrite the last Select to create a new DataRow with two columns, and shove them into a new DataTable which you pass to the DataView.