Linq merging DataTable with dynamically added primary keys - c#

I'm stumped on this one.
I'm trying to merge two DataTables into one. Preferably I would use linq to perform this task, but the problem is I need to add conditions for the join dynamically. The data for each table comes from two different calls to stored procedures and which calls are used can be switched. The results can therefor vary in number of columns and which primary keys are available.
The goal is to replace regular strings in the first result set with a second database that can contain unicode (but only if it contains a value for that specific combination of primary keys).
My linq query would look like this:
var joined = (from DataRow reg in dt1.Rows
join DataRow uni in dt2.Rows
on new { prim1 = reg.ItemArray[0], prim2 = reg.ItemArray[1] }
equals new { prim1 = uni.ItemArray[0], prim2 = uni.ItemArray[1] }
select new
{
prim1 = reg.ItemArray[0],
prim2 = reg.ItemArray[1],
value1 = reg.ItemArray[4],
value2 = uni.ItemArray[3] ?? reg.ItemArray[3]
}
);
This works perfectly for what I want, but as I said I need to be able to define which columns in each table are primary keys, so this:
join DataRow uni in dt2.Rows
on new { prim1 = reg.ItemArray[0], prim2 = reg.ItemArray[1] }
equals new { prim1 = uni.ItemArray[0], prim2 = uni.ItemArray[1] }
needs to be replaced by something like creating a DataRelation between the tables or before performing the linq adding the primary keys dynamically.
ALSO, I need to make the select something like SQLs * instead of specifying each column, as I do not know the number of columns in the first result set.
I've also tried joining the tables by adding primary keys and doing a merge, but how do I then choose which column in dt2 to overwrite which one in dt1?
DataTable join = new DataTable("joined");
join = dt1.Copy();
join.Merge(dt2, false, MissingSchemaAction.Add);
join.AcceptChanges();
I'm using VS2012.

I ended up using a very simple approach, which doesn't involve creating primary key relations or joins at all. I'm sure there are more elegant or performance effective ways of solving the problem.
Basically I've adapted the solution in Linq dynamically adding where conditions, where instead of joining I dynamically add .Where-clauses.
That way I can loop through the rows and compare for each dynamically added primary key:
foreach (DataRow regRow in dt1.Rows)
{
//Select all rows in second result set
var uniRows = (from DataRow uniRow in dt2.Rows select uniRow);
//Add where clauses as needed
if (firstCondition) { uniRows = uniRows.Where(x => x["SalesChannel"] == "001"); }
else if (secondCondition) { uniRows = uniRows.Where(x => x["Language"] == "SV"); }
else (thirdCondition) { uniRows = uniRows.Where(x => x["ArticleNo"] == "242356"); }
// etc...
}
Each row gets compared to a diminishing list of rows in the second result set.

Related

How do I use LINQ to update a datatable with a SqlDataReader?

I am trying to merge data from two separate queries using C#. The data is located on separate servers or I would just combine the queries. I want to update the data in one of the columns of the first data set with the data in one of the columns of the second data set, joining on a different column.
Here is what I have so far:
ds.Tables[3].Columns[2].ReadOnly = false;
List<object> table = new List<object>();
table = ds.Tables[3].AsEnumerable().Select(r => r[2] = reader.AsEnumerable().Where(s => r[3] == s[0])).ToList();
The ToList() is just for debugging. To summarize, ds.Tables[3].Rows[2] is the column I want to update. ds.Tables[3].Rows[3] contains the key I want to join to.
In the reader, the first column contains the matching key to ds.Tables[3].Rows[3] and the second column contains the data with which I want to update ds.Tables[3].Rows[2].
The error I keep getting is
Unable to cast object of type 'WhereEnumerableIterator1[System.Data.IDataRecord]' to type 'System.IConvertible'.Couldn't store <System.Linq.Enumerable+WhereEnumerableIterator1[System.Data.IDataRecord]> in Quoting Dealers Column. Expected type is Int32.
Where am I going wrong with my LINQ?
EDIT:
I updated the line where the updating is happening
table = ds.Tables[3].AsEnumerable().Select(r => r[2] = reader.AsEnumerable().First(s => r[3] == s[0])[1]).ToList();
but now I keep getting
Sequence contains no matching element
For the record, the sequence does contain a matching element.
You can use the following sample to achieve the join and update operation. Let's suppose there are two Datatables:
tbl1:
tbl2:
Joining two tables and updating the value of column "name1" of tbl1 from column "name2" of tbl2.
public DataTable JoinAndUpdate(DataTable tbl1, DataTable tbl2)
{
// for demo purpose I have created a clone of tbl1.
// you can define a custom schema, if needed.
DataTable dtResult = tbl1.Clone();
var result = from dataRows1 in tbl1.AsEnumerable()
join dataRows2 in tbl2.AsEnumerable()
on dataRows1.Field<int>("ID") equals dataRows2.Field<int>("ID") into lj
from reader in lj
select new object[]
{
dataRows1.Field<int>("ID"), // ID from table 1
reader.Field<string>("name2"), // Updated column value from table 2
dataRows1.Field<int>("age")
// .. here comes the rest of the fields from table 1.
};
// Load the results in the table
result.ToList().ForEach(row => dtResult.LoadDataRow(row, false));
return dtResult;
}
Here's the result:
After considering what #DStanley said about LINQ, I abandoned it and went with a foreach statement. See code below:
ds.Tables[3].Columns[2].ReadOnly = false;
while (reader.Read())
{
foreach (DataRow item in ds.Tables[3].Rows)
{
if ((Guid)item[3] == reader.GetGuid(0))
{
item[2] = reader.GetInt32(1);
}
}
}

C# Find non matching values in DataTables

I'm struggling with the following problem:
There are 2 DataTables (SSFE and FE in my case).
FE will contain items that match with SSFE, but it will also contain values not present in SSFE.
For Example
SSFE 1,2,3,4,5,6,9,10
FE 1,2,3,4,5,6,7,8,9,10,11
The ouput I need is in this example : 7, 8, 11.
I'm using the following code to find items that do match:
DataSet set = new DataSet();
//wrap the tables in a DataSet.
set.Tables.Add(SSFEData);
set.Tables.Add(FEData);
//Creates a ForeignKey like Join between two tables.
//Table1 will be the parent. Table2 will be the child.
DataRelation relation = new DataRelation("IdJoin", SSFEData.Columns[0], FEData.Columns[0], false);
//Have the DataSet perform the join.
set.Relations.Add(relation);
//Loop through table1 without using LINQ.
for (int i = 0; i < SSFEData.Rows.Count; i++)
{
//If any rows in Table2 have the same Id as the current row in Table1
if (SSFEData.Rows[i].GetChildRows(relation).Length > 0)
{
SSFEData.Rows[i]["PackageError"] = SSFEData.Rows[i].GetChildRows(relation)[0][1];
SSFEData.Rows[i]["SaleError"] = SSFEData.Rows[i].GetChildRows(relation)[0][2];
}
}
There should be an trick to find these items that do not have an relation.
Any suggestion will be great!
Well, you could of course use a little bit of LINQ by turning the data tables into IEnumerables using the AsEnumerable()1 extension method.
I am using a few assumptions to illustrate this:
"id" is the column with an integer value relating rows in FEData and SSFEData.
"id" is the primary key column on both FEData and SSFEData.
Then this will return a list of rows from FEData that are not present in SSFEData:
var notInSSFEData = FEData.AsEnumerable()
.Where(x => SSFEData.Rows.Find((object)x.Field<int>("id")) == null)
.ToList();
If assumption 2 above does not hold (i.e. the "id" field is not the primary key), a slightly more elaborate query is required.
var notInSSFEData = FEData.AsEnumerable()
.Where(x1 => !SSFEData.AsEnumerable().Any(x2 => x2.Field<int>("id") == x1.Field<int>("id")))
.ToList();
1 this requires adding a reference to System.Data.DataSetExtensions (in System.Data.DataSetExtensions.dll).

Get present primary keys from link table, set Checkstate checklistbox

I have a CheckedListbox which contains values from some table called products.
The idea is to check the products that are associated to a customer. Now it does save correctly in an link table, yet when loading it again, the items that were checked do not get loaded correctly into the CheckedListbox.
So from that link table where, I would like to get all rows from just one column. All tables are already loaded into the application so I don't want to use sql.
I've tried using linq, with no success, Ids is just empty here.
int[] Ids = (from m in dataset.Tables["LinkTable"].AsEnumerable()
where m.Field<int>("customerId") == customerId
select m.Field<int>("productId")).ToArray();
Then, if I do succeed to get those Id's, I would like to get the indexes of those primary keys so I can set the correct products to checked.
I've tired doing it like this, but this gives me error in other parts of the program, because I am setting a Primary key to a global datatable. Datagridviews don't like that.
DataColumn[] keyColumns = new DataColumn[1];
keyColumns[0] = dataset.Tables["products"].Columns["Id"];
currentPatient.GetTheDataSet.Tables["products"].PrimaryKey = keyColumns;
foreach (int Id in Ids)
{
DataRow row = dataset.Tables["Products"].Rows.Find(Id);
int index = dataset.Tables["Products"].Rows.IndexOf(row);
clbMedications.SetItemChecked(index, true);
}
I would like to do that last part without specifying a primary key, I couldn't find how to do that in linq.
I know it consists of 2 questions, but perhaps this can be done with just one linq statement so I better combine them.
[EDIT]
Finally, i think i've got what you need:
var qry = (from p in ds.Tables["products"].AsEnumerable()
select new {
Id = p.Field<int>("Id"),
Index = ds.Tables["products"].Rows.IndexOf(p),
Checked = ds.Tables["LinkTable"].AsEnumerable().Any(x=>x.Field<int>("productId") == p.Field<int>("Id") && x.Field<int>("customerId")==customerid)
}).ToList();
Above query returns the list, which you can bnid with CheckedListbox.

Sorting a data table based on sort order from another data table

I have 2 data tables, one in which i have my data(D1) with a unique data Id and in another data table(D2) i have the Ids of all the records of my data table(D1) in a particular order.How can i sort my data table(D1) based on the order of Ids in D2.I am using c# asp.net
You could copy the rows in the ordering table into a Dicationary with an index. Assuming your key field is named Key the code might look like this:
static void Main(string[] args)
{
var dt = new DataTable("Data");
var dtOrder = new DataTable("Order");
// Insert some data here
int i = 0;
var orderDict = new Dictionary<object, int>();
foreach(DataRow row in dtOrder.Rows)
{
orderDict.Add(row["Key"], ++i);
}
var ordered = dt.Rows.Cast<DataRow>().OrderBy(r => orderDict[r["Key"]]);
}
As I read Peaceman71's comment, I think it is worth mentioning that this is a disconnected approach. Any proper database software will do this for you as well.
It depends on if you want to do this in the code or in the database.
In the database you would join the two tables, such as (MS-SQL/T-SQL):
SELECT D2.Sort, D1.* FROM D2 LEFT JOIN D1 ON D2.ID = D1.ID ORDER BY D2.Sort
In code it depends very much on where you keep the data. DataSet, DataTables etc.

How to save data retrieved from a query

I previously asked the question and got answer to Best approach to write query but the problem is that if you have to save this result in a list then there duplication of records. For example
the resultant table of the join given EXAMPLE
See there are duplicate rows. How can you filter them out, and yet save the data of order number?
Of course there may be some ways but I am looking for some great ways
How can we store the data in list and not create duplicate rows in list?
My current code for my tables is
int lastUserId = 0;
sql_cmd = new SqlCommand();
sql_cmd.Connection = sql_con;
sql_cmd.CommandText = "SELECT * FROM AccountsUsers LEFT JOIN Accounts ON AccountsUsers.Id = Accounts.userId ORDER BY AccountsUsers.accFirstName";
SqlDataReader reader = sql_cmd.ExecuteReader();
if (reader.HasRows == true)
{
Users userToAdd = new Users();
while (reader.Read())
{
userToAdd = new Users();
userToAdd.userId = int.Parse(reader["Id"].ToString());
userToAdd.firstName = reader["accFirstName"].ToString();
userToAdd.lastName = reader["accLastName"].ToString();
lastUserId = userToAdd.userId;
Websites domainData = new Websites();
domainData.domainName = reader["accDomainName"].ToString();
domainData.userName = reader["accUserName"].ToString();
domainData.password = reader["accPass"].ToString();
domainData.URL = reader["accDomain"].ToString();
userToAdd.DomainData.Add(domainData);
allUsers.Add(userToAdd);
}
}
For second table I have custom list that will hold the entries of all the data in second table.
The table returned is table having joins and have multiple rows for same
Besides using the Dictionary idea as answered by Antonio Bakula...
If you persist the dictionary of users and call the code in your sample multiple times you should consider that a user account is either new, modifed, or deleted.
The algorithm to use is the following when executing your SQL query:
If row in query result is not in dictionary create and add new user to the dictionary.
If row in query result is in dictionary update the user information.
If dictionary item not in query result delete the user from the dictionary.
I'd also recommend not using SELECT *
Use only the table columns your code needs, this improves the performance of your code, and prevents a potential security breach by returning private user information.
i am not sure why are you not using distinct clause in your sql to fetch unique results. also that will be faster. did you look at using hashtables.
I would put users into Dictonary and check if allready exists, something like this :
Dictionary<int, Users> allUsers = new Dictionary<int, Users>()
and then in Reader while loop :
int userId = int.Parse(reader["Id"].ToString());
Users currUser = allUsers[userId];
if (currUser == null)
{
currUser = new Users();
currUser.userId = userId);
currUser.firstName = reader["accFirstName"].ToString();
currUser.lastName = reader["accLastName"].ToString();
allUsers.Add(userID, currUser);
}
Websites domainData = new Websites();
domainData.domainName = reader["accDomainName"].ToString();
domainData.userName = reader["accUserName"].ToString();
domainData.password = reader["accPass"].ToString();
domainData.URL = reader["accDomain"].ToString();
currUser.DomainData.Add(domainData);
Seems like the root of your problem is in your database table.
When you said duplicate data rows, are you saying you get duplicate entries in the list or you have duplicate data in your table?
Give 2 rows that are duplicate.
Two options:
First, prevent pulling duplicate data from sql by using a distinct clause like:
select distinct from where
Second option as mentioned Antonio, is to check if the list already has it.
First option is recommended unless there are other reasons.

Categories