i have DataTable which can have semi-duplicate rows. In the picture example two highlighted rows have all the same values but the amounts in 'Amount' columns. I would need to identify those rows and sum the amount $. Data comes from text file and there is no key to uniquely identify rows.
I looked at some answers like this one
Best way to remove duplicate entries from a data table but in my case would need to match on not just one column but 10.
Also I tried different LINQ queries but was not successful in getting far.
This is the SQL query which does the job:
SELECT [Date1],[Date2],[Date3]
,SUM([Amount1]) as Summary1
,SUM([Amount2]) as Summary2
,SUM([Amount3]) as Summary3
,[col1],[col2],[Rate1],[Rate2],[Rate3],[product],[comment]
FROM [Table]
group by [Date1],[Date2],[Date3],[col1],[col2],[Rate1],[Rate2],[Rate3],[product],[comment]
EDIT: Just to clarify, SQL query was an example of how I could get a successful results if I was querying SQL table.
This is how you would do it with EF:
var result=db.Table
.GroupBy(r=>new {
r.Date1,
r.Date2,
r.Date3,
r.col1,
r.col2,
r.Rate1,
r.Rate2,
r.Rate3,
r.product,
r.comment},
p=>new {
p.Amount1,
p.Amount2,
p.Amount3},
(key,vals)=>new {
key.Date1,
key.Date2,
key.Date3,
Amount1=vals.Sum(v=>v.Amount1),
Amount2=vals.Sum(v=>v.Amount2),
Amount3=vals.Sum(v=>v.Amount3),
key.col1,
key.col2,
key.Rate1,
key.Rate2,
key.Rate3,
key.product,
key.comments}
);
Slight modification if you are actually doing it from a dataTable:
var result=dt.AsEnumerable()
.GroupBy(d=>new {
Date1=d.Field<datetime>("Date1"),
Date2=d.Field<datetime>("Date2"),
Date3=d.Field<datetime>("Date3"),
col1=d.Field<string>("col1"),
col2=d.Field<string>("col2"),
Rate1=d.Field<decimal>("Rate1"),
Rate2=d.Field<decimal>("Rate2"),
Rate3=d.Field<decimal>("Rate3"),
product=d.Field<string>("product"),
comments=d.Field<string>("comments")},
p=>new {
Amount1=p.Field<decimal>("Amount1"),
Amount2=p.Field<decimal>("Amount2"),
Amount3=p.Field<decimal>("Amount3")},
(key,vals)=>new {
key.Date1,
key.Date2,
key.Date3,
Amount1=vals.Sum(v=>v.Amount1),
Amount2=vals.Sum(v=>v.Amount2),
Amount3=vals.Sum(v=>v.Amount3),
key.col1,
key.col2,
key.Rate1,
key.Rate2,
key.Rate3,
key.product,
key.comments}
);
If you need the result as a datatable, you can use one of the many List To Datatable extension methods out there. Just add ".ToList().AsDataTable()" at the end of the above query to get it back into a datatable.
Related
I have a datatable with name field and can have names(firstname surname)with white spaces between two words.
when I use a select method to query the records, I get nothing in return
example:
DataRow[] results = datatable.Select("Name = 'FITUR DISPELOR'");
Console.WriteLine(results.Count().ToString());
gives me 0 count of returned results.
But if I write the following, I have a matching record.
DataRow[] results = datatable.Select("Name = ''");
Console.WriteLine(results.Count().ToString());
Can someone help me out how can I query the full name in a data table? I am using c# for this.
How to Merge rows data From DataTable in C#?
I think that this is a good use case for the LINQ Group clause. You can start with something like this:
var rowGroups = dataTable.Rows.GroupBy(row =>
new {RecptNo = row["recpt_no"], Test = row["Test"]});
foreach(var group in rowGroups)
{
//Here "group" is a collection of rows with the same rectp_no and test. Process as required.
//You could also check group.Key.RecptNo and group.Key.Test here if necessary.
}
Here's a link of that relates to your question. Hope it helps.
How to merge rows in a DataTable when data in multiple columns match?
I am working with C# Window Form, I have created a form connect to mySQL database and it display list of databases, list of tables on each database and also tables contents of each table.
Questions that I have here:
After I selected a random cell in the table (datagridview) and click Delete button, I want that row (corresponding to the selected cell) to be deleted on the database.
I also need that the datagridview table content also will be refreshed and updated (with that row has been removed). (This part I think I can do if I know how to do the part 1)
So I need help with the question 1, one of those things I can't not figure out is that how can I write the SQL statement to put in the SQLadapter or SQLcommandbuilder or whatever it is. I know the SQL statement should be like:
Delete from (selected Table) Where (THIS IS THE PART WHERE I STUCK AT) => I dont know what to put in this condition, how to get it?
Any helps and advises is really appreciated!
The delete statement should consider all the selected table primary key columns and the selected row from the datagridview.
How to get the primary key columns:
SELECT `COLUMN_NAME`
FROM `information_schema`.`COLUMNS`
WHERE (`TABLE_SCHEMA` = 'dbName')
AND (`TABLE_NAME` = 'tableName')
AND (`COLUMN_KEY` = 'PRI');
Source: A better way to get Primary Key columns
How your delete statement should look like:
DELETE FROM <TABLE>
WHERE <PRIMARY_KEY_COLUMN_1> = <ROW_VALUE_1>
AND <PRIMARY_KEY_COLUMN_2> = <ROW_VALUE_2>
You see, the table could have multiple columns uniquely identifying a row. There is also the possibility of existing a reference for that very row on another table, which would prevent you from deleting it.
It would look like this:
List<string> primaryKeyColumns = GetPrimaryKeyColumns(SelectedDB, SelectedTable);
string deleteWhereClause = string.Empty;
foreach (string column in primaryKeyColumns)
{
DataGridViewRow row = datagridview.CurrentCell.OwningRow;
string value = row.Cells[column].Value.ToString();
if (string.IsNullOrEmpty(deleteWhereClause))
{
deleteWhereClause = string.Concat(column, "=", value);
}
else
{
deleteWhereClause += string.Concat(" AND ", column, "=", value);
}
}
string deleteStatement = string.Format("DELETE FROM {0} WHERE {1}", SelectedTable, deleteWhereClause);
The method GetPrimaryKeyColumns returns the names of all the primary key columns of the selected table using the select statement i posted.
You would also have to deal with other types of columns such as dates and strings, but that's basically what you will have.
I have a DataTable imported from Excel file.
Data i need is only unique from specific columns of the DataTable.
The unique data i meant is like when a command DISTINCT is used in SQL Select Query.
I want to get the list of the unique data from the DataTable Column and put them into List
I think LinQ can be used for this matter but i'm not so familiar with it.
I was thinking of code like this below
var data is from MyDataTable
where MyDataTable.ColumnName = "SpecificColumn"
select MyDataTable["SpecificColumn"]).UniqueData;
List<string> MyUniqueData = new List<string>();
foreach(object obj in data)
{
if(MyUniqueData.NotContain(obj))
MyUniqueData.add(obj);
}
I hope someone can drop off some knowledge to me.
var unique = data.Distinct().ToList();
What you're looking for is .Distinct(). See MSDN documentation here. You can specify your own comparer if you need something specific and it will return only unique records.
If you have a Datatable or DataView, inorder to get unique records from a column, you have to write this.
this would be simple.
DataTable dtNew = dt.DefaultView.ToTable(true, "ColName"); // for Datatable
DataTable dtnew= dv.ToTable(true, "ColName"); // for DataView
I've a datatable which has a single text column 'Title' which can have multiple values with duplicates. I can remove the duplicates using a dataview.
DataView v = new DataView(tempTable);
tempTable = v.ToTable(true, "Title");
But how can i get the number of duplicates for each distinct value without any looping?
If you don't want to loop or use Linq, so there is no way to do that but you can use a computed column on the data table with one more condition if applicable with you. That is the data should be in two related tables like this.
DataRelation rel = new DataRelation("CustToOrders", data.Tables["Customers"].Columns["customerid"], data.Tables["Orders"].Columns["customerid"]);
data.Relations.Add(rel);
Given that customerid field as a Foreign key in the Orders table so it has duplicates.
You can get the count of the duplicates this way:
data.Tables["Customers"].Columns.Add("Duplicates",
GetType(Decimal), "Count(child.customerid)");
The way I would get the results that you want would look something like this:
tempTable.Rows.Cast<DataRow>()
.Select(dr => Convert.ToString(dr[0]))
.GroupBy(dr => dr)
.Select(g => new { Title = g.Key, Count = g.Count() });
However, it's actually looping under the hood. In fact, I can't think of a way to do that kind of a grouping without inspecting each record.
The drawback is that the result of that expression is a sequence of anonymous type instances. If you still want the result to be a DataView, you could rewrite the last Select to create a new DataRow with two columns, and shove them into a new DataTable which you pass to the DataView.