myDataTable has a column named ORDER_NO. I would like to select the rows from this table which appears once. If a value appears two times than it should not selected.
ORDER_NO contain Values
1000A
1001A
1001B
1002A
1002B
1002C
1000A
1001A
1001B
I want to select only form the values above are:
1002A
1002B
1002C
as they appears once in the column. Can anyone help?
So you want only unique rows according to the ORDER_NO column?
Presuming that it's a string column you could use LINQ's Enumerable.GroupBy:
var uniqueRows = table.AsEnumerable()
.GroupBy(row => row.Field<string>("ORDER_NO"))
.Where(group => group.Count() == 1)
.Select(group => group.First());
if you want a new DataTable from the unique rows you can use:
table = uniqueRows.CopyToDataTable();
If you instead only want this column's values:
IEnumerable<string> unqiueOrderNumbers = uniqueRows.Select(row => row.Field<string>("ORDER_NO"));
Apart from #Tim's answer you can also use DataView to simplify this thing
DataView view = new DataView(table);
DataTable distinctValues = view.ToTable(true, "ORDER_NO");
I think this is the simplest way to get distinct values from any table. You can even mention multiple columns in the ToTable method. Just pass column name as argument like ORDER_NO is send in above sample code.
Related
I gone through similar questions posted here but didn't find the solution of my problem.
I have a datatable in C# which contains duplicate rows like below:
Now, I have to apply a filter which finds all distinct rows based on Last 2 highlighted columns but in final result set I have to return all columns.
Also, I'll get an ADDRESS_ID whose corresponding row should be
returned and duplicates should be removed.
DataView view = new DataView(ds.Tables[0]);
DataTable distinctValues = view.ToTable(true, "ADDR_LINE_1", "ADDR_LINE_2", "ADDR_LINE_3", "CITY", "STATE", "ZIP", "BOX_NUMBER");
This code is returning 2 rows but not all columns.
Also used this code:
DataTable dtUniqRecords = new DataTable();
dtUniqRecords = ds.Tables[0].DefaultView.ToTable(true, "RELATE_CODE", "ADDRESS_TYPE", "ADDRESS_CODE", "ADDRESS_ID", "ADDR_LINE_1", "ADDR_LINE_2", "ADDR_LINE_3", "CITY", "STATE", "ZIP", "BOX_NUMBER");
But this is returning all rows with duplicates.
You can use GroupBy to filter out duplicate values:
var dt = ds.Tables[0];
var distinct = dt.AsEnumerable()
.GroupBy(g => new
{
Address1 = g.Field<string>("ADDR_LINE_1"),
Address2 = g.Field<string>("ADDR_LINE_2")
// any other fields you need to group by
})
.Select(g => g.First()) // select first group including all columns
.CopyToDataTable();
I have a DataTable where sometimes values in all columns in two or more rows repeat. I would like to get distinct DataTable. The solutions from here and here don't work for me because I have many columns and depending on some conditions, the number of columns changes.
I was thinking maybe something like this
System.Data.DataTable table = new System.Data.DataTable(); // already fulfilled table
DataView view = new DataView(table);
var tableDistinct = view.ToTable(true, table.Columns);
But I can't pass table.Columns as an argument.
I don't know what's going wrong because you haven't said what's not working. However, you could use LINQ(-TO-DataTable):
table = table.AsEnumerable()
.GroupBy(r => new{ Col1 = r["Col1"], Col2 = r["Col2"], Col3 = r["Col3"] })
.Select(g => g.First())
.CopyToDataTable();
Change the columns in the anonymous type according to your column-list.
The ToTable access a list of string params, the following should convert all your columns to array of string so you don't have to enter them manually
System.Data.DataTable table = new System.Data.DataTable(); // already fulfilled table
DataView view = new DataView(table);
var tableDistinct = view.ToTable(true, table.Columns.Cast<DataColumn>().Select(z=>z.ColumnName).ToArray());
I have a datatable returned as a result of fetching data from a spreadsheet. I need to display the resultset only with the distinct rows depends up on only a column.
For example I have a datatable with columns
id | name | age | email
Then if more than one record with the same id is listed it should omitted. I tried
dt = dt.DefaultView.ToTable(true)
but it returns the distinct records with respect to all columns. I need the distinct records only based on the id.
Can anyone help me on this?
You can use GroupBy :-
DataTable result = dt.AsEnumerable()
.GroupBy(x => x.Field<int>("Id"))
.Select(x => x.First()).CopyToDataTable();
Please note, in case of a matching Id, I am taking the first record and ignoring the rest.
You need to mention the column name on which the ToTable operation will execute to select distinct values.
Please find below the code section
DataView view = new DataView(table);
DataTable distinctValues = view.ToTable(true, "id");
I have a code reading data from excel spreadsheet and I have gone this far with some answers on SO
DataTable dt = ds.Tables[0];
dt = dt.AsEnumerable().Where((row, index) => index > 4).CopyToDataTable();
DataTable filteredRows = dt.Rows.Cast<DataRow>().Where(row => row.ItemArray.All(field => !(field is System.DBNull))).CopyToDataTable();
having this
dt.Rows.Cast<DataRow>().Where(row => row.ItemArray.All(field => (field is System.DBNull)))
returns all rows that are empty.
I have also tried Any, it didn't give the required output
The code above works for where all the fields are not NULL i.e. every columns has a field. This exempt all rows that have 1 column missing but that's not what I want.
I want to exempt all rows that have all columns empty.
Just move the NOT (!) out one level. You want the items where "all rows are null" is not true, rather than where "all of the rows are not null" is true.
DataTable filteredRows = dt.Rows.Cast<DataRow>()
.Where(row => !row.ItemArray.All(field => field is System.DBNull))
.CopyToDataTable();
Have you tried filtering to Any Field instead of All?
DataTable filteredRows = dt.Rows.Cast<DataRow>().Where(row => row.ItemArray.Any(field => !(field is System.DBNull))).CopyToDataTable();
I've a datatable which has a single text column 'Title' which can have multiple values with duplicates. I can remove the duplicates using a dataview.
DataView v = new DataView(tempTable);
tempTable = v.ToTable(true, "Title");
But how can i get the number of duplicates for each distinct value without any looping?
If you don't want to loop or use Linq, so there is no way to do that but you can use a computed column on the data table with one more condition if applicable with you. That is the data should be in two related tables like this.
DataRelation rel = new DataRelation("CustToOrders", data.Tables["Customers"].Columns["customerid"], data.Tables["Orders"].Columns["customerid"]);
data.Relations.Add(rel);
Given that customerid field as a Foreign key in the Orders table so it has duplicates.
You can get the count of the duplicates this way:
data.Tables["Customers"].Columns.Add("Duplicates",
GetType(Decimal), "Count(child.customerid)");
The way I would get the results that you want would look something like this:
tempTable.Rows.Cast<DataRow>()
.Select(dr => Convert.ToString(dr[0]))
.GroupBy(dr => dr)
.Select(g => new { Title = g.Key, Count = g.Count() });
However, it's actually looping under the hood. In fact, I can't think of a way to do that kind of a grouping without inspecting each record.
The drawback is that the result of that expression is a sequence of anonymous type instances. If you still want the result to be a DataView, you could rewrite the last Select to create a new DataRow with two columns, and shove them into a new DataTable which you pass to the DataView.