get the datatable with distinct rows in C# - c#

I have a datatable returned as a result of fetching data from a spreadsheet. I need to display the resultset only with the distinct rows depends up on only a column.
For example I have a datatable with columns
id | name | age | email
Then if more than one record with the same id is listed it should omitted. I tried
dt = dt.DefaultView.ToTable(true)
but it returns the distinct records with respect to all columns. I need the distinct records only based on the id.
Can anyone help me on this?

You can use GroupBy :-
DataTable result = dt.AsEnumerable()
.GroupBy(x => x.Field<int>("Id"))
.Select(x => x.First()).CopyToDataTable();
Please note, in case of a matching Id, I am taking the first record and ignoring the rest.

You need to mention the column name on which the ToTable operation will execute to select distinct values.
Please find below the code section
DataView view = new DataView(table);
DataTable distinctValues = view.ToTable(true, "id");

Related

Select all columns after filtering Distinct rows based on few columns from a Datatable in c#

I gone through similar questions posted here but didn't find the solution of my problem.
I have a datatable in C# which contains duplicate rows like below:
Now, I have to apply a filter which finds all distinct rows based on Last 2 highlighted columns but in final result set I have to return all columns.
Also, I'll get an ADDRESS_ID whose corresponding row should be
returned and duplicates should be removed.
DataView view = new DataView(ds.Tables[0]);
DataTable distinctValues = view.ToTable(true, "ADDR_LINE_1", "ADDR_LINE_2", "ADDR_LINE_3", "CITY", "STATE", "ZIP", "BOX_NUMBER");
This code is returning 2 rows but not all columns.
Also used this code:
DataTable dtUniqRecords = new DataTable();
dtUniqRecords = ds.Tables[0].DefaultView.ToTable(true, "RELATE_CODE", "ADDRESS_TYPE", "ADDRESS_CODE", "ADDRESS_ID", "ADDR_LINE_1", "ADDR_LINE_2", "ADDR_LINE_3", "CITY", "STATE", "ZIP", "BOX_NUMBER");
But this is returning all rows with duplicates.
You can use GroupBy to filter out duplicate values:
var dt = ds.Tables[0];
var distinct = dt.AsEnumerable()
.GroupBy(g => new
{
Address1 = g.Field<string>("ADDR_LINE_1"),
Address2 = g.Field<string>("ADDR_LINE_2")
// any other fields you need to group by
})
.Select(g => g.First()) // select first group including all columns
.CopyToDataTable();

How To select Specific Column From DataTable in C#?

I have a dataTable with 4 columns ,
I want to select one column without foreach or any other expensive loop and my result must be a new data table with one column ,How can I do this;
DataTable leaveTypesPerPersonnel = LeaveGroup.GetLeaveTypesPerPersonnels(dtPersonnel.row);
leaveTypesPerPersonnel has this columns :
[ID,Guid,LeaveTypeID,Code]
I want Filter leaveTypesPerPersonnel wihtout foreach and get new datatable with just Column [ID]
NOTE: Output must be a Datatable With one column.
leaveTypesPerPersonnel.Columns.Remove("Guid");
leaveTypesPerPersonnel.Columns.Remove("LeaveTypeID");
leaveTypesPerPersonnel.Columns.Remove("Code");
or
DataTable dt= new DataView(leaveTypesPerPersonnel).ToTable(false,"ID");
You should be able to run a quick LINQ statement against the data table.
var results = (from item in leaveTypesPerPersonnel
select item.ID);
This will give you an IEnumerable if I remember correctly. It's not a DataTable, but might provide a solution to your problem as well.
Here is a try on how to search and convert the result to DataTable
var dataTable = leaveTypesPerPersonnel.Rows.Cast<DataRow>().ToList().Where(x=> x["ID"] == 21).CopyToDataTable().DefaultView.ToTable(false, "ID");
Or
var dataTable = leaveTypesPerPersonnel.Select("ID = 21").CopyToDataTable().DefaultView.ToTable(false, "ID");
Or
var dataTable = leaveTypesPerPersonnel.Rows.Cast<DataRow>().ToList().CopyToDataTable().DefaultView.ToTable(false, "ID");

How to select Values from Datatable which appears once

myDataTable has a column named ORDER_NO. I would like to select the rows from this table which appears once. If a value appears two times than it should not selected.
ORDER_NO contain Values
1000A
1001A
1001B
1002A
1002B
1002C
1000A
1001A
1001B
I want to select only form the values above are:
1002A
1002B
1002C
as they appears once in the column. Can anyone help?
So you want only unique rows according to the ORDER_NO column?
Presuming that it's a string column you could use LINQ's Enumerable.GroupBy:
var uniqueRows = table.AsEnumerable()
.GroupBy(row => row.Field<string>("ORDER_NO"))
.Where(group => group.Count() == 1)
.Select(group => group.First());
if you want a new DataTable from the unique rows you can use:
table = uniqueRows.CopyToDataTable();
If you instead only want this column's values:
IEnumerable<string> unqiueOrderNumbers = uniqueRows.Select(row => row.Field<string>("ORDER_NO"));
Apart from #Tim's answer you can also use DataView to simplify this thing
DataView view = new DataView(table);
DataTable distinctValues = view.ToTable(true, "ORDER_NO");
I think this is the simplest way to get distinct values from any table. You can even mention multiple columns in the ToTable method. Just pass column name as argument like ORDER_NO is send in above sample code.

ToTable() doesn't return distinct records?

I am using datatable to return me distinct records but somehow its not returning me distinct records
I tried like
dt.DefaultView.ToTable(true, "Id");
foreach (DataRow row in dt.Rows)
{
System.Diagnostics.Debug.Write(row["Id"]);
}
It still returns me all the records. What could be wrong here?
UPDATE
My sql is as below
select t.Update ,t.id as Id, t.name ,t.toDate,t.Age from tableA t Where t.Id = 55
union
select t.Update ,t.id as Id, t.name ,t.toDate,t.Age from tableB t Where t.Id = 55
order by Id
Its very hard to do distinct in my query as there are many columns than mentioned here.
If you use a database it would be better to use sql to return only distinct records(e.g. by using DISTINCT, GROUP BY or a window function).
If you want to filter the table in memory you could also use Linq-To-DataSet:
dt = dt.AsEnumerable()
.GroupBy(r=>r.Field<int>("Id")) // assuming that the type is `int`
.Select(g=>g.First()) // take the first row of each group arbitrarily
.CopyToData‌​Table();
Note that the power of Linq starts when you want to filter these rows or if you don't want to take the first row of each id-group arbitrarily but for example the last row(acc. to a DateTime field). Maybe you also want to order the groups or just return the first ten. No problem, just use OrderBy and Take.
The issue is that you're not grabbing the new table:
var newDt = dt.DefaultView.ToTable(true, "Id");
foreach (DataRow dr in newDt.Rows) ...
the ToTable method doesn't modify the existing table - it creates a new one.
DataView view = new DataView(table);
DataTable distinctValues = view.ToTable(true, "Id");
you are not assigning the return value to any variable, try this
DataTable dtnew = dt.DefaultView.ToTable(true, "Id");
foreach (DataRow row in dtnew.Rows)
{
System.Diagnostics.Debug.Write(row["Id"]);
}

Getting duplicates count for each distinct value from a datatable

I've a datatable which has a single text column 'Title' which can have multiple values with duplicates. I can remove the duplicates using a dataview.
DataView v = new DataView(tempTable);
tempTable = v.ToTable(true, "Title");
But how can i get the number of duplicates for each distinct value without any looping?
If you don't want to loop or use Linq, so there is no way to do that but you can use a computed column on the data table with one more condition if applicable with you. That is the data should be in two related tables like this.
DataRelation rel = new DataRelation("CustToOrders", data.Tables["Customers"].Columns["customerid"], data.Tables["Orders"].Columns["customerid"]);
data.Relations.Add(rel);
Given that customerid field as a Foreign key in the Orders table so it has duplicates.
You can get the count of the duplicates this way:
data.Tables["Customers"].Columns.Add("Duplicates",
GetType(Decimal), "Count(child.customerid)");
The way I would get the results that you want would look something like this:
tempTable.Rows.Cast<DataRow>()
.Select(dr => Convert.ToString(dr[0]))
.GroupBy(dr => dr)
.Select(g => new { Title = g.Key, Count = g.Count() });
However, it's actually looping under the hood. In fact, I can't think of a way to do that kind of a grouping without inspecting each record.
The drawback is that the result of that expression is a sequence of anonymous type instances. If you still want the result to be a DataView, you could rewrite the last Select to create a new DataRow with two columns, and shove them into a new DataTable which you pass to the DataView.

Categories