LINQ on Datatable Find where all rows are empty

LINQ on Datatable Find where all rows are empty - c#

I have a code reading data from excel spreadsheet and I have gone this far with some answers on SO
DataTable dt = ds.Tables[0];
dt = dt.AsEnumerable().Where((row, index) => index > 4).CopyToDataTable();
DataTable filteredRows = dt.Rows.Cast<DataRow>().Where(row => row.ItemArray.All(field => !(field is System.DBNull))).CopyToDataTable();
having this
dt.Rows.Cast<DataRow>().Where(row => row.ItemArray.All(field => (field is System.DBNull)))
returns all rows that are empty.
I have also tried Any, it didn't give the required output
The code above works for where all the fields are not NULL i.e. every columns has a field. This exempt all rows that have 1 column missing but that's not what I want.
I want to exempt all rows that have all columns empty.

Just move the NOT (!) out one level. You want the items where "all rows are null" is not true, rather than where "all of the rows are not null" is true.
DataTable filteredRows = dt.Rows.Cast<DataRow>()
.Where(row => !row.ItemArray.All(field => field is System.DBNull))
.CopyToDataTable();

Have you tried filtering to Any Field instead of All?
DataTable filteredRows = dt.Rows.Cast<DataRow>().Where(row => row.ItemArray.Any(field => !(field is System.DBNull))).CopyToDataTable();

Related

Select all columns after filtering Distinct rows based on few columns from a Datatable in c#

I gone through similar questions posted here but didn't find the solution of my problem.
I have a datatable in C# which contains duplicate rows like below:
Now, I have to apply a filter which finds all distinct rows based on Last 2 highlighted columns but in final result set I have to return all columns.
Also, I'll get an ADDRESS_ID whose corresponding row should be
returned and duplicates should be removed.
DataView view = new DataView(ds.Tables[0]);
DataTable distinctValues = view.ToTable(true, "ADDR_LINE_1", "ADDR_LINE_2", "ADDR_LINE_3", "CITY", "STATE", "ZIP", "BOX_NUMBER");
This code is returning 2 rows but not all columns.
Also used this code:
DataTable dtUniqRecords = new DataTable();
dtUniqRecords = ds.Tables[0].DefaultView.ToTable(true, "RELATE_CODE", "ADDRESS_TYPE", "ADDRESS_CODE", "ADDRESS_ID", "ADDR_LINE_1", "ADDR_LINE_2", "ADDR_LINE_3", "CITY", "STATE", "ZIP", "BOX_NUMBER");
But this is returning all rows with duplicates.

You can use GroupBy to filter out duplicate values:
var dt = ds.Tables[0];
var distinct = dt.AsEnumerable()
.GroupBy(g => new
{
Address1 = g.Field<string>("ADDR_LINE_1"),
Address2 = g.Field<string>("ADDR_LINE_2")
// any other fields you need to group by
})
.Select(g => g.First()) // select first group including all columns
.CopyToDataTable();

Select distinct DataTable rows

I have a DataTable where sometimes values in all columns in two or more rows repeat. I would like to get distinct DataTable. The solutions from here and here don't work for me because I have many columns and depending on some conditions, the number of columns changes.
I was thinking maybe something like this
System.Data.DataTable table = new System.Data.DataTable(); // already fulfilled table
DataView view = new DataView(table);
var tableDistinct = view.ToTable(true, table.Columns);
But I can't pass table.Columns as an argument.

I don't know what's going wrong because you haven't said what's not working. However, you could use LINQ(-TO-DataTable):
table = table.AsEnumerable()
.GroupBy(r => new{ Col1 = r["Col1"], Col2 = r["Col2"], Col3 = r["Col3"] })
.Select(g => g.First())
.CopyToDataTable();
Change the columns in the anonymous type according to your column-list.

The ToTable access a list of string params, the following should convert all your columns to array of string so you don't have to enter them manually
System.Data.DataTable table = new System.Data.DataTable(); // already fulfilled table
DataView view = new DataView(table);
var tableDistinct = view.ToTable(true, table.Columns.Cast<DataColumn>().Select(z=>z.ColumnName).ToArray());

How to select Values from Datatable which appears once

myDataTable has a column named ORDER_NO. I would like to select the rows from this table which appears once. If a value appears two times than it should not selected.
ORDER_NO contain Values
1000A
1001A
1001B
1002A
1002B
1002C
1000A
1001A
1001B
I want to select only form the values above are:
1002A
1002B
1002C
as they appears once in the column. Can anyone help?

So you want only unique rows according to the ORDER_NO column?
Presuming that it's a string column you could use LINQ's Enumerable.GroupBy:
var uniqueRows = table.AsEnumerable()
.GroupBy(row => row.Field<string>("ORDER_NO"))
.Where(group => group.Count() == 1)
.Select(group => group.First());
if you want a new DataTable from the unique rows you can use:
table = uniqueRows.CopyToDataTable();
If you instead only want this column's values:
IEnumerable<string> unqiueOrderNumbers = uniqueRows.Select(row => row.Field<string>("ORDER_NO"));

Apart from #Tim's answer you can also use DataView to simplify this thing
DataView view = new DataView(table);
DataTable distinctValues = view.ToTable(true, "ORDER_NO");
I think this is the simplest way to get distinct values from any table. You can even mention multiple columns in the ToTable method. Just pass column name as argument like ORDER_NO is send in above sample code.

get the datatable with distinct rows in C#

I have a datatable returned as a result of fetching data from a spreadsheet. I need to display the resultset only with the distinct rows depends up on only a column.
For example I have a datatable with columns
id | name | age | email
Then if more than one record with the same id is listed it should omitted. I tried
dt = dt.DefaultView.ToTable(true)
but it returns the distinct records with respect to all columns. I need the distinct records only based on the id.
Can anyone help me on this?

You can use GroupBy :-
DataTable result = dt.AsEnumerable()
.GroupBy(x => x.Field<int>("Id"))
.Select(x => x.First()).CopyToDataTable();
Please note, in case of a matching Id, I am taking the first record and ignoring the rest.

You need to mention the column name on which the ToTable operation will execute to select distinct values.
Please find below the code section
DataView view = new DataView(table);
DataTable distinctValues = view.ToTable(true, "id");

Better way to filter DataTable

I currently have a DataTable with the following columns: Date, X1, Y1, Z1, X2, Y2, Z2... Xn, Yn, Zn.
When populated, Date ALWAYS has a value, and X/Y/Z1 to X/Y/Zn can be DBNull, a string, or an int. If the entire row with the exception of Date, is DBNull, i would like to remove that particular row.
I am currently doing an exhaustive search, looping through each row with a for loop, and then with a nested for loop, checking each cell, if i do not find any data (ie. only dbnull's), i then call RemoveAt, and reset the outer loop to start at zero again.
Is there a better/less hacky way of performing this operation? The initial building of the datatable cannot be modified, this must be something that happens post building.

If I understand correctly, you want to remove a row if all columns has DbNull.Value.
Try the following to do that.
DataTable table = new DataTable();
string[] columns = table.Columns.Cast<DataColumn>()
.Select(x => x.ColumnName)
.Skip(1)//skip to ignore first column
.ToArray();
Method1:
Remove all invalid rows
var invalidRows = table.AsEnumerable()
.Where(x => columns.All(c => x.Field<object>(c) == DBNull.Value))
.ToArray();
foreach (var row in invalidRows)
{
table.Rows.Remove(row);
}
Method2: take only valid rows and make new DataTable as suggested my #Tim in comments to improve performance when you have many invalid rows
var newTable = table.AsEnumerable()
.Where(x => columns.Any(c => x.Field<object>(c) != DBNull.Value))
.CopyToDataTable();

ATTENTION : THESE ARE MY EXAMPLES>>NOT EXCATLLY FOR YOUR TABLE>>>SO CHANGE IT FOR YOURSELF
The Main Help is Here >> Help
And Then
Way one :
dtData.Select("ID=1 AND ID2=3");
Way two :
GridFieldDAO dao = new GridFieldDAO();
//Load My DataTable
DataTable dt = dao.getDT();
//Get My rows based off selection criteria
DataRow[] drs = dt.Select("(detailID = 1) AND (detailTypeID = 2)");
//make a new "results" datatable via clone to keep structure
DataTable dt2 = dt.Clone();
//Import the Rows
foreach (DataRow d in drs)
{
dt2.ImportRow(d);
}
//Bind to my new DataTable and it will only show rows based off selection
//criteria
myGrid.DataSource = dt2;
myGrid.DataBind();
And The best Way is :
DataTable tblFiltered = table.AsEnumerable()
.Where(row => row.Field<String>("Nachname") == username
&& row.Field<String>("Ort") == location)
.OrderByDescending(row => row.Field<String>("Nachname"))
.CopyToDataTable();

May be this will help you. Try this
var ordered = yourdatatable.AsEnumerable().Where(x => x.Field<DateTime>("ColumnName") != null);
if (ordered.Count() > 0)
{
yourdatatable= orderedCopyToDataTable();
}
you can do the same for other columns as well.
Or
Why don't you check for the null values in your query. check for ISNULL(columnName, value) As ColumnName. Check more details here

You can use this little Linq query:
var columnsWithoutDate = table.Columns.Cast<DataColumn>().Skip(1);
table = table.AsEnumerable()
.Where(row => columnsWithoutDate.Any(col => !row.IsNull(col)))
.CopyToDataTable();
Skip(1) returns all columns but the first, so your date column is excluded. The Where enumerates all DataRows in the table and takes all rows with at least one non-null field(see:DataRow.IsNull(column)). Finally CopyToDataTable creates a new DataTable.

I would go for something like this:
var test = from row in table.AsEnumerable()
where (!row.IsNull("col1") || !row.IsNull("col2"))
select row;
//option1
DataTable dt = test.CopyToDataTable<DataRow>();
//option2
DataTable dt2 = new DataTable();
dt2.Columns.Add("col1", typeof(String));
dt2.Columns.Add("col2", typeof(Int32));
foreach (var v in test)
{
DataRow dr = dt2.NewRow();
dr["col1"] = v.Field<String>("col1");
dr["col1"] = v.Field<Int32>("col2");
dt2.Rows.Add(dr);
}

Did you try using RowFilter of DataTable?
DataTable dt = GetData();
//set the filter
dt.DefaultView.RowFilter = "----your filter----";
//then access the DataView
foreach (DataRowView drv in dt.DefaultView)
{
//you can also get a row from rowview
DataRow dr = drv.Row;
}
Check this documentation, they also explain how to handle null values in filters.
http://msdn.microsoft.com/en-us/library/system.data.dataview.rowfilter.aspx
You can also use Select() method with same filter, refer the below answer there is a good comparison on both approach.
DataView.RowFilter Vs DataTable.Select() vs DataTable.Rows.Find()
I would not suggest using AsEnumerable() approach, though looks like simple code but it is just like doing a foreach loop on rows and having IF conditions.
DataTable filter approach should be faster than AsEnumerable() (I am not sure, but I am assuming this because DataTable is .net's powerful data structure to handle tabular data)

modified answer:
myDataTable.AsEnumerable().Where(a => a.ItemArray.Count(b=>b != DBNull.Value)==1).ToList().ForEach(row => dataTable.Rows.Remove(row));
I checked, it works.
EDIT:
in response to #Tim Schmelter comment:
1 . you need myDataTable.AsEnumerable() in C#
If you have a strongly typed DataTable, you do not. I assumed it's the case, since OP says:
The initial building of the datatable cannot be modified, this must be
something that happens post building.
Maybe I did't understand what he meant (my English sometimes fails me)
2 . count the non-null fields is incorrect since a string can be null
which is not the same as if it is DBNull.Value(also according OP's
specifications)
You are probably right. If OP says he only wants DBNull, the second condition should be removed (it's a bad habit of mine to check for null just in case)
3 . ToList creates another List which is redundant
Yes. And if there's no ToList(), ForEach() can't be used. The old fashioned foreach can be used instead, or beter for loop (since foreach doesn't like when you try to modify collection inside it). Still you have to keep your result in some way.
4 . DataRow.Delete does it not remove from the table what is desired,
but it flags it as deleted for a DataAdapter(OP's has not mentioned
that he's using one, it is also not desired).
Thank you for pointing that out.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

LINQ on Datatable Find where all rows are empty - c#

Just move the NOT (!) out one level. You want the items where "all rows are null" is not true, rather than where "all of the rows are not null" is true. DataTable filteredRows = dt.Rows.Cast<DataRow>() .Where(row => !row.ItemArray.All(field => field is System.DBNull)) .CopyToDataTable();

Have you tried filtering to Any Field instead of All? DataTable filteredRows = dt.Rows.Cast<DataRow>().Where(row => row.ItemArray.Any(field => !(field is System.DBNull))).CopyToDataTable();

Related

Select all columns after filtering Distinct rows based on few columns from a Datatable in c#

Select distinct DataTable rows

How to select Values from Datatable which appears once

get the datatable with distinct rows in C#

Better way to filter DataTable

Categories

Resources