Given a DataTable with Columns A, B, C, D
looking for a clean way to restrict columns to say A, C
similar to how the DataView restricts the Rows in a DataView
Typically a DataTable has a relatively small, fixed number of named columns, and a potentially large, variable number of unnamed rows. Hence filtering makes sense for rows, but not for columns.
Most applications would simply ignore the columns they are not interested in.
I don't think there's any way to do what you want, short of cloning the DataTable and deleting the columns you don't want.
Or perhaps pivoting the DataTable so that columns become rows.
Found the solution, posting it here for others who might land here
DataView.ToTable() method described here http://msdn.microsoft.com/en-us/library/wec2b2e6.aspx
used as
DataTable.DefaultView.ToTable( flag, );
Related
This question is about finding a more efficient way for a simple problem. I have two DataTables with same structure (i.e. the Columns have same name with same Ordinals). Let them Call DataTable A and DataTable B. Assume both have 100 rows. Now I want to copy all the rows of DataTable B to DataTable A without removing rows from DataTable A. So in the end DataTable A has 200 rows. I did it as shown below.
for (int i = 0; i < B.Rows.Count - 1;i++ )
{
DataRow dr = B.Rows[i];
A.Rows.Add(dr);
}
The issue is I do not want to loop. Is there a direct way to copy it, without looping. The whole 100 rows at once. Is there a function which specifies the set of rows you want to copy.
As far as I know, there is no other way of copying multiple rows from one Datatable to another than iterating through all the rows. In fact, on MSDN there is an article telling you how to copy rows between Datatables and uses an iteration loop.
https://support.microsoft.com/en-gb/kb/305346
There are some problems with your simple approach because it doesnt handle primary key violations. Try BeginLoadData, LoadDataRow and EndLoadData. This should be more efficient. BeginLoadData and EndLoadData call only once.
If you just need a new independent DataTable instance to work with and do not need to append rows to an existing DataTable, then the DataView.ToTable() method is very convenient.
https://msdn.microsoft.com/en-us/library/a8ycds2f(v=vs.110).aspx
It creates a separate copy with the same schema and content.
DataTable objTableB = objTableA.DefaultView.ToTable();
I have two DataTables both has same no of columns and column names.Am in need of comparing both for the different rows.Which means even if one cell doesnt match the row should be plotted.I tried with
table1.Merge(table2);
DataTable modified = table2.GetChanges();
But this is returning null.
Where as
IEnumerable<DataRow> added = table1.AsEnumerable().Except(table2.AsEnumerable());
This is returning the table1 values alone even there are different values for a some cells in table1 compared to table2.
Can anyone help me for this comparison.Various sites i referred,the instruction said was to compare each column in a row but since i have N no of columns i cant go with that.I need a smarter way of comparison which would be efficient.
Thanks in advance
IEnumerable<DataRow> added = table1.AsEnumerable().Except(table2.AsEnumerable());
should be changed to
IEnumerable<DataRow> added = table1.AsEnumerable().Except(table2.AsEnumerable(),DataRowComparer.Default);
Because DataRows don't know how to compare themselves to eachother on their own. You can also provide your own equality delegate instead of DataRowComparer.Default if required.
Have you tried using merge? http://msdn.microsoft.com/en-us/library/fk68ew7b.aspx
Datatable1.Merge(datatable2);
DataTable DataTable3 = Datatable2.GetChanges();
I have an object structure that is mimicking the properties of an excel table. So i have a table object containing properties such as title, header row object and body row objects. Within the header row and each body row object, i have a cell object containing info on each cell per row. I am looking for a more efficient way to store this table structure since in one of my uses for this object, i am printing its structure to screen. Currently, i am doing an O(n^2) complexity for printing each row for each cell:
foreach(var row in Table.Rows){
foreach(var cell in row.Cells){
Console.WriteLine(cell.ToString())
}
}
Is there a more efficient way of storing this structure to avoid the n^2? I ask this because this printing functionality exists in another n^2 loop. Basically i have a list of tables titles and a list of tables. I need to find those tables whose titles are in the title list. Then for each of those tables, i need to print their rows and the cells in each row. Can any part of this operation be optimized by using a different data structure for storage perhaps? Im not sure how exactly they work but i have heard of hashing and dictionary?
Thanks
Since you are looking for tables with specific titles, you could use a dictionary to store the tables by title
Dictionary<string,Table> tablesByTitle = new Dictionary<string,Table>();
tablesByTitle.Add(table.Title, table);
...
table = tablesByTitle["SomeTableTitle"];
This would make finding a table an O(1) operation. Finding n tables would be an O(n) operation.
Printing the tables then of cause depends on the number of rows and columns. There is nothing, which can change that.
UPDATE:
string tablesFromGuiElement = "Employees;Companies;Addresses";
string[] selectedTables = tablesFromGuiElement.Split(';');
foreach (string title in selectedTables) {
Table tbl = tablesByTitle[title];
PrintTable(tbl);
}
There isn't anything more efficient than an N^2 operation for outputting an NxN matrix of values. Worst-case, you will always be doing this.
Now, if instead of storing the values in a multidimensional collection that defines the graphical relationship of rows and columns, you put them in a one-dimensional collection and included the row-column information with each cell, then you would only need to iterate through the cells that had values. Worst-case is still N^2 for a table of N rows and N columns that is fully populated (the one-dimensional array, though linear to enumerate, will have N^2 items), but the best case would be that only one cell in that table is populated (or none are) which would be constant-time.
This answer applies to the, printing the table part, but the question was extended.
for the getting the table part, see the other answer.
No, there is not.
Unless perhaps your values follow some predictable distribution, then you could use a function of x and y and store no data at all, or maybe a seed and a function.
You could cache the print output in a string or StringBuider if you require it multiple times.
If there is enough data I guess you might apply some compression algorithm but I wouldn't say that was simpler or more efficient.
Morning,
Regarding the following quote, is this limit independent of how many columns there are? (Im assuming not but its not specifically stated anywhere.) If it is linked to the number of columns, how do you calculate that your not over this limit?
To add rows to a DataTable, you must first use the NewRow method to return a new DataRow object. The NewRow method returns a row with the schema of the DataTable, as it is defined by the table's DataColumnCollection. The maximum number of rows that a DataTable can store is 16,777,216. For more information, see Adding Data to a DataTable.
"Link to where quote was taken from."
Thanks for your help.
I would expect that limit (which is 224) to be independent of the number of columns. I expect that it's just a single 32-bit integer internally is used to represent the row count as 24-bits and 8 bits are used for flags or something similar.
In practice, 16 million rows is going to take a long time to populate and a lot of memory... if you're in danger of hitting that limit, you should probably be rethinking how you're accessing data to start with.
It is not linked to the number of columns, except in so far as your memory has an upper limit so if you have so many columns you can't store the rows in memory you could get an out of memory error.
You can use the DataTable.Rows.count to check the current row count before adding the new row
I'm creating a HashMap mapping the ID field of a row in a DataTable to the row itself, to improve lookup time for some frequently accessed tables. Now, from time to time, I'm getting the RowNotInTableException:
This row has been removed from a table and does not have any data. BeginEdit() will allow creation of new data in this row.
After looking around the net a bit, it seems that DataRows don't like not being attached to a DataTable. Even though the DataTable stays in memory (not sure if the DataRows keep a reference to it, but I'm definitely still caching it anyway), is it possible I'm breaking something by keeping those rows all isolated in a HashMap? What other reason can there be for this error? This post
RowNotInTableException when accessing second time
discusses a similar problem but there's no solution either.
UPDATE
I'm actually storing DataRowViews if that makes any difference.
The DataRow should be always attached to some DataTable. Even if is removed from DataTable, the row still has reference to the table.
The reason is, the schema of table is placed in DataTable not in DataRow (and the data itself too).
If you want fast lookup without DataTables, use some own structure instead of DataRow.