Copying ASP.Net TableRow - c#

I think this problem is related to Reference Types, and my lack of understanding of these ...
So I have dynamically created ASP.Net Tables (as in Web.UI.WebControls.Table, not the database variety)
These can have anything from one row with one cell with text, to a whole series of nested tables and controls, depending on the clients.
I need to loop through each TableRow, if a certain condition is met then I copy that row to a 2nd Table object. Here's a simplified bit of the code.
Table xTblComplete = (passed in as parameter) // original & complete table
Table xTblTemp = new Table(); // gets built dynamically with specific rows
foreach (TableRow xThisRow in xTblComplete.Rows)
{
if (xThisRow.Cells.Count > 0)
{
if (certain condition met)
{
xTblTemp.Rows.Add(xThisRow);
}
}
}
Where I come unstuck is that the foreach (row in table.rows) throws an error when I try to add the TableRow to Table2. I get the error "Collection was modified; enumeration operation may not execute. "
This makes sense, in that I should be making a COPY of that Table Row to add.
Can anyone advise on how this is done? I've scanned MSDN and the forums for general copying-of-reference types, but they all seem to point to using ICloneable , which I believe I'm unable to do as this isn't my class.
Am hoping this is something small and fundamental I'm missing out on, thanks in advance.

You are iterating through the row collection using the for loop. You can't modify the collection while doing that, that's why you are getting the error message. That row is attached to that table. Period.
If you need a copy of that row get the values you are looking for cell by cell. Here is an example:
TableRow tempRow= new TableRow();
Table xTblTemp= new Table();
for (int i = 0; i < xTblComplete.Rows[0].Cells.Count - 1; i++)
{
TableCell cell = xTblComplete.Rows[0].Cells[i];
tempRow.Cells[i].Text = cell.Text;
}
xTblTemp.Rows.Add(tempRow);

Thanks Ulises, your answers were helpful, unfortunately the complexity of these tables procluded a simple loop & copy contents. By that I mean that there were cells that might possibly have had 5 levels of nested tables, with any number of web controls inside each. Yep, a CSS perfectionist would retch at the idea of so many nested tables, but it's what had to be done!!
In the end I utilized a while(bln) loop, examining xTblComplete.Rows[0] each time.
If it met the condition , I would copy it to xTmpTable, which also removed it from xTblComplete.
If it failed the condition, I would remove it myself ( xTblComplete.Rows.Remove(xTblComplete.Rows[0]);
This way Rows[0] would always be the next to process.
After each Loop I checked the bln for more rows to process, if none then the loop would exit.

Related

What is the fastest way to populate a C# DataTable with data stored on columns?

I have a DataTable object that I need to fill based on data stored in a stream of columns - i.e. the stream initially contains the schema of the DataTable, and subsequently, values that should go into it organised by column.
At present, I'm taking the rather naive approach of
Create enough empty rows to hold all data values.
Fill those rows per cell.
The result is a per-cell iteration, which is not especially quick to say the least.
That is:
// Create rows first...
// Then populate...
foreach (var col in table.Columns.Cast<DataColumn>)
{
List<object> values = GetValuesfromStream(theStream);
// Actual method has some DBNull checking here, but should
// be immaterial to any solution.
for (var i=0; i<values.Count; i++)
table.Rows[i][col] = values[i];
}
My guess is the backing DataStorage items for each column aren't expanding as the rows are added, but as values are added to each column, but I'm far from certain. Any tips for loading this kind of data.
NB that loading all lists first and then reading in by row is probably not sensible - this approach is being taken in the first place to mitigate potential out of memory exceptions that tend to result when serializing huge DataTable objects, so grabbing a clone of the entire data grid and reading it in would probably just move the problem elsewhere. There's definitely enough memory for the original table and another column of values, but there probably isn't for two copies of the DataTable.
Whilst I haven't found a way to avoid iterating cells, as per the comments above, I've found that writing to DataRow items that have already been added to the table turns out to be a bad idea, and was responsible for the vast majority of the slowdown I observed.
The final approach I used ended up looking something like this:
List<DataRow> rows = null;
// Start population...
var cols = table.Columns.Cast<DataColumn>.Where(c => string.IsNullOrEmpty(c.Expression));
foreach (var col in cols)
{
List<object> values = GetValuesfromStream(theStream);
// Create rows first if required.
if (rows == null)
{
rows = new List<DataRow>();
for (var i=0; i<values.Count; i++)
rows.Add(table.NewRow());
}
// Actual method has some DBNull checking here, but should
// be immaterial to any solution.
for (var i=0; i<values.Count; i++)
rows[i][col] = values[i];
}
rows.ForEach(r => table.Rows.Add(r));
This approach addresses two problems:
If you try to add an empty DataRow to a table that has null-restrictions or similar, then you'll get an error. This approach ensures all the data is there before it's added, which should address most such issues (although I haven't had need to check how it works with auto-incrementing PK columns).
Where expressions are involved, these are evaluated when row state changes for a row that has been added to a table. Consequently, where before I had re-calculation of all expressions taking place every time a value was added to a cell (expensive and pointless), now all calculation takes place just once after all base data has been added.
There may of course be other complications with writing to a table that I've not yet encountered because the tables I am making use of don't use those features of the DataTable class/model. But for simple cases, this works well.

How to read all new rows from database?

I am trying to read all new rows that are added to the database on a timer.
First I read the entire database and save it to a local data table, but I want to read all new rows that are added to the database. Here is how I'm trying to read new rows:
string accessDB1 = string.Format("SELECT * FROM {0} ORDER BY ID DESC", tableName);
setupaccessDB(accessDB1);
int dTRows = localDataTable.Rows.Count + 1;
localDataTable.Rows.Add();
using (readNext = command.ExecuteReader())
{
while (readNext.Read())
{
for (int xyz = 0; xyz < localDataTable.Columns.Count; xyz++)
{
// Code
}
break;
}
}
If only 1 row is added within the timer then this works fine, but when multiple rows are added this only reads the latest row.
So is there any way I can read all added rows.
I am using OledbDataReader.
Thanks in advance
For most tables the primary key is based an incremental value. This can be a very simple integer that is incremented by one, but it could also be a datetime based guid.
Anyway if you know the id of the last record. You can simple ask for all records that have a 'higher' id. In that way you do get the new records, but what about updated records? If you also want those you might want to use a column that contains a datetime value.
A little bit more trickier are records that are deleted from the database. You can't retrieve those with a basic query. You could solve that by setting a TTL for each record you retrieve from the database much like a cache. When the record is 'expired', you try to retrieve it again.
Some databases like Microsoft SQL Server also provide more advanced options into this regard. You can use query notifications via the broker services or enable change tracking on your database. The last one can even indicate what was the last action per record (insert, update or delete).
Your immediate problem lies here:
while (readNext.Read())
{
doSomething();
break;
}
This is what your loop basically boils down to. That break is going to exit the loop after processing the first item, regardless of how many items there are.
The first item, in this case, will probably be the last one added (as you state it is) since you're sorting by descending ID.
In terms of reading only newly added rows, there are a variety of ways to do it, some which will depend on the DBMS that you're using.
Perhaps the simplest and most portable would be to add an extra column processed which is set to false when a row is first added.
That way, you can simply have a query that looks for those records and, for each, process them and set the column to true.
In fact, you could use triggers to do this (force the flag to false on insertion) which opens up the possibility for doing it with updates as well.
Tracking deletions is a little more difficult but still achievable. You could have a trigger which actually writes the record to a separate table before deleting it so that your processing code has access to those details as well.
The following works
using (readNext = command.ExecuteReader())
{
while (readNext.Read())
{
abc = readNext.FieldCount;
for (int s = 1; s < abc; s++)
{
var nextValue = readNext.GetValue(s);
}
}
}
The For Loop reads the current row and then the While Loop moves onto the next row

identify which row caused exception in c#

I have a datatable and we fetch values from one database, put them in a datatable and insert them into another database. I am using the execute query method of sql and stored procedures to insert data. If one row has a string or binary data truncated error can we identify this using c# and printing that row on console??
Basically, everything is fine in dt but when I insert it I will get exception. Can I get the detail row which is causing exception?
Can anyone guide me on how to proceed with this? I need to know the exact row which is causing the issue.
If you don't know which specific row is causing the error, you'll probably have to foreach loop through it.
foreach(DataRow row in yourDataTable.Rows) {
//Check for the issue.
}
This will loop through each DataRow in your table. You'll have to check each individual cell, but you can't foreach loop through it, you'll have to do it manually based on the row. For example, if you know each Column name, you can do:
if(row["whatever"]...) // you want to check for the issue here.
You can also do:
int len = yourDataTable.Columns.Count;
foreach(DataRow row in yourDataTable.Rows) {
for(int i = 0; i < len; i++) {
//Check based in row[i] for your problem.
}
}
That will loop through each row, then each cell in each row based on index. You'll have to do your comparison based on the type received, though, which you'll have to determine based on the contents of your DataTable.

Is there a more efficient way to create lists that are the same?

I have a series of lists and classes that implement a table of data. The basic classes are: Columns, Rows, and Cells. The Rows contains some ID information and list of Cells which contains the row's value for each column. Currently I create the rows in a cell with code like this
void CreateRow()
{
Row newRow = new Row();
newRow.ID = idInfo;
foreach (var Column in Columns)
{
newRow.Cells.Add(new Cell(Column.ID));
}
Rows.Add(newRow);
}
The works fine, but in some cases am calling CreateRow() 20,000 times and have 200+ columns. So I am wondering if there is a more efficient way to populate the cells since the cells in a certain column in each row are identical.
Any ideas?
Thanks,
Jerry
Currently you create unique Cell object for each position in your matrix - that's a lot of cells given your use case of 20.000 + rows.
One approach to be more efficient could be to not add the cells at all when you construct the matrix, but only when you try to get or set its value (i.e using Lazy<T>).
Assuming you set the value of a cell before retrieving it, you could then have a factory method for creating a cell with a value - make the Cell object immutable and when you are "creating" a Cell for which you already have another cell with an identical value, return that cell instead. This could reduce the total number of Cell objects significantly, of course there's more overhead since you need to check whether you have a cell of the same value already and need to call the factory method again if you need to update the value of a cell.
Then again all of this could not be worth it if you do not experience any memory/performance problems with your current approach - measuring performance is key here.
Isn't Columns a collection?
var Ids = Columns.Select(c => c.Id).ToArray();
var Names = Columns.Select(c => c.Name).ToArray();
etc. Except why do that if Columns is already a collection? For you could do Columns[index].Id
Or if you must have the code you outlined:
Row newRow = new Row();
newRow.ID = idInfo;
// presuming Cells is a List<>
newRow.Cells.AddRange(Columns.Select(c => new Cell(c.Id)));
Rows.Add(newRow);
Some suggestions (depends on what you are looking for)
Consider using (strongly typed) DataSet/DataTable
If using List and you know the size, set the capacity to avoid reallocation (new List(2000))
Use struct instead of class if it makes sense
Cache objects if it makes sense (instead of duplicating the same object over and over)
You're creating the cells anyways. So I gather that the question refers to when you will fill the cells with their values, which are always in each column for all rows.
I actually think that from a correctness point of view, it makes sense to have the data duplicated, since they are in effect separate instances of the same data.
That said, if it is not really data, but you just want to show a view-column with the same value for each row, and you just want it as a data column in order to ease showing it as a view-column, then in your property-get Row.Cells(Id) you can check the ID, and if it's one of those columns where the value is always the same, return that value, bypassing looking up your _Cells collection.
If the data is mostly the same and sometimes different, you may want to use 'default values' where if the Cell object does not exist, a default value for that column will be returned. This necessitates a GetValue() method on the row, though, if you want to avoid having the Cell object altogether for places where it is default.
If you don't care about #1, you can really make a single instance of whatever the value is, and reference it in your Cell's value. This is harder to do for a Value Type than for a Reference Type (definition here) but it can be done.
Lastly, is there any reason you're not using .NET's supplied DataTable and DataRow types? I'm sure the MS geeks programmed as much efficiency as they could into those.

The right data structure to use for an Excel clone

Let say I'm working on an Excel clone in C#.
My grid is represented as follows:
private struct CellValue
{
private int column;
private int row;
private string text;
}
private List<CellValue> cellValues = new List<CellValue>();
Each time user add a text, I just package it as CellValue and add it into cellValues. Given a CellValue type, I can determine its row and column in O(1) time, which is great. However, given a column and a row, I need to loop through the entire cellValues to find which text is in that column and row, which is terribly slow. Also, given a text, I too need to loop through the entire thing. Is there any data structure where I can achive all 3 task in O(1) time?
Updated:
Looking through some of the answers, I don't think I had found one that I like. Can I:
Not keeping more than 2 copies of CellValue, in order to avoid sync-ing them. In C world I would have made nice use of pointers.
Rows and Columns can be dynamically added (Unlike Excel).
I would opt for a sparse array (a linked list of linked lists) to give maximum flexibility with minimum storage.
In this example, you have a linked list of rows with each element pointing to a linked list of cells in that row (you could reverse the cells and rows depending on your needs).
|
V
+-+ +---+ +---+
|1| -> |1.1| ----------> |1.3| -:
+-+ +---+ +---+
|
V
+-+ +---+
|7| ----------> |7.2| -:
+-+ +---+
|
=
Each row element has the row number in it and each cell element has a pointer to its row element, so that getting the row number from a cell is O(1).
Similarly, each cell element has its column number, making that O(1) as well.
There's no easy way to get O(1) for finding immediately the cell at a given row/column but a sparse array is as fast as it's going to get unless you pre-allocate information for every possible cell so that you can do index lookups on an array - this would be very wasteful in terms of storage.
One thing you could do is make one dimension non-sparse, such as making the columns the primary array (rather than linked list) and limiting them to 1,000 - this would make the column lookup indexed (fast), then a search on the sparse rows.
I don't think you can ever get O(1) for a text lookup simply because text can be duplicated in multiple cells (unlike row/column). I still believe the sparse array will be the fastest way to search for text, unless you maintain a sorted index of all text values in another array (again, that can make it faster but at the expense of copious amounts of memory).
I think you should use one of the indexed collections to make it work reasonably fast, the perfect one is the KeyedCollection
You need to create your own collection by extending this class. This way your object will still contain row and column (so you will not loose anything), but you will be able to search for them. Probably you will have to create a class encapsulating (row, column) and make it the key (so make it immutable and override equals and get hash code)
I'd create
Collection<Collection<CellValue>> rowCellValues = new Collection<Collection<CellValue>>();
and
Collection<Collection<CellValue>> columnCellValues = new Collection<Collection<CellValue>>();
The outer collection has one entry for each row or column, indexed by the row or column number, the inner collection has all the cells in that row or column. These collections should be populated as part of the process that creates new CellValue objects.
rowCellValues[newCellValue.Row].Add(newCellValue);
columnCellValues[newCellValue.Column].Add(newCellValue);
This smells of premature optimization.
That said, there's a few features of excel that are important in choosing a good structure.
First is that excel uses the cells in a moderately non-linear fashion. The process of resolving formulas involves traversing the spreadsheets in effectively random order. The structure will need a mechanism of easily looking up values of random keys cheaply, marking them dirty, resolved, or unresolvable due to circular reference. It will also need some way to know when there are no more unresolved cells left, so that it can stop working. Any solution that involves a linked list is probably sub-optimal for this, since they would require a linear scan to get those cells.
Another issue is that excel displays a range of cells at one time. This may seem trivial, and to a large extent it is, but It will certainly be ideal if the app can pull all of the data needed to draw a range of cells in one shot. part of this may be keeping track of the display height and width of the rows and columns, so that the display system can iterate over the range until the desired width and height of cells has been collected. The need to iterate in this manner may preclude the use of a hashing strategy for sparse storage of cells.
On top of that, there are some weaknesses of the representational model of spreadsheets that could be addressed much more effectively by taking a slightly different approach.
For example, column aggregates are sort of clunky. A column total is easy enough to implement in excel, but it has a sort of magic behavior that works most of the time but not all of the time. For instance, if you add a row into the aggregated area, further calculations on that aggregate may continue to work, or not, depending on how you added it. If you copy and insert a row (and replace the values) everything works fine, but if you cut and paste the cells one row down, things don't work out so well.
Given that the data is 2-dimensional, I would have a 2D array to hold it in.
Well, you could store them in three Dictionaries: two Dictionary<int,CellValue> objects for rows and columns, and one Dictionary<string,CellValue> for text. You'd have to keep all three carefully in sync though.
I'm not sure that I wouldn't just go with a big two-dimensional array though...
If it's an exact clone, then an array-backed list of CellValue[256] arrays. Excel has 256 columns, but a growable number of rows.
If rows and columns can be added "dynamically", then you shouldn't store the row/column as an numeric attribute of the cell, but rather as a reference to a row or column object.
Example:
private struct CellValue
{
private List<CellValue> _column;
private List<CellValue> _row;
private string text;
public List<CellValue> column {
get { return _column; }
set {
if(_column!=null) { _column.Remove(this); }
_column = value;
_column.Add(this);
}
}
public List<CellValue> row {
get { return _row; }
set {
if(_row!=null) { _row.Remove(this); }
_row = value;
_row.Add(this);
}
}
}
private List<List<CellValue>> MyRows = new List<List<CellValue>>;
private List<List<CellValue>> MyColumns = new List<List<CellValue>>;
Each Row and Column object is implemented as a List of the CellValue objects. These are unordered--the order of the cells in a particular Row does not correspond to the Column index, and vice-versa.
Each sheet has a List of Rows and a list of Columns, in order of the sheet (shown above as MyRows and MyColumns).
This will allow you to rearrange and insert new rows and columns without looping through and updating any cells.
Deleting a row should loop through the cells on the row and delete them from their respective columns before deleting the row itself. And vice-versa for columns.
To find a particular Row and Column, find the appropriate Row and Column objects, then find the CellValue that they contain in common.
Example:
public CellValue GetCell(int rowIndex, int colIndex) {
List<CellValue> row = MyRows[rowIndex];
List<CellValue> col = MyColumns[colIndex];
return row.Intersect(col)[0];
}
(I'm a little fuzzy on these Extension methods in .NET 3.5, but this should be in the ballpark.)
If I recall correctly, there was an article about how Visicalc did it, maybe in Byte Magazine in the early 80s. I believe it was a sparse array of some sort. But I think there were links both up-and-down and left-and-right, so that any given cell had a pointer to the cell above it (however many cells away that may be), below it, to the left of it, and to the right of it.

Categories