I hope you're all doing well.
I have a quick question regarding iteration. I've read several post about the speed of iteration and I couldn't figure how to make my iteration faster. Currently I'm doing something like this :
void Iteration()
{
//Creating and filling the datatable
DataTable dt = new DataTable();
dt.Columns.Add("Datetime", typeof(DateTime));
for (int i = 0; i < 150; i++)
{
DataRow row = dt.NewRow();
row["Datetime"] = DateTime.Now.AddDays(i);
dt.Rows.Add(row);
}
//Creating and filling the list
List<DateTime> _listDates = new List<DateTime>();
DateTime _startDate = DateTime.Now.AddMonths(-1);
for(int i = 0; i < 250; i++)
_listDates.Add(_startDate.AddDays(i));
//Here's the actual iteration
foreach (DateTime _date in _listDates)
{
foreach (DataRow row in dt.Rows)
{
if ((DateTime)row["Datetime"] == _date)
{
//Do something.........
}
}
}
}
I fill a List<DateTime> and a DataTable with respectively 250 and 150 rows/line. I then want to compare the two values against each other and do something when there's a match. However, in my method that means 250 * 150 = 37500 passes. Now I could break out the loop when there's a match but that seems trivial to me since the match can also be on the bottom of the list and datatable. And in my program the average lists and tables have 2500 rows. So that's millions of passes every n minutes. Needles to say that this takes a while. I'm running this calculation on a separate thread so my program stays responsive.
Is there any way to make this smarter and/or faster ? Am I on the right track ?
Cheers,
What about this? this is more efficient because both data table and datetime list are scanned only once, and HashSet.Contains time complexity is O(1).
void Iteration()
{
//Creating and filling the datatable
DataTable dt = new DataTable();
dt.Columns.Add("Datetime", typeof(DateTime));
for (int i = 0; i < 150; i++)
{
DataRow row = dt.NewRow();
row["Datetime"] = DateTime.Now.AddDays(i);
dt.Rows.Add(row);
}
//Creating and filling the list
List<DateTime> _listDates = new List<DateTime>();
DateTime _startDate = DateTime.Now.AddMonths(-1);
for (int i = 0; i < 250; i++)
_listDates.Add(_startDate.AddDays(i));
var dateSet = new HashSet<DateTime>(_listDates);
foreach (DataRow row in dt.Rows)
{
if (dateSet.Contains( (DateTime)row["Datetime"]))
{
//Do something.........
}
}
}
Related
I have a datatable loaded into my datagridview.
One functionality is to send a Mail to all selected recipients.
Before doing so I have to get the selected data first out of my raw datatable.
However, if I have a huge datatable and want to look for the checked cells, the whole process takes too long (1-3 minutes).
private DataTable GetDataTable()
{
DataTable sdt = new DataTable(); //"Selected Datatable"
int i = 0;
for (int z = 0; z < dataGridView1.Columns.Count; z++) // Add Columns to Datatable sdt
sdt.Columns.Add(dataGridView1.Columns[z].HeaderText);
foreach (DataGridViewRow Row in dataGridView1.Rows)
{
if (Convert.ToBoolean(Row.Cells["CheckboxHeader"].Value)) // Go on if Checkbox is checked
{
sdt.Rows.Add();
for (int j = 1; j < dataGridView1.ColumnCount; ++j)
{
sdt.Rows[i][j] = Row.Cells[j].Value;
}
i++;
}
}
return sdt;
}
How can I access all the checked Rows at once?
Console.WriteLine("Enter how many rows you want to delete:");
del = Convert.ToInt16(Console.ReadLine());
try
{
for (int i = 0; i <del; i++)
{
dt.Rows[i].Delete();
dt.AcceptChanges();
}
Console.WriteLine("\n************************************\n");
}
The problem you're having is that each time you delete a row, the rows "move up" so when you delete dt.Rows[0], what was dt.Rows[1] becomes dt.Rows[0], because you're accessing each row by its index rather than by a unique identifier for it. The below is a worked example that will show this if you run it (I've based it on your code so that you've got some context to it):
var dt = new DataTable();
dt.Columns.Add("IdColumn", typeof(int));
for (int i = 0; i < 10; i++)
{
dt.Rows.Add(new object[] { i } );
}
var del = 3;
for (int i = 0; i < del; i++)
{
dt.Rows[i].Delete();
dt.AcceptChanges();
Console.WriteLine("Deleted row number {0}, rows remaining in table are {1}, IdColumn for Rows[i] is {2}", i, dt.Rows.Count, dt.Rows[i]["IdColumn"]);
}
Console.WriteLine("\n************************************\n");
One option available to you is to use LINQ and replace the above with something along the lines of:
var dt = new DataTable();
dt.Columns.Add("IdColumn", typeof(int));
for (int i = 0; i < 10; i++)
{
dt.Rows.Add(new object[] { i } );
}
var del = 3;
dt = dt.AsEnumerable().Skip(del).CopyToDataTable();
If you look at the contents of dt after running the second code snippet, you should see that the rows with a value of {0, 1, 2} for IdColumn have been removed, which looks like the outcome you're after.
I have data table with 100000 records, I want to iterate through data table for every 10,000 records I want to save the records. for the next iteration next 10000 records I want to save until for 100000 records.
DataTable dt = new DataTable();
dt = ds.tables[0]; //here i am getting 100,000 records
for (int i = 0; i < dt.rows.count; i + 10000)
{
savedatatable(dt[i]);
}
You should use the following code:
DataTable dt = new DataTable();
dt = ds.tables[0]; //here i am getting 100,000 records
//Loop through columns in rows
for (int i = 0; i < dt.rows.count && i < 100000; i += 10000)
{
foreach (DataColumn col in dt.Columns)
savedatatable(dt.Rows[col.ColumnName].ToString());
}
or
DataTable dt = new DataTable();
dt = ds.tables[0]; //here i am getting 100,000 records
//Loop through rows in columns
foreach (DataColumn col in dt.Columns)
{
for (int i = 0; i < dt.rows.count && i < 100000; i += 10000)
savedatatable(dt.Rows[col.ColumnName].ToString());
}
Here's a similar question, but I'm not sure if this is what you wanted. : Looping through a DataTable
Should be something like this:
for (int i = 0; i < dt.Rows.Count; i+=10000)
{
DataRow dr = dt.Rows[i];
// do something
}
I am trying to copy data from my datagridview to a datatable so i can export to a csv file
here is the code
public DataTable createdatatablefromdgv()
{
DataTable dsptable = new DataTable();
for (int i = 0; i < dataGridView1.Columns.Count; i++)
{
DataColumn dspcolumn = new DataColumn(dataGridView1.Columns[i].HeaderText);
dsptable.Columns.Add(dspcolumn);
}
int noOfColumns = dataGridView1.Columns.Count;
foreach (DataGridViewRow dgvr in dataGridView1.Rows)
{
DataRow dataRow = dsptable.NewRow();
for (int i = 0; i < noOfColumns; i++)
{
dataRow[i] = dgvr.Cells[i].Value.ToString();
}
}
return dsptable;
}
It seems like it copies the data from the the grid to to the table but when I return the datatable all there is the columns no rows
You are not adding dataRow to data table after assigning values to its columns.
public DataTable createdatatablefromdgv()
{
DataTable dsptable = new DataTable();
for (int i = 0; i < dataGridView1.Columns.Count; i++)
{
DataColumn dspcolumn = new DataColumn(dataGridView1.Columns[i].HeaderText);
dsptable.Columns.Add(dspcolumn);
}
int noOfColumns = dataGridView1.Columns.Count;
foreach (DataGridViewRow dgvr in dataGridView1.Rows)
{
DataRow dataRow = dsptable.NewRow();
for (int i = 0; i < noOfColumns; i++)
{
dataRow[i] = dgvr.Cells[i].Value.ToString();
}
dsptable.Rows.Add(dataRow); //Add this statement to add rows to Data Table
}
return dsptable;
}
The above answer is correct but a little explanation as to why you encountered this problem. One would think that by calling NewRow() on the DataTable you would add a new row to the DataTable and then be able to access it.
What NewRow actually does is give you an instance of a DataRow which you can then use column names as apose to column identifiers (integers).
This allows you to do something like this
DataRow dataRow = dsptable.NewRow();
foreach(DataColumn dc in dsptable.Columns)
{
dataRow[dc.ColumnName] = dgvr.Cells[dc.ColumnName].Value.ToString()
}
Alternatively you could just call Rows.Add() which takes an object array or a as you were trying to do a DataRow.
List<string> rowData = new List<string>();
for (int i = 0; i < noOfColumns; i++)
{
rowData.Add(dgvr.Cells[i].Value.ToString());
}
dsptable.Rows.Add(rowData.ToArray());
This should explain adding rows to a datatable more simply :)
For my own edification, I decided to test the comparative speeds of DataTable.ImportRow vs DataTable.Merge. I found that DataTable.ImportRow was largely slower than DataTable.Merge. On rare occasion, the two functions had an equal processing time. On even rarer occasions, ImportRow was faster than Merge.
Below are my testing results and code.
Why is ImportRow slower than Merge?
What makes Merge faster?
DataTable dt = new DataTable();
dt.Columns.Add("customerId", typeof(int));
dt.Columns.Add("username", typeof(string));
for (int i = 0; i <= 100000; i++)
{
DataRow myNewRow;
myNewRow = dt.NewRow();
myNewRow["customerId"] = 1;
myNewRow["username"] = "johndoe";
dt.Rows.Add(myNewRow);
}
// First Duration
DateTime startTime1 = DateTime.Now;
DataTable dt2 = new DataTable();
dt2 = dt.Clone();
for (int i = 0; i < dt.Rows.Count; i++)
dt2.ImportRow(dt.Rows[i]);
DateTime stopTime1 = DateTime.Now;
// End First Duration
TimeSpan duration1 = stopTime1 - startTime1;
// Second Duration
DateTime startTime2 = DateTime.Now;
DataTable dt3 = new DataTable();
dt3 = dt.Clone();
dt3.Merge(dt);
DateTime stopTime2 = DateTime.Now;
// End Second Duration
TimeSpan duration2 = stopTime2 - startTime2;
Edit: Updated code as per suggestions -
DataTable dt = new DataTable();
dt.Columns.Add("customerId", typeof(int));
dt.Columns.Add("username", typeof(string));
DataColumn[] key = new DataColumn[1];
key[0] = dt.Columns[0];
dt.PrimaryKey = key;
for (int i = 0; i <= 100000; i++)
{
DataRow myNewRow;
myNewRow = dt.NewRow();
myNewRow["customerId"] = i;
myNewRow["username"] = "johndoe";
dt.Rows.Add(myNewRow);
}
// First Duration
//DateTime startTime1 = DateTime.Now;
Stopwatch sw1 = new Stopwatch();
sw1.Start();
DataTable dt2 = new DataTable();
dt2 = dt.Clone();
for (int i = 0; i < dt.Rows.Count; i++)
dt2.ImportRow(dt.Rows[i]);
//DateTime stopTime1 = DateTime.Now;
sw1.Stop();
// End First Duration
TimeSpan duration1 = sw1.Elapsed;
// Second Duration
//DateTime startTime2 = DateTime.Now;
Stopwatch sw2 = new Stopwatch();
sw2.Start();
DataTable dt3 = new DataTable();
dt3 = dt.Clone();
dt3.Merge(dt);
sw2.Stop();
//DateTime stopTime2 = DateTime.Now;
// End Second Duration
TimeSpan duration2 = sw2.Elapsed;
label3.Text = duration1.Milliseconds.ToString();
label4.Text = duration2.Milliseconds.ToString();
Your measured differences are quite small, especially since you have a resolution of only 20ms (DateTime). Use a StopWatch.
You are setting Id=1 on all records, so it looks like you don't have a proper primary key. That makes this very unrepresentative.
Merge should be faster as that is the one that could be optimized for bulk actions. Given that, I find the results even more equal.
First of all before you make any specific results here i would use a "StopWatch" to do the timings and not DateTime.Now. StopWatch is a much more precise measurement tool and will get more consistent results.
Otherwise, it would make sense logically that merge could have optimizations for addition as it is designed to import many rows at once.