ImportRow vs Merge Speed Question - c#

For my own edification, I decided to test the comparative speeds of DataTable.ImportRow vs DataTable.Merge. I found that DataTable.ImportRow was largely slower than DataTable.Merge. On rare occasion, the two functions had an equal processing time. On even rarer occasions, ImportRow was faster than Merge.
Below are my testing results and code.
Why is ImportRow slower than Merge?
What makes Merge faster?
DataTable dt = new DataTable();
dt.Columns.Add("customerId", typeof(int));
dt.Columns.Add("username", typeof(string));
for (int i = 0; i <= 100000; i++)
{
DataRow myNewRow;
myNewRow = dt.NewRow();
myNewRow["customerId"] = 1;
myNewRow["username"] = "johndoe";
dt.Rows.Add(myNewRow);
}
// First Duration
DateTime startTime1 = DateTime.Now;
DataTable dt2 = new DataTable();
dt2 = dt.Clone();
for (int i = 0; i < dt.Rows.Count; i++)
dt2.ImportRow(dt.Rows[i]);
DateTime stopTime1 = DateTime.Now;
// End First Duration
TimeSpan duration1 = stopTime1 - startTime1;
// Second Duration
DateTime startTime2 = DateTime.Now;
DataTable dt3 = new DataTable();
dt3 = dt.Clone();
dt3.Merge(dt);
DateTime stopTime2 = DateTime.Now;
// End Second Duration
TimeSpan duration2 = stopTime2 - startTime2;
Edit: Updated code as per suggestions -
DataTable dt = new DataTable();
dt.Columns.Add("customerId", typeof(int));
dt.Columns.Add("username", typeof(string));
DataColumn[] key = new DataColumn[1];
key[0] = dt.Columns[0];
dt.PrimaryKey = key;
for (int i = 0; i <= 100000; i++)
{
DataRow myNewRow;
myNewRow = dt.NewRow();
myNewRow["customerId"] = i;
myNewRow["username"] = "johndoe";
dt.Rows.Add(myNewRow);
}
// First Duration
//DateTime startTime1 = DateTime.Now;
Stopwatch sw1 = new Stopwatch();
sw1.Start();
DataTable dt2 = new DataTable();
dt2 = dt.Clone();
for (int i = 0; i < dt.Rows.Count; i++)
dt2.ImportRow(dt.Rows[i]);
//DateTime stopTime1 = DateTime.Now;
sw1.Stop();
// End First Duration
TimeSpan duration1 = sw1.Elapsed;
// Second Duration
//DateTime startTime2 = DateTime.Now;
Stopwatch sw2 = new Stopwatch();
sw2.Start();
DataTable dt3 = new DataTable();
dt3 = dt.Clone();
dt3.Merge(dt);
sw2.Stop();
//DateTime stopTime2 = DateTime.Now;
// End Second Duration
TimeSpan duration2 = sw2.Elapsed;
label3.Text = duration1.Milliseconds.ToString();
label4.Text = duration2.Milliseconds.ToString();

Your measured differences are quite small, especially since you have a resolution of only 20ms (DateTime). Use a StopWatch.
You are setting Id=1 on all records, so it looks like you don't have a proper primary key. That makes this very unrepresentative.
Merge should be faster as that is the one that could be optimized for bulk actions. Given that, I find the results even more equal.

First of all before you make any specific results here i would use a "StopWatch" to do the timings and not DateTime.Now. StopWatch is a much more precise measurement tool and will get more consistent results.
Otherwise, it would make sense logically that merge could have optimizations for addition as it is designed to import many rows at once.

Related

Can I increase my DataTable and List<> iteration speed. C#

I hope you're all doing well.
I have a quick question regarding iteration. I've read several post about the speed of iteration and I couldn't figure how to make my iteration faster. Currently I'm doing something like this :
void Iteration()
{
//Creating and filling the datatable
DataTable dt = new DataTable();
dt.Columns.Add("Datetime", typeof(DateTime));
for (int i = 0; i < 150; i++)
{
DataRow row = dt.NewRow();
row["Datetime"] = DateTime.Now.AddDays(i);
dt.Rows.Add(row);
}
//Creating and filling the list
List<DateTime> _listDates = new List<DateTime>();
DateTime _startDate = DateTime.Now.AddMonths(-1);
for(int i = 0; i < 250; i++)
_listDates.Add(_startDate.AddDays(i));
//Here's the actual iteration
foreach (DateTime _date in _listDates)
{
foreach (DataRow row in dt.Rows)
{
if ((DateTime)row["Datetime"] == _date)
{
//Do something.........
}
}
}
}
I fill a List<DateTime> and a DataTable with respectively 250 and 150 rows/line. I then want to compare the two values against each other and do something when there's a match. However, in my method that means 250 * 150 = 37500 passes. Now I could break out the loop when there's a match but that seems trivial to me since the match can also be on the bottom of the list and datatable. And in my program the average lists and tables have 2500 rows. So that's millions of passes every n minutes. Needles to say that this takes a while. I'm running this calculation on a separate thread so my program stays responsive.
Is there any way to make this smarter and/or faster ? Am I on the right track ?
Cheers,
What about this? this is more efficient because both data table and datetime list are scanned only once, and HashSet.Contains time complexity is O(1).
void Iteration()
{
//Creating and filling the datatable
DataTable dt = new DataTable();
dt.Columns.Add("Datetime", typeof(DateTime));
for (int i = 0; i < 150; i++)
{
DataRow row = dt.NewRow();
row["Datetime"] = DateTime.Now.AddDays(i);
dt.Rows.Add(row);
}
//Creating and filling the list
List<DateTime> _listDates = new List<DateTime>();
DateTime _startDate = DateTime.Now.AddMonths(-1);
for (int i = 0; i < 250; i++)
_listDates.Add(_startDate.AddDays(i));
var dateSet = new HashSet<DateTime>(_listDates);
foreach (DataRow row in dt.Rows)
{
if (dateSet.Contains( (DateTime)row["Datetime"]))
{
//Do something.........
}
}
}

how to iterate through datatable

I have data table with 100000 records, I want to iterate through data table for every 10,000 records I want to save the records. for the next iteration next 10000 records I want to save until for 100000 records.
DataTable dt = new DataTable();
dt = ds.tables[0]; //here i am getting 100,000 records
for (int i = 0; i < dt.rows.count; i + 10000)
{
savedatatable(dt[i]);
}
You should use the following code:
DataTable dt = new DataTable();
dt = ds.tables[0]; //here i am getting 100,000 records
//Loop through columns in rows
for (int i = 0; i < dt.rows.count && i < 100000; i += 10000)
{
foreach (DataColumn col in dt.Columns)
savedatatable(dt.Rows[col.ColumnName].ToString());
}
or
DataTable dt = new DataTable();
dt = ds.tables[0]; //here i am getting 100,000 records
//Loop through rows in columns
foreach (DataColumn col in dt.Columns)
{
for (int i = 0; i < dt.rows.count && i < 100000; i += 10000)
savedatatable(dt.Rows[col.ColumnName].ToString());
}
Here's a similar question, but I'm not sure if this is what you wanted. : Looping through a DataTable
Should be something like this:
for (int i = 0; i < dt.Rows.Count; i+=10000)
{
DataRow dr = dt.Rows[i];
// do something
}

CSV to SQL - Add 1 day to a Date in a datacolumn

I am trying to add 1 day to all dates that are in a certain datacolumn ['RecordAddedDate']
csvData.Columns.AddRange(new DataColumn[3] {
new DataColumn("Manufacturer", typeof(string)),
new DataColumn("SupplierCode", typeof(string)),
new DataColumn("RecordAddedDate", typeof(DateTime))});
At the moment the moment I have this working:
for (int rowIndex = 0; rowIndex < csvData.Rows.Count; rowIndex++)
{
DateTime dt2 = DateTime.Parse(fieldData[2]);
var newDate = dt2.AddDays(1);
csvData.Rows[rowIndex][2] = newDate;
}
But it only adds 1 day to the first row read from the csv and doesn't add for the rest.
Any Help?
Here is the while loop which reads the data from the csv and adds the data
while (!csvReader.EndOfData)
{
string[] fieldData = csvReader.ReadFields();
//Making empty value as null
for (int i = 0; i < fieldData.Length; i++)
{
Console.WriteLine(fieldData[i]);
if (fieldData[i] == "")
{
fieldData[i] = null;
}
for (int rowIndex = 0; rowIndex < csvData.Rows.Count; rowIndex++)
{
DateTime dt2 = csvData.Rows[rowIndex].Field<DateTime>(2);
DateTime newDate = dt2.AddDays(1);
csvData.Rows[rowIndex][2] = newDate;
}
}
csvData.Rows.Add(fieldData);
Console.WriteLine("Rows count:" + csvData.Rows.Count);
}
}
return csvData;
What is fieldData[2]? You are always using this in the loop, so no wonder that you always get the same DateTime. If the table is already filled and you want to update a value use csvData.Rows[rowIndex][2] = csvData.Rows[rowIndex].Field<DateTime>(2).AddDays(1);
for (int rowIndex = 0; rowIndex < csvData.Rows.Count; rowIndex++)
{
DateTime dt2 = csvData.Rows[rowIndex].Field<DateTime>(2);
DateTime newDate = dt2.AddDays(1);
csvData.Rows[rowIndex][2] = newDate;
}

Index outside of bounds of the array when using DataGrid View?

I am trying to directly bind an array to a grid view control, where I am trying to display the details on grid view control.
I have tried the below code, but it is throwing up some errors. Please help me to find proper solution. Thank you.
Code:
protected void ddlCircle_SelectedIndexChanged(object sender, EventArgs e)
{
ShadingAnalysisDataSetTableAdapters.tbl_CadEngineersTeamTableAdapter cd;
cd = new ShadingAnalysisDataSetTableAdapters.tbl_CadEngineersTeamTableAdapter();
DataTable dt = new DataTable();
dt = cd.GetAvailableData(ddlCircle.SelectedValue);
int x, y;
DataTable dt3 = new DataTable();
dt3 = cd.GetTeam();
y = dt3.Rows.Count;
x = dt.Rows.Count;
DataTable dt2 = new DataTable();
dt2 = cd.GetAssignTeam(x);
string[] strArr = new string[dt.Rows.Count];
int i = 0;
testc:
foreach (DataRow r in dt2.Rows)
{
strArr[i] = r["Team"].ToString();
i++;
if (i >= x - 1)
{
break;
}
if (i >= y)
{
goto testc;
}
}
GridView2.DataSource = strArr[i];
GridView2.DataBind();
}
GridView2.DataSource = strArr[i]; this Line will likely produce the error, right? It is because you incremented i within your final iteration to dt2.Rows.Count + 1
Write this
if (i > 0) GridView2.DataSource = strArr[i - 1];
as last line.

Can you declare a DataTable as an array?

DataTable[] dt = new DataTable[2];
for(i = 0; i <= 1; i++)
{
dt[i].Columns.Add("id");
dt[i].Columns.Add("name");
}
When I run this I get:
Object reference not set to an instance of an object.
Can DataTable arrays be declared and used like this?
Yes you can do this, you get that error because dt[i] is not a DataTable instance:
You could do:
dt[i] = new DataTable();
Full code:
DataTable[] dt = new DataTable[2];
for(i = 0; i <= 1; i++)
{
dt[i] = new DataTable()
dt[i].Columns.Add("id");
dt[i].Columns.Add("name");
}
DataTable[] dt = new DataTable[2];
for(i = 0; i <= 1; i++)
{
dt[i].Columns.Add("id");
dt[i].Columns.Add("name");
}
I think in the code you are declaring an array with 2 positions (empty), but not actually filling them.
You need:
DataTable[] dt = new DataTable[2];
for(i = 0; i <= 1; i++)
{
dt[i] = new DataTable();
dt[i].Columns.Add("id");
dt[i].Columns.Add("name");
}
And to answer your question, yes you should be able to have an array of DataTable.
You have to initialize your DataTable[] array elements first:
dt[0] = new DataTable();
dt[1] = new DataTable();
or within the loop
for(i = 0; i <= 1; i++)
{
dt[i] = new DataTable();
dt[i].Columns.Add("id");
dt[i].Columns.Add("name");
}

Categories