I'm importing an excel file into a datatable (dtImport) and rearranging that data into another datatable (dtImportParsed).
Here's what that datatable (dtImport) looks like when I first import it.
And this is how I'm trying to rearrange that datatable (dtImportParsed):
I'm currently accomplishing this by using some nested for loops, but this takes a very long time. For example, a sheet with 36 columns and 4,000 rows takes about 30-40 minutes to complete. Is there an alternative method of accomplishing this that would speed things up?
Here's my code:
for (int c = 2; c < dtImport.Columns.Count; c++) //for each date column
{
for (int r = 1; r < dtImport.Rows.Count; r++)
{
if (dtImportParsed.Rows.Count == 0)
{
DataRow dataRowImport = dtImportParsed.NewRow();
dataRowImport["Date"] = dtImport.Columns[c].ColumnName.ToString().Trim();
dataRowImport["account_id"] = dtImport.Rows[r]["account_id"].ToString().Trim();
dataRowImport[dtImport.Rows[r]["Event Name"].ToString().Trim()] = dtImport.Rows[r][c].ToString().Trim();
dtImportParsed.Rows.Add(dataRowImport);
}
else
{
for (int i = 0; i < dtImportParsed.Rows.Count; i++)
{
if (dtImportParsed.Rows[i]["account_id"].ToString() == dtImport.Rows[r]["account_id"].ToString())
{
if (dtImportParsed.Rows[i]["Date"].ToString() == dtImport.Columns[c].ColumnName.ToString())
{
dtImportParsed.Rows[i][dtImport.Rows[r]["Event Name"].ToString().Trim()] = dtImport.Rows[r][c].ToString().Trim();
break;
}
}
else if (i == dtImportParsed.Rows.Count - 1)
{
DataRow dataRowImport = dtImportParsed.NewRow();
dataRowImport["Date"] = dtImport.Columns[c].ColumnName.ToString().Trim();
dataRowImport["account_id"] = dtImport.Rows[r]["account_id"].ToString().Trim();
dataRowImport[dtImport.Rows[r]["Event Name"].ToString().Trim()] = dtImport.Rows[r][c].ToString().Trim();
dtImportParsed.Rows.Add(dataRowImport);
}
}
}
}
}
The algorithm you use to generate your expected result is too expensive! It will execute in order (c x r x i) where i > r because empty fields are injected in the final table; Actually it is an O(n3) algorithm! Also you preform it on DataTables via iterating DataRows that probably are not efficient for your requirement.
If your source data-set is not large (as you mentioned) and you have not memory restriction, I propose you to arrange expected data-set in memory using index-based data structures. Something like this:
var arrangeDragon = new Dictionary<string, Dictionary<string, Dictionary<string, string>>>();
The dragon enters! And eats the inner for.
for (int c = 2; c < dtImport.Columns.Count; c++) //for each date column
{
for (int r = 1; r < dtImport.Rows.Count; r++)
{
// ...
// instead of: for (int i = 0; i < dtImportParsed.Rows.Count; i++) ...
string date = dtImport.Columns[c].ColumnName.ToString().Trim();
string accountId = dtImport.Rows[r]["account_id"].ToString();
string eventName = dtImport.Rows [r]["Event Name"].ToString().Trim();
if (!arrangeDragon.ContainsKey(date))
arrangeDragon.Add(date, new Dictionary<string, Dictionary<string, string>>());
if (!arrangeDragon[date].ContainsKey(accountId))
arrangeDragon[date][accountId] = new Dictionary<string, string>();
if (!arrangeDragon[date][accountId].ContainsKey(eventName))
arrangeDragon[date][accountId][eventName] = dtImport.Rows[r][c].ToString().Trim();
// ...
}
}
These checks will execute in O(1) instead of O(i), so total overhead will decrease to O(n2) that is the nature of iterating table :)
Also retrieve order is O(1):
string data_field = arrangeDragon["1/1/2022"]["account1"]["Event1"];
Assert.AreEqual(data_field, "42");
Now you can iterate nested Dictionarys once and build the dtImportParsed.
If your data-set is large or host memory is low, you need other solutions that is not your problem as mentioned ;)
Good luck
I have Data table having 3k and more columns , as we all know that SQL Tables does not support more than 1024 Columns .I will be having this another table for it .
I just need to split DataTable if columns are greater than 1024.
I need an efficient way.
Thanks in advance.
I have never seen database table with such a big number of columns.
I do not know, how many rows does your DataTable typically have. You could transpose DataTable or below is a code snippet how to split DataTable. I used #jdweng idea with copy table with al values and then delete unwanted columns backwards. I cannot assess if this is the most efficient way. I did not test it on DataTable with rows.
static List<DataTable> SplitTables(DataTable largeTable)
{
var tables = new List<DataTable>();
int chunkSize = 1024;
int cycles = largeTable.Columns.Count / chunkSize;
for (int i = 0; i <= cycles; i++)
{
var tempTable = largeTable.Copy();
for (int j = largeTable.Columns.Count - 1; j >= 0; j--)
{
if (!(j >= i * chunkSize && j < (i + 1) * chunkSize))
{
tempTable.Columns.RemoveAt(j);
}
}
tables.Add(tempTable);
}
return tables;
}
Read about normalization, as typically properly normalized database does not require tables with 1024+ columns. If will not help, you question can be solved by e.g.
1) count number of columns
SELECT count(*)
FROM information_schema.columns c
JOIN information_schema.tables t ON c.TABLE_NAME = t.TABLE_NAME
AND c.table_schema = t.table_schema
WHERE table_type = 'base table'
and c.table_name ='table_name_here'
2) if more than expected - create a new table
CREATE TABLE new_table_name_here
(...)
I am trying to build a custom machine learning library in C# , I have researched fairly about the topic. My first example (XOR estimator) was a success, I was able to lower the average loss to almost cero. Then I tried to build a model to classify handwritten digits (using MNIST text database).The problem is ,no matter how I configure the model, I always get stuck on a certain average loss over the data set.Second problem, because the MNIST dataset is very large, the model takes a lot of time to compute, maybe I can use some advice on how to carry on the slowest parts of the algorithm (I am using stochastic gradient descent).I am going to show the main method that performs most of the work.
I have tried using MSE and CrossEntropy loss functions, also tanh , sigmoid , reLu and softPlus activation functions. The model I am trying to build is a 4 layer one. First layer, 784 input neurons ; Second , 16 neurons,sigmoid ; Third , 16 neurons , sigmoid and Output layer, 10 neurons (one hot encoded digits) with sigmoid. I am aware that the code below may not be a minimal reproducible example, but it represents the algorithm I am trying to figure. I have also uploaded the solution to GitHub, maybe somebody can give me a hand figuring the problem. This is the link https://github.com/juan-carvajal/MachineLearningFramework
Running the Main method of the app will first, execute the XOR classifier, that runs good. Then the MNIST classifier.
The model is best represented here:
DataSet dataSet = new DataSet("mnist2.txt", ' ', 10, false);
//This creates a model with batching=128 , learningRate=0.5 and
//CrossEntropy loss function
var p = new Perceptron(128, 0.5, ErrorFunction.CrossEntropy())
.Layer(784, ActivationFunction.Sigmoid())
.Layer(16, ActivationFunction.Sigmoid())
.Layer(16, ActivationFunction.Sigmoid())
.Layer(10, ActivationFunction.Sigmoid());
//1000 is the number of epochs
p.Train2(dataSet, 1000);
Actual Algorithm (stochastic gradiente descent):
Console.WriteLine("Initial Loss:"+ CalculateMeanErrorOverDataSet(dataSet));
for (int i = 0; i < epochs; i++)
{
//Shuffle the data in every step
dataSet.Shuffle();
List<DataRow> batch = dataSet.NextBatch(this.Batching);
//Gets random batch from the dataSet
int count = 0;
foreach (DataRow example in batch)
{
count++;
double[] result = this.FeedForward(example.GetFeatures());
double[] labels = example.GetLabels();
if (result.Length != labels.Length)
{
throw new Exception("Inconsistent array size, Incorrect implementation.");
}
else
{
//What follows is the calculation of the gradient for this example, every example affects the current gradient, then all those changes are averaged an every parameter is updated.
double error = CalculateExampleLost(example);
for (int l = this.Layers.Count - 1; l > 0; l--)
{
if (l == this.Layers.Count - 1)
{
for (int j = 0; j < this.Layers[l].CostDerivatives.Length; j++)
{
this.Layers[l].CostDerivatives[j] = ErrorFunction.GetDerivativeValue(labels[j], this.Layers[l].Activations[j]);
}
}
else
{
for (int j = 0; j < this.Layers[l].CostDerivatives.Length; j++)
{
double acum = 0;
for (int j2 = 0; j2 < Layers[l + 1].Size; j2++)
{
acum += Layers[l + 1].WeightMatrix[j2, j] * this.Layers[l+1].ActivationFunction.GetDerivativeValue(Layers[l + 1].WeightedSum[j2]) * Layers[l + 1].CostDerivatives[j2];
}
this.Layers[l].CostDerivatives[j] = acum;
}
}
for (int j = 0; j < this.Layers[l].Activations.Length; j++)
{
this.Layers[l].BiasVectorChangeRecord[j] += this.Layers[l].ActivationFunction.GetDerivativeValue(Layers[l].WeightedSum[j]) * Layers[l].CostDerivatives[j];
for (int k = 0; k < Layers[l].WeightMatrix.GetLength(1); k++)
{
this.Layers[l].WeightMatrixChangeRecord[j, k] += Layers[l - 1].Activations[k]
* this.Layers[l].ActivationFunction.GetDerivativeValue(Layers[l].WeightedSum[j])
* Layers[l].CostDerivatives[j];
}
}
}
}
}
TakeGradientDescentStep(batch.Count);
if ((i + 1) % (epochs / 10) == 0)
{
Console.WriteLine("Epoch " + (i + 1) + ", Avg.Loss:" + CalculateMeanErrorOverDataSet(dataSet));
}
}
This is an example of what the local minima looks like in the current model.
In my research I found out that similar models may archieve accuracy up to 90%. My model barely got 10%.
What is the best way to perform bulk inserts into an MS Access database from .NET? Using ADO.NET, it is taking way over an hour to write out a large dataset.
Note that my original post, before I "refactored" it, had both the question and answer in the question part. I took Igor Turman's suggestion and re-wrote it in two parts - the question above and followed by my answer.
I found that using DAO in a specific manner is roughly 30 times faster than using ADO.NET. I am sharing the code and results in this answer. As background, in the below, the test is to write out 100 000 records of a table with 20 columns.
A summary of the technique and times - from best to worse:
02.8 seconds: Use DAO, use DAO.Field's to refer to the table columns
02.8 seconds: Write out to a text file, use Automation to import the text into Access
11.0 seconds: Use DAO, use the column index to refer to the table columns.
17.0 seconds: Use DAO, refer to the column by name
79.0 seconds: Use ADO.NET, generate INSERT statements for each row
86.0 seconds: Use ADO.NET, use DataTable to an DataAdapter for "batch" insert
As background, occasionally I need to perform analysis of reasonably large amounts of data, and I find that Access is the best platform. The analysis involves many queries, and often a lot of VBA code.
For various reasons, I wanted to use C# instead of VBA. The typical way is to use OleDB to connect to Access. I used an OleDbDataReader to grab millions of records, and it worked quite well. But when outputting results to a table, it took a long, long time. Over an hour.
First, let's discuss the two typical ways to write records to Access from C#. Both ways involve OleDB and ADO.NET. The first is to generate INSERT statements one at time, and execute them, taking 79 seconds for the 100 000 records. The code is:
public static double TestADONET_Insert_TransferToAccess()
{
StringBuilder names = new StringBuilder();
for (int k = 0; k < 20; k++)
{
string fieldName = "Field" + (k + 1).ToString();
if (k > 0)
{
names.Append(",");
}
names.Append(fieldName);
}
DateTime start = DateTime.Now;
using (OleDbConnection conn = new OleDbConnection(Properties.Settings.Default.AccessDB))
{
conn.Open();
OleDbCommand cmd = new OleDbCommand();
cmd.Connection = conn;
cmd.CommandText = "DELETE FROM TEMP";
int numRowsDeleted = cmd.ExecuteNonQuery();
Console.WriteLine("Deleted {0} rows from TEMP", numRowsDeleted);
for (int i = 0; i < 100000; i++)
{
StringBuilder insertSQL = new StringBuilder("INSERT INTO TEMP (")
.Append(names)
.Append(") VALUES (");
for (int k = 0; k < 19; k++)
{
insertSQL.Append(i + k).Append(",");
}
insertSQL.Append(i + 19).Append(")");
cmd.CommandText = insertSQL.ToString();
cmd.ExecuteNonQuery();
}
cmd.Dispose();
}
double elapsedTimeInSeconds = DateTime.Now.Subtract(start).TotalSeconds;
Console.WriteLine("Append took {0} seconds", elapsedTimeInSeconds);
return elapsedTimeInSeconds;
}
Note that I found no method in Access that allows a bulk insert.
I had then thought that maybe using a data table with a data adapter would be prove useful. Especially since I thought that I could do batch inserts using the UpdateBatchSize property of a data adapter. However, apparently only SQL Server and Oracle support that, and Access does not. And it took the longest time of 86 seconds. The code I used was:
public static double TestADONET_DataTable_TransferToAccess()
{
StringBuilder names = new StringBuilder();
StringBuilder values = new StringBuilder();
DataTable dt = new DataTable("TEMP");
for (int k = 0; k < 20; k++)
{
string fieldName = "Field" + (k + 1).ToString();
dt.Columns.Add(fieldName, typeof(int));
if (k > 0)
{
names.Append(",");
values.Append(",");
}
names.Append(fieldName);
values.Append("#" + fieldName);
}
DateTime start = DateTime.Now;
OleDbConnection conn = new OleDbConnection(Properties.Settings.Default.AccessDB);
conn.Open();
OleDbCommand cmd = new OleDbCommand();
cmd.Connection = conn;
cmd.CommandText = "DELETE FROM TEMP";
int numRowsDeleted = cmd.ExecuteNonQuery();
Console.WriteLine("Deleted {0} rows from TEMP", numRowsDeleted);
OleDbDataAdapter da = new OleDbDataAdapter("SELECT * FROM TEMP", conn);
da.InsertCommand = new OleDbCommand("INSERT INTO TEMP (" + names.ToString() + ") VALUES (" + values.ToString() + ")");
for (int k = 0; k < 20; k++)
{
string fieldName = "Field" + (k + 1).ToString();
da.InsertCommand.Parameters.Add("#" + fieldName, OleDbType.Integer, 4, fieldName);
}
da.InsertCommand.UpdatedRowSource = UpdateRowSource.None;
da.InsertCommand.Connection = conn;
//da.UpdateBatchSize = 0;
for (int i = 0; i < 100000; i++)
{
DataRow dr = dt.NewRow();
for (int k = 0; k < 20; k++)
{
dr["Field" + (k + 1).ToString()] = i + k;
}
dt.Rows.Add(dr);
}
da.Update(dt);
conn.Close();
double elapsedTimeInSeconds = DateTime.Now.Subtract(start).TotalSeconds;
Console.WriteLine("Append took {0} seconds", elapsedTimeInSeconds);
return elapsedTimeInSeconds;
}
Then I tried non-standard ways. First, I wrote out to a text file, and then used Automation to import that in. This was fast - 2.8 seconds - and tied for first place. But I consider this fragile for a number of reasons: Outputing date fields is tricky. I had to format them specially (someDate.ToString("yyyy-MM-dd HH:mm")), and then set up a special "import specification" that codes in this format. The import specification also had to have the "quote" delimiter set right. In the example below, with only integer fields, there was no need for an import specification.
Text files are also fragile for "internationalization" where there is a use of comma's for decimal separators, different date formats, possible the use of unicode.
Notice that the first record contains the field names so that the column order isn't dependent on the table, and that we used Automation to do the actual import of the text file.
public static double TestTextTransferToAccess()
{
StringBuilder names = new StringBuilder();
for (int k = 0; k < 20; k++)
{
string fieldName = "Field" + (k + 1).ToString();
if (k > 0)
{
names.Append(",");
}
names.Append(fieldName);
}
DateTime start = DateTime.Now;
StreamWriter sw = new StreamWriter(Properties.Settings.Default.TEMPPathLocation);
sw.WriteLine(names);
for (int i = 0; i < 100000; i++)
{
for (int k = 0; k < 19; k++)
{
sw.Write(i + k);
sw.Write(",");
}
sw.WriteLine(i + 19);
}
sw.Close();
ACCESS.Application accApplication = new ACCESS.Application();
string databaseName = Properties.Settings.Default.AccessDB
.Split(new char[] { ';' }).First(s => s.StartsWith("Data Source=")).Substring(12);
accApplication.OpenCurrentDatabase(databaseName, false, "");
accApplication.DoCmd.RunSQL("DELETE FROM TEMP");
accApplication.DoCmd.TransferText(TransferType: ACCESS.AcTextTransferType.acImportDelim,
TableName: "TEMP",
FileName: Properties.Settings.Default.TEMPPathLocation,
HasFieldNames: true);
accApplication.CloseCurrentDatabase();
accApplication.Quit();
accApplication = null;
double elapsedTimeInSeconds = DateTime.Now.Subtract(start).TotalSeconds;
Console.WriteLine("Append took {0} seconds", elapsedTimeInSeconds);
return elapsedTimeInSeconds;
}
Finally, I tried DAO. Lots of sites out there give huge warnings about using DAO. However, it turns out that it is simply the best way to interact between Access and .NET, especially when you need to write out large number of records. Also, it gives access to all the properties of a table. I read somewhere that it's easiest to program transactions using DAO instead of ADO.NET.
Notice that there are several lines of code that are commented. They will be explained soon.
public static double TestDAOTransferToAccess()
{
string databaseName = Properties.Settings.Default.AccessDB
.Split(new char[] { ';' }).First(s => s.StartsWith("Data Source=")).Substring(12);
DateTime start = DateTime.Now;
DAO.DBEngine dbEngine = new DAO.DBEngine();
DAO.Database db = dbEngine.OpenDatabase(databaseName);
db.Execute("DELETE FROM TEMP");
DAO.Recordset rs = db.OpenRecordset("TEMP");
DAO.Field[] myFields = new DAO.Field[20];
for (int k = 0; k < 20; k++) myFields[k] = rs.Fields["Field" + (k + 1).ToString()];
//dbEngine.BeginTrans();
for (int i = 0; i < 100000; i++)
{
rs.AddNew();
for (int k = 0; k < 20; k++)
{
//rs.Fields[k].Value = i + k;
myFields[k].Value = i + k;
//rs.Fields["Field" + (k + 1).ToString()].Value = i + k;
}
rs.Update();
//if (0 == i % 5000)
//{
//dbEngine.CommitTrans();
//dbEngine.BeginTrans();
//}
}
//dbEngine.CommitTrans();
rs.Close();
db.Close();
double elapsedTimeInSeconds = DateTime.Now.Subtract(start).TotalSeconds;
Console.WriteLine("Append took {0} seconds", elapsedTimeInSeconds);
return elapsedTimeInSeconds;
}
In this code, we created DAO.Field variables for each column (myFields[k]) and then used them. It took 2.8 seconds. Alternatively, one could directly access those fields as found in the commented line rs.Fields["Field" + (k + 1).ToString()].Value = i + k; which increased the time to 17 seconds. Wrapping the code in a transaction (see the commented lines) dropped that to 14 seconds. Using an integer index rs.Fields[k].Value = i + k; droppped that to 11 seconds. Using the DAO.Field (myFields[k]) and a transaction actually took longer, increasing the time to 3.1 seconds.
Lastly, for completeness, all of this code was in a simple static class, and the using statements are:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using ACCESS = Microsoft.Office.Interop.Access; // USED ONLY FOR THE TEXT FILE METHOD
using DAO = Microsoft.Office.Interop.Access.Dao; // USED ONLY FOR THE DAO METHOD
using System.Data; // USED ONLY FOR THE ADO.NET/DataTable METHOD
using System.Data.OleDb; // USED FOR BOTH ADO.NET METHODS
using System.IO; // USED ONLY FOR THE TEXT FILE METHOD
Thanks Marc, in order to vote you I created an account on StackOverFlow...
Below is the reusable method [Tested on C# with 64 Bit - Win 7, Windows 2008 R2, Vista, XP platforms]
Performance Details:
Exports 120,000 Rows in 4 seconds.
Copy the below code and pass the parameters... and see the performance.
Just pass your datatable with the same schema, as of target Access Db Table.
DBPath= Full path of access Db
TableNm = Name of Target Access Db table.
The code:
public void BulkExportToAccess(DataTable dtOutData, String DBPath, String TableNm)
{
DAO.DBEngine dbEngine = new DAO.DBEngine();
Boolean CheckFl = false;
try
{
DAO.Database db = dbEngine.OpenDatabase(DBPath);
DAO.Recordset AccesssRecordset = db.OpenRecordset(TableNm);
DAO.Field[] AccesssFields = new DAO.Field[dtOutData.Columns.Count];
//Loop on each row of dtOutData
for (Int32 rowCounter = 0; rowCounter < dtOutData.Rows.Count; rowCounter++)
{
AccesssRecordset.AddNew();
//Loop on column
for (Int32 colCounter = 0; colCounter < dtOutData.Columns.Count; colCounter++)
{
// for the first time... setup the field name.
if (!CheckFl)
AccesssFields[colCounter] = AccesssRecordset.Fields[dtOutData.Columns[colCounter].ColumnName];
AccesssFields[colCounter].Value = dtOutData.Rows[rowCounter][colCounter];
}
AccesssRecordset.Update();
CheckFl = true;
}
AccesssRecordset.Close();
db.Close();
}
finally
{
System.Runtime.InteropServices.Marshal.ReleaseComObject(dbEngine);
dbEngine = null;
}
}
You can use a KORM, object relation mapper that allows bulk operations over MsAccess.
database
.Query<Movie>()
.AsDbSet()
.BulkInsert(_data);
or if you have source reader, you can directly use MsAccessBulkInsert class:
using (var bulkInsert = new MsAccessBulkInsert("connection string"))
{
bulkInsert.Insert(sourceReader);
}
KORM is available from nuget Kros.KORM.MsAccess and it's opensource on GitHub
Thanks Marc for the examples.
On my system the performance of DAO is not as good as suggested here:
TestADONET_Insert_TransferToAccess(): 68 seconds
TestDAOTransferToAccess(): 29 seconds
Since on my system the use of Office interop libraries is not an option I tried a new method involving the writing of a CSV file and then importing it via ADO:
public static double TestADONET_Insert_FromCsv()
{
StringBuilder names = new StringBuilder();
for (int k = 0; k < 20; k++)
{
string fieldName = "Field" + (k + 1).ToString();
if (k > 0)
{
names.Append(",");
}
names.Append(fieldName);
}
DateTime start = DateTime.Now;
StreamWriter sw = new StreamWriter("tmpdata.csv");
sw.WriteLine(names);
for (int i = 0; i < 100000; i++)
{
for (int k = 0; k < 19; k++)
{
sw.Write(i + k);
sw.Write(",");
}
sw.WriteLine(i + 19);
}
sw.Close();
using (OleDbConnection conn = new OleDbConnection(Properties.Settings.Default.AccessDB))
{
conn.Open();
OleDbCommand cmd = new OleDbCommand();
cmd.Connection = conn;
cmd.CommandText = "DELETE FROM TEMP";
int numRowsDeleted = cmd.ExecuteNonQuery();
Console.WriteLine("Deleted {0} rows from TEMP", numRowsDeleted);
StringBuilder insertSQL = new StringBuilder("INSERT INTO TEMP (")
.Append(names)
.Append(") SELECT ")
.Append(names)
.Append(#" FROM [Text;Database=.;HDR=yes].[tmpdata.csv]");
cmd.CommandText = insertSQL.ToString();
cmd.ExecuteNonQuery();
cmd.Dispose();
}
double elapsedTimeInSeconds = DateTime.Now.Subtract(start).TotalSeconds;
Console.WriteLine("Append took {0} seconds", elapsedTimeInSeconds);
return elapsedTimeInSeconds;
}
Performace analysis of TestADONET_Insert_FromCsv(): 1.9 seconds
Similar to Marc's example TestTextTransferToAccess(), this method is also fragile for a number of reasons regarding the use of CSV files.
Hope this helps.
Lorenzo
First make sure that the access table columns have the same column names and similar types. Then you can use this function which I believe is very fast and elegant.
public void AccessBulkCopy(DataTable table)
{
foreach (DataRow r in table.Rows)
r.SetAdded();
var myAdapter = new OleDbDataAdapter("SELECT * FROM " + table.TableName, _myAccessConn);
var cbr = new OleDbCommandBuilder(myAdapter);
cbr.QuotePrefix = "[";
cbr.QuoteSuffix = "]";
cbr.GetInsertCommand(true);
myAdapter.Update(table);
}
Another method to consider, involving linking tables via DAO or ADOX then executing statements like this:
SELECT * INTO Table1 FROM _LINKED_Table1
Please see my full answer here:
MS Access Batch Update via ADO.Net and COM Interoperability
To add to Marc's answer:
Note that having the [STAThread] attribute above your Main method. will make your program easily able to communicate with COM objects, increasing the speed further. I know it's not for every application but if you heavily depend on DAO, I would recommend it.
Further more, using the DAO insertion method. If you have a column that is not required and you want to insert null, don't even set it's value. Setting the value cost time even if it's null.
Note the position of the DAO component here. This helps explain the efficiency improvements.
I have to upgrade the following code to use prepared statements:
OdbcCommand cmd = sql.CreateCommand();
cmd.CommandText = "SELECT [EMail] from myTable WHERE "+,
for (int i = 0; i < 50; i++)
{
if (i > 0)
{
cmd.CommandText += " OR ";
}
cmd.CommandText += "UNIQUE_ID = " + lUniqueIDS[i];
}
Forbid my stupid code above, it's just an example... I'm trying to fetch all the Emails of users with IDs either x, y, z, etc...
The question is - how can I rewrite it using prepared statements?
A blind naive guess would be
for (int i = 0; i < 50; i++)
{
if (i > 0)
{
cmd.CommandText += " OR ";
}
cmd.CommandText += "UNIQUE_ID = ?";
cmd.Parameters.Add("#UNIQUE_ID", OdbcType.BigInt).Value = lUniqueIDS[i];
}
Should it work? Can I append the same parameter (unique_id) more than once?
It looks like you're using positional parameters (i.e. ? in the query, rather than #UNIQUE_ID) which means the names of the parameters shouldn't matter as far as the SQL is concerned. However, I wouldn't be entirely surprised to see the provider complain... and it may make diagnostics harder too. I suggest you use the index as a suffix:
cmd.Parameters.Add("#UNIQUE_ID" + i, ObdcType.BigInt).Value = lUniqueIDs[i];