Copy large Datatable into MS Access table C# - c#

I wrote the following code in order to copy a DataTable content into a MS Access table.
The problem is that the data set is very huge, it takes a long time (more than 10mns), and stops when the file reaches 2GB. I know entire set of data is about 785Mo in RAM for about 820000 rows.
public static bool InsertmyDataTableDAO(string filePathName, DataTable myDataTable)
{
string connectionString = string.Format(ConnectionParameters.MsAccessConnectionStringOledb, filePathName);
DBEngine dbEngine = new DBEngine();
Database db = dbEngine.OpenDatabase(filePathName);
db.Execute("DELETE FROM " + myDataTable.TableName);
Recordset rs = db.OpenRecordset(myDataTable.TableName);
Field[] tableFields = new Field[myDataTable.Columns.Count];
foreach(DataColumn column in myDataTable.Columns)
{
tableFields[column.Ordinal] = rs.Fields[column.ColumnName];
}
foreach(DataRow row in myDataTable.Rows)
{
rs.AddNew();
foreach(DataColumn col in row.Table.Columns)
{
tableFields[col.Ordinal].Value = row[col.Ordinal];
}
rs.Update();
}
rs.Close();
db.Close();
return true;
}
Is there a faster way to copy data set from datatable to MS Access DB?

The max db size for access is 2GB, you can't bypass this limit :
https://support.office.com/en-us/article/access-specifications-0cf3c66f-9cf2-4e32-9568-98c1025bb47c?ui=en-US&rs=en-US&ad=US

I see you're using a DELETE statement to remove the rows beforehand. DELETE doesn't necessarily recover free space. Here's what I'd do...
Use your existing code to delete the data in the table.
Next, use Microsoft.Interop.Access to compact/repair the database
Finally, run your above code to insert the DataTable.
I'd also add that you could probably use Microsoft.Interop.Access to import the datatable too... Perhaps save it to a CSV file first... then import it that way rather than using INSERT statements.

Related

Efficient way to add CSV file as records to sql server

ASP.Net Core app - which reads a CSV file, and loops through each record to add it to the database.
using (var reader = new StringReader(publicScheduleData))
using (var csvReader = new CsvReader(reader))
{
csvReader.Configuration.BadDataFound = null;
csvReader.Configuration.MissingFieldFound = null;
csvReader.Configuration.HeaderValidated = null;
var records = csvReader.GetRecords<Qualys>();
foreach (var item in records)
{
_context.Qualys.Add(item);
await _context.SaveChangesAsync();
}
}
I know you can import directly in SSMS, but this has to be an end user function, to update the sql data from a web form.
The above method seems very slow - writing a few records per second. My CSV file has 90,000+ rows, and 26 columns.
Is there a quicker method to add those records to the database?
Thanks, Mark
If you insert your csv into a datatable and then insert it into the database you should get much faster insert times as you are not doing a single insert at a time. This article may get you on your way. If the approach in the article doesn't work for you, I would still suggest inserting the csv data into a datatable and then using a user defined table to insert the data as a parameter to a stored procedure and then you can do and insert with a select from the user defined table.
The newer version of .NET Core supports SQLBulkCopy library. This library will let you efficiently bulk load a SQL Server table with data from another source.
Here is example to follow https://www.c-sharpcorner.com/article/bulk-upload-in-net-core/
using (var sqlCopy = new SqlBulkCopy(connString))
{
sqlCopy.DestinationTableName = "[Products]";
sqlCopy.BatchSize = 500;
using (var reader = ObjectReader.Create(prodlist, copyParameters))
{
sqlCopy.WriteToServer(reader);
}
}

DataRow Column Has Value When Read One Way, But Not When Read a Different Way

I'm trying to read values from a DataSet in C#/ASP.NET returned from a stored procedure. Normally, the DefaultView from that DataSet is passed into a GridView in an ASP.NET page. in that event, a particular column I'm interested in has a value. However, if I try to read a DataRow and get the column value, it comes through as empty.
For example, this will display a value:
DataSet ds = //////
DataView dv = ds.Tables[0].DefaultView;
grdQuotes.DataSource = dv;
grdQuotes.DataBind();
This, however, gives me no value:
DataSet ds = //////
foreach (DataRow row in ds.Tables[0].Rows) {
String value1 = (String)row["DOReviewDate"];
String value2 = ((Object)row["DOReviewDate"]).ToString();
String value3 = row.Field<String >("DOReviewDate");
}
All three variables end up empty.
I'm pretty lost on where to go with this, as it's apparent that there's a value being pulled from the SQL database, otherwise it wouldn't display in the GridView table on the page. Also, I can get the rest of the column values in the row without problem. Interestingly enough, there is one other column exhibiting the same behavior.
-- EDIT --
Attempt to iterate through rows and columns to get data:
StringBuilder sb = new StringBuilder();
foreach (DataRow row in ds.Tables[0].Rows)
{
StringBuilder r = new StringBuilder();
foreach (DataColumn c in ds.Tables[0].Columns)
{
r.Append(String.Format("{0} | ", row[c]));
}
r.Append("END");
sb.AppendLine(r.ToString());
}
I was finally able to locate the stored procedure powering this functionality. It's returning empty values for the columns I need. There's nothing but an empty string designating what to return. So, while I can't seem to find the functionality within the page construction, it looks like there is some sort of join or alternate search pulling the data, and I believe I found the table to pull from. Guess I'll just have to take this initial data returned to me and use it to pull from the other table. Or, I guess, maybe make my own SQL search with a join. Either way, the stored procedure does not itself contain a join to the table I need. This is what was in the stored procedure:
'' as DOReviewDate

Write data to columns in SQL Server table dynamically

I have a .csv file with around 200 columns and the order of columns changes all the time. I want to read each row from the file, identify the corresponding column names in the database, and write data to the table accordingly.
For this I can use a simple switch case checking for the name of column. Since there are 200 columns, I'm wondering if there is any other way to do it.
Example:
public void ColName(string str, Type a)
{
SampleTableName obj = new SampleTableName();
obj."str" = a;
connection.AddSampleTableName(obj);
connection.savechanges();
}
/* SampleTableName has columns: [Name, Age] */
ColName("Name","XYZ");
Output:
Name Age
XYZ NULL
Any ideas please? Thanks.
If the column names are the same you can use SqlBulkCopy and add a list of Column Mappings. The order doesn't matter, as long as the DataTable name is set.
DataTable table = CreateTable(rows);
using (var bulkCopy = new SqlBulkCopy(connectionString))
{
foreach (var col in table.Columns.OfType<DataColumn>())
{
bulkCopy.ColumnMappings.Add(
new SqlBulkCopyColumnMapping(col.ColumnName, col.ColumnName));
}
bulkCopy.BulkCopyTimeout = 600; // in seconds
bulkCopy.DestinationTableName = "<tableName>";
bulkCopy.WriteToServer(table);
}
If the column names are no the same, a dictionary to lookup the different names could be used.
To keep it simple for maintenance purpose, I went with a switch case sigh. However, I wrote a small script to add all those fields values to the table object.

What is the most efficient way to copy UniDataSet to SQL Server?

I have a U2/UniVerse database, and need to copy the data in one table into a SQL Server table. The table in question has approx 600,000 rows and just under 200 columns. I didn't create the table, and can't change it.
For other tables, I'm looping thru a UniDataSet one record at a time and adding it to a DataTable, and then using SqlBulkCopy to copy the records to the SQL Server. This works fine, but with the large table I seem to be running out of memory when creating the DataTable.
DataTable dt = new DataTable("myTempTable");
dt.Columns.Add("FirstColumn", typeof(string));
dt.Columns.Add("SecondColumn", typeof(string));
... //adding a bunch more columns here
dt.Columns.Add("LastColumn", typeof(string));
U2Connection con = GetU2Con();
UniSession us1 = con.UniSession;
UniSelectList s1 = us1.CreateUniSelectList(0);
UniFile f1 = us1.CreateUniFile("MyU2TableName")
s1.Select(f1);
UniDataSet uSet = f1.ReadRecords(s1.ReadListAsStringArray());
foreach (UniRecord uItem in uSet)
{
List<String> record = new List<String>(uItem.Record.ToString().Split(new string[] { "รพ" }, StringSplitOptions.None));
DataRow row = dt.NewRow();
row[0] = uItem.RecordID;
row[1] = record[0];
row[2] = record[1];
... //add the rest of the record
row[50] = record[49]
dt.Rows.Add(row);
}
con.Close();
So that copies the records from the UniDataSet into a DataTable. Then, I SqlBulkCopy the DataTable into a SQL table:
string SQLcon = GetSQLCon();
using (SqlBulkCopy sbc = new SqlBulkCopy(SQLcon))
{
sbc.DestinationTableName = "dbo.MySQLTableName";
sbc.BulkCopyTimeout = 0;
sbc.BatchSize = 1000; //I've tried anywhere from 50 to 50000
try
{
sbc.WriteToServer(dt);
}
catch
{
Console.WriteLine(ex.Message);
}
}
This works just fine for my U2 tables that have 50,000 or so rows, but it basically crashes the debugger (VS Express 2012) when the table has 500,000 rows. The PC I'm doing this on is Windows 7 x64 with 4GB ram. The VS process looks like it uses up to 3.5GB RAM before it crashes.
I'm hoping there's a way to write the UniDataSet right to SQL using SqlBulkCopy, but I'm not too familiar with the U2 .Net toolkit.
The problem I face is the UniDataSet records are multivalue, and I need to pick them apart before I can write them to SQL.
Thanks!
DataTable it gets too much bigger in memory, before inserting to Database.
Why don't you split the bulk insert operation? For example read the first 50.000 results and the insert to Sql server database, clear the DataTable Memory and start again with the next 50.000 rows.
if (dt.Rows.Count > 50000)
{
//do SqlbulkCopy
dt.Rows.Clear();
}
In U2 Toolkit for .NET v2.1.0 , we have implemented Native Access. Now you can create DataSet/DataTable from UniData/UniVerse File directly. You can specify WHERE and SORT Clause too. You will see performance improvement as it will not make too much Server Trip to get IDs. For example, if you have 1000 record IDs, it will make 1000 times Server Trip. Whereas if you use Native Access, it will make one Server Trip.
Please download U2 Toolkit for .NET v2.2.0 Hot Fix 1 and try the following code. For more information , please contact u2askus#rocketsoftware.com.
U2Connection con = GetU2Con();
U2Command cmd = lConn.CreateCommand();
cmd.CommandText = string.Format("Action=Select;File=MyU2TableName;Attributes=MyID,FirstColumn,SecondColumn,LastColumn;Where=MyID>0;Sort=MyID");
U2DataAdapter da = new U2DataAdapter(cmd);
DataSet ds = new DataSet();
da.Fill(ds);
DataTable dt = ds.Tables[0];

DbConnection.GetSchema("Tables") returns tables only for one database

I try to get all table names for all databases.
But GetSchema("Tables") returns names only for one db.
It's strange cause I use no restrictions and have dbowner on many dbs with read/write permissions.
What do I need to get all tables info?
It would only return the list of tables in the current context's database.
To get a list of all the tables you'll need to loop through each database.
I have used it in one of my open source projects - http://dbdoc.codeplex.com
You'll have to do something like this:
foreach (Microsoft.SqlServer.Management.Smo.Database db in server.Databases)
foreach (Microsoft.SqlServer.Management.Smo.Table tbl in db.Tables)
tables.Add(tbl.Name); // Temp variable
I made it this way:
Cursor.Current = Cursors.WaitCursor;
sqlConnection.Open();
System.Data.DataTable dataTable = sqlConnection.GetSchema("Databases");
foreach (System.Data.DataRow row in dataTable.Rows)
{
using (var tablesConnection = new System.Data.SqlClient.SqlConnection())
{
var dbName = row[0] as string;
try
{
sqlConnection.ChangeDatabase(dbName);
}
catch (Exception)
{
continue;
}
var tables = sqlConnection.GetSchema("Tables");
}
The real issue is that even initial catalog is not specified explicitly default db is assigned. So GetSchema("Tables") shoul be called for every db on the server.
It may be other db, not necessarily 'master'.
And I thought GetSchema("Tables") returns all tables for all databases.

Categories