I want to know if someone of you know a faster way to fill a DataTable manually then I do.
Here is what I got, I have a List with about 1.7b entries.
I want to fill this entries as fast as possible into DataTable with one column.
An entry in my list looks like this here {"A2C","DDF","ER","SQ","8G"}
My code need about 7-8 seconds
for (int i = 0; i <= lists.Count; i++)
{
table_list.Rows.Add();
}
for (int a = 0; a < list.Count; a++)
{
table_list.Rows[a][0] = list[a][0] + list[a][1] +
list[a][2] + list[a][3] + list[a][4];
}
As I didn't find any similar question on the board (just questions about how to fill datatable by sql and fill method), I decided to post my question.
Any input is highly appreciated!
i add this DataTable into an sql server database (i do this by SqlBulkCopy)
This is a mistake; the DataTable is pure overhead here. What you should expose is an IDataReader over that data. This API is a bit tricky, but FastMember makes it easier. For example, it sounds like you have 1 column; so consider:
class Foo {
public string ColumnName {get;set;}
}
Now write an iterator block method that converts this from the original list per item:
IEnumerable<Foo> Convert(List<TheOldType> list) {
foreach(var row in list) {
yield return new Foo { ColumnName = /* TODO */ };
}
}
and now create an IDataReader via FastMember on top of that lazy sequence:
List<TheOldType> list
var data = Convert(list);
using(var bcp = new SqlBulkCopy(connection))
using(var reader = ObjectReader.Create(data, "ColumnName"))
{
bcp.DestinationTableName = "SomeTable";
bcp.WriteToServer(reader);
}
This works much better than populating a DataTable - in particular, it avoids populating a huge DataTable. Emphasis: the above is spooling - not buffered.
Why do you create an empty row first, then loop the table again to fill them?
I would use a simple foreach:
var table_list = new DataTable();
table_list.Columns.Add();
foreach(string[] fields in lists)
{
DataRow newRow = table_list.Rows.Add();
newRow.SetField(0, string.Join("", fields));
}
Why do you put all into one field?
Why not use the LoadDataRow method of the DataTable.
// turnoff notifications
table_list.BeginLoadData();
// load each row into the table
foreach(string[] fields in lists)
table_list.LoadDataRow(new object[] { string.Join("", fields) }, false);
// turn notifications back on
table_list.EndLoadData();
Also see: DataTable.LoadDataRow Method http://msdn.microsoft.com/en-us/library/kcy03ww2(v=vs.110).aspx
Related
So I'll explain my situation first.
I have a WPF View for my customer that is populated based on SQL strings that the customer defines. They can change these and add/remove these at any point and the structure of the result set is not in my control.
My expected output for this is
Populating the DataGrid at runtime without prior knowledge of the structure so using AutoGenerateColumns and providing dataTable.DefaultView as the ItemsSource for the DataGrid. This is bound to my DataGrid.
GetItemsSource = dataTable.DefaultView;
Export this DataGrid to a CSV for the customer to check whenever they want.
Now I already have a Generic List function to Save to CSV but since the structure is not known I can't change my dataTable to a list to use this.
My current solution is Save To CSV function that uses a dataTable instead of a List.
Is there some other type of data structure I could use instead of dataTable that would make using my generic function possible or do I have just have an extra Save To CSV function just for this scenario?
UPDATE
My generic list function
public static void SaveToCsv<T>(List<T> data, string filePath) where T : class
{
CreateDirectoryIfNotExists(filePath);
List<string> lines = new();
StringBuilder line = new();
if (data == null || data.Count == 0)
{
throw new ArgumentNullException("data", "You must populate the data parameter with at least one value.");
}
var cols = data[0].GetType().GetProperties();
foreach (var col in cols)
{
line.Append(col.Name);
line.Append(",");
}
lines.Add(line.ToString().Substring(0, line.Length - 1));
foreach (var row in data)
{
line = new StringBuilder();
foreach (var col in cols)
{
line.Append(col.GetValue(row));
line.Append(",");
}
lines.Add(line.ToString().Substring(0, line.Length - 1));
}
System.IO.File.WriteAllLines(filePath, lines);
}
My current Data Table function
public static void SaveToCsv(DataTable data, string filePath)
{
CreateDirectoryIfNotExists(filePath);
List<string> lines = new();
StringBuilder line = new();
if(data == null)
{
throw new ArgumentNullException("data", "The DataTable has no values to Save to CSV.");
}
IEnumerable<string> columnNames = data.Columns.Cast<DataColumn>().Select(column => column.ColumnName);
line.AppendLine(string.Join(",", columnNames));
lines.Add(line.ToString().Substring(0, line.Length - 3));
int prevlinelength = line.Length - 1;
foreach (DataRow row in data.Rows)
{
IEnumerable<string> fields = row.ItemArray.Select(field => field.ToString());
line.AppendLine(string.Join(",", fields));
lines.Add(line.ToString().Substring(prevlinelength + 1, line.Length - 3 - prevlinelength));
prevlinelength = line.Length - 1;
}
File.WriteAllLines(filePath, lines);
}
Is it possible to convert a DataTable to IEnumerable where the T can not be defined at compile time and is not known beforehand?
you can create generic objects at runtime, but it is not simple, so I would avoid it if possible.
Is there some other type of data structure I could use instead of dataTable that would make using my generic function possible or do I have just have an extra Save To CSV function just for this scenario?
You could simply convert the Rows property on your datatable and convert it to a List<DataRow> and give to your function. But it would probably not do what you want.
What you need is a some way to convert a DataRow into an object of a class with properties for each column, and while it is possible to create classes from a database model, it will be a lot of work to do so at runtime. I would guess far more than your current solution.
To conclude, keep your current solution if it works. Messing around with reflection and runtime code generation will just make things more complicated.
I have searched high and low for a method to show the entire row of a C# datatable, both by referencing the row number and by simply writing the row contents to a string variable and showing the string in the console. I can specify the exact row and field value and display that value, but not the whole row. This is not a list in C#, this is a datatable.
For the simple code below, the output I get for the first WriteLine is "Horse", but the second two WriteLine commands, I get the console output of "System.Data.DataRow" instead of the whole row of data.
What am I doing wrong? Any help would be appreciated.
using System;
using System.Data;
using System.Threading;
namespace DataTablePractice
{
class Program
{
static void Main(string[] args)
{
// Create a DataTable.
using (DataTable table = new DataTable())
{
// Two columns.
table.TableName = "table";
table.Columns.Add("Number", typeof(string));
table.Columns.Add("Pet", typeof(string));
// ... Add two rows.
table.Rows.Add("4", "Horse");
table.Rows.Add("10", "Moose");
// ... Display first field of the first row in the console
Console.WriteLine(table.Rows[0].Field<string>(1));
//...Display the first row of the table in the console
Console.WriteLine(table.Rows[0]);
//...Create a new row variable to add a third pet
var newrow = table.Rows.Add("15", "Snake");
string NewRowString = newrow.ToString();
//...Display the new row of data in the console
Console.WriteLine(NewRowString);
//...Sleep for a few seconds to examine output
Thread.Sleep(4000);
}
}
}
}
When you run this:
Console.WriteLine(table.Rows[0]);
It's in effect calling this:
Console.WriteLine(table.Rows[0].ToString()); // prints object type, in this case a DataRow
If it were your own class, you could override ToString to return whatever you need, but you don't have that option with the DataRow class. And so it uses the default behavior as described here:
Default implementations of the Object.ToString method return the fully qualified name of the object's type.
You could iterate through the columns, like this for example:
var row = table.Rows[0];
for (var i = 0; i < row.Count; i++)
Console.Write(row[i] + " : ");
Or, a shorter way to print them all out:
Console.WriteLine(String.Join(" : ", table.Rows[0].ItemArray));
Given your data, maybe you just want to reference the two fields?
foreach (DataRow row in dt.Rows)
Console.WriteLine($"You have {row[0]} {row[1]}(s).");
// You have 4 Horse(s).
// You have 10 Moose(s).
While the answer here is excellent, I highly recommend using Spectre.Console
It is an open source library that helps you generate highly formatted console output.
With this, the code to write the output simply becomes:
public static void Print(this DataTable dataTable)
{
var table = new Table();
table.AddColumn("#");
for (int i=0;i<dataTable.Columns.Count;i++)
{
table.AddColumn(dataTable.Columns[i].ColumnName);
}
for(int i=0;i<dataTable.Rows.Count;i++)
{
var values = new List<string>
{
i.ToString()
};
for (int j = 0; j < dataTable.Columns.Count;j++)
{
values.Add(dataTable.Rows[i][j]?.ToString()??"null");
}
table.AddRow(values.ToArray());
}
AnsiConsole.Write(table);
}
I have a really large table with about a 1,000,000 rows of data in a c# datatable and I would like to upload that into a mysql db table. What is the best and fastest way to do this ?
Looping through the rows and uploading one row at a time looks to be the really bad performance wise and also throws a timeout exception at times.
I know that one of the solutions is to write it out to a file and read it from file using mysqlbulkloader. Is there any other way this could be done directly from the data table to the database ?
A non-generic solution exists in the form of building a SQL query using StringBuilder. I have used a solution like this for MSSQL 2008, so it may prove useful for you.
string _insertQuery(IEnumerable<Item> datatable) {
sb.Append("INSERT INTO table (coltext, colnum, coltextmore) VALUES ");
foreach (var i in datatable) {
sb.AppendFormat("('{0}', {1}, '{2}'),",
new object[] { i.ColText, i.ColNum, i.ColTextMore });
}
sb.Remove(sb.Length - 1, 1);
return sb.ToString();
}
And you will (probably) need a way to page through the 1,000,000 rows:
var lst = new List<Item>();
// ...
for (int i = 0; i < datatable.Count; i += 1000) {
_insertQuery(lst.RangeOf(i, 1000);
}
RangeOf() is an IList extension I wrote that pages through the list:
public static IList<T> RangeOf<T>(this IList<T> src, int start, int length) {
var result = new List<T>();
for (int i = start; i < start + length; i++) {
result.Add(src[i]);
}
return result;
}
I am reading my DataTable as follow:
foreach ( DataRow o_DataRow in vco_DataTable.Rows )
{
//Insert More Here
}
It crash; because I insert more records.
How can I read my DataTable without reading the new records? Can I read by RowState?
Thanks
Since I don't know what language you are using I can only give general advice.
In most (all?) languages it's not possible to do a foreach over a collection if you are modifying the collection. There are two common ways to deal with this.
Wild ass guessing pseudo code follows:
// first way uses array notation (if possible)
var no_of_rows = vco_DataTable.Rows.count();
for(var i = 0; i < no_of_rows; i++) {
DataRow o_DataRow = vco_DataTable.Rows[i];
//Insert More Here
}
// The second way copies the data
var my_copy = vco_DataTable.Copy()
foreach ( DataRow o_DataRow in my_copy.Rows )
{
//Insert More into vco_DataTable Here
}
copy.Dispose() // delete/destroy the copy
Best practice when converting DataColumn values to an array of strings?
[Edit]
All values for certain DataColumn for all DataTable rows to be converted to an array of string?
If I understood your goal you want to specify a particular column and return all its values as a string array.
Try these approaches out:
int columnIndex = 2; // desired column index
// for loop approach
string[] results = new string[dt.Rows.Count];
for (int index = 0; index < dt.Rows.Count; index++)
{
results[index] = dt.Rows[index][columnIndex].ToString();
}
// LINQ
var result = dt.Rows.Cast<DataRow>()
.Select(row => row[columnIndex].ToString())
.ToArray();
You could replace columnIndex with columnName instead, for example:
string columnName = "OrderId";"
EDIT: you've asked for a string array specifically but in case you're flexible about the requirements I would prefer a List<string> to avoid the need to determine the array length prior to the for loop in the first example and simply add items to it. It's also a good opportunity to use a foreach loop instead.
I would then rewrite the code as follows:
List<string> list = new List<string>();
foreach (DataRow row in dt.Rows)
{
list.Add(row[columnIndex].ToString());
}
DataRow.ItemArray Property -
http://msdn.microsoft.com/en-us/library/system.data.datarow.itemarray.aspx
Also, which version are you using? You should check out the DataTableExtensions class -
http://msdn.microsoft.com/en-us/library/system.data.datatableextensions.aspx
And the DataRowExtensions class -
http://msdn.microsoft.com/en-us/library/system.data.datarowextensions.aspx
I know this question is old, but I found it in my Google search trying to do something similar. I wanted to create a list from all the values contained in a specific row of my datatable. In my code example below, I added a SQL datasource to my project in Visual Studio using the GUI wizards and I dropped the needed table adapter into the designer.
'Create a private DataTable
Private authTable As New qmgmtDataSet.AuthoritiesDataTable
'Fill the private table using the table adapter
Me.AuthoritiesTableAdapter1.Fill(Me.authTable)
'Make the list of values
Dim authNames As List(Of String) = New List(Of String)(From value As qmgmtDataSet.AuthoritiesRow In Me.authTable.Rows Select names.authName)