I would like to use a DataReader considering my file has millions of lines and a DataTable seems to be slow for loading with SQLBulkCopy
I implemented this reader, which will read my file this way:
//snippet from CSVReader implementation
Read();
_csvHeaderstring = _csvlinestring;
_header = ReadRow(_csvHeaderstring);
int i = 0;
_csvHeaderstring = "";
foreach (var item in _header)//read each column and create a dummy header.
{
headercollection.Add("COL_" + i.ToString(), null);
_csvHeaderstring = _csvHeaderstring + "COL_" + i.ToString() + _delimiter;
i++;
}
_csvHeaderstring.TrimEnd(_delimiter);
_header = ReadRow(_csvHeaderstring);
Close(); //close and repoen to get the record position to beginning.
_file = File.OpenText(filePath);
public static void MyMethod()
{
textDataReader rdr = new textDataReader(file, '\t', false);
using (SqlBulkCopy bulkcopy = new SqlBulkCopy(connectionString))
{
bulkcopy.DestinationTableName = "[dbo].[MyTable]";
bulkcopy.ColumnMappings.Add("COL_0", "DestinationCol1");
bulkcopy.ColumnMappings.Add("COL_1", "DestinationCol2");
bulkcopy.ColumnMappings.Add("COL_3", "DestinationCol3");
bulkcopy.ColumnMappings.Add("COL_4", "DestinationCol4");
bulkcopy.ColumnMappings.Add("COL_5", "DestinationCol5");
bulkcopy.ColumnMappings.Add("COL_6", "DestinationCol6");
bulkcopy.WriteToServer(rdr);
bulkcopy.Close();
}
}
However when trying to do a SQLBulkCopy columnmapping to skip the unecessary column I get the error:
the given columnmapping does not match up with any column in the
source or destination.
I have also tried to map this way to no avail.
Will this implementation of IDataReader work for SQLBulkCopy? I tried implementing an 'ignore column' method but that didn't work.
Related
file path is #"E:\BCFNA-orig-1.xsl"
excel file consists of 9 columns and 500 rows i want to get data from each row into an array int[] NumberOfInputs = {7,4,4,4,2,4,5,5,0}; " the values inside array are supposed to get from excel file , use it in my program and than get data from next row.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;
using System.Data.OleDb;
using System.IO;
namespace ConsoleApplication3
{
class Program
{
static void Main()
{
}
public class SomethingSometingExcelClass
{
public void DoSomethingWithExcel(string filePath)
{
List<DataTable> worksheets = ImportExcel(filePath);
foreach(var item in worksheets){
foreach (DataRow row in item.Rows)
{
//add to array
}
}
}
/// <summary>
/// Imports Data from Microsoft Excel File.
/// </summary>
/// <param name="FileName">Filename from which data need to import data
/// <returns>List of DataTables, based on the number of sheets</returns>
private List<DataTable> ImportExcel(string FileName)
{
List<DataTable> _dataTables = new List<DataTable>();
string _ConnectionString = string.Empty;
string _Extension = Path.GetExtension(FileName);
//Checking for the extentions, if XLS connect using Jet OleDB
_ConnectionString =
"Provider=Microsoft.Jet.OLEDB.4.0; Data Source=E:\\BCFNA-
orig-1.xls;Extended
Properties=Excel 8.0";
DataTable dataTable = null;
using (OleDbConnection oleDbConnection =
new OleDbConnection(string.Format(_ConnectionString, FileName)))
{
oleDbConnection.Open();
//Getting the meta data information.
//This DataTable will return the details of Sheets in the Excel
File.DataTable dbSchema =
oleDbConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables_Info, null);
foreach (DataRow item in dbSchema.Rows)
{
//reading data from excel to Data Table
using (OleDbCommand oleDbCommand = new OleDbCommand())
{
oleDbCommand.Connection = oleDbConnection;
oleDbCommand.CommandText = string.Format("SELECT * FROM
[B1415:J2113]", item["TABLE_NAME"].ToString());
using (OleDbDataAdapter oleDbDataAdapter = new
OleDbDataAdapter())
{
oleDbDataAdapter.SelectCommand = oleDbCommand;
dataTable = new
DataTable(item["TABLE_NAME"].ToString());
oleDbDataAdapter.Fill(dataTable);
_dataTables.Add(dataTable);
}
}
}
}
return _dataTables;
}
}
}
}
//////////////////////////////////////
above is the code which i am using to get data from excel but
///////////////////////////////////////////////////////
below is the nested loop in which i want to use data
/////////////////////////////////////////////////
for (ChromosomeID = 0; ChromosomeID < PopulationSize; ChromosomeID++)
{
Fitness = 0;
Altemp = (int[])AlPopulation[ChromosomeID];
for (int z = 0; z < 500; z++)
{
int[] NumberOfInputs = new int[9];
//// this is the array where in which data need to be added
InputBinary.AddRange(DecBin.Conversion2(NumberOfInputs));
for (i = 0; i < Altemp.Length; i++)
{
AlGenotype[i] = (int)Altemp[i];
}
Class1 ClsMn = new Class1();
AlActiveGenes = ClsMn.ListofActiveNodes(AlGenotype);
ClsNetworkProcess ClsNWProcess = new
ClsNetworkProcess();
AlOutputs = ClsNWProcess.NetWorkProcess(InputBinary,
AlGenotype, AlActiveGenes);
int value = 0;
for (i = 0; i < AlOutputs.Count; ++i)
{
value ^= (int)AlOutputs[i]; // xor the
output of the system
}
temp = Desired_Output[0];
if (value == temp) // compare system Output with
DesiredOutput bit by bit
Fitness++;
else
Fitness = Fitness;
}
AlFitness.Add(Fitness);
}
}
Zahra, no one on here that is answering questions is paid to answer them. We answer because others have helped us so we want to give back. Your attitude of "want a complete code with all reference assemblies used" seems rather demanding.
Having said that. xlsx is a proprietary format. You will need a tool like ExcelLibrary to be able to do this. Even though this answer is more related to writing to xlsx it should still give you some more options: https://stackoverflow.com/a/2603625/550975
I would suggest to use my tool Npoi.Mapper, which is based on popular library NPOI. You can import and export with POCO types directly with convention based mapping, or explicit mapping.
Get objects from Excel (XLS or XLSX)
var mapper = new Mapper("Book1.xlsx");
var objs1 = mapper.Take<SampleClass>("sheet2");
// You can take objects from the same sheet with different type.
var objs2 = mapper.Take<AnotherClass>("sheet2");
Export objects
//var objects = ...
var mapper = new Mapper();
mapper.Save("test.xlsx", objects, "newSheet", overwrite: false);
I have 300 csv files that each file contain 18000 rows and 27 columns.
Now, I want to make a windows form application which import them and show in a datagridview and do some mathematical operation later.
But, my performance is very inefficiently...
After search this problem by google, I found a solution "A Fast CSV Reader".
(http://www.codeproject.com/Articles/9258/A-Fast-CSV-Reader)
I'm follow the code step by step, but my datagridview still empty.
I don't know how to solve this problem.
Could anyone tell me how to do or give me another better way to read csv efficiently.
Here is my code...
using System.IO;
using LumenWorks.Framework.IO.Csv;
private void Form1_Load(object sender, EventArgs e)
{
ReadCsv();
}
void ReadCsv()
{
// open the file "data.csv" which is a CSV file with headers
using (CachedCsvReader csv = new
CachedCsvReader(new StreamReader("data.csv"), true))
{
// Field headers will automatically be used as column names
dataGridView1.DataSource = csv;
}
}
Here is my input data:
https://dl.dropboxusercontent.com/u/28540219/20130102.csv
Thanks...
The data you provide contains no headers (first line is a data line). So I got an ArgumentException (item with same key added) when I tried to add the csv reader to the DataSource. Setting the hasHeaders parameter in the CachCsvReader constructor did the trick and it added the data to the DataGridView (very fast).
using (CachedCsvReader csv = new CachedCsvReader(new StreamReader("data.csv"), false))
{
dataGridView.DataSource = csv;
}
Hope this helps!
You can also do like
private void ReadCsv()
{
string filePath = #"C:\..\20130102.csv";
FileStream fileStream = null;
try
{
fileStream = File.Open(filePath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite);
}
catch (Exception ex)
{
return;
}
DataTable table = new DataTable();
bool isColumnCreated = false;
using (StringReader reader = new StringReader(new StreamReader(fileStream, Encoding.Default).ReadToEnd()))
{
while (reader.Peek() != -1)
{
string line = reader.ReadLine();
if (line == null || line.Length == 0)
continue;
string[] values = line.Split(',');
if(!isColumnCreated)
{
for(int i=0; i < values.Count(); i++)
{
table.Columns.Add("Column" + i);
}
isColumnCreated = true;
}
DataRow row = table.NewRow();
for(int i=0; i < values.Count(); i++)
{
row[i] = values[i];
}
table.Rows.Add(row);
}
}
dataGridView1.DataSource = table;
}
Based on you performance requirement, this code can be improvised. It is just a working sample for your reference.
I hope this will give some idea.
Hi I'm using csvHelper to read in a csv files with a variable number of columns. The first row always contains a header row. The number of columns is unknown at first, sometimes there are three columns and sometimes there are 30+. The number of rows can be large.
I can read in the csv file, but how do I address each column of data. I need to do some basic stats on the data (e.g. min, max, stddev), then write them out in a non csv format.
Here is my code so far...
try{
using (var fileReader = File.OpenText(inFile))
using (var csvResult = new CsvHelper.CsvReader(fileReader))
{
// read the header line
csvResult.Read();
// read the whole file
dynamic recs = csvResult.GetRecords<dynamic>().ToList();
/* now how do I get a whole column ???
* recs.getColumn ???
* recs.getColumn['hadername'] ???
*/
}
catch (Exception ex)
{
MessageBox.Show("Error: Could not read file from disk. Original error: " + ex.Message);
}
Thanks
I don't think the library is capable of doing so directly. You have to read your column from individual fields and add them to a List, but the process is usually fast because readers do job fast. For example if your desired column is of type string, the code would be like so:
List<string> myStringColumn= new List<string>();
using (var fileReader = File.OpenText(inFile))
using (var csvResult = new CsvHelper.CsvReader(fileReader))
{
while (csvResult.Read())
{
string stringField=csvResult.GetField<string>("Header Name");
myStringColumn.Add(stringField);
}
}
using (System.IO.StreamReader file = new System.IO.StreamReader(Server.MapPath(filepath)))
{
//Csv reader reads the stream
CsvReader csvread = new CsvReader(file);
while (csvread.Read())
{
int count = csvread.FieldHeaders.Count();
if (count == 55)
{
DataRow dr = myExcelTable.NewRow();
if (csvread.GetField<string>("FirstName") != null)
{
dr["FirstName"] = csvread.GetField<string>("FirstName"); ;
}
else
{
dr["FirstName"] = "";
}
if (csvread.GetField<string>("LastName") != null)
{
dr["LastName"] = csvread.GetField<string>("LastName"); ;
}
else
{
dr["LastName"] = "";
}
}
else
{
lblMessage.Visible = true;
lblMessage.Text = "Columns are not in specified format.";
lblMessage.ForeColor = System.Drawing.Color.Red;
return;
}
}
}
I have a remote sql connection in C# that needs to execute a query and save its results to the users's local hard disk. There is a fairly large amount of data this thing can return, so need to think of an efficient way of storing it. I've read before that first putting the whole result into memory and then writing it is not a good idea, so if someone could help, would be great!
I am currently storing the sql result data into a DataTable, although I am thinking it could be better doing something in while(myReader.Read(){...}
Below is the code that gets the results:
DataTable t = new DataTable();
string myQuery = QueryLoader.ReadQueryFromFileWithBdateEdate(#"Resources\qrs\qryssysblo.q", newdate, newdate);
using (SqlDataAdapter a = new SqlDataAdapter(myQuery, sqlconn.myConnection))
{
a.Fill(t);
}
var result = string.Empty;
for(int i = 0; i < t.Rows.Count; i++)
{
for (int j = 0; j < t.Columns.Count; j++)
{
result += t.Rows[i][j] + ",";
}
result += "\r\n";
}
So now I have this huge result string. And I have the datatable. There has to be a much better way of doing it?
Thanks.
You are on the right track yourself. Use a loop with while(myReader.Read(){...} and write each record to the text file inside the loop. The .NET framework and operating system will take care of flushing the buffers to disk in an efficient way.
using(SqlConnection conn = new SqlConnection(connectionString))
using(SqlCommand cmd = conn.CreateCommand())
{
conn.Open();
cmd.CommandText = QueryLoader.ReadQueryFromFileWithBdateEdate(
#"Resources\qrs\qryssysblo.q", newdate, newdate);
using(SqlDataReader reader = cmd.ExecuteReader())
using(StreamWriter writer = new StreamWriter("c:\temp\file.txt"))
{
while(reader.Read())
{
// Using Name and Phone as example columns.
writer.WriteLine("Name: {0}, Phone : {1}",
reader["Name"], reader["Phone"]);
}
}
}
I came up with this, it's a better CSV writer than the other answers:
public static class DataReaderExtension
{
public static void ToCsv(this IDataReader dataReader, string fileName, bool includeHeaderAsFirstRow)
{
const string Separator = ",";
StreamWriter streamWriter = new StreamWriter(fileName);
StringBuilder sb = null;
if (includeHeaderAsFirstRow)
{
sb = new StringBuilder();
for (int index = 0; index < dataReader.FieldCount; index++)
{
if (dataReader.GetName(index) != null)
sb.Append(dataReader.GetName(index));
if (index < dataReader.FieldCount - 1)
sb.Append(Separator);
}
streamWriter.WriteLine(sb.ToString());
}
while (dataReader.Read())
{
sb = new StringBuilder();
for (int index = 0; index < dataReader.FieldCount; index++)
{
if (!dataReader.IsDBNull(index))
{
string value = dataReader.GetValue(index).ToString();
if (dataReader.GetFieldType(index) == typeof(String))
{
if (value.IndexOf("\"") >= 0)
value = value.Replace("\"", "\"\"");
if (value.IndexOf(Separator) >= 0)
value = "\"" + value + "\"";
}
sb.Append(value);
}
if (index < dataReader.FieldCount - 1)
sb.Append(Separator);
}
if (!dataReader.IsDBNull(dataReader.FieldCount - 1))
sb.Append(dataReader.GetValue(dataReader.FieldCount - 1).ToString().Replace(Separator, " "));
streamWriter.WriteLine(sb.ToString());
}
dataReader.Close();
streamWriter.Close();
}
}
usage: mydataReader.ToCsv("myfile.csv", true)
Rob Sedgwick answer is more like it, but can be improved and simplified. This is how I did it:
string separator = ";";
string fieldDelimiter = "";
bool useHeaders = true;
string connectionString = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx";
using (SqlConnection conn = new SqlConnection(connectionString))
{
using (SqlCommand cmd = conn.CreateCommand())
{
conn.Open();
string query = #"SELECT whatever";
cmd.CommandText = query;
using (SqlDataReader reader = cmd.ExecuteReader())
{
if (!reader.Read())
{
return;
}
List<string> columnNames = GetColumnNames(reader);
// Write headers if required
if (useHeaders)
{
first = true;
foreach (string columnName in columnNames)
{
response.Write(first ? string.Empty : separator);
line = string.Format("{0}{1}{2}", fieldDelimiter, columnName, fieldDelimiter);
response.Write(line);
first = false;
}
response.Write("\n");
}
// Write all records
do
{
first = true;
foreach (string columnName in columnNames)
{
response.Write(first ? string.Empty : separator);
string value = reader[columnName] == null ? string.Empty : reader[columnName].ToString();
line = string.Format("{0}{1}{2}", fieldDelimiter, value, fieldDelimiter);
response.Write(line);
first = false;
}
response.Write("\n");
}
while (reader.Read());
}
}
}
And you need to have a function GetColumnNames:
List<string> GetColumnNames(IDataReader reader)
{
List<string> columnNames = new List<string>();
for (int i = 0; i < reader.FieldCount; i++)
{
columnNames.Add(reader.GetName(i));
}
return columnNames;
}
I agree that your best bet here would be to use a SqlDataReader. Something like this:
StreamWriter YourWriter = new StreamWriter(#"c:\testfile.txt");
SqlCommand YourCommand = new SqlCommand();
SqlConnection YourConnection = new SqlConnection(YourConnectionString);
YourCommand.Connection = YourConnection;
YourCommand.CommandText = myQuery;
YourConnection.Open();
using (YourConnection)
{
using (SqlDataReader sdr = YourCommand.ExecuteReader())
using (YourWriter)
{
while (sdr.Read())
YourWriter.WriteLine(sdr[0].ToString() + sdr[1].ToString() + ",");
}
}
Mind you, in the while loop, you can write that line to the text file in any format you see fit with the column data from the SqlDataReader.
Keeping your original approach, here is a quick win:
Instead of using String as a temporary buffer, use StringBuilder. That will allow you to use the function .append(String) for concatenations, instead of using the operator +=.
The operator += is specially inefficient, so if you place it on a loop and it is repeated (potentially) millions of times, the performance will be affected.
The .append(String) method won't destroy the original object, so it's faster
Using the response object without a response.Close() causes at least in some instances the html of the page writing out the data to be written to the file. If you use Response.Close() the connection can be closed prematurely and cause an error producing the file.
It is recommended to use the HttpApplication.CompleteRequest() however this appears to always cause the html to be written to the end of the file.
I have tried the stream in conjunction with the response object and have had success in the development environment. I have not tried it in production yet.
I used .CSV to export data from database by DataReader. in my project i read datareader and create .CSV file manualy. in a loop i read datareader and for every rows i append cell value to result string. for separate columns i use "," and for separate rows i use "\n". finally i saved result string as result.csv.
I suggest this high performance extension. i tested it and quickly export 600,000 rows as .CSV .
I use:
private void SaveData(string path)
{
DataTable tblResult = new DataTable();
using(SqlCommand cm = new SqlCommand("select something", objConnect))
{
tblResult.Load(cm.ExecuteLoad());
}
if (tblResult != null)
{
using(FileStream fs = new FileStream(path, FileMode.Create, FileAccess.Write))
{
BinaryFormatter bin = new BinaryFormatter();
bin.Serialize(fs, tblResult);
}
}
}
ease to use, and easy to load, with:
private DataTable LoadData(string path)
{
DataTable t = new DataTable();
using(FileStream fs = new FileStream(path, FileMode.Open, FileAccess.Read))
{
BinaryFormatter bin = new BinaryFormatter();
t = (DataTable)bin.Deserialize(fs);
}
return t;
}
you can use this method also to save a DataSet.
UPDATE
I figured it out. Check out my answer below.
I'm trying to create a JSON string representing a row from a database table to return in an HTTP response. It seems like Json.NET would be a good tool to utilize. However, I'm not sure how to do build the JSON string while I'm reading from the database.
The problem is marked by the obnoxious comments /******** ********/
// connect to DB
theSqlConnection.Open(); // open the connection
SqlDataReader reader = sqlCommand.ExecuteReader();
if (reader.HasRows) {
while(reader.Read()) {
StringBuilder sb = new StringBuilder();
StringWriter sw = new StringWriter(sb);
using (JsonWriter jsonWriter = new JsonTextWriter(sw)) {
// read columns from the current row and build this JsonWriter
jsonWriter.WriteStartObject();
jsonWriter.WritePropertyName("FirstName");
// I need to read the value from the database
/******** I can't just say reader[i] to get the ith column. How would I loop here to get all columns? ********/
jsonWriter.WriteValue(... ? ...);
jsonWriter.WritePropertyName("LastName");
jsonWriter.WriteValue(... ? ...);
jsonWriter.WritePropertyName("Email");
jsonWriter.WriteValue(... ? ...);
// etc...
jsonWriter.WriteEndObject();
}
}
}
The problem is that I don't know how to read each column from the row from the SqlReader such that I can call WriteValue and give it the correct information and attach it to the correct column name. So if a row looks like this...
| FirstName | LastName | Email |
... how would I create a JsonWriter for each such row such that it contains all column names of the row and the corresponding values in each column and then use that JsonWriter to build a JSON string that is ready for returning through an HTTP Response?
Let me know if I need to clarify anything.
My version:
This doesn't use DataSchema and also wraps the results in an array, instead of using a writer per row.
SqlDataReader rdr = cmd.ExecuteReader();
StringBuilder sb = new StringBuilder();
StringWriter sw = new StringWriter(sb);
using (JsonWriter jsonWriter = new JsonTextWriter(sw))
{
jsonWriter.WriteStartArray();
while (rdr.Read())
{
jsonWriter.WriteStartObject();
int fields = rdr.FieldCount;
for (int i = 0; i < fields; i++)
{
jsonWriter.WritePropertyName(rdr.GetName(i));
jsonWriter.WriteValue(rdr[i]);
}
jsonWriter.WriteEndObject();
}
jsonWriter.WriteEndArray();
}
EDITED FOR SPECIFIC EXAMPLE:
theSqlConnection.Open();
SqlDataReader reader = sqlCommand.ExecuteReader();
DataTable schemaTable = reader.GetSchemaTable();
foreach (DataRow row in schemaTable.Rows)
{
StringBuilder sb = new StringBuilder();
StringWriter sw = new StringWriter(sb);
using (JsonWriter jsonWriter = new JsonTextWriter(sw))
{
jsonWriter.WriteStartObject();
foreach (DataColumn column in schemaTable.Columns)
{
jsonWriter.WritePropertyName(column.ColumnName);
jsonWriter.WriteValue(row[column]);
}
jsonWriter.WriteEndObject();
}
}
theSqlConnection.Close();
Got it! Here's the C#...
// ... SQL connection and command set up, only querying 1 row from the table
StringBuilder sb = new StringBuilder();
StringWriter sw = new StringWriter(sb);
JsonWriter jsonWriter = new JsonTextWriter(sw);
try {
theSqlConnection.Open(); // open the connection
// read the row from the table
SqlDataReader reader = sqlCommand.ExecuteReader();
reader.Read();
int fieldcount = reader.FieldCount; // count how many columns are in the row
object[] values = new object[fieldcount]; // storage for column values
reader.GetValues(values); // extract the values in each column
jsonWriter.WriteStartObject();
for (int index = 0; index < fieldcount; index++) { // iterate through all columns
jsonWriter.WritePropertyName(reader.GetName(index)); // column name
jsonWriter.WriteValue(values[index]); // value in column
}
jsonWriter.WriteEndObject();
reader.Close();
} catch (SqlException sqlException) { // exception
context.Response.ContentType = "text/plain";
context.Response.Write("Connection Exception: ");
context.Response.Write(sqlException.ToString() + "\n");
} finally {
theSqlConnection.Close(); // close the connection
}
// END of method
// the above method returns sb and another uses it to return as HTTP Response...
StringBuilder theTicket = getInfo(context, ticketID);
context.Response.ContentType = "application/json";
context.Response.Write(theTicket);
... so the StringBuilder sb variable is the JSON object that represents the row I wanted to query. Here is the JavaScript...
$.ajax({
type: 'GET',
url: 'Preview.ashx',
data: 'ticketID=' + ticketID,
dataType: "json",
success: function (data) {
// data is the JSON object the server spits out
// do stuff with the data
}
});
Thanks to Scott for his answer which inspired me to come to my solution.
Hristo
I made the following method where it converts any DataReader to JSON, but only for single depth serialization:
you should pass the reader, and the column names as a string array, for example:
String [] columns = {"CustomerID", "CustomerName", "CustomerDOB"};
then call the method
public static String json_encode(IDataReader reader, String[] columns)
{
int length = columns.Length;
String res = "{";
while (reader.Read())
{
res += "{";
for (int i = 0; i < length; i++)
{
res += "\"" + columns[i] + "\":\"" + reader[columns[i]].ToString() + "\"";
if (i < length - 1)
res += ",";
}
res += "}";
}
res += "}";
return res;
}