SqlDataReader change column names when exporting to csv - c#

I have a query that gets report data via a SqlDataReader and sends the SqlDataReader to a method that exports the content out to a .CSV file; however, the column names are showing up in the .CSV file the way that they appear in the database which is not ideal.
I do not want to alter the query itself (changing the names to have spaces) because this query is called in another location where it maps to an object and spaces would not work. I would prefer not to create a duplicate query because maintenance could be problematic. I also do not want to modify the method that writes out the .CSV as this is a method that is globally used.
Can I modify the column names after I fill the data reader but before I send it to the .CSV method? If so, how?
If I can't do it this way, could I do it if it was a DataTable instead?
Here is the general flow:
public static SqlDataReader RunMasterCSV(Search search)
{
SqlDataReader reader = null;
using (Network network = new Network())
{
using (SqlCommand cmd = new SqlCommand("dbo.MasterReport"))
{
cmd.CommandType = CommandType.StoredProcedure;
//Parameters here...
network.FillSqlReader(cmd, ref reader);
<-- Ideally would like to find a solution here -->
return reader;
}
}
}
public FileInfo CSVFileWriter(SqlDataReader reader)
{
DeleteOldFolders();
FileInfo file = null;
if (reader != null)
{
using (reader)
{
var WriteDirectory = GetExcelOutputDirectory();
double folderToSaveInto = Math.Ceiling((double)DateTime.Now.Hour / Folder_Age_Limit.TotalHours);
string uploadFolder = GetExcelOutputDirectory() + "\\" + DateTime.Now.ToString("ddMMyyyy") + "_" + folderToSaveInto.ToString();
//Add directory for today if one does not exist
if (!Directory.Exists(uploadFolder))
Directory.CreateDirectory(uploadFolder);
//Generate random GUID fileName
file = new FileInfo(uploadFolder + "\\" + Guid.NewGuid().ToString() + ".csv");
if (file.Exists)
file.Delete();
using (file.Create()) { /*kill the file stream immediately*/};
StringBuilder sb = new StringBuilder();
if (reader.Read())
{
//write the column names
for (int i = 0; i < reader.FieldCount; i++)
{
AppendValue(sb, reader.GetName(i), (i == reader.FieldCount - 1));
}
//write the column names
for (int i = 0; i < reader.FieldCount; i++)
{
AppendValue(sb, reader[i] == DBNull.Value ? "" : reader[i].ToString(), (i == reader.FieldCount - 1));
}
int rowcounter = 1;
while (reader.Read())
{
for (int i = 0; i < reader.FieldCount; i++)
{
AppendValue(sb, reader[i] == DBNull.Value ? "" : reader[i].ToString(), (i == reader.FieldCount - 1));
}
rowcounter++;
if (rowcounter == MaxRowChunk)
{
using (var sw = file.AppendText())
{
sw.Write(sb.ToString());
sw.Close();
sw.Dispose();
}
sb = new StringBuilder();
rowcounter = 0;
}
}
if (sb.Length > 0)
{
//write the last bit
using (var sw = file.AppendText())
{
sw.Write(sb.ToString());
sw.Close();
sw.Dispose();
sb = new StringBuilder();
}
}
}
}
}
return file;
}

I would try a refactoring of your CSVFileWriter.
First you should add a delegate declaration
public delegate string onColumnRename(string);
Then create an overload of your CSVFileWriter where you pass the delegate together with the reader
public FileInfo CSVFileWriter(SqlDataReader reader, onColumnRename renamer)
{
// Move here all the code of the old CSVFileWriter
.....
}
Move the code of the previous CSVFileWriter to the new method and, from the old one call the new one
public FileInfo CSVFileWriter(SqlDataReader reader)
{
// Pass null for the delegate to the new version of CSVFileWriter....
return this.CSVFileWriter(reader, null)
}
This will keep existing clients of the old method happy. For them nothing has changed.....
Inside the new version of CSVFileWriter you change the code that prepare the column names
for (int i = 0; i < reader.FieldCount; i++)
{
string colName = (renamer != null ? renamer(reader.GetName(i))
: reader.GetName(i))
AppendValue(sb, colName, (i == reader.FieldCount - 1));
}
Now it is just a matter to create the renamer function that translates your column names
private string myColumnRenamer(string columnName)
{
if(columnName == "yourNameWithoutSpaces")
return "your Name with Spaces";
else
return text;
}
This could be optimized with a static dictionary to remove the list of ifs
At this point your could call the new CSVFileWriter passing your function
FileInfo fi = CSVFileWrite(reader, myColumnRenamer);

Related

System.indexoutofrangeexception: 'cannot find column 1'

I have a program to parse a CSV file from local filesystem to a specified SQL Server table.
Now when i execute the program i get error :
System.IndexOutOfRangeException: 'Cannot find column 1' exception on the line where i the program attempts to populate the datatable.
On closer inspection the error shows that its emanating from row number 3 as shown on this link :
CSV_ERROR
This is how i am reading and saving the CSV file :
static void Main(string[] args)
{
var absPath = #"C:\Users\user\Documents\Projects\MastercardSurveillance\fbc_mc_all_cards.csv";
ProcessFile();
void ProcessFile()
{
string realPath = #"C:\Users\user\Documents\CSV";
string appLog = "CSVERRORS";
var logPath = realPath + Convert.ToString(appLog) + DateTime.Today.ToString("dd -MM-yy") + ".txt";
if (!File.Exists(logPath))
{
File.Create(logPath).Dispose();
}
var dt = GetDATATable();
if (dt == null)
{
return;
}
if (dt.Rows.Count == 0)
{
using (StreamWriter sw = File.AppendText(logPath))
{
sw.WriteLine("No rows imported after reading file " + absPath);
sw.Flush();
sw.Close();
}
return;
}
ClearData();
InsertDATA();
}
DataTable GetDATATable()
{
var FilePath = absPath;
string TableName = "Cards";
string realPath = #"C:\Users\user\Documents\CSV";
string appLog = "CSVERRORS";
var logPath = realPath + Convert.ToString(appLog) + DateTime.Today.ToString("dd -MM-yy") + ".txt";
if (!File.Exists(logPath))
{
File.Create(logPath).Dispose();
}
var dt = new DataTable(TableName);
using (var csvReader = new TextFieldParser(FilePath))
{
csvReader.SetDelimiters(new string[] { "," });
csvReader.HasFieldsEnclosedInQuotes = true;
var readFields = csvReader.ReadFields();
if (readFields == null)
{
using (StreamWriter sw = File.AppendText(logPath))
{
sw.WriteLine("Could not read header fields for file " + FilePath);
sw.Flush();
sw.Close();
}
return null;
}
foreach (var dataColumn in readFields.Select(column => new DataColumn(column, typeof(string)) { AllowDBNull = true, DefaultValue = string.Empty }))
{
dt.Columns.Add(dataColumn);
}
while (!csvReader.EndOfData)
{
var data = csvReader.ReadFields();
if (data == null)
{
using (StreamWriter sw = File.AppendText(logPath))
{
sw.WriteLine(string.Format("Could not read fields on line {0} for file {1}", csvReader.LineNumber, FilePath));
sw.Flush();
sw.Close();
}
continue;
}
var dr = dt.NewRow();
for (var i = 0; i < data.Length; i++)
{
if (!string.IsNullOrEmpty(data[i]))
{
dr[i] = data[i];
}
}
dt.Rows.Add(dr);
}
}
return dt;
}
void ClearData()
{
string SqlSvrConn = #"Server=XXXXXX-5QFK4BL\MSDEVOPS;Database=McardSurveillance;Trusted_Connection=True;MultipleActiveResultSets=true;";
using (var sqlConnection = new SqlConnection(SqlSvrConn))
{
sqlConnection.Open();
// Truncate the live table
using (var sqlCommand = new SqlCommand(_truncateLiveTableCommandText, sqlConnection))
{
sqlCommand.ExecuteNonQuery();
}
}
}
void InsertDATA()
{
string SqlSvrConn = #"Server=XXXXXX-5QFK4BL\MSDEVOPS;Database=McardSurveillance;Trusted_Connection=True;MultipleActiveResultSets=true;";
DataTable table = GetDATATable();
using (var sqlBulkCopy = new SqlBulkCopy(SqlSvrConn))
{
sqlBulkCopy.DestinationTableName = "dbo.Cards";
for (var count = 0; count < table.Columns.Count; count++)
{
sqlBulkCopy.ColumnMappings.Add(count, count);
}
sqlBulkCopy.WriteToServer(table);
}
}
}
How can i identify and possibly exclude the extra data columns being returned from the CSV file?
It appears there is a mismatch between number of columns in datatable and number of columns being read from the CSV file.
Im not sure however how i can account for this with my logic. For now i did not want to switch to using a CSV parse package but rather i need insight on how i can remove the extra column or rather ensure that the splitting takes account of all possible dubious characters.
For clarity i have a copy of the CSV file here :
CSV_FILE

Upload CSV data into SQL Database using MVC and EF

I am new MVC framework and trying to figure out on how to Parse the CSV file in such a way that only data from certain columns are saved to the database.
I am able to select the CSV file and upload it via the View and pass it to my controller using the following code as mentioned here Codelocker
public ActionResult UploadMultipleFiles(FileUploadViewModel fileModel)
{
//open file
if (Request.Files.Count == 1)
{
//get file
var postedFile = Request.Files[0];
if (postedFile.ContentLength > 0)
{
//read data from input stream
using (var csvReader = new System.IO.StreamReader(postedFile.InputStream))
{
string inputLine = "";
//read each line
while ((inputLine = csvReader.ReadLine()) != null)
{
//get lines values
string[] values = inputLine.Split(new char[] { ',' });
for (int x = 0; x < values.Length; x++)
{
//do something with each line and split value
}
}
csvReader.Close();
}
}
}
return View("Index");
}
However, I am not really sure as how to only select the required columns in CSV file and store it to the database?
Any suggestions guys?
Solved the problem by creating a DataTable method where by creating required columns and then using StreamReader and looping through the lines and selecting the required columns
[HttpPost]
public ActionResult UploadMultipleFiles()
{
FileUploadService service = new FileUploadService();
var postedFile = Request.Files[0];
StreamReader sr = new StreamReader(postedFile.InputStream);
StringBuilder sb = new StringBuilder();
DataTable dt = CreateTable();
DataRow dr;
string s;
int j = 0;
while (!sr.EndOfStream)
{
while ((s = sr.ReadLine()) != null)
{
//Ignore first row as it consists of headers
if (j > 0)
{
string[] str = s.Split(',');
dr = dt.NewRow();
dr["Postcode"] = str[0].ToString();
dr["Latitude"] = str[2].ToString();
dr["Longitude"] = str[3].ToString();
dr["County"] = str[7].ToString();
dr["District"] = str[8].ToString();
dr["Ward"] = str[9].ToString();
dr["CountryRegion"] = str[12].ToString();
dt.Rows.Add(dr);
}
j++;
}
}
service.SaveFilesDetails(dt);
sr.Close();
return View("Index");
}

How to create a generic text file parser for any find of text file?

Want to create a generic text file parser in c# for any find of text file.Actually i have 4 application all 4 getting input data from txt file format but text files are not homogeneous in nature.i have tried fixedwithdelemition.
private static DataTable FixedWidthDiliminatedTxtRead()
{
string[] fields;
StringBuilder sb = new StringBuilder();
List<StringBuilder> lst = new List<StringBuilder>();
DataTable dtable = new DataTable();
ArrayList aList;
using (TextFieldParser tfp = new TextFieldParser(testOCC))
{
tfp.TextFieldType = FieldType.FixedWidth;
tfp.SetFieldWidths(new int[12] { 2,25,8,12,13,5,6,3,10,11,10,24 });
for (int col = 1; col < 13; ++col)
dtable.Columns.Add("COL" + col);
while (!tfp.EndOfData)
{
fields = tfp.ReadFields();
aList = new ArrayList();
for (int i = 0; i < fields.Length; ++i)
aList.Add(fields[i] as string);
if (dtable.Columns.Count == aList.Count)
dtable.Rows.Add(aList.ToArray());
}
}
return dtable;
}
but i feel its very rigid one and really varies application to application making it configgurable .any better way ..
tfp.SetFieldWidths(new int[12] { 2,25,8,12,13,5,6,3,10,11,10,24 });
File nature :
Its a report kind of file .
position of columns are very similar
row data of file id different .
I get this as a reference
http://www.codeproject.com/Articles/11698/A-Portable-and-Efficient-Generic-Parser-for-Flat-F
any other thoughts ?
If the only thing different is the field widths, you could just try sending the field widths in as a parameter:
private static DataTable FixedWidthDiliminatedTxtRead(int[] fieldWidthArray)
{
string[] fields;
StringBuilder sb = new StringBuilder();
List<StringBuilder> lst = new List<StringBuilder>();
DataTable dtable = new DataTable();
ArrayList aList;
using (TextFieldParser tfp = new TextFieldParser(testOCC))
{
tfp.TextFieldType = FieldType.FixedWidth;
tfp.SetFieldWidths(fieldWidthArray);
for (int col = 1; col < 13; ++col)
dtable.Columns.Add("COL" + col);
while (!tfp.EndOfData)
{
fields = tfp.ReadFields();
aList = new ArrayList();
for (int i = 0; i < fields.Length; ++i)
aList.Add(fields[i] as string);
if (dtable.Columns.Count == aList.Count)
dtable.Rows.Add(aList.ToArray());
}
}
return dtable;
}
If you will have more logic to grab the data, you might want to consider defining an interface or abstract class for a GenericTextParser and create concrete implementations for each other file.
Hey I made one of these last week.
I did not write it with the intentions of other people using it so I appologize in advance if its not documented well but I cleaned it up for you. ALSO I grabbed several segments of code from stack overflow so I am not the original author of several pieces of this.
The places you need to edit are the path and pathout and the seperators of text.
char[] delimiters = new char[]
So it searches for part of a word and then grabs the whole word. I used a c# console application for this.
Here you go:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;
namespace UniqueListofStringFinder
{
class Program
{
static void Main(string[] args)
{
string path = #"c:\Your Path\in.txt";
string pathOut = #"c:\Your Path\out.txt";
string data = "!";
Console.WriteLine("Current Path In is set to: " + path);
Console.WriteLine("Current Path Out is set to: " + pathOut);
Console.WriteLine(Environment.NewLine + Environment.NewLine + "Input String to Search For:");
Console.Read();
string input = Console.ReadLine();
// Delete the file if it exists.
if (!File.Exists(path))
{
// Create the file.
using (FileStream fs = File.Create(path))
{
Byte[] info =
new UTF8Encoding(true).GetBytes("This is some text in the file.");
// Add some information to the file.
fs.Write(info, 0, info.Length);
}
}
System.IO.StreamReader file = new System.IO.StreamReader(path);
List<string> Spec = new List<string>();
using (StreamReader sr = File.OpenText(path))
{
while (!file.EndOfStream)
{
string s = file.ReadLine();
if (s.Contains(input))
{
char[] delimiters = new char[] { '\r', '\n', '\t', ')', '(', ',', '=', '"', '\'', '<', '>', '$', ' ', '#', '[', ']' };
string[] parts = s.Split(delimiters,
StringSplitOptions.RemoveEmptyEntries);
foreach (string word in parts)
{
if (word.Contains(input))
{
if( word.IndexOf(input) == 0)
{
Spec.Add(word);
}
}
}
}
}
Spec.Sort();
// Open the stream and read it back.
//while ((s = sr.ReadLine()) != null)
//{
// Console.WriteLine(s);
//}
}
Console.WriteLine();
StringBuilder builder = new StringBuilder();
foreach (string s in Spec) // Loop through all strings
{
builder.Append(s).Append(Environment.NewLine); // Append string to StringBuilder
}
string result = builder.ToString(); // Get string from StringBuilder
Program a = new Program();
data = a.uniqueness(result);
int i = a.writeFile(data,pathOut);
}
public string uniqueness(string rawData )
{
if (rawData == "")
{
return "Empty Data Set";
}
List<string> dataVar = new List<string>();
List<string> holdData = new List<string>();
bool testBool = false;
using (StringReader reader = new StringReader(rawData))
{
string line;
while ((line = reader.ReadLine()) != null)
{
foreach (string s in holdData)
{
if (line == s)
{
testBool = true;
}
}
if (testBool == false)
{
holdData.Add(line);
}
testBool = false;
// Do something with the line
}
}
int i = 0;
string dataOut = "";
foreach (string s in holdData)
{
dataOut += s + "\r\n";
i++;
}
// Write the string to a file.
return dataOut;
}
public int writeFile(string dataOut, string pathOut)
{
try
{
System.IO.StreamWriter file = new System.IO.StreamWriter(pathOut);
file.WriteLine(dataOut);
file.Close();
}
catch (Exception ex)
{
dataOut += ex.ToString();
return 1;
}
return 0;
}
}
}
private static DataTable FixedWidthTxtRead(string filename, int[] fieldWidths)
{
string[] fields;
DataTable dtable = new DataTable();
ArrayList aList;
using (TextFieldParser tfp = new TextFieldParser(filename))
{
tfp.TextFieldType = FieldType.FixedWidth;
tfp.SetFieldWidths(fieldWidths);
for (int col = 1; col <= fieldWidths.length; ++col)
dtable.Columns.Add("COL" + col);
while (!tfp.EndOfData)
{
fields = tfp.ReadFields();
aList = new ArrayList();
for (int i = 0; i < fields.Length; ++i)
aList.Add(fields[i] as string);
if (dtable.Columns.Count == aList.Count) dtable.Rows.Add(aList.ToArray());
}
}
return dtable;
}
Here's what I did:
I built a factory for the type of processor needed (based on file type/format), which abstracted the file reader.
I then built a collection object that contained a set of triggers for each field I was interested in (also contained the property name for which this field is destined). This settings collection is loaded in via an XML configuration file, so all I need to change are the settings, and the base parsing process can react to how the settings are configured. Finally I built a reflection wrapper wherein once a field is parsed, the corresponding property on the model object is set.
As the file flowed through, the triggers for each setting evaluated each lines value. When it found what it was set to find (via pattern matching, or column length values) it fired and event that bubbled up and set a property on the model object. I can show some pseudo code if you're interested. It needs some work for efficiency's sake, but I like the concept.

Split text file, fastest method

Morning,
I'm trying to split a large text file (15,000,000 rows) using StreamReader/StreamWriter. Is there a quicker way?
I tested it with 130,000 rows and it took 2min 40sec which implies 15,000,000 rows will take approx 5hrs which seems a bit excessive.
//Perform split.
public void SplitFiles(int[] newFiles, string filePath, int processorCount)
{
using (StreamReader Reader = new StreamReader(filePath))
{
for (int i = 0; i < newFiles.Length; i++)
{
string extension = System.IO.Path.GetExtension(filePath);
string temp = filePath.Substring(0, filePath.Length - extension.Length)
+ i.ToString();
string FilePath = temp + extension;
if (!File.Exists(FilePath))
{
for (int x = 0; x < newFiles[i]; x++)
{
DataWriter(Reader.ReadLine(), FilePath);
}
}
else
{
return;
}
}
}
}
public void DataWriter(string rowData, string filePath)
{
bool appendData = true;
using (StreamWriter sr = new StreamWriter(filePath, appendData))
{
{
sr.WriteLine(rowData);
}
}
}
Thanks for your help.
You haven't made it very clear, but I'm assuming that the value of each element of the newFiles array is the number of lines to copy from the original into that file. Note that currently you don't detect the situation where there's either extra data at the end of the input file, or it's shorter than expected. I suspect you want something like this:
public void SplitFiles(int[] newFiles, string inputFile)
{
string baseName = Path.GetFileNameWithoutExtension(inputFile);
string extension = Path.GetExtension(inputFile);
using (TextReader reader = File.OpenText(inputFile))
{
for (int i = 0; i < newFiles.Length; i++)
{
string outputFile = baseName + i + extension;
if (File.Exists(outputFile))
{
// Better than silently returning, I'd suggest...
throw new IOException("File already exists: " + outputFile);
}
int linesToCopy = newFiles[i];
using (TextWriter writer = File.CreateText(outputFile))
{
for (int j = 0; i < linesToCopy; j++)
{
string line = reader.ReadLine();
if (line == null)
{
return; // Premature end of input
}
writer.WriteLine(line);
}
}
}
}
}
Note that this still won't detect if there's any unconsumed input... it's not clear what you want to do in that situation.
One option for code clarity is to extract the middle of this into a separate method:
public void SplitFiles(int[] newFiles, string inputFile)
{
string baseName = Path.GetFileNameWithoutExtension(inputFile);
string extension = Path.GetExtension(inputFile);
using (TextReader reader = File.OpenText(inputFile))
{
for (int i = 0; i < newFiles.Length; i++)
{
string outputFile = baseName + i + extension;
// Could put this into the CopyLines method if you wanted
if (File.Exists(outputFile))
{
// Better than silently returning, I'd suggest...
throw new IOException("File already exists: " + outputFile);
}
CopyLines(reader, outputFile, newFiles[i]);
}
}
}
private static void CopyLines(TextReader reader, string outputFile, int count)
{
using (TextWriter writer = File.CreateText(outputFile))
{
for (int i = 0; i < count; i++)
{
string line = reader.ReadLine();
if (line == null)
{
return; // Premature end of input
}
writer.WriteLine(line);
}
}
}
There are utilities for splitting files that may outperform your solution - e.g. search for "split file by line".
If they don't suit, there are solutions for loading all the source file into memory and then writing out the files but that probably isn't appropriate given the size of the source file.
In terms of improving your code, a minor improvement would be the generation of the destination file path (and also clarifying the confusing between the source filePath you use and the destination files). You don't need to re-establish the source file extension each time in your loop.
The second improvement (and probably more significant improvement - as highlighted by commenters) is about how you write out the destination files - these seem to have a differing number of lines from the source (value in each newFiles entry) that you specify you want in individual destination files? So I'd suggest for each entry you read all the source file relevant to the next destination file, then output the destination rather than repeatedly opening a destination file. You could "gather" the lines in a StringBuilder/List etc - alternatively just write them directly out to the destination file (but only opening it once)
public void SplitFiles(int[] newFiles, string sourceFilePath, int processorCount)
{
string sourceDirectory = System.IO.Path.GetDirectoryName(sourceFilePath);
string sourceFileName = System.IO.Path.GetFileNameWithoutExtension(sourceFilePath);
string extension = System.IO.Path.GetExtension(sourceFilePath);
using (StreamReader Reader = new StreamReader(sourceFilePath))
{
for (int i = 0; i < newFiles.Length; i++)
{
string destinationFileNameWithExtension = string.Format("{0}{1}{2}", sourceFileName, i, extension);
string destinationFilePath = System.IO.Path.Combine(sourceDirectory, destinationFileNameWithExtension);
if (!File.Exists(destinationFilePath))
{
// Read all the lines relevant to this destination file
// and temporarily store them in memory
StringBuilder destinationText = new StringBuilder();
for (int x = 0; x < newFiles[i]; x++)
{
destinationText.Append(Reader.ReadLine());
}
DataWriter(destinationFilePath, destinationText.ToString());
}
else
{
return;
}
}
}
}
private static void DataWriter(string destinationFilePath, string content)
{
using (StreamWriter sr = new StreamWriter(destinationFilePath))
{
{
sr.Write(content);
}
}
}
I've recently had to do this for several hundred files under 2 GB each (up to 1.92 GB), and the fastest method I found (if you have the memory available) is StringBuilder. All the other methods I tried were painfully slow.
Please note that this is memory dependent. Adjust "CurrentPosition = 130000" accordingly.
string CurrentLine = String.Empty;
int CurrentPosition = 0;
int CurrentSplit = 0;
foreach (string file in Directory.GetFiles(#"C:\FilesToSplit"))
{
StringBuilder sb = new StringBuilder();
using (StreamReader sr = new StreamReader(file))
{
while ((CurrentLine = sr.ReadLine()) != null)
{
if (CurrentPosition == 130000) // Or whatever you want to split by.
{
using (StreamWriter sw = new StreamWriter(#"C:\FilesToSplit\SplitFiles\" + Path.GetFileNameWithoutExtension(file) + "-" + CurrentSplit + "." + Path.GetExtension(file)))
{
// Append this line too, so we don't lose it.
sb.Append(CurrentLine);
// Write the StringBuilder contents
sw.Write(sb.ToString());
// Clear the StringBuilder buffer, so it doesn't get too big. You can adjust this based on your computer's available memory.
sb.Clear();
// Increment the CurrentSplit number.
CurrentSplit++;
// Reset the current line position. We've found 130,001 lines of text.
CurrentPosition = 0;
}
}
else
{
sb.Append(CurrentLine);
CurrentPosition++;
}
}
}
// Reset the integers at the end of each file check, otherwise it can quickly go out of order.
CurrentPosition = 0;
CurrentSplit = 0;
}

How to efficiently write to file from SQL datareader in c#?

I have a remote sql connection in C# that needs to execute a query and save its results to the users's local hard disk. There is a fairly large amount of data this thing can return, so need to think of an efficient way of storing it. I've read before that first putting the whole result into memory and then writing it is not a good idea, so if someone could help, would be great!
I am currently storing the sql result data into a DataTable, although I am thinking it could be better doing something in while(myReader.Read(){...}
Below is the code that gets the results:
DataTable t = new DataTable();
string myQuery = QueryLoader.ReadQueryFromFileWithBdateEdate(#"Resources\qrs\qryssysblo.q", newdate, newdate);
using (SqlDataAdapter a = new SqlDataAdapter(myQuery, sqlconn.myConnection))
{
a.Fill(t);
}
var result = string.Empty;
for(int i = 0; i < t.Rows.Count; i++)
{
for (int j = 0; j < t.Columns.Count; j++)
{
result += t.Rows[i][j] + ",";
}
result += "\r\n";
}
So now I have this huge result string. And I have the datatable. There has to be a much better way of doing it?
Thanks.
You are on the right track yourself. Use a loop with while(myReader.Read(){...} and write each record to the text file inside the loop. The .NET framework and operating system will take care of flushing the buffers to disk in an efficient way.
using(SqlConnection conn = new SqlConnection(connectionString))
using(SqlCommand cmd = conn.CreateCommand())
{
conn.Open();
cmd.CommandText = QueryLoader.ReadQueryFromFileWithBdateEdate(
#"Resources\qrs\qryssysblo.q", newdate, newdate);
using(SqlDataReader reader = cmd.ExecuteReader())
using(StreamWriter writer = new StreamWriter("c:\temp\file.txt"))
{
while(reader.Read())
{
// Using Name and Phone as example columns.
writer.WriteLine("Name: {0}, Phone : {1}",
reader["Name"], reader["Phone"]);
}
}
}
I came up with this, it's a better CSV writer than the other answers:
public static class DataReaderExtension
{
public static void ToCsv(this IDataReader dataReader, string fileName, bool includeHeaderAsFirstRow)
{
const string Separator = ",";
StreamWriter streamWriter = new StreamWriter(fileName);
StringBuilder sb = null;
if (includeHeaderAsFirstRow)
{
sb = new StringBuilder();
for (int index = 0; index < dataReader.FieldCount; index++)
{
if (dataReader.GetName(index) != null)
sb.Append(dataReader.GetName(index));
if (index < dataReader.FieldCount - 1)
sb.Append(Separator);
}
streamWriter.WriteLine(sb.ToString());
}
while (dataReader.Read())
{
sb = new StringBuilder();
for (int index = 0; index < dataReader.FieldCount; index++)
{
if (!dataReader.IsDBNull(index))
{
string value = dataReader.GetValue(index).ToString();
if (dataReader.GetFieldType(index) == typeof(String))
{
if (value.IndexOf("\"") >= 0)
value = value.Replace("\"", "\"\"");
if (value.IndexOf(Separator) >= 0)
value = "\"" + value + "\"";
}
sb.Append(value);
}
if (index < dataReader.FieldCount - 1)
sb.Append(Separator);
}
if (!dataReader.IsDBNull(dataReader.FieldCount - 1))
sb.Append(dataReader.GetValue(dataReader.FieldCount - 1).ToString().Replace(Separator, " "));
streamWriter.WriteLine(sb.ToString());
}
dataReader.Close();
streamWriter.Close();
}
}
usage: mydataReader.ToCsv("myfile.csv", true)
Rob Sedgwick answer is more like it, but can be improved and simplified. This is how I did it:
string separator = ";";
string fieldDelimiter = "";
bool useHeaders = true;
string connectionString = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx";
using (SqlConnection conn = new SqlConnection(connectionString))
{
using (SqlCommand cmd = conn.CreateCommand())
{
conn.Open();
string query = #"SELECT whatever";
cmd.CommandText = query;
using (SqlDataReader reader = cmd.ExecuteReader())
{
if (!reader.Read())
{
return;
}
List<string> columnNames = GetColumnNames(reader);
// Write headers if required
if (useHeaders)
{
first = true;
foreach (string columnName in columnNames)
{
response.Write(first ? string.Empty : separator);
line = string.Format("{0}{1}{2}", fieldDelimiter, columnName, fieldDelimiter);
response.Write(line);
first = false;
}
response.Write("\n");
}
// Write all records
do
{
first = true;
foreach (string columnName in columnNames)
{
response.Write(first ? string.Empty : separator);
string value = reader[columnName] == null ? string.Empty : reader[columnName].ToString();
line = string.Format("{0}{1}{2}", fieldDelimiter, value, fieldDelimiter);
response.Write(line);
first = false;
}
response.Write("\n");
}
while (reader.Read());
}
}
}
And you need to have a function GetColumnNames:
List<string> GetColumnNames(IDataReader reader)
{
List<string> columnNames = new List<string>();
for (int i = 0; i < reader.FieldCount; i++)
{
columnNames.Add(reader.GetName(i));
}
return columnNames;
}
I agree that your best bet here would be to use a SqlDataReader. Something like this:
StreamWriter YourWriter = new StreamWriter(#"c:\testfile.txt");
SqlCommand YourCommand = new SqlCommand();
SqlConnection YourConnection = new SqlConnection(YourConnectionString);
YourCommand.Connection = YourConnection;
YourCommand.CommandText = myQuery;
YourConnection.Open();
using (YourConnection)
{
using (SqlDataReader sdr = YourCommand.ExecuteReader())
using (YourWriter)
{
while (sdr.Read())
YourWriter.WriteLine(sdr[0].ToString() + sdr[1].ToString() + ",");
}
}
Mind you, in the while loop, you can write that line to the text file in any format you see fit with the column data from the SqlDataReader.
Keeping your original approach, here is a quick win:
Instead of using String as a temporary buffer, use StringBuilder. That will allow you to use the function .append(String) for concatenations, instead of using the operator +=.
The operator += is specially inefficient, so if you place it on a loop and it is repeated (potentially) millions of times, the performance will be affected.
The .append(String) method won't destroy the original object, so it's faster
Using the response object without a response.Close() causes at least in some instances the html of the page writing out the data to be written to the file. If you use Response.Close() the connection can be closed prematurely and cause an error producing the file.
It is recommended to use the HttpApplication.CompleteRequest() however this appears to always cause the html to be written to the end of the file.
I have tried the stream in conjunction with the response object and have had success in the development environment. I have not tried it in production yet.
I used .CSV to export data from database by DataReader. in my project i read datareader and create .CSV file manualy. in a loop i read datareader and for every rows i append cell value to result string. for separate columns i use "," and for separate rows i use "\n". finally i saved result string as result.csv.
I suggest this high performance extension. i tested it and quickly export 600,000 rows as .CSV .
I use:
private void SaveData(string path)
{
DataTable tblResult = new DataTable();
using(SqlCommand cm = new SqlCommand("select something", objConnect))
{
tblResult.Load(cm.ExecuteLoad());
}
if (tblResult != null)
{
using(FileStream fs = new FileStream(path, FileMode.Create, FileAccess.Write))
{
BinaryFormatter bin = new BinaryFormatter();
bin.Serialize(fs, tblResult);
}
}
}
ease to use, and easy to load, with:
private DataTable LoadData(string path)
{
DataTable t = new DataTable();
using(FileStream fs = new FileStream(path, FileMode.Open, FileAccess.Read))
{
BinaryFormatter bin = new BinaryFormatter();
t = (DataTable)bin.Deserialize(fs);
}
return t;
}
you can use this method also to save a DataSet.

Categories