how to convert csv file to xml file in c# by columns - c#

right now I have csv file in there it contains Worker, Account Id, Account Code, Hierarchy, and Date column. how do I write c# code to convert csv file to xml file?
select new XElement("Worker",
new XElement("Account Id", columns[0]),
new XElement("Account Code", columns[1]),
new XElement("Hierarchy", columns[2]),
new XElement("Date", columns[3]),
For now I have code something like that, how can I make improve on that code?

Perhaps you could ensure the column names are the same by doing something like:
new XElement(columns[0].Key, columns[0].value)
That way you wouldn't have to continually type in every single column name, and could just use a foreach(..) block to generate it instead.

Try this
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;
using System.Data.OleDb;
using System.IO;
namespace ConsoleApplication55
{
class Program
{
const string csvFILENAME = #"c:\temp\test.csv";
const string xmlFILENAME = #"c:\temp\test.xml";
static void Main(string[] args)
{
CSVReader reader = new CSVReader();
DataSet ds = reader.ReadCSVFile(csvFILENAME, true);
ds.WriteXml(xmlFILENAME, XmlWriteMode.WriteSchema);
}
}
public class CSVReader
{
public DataSet ReadCSVFile(string fullPath, bool headerRow)
{
string path = fullPath.Substring(0, fullPath.LastIndexOf("\\") + 1);
string filename = fullPath.Substring(fullPath.LastIndexOf("\\") + 1);
DataSet ds = new DataSet();
try
{
if (File.Exists(fullPath))
{
string ConStr = string.Format("Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0}" + ";Extended Properties=\"Text;HDR={1};FMT=Delimited\\\"", path, headerRow ? "Yes" : "No");
string SQL = string.Format("SELECT * FROM {0}", filename);
OleDbDataAdapter adapter = new OleDbDataAdapter(SQL, ConStr);
adapter.Fill(ds, "TextFile");
ds.Tables[0].TableName = "Table1";
}
foreach (DataColumn col in ds.Tables["Table1"].Columns)
{
col.ColumnName = col.ColumnName.Replace(" ", "_");
}
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
return ds;
}
}
}

Well there's a class from MSDN called XmlCsvReader You only have to specify your filepath where the csv document is located. Then you specify your root's name which is Worker and it'll take care of the rest when it's loaded. The only thing that needs to be done is to specify where you want to output it using the Save method!
XmlDocument doc = new XmlDocument();
XmlCsvReader reader = new XmlCsvReader(new Uri("//yourfilepath.input.csv"), doc.NameTable);
reader.FirstRowHasColumnNames = true;
reader.RootName = "Worker";
reader.RowName = "Worker";
doc.Load(reader);
doc.Save("output.xml");

Related

From EXCEL to XSD generated class in C#

I have this situation, I have been provided with an XSD schema consisting of four XSD files which I was able to convert to a class using the XSD.exe tool and include it in my project, for this example this class is named "Test_XSD". On the other side I have a populated excel sheet table consisting of 10 columns which I need to map to certain elements in the "Text XSD". The "Test_XSD" schema is complex however if I map the 10 columns to their relevant elements is sufficient since many other elements are not mandatory. I have searched and searched but cannot find a simple example to start building on it.
I am able to read the excel file in Visual Studio and convert to XML, however this does not conform with the XSD generated class. I know that I have to create an instance of the "Test_XSD" and load it with the data from the Excel but I don't have any clue from where to start. Can someone explain what needs to be done.
This is what I've done so far, not too much I admit but this is something totally new for me and to be honest I didn't have yet understood the way forward although I've researched a lot.
static void Main(string[] args)
{
// Using an OleDbConnection to connect to excel
var cs = $#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source={#"C:\AAAA\Report.xlsx"};Extended Properties=""Excel 12.0 Xml; HDR = Yes; IMEX = 2"";Persist Security Info=False";
var con = new OleDbConnection(cs);
con.Open();
// Using OleDbCommand to read data of the sheet(sheetName)
var cmd = new OleDbCommand($"select * from [Sheet1$]", con);
var ds = new DataSet();
var da = new OleDbDataAdapter(cmd);
da.Fill(ds);
//// Convert DataSet to Xml
//using (var fs = new FileStream(#"C:\Users\MT2362\Downloads\CRS_XML.xml", FileMode.CreateNew))
//{
// using (var xw = new XmlTextWriter(fs, Encoding.UTF8))
// {
// ds.WriteXml(xw);
// }
//}
XSD xsd = new XSD();
xsd.version = "TEST VERSION";
Console.WriteLine(xsd.version);
Console.ReadKey();
}
I've noted taht the class generated from the XSD ("Test_XSD") is composed of multiple partial class, hence I think that an instance for each class must be created.
Thanks in advance, code snippets are highly appreciated.
The object of your XSD class would have public properties. If you set the value of these properties (similar to your .version in your example), then your object is fully populated.
Is this what you want ?
After running the XSD.exe tool, the output would be a list of C# classes that would be available to you.
Since you were able to successfully read from the Excel file and create and XML file for the dataset.
Perform the following:
Add a new class to your project as follows:
public class ExcelNameSpaceXmlTextReader : XmlTextReader
{
public ExcelNameSpaceXmlTextReader(System.IO.TextReader reader)
: base(reader) { }
public override string NamespaceURI
{
get { return ""; }
}
}
Then in a separate Utitlity class add a deserializer function as follows
public class Utility
{
public T FromXml<T>(String xml)
{
T returnedXmlClass = default(T);
using (TextReader reader = new StringReader(xml))
{
returnedXmlClass = (T)new XmlSerializer(typeof(T)).Deserialize(new ExcelNameSpaceXmlTextReader(reader));
}
return returnedXmlClass;
}
}
Now add code to consume the read in data from XML as the object you want to serialize the data to by consuming the generic Utility function
So your code would be like
static void Main(string[] args)
{
// Using an OleDbConnection to connect to excel
var cs = $#"Provider=Microsoft.ACE.OLEDB.12.0;Data Source={#"C:\AAAA\Report.xlsx"};Extended Properties=""Excel 12.0 Xml; HDR = Yes; IMEX = 2"";Persist Security Info=False";
var con = new OleDbConnection(cs);
con.Open();
// Using OleDbCommand to read data of the sheet(sheetName)
var cmd = new OleDbCommand($"select * from [Sheet1$]", con);
var ds = new DataSet();
var da = new OleDbDataAdapter(cmd);
da.Fill(ds);
// Convert DataSet to Xml
using (var fs = new FileStream(#"C:\Users\MT2362\Downloads\CRS_XML.xml", FileMode.CreateNew))
{
using (var xw = new XmlTextWriter(fs, Encoding.UTF8))
{
ds.WriteXml(xw);
}
}
XDocument doc = XDocument.Load("C:\Users\MT2362\Downloads\CRS_XML.xml");
Test_XSD test_XSD = Utility.FromXml<Test_XSD>(doc.Document.ToString());
XSD xsd = new XSD();
xsd.version = "TEST VERSION";
Console.WriteLine(xsd.version);
Console.ReadKey();
}

CsvHelper how to filter a column and return only distinct values

I am using CsvHelper in C#. I want to filter my ORIGTITLE column for distinct values and store those in the Dictionary collection. So in my Dictionary collection I should have the values below. How can I accomplish this ?
[Abbrechen,Ab] [Abgleichen,translate1] [Abgrenzung,Something]
[Tree,Baum]
My .csv file looks like this:
ORIGTITLE;REPLACETITLE;ORIGTOOLTIP
Abbrechen;Ab;Abbrechen
Abbrechen;Abort;abgelaufen
Abgleich;translate1;
Abgrenzung;Something;Abgrenzung zum Konto
Tree;Baum;Baum
Tree;Leaf;Baum
Here is my C# code so far:
class DataRecord
{
public string ORIGTITLE { get; set; }
public string REPLACETITLE { get; set; }
public string ORIGTOOLTIP { get; set; }
}
public void CsvReader()
{
using (StreamReader streamReader = new StreamReader(#"C:\Users\Devid\Desktop\Newtest.txt"))
{
CsvReader reader = new CsvReader(streamReader);
reader.Configuration.Encoding = Encoding.UTF8;
reader.Configuration.Delimiter = ";";
List<DataRecord> records = reader.GetRecords<DataRecord>().ToList();
//records has to be a distinct list
Dictionary<string, string> dict = new Dictionary<string, string>();
foreach (DataRecord record in records)
{
dict.Add(record.ORIGTITLE, record.REPLACETITLE);
//i get a error because the key is not distinctq
}
}
}
You can use my CSV reader which puts results into a datatable. You can then use Linq to get distinct values.
You can get a dictionary with code like this
CSVReader reader = new CSVReader();
DataTable dt = reader.ReadCSVFile("filename", true).Tables[0];
Dictionary<int, List<DataRow>> dict = dt.AsEnumerable()
.GroupBy(x => x.Field<int>(0), y => y)
.ToDictionary(x => x.Key, y => y.ToList());
here is the class for a form application
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.IO;
using System.Data.OleDb;
using System.Xml;
using System.Xml.Xsl;
namespace CSVImporter
{
public partial class CSVImporter : Form
{
//const string xmlfilename = #"C:\Users\fenwky\XmlDoc.xml";
const string xmlfilename = #"C:\temp\test.xml";
DataSet ds = null;
public CSVImporter()
{
InitializeComponent();
// Create a Open File Dialog Object.
openFileDialog1.Filter = "csv files (*.csv)|*.csv|All files (*.*)|*.*";
openFileDialog1.ShowDialog();
string fileName = openFileDialog1.FileName;
//doc.InsertBefore(xDeclare, root);
// Create a CSV Reader object.
CSVReader reader = new CSVReader();
ds = reader.ReadCSVFile(fileName, true);
dataGridView1.DataSource = ds.Tables["Table1"];
}
private void WXML_Click(object sender, EventArgs e)
{
WriteXML();
}
public void WriteXML()
{
StringWriter stringWriter = new StringWriter();
ds.WriteXml(new XmlTextWriter(stringWriter), XmlWriteMode.WriteSchema);
string xmlStr = stringWriter.ToString();
XmlDocument doc = new XmlDocument();
doc.LoadXml(xmlStr);
XmlDeclaration xDeclare = doc.CreateXmlDeclaration("1.0", "UTF-8", null);
XmlNode docNode = doc.CreateXmlDeclaration("1.0", "UTF-8", null);
doc.InsertBefore(xDeclare, doc.FirstChild);
// Create a procesing instruction.
//XmlProcessingInstruction newPI;
//String PItext = "<abc:stylesheet xmlns:abc=\"http://www.w3.org/1999/XSL/Transform\" version=\"1.0\">";
//String PItext = "type='text/xsl' href='book.xsl'";
string PItext = "html xsl:version=\"1.0\" xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\"";
XmlText newPI = doc.CreateTextNode(PItext);
//newPI = docCreateProcessingInstruction("html", PItext);
//newPI = doc.CreateComment(CreateDocumentType("html", PItext, "", "");
doc.InsertAfter(newPI, doc.FirstChild);
doc.Save(xmlfilename);
XslCompiledTransform myXslTrans = new XslCompiledTransform();
myXslTrans.Load(xmlfilename);
string directoryPath = Path.GetDirectoryName(xmlfilename);
myXslTrans.Transform(xmlfilename, directoryPath + "result.html");
webBrowser1.Navigate(directoryPath + "result.html");
}
}
public class CSVReader
{
public DataSet ReadCSVFile(string fullPath, bool headerRow)
{
string path = fullPath.Substring(0, fullPath.LastIndexOf("\\") + 1);
string filename = fullPath.Substring(fullPath.LastIndexOf("\\") + 1);
DataSet ds = new DataSet();
try
{
if (File.Exists(fullPath))
{
string ConStr = string.Format("Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0}" + ";Extended Properties=\"Text;HDR={1};FMT=Delimited\\\"", path, headerRow ? "Yes" : "No");
string SQL = string.Format("SELECT * FROM {0}", filename);
OleDbDataAdapter adapter = new OleDbDataAdapter(SQL, ConStr);
adapter.Fill(ds, "TextFile");
ds.Tables[0].TableName = "Table1";
}
foreach (DataColumn col in ds.Tables["Table1"].Columns)
{
col.ColumnName = col.ColumnName.Replace(" ", "_");
}
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
return ds;
}
}
}
Check if the key exists before you add it:
if (!dict.ContainsKey(record.ORIGTITLE))
{
dict.Add(record.ORIGTITLE, record.REPLACETITLE);
}
This will not perform very well for big datasets. Consider using LINQ to get distinct values instead.
You can use LINQ to group the records by ORIGTITLE property, and then project to a dictionary taking ORIGTITLE as dictionary key and the first REPLACETITLE in each group as dictionary value :
List<DataRecord> records = reader.GetRecords<DataRecord>().ToList();
var dict = records.GroupBy(r => r.ORIGTITLE)
.ToDictionary(k => k.Key, v => v.First(). REPLACETITLE );

to get data from excelfile "*.xlsx" into an array using c#

file path is #"E:\BCFNA-orig-1.xsl"
excel file consists of 9 columns and 500 rows i want to get data from each row into an array int[] NumberOfInputs = {7,4,4,4,2,4,5,5,0}; " the values inside array are supposed to get from excel file , use it in my program and than get data from next row.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;
using System.Data.OleDb;
using System.IO;
namespace ConsoleApplication3
{
class Program
{
static void Main()
{
}
public class SomethingSometingExcelClass
{
public void DoSomethingWithExcel(string filePath)
{
List<DataTable> worksheets = ImportExcel(filePath);
foreach(var item in worksheets){
foreach (DataRow row in item.Rows)
{
//add to array
}
}
}
/// <summary>
/// Imports Data from Microsoft Excel File.
/// </summary>
/// <param name="FileName">Filename from which data need to import data
/// <returns>List of DataTables, based on the number of sheets</returns>
private List<DataTable> ImportExcel(string FileName)
{
List<DataTable> _dataTables = new List<DataTable>();
string _ConnectionString = string.Empty;
string _Extension = Path.GetExtension(FileName);
//Checking for the extentions, if XLS connect using Jet OleDB
_ConnectionString =
"Provider=Microsoft.Jet.OLEDB.4.0; Data Source=E:\\BCFNA-
orig-1.xls;Extended
Properties=Excel 8.0";
DataTable dataTable = null;
using (OleDbConnection oleDbConnection =
new OleDbConnection(string.Format(_ConnectionString, FileName)))
{
oleDbConnection.Open();
//Getting the meta data information.
//This DataTable will return the details of Sheets in the Excel
File.DataTable dbSchema =
oleDbConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables_Info, null);
foreach (DataRow item in dbSchema.Rows)
{
//reading data from excel to Data Table
using (OleDbCommand oleDbCommand = new OleDbCommand())
{
oleDbCommand.Connection = oleDbConnection;
oleDbCommand.CommandText = string.Format("SELECT * FROM
[B1415:J2113]", item["TABLE_NAME"].ToString());
using (OleDbDataAdapter oleDbDataAdapter = new
OleDbDataAdapter())
{
oleDbDataAdapter.SelectCommand = oleDbCommand;
dataTable = new
DataTable(item["TABLE_NAME"].ToString());
oleDbDataAdapter.Fill(dataTable);
_dataTables.Add(dataTable);
}
}
}
}
return _dataTables;
}
}
}
}
//////////////////////////////////////
above is the code which i am using to get data from excel but
///////////////////////////////////////////////////////
below is the nested loop in which i want to use data
/////////////////////////////////////////////////
for (ChromosomeID = 0; ChromosomeID < PopulationSize; ChromosomeID++)
{
Fitness = 0;
Altemp = (int[])AlPopulation[ChromosomeID];
for (int z = 0; z < 500; z++)
{
int[] NumberOfInputs = new int[9];
//// this is the array where in which data need to be added
InputBinary.AddRange(DecBin.Conversion2(NumberOfInputs));
for (i = 0; i < Altemp.Length; i++)
{
AlGenotype[i] = (int)Altemp[i];
}
Class1 ClsMn = new Class1();
AlActiveGenes = ClsMn.ListofActiveNodes(AlGenotype);
ClsNetworkProcess ClsNWProcess = new
ClsNetworkProcess();
AlOutputs = ClsNWProcess.NetWorkProcess(InputBinary,
AlGenotype, AlActiveGenes);
int value = 0;
for (i = 0; i < AlOutputs.Count; ++i)
{
value ^= (int)AlOutputs[i]; // xor the
output of the system
}
temp = Desired_Output[0];
if (value == temp) // compare system Output with
DesiredOutput bit by bit
Fitness++;
else
Fitness = Fitness;
}
AlFitness.Add(Fitness);
}
}
Zahra, no one on here that is answering questions is paid to answer them. We answer because others have helped us so we want to give back. Your attitude of "want a complete code with all reference assemblies used" seems rather demanding.
Having said that. xlsx is a proprietary format. You will need a tool like ExcelLibrary to be able to do this. Even though this answer is more related to writing to xlsx it should still give you some more options: https://stackoverflow.com/a/2603625/550975
I would suggest to use my tool Npoi.Mapper, which is based on popular library NPOI. You can import and export with POCO types directly with convention based mapping, or explicit mapping.
Get objects from Excel (XLS or XLSX)
var mapper = new Mapper("Book1.xlsx");
var objs1 = mapper.Take<SampleClass>("sheet2");
// You can take objects from the same sheet with different type.
var objs2 = mapper.Take<AnotherClass>("sheet2");
Export objects
//var objects = ...
var mapper = new Mapper();
mapper.Save("test.xlsx", objects, "newSheet", overwrite: false);

My .NET Windows service will not start

I wrote this .NET Windows Service application that is basically a file watcher. The service will monitor incoming .csv files, parse data from them, and add the data to spreadsheets. I installed the service on a server and tried to start it. I was given the warning, "The service started and then stopped. Some services stop automatically if they are not in use by other services or programs." In a debug attempt, I removed all the code and just had a bare service and it started/stopped fine. So I added the file watcher object and it popped the warning again. Next I changed the service to run with a local administrative account instead of the "LocalService" account then it worked. I added the rest of my code and it worked for awhile. I finished development and added the EventLog object and I was right back to the warning. I removed the EventLog object but still got the warning. I just do not know what is causing this to not start. Here is my service:
using System;
using System.IO;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Diagnostics;
using OfficeOpenXml;
using System.Linq;
using System.ServiceProcess;
using System.Text;
namespace COD_Automation
{
public partial class COD_AUTO : ServiceBase
{
FileSystemWatcher eWatcher;
String remoteSrc;
public static String filePath;
public static String fileName;
public static Boolean fileCheck;
public static String modifiedDT;
public static String remodifiedDT;
public static String SampNum;
public static String SampDate;
public static String AnalysisInitials;
public static int SampResult;
public static double Dilution;
public static FileInfo efile;
public int rowIndex = 8;
public int filterID = 1;
public COD_AUTO()
{
InitializeComponent();
if (!System.Diagnostics.EventLog.SourceExists("COD_Automation"))
{
System.Diagnostics.EventLog.CreateEventSource(
"COD_Automation", "COD Automation Log");
}
serviceLog.Source = "COD_Automation";
serviceLog.Log = "COD Automation Log";
}
protected override void OnStart(string[] args)
{
serviceLog.WriteEntry("COD Automation Service has started.");
//Define the remote folder location to watch
remoteSrc = "\\\\mkfiler01\\ops\\Envcom\\EXEC\\ENVCOM\\LAB\\COD\\Exports\\DataLog";
//Create a new FileSystemWatcher and set its properties
eWatcher = new FileSystemWatcher(remoteSrc, "*.csv");
//Add event handler
eWatcher.Created += new FileSystemEventHandler(eWatcher_Created);
//Begin watching
eWatcher.EnableRaisingEvents = true;
}
protected override void OnStop()
{
serviceLog.WriteEntry("COD Automation Service has stopped.");
eWatcher.EnableRaisingEvents = false;
}
private void eWatcher_Created(object source, FileSystemEventArgs e)
{
filePath = e.FullPath;
fileName = e.Name;
ParseData(filePath);
FileCheck(fileName);
CreateExcelFile(fileCheck);
AddSample(SampNum, SampDate, AnalysisInitials, SampResult, Dilution);
}
public void ParseData(String filePath)
{
//Create a dictionary collections with int keys (rowNums) and String values (each line of Strings)
Dictionary<int, String> eachCSVLine = new Dictionary<int, string>();
//String array that holds the contents of the specified row
String[] lineContent;
int rowNum = 1;
foreach (string line in File.ReadLines(filePath))
{
eachCSVLine.Add(rowNum, line);
rowNum++;
}
//Get the required line and split it by "," into an array
String reqLine = eachCSVLine[5];
lineContent = reqLine.Split(',');
//Get the required values(index 2 for parsed Operator ID, index 4 for parsed Sample Number, index 11 for Sample Result)
AnalysisInitials = lineContent.GetValue(2).ToString();
SampNum = lineContent.GetValue(3).ToString(); //sample number
String result = lineContent.GetValue(11).ToString();
String dilute = lineContent.GetValue(8).ToString();
Dilution = Double.Parse(dilute);
SampResult = Int32.Parse(result); //sample result
}
public void AddSample(String SampleNum, String SampleDate, String AnalysisInitials, int SampleResult, double Diluted)
{
try
{
using (ExcelPackage excelPackage = new ExcelPackage(efile))
{
ExcelWorksheet worksheet = excelPackage.Workbook.Worksheets[1];
var cell = worksheet.Cells;
//check to see if this is the first sample added --if true, add the first sample --if false, increment rowindex & filterID then add the sample to next available row
if (cell["A8"].Value == null)
{
cell["B5"].Value = SampleDate;
cell["B6"].Value = AnalysisInitials;
cell[rowIndex, 1].Value = filterID; //Filter ID
cell[rowIndex, 2].Value = SampleNum; //Sample Number
cell[rowIndex, 3].Value = Dilution; //Dilution
cell[rowIndex, 4].Value = SampleResult; //Meter Reading
}
else
{
while (!(cell["A8"].Value == null))
{
rowIndex++;
filterID++;
if (cell[rowIndex, 1].Value == null) //ensures that the new row is blank so the loop can break to continue adding the sample
{ break; }
}
//add the sample to the next empty row
cell[rowIndex, 1].Value = filterID; //Filter ID
cell[rowIndex, 2].Value = SampleNum; //Sample Number
cell[rowIndex, 3].Value = Dilution; //Dilution
cell[rowIndex, 4].Value = SampleResult; //Meter Reading
}
excelPackage.Save();
}
}
catch (Exception e)
{
serviceLog.WriteEntry("Sample could not be added to the spreadsheet because of the following error: " + e.Message + ".");
}
}
public Boolean FileCheck(String fileName)
{
//Get the date of the .csv file
String[] fNames = fileName.Split('_');
String fDate = fNames.ElementAt(3);
DateTime dt = Convert.ToDateTime(fDate);
//format the file date into the proper format and convert to a string
modifiedDT = dt.ToString("MMddyy");
//modify the "modifiedDT to get the sample date to insert into spreadsheet
String mdate = modifiedDT.Insert(2, "/");
remodifiedDT = mdate.Insert(5, "/");
SampDate = remodifiedDT; //sample date
//assign an excel filename
String exFile = #"\\mkfiler01\ops\Envcom\EXEC\ENVCOM\LAB\COD\Imports\" + modifiedDT + "COD-" + AnalysisInitials + ".xlsx";
//check for file existence
if (File.Exists(exFile))
{ fileCheck = true; }
else
{ fileCheck = false; }
return fileCheck;
}
public void CreateExcelFile(Boolean fileCheck)
{
if (fileCheck)
{
efile = new FileInfo(#"\\mkfiler01\ops\Envcom\EXEC\ENVCOM\LAB\COD\Imports\" + modifiedDT + "COD-" + AnalysisInitials + ".xlsx");
using (ExcelPackage excelPackage = new ExcelPackage(efile))
{
//Read the existing file to see if the Analysis Initials match the AnalysisInitial variable value
ExcelWorksheet worksheet = excelPackage.Workbook.Worksheets[1];
String initials = worksheet.Cells["B6"].Value.ToString();
//If initials = AnalysisIntials then assign the existing file the WB variable, else create a new file for the different AnalysisInitials
if (initials.Equals(AnalysisInitials))
{
excelPackage.Save();
}
else
{
try
{
//Excel COD Template to use to create new Excel spreadsheet
FileInfo template = new FileInfo(#"\\mkfiler01\ops\Envcom\EXEC\ENVCOM\LAB\COD\COD TEMPLATE.xlsx");
//The new Excel spreadsheet filename
FileInfo newFile = new FileInfo(#"\\mkfiler01\ops\Envcom\EXEC\ENVCOM\LAB\COD\Imports\" + modifiedDT + "COD-" + AnalysisInitials + ".xlsx");
// Using the template to create the newfile
using (ExcelPackage excelPackage1 = new ExcelPackage(newFile, template))
{
// save the new Excel spreadsheet
excelPackage1.Save();
}
}
catch (Exception ex)
{
serviceLog.WriteEntry("Excel file could not be created because " + ex.Message);
}
}
}
}
else
{
try
{
efile = new FileInfo(#"\\mkfiler01\ops\Envcom\EXEC\ENVCOM\LAB\COD\Imports\" + modifiedDT + "COD-" + AnalysisInitials + ".xlsx");
//Excel COD Template to use to create new Excel spreadsheet
FileInfo template = new FileInfo(#"\\mkfiler01\ops\Envcom\EXEC\ENVCOM\LAB\COD\COD TEMPLATE.xlsx");
//The new Excel spreadsheet filename
FileInfo newFile = new FileInfo(#"\\mkfiler01\ops\Envcom\EXEC\ENVCOM\LAB\COD\Imports\" + modifiedDT + "COD-" + AnalysisInitials + ".xlsx");
// Using the template to create the newfile
using (ExcelPackage excelPackage = new ExcelPackage(newFile, template))
{
// save the new Excel spreadsheet
excelPackage.Save();
}
}
catch (Exception ex)
{
serviceLog.WriteEntry("Excel file could not be created because " + ex.Message);
}
}
}
}
}
My EventViewer was showing details of an ArgumentException in regards to the Source and Log properties of the EventLog object. I changed the following code:
if (!System.Diagnostics.EventLog.SourceExists("COD_Automation"))
{
System.Diagnostics.EventLog.CreateEventSource(
"COD_Automation", "COD Automation Log");
}
serviceLog.Source = "COD_Automation";
serviceLog.Log = "COD Automation Log";
into the following code:
if (!EventLog.SourceExists("COD_Automation"))
{
EventLog.CreateEventSource("COD_Automation", "Application");
}
serviceLog.Source = "COD_Automation";
serviceLog.Log = "Application";
This fixed my problem. I was initially trying to register the COD_Automation source in the COD_Automation Log which doesn't exist. So I set the Log property to the correct log of "Application".
The LocalService account will not have access to a UNC share like \\mkfiler01. My guess is that you are getting an access denied error when trying to access that file share.
Set the service to run under a domain account that has access to that share.

How to export large SQL Server table into a CSV file using the FileHelpers library?

I'm looking to export a large SQL Server table into a CSV file using C# and the FileHelpers library.
I could consider C# and bcp as well, but I thought FileHelpers would be more flexible than bcp. Speed is not a special requirement.
OutOfMemoryException is thrown on the storage.ExtractRecords() when the below code is run (some less essential code has been omitted):
SqlServerStorage storage = new SqlServerStorage(typeof(Order));
storage.ServerName = "SqlServer";
storage.DatabaseName = "SqlDataBase";
storage.SelectSql = "select * from Orders";
storage.FillRecordCallback = new FillRecordHandler(FillRecordOrder);
Order[] output = null;
output = storage.ExtractRecords() as Order[];
When the below code is run, 'Timeout expired' is thrown on the link.ExtractToFile():
SqlServerStorage storage = new SqlServerStorage(typeof(Order));
string sqlConnectionString = "Server=SqlServer;Database=SqlDataBase;Trusted_Connection=True";
storage.ConnectionString = sqlConnectionString;
storage.SelectSql = "select * from Orders";
storage.FillRecordCallback = new FillRecordHandler(FillRecordOrder);
FileDataLink link = new FileDataLink(storage);
link.FileHelperEngine.HeaderText = headerLine;
link.ExtractToFile("file.csv");
The SQL query run takes more than the default 30 sec and therefore the timeout exception. Unfortunately, I can't find in the FileHelpers docs how to set the SQL Command timeout to a higher value.
I could consider to loop an SQL select on small data sets until the whole table gets exported, but the procedure would be too complicated.
Is there a straightforward method to use FileHelpers on large DB tables export?
Rei Sivan's answer is on the right track, as it will scale well with large files, because it avoids reading the entire table into memory. However, the code can be cleaned up.
shamp00's solution requires external libraries.
Here is a simpler table-to-CSV-file exporter that will scale well to large files, and does not require any external libraries:
using System;
using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
using System.IO;
using System.Linq;
public class TableDumper
{
public void DumpTableToFile(SqlConnection connection, string tableName, string destinationFile)
{
using (var command = new SqlCommand("select * from " + tableName, connection))
using (var reader = command.ExecuteReader())
using (var outFile = File.CreateText(destinationFile))
{
string[] columnNames = GetColumnNames(reader).ToArray();
int numFields = columnNames.Length;
outFile.WriteLine(string.Join(",", columnNames));
if (reader.HasRows)
{
while (reader.Read())
{
string[] columnValues =
Enumerable.Range(0, numFields)
.Select(i => reader.GetValue(i).ToString())
.Select(field => string.Concat("\"", field.Replace("\"", "\"\""), "\""))
.ToArray();
outFile.WriteLine(string.Join(",", columnValues));
}
}
}
}
private IEnumerable<string> GetColumnNames(IDataReader reader)
{
foreach (DataRow row in reader.GetSchemaTable().Rows)
{
yield return (string)row["ColumnName"];
}
}
}
I wrote this code, and declare it CC0 (public domain).
I incorporate 2 The code above. I use this code. I use VS 2010.
//this is all lib that i used|||||||||||||||
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using UsbLibrary;
using System.Data;
using System.Data.SqlClient;
using System.Configuration;
using System.Globalization;
//cocy in a button||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
SqlConnection _connection = new SqlConnection();
SqlDataAdapter _dataAdapter = new SqlDataAdapter();
SqlCommand _command = new SqlCommand();
DataTable _dataTable = new DataTable();
_connection = new SqlConnection();
_dataAdapter = new SqlDataAdapter();
_command = new SqlCommand();
_dataTable = new DataTable();
//dbk is my database name that you can change it to your database name
_connection.ConnectionString = "Data Source=.;Initial Catalog=dbk;Integrated Security=True";
_connection.Open();
SaveFileDialog saveFileDialogCSV = new SaveFileDialog();
saveFileDialogCSV.InitialDirectory = Application.ExecutablePath.ToString();
saveFileDialogCSV.Filter = "CSV files (*.csv)|*.csv|All files (*.*)|*.*";
saveFileDialogCSV.FilterIndex = 1;
saveFileDialogCSV.RestoreDirectory = true;
string path_csv="";
if (saveFileDialogCSV.ShowDialog() == DialogResult.OK)
{
// Runs the export operation if the given filenam is valid.
path_csv= saveFileDialogCSV.FileName.ToString();
}
DumpTableToFile(_connection, "tbl_trmc", path_csv);
}
//end of code in button|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
public void DumpTableToFile(SqlConnection connection, string tableName, string destinationFile)
{
using (var command = new SqlCommand("select * from " + tableName, connection))
using (var reader = command.ExecuteReader())
using (var outFile = System.IO.File.CreateText(destinationFile))
{
string[] columnNames = GetColumnNames(reader).ToArray();
int numFields = columnNames.Length;
outFile.WriteLine(string.Join(",", columnNames));
if (reader.HasRows)
{
while (reader.Read())
{
string[] columnValues =
Enumerable.Range(0, numFields)
.Select(i => reader.GetValue(i).ToString())
.Select(field => string.Concat("\"", field.Replace("\"", "\"\""), "\""))
.ToArray();
outFile.WriteLine(string.Join(",", columnValues));
}
}
}
}
private IEnumerable<string> GetColumnNames(IDataReader reader)
{
foreach (DataRow row in reader.GetSchemaTable().Rows)
{
yield return (string)row["ColumnName"];
}
}
try this one:
private void exportToCSV()
{
//Asks the filenam with a SaveFileDialog control.
SaveFileDialog saveFileDialogCSV = new SaveFileDialog();
saveFileDialogCSV.InitialDirectory = Application.ExecutablePath.ToString();
saveFileDialogCSV.Filter = "CSV files (*.csv)|*.csv|All files (*.*)|*.*";
saveFileDialogCSV.FilterIndex = 1;
saveFileDialogCSV.RestoreDirectory = true;
if (saveFileDialogCSV.ShowDialog() == DialogResult.OK)
{
// Runs the export operation if the given filenam is valid.
exportToCSVfile(saveFileDialogCSV.FileName.ToString());
}
}
* Exports data to the CSV file.
*/
private void exportToCSVfile(string fileOut)
{
// Connects to the database, and makes the select command.
string sqlQuery = "select * from dbo." + this.lbxTables.SelectedItem.ToString();
SqlCommand command = new SqlCommand(sqlQuery, objConnDB_Auto);
// Creates a SqlDataReader instance to read data from the table.
SqlDataReader dr = command.ExecuteReader();
// Retrives the schema of the table.
DataTable dtSchema = dr.GetSchemaTable();
// Creates the CSV file as a stream, using the given encoding.
StreamWriter sw = new StreamWriter(fileOut, false, this.encodingCSV);
string strRow; // represents a full row
// Writes the column headers if the user previously asked that.
if (this.chkFirstRowColumnNames.Checked)
{
sw.WriteLine(columnNames(dtSchema, this.separator));
}
// Reads the rows one by one from the SqlDataReader
// transfers them to a string with the given separator character and
// writes it to the file.
while (dr.Read())
{
strRow = "";
for (int i = 0; i < dr.FieldCount; i++)
{
switch (Convert.ToString(dr.GetFieldType(i)))
{
case "System.Int16":
strRow += Convert.ToString(dr.GetInt16(i));
break;
case "System.Int32" :
strRow += Convert.ToString(dr.GetInt32(i));
break;
case "System.Int64":
strRow += Convert.ToString(dr.GetInt64(i));
break;
case "System.Decimal":
strRow += Convert.ToString(dr.GetDecimal(i));
break;
case "System.Double":
strRow += Convert.ToString(dr.GetDouble(i));
break;
case "System.Float":
strRow += Convert.ToString(dr.GetFloat(i));
break;
case "System.Guid":
strRow += Convert.ToString(dr.GetGuid(i));
break;
case "System.String":
strRow += dr.GetString(i);
break;
case "System.Boolean":
strRow += Convert.ToString(dr.GetBoolean(i));
break;
case "System.DateTime":
strRow += Convert.ToString(dr.GetDateTime(i));
break;
}
if (i < dr.FieldCount - 1)
{
strRow += this.separator;
}
}
sw.WriteLine(strRow);
}
// Closes the text stream and the database connenction.
sw.Close();
dr.Close();
// Notifies the user.
MessageBox.Show("ready");
}
Very appreciative of Jay Sullivan's answer -- was very helpful for me.
Building on that, I observed that in his solution the string formatting of varbinary and string data types was not good -- varbinary fields would come out as literally "System.Byte" or something like that, while datetime fields would be formatted MM/dd/yyyy hh:mm:ss tt, which is not desirable for me.
Below I is my hacked-together solution which converts to string differently based on data type. It is uses nested ternary operators, but it works!
Hope it is helpful for someone.
public static void DumpTableToFile(SqlConnection connection, Dictionary<string, string> cArgs)
{
string query = "SELECT ";
string z = "";
if (cArgs.TryGetValue("top_count", out z))
{
query += string.Format("TOP {0} ", z);
}
query += string.Format("* FROM {0} (NOLOCK) ", cArgs["table"]);
string lower_bound = "", upper_bound = "", column_name = "";
if (cArgs.TryGetValue("lower_bound", out lower_bound) && cArgs.TryGetValue("column_name", out column_name))
{
query += string.Format("WHERE {0} >= {1} ", column_name, lower_bound);
if (cArgs.TryGetValue("upper_bound", out upper_bound))
{
query += string.Format("AND {0} < {1} ", column_name, upper_bound);
}
}
Console.WriteLine(query);
Console.WriteLine("");
using (var command = new SqlCommand(query, connection))
using (var reader = command.ExecuteReader())
using (var outFile = File.CreateText(cArgs["out_file"]))
{
string[] columnNames = GetColumnNames(reader).ToArray();
int numFields = columnNames.Length;
Console.WriteLine(string.Join(",", columnNames));
Console.WriteLine("");
if (reader.HasRows)
{
Type datetime_type = Type.GetType("System.DateTime");
Type byte_arr_type = Type.GetType("System.Byte[]");
string format = "yyyy-MM-dd HH:mm:ss.fff";
int ii = 0;
while (reader.Read())
{
ii += 1;
string[] columnValues =
Enumerable.Range(0, numFields)
.Select(i => reader.GetValue(i).GetType()==datetime_type?((DateTime) reader.GetValue(i)).ToString(format):(reader.GetValue(i).GetType() == byte_arr_type? String.Concat(Array.ConvertAll((byte[]) reader.GetValue(i), x => x.ToString("X2"))) :reader.GetValue(i).ToString()))
///.Select(field => string.Concat("\"", field.Replace("\"", "\"\""), "\""))
.Select(field => field.Replace("\t", " "))
.ToArray();
outFile.WriteLine(string.Join("\t", columnValues));
if (ii % 100000 == 0)
{
Console.WriteLine("row {0}", ii);
}
}
}
}
}
public static IEnumerable<string> GetColumnNames(IDataReader reader)
{
foreach (DataRow row in reader.GetSchemaTable().Rows)
{
yield return (string)row["ColumnName"];
}
}
FileHelpers has an async engine which is better suited for handling large files. Unfortunately, the FileDataLink class does not use it, so there's no easy way to use it with SqlStorage.
It's not very easy to modify the SQL timeout either. The easiest way would be to copy the code for SqlServerStorage to create your own alternative storage provider and provide replacements for ExecuteAndClose() and ExecuteAndLeaveOpen() which set the timeout on the IDbCommand. (SqlServerStorage is a sealed class, so you cannot just subclass it).
You might want to check out ReactiveETL which uses the FileHelpers async engine for handling files along with a rewrite of Ayende's RhinoETL using ReactiveExtensions to handle large datasets.

Categories