When using this code for some reason it skips the first line of the csv file, which are the headers. What am I doing wrong?
string strFileName = path;
OleDbConnection conn = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0; Data Source = " + System.IO.Path.GetDirectoryName(strFileName) + "; Extended Properties = \"Text\"");
conn.Open();
OleDbDataAdapter adapter = new OleDbDataAdapter("SELECT * FROM " + System.IO.Path.GetFileName(strFileName), conn);
DataSet ds = new DataSet("Temp");
adapter.Fill(ds);
DataTable tb = ds.Tables[0];
string data = null;
for (int j = 0; j <= tb.Rows.Count - 1; j++)
{
for (int k = 0; k <= tb.Columns.Count - 1; k++)
{
data = tb.Rows[j].ItemArray[k].ToString();
SaturnAddIn.getInstance().Application.ActiveWorkbook.ActiveSheet.Cells[j + 1, k + 1] = data;
}
}
It will skip the first row of headers, unless you use:
Extended Properties=Text;HDR=No;
But in this case it will treat the first row as a data-row which will probably (at some stage) cause data-type errors.
Normally you would skip the first row, and create the headers in Excel manually.
This comment notes the same behavior when the FULL PATH is passed into the SELECT statement. Since the directory of the file is provided in the OleDbConnection it does not need to be provided a second time.
There are some similar notes at this answer (to a different question) that indicate that the path should be in the connection, as well.
It also recommends using a "real" CSV parser.
Also found that when HDR=YES you can get the first column using the table.Columns[0].ColumnName and using some sort of loop.
Related
I'm working on a project in C# that converts a database table to an XML-file with base64 encoded contents. Please bear with me, because C# is not my day-to-day programming language.
The code I've managed to come up with is this:
OdbcCommand DbCommand = DbConnection.CreateCommand();
DbCommand.CommandText = "SELECT * FROM " + dbTable;
OdbcDataReader DbReader = DbCommand.ExecuteReader();
int fCount = DbReader.FieldCount;
string[] colnames = new string[fCount];
output += "<" + dbTable + ">\n";
for (int i = 0; i < fCount; i++)
{
string fName = DbReader.GetName(i);
colnames[i] = fName.ToString();
}
while (DbReader.Read())
{
output += "\t<export_row>\n";
for (int i = 0; i < fCount; i++)
{
string col = "";
try
{
col = DbReader.GetString(i);
}
catch (Exception) { }
if (col.ToString().Length > 0 || i == 0)
{
output += "\t\t<" + colnames[i] + ">" + Base64Encode(col).ToString() + "</" + colnames[i] + ">\n"; ;
}
}
output += "\t</export_row>\n";
}
output += "</" + dbTable + ">\n";
The problem is, that even with a relatively small table, this causes the application to choke up and run extremely slowly. The obvious clue is that there's an enormous amount of iterations involved for each row, so I have been looking for a solution to this problem. I have tried using a DataSet, which seemed to increase performance slightly, but not significantly enough.
connection.Open();
adapter.Fill(dataSet);
output += "<" + dbTable + ">\n";
foreach (DataTable table in dataSet.Tables)
{
foreach (DataRow row in table.Rows)
{
output += "\t<export_row>\n";
foreach (DataColumn column in table.Columns)
{
output += "\t\t<" + column.ToString() + ">" + Base64Encode(row[column].ToString()).ToString() + "</" + column.ToString() + ">\n"; ;
}
output += "\t</export_row>\n";
}
}
output += "</" + dbTable + ">\n";
However, the problem remains that there is no other way than iterating through all the columns each and every time. Which begs the question: isn't there a more efficient way to do this? I'm not going to make a model for every table, because there are hundreds of tables in this database and the power would be the flexibility of transferring data in this way.
Can someone help me out, or point me in the right direction? For example, is there a way to extract both the column and the value at the same time? As in: foreach(row as key => value) or something. That would drastically reduce the amount of iterations required.
Thanks in advance for thinking along! There must be something (obvious) I missed.
The key is always not to write formatting of text formats yourself be it HTML, JSON, XML, YAML, or anything else. This is just asking for hard-to-find bugs and injections since you do not have control of the data or table names. For example, what happens if your data contains !, <, or >?
C# has numerous built-in XML tools and so does SQL where the formatting is done for you. Which one to use would depend on your other requirements or preferences.
SqlCommand.ExecuteXmlReader
string cmd = "SELECT * FROM " + myTable + " FOR XML AUTO";
using (SqlCommand k = new SqlCommand(cmd, c))
{
c.Open();
XmlReader xml = k.ExecuteXmlReader();
Console.WriteLine(xml);
c.Close();
}
DataTable.WriteXml
string ConString = "your connection string";
string CmdString = "SELECT * FROM " + myTable;
SqlConnection con;
SqlCommand cmd;
SqlDataAdapter sda;
DataTable dt;
using (con = new SqlConnection(ConString))
{
cmd = new SqlCommand(CmdString, con);
con.Open();
dt = new DataTable(tableName);
sda = new SqlDataAdapter(cmd);
sda.Fill(dt);
dt.WriteXml(tableName + ".xml");
con.Close();
}
DataSet.GetXml()
// Create a DataSet with one table containing
// two columns and 10 rows.
DataSet dataSet = new DataSet("dataSet");
DataTable table = dataSet.Tables.Add("Items");
table.Columns.Add("id", typeof(int));
table.Columns.Add("Item", typeof(string));
// Add ten rows.
DataRow row;
for(int i = 0; i <10;i++)
{
row = table.NewRow();
row["id"]= i;
row["Item"]= "Item" + i;
table.Rows.Add(row);
}
// Display the DataSet contents as XML.
Console.WriteLine(dataSet.GetXml());
I have created a trigger statement in my database table and the data will be stored in a new physical table. Now I have created a console application to extract the data from physical table into excel sheet using excel interop.
Each time i run the application, I only want the not exported data to show in the new excel instead of showing everything. Its like I want to compare with previously generated excel and remove the data that is already in there in the currently generating excel.
For example:
Stock.xls data:
A
B
C
Database Table data :
A
B
C
When I run the application for the second time (I have added a new row in physical table in db, so the new xl sheet should remove a,b,c (and only must show d))
Stock.xls data:
D
Database Table data :
A
B
C
D
This is my code :
string connectionstring = System.Configuration.ConfigurationManager.ConnectionStrings["IntegrationConnection"].ConnectionString;
string sql2 = null;
string data2 = null;
int k = 0;
int l = 0;
string Filename2 = #"D:\Integration\Stock.xls";
if (!File.Exists(Filename2))
{
File.Create(Filename2).Dispose();
using (TextWriter tw = new StreamWriter(Filename2))
{
tw.WriteLine("Please run the program again");
tw.Close();
}
}
////*** Preparing excel Application
Excel.Application xlApp2;
Excel.Workbook xlWorkBook2;
Excel.Worksheet xlWorkSheet2;
object misValue2 = System.Reflection.Missing.Value;
///*** Opening Excel application
xlApp2 = new Microsoft.Office.Interop.Excel.Application();
xlWorkBook2 = xlApp2.Workbooks.Open(Filename2);
xlWorkSheet2 = (Excel.Worksheet)(xlWorkBook2.ActiveSheet as Excel.Worksheet);
xlApp2.DisplayAlerts = false;
SqlConnection conn2 = new SqlConnection(connectionstring);
conn2.Open();
sql2 = "SELECT * from tblMPartHistory";
///*** Preparing to retrieve value from the database
DataTable dtable2 = new DataTable();
SqlDataAdapter dscmd2 = new SqlDataAdapter(sql2, conn2);
DataSet ds2 = new DataSet();
dscmd2.Fill(dtable2);
////*** Generating the column Names here
string[] colNames2 = new string[dtable2.Columns.Count];
int col2 = 0;
foreach (DataColumn dc in dtable2.Columns)
colNames2[col2++] = dc.ColumnName;
char lastColumn2 = (char)(51 + dtable2.Columns.Count - 1);
xlWorkSheet2.get_Range("A1", lastColumn2 + "1").Value2 = colNames2;
xlWorkSheet2.get_Range("A1", lastColumn2 + "1").Font.Bold = true;
xlWorkSheet2.get_Range("A1", lastColumn2 + "1").VerticalAlignment
= Excel.XlVAlign.xlVAlignCenter;
/////*** Inserting the Column and Values into Excel file
for (k = 0; k <= dtable2.Rows.Count - 1; k++)
{
for (l = 0; l <= dtable2.Columns.Count - 1; l++)
{
data2 = dtable2.Rows[k].ItemArray[l].ToString();
xlWorkSheet2.Cells[k + 2, l + 1] = data2;
xlWorkBook2.Save();
}
}
xlWorkBook2.Close(true, misValue2, misValue2);
xlApp2.Quit();
System.Runtime.InteropServices.Marshal.ReleaseComObject(xlWorkSheet2);
System.Runtime.InteropServices.Marshal.ReleaseComObject(xlWorkBook2);
System.Runtime.InteropServices.Marshal.ReleaseComObject(xlApp2);
So this is how I understand your question:
You have a Database table and you want to generate a new excel sheet every time new rows have been 'commited' to your database, reflecting the recently added data only.
Of course you could go with your approach and just compare the data to all of your recent excel sheets, to remove duplicates.
A better approach would be though to add either a timestamp or a session-id to your dataset.
Then you can query your database for it and generate the new sheet from all the rows with the latest matching timestamp or highest matching session-id.
This way you will not only spare the extra work of the duplicate removal, but you'd also be able to restore all the sheets when they get lost somehow.
I have a small button click code which outputs a SQL query to Excel.
This worked fine until I added a new field in the table it queries and now it is still outputting the same fields and not including the new one.
Code:
private void button4_Click(object sender, EventArgs e)
{
string SQLQuery = "SELECT * FROM Services";
SqlConnection conn = new SqlConnection("CONNECTION STRING");
DataSet ds = new DataSet();
SqlDataAdapter da = new SqlDataAdapter(SQLQuery, conn);
da.Fill(ds);
Microsoft.Office.Interop.Excel.Application ExcelApp = new Microsoft.Office.Interop.Excel.Application();
Workbook ExcelWorkBook = null;
Worksheet ExcelWorkSheet = null;
ExcelApp.Visible = true;
ExcelApp.WindowState = XlWindowState.xlMinimized;
ExcelApp.WindowState = XlWindowState.xlMaximized;
ExcelWorkBook = ExcelApp.Workbooks.Add(XlWBATemplate.xlWBATWorksheet);
List<string> SheetNames = new List<string>();
SheetNames.Add("Services Details");
ExcelApp.ActiveWindow.Activate();
try
{
for (int i = 1; i < ds.Tables.Count; i++)
ExcelWorkBook.Worksheets.Add();
for (int i = 0; i < ds.Tables.Count; i++)
{
int r = 1;
ExcelWorkSheet = ExcelWorkBook.Worksheets[i + 1];
ExcelWorkSheet.Name = "Services";
for (int col = 1; col < ds.Tables[i].Columns.Count; col++)
ExcelWorkSheet.Cells[r, col] = ds.Tables[i].Columns[col - 1].ColumnName;
r++;
for (int row = 0; row < ds.Tables[i].Rows.Count; row++)
{
for (int col = 1; col < ds.Tables[i].Columns.Count; col++)
ExcelWorkSheet.Cells[r, col] = ds.Tables[i].Rows[row][col - 1].ToString();
r++;
}
ExcelWorkSheet.Rows[1].EntireRow.Font.Bold = true;
ExcelApp.Columns.AutoFit();
}
ExcelApp.Quit();
Marshal.ReleaseComObject(ExcelWorkSheet);
Marshal.ReleaseComObject(ExcelWorkBook);
Marshal.ReleaseComObject(ExcelApp);
}
catch (Exception exHandle)
{
Console.WriteLine("Exception: " + exHandle.Message);
Console.ReadLine();
}
}
I have also tried explicitly querying the new field SELECT ABC FROM Services and nothing is output.
There are values in the new fields.
If I run the same query on Azure query editor preview I get the correct results.
EDIT
Ok so I changed the query to SELECT *,1 FROM Services and then I get all the fields (bar the new "1" field) how can I change the loop to get all fields?
EDIT 2 SOLUTION Using EPPlus
Just to update anyone looking at this in the future, I used the NuGet Package Manager (Project > Manage NuGet Packages) and installed EPPlus by Jan Kallmän.
I then added:
using OfficeOpenXml;
using OfficeOpenXml.Style;
using System.Data.SqlClient;
using System.IO;
And used the following code on the button:
private void button4_Click(object sender, EventArgs e)
{
string SQLQuery = "SELECT * FROM Services";
SqlConnection conn = new SqlConnection("CONNECTION STRING");
DataSet ds = new DataSet();
SqlDataAdapter da = new SqlDataAdapter(SQLQuery, conn);
DataTable dt = new DataTable();
da.Fill(dt);
using (var p = new ExcelPackage())
{
var ws = p.Workbook.Worksheets.Add("Service");
ws.Cells["A1"].LoadFromDataTable(dt, true);
int totalRows = ws.Dimension.End.Row;
int totalCols = ws.Dimension.End.Column;
var headerCells = ws.Cells[1, 1, 1, totalCols];
var headerFont = headerCells.Style.Font;
headerFont.Bold = true;
var allCells = ws.Cells[1, 1, totalRows, totalCols];
allCells.AutoFitColumns();
p.SaveAs(new FileInfo(#"d:\excel\Service" + DateTime.Now.ToFileTime() + ".xlsx"));
}
}
This is an instant output to a file and thanks to #Caius Jard
First up, I second the comment about using EPPlus. I think it even has a method to turn a datatable into a sheet, so this click event handler could be boiled down to about 4 lines of code. Take a look over this SO question - Export DataTable to excel with EPPlus
Second, I think your actual problem is a simple off-by-one error
You said (my paraphrase)
if I use a query that returns one column, like select abc from services, nothing is output
Here's how you're doing your column output:
for(int col = 1; col < table.Columns.Count; col++)
Your table has one column. The comparison to run the loop is thus: is 1 less than 1?
False
Loop doesn't run
If your table had 10 columns, the loop would run 9 times. The last column will never be output. You've got a query with a new column on the end, you're expecting to see it, it's not there because your c# doesn't output it, not because there is something weird about the data in the column..
In terms of what to do about it, change the comparison you do in the "should loop run" part:
for(int col = 1; col <= table.Columns.Count; col++)
Or change the way you index (index by 0 instead of index by 1):
for(int col = 0; col < table.Columns.Count; col++)
excel[row, col+1] = table[row][col]; //excel is 1 based, c# is 0 based
Regarding those notions of "but the data hasn't changed, and the code hasn't changed, and something different is observed" - it's almost never the case. It's far more likely that something is being misremembered by the human in the equation. Maybe the first guy that wrote the code hit the same issue and just duplicated the last column in the SQL to get around it, then you/someone else saw it and though "that looks wrong, i'll just take that out..", then months later you're looking and going "this used to work, i'm sure it did..."
:)
But seriously; use EPPlus ;) it's delightful, easy, creates xlsx files directly (they're just xml files inside a zip, renamed to xlsx) and is super simple. Excel COM is a massive headache, and you'd be needlessly building a tech support nightmare for yourself. With EPPlus (use nuget package manager to add a reference to EPPlus) your code would look more like:
using(SqlDataAdapter da = new SqlDataAdapter("SELECT * FROM services", "PUT YOUR CONN STRING IN A CONFIG FILE")){
DataTable dt = new DataTable();
da.Fill(dt);
using (ExcelPackage ep = new ExcelPackage(newFile))
{
ExcelWorksheet ws = ep.Workbook.Worksheets.Add("Sheetname");
ws.Cells["A1"].LoadFromDataTable(dt, true);
ep.SaveAs(new FileInfo("YOUR SAVE PATH HERE"));
}
}
Is "Services" a table or a view? You might be seeing results from an outdated schema (cached before you made your change). This is a danger with "select *". Try an explicit field list instead of "*".
private void OnCreated(object sender, FileSystemEventArgs e)
{
excelDataSet.Clear();
string extension = Path.GetExtension(e.FullPath);
if (extension == ".xls" || extension == ".xlsx")
{
string ConnectionString = "";
if (extension == ".xls") { ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0; Data Source = '" + e.FullPath + "';Extended Properties=\"Excel 8.0;HDR=YES;\""; }
if (extension == ".xlsx") { ConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0; Data Source = '" + e.FullPath + "';Extended Properties=\"Excel 12.0;HDR=YES;\""; }
using (OleDbConnection conn = new OleDbConnection(ConnectionString))
{
conn.Open();
OleDbDataAdapter objDA = new OleDbDataAdapter("select * from [Sheet1$]", conn);
objDA.Fill(excelDataSet);
conn.Close();
conn.Dispose();
}
}
}
This is my code. It's working when my filewatcher triggers. Problem is the excel file I read has 1 header row and 3 row that has values. When I use this code and check my dataset row count I get 9.. I've no idea where does it take that 9 from, am I doing something wrong? I'm checking my code for last 30-35 min and still couldn't find what I'm doing wrong..
I get the column's right but the rows are not working. I don't need the header line btw
Update: my example excel file had 3 rows and I was getting 9 as row count. I just copied these rows and made my file 24 row + 1 header row and when I did rows.count I got 24 as answer. So it worked fine? Is that normal?
There is a Nuget called Linq to Excel. I used this nuget in several projects to query the data inside .csv and .xlsx files without any difficulty, it is easy to implement. It might be poor in performance but it can resolve your problem.
Here is the documentation of Linq to Excel
I would highly recommend you to take a look at EPPLUS library https://github.com/JanKallman/EPPlus/wiki
I have plently of trouble with oledb until i found EPPLUS. It's really easy to use for creating and updating excel files. There are plenty of good examples out there like the one under which is from How do i iterate through rows in an excel table using epplus?
var package = new ExcelPackage(new FileInfo("sample.xlsx"));
ExcelWorksheet workSheet = package.Workbook.Worksheets[1];
var start = workSheet.Dimension.Start;
var end = workSheet.Dimension.End;
for (int row = start.Row; row <= end.Row; row++)
{ // Row by row...
for (int col = start.Column; col <= end.Column; col++)
{ // ... Cell by cell...
object cellValue = workSheet.Cells[row, col].Text; // This got me the actual value I needed.
}
}
(first sorry for my English):
I want to temporarily change auto-number column to int64 data type to import records from another database. After importing the records, I want to change it back to auto-number.
My Problem:
I try to use the table.Columns[i].AutoIncrement property to check if this column is auto-number and get its index so that I can change its datatype, but this property didn't work for me, it returned false for all columns.
I work with 2010/2013 Access database.
So I want to know what to do to get index of auto-number column?
You can use this approach
// Bogus query, we don't want any record, so add a always false condition
OleDbCommand cmd = new OleDbCommand("SELECT * FROM aTable where 1=2", con);
OleDbDataAdapter da = new OleDbDataAdapter(cmd);
DataTable test = new DataTable();
da.FillSchema(test, SchemaType.Source);
for(int x = 0; x < test.Columns.Count; x++)
{
DataColumn dc = test.Columns[x];
Console.WriteLine("ColName = " + dc.ColumnName +
", at index " + x +
" IsAutoIncrement:" + dc.AutoIncrement);
}