Reading Excel in c# where some columns are empty - c#

I have an Excel template like this
and I have some problems reading this (I can't use 3rd-party libraries). My solution:
public partial class CaricaDocumento : System.Web.UI.Page
{
static string HDR; // "Yes" indicates that the first row contains column names, not data
Regex regex = new Regex("([0-9]+)(.*)");
protected void Page_Load(object sender, EventArgs e)
{
string ind = "C:\\My_Template.xlsx";
string sheetName = "Page1";
DataTable dt = FromXLSX(ind, sheetName, true);
DataToView(dt);
}
// Bind data to the page
private void DataToView(DataTable dt)
{
LblFattura.Text = GetValue("AO10", dt);
LblDataFattura.Text = GetValue("AX10", dt);
LblCognomeOrRagioneSociale.Text = GetValue("B18", dt);
LblNome.Text = GetValue("AB18", dt);
}
// return the value from the cell, indicate a code like "A1", "B3", "AO10"...
public string GetValue(string codeCell, DataTable dt)
{
string[] substrings = regex.Split(codeCell);
string letterString = substrings[0]; // 'A' or 'B' ... 'AZ' ...
int letter = ColumnLetterToNumber(letterString); // transform the letter in a column index
int num = 1;
if (HDR == "Yes")
num = 2;
// if the first row is an header, do -2
// if the first row is a simple data row, do -1
int number = Int32.Parse(substrings[1]) - num; // the right row index
return dt.Rows[number][letter].ToString();
}
// transform the letter in a column index
public static int ColumnLetterToNumber(string columnName)
{
if (string.IsNullOrEmpty(columnName)) throw new ArgumentNullException("columnName");
columnName = columnName.ToUpperInvariant();
int sum = 0;
for (int i = 0; i < columnName.Length; i++)
{
sum *= 26;
char letter = columnName[i];
sum += (letter - ('A' - 1));
}
sum--;
return sum;
}
// return the DataTable
public static DataTable FromXLSX(string filePath, string sheet, bool hasHeaders)
{
try
{
// Create the new datatable.
DataTable dtexcel = new DataTable();
// Define the SQL for querying the Excel spreadsheet.
HDR = hasHeaders ? "Yes" : "No"; // "HDR=Yes;" indicates that the first row contains column names, not data
string IMEX = "1";
string strConn = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filePath
+ ";Extended Properties=\"Excel 12.0;HDR=" + HDR + ";IMEX=" + IMEX + ";\"";
// Create connection:
OleDbConnection conn = new OleDbConnection(strConn);
conn.Open();
if (!sheet.EndsWith("_"))
{
// Query data from the sheet
string query = "SELECT * FROM [" + sheet + "$]";
OleDbDataAdapter daexcel = new OleDbDataAdapter(query, conn);
dtexcel.Locale = CultureInfo.CurrentCulture;
// Fill the datatable:
daexcel.Fill(dtexcel);
}
// Close connection.
conn.Close();
// Set the datatable.
return dtexcel;
}
catch { throw; }
}
}
But I have noticed this issue: if datas don't start from the column 'A', the DataTable read data from the first column with data! Is a nightmare for the indexes. For example:
... in this case the column 'A' is ignored (the DataTable takes datas starting from 'B') and this invalidates the use of cell codes (like "A1", "B5", "AO11"...) because the method ColumnLetterToNumber(string columnName) is distorted.
Someone knows how I can impose that the DataTable gets datas starting from the 'A' column? Or alternative ways to get data from Excel using cell codes?

You can use this query:
string query = "SELECT NULL AS EmptyColumn, * FROM [" + sheet + "$]";

Related

How to add conditional data from a datatable into another datatable. [error: " no row at position 0] . C#

I am new to programming and got this job to create a tool to convert .DBF table into a .csv file.
so here is the scenario;
The dbf table 'Poles' contain four fields 'pole_id', 'guy_hoa_1', 'guy_hoa_2','guy_hoa_3' and 'guy_hoa_4'.
And the final csv file should show the value in two columns only:'PoleId' and 'HOA' respectively, where PoleID will be == pole_id and HOA= guy_hoa_1 + '|' +guy_hoa_2+'|' +guy_hoa_3 +'|'+ guy_hoa_4.
for example, the Poles table will have data like;
Sample data of Poles table
And, the ouput csv file should show data as follows;
Sample Output CSV file
*The pole_id is the main field and based on it the values of other fields will be selected.
So far I managed to write following code:
enter code here
enter code here
string str = textBox1.Text;
string path = str.Substring(0, str.LastIndexOf("\\") + 1);
string conn = "Provider=Microsoft.Jet.OLEDB.4.0; Data Source = '" + path + "';Extended Properties=dBase IV;User ID=Admin;Password=;";
OleDbConnection connection = new OleDbConnection();
connection.ConnectionString = conn;
connection.Open();
CheckConnectionLabel.Text = "Connected Successfully";
OleDbDataAdapter adapter = new OleDbDataAdapter(#"SELECT pole_id, guy_hoa_1, guy_hoa_2,guy_hoa_3,guy_hoa_4 FROM poles" + ".dbf", connection);
DataSet ds = new DataSet();
DataTable dt = new DataTable();
adapter.Fill(dt);
DataTable dt1 = dt.AsEnumerable()
.Where(r=> r.Field<string>("pole_id")!= null)
.Where(r=> r.Field<string>("pole_id")!=" ")
.CopyToDataTable();
DataTable dtTemp = new DataTable();
dtTemp.Columns.Add("PoleId", typeof(String));
dtTemp.Columns.Add("HOA", typeof(string));
string x = string.Empty;
for (int i=0;i< dt1.Rows.Count;i++)
{
if(dt1.Rows[i]["pole_id"]!= null || dt1.Rows[i]["pole_id"].ToString()!= "")
{
if(dt1.Rows[i]["guy_hoa_1"]!=null && dt1.Rows[i]["guy_hoa_1"].ToString()!="")
{
x =dt1.Rows[i]["guy_hoa_1"].ToString();
}
if(dt1.Rows[i]["guy_hoa_2"]!= null && dt1.Rows[i]["guy_hoa_2"].ToString()!="")
{
x = x + "|" + dt1.Rows[i]["guy_hoa_2"].ToString();
}
if(dt1.Rows[i]["guy_hoa_3"]!=null && dt1.Rows[i]["guy_hoa_3"].ToString()!= "")
{
x = x + "|" + dt1.Rows[i]["guy_hoa_3"].ToString();
}
if(dt1.Rows[i]["guy_hoa_4"]!=null && dt1.Rows[i]["guy_hoa_4"].ToString()!= "")
{
x = x + "|" + dt1.Rows[i]["guy_hoa_4"].ToString();
}
dtTemp.Rows[i]["PoleId"] = dt1.Rows[i]["poles_id"].ToString();
dtTemp.Rows[i]["HOA"] = x ;
}
}
connection.Close();
dataGridView1.DataSource = dtTemp;
}
catch (Exception ex)
{
MessageBox.Show("Error " + ex.Message);
}
}
enter code here
So, through above code I am connected to the dbf table and collected required data in 'dt' table. Then I filtered the data by removing the rows where pole_id was blank/null and put it in another 'dt1' table. Now my purpose was to check the conditions in dt1 table and then fill rows in dtTemp table which would later display the data in datagridview.
The Code is fetching the value of x till last IF statement correctly however nothing is getting filled up in dtTemp datatable and then showing this error.
Please help me and let me know where I am wrong... many thanks in advance!!
I got the solution as follows;
enter code here
object y = dt1.Rows[i]["pole_id"].ToString();
dtTemp.NewRow();
dtTemp.Rows.Add(y ,x);

How do you programmatically check if a spreadsheet has headers in C#

I am creating a winform application where every day, a user will select a xlsx file with the day's shipping information to be merged with our invoicing data.
The challenge I am having is when the user does not download the xlsx file with the specification that the winform data requires. (I wish I could eliminate this step with an API connection but sadly I cannot)
My first step is checking to see if the xlsx file has headers to that my file path is valid
Example
string connString = "provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + *path* + "';Extended Properties='Excel 12.0;HDR=YES;';";
Where path is returned from an OpenFileDialog box
If the file was chosen wasn't downloaded with headers the statement above throws an exception.
If change HDR=YES; to HDR=NO; then I have trouble identifying the columns I need and if the User bothered to include the correct ones.
My code then tries to load the data into a DataTable
private void loadRows()
{
for (int i = 0; i < deliveryTable.Rows.Count; i++)
{
DataRow dr = deliveryTable.Rows[i];
int deliveryId = 0;
bool result = int.TryParse(dr[0].ToString(), out deliveryId);
if (deliveryId > 1 && !Deliveries.ContainsKey(deliveryId))
{
var delivery = new Delivery(deliveryId)
{
SalesOrg = Convert.ToInt32(dr[8]),
SoldTo = Convert.ToInt32(dr[9]),
SoldName = dr[10].ToString(),
ShipTo = Convert.ToInt32(dr[11]),
ShipName = dr[12].ToString(),
};
Which all works only if the columns are in the right place.
If they are not in the right place my thought is to display a message to the user to get the right information
Does anyone have any suggestions?
(Sorry, first time posting a question and still learning to think through it)
I guess you're loading the spreadsheet into a Datatable? Hard to tell with one line of code. I would use the columns collection in the datatable and check to see if all the columns you want are there. Sample code to enumerate the columns below.
private void PrintValues(DataTable table)
{
foreach(DataRow row in table.Rows)
{
foreach(DataColumn column in table.Columns)
{
Console.WriteLine(row[column]);
}
}
}
private void GetExcelSheetForUpload(string PathName, string UploadExcelName)
{
string excelFile = "DateExcel/" + PathName;
OleDbConnection objConn = null;
System.Data.DataTable dt = null;
try
{
DataSet dss = new DataSet();
String connString = "Provider=Microsoft.ACE.OLEDB.12.0;Persist Security Info=True;Extended Properties=Excel 12.0 Xml;Data Source=" + PathName;
objConn = new OleDbConnection(connString);
objConn.Open();
dt = objConn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
if (dt == null)
{
return;
}
String[] excelSheets = new String[dt.Rows.Count];
int i = 0;
foreach (DataRow row in dt.Rows)
{
if (i == 0)
{
excelSheets[i] = row["TABLE_NAME"].ToString();
OleDbCommand cmd = new OleDbCommand("SELECT * FROM [" + excelSheets[i] + "]", objConn);
OleDbDataAdapter oleda = new OleDbDataAdapter();
oleda.SelectCommand = cmd;
oleda.Fill(dss, "TABLE");
}
i++;
}
grdExcel.DataSource = dss.Tables[0].DefaultView;
grdExcel.DataBind();
lblTotalRec.InnerText = Convert.ToString(grdExcel.Rows.Count);
}
catch (Exception ex)
{
ViewState["Fuletypeidlist"] = "0";
grdExcel.DataSource = null;
grdExcel.DataBind();
}
finally
{
if (objConn != null)
{
objConn.Close();
objConn.Dispose();
}
if (dt != null)
{
dt.Dispose();
}
}
}
if (grdExcel.HeaderRow.Cells[0].Text.ToString() == "CODE")
{
GetExcelSheetForEmpl(PathName);
}
else
{
divStatusMsg.Style.Add("display", "");
divStatusMsg.Attributes.Add("class", "alert alert-danger alert-dismissable");
divStatusMsg.InnerText = "ERROR !!... Upload Excel Sheet in header Defined Format ";
}

Can't sort excel worksheet using C#

I want to programmatically sort an excel worksheet using C# but the code I used doesn't work:
//the largest size of sheet in Excel 2010
int maxRowAmount = 1048576;
int maxColAmount = 16384;
//Sort by the value in column G1
sourceWorkSheet.Sort.SortFields.Add(sourceWorkSheet.Range["J:J"], XlSortOn.xlSortOnValues, XlSortOrder.xlAscending, XlSortDataOption.xlSortNormal);
//Find out the last used row and column, then set the range to sort,
//the range is from cell[2,1](top left) to the bottom right corner
int lastUsedRow=sourceWorkSheet.Cells[maxRowAmount, 1].End[XlDirection.xlUp].Row;
int lastUsedColumn=sourceWorkSheet.Cells[2, maxColAmount].End[XlDirection.xlToLeft].Column;
Range r = sourceWorkSheet.Range[sourceWorkSheet.Cells[2, 1], sourceWorkSheet.Cells[lastUsedRow,lastUsedColumn ]];
sourceWorkSheet.Sort.SetRange(r);
//Sort!
sourceWorkSheet.Sort.Apply();
I debug it using the messagebox to print of the value in the column "J" and the result is not sorted:
//print out the sorted result
Range firstColumn = sourceWorkSheet.UsedRange.Columns[10];
System.Array myvalues = (System.Array)firstColumn.Cells.Value;
string[] cmItem = myvalues.OfType<object>().Select(o => o.ToString()).ToArray();
String msg="";
for (int i = 0; i < 30; i++)
{
msg = msg + cmItem[i] + "\n";
}
MessageBox.Show(msg);
What's the reason of it not working?
Thanks
The solution is to put a
sourceWorkSheet.Sort.SortFields.Clear();
before
sourceWorkSheet.Sort.SortFields.Add(sourceWorkSheet.Range["J:J"], XlSortOn.xlSortOnValues, XlSortOrder.xlAscending, XlSortDataOption.xlSortNormal);
In your code you open excel then read from it so sheets are read in original order (not sorted alphabetical).
You can use next code to get sorted sheets.
OleDbConnection connection = new OleDbConnection(string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0}; Extended Properties=\"Excel 8.0;HDR=No;\"", filePath));
OleDbCommand command = new OleDbCommand();
DataTable tableOfData = null;
command.Connection = connection;
try
{
connection.Open();
tableOfData = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
string tablename = tableOfData.Rows[0]["TABLE_NAME"].ToString();
tableOfData = new DataTable();
command.CommandText = "Select * FROM [" + tablename + "]";
tableOfData.Load(command.ExecuteReader());
}
catch (Exception ex)
{
}

In Excel how to search a value in a column and get all the values in that row using C#

I am trying searching a value on column "C" and getting a matched cell name as well, for example C14, now how can I select the values in row 14.
I tried as :
private static MyObject GetRowValue(int rowNumber)
{
string connString = "";
string path = "C:\\Code\\MyFile.xls";
connString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + path + ";Extended Properties=\"Excel 8.0;HDR=Yes;IMEX=2\"";
string query = "SELECT * FROM [Sheet1$A" + rowNumber + ":BD" + rowNumber + "]";
using (OleDbConnection connection = new OleDbConnection(connString))
{
var adapter = new OleDbDataAdapter(query, connection);
DataSet ds = new DataSet();
adapter.Fill(ds);
DataTable dt = ds.Tables[0];
}
}
If row number is 10, them I am trying to get all values of 10th row only, but it is returning all the rows after 10th row.
Just use this formula:
string query = #"SELECT * FROM [Sheet1$"+ (rowNumber-1) + ":" + (rowNumber) + "]";
If rowNumber=10 then you get all the values from the 10th row.
Was this helpful?
If it were me, I'd let Excel do the work for me. You'd need the Office.Interop.Excel namespace.
private static ReadRows(string SearchValue, int StartRow)
{
int r = StartRow;
Excel.Application xl = new Excel.Application();
xl.Workbooks.Open(your workbook);
Excel.WorkSheet ws = xl.Workbooks(1).Worksheets(1);
do
{
if(ws.Cells(r,3).value == SearchValue)
{
// read the entire row
string colA = ws.Cells(r,1).value;
string colB = ws.Cells(r,2).value;
//...
// or loop through all columns
int c = 1;
do
{
// add cell value to some collection
c++;
} while (ws.Cells(r,c).Value != "");
}
r++;
} while (ws.Cells(r,3).Value != ""); // 3 because you want column C
}

Reading columns from Excel, reformat cells

I am currently trying to read in cells from an excel spread sheet, and it seems to reformat cells when I don't want it to. I want it to come through as plan text. I have read a couple of solutions to this problem and I have implemented them, but I am still having the same issue.
The reader turns dates in numbers and numbers into dates.
Example:
Friday, January 29, 2016 comes out to be : 42398
and
40.00 comes out to be : 2/9/1900 12:00:00 AM
code:
string stringconn = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + files[0] + ";Extended Properties=\"Excel 12.0;IMEX=1;HDR=NO;TypeGuessRows=0;ImportMixedTypes=Text\"";
try {
OleDbConnection conn = new OleDbConnection(stringconn);
OleDbDataAdapter da = new OleDbDataAdapter("SELECT * FROM [CUAnswers$]", conn);
DataTable dt = new DataTable();
try {
printdt(dt);
I have tried
IMEX=0;
HDR=NO;
TypeGuessRows=1;
This is how I am printing out the sheet
public void printdt(DataTable dt) {
int counter1 = 0;
int counter2 = 0;
string temp = "";
foreach (DataRow dataRow in dt.Rows) {
foreach (var item in dataRow.ItemArray) {
temp += " ["+counter1+"]["+counter2+"]"+ item +", ";
counter2++;
}
counter1++;
logger.Debug(temp);
temp = "";
counter2 = 0;
}
}
I had a similar problem, except it was using Interop to read the Excel spreadsheet. This worked for me:
var value = (range.Cells[rowCnt, columnCnt] as Range).Value2;
string str = value as string;
DateTime dt;
if (DateTime.TryParse((value ?? "").ToString(), out dt))
{
// Use the cell value as a datetime
}
Editted to add new ideas
I was going to suggest saving the spreadsheet as comma-separated values. Then Excel converts the cells to text. It is easy to parse a CSV in C#.
That led me to think of how to programmatically do the conversion, which is covered in Convert xls to csv programmatically. Maybe the code in the accepted answer is what you are looking for:
string ExcelFilename = "c:\\ExcelFile.xls";
DataTable worksheets;
string connectionString = #"Provider=Microsoft.Jet.OLEDB.4.0;" + #"Data Source=" + ExcelFilename + ";" + #"Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1""";
using (OleDbConnection connection = new OleDbConnection(connectionString))
{
connection.Open();
worksheets = connection.GetSchema("Tables");
foreach (DataRow row in worksheets.Rows)
{
// For Sheets: 0=Table_Catalog,1=Table_Schema,2=Table_Name,3=Table_Type
// For Columns: 0=Table_Name, 1=Column_Name, 2=Ordinal_Position
string SheetName = (string)row[2];
OleDbCommand command = new OleDbCommand(#"SELECT * FROM [" + SheetName + "]", connection);
OleDbDataAdapter oleAdapter = new OleDbDataAdapter();
oleAdapter.SelectCommand = command;
DataTable dt = new DataTable();
oleAdapter.FillSchema(dt, SchemaType.Source);
oleAdapter.Fill(dt);
for (int r = 0; r < dt.Rows.Count; r++)
{
string type1 = dr[1].GetType().ToString();
string type2 = dr[2].GetType().ToString();
string type3 = dr[3].GetType().ToString();
string type4 = dr[4].GetType().ToString();
string type5 = dr[5].GetType().ToString();
string type6 = dr[6].GetType().ToString();
string type7 = dr[7].GetType().ToString();
}
}
}

Categories