OpenXML Only reading 4 columns - c#

I have the following piece of code in a much larger OpenXML Excel reader. This reader grabs the information assigns to a dataset and then is displayed in a datagridview:
public static DataTable ExtractExcelSheetValuesToDataTable(string xlsxFilePath, string sheetName, int startingRow) {
DataTable dt = new DataTable();
using (SpreadsheetDocument myWorkbook = SpreadsheetDocument.Open(xlsxFilePath, true)) {
//Access the main Workbook part, which contains data
WorkbookPart workbookPart = myWorkbook.WorkbookPart;
WorksheetPart worksheetPart = null;
if (!string.IsNullOrEmpty(sheetName)) {
Sheet ss = workbookPart.Workbook.Descendants<Sheet>().Where(s => s.Name == sheetName).SingleOrDefault<Sheet>();
worksheetPart = (WorksheetPart)workbookPart.GetPartById(ss.Id);
} else {
worksheetPart = workbookPart.WorksheetParts.FirstOrDefault();
}
SharedStringTablePart stringTablePart = workbookPart.SharedStringTablePart;
if (worksheetPart != null) {
Row lastRow = worksheetPart.Worksheet.Descendants<Row>().LastOrDefault();
#region ColumnCreation
//Returns the columns - come back to this later - may be able to modify this to have
//A checkbox "Column names in first row"
Row firstRow = worksheetPart.Worksheet.Descendants<Row>().FirstOrDefault();
int columnInt = 0;
//if (firstRow != null)
//{
foreach (Cell c in firstRow.ChildElements)
{
string value = GetValue(c, stringTablePart);
dt.Columns.Add(columnInt + ": " + value);
columnInt++;
}
//}
#endregion
#region Create Rows
//if (lastRow != null)
//{
//lastRow.RowIndex;
for (int i = startingRow; i <= 150000; i++)
{
DataRow dr = dt.NewRow();
bool empty = true;
Row row = worksheetPart.Worksheet.Descendants<Row>().Where(r => i == r.RowIndex).FirstOrDefault();
int j = 0;
if (row != null)
{
foreach (Cell c in row.ChildElements)
{
//Get cell value
string value = GetValue(c, stringTablePart);
if (!string.IsNullOrEmpty(value) && value != "")
empty = false;
dr[j] = value;
j++;
if (j == dt.Columns.Count)
break;
}
if (empty)
break;
dt.Rows.Add(dr);
}
}
}
#endregion
}
// }
return dt;
}
public static string GetValue(Cell cell, SharedStringTablePart stringTablePart) {
if (cell.ChildElements.Count == 0) return null;
//get cell value
string value = cell.ElementAt(0).InnerText;//CellValue.InnerText;
//Look up real value from shared string table
if ((cell.DataType != null) && (cell.DataType == CellValues.SharedString))
value = stringTablePart.SharedStringTable.ChildElements[Int32.Parse(value)].InnerText;
return value;
}
public void GetSheetInfo(string fileName)
{
Sheets theSheets = null;
// Open file as read-only.
using (SpreadsheetDocument mySpreadsheet = SpreadsheetDocument.Open(fileName, false))
{
S sheets = mySpreadsheet.WorkbookPart.Workbook.Sheets;
WorkbookPart wbPart = mySpreadsheet.WorkbookPart;
theSheets = wbPart.Workbook.Sheets;
foreach (Sheet item in theSheets)
{
cmbSheetSelect.Items.Add(item.Name);
}
}
}
This has worked for basic spreadsheets but as I try to read more advanced ones I get a problem or two.
Firstly, I have a worksheet that has 5 columns:see here
However when I run my program it only returns the first 4 columns and not column E and all its data.
My second question would is it possible using that code (or a variation of it) to be able to specify the line I want the program to read as the datagridview column heading?

In case anyone needs this I found that changing:
Row firstRow = worksheetPart.Worksheet.Descendants<Row>().FirstOrDefault();
To
Row firstRow = worksheetPart.Worksheet.Descendants<Row>().ElementAtOrDefault(columnIndex)
Worked. With columnIndex being a variable I can change based on the sheet selected.

Related

OpenXML document with repeating row won't open (in OpenOffice)

I'm creating a new open xml document. When I write unique data/rows to the document I can open it a variety of programs. When I write a non-unique row and attempt to open the document in Apache OpenOffice I get an error General Error. General input/output error. Obviously this isn't very descriptive so I'm assuming I'm creating my document wrong but I'm not sure what is missing/wrong
Things I've tried:
The solution listed in the OpenOffice Documentation
Using the OpenXmlValidator
This doesn't return any errors
Opening in different software: Microsoft Office Excel Viewer and LibreOffice Calc.
The file opens in these but the machines running this code don't have this software installed
The weird fix
Rename the a.xlsx => a.zip
Extract the contents from the zip file
Zip up all the contents (using winrar and windows compressed zipped folder) named b.zip
Rename the b.zip to b.xlsx
The file now opens in OpenOffice without any error.
Doing a diff on the unzipped files shows no differences, doing a diff on a.xlsx and b.xlsx there are differences but nothing that makes sense to me
The Code:
static void Main(string[] args) {
var thing = new MyClass();
thing.GenerateDoc();
Console.WriteLine("Press any key to exit");
Console.ReadKey();
}
public class MyClass {
public MyClass() { }
public void GenerateDoc() {
var xmlFileString = "Temp.xlsx";
var sheetName = "sheetName";
var OpenXMLAlwaysPrintHeader = true;
try {
bool fileExists = System.IO.File.Exists(xmlFileString);
if (!fileExists) {
// check for a blank file template and copy that if it exists
CreateSpreadsheetWorkbook(xmlFileString, sheetName);
}
fileExists = System.IO.File.Exists(xmlFileString);
if (fileExists) {
UInt32 RowIndex;
using (var doc = SpreadsheetDocument.Open(xmlFileString, true)) {
// Check to see if the sheet we are adding data to exists
var workbookPart = doc.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.FirstOrDefault();
try {
worksheetPart = GetWorksheetPart(workbookPart, sheetName);
}
catch (Exception) { }
var sheet = worksheetPart.Worksheet ?? new Worksheet();
var sheetData = sheet.Elements<SheetData>().First();
var t = sheetData.Elements<Row>();
Row eHeader = null;
if (t.Count() > 0) {
eHeader = t?.First(); // This should be the first row ( the header or key of each item in the dict)
}
RowIndex = (uint)t.Count() + 1;
// Create the table of all strings if it doesnt exist
SharedStringTablePart shareStringPart;
if (doc.WorkbookPart.GetPartsOfType<SharedStringTablePart>().Count() > 0)
shareStringPart = doc.WorkbookPart.GetPartsOfType<SharedStringTablePart>().First();
else
shareStringPart = doc.WorkbookPart.AddNewPart<SharedStringTablePart>();
// Create a row for the header and the values (referring to the keys and the values in the dict)
Row header = new Row() { RowIndex = RowIndex++ };
Row data = new Row() { RowIndex = RowIndex };
// If we are not re-printing the header than the data row needs to shift up by 1
if (!OpenXMLAlwaysPrintHeader && t.Count() > 0)
data.RowIndex = --data.RowIndex;
var ColIndex = 1;
// Create the DefinedNames part which is a dictonary of "string" to a range of cells
// todo fix the next line
// This deletes all other pre-defined names since I can't figure out how to update a defined name yet
var dfns = new DefinedNames();
workbookPart.Workbook.DefinedNames = dfns;
/*
data.RowIndex = data.RowIndex - 1;
// Row 1
InsertObjectAt("Test1", data.RowIndex, ColIndex++, worksheetPart, shareStringPart);
InsertObjectAt("TestB", data.RowIndex++, ColIndex, worksheetPart, shareStringPart);
// Row 2
InsertObjectAt("Test2", data.RowIndex, --ColIndex, worksheetPart, shareStringPart);
InsertObjectAt("TestB", data.RowIndex++, ++ColIndex, worksheetPart, shareStringPart);
// Row 3
InsertObjectAt("Test1", data.RowIndex, --ColIndex, worksheetPart, shareStringPart);
InsertObjectAt("TestB1", data.RowIndex++, ++ColIndex, worksheetPart, shareStringPart);
// Row 4
InsertObjectAt("Test2", data.RowIndex, --ColIndex, worksheetPart, shareStringPart);
InsertObjectAt("TestB1", data.RowIndex++, ++ColIndex, worksheetPart, shareStringPart);
// */
sheet.SheetDimension = new SheetDimension() { Reference = "A1:B4" };
for (int i = 0; i < 2; i++) {
if (!OpenXMLAlwaysPrintHeader && t.Count() > 0) // Look up which column we want to insert our value into
{
var indexOfItem = InsertSharedStringItem("Key", shareStringPart);
var cells = eHeader.Elements<Cell>().Where(x => x.CellValue.InnerText == indexOfItem.ToString());
if (cells.Count() < 1) continue;
var cell = cells.First();
ColIndex = GetColumnIndex(cell?.CellReference).Value;
} // Otherwise we are always inserting a header so don't bother looking up where things should go
else {
//Insert for the header
InsertObjectAt(ColIndex%2 == 0 ? "TestB" : "Test1", header.RowIndex, ColIndex, worksheetPart, shareStringPart);
}
// Insert for the data
if (RowIndex == 2) {
InsertObjectAt((i % 2 == 0 ? "Test2" : "TestC"), data.RowIndex, ColIndex++, worksheetPart, shareStringPart);
}else {
InsertObjectAt((i % 2 == 0 ? "Test2" : "TestD"), data.RowIndex, ColIndex++, worksheetPart, shareStringPart);
}
/*
if (!OpenXMLAlwaysPrintHeader) // If we are not always printing a header we can create a named range for the column
CreateRange(workbookPart, "key", sheetName, data.RowIndex, ColIndex - 1);
// */
}
}
}
var validator = new OpenXmlValidator();
int count = 0;
var stringbuilder = new StringBuilder();
foreach (ValidationErrorInfo error in validator.Validate(SpreadsheetDocument.Open(xmlFileString, true))) {
stringbuilder.Append("\r\n");
count++;
stringbuilder.Append(("Error Count : " + count) + "\r\n");
stringbuilder.Append(("Description : " + error.Description) + "\r\n");
stringbuilder.Append(("Path: " + error.Path.XPath) + "\r\n");
stringbuilder.Append(("Part: " + error.Part.Uri) + "\r\n");
}
}
catch (Exception e) {
e = e;
}
}
private string GetExcelColumnName(int columnNumber) {
int dividend = columnNumber;
string columnName = String.Empty;
int modulo;
while (dividend > 0) {
modulo = (dividend - 1) % 26;
columnName = Convert.ToChar(65 + modulo).ToString() + columnName;
dividend = (int)((dividend - modulo) / 26);
}
return columnName;
}
public void CreateRange(WorkbookPart wbPart, string Name, string SheetName, uint RowIndex, int ColIndex) {
var definedNames = wbPart.Workbook.DefinedNames;
var myLocation = GetExcelColumnName(ColIndex) + RowIndex.ToString();
var Col = GetExcelColumnName(ColIndex);
var Text = string.Format("{0}!${1}${2}:${3}${4}", SheetName, Col, 2, Col, RowIndex);
var colRange = new DefinedName { Name = Name, Text = Text };
wbPart.Workbook.DefinedNames?.Append(colRange);
}
private static int? GetColumnIndex(string cellReference) {
if (string.IsNullOrEmpty(cellReference)) {
return null;
}
//remove digits
string columnReference = Regex.Replace(cellReference.ToUpper(), #"[\d]", string.Empty);
int columnNumber = -1;
int mulitplier = 1;
//working from the end of the letters take the ASCII code less 64 (so A = 1, B =2...etc)
//then multiply that number by our multiplier (which starts at 1)
//multiply our multiplier by 26 as there are 26 letters
foreach (char c in columnReference.ToCharArray().Reverse()) {
columnNumber += mulitplier * ((int)c - 64);
mulitplier = mulitplier * 26;
}
//the result is zero based so return columnnumber + 1 for a 1 based answer
//this will match Excel's COLUMN function
return columnNumber + 1;
}
private void InsertObjectAt(object item, uint RowIndex, int ColIndex, WorksheetPart worksheetPart, SharedStringTablePart sharedStringTablePart) {
if (item == null) return;
if (item is ICollection)
item = ICollectionToString(item as ICollection);
// Create the header cell
int index = InsertSharedStringItem(item.ToString(), sharedStringTablePart);
Cell c = InsertCellInWorksheet(GetExcelColumnName(ColIndex), RowIndex, worksheetPart);
c.CellValue = new CellValue(index.ToString());
c.DataType = new EnumValue<CellValues>(CellValues.SharedString);
}
private static int InsertSharedStringItem(string text, SharedStringTablePart shareStringPart) {
// If the part does not contain a SharedStringTable, create one.
if (shareStringPart.SharedStringTable == null) {
shareStringPart.SharedStringTable = new SharedStringTable();
}
int i = 0;
// Iterate through all the items in the SharedStringTable. If the text already exists, return its index.
foreach (SharedStringItem item in shareStringPart.SharedStringTable.Elements<SharedStringItem>()) {
if (item.InnerText == text) {
return i;
}
i++;
}
// The text does not exist in the part. Create the SharedStringItem and return its index.
shareStringPart.SharedStringTable.AppendChild(new SharedStringItem(new DocumentFormat.OpenXml.Spreadsheet.Text(text)));
shareStringPart.SharedStringTable.Save();
return i;
}
public static WorksheetPart GetWorksheetPart(WorkbookPart workbookPart, string sheetName) {
Sheet sheet = workbookPart.Workbook.Descendants<Sheet>().FirstOrDefault(s => s.Name == sheetName);
if (sheet == default(Sheet)) {
CreateSheet(workbookPart, sheetName);
}
return workbookPart.GetPartById(sheet.Id) as WorksheetPart;
}
public static void CreateSheet(WorkbookPart workbookPart, string sheetName) {
var sheets = workbookPart.Workbook.Descendants<Sheets>().FirstOrDefault();
if (sheets == default(Sheets))
sheets = workbookPart.Workbook.AppendChild(new Sheets());
var worksheetPart = workbookPart.AddNewPart<WorksheetPart>();
var sheetdata = new SheetData();
var worksheet = new Worksheet(sheetdata);
worksheetPart.Worksheet = worksheet;
var id = (UInt32)workbookPart.Workbook.Descendants<Sheet>().Count() + 1;
var sheet = new Sheet() { Id = workbookPart.GetIdOfPart(worksheetPart), SheetId = id, Name = sheetName };
sheets.AppendChild(sheet);
workbookPart.Workbook.Save();
}
public static void CreateSpreadsheetWorkbook(string filepath, string sheetName) {
// Create a spreadsheet document by supplying the filepath.
// By default, AutoSave = true, Editable = true, and Type = xlsx.
SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.
Create(filepath, SpreadsheetDocumentType.Workbook);
// Add a WorkbookPart to the document.
WorkbookPart workbookpart = spreadsheetDocument.AddWorkbookPart();
workbookpart.Workbook = new Workbook();
// Add a WorksheetPart to the WorkbookPart.
WorksheetPart worksheetPart = workbookpart.AddNewPart<WorksheetPart>();
worksheetPart.Worksheet = new Worksheet(new SheetData());
// Add Sheets to the Workbook.
Sheets sheets = spreadsheetDocument.WorkbookPart.Workbook.
AppendChild<Sheets>(new Sheets());
// Append a new worksheet and associate it with the workbook.
var id = (UInt32)workbookpart.Workbook.Descendants<Sheet>().Count() + 1;
Sheet sheet = new Sheet() {
Id = spreadsheetDocument.WorkbookPart.GetIdOfPart(worksheetPart),
SheetId = id,
Name = sheetName
};
sheets.Append(sheet);
workbookpart.Workbook.Save();
// Close the document.
spreadsheetDocument.Close();
}
private string ICollectionToString(ICollection item) {
try {
var result = string.Empty;
if (item is IDictionary) {
foreach (DictionaryEntry kvp in item as IDictionary) {
if (kvp.Value is ICollection)
result += kvp.Key + " { " + ICollectionToString(kvp.Value as ICollection) + " } ";
else
result += kvp.Key + " => " + kvp.Value + " |";
}
}
else if (item is IList) {
var serializer = new JavaScriptSerializer();
string thing = serializer.Serialize(item);
result += thing;
}
else {
// todo
}
return result;
}
catch (Exception e) {
}
return string.Empty;
}
}
// Given a column name, a row index, and a WorksheetPart, inserts a cell into the worksheet.
// If the cell already exists, returns it.
private static Cell InsertCellInWorksheet(string columnName, uint rowIndex, WorksheetPart worksheetPart) {
Worksheet worksheet = worksheetPart.Worksheet;
SheetData sheetData = worksheet.GetFirstChild<SheetData>();
string cellReference = columnName + rowIndex;
// If the worksheet does not contain a row with the specified row index, insert one.
Row row;
if (sheetData.Elements<Row>().Where(r => r.RowIndex == rowIndex).Count() != 0) {
row = sheetData.Elements<Row>().Where(r => r.RowIndex == rowIndex).First();
}
else {
row = new Row() { RowIndex = rowIndex, Spans = new ListValue<StringValue>() { InnerText = "1:2" } };
sheetData.Append(row);
}
// If there is not a cell with the specified column name, insert one.
if (row.Elements<Cell>().Where(c => c.CellReference.Value == columnName + rowIndex).Count() > 0) {
return row.Elements<Cell>().Where(c => c.CellReference.Value == cellReference).First();
}
else {
// Cells must be in sequential order according to CellReference. Determine where to insert the new cell.
Cell refCell = null;
foreach (Cell cell in row.Elements<Cell>()) {
if (cell.CellReference.Value.Length == cellReference.Length) {
if (string.Compare(cell.CellReference.Value, cellReference, true) > 0) {
refCell = cell;
break;
}
}
}
Cell newCell = new Cell() { CellReference = cellReference };
row.InsertBefore(newCell, refCell);
worksheet.Save();
return newCell;
}
}
}
Running the program once will create what I believe is a valid openxml document which will open in Apache OpenOffice. Running the program twice will add two lines of which 1 is not unique to the document. This will cause the error to show up in OpenOffice, but not in the other programs (Excel Viewer/Libreoffice Calc).
Unfortunately I need to use OpenOffice as its whats installed on the computers, but I'm not sure what I am doing wrong when creating the document. Do I need to add something to the rows to indicate that it is a duplicate?
Edit: To run the code you need the DocumentFormat.OpenXML nuget package
Edit1: This only occurs when running the program twice. If I were to just append a 4 rows two of which were identical and attempt to open the file I have no issue. Note that the InsertObjectAt method also opens the document everytime (once for each cell so 4 rows by 2 cols = 8 times).

when converting excel to data-table using OpenXML replaces first null column with second column which contains data in c#

excel(XLSM) file starts with first column empty and second column with values and so on it replaces the empty column with immediate column available
XLSM FILE :Before uploading
XLSM FILE:After uploading xlsm shifts to immediate null column
how to find the range or total column without shifting
i.e:when i count column it has to display as 3(A2,B2,C2)
but it gives me total column when converting
below is the code:
private void Get_XLSM_Data(ref DataTable dt)
{
string strPath = Path.GetExtension(this.FilePath);
if (strPath != null && strPath.ToUpper() == ".XLSM")
{
using (SpreadsheetDocument spreadSheetDocument =
SpreadsheetDocument.Open(this.FilePath, true))
{
IEnumerable<Sheet> sheets = spreadSheetDocument.WorkbookPart.Workbook
.GetFirstChild<Sheets>().Elements<Sheet>();
string relationshipId = sheets.First().Id.Value;
WorksheetPart worksheetPart = (WorksheetPart)spreadSheetDocument
.WorkbookPart.GetPartById(relationshipId);
Worksheet workSheet = worksheetPart.Worksheet;
var dimensionReference = workSheet.SheetDimension.Reference;
var cellTablePart = workSheet.WorksheetPart.SingleCellTablePart;
SheetData sheetData = workSheet.GetFirstChild<SheetData>();
IEnumerable<Column> columnsDescendants = sheetData.Descendants<Column>();
IEnumerable<Row> rows = sheetData.Descendants<Row>();
var sheetIdValue = sheets.First().SheetId.Value;
// ReSharper disable once PossibleNullReferenceException
var column = workSheet.GetFirstChild<SheetData>().ChildElements.FirstOrDefault().ChildElements.Count();
if (dt.TableName == "specific table ")
{
dt.Columns.Clear();
for (int col = 1; col <= column; col++)
{
string colName = "Column" + (col);
dt.Columns.Add(colName);
}
//// START: To add Headers (First row) in data table
string[] rowData = new string[dt.Columns.Count];
int colIndex = 0;
foreach (Cell cell in rows.ElementAt(0))
{
rowData[colIndex] = GetCellValue(spreadSheetDocument, cell); colIndex++;
}
dt.Rows.Add(rowData);
//// END: To add Headers (First row) in data table
}
try
{
for (int i = 1; i < rows.Count(); i++)
{
string[] rowData = new string[dt.Columns.Count];
int col = 0;
foreach (Cell cell in rows.ElementAt(i))
{
rowData[col] = GetCellValue(spreadSheetDocument, cell); col++;
}
dt.Rows.Add(rowData);
}
}
}
}
}
public static string GetCellValue(SpreadsheetDocument document, Cell cell)
{
SharedStringTablePart stringTablePart = document.WorkbookPart.SharedStringTablePart;
if (cell != null)
{
string cellValue = cell.CellValue != null ? cell.CellValue.InnerXml : String.Empty;
if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
{
cellValue = stringTablePart.SharedStringTable.ChildElements[Int32.Parse(cellValue)].InnerText;
}
else
{
if(!string.IsNullOrEmpty(cellValue))
{
//return Convert.ToString(cellValue, CultureInfo.InvariantCulture);
return double.Parse(cellValue, CultureInfo.InvariantCulture).ToString();
}
return cellValue;
}
return cellValue;
}
return String.Empty;
}
Row row = worksheetPart.Worksheet.GetFirstChild<SheetData>().Elements<Row>().FirstOrDefault();
var totalnumberOfColumns = 0;
if (row != null)
{
var spans = row.Spans != null ? row.Spans.InnerText : "";
if (spans != String.Empty)
{
//spans.Split(':')[1];
string[] columns = spans.Split(':');
startcolumnInuse = int.Parse(columns[0]);
endColumnInUse = int.Parse(columns[1]);
totalnumberOfColumns = int.Parse(columns[1]);
}
}
Below is the screen shot to find the maximum column present through span with above code i have shared
Here i have used different excel file(XLSM)
Below is the screen shot to find the maximum column present through span
with above code i have shared
Here i have used different excel file(XLSM)

How to add new row with cell data and styles in Excel using OpenXML and C#

How to add new rows with cell data and styles in Excel using OpenXML and C#
Though it was a question from my side but not anymore. I'll mention my R&D and code stuff to which resolved my problem.
public override void AddExcelRows(string[] bufData, int cReport, int cSection, int nrow, bool insertRow)
{
int rowIndex;
int colIndex;
rowIndex = //some number
colIndex = //some number
Sheet sheet = wbPart.Workbook.Descendants<Sheet>().Where((s) => s.Name == currentSheetName).FirstOrDefault();
WorksheetPart worksheetPart = wbPart.GetPartById(sheet.Id) as WorksheetPart;
SharedStringTablePart shareStringPart = wbPart.GetPartsOfType<SharedStringTablePart>().FirstOrDefault();
SheetData sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();
for (int colOffset = 0; colOffset <= some number; colOffset++)
{
if (bufData[colOffset] != null)
{
int index = InsertSharedStringItem(bufData[colOffset], shareStringPart);
var columnName = GetExcelColumnName(colIndex + colOffset);
Cell cell = InsertCellInWorksheet(columnName, rowIndex, worksheetPart);
if (cell.CellValue !=null && cell.CellValue.InnerText == bufData[colOffset])//if same value is already present in current cell then skip writign again. it was causing issue writng [kW] for project Technical report.
{
continue;
}
cell.CellValue = new CellValue(index.ToString());
cell.DataType = new EnumValue<CellValues>(CellValues.SharedString);
}
}
if (insertRow)
{
uint nextRowIndex = (uint)rowIndex + 1; //Add min 3 rows in excel with styles (border line)
Row oldRow = sheetData.Elements<Row>().Where(r => r.RowIndex == nextRowIndex).First();
var newRow = oldRow.CopyToLine((uint)nextRowIndex, sheetData);
}
wbPart.Workbook.Save();
}
Helper methods:
private string GetExcelColumnName(int columnNumber)
{
int dividend = columnNumber;
string columnName = String.Empty;
int modulo;
while (dividend > 0)
{
modulo = (dividend - 1) % 26;
columnName = Convert.ToChar(65 + modulo).ToString() + columnName;
dividend = (int)((dividend - modulo) / 26);
}
return columnName;
}
Below two methods were reused from: https://msdn.microsoft.com/en-us/library/office/cc861607.aspx
private static int InsertSharedStringItem(string text, SharedStringTablePart shareStringPart)
{
// If the part does not contain a SharedStringTable, create one.
if (shareStringPart.SharedStringTable == null)
{
shareStringPart.SharedStringTable = new SharedStringTable();
}
int i = 0;
// Iterate through all the items in the SharedStringTable. If the text already exists, return its index.
foreach (SharedStringItem item in shareStringPart.SharedStringTable.Elements<SharedStringItem>())
{
if (item.InnerText == text)
{
return i;
}
i++;
}
// The text does not exist in the part. Create the SharedStringItem and return its index.
shareStringPart.SharedStringTable.AppendChild(new SharedStringItem(new DocumentFormat.OpenXml.Spreadsheet.Text(text)));
shareStringPart.SharedStringTable.Save();
return i;
}
// Given a column name, a row index, and a WorksheetPart, inserts a cell into the worksheet.
// If the cell already exists, returns it.
private static Cell InsertCellInWorksheet(string columnName, int rowIndex, WorksheetPart worksheetPart)
{
Worksheet worksheet = worksheetPart.Worksheet;
SheetData sheetData = worksheet.GetFirstChild<SheetData>();
string cellReference = columnName + rowIndex;
// If the worksheet does not contain a row with the specified row index, insert one.
Row row = null;
if (sheetData.Elements<Row>().Where(r => r.RowIndex == rowIndex).Count() != 0)
{
row = sheetData.Elements<Row>().Where(r => r.RowIndex == rowIndex).First();
}
else
{
row = new Row() { RowIndex = (uint)rowIndex };
sheetData.InsertAt(new Row(), rowIndex);
}
// If there is not a cell with the specified column name, insert one.
if (row.Elements<Cell>().Where(c => c.CellReference.Value == columnName + rowIndex).Count() > 0)
{
return row.Elements<Cell>().Where(c => c.CellReference.Value == cellReference).First();
}
else
{
// Cells must be in sequential order according to CellReference. Determine where to insert the new cell.
Cell refCell = null;
foreach (Cell cell in row.Elements<Cell>())
{
if (string.Compare(cell.CellReference.Value, cellReference, true) > 0)
{
refCell = cell;
break;
}
}
Cell newCell = new Cell() { CellReference = cellReference };
row.InsertBefore(newCell, refCell);
worksheetPart.Worksheet.Save();
return newCell;
}
}
Then you need to add an extension Method for Adding new row with styles
public static class ExtensionClass
{
//A method for copying a row and insert it:
//Copy an existing row and insert it
//We don't need to copy styles of a refRow because a CloneNode() or Clone() methods do it for us
public static Row CopyToLine(this Row refRow, uint rowIndex, SheetData sheetData)
{
uint newRowIndex;
var newRow = (Row)refRow.CloneNode(true);
// Loop through all the rows in the worksheet with higher row
// index values than the one you just added. For each one,
// increment the existing row index.
IEnumerable<Row> rows = sheetData.Descendants<Row>().Where(r => r.RowIndex.Value >= rowIndex);
foreach (Row row in rows)
{
newRowIndex = System.Convert.ToUInt32(row.RowIndex.Value + 1);
foreach (Cell cell in row.Elements<Cell>())
{
// Update the references for reserved cells.
string cellReference = cell.CellReference.Value;
cell.CellReference = new StringValue(cellReference.Replace(row.RowIndex.Value.ToString(), newRowIndex.ToString()));
cell.DataType = new EnumValue<CellValues>(CellValues.SharedString);
}
// Update the row index.
row.RowIndex = new UInt32Value(newRowIndex);
}
sheetData.InsertBefore(newRow, refRow);
return newRow;
}
}

Reading the excel cell validation dropdown value in C# using openXML

I am writing a C# console application that will read values from an excel spreadsheet using OpenXML and create a DataTable. The app is able to read all values except those cells which contain a dropdown list. Is there a way for OpenXML to read these cells and determine which value is selected? Any suggestions are greatly appreciated. Thanks in advance.
Current Code:
public static string GetValue(Cell cell, SharedStringTablePart stringTablePart)
{
if (cell.ChildElements.Count == 0) return null;
//get cell value
string value = cell.ElementAt(0).InnerText;//CellValue.InnerText;
//Look up real value from shared string table
if ((cell.DataType != null) && (cell.DataType == CellValues.SharedString))
value = stringTablePart.SharedStringTable.ChildElements[Int32.Parse(value)].InnerText;
return value;
}
public static void ReadData(string xlsxFilePath, string sheetName)
{
DataTable dt = new DataTable();
using (SpreadsheetDocument myWorkbook = SpreadsheetDocument.Open(xlsxFilePath, true))
{
//Access the main Workbook part, which contains data
WorkbookPart workbookPart = myWorkbook.WorkbookPart;
WorksheetPart worksheetPart = null;
if (!string.IsNullOrEmpty(sheetName))
{
Sheet ss = workbookPart.Workbook.Descendants<Sheet>().Where(s => s.Name == sheetName).SingleOrDefault<Sheet>();
worksheetPart = (WorksheetPart)workbookPart.GetPartById(ss.Id);
}
else
{
worksheetPart = workbookPart.WorksheetParts.FirstOrDefault();
}
SharedStringTablePart stringTablePart = workbookPart.SharedStringTablePart;
if (worksheetPart != null)
{
Row lastRow = worksheetPart.Worksheet.Descendants<Row>().LastOrDefault();
IEnumerable<Row> firstRows = worksheetPart.Worksheet.Descendants<Row>().Skip(10);
Row firstRow = firstRows.FirstOrDefault();
int numColumns = 0;
//Row firstRow = worksheetPart.Worksheet.Descendants<Row>().FirstOrDefault();
if (firstRow != null)
{
foreach (Cell c in firstRow.ChildElements)
{
string value = GetValue(c, stringTablePart);
dt.Columns.Add(value);
numColumns++;
}
}
if (lastRow != null)
{
for (int i = 11; i <= lastRow.RowIndex; i++)
{
DataRow dr = dt.NewRow();
bool empty = true;
Row row = worksheetPart.Worksheet.Descendants<Row>() .Where(r => i == r.RowIndex).FirstOrDefault();
int j = 0;
if (row != null)
{
foreach (Cell c in row.ChildElements)
{
//Get cell value
string value = GetValue(c, stringTablePart);
if (string.IsNullOrEmpty(value) && value == " ")
dr[j] = "";
//if (!string.IsNullOrEmpty(value) && value != " ")
// empty = false;
else
dr[j] = value;
Console.Write(dr[j] + "\t");
j++;
if (j == numColumns-1)
{
Console.Write("\n");
break;
}
}
//if (empty)
// break;
dt.Rows.Add(dr);
}
}
}
}
}
}

Problem with skipping empty cells while importing data from .xlsx file in asp.net c# application

I have a problem with reading .xlsx files in asp.net mvc2.0 application, using c#. Problem occurs when reading empty cell from .xlsx file. My code simply skips this cell and reads the next one.
For example, if the contents of .xlsx file are:
FirstName LastName Age
John 36
They will be read as:
FirstName LastName Age
John 36
Here's the code that does the reading.
private string GetValue(Cell cell, SharedStringTablePart stringTablePart)
{
if (cell.ChildElements.Count == 0)
return string.Empty;
//get cell value
string value = cell.ElementAt(0).InnerText;//CellValue.InnerText;
//Look up real value from shared string table
if ((cell.DataType != null) && (cell.DataType == CellValues.SharedString))
value = stringTablePart.SharedStringTable.ChildElements[Int32.Parse(value)].InnerText;
return value;
}
private DataTable ExtractExcelSheetValuesToDataTable(string xlsxFilePath, string sheetName)
{
DataTable dt = new DataTable();
using (SpreadsheetDocument myWorkbook = SpreadsheetDocument.Open(xlsxFilePath, true))
{
//Access the main Workbook part, which contains data
WorkbookPart workbookPart = myWorkbook.WorkbookPart;
WorksheetPart worksheetPart = null;
if (!string.IsNullOrEmpty(sheetName))
{
Sheet ss = workbookPart.Workbook.Descendants<Sheet>().Where(s => s.Name == sheetName).SingleOrDefault<Sheet>();
worksheetPart = (WorksheetPart)workbookPart.GetPartById(ss.Id);
}
else
{
worksheetPart = workbookPart.WorksheetParts.FirstOrDefault();
}
SharedStringTablePart stringTablePart = workbookPart.SharedStringTablePart;
if (worksheetPart != null)
{
Row lastRow = worksheetPart.Worksheet.Descendants<Row>().LastOrDefault();
Row firstRow = worksheetPart.Worksheet.Descendants<Row>().FirstOrDefault();
if (firstRow != null)
{
foreach (Cell c in firstRow.ChildElements)
{
string value = GetValue(c, stringTablePart);
dt.Columns.Add(value);
}
}
if (lastRow != null)
{
for (int i = 2; i <= lastRow.RowIndex; i++)
{
DataRow dr = dt.NewRow();
bool empty = true;
Row row = worksheetPart.Worksheet.Descendants<Row>().Where(r => i == r.RowIndex).FirstOrDefault();
int j = 0;
if (row != null)
{
foreach (Cell c in row.ChildElements)
{
//Get cell value
string value = GetValue(c, stringTablePart);
if (!string.IsNullOrEmpty(value) && value != "")
empty = false;
dr[j] = value;
j++;
if (j == dt.Columns.Count)
break;
}
if (empty)
break;
dt.Rows.Add(dr);
}
}
}
}
}
return dt;
}
i had same problem.
This is my workout:
int offset = GetColDiff(lastCol, cell.CellReference);
//filling empty columns
while (offset-- > 1)
dt.Rows[rowCounter][cnt++] = DBNull.Value;
//filling regular column
dt.Rows[rowCounter][cnt++] = value;
lastCol = cell.CellReference;
******************
//calculating column distance
int GetColDiff(string prev, string curr)
{
int i=0;
int index1 = 0;
int index2 = 0;
while (prev!="0" && prev.Length>i && Char.IsLetter(prev[i]))//prev=="0"-startingcondition
{
index1 += ('Z' - 'A' + 1) * index1 + (prev[i] - 'A');
i++;
}
i = 0;
while (curr.Length>i && char.IsLetter(curr[i]))
{
index2 += ('Z' - 'A'+ 1) * index2 + (curr[i] - 'A');
i++;
}
return index2 - index1;
}
My solution to this problem isn't quite as elegant as some might use.
First, I map the columns to a char (A, B, C, D, etc), so I can know that FirstName = A, LastName = B, and Age = C.
Next, I look through the dataCells to see if there is a cell that has the Age reference. If there is a Age cell referenced, I will check the cell's DataType.
ex: dataCells.Where(x => x.CellReference.Value.Contains(cellIndex)).First().DataType == CellValues.SharedString)
In this case, cellIndex would = 'C'.
If the previous linq query is true, then you'll go to the sharedString table and find the value for the age by CellReference.
var age = sharedStrings.ChildElements[int.Parse(dataCells.Where(x => x.CellReference.Value.Contains(cellIndex)).FirstOrDefault().InnerText)].InnerText;
Your problem with accidentally setting the LastName (Column B) to whatever the Age (Column C) should be avoided if you work off of cell reference for each DataRow.
Side note: One thing I just ran into is that blank cells in Excel are stored two different ways. Sometimes there's a reference to a SharedStringTable index (cell.DataType = "s" and cell.InnerText = "37"), and sometimes the cell is just empty (cell.DataType = null and cell.InnerText = "").

Categories