Delete Table in Excel sheet with OpenXml SDK - c#

Could you please provide the c# code to delete Excel Table from worksheet.
Thank you!

Here is the code to delete all Tables in all sheets:
using (SpreadsheetDocument xl = SpreadsheetDocument.Open(targetFile, true))
{
WorkbookPart workbookPart = xl.WorkbookPart;
foreach (WorksheetPart sheet in workbookPart.WorksheetParts)
{
List<TableDefinitionPart> TableDefinitionPartToDelete = new List<TableDefinitionPart>();
var TableParts = sheet.Worksheet.WorksheetPart.Worksheet.Descendants<TablePart>();
List<TablePart> TablePartToDelete = new List<TablePart>();
foreach (var Item in TableParts)
{
TablePartToDelete.Add(Item);
}
foreach (var tp in TablePartToDelete)
{
tp.Remove();
}
foreach (TableDefinitionPart Item in sheet.TableDefinitionParts)
{
TableDefinitionPartToDelete.Add(Item);
}
foreach (TableDefinitionPart Item in TableDefinitionPartToDelete)
{
sheet.DeletePart(Item);
}
}
xl.Close();
}

Related

Read excel by sheet name with OpenXML

I am new at OpenXML c# and I want to read rows from excel file. But I need to read excel sheet by name. this is my sample code that reads first sheet:
using (var spreadSheet = SpreadsheetDocument.Open(path, true))
{
WorkbookPart workbookPart = spreadSheet.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
SheetData sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();
foreach (Row r in sheetData.Elements<Row>())
{
foreach (Cell c in r.Elements<Cell>())
{
if (c.DataType != null && c.DataType == CellValues.SharedString)
{
// reading cells
}
}
}
But how can I find by sheet name and read cells.
I've done it like in the code snippet below. It's basically Workbook->Spreadsheet->Sheet then getting the Name attribute of the sheet.
The basic underling xml looks like this:
<x:workbook>
<x:sheets>
<x:sheet name="Sheet1" sheetId="1" r:id="rId1" />
<x:sheet name="TEST sheet Name" sheetId="2" r:id="rId2" />
</x:sheets>
</x:workbook>
The id value is what the Open XML package uses internally to identify each sheet and link it with the other XML parts. That's why the line of code that follows identifying the name uses GetPartById to pick up the WorksheetPart.
using (SpreadsheetDocument doc = SpreadsheetDocument.Open(path, false))
{
WorkbookPart bkPart = doc.WorkbookPart;
DocumentFormat.OpenXml.Spreadsheet.Workbook workbook = bkPart.Workbook;
DocumentFormat.OpenXml.Spreadsheet.Sheet s = workbook.Descendants<DocumentFormat.OpenXml.Spreadsheet.Sheet>().Where(sht => sht.Name == "Sheet1").FirstOrDefault();
WorksheetPart wsPart = (WorksheetPart)bkPart.GetPartById(s.Id);
DocumentFormat.OpenXml.Spreadsheet.SheetData sheetdata = wsPart.Worksheet.Elements<DocumentFormat.OpenXml.Spreadsheet.SheetData>().FirstOrDefault();
foreach (DocumentFormat.OpenXml.Spreadsheet.Row r in sheetdata.Elements<DocumentFormat.OpenXml.Spreadsheet.Row>())
{
DocumentFormat.OpenXml.Spreadsheet.Cell c = r.Elements<DocumentFormat.OpenXml.Spreadsheet.Cell>().First();
txt += c.CellValue.Text + Environment.NewLine;
}
this.txtMessages.Text += txt;
}

Swap worksheets within the excel workbook c#

For example, see the image
I want to swap the worksheet "Sheet1" to "Sheet3".
My Code using EPPlus:
ExcelPackage masterPackage = new ExcelPackage();
foreach (var file in files)
{
ExcelPackage pckg = new ExcelPackage(new FileInfo(file));
foreach (var sheet in pckg.Workbook.Worksheets)
{
//check name of worksheet, in case that worksheet with same name already exist exception will be thrown by EPPlus
string workSheetName = sheet.Name;
foreach (var masterSheet in masterPackage.Workbook.Worksheets)
{
if (sheet.Name == masterSheet.Name)
{
workSheetName = string.Format("{0}_{1}", workSheetName, DateTime.Now.ToString("yyyyMMddhhssmmm"));
}
}
//add new sheet
if (sheet.Name.Contains("MB_STORE_POTENTIALvsWALLET"))
{
masterPackage.Workbook.Worksheets.Add(workSheetName, sheet);
}
else
{
masterPackage.Workbook.Worksheets.Add(workSheetName, sheet);
masterPackage.Workbook.Worksheets.MoveToStart(1);
}
}
}
masterPackage.SaveAs(new FileInfo(resultFile));
How to do this? Any suggestion please..
If only you need to swap the sheets (I mean content do not required to be processed) then renaming sheet should be simple.
Rename the "Sheet1" to "adsf"
Rename the "Sheet3" to "Sheet1"
Rename the "adsf" to "Sheet3"
Sheets("Sheet1").Name = "adsf"
Sheets("Sheet3").Name = "Sheet1"
Sheets("adsf").Name = "Sheet3"
This is working fine:
ExcelPackage masterPackage = new ExcelPackage();
foreach (var file in files)
{
ExcelPackage pckg = new ExcelPackage(new FileInfo(file));
foreach (var sheet in pckg.Workbook.Worksheets)
{
//check name of worksheet, in case that worksheet with same name already exist exception will be thrown by EPPlus
string workSheetName = sheet.Name;
foreach (var masterSheet in masterPackage.Workbook.Worksheets)
{
if (sheet.Name == masterSheet.Name)
{
workSheetName = string.Format("{0}_{1}", workSheetName, DateTime.Now.ToString("yyyyMMddhhssmmm"));
}
}
//add new sheet
if (sheet.Name.Contains("MB_STORE_POTENTIALvsWALLET"))
{
masterPackage.Workbook.Worksheets.Add(workSheetName, sheet);
}
else
{
masterPackage.Workbook.Worksheets.Add(workSheetName, sheet);
masterPackage.Workbook.Worksheets.MoveBefore(2, 1);
}
}
}
masterPackage.SaveAs(new FileInfo(resultFile));

DocumentFormat.openxml Excel File Reading Issue

I have used DocumentFormat.OpenXml dll in one of my project for reading and writing excel file.
During Reading of Excel File, Let's say for some column say Column1 I am having cell values as "TRUE" and "FALSE". When I read this Excel File using Following Code
private SharedStringTable sharedStringTable;
using (SpreadsheetDocument doc = SpreadsheetDocument.Open(fileName, false))
{
WorkbookPart workbookPart = doc.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.FirstOrDefault();
SharedStringTablePart sharedStringTablePart = workbookPart.SharedStringTablePart;
if (sharedStringTablePart != null)
{
sharedStringTable = sharedStringTablePart.SharedStringTable;
}
var sheets = workbookPart.Workbook.Sheets;
foreach (Sheet sheet in sheets)
{
Worksheet requiredItem = (doc.WorkbookPart.GetPartById(sheet.Id.Value) as WorksheetPart).Worksheet;
var sheetData = requiredItem.Elements<SheetData>().First();
foreach (var rowItem in sheetData.Elements<Row>())
{
foreach (var item in rowItem.Elements<Cell>())
{
string requiredText = string.empty;
if (item.CellValue != null)
{
requiredText = item.CellValue.InnerText;
}
}
}
}
}
At that time for Cell Values "TRUE" and "FALSE" i am getting values 1 and 0 Respectively.
Can anyone provide me any way so that I can get values "TRUE" and "FALSE" instead of 1 and 0 ?

How to count rows per worksheet in OpenXML

I switched from Interop library to OpenXML, because I need to read large Excel files. Before that I could use:
worksheet.UsedRange.Rows.Count
to get the number of rows with data on the worksheet. I used this information to make a progressbar. In OpenXML I do not know how to get the same information about the worksheet. What I have now is this code:
using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(path, false))
{
WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
SheetData sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();
int row_count = 0, col_count;
// here I would like to get the info about the number of rows
foreach (Row r in sheetData.Elements<Row>())
{
col_count = 0;
if (row_count > 10)
{
foreach (Cell c in r.Elements<Cell>())
{
// do some stuff
// update progressbar
}
}
row_count++;
}
}
It's not that hard (When you use LINQ),
using (SpreadsheetDocument myDoc = SpreadsheetDocument.Open("PATH", true))
{
//Get workbookpart
WorkbookPart workbookPart = myDoc.WorkbookPart;
//then access to the worksheet part
IEnumerable<WorksheetPart> worksheetPart = workbookPart.WorksheetParts;
foreach (WorksheetPart WSP in worksheetPart)
{
//find sheet data
IEnumerable<SheetData> sheetData = WSP.Worksheet.Elements<SheetData>();
// Iterate through every sheet inside Excel sheet
foreach (SheetData SD in sheetData)
{
IEnumerable<Row> row = SD.Elements<Row>(); // Get the row IEnumerator
Console.WriteLine(row.Count()); // Will give you the count of rows
}
}
}
Edited with Linq now it's straight forward.

open xml reading from excel file

I want to implement openXml sdk 2.5 into my project. I do everything in this link
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
using System.IO.Packaging;
static void Main(string[] args)
{
String fileName = #"C:\OPENXML\BigData.xlsx";
// Comment one of the following lines to test the method separately.
ReadExcelFileDOM(fileName); // DOM
//ReadExcelFileSAX(fileName); // SAX
}
// The DOM approach.
// Note that the code below works only for cells that contain numeric values.
//
static void ReadExcelFileDOM(string fileName)
{
using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(fileName, false))
{
WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
SheetData sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();
string text;
int rowCount= sheetData.Elements<Row>().Count();
foreach (Row r in sheetData.Elements<Row>())
{
foreach (Cell c in r.Elements<Cell>())
{
text = c.CellValue.Text;
Console.Write(text + " ");
}
}
Console.WriteLine();
Console.ReadKey();
}
}
But i am not getting any row. It hasn't entered loop. Note: I also set up openXml sdk 2.5 my computer
And I find below code this is work for numeric value.For string value it writes 0 1 2 ...
private static void Main(string[] args)
{
var filePath = #"C:/OPENXML/BigData.xlsx";
using (var document = SpreadsheetDocument.Open(filePath, false))
{
var workbookPart = document.WorkbookPart;
var workbook = workbookPart.Workbook;
var sheets = workbook.Descendants<Sheet>();
foreach (var sheet in sheets)
{
var worksheetPart = (WorksheetPart)workbookPart.GetPartById(sheet.Id);
var sharedStringPart = workbookPart.SharedStringTablePart;
//var values = sharedStringPart.SharedStringTable.Elements<SharedStringItem>().ToArray();
string text;
var rows = worksheetPart.Worksheet.Descendants<Row>();
foreach (var row in rows)
{
Console.WriteLine();
int count = row.Elements<Cell>().Count();
foreach (Cell c in row.Elements<Cell>())
{
text = c.CellValue.InnerText;
Console.Write(text + " ");
}
}
}
}
Console.ReadLine();
}
Your approach seemed to work ok for me - in that it did "enter the loop".
Nevertheless you could also try something like the following:
void Main()
{
string fileName = #"c:\path\to\my\file.xlsx";
using (FileStream fs = new FileStream(fileName, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
using (SpreadsheetDocument doc = SpreadsheetDocument.Open(fs, false))
{
WorkbookPart workbookPart = doc.WorkbookPart;
SharedStringTablePart sstpart = workbookPart.GetPartsOfType<SharedStringTablePart>().First();
SharedStringTable sst = sstpart.SharedStringTable;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
Worksheet sheet = worksheetPart.Worksheet;
var cells = sheet.Descendants<Cell>();
var rows = sheet.Descendants<Row>();
Console.WriteLine("Row count = {0}", rows.LongCount());
Console.WriteLine("Cell count = {0}", cells.LongCount());
// One way: go through each cell in the sheet
foreach (Cell cell in cells)
{
if ((cell.DataType != null) && (cell.DataType == CellValues.SharedString))
{
int ssid = int.Parse(cell.CellValue.Text);
string str = sst.ChildElements[ssid].InnerText;
Console.WriteLine("Shared string {0}: {1}", ssid, str);
}
else if (cell.CellValue != null)
{
Console.WriteLine("Cell contents: {0}", cell.CellValue.Text);
}
}
// Or... via each row
foreach (Row row in rows)
{
foreach (Cell c in row.Elements<Cell>())
{
if ((c.DataType != null) && (c.DataType == CellValues.SharedString))
{
int ssid = int.Parse(c.CellValue.Text);
string str = sst.ChildElements[ssid].InnerText;
Console.WriteLine("Shared string {0}: {1}", ssid, str);
}
else if (c.CellValue != null)
{
Console.WriteLine("Cell contents: {0}", c.CellValue.Text);
}
}
}
}
}
}
I used the filestream approach to open the workbook because this allows you to open it with shared access - so that you can have the workbook open in Excel at the same time. The Spreadsheet.Open(... method won't work if the workbook is open elsewhere.
Perhaps that is why your code didn't work.
Note, also, the use of the SharedStringTable to get the cell text where appropriate.
EDIT 2018-07-11:
Since this post is still getting votes I should also point out that in many cases it may be a lot easier to use ClosedXML to manipulate/read/edit your workbooks. The documentation examples are pretty user friendly and the coding is, in my limited experience, much more straight forward. Just be aware that it does not (yet) implement all the Excel functions (for example INDEX and MATCH) which may or may not be an issue. [Not that I would want to be trying to deal with INDEX and MATCH in OpenXML anyway.]
I had the same issue as the OP, and the answer above did not work for me.
I think this is the issue: when you create a document in Excel (not programmatically), you have 3 sheets by default and the WorksheetParts that has the row data for Sheet1 is the last WorksheetParts element, not the first.
I figured this out by putting a watch for document.WorkbookPart.WorksheetParts in Visual Studio, expanding Results, then looking at all of the sub elements until I found a SheetData object where HasChildren = true.
Try this:
// open the document read-only
SpreadSheetDocument document = SpreadsheetDocument.Open(filePath, false);
SharedStringTable sharedStringTable = document.WorkbookPart.SharedStringTablePart.SharedStringTable;
string cellValue = null;
foreach (WorksheetPart worksheetPart in document.WorkbookPart.WorksheetParts)
{
foreach (SheetData sheetData in worksheetPart.Worksheet.Elements<SheetData>())
{
if (sheetData.HasChildren)
{
foreach (Row row in sheetData.Elements<Row>())
{
foreach (Cell cell in row.Elements<Cell>())
{
cellValue = cell.InnerText;
if (cell.DataType == CellValues.SharedString)
{
Console.WriteLine("cell val: " + sharedStringTable.ElementAt(Int32.Parse(cellValue)).InnerText);
}
else
{
Console.WriteLine("cell val: " + cellValue);
}
}
}
}
}
}
document.Close();
Read Large Excel :
openxml has two approaches of DOM and SAX to read an excel. the DOM one consume more RAM resource since it loads the whole xml content(Excel file) in Memory but its strong typed approach.
SAX in other hand is event base parse. more here
so if you are facing large excel file its better to use SAX.
the below code sample uses SAX approach and also handle two important scenario in excel file reading.
open xml skips the empty cells so your dataset faces displacement and wrong index.
you need to skip the empty rows also.
this function returns the exact actual index of the cell at the time and handle the first scenario.
from here
private static int CellReferenceToIndex(Cell cell)
{
int index = 0;
string reference = cell.CellReference.ToString().ToUpper();
foreach (char ch in reference)
{
if (Char.IsLetter(ch))
{
int value = (int)ch - (int)'A';
index = (index == 0) ? value : ((index + 1) * 26) + value;
}
else
return index;
}
return index;
}
code to read excel sax approach.
//i want to import excel to data table
dt = new DataTable();
using (SpreadsheetDocument document = SpreadsheetDocument.Open(path, false))
{
WorkbookPart workbookPart = document.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
OpenXmlReader reader = OpenXmlReader.Create(worksheetPart);
//row counter
int rcnt = 0;
while (reader.Read())
{
//find xml row element type
//to understand the element type you can change your excel file eg : test.xlsx to test.zip
//and inside that you may observe the elements in xl/worksheets/sheet.xml
//that helps to understand openxml better
if (reader.ElementType == typeof(Row))
{
//create data table row type to be populated by cells of this row
DataRow tempRow = dt.NewRow();
//***** HANDLE THE SECOND SENARIO*****
//if row has attribute means it is not a empty row
if (reader.HasAttributes)
{
//read the child of row element which is cells
//here first element
reader.ReadFirstChild();
do
{
//find xml cell element type
if (reader.ElementType == typeof(Cell))
{
Cell c = (Cell)reader.LoadCurrentElement();
string cellValue;
int actualCellIndex = CellReferenceToIndex(c);
if (c.DataType != null && c.DataType == CellValues.SharedString)
{
SharedStringItem ssi = workbookPart.SharedStringTablePart.SharedStringTable.Elements<SharedStringItem>().ElementAt(int.Parse(c.CellValue.InnerText));
cellValue = ssi.Text.Text;
}
else
{
cellValue = c.CellValue.InnerText;
}
//if row index is 0 its header so columns headers are added & also can do some headers check incase
if (rcnt == 0)
{
dt.Columns.Add(cellValue);
}
else
{
// instead of tempRow[c.CellReference] = cellValue;
tempRow[actualCellIndex] = cellValue;
}
}
}
while (reader.ReadNextSibling());
//if its not the header row so append rowdata to the datatable
if (rcnt != 0)
{
dt.Rows.Add(tempRow);
}
rcnt++;
}
}
}
}
Everything is explained in the accepted answer.
Here is just an extension method to solve the problem
public static string GetCellText(this Cell cell, in SharedStringTable sst)
{
if (cell.CellValue is null)
return string.Empty;
if ((cell.DataType is not null) &&
(cell.DataType == CellValues.SharedString))
{
int ssid = int.Parse(cell.CellValue.Text);
return sst.ChildElements[ssid].InnerText;
}
return cell.CellValue.Text;
}

Categories