A sheet removal causes an unreadable file error in OpenXML - c#

I try to remove a sheet from an Excel file and I have tried a lots of source from Internet but I always get the same result: unreadable content.
The link of the last one: https://blogs.msdn.microsoft.com/vsod/2010/02/05/how-to-delete-a-worksheet-from-excel-using-open-xml-sdk-2-0/
I also tried this:
Sheet sheet = workbook.WorkbookPart.Workbook.Descendants<Sheet>().First(s => s.Name.Equals(sheetName));
sheet.Remove();
workbook.WorkbookPart.Workbook.Save();
Plus this:
sheet.RemoveAllChildren()
But the file is always corrupt.
Please!
UPDATE
using (MemoryStream xlsxStream = new MemoryStream())
{
using (var fileStream = File.OpenRead(templatePath))
fileStream.CopyTo(xlsxStream);
...
using (var workbook = SpreadsheetDocument.Open(xlsxStream, true, new OpenSettings { AutoSave = true }))
{
Sheet sheet = workbook.WorkbookPart.Workbook.Descendants<Sheet>().First(s => s.Name.Equals(sheetName));
sheet.Remove();
workbook.WorkbookPart.Workbook.Save();
...

A potential solution to your answer may be an empty (and not needed) CalculationChainPart. For more details see this answer

Related

ExcelDataReader reads all invisible sheets

In Excel there is a feature to hide some worksheets. I am reading a document which contains these kind of sheets and I want to ignore them.
This is the place which I can hide, or unhide worksheets:
On the Home tab, in the Cells group, click Format.
Under Visibility, click Hide & Unhide, and then click Unhide Sheet.
How to get list of ONLY Excel VISIBLE worksheet names in Excel using ExcelDataReader?
If using the reader interface, the IExcelDataReader.VisibleState property returns the visibility state of the currently read sheet.
If using .AsDataSet(), the same value can be retreived from DataTable.ExtendedProperties["visiblestate"]
How to get list of visible worksheet names in Excel using ExcelDataReader?
// Prepare your reader by
var stream = File.Open(yourExcelFilename, FileMode.Open, FileAccess.Read);
var excelDataReader = ExcelDataReader.ExcelReaderFactory.CreateOpenXmlReader(stream);
// This variable will store visible worksheet names
List<string> visibleWorksheetNames;
// Use a loop to read workbook
visibleWorksheetNames = new List<string>();
for (var i = 0; i < excelDataReader.ResultsCount; i++)
{
// checking visible state
if (excelDataReader.VisibleState == "visible")
{
visibleWorksheetNames.Add(excelDataReader.Name);
}
excelDataReader.NextResult();
}
Read only visible sheets to DataSet:
using (var stream = File.Open("test.xlsx", FileMode.Open, FileAccess.Read))
{
using (var reader = ExcelReaderFactory.CreateReader(stream))
{
var ds = reader.AsDataSet(new ExcelDataSetConfiguration()
{
FilterSheet = (tableReader, sheetIndex) => tableReader.VisibleState == "visible",
});
}
}
Use reader.RowHeight. Setting RowHeight = 0 results in a hidden row.

Asp.net Core Import Excel File FileNotFoundException

I've had this working but it has stopped and I can't figure out why. I'm importing a simple excel file, using EPPlus.Core v1.3. Here's the code I'm using:
public async Task<IActionResult> Import(IFormFile file)
{
//Get file
var newfile = new FileInfo(file.FileName);
var fileExtension = newfile.Extension;
//Check if file is an Excel File
if (fileExtension.Contains(".xls"))
{
//Create an excel package
using (var package = new ExcelPackage(newfile))
{
//Get the first worksheet in the file
var worksheet = package.Workbook.Worksheets[1];
...
The var worksheet line throws an error
"IndexOutOfRangeException: Worksheet position out of range."
However, when I look at the package variable I see this error
"Length = 'package.File.Length' threw an exception of type
'System.IO.FileNotFoundException'"
What am I missing here? Like I said, this used to work and I can't think of anything I changed related to this code to cause this issue.
Is it the same file also? It looks like the file might not have two worksheets and since the index is 1, it is looking for the second worksheet.
You need to load the file in a MemoryStream and create a ExcelPackage from there.
using (MemoryStream ms = new MemoryStream(FileUpload1.FileBytes))
using (ExcelPackage excelPackage = new ExcelPackage(ms))
{
//work with the excel document
}
FileUpload1.FileBytes if for asp.net webforms, but I'm guessing Core has something similar.

How to set active sheet with Open XML SDK 2.5

using the example here How to Copy a Worksheet within a Workbook
I have successfully been able to clone/copy sheets in my excel file, however when I open the excel the 2nd sheet is the active(visible) sheet. I haven't been able to locate a property that could do thins.....Is there any way to specify what sheet is active?
I've tried to force it by opening and editing the first sheet in the file thinking it was the last edited sheet that was active but that didn't work either.
any help would be great. TIA
update: looking at the workbook.xml created when renaming the .xlsx to .zip I came accross the 'activeTab' property. made a quick change to my code and seems to work just fine
public void SetFirstSheetInFocus(String xlsxFile)
{
using (SpreadsheetDocument spreadSheet = SpreadsheetDocument.Open(xlsxFile, true))
{
//Get a reference to access the main Workbook part, which contains all references
WorkbookPart _workbookPart = spreadSheet.WorkbookPart;
if (_workbookPart != null)
{
WorkbookView _workbookView = spreadSheet.WorkbookPart.Workbook.BookViews.ChildElements.First<WorkbookView>();
if (_workbookView != null)
{
_workbookView.ActiveTab = 0; // 0 for first or whatever tab you want to use
}
// Save the workbook.
_workbookPart.Workbook.Save();
}
}
}
If the name of your sheet is in the variable
sheetName
you can set the sheet with that name active like this:
using (var spreadsheetDoc = SpreadsheetDocument.Open(emptyHIPTemplatePath, true /* isEditable */, new OpenSettings { AutoSave = false }))
{
var workbookPart = spreadsheetDoc.WorkbookPart;
var workBook = spreadsheetDoc.WorkbookPart.Workbook;
var sheet = workBook.Descendants<Sheet>().FirstOrDefault(s => s.Name == sheetName);
var sheetIndex = workBook.Descendants<Sheet>().ToList().IndexOf(sheet);
var workBookView = workBook.Descendants<WorkbookView>().First();
workBookView.ActiveTab = Convert.ToUInt32(sheetIndex);
...
workBook.Save();
}
From Vincent Tan's book:
The SheetId property doesn't determine the order. The order of
appending the Sheet classes to the Sheets class, does.
When you add a sheet, it gets the next index, but a single sheet does not have an index. OpenXML gives it an index when you are done adding sheets. Again, from Vincent Tan's book:
Let's say you have 3 worksheets named Sheet1, Sheet2 and Sheet3.
However, when you appended the corresponding Sheet classes, you did it
as Sheet2, Sheet3 and Sheet1, in that order.

Find and replace value in Excel using C#

How can I find some value from cell and replace by new value in Excel?
I tryed this but it doesn't works:
Microsoft.Office.Interop.Excel.Application xlapp = new Microsoft.Office.Interop.Excel.Application();
Microsoft.Office.Interop.Excel.Workbook wb =default(Microsoft.Office.Interop.Excel.Workbook);
wb = xlapp.Workbooks.Open(FileName.ToString());
wb.Worksheets[0].Cells.Replace("find","replace");
I would recommend you use NPOI which can be accessed either via codeplex or directly through Nuget in Visual Studio. It gives you the ability to easily upload, edit and create spreadsheets in .NET
Example of uploading a spreadsheet:
HSSFWorkbook hssfworkbook;
void InitializeWorkbook(string path)
{
//read the template via FileStream, it is suggested to use FileAccess.Read to prevent file lock.
//book1.xls is an Excel-2007-generated file, so some new unknown BIFF records are added.
using (FileStream file = new FileStream(path, FileMode.Open, FileAccess.Read))
{
hssfworkbook = new HSSFWorkbook(file);
}
}
You can then use the IRow and ICell collections of the spreadsheet to locate and edit the data you need before doing an export.
More examples can be found here
If interested, you can use GemBox.Spreadsheet for this, like so:
SpreadsheetInfo.SetLicense("FREE-LIMITED-KEY");
// Load your XLS, XLSX, ODS or CSV file.
ExcelFile wb = ExcelFile.Load(FileName.ToString());
ExcelWorksheet ws = wb.Worksheets[0];
// Replace all "find" occurances with "replace" text.
int row, column;
while(ws.Cells.FindText("find", out row, out column))
ws.Cells[row, column].ReplaceText("find", "replace");
// Save your XLS, XLSX, ODS or CSV file.
wb.Save(FileName.ToString());
Also you can find another searching in Excel example here.
All you have to do is replace
wb.Worksheets[0].Cells.Replace("find","replace");
with
wb.Worksheets[1].Cells.Replace("find","replace");

Convert xls or xlsx file with multiple sheets into one csv file using interop

I am trying to convert a xls or xlsx file with multiple sheets into one CSV file using c# and the interop library. I am only getting the one sheet in the CSV file. I know I can specify the sheet to save as or change the active sheet to save that one but I am looking for a solution to append all the sheets to the same CSV file that will work with both xls and xlsx files. I am automating this and don't care what is in the excel document just want to pull the string values out and append it to the csv file. Here is the code I am using:
Microsoft.Office.Interop.Excel.Application app = new Microsoft.Office.Interop.Excel.Application();
app.Visible = false;
app.DisplayAlerts = false;
Workbook wkb = app.Workbooks.Open(fullFilePath);
wkb.SaveAs(newFileName, XlFileFormat.xlCSVWindows);
Is this even possible?
I'm just getting started tackling a similar situation, but I believe this may address your needs:
http://www.codeproject.com/Articles/246772/Convert-xlsx-xls-to-csv
This uses the ExcelDataReader api that you can get from NuGet
http://exceldatareader.codeplex.com/
Like Tim was saying, you're going to have to make sure and possibly validate that the columns and structure are the same between sheets. You may also have to eat the header rows on all the sheets after the first one. I'll post an update and some code samples once I've finished.
Update [7/15/2013]. Here's my finished code. Not very fancy, but it gets the job done. All of the sheets are tables in the DataSet, so you just loop through the tables adding onto your destination. I'm outputting to a MongoDB, but I'm guessing you can swap that out for a StreamWriter for your CSV file rather easily.
private static void ImportValueSetAttributeFile(string filePath)
{
FileStream stream = File.Open(filePath, FileMode.Open, FileAccess.Read);
// Reading from a OpenXml Excel file (2007 format; *.xlsx)
IExcelDataReader excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream);
// DataSet - The result of each spreadsheet will be created in the result.Tables
DataSet result = excelReader.AsDataSet();
// Free resources (IExcelDataReader is IDisposable)
excelReader.Close();
var connectionString = ConfigurationManager.ConnectionStrings[0].ConnectionString;
var database = ConfigurationManager.AppSettings["database"];
var mongoAccess = new MongoDataAccess(connectionString, database);
var cdm = new BaseDataManager();
int ind = 0;
for (int i = 0; i < result.Tables.Count; i++)
{
int row_no = 1;
while (row_no < result.Tables[ind].Rows.Count) // ind is the index of table
// (sheet name) which you want to convert to csv
{
var currRow = result.Tables[ind].Rows[row_no];
var valueSetAttribute = new ValueSetAttribute()
{
CmsId = currRow[0].ToString(),
NqfNumber = currRow[1].ToString(),
ValueSetName = currRow[2].ToString(),
ValueSetOid = currRow[3].ToString(),
Definition = currRow[4].ToString(),
QdmCategory = currRow[5].ToString(),
Expansion = currRow[6].ToString(),
Code = currRow[7].ToString(),
Description = currRow[8].ToString(),
CodeSystem = currRow[9].ToString(),
CodeSystemOid = currRow[10].ToString(),
CodeSystemVersion = currRow[11].ToString()
};
cdm.AddRecords<ValueSetAttribute>(valueSetAttribute, "ValueSetAttributes");
row_no++;
}
ind++;
}
}

Categories