Unmerge and clear cells in epplus 4.1 - c#

I had no luck deleting rows in excel so now I try to clear their content and still get the error:
"Can't delete/overwrite merged cells. A range is partly merged with the another merged range. A57788:J57788".
Columns 1-10 are really merged, but how do I unmerge them?
Here's my code:
cntr = 0;
while (ws.Cells[RowNum + cntr, 1].Value == null || !ws.Cells[RowNum + cntr, 1].Value.ToString().StartsWith("Report generation date"))
{
ws.Cells[RowNum + cntr, 1, RowNum + cntr, 18].Value = "";
ws.Cells[RowNum + cntr, 1, RowNum + cntr, 10].Merge = false;
for (int i = 1; i < 17; i++)
{
ws.Cells[RowNum + cntr, i].Style.Border.BorderAround(OfficeOpenXml.Style.ExcelBorderStyle.None);
ws.Cells[RowNum + cntr, i].Clear();
}
cntr++;
}
//ws.DeleteRow(RowNum, cntr);

The thing is you can not unmerge a single cell in a range, you have to unmerge the whole range.
To do that you can get the merged range that a cell belongs to using my solution here:
public string GetMergedRange(ExcelWorksheet worksheet, string cellAddress)
{
ExcelWorksheet.MergeCellsCollection mergedCells = worksheet.MergedCells;
foreach (var merged in mergedCells)
{
ExcelRange range = worksheet.Cells[merged];
ExcelCellAddress cell = new ExcelCellAddress(cellAddress);
if (range.Start.Row<=cell.Row && range.Start.Column <= cell.Column)
{
if (range.End.Row >= cell.Row && range.End.Column >= cell.Column)
{
return merged.ToString();
}
}
}
return "";
}
The second step is unmerging the whole range using:
public void DeleteCell(ExcelWorksheet worksheet, string cellAddress)
{
if (worksheet.Cells[cellAddress].Merge == true)
{
string range = GetMergedRange(worksheet, cellAddress); //get range of merged cells
worksheet.Cells[range].Merge = false; //unmerge range
worksheet.Cells[cellAddress].Clear(); //clear value
}
}
This way will cost you to lose merging of the other cells, and their value, to overcome this you can save value before clearing and unmerging then you can write it back, something like:
public void DeleteCell(ExcelWorksheet worksheet, string cellAddress)
{
if (worksheet.Cells[cellAddress].Merge == true)
{
var value = worksheet.Cells[cellAddress].Value;
string range = GetMergedRange(worksheet, cellAddress); //get range of merged cells
worksheet.Cells[range].Merge = false; //unmerge range
worksheet.Cells[cellAddress].Clear(); //clear value
//merge the cells you want again.
//fill the value in cells again
}
}

Related

Exception handling in reading excel files, C#

I am fairly new to C# coding, and recently I read that it is better to use Try Catch than if/else for file I/O related matters. I also read in other threads that try catch block should be avoided in loops because they significantly lower performances.
In my case, I am reading multiple (often over 1,000) excel files using for loop. Currently, I am using if/else if to handle really basic exceptions (i.e. file exist or not), but I was thinking of implementing try catch for the reason I explained above.
Here's some of my code
public void ReadFile(string path)
{
Excel.Application xlApp;
Excel.Workbooks xlBooks;
Excel.Workbook xlBook;
Excel.Worksheet xlSheet;
xlApp = new Excel.Application();
xlBooks = xlApp.Workbooks;
xlBook = xlBooks.Open(path);
xlSheet = xlBook.Sheets[1];
int rowCount = FindLastRow(xlSheet);
int colCount = FindLastColumn(xlSheet);
string header = "";
string glNum = Path.GetFileNameWithoutExtension(path).Trim();
for (int row = 1; row <= rowCount; row++)
{
bool processDone = false;
string curationInfo = "";
List<string> data = new List<string>();
// If empty line, skip
if (IsEmpty(xlSheet, row, colCount)) continue;
// If header line, capture
if (IsHeader(xlSheet, row, colCount) != "")
{
header = IsHeader(xlSheet, row, colCount);
continue;
}
// If coloured line, capture
if (IsColoured(xlSheet, row, colCount))
{
curationInfo = ConstructCurationInfo(xlSheet, row, colCount);
data.Add(curationInfo);
// If last row, end
if (row == rowCount) processDone = true;
while (!processDone)
{
// If last row
if (row == rowCount)
{
// If last row is empty, one section is done
if (IsEmpty(xlSheet, row, colCount)) processDone = true;
// If last row is header, one section is done
else if (IsHeader(xlSheet, row, colCount) != "") processDone = true;
// Every other case
else
{
// if coloured, capture
if (IsColoured(xlSheet, row, colCount))
{
string newCurationInfo = ConstructCurationInfo(xlSheet, row, colCount);
data.Add(newCurationInfo);
}
processDone = true;
}
}
// If not last row
else
{
int nextRow = row + 1;
// If next row is last row, end
if (nextRow == rowCount) processDone = true;
// If next row is empty, one section is done
if (IsEmpty(xlSheet, nextRow, colCount)) processDone = true;
// If next row is header, one section is done
else if (IsHeader(xlSheet, nextRow, colCount) != "") processDone = true;
// Every other case
else
{
// if coloured, capture
if (IsColoured(xlSheet, nextRow, colCount))
{
string newCurationInfo = ConstructCurationInfo(xlSheet, nextRow, colCount);
data.Add(newCurationInfo);
}
row++;
}
}
}
}
if (processDone && data.Count != 0)
{
Curation cur = new Curation(header, data, glNum);
curationList.Add(cur);
}
}
// Terminate background Excel Workers
xlBook.Close(false, Missing.Value, Missing.Value);
xlBooks.Close();
xlApp.Quit();
xlApp.DisplayAlerts = false;
Marshal.ReleaseComObject(xlSheet);
Marshal.ReleaseComObject(xlBook);
Marshal.ReleaseComObject(xlBooks);
Marshal.ReleaseComObject(xlApp);
GC.WaitForPendingFinalizers();
GC.Collect();
}
And the Form method that uses ReadFile() method
// Background workers
private void mergeNew_bgw_DoWork(object sender, DoWorkEventArgs e)
{
string input_path = db_input_tb.Text;
string output_dir_path = output_tb.Text;
// Status message to be reported to the UI
string status = "";
status = "Collecting files to be read.....";
mergeNew_bgw.ReportProgress(0, status);
List<string> excelPaths = GetPathToExcel(input_path);
if (IsEmpty(excelPaths))
{
status = "No File to be Processed!";
mergeNew_bgw.ReportProgress(0, status);
fileToProcess = false;
}
// If not empty list, process
else
{
ExcelInfo info = new ExcelInfo();
bool doesExist = false;
string path = "";
int row = 1;
for (int i = 0; i < excelPaths.Count; i++)
{
status = "Processing....." + (i + 1).ToString() + "/" + excelPaths.Count.ToString();
mergeNew_bgw.ReportProgress(0, status);
info.ReadFile(excelPaths[i]);
// If last file, write excel file
if (i + 1 == excelPaths.Count)
{
status = "Writing Complete Merged File";
mergeNew_bgw.ReportProgress(0, status);
info.WriteNew(ref doesExist, ref row, ref path, output_dir_path);
info.Clear();
}
else if ((i + 1) % cutoff == 0)
{
status = "Writing Partial Merged File";
mergeNew_bgw.ReportProgress(0, status);
info.WriteNew(ref doesExist, ref row, ref path, output_dir_path);
info.Clear();
}
}
}
}
Should I implement try catch block inside the for loop in mergeNew_bgw_DoWork method like this?
for (int i = 0; i < excelPaths.Count; i++)
{
status = "Processing....." + (i + 1).ToString() + "/" + excelPaths.Count.ToString();
mergeNew_bgw.ReportProgress(0, status);
try
{
info.ReadFile(excelPaths[i]);
}
catch(Exception e)
{
throw new Exception(e.ToString());
}
finally
{
// If last file, write excel file
if (i + 1 == excelPaths.Count)
{
status = "Writing Complete Merged File";
mergeNew_bgw.ReportProgress(0, status);
info.WriteNew(ref doesExist, ref row, ref path, output_dir_path);
info.Clear();
}
else if ((i + 1) % cutoff == 0)
{
status = "Writing Partial Merged File";
mergeNew_bgw.ReportProgress(0, status);
info.WriteNew(ref doesExist, ref row, ref path, output_dir_path);
info.Clear();
}
}
}
Thank you for your help!
EDIT
Apparently, performance will not change that much as pointed by one of the comments. However, where should I insert try catch block in order to get meaningful error messages? ReadFile() method is big, so I think putting the entire method in try block may not give users meaning error messages. Would it be better to insert try catch somewhere inside ReadFile() method?

Why do I get reading numeric values as string in excel

I try to read excel file using NPOI library.
Here is the code:
public void ReadDataFromXL()
{
try
{
for (int i = 1; i <= sheet.LastRowNum; i++)
{
IRow row = sheet.GetRow(i);
for (int j = 0; j < row.Cells.Count(); j++)
{
var columnIndex = row.GetCell(j).ColumnIndex;
var cell = row.GetCell(j);
if (cell != null)
{
switch (cell.CellType)
{
case CellType.Numeric:
var val = cell.NumericCellValue; ;
break;
case CellType.String:
var str = cell.StringCellValue;
break;
}
}
}
}
}
catch (Exception)
{
throw;
}
}
Here the content of .xlsx file that I try to read:
As you can see column X and column Y are numeric columns.
But when I start to read this columns using the code above some of the numeric values in X and Y column have been recognizes by code as string values.
For example in picture above the cell B4 is numeric type but, on cell.CellType it shows String and the value of the string is 31.724732480727\n. '\n' is appended to the value.
Any idea why some numeric values appeared as string and why '\n' appended to the value?
It looks like the datatype of the column is of String, so if you wanted to check for the double datatype (assuming its going to be in the num+'\n' format, you could try the following snippet of code.
String number = "1512421.512\n";
double res;
if (double.TryParse(number.Substring(0, number.Length - 1), out res))
{
Console.WriteLine("It's a number! " + res);
}

How to replace list value with another value c#

I write some data into csv file from List but some list indexes has empty string but another indexes has value
in these cases the data compared with another list wrote in the same csv file
this is my csv file opened using excel sheet
in the third column there exist ID for the the second column cell so in the coming rows i want to detect the name of the ID based on previous rows
like in row 3 it's ID is 19 and name is I/O so in the 7th row the ID is 19 and want to fill the second cell now
info : the IDs is already known above and any next ID will be exist before
by the follwing code.
bool isInList = ms.IndexOf(ShapeMaster) != -1;
if (isInList)
{
savinglabelnamefortextbox = t.InnerText;
string replacement =
Regex.Replace(savinglabelnamefortextbox, #"\t|\n|,|\r", "");
xl.Add("");
dl.Add(replacement);
ms.Add(ShapeMaster);
}
and I use the following code to write to the csv file
using (StreamWriter sw = File.CreateText(csvfilename))
{
for (int i = 0; i < dl.Count; i++)
{
var line = String.Format("{0},{1},{2}", dl[i], xl[i],ms[i]);
sw.WriteLine(line);
}
}
Try this
for (int x = 0; x < ms.Count; x++)
{
if (xl[x] != "")
{
continue;
}
else if (xl[x] == "")
{
for (int y = 0; y<xl.Count; y++)
{
if (ms[y] == ms[x])
{
xl[x] = xl[y];
break;
}
}
continue;
}
}

Try...catch returning nothing but code is still breaking

UPDATE: So this code is collection a SQL Query into a DataSet prior to this method. This data set is then dropped into excel in the corresponding tab at a specific cell address(which is loaded from the form) but the code below is the exporting to excel method. I am getting the following error:
An unhandled exception of type 'System.AccessViolationException' occurred in SQUiRE (Sql QUery REtriever) v1.exe
Additional information: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
I have been tracking this for a while and thought I fixed it, but my solution was a false positive. So I am using a try...catch block that is breaking but not returning anything. Let me know if you all see anything that I am missing. I usually break on this line (templateSheet = templateBook.Sheets[tabName];) and on the same tabName. The tab is not locked or restricted so It can be written to and works more than half of the time.
public void ExportToExcel(DataSet dataSet, Excel.Workbook templateBook, int i, int h, Excel.Application excelApp) //string filePath,
{
try
{
lock (this.GetType())
{
Excel.Worksheet templateSheet;
//check to see if the template is already open, if its not then open it,
//if it is then bind it to work with it
//if (!fileOpenTest)
//{ templateBook = excelApp.Workbooks.Open(filePath); }
//else
//{ templateBook = (Excel.Workbook)System.Runtime.InteropServices.Marshal.BindToMoniker(filePath); }
//Grabs the name of the tab to dump the data into from the "Query Dumps" Tab
string tabName = lstQueryDumpSheet.Items[i].ToString();
templateSheet = templateBook.Sheets[tabName];
// Copy DataTable
foreach (System.Data.DataTable dt in dataSet.Tables)
{
// Copy the DataTable to an object array
object[,] rawData = new object[dt.Rows.Count + 1, dt.Columns.Count];
// Copy the values to the object array
for (int col = 0; col < dt.Columns.Count; col++)
{
for (int row = 0; row < dt.Rows.Count; row++)
{ rawData[row, col] = dt.Rows[row].ItemArray[col]; }
}
// Calculate the final column letter
string finalColLetter = string.Empty;
string colCharset = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
int colCharsetLen = 26;
if (dt.Columns.Count > colCharsetLen)
{ finalColLetter = colCharset.Substring((dt.Columns.Count - 1) / colCharsetLen - 1, 1); }
finalColLetter += colCharset.Substring((dt.Columns.Count - 1) % colCharsetLen, 1);
/*Grabs the full cell address from the "Query Dump" sheet, splits on the '=' and
*pulls out only the cell address (i.e., "address=a3" becomes "a3")*/
string dumpCellString = lstQueryDumpText.Items[i].ToString();
string dumpCell = dumpCellString.Split('=').Last();
/*Refers to the range in which we are dumping the DataSet. The upper right hand cell is
*defined by 'dumpCell'and the bottom right cell is defined by the final column letter
*and the count of rows.*/
string firstRef = "";
string baseRow = "";
//Determines if the column is one letter or two and handles them accordingly
if (char.IsLetter(dumpCell, 1))
{
char[] createCellRef = dumpCell.ToCharArray();
firstRef = createCellRef[0].ToString() + createCellRef[1].ToString();
for (int z = 2; z < createCellRef.Count(); z++)
{ baseRow = baseRow + createCellRef[z].ToString(); }
}
else
{
char[] createCellRef = dumpCell.ToCharArray();
firstRef = createCellRef[0].ToString();
for (int z = 1; z < createCellRef.Count(); z++)
{ baseRow = baseRow + createCellRef[z].ToString(); }
}
int baseRowInt = Convert.ToInt32(baseRow);
int startingCol = ColumnLetterToColumnIndex(firstRef);
int endingCol = ColumnLetterToColumnIndex(finalColLetter);
int finalCol = startingCol + endingCol;
string endCol = ColumnIndexToColumnLetter(finalCol - 1);
int endRow = (baseRowInt + (dt.Rows.Count - 1));
string cellCheck = endCol + endRow;
string excelRange;
if (dumpCell.ToUpper() == cellCheck.ToUpper())
{ excelRange = string.Format(dumpCell + ":" + dumpCell); }
else
{ excelRange = string.Format(dumpCell + ":{0}{1}", endCol, endRow); }
//Dumps the cells into the range on Excel as defined above
templateSheet.get_Range(excelRange, Type.Missing).Value2 = rawData;
/*Check to see if all the SQL queries have been run from
if (i == lstSqlAddress.Items.Count - 1)
{
//Turn Auto Calc back on
excelApp.Calculation = Excel.XlCalculation.xlCalculationAutomatic;
/*Run through the value save sheet array then grab the address from the corresponding list
*place in the address array. If the address reads "whole sheet" then save the whole page,
*else set the addresses range and value save that.
for (int y = 0; y < lstSaveSheet.Items.Count; y++)
{
MessageBox.Show("Save Sheet: " + lstSaveSheet.Items[y] + "\n" + "Save Address: " + lstSaveRange.Items[y]);
}*/
//run the macro to hide the unused columns
excelApp.Run("ReportMakerExecute");
//save excel file as hospital name and move onto the next
SaveTemplateAs(templateBook, h);
}
}
}
}
catch (Exception e)
{
MessageBox.Show(e.ToString());
}
}

NPOI - Get excel row count to check if it is empty

I'm reading an xlsx file using NPOI lib, with C#. I need to extract some of the excel columns and save the extracted values into some kind of data structure.
I can successfully read the file and get all the values from the 2nd (the first one contains only headers) to the last row with the following code:
...
workbook = new XSSFWorkbook(fs);
sheet = (XSSFSheet)workbook.GetSheetAt(0);
....
int rowIndex = 1; //--- SKIP FIRST ROW (index == 0) AS IT CONTAINS TEXT HEADERS
while (sheet.GetRow(rowIndex) != null) {
for (int i = 0; i < this.columns.Count; i++){
int colIndex = this.columns[i].colIndex;
ICell cell = sheet.GetRow(rowIndex).GetCell(colIndex);
cell.SetCellType(CellType.String);
String cellValue = cell.StringCellValue;
this.columns[i].values.Add(cellValue); //--- Here I'm adding the value to a custom data structure
}
rowIndex++;
}
What I'd like to do now is check if the excel file is empty or if it has only 1 row in order to properly handle the issue and display a message
If I run my code against an excel file with only 1 row (headers), it breaks on
cell.SetCellType(CellType.String); //--- here cell is null
with the following error:
Object reference not set to an instance of an object.
I also tried to get the row count with
sheet.LastRowNum
but it does not return the right number of rows. For example, I have created an excel with 5 rows (1xHEADER + 4xDATA), the code reads successfully the excel values. On the same excel I have removed the 4 data rows and then I have launched again the code on the excel file. sheet.LastRowNum keeps returning 4 as result instead of 1.... I think this is related to some property bound to the manually-cleaned sheet cells.
Do you have any hint to solve this issue?
I think it would be wise to use sheet.LastRowNum which should return the amount of rows on the current sheet
Am I oversimplifying?
bool hasContent = false;
while (sheet.GetRow(rowIndex) != null)
{
var row = rows.Current as XSSFRow;
//all cells are empty, so is a 'blank row'
if (row.Cells.All(d => d.CellType == CellType.Blank)) continue;
hasContent = true;
}
You can retrieve the number of rows using this code:
public int GetTotalRowCount(bool warrant = false)
{
IRow headerRow = activeSheet.GetRow(0);
if (headerRow != null)
{
int rowCount = activeSheet.LastRowNum + 1;
return rowCount;
}
return 0;
}
Here is a way to get both the actual last row index and the number of physically existing rows:
public static int LastRowIndex(this ISheet aExcelSheet)
{
IEnumerator rowIter = aExcelSheet.GetRowEnumerator();
return rowIter.MoveNext()
? aExcelSheet.LastRowNum
: -1;
}
public static int RowsSpanCount(this ISheet aExcelSheet)
{
return aExcelSheet.LastRowIndex() + 1;
}
public static int PhysicalRowsCount(this ISheet aExcelSheet )
{
if (aExcelSheet == null)
{
return 0;
}
int rowsCount = 0;
IEnumerator rowEnumerator = aExcelSheet.GetRowEnumerator();
while (rowEnumerator.MoveNext())
{
++rowsCount;
}
return rowsCount;
}

Categories