How do i clear rows in Excel using EPPLus library? - c#

Right now I am providing hard code value as I know the number of rows
in excel that has data in it.I would like my program knows how to clear the excel and import new data every time in excel sheet. How do I achieve this using EPPlus v4.1.1?
My Code:*
using (ExcelPackage xlPackage = new ExcelPackage(new System.IO.FileInfo(sourcefile + fileName)))
{
ExcelWorksheet ws = xlPackage.Workbook.Worksheets[myWS];
int hardCodedRowNumber = 300;
ws.DeleteRow(2, hardCodedRowNumber);
// Other codes to import data from db after clearing excel
}

I guess the easiest and most reliable way is to delete the entire worksheet.
var oldSheet = xlPackage.Workbook.Worksheets[myWS];
xlPackage.Workbook.Worksheets.Delete(oldSheet);
var newSheet = xlPackage.Workbook.Worksheets.Add("MyNewSheet");
and from there set up line 1 from scratch.
If you strive to keep line 1 as is, you can create the new worksheet first, and copy the lines from one to the other before deleting the old worksheet:
oldSheet.Cells[1, 1, 1, oldSheet.Dimension.End.Column].Copy(newSheet.Cells[1, 1]);
Alternatively, using your own method, you can replace your "hardCodedRowNumber" with
ws.Dimension.End.Row

Related

How do I get closedxml.excel to recognize merged cells?

I'm trying to make a template excel file and I need to put data at various parts of the file. I have 2 fields where the data I'm importing is from a list so in the cell I do something like this:
{Item.Name}
and I of course name the range of cells that will be populated by this list. I have run into an issue where only the first record in my list will be of the correct format/ cell merge. Every record after the first completely breaks down all of my merged cells so my formatting is not good. Any ideas of how to get closedxml.excel to recognize there are merged cells?
I don't know if there is a way to get only the merged cells, but you can check if a cell is merged:
using (var excelFileStream = new FileStream("excelfile.xlsx", FileMode.Open, FileAccess.Read))
{
using IXLWorkbook workbook = new XLWorkbook(excelFileStream);
IXLWorksheet worksheet = workbook.Worksheets.Worksheet(1);
IXLCell cell = worksheet.Cell(row: 1, column: 1);
IXLRangeAddress range = cell.MergedRange().RangeAddress;
if (range.ColumnSpan > 1 || range.RowSpan > 1)
{
//merged cell
}
else
{
//non-merged cell
}
}

How do you append data to an existing Excel file?

How do I append data to an already existing Excel file.
Let's say there can be a variable amount of rows already written to a file and I need to get the next row to write on.
I was thinking check for 2 blank rows and then write on the 2nd row or something like that.
How would I do this? Is there a way in EPPlus to open an Excel file and find the last line or something?
The Worksheet.Dimension should get you what you need. So if you have a sheet like this:
You can does this:
using (var package = new ExcelPackage(excelFile))
{
var ws = package.Workbook.Worksheets.First();
var lastRow = ws.Dimension.End.Row;
var lastColumn = ws.Dimension.End.Column;
Console.WriteLine($"Last Row: {lastRow}");
Console.WriteLine($"Last Column: {lastColumn}");
}
Which gives in console:
Last Row: 9
Last Column: 6

Aspose workbook copy/rename sheet

I want to Copy Sheet from excel, create copy of sheet with particular name.
Aspose.Cells.Workbook workbook = new Aspose.Cells.Workbook(excelFilePath);
//Create a Worksheets object with reference to the sheets of the Workbook.
WorksheetCollection sheets = workbook.Worksheets;
sheets.AddCopy("Cash Bonuses");
Now the problem is it copies data of Sheet "Cash Bonuses" but it makes Sheet name as "Sheet111". I want to make this sheet with specified name like "Cash".How to do that ? Once data is copied to new tab , i want to delete old tab "Cash Bonuses" and rename new tab as "Cash bonuses" from "Cash".
Please note, in order to copy the contents of a worksheet to another worksheet, you need to add a blank worksheet to the collection and then call its Copy method while passing the object of existing worksheet (one that needs to be copied) otherwise you will lose data on the destination worksheet.
Please try the following piece of code as it tries to accomplish all your requirements. Hopefully, the comments will help you understand what the statements mean.
var workbook = new Aspose.Cells.Workbook(excelFilePath);
var sheets = workbook.Worksheets;
//Access 1st worksheet from the collection
//You may also pass the worksheet name to access a particular worksheet
var sheet0 = sheets[0];
//Add a new worksheet to the collection and name it as desired
var sheet1 = sheets[sheets.Add()];
sheet1.Name = "Cash";
//Copy the contents of 1st worksheet onto the new worksheet
sheet1.Copy(sheet0);
//Delete 1st worksheet
sheets.RemoveAt(sheet0.Index);
//Rename newly added worksheet to 'Cash bonuses'
sheet1.Name = "Cash bonuses";
//Save result
workbook.Save(dir + "output.xlsx");
Note: I work with Aspose as Developer Evangelist.

How can i get actual used range for modified excels using Epplus?

I am reading data from excel to datable using EPPlus.
After reading an excel sheet with 10 rows of record, I modified the excel sheet by removing existing data and kept data for only one row.
But when I am reading the modified excel it still reading 10 rows (1 with value and remaining as null fields) to data table.
How can limit this?
I am using following code for reading Excel.
using (var pck = new OfficeOpenXml.ExcelPackage())
{
using (var stream = File.OpenRead(FilePath))
{
pck.Load(stream);
}
var ws = pck.Workbook.Worksheets.First();
bool hasHeader = true; // adjust it accordingly(this is a simple approach)
foreach (var firstRowCell in ws.Cells[1, 1, 1, ws.Dimension.End.Column])
{
DSClientTransmittal.Tables[0].Columns.Add(hasHeader ? firstRowCell.Text : string.Format("Column {0}", firstRowCell.Start.Column));
}
var startRow = hasHeader ? 2 : 1;
for (var rowNum = startRow; rowNum <= ws.Dimension.End.Row; rowNum++)
{
//var wsRow = ws.Cells[rowNum, 1, rowNum, ws.Dimension.End.Column];
var wsRow = ws.Cells[rowNum, 1, rowNum, DSClientTransmittal.Tables[0].Columns.Count];
var row = DSClientTransmittal.Tables[0].NewRow();
foreach (var cell in wsRow)
{
try
{
object cellValue = cell.Value;
//row[cell.Start.Column - 1] = cell.Text;
row[cell.Start.Column - 1] = cellValue.ToString().Trim();
//cell.Style.Numberformat.Format = "#";
//row[cell.Start.Column - 1] = cell.Text;
}
catch (Exception ex) { }
}
DSClientTransmittal.Tables[0].Rows.Add(row);
}
pck.Dispose();
}
When I was using Interop excel to read excel, same issue was overcame by
clearformat() method like
ws.Columns.ClearFormats();
xlColCount = ws.UsedRange.Columns.Count;
Is there any equivalent for this in Epplus open xml?
How can I get actual used range for modified excels?
There is no built-in way of indicating that a row shouldn't be accounted for when only deleting data in some cells.
Dimension is as close as you can get, but rows are included in the Dimension if any column contains data or if any row above or below contains data.
You could however try to find out if you should skip a row in the for loop.
For example if you always delete data in the first 4 columns only, then you could try:
if(!ws.Cells[rowNum, 1, rowNum, 4].All(c => c.Value == null))
{
//Continue adding the row to the table
}
The description isn't indicating the criteria for skipping a row, but you get the idea.
To start with, I am not a C# programmer, but I think I have a solution that works using an Excel VBA script. You may be able to run this Excel VBA code with C, or get insight in how to accomplish the same thing with C+.
The problem you are having is related to the way Excel handles the working size of a worksheet. If you enter data in the 1 millionth row and then delete that cell, Excel still shows the worksheet as having 1 million rows.
I tested out this Excel VBA code and it successfully deleted all rows that were completely empty, and then reset the worksheet size.
Sub DelEmptyRowsResizeWorksheet()
Dim i As Long, iLimit As Long
iLimit = ActiveSheet.UsedRange.Rows.Count
For i = iLimit To 1 Step -1
If Application.CountA(Cells(i, 1).EntireRow) = 0 Then
Cells(i, 1).EntireRow.Delete
End If
Next i
iLimit = ActiveSheet.UsedRange.Rows.Count ' resize the worksheet based on the last row with data
End Sub
To do this manually without a script, first delete all empty rows at the bottom (or columns on the right side) of a worksheet, save it, then close and reopen the workbook. I found that this also resets the Excel workbook size.

How do you get the name of the first page of an excel workbook?

Suppose you don't know the name of the first worksheet in an excel workbook. And you want to find a way to read from the first page. This snippet sometimes works, but not always. Is it just me? Or is there a no brainer way to do this?
MyConnection = new System.Data.OleDb.OleDbConnection("provider=Microsoft.Jet.OLEDB.4.0;Data Source='" + inputFile + "';Extended Properties=Excel 8.0;");
String[] excelSheets = new String[tbl.Rows.Count];
int i = 0;
foreach (DataRow row in tbl.Rows)
{
excelSheets[i] = row["TABLE_NAME"].ToString();
i++;
}
string pageName = excelSheets[0];
OleDbDataAdapter myAdapter = new System.Data.OleDb.OleDbDataAdapter("SELECT * FROM [" + pageName + "]", MyConnection);
Note: I am looking for the name of the first worksheet.
If you have Office installed on the machine, why not just use Visual Studio Tools for Office (VSTO). Here is essentially the code to get the worksheet:
Microsoft.Office.Interop.Excel.Application app = new Microsoft.Office.Interop.Excel.Application();
Microsoft.Office.Interop.Excel.Workbook workbook = app.Workbooks.Open(fileName,otherarguments);
Microsoft.Office.Interop.Excel.Worksheet worksheet = workbook.Worksheets[1] as Microsoft.Office.Interop.Excel.Worksheet;
Your code seems to be missing the defintion of tbl. I assume it is something like
DataTable tbl = MyConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
If so, you will probably get the sheetnames but in the wrong order.
I could not find a proper solution for this issue, so I approached it from another point of view. I decided to look for sheets that actual had information on it. You can probably do this by looking at the rows, but the method I used was to look at the columns from the schema information. (This obviously will fail in your used sheet only has one column as unused sheets also have one column), but it worked in my case, and I also used it to check I had the expected number of columns (in my case nine)
This uses the GetOleDbSchemaTable(OleDbSchemaGuid.Columns, null) method to return the column information.
The code is probably irrelevant/trival, and as I happened to be learning LINQ when I came across this issue, so I wrote it in LINQ style
It does require a small class called LinqList which you can get here
DataTable columnDetails = objConn.GetOleDbSchemaTable(
System.Data.OleDb.OleDbSchemaGuid.Columns, null);
LinqList<DataRow> rows = new LinqList<DataRow>(columnDetails.Rows);
var query= (from r in rows
group r by r["Table_Name"] into results
select new { results.Key , count=results.Count() }
);
var activeSheets = (from sheet in query
where sheet.count == 9
select sheet.Key
).ToList();
if (activeSheets.Count != 1)
... display error
This is the same as this other question First sheet Excel
I think that the order of the returned table gets messed up. We would need to find a way to get the order of the tabs. For now if you check your code, sometime the first sheet is index 0. But it can be returned in any order. I have tried deleting the other sheets and with only one you get the right name. But that wouldn't be pratical.
edit : after some research, it could be the tabs are returned in order of names Using Excel OleDb to get sheet names IN SHEET ORDER
see link
SpreadsheetGear for .NET will let you load a workbook and get the names of sheets (with IWorkbook.Worksheets[sheetIndex].Name) and get the raw data or formatted text of each cell (it does more but that's probably what you are looking for if you are currently using OleDB).
You can download a free trial here.
Disclaimer: I own SpreadsheetGear LLC

Categories