I am trying to modify exisiting excel worksheet. Precisely I'd like to add a few rows to a table that exists in the worksheet (created using format as a table). I tried
var table = sheet.Tables["PositionsTable"];
but the 'table' thus created is only a meta-data of the actual table, and I can not add rows to it. If I try
sheet.Cells[table.Address.Address.ToString()].LoadFromCollection(positions);
Then I don't get the formatting of the table.
Anyone knows how I add rows to the table! Thanks
We use something like the following at our company. Basically the idea is that for each record you want to output you create an object array of the data to load into excel and then call LoadFromArrays.
using (var excelPkg = new ExcelPackage())
{
var name = "Sheet1";
//You will probably pass the columns to output into this function
var headerArray = new string[] { "Column1", "Column2" };
var data = positions
.Select(i => headerArray.Select(h => GetValue(i, h)).ToArray());
var ws = excelPkg.Workbook.Worksheets.Add(name);
ws.Cells["A1"].LoadFromArrays(
((object[])headerArray).ToSingleItemEnumerable().Union(data));
ws.Row(1).Style.Font.Bold = true; //set header to bold
excelPkg.SaveAs(stream, "password");
}
private static object GetValue(Position item, string field)
{
//Your logic goes here
return null;
}
public static IEnumerable<T> ToSingleItemEnumerable<T>(this T o)
{
yield return o;
}
Related
I am using the library LinqToExcel to read excel files in my mvc4 project. My problem is when I try to read the headers at row 4... How I can do this?
In project, exists a function that returns all the column names, but I suppose that the columns need to be at row 0.
// Summary:
// Returns a list of columns names that a worksheet contains
//
// Parameters:
// worksheetName:
// Worksheet name to get the list of column names from
public IEnumerable<string> GetColumnNames(string worksheetName);
Thanks.
Unfortunately the GetColumnNames() method only works when the header row is on row 1.
However, it should be possible to get the column names by using the WorksheetRangeNoHeader() method.
It would look something like this
var excel = new ExcelQueryFactory("excelFileName");
// Only select the header row
var headerRow = from c in excel.WorksheetRangeNoHeader("A4", "Z4")
select c;
var columnNames = new List<string>();
foreach (var headerCell in headerRow)
columnNames.Add(headerCell.ToString());
An FYI for future googlers:
It appears that GetColumnNames() has changed since the above answer was accepted.
There is now an overload in which you can define the range of the header row as a string:
// This will return a List<string>
var colNames = ExcelFile
.GetColumnNames(SheetName, "A9:AF9")
.ToList();
All,
Been trying to figure this out for a day now. Did a lot of googling!
I have an excel where I have 5 columns but in first column I have product numbers. I want to return DISTINCT product numbers from the excel. Using EPPlus to read in the excel. Here is my code:
string fileName = file.FileName;
string fileContentType = file.ContentType;
byte[] fileBytes = new byte[file.ContentLength];
var data = file.InputStream.Read(fileBytes, 0, Convert.ToInt32(file.ContentLength));
if (file.FileName.IndexOf(".xlsx") == 0)
{
throw new Exception("Please ensure that the file has been converted to latest excel version. The file type must be .xlsx.");
}
using (var package = new ExcelPackage(file.InputStream))
{
var currentSheet = package.Workbook.Worksheets;
var workSheet = currentSheet.FirstOrDefault();
var noOfCol = workSheet.Dimension.End.Column;
var noOfRow = workSheet.Dimension.End.Row;
//lets remove all records
//get a list of distinct item numbers and remove all records in preparation for upload
//I need help with this statement!
var result = workSheet.Cells.Select(grp => grp.First()).Distinct().ToList();
So I was able to figure it out by debugging. This doesnt seem to be the most efficient answer but here it goes:
var result = workSheet.Cells.Where(s => s.Address.Contains("A")).Where(v => v.Value != null).Where(vb => vb.Value.ToString() != "").GroupBy(g => g.Value.ToString()).Distinct().ToList();
So basically return Only column A (First column since address holds this information) then eliminate nulls and blanks, next group by the value and finally return distinct as a list.
Regarding your answer (sorry not enough rep to comment):
workSheet.Cells.Where(s => s.Address.Contains("A")).....
That could include ZA, AA, etc If you just want column A you could do
workSheet.Cells[1,1,workSheet.Dimension.End.Row, 1].....
This will start at A1, and just look down column A till the end. You'll still might need to filter null, blank etc, or if you need to start at row 5 here is all i needed. exmaple:
workSheet.Cells[5,1,workSheet.Dimension.End.Row, 1].GroupBy(g => g.Value.ToString()).Distinct().ToList();
I have a Excel file with several thousand rows and columns up to "BP".
I need to filter all of these rows by specific values in columns C and BP.
I tested the filter functionality in ClosedXML as per the code below.
When I apply a filter to one column all works well and the data is saved in the new file.
When I try to apply two filters, the last one executed is the one that is applied.
I have tried to use the worksheet as a Range/Table, same filtering problem.
I eventually created the "rows" expression, that works, but the 1st row (header) is filtered out.
public static void Filter(string source, string newFile)
{
using (var workbook = new XLWorkbook(source))
{
IXLWorksheet worksheet = workbook.Worksheet(1);
int salesFoundCell = worksheet.FirstRow().Cells().First(c => c.Value.ToString() == "Sales Order Description").Address.ColumnNumber;
int revenueFoundCell = worksheet.FirstRow().Cells().First(c => c.Value.ToString() == "Revenue recognition date").Address.ColumnNumber;
//worksheet.RangeUsed().SetAutoFilter().Column(salesFoundCell).EqualTo("Equipment Sale");
//worksheet.RangeUsed().SetAutoFilter().Column(revenueFoundCell).EqualTo("00.00.0000");
//var rows = worksheet.RowsUsed().Where(r => r.CellsUsed().Any(c => c.GetString().Contains("Equipment Sale")) &&
// r.CellsUsed().Any(c => c.GetString().Contains("00.00.0000")));
Console.WriteLine(rows.Count());
//workbook.SaveAs(newFile);
}
}
I also tried the method posted on the ClosedXML wiki, where you save the worksheet as a MemoryStream, reapply the filter and then save it to a new file.
This is the short version:
public void Create(string filePath)
{
var wb = new XLWorkbook();
IXLWorksheet ws;
#region Multi Column
String multiColumn = "Multi Column";
ws = wb.Worksheets.Add(multiColumn);
// Add filters
ws.RangeUsed().SetAutoFilter().Column(2).EqualTo(3).Or.GreaterThan(4);
ws.RangeUsed().SetAutoFilter().Column(3).Between("B", "D");
// Sort the filtered list
ws.AutoFilter.Sort(3);
#endregion
using (var ms = new MemoryStream())
{
wb.SaveAs(ms);
var workbook = new XLWorkbook(ms);
#region Multi Column
workbook.Worksheet(multiColumn).AutoFilter.Column(3).EqualTo("E");
workbook.Worksheet(multiColumn).AutoFilter.Sort(3, XLSortOrder.Descending);
#endregion
workbook.SaveAs(filePath);
ms.Close();
}
}
I went through several iterations of the below two expressions:
worksheet.RangeUsed().SetAutoFilter().Column(salesFoundCell).EqualTo("Equipment Sale");
worksheet.RangeUsed().SetAutoFilter().Column(revenueFoundCell).EqualTo("00.00.0000");
I tried filtering directly on the columns, as a range, as a table, trying to hide the rows that did not have the required values.
All of it either filters based on one column or not at all.
The "expression.AddFilter(some value).AddFilter(some other value);" does not help as I am not trying to add multiple filters on the same column
The "And/Or" functionality does the same, multiple filters on the same column.
Has anyone managed to filter based on values in multiple columns?
Any advice is much appreciated.
Try the below sorting method found here
myRange.SortColumns.Add(firstColumnNumber, XLSortOrder.Ascending);
myRange.SortColumns.Add(secondColumnNumber, XLSortOrder.Ascending);
myRange.Sort();
Here's my answer. I struggled with the same problem for a while.
The key is the sort, which has to be done after you define the filters.
var excelTable = TableRange.CreateTable();
excelTable.AutoFilter.Column(26).AddFilter("Filter 1");
excelTable.AutoFilter.Column(26).AddFilter("Filter 2");
excelTable.AutoFilter.Sort(1, XLSortOrder.Ascending);
I have a formated template stored in the Database.
after building and opening the Excel the cell has the format but its not formated like it should.
example: the field looks in the template like this. 1234.56$ but know it is looking like this 1234.56. so the $ is missing.
second example. 12% its looking like but know its looking like this 11.9999999997%
The value I put in are exact values. like 1234.56 and 11.9999999997% so if i put them manually in the generatet excle it worsk with the formating but not during the creating phase.
does anyone have some ideas?
My insert statment
public static void InsertRows(List<ExcelRow> rowDefinitions, Stream template, string sheetName)
{
using (SpreadsheetDocument doc = SpreadsheetDocument.Open(template, true))
{
// tell Excel to recalculate formulas next time it opens the doc
doc.WorkbookPart.Workbook.CalculationProperties.ForceFullCalculation = true;
doc.WorkbookPart.Workbook.CalculationProperties.FullCalculationOnLoad = true;
foreach (var rd in rowDefinitions)
{
// first get the context (WS + SheetData)
var ws = GetWorksheetPart(doc.WorkbookPart, sheetName);
var sheetData = ws.Worksheet.Descendants<SheetData>().First();
var nr = CreateRow((uint)rd.RowIndex, sheetData);
foreach (var cd in rd.Cells)
{
var c = EnsureCell(nr, cd.ColumnName);
SetCellValue(cd.CellText, c, doc.WorkbookPart.SharedStringTablePart);
}
}
doc.WorkbookPart.Workbook.Save();
}
}
For example, I have a sheet called EmployeeSheet, which is just a single column of every employee's name first and last in a company. And let's assume this list is perfectly formatted and has no duplicates so every cell is unique in this sheet.
Now I have a sheet for each department in the company, such as FinanceSheet, ITSheet, and SalesSheet. Each sheet has in it somewhere (as in each sheet doesn't have the same layout) a list of employees in each department. However any 1 employee name should only appear once between all of the department sheets (this excludes the EmployeeSheet).
Here's the solution I can think of but not figure out how to implement, would be to make a multidimensional array (Learned a small bit about them in school, vaguely remember how to use though).
Pseudocode something like:
arrEmployees = {"Tom Hanks", "Burt Reynolds", "Your Mom"}
arrFinance = {"Tom Hanks"}
arrIT = {"Burt Reynolds"}
arrSales = {"Your Mom"}
arrSheets = {arrEmployees, arrFinance, arrIT, arrSales}
While I've been able to get single cell values and ranges as strings by using
Sheets shts = app.Worksheets;
Worksheet ws = (Worksheet)sheets.get_Item("EmployeeSheet");
Excel.Range empRange = (Excel.Range)worksheet.get_range("B2");
string empVal = empRange.Value2.ToString();
But with that process to get a single cell value to a string, I don't know how I would put that into an element of my array, let alone a range of values.
I'm sure my method is not the most efficient, and it might not even be possible, but that's why I'm here for help, so any tips are appreciated.
EDIT: This is the solution that ended up working for me. Thanks to Ian Edwards solution.
Dictionary<string, List<Point>> fields = new Dictionary<string, List<Point>>();
fields["Finance"] = new List<Point>() { new Point(2,20)};
fields["Sales"] = new List<Point>();
for (int row = 5; row <= 185; row += 20) {fields["Sales"].Add(new Point(2,row));}
List<string> names = new List<string>();
List<string> duplicates = new List<string>();
foreach (KeyValuePair<string, List<Point>> kp in fields)
{
Excel.Worksheet xlSheet = (Excel.Worksheet)workbook.Worksheets[kp.Key];
foreach (Point p in kp.Value)
{
if ((xlSheet.Cells[p.Y, p.X] as Excel.Range.Value != null)
{
string cellVal = ((xlSheet.Cells[p.Y,p.X] as Excel.Range).Value).ToString();
if (!names.Contains(cellVal))
{ names.Add(cellVal)) }
else { duplicates.Add(cellVal); } } } }
Here's a little example I knocked together - the comments should explain what's going on line by line.
You can declare the name of the worksheets you want to check for names, as well as where to start looking for names in the 'worksheets' dictionary.
I assume you don't know how many names are in each list - it will keep going down each list until it encounters a blank cell.
// Load the Excel app
Microsoft.Office.Interop.Excel.Application xlApp = new Microsoft.Office.Interop.Excel.Application();
// Open the workbook
var xlWorkbook = xlApp.Workbooks.Open("XLTEST.xlsx");
// Delcare the sheets and locations to look for names
Dictionary<string, Tuple<int, int>> worksheets = new Dictionary<string, Tuple<int, int>>()
{
// Declare the name of the sheets to look in and the 1 base X,Y index of where to start looking for names on each sheet (i.e. 1,1, = A1)
{ "Sheet1", new Tuple<int, int>(1, 1) },
{ "Sheet2", new Tuple<int, int>(2, 3) },
{ "Sheet3", new Tuple<int, int>(4, 5) },
{ "Sheet4", new Tuple<int, int>(2, 3) },
};
// List to keep track of all names in all sheets
List<string> names = new List<string>();
// Iterate over every sheet we need to look at
foreach(var worksheet in worksheets)
{
string workSheetName = worksheet.Key;
// Get this excel worksheet object
var xlWorksheet = (Microsoft.Office.Interop.Excel.Worksheet)xlWorkbook.Worksheets[workSheetName];
// Get the 1 based X,Y cell index
int row = worksheet.Value.Item1;
int column = worksheet.Value.Item2;
// Get the string contained in this cell
string name = (string)(xlWorksheet.Cells[row, column] as Microsoft.Office.Interop.Excel.Range).Value;
// name is null when the cell is empty - stop looking in this sheet and move on to the next one
while(name != null)
{
// Add the current name to the list
names.Add(name);
// Get the next name in the cell below this one
name = (string)(xlWorksheet.Cells[++row, column] as Microsoft.Office.Interop.Excel.Range).Value;
}
}
// Compare the number of names to the number of unique names
if (names.Count() != names.Distinct().Count())
{
// You have duplicate names!
}
You can use .Range to define multiple cells (ie, .Range["A1", "F500"])
https://msdn.microsoft.com/en-us/library/microsoft.office.tools.excel.worksheet.range.aspx
You can then use .get_Value to get the contents/values of all cells in that Range. According to dotnetperls.com get_Value() is much faster than get_Range() (see 'Performance' section). Using the combo of multiple ranges + get_value will definitely perform better of lots of single range calls using get_range.
https://msdn.microsoft.com/en-us/library/microsoft.office.tools.excel.namedrange.get_value(v=vs.120).aspx
I store them in the an Object Array.
(object[,])yourexcelRange.get_Value(Excel.XlRangeValueDataType.xlRangeValueDefault);
From there you can write your own comparison method to compare multiple arrays. One quirk is that doing this returns a 1-indexed array, instead of a standard 0-based index.