I am using the library LinqToExcel to read excel files in my mvc4 project. My problem is when I try to read the headers at row 4... How I can do this?
In project, exists a function that returns all the column names, but I suppose that the columns need to be at row 0.
// Summary:
// Returns a list of columns names that a worksheet contains
//
// Parameters:
// worksheetName:
// Worksheet name to get the list of column names from
public IEnumerable<string> GetColumnNames(string worksheetName);
Thanks.
Unfortunately the GetColumnNames() method only works when the header row is on row 1.
However, it should be possible to get the column names by using the WorksheetRangeNoHeader() method.
It would look something like this
var excel = new ExcelQueryFactory("excelFileName");
// Only select the header row
var headerRow = from c in excel.WorksheetRangeNoHeader("A4", "Z4")
select c;
var columnNames = new List<string>();
foreach (var headerCell in headerRow)
columnNames.Add(headerCell.ToString());
An FYI for future googlers:
It appears that GetColumnNames() has changed since the above answer was accepted.
There is now an overload in which you can define the range of the header row as a string:
// This will return a List<string>
var colNames = ExcelFile
.GetColumnNames(SheetName, "A9:AF9")
.ToList();
Related
I want to set the cell formula for columns k-P by using a loop and in c# i can only loop with integers, how do you actually get the column(E.G K,L,M,N,O,P) with the index? for looping through rows its pretty easy because they are just numbers but for columns excel uses letters.
I cant think of anything other than defining my own List for letters k-P in c#
You can use CellReference, if you already have the cell object.
var temp = new CellReference(cell);
var reference = temp.FormatAsString();
Hope this works for you!
Allso you can get it directly from ICell object:
var adress = cell.Address.FormatAsString();
Ore if you need column name only add method like this:
public static string GetColumnName(this ICell cell)
{
return Regex.Match(cell.Address.FormatAsString(), #"[A-Z]+").Value;
}
And then just call it like:
var colName = cell.GetColumnName();
I have a Excel file with several thousand rows and columns up to "BP".
I need to filter all of these rows by specific values in columns C and BP.
I tested the filter functionality in ClosedXML as per the code below.
When I apply a filter to one column all works well and the data is saved in the new file.
When I try to apply two filters, the last one executed is the one that is applied.
I have tried to use the worksheet as a Range/Table, same filtering problem.
I eventually created the "rows" expression, that works, but the 1st row (header) is filtered out.
public static void Filter(string source, string newFile)
{
using (var workbook = new XLWorkbook(source))
{
IXLWorksheet worksheet = workbook.Worksheet(1);
int salesFoundCell = worksheet.FirstRow().Cells().First(c => c.Value.ToString() == "Sales Order Description").Address.ColumnNumber;
int revenueFoundCell = worksheet.FirstRow().Cells().First(c => c.Value.ToString() == "Revenue recognition date").Address.ColumnNumber;
//worksheet.RangeUsed().SetAutoFilter().Column(salesFoundCell).EqualTo("Equipment Sale");
//worksheet.RangeUsed().SetAutoFilter().Column(revenueFoundCell).EqualTo("00.00.0000");
//var rows = worksheet.RowsUsed().Where(r => r.CellsUsed().Any(c => c.GetString().Contains("Equipment Sale")) &&
// r.CellsUsed().Any(c => c.GetString().Contains("00.00.0000")));
Console.WriteLine(rows.Count());
//workbook.SaveAs(newFile);
}
}
I also tried the method posted on the ClosedXML wiki, where you save the worksheet as a MemoryStream, reapply the filter and then save it to a new file.
This is the short version:
public void Create(string filePath)
{
var wb = new XLWorkbook();
IXLWorksheet ws;
#region Multi Column
String multiColumn = "Multi Column";
ws = wb.Worksheets.Add(multiColumn);
// Add filters
ws.RangeUsed().SetAutoFilter().Column(2).EqualTo(3).Or.GreaterThan(4);
ws.RangeUsed().SetAutoFilter().Column(3).Between("B", "D");
// Sort the filtered list
ws.AutoFilter.Sort(3);
#endregion
using (var ms = new MemoryStream())
{
wb.SaveAs(ms);
var workbook = new XLWorkbook(ms);
#region Multi Column
workbook.Worksheet(multiColumn).AutoFilter.Column(3).EqualTo("E");
workbook.Worksheet(multiColumn).AutoFilter.Sort(3, XLSortOrder.Descending);
#endregion
workbook.SaveAs(filePath);
ms.Close();
}
}
I went through several iterations of the below two expressions:
worksheet.RangeUsed().SetAutoFilter().Column(salesFoundCell).EqualTo("Equipment Sale");
worksheet.RangeUsed().SetAutoFilter().Column(revenueFoundCell).EqualTo("00.00.0000");
I tried filtering directly on the columns, as a range, as a table, trying to hide the rows that did not have the required values.
All of it either filters based on one column or not at all.
The "expression.AddFilter(some value).AddFilter(some other value);" does not help as I am not trying to add multiple filters on the same column
The "And/Or" functionality does the same, multiple filters on the same column.
Has anyone managed to filter based on values in multiple columns?
Any advice is much appreciated.
Try the below sorting method found here
myRange.SortColumns.Add(firstColumnNumber, XLSortOrder.Ascending);
myRange.SortColumns.Add(secondColumnNumber, XLSortOrder.Ascending);
myRange.Sort();
Here's my answer. I struggled with the same problem for a while.
The key is the sort, which has to be done after you define the filters.
var excelTable = TableRange.CreateTable();
excelTable.AutoFilter.Column(26).AddFilter("Filter 1");
excelTable.AutoFilter.Column(26).AddFilter("Filter 2");
excelTable.AutoFilter.Sort(1, XLSortOrder.Ascending);
If I have a loaded SpreadsheetDocument instance:
SpreadsheetDocument spreadsheetDocument
and iterate over the WorksheetParts:
foreach (var wp in spreadsheetDocument.WorkbookPart.WorksheetParts)
for every part that is a "Table" I can get to the table definition with:
wp.TableDefinitionParts
and grab the first entry. At this point I can grab the table name:
var tableName = tableDefinitionPart.Table.Name;
But how do I determine which sheet this this table is located in?
Given a WorksheetPart (as assigned to wp in your code), the first entry Parts list will be an Packaging.IdPartPair object:
var parts = wp.Parts.ToList();
var idPartPair = parts[0];
If you take a look at the value of
idPartPair.OpenXmlPart.Uri.OriginalString
it will be a string that looks like this:
/xl/tables/table2.xml
The only thing you care about is the number 2 in that string. Believe it or not, that's actually saying that the table is in the third sheet of the workbook (zero-based)
At this point, write your favorite code to extract the 2 out of the above code. My version is this, but I'm sure someone else can make this shorter:
var sheetNo = int.Parse(string.Concat(Path.GetFileNameWithoutExtension(idPartPair.OpenXmlPart.Uri.OriginalString).Skip(5)));
Next, get the list of sheets:
var sheets = spreadsheetDocument.WorkbookPart.Workbook.Sheets.ToList();
Then use sheetNo to index into it:
var sheet = (Sheet)sheets[sheetNo];
Then you can easily get the sheet name:
var sheetName = sheet.Name;
On sheet there are some cells that are given names. How to get the names of all the cells in the worksheet, which is given a name.
trying to do so
foreach(Excel.Worksheet wSheet in excelPattern.Worksheets)
{
treeView1.Nodes.Add(wSheet.Name,wSheet.Name);
foreach(Excel.Name n in wSheet.Names){
treeView1.Nodes[wSheet.Name].Nodes.Add( n.Name);
}
}
but do not get what you need
I do not understand the problem correctly
I needed workbook.Names and I was looking for a worksheet.Names
cells names are not attached to the sheet, it is the global cell names
You can get it
// var workbook = ...;
foreach(var n in workbook.Names) {
string name = n.Name; // Name of cell
string ref = n.RefersTo; // Refers To cell (Sheet1!$E$29)
// ...
}
I am trying to retrieve data from an Excel spreadsheet using C#. The data in the spreadsheet has the following characteristics:
no column names are assigned
the rows can have varying column lengths
some rows are metadata, and these rows label the content of the columns in the next row
Therefore, the objects I need to construct will always have their name in the very first column, and its parameters are contained in the next columns. It is important that the parameter names are retrieved from the row above. An example:
row1|---------|FirstName|Surname|
row2|---Person|Bob------|Bloggs-|
row3|---------|---------|-------|
row4|---------|Make-----|Model--|
row5|------Car|Toyota---|Prius--|
So unfortunately the data is heterogeneous, and the only way to determine what rows "belong together" is to check whether the first column in the row is empty. If it is, then read all data in the row, and check which parameter names apply by checking the row above.
At first I thought the straightforward approach would be to simply loop through
1) the dataset containing all sheets, then
2) the datatables (i.e. sheets) and
3) the row.
However, I found that trying to extract this data with nested loops and if statements results in horrible, unreadable and inflexible code.
Is there a way to do this in LINQ ? I had a look at this article to start by filtering the empty rows between data but didn't really get anywhere. Could someone point me in the right direction with a few code snippets please ?
Thanks in advance !
hiro
I see that you've already accepted the answer, but I think that more generic solution is possible - using reflection.
Let say you got your data as a List<string[]> where each element in the list is an array of string with all cells from corresponding row.
List<string[]> data;
data = LoadData();
var results = new List<object>();
string[] headerRow;
var en = data.GetEnumerator();
while(en.MoveNext())
{
var row = en.Current;
if(string.IsNullOrEmpty(row[0]))
{
headerRow = row.Skip(1).ToArray();
}
else
{
Type objType = Type.GetType(row[0]);
object newItem = Activator.CreateInstance(objType);
for(int i = 0; i < headerRow.Length; i++)
{
objType.GetProperty(headerRow[i]).SetValue(newItem, row[i+1]);
}
results.Add(newItem);
}
}