How to determine header row while using ClosedXML - c#

i have a small winforms application im working on and using ClosedXML to handle our excel files. Im trying to build the read logic in a way that no matter what row the headers are on, i can find that row and work with the data below that. Because our reports come from our enterprise reporting system, the files are not always the same in where they start with the data because the exports from our system appends the report filters and selections to the top x rows then below that it starts the data dump. So right now that only way i can get it to work is if i manually remove all those rows at the top and make the header row the first row.
Im looking for some assistance in how i can find the "header" row based on column names or any other method. I have already looked thru their wiki https://github.com/ClosedXML/ClosedXML/wiki but that only has mention of working with printing headers and footers..
Here is where i believe i need to focus my work, but unclear where to start:
// Look for the first row used
var firstRowUsed = ws.FirstRowUsed(); //{'Precision Calculator D'!A1:XFD1}
//var firstRowUsed = "'Precision Calculator D'!A9:XFD9";
// Narrow down the row so that it only includes the used part
var udasRow = firstRowUsed.RowUsed(); //{'Precision Calculator D'!A10:A10}
//var udasRow = "'Precision Calculator D'!A10:A10}";
// Move to the next row (it now has the titles)
udasRow = udasRow.RowBelow();
There are reports ive tried that have the header starting on row 5 and others that start on row 7 and so on, so there is no actual row that they will alays be on, so need to find a way to determine it automatically. is there anyway to determine the row that the column names are in? The columns will always be in the same order, so those i have determined.
So ran across this in a mention of closedXML and it def may help get me where i need to be, but unclear how to implement
var foundMonth = ws.Search("Month", System.Globalization.CompareOptions.OrdinalIgnoreCase);
Since it returns a IEnumerable there is a chance that there may be more than one cell with the value "Month" and in my file that im testing with, there is 2 rows that contain the word and not sure how i can determine in this case that i want the last cell it found if there are multiple.
Addressed the concern about the multiple cells returned, and can now determine which row the headers are on with the following:
var foundMonth = ws.Search("Month", System.Globalization.CompareOptions.OrdinalIgnoreCase);
var monthRow = foundMonth.Last().Address.ToString();
Still unclear how to implement this into the original code post above, so that the firstRowUsed is reflected correctly in this case would be A11:XFD11

After exhausting search of ClosedXML and reading thru a number of other questions, i was able to find a solution. Below is the code that will help set the used range based on my current data structure within the file..
var foundMonth = ws.Search("Month", System.Globalization.CompareOptions.OrdinalIgnoreCase);
var monthRow = foundMonth.Last().Address; // A11
var lastcell = ws.LastCellUsed().Address; // BC3950
var rangeUsed = ws.Range(monthRow, lastcell);
Since i have no idea where my header row will be from file to file, im searching for my column header name in column A, since all the usable data is mostly numbers i can safely assume that in column A, the last found instance of the word "Month" is my header row.
With that and the last cell used i am able to determine my data range as seen above. Although i still need to figure out how to replace my firstRowUsed logic to work the same way, this is a step closer to a final solution. Ill post back my findings on that one before i mark this question answered.

var firstRowUsed = ws.Range(monthRow, lastcell).FirstRowUsed();
This line provides you the same as this line below
var firstRowUsed = ws.FirstRowUsed();
I tried this logic with 3 different files, each one having more and less data and also having the header row on different rows. and works like a charm

Related

Can I get the actual Excel row number of Excel spreadsheet row (not a sequential calculated number)

I am processing a multi-tab spreadsheet and saving the rows in a SQL database. I would like to store the actual Excel row number (that shows in the first column before column A). I have tried a number of ways to accomplish this but cannot seem to find a method that works. My current sample code is shown below. I would like to avoid counting rows as they are read from the file but would rather just get the actual Excel row number from Excel. I am hope I am explaining this adequately. I may only process 1 of every 5 or 10 rows and would have thought there would be a way to retrieve the row number that Excel displays to the left of Column A. Is that possible or am I out of luck? The code below seems to display only a sequential number of (for example) 1 through 5 if I only process 5 records. I am not doing any restriction of rows anywhere else in the code. I realize the following code is simplistic but it accurately reflects what I am trying to do.
foreach (IXLWorksheet works in workBook.Worksheets)
{
// get the name of the worksheet (the tab name)
string worksName = works.Name;
foreach (IXLRow row in works.Rows())
{
bProcessThisRow = false;
if (rowcontents == userrequest)
{
bProcessThisRow = true;
}
// more determination of rowcontents to user spec's
if (bProcessThisRow)
{
// get the Excel number of this row
int iRowNum = row.RowNumber();
// save row contents in database record
}
}
}
Well, this is embarrassing but I have to tell the truth because anything other than that would be "read" by everyone...yesterday was a really hectic day (I know, we all have them) and I just plain fouled up on the test data file. It was no one's fault but my own. And Scott Hannen was correct, RowNumber() does return the correct response. This was my first experience using the RowNumber() feature and a lack of faith in myself/experience with the RowNumber() probably contributed.
Sorry for wasting your time.

Insert Excel Rows/Columns with ExcelDNA or NetOffice

I am using ExcelDNA to set and get cell values. I can get and set cell values
var ref = ExcelReference(2, 2);
var val = ref.GetValue();
ref.SetValue(42);
is there a way to insert an entire row or column by moving entries to the right or down? I want the same behavior as when the user right clicks the column and all the entries are shifted to the right. Solution can use NetOffice if necessary.
I'd recommend using the COM object model for this, and the code will be similar to VBA for the same task.
You get hold of the root Application object with a call to ExcelDnaUtil.Application. The resulting object will be of type Microsoft.Office.Interop.Excel.Application and can be used to select the row of column, then call app.Selection.Insert() to insert the new row or column.
It should also be possible using the C API, but that is unlikely to be easier or faster.
I would like to add that NetOffice does not support the EntireRow and EntireColumn methods of the Range object, which would be useful for inserting or deleting full rows. As a workaround, one can replace this for rows by addressing full rows by Range(rowNoStart + ":" + rowNoEnd).
For columns, one can write Range(GetExcelColumnName(colStart) + ":" + GetExcelColumnName(colEnd)), where GetExcelColumnName is a function from this former SO post.

Pull entire excel row using LinqToExcel

I am trying to pull an entire row of values off of an excel file using linq to excel. I have all of the column names (there's 104 different ones) and now I just need to get the one row of values associated with each header. What I would like to do is just pull the entire second row of values, but I haven't been able to figure a work around for that.
Does anyone know of a way to just pull one row? Or do I need to approach this differently and pull the individual value by the header name.
Thank you.
Use the LinqToExcel.Row class (Documentation)
var excel = new ExcelQueryFactory("excelFileName");
var firstRow = excel.Worksheet().First();
var companyName = firstRow["CompanyName"];

C# removing and re-adding rows to excel sheet

Simple question.
I have an excel sheet that I want to use as a database. I use linq-to-excel and it works wonderfully except it only works if the header row is the first row in the sheet and the spreadhseets I need to run on have other (important to the owners) data in the first 7 rows with the header row appearing in the 8th row.
What's the best way I can cut out these first rows through C# temporarily, so I can run my program and then re-insert them back in place after I've changed whatever records/columns/etc I needed to?
You can use LinqToExcel's WorksheetRange() method to select the specific range of cell's you want to select. This also allows you to use the first row of the range as a header row.
Here's a code example:
var excel = new ExcelQueryFactory("excelFileName");
var indianaCompanies = from c in excel.WorksheetRange<Company>("B3", "G10")
where c.State == "IN"
select c;
And here's the documentation

Add DataValidation to an unknown number of Rows

Hi I want to add validation to every row in a column of an excel file. I'm using EP Plus for handling the creation and reading of excel files. Here is the code I have tried.
var codeValidation = codeListSheet.DataValidations.AddTextLengthValidation("A2:AN");
But this isn't working it says that it overlaps with the range of my next column
var paretnCodeValidation = codeListSheet.DataValidations.AddTextLengthValidation("B2:BN");
I know there should be an easy way of doing this but I can't find the answer. Hopefully there is someone who has come across this before.
OK my bad the answer was in the FAQ on the EP Plus
To Select an entire column you can use A:A or B:B etc

Categories