Using C# .Net Google Sheets API.
I am new to the API, so I may have missed it in the docs - but how do you find out the maximum row and column that contain a value without reading all the data in the sheet?
For example, if a sheet contains multiple values and the "last" cell in the sheet with a value is at C139 (no cells in the rows following have a value and no cells in any column after C have a value), then the maximum row would be 139 and the maximum column would be 2 (zero based) or 3 (one based).
I tried sheet.Properties.GridProperties.RowCount -- but that gives the TOTAL number of rows in the sheet (whether the cells have values or not).
Same goes for sheet.Properties.GridProperties.ColumnCount -- gives the TOTAL number of columns in the sheet (whether the cells have values or not).
Any links or ideas are welcome.
I understand that you want to know the last row of data in your Sheet. In that case, you can use a simple GET with a full range. For example let's assume that your Sheet only has two columns, in that case you can set up the range like A1:B. That range will include the full two columns, but the get will only get as far as the data goes. At this step you already have an array filled with your data range, so you only have to count the array index of the last element in order to know the last row value. If you don't know how many columns your Sheet have, you only have to modify the range in a similar way as before (i.e. A1:Z). Please ask me any doubts about this approach.
I'm reading an .xlsx spreadsheet into a C# console app with a view to outputting the content as a formatted xml file (to be picked up by another part of the system further down the line).
The problem with the the .xslx file is that it's a pro-forma input document based on, and replacing, an old paper-based order form we used to provide to customers, and the input fields aren't organised as a series of similar rows (except in the lower part of the document which consists of up to 99 rows of order detail lines). Some of the rows in the header part of the form/sheet are a mixture of label text AND data; same with the columns.
Effectively, what I need to do is to be able to cherry pick data from the initial dozen or so rows in order to poke data into the xml structure; the latter part of the document I can process by iterating over the rows for the order detail lines.
I can't use Interop as this will end up as an Azure function - so I've used ExcelDataReader to convert the spreadsheet to a dataset, then convert that dataset to a new dataset entirely composed of string values. But I haven't been able to successfully point to individual cells as I had expected to be using syntax something like
var cellValue = MyDataSet.Cell[10, 2];
I'd be grateful for any advice as to how I might get the result I need.
A Dataset has Tables and those have Rows which hold ColumnValues
A WorkSheet transforms into a Table (with Columns) and the Cells transform to Rows and column values.
To find the cell value at [10,2] on the first Worksheet do:
var cellValue = MyDataSet.Tables[0].Rows[10][2];
Remember that cellValue will be of type object. Cast accordingly.
When I send data to Excel it ignores the merged "property" of some cells and just writes to the first cell it finds. So assuming I have column A and column B merged and I am sending data to column A and C, it actually splits the merged column so I am left with an empty column B.
Here is some code for context (some variables have been kept generic):
Range cells = this.Worksheet.Cells;
Range cell = (Range)cells[rowIndex, columnIndex];
Boolean merged = (Boolean)cell.MergeCells; //Here I am trying to determine if the
//cell is merged.
My problem is that .MergeCells always returns false. What am I doing wrong here? I know that in the Excel worksheet the cells are merged.
The problem is you are casting to a boolean, and MergeCells is not always guaranteed to give you back a boolean, as outlined in this more recent question: how to detect merged cells in c# using MS interop excel. You need to also check for the value of null - see the linked question for how to do that.
Hypothesis
So what's probably happening to your code is the null value casts back to false, even though what the null value actually indicates is that there are merged cells in the range.
The answer is: Your code is correct.
Boolean merged = (Boolean)cell.MergeCells; //Cast from dynamic{bool} to bool
This works for me (Excel 2013 on Windows 7).
I have noticed both true and false values in my own tests.
So maybe your worksheet's cells just DO NOT CONTAIN a merged cell!?
I am using Microsoft Interop to convert excel files into csv files. I use sheet.SaveAs function.
My initial excel sheet has data from A1 to AZ columns for 100 rows.
I need in the CSV just the data from A1 to AP and only for 50 rows.
Using the Range function, I delete the row51-100, I clear the contents for the same rows, still when I save as CSV, I find rows 51-100 as below: (just commas). I do not want to see these commas in CSV.
,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,
The same for column AQ-AZ as well. I do not want these data in CSV. I delete, clear contents using Range function, yet these AQ-AZ columns appears in CSV files as “,,,,,,,,,,,,,,,,,,,,,” .
Is there a way to save XLS as CSV with only Range that I want to see in the CSV file. Is there a way to control the range that goes into CSV file?
In short, I want to see in CSV file just the data for column A1 to AP for 50 rows. No empty trailing “,”s. Is there a way?
The issue you are describing seems like a "Last Cell" issue. The last cell is the original end of your data, even after you delete rows/columns.
Here is what Microsoft has to say about it: How to reset the last cell in Excel
I seem to remember a programmatic way of doing this, but for the life of me, I cannot recall how.
Having looked at that info, maybe you could rethink how you can do this.
Perhaps you could just read the data you need and write it out yourself.
i.e. For each row in range, get the row as a value which will be an array of object,
convert to array of string, string.join with the delimiter as a comma and append
to a .csv file.
Clearing the contents as suggested in another answer did not work for me, what did work was copying the populated columns in a new worksheet and overwriting the old CSV.
Simply select the trailing empty columns in Excel, right click and select: clear contents. Then save.
How does one delete a column (or multiple columns) in Excel?
eg. How to delete column C and shift the rest left?
Here is the solution to make it clearer (thanks to Leniel for the link)
Excel.Range range = (Excel.Range)sheet.get_Range("C1", Missing.Value);
range.EntireColumn.Delete(Missing.Value);
System.Runtime.InteropServices.Marshal.ReleaseComObject(range);
This was the first result I hit and deleting a column in Excel doesn't need as much code as the current answers suggest. In fact (assuming you have a Worksheet object already, listed below as mySheet) all that is needed for the original question is:
mySheet.Columns["C"].Delete();
If you want to delete multiple columns then:
mySheet.Columns["C:D"].Delete();
You can specify a variable in the Delete method (see https://learn.microsoft.com/en-us/dotnet/api/microsoft.office.interop.excel.xldeleteshiftdirection?view=excel-pia) i.e. mySheet.Columns["C"].Delete(xlShiftToLeft)but there's no need as the Delete method is smart enough to realise that the Range you are selecting is a single column, so will do this automatically.
You can also uses a numeric value to designate the column i.e. mySheet.Columns[2].Delete()
Here you find how to do it:
http://bytes.com/topic/c-sharp/answers/258110-how-do-you-delete-excel-column
http://quicktestprofessional.wordpress.com/2008/02/14/delete-columns-from-xl-sheet/