Reading Excel SheetData with OpenXml returns null - c#

I'm trying to read a fairly simple Excel spreadheet (saved as an xlsx) using the OpenXML nuget package.
I'm able to locate the specific Sheet I'm interested in, but calling sheet.GetFirstChild<SheetData>() on it, in order to access the cell values, always returns null. I've looked at several examples online, and they all seem to agree that this is the correct way to access the data.
The xlsx file I'm trying to read can be downloaded here.
Here is the code I'm using to read the file. The line var data = sheet.GetFirstChild<SheetData>(); is where the null is returned.
void Main()
{
using (var s = System.IO.File.OpenRead("c:\\ImportTemplate.xlsx"))
using (var document = SpreadsheetDocument.Open(s, false))
{
foreach (var worksheetPart in document.WorkbookPart.WorksheetParts)
{
Sheet sheet = GetSheetFromWorkSheet(document.WorkbookPart, worksheetPart);
var name = sheet.Name.Value;
if (name == "ImportData")
{
var data = sheet.GetFirstChild<SheetData>(); // <-- data is set to null
foreach (var row in data.Descendants<Row>()){
// [...]
}
}
}
}
}
public static Sheet GetSheetFromWorkSheet(WorkbookPart workbookPart, WorksheetPart worksheetPart)
{
string relationshipId = workbookPart.GetIdOfPart(worksheetPart);
IEnumerable<Sheet> sheets = workbookPart.Workbook.Sheets.Elements<Sheet>();
return sheets.FirstOrDefault(s => s.Id.HasValue && s.Id.Value == relationshipId);
}
The above code as an easily runnable LINQPad file can be downloaded from here.
What am I doing wrong?

Related

OpenXML - embedding objects in Excel C#

I am trying to embed object into .xlsx document and copy sheets with embedded objects.
1. Copying sheets
This looks like straight forward issue. I have created method to copy the sheets:
static void CopySheetInsideWorkbook(string filename, string sheetName, string clonedSheetName)
{
using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(filename, true))
{
WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
WorksheetPart sourceSheetPart = GetWorksheetPartByName(spreadsheetDocument, sheetName);
SpreadsheetDocument tempSheet =
SpreadsheetDocument.Create(new MemoryStream(), spreadsheetDocument.DocumentType);
WorkbookPart tempWorkbookPart = tempSheet.AddWorkbookPart();
WorksheetPart tempWorksheetPart = tempWorkbookPart.AddPart<WorksheetPart>(sourceSheetPart);
WorksheetPart clonedSheet = workbookPart.AddPart<WorksheetPart>(tempWorksheetPart);
Sheets sheets = workbookPart.Workbook.GetFirstChild<Sheets>();
Sheet copiedSheet = new Sheet
{
Name = clonedSheetName,
Id = workbookPart.GetIdOfPart(clonedSheet),
SheetId = (uint) sheets.ChildElements.Count + 1
};
sheets.Append(copiedSheet);
workbookPart.Workbook.Save();
}
}
The ouput is as expected but the embedded files are copied as "Picture" rather than "Object". I unzipped .xlsx file and all looks legit ie. similar to the sheet I copied. Yet still the file cannot be opened on the copied sheet. All images, strings are displayed in correct way.
2. Embedding the object
What I understand I need to do is:
Convert object into oleObject - this will be separate fun.
Add DrawingsPart - It looks like it's read-only and I can only add ImagePart.
Embed Object
Connect both drawing and embedded object part toghether and allocate to some range in spreadsheet.
static void EmbedFileXlsx(string path, string embeddedFilePath, string placeholderImagePath)
{
using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(path, true))
{
WorksheetPart sourceSheetPart = GetWorksheetPartByName(spreadsheetDocument, "Test");
var imagePart = sourceSheetPart.AddImagePart(ImagePartType.Emf, "rId1");
imagePart.FeedData(File.Open(placeholderImagePath, FileMode.Open));
var embeddedObject =
sourceSheetPart.AddEmbeddedObjectPart(#"application/vnd.openxmlformats-officedocument.oleObject");
embeddedObject.FeedData(File.Open(embeddedFilePath, FileMode.Open));
spreadsheetDocument.Save();
}
}
This code just adds embedded objects into the file but does not create any type of relationship between them. This means that file is not visible on the spreadsheet.
I tried copying sheets using ClosedXML as well but unfortunately this is not supported nor the embedding.
I also managed to understand how I can copy sheet into new document with all embedded objects using .xml files inside spreadsheet but I do not think this would be much productive and I would like to achieve this using all the methods inside OpenXML. It looks everything is there but something is amiss.
I am no expert in this, but could this help you on your way?
spreadsheetDocument.CreateRelationshipToPart(SOME ID);

OpenXML writer usage makes graphics unreadable

As I'm writing quite large xlsx files, I'm using OpenXmlReader and OpenXmlWriter as recommended on this page:
https://blogs.msdn.microsoft.com/brian_jones/2010/06/22/writing-large-excel-files-with-the-open-xml-sdk/
What I only do is change formulas inside existing cells and making sure that their value is discarded so that it is recalculated when Excel opens the file.
Here is the function that I'm using:
public void Save(Stream Input, Stream Output)
{
Input.Position = 0;
if (Input != Output)
Input.CopyTo(Output);
using (SpreadsheetDocument document = SpreadsheetDocument.Open(Output, true))
{
WorkbookPart wbPart = document.WorkbookPart;
// force recalculation as we change formulas
wbPart.Workbook.CalculationProperties.ForceFullCalculation = true;
wbPart.Workbook.CalculationProperties.FullCalculationOnLoad = true;
// store the worksheet parts in a separate list because the loop below
// adds and removes elements inside wbPart.WorksheetParts
List<WorksheetPart> originalWsParts = new List<WorksheetPart>();
foreach (WorksheetPart inputWsPart in wbPart.WorksheetParts)
originalWsParts.Add(inputWsPart);
// process all worksheets in the workbook
foreach (WorksheetPart inputWsPart in originalWsParts)
{
string origninalSheetId = wbPart.GetIdOfPart(inputWsPart);
WorksheetPart replacementWsPart = wbPart.AddNewPart<WorksheetPart>();
string replacementWsPartId = wbPart.GetIdOfPart(replacementWsPart);
OpenXmlReader reader = OpenXmlReader.Create(inputWsPart);
OpenXmlWriter writer = OpenXmlWriter.Create(replacementWsPart);
while (reader.Read())
{
logger.Debug(reader.ElementType.Name);
if (reader.ElementType == typeof(Cell) && reader.IsStartElement)
{
writer.WriteStartElement(reader);
// write the cell content, changing the formula and skipping the value
while (reader.Read() && !(reader.ElementType == typeof(Cell) && reader.IsEndElement))
{
if (reader.IsStartElement)
{
if (reader.ElementType == typeof(CellFormula))
{
CellFormula element = reader.LoadCurrentElement() as CellFormula;
element.Text = "SUM(1,2)";
element.CalculateCell = true;
writer.WriteElement(element);
}
else if (reader.ElementType != typeof(CellValue))
{
writer.WriteStartElement(reader);
string elementText = reader.GetText();
if (!String.IsNullOrEmpty(elementText))
writer.WriteString(elementText);
}
}
else if (reader.IsEndElement)
{
if (reader.ElementType != typeof(CellValue))
writer.WriteEndElement();
}
}
writer.WriteEndElement();
}
else
{
if (reader.IsStartElement)
{
writer.WriteStartElement(reader);
string elementText = reader.GetText();
if (!String.IsNullOrEmpty(elementText))
writer.WriteString(elementText);
}
else if (reader.IsEndElement)
{
writer.WriteEndElement();
}
}
}
reader.Close();
writer.Close();
Sheet sheet = wbPart.Workbook.Descendants<Sheet>()
.Where(s => s.Id.Value.Equals(origninalSheetId)).First();
sheet.Id.Value = replacementWsPartId;
wbPart.DeletePart(inputWsPart);
}
}
}
It works quite well on the simplest workbooks, but it creates unreadable files when there are drawings inside sheets in the file.
For instance, if I have a drawings on Sheet1, when Excel opens the saved file, it complains that the file has unreadable parts and shows me the drawings in the list of things it has deleted.
I unzipped the xlsx file and compared the sheetX.xml files, and apart from the added x: prefix, they are the same.
Obviously, I'm missing something but reading the various docs I could find, nothing came to me. I believe there is a reference to the original worksheet part that has not been updated but I don't see any drawings descendant in the workbook.
Any help is most welcome.
Update
I looked more closely at the files content and there are two folders missing inside the xl folder: charts and drawings
So clearly, I'm missing code that would add those into the final archive, but I can't (yet) figure out what code this is.
The above code creates a new WorksheetPart inside the loop, when the intention was always to clone the input part.
I thus went looking for a way to clone a worksheet, but there is no ClonePart() method on WorkbookPart.
Fortunately for me, someone else already had this issue and Brian Jones did a part on his blog about this:
https://blogs.msdn.microsoft.com/brian_jones/2009/02/19/how-to-copy-a-worksheet-within-a-workbook/
So, I changed the start of the using statement to this:
using (SpreadsheetDocument document = SpreadsheetDocument.Open(Output, true), tmpDocument = SpreadsheetDocument.Create(new MemoryStream(), document.DocumentType))
{
WorkbookPart wbPart = document.WorkbookPart;
WorkbookPart tmpWbPart = tmpDocument.AddWorkbookPart();
Then instead of calling AddNewPart<WorksheetPart>, I now have the following code:
// can't directly clone, so add to the temporary workbook part and then back
// into the working workbook part
WorksheetPart tmpWsPart = tmpWbPart.AddPart(inputWsPart);
WorksheetPart replacementWsPart = wbPart.AddPart(tmpWsPart);
tmpWbPart.DeletePart(tmpWsPart);
tmpWsPart = null;
With those changes, I now have my Drawing part in the worksheet, the associated folders in the xlsx file.
As a result, Excel no longer complains when opening the file and the graphs are all there and updated.

How to set active sheet with Open XML SDK 2.5

using the example here How to Copy a Worksheet within a Workbook
I have successfully been able to clone/copy sheets in my excel file, however when I open the excel the 2nd sheet is the active(visible) sheet. I haven't been able to locate a property that could do thins.....Is there any way to specify what sheet is active?
I've tried to force it by opening and editing the first sheet in the file thinking it was the last edited sheet that was active but that didn't work either.
any help would be great. TIA
update: looking at the workbook.xml created when renaming the .xlsx to .zip I came accross the 'activeTab' property. made a quick change to my code and seems to work just fine
public void SetFirstSheetInFocus(String xlsxFile)
{
using (SpreadsheetDocument spreadSheet = SpreadsheetDocument.Open(xlsxFile, true))
{
//Get a reference to access the main Workbook part, which contains all references
WorkbookPart _workbookPart = spreadSheet.WorkbookPart;
if (_workbookPart != null)
{
WorkbookView _workbookView = spreadSheet.WorkbookPart.Workbook.BookViews.ChildElements.First<WorkbookView>();
if (_workbookView != null)
{
_workbookView.ActiveTab = 0; // 0 for first or whatever tab you want to use
}
// Save the workbook.
_workbookPart.Workbook.Save();
}
}
}
If the name of your sheet is in the variable
sheetName
you can set the sheet with that name active like this:
using (var spreadsheetDoc = SpreadsheetDocument.Open(emptyHIPTemplatePath, true /* isEditable */, new OpenSettings { AutoSave = false }))
{
var workbookPart = spreadsheetDoc.WorkbookPart;
var workBook = spreadsheetDoc.WorkbookPart.Workbook;
var sheet = workBook.Descendants<Sheet>().FirstOrDefault(s => s.Name == sheetName);
var sheetIndex = workBook.Descendants<Sheet>().ToList().IndexOf(sheet);
var workBookView = workBook.Descendants<WorkbookView>().First();
workBookView.ActiveTab = Convert.ToUInt32(sheetIndex);
...
workBook.Save();
}
From Vincent Tan's book:
The SheetId property doesn't determine the order. The order of
appending the Sheet classes to the Sheets class, does.
When you add a sheet, it gets the next index, but a single sheet does not have an index. OpenXML gives it an index when you are done adding sheets. Again, from Vincent Tan's book:
Let's say you have 3 worksheets named Sheet1, Sheet2 and Sheet3.
However, when you appended the corresponding Sheet classes, you did it
as Sheet2, Sheet3 and Sheet1, in that order.

Read/import existing Excel file programmatically (cell-by-cell) in Windows Phone 8

I am working on a Windows Phone 8 app to READ/WRITE Excel files. I asked a question here about this and the comment provided and many other links led me to OpenXml.
All of this got me good on how to create an Excel file and how to launch it. But now I am stuck at very basic of these all i.e. How to read an existing Excel file (probably created outside using MS Excel) cell-by-cell i.e. I want to access each cells and their values through my code. In the openXML thing I did this:
Stream localFile = App.GetResourceStream(new Uri("/ReadExcel;component/jai.xlsx"
,UriKind.Relative)).Stream;
MemoryStream ms = new MemoryStream();
localFile.CopyTo(ms);
DocumentFormat.OpenXml.Packaging.SpreadsheetDocument spreadsheetDoc =
DocumentFormat.OpenXml.Packaging.SpreadsheetDocument.Open(localFile, true);
{
var a = spreadsheetDoc.Package;
// Do work here
}
But it gives me error:
The type 'System.IO.Packaging.Package' is defined in an assembly that is not
referenced. You must add a reference to assembly 'WindowsBase, Version=4.0.0.0
So basically I am stuck at this WindowsBase.dll. I tried all various ways to import an assembly i.e. unblock and all, but nothing works.
So all I want to do is to programmatically access the content of an existing Excel file in my code cell-by-cell.
Please help or suggest whether it is even possible as of now in WP8.
I used the following method to read cells from an xlsx Excel file on Windows Phone 8:
Add the Microsoft Compression library to your project using NuGet
Adapt the code sample from the developer network to your needs - it shows how to read cells from an Excel file (and it needs the Compression lib)
Since I already extended the code a bit to handle empty columns and empty files properly you can also use my code:
public class ExcelReader
{
List<string> _sharedStrings;
List<Dictionary<string, string>> _derivedData;
public List<Dictionary<string, string>> DerivedData
{
get
{
return _derivedData;
}
}
List<string> _header;
public List<string> Headers { get { return _header; } }
// e.g. cellID = H2 - only works with up to 26 cells
private int GetColumnIndex(string cellID)
{
return cellID[0] - 'A';
}
public void StartReadFile(Stream input)
{
ZipArchive z = new ZipArchive(input, ZipArchiveMode.Read);
var worksheet = z.GetEntry("xl/worksheets/sheet1.xml");
var sharedString = z.GetEntry("xl/sharedStrings.xml");
// get shared string
_sharedStrings = new List<string>();
// if there is no content the sharedStrings will be null
if (sharedString != null)
{
using (var sr = sharedString.Open())
{
XDocument xdoc = XDocument.Load(sr);
_sharedStrings =
(
from e in xdoc.Root.Elements()
select e.Elements().First().Value
).ToList();
}
}
// get header
using (var sr = worksheet.Open())
{
XDocument xdoc = XDocument.Load(sr);
// get element to first sheet data
XNamespace xmlns = "http://schemas.openxmlformats.org/spreadsheetml/2006/main";
XElement sheetData = xdoc.Root.Element(xmlns + "sheetData");
_header = new List<string>();
_derivedData = new List<Dictionary<string, string>>();
// worksheet empty?
if (!sheetData.Elements().Any())
return;
// build header first
var firstRow = sheetData.Elements().First();
// full of c
foreach (var c in firstRow.Elements())
{
// the c element, if have attribute t, will need to consult sharedStrings
string val = c.Elements().First().Value;
if (c.Attribute("t") != null)
{
_header.Add(_sharedStrings[Convert.ToInt32(val)]);
} else
{
_header.Add(val);
}
}
// build content now
foreach (var row in sheetData.Elements())
{
// skip row 1
if (row.Attribute("r").Value == "1")
continue;
Dictionary<string, string> rowData = new Dictionary<string, string>();
// the "c" elements each represent a column
foreach (var c in row.Elements())
{
var cellID = c.Attribute("r").Value; // e.g. H2
// each "c" element has a "v" element representing the value
string val = c.Elements().First().Value;
// a string? look up in shared string file
if (c.Attribute("t") != null)
{
rowData.Add(_header[GetColumnIndex(cellID)], _sharedStrings[Convert.ToInt32(val)]);
} else
{
// number
rowData.Add(_header[GetColumnIndex(cellID)], val);
}
}
_derivedData.Add(rowData);
}
}
}
}
This works for simple Excel files having one work sheet and some text and number cells. It assumes there is a header row.
Usage is as follows:
var excelReader = new ExcelReader();
excelReader.StartReadFile(excelStream);
After reading excelReader.Headers contains the header names, excelReader.DerivedData contains the rows. Each row is a Dictionary having the header as key and the data as value. Empty cells won't be in there.
Hope this gets you started.
Unfortunately, it is not possible to use the official OpenXML SDK by Microsoft. The reason is exactly the exception you already ran into. WP8 does not have the System.IO.Packaging namespace available which is required to extract/compress the zip-based xlsx file format. Adding WindowsBase.dll won't work either because it is not compiled for WP8.
After googling for quite some time in the last two years about this the only 3 solutions that I know are (despite developing Excel support from zero by your own :) ):
Use the Ag.OpenXML open source project which you can find on http://agopenxml.codeplex.com/ . The source repository contains an implementation to write an Excel file (the downloadable package only contains Word export). I use this in my WP8 app for quite some time and it works well despite the lack of a lot of features. Unfortunately, this package is not maintained anymore since 2011. However, it might be a good start for you.
Use the commercial libraries of ComponentOne https://www.componentone.com/SuperProducts/StudioWindowsPhone/
Use the commercial libraries of Syncfusion http://www.syncfusion.com/products/windows-phone

Document not saving when Created Using the Open XML Format SDK 2.0 CTP

I want to create an excel document based on a template using Open XML Format SDK 2.0.
I have followed this tutorial Creating Documents by Using the Open XML Format SDK 2.0 CT.
My problem is that the rows and cells i put in to the document doesn't get saved. When I open the document it looks just like the template.
There is no exceptions thrown when I run my code. I figure I have to force the changes to be saved in the document, but I cant figure out how.
Here's some of my code:
public static void GenerateExcelReportToDisk()
{
var factory = new Factory();
var generated = "result.xlsx";
var newFile = Util.GetReportTargetPath() + generated;
var templateFile = Util.GetReportTemplatePath() + #"template.xlsx";
File.Copy(templateFile, newFile, true);
using (var myWorkbook = SpreadsheetDocument.Open(newFile, true))
{
var workbookPart = myWorkbook.WorkbookPart;
var worksheetPart = workbookPart.WorksheetParts.First();
var sheetData = worksheetPart.Worksheet.GetFirstChild<SheetData>();
//Get data
var data = factory.GetAllFixtures().Take(20);
int rowIndex = 3;
foreach (var fixture in data)
{
var pcRate = fixture.PCRate;
var account = fixture.Charter != null ? fixture.Charter.Shortname : null;
var region = fixture.Region != null ? fixture.Region.GroupName : null;
//CreateContentRow is exactly like the tutorial linked above.
var row = CreateContetRow(rowIndex, region, pcRate, account);
rowIndex++;
sheetData.AppendChild(row);
}
//Tried to add myWorkbook.WorkbookPart.Workbook.Save(); here, but it doesn't do anything
myWorkbook.Close();
}
Well, I managed to figure this out by myself after a short while.
Posting the answer here in case it will help someone (including myself):
In the line above myWorkbook.Close(); add worksheetPart.Worksheet.Save();
As simple as that...

Categories