OpenXML Copy Spreadsheet to new Workbook - c#

What I am trying to do: .Net Core Controller -> Read file using OpenXML -> Create new spreadsheet document using OpenXML with selected sheets from the previously read file -> return the newly created file.
Some caveats: These sheets that need to be copied to the new workbook will have formulas, references, dataValidations, named ranges and other sorts of links to other sheets in the original workbook that shouldn't be copied. We only want to copy the values and styles.
What I have tried so far:
Approach 1:
using var stream = new MemoryStream();
SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Create(stream, SpreadsheetDocumentType.Workbook);
WorkbookPart workbookpart = spreadsheetDocument.AddWorkbookPart();
workbookpart.Workbook = new Workbook();
WorksheetPart worksheetPart = workbookpart.AddNewPart<WorksheetPart>();
worksheetPart.Worksheet = new Worksheet(new SheetData());
Sheets sheets = spreadsheetDocument.WorkbookPart.Workbook.AppendChild<Sheets>(new Sheets());
workbookpart.Workbook.Save();
using var stream2 = new MemoryStream();
stream2.Write(workbook, 0, workbook.Length);
stream2.Position = 0;
var document = SpreadsheetDocument.Open(stream2, true);
Sheet sheetToCopy = document.WorkbookPart.Workbook.Sheets.Descendants<Sheet>().Where(x => x.Name == "Report Sheet").First();
sheets.Append(sheetToCopy.CloneNode(false));
workbookpart.Workbook.Save();
spreadsheetDocument.Close();
return stream.ToArray();
Approach 2:
I haven't coded this yet, but the idea is to copy each cells value and style from the original sheet, to the new sheet. I am very concerned about the time complexity of this, because the sheets could get very huge.
Approach 3:
I kind of have this working. The process here is to modify the originally downloaded workbook itself by removing the unnecessary worksheets, and clearing the named ranges, and data validations, but no matter how many caveats I delete from the original book, there's always another validation, or conditional formatting still lingering around.
I am really open to any ideas and would greatly appreciate some help from the community.
Thank you for taking a look.

Related

OpenXML - embedding objects in Excel C#

I am trying to embed object into .xlsx document and copy sheets with embedded objects.
1. Copying sheets
This looks like straight forward issue. I have created method to copy the sheets:
static void CopySheetInsideWorkbook(string filename, string sheetName, string clonedSheetName)
{
using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(filename, true))
{
WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
WorksheetPart sourceSheetPart = GetWorksheetPartByName(spreadsheetDocument, sheetName);
SpreadsheetDocument tempSheet =
SpreadsheetDocument.Create(new MemoryStream(), spreadsheetDocument.DocumentType);
WorkbookPart tempWorkbookPart = tempSheet.AddWorkbookPart();
WorksheetPart tempWorksheetPart = tempWorkbookPart.AddPart<WorksheetPart>(sourceSheetPart);
WorksheetPart clonedSheet = workbookPart.AddPart<WorksheetPart>(tempWorksheetPart);
Sheets sheets = workbookPart.Workbook.GetFirstChild<Sheets>();
Sheet copiedSheet = new Sheet
{
Name = clonedSheetName,
Id = workbookPart.GetIdOfPart(clonedSheet),
SheetId = (uint) sheets.ChildElements.Count + 1
};
sheets.Append(copiedSheet);
workbookPart.Workbook.Save();
}
}
The ouput is as expected but the embedded files are copied as "Picture" rather than "Object". I unzipped .xlsx file and all looks legit ie. similar to the sheet I copied. Yet still the file cannot be opened on the copied sheet. All images, strings are displayed in correct way.
2. Embedding the object
What I understand I need to do is:
Convert object into oleObject - this will be separate fun.
Add DrawingsPart - It looks like it's read-only and I can only add ImagePart.
Embed Object
Connect both drawing and embedded object part toghether and allocate to some range in spreadsheet.
static void EmbedFileXlsx(string path, string embeddedFilePath, string placeholderImagePath)
{
using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(path, true))
{
WorksheetPart sourceSheetPart = GetWorksheetPartByName(spreadsheetDocument, "Test");
var imagePart = sourceSheetPart.AddImagePart(ImagePartType.Emf, "rId1");
imagePart.FeedData(File.Open(placeholderImagePath, FileMode.Open));
var embeddedObject =
sourceSheetPart.AddEmbeddedObjectPart(#"application/vnd.openxmlformats-officedocument.oleObject");
embeddedObject.FeedData(File.Open(embeddedFilePath, FileMode.Open));
spreadsheetDocument.Save();
}
}
This code just adds embedded objects into the file but does not create any type of relationship between them. This means that file is not visible on the spreadsheet.
I tried copying sheets using ClosedXML as well but unfortunately this is not supported nor the embedding.
I also managed to understand how I can copy sheet into new document with all embedded objects using .xml files inside spreadsheet but I do not think this would be much productive and I would like to achieve this using all the methods inside OpenXML. It looks everything is there but something is amiss.
I am no expert in this, but could this help you on your way?
spreadsheetDocument.CreateRelationshipToPart(SOME ID);

OpenXML: How to copy a worksheet to another workbook?

I need to merge the worksheets of some workbooks into one new workbook. What I tried is this, but I am getting "Cannot insert the OpenXmlElement "newChild" because it is part of a tree.".
using (var document = SpreadsheetDocument.Create(destinationFileName, SpreadsheetDocumentType.Workbook))
{
// Add a WorkbookPart to the document
var workbookPart = document.AddWorkbookPart();
workbookPart.Workbook = new Workbook();
// Add a WorksheetPart to the WorkbookPart
var worksheetPart = workbookPart.AddNewPart<WorksheetPart>();
worksheetPart.Worksheet = new Worksheet(new SheetData());
foreach (var sourceFileName in sourceFileNames)
{
using (var sourceDocument = SpreadsheetDocument.Open(sourceFileName, false))
{
var sheet = (Sheet)sourceDocument.WorkbookPart.Workbook.Sheets.FirstChild;
workbookPart.Workbook.AppendChild(new Worksheet(sheet));
}
}
}
The error message means that you are trying to add an OpenXmlElement (a Sheet in this case) that already has a parent OpenXmlCompositeElement (a Sheets object in this case) as a new child to another OpenXmlCompositeElement (a Worksheet in this case).
In the following line item, you are grabbing a reference to a Sheet object that already has a parent Sheets object.
var sheet = (Sheet)sourceDocument.WorkbookPart.Workbook.Sheets.FirstChild;
With new Worksheet(sheet), you are trying to add that same sheet to another parent, i.e., the Worksheet instance that you are creating. That does not work.
Assuming for a moment that it makes sense to add a Sheet to a Worksheet, your second line would have to be rewritten as follows:
workbookPart.Workbook.AppendChild(new Worksheet(sheet.CloneNode(true)));
In the above line of code, sheet is replaced with sheet.CloneNode(true), which makes a deep copy of your original Sheet object that does not have a parent. Thus, the clone can be added as a new child to another parent.
While the above solution is an answer to your immediate question (because it avoids the error message), your code does not make sense, because it does not create valid Open XML markup. Worksheet instances should not be added to Workbook instances, and Sheet instances should not be added to Worksheet instances. This is not how you would copy multiple worksheets, which is a very complicated task that requires you to create multiple worksheet parts, link those worksheet parts to your workbook part, and consider shared strings, styles, and other things.

OpenXML Adding AutoFilter to document using SAX

For some reason my code will not add the auto filter to the spreadsheet. It generates fine however when opened the autofilter is not present. Below is the relevant snippet from my method. I attempt to append the autofilter to the worksheet and then use the xmlwriter to write to the document.
//create worksheet part, and add it to the sheets collection in workbook
WorksheetPart wsp = wbp.AddNewPart<WorksheetPart>();
OpenXmlWriter writer = OpenXmlWriter.Create(wsp);
var worksheet = new Worksheet();
worksheet.AppendChild<AutoFilter>(new AutoFilter() { Reference = "A:BA" });
writer.WriteStartElement(worksheet);
writer.WriteStartElement(new SheetData());
Found the answer myself. After writing the end of the sheetdata is when the autofilter must be written.
writer.WriteEndElement(); //end of SheetData
writer.WriteElement(new AutoFilter() { Reference = "A:BA" });
writer.WriteEndElement(); //end of worksheet
writer.Close();

How to set active sheet with Open XML SDK 2.5

using the example here How to Copy a Worksheet within a Workbook
I have successfully been able to clone/copy sheets in my excel file, however when I open the excel the 2nd sheet is the active(visible) sheet. I haven't been able to locate a property that could do thins.....Is there any way to specify what sheet is active?
I've tried to force it by opening and editing the first sheet in the file thinking it was the last edited sheet that was active but that didn't work either.
any help would be great. TIA
update: looking at the workbook.xml created when renaming the .xlsx to .zip I came accross the 'activeTab' property. made a quick change to my code and seems to work just fine
public void SetFirstSheetInFocus(String xlsxFile)
{
using (SpreadsheetDocument spreadSheet = SpreadsheetDocument.Open(xlsxFile, true))
{
//Get a reference to access the main Workbook part, which contains all references
WorkbookPart _workbookPart = spreadSheet.WorkbookPart;
if (_workbookPart != null)
{
WorkbookView _workbookView = spreadSheet.WorkbookPart.Workbook.BookViews.ChildElements.First<WorkbookView>();
if (_workbookView != null)
{
_workbookView.ActiveTab = 0; // 0 for first or whatever tab you want to use
}
// Save the workbook.
_workbookPart.Workbook.Save();
}
}
}
If the name of your sheet is in the variable
sheetName
you can set the sheet with that name active like this:
using (var spreadsheetDoc = SpreadsheetDocument.Open(emptyHIPTemplatePath, true /* isEditable */, new OpenSettings { AutoSave = false }))
{
var workbookPart = spreadsheetDoc.WorkbookPart;
var workBook = spreadsheetDoc.WorkbookPart.Workbook;
var sheet = workBook.Descendants<Sheet>().FirstOrDefault(s => s.Name == sheetName);
var sheetIndex = workBook.Descendants<Sheet>().ToList().IndexOf(sheet);
var workBookView = workBook.Descendants<WorkbookView>().First();
workBookView.ActiveTab = Convert.ToUInt32(sheetIndex);
...
workBook.Save();
}
From Vincent Tan's book:
The SheetId property doesn't determine the order. The order of
appending the Sheet classes to the Sheets class, does.
When you add a sheet, it gets the next index, but a single sheet does not have an index. OpenXML gives it an index when you are done adding sheets. Again, from Vincent Tan's book:
Let's say you have 3 worksheets named Sheet1, Sheet2 and Sheet3.
However, when you appended the corresponding Sheet classes, you did it
as Sheet2, Sheet3 and Sheet1, in that order.

Find and replace value in Excel using C#

How can I find some value from cell and replace by new value in Excel?
I tryed this but it doesn't works:
Microsoft.Office.Interop.Excel.Application xlapp = new Microsoft.Office.Interop.Excel.Application();
Microsoft.Office.Interop.Excel.Workbook wb =default(Microsoft.Office.Interop.Excel.Workbook);
wb = xlapp.Workbooks.Open(FileName.ToString());
wb.Worksheets[0].Cells.Replace("find","replace");
I would recommend you use NPOI which can be accessed either via codeplex or directly through Nuget in Visual Studio. It gives you the ability to easily upload, edit and create spreadsheets in .NET
Example of uploading a spreadsheet:
HSSFWorkbook hssfworkbook;
void InitializeWorkbook(string path)
{
//read the template via FileStream, it is suggested to use FileAccess.Read to prevent file lock.
//book1.xls is an Excel-2007-generated file, so some new unknown BIFF records are added.
using (FileStream file = new FileStream(path, FileMode.Open, FileAccess.Read))
{
hssfworkbook = new HSSFWorkbook(file);
}
}
You can then use the IRow and ICell collections of the spreadsheet to locate and edit the data you need before doing an export.
More examples can be found here
If interested, you can use GemBox.Spreadsheet for this, like so:
SpreadsheetInfo.SetLicense("FREE-LIMITED-KEY");
// Load your XLS, XLSX, ODS or CSV file.
ExcelFile wb = ExcelFile.Load(FileName.ToString());
ExcelWorksheet ws = wb.Worksheets[0];
// Replace all "find" occurances with "replace" text.
int row, column;
while(ws.Cells.FindText("find", out row, out column))
ws.Cells[row, column].ReplaceText("find", "replace");
// Save your XLS, XLSX, ODS or CSV file.
wb.Save(FileName.ToString());
Also you can find another searching in Excel example here.
All you have to do is replace
wb.Worksheets[0].Cells.Replace("find","replace");
with
wb.Worksheets[1].Cells.Replace("find","replace");

Categories