I need to embed an Excel document in a Word document. I used the answer on this SO question -> How can I embed any file type into Microsoft Word using OpenXml 2.0;
Everything works fine except that:
DrawAspect = OVML.OleDrawAspectValues.Icon lets you edit the Excel object by opening a new Excel instance. However, when I edit the data, it is not updated in the Word document.
DrawAspect = OVML.OleDrawAspectValues.Content lets you edit the Excel object directly in the Word document.
My question is, what do I have to change in the code so can I edit the Excel object in the new instance and have it properly reflected in the Word document? I tried everything to no avail.
Something tells me that DrawAspect = OVML.OleDrawAspectValues.Icon suggests that the object acts as an Icon, and changes cannot be properly reflected in this icon.
You could try a way I prorosed here:
How to insert an Image in WORD after a bookmark using OpenXML.
In short, use Open XML SDK 2.0 Productivity Tool (which is a part of Open XML SDK 2.0). Do whatever you need to do with document manually in MS Word. Then open this file in Open XML SDK 2.0 Productivity Tool. Then find the edits you are interested in to see how it is represented in OpenXML format , as well as how to do that programmaticaly.
Hope that helps!
UPDATED:
Okay - I have got better now what's the problem is... So in addition to my advice above I would recommend you to look at this threads on MSDN Forum:
How to modify the imbedded Excel in a Word document supplying chart
data
Embedded excel sheet in word
I let myself to repost the code sample (posted on MSDN Forum by Ji Zhou) just to avoid the deletion of original thread there.
Hope it is helpful enough to retrieve the Excel object from Word, change some cells and embed it back into Word.
public static void Main()
{
using (WordprocessingDocument wDoc = WordprocessingDocument.Open(#"D:\test.docx", true))
{
Stream stream = wDoc.MainDocumentPart.ChartParts.First().EmbeddedPackagePart.GetStream();
using (SpreadsheetDocument ssDoc = SpreadsheetDocument.Open(stream, true))
{
WorkbookPart wbPart = ssDoc.WorkbookPart;
Sheet theSheet = wbPart.Workbook.Descendants<Sheet>().
Where(s => s.Name == "Sheet1").FirstOrDefault();
if (theSheet != null)
{
Worksheet ws = ((WorksheetPart)(wbPart.GetPartById(theSheet.Id))).Worksheet;
Cell theCell = InsertCellInWorksheet("C", 2, ws);
theCell.CellValue = new CellValue("5");
theCell.DataType = new EnumValue<CellValues>(CellValues.Number);
ws.Save();
}
}
}
}
private static Cell InsertCellInWorksheet(string columnName, uint rowIndex, Worksheet worksheet)
{
SheetData sheetData = worksheet.GetFirstChild<SheetData>();
string cellReference = columnName + rowIndex;
Row row;
if (sheetData.Elements<Row>().Where(r => r.RowIndex == rowIndex).Count() != 0)
{
row = sheetData.Elements<Row>().Where(r => r.RowIndex == rowIndex).First();
}
else
{
row = new Row() { RowIndex = rowIndex };
sheetData.Append(row);
}
if (row.Elements<Cell>().Where(c => c.CellReference.Value == columnName + rowIndex).Count() > 0)
{
return row.Elements<Cell>().Where(c => c.CellReference.Value == cellReference).First();
}
else
{
Cell refCell = null;
foreach (Cell cell in row.Elements<Cell>())
{
if (string.Compare(cell.CellReference.Value, cellReference, true) > 0)
{
refCell = cell;
break;
}
}
Cell newCell = new Cell() { CellReference = cellReference };
row.InsertBefore(newCell, refCell);
worksheet.Save();
return newCell;
}
}
Related
I have a spreadsheet template which has a table (Table inserted into sheet from Insert tab) and when the user clicks on the export button these header values of the table should be updated by new values dynamically. For this I'm using c# and DocumentFormat.OpenXml library.
After exporting header values are getting updated in the spreadsheet table but when I open the excel document it says "We found problem with some content in '<>.xlsx' . Do you want us to recover as much as we can ? ".
Also, when I tried updating table rows other than header row, I wasn't getting above popup.
I have tried below approaches to update table header values.
Approach 1
using (SpreadsheetDocument document = SpreadsheetDocument.Open("test.xlsx", true))
{
WorkbookPart wbPart = document.WorkbookPart;
Sheet sheet = wbPart.Workbook.Descendants<Sheet>().Where(s => s.name =="Sheet1").FirstOrDefault();
WorksheetPart wsPart = (WorksheetPart) (wbPart.GetPartById(sheet.Id));
WorkSheet wSheet = wsPart.WorkSheet;
Row row = wSheet.Elements<Row>().Where(r => r.RowIndex ==1 ).FirstOrDefault();
Cell cell = row.Elements<Cell>().Where(c => string.Compare
(c.CellReference.Value, "A1" , true) == 0).First();
cell.CellValue = new CellValue("new Header");
cell.DataType = CellValues.String;
wbPart.WorkBook.save();
}
Approach 2
foreach(TableDefinitionPart tdp in wsPart.TableDefinitionParts )
{
QueryTablePart qtp= tdp .QueryTableParts.FirstOrDefault();
Table excelTable = tdp .Table;
int i = 0;
foreach(TableColumn col in excelTable.TableColumns)
{
col.name.Value = "Header"+i;
i++;
}
}
I tried to use Microsoft.Office.Interop.Excel but it's too slow when it comes to reading large excel documents (it was taking over 5 minutes for me). I read that DocumentFormat.OpenXml is faster when it comes to reading large excel documents but in the documentation it doesn't appear that I can't store the columns and row indexes.
For now, I am also only interested in the first row to get the column headers and I will be reading the rest of the document after some logic. I haven't been able to find a way to read only a portion of the excel document. I want to do something similar to this:
int r = 1; //row index
int c = 1; //column index
while (xlRange.Cells[r,c] != null && xlRange.Cells[r, c].Value2 != null)
{
TagListData.Add(new TagClass { IsTagSelected = false, TagName = xlRange[r, c].Value2.toString(), rIndex = r, cIndex = c });
c += 3;
}
Users will be picking excel documents through openFileDialog so there's no fixed number of rows of columns I can use. Is there a way I could make this work?
Thank you
In OpenXML if a cell has no text it might or might not appear in the list of cells (depends on whether it ever had text or not). Therefore the while (...Value2 != null) type of approach isn't really a safe way to do things in OpenXML.
Here is a very simple approach to reading the first row (written using LINQPad hence the Main and the Dump). Note the (simplified) use of the SharedStringTable to get the real text of the cell:
void Main()
{
var fileName = #"c:\temp\openxml-read-row.xlsx";
using (FileStream fs = new FileStream(fileName, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
using (SpreadsheetDocument doc = SpreadsheetDocument.Open(fs, false))
{
// Get the necessary bits of the doc
WorkbookPart workbookPart = doc.WorkbookPart;
SharedStringTablePart sstpart = workbookPart.GetPartsOfType<SharedStringTablePart>().First();
SharedStringTable sst = sstpart.SharedStringTable;
WorkbookStylesPart ssp = workbookPart.GetPartsOfType<WorkbookStylesPart>().First();
Stylesheet ss = ssp.Stylesheet;
// Get the first worksheet
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
Worksheet sheet = worksheetPart.Worksheet;
var rows = sheet.Descendants<Row>();
var row = rows.First();
foreach (var cell in row.Descendants<Cell>())
{
var txt = GetCellText(cell, sst);
// LINQPad specific method .Dump()
$"{cell.CellReference} = {txt}".Dump();
}
}
}
}
// Very basic way to get the text of a cell
private string GetCellText(Cell cell, SharedStringTable sst)
{
if (cell == null)
return "";
if ((cell.DataType != null) && (cell.DataType == CellValues.SharedString))
{
int ssid = int.Parse(cell.CellValue.Text);
string str = sst.ChildElements[ssid].InnerText;
return str;
}
else if (cell.CellValue != null)
{
return cell.CellValue.Text;
}
return "";
}
However... there's potentially a lot of work involved with OpenXML and you'd be well advised to try and use something like ClosedXML or EPPlus instead.
eg using ClosedXML
using (var workbook = new XLWorkbook(fileName))
{
var worksheet = workbook.Worksheets.First();
var row = worksheet.Row(1);
foreach (var cell in row.CellsUsed())
{
var txt = cell.Value.ToString();
// LINQPad specific method .Dump()
$"{cell.Address.ToString()} = {txt}".Dump();
}
}
I have a few different dictionaries with different categories of information and I need to output them all into an xls or csv file with multiple spreadsheets. Currently, I have to download each excel file for a specific date range individually and then copy and paste them together so they're on different sheets of the same file. Is there any way to download all of them together in one document? Currently, I use the following code to output their files:
writeCsvToStream(
organize.ToDictionary(k => k.Key, v => v.Value as IacTransmittal), writer
);
ms.Seek(0, SeekOrigin.Begin);
Response.Clear();
Response.AddHeader("Content-Disposition", "attachment; filename=" + fileName);
Response.AddHeader("Content-Length", ms.Length.ToString());
Response.ContentType = "application/octet-stream";
ms.CopyTo(Response.OutputStream);
Response.End();
where writeCsvToStream just creates the text for the individual file.
There are some different options you could use.
ADO.NET Excel driver - with this API you can populate data into Excel documents using SQL style syntax. Each worksheet in the workbook is a table, each column header in a worksheet is a column in that table etc.
Here is a code project article on the exporting to Excel using ADO.NET:
http://www.codeproject.com/Articles/567155/Work-with-MS-Excel-and-ADO-NET
The ADO.NET approach is safe to use in a multi-user, web app environment.
Use OpenXML to export the data
OpenXML is a schema definition for different types of documents and the later versions of Excel (the ones that use .xlsx, .xlsm etc. instead of just .xls) use this format for the documents. The OpenXML schema is huge and somewhat cumbersome, however you can do pretty much anything with it.
Here is a code project article on exporting data to Excel using OpenXML:
http://www.codeproject.com/Articles/692121/Csharp-Export-data-to-Excel-using-OpenXML-librarie
The OpenXML approach is safe to use in a multi-user, web app environment.
A third approach is to use COM automation which is the same as programmatically running an instance of the Excel desktop application and using COM to control the actions of that instance.
Here is an article on that topic:
http://support.microsoft.com/kb/302084
Note that this third approach (office automation) is not safe in a multi-user, web app environment. I.e. it should not be used on a server, only from standalone desktop applications.
If you're open to learning a new library, I highly recommend EPPlus.
I'm making a few assumptions here since you didn't post much code to translate, but an example of usage may look like this:
using OfficeOpenXml;
using OfficeOpenXml.Style;
public static void WriteXlsOutput(Dictionary<string, IacTransmittal> collection) //accepting one dictionary as a parameter
{
using (FileStream outFile = new FileStream("Example.xlsx", FileMode.Create))
{
using (ExcelPackage ePackage = new ExcelPackage(outFile))
{
//group the collection by date property on your class
foreach (IGrouping<DateTime, IacTransmittal> collectionByDate in collection
.OrderBy(i => i.Value.Date.Date)
.GroupBy(i => i.Value.Date.Date)) //assuming the property is named Date, using Date property of DateTIme so we only create new worksheets for individual days
{
ExcelWorksheet eWorksheet = ePackage.Workbook.Worksheets.Add(collectionByDate.Key.Date.ToString("yyyyMMdd")); //add a new worksheet for each unique day
Type iacType = typeof(IacTransmittal);
PropertyInfo[] iacProperties = iacType.GetProperties();
int colCount = iacProperties.Count(); //number of properties determines how many columns we need
//set column headers based on properties on your class
for (int col = 1; col <= colCount; col++)
{
eWorksheet.Cells[1, col].Value = iacProperties[col - 1].Name ; //assign the value of the cell to the name of the property
}
int rowCounter = 2;
foreach (IacTransmittal iacInfo in collectionByDate) //iterate over each instance of this class in this igrouping
{
int interiorColCount = 1;
foreach (PropertyInfo iacProp in iacProperties) //iterate over properties on the class
{
eWorksheet.Cells[rowCounter, interiorColCount].Value = iacProp.GetValue(iacInfo, null); //assign cell values by getting the value of each property in the class
interiorColCount++;
}
rowCounter++;
}
}
ePackage.Save();
}
}
}
Thanks for the ideas! I was eventually able to figure out the following
using Excel = Microsoft.Office.Interop.Excel;
Excel.Application ExcelApp = new Excel.Application();
Excel.Workbook ExcelWorkBook = null;
Excel.Worksheet ExcelWorkSheet = null;
ExcelApp.Visible = true;
ExcelWorkBook = ExcelApp.Workbooks.Add(Excel.XlWBATemplate.xlWBATWorksheet);
List<string> SheetNames = new List<string>()
{ "Sheet1", "Sheet2", "Sheet3", "Sheet4", "Sheet5", "Sheet6", "Sheet7"};
string [] headers = new string []
{ "Field 1", "Field 2", "Field 3", "Field 4", "Field 5" };
for (int i = 0; i < SheetNames.Count; i++)
ExcelWorkBook.Worksheets.Add(); //Adding New sheet in Excel Workbook
for (int k = 0; k < SheetNames.Count; k++ )
{
int r = 1; // Initialize Excel Row Start Position = 1
ExcelWorkSheet = ExcelWorkBook.Worksheets[k + 1];
//Writing Columns Name in Excel Sheet
for (int col = 1; col < headers.Length + 1; col++)
ExcelWorkSheet.Cells[r, col] = headers[col - 1];
r++;
switch (k)
{
case 0:
foreach (var kvp in Sheet1)
{
ExcelWorkSheet.Cells[r, 1] = kvp.Value.Field1;
ExcelWorkSheet.Cells[r, 2] = kvp.Value.Field2;
ExcelWorkSheet.Cells[r, 3] = kvp.Value.Field3;
ExcelWorkSheet.Cells[r, 4] = kvp.Value.Field4;
ExcelWorkSheet.Cells[r, 5] = kvp.Value.Field5;
r++;
}
break;
}
ExcelWorkSheet.Name = SheetNames[k];//Renaming the ExcelSheets
}
//Activate the first worksheet by default.
((Excel.Worksheet)ExcelApp.ActiveWorkbook.Sheets[1]).Activate();
//Save As the excel file.
ExcelApp.ActiveWorkbook.SaveCopyAs(#"out_My_Book1.xls");
I have been trying for a couple of days now to insert a string into an openxml spreadsheet. Everything else (so far) works, everything but that.
This is the code i'm currently running (note, this is purely for testing purposes and is pretty basic):
using (SpreadsheetDocument spreadSheet = SpreadsheetDocument.Create(file + "test.zip", SpreadsheetDocumentType.Workbook))
{
spreadSheet.AddWorkbookPart();
spreadSheet.WorkbookPart.Workbook = new Workbook();
spreadSheet.WorkbookPart.AddNewPart<SharedStringTablePart>();
spreadSheet.WorkbookPart.SharedStringTablePart.SharedStringTable = new SharedStringTable() {Count=1, UniqueCount=1};
spreadSheet.WorkbookPart.SharedStringTablePart.SharedStringTable.AppendChild(new SharedStringItem(new Text("test")));
spreadSheet.WorkbookPart.SharedStringTablePart.SharedStringTable.Save();
preadSheet.WorkbookPart.AddNewPart<WorksheetPart>();
spreadSheet.WorkbookPart.WorksheetParts.First().Worksheet = new Worksheet();
spreadSheet.WorkbookPart.WorksheetParts.First().Worksheet.AppendChild(new SheetData());
spreadSheet.WorkbookPart.WorksheetParts.First().Worksheet.First().AppendChild(new Row());
Row r2 = new Row() { RowIndex = 5 };
spreadSheet.WorkbookPart.WorksheetParts.First().Worksheet.First().AppendChild(r2);
r2.AppendChild(new Cell() { CellReference = "A5", CellValue = new CellValue("0"), DataType = new EnumValue<CellValues>(CellValues.SharedString) });
spreadSheet.WorkbookPart.WorksheetParts.First().Worksheet.Save();
spreadSheet.WorkbookPart.Workbook.GetFirstChild<Sheets>().AppendChild(new Sheet()
{
Id = spreadSheet.WorkbookPart.GetIdOfPart(spreadSheet.WorkbookPart.WorksheetParts.First()),
SheetId = 1,
Name = "test"
});
spreadSheet.WorkbookPart.Workbook.Save();
}
Everything seems to work, the file saves where i want it to and, generally, looks the way i expect it to. The "only" issue is that, when i add the string to the cell, excel will give me an error saying that the file is corrupt and continues to delete said cell.
Am i doing something wrong?
Try this, in your method..
if (spreadSheet.WorkbookPart.GetPartsOfType<SharedStringTablePart>().Count() > 0)
{
shareStringPart = spreadSheet.WorkbookPart.GetPartsOfType<SharedStringTablePart>().First();
}
else
{
shareStringPart = spreadSheet.WorkbookPart.AddNewPart<SharedStringTablePart>();
}
index = InsertSharedStringItem(cell_value, shareStringPart);
cell.CellValue = new CellValue(index.ToString());
cell.DataType = new EnumValue<CellValues>(CellValues.SharedString);
InsertSharedString Method:
private static int InsertSharedStringItem(string text, SharedStringTablePart shareStringPart)
{
// If the part does not contain a SharedStringTable, create one.
if (shareStringPart.SharedStringTable == null)
{
shareStringPart.SharedStringTable = new SharedStringTable();
}
int i = 0;
// Iterate through all the items in the SharedStringTable. If the text already exists, return its index.
foreach (SharedStringItem item in shareStringPart.SharedStringTable.Elements<SharedStringItem>())
{
if (item.InnerText == text)
{
return i;
}
i++;
}
// The text does not exist in the part. Create the SharedStringItem and return its index.
shareStringPart.SharedStringTable.AppendChild(new SharedStringItem(new DocumentFormat.OpenXml.Spreadsheet.Text(text)));
shareStringPart.SharedStringTable.Save();
return i;
}
I think you should create a spreadsheet (using Excel), add the text into the cell and then open this spreadsheet in "OpenXML 2.5 Productivity Tool". There is a "Reflect code" button in the productivity tool that would help you replicate in code what needs to be done. That's the easiest way, I've found to solve such bugs.
I have been struggling to find a solution on how to read a large xlsx file with OpenXml. I have tried the microsoft samples without luck. I simply need to read an excel file into a DataTable in c#. I am not concerned with value types in the datatable, everything can be stored as a string values.
The samples I have found so far don't retain the structure of the spreadsheet and only return the values of the cells.
Any ideas?
The open xml SDK can be a little hard to understand. However, I have found it useful to use http://simpleooxml.codeplex.com/ this code plex project. It adds a thin layer over the sdk to more easily parse through excel files and work with styles.
Then you can use something like the following with their worksheet reader to recurse through and grab the values you want
System.IO.MemoryStream ms = Utility.StreamToMemory(xslxTemplate);
using (SpreadsheetDocument document = SpreadsheetDocument.Open(ms, true))
{
IEnumerable<Sheet> sheets = document.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>();
if (sheets.Count() == 0)
{
// The specified worksheet does not exist.
return null;
}
string relationshipId = sheets.First().Id.Value;
WorksheetPart worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(relationshipId);
string myval =WorksheetReader.GetCell("A", 0, worksheetPart).CellValue.InnerText;
// Put in a loop to go through contents of document
}
You can get DataTable this way:
using (SpreadsheetDocument spreadsheet = SpreadsheetDocument.Open(fileName, false))
{
DataTable data = ToDataTable(spreadsheet, "Employees");
}
This method will read Excel sheet data as DataTable
public DataTable ToDataTable(SpreadsheetDocument spreadsheet, string worksheetName)
{
var workbookPart = spreadsheet.WorkbookPart;
var sheet = workbookPart
.Workbook
.Descendants<Sheet>()
.FirstOrDefault(s => s.Name == worksheetName);
var worksheetPart = sheet == null
? null
: workbookPart.GetPartById(sheet.Id) as WorksheetPart;
var dataTable = new DataTable();
if (worksheetPart != null)
{
var sheetData = worksheetPart.Worksheet.GetFirstChild<SheetData>();
foreach (Row row in sheetData.Descendants<Row>())
{
var values = row
.Descendants<Cell>()
.Select(cell =>
{
var value = cell.CellValue.InnerXml;
if (cell.DataType != null && cell.DataType.Value == CellValues.SharedString)
{
value = workbookPart
.SharedStringTablePart
.SharedStringTable
.ChildElements[int.Parse(value)]
.InnerText;
}
return (object)value;
})
.ToArray();
dataTable.Rows.Add(values);
}
}
return dataTable;
}