Most effective way to duplicate an Excel worksheet? - c#

I am working on a project upgrading a WPF program that has to work with Excel sheets a lot. It's using Closedxml and Excel interop to manipulate Excel files and add data.
After some trace I found a function that's painfully slow. It use the same method as other similar function, but everything else is working quite fine. The problem is Closedxml.copyto() eats up 85% of processing power.
It's purpose simply is to take one Excel sheet as a template if there is new record that need to be printed. it will copy the first sheet to new sheet then write data into it.
If you have any idea on how to speed up this kind of process, Please let me know!
foreach (object[] row in rows)
{
if (Common.integer(row[0]) < from_no || Common.integer(row[0]) > to_no)
{
continue;
}
sheetNum++;
if (sheetNum != 1)
{
this part use 85% of process Power
sheet_edit.CopyTo(sheetNum.ToString());
sheet_edit = book.Worksheet(sheetNum);
}
sheet_edit.Name = row[0].ToString();
ct.ThrowIfCancellationRequested();
w.ReportProgress(progCnt * 100 / maxCnt);
progCnt++;}
Thank you very much!
PS: sorry for my bad English!
PS: To anyone who did downvote my question, Please tell me the reason? Is it Not helpful or ... other reason?
PS: I searched all day but i cant find any answer for this. There are quite some method but those all just dont fit my need.
using interop: not quite faster.
using openxml: it mean i have write more code and it not quite easy to intergrade to this program
using closedXML.copyRange: Sure quite faster but it doesnt copy columns width, row height,... it mean more code, mode process... So not quite faster.
I decided to use dianogtics.process(print) in the loop, that 1st sheet will be reused in every loop. It kind of faster, but we cant choose printer or printer setting... default printer and setting will be used automatically.
I can explain this to my customer and i think this is quite aceptable.
But i am still waiting for the answer.. I you happened to know how to fasten this kind of processs up, please let me know!!

ClosedXML has to copy each object (cell, style, picture, etc) from the source to the destination. If you have many thousands of cells, then this will consume your CPU cycles.
You should ensure that your source worksheet contains only the cells and styles that you really need. In my experience, I have seen many Excel templates that contain many unused styles and empty cells at bizarre worksheet addresses.
If I were you, I would recreate the template as far as possible in ClosedXML itself (even if just a once-off process). This will ensure that your template is as minimal as possible. ClosedXML doesn't support all features yet, so after you create the template, you may want to add elements (e.g. charts). Then use that saved template in your further processing. It should be much smaller and faster as the one you're using now (my guess).
Other options you could try: An .xlsx file is just a .zip package. You can look at the underlying XML inside the file and determine how many cells or styles there are to be copied.
You can also download the ClosedXML source and narrow down exactly which kind of element is taking up the resources.
Disclaimer: I'm a ClosedXML project maintainer.

Instead of using ClosedXml to copy the sheet , You can use the Excel Interop to do the same. Below is the sample code for copying the worksheet
Excel.Application xlApp = Marshal.GetActiveObject("Excel.Application") as Excel.Application;
Excel.Workbook xlWb = xlApp.ActiveWorkbook as Excel.Workbook;
Excel.Worksheet xlSht = xlWb.Sheets[1];
xlSht.Copy(Type.Missing, xlWb.Sheets[xlWb.Sheets.Count]);
xlWb.Sheets[xlWb.Sheets.Count].Name = "NEW SHEET";

Related

Setting style in cells with conditional formatting in ClosedXML C#

There was a need to make conditional formatting of a cell with a histogram. Used ClosedXML but it didn't give the desired result.
It is necessary to solve the problem with both the gradient and negative numbers. Has anyone encountered something similar? I am attaching the code.
form_sheet.Cell("D37")
.AddConditionalFormat()
.DataBar(XLColor.FromArgb(68, 114, 196), false)
.Minimum(XLCFContentType.Number, -3)
.Maximum(XLCFContentType.Number, 3);
Ready to consider alternative solutions not through ClosedXML. The program will generate several dozen reports. All histograms will be in the same cells, so I also considered vbs, but I don’t have enough experience to write such a script that would change styles immediately for a bunch of documents.
Bit late to answer, might be helpful for others..
I also gone through the same gradient issue. Currently using ClosedXML it is not possible to generate conditional DataBar with solid color.
What am doing to resolve my issue is to generate the Excel as of now with ClosedXML and re-open the Excel again in Interop and add the DataBar in the respective cells using Interop.Excel.
I haven't fully rewrote the code using interop because performance wise we can't fully rely on interop as compared to ClosedXML, atleast for me.
Sample code for adding Databar using Interop
var excel = new Microsoft.Office.Interop.Excel.Application();
var workBooks = excel.Workbooks;
var workBook = workBooks.Add();
var workSheet = (Microsoft.Office.Interop.Excel.Worksheet)excel.ActiveSheet;
workSheet.Cells[1, "A"] = 10;
Microsoft.Office.Interop.Excel.Range range1 = workSheet.Cells[1, 1];
Microsoft.Office.Interop.Excel.Databar bar = (Microsoft.Office.Interop.Excel.Databar)range1.FormatConditions.AddDatabar();
bar.BarFillType = Microsoft.Office.Interop.Excel.XlDataBarFillType.xlDataBarFillSolid;
Thanks.

Issue in using extremely huge Excel File with Formula in C#

In my current project, Aspose has been used to work with Excel file (XLXS). This excel file has 4 worksheets. First two sheets are empty except they have first row which contain column names. These tab got data through code and other two contains tons of complex formula based on these inputs. Just imagine first two tab as inputs, third tab as complex calculation and last tab as output. Average size of file ranges from 26MB to 48MB. Below piece of code does most of the work. After this method, the file has been saved in some physical location too. output date saved in DB. This process working fine so far with above range, but when size exceeded beyound 100MB, it started throwing Out of Memory exception. Hardly once or twice, it able to complete the process in around 80 - 100 mins.
public void CaclulateM(DataSet dataModel)
{
var workbook = this.ExcelModel.Workbook;
var ranges = ExcelModel.GetExcelModelRanges;
base.ImportInputsTo(workbook, ranges, dataModel);
workbook.CalculateFormula(false);
base.ExportOutputsTo(workbook, ranges, dataModel);
}
I tried out some of the solution provided by Aspose, but failed.I tried other dlls too including Interop, ExcelLibrary, NPOI, but same result.
https://forum.aspose.com/t/aspose-cell-dll-issue-for-xslb-file/164440
Please help or let me know if you need any other input to suggest anything. I cannot provide you the excel file due to confidentiality.

Reading Rich Text from Excel Range (cells) with Office Interop

(This question was formerly titled "C# / WPF : Going from Excel Interop "Range" to WPF "FlowDocument"" however I've made progress on that front that allows me to restrict my question. I'm leaving the original question below so existing answers will still make sense.)
I'm using Office Interop to read the contents of cells in an Excel worksheet. Some of those cells contain Rich Text (for example some words are italicized but not the whole cell) and I would like to capture them as RTF so I can then display them into WPF controls.
I have been able to obtain the RTF contents of cells using the clipboard API, where I use Excel Interop to copy a Range of one cell to the clipboard, and then read the clipboard, like so:
// Step 1 : retrieve the RTF from the clipboard as a string
string txt = Clipboard.GetText(TextDataFormat.Rtf);
// Step 2 : create a FlowDocument object and a TextRange object:
FlowDocument doc = new FlowDocument();
TextRange tr = new TextRange(doc.ContentStart, doc.ContentEnd);
// Step 3 : convert the clipboard string to a stream
byte[] byteArray = Encoding.ASCII.GetBytes(txt);
MemoryStream stream = new MemoryStream(byteArray);
// Step 4 : load that stream into TextRange
tr.Load(stream, DataFormats.Rtf);
If I then assign "doc" to the Document property of, say, a RichTextBox control, it'll display the content of the Excel cell with the exact same formatting as Excel does, down to colored words and font sizes.
However, this is extremely slow. It may take minutes to load a thousand cells that way, even if most are empty.
So here's my updated question : clearly Excel has a mechanism for returning the RTF content of an Excel cell, otherwise my Clipboard code couldn't work. But is there are more efficient way than the Clipboard to exploit that mechanism ? Ideally through Interop ?
Original question :
This may be an unusual question but as I'm quite new to C#, WPF and Interop, I might be going about things the wrong way so don't hesitate to offer a better approach. Here's what I'm trying to do :
I'm coding a WPF application that uses Office Interop to grab the contents of cells from an Excel worksheet. That content is text which may contain some formatting (for example some words are in bold, others are in italics). The application then displays that content in a "FlowDocumentScrollViewer" control on its GUI.
I want this "FlowDocumentScrollViewer" control to render the content from the Excel cell exactly as it appears in Excel, with formatting and everything.
The best I've managed so far is to display the cell's content without any formatting. Here's how this works : I use Office Interop to read a Range of cells from the worksheet and take their Value2 property. Value2 is of type "object". Then I create a FlowDocument object out of it, like so:
FlowDocument doc = new FlowDocument();
Paragraph p = new Paragraph(new Run(Variable_containing_a_Value2.ToString()));
doc.Blocks.Add(p);
And then I store this FlowDocument into the "FlowDocumentScrollViewer" Document property.
Now since I'm using "ToString()" on the Value2 I'm not surprised that any formatting information this object might contain disappears past this point.
My problem is, I haven't been able to find a way to create that FlowDocument, from that Value2 object, that preserves formatting.
Now, I know there has to be a way to get that information through, because when I copy my Excel cell and paste it in Word, for example, then the formatting is carried through. I just don't know how.
Help me Obiwans, you're my only hope, as even Google has failed me.
It seems to me that you have at least a couple of options that will work better than just copying the cell contents as text. The Range object has Copy() and CopyPicture() methods, which you can use to have Excel copy the contents of the range to the clipboard.
The basic Copy() method should (I haven't tested it) put the contents of the cell into the clipboard in a variety of formats, including RTF. And you should be able to get the RTF and put that into the FlowDocument element.
Using RTF, you may still not get exactly the representation as seen in Excel. The only way to do that is to have Excel do the rendering. In that case, you'll want the CopyPicture() method, which will put picture of the range on the clipboard. This will be either a bitmap or metafile, depending on the options you use for the method call. You can then retrieve these from the clipboard and put them into your FlowDocument.
Depending on what applications you're looking at, e.g. Word, there's yet another more complicated approach, one that I doubt would work with FlowDocument, but which they are using. That is, they are presenting the Excel range an OLE object. This is harder to implement, but has the advantage that it's a live representation of the original Excel document, and the user can edit the range in-place in the host application.
The above should be enough to get you pointed in the right direction, so at least you know what you're looking for when you do your web searches. As stated, your question is very broad, and so the above is necessarily vague as well. Once you've decided on a particular method, have done some research and made an attempt into implementing that method, if you still have problems you can post a new question, with a good Minimal, Complete, and Verifiable code example that shows clearly what you've tried, with a detailed explanation of what specifically you're still having trouble with.

MS Word C# AddIn - how to edit xml of an open word document

Thanks for coming by :)
I need to modify the XML of an MS Word Document directly, because the Word Interop's capabilities are insufficient for what I need to do.
The trick is that I have to do it from a Word Add-In and apply it to the currently open document, so I can't open/save packages (right?). In short, several dozen articles like the one below are not applicable here:
https://msdn.microsoft.com/en-us/library/aa982683%28v=office.12%29.aspx
Any help would be appreciated :)
Example problem -- Remove custom cell margins from a really, really big table in word (think 200x10) and check "Same as whole table" for each.
A lead on a solution (currenttable is the currently selected word table):
using System.Xml.Linq; // plus all the standard Word Add-In references
...
XDocument currentablexdocument = XDocument.Parse(currenttable.Range.WordOpenXML);
currentablexdocument.Descendants().Where(e =>e.Name.LocalName.Equals("tcMar")).Remove();
currenttable.Range.Delete();
currentselection.InsertXML(currentablexdocument.ToString());
Explanation:
currenttable.Range.WordOpenXML provides me with well-formed XML representation of the table, which I then interpret as an XDocument
tcMar = table cell margins. These XML elements exist only if a cell has custom margins. Deleting all such elements does exactly what I need.
currenttable.Range.Delete() deletes the old table
currentselection.InsertXML(...) inserts the modified table XML into the document with margins fixed. Pretty much instantaneous. Yay!
Problem:
Deleting and inserting the table is flaky and yields undesired results. It would be much better if I could MODIFY the xml directly. Is it possible?
Disclaimer:
Any other ideas of fixing this particular issue are welcome, but I have tried a myriad of possible solutions:
applying table style rejected by client,
looping "SendKeys" commands to automate use of the Word interface too unreliable,
changing Table.XXXPadding, Row.XXXPadding, Column.XXXPadding doesn't affect custom Cell margins (among other issues)
looping through cells to change their Cell.XXXPadding too slow (Freezes word for several minutes on a 200x10 table). Note, it's accessing the padding that's slow; the loop itself takes 3 seconds to traverse the whole table when implemented correctly.
ofc I tried it all with ScreenRefreshing = false and AllowAutoFit = false;
Somebody please help :)
Cheers!

Can I set auto-width on an Open XML SDK-generated spreadsheet without calculating the individual widths?

I'm working on creating an Excel file from a large set of data by using the Open XML SDK. I've finally managed to get a functional Columns node, which specifies all of the columns which will actually be used in the file. There is a "BestFit" property that can be set to true, but this apparently does not do anything. Is there a way to automatically set these columns to "best fit", so that when someone opens this file, they're already sized to the correct amount? Or am I forced to calculate how wide each column should be in advance, and set this in the code?
The way I understand the spec and this MSDN discussion, BestFit tells you the width was auto-calculated in Excel, but it does not tell Excel that it should calculate it again next time it is opened.
As "goodol" indicates in that discussion, I think the width can only be calculated when you display the column, since it depends on the contents, the font used, other style parameters... So even if you want to pre-calculate the width yourself, be aware that this is only an estimation, and it can be wrong if the contents contain lots of "wide" characters. Or does the Open XML SDK do this for you?
I'm using EPPlus which I highly recommend. Took me a while to figure out how to do it using that, here's what I came up with:
// Get your worksheet in "sheet" variable
// Set columns to auto-fit
for (int i = 1; i <= sheet.Dimension.Columns; i++)
{
sheet.Column(i).AutoFit();
}

Categories