I use the Gembox.Spreadsheet dll to convert a Excel document to PDF by:
ExcelFile.Load(formExcelPath).Save(formPdfPath);
Which works as expected except for one thing: values which are calculated from formulas show up in the PDF as if they were never calculated.
From Gembox's site , it says "Formulas can't be exported to CSV, HTML, PDF or XPS file formats."
However, I do not want to export the formulas, I just want the values present in the cell. Is there a workaround for this? Some way of forcing formulas to be calculated before the conversion to PDF?
EDIT 28-09-2016:
We have released a new version of GemBox.Spreadsheet (version 4.1) in which we implemented a support for cell formula calculation, see the version history page.
Also, you can find the calculation example here.
ORIGINAL ANSWER
Unfortunately the problem is that currently GemBox.Spreadsheet (version 3.9) does not have a calculation engine.
In other words it is able to read the last calculated values from the input file, but it's unable to recalculate formula results on its own.
Note that we do have this feature request in our collection and please feel free to vote for it in order to boost its priority.
But at this moment I cannot tell you exactly when it will be implemented, this feature is not in our current roadmap.
Related
I have an excel sheet with a complex formula applied in it. I want to use this formula from my C# code for some calculations. My scenario is like this User fill a form (web) and submit, our program will fetch these values and applied in the hosted excel sheet and get the result.
is this possible?
Help is highly appreciable.
Thanks,
This really should go in the comments section, as I have a few questions to clarify your request. However, as I do not yet have the reputation, my questions must go here.
Is it a formula residing in your existing Excel document that you want to utilize that formula/document without re-writing the code?
Or are you asking if it is possible to write the same formula that you have working on the Excel sheet in c#?
If the former, it is conceivable to pass data into the Excel sheet from the code. After all we can use code to append to other document types, why not an Excel document? And if we can fetch specific lines from other document types, why not from an Excel document (especially since the grid/cell layout provides easier addressable points). However, while it is conceivable, doesn't mean it's easy. It's an interesting proposition, and I'd like to do some experiments on that.
If it is the latter, then depending on the formula's requirements, you may just need to use the appropriate math libraries and write out the formula directly in the c# code. You may not be able to visually see how everything ties together as nicely in the Excel document (with the referenced cells highlighted), but you could modularize components of the formula more easily as sub-functions and re-use values easily enough with variables.
The excel file has super long formulas and rather than convert them to C# I would prefer to just be able to read from it the result cell after I input values in the calculations cell via the app which would be written in C#
To partially answer your question, If you're not familiar with using the Interop library within C# I suggest you check out this link for basic implementation.
http://csharp.net-informations.com/excel/csharp-excel-tutorial.htm
Then, once you've got your app set up to manipulate excel files check out this one for how to read a value from a cell:
How to read single Excel cell value
HOWEVER, depending on how large your excel file may end up being, it may be a good idea to just go ahead and convert the formulae into C#. Taking a little bit of extra time on that will save you tons of time while waiting for excel to execute statements, as I believe it runs off of/in conjunction with VBA(which is an interpreted language as opposed to C# being a compiled language)
I am generating a Word document using the OpenXml SDK 2.0 and everything in that respect is fine. The document has a lot of tables with multi-row table headers and everything looks exactly the way it should.
I am passing this document through the word automation service in Sharepoint 2010 Enterprise and the service returns a converted file. Sometimes the file format is the same as the input format (Docx->Docx) as I use the service to refresh the table of contents, but most conversions are to PDF.
My problem is that the document returned does not contain the same headers as the source document. If I look at the OpenXml of the document, the rows do not have the TableHeader property but they do in the source.
Has anyone experienced this before? What can I do to fix this as I can find very little about WAS and how it works. We have invested a fair bit of time into developing this and do not want to have to resort to a third party component.
This is actually a defect in Word Automation Services and how it handles multi-row headers. I got around the issue by explicitly setting the colours in the header rows so that they look like headers, but are not.
This means that if the table spans across multiple pages then the header won't be repeated, but it's the lesser of two evils.
I need to print a few thousand stickers with a few text fields (name, position, etc) as well as a barcode image.
Each staff member gets two unique stickers, and the sticker paper has 4 per sheet so that's 2 staff per sheet.
I already have all the code to generate the barcode as an Image, and the staff details are stored in a List of object.
If possible, I'd like to avoid using MSWord directly since my development environment is quite different from the target environment and I've had issues in the past from the disparity. (Win7-64, MSOffice2010 vs. WinXP-32, MSOffice2003).
What's the best way to accomplish this?
If I save the document as an XML format and replace the mail merge fields with unique tokens which I can replace with my actual values (and I can even replace the binary image data with base-64 encoded image bytes) then that works but it's clunky. For starters, I'd have to save the XML file and then somehow print it transparent to the user (don't want Word showing up). Also, the XML template is 1 page, but I might have several dozen to print. I can send each page to the printer individually but that's not exactly ideal.
Any other suggestions?
I would use DevXpress XtraReports as I have used it in the past in similar scenarios with great results. If you prefer other engines like Crystal or Telerik is the same, as easy as dragging some fields in the page details section and assign your object list as datasource. DevXpress has also a RichTextBox with builtin mailmerge feature. at last if you decide for word do not forget that you can automate and use it while keeping it invisible so users wont see it.
I have a few pdf files that were created from word or excel files.
I need to get the information thats in the tables.
The text in the document is not an image so I'm able to extract the text using tools such as pdfbox.
When I have the text I have no way of knowing what cells in the table it belongs to because I don't know where the table borders are.
Iv'e tried a few desktop tools such as abby or solid pdf converter and they are able to convert the files into nice word documents but this doesn't suit my needs as I want to be able to do this programatticly in C#.
Some of the tables have nested tables wich I think makes this a little bit more diffucult.
I appreciate your help
The difficulty here is caused by the fact that the text in the PDF is not contained within any table. It might look like it is, but underneath the surface, it is not.
So there are a couple of options that I can think of. But none of them are going to be quite as satisfying as you'd probably like.
There are some companies that offer SDKs for PDF to Excel/Word conversion. Investintech and Iceni are a couple of examples. But these solutions are not free.
If you know the exact layout of the PDF files that you need to extract the table data from, then you can use any SDK that lets you extract text from a PDF and also tells you the exact co-ordinates of the extracted text. Using this method you need to know in advance where the text is going to be, so that you can extract text from a specific area on the page. It obviously won't work if you need to process any random document.
It's a difficult task, but hopefully this will give you a starting point.