I am using VSTO to fill data into a table in a Microsoft Word 2007 template. The amount of data varies and filling many pages (+50) takes a lot of time.
The code I use to create a table:
Word.Table table = doc.Tables.Add(tablePosition,
numberOfRows,
8,
ref System.Reflection.Missing.Value,
ref System.Reflection.Missing.Value);
I suspect that the time consumption is due to the communication between Visual Studio (C#) and Word each time I insert data into a cell. If this is the case, it might be faster to create the table in C# and afterwards insert it into Word.
The Microsot.Office.Interop.Word.Table is an abstract class - thus I cannot do this
Word.Table table = new Word.Table();
which would have been handy.
Are there other possibilities when just using VSTO?
Try creating the table in HTML Clipboard format, add to clipboard, then paste.
Try creating the table in HTML and inserting it.
Try creating tab-delimited string with newline character for each record. Insert string with selection, convert selection to table using tabs as delimiter.
Create template as XML, transforming data with Xslt into Word XML Document.
Create template as a "Directory Mail Merge", perform mail merge with data.
Depending on your requirements, I recommend using the mail merge technique because the user can edit the template and mail merges are fast, especially if you have 50+ pages.
Although I do similar things with LabVIEW7.1 and Word2000, the problem is similar. I have not found a way to insert blocks of data (table) with one command. There is even a problem when inserting single elements too fast for word, it occasionally hangs than and must be killed in order to solve that. Unfortunately there is no event nor property that signals word's ability to accept the next command and data set - at least I could not find anything.
As this is in a test sequencer I have the time to feed the test results into word with delays long enough to assume word is ready again when the next portion of data is send...
Related
I need to create a new word 2016 file, using VS2017, insert content (that's the easy part), and also to control it like doing the following:
Merge certain cells in same row, or same column
Define Right to Left or LTR
color the text/the background.
and more similar tasks.
I can open a document using
using Microsoft.Office;
using Word = Microsoft.Office.Interop.Word;
I can add text and save the document, yet still I don't see a way to fine control the color/direction and more parameters. After reading the documentation, it seems that this is probably not supported, unless I missed it.
I would appreciate if anyone can guide to a detailed documentation how to edit a word file from C# program.
Anyway, I can bypass it by creating an excel file which is simple using Interop and then insert it.
Here is a working solution for merging cells in a table, using VS2017 c#
var doc = DocX.Create(word_fname);
Table table = doc.AddTable(tableSize, 3);
table.Rows[row_cnt].MergeCells(1, 2); // to merge the 2nd & 3rd cells in the specific row
I have a c# application which exports the data from SQL server to CSV. During one such export the data from the table is split into multiple columns due to the presence of certain special characters.
How can I load the data without losing the special characters and without splitting the column data into several columns?
Below is the code sample that is not working fine.
sw.Write( string.Format("\""+column.ToString()+"\""));
where column value is:
Need to add "ABCD, LMSW # 123-456-789" and J Yu, PhD # 123-456-789" to OFFICE INFORMATION box on the right side of the web page: https://xyz.abc.yz.edu/
How can I discriminate the elements inside a table and those outside? And additionally how can I verify tables without a content control name?
I suggest you use Linq To XML. On MSDN there is an example console application that displays all paragraph text of a Word Document.
Near the bottom is a comment - Find all paragraphs in the document - this is the Linq To XML piece that pulls out the paragraphs from the body of the Word document.
// Find all paragraphs in the document.
var paragraphs =
from para in xDoc
.Root
.Element(w + "body")
.Descendants(w + "p") ...
Instead of a "p", you will need to use "tbl". This is how to collect all of the tables from a Document in order to verify their contents. To inspect each row and column will involve more code to loop through the tables data, but this should get you started.
If you install the Open XML Productivity Tool, you can view all of the xml of any Open XML document. The screen below shows the tool with a Word doc containing a table.
[]
The left pane show the structure of a typical table in a Word doc. The right is the Open XML Table spec. The tool helps you know what to read and what to ignore when you are writing your liq to xml code to read and verify the data in your tables.
If you have a specific table format you need to read for your project and you are stuck, post the table and the code you tried in another question. Otherwise based on your original question, this answer should be enough to help you get started towards your solution.
I need to make an application where users upload a certain document template (Word, etc.) and they place in it controls (labels, textboxs) with certain ids and based on the ids of the controls, I have to fill the template with data from SQL server and then make the word document filled with data available to download.
Is it anyway I can do with using C# and APS.NET, Javascript, Jquery, etc.?
I don't really know where to start.
Thanks in advance.
You can make you document from template as like below:
priavte void CreateWordDocument(string InputFileNamePath, string OutputFileNamePath)
{
Application app = new Microsoft.Office.Interop.Word.Application();
doc = app.Documents.Open(InputFileNamePath,ref missing, ref missing,ref Missing, ref Missing);
// Activate document
doc.Actiavte();
//Find place holders in input template and replace them with database values
this.FindAndReplace(app,"<Name>","John"); //take all values from database
this.FindAndReplace(app,"<Address>","Test address");
this.FindAndReplace(app,"<City>","Test City");
//Save file
doc.SaveAs(ref OutputFileNamePath, ref missing, ref missing, ref missing, ref missing);
doc.Close(ref missing,ref missing,ref missing,)
}
Visit link for more help : http://www.techrepublic.com/blog/howdoi/how-do-i-modify-word-documents-using-c/190
I searched a little bit and I found out that the solution to my problem is Open XML SDK 2.0 for Microsoft Office
I never used it and I guess I have to search a lot until i can use it in my application. Hopefully, It will work fine.
When you code it yourself in C#, please also take care of the following aspects:
Headers and footers deserve separate handling when working on the document directly. When you have multiple the same values on a page of a repeating element, I have found it easier to always put the first defined value in the header and the last defined value in the footer when they have a reference.
Try not to use the XML representation; headers and footers are a lot harder to process in the XML document when replacing the original XML.
Tables can cause problems; when you use the text representation of the document to parse the content (and which is required when you want to insert repeating rows) tables introduce an offset between the text presentation and the cursor position in the document. You will need to add extra anchors inside the table manually or use the XML representation.
Or use TAB-s instead of tables in your document when you need a table like structure.
When you need to repeat pictures or frames or other more advanced features, they are easier than tables. If you have only a limited list of pictures to be used such as a signature of the CxO-s, put them all in the Word document and remove the ones not needed at all or not needed to be repeated.
For uploading, I recommend remembering to set metadata properties (which makes life easier for document management systems) and using a webservice or alike to get it processed on the backend.
There is about 2000 lines of this, so manually would probably take more work than to figure out a way to do ths programatically. It only needs to work once so I'm not concerned with performance or anything.
<tr><td>Canada (CA)</td><td>Alberta (AB)</td></tr>
<tr><td>Canada (CA)</td><td>British Columbia (BC)</td></tr>
<tr><td>Canada (CA)</td><td>Manitoba (MB)</td></tr>
Basically its formatted like this, and I need to divide it into 4 parts, Country Name, Country Abbreviation, Division Name and Division Abbreviation.
In keeping with my complete lack of efficiency I was planning just to do a string.Replace on the HTML tags after I broke them up and then just finding the index of the opening brackets and grabbing the space delimited strings that are remaining. Then I realized I have no way of keeping track of which is the country and which is the division, as well as figuring out how to group them by country.
So is there a better way to do this? Or better yet, an easier way to populate a database with Country and Provinces/States? I looked around SO and the only readily available databases I can find dont provide the full name of the countries or the provinces/states or use IPs instead of geographic names.
Paste it into a spreadsheet. Some spreadsheets will parse the HTML table for you.
Save it as a .CSV file and process it that way. Or. Add a column to the spreadsheet that says something like the following:
="INSERT INTO COUNTRY(CODE,NAME) VALUES=('" & A1 & "','" & B1 & "');"
Then you have a column of INSERT statements that you can cut, paste and execute.
Edit
Be sure to include the <table> tag when pasting into a spreadsheet.
<table><tr><th>country</th><th>name></th></tr>
<tr><td>Canada (CA)</td><td>Alberta (AB)</td></tr>
<tr><td>Canada (CA)</td><td>British Columbia (BC)</td></tr>
<tr><td>Canada (CA)</td><td>Manitoba (MB)</td></tr>
</table>
Processing a CSV file requires almost no parsing. It's got quotes and commas. Much easier to live with than XML/HTML.
/<tr><td>([^\s]+)\s\(([^\)])\)<\/td><td>([^\s]+)\s\(([^\)])\)<\/td><\/tr>/
Then you should have 4 captures with the 4 pieces of data from any PCRE engine :)
Alternatively, something like http://jacksleight.com/assets/blog/really-shiny/scripts/table-extractor.txt provides more completeness.
Sounds like a problem easily solved by a Regex.
I recently learned that if you open a url from Excel it will try and parse out the table data.
If you are able to see this table in the browser (Internet explorer), you can select the entire table, right click & "Export to Microsoft Excel"
That should help you get data into separate columns, I guess.
do you have to do this programatically? If not, may i suggest just copying and pasting the table (from the browser) onto MS Excel and then clearing all formats? This way tou get a nice table that can then be imported into your database without problem.
just a suggestion... hth
An assembly exists for .Net called System.Xml; you can just reference the assembly and convert your HTML document to a System.Xml.XmlDocument, you can easily pinpoint the HTML node that contains your required data, and use the use the children nodes to add into your data. This requires little string parsing on your part.
Load the HTML data as XElements, use LINQ to grab the values you need, and then INSERT.
Blowing my own trumpet here but my FOSS tool CSVfix will do it with a combination of the read_xml and sql_insert commands.