I am using the Microsoft.Office.Interop.Word namespace in a console application to get form data from a MSWord document. In this MSWord doc are fields that have each been assigned a bookmark which I am using as an id.
I would like to be able to retrieve the value of a field its bookmark and store it in a dictionary.I am only able to get the value of each field but not the bookmark AND the field.
Is there a way that I could do something like wdField.Result.Bookmark to get a field's bookmark? I looked at the MSDN documentation but am having a difficult time getting this right. Here is the foreach loop that I am enumerating with:
foreach (Field wdField in oWordDoc.Fields)
{
wdField.Select();
string fieldText = wdField.Result.Text
Console.WriteLine(fieldText);
//string fieldBookMark = wdField.Result.BookMark
}
KazJaw is right: if you have all the target text "bookmarked", you can rely just on BookMarks. Sample code:
foreach (Bookmark bookMark in oWordDoc.Bookmarks)
{
string bmName = bookMark.Name;
Range bmRange = bookMark.Range;
string bmText = bmRange.Text;
}
Or:
Range bmRange = oWordDoc.Bookmarks["bookmark name"].Range;
Related
From a docx file, I would like to extract only the tables and their related heading. In other words, I am interested in the tables and the heading each table belongs to ("lies under").
I am using DocumentFormat.OpenXml library.
Here is my draft:
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;
...
using (var doc = WordprocessingDocument.Open(filewithpath, false))
{
Body body = doc.MainDocumentPart.Document.Body;
List<Table> tables = GetTables(body);
List<string> Paragraphs = new List<string>();
foreach (Table table in tables)
{
Paragraphs.Add(table.???); //I have no idea what to write here
}
}
Thanks in advance!
You can loop through siblings of table and look for "heading" style in paragraph style info.
You can check this answer to get "heading" style.
A pseudo code would be like:
siblings = table.GetElementsBefore(); // You can get all the siblings before the table
siblings_rev = siblings.Reverse(); // reverse it to start from closest sibling
foreach(sibling in siblings_rev)
if(sibling.style_properties.contains("header"))
string title = sibling.value;
I am using HTMLAgilityPack to read and load an XML file. After the file is loaded, I want to insert the values from it into a database.
XML looks like this:
<meeting>
<jobname></jobname>
<jobexperience></jobexperience>
</meeting>
I'm trying to accomplish this using XPath statements within a foreach loop as seen here:
DataTable dt = new DataTable();
//Add Data Columns here
dt.Columns.Add("JobName");
dt.Columns.Add("JobExperience");
// Create a string to read the XML tag "job"
string xPath_job = "//job";
string xPath_job_experience = "//jobexperience";
/* Use a ForEach loop to go through all 'meeting' tags and get the values
from the 'JobName' and 'JobExperience' tags */
foreach (HtmlNode planned_meeting in doc.DocumentNode.SelectNodes("//meeting"))
{
DataRow dr = dt.NewRow();
dr["JobName"] = planned_meeting.SelectSingleNode(xPath_job).InnerText;
dr["JobName"] = planned_meeting.SelectSingleNode(xPath_job_experience).InnerText;
dt.Rows.Add(dr);
}
So the problem is that even though the foreach loop is going through every 'meeting' tag, it's getting the values from only the first 'meeting' tag.
Any help would be greatly appreciated!
So the problem is that even though the foreach loop is going through every 'meeting' tag, it's getting the values from only the first 'meeting' tag.
Yes, that's what the code does. The XPath operator // selects all the elements in the whole document, e.g. //job select all job elements in the whole document.
So in your foreach loop you select all meeting elements in the whole document with
doc.DocumentNode.SelectNodes("//meeting"))
and then - in the loop - you select all //job and all //jobexperience elements in the whole document with
string xPath_job = "//job";
string xPath_job_experience = "//jobexperience";
So you select the first element of all elements - over and over again... Hence the impression that you only get the first element.
So change the code in a way that the children of the current meeting element get selected (by removing the // operator):
string xPath_job = "job";
string xPath_job_experience = "jobexperience";
I am trying to populate a word template in C#. The template contains a table with several cells. I need to be able to identify each cell based on a unique id. I do not find a way to store & read a unique id for each cell/text in word. My approach is to have the unique id as hidden text in each cell. And then format the cell (like changing the background color) based on this unique id.
I face the problem in reading this hidden text in each cell in C#?
Any suggestions would be of great help please!
Thanks!
Here it comes! You can iterate over a document and find a hidden text:
foreach (Microsoft.Office.Interop.Word.Range p in objDoc.Range().Words)
{
if (p.Font.Hidden != 0) //Hidden text found
{
// Do something
}
}
The values returned for p are:
0: Text visible
-1: Text Hidden
That's what I did for a Word Document, but if you are able to iterate over your cells' content, probably this information may help you.
To read hidden text in your code you just need to set
rangeObject.TextRetrievalMode.IncludeHiddenText = true
If you want to make them visible for example, you can iterate trough all words and check the Font.Hidden property, then set it visible:
Word.Document document = ThisAddIn.Instance.Application.ActiveDocument;
var rangeAll = document.Range();
rangeAll.TextRetrievalMode.IncludeHiddenText = true;
foreach (Microsoft.Office.Interop.Word.Range p in rangeAll.Words)
{
texts += p.Text;
if (p.Font.Hidden != 0) //Hidden text found
{
p.Font.Hidden = 0;
count++;
}
}
I am using OpenXML Spreadsheet in order to generate .xlsx file with a given template and a variable dictionary. But I have a question when updating the value in SharedStringTable since the InnerText of the SharedStringItem is read only.
My template excel file is a file with .xlsx. The variables I need to replace has prefix "$$$". For example, "$$$abc", then in the dictionary I may have <"abc", "Payson"> pair (if the dictionary does not contain the key "abc", just leave the "$$$abc" there.
What I have done is something like this
private void UpdateValue(WorkbookPart wbPart, Dictionary<string, string> dictionary)
{
var stringTablePart = wbPart.GetPartsOfType<SharedStringTablePart>().FirstOrDefault();
if (stringTablePart == null)
{
stringTablePart = wbPart.AddNewPart<SharedStringTablePart>();
}
var stringTable = stringTablePart.SharedStringTable;
if (stringTable == null)
{
stringTable = new SharedStringTable();
}
//iterate through all the items in the SharedStingTable
// if there is any text starts with $$$, find out the name of the string
// look for the string in the dictionary
// replace it if found, or keep it if not.
foreach (SharedStringItem item in stringTable.Elements<SharedStringItem>())
{
if (item.InnerText.StartsWith("$$$"))
{
string variableName = item.InnerText.Substring(3);
if (!string.IsNullOrEmpty(variableName) && dictionary.containsKey(variableName))
{
// The problem is here since InnerText is read only.
item.InnerText = dictionary[variableName];
}
}
}
}
The document is here http://msdn.microsoft.com/en-us/library/documentformat.openxml.openxmlcompositeelement.innertext(v=office.14).aspx
Even the document mentioned that innertext can be set, however, there is no set accessor.
Does anyone know how to set the InnterText. Since I may have many cells with the same variable name "$$$abc", and I would like to replace all of them with "Payson".
It appears that you need to replace the InnerXml, then save the SharedStringTable. See Set text value using SharedStringTable with xml sdk in .net.
I've tried to edit the Text and it works.
Text newText = new Text();
newText.Text = dictionary[variableName];
item.Text = newText;
stringTable.Save();
i want to find and change properties of report item element values in rdlc file. i deserialized ReportDefinition.xsd with xsd.exe tool :
using (TextReader textReader = new StreamReader(RdlcPath, Encoding.UTF8))
{
var serializer = new XmlSerializer(typeof(SampleRDLSchema.Report));
Report instance = (SampleRDLSchema.Report)serializer.Deserialize(textReader);
textReader.Close();
}
but now how i can get change in report item element values?(for example change Tablix width or textbox content)
Like Jamie F stated, it would be easier to use formulas and expressions for the properties in the report.
However, if you insist of doing it through XML manipulation, consider changing the xml instead of a deserialized object.
The reason I say this is because it's cleaner.
With the deserialized object you would have to do a loop, check if each object is the node you want, then continue this process until you have found the node you desire.
If the object is serialized and is in XML format as, say a string, you can simply use XElement to quickly grab things you want.
Example:
I use this to grab the width of the report from it's report definition (file xml string).
public String GetWidth()
{
XElement Report = XElement.Parse(this._ReportDefinition);
return Report.Element({http://schemas.microsoft.com/sqlserver/reporting/2008/01/reportdefinition}Width").Value;
}
Or Another Example:
// The grabs the names of all tablix report items from the report definition.
public String GetReportItemName(String DataSetName)
{
XElement Report = XElement.Parse(this._ReportDefinition);
String ReportItemName = String.Empty;
XElement Body = Report.Element("{http://schemas.microsoft.com/sqlserver/reporting/2008/01/reportdefinition}Body");
XElement ReportItems = Body.Element("{http://schemas.microsoft.com/sqlserver/reporting/2008/01/reportdefinition}ReportItems");
foreach (XElement ReportItem in ReportItems.Elements())
{
if (ReportItem.Name == "{http://schemas.microsoft.com/sqlserver/reporting/2008/01/reportdefinition}Tablix")
{
String Name = ReportItem.Element("{http://schemas.microsoft.com/sqlserver/reporting/2008/01/reportdefinition}DataSetName").Value;
if (Name == DataSetName)
{
ReportItemName = ReportItem.Attribute("Name").Value;
}
}
}
return ReportItemName;
}
Hope this is of some help to you.