I'm trying to add an hyperlink to some XML being inserted in a field of a Word Document (using Microsoft.Office.Interop.Word).
The XML being inserted contains multiple paragraphs, each containing some text that should be converted to a hyperlink. The text that contains the hyperlink is extracted from the end of the paragraph after the "Available at " substring is found.
The following code is able to create the hyperlink but the first hyperlink is always applied to all paragraphs. I was expecting the code to create an hyperlink for each of the paragraphs being iterated.
My guess is that the paragraph.Range object is pointing to text that is in fact the whole XML inserted as opposed to the text contained within the paragraph. I've also confirmed that the paragraph.Range.Text property returns the correct text for each paragraph so I am completely confused as to what should be expected for the Range property.
Any ideas? Thanks in advance.
if (!string.IsNullOrWhiteSpace(bibliography))
{
const string linkToken = "Available at ";
field.Result.InsertXML(bibliography);
foreach (Paragraph paragraph in field.Result.Paragraphs)
{
var paragraphText = paragraph.Range.Text;
var indexOfLink = paragraphText.IndexOf(linkToken, StringComparison.OrdinalIgnoreCase);
if (indexOfLink >= 0)
{
var linkStart = indexOfLink + linkToken.Length;
var linkPart = paragraphText.Substring(linkStart);
Uri uriFound;
if (Uri.TryCreate(linkPart, UriKind.Absolute, out uriFound))
{
object linkAddress = uriFound.ToString();
paragraph.Range.Hyperlinks.Add(paragraph.Range, ref linkAddress);
}
}
}
}
Related
Hi all I have code that is having Open XML SDK to find MERGEFIELDs in Microsoft Word documents and replace them with the provided data, this is working well but I want to replace provided string with the image now.
Code from the link:
https://www.codeproject.com/Articles/38575/Fill-Mergefields-in-docx-Documents-without-Microso
using (WordprocessingDocument docx = WordprocessingDocument.Open(stream, true))
{
// 2010/08/01: addition
ConvertFieldCodes(docx.MainDocumentPart.Document);
// first: process all tables
foreach (var field in docx.MainDocumentPart.Document.Descendants<SimpleField>())
{
string fieldname = GetFieldName(field, out switches);
// I will get fieldname "ImgLogo" and then I want to add an Image at this position.
}
}
I will get fieldname "ImgLogo" as shown above and then I want to add an Image at this position. Full code is shown in the above link.
Help me here Thanks in Advance.
You can use mergefield if statements. Write your statement out and after you've finished paste the image into the else segment of the if statement.
https://wordmvp.com/FAQs/MailMerge/MMergeIfFields.htm
Press Alt + F9 to show mergefield codes in word / excel. If should edit the template document and not the end product. As you can't see the fields after they've been merged.
When you merge in your picture it'll overwrite both { MERGEFIELD MergeImage } fields. When it's overwrites them, the { MERGEFIELD MergeImage } = "" will not be true and will show your image instead of the placeholder. As stated, the placeholder image should be pasted into the = "Paste Placeholder here".
This works as before you merge your image, { MERGEFIELD MergeImage } is equal to a blank string.
E.g.
{ IF { MERGEFIELD MergeImage } = "" "Paste Placeholder here" "{ MERGEFIELD MergeImage }" }
break down of the above:
{ IF CONDITION "TRUE" "FALSE" }
If nothing has been merged in you'll get the place holder, otherwise you'll get the merged image.
I've got a line here :
Paragraph par = row.Cells[0].AddParagraph("Value");
Is there a way to get the text value from par? I have tried par.GetValue() but that didn't work
Paragraphs can contain a mix of text with different sizes, fonts, attributes along with images and other things.
Here's a code snippet that gets the first text element:
if (para.Elements.Count > 0)
{
Text t = para.Elements[0] as Text;
if (t != null)
{
string s = t.Content;
...
}
}
You know what your code adds to the paragraph, so you should know what you have to extract.
I do not know what you are trying to do. Every MigraDoc document object has a Tag member for custom use. You can assign any object (including string) to this Tag.
I need to search a document for strings enclosed in <>. So if the application finds the variable within the document, it replaces that variable with DateTime.Today.ToShortDateString(). For instance:
string filename = "C:\\Temp\\" + appNum + "_ReceiptOfApplicationLtr.docx";
if (File.Exists((string)filename))
{
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(filename, true))
{
var body = wordDoc.MainDocumentPart.Document.Body;
foreach (var text in body.Descendants<Text>())
{
if (text.Text == "<TodaysDate>")
{
text.Text = text.Text.Replace("<TodaysDate>", DateTime.Today.ToShortDateString());
}
}
using (StreamWriter sw = new StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create)))
{
sw.Write(filename);
}
}
}
Well when it searches the Descendants Text, it finds the first <, then TodaysDate, finally >. The issue being it won't find the string <TodaysDate>. Can anyone help me out?
Open XML can store text in different text tags inside the same run. What I would do if I were you is just find the Run where your string is stored and use the InnerText property to find all the text inside that run.
For example:
Run runToFind = body.Descendants<Run>()
.FirstOrDefault(r => r.Innertext.Contains("<TodaysDate>");
Then you can replace the Run with another one:
runToFind.Parent.Replace(new Run(new Text(DateTime.Now.ToShortDateString())),runToFind);
For anyone still struggling with this - you can check out this library
https://github.com/antonmihaylov/OpenXmlTemplates
With it instead of searching for special tags in the text (because of the problems specified in the comment of Thomas Barnekow), you add a Content control in the document and in the tag name of the content control you specify the name of the variable you want to replace.
You can then feed JSON data or a regular C# dictionary object and the text will get replaced.
Note - I am the maker of that library, but i have no financial gain from it - it is open sourced and under active development (and always looking for contributors!)
I need to retain paragraph breaks in a .docx file, but get rid of linebreaks which are often in the wrong place when copying from one file to another (due to different page sizes, and when the font is changed).
Using the DocX Library, I'm trying this:
private void ReplaceLineBreaksWithBoo(string filename)
{
List<string> lineBreaks;
using (DocX document = DocX.Load(filename))
{
lineBreaks = document.FindUniqueByPattern("\n", System.Text.RegularExpressions.RegexOptions.None);
if (lineBreaks.Count > 0)
{
foreach (string s in lineBreaks)
{
document.ReplaceText(s, string.empty); // <-- or a space?
}
}
document.Save();
}
}
...but it doesn't work - "\n" is not the right thing to pass, I reckon; I don't know what I need for that first arg to the FindUniqueByPattern() method. Documentation is nil and the discussion forum there resembles Bodie, California:
I guess you can't do it using FindUniqueByPattern or FindAll. Newline is not represented by any symbol but stored as a paragraph with empty text. You can peek document representation in xml format from document.Xml property, there you'll see empty line stored as single <w:p> element.
Therefore you can search for Paragraphs with empty text instead of searching for newline character :
using (DocX document = DocX.Load(filename))
{
var emptyLines = document.Paragraphs.Where(o => string.IsNullOrEmpty(o.Text));
foreach (var paragraph in emptyLines)
{
paragraph.Remove(false);
}
document.Save();
}
I am appending some text containing '\r\n' into a word document at run-time.
But when I see the word document, they are replaced with small square boxes :-(
I tried replacing them with System.Environment.NewLine but still I see these small boxes.
Any idea?
the answer is to use \v - it's a paragraph break.
Have you not tried one or the other in isolation i.e.\r or \n as Word will interpret a carriage return and line feed respectively. The only time you would use the Environment.Newline is in a pure ASCII text file. Word would handle those characters differently! Or even a Ctrl+M sequence. Try that and if it does not work, please post the code.
Word uses the <w:br/> XML element for line breaks.
After much trial and error, here is a function that sets the text for a Word XML node, and takes care of multiple lines:
//Sets the text for a Word XML <w:t> node
//If the text is multi-line, it replaces the single <w:t> node for multiple nodes
//Resulting in multiple Word XML lines
private static void SetWordXmlNodeText(XmlDocument xmlDocument, XmlNode node, string newText)
{
//Is the text a single line or multiple lines?>
if (newText.Contains(System.Environment.NewLine))
{
//The new text is a multi-line string, split it to individual lines
var lines = newText.Split("\n\r".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
//And add XML nodes for each line so that Word XML will accept the new lines
var xmlBuilder = new StringBuilder();
for (int count = 0; count < lines.Length; count++)
{
//Ensure the "w" prefix is set correctly, otherwise docFrag.InnerXml will fail with exception
xmlBuilder.Append("<w:t xmlns:w=\"http://schemas.microsoft.com/office/word/2003/wordml\">");
xmlBuilder.Append(lines[count]);
xmlBuilder.Append("</w:t>");
//Not the last line? add line break
if (count != lines.Length - 1)
{
xmlBuilder.Append("<w:br xmlns:w=\"http://schemas.microsoft.com/office/word/2003/wordml\" />");
}
}
//Create the XML fragment with the new multiline structure
var docFrag = xmlDocument.CreateDocumentFragment();
docFrag.InnerXml = xmlBuilder.ToString();
node.ParentNode.AppendChild(docFrag);
//Remove the single line child node that was originally holding the single line text, only required if there was a node there to start with
node.ParentNode.RemoveChild(node);
}
else
{
//Text is not multi-line, let the existing node have the text
node.InnerText = newText;
}
}