Replace text in Word with text from C# form - c#

I'm trying to make an application in C#. When pressing a radio button, I'd like to open a Microsoft Word document (an invoice) and replace some text with text from my Form. The Word documents also contains some textboxes with text.
I've tried to implement the code written in this link Word Automation Find and Replace not including Text Boxes but when I press the radio button, a window appears asking for "the encoding that makes the document readable" and then the Word document opens and it's full of black triangles and other things instead of my initial template for the invoice.
How my invoice looks after:
Here is what I've tried:
string documentLocation = #"C:\\Documents\\Visual Studio 2015\\Project\\Invoice.doc";
private void yes_radioBtn_CheckedChanged(object sender, EventArgs e)
{
FindReplace(documentLocation, "HotelName", "MyHotelName");
Process process = new Process();
process.StartInfo.FileName = documentLocation;
process.Start();
}
private void FindReplace(string documentLocation, string findText, string replaceText)
{
var app = new Microsoft.Office.Interop.Word.Application();
var doc = app.Documents.Open(documentLocation);
var range = doc.Range();
range.Find.Execute(FindText: findText, Replace: WdReplace.wdReplaceAll, ReplaceWith: replaceText);
var shapes = doc.Shapes;
foreach (Shape shape in shapes)
{
var initialText = shape.TextFrame.TextRange.Text;
var resultingText = initialText.Replace(findText, replaceText);
shape.TextFrame.TextRange.Text = resultingText;
}
doc.Save();
doc.Close();
Marshal.ReleaseComObject(app);
}

So if your word template is the same each time you essentially
Copy The Template
Work On The Template
Save In Desired Format
Delete Template Copy
Each of the sections that you are replacing within your word document you have to insert a bookmark for that location (easiest way to input text in an area).
I always create a function to accomplish this, and I end up passing in the path - as well as all of the text to replace my in-document bookmarks. The function call can get long sometimes, but it works for me.
Application app = new Application();
Document doc = app.Documents.Open("sDocumentCopyPath.docx");
if (doc.Bookmarks.Exists("bookmark_1"))
{
object oBookMark = "bookmark_1";
doc.Bookmarks.get_Item(ref oBookMark).Range.Text = My Text To Replace bookmark_1;
}
if (doc.Bookmarks.Exists("bookmark_2"))
{
object oBookMark = "bookmark_2";
doc.Bookmarks.get_Item(ref oBookMark).Range.Text = My Text To Replace bookmark_2;
}
doc.ExportAsFixedFormat("myNewPdf.pdf", WdExportFormat.wdExportFormatPDF);
((_Document)doc).Close();
((_Application)app).Quit();
This code should get you up and running unless you want to pass in all the values into a function.
EDIT: If you need more examples I'm working on a blog post as well, so I have a lot more detail if this wasn't clear enough for your use case.

Related

Search Word Document and hide range where Style.localName = "MyStyle"

I have a lot of Word documents where I need to hide text witch is formatted with a custom Word.Style type.
I’m using C# in a Windows Form and I’ve tried to open the Word documents one by one and go through all paragraphs. I then cast the paragraph style to a style object and from this I get the paragraphs Style localname.
Then if this paragraphs local style name matches one that I need to hide, then I try to take the paragraps.Range, select it, and set it to range.Font.Hidden = 1 I then try to save the document, but it document dosen’t change, so text formatted with my custom style does not change.
I've tried this so far:
using Word = Microsoft.Office.Interop.Word;
Word.Application wordApp = new Word.Application();
private void buttonOpenMany_Click(object sender, EventArgs e)
{
List<string> Styles = new List<string>();
Styles.Add("MyStyle1");
Styles.Add("MyStyle2");
Styles.Add("MyStyle3");
foreach (var item in Directory.GetFiles(textBoxWordFolder.Text,"*.docx"))
{
HideByStyleName(item, Styles);
}
MessageBox.Show("Done");
}
void HideByStyleName(string WordFile, List<string> StylesToHide)
{
wordApp.Documents.Open(WordFile);
Word.Document document = wordApp.ActiveDocument;
Word.Range rng = document.Content;
foreach (Word.Paragraph paragraph in document.Paragraphs)
{
Object styleobject = paragraph.get_Style();
string stylename = ((Word.Style)styleobject).NameLocal;
foreach (var style in StylesToHide)
{
if (stylename == style)
{
Word.Range range = paragraph.Range;
range.Select();
range.Font.Hidden = 1;
}
}
}
document.SaveAs2(WordFile);
document.Close();
}
I’ve also tried to use Range.Find.Execute but are having a hard time to figure out how to use this to find text where the Style.NameLocal =”MyCustomStyle1” And also who to replace this found range with Hidden text. Hope anyone have some input or guides.
Thank you and best regards
Rasmus

RichTextBox links appearing on same position

I have been able to get links appearing in my RichTextbox. The first entry is correct but when I try appending a new line that also contains a link, the first entry is in the same position as the new link. When clicking on the link it retains it's first entries hyperlink.
I want each line to have it's own hyperlink (where it's underlined)
Code used to append a Link
public void AppendLink(string text, string linkText)
{
LinkLabel link = new LinkLabel();
link.Text = text;
link.LinkClicked += new LinkLabelLinkClickedEventHandler(this.link_LinkClicked);
LinkLabel.Link data = new LinkLabel.Link();
data.LinkData = linkText;
link.Links.Add(data);
link.Location = this.logTextBox.GetPositionFromCharIndex(this.logTextBox.TextLength);
this.logTextBox.Controls.Add(link);
logTextBox.SelectionFont = UNDERLINE_FONT;
this.logTextBox.AppendText(s);
}
Called using this
AppendLogLine("Sealed ");
AppendLink(itemName, GetItemLink(itemName));
AppendLog(" is an unknown item. Keeping.");
Append Log and AppendLogLine does the same as AppendLink just doesn't create a link and uses a different Font

Custom Document property not getting saved in Word Document

I have cretaed a add-in for word. I am trying to update value of a custom property in a word document on a button click. But its not getting saved.
The code I write is:
private void button_Click(object sender, IRibbonControl control, bool pressed)
{
Word.Document document = WordApp.ActiveDocument;
Microsoft.Office.Core.DocumentProperties properties;
properties = (Microsoft.Office.Core.DocumentProperties)document.CustomDocumentProperties;
properties["abc"].Value = "newValue";
document.Save();
}
Here if I close the document and open it again am getting the old value not the new one.
But if I add a space in my document and then save it. Then value of custom property getting saved.
Code is:
private void button_Click(object sender, IRibbonControl control, bool pressed)
{
Word.Document document = WordApp.ActiveDocument;
Microsoft.Office.Core.DocumentProperties properties;
properties = (Microsoft.Office.Core.DocumentProperties)document.CustomDocumentProperties;
properties["abc"].Value = "newValue";
document.Range(document.Content.End - 1, document.Content.End - 1).Select();
WordApp.Selection.Range.Text = " ";
document.Save();
}
Why the behavior is like this. I do not want to add any extra blank space in my document. Please help me in this. Thanks in advance.
This is a known "idiosyncracy" of a number of Office applications, not just Word. Changing the value of a document property, but nothing else, doesn't get "noticed", so it's not saved. There's quite a bit of detail in this discussion on MSDN.
Either the code needs to add something to the document "body" (which can then be deleted but not undone) or it can explicitly set the "dirty" flag on the document so that Word realizes it does need to save:
document.Saved = false;
document.Save();

How to use the Word RepeatingSection ContentControl in a protected word document

List item
I am implementing a Word template for a form filling application using VSTO and c# in Visual Studio 2017 and wish to take advantage of Word repeating section content control. However, I am being prevented from programmatically applying this type of control after I have previously protected the document for form filling. It appears that unprotecting the document does not return the document to the same unprotected state in this context as prior to protecting it. Here is a stripped down demonstration program to highlight the problem:
In Visual Studio create a new Word 2013 and 2016 VSTO Template project, leaving the project to use an unchanged default blank document template, add the following code to the ThisDocument partial class
private void ThisDocument_Startup(object sender, System.EventArgs e)
{
//Demonstrates an unexpected impact of protecting then subsequently unprotecting a document
AddTableDirect();
DocProtect();
DocUnprotect();
AddTableRepeatingSection();
}
private void ThisDocument_Shutdown(object sender, System.EventArgs e)
{
}
private void DocProtect()
{
//Protects the active document restricting the operator to form filling
object noReset = true;
object password = System.String.Empty;
object useIRM = false;
object enforceStyleLock = false;
this.Protect(Word.WdProtectionType.wdAllowOnlyFormFields,
ref noReset, ref password, ref useIRM, ref enforceStyleLock);
}
private void DocUnprotect()
{
// Unprotects the active document allowing programmatic manipulation
object password = System.String.Empty;
this.Unprotect(ref password);
}
private void AddTableDirect()
{
//Creates a one row table directly adding a single plain text content control
Word.Range range = this.Sections[1].Range.Paragraphs[1].Range;
Word.Table table = this.Tables.Add
(range, 1, 1, Word.WdDefaultTableBehavior.wdWord9TableBehavior, Word.WdAutoFitBehavior.wdAutoFitWindow);
Word.ContentControl cc = this.ContentControls.Add
(Word.WdContentControlType.wdContentControlText, table.Cell(1, 1).Range);
}
private void AddTableRepeatingSection()
{
//Programatically duplicates the table as a repeating section
Word.Table table = this.Sections[1].Range.Tables[1];
Word.Range rSRange = table.Range;
Word.ContentControl rSCC = this.ContentControls.Add
(Word.WdContentControlType.wdContentControlRepeatingSection, rSRange);
rSCC.RepeatingSectionItems[1].InsertItemAfter();
}
If you build and run this code as is then a System.Runtime.InteropServices.COMException is generated with text: "This method or property is not available because the current selection partially covers a plain text content control" on the statement that adds the Repeating Section control in the AddTableRepeatingSection() method (the line before InsertItemAfter).
However if you comment out the DocProtect() and DocUnprotect() statements in ThisDocument_StartUp then this code runs successfully.
What do I need to change to enable me to protect and unprotect the document without generating this exception when programmatically applying the repeating section content control?
I can duplicate what you're seeing - I don't know why it's doing this, seems to be almost some kind of race condition because after the document is opened (click "Continue") it works manually...
I found a workaround. It appears that selecting the table puts whatever is causing Word to pick up the content control in the first cell back where it belongs:
private void AddTableRepeatingSection()
{
//Programatically duplicates the table as a repeating section
Word.Table table = this.Sections[1].Range.Tables[1];
Word.Range rSRange = table.Range;
rSRange.Select();
Word.Range r = this.Application.Selection.Range;
Word.ContentControl rSCC = this.ContentControls.Add
(Word.WdContentControlType.wdContentControlRepeatingSection, r);
rSCC.RepeatingSectionItems[1].InsertItemAfter();
}

How to Reading hyperlinks with AnchorText from pdf file C#

I have taken the link values from PDF file like http://google.com
but I need to take the anchor text value, for example click here.
How to to take the anchor link value text?
I have taken the URL value of the PDF file by using the below URL: Reading hyperlinks from pdf file
for example.
Anchor a = new Anchor("Test Anchor");
a.Reference = "http://www.google.com";
myParagraph.Add(a);
Here I get the http://www.google.com but I need to get anchor value i.e. Test Anchor
Need your suggestions.
From the PDF file you need to identify the region where the link is placed and then read the text below the link using iTextSharp.
This way you can extract the text underneath the link. The limitation of this approach is that if the link region is wider than the text, the extraction will read the full text under that region.
private void GetAllHyperlinksFromPDFDocument(string pdfFilePath)
{
string linkTextBuilder = "";
string linkReferenceBuilder = "";
PdfDictionary PageDictionary = default(PdfDictionary);
PdfArray Annots = default(PdfArray);
PdfReader R = new PdfReader(pdfFilePath);
List<BinaryHyperlink> ret = new List<BinaryHyperlink>();
//Loop through each page
for (int i = 1; i <= R.NumberOfPages; i++)
{
//Get the current page
PageDictionary = R.GetPageN(i);
//Get all of the annotations for the current page
Annots = PageDictionary.GetAsArray(PdfName.ANNOTS);
//Make sure we have something
if ((Annots == null) || (Annots.Length == 0))
continue;
//Loop through each annotation
foreach (PdfObject A in Annots.ArrayList)
{
//Convert the itext-specific object as a generic PDF object
PdfDictionary AnnotationDictionary = (PdfDictionary)PdfReader.GetPdfObject(A);
//Make sure this annotation has a link
if (!AnnotationDictionary.Get(PdfName.SUBTYPE).Equals(PdfName.LINK))
continue;
//Make sure this annotation has an ACTION
if (AnnotationDictionary.Get(PdfName.A) == null)
continue;
//Get the ACTION for the current annotation
PdfDictionary AnnotationAction = (PdfDictionary)AnnotationDictionary.GetAsDict(PdfName.A);
if (AnnotationAction.Get(PdfName.S).Equals(PdfName.URI))
{
//Get action link URL : linkReferenceBuilder
PdfString Link = AnnotationAction.GetAsString(PdfName.URI);
if (Link != null)
linkReferenceBuilder = Link.ToString();
//Get action link text : linkTextBuilder
var LinkLocation = AnnotationDictionary.GetAsArray(PdfName.RECT);
List<string> linestringlist = new List<string>();
iTextSharp.text.Rectangle rect = new iTextSharp.text.Rectangle(((PdfNumber)LinkLocation[0]).FloatValue, ((PdfNumber)LinkLocation[1]).FloatValue, ((PdfNumber)LinkLocation[2]).FloatValue, ((PdfNumber)LinkLocation[3]).FloatValue);
RenderFilter[] renderFilter = new RenderFilter[1];
renderFilter[0] = new RegionTextRenderFilter(rect);
ITextExtractionStrategy textExtractionStrategy = new FilteredTextRenderListener(new LocationTextExtractionStrategy(), renderFilter);
linkTextBuilder = PdfTextExtractor.GetTextFromPage(R, i, textExtractionStrategy).Trim();
}
}
}
}
Unfortunately I don't think you're going to be able to do this, at least not without a lot of guess-work. In HTML this would be easy because a hyperlink and its text are stored together as:
Click here
However, in a PDF these two entities are not stored with any form of relationship. What we think of as a "hyperlink" within a PDF is technically a PDF Annotation that just happens to be sitting on top of text. You can see this by opening a PDF in an editing program such as Adobe Acrobat Pro. You can change the text but the "clickable" area doesn't change. You can also move and resize the "clickable" area and put it anywhere in the document.
When creating PDFs, iText/iTextSharp abstract this away so you don't have to think about this. You can create a "hyperlink" with clickable text but when it generates a PDF it ultimately will create the text as normal text, calculate the rectangle coordinates and then put an annotation at that rectangle.
I did say that you could try to guess at this, and it might or might not work for you. To do this you'd need to get the rectangle for annotation and then find the text that's also at those coordinates. It won't be an exact match, however, because of padding issues. If you absolutely have to get the text under a hyperlink then this is the only way that I know of for doing this. Good luck!

Categories