Search Word Document and hide range where Style.localName = "MyStyle" - c#

I have a lot of Word documents where I need to hide text witch is formatted with a custom Word.Style type.
I’m using C# in a Windows Form and I’ve tried to open the Word documents one by one and go through all paragraphs. I then cast the paragraph style to a style object and from this I get the paragraphs Style localname.
Then if this paragraphs local style name matches one that I need to hide, then I try to take the paragraps.Range, select it, and set it to range.Font.Hidden = 1 I then try to save the document, but it document dosen’t change, so text formatted with my custom style does not change.
I've tried this so far:
using Word = Microsoft.Office.Interop.Word;
Word.Application wordApp = new Word.Application();
private void buttonOpenMany_Click(object sender, EventArgs e)
{
List<string> Styles = new List<string>();
Styles.Add("MyStyle1");
Styles.Add("MyStyle2");
Styles.Add("MyStyle3");
foreach (var item in Directory.GetFiles(textBoxWordFolder.Text,"*.docx"))
{
HideByStyleName(item, Styles);
}
MessageBox.Show("Done");
}
void HideByStyleName(string WordFile, List<string> StylesToHide)
{
wordApp.Documents.Open(WordFile);
Word.Document document = wordApp.ActiveDocument;
Word.Range rng = document.Content;
foreach (Word.Paragraph paragraph in document.Paragraphs)
{
Object styleobject = paragraph.get_Style();
string stylename = ((Word.Style)styleobject).NameLocal;
foreach (var style in StylesToHide)
{
if (stylename == style)
{
Word.Range range = paragraph.Range;
range.Select();
range.Font.Hidden = 1;
}
}
}
document.SaveAs2(WordFile);
document.Close();
}
I’ve also tried to use Range.Find.Execute but are having a hard time to figure out how to use this to find text where the Style.NameLocal =”MyCustomStyle1” And also who to replace this found range with Hidden text. Hope anyone have some input or guides.
Thank you and best regards
Rasmus

Related

How to set ContentControl.Range to the current ContentControl I am working from?

I am not finding a way to set the ContentControl.Range.Text from where the C# is executing from (inside the content control). Perhaps I should be looking at it from a completely different perspective.
Currently I have a content control that produces a set of text with some text between [] square brackets and I want to select text and format the colour by setting the start and end of the range of characters between the []. I am stuck on trying to set the initial range to the contentcontrol I am currently using.
Most of what I have managed/found/patched together below.
object word;
Microsoft.Office.Interop.Word.Document _PWdDoc;
try
{
word = System.Runtime.InteropServices.Marshal.GetActiveObject("Word.Application");
//If there is a running Word instance, it gets saved into the word variable
}
catch (Exception ex)
{
//If there is no running instance, it creates a new one
Type type = Type.GetTypeFromProgID("Word.Application");
word = System.Activator.CreateInstance(type);
}
Microsoft.Office.Interop.Word.Application oWord = (Microsoft.Office.Interop.Word.Application) word;
_PWdDoc = oWord.ActiveDocument;
System.Collections.IEnumerator ContentX = _PWdDoc.ContentControls.GetEnumerator();
//Microsoft.Office.Interop.Word.ContentControl ContentX = Microsoft.Office.Interop.Word.ContentControls.Item[];
//Microsoft.Office.Interop.Word.Range rng = Microsoft.Office.Interop.Word.ContentControl.Range.Duplicate(ref ContentX);
//var rngX = Microsoft.Office.Interop.Word.ContentControl.Range(ContentX);
//Microsoft.Office.Interop.Word.ContentControl cc1 = ContentX;
Excuse the coding mess but it's all I can come up with with the minimal experience I have with this.
Now I have gotten the IEnumerator fo the Content Control(I think) I have no idea how to use it besides from what I have read, they say to iterate through the IEnumerables accessing each of them. That's not what I want to do. I want 1 content control. The current one that I am working in. I want to find it's range and assign it to a value. Then in that range's "text" I want to do some [fancy] highlighting.
Determining whether the current selection or a specific Range is in a content control and doing something with that content control is not a trivial matter. Most other Word objects will return something that they're "in"; content controls do not.
So the approach I use is to
create a Range that reaches from the current selection (or a specific Range) back to the beginning of the document
count the number of content controls in that range
then check whether the current selection is in the same range as the last content control of the extended range.
if it is, then I know the selection is within a content control and I can access the content control.
Here's some sample code. The snippet that calls the function I use to return the information:
Word.Range rng = null;
//Substitute a specific Range object if working with a Range, rather than a Selection
Word.ContentControl cc = IsSelectionInCC(wdApp.Selection.Range);
if ( cc != null)
{
rng = cc.Range;
rng.HighlightColorIndex = Word.WdColorIndex.wdYellow;
}
The function:
private Word.ContentControl IsSelectionInCC(Word.Range sel)
{
Word.Range rng = sel.Range;
Word.Document doc = (Word.Document) rng.Parent;
rng.Start = doc.Content.Start;
int nrCC = rng.ContentControls.Count;
Word.ContentControl cc = null;
bool InCC = false;
rng.Start = doc.Content.Start;
if (nrCC > 0)
{
if (sel.InRange(doc.ContentControls[nrCC].Range))
{
InCC = true; //Debug.Print ("Sel in cc")
cc = doc.ContentControls[nrCC];
}
else
{
sel.MoveEnd(Word.WdUnits.wdCharacter, 1);
if (sel.Text == null)
{
//Debug.Print ("Sel at end of cc")
InCC = true;
cc = doc.ContentControls[nrCC];
}
}
}
return cc;
}
Assuming you mean that the insertion point is inside a Content Control, and your Word Application object is called oWord, then you can get the range of that content control using e.g.
Microsoft.Office.Interop.Word.Range r = oWord.Selection.Range.ParentContentControl.Range
If you have nested controls You can verify that the insertion point is in a Content Control (Word 2013 and later, I think) by checking the value of inCC as follows:
Boolean inCC = (Boolean)oWord.Selection.Information[Microsoft.Office.Interop.Word.WdInformation.wdInContentControl]
However, when dealing with content controls, be aware that selecting a content control in the UI is different from selecting the "range of the content control". Programmatically, it's obvious how to select the Range - not so obvious how to select the control. If you select the Range, the ParentContentControl should be the control whose range you've selected. If you (or the user) selected the control, OTTOMH I am not so sure.

Replace text in Word with text from C# form

I'm trying to make an application in C#. When pressing a radio button, I'd like to open a Microsoft Word document (an invoice) and replace some text with text from my Form. The Word documents also contains some textboxes with text.
I've tried to implement the code written in this link Word Automation Find and Replace not including Text Boxes but when I press the radio button, a window appears asking for "the encoding that makes the document readable" and then the Word document opens and it's full of black triangles and other things instead of my initial template for the invoice.
How my invoice looks after:
Here is what I've tried:
string documentLocation = #"C:\\Documents\\Visual Studio 2015\\Project\\Invoice.doc";
private void yes_radioBtn_CheckedChanged(object sender, EventArgs e)
{
FindReplace(documentLocation, "HotelName", "MyHotelName");
Process process = new Process();
process.StartInfo.FileName = documentLocation;
process.Start();
}
private void FindReplace(string documentLocation, string findText, string replaceText)
{
var app = new Microsoft.Office.Interop.Word.Application();
var doc = app.Documents.Open(documentLocation);
var range = doc.Range();
range.Find.Execute(FindText: findText, Replace: WdReplace.wdReplaceAll, ReplaceWith: replaceText);
var shapes = doc.Shapes;
foreach (Shape shape in shapes)
{
var initialText = shape.TextFrame.TextRange.Text;
var resultingText = initialText.Replace(findText, replaceText);
shape.TextFrame.TextRange.Text = resultingText;
}
doc.Save();
doc.Close();
Marshal.ReleaseComObject(app);
}
So if your word template is the same each time you essentially
Copy The Template
Work On The Template
Save In Desired Format
Delete Template Copy
Each of the sections that you are replacing within your word document you have to insert a bookmark for that location (easiest way to input text in an area).
I always create a function to accomplish this, and I end up passing in the path - as well as all of the text to replace my in-document bookmarks. The function call can get long sometimes, but it works for me.
Application app = new Application();
Document doc = app.Documents.Open("sDocumentCopyPath.docx");
if (doc.Bookmarks.Exists("bookmark_1"))
{
object oBookMark = "bookmark_1";
doc.Bookmarks.get_Item(ref oBookMark).Range.Text = My Text To Replace bookmark_1;
}
if (doc.Bookmarks.Exists("bookmark_2"))
{
object oBookMark = "bookmark_2";
doc.Bookmarks.get_Item(ref oBookMark).Range.Text = My Text To Replace bookmark_2;
}
doc.ExportAsFixedFormat("myNewPdf.pdf", WdExportFormat.wdExportFormatPDF);
((_Document)doc).Close();
((_Application)app).Quit();
This code should get you up and running unless you want to pass in all the values into a function.
EDIT: If you need more examples I'm working on a blog post as well, so I have a lot more detail if this wasn't clear enough for your use case.

c# wpf export RichTextBox formatting to xml document

so in WPF i've created a RichTextBox and implemented the functionality to be able to format selected text (bold, undelined, font, etc...), but now i would like to export all of the formatting to a XML file, so when i would load it the loaded file would give me the same text with the same formatting.
I think that the best way to do this would be, if i could find each place where there is formatting in the RTB and then save it as a text range, but i dont know if RTB has a method for finding if a part of text is formatted.
Here is what i've got:
xaml:
<Button Name = "export" Click = "export_Click"/>
<RichTextBox x:Name="RTB"/>
and the c#:
private void export_Click(object sender, RoutedEventArgs e){
TextRange range = new TextRange();
//here is where i want to access the formatted areas
//something like: range = RTB.document.getBoldArea();
//and then i could export what i got in the text range to a xml file
}
thanks in advance to anyone willing to help!
You can actually access XAML content directly, which is itself obviously XML. You could either save this directly or manipulate/translate it into your own schema.
To get the XAML for a RichTextBox :
static string GetXaml(RichTextBox rt)
{
TextRange range = new TextRange(rt.Document.ContentStart, rt.Document.ContentEnd);
MemoryStream stream = new MemoryStream();
range.Save(stream, DataFormats.Xaml);
string xamlText = Encoding.UTF8.GetString(stream.ToArray());
return xamlText;
}
To set the XAML content for a RichTextBox :
static void SetXaml(RichTextBox rt, string xamlString)
{
StringReader stringReader = new StringReader(xamlString);
XmlReader xmlReader = XmlReader.Create(stringReader);
Section sec = XamlReader.Load(xmlReader) as Section;
FlowDocument doc = new FlowDocument();
while (sec.Blocks.Count > 0)
doc.Blocks.Add(sec.Blocks.FirstBlock);
rt.Document = doc;
}

How to keep style on open xml documents

I am using open XML(Microsoft Word - .docx) as a file template to automatically generate other documents. In the template document I have defined content controls, and I have written code to replace content in these content controls.
The content is replaced and the documents are generated, but I am struggling with keeping the style. In Word, when inspecting properties of the content control, I have checked the checbox for "Use a style to format text into the empty control: style", and also checked for "Remove content controls when content are edited". This doesn't seem to have any impact when documents are generated by code.
This is my code (which a community member here was kind enough to help with) for replacing the data in the content controls. Any ideas what I should do in order to keep the formatting? The formatting is simple text formatting, like size and font. Please advice:
private static void ReplaceTags(MainDocumentPart mainPart, string tagName, string tagValue)
{
//grab all the tag fields
var tagFields = mainPart.Document.Body.Descendants<SdtBlock>().Where
(r => r.SdtProperties.GetFirstChild<Tag>().Val == tagName);
foreach (var field in tagFields)
{
//remove all paragraphs from the content block
field.SdtContentBlock.RemoveAllChildren<DocumentFormat.OpenXml.Wordprocessing.Paragraph>();
//create a new paragraph containing a run and a text element
var newParagraph = new DocumentFormat.OpenXml.Wordprocessing.Paragraph();
var newRun = new DocumentFormat.OpenXml.Wordprocessing.Run();
var newText = new DocumentFormat.OpenXml.Wordprocessing.Text(tagValue);
newRun.Append(newText);
newParagraph.Append(newRun);
//add the new paragraph to the content block
field.SdtContentBlock.Append(newParagraph);
}
}
When you assign a style to the content control a new RunProperties element is added under the SdtProperties. For example, if I assign a new style called Style1 I can see the following XML is generated:
<w:sdt>
<w:sdtPr>
<w:rPr>
<w:rStyle w:val="Style1" />
</w:rPr>
<w:alias w:val="LastName" />
<w:tag w:val="LastName" />
....
You need to grab this value and assign it to the new Paragraph you are creating, add the Paragraph at the same level as the SdtBlock and then remove the SdtBlock which is what Word does when you select the "Remove content control when contents are edited" option. The RunProperties are the <w:rPr> element. The following should do what you're after.
private static void ReplaceTags(MainDocumentPart mainPart, string tagName, string tagValue)
{
//grab all the tag fields
IEnumerable<SdtBlock> tagFields = mainPart.Document.Body.Descendants<SdtBlock>().Where
(r => r.SdtProperties.GetFirstChild<Tag>().Val == tagName);
foreach (var field in tagFields)
{
//grab the RunProperties from the SdtBlcok
RunProperties runProp = field.SdtProperties.GetFirstChild<RunProperties>();
//create a new paragraph containing a run and a text element
Paragraph newParagraph = new Paragraph();
Run newRun = new Run();
if (runProp != null)
{
//assign the RunProperties to our new run
newRun.Append(runProp.CloneNode(true));
}
Text newText = new Text(tagValue);
newRun.Append(newText);
newParagraph.Append(newRun);
//insert the new paragraph before the field we're going to remove
field.Parent.InsertBefore(newParagraph, field);
//remove the SdtBlock to mimic the Remove content control when contents are edited option
field.Remove();
}
}

underline portion of text using iTextSharp

I have an application that uses itextsharp to fill PDF form fields.
One of these fields has some text with tags. For example:
<U>This text should be underlined</>.
I'd like that the text closed in .. has to be underlined.
How could I do that?
How could I approch it with HTMLWorker for example?
Here's the portion of code where I write my description:
for (int i = 0; i < linesDescription.Count; i++)
{
int count = linesDescription[i].Count();
int countTrim = linesDescription[i].Trim().Count();
Chunk cnk = new Chunk(linesDescription[i] + GeneralPurpose.ReturnChar, TextStyle);
if (firstOpe && i > MaxLinePerPage - 1)
LongDescWrapped_dt_extra.Add(cnk);
else
LongDescWrapped_dt.Add(cnk);
}
Ordinary text fields do not support rich text. If you want the fields to remain interactive, you will need RichText fields. These are fields that are flagged in a way that they accept an RV value. This is explained here: Set different parts of a form field to have different fonts using iTextSharp (Note that I didn't succeed in getting this to work, but you may have better luck.)
If it is OK for you to flatten the form (i.e. remove all interactivity), please take a look at the FillWithUnderline example:
public void manipulatePdf(String src, String dest) throws DocumentException, IOException {
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
stamper.setFormFlattening(true);
AcroFields form = stamper.getAcroFields();
FieldPosition pos = form.getFieldPositions("Name").get(0);
ColumnText ct = new ColumnText(stamper.getOverContent(pos.page));
ct.setSimpleColumn(pos.position);
ElementList elements = XMLWorkerHelper.parseToElementList("<div>Bruno <u>Lowagie</u></div>", null);
for (Element element : elements) {
ct.addElement(element);
}
ct.go();
stamper.close();
}
In this example, we don't fill out the field, but we get the fields position (a page number and a rectangle). We then use ColumnText to add content at this position. As we are inputting HTML, we use XML Worker to parse the HTML into iText objects that we can add to the ColumnText object.
This is a Java example, but it should be easy to port this to C# if you know how to code in C# (which I don't).
You can trythis
Chunk chunk = new Chunk("Underlined TExt", FontFactory.GetFont(FontFactory.TIMES_ROMAN, 12.0f, iTextSharp.text.Font.BOLD | iTextSharp.text.Font.UNDERLINE));
Paragraph reportHeadline = new Paragraph(chunk);
reportHeadline.SpacingBefore = 12.0f;
pdfDoc.Add(reportHeadline);

Categories