Getting a paragraph value in MigraDoc - c#

I've got a line here :
Paragraph par = row.Cells[0].AddParagraph("Value");
Is there a way to get the text value from par? I have tried par.GetValue() but that didn't work

Paragraphs can contain a mix of text with different sizes, fonts, attributes along with images and other things.
Here's a code snippet that gets the first text element:
if (para.Elements.Count > 0)
{
Text t = para.Elements[0] as Text;
if (t != null)
{
string s = t.Content;
...
}
}
You know what your code adds to the paragraph, so you should know what you have to extract.
I do not know what you are trying to do. Every MigraDoc document object has a Tag member for custom use. You can assign any object (including string) to this Tag.

Related

Finding list of objects that contain full or just part of searched string

I've a list of paragraphs. Each paragagraph can contain Text. I'm trying to search for a string that may be as whole within a single paragraph, or spread across multiple paragraphs with as bad case where each letter is different paragraph.
public List<WordParagraph> FindText(string text) {
List<WordParagraph> list = new List<WordParagraph>();
var found = false;
Paragraph currentParagraph = null;
foreach (var paragraph in this.Paragraphs) {
//if (currentParagraph == null) {
// currentParagraph = paragraph._paragraph;
//} else {
// if (currentParagraph != paragraph._paragraph) {
// found = false;
// }
//}
// paragraph.Text
// logic missing to find text that can start within some paragraph.Text, but
// can span across multiple paragraphs
// for example searching for text "This Is MyTest" within 4 paragraphs that
// may be written like
// paragraph.Text = "Thi"
// paragraph.Text = "s Is"
// paragraph.Text = " MyTes"
// paragraph.Text = "t"
}
return list;
}
I've tried some logic around foreach char in text, and nested loop over text from the paragraph.text but the logic was failing me.
To give you a bit of background. Consider a Word Document that has a single sentence - one long sentence but each word, or even letter is formatted differently - different font size, bold, underline or whatever. It looks like this:
Now what Word actually saved in the file is a single paragraph, but each paragraph has multiple "runs". The run contains a Text element. Each text element contains the text that you see in Word, but due to formatting of possibly even each word it can be split into many many small Text properties.
Now in my example, I've simplified the logic and for me, each "run" is a paragraph with a text. So List of WordParagraphs is a list of runs within Screenshot you see.
Now I need to find a string "I have that" from the whole sentence you see in word. That means I need to go thru all paragraphs, find the first letter that matches and then check if next letter matches as well, if not I need to start again.
My brain is having hard time to grasp this logic in code.

Apply bold formatting to specific text on Powerpoint textbox programmatically

I have a code that iterates through all the shapes in a Powerpoint presentation (single slide), finds the one that is a textbox and checks whether it is the one I want to replace the text with (and does so if it is, obviously).
All that is working fine, but I want to set the text bold in 2 parts of the text: the name of the person and the name of the course (it's a diploma). I have tried adjusting the ideas/code from this answer, but to no success.
Could anybody help me?
Below is the code I have:
Presentation certificadoCOM = powerpointApp.Presentations.Open(#"C:\Users\oru1ca\Desktop\certCOM.pptx");
// iterates through all shapes
foreach (Shape shape in certificadoCOM.Application.ActivePresentation.Slides.Range().Shapes)
{
// gets the name of the shape and checks whether is a textbox
string shapeName = shape.Name;
if (shapeName.StartsWith("Text Box"))
{
// gets the text from the shape, and if it's the one to change, replace the text
string shapeText = shape.TextFrame.TextRange.Text;
if (shapeText.StartsWith("Concedemos"))
{
shape.TextFrame.TextRange.Text = "Concedemos à Sra. " + nomeP[i] + ",\n representando [...]";
}
}
}
TextRange has methods to select a range of text within the TextFrame.
For example, .Words(int) will select a selection of words (a set of characters separated via spaces) which you can then apply styles to (in this case .Bold.
Code example:
//Set the first 3 words as bold.
shape.TextFrame.TextRange.Words(3).Font.Bold = true;

Add hyperlink to paragraph using word Interop

I'm trying to add an hyperlink to some XML being inserted in a field of a Word Document (using Microsoft.Office.Interop.Word).
The XML being inserted contains multiple paragraphs, each containing some text that should be converted to a hyperlink. The text that contains the hyperlink is extracted from the end of the paragraph after the "Available at " substring is found.
The following code is able to create the hyperlink but the first hyperlink is always applied to all paragraphs. I was expecting the code to create an hyperlink for each of the paragraphs being iterated.
My guess is that the paragraph.Range object is pointing to text that is in fact the whole XML inserted as opposed to the text contained within the paragraph. I've also confirmed that the paragraph.Range.Text property returns the correct text for each paragraph so I am completely confused as to what should be expected for the Range property.
Any ideas? Thanks in advance.
if (!string.IsNullOrWhiteSpace(bibliography))
{
const string linkToken = "Available at ";
field.Result.InsertXML(bibliography);
foreach (Paragraph paragraph in field.Result.Paragraphs)
{
var paragraphText = paragraph.Range.Text;
var indexOfLink = paragraphText.IndexOf(linkToken, StringComparison.OrdinalIgnoreCase);
if (indexOfLink >= 0)
{
var linkStart = indexOfLink + linkToken.Length;
var linkPart = paragraphText.Substring(linkStart);
Uri uriFound;
if (Uri.TryCreate(linkPart, UriKind.Absolute, out uriFound))
{
object linkAddress = uriFound.ToString();
paragraph.Range.Hyperlinks.Add(paragraph.Range, ref linkAddress);
}
}
}
}

How can I replace line breaks with nothing/an empty string using the DocX Library?

I need to retain paragraph breaks in a .docx file, but get rid of linebreaks which are often in the wrong place when copying from one file to another (due to different page sizes, and when the font is changed).
Using the DocX Library, I'm trying this:
private void ReplaceLineBreaksWithBoo(string filename)
{
List<string> lineBreaks;
using (DocX document = DocX.Load(filename))
{
lineBreaks = document.FindUniqueByPattern("\n", System.Text.RegularExpressions.RegexOptions.None);
if (lineBreaks.Count > 0)
{
foreach (string s in lineBreaks)
{
document.ReplaceText(s, string.empty); // <-- or a space?
}
}
document.Save();
}
}
...but it doesn't work - "\n" is not the right thing to pass, I reckon; I don't know what I need for that first arg to the FindUniqueByPattern() method. Documentation is nil and the discussion forum there resembles Bodie, California:
I guess you can't do it using FindUniqueByPattern or FindAll. Newline is not represented by any symbol but stored as a paragraph with empty text. You can peek document representation in xml format from document.Xml property, there you'll see empty line stored as single <w:p> element.
Therefore you can search for Paragraphs with empty text instead of searching for newline character :
using (DocX document = DocX.Load(filename))
{
var emptyLines = document.Paragraphs.Where(o => string.IsNullOrEmpty(o.Text));
foreach (var paragraph in emptyLines)
{
paragraph.Remove(false);
}
document.Save();
}

Difference between InnerHTML and InnerText property of ASP.Net controls?

While using ASP.NET controls, for example
<h1 id="header" runat="server">text</h1>
if we want to change the text of the header we can do it probably by two properties InnerHTML and InnerText. I want to know what is the basic difference between the two properties?
InnerHtml lets you enter HTML code directly, InnerText formats everything you put in there for it to be taken as plain text.
For example, if you were to enter this in both properties: Hello <b>world</b>
This is what you would get with InnerHTML:
Hello world
That is, exactly the same HTML you entered.
Instead, if you use InnerText, you get this:
Hello <b>world</b>
And the resulting HTML would be Hello <b>world</b>
When in doubt, go to the source (or decompile):
In HtmlContainerControl:
public virtual string InnerText
{
get
{
return HttpUtility.HtmlDecode(this.InnerHtml);
}
set
{
this.InnerHtml = HttpUtility.HtmlEncode(value);
}
}
public virtual string InnerHtml
{
get
{
if (base.IsLiteralContent())
{
return ((LiteralControl)this.Controls[0]).Text;
}
if (this.HasControls() && this.Controls.Count == 1 && this.Controls[0] is DataBoundLiteralControl)
{
return ((DataBoundLiteralControl)this.Controls[0]).Text;
}
if (this.Controls.Count == 0)
{
return string.Empty;
}
throw new HttpException(SR.GetString("Inner_Content_not_literal", new object[]
{
this.ID
}));
}
set
{
this.Controls.Clear();
this.Controls.Add(new LiteralControl(value));
this.ViewState["innerhtml"] = value;
}
}
Both properties ultimately use InnerHtml, but setting InnerText HTML encodes the value so that it will be displayed literally in the browser versus interpreted as markup.
Remember that assigning to InnerHtml will not encode the value, and thus any user-driven content should be sanitized prior to assignment.
This also emphasizes how important it is to be mindful of view state (note the last line of InnerHtml's setter; everything ends up in view state whether or not you need it).
InnerHtml allows to insert html formated text within an HTML container, while InnerText only allows plain text (if I remember correctly this property trims any type of html you try to put in it)
InnerHtml. http://msdn.microsoft.com/en-us/library/system.web.ui.htmlcontrols.htmlcontainercontrol.innerhtml.aspx
InnerText. http://msdn.microsoft.com/en-us/library/system.web.ui.htmlcontrols.htmlcontainercontrol.innertext.aspx

Categories