How to determine if Aspose.Words bookmark contains nested bookmarks - c#

I'm using the following code to iterate through all of the bookmarks in a Microsoft Word document:
foreach (var bookmark in _document.Range.Bookmarks.Cast<Bookmark>())
{
//code
}
How can I determine if bookmark contains nested bookmarks? I need to execute a separate set of logic based on whether or not a bookmark has other bookmarks within it.

The above foreach loop will get all the instances of Bookmark. To get the nested bookmark is a bit tricky, as there is no direct API in Aspose.Words to do this.
I have written a program to do this, the source code is not tiny, so I am sharing the Visual Studio project on Google Drive here.
Below is the summary
foreach (var bookmark in wordDoc.Range.Bookmarks.Cast<Aspose.Words.Bookmark>())
{
Console.WriteLine(bookmark.Name);
// Get all the nodes between bookmark start and end
ArrayList extractedNodes = ExtractContent(bookmark.BookmarkStart, bookmark.BookmarkEnd, true);
for (int i = 0; i < extractedNodes.Count; i++)
{
// Skip first and last nodes
if (i == 0 || i == extractedNodes.Count - 1)
continue;
// See if there is any bookmarks in this node
Node node = (Node)extractedNodes[i];
if (node.Range.Bookmarks.Count > 0)
Console.WriteLine("Nested bookmark found");
}
}
For details of ExtractContent() method, please visit http://www.aspose.com/docs/display/wordsnet/Extract+Content+Overview+and+Code

Related

How to get list of fonts used within one PDF file and copy them to another?

I'm trying to convert a PDF1.7 document to a PDFA/3B one, and currently I need to get all fonts in the source document and copy them into the target (if this actually the way to do it). So currently I have the following:
for (int i = 1; i < source.GetNumberOfPdfObjects(); i++)
{
var obj = source.GetPdfObject(i);
if (!obj?.IsDictionary() ?? true)
continue;
var dict = obj as PdfDictionary;
if (dict == null)
continue;
if (PdfName.Font.Equals(dict.GetAsName(PdfName.Type)))
{
var fontDescriptor = dict.GetAsDictionary(PdfName.FontDescriptor);
if (fontDescriptor == null)
continue;
//What else?
}
}
But I got stuck trying to get the font.
Is this the way to get the fonts from one doc or is there an easier way? And how does one copy them into the new doc?
To get all the fonts in the document and copy them into the target document, you need the following code:
for (int i = 1; i <= pdfDocument.getNumberOfPdfObjects(); i++) {
PdfObject object = pdfDocument.getPdfObject(i);
if (object.isDictionary() && PdfName.Font.equals(((PdfDictionary)object).getAsName(PdfName.Type))) {
object.copyTo(targetDocument);
}
}
However, please don't expect that all the content will be preserved on pages etc. This code just does what you ask for - copy the fonts to the new document. Preserving content and references to fonts is much more complicated than just copying the fonts.
Also, don't expect that by copying objects from an arbitrary PDF document to a document that you will make claim to be PDF/A-3B-compliant that document will acquire such compliance. This is simply not true. There are a lot of requirements PDF/A standard imposes and among them there are some requirements for fonts which are not necessarily fulfilled in your original document.

Duplication with OpenXML (word document) and ID issues

Is it possible to duplicate a word document element with OpenXML without having any issues of "duplicate id" ?
Actually, to duplicate, I clone the elements inside the body and append the cloned elements in the body. But if any of the element have an ID, I'm having errors when I open the document in word.
Here is an example of error from OpenXML validator :
[60] Description="Attribute 'id' should have unique value. Its
current value 'Rectangle 11' duplicates with
others."
And here is my code :
Document document = wordDocument.MainDocumentPart.Document;
Body body = document.Body;
IEnumerable<OpenXmlElement> elements = ((Body)body.CloneNode(true)).Elements();
foreach (var element in elements)
{
OpenXmlElement e = (OpenXmlElement)element.CloneNode(true);
body.AppendChild(e);
}
You can't just copy elements with an id, you have to duplicate Parts too (search OpenXmlPart for more informations).
You can do this by combining functions AddPart() and GetIdOfPart() (accessible from MainDocumentPart)
First try:
when you have an element with an id, use AddPart(OpenXmlPart part) to add the element part and retrieve the new generated id of the part with GetIdOfPart(OpenXmlPart part)
After that, you can replace in your cloned OpenXmlElement the id by the new one
Second try:
or you could imagine an other way like:
Check highest id of existing parts (and save it)
Clone all parts from the start and choose yourself the id (by adding the highest saved id)
When you copy each element and find an id, add the saved highest id to match with the new part
I hope one of this way will help you, but in any case you will need to clone parts
DocIO is a .NET class library that can read, write and render Microsoft Word documents. Using DocIO, you can clone the elements such as paragraph, table, text run or the entire document and append it where you need.
The whole suite of controls is available for free (commercial applications also) through the community license program if you qualify. The community license is the full product with no limitations or watermarks.
Herewith we have a given simple example code snippet which clone all the paragraphs and tables in the document body and append them at the end of the same document.
using Syncfusion.DocIO.DLS;
namespace DocIO_Clone
{
class Program
{
static void Main(string[] args)
{
using (WordDocument document = new WordDocument(#"InputWordFile.docx"))
{
int sectionCount = document.Sections.Count;
for (int i = 0; i < sectionCount; i++)
{
IWSection section = document.Sections[i];
int entityCount = section.Body.ChildEntities.Count;
for (int j = 0; j < entityCount; j++)
{
IEntity entity = section.Body.ChildEntities[j];
switch(entity.EntityType)
{
case EntityType.Paragraph:
IWParagraph paragraph = entity.Clone() as IWParagraph;
document.LastSection.Body.ChildEntities.Add(paragraph);
break;
case EntityType.Table:
IWTable table = entity.Clone() as IWTable;
document.LastSection.Body.ChildEntities.Add(table);
break;
}
}
}
document.Save("ResultDocument.docx");
}
}
}
}
For further information, please refer our help documentation
Note: I work for Syncfusion

How to write a Numbered List of Text in a specific location?

I need to write an array of string to a numbered list but in a specific location of a document.
For example, the array is:
sentence[0] : Jonathan Spielberg
sentence[1] : Stephanie Black
sentence[2] : Marcus Smith
sentence[3] : Kylie Ashton
...
Then it should be written in a specific location, let's say under the section heading "A. Candidate's Name"
A. Candidate's Name
1. Jonathan Spielberg
2. Stephanie Black
3. Marcus Smith
4. Kylie Ashton
My logic so far is using a unique tags, then it will be replaced and looped by the array to be written on a numbered list. Let's say the unique tag is ######CANDIDATESNAME#####. I've done such way, but that doesn't work.
How am I supposed to do to code this?
P.S. : I have a template document .doc/.docx for the only section headings, then I just need to fill it with the numbered list.
I would suggest you following solution.
1) Implement IReplacingCallback interface.
2) Use Range.Replace method to find the unique tag.
3) Move the cursor to the text (unique tag) and insert the numbered list.
Please read following documentation link and use following code to insert numbered list at the position of unique tag.
Find and Replace
string[] list = new string[] { "Jonathan Spielberg", "Stephanie Black", "Marcus Smith", "Kylie Ashton" };
Document mainDoc = new Document(MyDir + "in.docx");
mainDoc.Range.Replace(new Regex("######CANDIDATESNAME#####"), new FindandInsertList(list), false);
mainDoc.Save(MyDir + " Out.docx");
//--------------------------------------
public class FindandInsertList : IReplacingCallback
{
private string[] listitems;
public FindandInsertList(string[] list)
{
listitems = list;
}
ReplaceAction IReplacingCallback.Replacing(ReplacingArgs e)
{
// This is a Run node that contains either the beginning or the complete match.
Node currentNode = e.MatchNode;
// The first (and may be the only) run can contain text before the match,
// in this case it is necessary to split the run.
if (e.MatchOffset > 0)
currentNode = SplitRun((Run)currentNode, e.MatchOffset);
// This array is used to store all nodes of the match for further removing.
ArrayList runs = new ArrayList();
// Find all runs that contain parts of the match string.
int remainingLength = e.Match.Value.Length;
while (
(remainingLength > 0) &&
(currentNode != null) &&
(currentNode.GetText().Length <= remainingLength))
{
runs.Add(currentNode);
remainingLength = remainingLength - currentNode.GetText().Length;
// Select the next Run node.
// Have to loop because there could be other nodes such as BookmarkStart etc.
do
{
currentNode = currentNode.NextSibling;
}
while ((currentNode != null) && (currentNode.NodeType != NodeType.Run));
}
// Split the last run that contains the match if there is any text left.
if ((currentNode != null) && (remainingLength > 0))
{
SplitRun((Run)currentNode, remainingLength);
runs.Add(currentNode);
}
// Create Document Buidler
DocumentBuilder builder = new DocumentBuilder(e.MatchNode.Document as Document);
builder.MoveTo((Run)runs[runs.Count - 1]);
builder.ListFormat.List = e.MatchNode.Document.Lists.Add(ListTemplate.NumberDefault);
foreach (string item in listitems)
{
builder.Writeln(item);
}
// End the bulleted list.
builder.ListFormat.RemoveNumbers();
// Now remove all runs in the sequence.
foreach (Run run in runs)
run.Remove();
// Signal to the replace engine to do nothing because we have already done all what we wanted.
return ReplaceAction.Skip;
}
private static Run SplitRun(Run run, int position)
{
Run afterRun = (Run)run.Clone(true);
afterRun.Text = run.Text.Substring(position);
run.Text = run.Text.Substring(0, position);
run.ParentNode.InsertAfter(afterRun, run);
return afterRun;
}
}
I work with Aspose as Developer evangelist.

Migradoc & nested paragraphs

I am using Ben Foster's Migradoc extensions to format a PDF document using Markdown syntax.
I am running into an issue when using headers or sub-lists (<hx> or <li> elements) within a list (a null reference exception is thrown). The issue is detailed here.
The root cause of the problem is that Migradoc does not support nested paragraphs.
Are there any possible workarounds to this issue?
You ask "Are there any possible workarounds to this issue?"
MigraDoc is able to create PDF and RTF. Does RTF (Word) support nested paragraphs?
Probably not. I think this is not a MigraDoc issue.
Nested lists are possible in MigraDoc, but may require changes in the extensions. IIRC there are limitations with respect to nesting when numbered lists are involved.
IMHO nested paragraphs do not make sense. MigraDoc supports AddFormattedText that allows to use different formats in a single paragraph. This may require changes to the extensions and/or the input given to the extensions.
Hey I've been using Ben Foster's Migradoc extensions as well and had this same problem. This may not be perfect, but it worked well enough for me... Modify your HtmlConverter.cs and do the following:
First, add a global variable:
private int _nestedListLevel;
Next, add 2 new node handlers to the AddDefaultNodeHandlers() method:
nodeHandlers.Add("ul", (node, parent) =>
{
if (parent is Paragraph)
{
_nestedListLevel++;
return parent.Section;
}
_nestedListLevel = 0;
return parent;
});
nodeHandlers.Add("ol", (node, parent) =>
{
if (parent is Paragraph)
{
_nestedListLevel++;
return parent.Section;
}
_nestedListLevel = 0;
return parent;
});
Finally, change the "li" node handler to the following... NOTE, this removes some of the styling work that he did, but it made things less complicated for me and works just fine.. you can re-add that stuff if you want.
nodeHandlers.Add("li", (node, parent) =>
{
var listStyle = node.ParentNode.Name == "ul"
? "UnorderedList"
: "OrderedList";
var section = (Section)parent;
var isFirst = node.ParentNode.Elements("li").First() == node;
var isLast = node.ParentNode.Elements("li").Last() == node;
var listItem = section.AddParagraph().SetStyle(listStyle);
if (listStyle == "UnorderedList")
{
listItem.Format.ListInfo.ListType = _nestedListLevel%2 == 1 ? ListType.BulletList2 : ListType.BulletList1;
}
else
{
listItem.Format.ListInfo.ListType = _nestedListLevel % 2 == 1 ? ListType.NumberList2 : ListType.NumberList1;
}
if (_nestedListLevel > 0)
{
listItem.Format.LeftIndent = String.Format(CultureInfo.InvariantCulture, "{0}in", _nestedListLevel*.75);
}
// disable continuation if this is the first list item
listItem.Format.ListInfo.ContinuePreviousList = !isFirst;
if (isLast)
_nestedListLevel--;
return listItem;
});

Cant resolve infinite recursion

Ive recently started coding on populating a treeview control from a list and this is where i am starting.
http://msmvps.com/blogs/deborahk/archive/2009/11/09/populating-a-treeview-control-from-a-list.aspx
I am currently trying to retrieve registry paths and storing them to a list. From that list, I am trying to add it as parent nodes to the treeview control. Then I traverse every level in the registry and add them as nodes eventually creating a tree-like replica of it. I am retrieving registry information through a separate process and I shy away from using the Registry API within c#. I incorporated all of the code in the link above to my current code and now i am experiencing an infinite recursion error everytime i compile.
This is the current code I am working on.
private void registry()
{
string hivelist_output = process.StandardOutput.ReadToEnd();
string[] hivelist_lines = Regex.Split(hivelist_output, "\r\n");
string[] registry_paths = new string[hivelist_lines.Length];
List<string> registry_path_list = new List<string>();
for (i = 0; i < hivelist_lines.Length; i++)
{
registry_paths[i] = hivelist_lines[i].Substring(22);
registry_path_list.Add(registry_paths[i].Trim());
}
for (i = 0; i < registry_path_list.Count; i++)
{
treeViewList.Add(new TreeViewItem()
{
ParentID = 0,
ID = i,
Text = registry_path_list[i]
});
}
PopulateTreeView(0, null);
treeView1.ExpandAll();
}
private void PopulateTreeView(int parentId, TreeNode parentNode)
{
var filteredItems = treeViewList.Where(item => item.ParentID == parentId);
TreeNode childNode = new TreeNode();
foreach (var i in filteredItems.ToList())
{
if (parentNode == null)
childNode = treeView1.Nodes.Add(i.Text);
else
childNode = parentNode.Nodes.Add(i.Text);
PopulateTreeView(i.ID, childNode);
}
}
The error of infinite recursion points me to this line of code and I dont understand why.
childNode = parentNode.Nodes.Add(i.Text);
I appreciate your great advice on this matter. Thanks in advance!
-------------------------------------SOLVED-------------------------------------------
Moving forward, now my next problem is to add the subkeys below each parent node (registry path) and as well as the subkeys below each of the previously retrieved subkeys and key values and data types. Since the registry is quite deep, how could i make it so that I plot all possible values without using too much nested loops? Thanks!
*structure of what I would like to achieve
registry path
- subkey1
- subkeys of subkey1
- subkeys of subkey1.1
- data types & values
-subkey 2
.
.
.
-subkey 3
.
.
.
-subkey n
- data types & values
so and so forth
You are adding TreeViewItems always with ID = 0. Therefore, your PopulateTreeView always calls itself recursively with ID 0.
You need to fix your setup method:
for (i = 0; i < registry_path_list.Count; i++)
{
treeViewList.Add(new TreeViewItem()
{
ParentID = 0,
ID = 0,
Text = registry_path_list[i]
});
}
and use the corresponding IDs.

Categories