Reading specific text from XML files

Reading specific text from XML files - c#

I have created a small XML tool which gives me count of specific XML tags from multiple XML files.
The code for this is as follow:
public void SearchMultipleTags()
{
if (txtSearchTag.Text != "")
{
try
{
//string str = null;
//XmlNodeList nodelist;
string folderPath = textBox2.Text;
DirectoryInfo di = new DirectoryInfo(folderPath);
FileInfo[] rgFiles = di.GetFiles("*.xml");
foreach (FileInfo fi in rgFiles)
{
int i = 0;
XmlDocument xmldoc = new XmlDocument();
xmldoc.Load(fi.FullName);
//rtbox2.Text = fi.FullName.ToString();
foreach (XmlNode node in xmldoc.GetElementsByTagName(txtSearchTag.Text))
{
i = i + 1;
//
}
if (i > 0)
{
rtbox2.Text += DateTime.Now + "\n" + fi.FullName + " \nInstance: " + i.ToString() + "\n\n";
}
else
{
//MessageBox.Show("No Markup Found.");
}
//rtbox2.Text += fi.FullName + "instances: " + str.ToString();
}
}
catch (Exception)
{
MessageBox.Show("Invalid Path or Empty File name field.");
}
}
else
{
MessageBox.Show("Dont leave field blanks.");
}
}
This code returns me the tag counts in Multiple XML files which user wants.
Now the same I want to Search for particular text and its count present in XML files.
Can you suggest the code using XML classes.
Thanks and Regards,
Mayur Alaspure

Use LINQ2XML instead..It's simple and a complete replacement to othe XML API's
XElement doc = XElement.Load(fi.FullName);
//count of specific XML tags
int XmlTagCount=doc.Descendants().Elements(txtSearchTag.Text).Count();
//count particular text
int particularTextCount=doc.Descendants().Elements().Where(x=>x.Value=="text2search").Count();

System.Xml.XPath.
Xpath supports counting: count(//nodeName)
If you want to count nodes with specific text, try count(//*[text()='Hello'])
See How to get count number of SelectedNode with XPath in C#?
By the way, your function should probably look something more like this:
private int SearchMultipleTags(string searchTerm, string folderPath) { ...
//...
return i;
}

Try using XPath:
//var document = new XmlDocument();
int count = 0;
var nodes = document.SelectNodes(String.Format(#"//*[text()='{0}']", searchTxt));
if (nodes != null)
count = nodes.Count;

Related

Get only child nodes of a parent node

I try to work with html agility pack. The basic works fine, only when I try to get the childnodes of a part, then i dont get all nodes with this the class 'dealer-offer' equal in which parentnode it will be.
Here is the code, that i use for it:
private void getListOfDiv(string html, string classname)
{
if (html != null)
{
var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
var divProduktkategorie = doc.DocumentNode.SelectSingleNode("//div[#class='" + classname + "']");
//this.txtHtmlCode.Text = divProduktkategorie.InnerHtml;
//return;
int i = 1;
foreach( var divAngebote in divProduktkategorie.SelectNodes("//div[#class='dealer-offer']"))
{
this.listBox1.Items.Add(i + ": " + classname);
this.txtHtmlCode.AppendText(divAngebote.OuterHtml);
i++;
}
}
}
Wenn I return the divProduktkategorie to the outputfild, then I get only the 3 positiones, which be under this singlenode, but wenn I start the loop, then I get every node with the class 'dealer-offer' and not only the 3 positions.
Where is my fault? I didn't find it by myself.
Thanks for helping

Try to get the 3 nodes with correct relative path and then just foreach them. Dont search them in divProduktkategorie references.
private void getListOfDiv(string html, string classname)
{
if (html != null)
{
var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
var divProduktkategorie = doc.DocumentNode.SelectSingleNode("//div[#class='" + classname + "']//div[#class='dealer-offer']");
//this.txtHtmlCode.Text = divProduktkategorie.InnerHtml;
//return;
int i = 1;
foreach( var divAngebote in divProduktkategorie)
{
this.listBox1.Items.Add(i + ": " + classname);
this.txtHtmlCode.AppendText(divAngebote.OuterHtml);
i++;
}
}
}

How to access and replace text in certain paragraphs using OPENXML powertools case by case

I am trying to redact some word files using c# and openxml. I need to do controlled replace of the numbers with certain phrase. Each word file contains different amount of info. I want to use OPENXML powertools for this purspose.
I used normal openxml method to replace but it very unreliable and gets random errors such as zero length error.I used regex replace and that seems to work but it replaces it through out the document which is highly undesirable.
Here is some snippet of the code :
private void redact_Replaceall(string wfile)
{
try
{
using (WordprocessingDocument doc = WordprocessingDocument.Open(wfile, true))
{
var ydoc = doc.MainDocumentPart.GetXDocument();
IEnumerable<XElement> content = ydoc.Descendants(W.body);
Regex regex = new Regex(#"\d+\.\d{2,3}");
int count1 = OpenXmlPowerTools.OpenXmlRegex.Match(content, regex);
int count2 = OpenXmlPowerTools.OpenXmlRegex.Replace(content, regex, replace_text, null);
statusBar1.Text = "Try 1: Found: " + count1 + ", Replaced: " + count2;
doc.MainDocumentPart.PutXDocument();
}
}
catch(Exception e)
{
MessageBox.Show("Replace all exprienced error: " + e.Message);
}
}
Basically, I want to do this redaction based on content of paragraph. I am able to get the paragraphs using but not the id's
IEnumerable<XElement> content = ydoc.Descendants(W.p);
Here is my approach using the normal openxml method but I get alot of errors depending on the file.
foreach (DocumentFormat.OpenXml.Wordprocessing.Paragraph para in bod.Descendants<DocumentFormat.OpenXml.Wordprocessing.Paragraph>())
{
foreach (var run in para.Elements<Run>())
{
foreach (var text in run.Elements<Text>())
{
string temp = text.Text;
int firstlength = first.Length + 1;
int secondlength = second.Length + 1;
if (text.Text.Contains(first) && !(temp.Length > firstlength))
{
text.Text = text.Text.Replace(first, "DELETED");
}
if (text.Text.Contains(second) && !(temp.Length > secondlength))
{
text.Text = text.Text.Replace(second, "DELETED");
}
}
}
}
Here is the last new approach but I am stuck on it
private void redact_Replacebadones(string wfile)
{
try
{
using (WordprocessingDocument doc = WordprocessingDocument.Open(wfile, true))
{
var ydoc = doc.MainDocumentPart.GetXDocument();
/* from XElement xele in ydoc.Root.Elements();
List<string> lhsElements = xele.Elements("lhs")
.Select(el => el.Attribute("id").Value)
.ToList();
*/
/// XElement
IEnumerable<XElement> content = ydoc.Descendants(W.p);
foreach (var p in content )
{
if (p.Value.Contains("each") && !p.Value.Contains("DELETED"))
{
string to_overwrite = p.Value;
Regex regexop = new Regex(#"\d+\.\d{2,3}");
regexop.Replace(to_overwrite, "Deleted");
p.SetValue(to_overwrite);
MessageBox.Show("NAME :" + p.GetParagraphInfo() +" VValue:"+to_overwrite);
}
}
doc.MainDocumentPart.PutXDocument();
}
}
catch (Exception e)
{
MessageBox.Show("Replace each exprienced error: " + e.Message);
}
}

May be a bit late. OpenXML Power tools by Eric white has a Function SearchAndReplace where you can replace Text content, so you don't have to handle it with RegEx.
This function handles also text which is splitted into runs. (If you edit a word, a word can be splittet in runs, so you dint find the search phrase directly.)
May be this helps somebody.

Reading the string from a txt file and converting it to elements in xml using c#

I have a string like this { {Name Mike} {age 19} {gender male}} in a txt file.
I would like this to be converted to xml as the below output. As i am new to this, it seems to be pretty doubts for me.
<name>Mike</name>
<age>19</age>
<gender>male</male>
any help would be appreciated.

Here is my solution, at first you have to create a xml file in my case I have created x.xml at my bin folder and must create a root elemnt on the xml file, in my case sample xml at the begening as below, root element name can be anything, I have used just root
<root>
</root>
then code for writting you string as below
string s = "{{Name Mike} {age 19} {gender male}}";
string[] s2 = s.Replace("{", "").Replace("}", "").Split(' ');
for (int i = 0; i < s2.Length; i++)
{
XDocument doc = XDocument.Load("x.xml");
XElement rt = doc.Element("root");
XElement elm = rt.Element(s2[i]);
if (elm != null)
{
elm.SetValue(s2[i + 1]);
}
else
{
XElement x = new XElement(s2[i], s2[i + 1]);
rt.Add(x);
}
doc.Save("x.xml");
i++;
}
hope this will solve your problem
Update
if you want to automate file creation without creating the xml file by hand then you can do this way
string s = "{{Name Mike} {age 19} {gender male}}";
string[] s2 = s.Replace("{", "").Replace("}", "").Split(' ');
if (!File.Exists("x.xml"))
{
TextWriter tw = new StreamWriter("x.xml", true);
tw.WriteLine("<root>\n</root>");
tw.Close();
}
for (int i = 0; i < s2.Length; i++)
{
XDocument doc = XDocument.Load("x.xml");
XElement rt = doc.Element("root");
XElement elm = rt.Element(s2[i]);
if (elm != null)
{
elm.SetValue(s2[i + 1]);
}
else
{
XElement x = new XElement(s2[i], s2[i + 1]);
rt.Add(x);
}
doc.Save("x.xml");
i++;
}

How To Remove Last Node In XML? C#

I am trying to remove the last node from an XML file, but cannot find any good answers for doing this. Here is my code:
XmlReader x = XmlReader.Create(this.PathToSpecialFolder + #"\" + Application.CompanyName + #"\" + Application.ProductName + #"\Recent.xml");
int c = 0;
while (x.Read())
{
if (x.NodeType == XmlNodeType.Element && x.Name == "Path")
{
c++;
if (c <= 10)
{
MenuItem m = new MenuItem() { Header = x.ReadInnerXml() };
m.Click += delegate
{
};
openRecentMenuItem.Items.Add(m);
}
}
}
x.Close();
My XML node structure is as follows...
<RecentFiles>
<File>
<Path>Text Path</Path>
</File>
</RecentFiles>
In my situation, there will be ten nodes maximum, and each time a new one is added, the last must be removed.

You can try this
XmlDocument doc = new XmlDocument();
doc.Load(fileName);
XmlNodeList nodes = doc.SelectNodes("/RecentFiles/File");
nodes[nodes.Count].ParentNode.RemoveChild(nodes[nodes.Count]);
doc.Save(fileName);

It sounds like you want something like:
var doc = XDocument.Load(path);
var lastFile = doc.Descendants("File").LastOrDefault();
if (lastFile != null)
{
lastFile.Remove();
}
// Now save doc or whatever you want to do with it...

Build XML file from XPathExpressions

I have a bunch of XPathExpressions that I used to read an XML file. I now need go the other way. (Generate an XML file based on the values I have.)
Here is an example to illustrate. Say I have a bunch of code like this:
XPathExpression hl7Expr1 = navigator.Compile("/ORM_O01/MSH/MSH.6/HD.1");
var hl7Expr2 = navigator.Compile("/ORM_O01/ORM_O01.PATIENT/PID/PID.18/CX.1");
var hl7Expr3 = navigator.Compile("/ORM_O01/ORM_O01.PATIENT/ORM_O01.PATIENT_VISIT/PV1/PV1.19/CX.1");
var hl7Expr4 = navigator.Compile("/ORM_O01/ORM_O01.PATIENT/PID/PID.3[1]/CX.1");
var hl7Expr5 = navigator.Compile("/ORM_O01/ORM_O01.PATIENT/PID/PID.5[1]/XPN.1/FN.1");
var hl7Expr6 = navigator.Compile("/ORM_O01/ORM_O01.PATIENT/PID/PID.5[1]/XPN.2");
string hl7Value1 = "SomeValue1";
string hl7Value2 = "SomeValue2";
string hl7Value3 = "SomeValue3";
string hl7Value4 = "SomeValue4";
string hl7Value5 = "SomeValue5";
string hl7Value6 = "SomeValue6";
Is there a way to take the hl7Expr XPathExpressions and generate an XML file with the corresponding hl7Value string in it?
Or maybe just use the actual path string to do the generation (instead of using the XPathExpression object)?
Note: I saw this question: Create XML Nodes based on XPath? but the answer does not allow for [1] references like I have on hl7Expr4.

I found this answer: https://stackoverflow.com/a/3465832/16241
And I was able to modify the main method to convert the [1] to attributes (like this):
public static XmlNode CreateXPath(XmlDocument doc, string xpath)
{
XmlNode node = doc;
foreach (string part in xpath.Substring(1).Split('/'))
{
XmlNodeList nodes = node.SelectNodes(part);
if (nodes.Count > 1) throw new ApplicationException("Xpath '" + xpath + "' was not found multiple times!");
else if (nodes.Count == 1) { node = nodes[0]; continue; }
if (part.StartsWith("#"))
{
var anode = doc.CreateAttribute(part.Substring(1));
node.Attributes.Append(anode);
node = anode;
}
else
{
string elName, attrib = null;
if (part.Contains("["))
{
part.SplitOnce("[", out elName, out attrib);
if (!attrib.EndsWith("]")) throw new ApplicationException("Unsupported XPath (missing ]): " + part);
attrib = attrib.Substring(0, attrib.Length - 1);
}
else elName = part;
XmlNode next = doc.CreateElement(elName);
node.AppendChild(next);
node = next;
if (attrib != null)
{
if (!attrib.StartsWith("#"))
{
attrib = " Id='" + attrib + "'";
}
string name, value;
attrib.Substring(1).SplitOnce("='", out name, out value);
if (string.IsNullOrEmpty(value) || !value.EndsWith("'")) throw new ApplicationException("Unsupported XPath attrib: " + part);
value = value.Substring(0, value.Length - 1);
var anode = doc.CreateAttribute(name);
anode.Value = value;
node.Attributes.Append(anode);
}
}
}
return node;
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Reading specific text from XML files - c#

Try using XPath: //var document = new XmlDocument(); int count = 0; var nodes = document.SelectNodes(String.Format(#"//*[text()='{0}']", searchTxt)); if (nodes != null) count = nodes.Count;

Related

Get only child nodes of a parent node

How to access and replace text in certain paragraphs using OPENXML powertools case by case

Reading the string from a txt file and converting it to elements in xml using c#

How To Remove Last Node In XML? C#

Build XML file from XPathExpressions

Categories

Resources