<?xml version="1.0" encoding="UTF-8"?>
<Envelope xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Message>
<MessageID>1</MessageID>
<Product>
<SKU>33333-01</SKU>
</Product>
</Message>
</Envelope>
I've tried googling but whether I'm just not providing the correct search criteria I don't know.
I want to be able to search the XML file based on the MessageID and then grab the SKU.
I then want to search another XML file based on the SKU and remove that message completely.
<?xml version="1.0" encoding="UTF-8"?>
<Envelope xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Message>
<MessageID>1</MessageID>
<Inventory>
<SKU>33333-01</SKU>
<Quantity>1</Quantity>
</Inventory>
</Message>
<Message>
<MessageID>2</MessageID>
<Inventory>
<SKU>22222-01</SKU>
<Quantity>1</Quantity>
</Inventory>
</Message>
</Envelope>
Meaning the XML above becomes:
<?xml version="1.0" encoding="UTF-8"?>
<Envelope xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Message>
<MessageID>2</MessageID>
<Inventory>
<SKU>22222-01</SKU>
<Quantity>1</Quantity>
</Inventory>
</Message>
</Envelope>
To confirm I cannot confirm that the MessageID will be the same over different XML files.
Thanks in advance for any help.
My questions:
How do I search through XML files?
How do I then grab another Nodes details
Can I remove a complete from an XML file based on a search?
You can use XmlDocument to load your XML document. Then, you can use XPath for searching any nodes.
XmlDocument document = new XmlDocument();
document.Load("C:\fileOnTheDisk.xml");
// or
document.LoadXml("<a>someXmlString</a>");
// Returns single element or null if not found
var singleNode = document.SelectSingleNode("Envelope/Message[MessageID = '1']");
// Returns a NodeList
var nodesList = document.SelectNodes("Envelope/Message[MessageID = '1']");
Read more about XPath at w3schools.com.
Here is a good XPath Tester.
For example, you can use the following XPath to find nodes in your document by ID:
XmlDocument document = new XmlDocument();
document.Load("C:\doc.xml");
var node = document.SelectSingleNode("Envelope/Message[MessageID = '1']");
var sku = node.SelectSingleNode("Inventory/SKU").InnerText;
Console.WriteLine("{0} node has SKU = {1}", 1, sku);
Or you can output all SKUs:
foreach (XmlNode node in document.SelectNodes("Envelope/Message"))
{
Console.WriteLine("{0} node has SKU = {1}",
node.SelectSingleNode("MessageID").InnerText,
node.SelectSingleNode("Inventory/SKU").InnerText);
}
It will produce:
1 node has SKU = 33333-01
2 node has SKU = 22222-01
Note that there are possible NullReferenceExceptions if nodes are not present.
You can simply remove it using RemoveChild() method of its parent.
XmlDocument document = new XmlDocument();
document.Load("C:\doc.xml");
var node = document.SelectSingleNode("Envelope/Message[MessageID = '1']");
node.ParentNode.RemoveChild(node);
document.Save("C:\docNew.xml"); // will be without Message 1
You can use Linq to XML to do this:
var doc= XDocument.Load("input.xml");//path of your xml file in which you want to search based on message id.
var searchNode= doc.Descendants("MessageID").FirstOrDefault(d => d.Value == "1");// It will search message node where its value is 1 and get first of it
if(searchNode!=null)
{
var SKU=searchNode.Parent.Descendants("SKU").FirstOrDefault();
if(SKU!=null)
{
var searchDoc=XDocument.Load("search.xml");//path of xml file where you want to search based on SKU value.
var nodes =searchDoc.Descendants("SKU").Where(d=>d.Value==SKU.Value).Select(d=>d.Parent.Parent).ToList();
nodes.ForEach(node=>node.Remove());
searchDoc.Save("output.xml");//path of output file
}
}
I'd recommend you did this using LINQ to XML - it's much nicer to work with than the old XmlDocument API.
For all the examples, you can parse your XML string xml to an XDocument like so:
var doc = XDocument.Parse(xml);
1. How do I search through XML files?
You can get the SKU for a specific message ID by querying your document:
var sku = (string)doc.Descendants("Message")
.Where(e => (int)e.Element("MessageID") == 1)
.SelectMany(e => e.Descendants("SKU"))
.Single();
2. How do I then grab another Nodes details?
You can get the Message element with a specified SKU using a another query:
var message = doc.Descendants("SKU")
.Where(sku => (string)sku == "33333-01")
.SelectMany(e => e.Ancestors("Message"))
.Single();
3. Can I remove a complete element from an XML file based on a search?
Using your result from step 2, you can simple call Remove:
message.Remove();
Alternatively, you can combine the query from step 2 and simply execute a command to remove any messages that have a specific SKU:
doc.Descendants("SKU")
.Where(sku => (string)sku == "33333-01")
.SelectMany(e => e.Ancestors("Message"))
.Remove();
I tried to answer all your questions:
using System.Xml.XPath;
using System.Xml.Linq;
XDocument xdoc1 = XDocument.Load("xml1.xml");
XDocument xdoc2 = XDocument.Load("xml2.xml");
string sku = String.Empty;
string searchedID = "2";
//1.searching through an xml file based on path
foreach (XElement message in xdoc1.XPathSelectElements("Envelope/Message"))
{
if (message.Element("MessageID").Value.Equals(searchedID))
{
//2.grabbing another node's details
sku = message.XPathSelectElement("Inventory/SKU").Value;
}
}
foreach (XElement message in xdoc2.XPathSelectElements("Envelope/Message"))
{
if (message.XPathSelectElement("Inventory/SKU") != null && message.XPathSelectElement("Inventory/SKU").Value.Equals(sku))
{
//removing a node
message.Remove();
}
}
xdoc2.Save("xml2_del.xml");
}
Related
<?xml version="1.0"?>
<TextType IsKey="false" Name="XMLReport"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Providers
xmlns="Reporting"/>
<Sales
xmlns="Reporting"/>
<Value
xmlns="Reporting">
<?xml version="1.0" encoding="utf-8"?>
<TestReport>
<StudyUid>
<![CDATA[123]]>
</StudyUid>
<Modality>
<![CDATA[XYZ]]>
</Modality>
<StudyDate format="DICOM">123456</StudyDate>
<StudyTime format="DICOM">6789</StudyTime>
<AccessionNumber>
<![CDATA[123]]>
</AccessionNumber>
<StudyDescription>
<![CDATA[abc def]]>
</StudyDescription>
<OperatorName format="xyz">
<![CDATA[abc]]>
</OperatorName>
<PhysicianReadingStudy format="xyz">
<![CDATA[^^^^]]>
</PhysicianReadingStudy>
<InstitutionName>
<![CDATA[xyz]]>
</InstitutionName>
<HospitalName>
<![CDATA[Hospital Name]]>
</HospitalName>
<ReportSet>
<MyReport ID="1">
<ReportStatus>
<![CDATA[Done]]>
</ReportStatus>
</MyReport>
<MyReport ID="2">
<ReportStatus>
<![CDATA[Done]]>
</ReportStatus>
</MyReport>
<MyReport ID="3">
<ReportStatus>
<![CDATA[Initial]]>
</ReportStatus>
</MyReport>
</ReportSet>
<ReportImageSet />
<FetusSet />
</TestReport>
</Value>
<WhoSetMe xmlns="Reporting">NotSpecified
</WhoSetMe>
</TextType>
I want to parse the xml above in C# and check whether "ReportStatus" is "Done" for all the ReportStatus under MyReport/ReportSet. One more twist here is the xml contains one more xml starts at "Value" tag as in above example.It may contatin many ReportStatus tag under ReportSet tag. Can someone please help me?
// Can you try this? I tried to do it with LINQ to XML.
// I assume you have multiple <TestReport /> elements in <Value /> tag
// and var xelement is your xml variable
// First we get all TestReport elemnts
IEnumerable<XElement> allReports =
from el in xelement.Elements("TextType/Value/TestReport")
select el;
// From allReports we get all MyReport elemnts
IEnumerable<XElement> allMyReports =
from el in allReports.Elements("ReportSet/MyReport")
select el;
// From allReports we also get all MyReport elemnts with element ReportStatus value equals "Done"
IEnumerable<XElement> allDoneMyReports =
from el in allMyReports
where (string)el.Element("ReportStatus") == "Done"
select el;
// Now we compare allMyReport with allDoneMyReports
if (allMyReports.Count() == allDoneMyReports.Count())
{
//DO Somehing
}
Your XML document is invalid. You need to fix it before trying to parse it. The issue is that a document can only have one top-level element; you have 2 <TextType> and <Providers>.
Most of your elements are the namespace Reporting. You need to use it when referencing the element.
XNamespace ns = "Reporting";
var value = doc.Element("Value" + ns);
Update
Just use the namespace for each element
XNamespace ns = "Reporting";
var value = xelement.Elements("Value" + ns);
Another Update
The XML document is considered invalid because it has multiple XML declarations; there is no way to disable this. I suggest you pre-process the document to remove the extra declarations. Here's an example (https://dotnetfiddle.net/UnuAF6)
var xml = "<?xml version='1.0'?><a> <?xml version='1.0'?><b id='b' /></a>";
var doc = XDocument.Parse(xml.Replace(" <?xml version='1.0'?", " "));
var bs = doc.Descendants("b");
Console.WriteLine("{0} 'b' elements", bs.Count());
I have an xml file like:
<?xml version="1.0" encoding="utf-8"?>
<Root>
<Session TimeStamp="2016-12-21T17:01:01.8642453+02:00">
<Message>
<Content>test1</Content>
<ID>1</ID>
<Timestamp>12/21/2016 17:01:01</Timestamp>
<EventType>Debug</EventType>
<Priority>High</Priority>
</Message>
<Message>
<Content>test2</Content>
<ID>2</ID>
<Timestamp>12/21/2016 17:01:01</Timestamp>
<EventType>Exception</EventType>
<Priority>Low</Priority>
</Message>
<Message>
<Content>test3</Content>
<ID>3</ID>
<Timestamp>12/21/2016 17:01:01</Timestamp>
<EventType>Info</EventType>
<Priority>Medium</Priority>
</Message>
<Message>
<Content>test4</Content>
<ID>4</ID>
<Timestamp>12/21/2016 17:01:01</Timestamp>
<EventType>Warn</EventType>
<Priority>None</Priority>
</Message>
</Session>
</Root>
I want to check the value of element Content in every message i have try with this method:
Assert.IsTrue(xDocument.Root.Elements("Session").Last().Elements("Message").First().Element("Content").Value.Contains("test1"));
exception: System.InvalidOperationException: Sequence contains no elements
The method fail, cant find the element value, how can i do it using xdocument?
Are you looking for this since you say
I want to check the value of element Content in every message
xDocument.Root.Elements("Session")
.Elements("Message")
.Elements("Content")
.Select(x => x.Value.Contains("test1"));
It would return which node contains test1 so the result would be true,false,false,false
Edit
as per your comment "i want only to verify if message 1 content contains string "test1" "
xDocument.Root.Elements("Session")
.Elements("Message")
.Elements("Content")
.FirstOrDefault().Value.Contains("test1");
XmlDocument advDoc=new XmlDocument();
advDoc.Load("test.xml");
XmlNodeList _ngroups = advDoc.GetElementsByTagName("Content");
foreach(XmlNode nd in _ngroups)
{
if(nd.InnerText.ToString()=="test1")
Console.WriteLine("true");
}
i want only to verify if message 1 content contains string "test1"
string pathToXmlFile = ""; // point to your xml file ...
using (StreamReader reader = File.OpenText(pathToXmlFile))
{
XDocument doc = XDocument.Load(reader); // load into XDocument
XElement idElement = doc.Root.Element("Session").Elements("Message").Elements("ID").First( item => item.Value == "1"); // since you need the message id = 1
string content = idElement.Parent.Element("Content").Value; // get the parent of this message id which is message element then navigate to its content element.
}
Hope this helps..
I have an string parameter with xml content in it. Basically the string have an XML inside.
string S = funcThatReturnsXML (parameters);
S have the next text:
<?xml version="1.0" encoding="utf-8" ?>
<tagA>
<tagB>
<tagBB>
..
.
.
</tagBB>
.
.
</tagB>
<tagC>
..
..
.
</tagC>
</tagA>
The funcThatReturnsXML (parameters) creates an XmlDocument object but the return it as a string, I cant change this function, to much stuff works with it.
Tried to create XmlDocument objetc but the SelectSingleNode return null.
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(S);
XmlNode root = xmlDoc.SelectSingleNode("tagB");
How can I delete from string S (not XML Object) specific node, for example <tagB>
EDIT: this is the XML I tested with:
<?xml version="1.0" ?>
- <Request xmlns:xsi="http://www.mysite.com" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
- <info xmlns="http://www.mysite.com">
<RequestTR>54</RequestTR>
<time>2013-12-22</time>
</info>
- <Parameters xmlns="http://www.mysite.com">
<id>3</id>
<name>2</name>
</Parameters>
<title>Request</title>
</Request>
Try this:
string S = funcThatReturnsXML(parameters);
var doc = XDocument.Parse(S);
var nodeToRemove = doc.Descendants("tagB");
nodeToRemove.Remove();
That will remove all nodes named "tagB" from string S which contains xml.
UPDATE 1:
Sorry, i missed to include one more line:
S = doc.ToString();
My first code above removed "tagB" from doc but didnt save it back to S variable.
UPDATE 2:
I tested with following xml which contain attribute:
<tagA attribute="value">
<tagB>
<tagBB>
</tagBB>
</tagB>
<tagC></tagC>
</tagA>
and the output of Console.WriteLine(S):
<tagA attribute="value">
<tagC></tagC>
</tagA>
UPDATE 3:
Given your updated xml format, I know why my previous code didn't work for you. That was because your xml have namespace (xmlns) declared. The solution is to use LocalName when searching for the node to be removed, that will search for node name while ignoring its namespace. The follwoing example shows how to remove all "info" node:
var doc = XDocument.Parse(S);
var nodeToRemove = doc.Descendants().Where(o => o.Name.LocalName == "info");
nodeToRemove.Remove();
S = doc.ToString();
If you can determine the particular outer element to remove from the returned XML, you could use LINQ to XML:
var returnedXml = funcThatReturnsXML(parameters);
var xmlElementToRemove = funcThatReturnsOuterElement(returnedXml);
var xelement = XElement.Load("XmlDoc.txt");
xelement.Elements().Where(e => e.Name == xmlElementToRemove).Remove();
For example:
using System.Linq;
using System.Xml.Linq;
class Program
{
static void Main(string[] args)
{
// pretend this is the funThatReturnsXML return value
var returnedXml = "<tagB><tagBB></tagBB></tagB>";
// get the outer XML element name
var xmlElementToRemove = GetOuterXmlElement(returnedXml);
// load XML from where ever
var xelement = XElement.Load("XmlDoc.txt");
// remove the outer element and all subsequent elements
xelement.Elements().Where(e => e.Name == xmlElementToRemove).Remove();
}
static string GetOuterXmlElement(string xml)
{
var index = xml.IndexOf('>');
return xml.Substring(1, index - 1);
}
}
Note that the above is a "greedy" removal method, if there is more than once element with the name returned via the GetOuterXmlElemet method they will all be removed. If you want a specific instance to be removed then you will require something more sophisticated.
Building on your edit:
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(S);
var nodeA = xmlDoc.SelectSingleNode("/tagA");
var nodeB = nodeA.SelectSingleNode("tagB");
nodeA.RemoveChild(nodeB);
To remove (possibly) multiple tagB nodes in unknown positions, you may try:
var bees = xmlDoc.SelectNodes("//tagB");
foreach (XmlNode bee in bees) {
var parent = bee.ParentNode;
parent.RemoveChild(bee);
}
I am trying to remove all the nodes from the XML file. But it's removing the root node open tag also .Using C# anf Linq
Input:
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!--Log the error count and error message-->
<root>
<ErrData>
<Count>1</Count>
<Timestamp>2011-11-21T11:57:12.3539044-05:00</Timestamp>
</ErrData>
<ErrData>max of 20 ErrData elements</ErrData>
</root>
Expected OP:
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!--Log the error count and error message-->
<root>
</root>
Actual OP:EDITED
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!--Log the error count and error message-->
<root />
Code :
XDocument docs = XDocument.Load(path);
try
{
docs.Descendants("ErrData").Remove();
}
CODE:
Below is the code i am using ,the concept is the error count and timestamp are logged into XML file.Once its reached the threshold value,email will be send by function and remove all the nodes from the xml. Then when the next error comes it will start entering in to the xml file as below,
XDocument doc = null;
XElement el;
if (!System.IO.File.Exists(path))
{
doc = new XDocument(new XDeclaration("1.0", "utf-8", "no"));
el = new XElement("root");
//el = new XElement("root");
XComment comment = new XComment("Log the error count and error message");
doc.Add(comment);
}
else
{
doc = XDocument.Load(path);
}
XElement p1 = new XElement("ErrData");
XElement p1Count = new XElement("Count", eventCount);
XElement p1Windowsatrt = new XElement("Timestamp", windowStart);
p1.Add(p1Count );
p1.Add(p1Windowsatrt );
if (doc.Root != null)
{
el = doc.Root;
el.Add(p1);
}
else
{
el = new XElement("root");
el.Add(p1);
}
try
{
doc.Add(el);//Line throwing the exeception
}
catch (Exception e)
{
}
finally
{
doc.Save(path);
}
Use docs.Root.Nodes().Remove().
<root /> is valid XML for a tag with no content (self-closing tags). If you absolutely need an opening and closing tag you need to put some content in the root node like a comment or text.
The confusion is in your very first sentence: "I am trying to remove all the nodes/elements from the XML file." Which one is it? Do you want to remove all nodes, or all elements?
There are five types of nodes in XML: elements, text, comments, processing instructions, and attributes. If you use "node" and "element" interchangeably, as you are here, you're going to have no end of trouble working with XML.
What you got, <root/>, is the correct output for code that removes all descendant nodes: it's a single element named root with no content.
What you expect,
<root>
</root>
is a single element named root that contains a child text node containing whitespace, probably a newline. The code you wrote removes all descendant nodes, not just descendant element nodes, and so it removed this text node as well.
I would like to filter with high performance XML elements from an XML document.
Take for instance this XML file with contacts:
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/xsl" href="asistentes.xslt"?>
<contactlist evento="Cena Navidad 2010" empresa="company">
<contact type="1" id="1">
<name>Name1</name>
<email>xxxx#zzzz.es</email>
<confirmado>SI</confirmado>
</contact>
<contact type="1" id="2">
<name>Name2</name>
<email>xxxxxxxxx#zzzze.es</email>
<confirmado>Sin confirmar</confirmado>
</contact>
</contaclist>
My current code to filter from this XML document:
using System;
using System.Xml.Linq;
class Test
{
static void Main()
{
string xml = #" the xml above";
XDocument doc = XDocument.Parse(xml);
foreach (XElement element in doc.Descendants("contact")) {
Console.WriteLine(element);
var id = element.Attribute("id").Value;
var valor = element.Descendants("confirmado").ToList()[0].Value;
var email = element.Descendants("email").ToList()[0].Value;
var name = element.Descendants("name").ToList()[0].Value;
if (valor.ToString() == "SI") { }
}
}
}
What would be the best way to optimize this code to filter on <confirmado> element content?
var doc = XDocument.Parse(xml);
var query = from contact in doc.Root.Elements("contact")
let confirmado = (string)contact.Element("confirmado")
where confirmado == "SI"
select new
{
Id = (int)contact.Attribute("id"),
Name = (string)contact.Element("name"),
Email = (string)contact.Element("email"),
Valor = confirmado
};
foreach (var contact in query)
{
...
}
Points of interest:
doc.Root.Elements("contact") selects only the <contact> elements in the document root, instead of searching the whole document for <contact> elements.
The XElement.Element method returns the first child element with the given name. No need to convert the child elements to a list and take the first element.
The XElement and XAttribute classes provide a wide selection of convenient conversion operators.
You could use LINQ:
foreach (XElement element in doc.Descendants("contact").Where(c => c.Element("confirmado").Value == "SI"))