I have an XML as below
<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope
xmlns="http://com/uhg/uht/uhtSoapMsg_V1"
xmlns:env="http://schemas.xmlsoap.org/soap/envelope/">
<env:Header>
<uhtHeader
xmlns="http://com/uhg/uht/uhtHeader_V1">
<consumer>COMET</consumer>
<auditId></auditId>
<sendTimestamp>2020-09-03T18:15:40.942-05:00</sendTimestamp>
<environment>P</environment>
<businessService version="24">getClaimHistory</businessService>
<status>success</status>
</uhtHeader>
</env:Header>
<env:Body>
<srvcRspn
xmlns="http://com/uhg/uht/getClaimHistory_V24">
<srvcErrList arrayType="srvcErrOccur[1]" type="Array">
<srvcErrOccur>
<orig>Foundation</orig>
<rtnCd>00</rtnCd>
<explCd>000</explCd>
<desc></desc>
</srvcErrOccur>
</SrvcErrList>
</srvcRspn>
</env:Body>
</env:Envelope>
I want to remove all the attribute values with "http" like below:
<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope
xmlns=""
xmlns:env="">
<env:Header>
<uhtHeader
xmlns="">
<consumer>COMET</consumer>
<auditId></auditId>
<sendTimestamp>2020-09-03T18:15:40.942-05:00</sendTimestamp>
<environment>P</environment>
<businessService version="24">getClaimHistory</businessService>
<status>success</status>
</uhtHeader>
</env:Header>
<env:Body>
<srvcRspn
xmlns="">
<srvcErrList arrayType="srvcErrOccur[1]" type="Array">
<srvcErrOccur>
<orig>Foundation</orig>
<rtnCd>00</rtnCd>
<explCd>000</explCd>
<desc></desc>
</srvcErrOccur>
</SrvcErrList>
</srvcRspn>
</env:Body>
</env:Envelope>
I have tried several ways but none of them has worked for me. Can anyone suggest what is fastest way to do it in VB.NET/C#.
The actual response is very large (approx 100000 lines of XML minimum) and using for each will consume a good amount of time. Is there any parsing method or LINQ query method which can do it faster.
I got the way to do it using Regex as below:
Return Regex.Replace(xmlDoc, "((?<=<|<\/)|(?<= ))[A-Za-z0-9]+:| xmlns(:[A-Za-z0-9]+)?="".*?""", "")
It serves my purpose completely. Thanks Cleptus for your quick reference.
Related
I have a SOAP message like following:
<?xml version='1.0' encoding='UTF-8'?>
<SOAP-ENV:Envelope
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/1999/XMLSchema">
<SOAP-ENV:Body>
<ns1:SaveMethod xmlns:ns1="urn:MyService" SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<data xsi:type="xsd:string">
<![CDATA[
<?xml version='1.0' encoding='iso-8859-1'?>
<test>123</test>
]]>
</data>
</ns1:SaveMethod>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
My WCF is working fine when the SaveMethod tag is without prefix e.g.
<SaveMethod>
<data xsi:type="xsd:string">
etc.
But since i am not able to edit the SOAP message i need to find out how to make the SaveMethod accept the tag with prefix ns1.
Btw my method is as follows:
public void SaveMethod(String data){
// do something
}
Maybe i can do something on the client-side (HttpWebRequest used btw) ??
Thank you very much
NEWBIE QUESTION.
I haven't worked that much with xml, nothing like this anyway. I have some XML as shown below that I receive which has several namespaces.
I need to read some values, then update others before returning the revised XML with namespaces intact - don't want them removed.
I am given the path to some of the elements like this cred/sub/aa or trip/items/item[0]/customerInfo/custName.
But it seems that namespaces make it difficult to get to those elements so simply.
Does anybody know how I can read some of the values like NON-SMOKING from custPref or get the value CABBAGE from bossman/zz.
Also, I want to be able to then set a value such as custName to say Mr. X.
Any ideas?
Thanks.
<?xml version="1.0" encoding="utf-16" ?>
<A1 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<cred xmlns="https://blah-blah.com/?foobar">
<sub>
<aa>Zippo</aa>
<bb>lighter</bb>
</sub>
<reqId>
<cc></cc>
<dateOfBirth></dateOfBirth>
</reqId>
</cred>
<reqName xmlns="http://blah-blah/vader/base">qwerty</reqName>
<reqId xmlns="http://blah-blah/vader/base">12345</reqId>
<machine xmlns="http://blah-blah/vader/base">
<qqq>hello</qqq>
<www>goodbye</www>
<eee>99999</eee>
<rrr>88888</rrr>
</machine>
<monkey xmlns="http://blah-blah/vader/base">alskdjfhg</monkey>
<math xmlns="http://blah-blah/vader/base">
<language></language>
</math>
<trip xmlns="http://blah-blah/simple">
<tripOverview xmlns="http://blah-blah/vader/base">
<description></description>
<cost></cost>
</tripOverview>
<bossman xmlns="http://blah-blah/vader/base">
<zz>CABBAGE</zz>
<yy>BANANA</yy>
<xx>MELON</xx>
<ww>SYRUP</ww>
</bossman>
<items>
<item>
<itemSummary xmlns="http://blah-blah/vader/base">
<description></description>
<cost></cost>
<reference></reference>
</itemSummary>
<customerInfo xmlns="http://blah-blah/vader/base">
<custName></custName>
<custPref>NON-SMOKING</custPref>
</customerInfo>
<seatId xmlns="http://blah-blah/vader/base">1</seatId>
</item>
</items>
</trip>
</A1>
string xml = "<Root><Options></Options></Root>";
var xdocs = XDocument.Parse(xml);
xdocs.Descendants().Where(q => q.Name == "Options").FirstOrDefault().Value = "FoundIt";
I wrote several lines of code but still can't get over this:
I need to load many xml docs from web library. I don't know how many documents there are so I wonder which loop should I use while loading:
XDocument doc = XDocument.Load("http://" + i);
where -i is identifiers number.
I tried loading until i get document without meaningful content (thought it is the end, the rest are empty), but problem is that there is several Xdocs that are empty in the middle of library.
XML with content looks like
<?xml version="1.0" encoding="utf-8"?>
<OP xmlns="" xmlns:xsi="" xsi:schemaLocation="">
<request verb="GR" identifier="53" metadataPrefix="p"></request>
<GR>
<header>
<identifier>53,number of doc...used for counting</identifier>
</header>
<metadata>
<P xmlns="" xsi:schemaLocation="">
<TITLE>title</TITLE>
<CERTIFICATE NAME="different names">
</CERTIFICATE>
<YEAR>
<DATE>2012-10-18T00:00:00Z</DATE>
</YEAR>
<MINIATURE>
<COPY>
<CNAME>Copy name<CNAME>
<FORMAT>obj/max/dxf/3ds/...</FORMAT>
</COPY>
</MINIATURE>
</metadata>
</GR>
</OP>
XML without content
<?xml version="1.0" encoding="utf-8"?>
<OP xmlns="" xmlns:xsi="" xsi:schemaLocation="">
<request verb="GR" identifier="53" metadataPrefix="p"></request>
Furthermore, I need to do some counting like:
Tot.no. of doc,
No. of docs per certificate <CERTIFICATE>
No. of docs for each year <YEAR><DATE>
No of docs for each format <MINIATURE><COPY><FORMAT>
and my output should look like:
<?xml version="1.0" encoding="UTF-8" ?>
<Statistic>
<DocSum>21220</DocSum>
<Certificates>
<Certificate id=”certificateName”>17098</Certificate>
…
<Certificates>
<Years>
<Year year=”2014”>23</Year>
…
</Years>
<Miniature>
<Format post=”obj”>11723</Format>
…
</Miniature>
</Statistic>
If you could give me some help, hints or tips how to deal with it.
The posted answer by smink to the following thread should get you on the right path.
C# HttpWebRequest command to get directory listing
One of the easiest ways to get a list of the files of a web directory without knowing exactly how many there are or their filenames is by parsing the html of the directory and pulling out the tags.
You can then iterate through these tags and filter them out for the files by extensions that you need. I can provide a more in-depth example if necessary.
I am having problem to merge 2 or more xml files into 1 using c#.
I am doing it with DataSets:
//ds1,ds2,ds3 are DataSets
private void MyMethod()
{
ds1.ReadXml(tmpStream);
ds2.ReadXml(tmpStream);
ds1.Merge(ds2);
}
but i dont want to use DataSet. i am searching for another way.
first XML is
<?xml version="1.0" encoding="utf-8"?>
<catalog>
<item>
<path>'filePath'</path>
<deleted>0</deleted>
<date>9/23/2010 11:30:03 AM</date>
</item>
</catalog>
the second is
<?xml version="1.0" encoding="utf-8"?>
<catalog>
<item>
<path>'filePath'</path>
<deleted>0</deleted>
<date>9/23/2010 11:30:03 AM</date>
</item>
</catalog>
result must be
<?xml version="1.0" encoding="utf-8"?>
<catalog>
<item>
<path>'filePath'</path>
<deleted>0</deleted>
<date>9/23/2010 11:30:03 AM</date>
</item>
<item>
<path>'filePath'</path>
<deleted>0</deleted>
<date>9/23/2010 11:30:03 AM</date>
</item>
</catalog>
Though this isn't really clear of what sort of merge you want, this article Merging XML Files, Schema Validation, and More might help you get the idea.
Easiest could be, if you dont want any checks to be performed(duplicates, zombies, etc)
var ResultXml = XDocument.Load("file1.xml");
ResultXml.Root.Add(XDocument.Load("file2.xml").Root.Elements());
To merge XML files into resulting one, you could use Microsoft's XML Diff and Patch C# API. You could read more about it in Eric White's blog post: "OpenXmlDiff.Exe: A Utility to Find the Differences Between Two Open XML Documents"
Hi currently I have a nested XMl , having the following Structure :
<?xml version="1.0" encoding="utf-8" ?>
<Response>
<Result>
<item id="something" />
<price na="something" />
<?xml version="1.0" encoding="UTF-8" ?>
<DIDL-Lite xmlns="urn:schemas-upnp-org:metadata-1-0/DIDL-Lite/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:upnp="urn:schemas-upnp-org:metadata-1-0/upnp/" xmlns:dlna="urn:schemas-dlna-org:metadata-1-0/">
</Result>
<NumberReturned>10</NumberReturned>
<TotalMatches>10</TotalMatches>
</Response>
Any help on how to read this using Xdocument or XMLReader will be really helpfull.
Thanks,
Subhendu
XDocument and XmlReader are both XML parsers that expect a properly formed XML as input. What you have shown is not a XML file. So the first task would be to extract the nested XML and as this is not valid XML you cannot rely on any parser to do this job. You'll need to resort to string manipulation and or regular expressions.
My suggestion would be to fix the procedure generating this invalid XML in the first place. Another suggestion is to never generate a XML file manually but use an appropriate tool for this (XmlWriter, XDocument, ...)