What is the best method to manipulate xml files? - c#

I want to manipulate XML files.
...
<Document Id="1091">
<Indexes>
<Index Name="MODD" Value="aaa" />
<Index Name="DDAT" Value="bbb" />
<Index Name="CDAT" Value="ccc" />
<Index Name="MDAT" Value="ddd" />
<Index Name="DOCN" Value="eee" />
<Index Name="STAT" Value="fff" />
...
</Indexes>
</Document>
<Document Id="2088">
...
I have retrieve the value of some index randomly. I would avoid the loop on all the indexes. What is the tool you advise me to use and why?
load the file as a text file and use RegEx
load the xml file and use XPath
load the xml file and use Linq to Xml
generate the classes with xsd.exe or xsd2code
another approach

I'd go with LinqToXml. Good syntax and easy to use!

Related

Passing XML multi level data as parameter and using in stored procedure

I have XML format data which I will pass from a .net application.
In the SQL Server stored procedure, this data is passed in as a XML parameter. I want to read and save the data in the required tables, say TblOrder and TblItem.
In XML, there will be multiple orders. Each order contains one or several items accordingly.
Structure on which operation need to be implemented:
<?xml version="1.0" encoding="UTF-8"?>
<Orders>
<Order>
<B2>B2**ABIJ**0000884443**PP</B2>
<CreateBy null="true" />
<CreateDate>/Date(1485150414358)/</CreateDate>
<CurrencyId>1</CurrencyId>
<CustomerId>13</CustomerId>
<DeliveryAddress>LIBERTY PRESS LLC</DeliveryAddress>
<DeliveryCity>SPRINGVILLE UT 84663</DeliveryCity>
<DeliveryCityId>0</DeliveryCityId>
<DeliveryDate>/Date(1478750400000)/</DeliveryDate>
<DeliveryId>14</DeliveryId>
<DeliveryState>UT</DeliveryState>
<DeliveryStateId>16</DeliveryStateId>
<DeliveryType>Delivery</DeliveryType>
<EquipmentId>4</EquipmentId>
<Items>
<Item>
<CSA>false</CSA>
<CTPAT>false</CTPAT>
<CommodityItem>General Freight</CommodityItem>
<CommodityItemId>0</CommodityItemId>
<CustCommodityItem null="true" />
<FAST>false</FAST>
<Hazmat>false</Hazmat>
<Height null="true" />
<IsActive>false</IsActive>
<ItemId>0</ItemId>
<ItemName>Item A</ItemName>
<Length null="true" />
<Make null="true" />
<Mass null="true" />
<MassUnit null="true" />
<Model null="true" />
<OrderId>0</OrderId>
<PIP>false</PIP>
<PilotCar>false</PilotCar>
<ReeferTemp null="true" />
<Tarp>false</Tarp>
<TrailerType null="true" />
<TruckType null="true" />
<VIN null="true" />
<Width null="true" />
<Year null="true" />
</Item>
</Items>
<L11>L11*SYL884443*BM</L11>
<LastUpdate>/Date(1485150414358)/</LastUpdate>
</Order>
<Order>
...
<Items>
<Item>
...
</Item>
</Order>
</Orders>
Steps I want to achieve are:
Read XML parameter from the stored procedure, which is passed from the .net application.
Loop through the XML and save data in the TblOrder and TblItem tables
Going through the article follows :
Pass-XML-parameter-to-Stored-Procedure
How to loop and parse xml parameter in sql server stored procedure
I got the Idea to access very first level (in my case Order of Orders).
Moving forward having issue accessing the second level which will be again a collection (in my case Item of Items of Order).
Thanks in advance for your support
You have two approaches:
You can pass the XML as-is into a Stored Procedure and do all the hard work in T-SQL
You can shredd the XML within C#, fill appropriate data objects and use classical data storage.
From your question I take, that you'd prefer to pass this into a stored procedure as XML parameter. There are some things to know:
C# uses 16-bit-unicode internally and so does SQL Server's XML. But you will not be able to cast this unicode string to XML as long as there is encoding="UTF-8" included... You might pass this as VARCHAR(MAX) (not NVARCHAR(MAX)!), but this could lead you in troubles if there are sepcial characters involved. Best was, to cut the first line (the <?xml ...?> declaration) away completely.
Your XML is not created correctly. Is this under you control? If you include null="true" (there's no need for normally!), you should do this with the xsi-namespace. And date/time values within XML should be ISO8601. Your values (like /Date(1485150414358)/) are no format SQL Server will be able to cast directly...
Nevertheless I see multi <Order>-elements and multi <Item>-elements. You could read them as follows:
DECLARE #xml XML=
N'<Orders>
<Order>
<B2>B2**ABIJ**0000884443**PP</B2>
<CreateBy null="true" />
<CreateDate>/Date(1485150414358)/</CreateDate>
<CurrencyId>1</CurrencyId>
<CustomerId>13</CustomerId>
<DeliveryAddress>LIBERTY PRESS LLC</DeliveryAddress>
<DeliveryCity>SPRINGVILLE UT 84663</DeliveryCity>
<DeliveryCityId>0</DeliveryCityId>
<DeliveryDate>/Date(1478750400000)/</DeliveryDate>
<DeliveryId>14</DeliveryId>
<DeliveryState>UT</DeliveryState>
<DeliveryStateId>16</DeliveryStateId>
<DeliveryType>Delivery</DeliveryType>
<EquipmentId>4</EquipmentId>
<Items>
<Item>
<CSA>false</CSA>
<CTPAT>false</CTPAT>
<CommodityItem>General Freight</CommodityItem>
<CommodityItemId>0</CommodityItemId>
<CustCommodityItem null="true" />
<FAST>false</FAST>
<Hazmat>false</Hazmat>
<Height null="true" />
<IsActive>false</IsActive>
<ItemId>0</ItemId>
<ItemName>Item A</ItemName>
<Length null="true" />
<Make null="true" />
<Mass null="true" />
<MassUnit null="true" />
<Model null="true" />
<OrderId>0</OrderId>
<PIP>false</PIP>
<PilotCar>false</PilotCar>
<ReeferTemp null="true" />
<Tarp>false</Tarp>
<TrailerType null="true" />
<TruckType null="true" />
<VIN null="true" />
<Width null="true" />
<Year null="true" />
</Item>
</Items>
<L11>L11*SYL884443*BM</L11>
<LastUpdate>/Date(1485150414358)/</LastUpdate>
</Order>
</Orders>';
--the query
SELECT --elements of Order
o.value(N'(B2)[1]',N'nvarchar(max)') AS B2
--very strange date-format...
,o.value(N'(CreateDate)[1]',N'nvarchar(max)') AS CreateDate
--typed INT
,o.value(N'(CurrencyId)[1]',N'int') AS CurrencyId
--more like this
--elements of Item
,i.value(N'(CSA)[1]',N'nvarchar(max)') AS CSA
--There's no need for *null="true"*
--Query the "/text()" and the empty element will be NULL
,CASE WHEN i.value(N'(CustCommodityItem/#null)[1]',N'nvarchar(max)')=N'true' THEN NULL ELSE i.value(N'(CustCommodityItem)[1]',N'nvarchar(max)') END AS CustCommodityItem_complicated
,i.value(N'(CustCommodityItem)[1]',N'nvarchar(max)') AS CustCommodityItem_empty
,i.value(N'(CustCommodityItem/text())[1]',N'nvarchar(max)') AS CustCommodityItem_null
FROM #xml.nodes(N'/Orders/Order') AS A(o)
OUTER APPLY o.nodes(N'Items/Item') AS B(i)

Read an XML file, change it and save it as new formatted XML

I have an XML file which has been exported from Orchard CMS, what I need to now do is convert the nodes within the XML file to a new structure so that I can import the file in to Umbraco.
How do I go about doing this? I'm thinking I could write some c# .net to read the XML file and then make the changes I need and then same it as a new file.
Example of a what I am trying to do is:
Exported file:
<BlogPost Id="/alias=The Blog\/2012\/09\/10\/on-starters-orders" Status="Published">
<TextField.Excerpt />
<TaxonomyField.Categories Terms="" />
<TaxonomyField.Tags Terms="" />
<BodyPart Text="MAIN CONTENT OF THE BLOG POST"
/>
<CommonPart Owner="/User.UserName=Owain" Container="/alias=blog" CreatedUtc="2012-09-10T13:27:00Z" PublishedUtc="2012-09-25T08:57:25Z" ModifiedUtc="2012-09-25T08:56:15Z" />
<AutoroutePart Alias="The Blog/2012/09/10/on-starters-orders" UseCustomPattern="false" />
<TitlePart Title="On starters orders....." />
<CommentsPart CommentsShown="true" CommentsActive="true" ThreadedComments="false" />
<TagsPart Tags="" />
</BlogPost>
What I need to convert it to is:
<posts>
<post id="1" date-created="2012-09-25T08:57:25Z" date-modified="2012-09-25T08:56:15Z" approved="true" post-url="on-starters-orders" type="normal" hasexcerpt="false" views="0" is-published="True">
<title type="text"><![CDATA[On starters orders.....]]></title>
<content type="text"><![CDATA[MAIN CONTENT OF THE BLOG POST]]>
</content>
<post-name type="text"><![CDATA[On starters orders.....]]></post-name>
<categories>
<category ref="1018" />
</categories>
<tags>
<tag ref="training" />
</tags>
<comments>
<comment id="35" date-created="2006-09-05T11:36:50" date-modified="2006-09-05T11:36:50" approved="false" user-name="Phil Haack" user-url="http://haacked.com">
<title type="text"><![CDATA[re: CS Dev Guide: Send Emails]]></title>
<content type="text"><![CDATA[Another test comment.]]></content>
</comment>
</comments>
<authors>
<author ref="Owain" />
</authors>
</post>
Looking for suggestion on the best way to do this as I have 150+ posts to convert and don't fancy doing it manually.
You can use XSLT transformation which has a lot of options to do what you are looking for. XSLT can output in xml.

Want to find a specific node in xml

Currently, I have the xml as the following:
<Node_Parent>
<Column name="ColA" value="A" />
<Column name="ColB" value="B" />
<Column name="ColC" value="C" />
</Node_Parent>
How to get value B at ColB? I tried to use XmlDocument.SelectSingleNode("Node_Parent"), but I cannot access to ColB?
If I change to <ColB value="B" />, I can use XmlDocument.SelectSingleNode("Node_Parent/ColB").Attributes["value"].Value, but the xml format doesn't look good?
Thanks.
You need to write an XPath query in the SelectSingleNode:
var value = doc.SelectSingleNode(
"Node_Parent/Column[#name = 'ColB']"
).Attributes["value"].Value;
For more info on the XPath query language, see http://www.w3schools.com/xpath.
Good luck!

//#attrib vs //name/#attrib in C#

On the XML below, I'm using the SelectSingleNode of XmlDocument to pull out the result value.
evtASxml.SelectSingleNode(#"//#value").Value
returns the value of the first "value."
evtASxml.SelectSingleNode(#"//Result/#value").Value
raises a null exception.
Could someone explain what's going on?
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="Microsoft-Windows-CAPI2" Guid="{f00f00-f00-f00f00-f00-f00f00f00}" />
<EventID>30</EventID>
<Version>0</Version>
<Level>2</Level>
<Task>30</Task>
<Opcode>0</Opcode>
<Keywords>0x4000000000000001</Keywords>
<TimeCreated SystemTime="2012-04-08T23:43:37.573242200Z" />
<EventRecordID>4828</EventRecordID>
<Correlation ActivityID="{f00f00-f00-f00-f00-f00f00f00f00}" />
<Execution ProcessID="7512" ThreadID="3220" />
<Channel>Microsoft-Windows-CAPI2/Operational</Channel>
<Computer>Matt-Seven</Computer>
<Security UserID="S-f00-f00-f00-f00f00f00-f00f00f00-f00f00f00-f00f00" />
</System>
<UserData>
<CertVerifyCertificateChainPolicy>
<Policy type="CERT_CHAIN_POLICY_SSL" constant="4" />
<Certificate fileRef="f00f00f00f00f00f00f00f00f00f00f00.cer" subjectName="www.example.com" />
<CertificateChain chainRef="{f00f00-f00-f00-f00-f00f00f00f00}" />
<Flags value="0" />
<SSLAdditionalPolicyInfo authType="server" serverName="example.com">
<IgnoreFlags value="0" />
</SSLAdditionalPolicyInfo>
<Status chainIndex="0" elementIndex="0" />
<EventAuxInfo ProcessName="iexplore.exe" />
<CorrelationAuxInfo TaskId="{f00f00-f00-f00-f00-f00f00f00f00}" SeqNumber="4" />
<Result value="800B010F">The certificate's CN name does not match the passed value.</Result>
</CertVerifyCertificateChainPolicy>
</UserData>
</Event>
Numeric values from my event log replaced with f00.
Just guessing, but I think you want //*[#value], and not //#value
The reason for this problem is that the XML document is in a default namespace.
Selecting elements by name when they are in a default namespace is the most FAQ in XPath.
Xpath treats any unprefixed element name as belonging to "no namespace". In your case no Result element exists that is in "no namespace" (all elements are in the "http://schemas.microsoft.com/win/2004/08/events/event" namespace) and thus no node is selected.
In C# it is recommended that you provide an XmlNamespaceManager as the second argument of SelectSingleNode() -- just use the appropriate overload.
Use:
evtASxml.SelectSingleNode(#"//x:Result/#value", yourXmlNamespaceManager).Value
where the association of "x" to the "http://schemas.microsoft.com/win/2004/08/events/event" namespace has been added to yourXmlNamespaceManager using the AddNamespace() method.

RavenDB: help to architecture my project

This is a personal project to get started using RavenDB.
I have been using a mxing program for years whose track data are stored in a xml file. The structure is as follow:
<Song attribute="" attribute="">
<node1 attribute="" />
<node2 attribute="" attribute="" />
<node3 attribute="" attribute="" attribute="" />
<node4 attribute="" attribute="" />
<node5 attribute="" />
</Song>
<Song attribute="" attribute="">
<node1 attribute="" />
<node2 attribute="" />
<node3 attribute="" />
<node4 attribute="" attribute="" />
</Song>
I'd like to manipulate the data (CRUD and other niceties). After having fun, I'd like to save everything into ravenDB and then in a new xml file. As the data are xml nodes, I think it is best to import all the nodes and its content into RavenDB.
To make the design decouple from the schema, I plan to make at least 2 POCO:
DAL : SongRecord POCO whose properties are those from a typical node
BL : Song POCO more business oriented
What should I do?
JSON.NET offers to serialize/deserialize xml to json and vice versa. Once the xml is serialized into json, I can store it into RavenDB.
My BL communicates with the DAL that queries against RavenDB.
After a while I want to persist everything into the db and then export everything into a new xml file, whose schema is the one I mentionned above.
What do you think about it? Is there something wrong? What's best instead? Remember it is a pet project to learn RavenDB.
Doesn't RavenDB handle the serialization back and forth to JSON? All you should have to handle is getting your POCOs back and forth to the XML format for the mixer program. Everywhere else, leave them as POCOs.

Categories