I am transferring large chunks of data over WCF, and I am trying to optimize the results sent. I have switched from NetDataContactSerializer to DataContractSerializer, but since it is no longer being serialized as XML, I wonder exactly what happens.
As an example, imagine I serialize a collection (100,000 records) of the following to XML:
public class SomeDataObject
{
public string AnExcessivelyLongPropertyNameJustToIllustrateMyPoint { get; set; }
}
It would look something like this:
<a:SomeDataObject>
<b:AnExcessivelyLongPropertyNameJustToIllustrateMyPoint>
ABC
</b:AnExcessivelyLongPropertyNameJustToIllustrateMyPoint>
</a:SomeDataObject>
Now, from the above, it is clear that for hundreds of thousands of records, there is a significant performance gain in naming the property something else, like this:
<a:SomeDataObject>
<b:NormalName>ABC</b:NormalName>
</a:SomeDataObject>
My question is: When using a netTcp binding and the default DataContactSerializer, is it intelligent enough to not actually repeat the names of the properties being serialized?
Or if you don't know the answer to this, is there an easy way to measure this?
For WPF serialization to have smaller XML add short name in Name attribute like:
[DataContract(Name="sdo")]
public class SomeDataObject
{
[DataMember(Name = "axlpnjtimp")]
public string AnExcessivelyLongPropertyNameJustToIllustrateMyPoint { get; set; }
}
For manual XML serialization using XmlSerializer add attribute [XmlType(TypeName = "x")] for class and [XmlElement("y")] for properties.
In your case I would so something like:
[XmlType(TypeName = "sdo")]
public class SomeDataObject
{
[XmlElement("axlpnjtimp")]
public string AnExcessivelyLongPropertyNameJustToIllustrateMyPoint { get; set; }
}
And serialized xml:
<?xml version="1.0" encoding="utf-16"?>
<sdo>
<axlpnjtimp>Property Name</axlpnjtimp>
</sdo>
It will reduce send data size considerably.
Related
I'm troubleshooting some old existing .Net 4.6.1 code that is XML serializing this class:
public class Orders
{
private int _pagenumber = 0;
[XmlAttribute]
public int pages
{
get { return _pagenumber; }
set { _pagenumber = value; }
}
[XmlText]
public string OrdersXml { get; set; }
}
The OrdersXml string contains a block of already-XML-serialized Order objects (i.e. XML text like: "<Order><OrderId>1</OrderId>...</Order><Order>...</Order>..."). (They are being XML serialized elsewhere for a variety of reasons and this is not subject to redesign.)
The intent is to include that block of XML verbatim in the serialization of this Orders object - in other words, as if string OrdersXml was instead an Orders[] OrdersXML being serialized as part of the Orders object, ending up like: <Orders pages="6"><Order><OrderID>123456</OrderID>...</Order>...</Orders>
But that's not happening. The XML in the OrdersXml property is being serialized as XML-escaped plain text, and it's coming out "<Orders pages="6"><Order><OrderID>2</OrderID>..." - the code is doing post-serialization cleanup to reverse that, and it's coming out useably correct in most cases. I'd rather it serialize correctly in the first place...
I've tried using [XmlText(typeof(string))] instead but that didn't help.
Is the XmlSerializer ignoring the [XmlText] attribute on OrdersXml, or is that not what [XmlText] is intended to do?
What is the "correct" best-practice way to composite XML like this?
I'm writing a C# library that, as one of its functions, needs to be able to accept XML of the following forms from a web service and deserialize them.
Form 1:
<results>
<sample>
<status>status message</status>
<name>sample name</name>
<igsn>unique identifier for this sample</igsn>
</sample>
</results>
Form 2:
<results>
<sample name="sample name">
<valid code="InvalidSample">no</valid>
<status>Not Saved</status>
<error>error message</error>
</sample>
</results>
Here's my class that I'm deserializing to:
namespace MyNamespace
{
[XmlRoot(ElementName = "results")]
public class SampleSubmissionResponse
{
[XmlElement("sample")]
public List<SampleSubmissionSampleResultRecord> SampleList { get; set; }
...
}
public class SampleSubmissionSampleResultRecord
{
...
/* RELEVANT PROPERTY RIGHT HERE */
[XmlAttribute(Attribute = "name")]
[XmlElement(ElementName = "name")]
public string Name { get; set; }
...
}
public class SampleSubmissionValidRecord
{
...
}
}
The problem is that in one XML sample, the name attribute of the Sample element is an element, and in the other it's an attribute. If I decorate the property of my class with both XmlAttribute and XmlElement, I get an exception thrown when creating an instance of XmlSerializer.
I've been googling for a good while now, and I can't find any docs that deal with this situation. I assume, but don't know for sure, that this is because when creating an XML schema, you're not supposed to use the same name for an attribute and a child element of the same element.
So, what do I do here?
One solution might be to have two totally separate models for the different types. That would probably work, but doesn't seem very elegant.
Another option might be to implement IXmlSerializable and write some elaborate code to handle this in the deserialize method. That would be an awfully verbose solution to a simple problem.
Third option I'm hoping for: some way of applying both XmlAttribute and XmlElement to the same property, or an equivalent "either-or" attribute.
Fourth option: Change the web service the XML comes from to use one form consistently. Unfortunately, the folks who own it may not be willing to do this.
Specify only one attribute to Name property. This will correctly parse out the first xml form.
public class SampleSubmissionSampleResultRecord
{
[XmlElement(ElementName = "name")]
public string Name { get; set; }
}
To parse the second xml form, subscribe the XmlSerializer to the UnknownAttribute event.
var xs = new XmlSerializer(typeof(SampleSubmissionResponse));
xs.UnknownAttribute += Xs_UnknownAttribute;
In the event handler, we get the desired value.
private void Xs_UnknownAttribute(object sender, XmlAttributeEventArgs e)
{
var record = (SampleSubmissionSampleResultRecord)e.ObjectBeingDeserialized;
record.Name = e.Attr.Value;
}
I'm using the built in XML deserialization (not because it was my choice, but legacy code) to deserialize xml to a strong typed object.
NOTE: I have no control over the xml, it is an external api
The problem is an xml node has been extended to include a child node of the same name and it's breaking the serialization.
For example, the xml as follows:
<people>
<person>
<id>1234</id>
<person>
<name>This is my name</name>
<person>
</person>
</people>
With the following objects
[XmlType("person")]
public class Person {
[XmlElement("id")]
public int Id { get; set; }
[XmlElement("person")]
public PersonTitle Title{ get; set; }
}
[XmlType("person")]
pulic class PersonTitle
{
[XmlElement("name")]
public string Name { get; set; }
}
This is throwing an error when calling (T)xmlserializer.Deserialize(stream) due to the duplicate names even though the xml is valid. Personally I would not have gone to the trouble to replicate the xml layout in objects just to automatically deserialize it when manually deserializing is easier to maintain (especially when it's never serialized by .net in the first place).
However, I'd like to know if there's a way I can get around this even if it means flattenting the child object out.
I know this doesn't work, but as example:
[XmlType("person")]
public class Person {
[XmlElement("id")]
public int Id { get; set; }
[XmlElement("person/name")]
public string Title{ get; set; }
}
Any help is appreciated.
The easiest method might be to run it through an XSLT transform before deserializing it- match the person/person/name elements and output just the person/name part. Then deserialize the result.
Here's a SO post on applying XSLT within C#: How to apply an XSLT Stylesheet in C#
And here is one on using XSLT to replace elements: http://cvalcarcel.wordpress.com/2008/09/06/replacing-arbitrary-xml-located-within-an-xml-document-using-xslt/
In a worst case scenario you could write the class however you like (don't compromise due to serialization) and then implement IXmlSerializable. Implement ReadXml, throw NotImplementedException for WriteXml if you like.
I have a fairly simple DAL assembly that consists of an SalesEnquiry Class which contains a List<T> of another Vehicle class.
We'll be receiving XML files by email that I'm wanting to use to populate instances of my SalesEnquiry class, so I'm trying to use de-serialization.
I've added XMLRoot/XMLElement/XMLIgnore attributes to both classes as I think is appropriate. However, when I try de-serializing, the parent SalesEnquiry object is populated but no child Vehicle objects.
I understand that de-serializing List<T> can be tricky, but I'm not sure why, how to avoid problems, or even if this is why I'm struggling.
While debugging, I've successfully serialized a Vehicle object on it's own, so I'm assuming that I'm heading in the right direction, but when I de-serialize the SalesEnquiry XML (which contains one or more child Vehicles), the List<Vehicle> isn't populated.
Where am I going wrong?
Update:
In a test project, I serialized an SalesEnquiry containing two vehicles and save to a file. I then loaded the file up an de-serialized it back into a different SalesEnquiry object. It worked!
So what was the difference? The vehicles were recorded as follows:
<?xml version="1.0" encoding="utf-8"?>
<enquiry xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<enquiry_no>100001</enquiry_no>
<vehicles>
<Vehicle>
<vehicle_type>Car</vehicle_type>
<vehicle_make>Ford</vehicle_make>
<vehicle_model>C-Max</vehicle_model>
...
The thing to note is that Vehicle has a an initial capital, whereas my incoming XML doesn't. In my Vehicle class, I gave the class an [XmlRoot("vehicle")] attribute, which I though would make the link, but clearly it doesn't. This makes sense, I guess, because although Vehicle is a class in it's own right, it's merely an array item within a List inside my SalesEnquiry.
In which case, the question is - How do I annotate the Vehicle class such that I can map the incoming XML elements (<vehicle>) to my list items (Vehicle)? [XmlArrayItem] (or [XmlElement] for that matter) are 'not valid on this declaration type'.
In this example, I can request that the people who generate the XML use <Vehicle> rather than <vehicle>, but there may be situations where I don't have this freedom, so I'd rather learn a solution than apply a workaround.
Conclusion:
By adding [XmlArrayItem("vehicle", typeof(Vehicle))] to the existing decoration for my List, the XML is now able to de-serialize fully. Phew!
Here's a working pair of classes with appropriate decorations:
(Note: The XmlAnyElement and XmlAnyAttribute are optional. It's a habit I'm in to promote the flexibility of the entity.)
[XmlType("enquiry")]
[XmlRoot("enquiry")]
public class Enquiry
{
private List<Vehicle> vehicles = new List<Vehicle>();
[XmlElement("enquiry_no")]
public int EnquiryNumber { get; set; }
[XmlArray("vehicles")]
[XmlArrayItem("Vehicle", typeof(Vehicle))]
public List<Vehicle> Vehicles
{
get { return this.vehicles; }
set { this.vehicles = value ?? new List<Vehicle>(); }
}
[XmlAnyElement]
public XmlElement[] AnyElements;
[XmlAnyAttribute]
public XmlAttribute[] AnyAttributes;
}
public class Vehicle
{
[XmlElement("vehicle_type")]
public string VehicleType { get; set; }
[XmlElement("vehicle_make")]
public string VehicleMake { get; set; }
[XmlElement("vehicle_model")]
public string VehicleModel { get; set; }
}
The following DataContract:
[DataContract(Namespace = "http://namespace", Name = "Blarg")]
public class Blarg
{
[XmlAttribute("Attribute")]
public string Attribute{ get; set; }
[DataMember(Name = "Record", IsRequired = false, Order = 4)]
public List<Record> Record{ get; set; }
}
Serializes into this:
<Blarg Attribute="blah">
<Record>
<Record/>
<Record/>
<Record/>
</Record>
</Blarg>
But I want this:
<Blarg>
<Record/>
<Record/>
<Record/>
<Blarg/>
The DataContractSerializer seems to be inserting the header parent automagically and I don't want it.
How do I go about removing the wrapping <Record>?
I don't think you can do that.
The DataContractSerializer is optimized for speed, and in the process it sacrifices some flexibility and some features (like XML attributes). I don't think you have much chance to influence the DCS - it does its job as it sees fit, and as quickly as possible. You get to define quite neatly what to serialize (with the [DataMember] attribute, but you don't really have a say in how to serialize.
If you need more control, you could pick the XmlSerializer instead - in that case, you have 10-15% slower serialization, but you can control things like the shape of the data etc. But even in this case - I am not aware of any way you can tell the XML serializer to serialize a collection into a series of XML tags without an enclosing tag for the collection.
I found the answer here.
See the short story bellow(for the long one checkout the url):
[XmlElement ("Parameter")]
public List<Parameter> Parameters;