XML serialization and LINQ - c#

I've currently got an XML element in my database that maps to an object (Long story short the XML is complicated and dynamic enough to defy a conventional relational data structure without massive performance hits).
Around it I've wrapped a series of C# objects that encapsulate the structure of the XML. They work off a base class and there's multiple different possible classes it'll deserialize to with different data structures and different implemented methods. I'm currently wrapping the functionality to serialize/deserialize these into a partial class of the LINQ-generated database objects.
My current approach to this is:
public Options GetOptions()
{
if (XmlOptions == null) return null;
XmlSerializer xs = new XmlSerializer(typeof(Options));
return (Options)xs.Deserialize(XmlOptions.CreateReader());
}
public void SetOptions(Options options)
{
if (XmlOptions == null) Options = null;
else
{
XmlSerializer xs = new XmlSerializer(typeof(Options));
using (MemoryStream ms = new MemoryStream())
{
xs.Serialize(ms, options);
XmlOptions = System.Xml.Linq.XElement.Parse(System.Text.UTF8Encoding.UTF8.GetString(ms.ToArray()));
}
}
}
(To help with reading given the changed names aren't too clear, XmlOptions is the XElement element from LINQ and Options is my class it deserializes into)
Now, it works. But that's not really enough to call it "finished" to me :P It just seems incredibly inefficient to serialize XML to a memory stream, convert it to a string, then re-parse it as XML. My question is - Is this the cleanest way to do this? Is there a more efficient/tidy mechanism for doing the serialization? Is there a better approach that'll give me the same benefits (I've only got test data in the system so far, so I can change the XML structure if required)?
PS: I've renamed the fields to be more generic - so don't really need comments about naming conventions ;)

XmlSerializer.Serialize has an overload that takes an XmlWriter.
You can create an XmlWriter that writes to an existing XElement by calling the CreateWriter method.
You can therefore write the following:
static readonly XmlSerializer xs = new XmlSerializer(typeof(Options));
...
var temp = new XElement("Parent");
using (var writer = temp.CreateWriter())
xs.Serialize(writer, options);
XmlOptions = Temp.FirstNode;

Related

Casting a List<MyClass> back from object returns a System.InvalidCastException C# [duplicate]

This question already has answers here:
Deep cloning objects
(58 answers)
Closed 1 year ago.
I have a Dictionary<string,object> where I save objects to parse to my plugin system, some of these objects need to be a copy not the original so I use this DeepClone method:
public static T DeepCloneJSON<T>(this T Obj)
{
var text = JsonSerializer.Serialize(Obj);
return JsonSerializer.Deserialize<T>(text);
}
for example, if I do :
var dict = new Dictionary<string,object>(){
{"someKey",new List<MyClass>(){new MyClass()}}
}
var copy = dict["someKey"].DeepCloneJSON();
var cast = (List<MyClass>)copy;
i get a System.InvalidCastException, but if i use a MemoryStream DeepCopy method i don't get this exception, like this one:
public static T DeepCloneMemoryStream<T>(this T obj)
{
using (var ms = new MemoryStream())
{
var formatter = new BinaryFormatter();
formatter.Serialize(ms, obj);
ms.Position = 0;
return (T)formatter.Deserialize(ms);
}
}
so, I would like to know why I get this exception using a System.text.json based DeepClone Method and if it is possible to do this with a System.text.json because in my tests it presents to be faster and use less memory than a MemoryStream based onde.
Your DeepCloneMemoryStream recreates the exact type via Reflection, even if T is object.
Your DeepCloneJSON roundtrip via JSON string loses type information, and so deserialization requires the exact type to be specified. In your case, you are only passing object as T and therefore you get back a JsonElement instead of a List<MyClass>.
The following will work if you change how you make your call:
var copy = DeepCloneJSON((List<MyClass>)dict["someKey"]);
Alternatively, change your implementation such that deserialization happens according to that actual type of Obj instead of the specified type T - this would have the analogous effect to your other implementation, where T is only used for casting.
public static T DeepCloneJSON<T>(T Obj)
{
var text = JsonSerializer.Serialize(Obj);
return (T)JsonSerializer.Deserialize(text, Obj.GetType());
}
A completely different approach
You can avoid the cast exception and the overhead of serializing and deserializing, and the possible issues with that (e.g. private members, non-serializable instances and such) by choosing among various open source NuGet packages that use techniques such as reflection emit or expression trees to avoid the serialization.
One example is FastDeepCloner as suggested in the comments. There are several others that provide similar features.

How to read value of an INode?

I know you two ways, but don't work as I want:
1. [INode].ToString();
This returns the value in my node plus a "^^[predicate uri]", like this;
random node value.^^http://www.w3.org/2001/XMLSchema#string
2. [INode].ReadXml(Xml reader); I don't know how to use, coz I can't find any examples.
Is there a way of retrieving only the value of the node?
Or is the "XmlRead()" methode what I need? How do I use it?
Based on the NodeType you can cast to the appropriate interface and then access the value e.g.
switch (node.NodeType)
{
case NodeType.Literal:
return ((ILiteralNode)node).Value;
case NodeType.Uri:
return ((IUriNode)node).Uri.ToString();
// etc.
}
Or you might want to use node.AsValuedNode().AsString() if you are sure that your node is a literal
Note that the ReadXml()/WriteXml() methods are for .Net XML serialisation and are not intended for general use.
To get content you should use WriteXml instead of ReadXml function
var sb = new StringBuilder();
var xmlWriterSettings = new XmlWriterSettings
{ // It's required in my case but maybe not in your try different settings
ConformanceLevel = ConformanceLevel.Auto
};
using (var writer = XmlWriter.Create(sb, xmlWriterSettings))
rdfType.WriteXml(writer);
var result = sb.ToString();
I seem to have misundestood the XmlReader and XmlWriter, also understand the way of use, but don't seem to get it working.
This message I get:
InvalidOperationException: This XmlWriter does not accept Attribute at this state Content.
I suppose I do need to tweak the XmlWritterSettings in order to make it work.
I don't see any documantion on the required XmlWritterSettings for reading the DotNetRDF INodes, so I shall use the "ToString()" for now.
this is the part of the RDF/XML that the node holds:
<property:Paragraph rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Geef houvast.</property:Paragraph>
Isn't there some other way to extract the "Geef houvast." between the element?
Thanks for helping me understand the XmlReader/XmlWriter now!

C# - XML deserialization - ignore elements with attribue

I need to deserialize some xml to c# objects. This is my class:
[XmlRoot("root")]
[Serializable]
public class MyRoot
{
[XmlElement("category")]
public List<Category> Categories { get; set; }
}
I'm deserializing like this:
root = (MyRoot)new XmlSerializer(typeof(MyRoot)).Deserialize(new StringReader(client.DownloadString(XmlUrl)));
But I want to ignore some Category elements with specified "id" attribute values. Is there some way I can do this?
Implementing IXmlSerializable is one way to go, but perhaps an easier path would be simply modifying the XML (using LINQ or XSLT?) ahead of time:
HashSet<string> badIds = new HashSet<string>();
badIds.Add("1");
badIds.Add("excludeme");
XDocument xd = XDocument.Load(new StringReader(client.DownloadString(XmlUrl)));
var badCategories = xd.Root.Descendants("category").Where(x => badIds.Contains((string)x.Attribute("id")));
if (badCategories != null && badCategories.Any())
badCategories.Remove();
MyRoot root = (MyRoot)new XmlSerializer(typeof(MyRoot)).Deserialize(xd.Root.CreateReader());
You could do something similar on your resulting collection, but it's entirely possible you don't serialize the id, and may not want to/need to otherwise.
Another approach is to have a property named something like ImportCategories with the [XmlElement("category")] attribute and then have Categories as a property that returns a filtered list from ImportCategories using LINQ.
Then your code would do the deserialisaion and then use root.Categories.
To do this the Microsoft way, you would need to implement an IXmlSerializable interface for the class that you want to serialize:
https://msdn.microsoft.com/en-us/library/system.xml.serialization.ixmlserializable(v=vs.110).aspx
It's going to require some hand-coding on your part - you basically have to implement the WriteXml and ReadXml methods, and you get a XmlWriter and a XmlReader interface respectively, to do what you need to do.
Just remember to keep your classes pretty atomic, so that you don't end up custom-serializing for the entire object graph (ugh).

serialization of IList<object[]> XmlSerializer with Generic Lists

I am getting some type IList<object[]> , what is the best way to serialize it to xml.
And then read it back to IList<object[]>.
I just not see any easy way to do so.
Thanks for help.
The XmlSerializer chokes on interfaces. So you could convert it to an array or a concrete List<T> before serializing. Also you should absolutely specify known types because this object[] will simply not work. The serializer must know in advance all types that you will be dealing with. This way it will emit type information into the resulting XML:
var data = list.ToArray();
var knownTypes = new[] { typeof(Foo), typeof(Bar) };
var serializer = new XmlSerializer(data.GetType(), knownTypes);
serializer.Serialize(someStream, data);
Or if you don't wanna bother with all this and simply get some human readable persistence for your objects you could use Json:
var serializer = new JavaScriptSerializer();
string json = serializer.Serialize(list);
And if you don't care about human readability, a binary serializer should do just fine.

.NET XML serialization

I would like to serialize and deserialize mixed data into XML. After some searches I found out that there were two ways of doing it: System.Runtime.Serialization.Formatters.Soap.SoapFormatter and System.Xml.Serialization.XmlSerializer. However, neither turned out to match my requirements, since:
SoapFormatter doesn't support serialization of generic types
XmlSerializer refuses to serialize types that implement IDictionary, not to mention that it's far less easy-to-use than "normal" serialization (e.g. see this SO question)
I'm wondering if there exists an implementation that doesn't have these limitations? I have found attempts (for example CustomXmlSerializer and YAXLib as suggested in a related SO question), but they don't seem to work either.
I was thinking about writing such serializer myself (though it certainly doesn't seem to be a very easy task), but then I find myself limited by the CLR, as I cannot create object instances of types that won't have a paramaterless constructor, even if I'm using reflection. I recall having read somewhere that implementations in System.Runtime.Serialization somehow bypass the normal object creation mechanism when deserializing objects, though I'm not sure. Any hints of how could this be done?
(See edit #3)
Could somebody please point me to the right direction with this?
Edit: I'm using .NET 3.5 SP1.
Edit #2: Just to be clear, I'd like a solution that is as much like using BinaryFormatter as possible, meaning that it should require the least possible extra code and annotations.
Edit #3: with some extra Googling, I found a .NET class called System.Runtime.Serialization.FormatterServices.GetUninitializedObject that actually can return "zeroed" objects of a specified type, which is great help in deserialization (if I get to implement it myself). I'd still like to find an existing solution though.
Ive had great success using the datacontractserializer class.here
Here's a pretty good article comparing serializers link
Depending on your .NET version and the complexity of your data, you may have luck using LINQ to XML to serialize:
internal class Inner
{
public int Number { get; set; }
public string NotNumber { get; set; }
}
internal class Outer
{
public int ID { get; set; }
public Dictionary<string, Inner> Dict { get; set; }
}
internal class Program
{
private static void Main()
{
var data = new Outer
{
ID = 1,
Dict =
new Dictionary<string, Inner>
{
{
"ABC",
new Inner
{
Number = 1,
NotNumber = "ABC1"
}
},
{
"DEF",
new Inner
{
Number = 2,
NotNumber = "DEF2"
}
}
}
};
var serialized =
new XDocument(new XElement("Outer",
new XAttribute("id", data.ID),
new XElement("Dict",
from i in data.Dict
select
new XElement(
"Entry",
new XAttribute(
"key", i.Key),
new XAttribute(
"number",
i.Value.Number),
new XAttribute(
"notNumber",
i.Value.
NotNumber)))));
Console.WriteLine(serialized);
Console.Write("ENTER to finish: ");
Console.ReadLine();
}
}
Result:
<Outer id="1">
<Dict>
<Entry key="ABC" number="1" notNumber="ABC1" />
<Entry key="DEF" number="2" notNumber="DEF2" />
</Dict>
</Outer>
Deserializing:
private static Outer Deserialize(XDocument serialized)
{
if (serialized.Root == null)
{
return null;
}
var outerElement = serialized.Root.Element("Outer");
if (outerElement == null)
{
return null;
}
return new Outer
{
ID =
int.Parse(
outerElement.Attribute("id").Value),
Dict =
outerElement.Element("Dict").
Elements("Entry").ToDictionary(
k => k.Attribute("key").Value,
v => new Inner
{
Number = Convert.ToInt32(v.Attribute("number").Value),
NotNumber = v.Attribute("notNumber").Value
})
};
}
Which version of the .NET Framework do you use? If you are on .NET 3.0 or higher, you may have luck with the DataContractSerializer or the NetDataContractSerializer. Both of those do serialize to XML but work quite different than the XmlSerializer.
Implementing custom XML serialization is not too bad. You could implement IXmlSerializable in the class that the normal XmlSerializer cannot support by default.
This CodeProject article has a good explanation of IXmlSerializable, and this blog post provides another look at more or less the same thing.
I suggest DataContractJsonSerializer, which gives shorter output and handles dictionaries better.

Categories