How to use XmlSerializer to deserialize into an existing instance?

How to use XmlSerializer to deserialize into an existing instance? - c#

Is it somehow possible to use the XmlSerializer to deserialize its data into an existing instance of a class rather than into a new one?
This would be helpful in two cases:
Easily merge two XML files into one object instance.
Let object constructer itself be the one who is loading its data from the XML file.
If the is not possible by default it should work by using reflection (copying each property after the deserialisation) but this would be an ugly solution.

Basically, you can't. XmlSerializer is strictly constructive. The only interesting thing you can do to customize XmlSerializer is to implement IXmlSerializable and do everything yourself - not an attractive option (and it will still create new instances with the default constructor, etc).
Is xml a strict requirement? If you can use a different format, protobuf-net supports merging fragments into existing instances, as simply as:
Serializer.Merge(source, obj);

I think you're on the right track with the Reflection idea.
Since you probably have a wrapper around the XML operations anyway, you could take in the destination object, do the deserialization normally into a new object, then do something similar to cloning by copying over one by one only the properties holding non-default values.
It shouldn't be that complex to implement this, and it would look to consumers from the rest of your application just like in-place deserialization.

I hit the same problem a few weeks ago.
I put a method Deserialize(string serialized form) in the ISelfSerializable interface that an entity class of mine implemented. I also made sure the interface forced the class to have a default constructor.
In my factory I created an object of that type and then deserialized the string into it.

This is not thread safe thing to do... But you can do:
[Serializable]
public class c_Settings
{
static c_Settings Default;
public static SetExistingObject(c_Settings def)
{
Default = def;
}
public string Prop1;
public bool Prop2;
public c_Settings()
{
if (Default == null)
return;
MemberInfo[] members = FormatterServices.GetSerializableMembers(typeof(c_Settings));
FormatterServices.PopulateObjectMembers(this, members, FormatterServices.GetObjectData(Default, members));
}
}
This way you feed your object to deserialiser and deserialiser only overwrites whatever is written in .xml.

Related

Factory pattern with objects that have many optional properties

I'm refactoring a class that represents the data in some XML. Currently, the class loads the XML itself and property implementations parse the XML every time. I'd like to factor out the XML logic and use a factory to create these objects. But there are several 'optional' properties and I'm struggling to find an elegant way to handle this.
Let's say the XML looks like this:
<data>
<foo>a</foo>
<bar>b</bar>
</data>
Assume both foo and bar are optional. The class implementation looks something like this:
interface IOptionalFoo
{
public bool HasFoo();
public string Foo { get; }
}
// Assume IOptionalBar is similar
public class Data : IOptionalFoo, IOptionalBar
{
// ...
}
(Don't ask me why there's a mix of methods and properties for it. I didn't design that interface and it's not changing.)
So I've got a factory and it looks something like this:
class DataFactory
{
public static Data Create(string xml)
{
var dataXml = new DataXml(xml);
if (dataXml.HasFoo())
{
// ???
}
// Create and return the object based on the data that was gathered
}
}
This is where I can't seem to settle on an elegant solution. I've done some searching and found some solutions I don't like. Suppose I leave out all of the optional properties from the constructor:
I can implement Foo and Bar as read/write on Data. This satisfies the interface but I don't like it from a design standpoint. The properties are meant to be immutable and this fudges that.
I could provide SetFoo() and SetBar() methods in Data. This is just putting lipstick on the last method.
I could use the internal access specifier; for the most part I don't believe this class is being used outside of its assembly so again it's just a different way to do the first technique.
The only other solution I can think of involves adding some methods to the data class:
class Data : IOptionalFoo, IOptionalBar
{
public static Data WithFoo(Data input, string foo)
{
input.Foo = foo;
return input;
}
}
If I do that, the setter on Foo can be private and that makes me happier. But I don't really like littering the data object with a lot of creation methods, either. There's a LOT of optional properties. I've thought about making some kind of DataInitialization object with a get/set API of nullable versions for each property, but so many of the properties are optional it'd end up more like the object I am refactoring becomes a facade over a read/write version. Maybe that's the best solution: an internal read/write version of the class.
Have I enumerated the options? Do I need to quit being so picky and settle on one of the techniques above? Or is there some other solution I haven't thought of?

You can think of such keywords as virtual/castle dynamic proxy/reflection/T4 scripts - each one can solve the problem on a slightly different angle.
On another note, this seems perfectably reasonable, unless I misunderstood you:
private void CopyFrom(DataXml dataXml) // in Data class
{
if (dataXml.HasFoo()) Foo = dataXml.Foo;
//etc
}

What I did:
I created a new class that represented a read/write interface for all of the properties. Now the constructor of the Data class takes an instance of that type via the constructor and wraps the read/write properties with read-only versions. It was a little tedious, but wasn't as bad as I thought.

static in interfaces

i want to write my own serialisation (xml and binary do not fit for me,
i want "a more ADO" way)
so i defined an interface:
interface ISerializeData
{
DataTable GetDataSchema();
DataTable SerializeData();
object DeserializeData(DataTable data);
}
now i do not want to create an instance of an object to let
me get the schema for that object.
And: DeserializeData should return an instance, not use an instance.
Therefore i think it should be also static. (okay, it can initialize
an instancce from a datatable...)
Any ideas? How can i model that? static is not allowed in
interfaces and my classes already inherit from another abstract
base class.
Any ideas appreciated!

that issue is why the other serializer utilize attributes as they allow you to provide metadata about how the class is to be stored with out forcing you to deal with the implementation of the class itself.

Maybe I'm wrong, but this is really more a task for a utility class. Take DeserializeData, for instance. Somewhere in your code you decide which type you're going to construct. In your proposed code you would choose the type and call its static method. Now what? Would each type have its own code to do the serialization? You'd probably end up creating some class doing all the work, to stay DRY. So you might as well have one DeserializeData method in a utility class, like:
public static T DeserializeData(DataTable data)
where T : new
{
var T = new T();
.... // Set properties
}
In this method you'd probably get the data schema.
Maybe SerializeData() could be an instance method, but that too would delegate its work to some utilty class.
Please let me know if I completely misunderstood your question.

Deserialization of changed class

I am working on a program, where I save it's project files by serializing Project class.
Because I am still working on it, some classes, that are part of Project class, do change from time to time (e.g. class got new property). It makes "simple" deserialization impossible.
Is there any way to solve it ? I mean, without writng custom serializer ? (which probably is something high above my level for now)
Just in case, I am using BinaryFormatter.

I hope I understood your problem correctly. You have a class serialized to a file which you have since changed in the program (e.g you have added another property). Now you want to deserialize this class from the file. This is not a problem as long as you have only added new properties. They will be ignored by the deserializer. It creates a new instance of your class (that is the reason why serializable classes have to have a default constructor) and tries to fill the properties it finds in the stream to derserialize. If you change a property's type or remove a property, you won't be able to deserialize the original file.
One workaround for removing properties is to keep them in the class, but just stop using them in the rest of the program. A workaround for properties that have been changed to a different type could look something like this:
[Serializable]
public class MyClass
{
int? newProperty;
[XmlElement("Property")]
public string OldProperty
{
get { return string.Empty; }
set
{
if (!newProperty.HasValue)
{
int temp;
if (int.TryParse(value, out temp))
{
newProperty.Value = temp;
}
}
}
}
public int NewProperty
{
get { return newPropery.HasValue ? newProperty.Value : 0; }
set { newProperty.Value = value; }
}
}

From my experience, I've found using BinaryFormatter for serialization/de-serialization of data types that are going to change a really bad idea. If something changes in your data type, from what I know the BinaryFormatter will fail in the process.
To overcome this issue in the data types I was using, I had to write my own serializer, which wasn't actually that much of a major task. You can use the BinaryReader and BinaryWriter classes to read and write the data in and out of your type. That way you can control the data you are expecting and handle any missing data either by adding default values, skipping the property altogether, or throwing some form of Exception to signify corrupt data. Refer to the MSDN article links above for more information.

With help from Merlyn Morgan-Graham's comments I've found solution, that will work for me.
Versioning described in Version Tolerant Serialization is really good idea, but when I use only [Serializable] attribute.
I forgot to write (my mistake), that I am using ISerializable interface.
I've found, that in deserialization constructor SerializationInfo object has MemberCount property, which solves my problem if I only add new properties/members from time to time. With this information, new members/properties, that can't be deserialized from older file, can be set to default or maybe I can use some prompt form.
Other way here would be using something like assembly version in deserialization, as a first deserialized member. This can solve deserialization problems with more complex class changes.
Either way, I agree with Merylin - "if you can't script something, you shouldn't be building it". ;)

C# How do you solve a circular object reference

I've run into what i belive could be a major issue for my code design and i was hoping someone here could explain to me how i would work around the issue.
I have 2 classes which each have a property of the other class creating a circular reference. I plan on serializing these classes and using XSLT to format the output but i'm assuming this will fail due to the circular reference.
Example
public class Book
{
public BookShop TheShop = new BookShop();
}
public class BookShop
{
list<Book> Books = new list<Book>();
}
So from this example each book will be in a bookShop and each bookshop will have many books. If i serialize the bookshop it will then serialize each book which then serialize a bookshop and so on round and round. How should i handle this?

Tag TheShop with an attribute to prevent its serialization.
[XmlIgnore] with the default serializer.
http://www.codeproject.com/KB/XML/GameCatalog.aspx
Probably just a problem with your example, not your real code: Don't use public fields but properties. I think XmlSerializer doesn't even serialize public fields.

Add [XmlIgnore] to the TheShop property to prevent it from being serialized.
You can then set it manually when deserializing.

Best practice would be to have the BookShop class implement an interface (IBookShop) and then have the Book class store the interface not the concrete class. You should also make BookShop into a property in the Book class:
public class Book
{
public Book(IBookShop bookShop)
{
TheStop = bookShop;
}
[XmlIgnore]
public IBookShop TheShop { get; set; }
}
public interface IBookShop
{
void SomeMethod();
}
public class BookShop : IBookShop
{
list<Book> Books = new list<Book>();
public void SomeMethod()
{
}
}

If you're going to use System.Xml.Serialization.XmlSerializer, you should decorate TheShop with System.Xml.Serialization.XmlIgnoreAttribute:
public class Book
{
[System.Xml.Serialization.XmlIgnore]
public BookShop TheShop;
}
That is, assuming the BookShop is the root object you wish to serialize. MSDN

First you need to check whether this is really a problem. If you always care about a bookshop when you have a book, and you always care about all the books a bookshop has, then it's perfectly sensible to have the whole graph serialised. This doesn't result in an infinite loop, because the serialisation uses an identifier to indicate a reference to an object already serialised (there is a bug if you do an XML serialisation of a graph with a circular reference in its types, but that's a bug rather than inherent to the problem of serialising XML, as the fact that it can be resolved proves, see Why do I get a "System.StackOverflowException was unhandled " exception when serializing? on that).
So, maybe you don't want to do anything here at all, and you're fine as you are.
Otherwise, the question is - just what do you want to serialise? Most suggestions so far have been to not serialise the TheShop property. This could be fine, or it may be useless if you will need to later access that shop.
If you have some sort of identifier (id number, uri) for each shop, then you could perhaps memoise - access to TheShop looks first at whether a private _theShop is null, and if it is, loads the relevant object into _theShop based on that identifier. Then you just need to serialise the identifier, not the full object.
Finally, if you are using XSLT to format the output to some other specification (whether XHTML for display, or something else) you may find it simpler just to roll your own XML serialisation. While this is a more complicated task in many ways, the fact that the XML produced by serialisation isn't particularly convenient for reformatting for display may mean that overall it's simpler this way. Indeed, if this is your only reason for serialising (you will never deserialise from the XML produced) then it may be much easier, as you need only consider what the XML for display needs, and not worry about anything else. Hence serialising may not be the best approach at all, but simply a ToXml() method, or a WriteBookToXml() method in another class.

How can I add a type constraint to include anything serializable in a generic method?

My generic method needs to serialize the object passed to it, however just insisting that it implements ISerializable doesn't seem to work. For example, I have a struct returned from a web service (marked with SerializableAttribute) that serializes to xml just fine, but, as expected, the C# compiler complains.
Is there a way I can check the object is serializable before attempting to serialize it, or, better still, a way of using the where keyword to check the object is suitable?
Here's my full method:
public static void Push<T>(string url, T message)
where T : ISerializable
{
string xml = SerializeMessage(message);
// Send the message to Amazon SQS
SendMessageRequest sendReq = new SendMessageRequest { QueueUrl = url, MessageBody = xml };
AmazonSQSClient client = new AmazonSQSClient(S3User, S3Pass);
client.SendMessage(sendReq);
}
And SerializeMessage:
private static string SerializeMessage<T>(T message)
{
XmlSerializer xmlSerializer = new XmlSerializer(typeof(T));
using (StringWriter stringWriter = new StringWriter())
{
xmlSerializer.Serialize(stringWriter, message);
return stringWriter.ToString();
}
}
If this isn't possible, what's the best way to perform a check that an object is serializable at runtime?

You can't do this totally via generic constraints, but you can do a couple things to help:
1) Put the new() constraint on the generic type (to enable the ability to deserialize and to ensure the XmlSerializer doesn't complain about a lack of default ctor):
where T : new()
2) On the first line of your method handling the serialization (or constructor or anywhere else you don't have to repeat it over and over), you can perform this check:
if( !typeof(T).IsSerializable && !(typeof(ISerializable).IsAssignableFrom(typeof(T)) ) )
throw new InvalidOperationException("A serializable Type is required");
Of course, there's still the possibility of runtime exceptions when trying to serialize a type, but this will cover the most obvious issues.

I wrote a length blog article on this subject that you may find helpful. It mainly goes into binary serialization but the concepts are applicable to most any serialization format.
http://blogs.msdn.com/jaredpar/archive/2009/03/31/is-it-serializable.aspx
The long and short of it is
There is no way to add a reliable generic constraint
The only way to check and see if an object was serializable is to serialize it and see if the operation succeeds

The only way to know if an object is serializable is to try to serialize it.
In fact, you were asking how to tell if a type "is serializable", but the actual question will be with respect to objects. Some instances of a type may not be serializable even if the type is marked [Serializable]. For instance, what if the instance contains circular references?

Instead of
XmlSerializer xmlSerializer = new XmlSerializer(typeof(T));
try
XmlSerializer xmlSerializer = new XmlSerializer(message.GetType());

C# 8 and up allows the unmanaged constraint to limit types to structs that have nothing but value types in them (on any nested level). What we really want is:
public class MyClass<T> where T : ISerializable or unmanaged
But unfortunately, at the time of writing C# does not support this syntax (constraints are always AND, separated by commas).
A workaround could be a ValueWrapper class:
public class ValueWrapper<U> : ISerializable where U : unmanaged
This takes a U for a constructor argument. It has one property U Value. Now you can treat value types as ISerializable simply by wrapping them in a ValueWrapper.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.