Version tolerant XML Serialization

By Mirek on (tags: Version Tolerant Serialization, XML, categories: architecture, code)

One of the beauties of xml, is that it can be extended without breaking applications. You can add element to the xml document tree and the application should not crash or fail while consuming this xml. This is what we can read on the w3schools.com xml tutorial. But how is this with serialization of the xml?

Recently I found a guide on MSDN which is called Version Tolerant Serialization. It states that since .Net  Framework version 2.0 the serialization (or actually deserialization) mechanisms implemented in the framework are tolerant for the changes introduced in the xml content or target object class. I wrote deserialization, because in fact this is only about reading xml document and creating an objects out of it, where the differences in the xml structure matter. Serialization, so writing a objects tree to the xml structure is always safe, since the xml schema is constructed strictly based on object’s fields.

As described on MSDN guide to benefit from the version tolerant xml serialization we need to stick to few points, which are described as “best practices”

  • Never remove a serialized field.
  • Never apply the NonSerializedAttribute attribute to a field if the attribute was not applied to the field in the previous version.
  • Never change the name or the type of a serialized field.
  • When adding a new serialized field, apply the OptionalFieldAttribute attribute.
  • When removing a NonSerializedAttribute attribute from a field (that was not serializable in a previous version), apply the OptionalFieldAttribute attribute.
  • For all optional fields, set meaningful defaults using the serialization callbacks unless 0 or nullas defaults are acceptable.

Without unnecessary considerations lets quickly test this rules on .net framework 4.5. Lets say we have a Person class like this

[Serializable]
public class Person
{
    [XmlAttribute(AttributeName = "UID")]
    public int Id { get; set; }
 
    public string FullName;
 
    public int Age;
 
    [OptionalField]
    public string LastName;
 
    [OptionalField]
    public Address Address;
 
    [OptionalField]
    public DateTime CreatedOn;
}

And the xml document which we are going to materialize into a Person class instance. Assume the xml was serialized from some very early version of Person class and so is a bit “out of date”.

<?xml version="1.0" encoding="utf-8"?>
<Person UID="7" Opacity="big">
  <Nick>&quot;Rocky&quot; </Nick>
  <FullName>Sylvester Stallone</FullName>
  <Age>68</Age>
  <ModifiedOn>2014-10-10</ModifiedOn>
</Person>"

As you can see, we have an extra element Opacity, Nick and ModifiedOn in the xml document which don’t have corresponding fields in the class. We also have three fields in the Person which are not reflected in the xml. however the xml document is deserialized with no exception, just the Address,LastName and CreatedOn fields are set to its defaults. Moreover situation looks the same when we remove the OptionalField attributes from the fields decoration. This attribute is used in web service deserialization mechanisms and is important there, while for the sake of this test I used XmlSerializer class.

As you can see we can take even weaker restrictions and remove the field from the serialized class. In the result of deserialization the element node found in xml document will simply be ignored since it does not have a corresponding field on the target class. The same applies other way. If there is a missing xml element, the corresponding field on the class will be set to its default (null, 0, zero date). The only change we should care on is the change of type of fields, which may result in a exception while parsing xml values. For instance when we change the type of string field to integer and then try to serialize some old string, non number values. This is however kind of obvious problems which we can easily avoid.