Towards better serialization

Wed Jun 12 15:11:16 UTC 2019

> What about inheritance?

Good question!  

First, note that this is two questions — one is “what _can_ a serialization framework do here”, and the other is “what should Java serialization do here.”  Remember, the high-order bit here is to banish the magic; one thing this enables is that writing a serialization framework reduces to (a) mapping instances to serializers (which could be informed by @Serializer, or not) and (b) encoding, all without need of privilege, and of course the reverse.  

This means that one of the parameters of a serialization framework is deserialization fidelity.  For example, when confronted with a List, and after inspecting the instance class for a suitable serializer and not finding one, what could it do?  It clearly could fail, but it could also encode the List using a generic List wrapper, which might deserialize to ArrayList or List.of(…).  

Clearly, if the implementation class has a suitable serializer, we should probably use that.  But if it doesn’t, what if List has a serializer?  A serialization framework could use that as a fallback.  

Let’s take a simple inheritance example, without serializers first.

    class A {
        int a;

        public A(this int a) { } // this-bound parameter
    }

    class B extends A {
        int b;

        public B(int a, this int b) { super(a); }
    }

Note that as part of A’s design, it exposes an accessible-to-subclasses constructor, which subclass constructors will delegate to.  Let’s add in the patterns (no syntax please!):

    class A {
        int a;

        public A(this int a) { } 
        public pattern A(this int a) { } 
    }

    class B extends A {
        int b;

        public B(int a, this int b) { super(a); }
        public pattern B(int a, this int b) {
            // deliberately stupid syntax
            match this to A(binding a);  // binds a
            // b implicitly bound, because of this-declaration
        }
    }

The point here is, just as A must provide some construction service to its subclasses, it must also provide some deconstruction services (patterns, accessible fields, accessible accessors, whatever) to its subclasses too.  Now, B is fully ready to be serializable, as it has both a constructor and deconstructor which are suitable for serialization. 

> Could factory method deserializer declared in the class X produce an object of type Y which is a subclass of X?

Yes.  An example of this would be if we wanted to put a deserializer in List, where it returns some default implementation (ArrayList, List.of(), whatever.). 

Remember, serialization frameworks get to decide how they are going to map instances to serializers/deserializers; the above is not a statement that Java serialization _will_ support finding deserializers in super types, but that a serialization framework _could_ do so, and it is on them to define the search process.  (A serialization framework could also, for example, provide a registry where you could register serializers/deserializers explicitly — “if you find a FooList, serialized it as a BarList”.  Java serialization probably will not, but others could.). 

> In this case where serializer pattern should be declared? In Y or in X? Assuming that serialized stream contains the class name Y, then probably both serializer and deserializer should be searched by serialization framework in the Y.

Again, how a serialization framework searches for a serializer/deserializer is part of its differentiation from other frameworks.  We provide authors with the ability to easily and defensively expose API points for use by serialization; the framework wires up its choices of which API points are called in response to what.  

> More concrete example: immutable lists created via List.of(...). There are at least two implementations inside and, I think, it's desired not to expose their count and structure. E.g. future Java version might have more or less imlementations. How the serializer and deserializer would look like for these objects?

Probably so.  In this case, I would think one of annotation parameters on @Serializer() would map to “look in this other class for deserializers”, so that InternalListImpl42 could serialize to something that says “deserialize me with the deserializer for PublicListWrapper”, such as: 

    private class InternalListImpl42 {
        @Serializer(version = PublicListWrapper.VERSION, 
                           deserializationClass = PublicListWrapper.class)
        open pattern InternalListImpl42(Object[] elements) { … }
    }

in which case the serialization version corresponds to that of PLW, not ILI42.  

> Another question: is it expected that static checks will be applied for annotated methods/patterns? I expect the following:
> - Deserialization annotation is applied only to constructor or to static factory method which return type is the same as the containing class. If the class is parameterized like Map<K, V>, then static factory method should be parameterized in the same way like static <KK, VV> Map<KK, VV> createMap(...) (or such restriction is redundant?). Parameterized constructor is not allowed (or it is?)
> - Serialization annotation is applied only to patterns. Probably could be applied to getter-like no-arg method if object is serialized to single simpler value like File::toString could be used to serialize File.
> - No two members of the same class could have the same annotation with the same version number
> - If class contains both serializer and deserializer with the same version number, their parameter count and types should match.

These are the sort of checks I would expect a compiler to want to do.  There is room for adjustment  and interpretation (the parameter types need not be equal, just assignment-compatible) but you’ve got the right spirit.  

For super-simple classes, we could consider to allow an accessor to act as a serializer, if the serial form only has one component.  If it has multiple components, now we’re in the business of either capturing the order somewhere, or writing names to the stream so they can be matched to constructor parameter names (yuck), which starts to feel like the wrong end of the convenience-complexity lever, but again — serialization frameworks can do what they want.

> What about singleton objects or objects without state in general? Seems no problem with deserialization (e.g. class Singleton { @Deserializer public static Singleton getInstance() {...} }). But how to declare the serializer? Is it expected to have patterns which deconstruct the object to zero components?

A pattern with zero state components is perfectly allowable.  If we think this is a common case, then we could have a @Serializer.Singleton annotation to optimize its expression.  There’s a significant API design problem ahead of us in picking the right annotations, defining the consistency-checking rules, etc; the examples in the document should be considered placeholders.  

> Will java.lang.String have a serializer (toCharArray?) and a deserializer (String(char[])) or it's considered to be basic enough and all serialization frameworks should handle it as a primitive?

I would think “primitive”, but if you wanted to make an argument for the other way, I’d listen!