JEP 187: Serialization 2.0 & Serialization-aware constructors
David M. Lloyd
david.lloyd at redhat.com
Wed Jan 22 16:33:24 UTC 2014
On 01/13/2014 06:26 PM, mark.reinhold at oracle.com wrote:
> Posted: http://openjdk.java.net/jeps/187
The concept of explicit deserialization constructors is interesting and
is something I've explored a little bit in the context of JBoss Marshalling.
The way construction works today (simple version!), the framework will
magic up a new Constructor instance which can construct a
partially-initialized object. By "partially initialized" I mean, only
the classes in the non-serializable "top half" of the class hierarchy
are initialized, subclass-first like always. At this point it relies on
the language constraints that require that the superclass be initialized
as the first step (more or less) of construction, thus effectively
reversing initialization order to be superclass-first.
Now at this point there is an object that was (more or less) initialized
from the top (Object) down to the last non-serializable class in the
hierarchy (which is often also Object, as it happens). From here, the
deserialization mechanism takes over, using stream information to
acquire values and "stuff" them into fields (even final fields) using
reflection, in superclass-first order. Some reflection magic makes sure
that final field publication works more or less as expected; some other
magic ensures that sensible action is taken for certain types of
differences in the sending and receiving class structures. No
initializers are ever invoked for these classes, though you can define a
private readObject() method which is a close approximation (as long as
you don't have final fields, else you're stuck using reflection too).
The idea with a serialization-aware constructor is that each
serializable class constructor can read the stream information itself
and initialize fields the "normal" way, with "normal" validation.
The simplest/most naive implementation of this is to simply pass in an
ObjectInputStream to these constructors. This approach seems to work
fairly well actually, from the user's perspective: each constructor
calls to the superclass first, then it acquires (for example) a GetField
object for itself and then pulls field data out of it and populates its
real fields, much like a readObject() method might do.
The problem here is that the actual serialization implementation
normally gets to hook in between calls to readObject(); it cannot do
this for constructors, because each constructor calls the superclass'
immediately in a chain. The framework would have to examine the call
stack to know who the actual caller is, and there is also the
possibility that the constructors would abuse this contract in various
ways, taking advantage of the framework's lack of control.
In an ideal world (for serialization implementations anyway),
constructors would be wholly isolated, which would allow the framework
to call each one in sequence with only its safely isolated bit of the
stream. But in the real world, this isn't really possible within the
framework of the existing language.
One concept that might be interesting would be to introduce such
isolated instance initializers which do not call up to the superclass
but which otherwise follow the general constructor contract. This would
present a very simple solution from the perspective of serialization,
though the complexity of such a solution is potentially great.
Another option is to establish a tighter API which constructors can
consume. The constructor would be able to read field information out of
the API but only for its own class, possibly even enforced by call stack
inspection. The constructor would be contractually obligated to
propagate the API object to the superclass; the framework would have to
enforce that the propagation happened correctly for the class hierarchy
(which it would have knowledge of), i.e. ensure the object didn't
"cheat" by calling a non-serialization constructor for a serializable
superclass.
Other ideas may be possible as well. I found this to be an interesting
problem when I was exploring it myself, and I still find it pretty
interesting.
--
- DML
More information about the core-libs-dev
mailing list