The future of Serialization
Peter Firmstone
peter.firmstone at zeus.net.au
Mon Aug 11 10:18:28 UTC 2014
On 11/08/2014 8:12 PM, Peter Firmstone wrote:
> Brian,
>
> Thanks for picking up on my frustration ;)
>
> I have something in mind for Serializable2 to address cyclic data
> structures and the possibility of independant evolution of super and
> child classes, while retaining a relatively clean public api, with one
> optional private method. The methods and interfaces proposed are
> suitable for any alternative ObjectInput and ObjectOutput implementation.
>
> An interface exists in Apache River, it's called Startable, it has one
> method:
>
> public void start() throws Exception;
>
> It's called by a framework to allow an Object to start threads,
> publish "this" or throw an exception after construction. The intent
> is to allow an object to be immutable with final fields and be
> provided with a thread of execution after construction and before
> publication.
>
> Something similar can be used to wire up circular relations, let met
> explain:
>
> Every class that implements Serializable has one thing in common, the
> Serialization protocol and every Object instance of a Serializable
> class has an arbitrary serial form.
>
> I propose a final class representing SerialForm for an object, that
> cannot be extended, requires privilege to instantiate and also
> performs method guard security checks, for all callers with the
> exception of a calling class reading or writing its own serial form.
> SerialForm needs a parameter field key identity represented by the
> calling Class,
Sorry, that should read "field name", not "method name".
> the method name and the field's Class type, this key would be used for
> both writing and retrieving a field entry in SerialForm. SerialForm
> will also provide a method to advise if a field key contains a
> circular relation, any field entry in SerialForm that would contain a
> circular relation is not populated until after construction of the
> current object is complete.
>
> An arbitrary Serializable2 Object instance may be composed of a
> hierarchy of classes, each belonging to a separate ProtectionDomain.
>
> For the following interface:
>
> public interface Serializable2 {
>
> void writeObject(SerialForm serial) throws IOException;
>
> }
>
> Implementers of Serializable2 must:
>
> 1. Implement writeObject
> 2. Implement a constructor with the signature: (SerialForm serial).
>
> Implementors that need to check invariants, delay throwing an
> Exception, publish "this" or set a circular reference after
> construction should:
>
> 4. Implement: private void readObjectNoData() throws
> InvalidObjectException;
>
> Child class implementations should:
>
> 5. Call their super class writeObject method and superclass
> constructor, but may call any super class constructor or methods.
>
> Compatibility and Evolution:
>
> 1. Fields can be included or omitted from SerialForm, by an
> implementation, without breaking compatibility, provided a null
> reference is accepted during deserialization.
> 2. Child classes in a hierarchy; all Serializable2 implementing
> superclass constructors have the same signature; the superclass
> implementation can be substituted, without breaking child class
> deserialization (provided this is the constructor used by the
> child class).
> 3. There is no serialVersionUID.
> 4. Child class Serializable2 implementations can extend a superclass
> without a zero arg constructor that doesn't itself implement
> Serializable2.
> 5. Child classes that do not override writeObject will not be
> serialized, so can effectively opt out.
> 6. Because implementations are required to implement public methods,
> there is no "Magic".
> 7. Serializable2 shouldn't extend Serializable, allowing classes to
> implement both interfaces for a period of time (for that reason
> the signature for readObjectNoData may need to be changed for
> Serializable2).
> 8. ObjectInputStream and ObjectOutputStream can be extended to
> support both implementations for compatibility, however
> alternative stream implementations would be preferable for
> Serializable2 to avoid Serializable security issues. The new
> implementations should be possible to substitute because both
> types would use the same Stream Protocol, provided the classes
> being deserialized implement Serializable2.
>
>
> My reasoning for retaining readObjectNoData() and for updating field
> entry's in SerialForm that contain circular relations after
> construction, is:
>
> 1. An object reference for the object currently being deserialized
> can be passed to another object's constructor (via a SerialForm
> instance) after the current Object's constructor completes,
> allowing safe publication of final field freezes that occur at the
> end of construction.
> 2. When the Serialization2 Framework becomes aware of an object that
> contains a circular relationship while that object is in the
> process of being deserialized, the second object will not be
> instantiated until after the constructor of the first object in
> the relationship completes. Data read in from the stream can be
> stored in a SerialForm without requiring object instantation.
> 3. After construction completes, the object that has just been
> deserialized can retain a copy of its SerialForm and look up the
> field containing a circular relationship, the Serialization
> framework will update its SerialForm with the new object that
> holds a circular relationship, prior to calling readObjectNoData()
> on the first object.
> 4. If the developer of the implementing class is not aware of the
> possibility of a circular relationship, then the worst consequence
> is a field will be set to null during construction, "this" will
> not escape.
> 5. The second Object holding a link to an object that apears earlier
> in the stream, may not be aware that the object it holds a
> reference to also needs a reference to it. The first object will
> not obtain a reference to the second until both Object
> constructors have completed. The second object may not need to
> implement readObjectNoData().
> 6. readObjectNoData() needs to be called on every class belonging to
> a single Object's inheritance hierarchy, when defined, after all
> constructors have completed, it should be called in the order of
> superclass to child class.
>
> Thoughts?
>
> Regards,
>
> Peter.
>
> On 10/08/2014 3:20 AM, Brian Goetz wrote:
>>> I've noticed there's not much interest in improving Serialization on
>>> these lists. This makes me wonder if java Serialization has lost
>>> relevance in recent years with the rise of protocol buffers apache
>>> thrift and other means of data transfer over byte streams.
>>
>> I sense your frustration, but I think you may be reaching the wrong
>> conclusion. The lack of response is probably not evidence that
>> there's no interest in fixing serialization; its that fixing
>> serialization, with all the constraints that "fix" entails, is just
>> really really hard, and its much easier to complain about it (and
>> even say "let's just get rid of it") than to fix it.
>>
>>> Should Serializable eventually be deprecated? Should Serialization be
>>> disabled by default? Should a new mechanism be developed? If a new
>>> mechanism is developed, what about circular object relationships?
>>
>> As I delved into my own explorations of serialization, I started to
>> realize why such a horrible approach was the one that was ultimately
>> chosen; while serialization is horrible and awful and leaky and
>> insecure and complex and brittle, it does address problems like
>> cyclic data structures and independent evolution of subclass and
>> superclass better than the "clean" models.
>>
>> My conclusion is, at best, a new mechanism would have to live
>> side-by-side with the old one, since it could only handle 95% of the
>> cases. It might handle those 95% much better -- more cleanly,
>> securely, and allowing easier schema evolution -- but the hard cases
>> are still there. Still, reducing the use of the horrible old
>> mechanism may still be a worthy goal, even if it can't be killed
>> outright.
>>
>
More information about the net-dev
mailing list