A hole in the serialization spec

Chris Hegarty chris.hegarty at oracle.com
Thu Feb 13 16:29:32 UTC 2014


On 12 Feb 2014, at 15:24, David M. Lloyd <david.lloyd at redhat.com> wrote:

> That's a quote from the serialization spec.  I take it to mean, "Don't write fields and everything might go to hell".  In practice, if the reading side doesn't read fields, things end up more or less OK, as evidenced by various classes in the wild.  But it's not hard to imagine a scenario in which a class change could cause protocol corruption.
> 
> I think the specifics of the quote relate to this kind of class change; in particular, if a class is deleted from the hierarchy on the read side, and that class corresponds to the class that had the misbehaving writeObject, I suspect that things will break at that point as the read side will probably try to consume and discard the field information for that class, which will be missing (it will start reading the next class' fields instead I think).

Yes, possibly. And who knows what fields/values may be read and mistaken for the wrong object in the hierarchy. So ‘undefined' behaviour seems right to me.

A simple example throws StreamCorruptedException with Oracles JDK:

public class NoFields {

    public static void main(String[] args) throws Exception  {
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        try (ObjectOutputStream oos = new ObjectOutputStream(baos)) {
            oos.writeObject(new B(5, 10));
        }

        ByteArrayInputStream bais = new ByteArrayInputStream(baos.toByteArray());
        ObjectInputStream ois = new ObjectInputStream(bais);
        B b = (B)ois.readObject();
        System.out.println("aValue = " + b.aValue);
        System.out.println("bValue = " + b.bValue);
    }

    static class A implements Serializable {
        final int aValue;
        A(int value) { this.aValue = value; }

        private void writeObject(ObjectOutputStream oos) throws IOException {
            oos.defaultWriteObject();  // <<< comment out
        }
    }

    static class B extends A implements Serializable {
        final int bValue;
        B(int aValue, int bValue) { super(aValue); this.bValue = bValue; }
    }

-Chris.

> 
> On 02/12/2014 09:08 AM, Chris Hegarty wrote:
>> David, others?
>> 
>> I'm afraid I'm still not clear what is mean by:
>>   " ... undefined in cases where the ObjectInputStream cannot resolve
>>     the class which defined the writeObject method in question."
>> 
>> This does not seem directly related to the issue we are discussing (
>> whether to invoke default read/write object ).
>> 
>> -Chris.
>> 
>> On 10/02/14 15:37, David M. Lloyd wrote:
>>> I agree that it's a problem; however it's also clear that there are many
>>> classes in the wild which have this problem.  It would be nice if the
>>> behavior could _become_ defined *somehow* though.  I can see at least
>>> four options:
>>> 
>>> 1) do nothing :(
>>> 2) start throwing (or writing) an exception in write/readObject when
>>> stream ops are performed without reading fields (maybe can be disabled
>>> with a sys prop or something)
>>> 3) leave fields cleared and risk protocol issues
>>> 4) silently start reading/writing empty field information (risks
>>> protocol issues)
>>> 
>>> Maybe there are better options I'm not thinking of.
>>> 
>>> On 02/10/2014 08:53 AM, Chris Hegarty wrote:
>>>> David,
>>>> 
>>>> " ... undefined in cases where the ObjectInputStream cannot resolve the
>>>> class which defined the writeObject method in question."
>>>> 
>>>> I'm not clear as to what this statement is about?
>>>> 
>>>> I'm sure you already know this, and maybe in your environment do not
>>>> care much about it, but having a read/writeObject not invoke the
>>>> appropriate default read/write Object/Fields method is a serious
>>>> impediment to evolving the serial form ( in a compatible way ). For
>>>> example, if your class has no serializable fields in one revision, but
>>>> adds a serializable field(s) in a subsequent revision. This could lead
>>>> to a StreamCorruptedException, or some other undefined behavior.
>>>> 
>>>> The OpenJDK sources do seem to be quite tolerant of this situation. I'm
>>>> not entirely sure if this is a good or a bad thing. That said, I don't
>>>> think we want to encourage this kind of behavior.
>>>> 
>>>> -Chris.
>>>> 
>>>> On 07/02/14 15:07, David M. Lloyd wrote:
>>>>> Since the topic of serialization has come up recently, I'll take it as
>>>>> an excuse to bring up a problem that I've run into a couple of times
>>>>> with the serialization specification, which has resulted in user
>>>>> problems.
>>>>> 
>>>>> If you read section 2.3 [1] of the specification, it says:
>>>>> 
>>>>> "The class's writeObject method, if implemented, is responsible for
>>>>> saving the state of the class. Either ObjectOutputStream's
>>>>> defaultWriteObject or writeFields method must be called once (and only
>>>>> once) before writing any optional data that will be needed by the
>>>>> corresponding readObject method to restore the state of the object;
>>>>> even
>>>>> if no optional data is written, defaultWriteObject or writeFields must
>>>>> still be invoked once. If defaultWriteObject or writeFields is not
>>>>> invoked once prior to the writing of optional data (if any), then the
>>>>> behavior of instance deserialization is undefined in cases where the
>>>>> ObjectInputStream cannot resolve the class which defined the
>>>>> writeObject
>>>>> method in question."
>>>>> 
>>>>> If you go to section 3.4 [2] of the specification, it reads:
>>>>> 
>>>>> "The readObject method of the class, if implemented, is responsible for
>>>>> restoring the state of the class. The values of every field of the
>>>>> object whether transient or not, static or not are set to the default
>>>>> value for the fields type. Either ObjectInputStream's defaultReadObject
>>>>> or readFields method must be called once (and only once) before reading
>>>>> any optional data written by the corresponding writeObject method; even
>>>>> if no optional data is read, defaultReadObject or readFields must still
>>>>> be invoked once."
>>>>> 
>>>>> Now the problem: there are many classes in the wild which nevertheless
>>>>> do not write/read fields.  We cause an exception in such cases rather
>>>>> than make up some undefined behavior.  What I'm wondering is, is there
>>>>> some sensible behavior that could be specified for this case?  The
>>>>> Oracle JDK seems to simply leave fields uninitialized in this case,
>>>>> maybe that can be a specified behavior?
>>>>> 
>>>>> [1]
>>>>> http://docs.oracle.com/javase/7/docs/platform/serialization/spec/output.html#861
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> [2]
>>>>> http://docs.oracle.com/javase/7/docs/platform/serialization/spec/input.html#2971
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>> 
>>> 
> 
> 
> -- 
> - DML




More information about the core-libs-dev mailing list