Serialzation PREVIOUSLY: RFR: 8229773: Resolve permissions for code source URLs lazily

Tue Aug 20 04:51:23 UTC 2019

Thanks Sean,

No I hadn't seen it, I've just read it, will probably need to read it 
again to appreciate it fully...

It certainly identifies all the issues I'm aware of, as well as being 
respectful of the original implementors (many of whom participated in 
Apache River when Jini was donated to Apache), I came to the same 
conclusion with circular object graphs; the benefits don't outweigh the 
cost.

We also use annotations instead of interfaces,to annotate the class and 
constructor, so that overriding classes don't automagically inherit the 
functionality.

At this time, we haven't reimplemented deconstruction, we are using 
ObjectOutputStream with serializers, which are basically serialization 
proxy's for existing classes, we have fully reimplemented 
deserialization using constructors.

Agree with serial from being independant of the wire protocol, so any 
serialization scheme can be used, this is an excellent idea of course.

The constructors / deconstructors have identified that serial form is 
really just a parameter list.   Developers will want to make defensive 
copies of mutable state, just like public api methods.

We did consider constructors with multiple parameters, but decided 
against it for the following reasons:

   1. We didn't care about parameter order (tuples), or the order in
      which they were serialized / deserialized, we only cared about
      parameter names and types.
   2. For encapsulation we didn't want subclasses having to manage the
      serial form of superclasses, we wanted them to remain as
      independant as possible, so they don't inadvertantly break.
          * For example, a library superclass adds a serial form
            parameter, or changes a type, in its serial form.   The
            child class would have to be aware of the changes in order
            to pass the correct parameters to the correct superclass
            constructor.
          * Different serial version constructors would result in the
            loss of later version superclass state when child classes
            call an earlier version.
   3. We settled on a caller sensitive parameter that is passed to the
      deserialization constructor.
          * Encapsulation: Each class in an inheritance heirarch only
            has access to it's own serial form.
          * The serial form of each class is independant and may evolve
            independantly.
          * Each class in the inheritance heirarchy is responsible for
            checking it's own invariants, including the ability to
            create superclass instances, even if a superclass is
            abstract for checking inter class invariants.
   4. It was less work for the framework to populate a standard
      parameter object, with serial form, the framework didn't need to
      worry about inspecting the constructor signature and determining
      the parameter order.
   5. One constructor could be used for different versions.
   6. We currently use |serialPersistentFields to declare serial form,
      but there is probably a better way of doing this, perhaps a way
      that also documents different serial form versions.|

Regards,

Peter.

On 20/08/2019 7:55 AM, Sean Mullan wrote:
> Brian Goetz (copied) has done a lot of thinking in the serialization 
> area, so I have copied him. Not sure if you have seen it but he 
> recently posted a document about some of his ideas and possible future 
> directions for serialization: 
> http://cr.openjdk.java.net/~briangoetz/amber/serialization.html
>
> --Sean
>
> On 8/17/19 10:22 PM, Peter Firmstone wrote:
>> Thanks Sean,
>>
>> You've gone to some trouble to answer my question, which demonstrates 
>> you have considered it.
>>
>> I donate some time to help maintain Apache River, derived from Sun's 
>> Jini.  Once Jini depended on RMI, today, not so much, it still has 
>> some dependencies on some RMI interfaces, but doesn't utilise JRMP 
>> although it provides some backward compatibilty enable it.
>>
>> But my point is, we heavily utilise java Serialization, and have an 
>> independant implementation of a subset of Java Serialization 
>> (originating from Apache Harmony).  We do this for security as we use 
>> an annotated serialization constructor.   Serial form is unchanged, 
>> we have Serializers for commonly used java library objects, for 
>> example, we have a "PermissionSerializer", but we don't have a 
>> "PermissionCollectionSerializer" or "PermissionsSerializer" (for 
>> java.security.Permissions).   Incidentally, we have found we do not 
>> need the ability to serialize circular object graphs.   Throwable is 
>> an object that has a circular object graph, but that circular object 
>> graph can be linked up after deserialization.
>>
>> Permission implementing Serializable is probably not too much of a 
>> threat, as these objects are effectively immutable after lazy 
>> initialization.
>>
>> ProtectionDomain calls java.security.Permissions::setReadOnly during 
>> it's construction.
>>
>> ProtectionDomain::getPermissions returns internal 
>> java.security.Permissions.   If this is serialized, then the readOnly 
>> internal state can be written to as the internal object references 
>> are accessible from within the stream.
>>
>> Admitedly, the attacker would already need to have some privilege, to 
>> have access to a ProtectionDomain, so it's a path of privilege 
>> escallation.  I'm not talking about gadget attacks and 
>> deserialization of untrusted data, I'm talking about breaking 
>> encapsulation.
>>
>> Even though we are heavily dependant on Java Serialization, we are 
>> very careful when we implement it, and avoid implementing it when 
>> possible. Hindsight is 20:20, but given we are now seeing some Java 
>> SE backward compatibility breakages, perhaps it might be worth 
>> considering breaking serialization.  I don't mean we need to 
>> necessarily break object serial form, but making the Java 
>> serialization API explicit with subset of existing api features, that 
>> makes long term maintenace and security less of a burden and removing 
>> support for Serialization of some objects, where it is seldom used, 
>> perhaps using a JEP that requests developers to consider which 
>> library objects actually need to be serializable.
>>
>> Something we do in our Java Serialization API is require that mutable 
>> deserialized objects are defensively copied during object 
>> construction (serial fields are deserialized before an object is 
>> constructed, the deserialized fields are accessible via a parameter 
>> passed in during construction.   We have tools that assist developers 
>> to check deserialized Java Collections contain the expected object 
>> types for example, so during object construction the developer has to 
>> replace the Collection with a new instance and copy the contents to 
>> the new Collection after checking the type of each object contained 
>> therein. Also we don't actually serialize Java Collections, we have 
>> standard serial forms for List, Set and Map, so these serial forms 
>> are equal, similar to the List, Set and Map contracts.  By doing 
>> this, Collections don't actually need to implement Serializable at 
>> all, as a Serializer becomes responsible for their serialization.   
>> This also means that all Collections must be accessed by interfaces, 
>> rather than implementation classes, so the deserialization 
>> constructor, must defensively copy them into their preferred 
>> Collection instance.   It's a bit like dependency injection.
>>
>> I know it would take time, and there would be some pain, but long 
>> term it would save a lot of maintenance developer time.
>>
>> Regards,
>>
>> Peter.
>>
>> On 17/08/2019 12:50 AM, Sean Mullan wrote:
>>> On 8/15/19 8:18 PM, Peter Firmstone wrote:
>>>> Hi Roger,
>>>>
>>>> +1 for writeReplace
>>>>
>>>> Personally I'd like to see some security classes break backward 
>>>> compatibility and remove support for serialization as it allows 
>>>> someone to get references to internal objects, especially since 
>>>> these classes are cached by the JVM.  Which makes 
>>>> PermissionCollection.setReadOnly() very easy to bypass, by adding 
>>>> permissions to internal collections once you have a reference to them.
>>>>
>>>> Does anyone have any use cases for serializing these objects?
>>>>
>>>> These objects are easy to re-create by sending or recieving and 
>>>> parsing strings, because they are built from text based policy 
>>>> files, and when you do that, you are validating input, so I never 
>>>> did fully understand why they were made serializable.
>>>
>>> This is briefly explained on page 61 in the "Inside Java 2 Platform 
>>> Security" book [1]:
>>>
>>> "The Permission class implements two interfaces: java.security.Guard 
>>> and java.io.Serializable. For the latter, the intention is that 
>>> Permission objects may be transported to remote machines, such as 
>>> via Remote Method Invocation (RMI), and thus a Serializable 
>>> representation is useful."
>>>
>>> The Permission class was introduced in Java SE 1.2 so there were 
>>> different motivations back then :)
>>>
>>> --Sean
>>>
>>> [1] https://www.oracle.com/technetwork/java/javaee/index-141918.html
>>