Serialzation PREVIOUSLY: RFR: 8229773: Resolve permissions for code source URLs lazily
Peter Firmstone
peter.firmstone at zeus.net.au
Fri Aug 23 05:21:23 UTC 2019
I probably should have vetted this before hitting send... let me know if
you need any clarifications.
Cheers,
Peter.
On 23/08/2019 12:59 PM, Peter Firmstone wrote:
> "...since at the time the industry believed that distributed objects
> were going to save us from complexity.) Many of the sins of
> serialization were committed in the desire to get that last .1%, but
> the cost and benefit of that last .1% are woefully out of balance."
>
> The following are probably a non goals, but something to consider or
> keep in mind, relating to distributed objects:
>
> The are three types of distributed objects:
>
> 1. Immutable value / data Object types.
> 2. Shared Mutable Objects.
> 3. Unshared Mutable Objects.
> 4. Remote Objects / Services (best for managing shared mutable state).
>
> The second type of distributed object causes much pain and should be
> discouraged. The first three types of distributed objects can have
> class resolution issues, but these are solveable.
>
> A lot of folks also have problems with deserialization Objects when
> class visibility is different at both ends, I'm guessing this would be
> the same for value types.
>
> For example OSGi folk recommend using primitive parameter types for
> remote OSGi services.
>
> RMI annotates streams with codebase annotations. Jini Extensible
> Remote Invocation used to do that too.
>
> The problem with RMI codebase and Jini codebase annotations is if you
> resolve your classes locally, you lose the codebase annotations, when
> re-serializing data and because class visibility can be different at
> different endpoints, end up with all sorts of class resolution
> issues. "Class Loading Issues in Java™ RMIand Jini™ Network
> Technology" by Michael Warres
> https://pdfs.semanticscholar.org/143f/468fcbdafd20f2b8c27fe5e0a869913b641a.pdf
>
>
> The solution of course is simple, ensure that you deserialize into the
> same module that you serialized from, especially when deserializing in
> another jvm, so class resolution is identical.
>
> We serialize a lot of complex object graphs, none are circular. The
> module used for serialization should have visiblity of the entire
> graph of object classes.
>
> So if we're using OSGi modules, and provide a network / remote service
> (not to be confused with an OSGi remote service) we ensure the proxy's
> for these services have the same module installed at the client and
> server endpoints. The service is represented by a Java interface and
> the client makes calls on the interfaces methods. This interface may
> be implemented by what is called a smart proxy, which is encapsulated
> by a module which is dynamically downloaded at runtime, or a
> reflection Proxy using an InvocationHandler that is generated
> dynamically.
>
> We still provide an option for codebase annotations for client
> parameter objects, where a client subclasses parameter types and pass
> them to the service, but this is discouraged, it is provided for
> backward compatibility only. Where the parameters are also
> interfaces, the client can implement a remote object and pass it as a
> parameter instead, in our system, this will cause a module to be
> loaded in the server identical to that at the client to resolve the
> remote object classes, without using stream codebase annotations.
>
> Incidentally, if you're curious how this happens, a proxy is sent {I
> guess you can call it a serialization proxy :) } and authenticated by
> the remote end, security constraints applied, then the remote end asks
> the proxy for a codebase URL,which is loaded into a ClassLoader with
> controlled visibility, this is extensible using a ServiceProvider or
> OSGi service, then the proxy is deserialized into this by calling a
> method on the serialization proxy.
>
> By limiting scope, we can still have 99% of the benefits of
> distributed objects, without the pain.
>
> Incidentally apart from the complexity of class resolution, what
> really limited distributed computing was IPv4. IPv6 removes the
> network addressing limitations placed on distributed computing.
>
> So I'd make the following qualifications:
>
> 1. Use only primitive types when serializing between different
> languages.
> 2. Serialize Java language Object types and primitives only between
> jvm's when class visibility is uncontrolled.
> 3. When serializing other object types, ensure they are immutable if
> shared and that class visibility is identical and managed at both
> endpoints.
> 4. Do not serialize objects whose classes may not be resolveable
> (when you need to depend on annotated streams and uncontrolled
> class resolution for example), find another way to solve the
> problem.
>
> We've had a 20 years to iron out the wrinkles. :)
>
> Regards,
>
> Peter.
>
> On 23/08/2019 7:36 AM, Peter Firmstone wrote:
>> Hi Sean,
>>
>> Regarding the section entitled "Why not write a new serialization
>> library?", unlike the serialization libraries listed, our purpose was
>> to be able to securely deserialize untrusted data, while maintaining
>> backward serial form compatibility with Java Serialization, provided
>> it didn't compromise security.
>>
>> We don't use blacklists or whitelists, we use permissions to grant
>> DeserializationPermission, it doesn't have the granularity of white
>> lists, but then, classes that implement @AtomicSerial are supposed to
>> be hardened implementations in any case.
>>
>> If it can be of use, feel free to experiment with it, hopefully it
>> might help with some of your design decisions:
>>
>> https://github.com/pfirmstone/JGDMS/tree/trunk/JGDMS/jgdms-platform/src/main/java/org/apache/river/api/io
>>
>>
>> Much of the code on this site provides implementation examples as well.
>>
>> Regards,
>>
>> Peter.
>>
>> On 20/08/2019 7:55 AM, Sean Mullan wrote:
>>> Brian Goetz (copied) has done a lot of thinking in the serialization
>>> area, so I have copied him. Not sure if you have seen it but he
>>> recently posted a document about some of his ideas and possible
>>> future directions for serialization:
>>> http://cr.openjdk.java.net/~briangoetz/amber/serialization.html
>>>
>>> --Sean
>>>
>>> On 8/17/19 10:22 PM, Peter Firmstone wrote:
>>>> Thanks Sean,
>>>>
>>>> You've gone to some trouble to answer my question, which
>>>> demonstrates you have considered it.
>>>>
>>>> I donate some time to help maintain Apache River, derived from
>>>> Sun's Jini. Once Jini depended on RMI, today, not so much, it
>>>> still has some dependencies on some RMI interfaces, but doesn't
>>>> utilise JRMP although it provides some backward compatibilty enable
>>>> it.
>>>>
>>>> But my point is, we heavily utilise java Serialization, and have an
>>>> independant implementation of a subset of Java Serialization
>>>> (originating from Apache Harmony). We do this for security as we
>>>> use an annotated serialization constructor. Serial form is
>>>> unchanged, we have Serializers for commonly used java library
>>>> objects, for example, we have a "PermissionSerializer", but we
>>>> don't have a "PermissionCollectionSerializer" or
>>>> "PermissionsSerializer" (for java.security.Permissions).
>>>> Incidentally, we have found we do not need the ability to serialize
>>>> circular object graphs. Throwable is an object that has a
>>>> circular object graph, but that circular object graph can be linked
>>>> up after deserialization.
>>>>
>>>> Permission implementing Serializable is probably not too much of a
>>>> threat, as these objects are effectively immutable after lazy
>>>> initialization.
>>>>
>>>> ProtectionDomain calls java.security.Permissions::setReadOnly
>>>> during it's construction.
>>>>
>>>> ProtectionDomain::getPermissions returns internal
>>>> java.security.Permissions. If this is serialized, then the
>>>> readOnly internal state can be written to as the internal object
>>>> references are accessible from within the stream.
>>>>
>>>> Admitedly, the attacker would already need to have some privilege,
>>>> to have access to a ProtectionDomain, so it's a path of privilege
>>>> escallation. I'm not talking about gadget attacks and
>>>> deserialization of untrusted data, I'm talking about breaking
>>>> encapsulation.
>>>>
>>>> Even though we are heavily dependant on Java Serialization, we are
>>>> very careful when we implement it, and avoid implementing it when
>>>> possible. Hindsight is 20:20, but given we are now seeing some Java
>>>> SE backward compatibility breakages, perhaps it might be worth
>>>> considering breaking serialization. I don't mean we need to
>>>> necessarily break object serial form, but making the Java
>>>> serialization API explicit with subset of existing api features,
>>>> that makes long term maintenace and security less of a burden and
>>>> removing support for Serialization of some objects, where it is
>>>> seldom used, perhaps using a JEP that requests developers to
>>>> consider which library objects actually need to be serializable.
>>>>
>>>> Something we do in our Java Serialization API is require that
>>>> mutable deserialized objects are defensively copied during object
>>>> construction (serial fields are deserialized before an object is
>>>> constructed, the deserialized fields are accessible via a parameter
>>>> passed in during construction. We have tools that assist
>>>> developers to check deserialized Java Collections contain the
>>>> expected object types for example, so during object construction
>>>> the developer has to replace the Collection with a new instance and
>>>> copy the contents to the new Collection after checking the type of
>>>> each object contained therein. Also we don't actually serialize
>>>> Java Collections, we have standard serial forms for List, Set and
>>>> Map, so these serial forms are equal, similar to the List, Set and
>>>> Map contracts. By doing this, Collections don't actually need to
>>>> implement Serializable at all, as a Serializer becomes responsible
>>>> for their serialization. This also means that all Collections
>>>> must be accessed by interfaces, rather than implementation classes,
>>>> so the deserialization constructor, must defensively copy them into
>>>> their preferred Collection instance. It's a bit like dependency
>>>> injection.
>>>>
>>>> I know it would take time, and there would be some pain, but long
>>>> term it would save a lot of maintenance developer time.
>>>>
>>>> Regards,
>>>>
>>>> Peter.
>>>>
>>>> On 17/08/2019 12:50 AM, Sean Mullan wrote:
>>>>> On 8/15/19 8:18 PM, Peter Firmstone wrote:
>>>>>> Hi Roger,
>>>>>>
>>>>>> +1 for writeReplace
>>>>>>
>>>>>> Personally I'd like to see some security classes break backward
>>>>>> compatibility and remove support for serialization as it allows
>>>>>> someone to get references to internal objects, especially since
>>>>>> these classes are cached by the JVM. Which makes
>>>>>> PermissionCollection.setReadOnly() very easy to bypass, by adding
>>>>>> permissions to internal collections once you have a reference to
>>>>>> them.
>>>>>>
>>>>>> Does anyone have any use cases for serializing these objects?
>>>>>>
>>>>>> These objects are easy to re-create by sending or recieving and
>>>>>> parsing strings, because they are built from text based policy
>>>>>> files, and when you do that, you are validating input, so I never
>>>>>> did fully understand why they were made serializable.
>>>>>
>>>>> This is briefly explained on page 61 in the "Inside Java 2
>>>>> Platform Security" book [1]:
>>>>>
>>>>> "The Permission class implements two interfaces:
>>>>> java.security.Guard and java.io.Serializable. For the latter, the
>>>>> intention is that Permission objects may be transported to remote
>>>>> machines, such as via Remote Method Invocation (RMI), and thus a
>>>>> Serializable representation is useful."
>>>>>
>>>>> The Permission class was introduced in Java SE 1.2 so there were
>>>>> different motivations back then :)
>>>>>
>>>>> --Sean
>>>>>
>>>>> [1] https://www.oracle.com/technetwork/java/javaee/index-141918.html
>>>>
>>
>
More information about the security-dev
mailing list