Explicit Serialization API and Security

Sat Jan 10 07:00:22 UTC 2015

Again, thank you all for engaging in discussion of this very difficult 
topic.

While we can't presently check intra object dependencies during 
deserialization with readObject(), the examples I provide can do this.

ObjectInputValidation is sufficient for enforcing business rules, 
however it can't stop an attacker.  By the time the validator runs, the 
attackers object of choice has been instantiated and it's game over.  
That's right, the attacker may choose any Serializable class from the 
classpath, and may even extend non serializable classes, with a zero arg 
constructor, such as ClassLoader.

To trust deserialization of an Object, as opposed to a String or 
primitive, the ObjectInputStream (or overriding subclass) must limit the 
classes allowed to be deserialized.

For a class to be trusted, it must be trusted to check its parameters 
and enforce its invariants during object deserialization and to not 
deserialize with priviliged context.

Validating an entire object graph's invariants, requires each class in 
the graph to take responsibility for validating and enforcing its own 
invariants.

As shown in my earlier example, intra class invariants can be enforced 
using Serialization constructors, from which static methods are called 
to check invariants prior to super class constructors being called.

Presently, I override ObjectInputStream and use a Permission called 
DeserializationPermission to limit classes that can be deserialized. 
Untrusted connections are run from unprivileged context to limit classes 
that can be instantiated, while those with trusted connections are run 
with a Subject that allows any class to be deserialized.

ReadSerial, could do this:
     Class c = rs.getType("foo");

And

     Foo f = rs.getObject("foo", Foo.class);

Which performs instanceof type check, (prior to Object deserialization, 
if first time deserialized, otherwise after) and generic complile time 
type checked method return.   Thus we must restrict the classes that can 
be deserialized with ObjectInputStream.

If Foo is mutable and we want a private copy, the caller needs to copy 
or clone it before checking invariants, as it would any mutable 
constructor parameter.

Each level of validation is the responsibility of the component with the 
most knowledge.

   1. Trusted class lists - administration, it might change.
   2. Object invariants and intra object invariants - Classes, not objects.
   3. Business rules, but not security - ObjectInputValidation.

So summing up, in order to secure deserialization we must validate all 
data input, preferably before allowing object instantiation.

Once an object has been constructed an attacker can gain a reference, 
whether by a finalizer attack or some other cleverly crafted stream, the 
attacker cannot obtain a reference to an object that doesn't exist.

Circular links can allow an attacker to obtain an object reference 
,prior to its invariants being checked, when we rely on readObject() to 
throw an exception.  Delaying finalizer registration won't save you.

Classes with circular links should be final and use a transient boolean 
"initialized" field, that every method checks to prevent an attacker 
doing anything useful, should they gain a reference to an incorrectly 
constructed object.

The real question is, do we continue to plaster over the issues with the 
Serialization framework's api, and continue to create special cases such 
as a second final field freeze after a constructor completes?    To be 
honest, I don't like this second final field freeze, because invariants 
haven't been checked.

Don't get me wrong, Serialization is quite clever, but implementing 
Serializable properly can consume a lot of time due to api complexity 
and is presently a security problem.

In my examples all constructors share the same invariant checks and 
because it's public API, the invariants can be tested easily with unit 
tests.

It seems like we're trying to give constructor properties to a private 
instance method at the expense of increasing complexity, wouldn't it be 
simpler in the long term to provide a Serialization constructor?

P.S. David, I like your suggestion of using a protected method, for 
limiting array size.

Thank you,

Peter.
> On 01/08/2015 02:10 PM, Brian Goetz wrote:
>>> >>  1) Validate invariants
>>> >>
>>> >>        A clear and easy to understand mechanism that can validate the
>>> >>  deserialized
>>> >>        fields. Does not prevent the use of final fields, as the
>>> >>  serialization framework
>>> >>        will be responsible for setting them. Something along the lines
>>> >>  of what David
>>> >>        suggested:
>>> >>
>>> >>          private static void validate(GetField fields) {
>>> >>              if (fields.getInt("lo")>  fields.getInt("hi")) { ... }
>>> >>         }
>>> >>
>>> >>        This could be a ?special? method, or annotation driven. TBD.
>>> >>
>>> >>        Note: the validate method is static, so the object instance is
>>> >>  not required to
>>> >>        be created before running the validation.
>> >
>> >  Sort of...
>> >
>> >  This is true if the fields participating in the invariant are
>> >  primitives.  But if they're refs, what do you do?  What if you want to
>> >  validate something like
>> >
>> >      count == list.size()   // fields are int count, List list
>> >
>> >  ?  Then wouldn't GetField.getObject have to deserialize the object
>> >  referred to by that field?
> In fact you cannot validate invariants across multiple objects at all
> using this method*or*  readObject() (existing or enhanced) unless the
> object in question is an enum, Class, or String (and a few other special
> cases) because you can't rely on initialization order during
> deserialization.  That's the very reason why OIS#registerValidation()
> even exists - inter-object validation is not possible until after the
> root-most readObject has completed, which is the time when validations
> are executed.  Robust validation of a given object class may require two
> stages - "near" validation and "spanning" validation - to fully
> validate.  However the readObject() approach and its proposed variations
> (including the static validate() version) can still be useful for
> fail-fast and non-complex validations; you just have to understand that
> (just as today) any Object you examine might not be fully initialized yet.
>
> -- - DML