Explicit Serialization API and Security

Mon Jan 12 15:56:53 UTC 2015

On 10/01/15 07:00, Peter Firmstone wrote:
> Again, thank you all for engaging in discussion of this very difficult
> topic.
>
> While we can't presently check intra object dependencies during
> deserialization with readObject(), the examples I provide can do this.

I have replied to Davids mail with a small change to GetField ( added 
superTypeFields() ) to return the deserialized supertypes fields. This 
gives subtypes the ability to check values of the supertypes persistent 
state.

As with your original proposal, it is limited to the persistent state as 
read off the stream, and not the transient state, but I think it gets us 
most of the way there.

In your original proposal, it looks quite cumbersome to hook up the 
static validator method in the constructor hierarchy ( as it is to do 
this with standard constructors too ). It also relies on the fact that 
the subtype has to create an instance of the supertype. I just wonder if 
we can push on the alternative static validator proposal to come up with 
something a but more attractive ( maybe not! ).

> ObjectInputValidation is sufficient for enforcing business rules,
> however it can't stop an attacker.  By the time the validator runs, the
> attackers object of choice has been instantiated and it's game over.
> That's right, the attacker may choose any Serializable class from the
> classpath, and may even extend non serializable classes, with a zero arg
> constructor, such as ClassLoader.
>
> To trust deserialization of an Object, as opposed to a String or
> primitive, the ObjectInputStream (or overriding subclass) must limit the
> classes allowed to be deserialized.

Agreed. I am working on fleshing out a more concrete proposal on 
confinement, that I mentioned in previous mails. Hopefully, I will have 
something in the next few days.

> For a class to be trusted, it must be trusted to check its parameters
> and enforce its invariants during object deserialization and to not
> deserialize with priviliged context.
>
> Validating an entire object graph's invariants, requires each class in
> the graph to take responsibility for validating and enforcing its own
> invariants.

Agreed, and I think that the revised static validator method gives us this.

> As shown in my earlier example, intra class invariants can be enforced
> using Serialization constructors, from which static methods are called
> to check invariants prior to super class constructors being called.

Yes, but it is cumbersome and easy to make mistakes.

> Presently, I override ObjectInputStream and use a Permission called
> DeserializationPermission to limit classes that can be deserialized.
> Untrusted connections are run from unprivileged context to limit classes
> that can be instantiated, while those with trusted connections are run
> with a Subject that allows any class to be deserialized.

Interesting. I think there is overlap here with confinement.

> ReadSerial, could do this:
>      Class c = rs.getType("foo");
>
> And
>
>      Foo f = rs.getObject("foo", Foo.class);
>
> Which performs instanceof type check, (prior to Object deserialization,
> if first time deserialized, otherwise after) and generic complile time
> type checked method return.   Thus we must restrict the classes that can
> be deserialized with ObjectInputStream.

Have you seen the changes I proposing for failure atomicity, preliminary 
webrev
    http://cr.openjdk.java.net/~chegar/failureAtomicity/

    .. and I think we can go further than this, creating the containing 
object lazily, if there are no readObjectXXX methods in the hierarchy.

> If Foo is mutable and we want a private copy, the caller needs to copy
> or clone it before checking invariants, as it would any mutable
> constructor parameter.
>
> Each level of validation is the responsibility of the component with the
> most knowledge.
>
>    1. Trusted class lists - administration, it might change.
>    2. Object invariants and intra object invariants - Classes, not objects.
>    3. Business rules, but not security - ObjectInputValidation.
>
> So summing up, in order to secure deserialization we must validate all
> data input, preferably before allowing object instantiation.

Yes, where possible. If we can phase out readObjectXXX and replace it 
with a static validator, then I think this is possible.

> Once an object has been constructed an attacker can gain a reference,
> whether by a finalizer attack or some other cleverly crafted stream, the
> attacker cannot obtain a reference to an object that doesn't exist.
>
> Circular links can allow an attacker to obtain an object reference
> ,prior to its invariants being checked, when we rely on readObject() to
> throw an exception.  Delaying finalizer registration won't save you.

Yes, circular references are nasty.

> Classes with circular links should be final and use a transient boolean
> "initialized" field, that every method checks to prevent an attacker
> doing anything useful, should they gain a reference to an incorrectly
> constructed object.
>
> The real question is, do we continue to plaster over the issues with the
> Serialization framework's api, and continue to create special cases such
> as a second final field freeze after a constructor completes?    To be
> honest, I don't like this second final field freeze, because invariants
> haven't been checked.

I think some of the discussion on this mailing list can provide APIs 
that avoid this, in the common case.

> Don't get me wrong, Serialization is quite clever, but implementing
> Serializable properly can consume a lot of time due to api complexity
> and is presently a security problem.

Yes, we too have spent too much time on this already. :-(  I hope this 
discussion will result in concrete changes to help make Serialization 
more secure by default, and provide additional tools, confinement, etc, 
when this is not possible.

> In my examples all constructors share the same invariant checks and
> because it's public API, the invariants can be tested easily with unit
> tests.
>
> It seems like we're trying to give constructor properties to a private
> instance method at the expense of increasing complexity, wouldn't it be
> simpler in the long term to provide a Serialization constructor?

I think long term we will need a replacement for 
Unsafe.allocateInstance. There are a number of issues in JIRA about 
possibly adding support, with suitable permissions, to reflection, to 
create an instance without running its constructor. I need to track down 
the bugs and details, not sure if this is even a realistic possibility.

> P.S. David, I like your suggestion of using a protected method, for
> limiting array size.

Agreed. This looks like it may have potential.  I thought some work was 
done in this area in the recent past, restricting the size of the graph 
or objects. Let me see if I can find out more.

-Chris.

> Thank you,
>
> Peter.
>> On 01/08/2015 02:10 PM, Brian Goetz wrote:
>>>> >>  1) Validate invariants
>>>> >>
>>>> >>        A clear and easy to understand mechanism that can validate
>>>> the
>>>> >>  deserialized
>>>> >>        fields. Does not prevent the use of final fields, as the
>>>> >>  serialization framework
>>>> >>        will be responsible for setting them. Something along the
>>>> lines
>>>> >>  of what David
>>>> >>        suggested:
>>>> >>
>>>> >>          private static void validate(GetField fields) {
>>>> >>              if (fields.getInt("lo")>  fields.getInt("hi")) { ... }
>>>> >>         }
>>>> >>
>>>> >>        This could be a ?special? method, or annotation driven. TBD.
>>>> >>
>>>> >>        Note: the validate method is static, so the object
>>>> instance is
>>>> >>  not required to
>>>> >>        be created before running the validation.
>>> >
>>> >  Sort of...
>>> >
>>> >  This is true if the fields participating in the invariant are
>>> >  primitives.  But if they're refs, what do you do?  What if you
>>> want to
>>> >  validate something like
>>> >
>>> >      count == list.size()   // fields are int count, List list
>>> >
>>> >  ?  Then wouldn't GetField.getObject have to deserialize the object
>>> >  referred to by that field?
>> In fact you cannot validate invariants across multiple objects at all
>> using this method*or*  readObject() (existing or enhanced) unless the
>> object in question is an enum, Class, or String (and a few other special
>> cases) because you can't rely on initialization order during
>> deserialization.  That's the very reason why OIS#registerValidation()
>> even exists - inter-object validation is not possible until after the
>> root-most readObject has completed, which is the time when validations
>> are executed.  Robust validation of a given object class may require two
>> stages - "near" validation and "spanning" validation - to fully
>> validate.  However the readObject() approach and its proposed variations
>> (including the static validate() version) can still be useful for
>> fail-fast and non-complex validations; you just have to understand that
>> (just as today) any Object you examine might not be fully initialized
>> yet.
>>
>> -- - DML