Explicit Serialization API and Security

Wed Jan 14 11:37:29 UTC 2015

On 13/01/15 10:00, Peter Levart wrote:
> Hi Chris,
>
> I stepped back and asked myself what problem(s) we are searching
> solution(s) for and tried to see how existing API solves or doesn't
> solve them. I think that existing readObject() + GetFields API already
> solves most of the problems discussed so far as it:
> - provides encapsulation (using GetFields API we only have access to
> deserialized values of the class where readObject() method being invoked
> is declared)
> - provides a means to check intra-object invariants in an "encapsulated"
> way - the superclass state is already initialized/deserialized when
> readObject() is called, so we can check GetFields provided values
> against superclass state in a way that preserves encapsulation.
>
> What readObject()/GetFields combo doesn't provide is a relatively nice
> solution for final fields (we have to resort to clumsy reflection). So
> why not adding just that to the API? For example, extend
> ObjectInputStream with the following method:
>
>      ObjectInputStream {
>          ...
>          FieldAccess fields();
>         ...
>      }
>
>
> where FieldAccess is something like:
>
>      public interface FieldAccess {
>          FieldAccess set(String fieldName, boolean value);
>          FieldAccess set(String fieldName, byte value);
>          FieldAccess set(String fieldName, char value);
>          FieldAccess set(String fieldName, short value);
>          FieldAccess set(String fieldName, int value);
>          FieldAccess set(String fieldName, long value);
>          FieldAccess set(String fieldName, float value);
>          FieldAccess set(String fieldName, double value);
>          <T> FieldAccess set(String fieldName, T value);
>      }
>
> ...which would confine access to just those fields declared in the class
> where readObject() being invoked is declared and just for the instance
> and time of readObject() call-back.
>
>
> Simple, flexible and stright-forward. It solves the problem with final
> fields. It doesn't provide any auto-magic, which, although it might seem
> pretty at first, can sometimes just be in the way because of
> inflexibility. Sugar can be added though (in the form of
> GetFields.defaultReadFields() or even special check() methods), but
> flexibility is a necessary ingredient, I think.
>
> What do you think?

I agree completely.

An API at the level that you are proposing will provide the necessary 
functionality and flexibility that is required to do validation in 
readObject. As you clearly stated, and is already the case, validation 
in this way can depend on supertype state.

Failure automicity of the whole type hierarchy is the only thing that is 
missing, but I think that could possibly be built on top, or solved in a 
different way. I don't see that as being a blocker for moving this 
forward. This proposal stands on its own merits.

Peter,
   Is this something that you want to actively flesh out? If not, I can 
try to help move this forward.

-Chris.

> Regards, Peter
>
> On 01/12/2015 04:56 PM, Chris Hegarty wrote:
>> On 10/01/15 07:00, Peter Firmstone wrote:
>>> Again, thank you all for engaging in discussion of this very difficult
>>> topic.
>>>
>>> While we can't presently check intra object dependencies during
>>> deserialization with readObject(), the examples I provide can do this.
>>
>> I have replied to Davids mail with a small change to GetField ( added
>> superTypeFields() ) to return the deserialized supertypes fields. This
>> gives subtypes the ability to check values of the supertypes
>> persistent state.
>>
>> As with your original proposal, it is limited to the persistent state
>> as read off the stream, and not the transient state, but I think it
>> gets us most of the way there.
>>
>> In your original proposal, it looks quite cumbersome to hook up the
>> static validator method in the constructor hierarchy ( as it is to do
>> this with standard constructors too ). It also relies on the fact that
>> the subtype has to create an instance of the supertype. I just wonder
>> if we can push on the alternative static validator proposal to come up
>> with something a but more attractive ( maybe not! ).
>>
>>> ObjectInputValidation is sufficient for enforcing business rules,
>>> however it can't stop an attacker.  By the time the validator runs, the
>>> attackers object of choice has been instantiated and it's game over.
>>> That's right, the attacker may choose any Serializable class from the
>>> classpath, and may even extend non serializable classes, with a zero arg
>>> constructor, such as ClassLoader.
>>>
>>> To trust deserialization of an Object, as opposed to a String or
>>> primitive, the ObjectInputStream (or overriding subclass) must limit the
>>> classes allowed to be deserialized.
>>
>> Agreed. I am working on fleshing out a more concrete proposal on
>> confinement, that I mentioned in previous mails. Hopefully, I will
>> have something in the next few days.
>>
>>> For a class to be trusted, it must be trusted to check its parameters
>>> and enforce its invariants during object deserialization and to not
>>> deserialize with priviliged context.
>>>
>>> Validating an entire object graph's invariants, requires each class in
>>> the graph to take responsibility for validating and enforcing its own
>>> invariants.
>>
>> Agreed, and I think that the revised static validator method gives us
>> this.
>>
>>> As shown in my earlier example, intra class invariants can be enforced
>>> using Serialization constructors, from which static methods are called
>>> to check invariants prior to super class constructors being called.
>>
>> Yes, but it is cumbersome and easy to make mistakes.
>>
>>> Presently, I override ObjectInputStream and use a Permission called
>>> DeserializationPermission to limit classes that can be deserialized.
>>> Untrusted connections are run from unprivileged context to limit classes
>>> that can be instantiated, while those with trusted connections are run
>>> with a Subject that allows any class to be deserialized.
>>
>> Interesting. I think there is overlap here with confinement.
>>
>>> ReadSerial, could do this:
>>>      Class c = rs.getType("foo");
>>>
>>> And
>>>
>>>      Foo f = rs.getObject("foo", Foo.class);
>>>
>>> Which performs instanceof type check, (prior to Object deserialization,
>>> if first time deserialized, otherwise after) and generic complile time
>>> type checked method return.   Thus we must restrict the classes that can
>>> be deserialized with ObjectInputStream.
>>
>> Have you seen the changes I proposing for failure atomicity,
>> preliminary webrev
>>    http://cr.openjdk.java.net/~chegar/failureAtomicity/
>>
>>    .. and I think we can go further than this, creating the containing
>> object lazily, if there are no readObjectXXX methods in the hierarchy.
>>
>>> If Foo is mutable and we want a private copy, the caller needs to copy
>>> or clone it before checking invariants, as it would any mutable
>>> constructor parameter.
>>>
>>> Each level of validation is the responsibility of the component with the
>>> most knowledge.
>>>
>>>    1. Trusted class lists - administration, it might change.
>>>    2. Object invariants and intra object invariants - Classes, not
>>> objects.
>>>    3. Business rules, but not security - ObjectInputValidation.
>>>
>>> So summing up, in order to secure deserialization we must validate all
>>> data input, preferably before allowing object instantiation.
>>
>> Yes, where possible. If we can phase out readObjectXXX and replace it
>> with a static validator, then I think this is possible.
>>
>>> Once an object has been constructed an attacker can gain a reference,
>>> whether by a finalizer attack or some other cleverly crafted stream, the
>>> attacker cannot obtain a reference to an object that doesn't exist.
>>>
>>> Circular links can allow an attacker to obtain an object reference
>>> ,prior to its invariants being checked, when we rely on readObject() to
>>> throw an exception.  Delaying finalizer registration won't save you.
>>
>> Yes, circular references are nasty.
>>
>>> Classes with circular links should be final and use a transient boolean
>>> "initialized" field, that every method checks to prevent an attacker
>>> doing anything useful, should they gain a reference to an incorrectly
>>> constructed object.
>>>
>>> The real question is, do we continue to plaster over the issues with the
>>> Serialization framework's api, and continue to create special cases such
>>> as a second final field freeze after a constructor completes? To be
>>> honest, I don't like this second final field freeze, because invariants
>>> haven't been checked.
>>
>> I think some of the discussion on this mailing list can provide APIs
>> that avoid this, in the common case.
>>
>>> Don't get me wrong, Serialization is quite clever, but implementing
>>> Serializable properly can consume a lot of time due to api complexity
>>> and is presently a security problem.
>>
>> Yes, we too have spent too much time on this already. :-(  I hope this
>> discussion will result in concrete changes to help make Serialization
>> more secure by default, and provide additional tools, confinement,
>> etc, when this is not possible.
>>
>>> In my examples all constructors share the same invariant checks and
>>> because it's public API, the invariants can be tested easily with unit
>>> tests.
>>>
>>> It seems like we're trying to give constructor properties to a private
>>> instance method at the expense of increasing complexity, wouldn't it be
>>> simpler in the long term to provide a Serialization constructor?
>>
>> I think long term we will need a replacement for
>> Unsafe.allocateInstance. There are a number of issues in JIRA about
>> possibly adding support, with suitable permissions, to reflection, to
>> create an instance without running its constructor. I need to track
>> down the bugs and details, not sure if this is even a realistic
>> possibility.
>>
>>> P.S. David, I like your suggestion of using a protected method, for
>>> limiting array size.
>>
>> Agreed. This looks like it may have potential.  I thought some work
>> was done in this area in the recent past, restricting the size of the
>> graph or objects. Let me see if I can find out more.
>>
>> -Chris.
>>
>>> Thank you,
>>>
>>> Peter.
>>>> On 01/08/2015 02:10 PM, Brian Goetz wrote:
>>>>>> >>  1) Validate invariants
>>>>>> >>
>>>>>> >>        A clear and easy to understand mechanism that can validate
>>>>>> the
>>>>>> >>  deserialized
>>>>>> >>        fields. Does not prevent the use of final fields, as the
>>>>>> >>  serialization framework
>>>>>> >>        will be responsible for setting them. Something along the
>>>>>> lines
>>>>>> >>  of what David
>>>>>> >>        suggested:
>>>>>> >>
>>>>>> >>          private static void validate(GetField fields) {
>>>>>> >>              if (fields.getInt("lo")> fields.getInt("hi")) { ... }
>>>>>> >>         }
>>>>>> >>
>>>>>> >>        This could be a ?special? method, or annotation driven.
>>>>>> TBD.
>>>>>> >>
>>>>>> >>        Note: the validate method is static, so the object
>>>>>> instance is
>>>>>> >>  not required to
>>>>>> >>        be created before running the validation.
>>>>> >
>>>>> >  Sort of...
>>>>> >
>>>>> >  This is true if the fields participating in the invariant are
>>>>> >  primitives.  But if they're refs, what do you do? What if you
>>>>> want to
>>>>> >  validate something like
>>>>> >
>>>>> >      count == list.size()   // fields are int count, List list
>>>>> >
>>>>> >  ?  Then wouldn't GetField.getObject have to deserialize the object
>>>>> >  referred to by that field?
>>>> In fact you cannot validate invariants across multiple objects at all
>>>> using this method*or*  readObject() (existing or enhanced) unless the
>>>> object in question is an enum, Class, or String (and a few other
>>>> special
>>>> cases) because you can't rely on initialization order during
>>>> deserialization.  That's the very reason why OIS#registerValidation()
>>>> even exists - inter-object validation is not possible until after the
>>>> root-most readObject has completed, which is the time when validations
>>>> are executed.  Robust validation of a given object class may require
>>>> two
>>>> stages - "near" validation and "spanning" validation - to fully
>>>> validate.  However the readObject() approach and its proposed
>>>> variations
>>>> (including the static validate() version) can still be useful for
>>>> fail-fast and non-complex validations; you just have to understand that
>>>> (just as today) any Object you examine might not be fully initialized
>>>> yet.
>>>>
>>>> -- - DML
>