RFR 9: JEP 290: Filter Incoming Serialization Data

Thu Jul 21 18:19:33 UTC 2016

Hi Peter,

On 7/21/2016 3:42 AM, Peter Levart wrote:
> Hi Roger,
>
>
> On 07/20/2016 04:44 PM, Roger Riggs wrote:
>>>
>>> - What is the purpose of the UNDECIDED return? I suspect it is meant 
>>> to be used in some filter implementation that delegates the 
>>> validation to some "parent" filter and respects its decision unless 
>>> it is UNDECIDED in which case it decides (or does not) on its own. 
>>> Should such strategy be mentioned in the docs to encourage 
>>> inter-operable filter implementations?
>> Yes,  some simple filters might be for purposes of black-listing or 
>> white-listing.
>> The pattern based filters, as produced by 
>> ObjectInputFilter.createFilter(patterns), can simply represent
>> white or black listing, but if none of the patterns match, it can 
>> only report UNDECIDED.
>>
>> A custom filter, should check if there is a process-wide filter 
>> configured and invoke it first.
>> Returning its status unless it is UNDECIDED and in that case use its 
>> own logic to determine the status.
>>
>> Definitely worthy of an @apiNote in ObjectInputFilter.
>
> Shoud Config.createFilter(pattern) then have an overload that allows 
> specifying a "parent" filter in addition to the "pattern" ?
>
Supplying a filter as a method has more flexibility in what it can check 
and how it combines
filters.  With lambdas it is very easy to write a filter to handle the 
more complex cases.
The pattern based filters are intended to cover only simple 
white-list/black list cases.
So for now, simple is good enough to start.
>>
>>>
>>> - The call-back is invoked after the type of the object and possible 
>>> array length is read from stream but before the object's state is 
>>> read. Suppose that the object that is about to be read is either 
>>> Externalizable object or an object with a readObject() method(s) 
>>> that consume block data from the stream. This block data can be 
>>> large. Should there be a call-back to "announce" the block data too? 
>>> (for example, when the 'clazz' is null and the 'size' is 0, the 
>>> call-back reports a back-reference to a previously read object, but 
>>> when the 'clazz' is null and the 'size' > 0, it announces the 'size' 
>>> bytes of block data. Does this make sense?)
>> Interesting case, I'll take another look at that. Since block data 
>> records are <= 1024, a filter might not
>> have enough information to make an informed decision.  Those bytes 
>> would show up in
>> the stream bytes but not until the next object is read.
>
> ...which could be to late. If the filter is to be also used as a 
> defense against forged streams that try to provoke DOS by triggering 
> frequent GCs and OutOfMemoryError(s), then such call-back that 
> announces each block data record could help achieve that.
Individual block data lengths are not very informative since block data 
can be segmented but
a cumulative (for the whole stream) block data size suitable for a 
callback from the
start of each block data segment might be useful.

Thanks, Roger

>
> Regards, Peter
>