Stability of lambda serialization
Brian Goetz
brian.goetz at oracle.com
Tue Aug 6 13:23:37 PDT 2013
Can't just be the representation of the lambda; needs to fold in the enclosing context as well. Otherwise:
foo() {
String x = ...;
bar(() -> x.length());
}
to
foo() {
File x = ...;
bar(() -> x.length());
}
will fool the hash.
On Aug 6, 2013, at 12:35 PM, Doug Lea wrote:
> On 08/06/13 13:04, David M. Lloyd wrote:
>> On 08/06/2013 11:43 AM, David M. Lloyd wrote:
>>> On 08/06/2013 11:14 AM, David M. Lloyd wrote:
>>>> On 08/06/2013 09:36 AM, Doug Lea wrote:
>>>>> On 08/06/13 09:17, David M. Lloyd wrote:
>>>>>
>>>>>> Me: Reordering captured variables, reordering lambda incidence. The
>>>>>> EG's stance
>>>>>> is just a generalization. It's not a stance in any case: things which
>>>>>> destabilize lambdas in terms of serializability are not a question of
>>>>>> opinion,
>>>>>> and it's bizarre to frame it that way.
>>>>>>
>>>>>>> What to do? EG: make a best effort, with documented caveats; you:
>>>>>>> conservatively prohibit serialization of capturing lambdas; third
>>>>>>> alternative: conservatively detect problems and break at
>>>>>>> deserialization
>>>>>>
>>>>>> I'm OK with either "you" or "third alternative".
>>>>>
>>>>> I'm OK with 3rd alternative if some reasonably efficient
>>>>> checksum/serial id ensuring breakage could be devised.
>>>>> David, any ideas?
>>>>
>>>> It's a good idea. I can think of a few requirements offhand:
>>>>
>>>> * Generation of the hash would necessarily occur at compile time
>>>> * The hash would have to be unique for each lambda within a class and/or
>>>> compilation unit
>>>> * The hash would have to be sensitive to any changes which would cause
>>>> any indeterminism in how the lambda is resolved - this may extend to
>>>> hashing even the bytecode of methods which include the lambda. This is
>>>> the key/most complex concept to tackle to make this solution work.
>>>> * The last step is to tag the UID value on to the serialized lambda
>>>> representation.
>>>>
>>>> I don't think there is much more to it than this; the hardest part is
>>>> determining what/how to hash. If it happens at compile time then
>>>> resolution at run time (i.e. the more performance-sensitive context)
>>>> should be the same kind of numerical comparison which is already done
>>>> for serialVersionUID.
>>>
>>> Brian pointed out a couple things:
>>>
>>> * Such a scheme would have to be very strongly and clearly specified
>>> * The scheme cannot depend on any particular non-spec compiler behavior
>>> (i.e. the same source file should create the same hashes regardless of
>>> compiler version or vendor)
>>>
>>> I suggested as a possible starting point a scheme which could create a
>>> 64-bit hash based on a combination of:
>
> How about just a hash of its actual string representation,
> plus its context (enclosing method etc). A little crazy but among the
> few simple and feasible ones I can think of. It means you blow up
> if you add a space. Fine: If you are going to draw the line somewhere,
> it might as well be here.
>
> Although at this point you wonder, why bother serializing.
> Just pass the string and invoke a compiler to parse... Probably
> not a lot slower.
>
> -Doug
>
>
>
>
>
>>>
>>> * Any captured variables' name and declaration order
>>> * The declaration order of the lambda
>>> * Information about the enclosing method: name and signature, maybe decl
>>> order? (though it should be redundant wrt. the lambda decl order)
>>> * The usual serialVersionUID calculation
>>>
>>> I would really appreciate anyone's thoughts as to the efficacy of this
>>> approach and any potential weaknesses; in particular I'd like to hear if
>>> anyone things this is a non-trivial change in terms of compilation and
>>> runtime.
>>>
>>> In particular, it is not 100% clear how the calculation would work with
>>> nested lambdas or lambdas nested in inner classes for example.
>>
>> For runtime it seems to me that this would largely consist of bundling the hash
>> with the method handle information which can be passed to its serialized
>> representation. The deserialization of the lambda could then hopefully just
>> verify the hash against the local method handle and throw an exception if it has
>> changed.
>
More information about the lambda-spec-observers
mailing list