Serializable lambdas -- where we are, how we got here
Remi Forax
forax at univ-mlv.fr
Fri Aug 16 13:56:16 PDT 2013
On 08/16/2013 07:47 PM, Brian Goetz wrote:
> Several concerns have been recently (re)raised again about the
> stability of serializable lambdas. This attempts to provide an
> inventory of where we are and how we got here.
>
> There were some who initially (wishfully) suggested that it would be
> best to declare serialization a mistake and not make lambdas
> serializable at all. While this was a very tempting target,
> ultimately this conflicted with another decision we made: that of
> using nominal function types (functional interfaces) to type lambdas.
>
> For example, imagine:
>
> interface SerializablePredicate<T>
> extends Predicate<T>, Serializable { }
>
> If the user does:
>
> SerializablePredicate<String> p = s -> false;
> or
> SerializablePredicate<String> p = String::isEmpty;
>
> It would violate the principle of least surprise that the resulting
> objects (whose lambda-heritage should be invisible to anyone who later
> touches it) to not be serializable. Hence begun our slide down the
> slippery slope.
>
> An intrinsic challenge of serialization is that, when confronted with
> different class files at deserialization time than were present at
> serialization time, to make a good-faith effort to figure out what to
> do. For classes, the default behavior (in the absence of an explicit
> serial version UID) is to consider any change to the class signatures
> to invalidate existing serialized forms, but in the presence of a
> serial version UID, to attempt to deal gracefully with added or
> removed fields. Inherent in this is the assumption that if the *name*
> and *signature* of something hasn't changed, its semantics haven't,
> either. If you change the meaning of a field or a method, but not its
> name, you're out of luck.
>
> Anonymous classes are less forgiving than nominal classes, because (a)
> their names are generated at compile time and may change if the source
> changes "too much", and (b) their field names / constructor signature
> may change based on changes in method bodies even if the class and
> method signatures don't change. This problem has been with us since
> 1997. There are two possible failure modes that come out of this:
> Type 1) An instance may fail to deserialize, due to changes that have
> nothing to do with the object being serialized;
> Type 2) An instance may deserialize successfully, but may be bound to
> the *wrong* implementation due to bad luck.
>
> Still, many users successfully deal with serialization and anonymous
> classes by following a simple rule: have the same bits on both sides
> of the wire. In reality, the situation is more forgiving than that:
> if you recompile the same source with the same compiler, things still
> work -- and users fundamentally expect this to be the case. And the
> same is true for "lightly modified" versions of the same sources
> (adding comments, adding debugging statements, etc.)
>
> Lambdas are similar to anonymous classes in some ways, and we were
> aware of these failure modes at the time we first discussed
> serialization of lambdas. Obviously we would have preferred to
> prevent these failures if possible, but all the approaches explored
> were either too restrictive or incomplete. Restrictions that were
> explored and rejected include:
> - No serializable lambdas at all
> - Only serialize static or unbound method refs
> - Only serialize named, non-capturing lambdas
>
> The various hash-the-world options that have been suggested (hash the
> source or bytecode) are too weird, too brittle, too hard to specify,
> and will result in users being confounded by, say, recompiling what
> they perceive as identical sources with an identical compiler and
> still getting runtime failures, violating (reasonable) user
> expectations. (It would be almost better to generate a *random* name
> on every compilation, but we're not going to do that.)
>
> In the absence of being able to make it perfect, having exactly the
> same drawbacks of an existing mechanism, which users are familiar with
> and have learned to work around, was deemed better than making it
> imperfect in yet a new way.
>
> That said, if there's a possibility to reduce type-2 failures without
> undermining the usability of serialization or the simplicity of the
> user model, we're willing to continue to explore these (despite the
> extreme lateness of the hour).
>
> At the recent EG meeting, we specifically discussed whether it would
> be worthwhile to try and address recovering from capture-order issues.
> This *is* tractible (subject to the same caveats with nominal classes
> -- that same-name means same-meaning). But, the sense of the room
> then was that this doesn't help enough, because there is still the
> name-induced stability issue, and that fixing one without the other
> just encourages users to think that they can make arbitrary code
> changes and expect serialization stability, and makes it even more
> surprising when we get a failure due to, say, adding a new lambda to a
> method. However, if we felt we were likely to do named lambdas later,
> then this approach could close half the problem now and we could close
> the other half of the problem later.
>
> One possibility that has not yet been discussed is to issue a lint
> warning for serializable lambdas/method refs that are subject to
> stability issues.
>
> Here's where we are:
> - We're not revisiting the decisions about what lambdas and method
> references should be serializable. This has been reopened several
> times with no change in consensus, and no new information has come to
> light that would change the decision.
> - "Just like inner classes" is a local maxima. Better to not ask the
> user to create a new mental model than to require a new one that is
> just as flawed but in different ways. However, we already make some
> departures from inner class treatment, so this is a more "spirit of
> the rule" thing than a "letter of the rule." If we can do *much*
> better, great, but "slightly better but different" is worse.
> - We might be able to revisit some translation decisions if they
> result in significant improvements to stability without cost to
> usability, but we are almost, if not completely, out of time.
> - We're open to adding more lint warnings at compile time.
>
>
> Stay tuned for a specific proposal.
So you want a lint warning saying serialization sucks :)
You want a warning when a lambda/method ref capture local variables,
it's logical to have the same warning for inner class.too.
But in that case you will raise warnings in already written and valid code.
Not a good idea, IMO.
Rémi
More information about the lambda-spec-experts
mailing list