Serialization of object identity

Fri Jun 14 14:50:20 UTC 2019

Thanks for the reply.

> Actually, they're two entirely different problems.

I don't think they are entirely different since identity info is a
requirement for cyclic graphs, but I don't intend to discuss semantics.

The challenge with cyclic graphs is not that we have to respect identity
> -- that can be done (it is up to a particular serialization to decide if
> it is going to do so.)

 I'm happy so see that respecting identity can be done. If you could add
this to the document on the next update it would be beneficial, even if
it's just "can be done", because it's not clear at all (to me) that this is
the case. All of the constructor and deconstructor patterns presented are
about the data stored in fields, and the treatment of identity info comes
from somewhere else.

The challenge with cyclic graphs is that
> logically cyclic graphs cannot, in general, be reproduced through a
> series of constructor calls -- some mutation is required as well.  Which
> conflicts with our main security goal, that deserialization proceed
> through constructors.

I understand, but maybe giving up from the start (from the public's POV) is
premature. Leaking "this", for example as Rémi suggested, is not
necessarily a flaw if done cautiously.
Also, mutation is not necessarily a security issue, though it is another
place where one could arise. Many mutators do the same checks on the input
as the constructor does. If my class has fields for weight and height, it's
very reasonable that both the constructor and the setters will do a check
for >0. In this case, I can deserialize using the empty constructor and
then mutate, and what do I lose? I believe that this is what Jackson does
in its simplest case.
I'm the author of my classes and I'm responsible for checking my input in
mutators just as much as in constructors, and whatever deserialization
requires me to do for the constructor I can do for the mutator. Any kind of
delayed assignment ("construct and then") could be a major part of a
solution. Rémi's trick with Function is on the same page.

Sure, it's easy for me to pop in and say "put annotations on setters and
invoke them after construction" as if it's a simple solution for this
problem, and it's really not. What it is is a reason to reconsider throwing
cyclic graphs out, especially when JSON can support it practically.

On Wed, Jun 12, 2019 at 10:55 PM Remi Forax <forax at univ-mlv.fr> wrote:

>
>
> ----- Mail original -----
> > De: "Brian Goetz" <brian.goetz at oracle.com>
> > À: "Nir Lisker" <nlisker at gmail.com>, "amber-dev" <
> amber-dev at openjdk.java.net>
> > Envoyé: Mercredi 12 Juin 2019 21:26:34
> > Objet: Re: Serialization of object identity
>
> >> In fact, the cyclic graph issue is a result of an inability to represent
> >> object identity in serialization, which is a much larger problem.
> >
> > Actually, they're two entirely different problems.
> >
> > The challenge with cyclic graphs is not that we have to respect identity
> > -- that can be done (it is up to a particular serialization to decide if
> > it is going to do so.)  The challenge with cyclic graphs is that
> > logically cyclic graphs cannot, in general, be reproduced through a
> > series of constructor calls -- some mutation is required as well.  Which
> > conflicts with our main security goal, that deserialization proceed
> > through constructors.
> >
> > (It is possible, at the cost of significant complexity for both the
> > framework and class authors, to have a more complex model that can
> > reflect post-construction mutation -- but the incremental complexity and
> > risk is significant.)
>
> it's not fully true because you can leak "this" and then mutate a field
> inside the constructor.
>
> class A {
>   final B b;
>   A(Function<A,B> fun) {
>     b = fun.apply(this);
>   }
> }
> class B {
>   final A a;
>   B(A a) {
>    this.a = a;
>   }
> }
>
> new A(B::new);
>
> Rémi
>