(15) RFR: JDK-8247444: Trust final fields in records

Sun Jun 21 22:32:15 UTC 2020

Hi Aleksey,

On 6/18/20 7:10 PM, Aleksey Shipilev wrote:
> On 6/18/20 3:09 PM, Chris Hegarty wrote:
>>> On 18 Jun 2020, at 13:43, Aleksey Shipilev <shade at redhat.com> wrote:
>>> Again, JDK-8247532 is the writing on the wall: we don't need 3rd party developers to tell if Record
>>> serialization works fast in 15 -- we already know it does not.
>> I disagree. JDK-8247532 is under review and well on its way towards JDK 15 (yes, during RDP 1).
>> My reading of Peter’s benchmark result is that Record deserialization *does work fast*.   What am
>> I missing.
> JDK-8247532 is the evidence that Records serialization performance story is not clear.
>
> Even if we disregard that after JDK-8247532 [1] Records are still 8% slower, the _existence_ of
> JDK-8247532 indicates there are performance problems in the area. That evidence now needs to be
> compensated by much more evidence to the contrary. (Yes, I contracted a lot of Bayesian thinking
> from my statistician wife...)
>
> (Here were several paragraphs of further thoughts, but I realized it basically repeats what I said
> before.)

Let me just express my thoughts on the subject of preventing writes to 
final fields in records in connection to deserialization. In short, I 
don't think any serialization framework would need to have direct write 
access to record fields in order to be performant. Serialization 
frameworks need write access to final fields in classical classes mainly 
because fields are generally encapsulated. If classes don't implement 
special API (constructors, methods, annotations ...) to interface with 
serialization framework (like for example Jackson annotations, etc.) 
then serialization framework can only interface such classes on the 
level of fields.

Now records are very constrained classes. They have an always present 
API that can be used to interface with serialization frameworks. For 
each field, they poses a constructor parameter with the same name and 
type and are arranged so that the fields are set from the constructor 
parameters. So instead of allocating a "zero" instance and setting 
fields, one can just invoke a canonical constructor which does the same. 
Such invocation is equally performant as allocating instance and setting 
fields from outside the constructor.

What I did in JDK-8247532 was that I just optimized the code that 
transforms the bytes read from stream into values for arguments of the 
canonical constructor using method handle transformations. That part was 
slow and not the fact that records are deserialized using their 
canonical constructor. Every serialization framework that works on the 
level of fields has to do the transformation of stream encoded data to 
field values - how it does that is what differs it from the rest - the 
final mechanism how those values are stored into object fields is not 
what makes it unique or performant.

There is also a desire that record constructor(s) are the only way to 
create instances of records such that any validation logic in the 
constructor(s) can't get bypassed. Not even via deserialization of 
"forged" streams. That way we can have records that are safer without 
scarifying performance.

And BTW, according to latest JDK-8247532 webrev.06, Java deserialization 
of records is not slower but sometimes even faster than deserialization 
of equivalent classical classes.

Regards, Peter