Ensuring atomicity of long values

Sun Oct 22 00:00:13 UTC 2023

----- Original Message -----
> From: "John Rose" <john.r.rose at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "-" <liangchenblue at gmail.com>, "valhalla-dev" <valhalla-dev at openjdk.org>
> Sent: Saturday, October 21, 2023 8:49:06 PM
> Subject: Re: Ensuring atomicity of long values

> On 21 Oct 2023, at 2:29, forax at univ-mlv.fr wrote:
> 
>> Long will never tear, but Long! (note, there is a bang here) will tear. Long!
>> and long have exactly the same semantics with respect to tearing.
>> From the spec POV, if you want flattenes on heap you need both a value class
>> with an implicit constructor AND using ! at use site (where the field is
>> declared).
> 
> Hmm, Long! also will never tear, until/unless the JDK author of Long
> marks the class (somehow) as permitting non-atomic updates.  By default,
> updates of both V and V! are free from races.  Only after opt-in
> does V! (never V) begin to allow races.  I can’t say with certainty
> what the JDK policy will be for making Long tearable, in the end.
> 
> It seems safer NOT to declare Long! tearable.  So Long! would be
> a little safer than long on platforms which tear long.  But it
> is also unclear whether having Long! differ from long this way
> is a good thing or not.  So… I can’t say.  Maybe Brian or Dan
> has worked out the whole equation for this particular question.

That's a good question.
In practice on 64 bits platforms, there is no difference. I recall that the cost of forcing 64 bits writes was prohibitive on lego mindstorm NXT (a 32bits ARM), but it was 15 years ago.
I suppose we need more recent data :)

> 
> There is a spectrum or journey of opting in: first value class
> (discarding identity), then allow zeroes by implicit constructor
> (discarding 100% explicit construction), then non-atomicity
> (discarding protection against construction by races).  Each
> step is a tradeoff: Give something up, get something better.
> 
>> After, a specific VM can do heroics and discover that the actual CPU supports
>> atomic 128 bits read/write using vector instructions and decide to store a Long
>> (no bang) as a 128 bits values, 64 for the value and 1 for null or not. But
>> this is a tradeoff, using more memory may backfire.
> 
> Ah yes, the 128-bit heroics.  Those will be best when the object payload
> uses most or all of those 128 bits.  There are also possible 64.1-bit
> heroics which steal the single bit from some condition on the object.
> Those are in the realm of STM, and work best if the extra bit only
> rarely needs setting.  That line of thought arises from the design
> of lazy fields.  Then there are also possible 63-bit heroics involving
> unions between immediate and pointer, as we had in the old days of Lisp.
> 
> There are old prototypes of 64-N-bit thingies on HotSpot (sort of
> like NaN-tagged doubles, but which steal encodings from 64-bit long).
> Like the NaN tags, they seem to promise interesting 64-N-bit heroics
> for value types, and with a configurable N.
> 
> It will be fun to try all out these heroics, someday, on 65-bit types,
> and see if they help any use-cases.
> 
> A “heroic” which gets by in less than 64 bits seems paradoxical,
> but sometimes is applicable if you really don’t expect the values
> to inhabit the entire 2^64 bit range, and/or are willing to take
> a performance hit on a few values, perhaps because they are rare.
> That’s the essential tradeoff of Lisp and Smalltalk small ints.
> 
> If you think that is far fetched, ask yourself when there ever
> was a Java application which created the full range of possible
> java.lang.Long values.  None ever did, because it would require
> a peta-scale heap, which Java can’t run on and may never.
> 
> To be clear, these are all just the possibilities; nobody is
> working on them yet.  We’ve got plenty of more basic heroics
> to carry out now, in the Valhalla project.  But the above
> considerations suggest to me that our current set of heroics
> are just the beginning.  Like it was in the beginning of Java,
> when “it is an interpretive language” and “its GC is slower
> than hand-managed malloc” were sometimes considered wise
> reasons to stay with C.  That cost model was exploded by
> HotSpot and other second-generation JVMs.
> 
> Maybe there will be a wise principle at first that “Valhalla
> values take up more heap space than primitives”.  That shouldn’t
> be true if you declare them properly, but there might be cases
> where you want to preserve safety or abstraction beyond what
> primitives do, and pay for it (at first) with a little heap.
> But given a good semantic foundation, such cost models can
> go out of date, as the platform adds better and better
> optimizations under the covers.  That’s what I hope for Valhalla.

Dear Santa, I want deoptimizations of the layout instead of heroics. A way to mark that all instances of a class need to be evacuated and enlarged because a final nullable value type field actually is required to store null (or a special long value as you said). The marking + evacuation has to be done concurrently with the rest of the application like modern concurrent GCs do.

This same mechanism can be reused to speculatively avoid to store the identity hashCode of an instance of an identity class until the identity hashCode is actually requested.
Same thing with avoiding to store a reference to a lock (written in Java) in all objects until synchronized is called on an instance of a class.

regards,
Rémi