RFR: JDK-8205549 JDK-8205698 Support of flattened values in Unsafe

Fri Jun 29 16:36:09 UTC 2018

----- Mail original -----
> De: "Frederic Parain" <frederic.parain at oracle.com>
> À: "John Rose" <john.r.rose at oracle.com>
> Cc: "valhalla-dev" <valhalla-dev at openjdk.java.net>
> Envoyé: Vendredi 29 Juin 2018 16:14:26
> Objet: Re: RFR: JDK-8205549 JDK-8205698 Support of flattened values in Unsafe

>> On Jun 28, 2018, at 21:12, John Rose <john.r.rose at oracle.com> wrote:
>> 
>> On Jun 28, 2018, at 1:24 PM, Frederic Parain <frederic.parain at oracle.com> wrote:
>>> 
>>> John,
>>> 
>>> This is an emergency fix (Unsafe was not part of the plan for LW1, until
>>> we discovered VarHandles needed it), it focuses on robustness, not on
>>> performance.
>> 
>> Yes, I guessed that!
>> 
>>> Looking  longer term:
>>> 
>>> 1 - retrieving all information for flattenable/flattened fields or value
>>>      containers is expensive. I think there’s a clear consensus that
>>>      new methods are required to efficiently support access to
>>>      flattened/flattenable fields.  But the LW1 date is fast approaching,
>>>      so I’m not convinced that rushing these new APIs for LW1 is a good
>>>      choice.
>> 
>> No, probably not, unless your technique causes a serious problem with
>> performance.  But I think the JIT folks are avoiding the unsafe.cpp code
>> successfully, so we're good.
>> 
>>> 
>>> 2 - deprecating U.getObject() and U.putObject(), for more precise
>>>      methods makes sense, however deprecating is often a slow process
>>>      in the Java world. An alternate solution would be to keep getObject()
>>>      and putObject() while proposing the new alternative. It seems possible
>>>      to restore almost all performance of getObject()/putObject() for the
>>>      non value types cases (not part of the current patch, I wanted to
>>>      push the fix ASAP to fix the current regressions). The trick is that
>>>      objectFieldOffset() is not guaranteed to return a true offset, it could
>>>      be some kind of a handle. It is possible to use three bits out of the
>>>      64bits of the offset to encode the information the JVM needs
>>>     (flattened field/flattenable field/value container). Internally, Hotspot
>>>     is already using 3 bits of the field offset metadata to encode some
>>>     information. With a simple mask and test against zero, it would be
>>>     very simple to detect that no value type is involved and the code
>>>     could directly perform the legacy behavior.
>> 
>> Yes, the oFO could be given tag bits.  Some clients of Unsafe may do
>> arithmetic on these offsets, so the tagged offsets might have to be
>> made resistant to arithmetic mistakes.  Probably this means the
>> tags would want to be placed high in the oFO word.  Of course clients
>> who perform arithmetic on those are taking risks, and should check
>> for tags anyway.
>> 
>> As I said to Mandy, I think any such test should be reified in Java
>> code, rather than burned into unsafe.cpp and the JIT.  If it is burned
>> in, it becomes invisible and harder (for both clients and JVM) to optimize.
>> 
>> If we decide to go the route you are saying, the legacy getObject should
>> become a Java-coded method which clearly decomposes into the steps
>> you are describing.  Then it will be a simpler matter for clients and for
>> the JIT to refactor their usages to avoid the branches.  Today's getObject
>> calls usually compile down to single memory references; adding even
>> a simple tag check and branch to the base semantics risks adding
>> a proportionately large overhead.  Reifying the extra check at the
>> Java level gives control back to the programmer.
> 
> It seems my plan was not explicit enough.
> I was suggesting considering getObject()/putObject() as doomed,
> because they are not aware that flattened fields and values exist,
> so the JVM has to do all the hard work under the hood.
> 
> Mandy is making good progress on getValue()/putValue() to provide
> efficient access to values.
> 
> We should also have methods getRefObject()/putRefObject(),
> working as getObject()/putObject() work today, assuming that the
> caller has checked that no values were involved. These methods
> are the ones that will optimized to a single memory reference by the JIT.
> 
> The legacy getObject()/putObject() would be maintained with value
> types support just the time to allow a smooth transition.
> 
> Fred

Hi Fred,
given that there are now two Unsafes, we can offer a getObject (resp putObject) that delegates to getRefObject or getValue only in sun.misc.Unsafe to maintain backward compatibility while removing getObject/putObject from jdk.internal.misc.Unsafe.

Rémi

> 
> 
>> 
>>>     This would preserve the performance of GetObject()/putObject()
>>>     for legacy code, the time for the code base to migrate to the
>>>     new APIs, and make the support of value almost transparent
>>>     for this code (except for performance).
>>>     Once all codes have been migrated, getObject()/putObject() would
>>>     be removed.
>> 
>> Or it could be left as-is.  Given a Java-coded body, it would be clear
>> to users what are the pros and cons of using it or refactoring to use
>> its parts.
>> 
>> I guess I buy this trajectory.  The stripped-down version of getObject
>> (the thing that getObject calls after making a safety check) should be
>> called getReference (or getOop or getPointer, but reference is the
>> proper JVM-level term).  The semantics of getReference would be
>> more "hard line unsafe-y" in that if you accidentally passed it the
>> offset of a flattened field, it would just crash rather than detect the
>> error and try to make it nicer for you.
>> 
>> I think we could do all of this migration without tag bits.  Instead
>> of tag bits, the transitional getObject would do the expensive metadata
>> lookup you are adding into unsafe.cpp (using a Java API, not hardwired
>> C++ code).  It would be slow, but everybody would know what to do
>> about it.  Adding the tag bits would make that slow path a little faster,
>> but at the cost of making offsets (oFO results) less scrutable.
>> 
>>>     This trick could also be applied to arrayBaseOffset() for the support
>>>     of arrays.
>> 
>> Yep.  There's even a documented sentinel value (-1) which is a sort
>> of prototype for an offset-that-is-not-a-real-offset.  But we've never
>> been able (so far) to make really good use of that option.
>> 
>> Thanks!
>> 
> > — John