RFR: JDK-8205549 JDK-8205698 Support of flattened values in Unsafe

Fri Jun 29 01:12:45 UTC 2018

On Jun 28, 2018, at 1:24 PM, Frederic Parain <frederic.parain at oracle.com> wrote:
> 
> John,
> 
> This is an emergency fix (Unsafe was not part of the plan for LW1, until
> we discovered VarHandles needed it), it focuses on robustness, not on
> performance.

Yes, I guessed that!

> Looking  longer term:
> 
>  1 - retrieving all information for flattenable/flattened fields or value
>       containers is expensive. I think there’s a clear consensus that
>       new methods are required to efficiently support access to
>       flattened/flattenable fields.  But the LW1 date is fast approaching,
>       so I’m not convinced that rushing these new APIs for LW1 is a good
>       choice.

No, probably not, unless your technique causes a serious problem with
performance.  But I think the JIT folks are avoiding the unsafe.cpp code
successfully, so we're good.

> 
>  2 - deprecating U.getObject() and U.putObject(), for more precise
>       methods makes sense, however deprecating is often a slow process
>       in the Java world. An alternate solution would be to keep getObject()
>       and putObject() while proposing the new alternative. It seems possible
>       to restore almost all performance of getObject()/putObject() for the
>       non value types cases (not part of the current patch, I wanted to
>       push the fix ASAP to fix the current regressions). The trick is that
>       objectFieldOffset() is not guaranteed to return a true offset, it could
>       be some kind of a handle. It is possible to use three bits out of the
>       64bits of the offset to encode the information the JVM needs
>      (flattened field/flattenable field/value container). Internally, Hotspot
>      is already using 3 bits of the field offset metadata to encode some
>      information. With a simple mask and test against zero, it would be
>      very simple to detect that no value type is involved and the code
>      could directly perform the legacy behavior.

Yes, the oFO could be given tag bits.  Some clients of Unsafe may do
arithmetic on these offsets, so the tagged offsets might have to be
made resistant to arithmetic mistakes.  Probably this means the
tags would want to be placed high in the oFO word.  Of course clients
who perform arithmetic on those are taking risks, and should check
for tags anyway.

As I said to Mandy, I think any such test should be reified in Java
code, rather than burned into unsafe.cpp and the JIT.  If it is burned
in, it becomes invisible and harder (for both clients and JVM) to optimize.

If we decide to go the route you are saying, the legacy getObject should
become a Java-coded method which clearly decomposes into the steps
you are describing.  Then it will be a simpler matter for clients and for
the JIT to refactor their usages to avoid the branches.  Today's getObject
calls usually compile down to single memory references; adding even
a simple tag check and branch to the base semantics risks adding
a proportionately large overhead.  Reifying the extra check at the
Java level gives control back to the programmer.

>      This would preserve the performance of GetObject()/putObject()
>      for legacy code, the time for the code base to migrate to the
>      new APIs, and make the support of value almost transparent
>      for this code (except for performance).
>      Once all codes have been migrated, getObject()/putObject() would
>      be removed.

Or it could be left as-is.  Given a Java-coded body, it would be clear
to users what are the pros and cons of using it or refactoring to use
its parts.

I guess I buy this trajectory.  The stripped-down version of getObject
(the thing that getObject calls after making a safety check) should be
called getReference (or getOop or getPointer, but reference is the
proper JVM-level term).  The semantics of getReference would be
more "hard line unsafe-y" in that if you accidentally passed it the
offset of a flattened field, it would just crash rather than detect the
error and try to make it nicer for you.

I think we could do all of this migration without tag bits.  Instead
of tag bits, the transitional getObject would do the expensive metadata
lookup you are adding into unsafe.cpp (using a Java API, not hardwired
C++ code).  It would be slow, but everybody would know what to do
about it.  Adding the tag bits would make that slow path a little faster,
but at the cost of making offsets (oFO results) less scrutable.

>      This trick could also be applied to arrayBaseOffset() for the support
>      of arrays.

Yep.  There's even a documented sentinel value (-1) which is a sort
of prototype for an offset-that-is-not-a-real-offset.  But we've never
been able (so far) to make really good use of that option.

Thanks!

— John