RFR: JDK-8205549 JDK-8205698 Support of flattened values in Unsafe

Fri Jun 29 14:14:26 UTC 2018

> On Jun 28, 2018, at 21:12, John Rose <john.r.rose at oracle.com> wrote:
> 
> On Jun 28, 2018, at 1:24 PM, Frederic Parain <frederic.parain at oracle.com> wrote:
>> 
>> John,
>> 
>> This is an emergency fix (Unsafe was not part of the plan for LW1, until
>> we discovered VarHandles needed it), it focuses on robustness, not on
>> performance.
> 
> Yes, I guessed that!
> 
>> Looking  longer term:
>> 
>> 1 - retrieving all information for flattenable/flattened fields or value
>>      containers is expensive. I think there’s a clear consensus that
>>      new methods are required to efficiently support access to
>>      flattened/flattenable fields.  But the LW1 date is fast approaching,
>>      so I’m not convinced that rushing these new APIs for LW1 is a good
>>      choice.
> 
> No, probably not, unless your technique causes a serious problem with
> performance.  But I think the JIT folks are avoiding the unsafe.cpp code
> successfully, so we're good.
> 
>> 
>> 2 - deprecating U.getObject() and U.putObject(), for more precise
>>      methods makes sense, however deprecating is often a slow process
>>      in the Java world. An alternate solution would be to keep getObject()
>>      and putObject() while proposing the new alternative. It seems possible
>>      to restore almost all performance of getObject()/putObject() for the
>>      non value types cases (not part of the current patch, I wanted to
>>      push the fix ASAP to fix the current regressions). The trick is that
>>      objectFieldOffset() is not guaranteed to return a true offset, it could
>>      be some kind of a handle. It is possible to use three bits out of the
>>      64bits of the offset to encode the information the JVM needs
>>     (flattened field/flattenable field/value container). Internally, Hotspot
>>     is already using 3 bits of the field offset metadata to encode some
>>     information. With a simple mask and test against zero, it would be
>>     very simple to detect that no value type is involved and the code
>>     could directly perform the legacy behavior.
> 
> Yes, the oFO could be given tag bits.  Some clients of Unsafe may do
> arithmetic on these offsets, so the tagged offsets might have to be
> made resistant to arithmetic mistakes.  Probably this means the
> tags would want to be placed high in the oFO word.  Of course clients
> who perform arithmetic on those are taking risks, and should check
> for tags anyway.
> 
> As I said to Mandy, I think any such test should be reified in Java
> code, rather than burned into unsafe.cpp and the JIT.  If it is burned
> in, it becomes invisible and harder (for both clients and JVM) to optimize.
> 
> If we decide to go the route you are saying, the legacy getObject should
> become a Java-coded method which clearly decomposes into the steps
> you are describing.  Then it will be a simpler matter for clients and for
> the JIT to refactor their usages to avoid the branches.  Today's getObject
> calls usually compile down to single memory references; adding even
> a simple tag check and branch to the base semantics risks adding
> a proportionately large overhead.  Reifying the extra check at the
> Java level gives control back to the programmer.

It seems my plan was not explicit enough.
I was suggesting considering getObject()/putObject() as doomed,
because they are not aware that flattened fields and values exist,
so the JVM has to do all the hard work under the hood.

Mandy is making good progress on getValue()/putValue() to provide
efficient access to values.

We should also have methods getRefObject()/putRefObject(),
working as getObject()/putObject() work today, assuming that the
caller has checked that no values were involved. These methods
are the ones that will optimized to a single memory reference by the JIT.

The legacy getObject()/putObject() would be maintained with value
types support just the time to allow a smooth transition.

Fred

> 
>>     This would preserve the performance of GetObject()/putObject()
>>     for legacy code, the time for the code base to migrate to the
>>     new APIs, and make the support of value almost transparent
>>     for this code (except for performance).
>>     Once all codes have been migrated, getObject()/putObject() would
>>     be removed.
> 
> Or it could be left as-is.  Given a Java-coded body, it would be clear
> to users what are the pros and cons of using it or refactoring to use
> its parts.
> 
> I guess I buy this trajectory.  The stripped-down version of getObject
> (the thing that getObject calls after making a safety check) should be
> called getReference (or getOop or getPointer, but reference is the
> proper JVM-level term).  The semantics of getReference would be
> more "hard line unsafe-y" in that if you accidentally passed it the
> offset of a flattened field, it would just crash rather than detect the
> error and try to make it nicer for you.
> 
> I think we could do all of this migration without tag bits.  Instead
> of tag bits, the transitional getObject would do the expensive metadata
> lookup you are adding into unsafe.cpp (using a Java API, not hardwired
> C++ code).  It would be slow, but everybody would know what to do
> about it.  Adding the tag bits would make that slow path a little faster,
> but at the cost of making offsets (oFO results) less scrutable.
> 
>>     This trick could also be applied to arrayBaseOffset() for the support
>>     of arrays.
> 
> Yep.  There's even a documented sentinel value (-1) which is a sort
> of prototype for an offset-that-is-not-a-real-offset.  But we've never
> been able (so far) to make really good use of that option.
> 
> Thanks!
> 
> — John