RFR: JDK-8205549 JDK-8205698 Support of flattened values in Unsafe
Remi Forax
forax at univ-mlv.fr
Fri Jun 29 16:36:09 UTC 2018
----- Mail original -----
> De: "Frederic Parain" <frederic.parain at oracle.com>
> À: "John Rose" <john.r.rose at oracle.com>
> Cc: "valhalla-dev" <valhalla-dev at openjdk.java.net>
> Envoyé: Vendredi 29 Juin 2018 16:14:26
> Objet: Re: RFR: JDK-8205549 JDK-8205698 Support of flattened values in Unsafe
>> On Jun 28, 2018, at 21:12, John Rose <john.r.rose at oracle.com> wrote:
>>
>> On Jun 28, 2018, at 1:24 PM, Frederic Parain <frederic.parain at oracle.com> wrote:
>>>
>>> John,
>>>
>>> This is an emergency fix (Unsafe was not part of the plan for LW1, until
>>> we discovered VarHandles needed it), it focuses on robustness, not on
>>> performance.
>>
>> Yes, I guessed that!
>>
>>> Looking longer term:
>>>
>>> 1 - retrieving all information for flattenable/flattened fields or value
>>> containers is expensive. I think there’s a clear consensus that
>>> new methods are required to efficiently support access to
>>> flattened/flattenable fields. But the LW1 date is fast approaching,
>>> so I’m not convinced that rushing these new APIs for LW1 is a good
>>> choice.
>>
>> No, probably not, unless your technique causes a serious problem with
>> performance. But I think the JIT folks are avoiding the unsafe.cpp code
>> successfully, so we're good.
>>
>>>
>>> 2 - deprecating U.getObject() and U.putObject(), for more precise
>>> methods makes sense, however deprecating is often a slow process
>>> in the Java world. An alternate solution would be to keep getObject()
>>> and putObject() while proposing the new alternative. It seems possible
>>> to restore almost all performance of getObject()/putObject() for the
>>> non value types cases (not part of the current patch, I wanted to
>>> push the fix ASAP to fix the current regressions). The trick is that
>>> objectFieldOffset() is not guaranteed to return a true offset, it could
>>> be some kind of a handle. It is possible to use three bits out of the
>>> 64bits of the offset to encode the information the JVM needs
>>> (flattened field/flattenable field/value container). Internally, Hotspot
>>> is already using 3 bits of the field offset metadata to encode some
>>> information. With a simple mask and test against zero, it would be
>>> very simple to detect that no value type is involved and the code
>>> could directly perform the legacy behavior.
>>
>> Yes, the oFO could be given tag bits. Some clients of Unsafe may do
>> arithmetic on these offsets, so the tagged offsets might have to be
>> made resistant to arithmetic mistakes. Probably this means the
>> tags would want to be placed high in the oFO word. Of course clients
>> who perform arithmetic on those are taking risks, and should check
>> for tags anyway.
>>
>> As I said to Mandy, I think any such test should be reified in Java
>> code, rather than burned into unsafe.cpp and the JIT. If it is burned
>> in, it becomes invisible and harder (for both clients and JVM) to optimize.
>>
>> If we decide to go the route you are saying, the legacy getObject should
>> become a Java-coded method which clearly decomposes into the steps
>> you are describing. Then it will be a simpler matter for clients and for
>> the JIT to refactor their usages to avoid the branches. Today's getObject
>> calls usually compile down to single memory references; adding even
>> a simple tag check and branch to the base semantics risks adding
>> a proportionately large overhead. Reifying the extra check at the
>> Java level gives control back to the programmer.
>
> It seems my plan was not explicit enough.
> I was suggesting considering getObject()/putObject() as doomed,
> because they are not aware that flattened fields and values exist,
> so the JVM has to do all the hard work under the hood.
>
> Mandy is making good progress on getValue()/putValue() to provide
> efficient access to values.
>
> We should also have methods getRefObject()/putRefObject(),
> working as getObject()/putObject() work today, assuming that the
> caller has checked that no values were involved. These methods
> are the ones that will optimized to a single memory reference by the JIT.
>
> The legacy getObject()/putObject() would be maintained with value
> types support just the time to allow a smooth transition.
>
> Fred
Hi Fred,
given that there are now two Unsafes, we can offer a getObject (resp putObject) that delegates to getRefObject or getValue only in sun.misc.Unsafe to maintain backward compatibility while removing getObject/putObject from jdk.internal.misc.Unsafe.
Rémi
>
>
>>
>>> This would preserve the performance of GetObject()/putObject()
>>> for legacy code, the time for the code base to migrate to the
>>> new APIs, and make the support of value almost transparent
>>> for this code (except for performance).
>>> Once all codes have been migrated, getObject()/putObject() would
>>> be removed.
>>
>> Or it could be left as-is. Given a Java-coded body, it would be clear
>> to users what are the pros and cons of using it or refactoring to use
>> its parts.
>>
>> I guess I buy this trajectory. The stripped-down version of getObject
>> (the thing that getObject calls after making a safety check) should be
>> called getReference (or getOop or getPointer, but reference is the
>> proper JVM-level term). The semantics of getReference would be
>> more "hard line unsafe-y" in that if you accidentally passed it the
>> offset of a flattened field, it would just crash rather than detect the
>> error and try to make it nicer for you.
>>
>> I think we could do all of this migration without tag bits. Instead
>> of tag bits, the transitional getObject would do the expensive metadata
>> lookup you are adding into unsafe.cpp (using a Java API, not hardwired
>> C++ code). It would be slow, but everybody would know what to do
>> about it. Adding the tag bits would make that slow path a little faster,
>> but at the cost of making offsets (oFO results) less scrutable.
>>
>>> This trick could also be applied to arrayBaseOffset() for the support
>>> of arrays.
>>
>> Yep. There's even a documented sentinel value (-1) which is a sort
>> of prototype for an offset-that-is-not-a-real-offset. But we've never
>> been able (so far) to make really good use of that option.
>>
>> Thanks!
>>
> > — John
More information about the valhalla-dev
mailing list