RFR: JDK-8199781: Don't use naked == for comparing oops

Wed Mar 28 20:41:26 UTC 2018

I agree, this looks like template voodoo.   It seems like you might be 
able to use more X macros to reduce some boilerplate code.  But I'm not 
requesting you to try this for this change.

http://cr.openjdk.java.net/~eosterlund/8199781/webrev.02/src/hotspot/share/oops/accessBackend.new.hpp.html

Can you put a } // namespace AccessInternal

at these places?   Long namespace declarations are not as bad as #ifdef 
blocks but it's nice to see what the stray } goes to.

Here:

  175 }

1372 }

I don't need to see a webrev for this minor change.

http://cr.openjdk.java.net/~eosterlund/8199781/webrev.02/src/hotspot/share/prims/jvm.cpp.udiff.html

+ if ((!oopDesc::equals(previous_protection_domain, protection_domain)) 
&& (protection_domain != NULL)) {

So we don't need oopDesc::equals for NULL?

http://cr.openjdk.java.net/~eosterlund/8199781/webrev.02/src/hotspot/share/oops/accessBackend.new.hpp.html

So against my better judgement, I was trying to follow the bouncing 
template instantiation ball.  And I guess we end up calling this:

RawAccessBarrier<>

  405   static bool equals(oop o1, oop o2) { return o1 == o2; }

Can you show how this decomposes from:

+ inline static bool equals(oop o1, oop o2) { return 
Access<>::equals(o1, o2); }

Thank you for getting this to inline and supporting Roman's patch.

Coleen

On 3/28/18 4:11 PM, Erik Osterlund wrote:
> Hi Roman,
>
> Thanks for the review.
>
> /Erik
>
>> On 28 Mar 2018, at 21:56, Roman Kennke <rkennke at redhat.com> wrote:
>>
>>
>>> Incremental webrev:
>>> http://cr.openjdk.java.net/~eosterlund/8199781/webrev.01_02/
>>>
>>> Full webrev:
>>> http://cr.openjdk.java.net/~eosterlund/8199781/webrev.02/
>> Looks good to me. (Well. As good as all this template-voodoo can look
>> ;-) Better than before though).
>>
>> Cheers,
>> Roman
>>
>>
>>>> Thank you for solving the inline problem with oopDesc::equals with
>>>> this.   I haven't fully reviewed it but don't wait for me.
>>> Thank you for having a look at this.
>>>
>>> /Erik
>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>>> On 3/28/18 9:00 AM, Erik Österlund wrote:
>>>>> Hi,
>>>>>
>>>>> The access.hpp support has reincarnated.
>>>>>
>>>>> Here is an incremental patch on Roman's work (rebased on latest
>>>>> jdk-hs) that makes it possible to include access.hpp to call
>>>>> Access<>::equals specifically. The infrastructure allows picking more
>>>>> accesses that can be used from the hpp file, but I have kept it down
>>>>> to the minimum requirements now, so for now it only works for equals.
>>>>> Perhaps in the future, we might want to move more operations to be
>>>>> reachable from hpp, but not today.
>>>>>
>>>>> I changed the involved header files to include access.hpp, and made
>>>>> it possible to inline this again for non-Shenandoah builds, by
>>>>> checking for the INTERNAL_BT_TO_SPACE_INVARIANT build-time decorator.
>>>>>
>>>>> Builds that only have to-space invariant GCs inline equals to a
>>>>> normal native comparison of the pointers. Builds that have at least
>>>>> one to-space invariant GC instead generate a single function pointer
>>>>> call. This function pointer may be resolved in the access.cpp file,
>>>>> instead of the access.inline.hpp file.
>>>>>
>>>>> Here is an explanation how this works:
>>>>>
>>>>> The template pipeline for each operation consists of 5 steps, and
>>>>> they were previously all performed in the access.inline.hpp file. The
>>>>> steps are:
>>>>>
>>>>> 1) Set default decorators and decay types
>>>>> 2) Reduce types
>>>>> 3) Pre-runtime dispatch
>>>>> 4) Runtime-dispatch
>>>>> 5.a) Barrier resolution (check which GC backend accessor to be used
>>>>> upon the first invocation, then patch function pointer to 5.b)
>>>>> 5.b) Post-runtime dispatch
>>>>>
>>>>> I have moved steps 1-4 to the accessBackend.hpp file, which is
>>>>> included by access.hpp. This allows performing all steps until the
>>>>> backend barrier resolution in only hpp files. This makes it possible
>>>>> to use, e.g. raw accesses (although without compressed oops, atomic
>>>>> or order access) from only the access.hpp file. The resolution of
>>>>> barriers in the GC backends (comprising steps 4-5) remains in the
>>>>> access.inline.hpp file. But they can be instantiated from the
>>>>> access.cpp file for well chosen accesses and decorator combinations
>>>>> that can be pre-generated in the libjvm.so file. The function
>>>>> pointers used in builds that have not to-space invariant GCs will be
>>>>> resolved to point right into these functions in the access.cpp file
>>>>> at runtime.
>>>>>
>>>>> This change required moving the decorators to a new file called
>>>>> accessDecorators.hpp. I kind of like having them there in a separate
>>>>> file.
>>>>>
>>>>> Incremental webrev to Roman's patch:
>>>>> http://cr.openjdk.java.net/~eosterlund/8199781/webrev.00_01/
>>>>>
>>>>> Full webrev:
>>>>> http://cr.openjdk.java.net/~eosterlund/8199781/webrev.01/
>>>>>
>>>>> Thanks,
>>>>> /Erik
>>>>>
>>>>>> On 2018-03-27 09:14, Roman Kennke wrote:
>>>>>>> Am 27.03.2018 um 00:46 schrieb John Rose:
>>>>>>>> On Mar 26, 2018, at 2:21 PM, Roman Kennke <rkennke at redhat.com> wrote:
>>>>>>>>> Am 26.03.2018 um 21:11 schrieb John Rose:
>>>>>>>>> ...
>>>>>>>>> This is clearly one of those cases.  Let's code against surprises
>>>>>>>>> (both present and future) by abstracting simpler operations with
>>>>>>>>> inlines.
>>>>>>>> Thanks John for clarifying. I generally do agree with you that it
>>>>>>>> makes
>>>>>>>> sense to inline something like an operator that will compile to a
>>>>>>>> single
>>>>>>>> instruction.
>>>>>>>>
>>>>>>>> Unfortunately, this new abstraction will generate a non-inlineable
>>>>>>>> call.
>>>>>>>> Thanks Erik's work with the Access API, it should be a straight
>>>>>>>> call to
>>>>>>>> the right implementation, not even a virtual call, but it's a call
>>>>>>>> nonetheless.
>>>>>>> What would it take to inline the call, so there's no new performance
>>>>>>> hazard introduced?
>>>>>>>
>>>>>>> This might be a place to add #ifdefs, if the templates don't let us
>>>>>>> control the decision statically.
>>>>>>>
>>>>>>> (I confess I find the templates complex and hard to reason about.
>>>>>>> But that's a potential maintenance issue with the access API, not
>>>>>>> your work.)
>>>>>> The Access API supports static inclusion/exclusion of stuff, and we've
>>>>>> used it before to ensable barriers on primitive field accesses for
>>>>>> Shenandoah-enabled builds. It works by passing
>>>>>> -DSUPPORT_BARRIER_ON_PRIMITIVES to gcc. This is much better than
>>>>>> #ifdefs
>>>>>> and does the same thing. I can add something similar for object
>>>>>> equality.
>>>>>>
>>>>>>>> The reason why I'm proposing it is that the GC might want to have
>>>>>>>> a say
>>>>>>>> about object equality. In the case of Shenandoah, there may be two
>>>>>>>> copies of any one object floating around, and comparing the two
>>>>>>>> copies
>>>>>>>> might otherwise result in false negatives.
>>>>>>> 1. Of course we need a way for some GC algorithms to control
>>>>>>> object identity.
>>>>>>>
>>>>>>> 2. It's good coding discipline to find all the places where native
>>>>>>> op== is used and upgrade them to the needed control.
>>>>>> I hope I did that with the proposed patch (it's based on years of
>>>>>> finding all those places in Shenandoah land ;-) ).
>>>>>>
>>>>>>> (2a. We also need a way to avoid having op== creep back in
>>>>>>> in an uncontrolled way.)
>>>>>> We have something for this in Shenandoah too. It works by overriding ==
>>>>>> in oopsHierarch.hpp and running into ShouldNotReachHere() if both
>>>>>> operands are != NULL. This catches all uses of naked == at runtime,
>>>>>> when
>>>>>> running with +CheckUnhandledOops. It's not perfect, because it doesn't
>>>>>> catch uses in places that are not executed in tests, but it's still
>>>>>> incredibly useful. The caveat here is that we also need another
>>>>>> abstraction for code paths that actually want naked ==, e.g. in the GC.
>>>>>> In Shenandoah we have an oopDesc::unsafe_equals() for that. The other
>>>>>> alternative would be to drop operators from oop to oopDesc* and compare
>>>>>> that (which is exactly what oopDesc::unsafe_equals() does).
>>>>>>
>>>>>>> 3. We still need an unsurprising performance model for GC's
>>>>>>> which don't require the extra identity hook.
>>>>>> Yeah. Unfortunately this is C++ and not Java, and the decision which GC
>>>>>> to use is made at runtime. The 'best' that I can see is to rip out the
>>>>>> indirection in Shenandoah-disabled builds, as proposed above. I don't
>>>>>> like it much, but if this is the only way we can get Shenandoah in,
>>>>>> then
>>>>>> let's go for it?
>>>>>>
>>>>>>> So we're not there yet, I think.  If there's no easy way to adjust
>>>>>>> the access API to get the last inlining, I suggest adding an #ifdef
>>>>>>> into oopDesc::equals.
>>>>>>>
>>>>>>> I suspect the happy final solution will be to templatize hot loops
>>>>>>> using making the inlinable adjusted op== be part of the "decorations"
>>>>>>> of those loops.  Then we will have two copies of hot loops in the
>>>>>>> JVM, one with the out-of-line hook and one without.
>>>>>>>
>>>>>>> For now, I think it's enough to have separate builds of the JVM,
>>>>>>> one with the #ifdef set "the old way" and one that allows more
>>>>>>> flexibility.
>>>>>>>
>>>>>>> (Is this the first time we've run into an occasion to make a separate
>>>>>>> Shenandaoah build?  Surely not.)
>>>>>> Primitive field accesses required similar jumping through hoops ;-)
>>>>>>
>>>>>>>> So... with inlining oopDesc::equals() we will get one straight call
>>>>>>>> (from the caller of oopDesc::equals() to the impl), instead of 2
>>>>>>>> (caller
>>>>>>>> -> oopDesc::equals() -> impl).
>>>>>>>>
>>>>>>>> It's the best I could come up with that does what (Shenandoah) GC
>>>>>>>> needs,
>>>>>>>> and it hasn't shown up in any performance testing that we did, yet.
>>>>>>>>
>>>>>>>> Still good?
>>>>>>> I think it's too risky to out-of-line the "cmpq" instruction. Let's
>>>>>>> find an agreeable way to handle that via a trade-off that keeps
>>>>>>> the old code shape in VM builds which don't need the new code
>>>>>>> shape.  I'm speaking up here because I don't believe this is the
>>>>>>> last time we'll have to consider adding something gross like an
>>>>>>> #ifdef to support new features, and I want to aim at trade-offs
>>>>>>> that keep the existing hot paths efficient, while still allowing
>>>>>>> new code shapes to become part of the mix.
>>>>>>>
>>>>>>> (I'd be happy to see something cleaner than an #ifdef, of
>>>>>>> course.  Seems to me that a constant non-product flag
>>>>>>> -XX:+ShortCircuitOopEquals would work too, with a
>>>>>>> binding of falseInShenandoah.)
>>>>>> That would require coding 2 versions of hot loops that involve oop==oop
>>>>>> and switch to one or the other based on that flag. I have checked all
>>>>>> the uses of oopDesc::equals() in my patch, and none seemed
>>>>>> particularily
>>>>>> critical. I think this would be overkill.
>>>>>>
>>>>>> And notice also that none interpreter/c1/c2 compiled code will suffer
>>>>>> this problem because we'll just generate the right code at runtime.
>>>>>>
>>>>>> Roman
>>>>>>
>>