RFR: JDK-8207169: X86: Modularize cmpxchg-oop assembler for C1 and C2
Erik Österlund
erik.osterlund at oracle.com
Wed Aug 29 15:28:54 UTC 2018
Hi Roman,
On 2018-08-29 17:14, Roman Kennke wrote:
> Hmm right, generating specific e.g. ShenandoahCompareAndSwapPNode and
> putting an expansion for that in the .ad file seems like a good idea.
I meant generating CAS and GC barrier nodes for the intrinsic, and macro
expanding the barrier nodes during the macro expansion phase into an
inflated node representation describing their barrier logic, and leave
the AD file alone.
> We tried generating ideal graph for what we need for cmpxchg-oop and it
> was complicated and error-prone and not really worth it. We may try again.
It is fiddly indeed. But the resulting interface is much cleaner I
think. I don't think putting things in the AD file for convenience is
necessarily the right thing to do. I do see the temptation of doing so
though, as it is quite a shortcut. But I think the AD files was meant to
describe how nodes map to different architectures, not be a place where
you put things for convenience that were tricky to stitch together the
right node graph for. At least that is how I view it.
Thanks,
/Erik
> Roman
>
>> I agree with you that it would be nice to have a good interface for
>> this, and that putting your GC barriers straight into the AD file makes
>> it very awkward to have a clean interface. It will be a special
>> interface for shenandoah only, that only shenandoah uses. All other GCs
>> generate nodes for their barriers. It seems like a bit of a shortcut for
>> not having to generate the proper nodes for your barriers, which I think
>> is the right level to do this, and is where all other GCs do this. The
>> rules matched in the AD file kind of presume you know what instructions
>> will be generated. This is required for example for PrintOptoAssembly to
>> generate anything useful. It might not be a dealbreaker I suppose if
>> PrintOptoAssembly prints nonsense. But I can't help but feel like the
>> abstraction level is not right here, and that placing GC barriers
>> straight into the AD file for convenience, is a bit of a hack.
>>
>> An alternative that I think should be considered, is to do what ZGC
>> does: generate nodes in ZBarrierSetC2, and macro expand barrier nodes in
>> ZBarrierSetC2. That way, your changes can remain in your
>> ShenandoahBarrierSetC2 class, and not leak out into the AD files.
>>
>> Thanks,
>> /Erik
>>
>> On 2018-08-29 16:35, Roman Kennke wrote:
>>> Hi Roland,
>>>
>>>>> It was suggested to me (off-list) to simply put this stuff under if
>>>>> (UseShenandoahGC). I don't have a very strong preference about this.
>>>>> I'm
>>>>> leaning towards having a (somewhat) proper GC interface for it, if only
>>>>> because of the symmetry with the runtime GC interface that also has a
>>>>> cmpxchg hook, and because, cmpxchg(-oop) is, in-fact, a heap-access and
>>>>> should thus go through the GC interface.
>>>>>
>>>>> So, how do others feel about this? Better to put this under
>>>>> Shenandoah-specific paths? Make it a proper GC/BarrierSet-interface? I
>>>>> also very much welcome suggestions for improving the suggested
>>>>> interface.
>>>> FWIW, I made that off-list comment. My objection is that:
>>>>
>>>> 666 if
>>>> (BarrierSet::barrier_set()->barrier_set_assembler()->handle_cmpxchg_oop())
>>>> {
>>>> 667 tmp1 = new_register(T_OBJECT);
>>>> 668 tmp2 = new_register(T_OBJECT);
>>>> 669 }
>>>>
>>>> is shenandoah specific. I suppose you could have an API entry point that
>>>> would return some opaque object. That object for shenandoah would record
>>>> 2 extra registers. Then in the backend, you would pass that opaque
>>>> object to BarrierSetAssembler::cmpxchg_oop_c1() and the shenandoah
>>>> implementation would get its 2 extra registers.
>>>>
>>>> The problem is that with c2:
>>>>
>>>> 7471 instruct compareAndSwapP_BS(rRegI res,
>>>> 7472 memory mem_ptr,
>>>> 7473 rRegP tmp1, rRegP tmp2,
>>>> 7474 rax_RegP oldval, rRegP newval,
>>>> 7475 rFlagsReg cr)
>>>> 7476 %{
>>>>
>>>> is also shenandoah specific for the same reason (it needs 2 extra
>>>> registers). And I don't see a way to properly abstract that.
>>>>
>>>> So it seems to me, that code is not properly abstracted, there's no
>>>> reasonable way to properly abstract it, and it's actually shenandoah
>>>> specific code in disguise so it's better to have it be explicitly
>>>> shenandoah specific.
>>>>
>>>> Roland.
>>>>
>>> Thanks, Roland!
>>>
>>> So let me put my argument here too. The GC/BarrierSet-interface already
>>> has a couple of interfaces that are Shenandoah-specific. Basically all
>>> of what I added for supporting primitive access, object equality and
>>> allocations is Shenandoah-specific. There is a cmpxchg-oop interface in
>>> the runtime too. But then again, it is totally conceivable that a future
>>> GC might need those abstractions. In-fact, as far as I know, ZGC suffers
>>> from a similar problem with cmpxchg-oop as Shenandoah does, but there
>>> it's solved differently.
>>>
>>> Looking from a very-far-away perspective, one of the design ideas behind
>>> the current GC/BarrierSet-interface was that the GC should own all heap
>>> access. And cmpxchg-oop is quite definitely a heap-access. It is
>>> unfortunate that we can't seem to make it really clean with the way that
>>> the C2 backend is currently working. What I suggested seemed the closest
>>> we can get while remaining practical.
>>>
>>> Roman
>>>
>
More information about the hotspot-compiler-dev
mailing list