RFR: JDK-8207169: X86: Modularize cmpxchg-oop assembler for C1 and C2

Erik Österlund erik.osterlund at oracle.com
Wed Aug 29 15:28:54 UTC 2018


Hi Roman,

On 2018-08-29 17:14, Roman Kennke wrote:
> Hmm right, generating specific e.g. ShenandoahCompareAndSwapPNode and
> putting an expansion for that in the .ad file seems like a good idea.

I meant generating CAS and GC barrier nodes for the intrinsic, and macro 
expanding the barrier nodes during the macro expansion phase into an 
inflated node representation describing their barrier logic, and leave 
the AD file alone.

> We tried generating ideal graph for what we need for cmpxchg-oop and it
> was complicated and error-prone and not really worth it. We may try again.

It is fiddly indeed. But the resulting interface is much cleaner I 
think. I don't think putting things in the AD file for convenience is 
necessarily the right thing to do. I do see the temptation of doing so 
though, as it is quite a shortcut. But I think the AD files was meant to 
describe how nodes map to different architectures, not be a place where 
you put things for convenience that were tricky to stitch together the 
right node graph for. At least that is how I view it.

Thanks,
/Erik

> Roman
>
>> I agree with you that it would be nice to have a good interface for
>> this, and that putting your GC barriers straight into the AD file makes
>> it very awkward to have a clean interface. It will be a special
>> interface for shenandoah only, that only shenandoah uses. All other GCs
>> generate nodes for their barriers. It seems like a bit of a shortcut for
>> not having to generate the proper nodes for your barriers, which I think
>> is the right level to do this, and is where all other GCs do this. The
>> rules matched in the AD file kind of presume you know what instructions
>> will be generated. This is required for example for PrintOptoAssembly to
>> generate anything useful. It might not be a dealbreaker I suppose if
>> PrintOptoAssembly prints nonsense. But I can't help but feel like the
>> abstraction level is not right here, and that placing GC barriers
>> straight into the AD file for convenience, is a bit of a hack.
>>
>> An alternative that I think should be considered, is to do what ZGC
>> does: generate nodes in ZBarrierSetC2, and macro expand barrier nodes in
>> ZBarrierSetC2. That way, your changes can remain in your
>> ShenandoahBarrierSetC2 class, and not leak out into the AD files.
>>
>> Thanks,
>> /Erik
>>
>> On 2018-08-29 16:35, Roman Kennke wrote:
>>> Hi Roland,
>>>
>>>>> It was suggested to me (off-list) to simply put this stuff under if
>>>>> (UseShenandoahGC). I don't have a very strong preference about this.
>>>>> I'm
>>>>> leaning towards having a (somewhat) proper GC interface for it, if only
>>>>> because of the symmetry with the runtime GC interface that also has a
>>>>> cmpxchg hook, and because, cmpxchg(-oop) is, in-fact, a heap-access and
>>>>> should thus go through the GC interface.
>>>>>
>>>>> So, how do others feel about this? Better to put this under
>>>>> Shenandoah-specific paths? Make it a proper GC/BarrierSet-interface? I
>>>>> also very much welcome suggestions for improving the suggested
>>>>> interface.
>>>> FWIW, I made that off-list comment. My objection is that:
>>>>
>>>>     666     if
>>>> (BarrierSet::barrier_set()->barrier_set_assembler()->handle_cmpxchg_oop())
>>>> {
>>>>     667       tmp1 = new_register(T_OBJECT);
>>>>     668       tmp2 = new_register(T_OBJECT);
>>>>     669     }
>>>>
>>>> is shenandoah specific. I suppose you could have an API entry point that
>>>> would return some opaque object. That object for shenandoah would record
>>>> 2 extra registers. Then in the backend, you would pass that opaque
>>>> object to BarrierSetAssembler::cmpxchg_oop_c1() and the shenandoah
>>>> implementation would get its 2 extra registers.
>>>>
>>>> The problem is that with c2:
>>>>
>>>>    7471 instruct compareAndSwapP_BS(rRegI res,
>>>>    7472                             memory mem_ptr,
>>>>    7473                             rRegP tmp1, rRegP tmp2,
>>>>    7474                             rax_RegP oldval, rRegP newval,
>>>>    7475                             rFlagsReg cr)
>>>>    7476 %{
>>>>
>>>> is also shenandoah specific for the same reason (it needs 2 extra
>>>> registers). And I don't see a way to properly abstract that.
>>>>
>>>> So it seems to me, that code is not properly abstracted, there's no
>>>> reasonable way to properly abstract it, and it's actually shenandoah
>>>> specific code in disguise so it's better to have it be explicitly
>>>> shenandoah specific.
>>>>
>>>> Roland.
>>>>
>>> Thanks, Roland!
>>>
>>> So let me put my argument here too. The GC/BarrierSet-interface already
>>> has a couple of interfaces that are Shenandoah-specific. Basically all
>>> of what I added for supporting primitive access, object equality and
>>> allocations is Shenandoah-specific. There is a cmpxchg-oop interface in
>>> the runtime too. But then again, it is totally conceivable that a future
>>> GC might need those abstractions. In-fact, as far as I know, ZGC suffers
>>> from a similar problem with cmpxchg-oop as Shenandoah does, but there
>>> it's solved differently.
>>>
>>> Looking from a very-far-away perspective, one of the design ideas behind
>>> the current GC/BarrierSet-interface was that the GC should own all heap
>>> access. And cmpxchg-oop is quite definitely a heap-access. It is
>>> unfortunate that we can't seem to make it really clean with the way that
>>> the C2 backend is currently working. What I suggested seemed the closest
>>> we can get while remaining practical.
>>>
>>> Roman
>>>
>



More information about the hotspot-compiler-dev mailing list