RFR: JDK-8207169: X86: Modularize cmpxchg-oop assembler for C1 and C2

Roman Kennke rkennke at redhat.com
Wed Aug 29 15:41:34 UTC 2018


Assuming that we manage to expand our logic into C2 ideal graph, we still need a way to deal with C1. I have a patch that can alloc the tmp regs only for GC that needs it, and otherwise uses the patch as I proposed (I.e. a BSA interface for C1 cmpxchg_pop), minus the C2 stuff and minus the handle_cmpxchg_pop() stuff. Would you be ok with such a solution? Or would you rather have us generate the C1 LIR (which would be *really* nasty...)

Roman


Am 29. August 2018 17:28:54 MESZ schrieb "Erik Österlund" <erik.osterlund at oracle.com>:
>Hi Roman,
>
>On 2018-08-29 17:14, Roman Kennke wrote:
>> Hmm right, generating specific e.g. ShenandoahCompareAndSwapPNode and
>> putting an expansion for that in the .ad file seems like a good idea.
>
>I meant generating CAS and GC barrier nodes for the intrinsic, and
>macro 
>expanding the barrier nodes during the macro expansion phase into an 
>inflated node representation describing their barrier logic, and leave 
>the AD file alone.
>
>> We tried generating ideal graph for what we need for cmpxchg-oop and
>it
>> was complicated and error-prone and not really worth it. We may try
>again.
>
>It is fiddly indeed. But the resulting interface is much cleaner I 
>think. I don't think putting things in the AD file for convenience is 
>necessarily the right thing to do. I do see the temptation of doing so 
>though, as it is quite a shortcut. But I think the AD files was meant
>to 
>describe how nodes map to different architectures, not be a place where
>
>you put things for convenience that were tricky to stitch together the 
>right node graph for. At least that is how I view it.
>
>Thanks,
>/Erik
>
>> Roman
>>
>>> I agree with you that it would be nice to have a good interface for
>>> this, and that putting your GC barriers straight into the AD file
>makes
>>> it very awkward to have a clean interface. It will be a special
>>> interface for shenandoah only, that only shenandoah uses. All other
>GCs
>>> generate nodes for their barriers. It seems like a bit of a shortcut
>for
>>> not having to generate the proper nodes for your barriers, which I
>think
>>> is the right level to do this, and is where all other GCs do this.
>The
>>> rules matched in the AD file kind of presume you know what
>instructions
>>> will be generated. This is required for example for
>PrintOptoAssembly to
>>> generate anything useful. It might not be a dealbreaker I suppose if
>>> PrintOptoAssembly prints nonsense. But I can't help but feel like
>the
>>> abstraction level is not right here, and that placing GC barriers
>>> straight into the AD file for convenience, is a bit of a hack.
>>>
>>> An alternative that I think should be considered, is to do what ZGC
>>> does: generate nodes in ZBarrierSetC2, and macro expand barrier
>nodes in
>>> ZBarrierSetC2. That way, your changes can remain in your
>>> ShenandoahBarrierSetC2 class, and not leak out into the AD files.
>>>
>>> Thanks,
>>> /Erik
>>>
>>> On 2018-08-29 16:35, Roman Kennke wrote:
>>>> Hi Roland,
>>>>
>>>>>> It was suggested to me (off-list) to simply put this stuff under
>if
>>>>>> (UseShenandoahGC). I don't have a very strong preference about
>this.
>>>>>> I'm
>>>>>> leaning towards having a (somewhat) proper GC interface for it,
>if only
>>>>>> because of the symmetry with the runtime GC interface that also
>has a
>>>>>> cmpxchg hook, and because, cmpxchg(-oop) is, in-fact, a
>heap-access and
>>>>>> should thus go through the GC interface.
>>>>>>
>>>>>> So, how do others feel about this? Better to put this under
>>>>>> Shenandoah-specific paths? Make it a proper
>GC/BarrierSet-interface? I
>>>>>> also very much welcome suggestions for improving the suggested
>>>>>> interface.
>>>>> FWIW, I made that off-list comment. My objection is that:
>>>>>
>>>>>     666     if
>>>>>
>(BarrierSet::barrier_set()->barrier_set_assembler()->handle_cmpxchg_oop())
>>>>> {
>>>>>     667       tmp1 = new_register(T_OBJECT);
>>>>>     668       tmp2 = new_register(T_OBJECT);
>>>>>     669     }
>>>>>
>>>>> is shenandoah specific. I suppose you could have an API entry
>point that
>>>>> would return some opaque object. That object for shenandoah would
>record
>>>>> 2 extra registers. Then in the backend, you would pass that opaque
>>>>> object to BarrierSetAssembler::cmpxchg_oop_c1() and the shenandoah
>>>>> implementation would get its 2 extra registers.
>>>>>
>>>>> The problem is that with c2:
>>>>>
>>>>>    7471 instruct compareAndSwapP_BS(rRegI res,
>>>>>    7472                             memory mem_ptr,
>>>>>    7473                             rRegP tmp1, rRegP tmp2,
>>>>>    7474                             rax_RegP oldval, rRegP newval,
>>>>>    7475                             rFlagsReg cr)
>>>>>    7476 %{
>>>>>
>>>>> is also shenandoah specific for the same reason (it needs 2 extra
>>>>> registers). And I don't see a way to properly abstract that.
>>>>>
>>>>> So it seems to me, that code is not properly abstracted, there's
>no
>>>>> reasonable way to properly abstract it, and it's actually
>shenandoah
>>>>> specific code in disguise so it's better to have it be explicitly
>>>>> shenandoah specific.
>>>>>
>>>>> Roland.
>>>>>
>>>> Thanks, Roland!
>>>>
>>>> So let me put my argument here too. The GC/BarrierSet-interface
>already
>>>> has a couple of interfaces that are Shenandoah-specific. Basically
>all
>>>> of what I added for supporting primitive access, object equality
>and
>>>> allocations is Shenandoah-specific. There is a cmpxchg-oop
>interface in
>>>> the runtime too. But then again, it is totally conceivable that a
>future
>>>> GC might need those abstractions. In-fact, as far as I know, ZGC
>suffers
>>>> from a similar problem with cmpxchg-oop as Shenandoah does, but
>there
>>>> it's solved differently.
>>>>
>>>> Looking from a very-far-away perspective, one of the design ideas
>behind
>>>> the current GC/BarrierSet-interface was that the GC should own all
>heap
>>>> access. And cmpxchg-oop is quite definitely a heap-access. It is
>>>> unfortunate that we can't seem to make it really clean with the way
>that
>>>> the C2 backend is currently working. What I suggested seemed the
>closest
>>>> we can get while remaining practical.
>>>>
>>>> Roman
>>>>
>>



More information about the hotspot-gc-dev mailing list