RFR: JDK-8207169: X86: Modularize cmpxchg-oop assembler for C1 and C2

Mon Sep 3 08:59:13 UTC 2018

I wasn't sure that the BarrierSetC1 interface allows to define custom ops. This sounds like a good natural solution. Ditto for C2. Let's see if we can make that work.

Roman

Am 3. September 2018 10:37:04 MESZ schrieb "Erik Österlund" <erik.osterlund at oracle.com>:
>Hi Roland,
>
>First of all, I apologize for getting your name wrong in the last
>email.
>
>On 2018-08-31 16:46, Roland Westrelin wrote:
>>> Well... C1 uses CAS in the heap only for the Unsafe CAS intrinsic,
>>> which is indeed inserted at parse time. And all other GCs alter the
>>> CFG for the GC barriers in their CAS barriers, using LIR. Except
>>> Epsilon I suppose.
>> Are you talking about for instance G1BarrierSetC1::pre_barrier()?
>That
>> method adds control flow withing a basic block. It doesn't hack the
>CFG
>> (it doesn't add new basic blocks). How can the register allocator
>> compute liveness without a correct CFG? Either
>> G1BarrierSetC1::pre_barrier() is a simple enough case that register
>> allocation is correct or there are some nasty bugs in there. In any
>> case, building control flow within a block like
>> G1BarrierSetC1::pre_barrier() does is an ugly hack. Doing anything
>more
>> complicated that way is asking for trouble.
>
>The C1 basic blocks are built and optimized as part of the HIR and are 
>not to be changed after that. Once the HIR is generated, the LIR
>inserts 
>operations required for lowering this optimized HIR to machine code. 
>After IR::compute_code() of the HIR, those basic blocks are set in 
>stone. That means that any control flow alterations needed by the 
>LIRGenerator, which comes into play after that, is going to use
>branches 
>within the HIR basic block instead (as we promised not to change the
>HIR 
>basic blocks after the HIR is built and optimized). I can see how that 
>might feel like a hack, but that is kind of the way that things are 
>currently done in C1. It is used this way for all barrier sets today 
>(UseCondCardMark for card marking GCs, for G1, ZGC), and it's also used
>
>by T_BOOLEAN normalization, switch statements, checking for referents
>in 
>unsafe intrinsics etc. I suppose the stubs inserted at the LIR level 
>also similarly break the basic block abstraction of the HIR level.
>These 
>are things that can of course be changed into a more strict basic block
>
>model even at the LIR level. But I don't know how much that would help 
>given that this is just the pass before lowering to machine code. But 
>that is a whole different discussion.
>
>I do not propose to move the GC barriers into the HIR - it is too
>early. 
>I propose to insert it at the LIR level like all the other GCs, in a 
>similar way to all the other GCs, using the same mechanisms used by all
>
>the other GCs.
>
>@Roman: If you feel more comfortable using your own LIR_Op with your
>own 
>lowering or stubs instead because you want this written in assembly for
>
>whatever reason, then I am fine with that too as long as it is
>contained 
>in the shenandoah folders. What I do have reservations against is to 
>change the API that everybody else uses to make the LIRGenerator raw
>CAS 
>get lowered into a not raw Access call to the macro assembler, passing 
>in temporary registers used by Shenandoah from above into the raw cas 
>used by the not raw macro assembler access CAS.
>
>For example, in ZGC we have a class LIR_OpZLoadBarrierTest LIR_Op 
>defined in zBarrierSetC1.cpp, which allows us to do custom machine 
>dependent lowering of the test itself, which can be inserted into the 
>LIR list.
>
>I hope we are on the same page here!
>
>Thanks,
>/Erik
>
>> Roland.