RFR: JDK-8207169: X86: Modularize cmpxchg-oop assembler for C1 and C2

Mon Sep 3 08:37:04 UTC 2018

Hi Roland,

First of all, I apologize for getting your name wrong in the last email.

On 2018-08-31 16:46, Roland Westrelin wrote:
>> Well... C1 uses CAS in the heap only for the Unsafe CAS intrinsic,
>> which is indeed inserted at parse time. And all other GCs alter the
>> CFG for the GC barriers in their CAS barriers, using LIR. Except
>> Epsilon I suppose.
> Are you talking about for instance G1BarrierSetC1::pre_barrier()? That
> method adds control flow withing a basic block. It doesn't hack the CFG
> (it doesn't add new basic blocks). How can the register allocator
> compute liveness without a correct CFG? Either
> G1BarrierSetC1::pre_barrier() is a simple enough case that register
> allocation is correct or there are some nasty bugs in there. In any
> case, building control flow within a block like
> G1BarrierSetC1::pre_barrier() does is an ugly hack. Doing anything more
> complicated that way is asking for trouble.

The C1 basic blocks are built and optimized as part of the HIR and are 
not to be changed after that. Once the HIR is generated, the LIR inserts 
operations required for lowering this optimized HIR to machine code. 
After IR::compute_code() of the HIR, those basic blocks are set in 
stone. That means that any control flow alterations needed by the 
LIRGenerator, which comes into play after that, is going to use branches 
within the HIR basic block instead (as we promised not to change the HIR 
basic blocks after the HIR is built and optimized). I can see how that 
might feel like a hack, but that is kind of the way that things are 
currently done in C1. It is used this way for all barrier sets today 
(UseCondCardMark for card marking GCs, for G1, ZGC), and it's also used 
by T_BOOLEAN normalization, switch statements, checking for referents in 
unsafe intrinsics etc. I suppose the stubs inserted at the LIR level 
also similarly break the basic block abstraction of the HIR level. These 
are things that can of course be changed into a more strict basic block 
model even at the LIR level. But I don't know how much that would help 
given that this is just the pass before lowering to machine code. But 
that is a whole different discussion.

I do not propose to move the GC barriers into the HIR - it is too early. 
I propose to insert it at the LIR level like all the other GCs, in a 
similar way to all the other GCs, using the same mechanisms used by all 
the other GCs.

@Roman: If you feel more comfortable using your own LIR_Op with your own 
lowering or stubs instead because you want this written in assembly for 
whatever reason, then I am fine with that too as long as it is contained 
in the shenandoah folders. What I do have reservations against is to 
change the API that everybody else uses to make the LIRGenerator raw CAS 
get lowered into a not raw Access call to the macro assembler, passing 
in temporary registers used by Shenandoah from above into the raw cas 
used by the not raw macro assembler access CAS.

For example, in ZGC we have a class LIR_OpZLoadBarrierTest LIR_Op 
defined in zBarrierSetC1.cpp, which allows us to do custom machine 
dependent lowering of the test itself, which can be inserted into the 
LIR list.

I hope we are on the same page here!

Thanks,
/Erik

> Roland.