RFR: JDK-8207169: X86: Modularize cmpxchg-oop assembler for C1 and C2

Mon Sep 3 09:57:51 UTC 2018

Hi Erik,

> It did not use to be possible as it needed its own enum switches all
> over the place. But as part of my C1 barrier set interface work, I
> wanted to make it possible to make your own LIR_Ops in the barrier set
> as well without cluttering the switches and inserted appropriate virutal
> calls to the LIR_Ops allowing you to do that. Now, basically, if your
> LIR_Op id is lir_none (which the default constructor sets it to), then
> it will use virtual calls into your LIR_Op in the switch statements.
> 
> I see how inserting LIR loops in the HIR basic block in the general case
> can go horribly wrong as Roland showed in his example. So if you feel
> like defining your own LIR_Op and lower it in your barrier set is the
> more natural solution for Shenandoah, you can use that mechanism of course.
> 
> It sounds like we have reached an agreement?

I think so, at least for now. We'll try to turn our cmpxchg-oop problem
into LIR_Op and C2 node and see how that goes. I withdraw this RFR.

Thanks a lot,
Roman

> 
> Thanks,
> /Erik
> 
> On 2018-09-03 10:59, Roman Kennke wrote:
>> I wasn't sure that the BarrierSetC1 interface allows to define custom
>> ops. This sounds like a good natural solution. Ditto for C2. Let's see
>> if we can make that work.
>>
>> Roman
>>
>> Am 3. September 2018 10:37:04 MESZ schrieb "Erik Österlund"
>> <erik.osterlund at oracle.com>:
>>> Hi Roland,
>>>
>>> First of all, I apologize for getting your name wrong in the last
>>> email.
>>>
>>> On 2018-08-31 16:46, Roland Westrelin wrote:
>>>>> Well... C1 uses CAS in the heap only for the Unsafe CAS intrinsic,
>>>>> which is indeed inserted at parse time. And all other GCs alter the
>>>>> CFG for the GC barriers in their CAS barriers, using LIR. Except
>>>>> Epsilon I suppose.
>>>> Are you talking about for instance G1BarrierSetC1::pre_barrier()?
>>> That
>>>> method adds control flow withing a basic block. It doesn't hack the
>>> CFG
>>>> (it doesn't add new basic blocks). How can the register allocator
>>>> compute liveness without a correct CFG? Either
>>>> G1BarrierSetC1::pre_barrier() is a simple enough case that register
>>>> allocation is correct or there are some nasty bugs in there. In any
>>>> case, building control flow within a block like
>>>> G1BarrierSetC1::pre_barrier() does is an ugly hack. Doing anything
>>> more
>>>> complicated that way is asking for trouble.
>>> The C1 basic blocks are built and optimized as part of the HIR and are
>>> not to be changed after that. Once the HIR is generated, the LIR
>>> inserts
>>> operations required for lowering this optimized HIR to machine code.
>>> After IR::compute_code() of the HIR, those basic blocks are set in
>>> stone. That means that any control flow alterations needed by the
>>> LIRGenerator, which comes into play after that, is going to use
>>> branches
>>> within the HIR basic block instead (as we promised not to change the
>>> HIR
>>> basic blocks after the HIR is built and optimized). I can see how that
>>> might feel like a hack, but that is kind of the way that things are
>>> currently done in C1. It is used this way for all barrier sets today
>>> (UseCondCardMark for card marking GCs, for G1, ZGC), and it's also used
>>>
>>> by T_BOOLEAN normalization, switch statements, checking for referents
>>> in
>>> unsafe intrinsics etc. I suppose the stubs inserted at the LIR level
>>> also similarly break the basic block abstraction of the HIR level.
>>> These
>>> are things that can of course be changed into a more strict basic block
>>>
>>> model even at the LIR level. But I don't know how much that would help
>>> given that this is just the pass before lowering to machine code. But
>>> that is a whole different discussion.
>>>
>>> I do not propose to move the GC barriers into the HIR - it is too
>>> early.
>>> I propose to insert it at the LIR level like all the other GCs, in a
>>> similar way to all the other GCs, using the same mechanisms used by all
>>>
>>> the other GCs.
>>>
>>> @Roman: If you feel more comfortable using your own LIR_Op with your
>>> own
>>> lowering or stubs instead because you want this written in assembly for
>>>
>>> whatever reason, then I am fine with that too as long as it is
>>> contained
>>> in the shenandoah folders. What I do have reservations against is to
>>> change the API that everybody else uses to make the LIRGenerator raw
>>> CAS
>>> get lowered into a not raw Access call to the macro assembler, passing
>>> in temporary registers used by Shenandoah from above into the raw cas
>>> used by the not raw macro assembler access CAS.
>>>
>>> For example, in ZGC we have a class LIR_OpZLoadBarrierTest LIR_Op
>>> defined in zBarrierSetC1.cpp, which allows us to do custom machine
>>> dependent lowering of the test itself, which can be inserted into the
>>> LIR list.
>>>
>>> I hope we are on the same page here!
>>>
>>> Thanks,
>>> /Erik
>>>
>>>> Roland.
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20180903/1732d708/signature-0001.asc>