RFR: RFR: Eliminate write-barrier assembly stub (part 2)

Sat Mar 10 11:15:28 UTC 2018

The TypeTuple only comes out of calls. Since we should already have
optimized chained WBs and similar, there will not be any use attempting
to optimize this sort of stuff after expansion. The following change
makes the problem disappear:

Differential:
http://cr.openjdk.java.net/~rkennke/eliminate_wb_stub-c2/webrev.02.diff/
Full patch:
http://cr.openjdk.java.net/~rkennke/eliminate_wb_stub-c2/webrev.02/

Ok now?

Roman

> The problem seems to be that we get a Type that is not possible to make
> a ptr of here:
> 
>    if (type->make_ptr()->higher_equal(TypePtr::NULL_PTR)) {
> 
> make_ptr() would return NULL, and blow up in higher_equal()
> 
> Interestingly, guarding with is_ptr() and returning true, or checking
> with make_ptr() == NULL doesn't help.
> 
> I think what we get there is a TypeTuple, and we should probably not get
> it in the first place. How this happened, I don't know :-)
> 
> Can you help?
> 
> Roman
> 
>> Sometimes I seem to get a C2 compilation error. I could only reproduce
>> it with release build so far. It happens when running TestGCBasher.
>> Roland, could you check this? Find attached the hs_err and replay files.
>>
>> Thanks, Roman
>>
>>
>>> After I eliminated the assembly stub, it doesn't do anything anymore for
>>> C2, except for some shuffling of registers and doing 2 calls instead of
>>> one (compiled -> stub -> runtime). I strongly suspect that the register
>>> allocator can do the shuffling better, and avoiding the extra jump will
>>> also be a net win. At the very least, letting C2 call directly to the
>>> entry in ShenandoaBarrierSet removes a bunch of unnecessary code diffs
>>> against upstream:
>>>
>>> http://cr.openjdk.java.net/~rkennke/eliminate_wb_stub-c2/webrev.01/
>>>
>>> One thing I am not sure is this little piece in
>>> ShenandoahSupport::needs_barrier_impl() :
>>>
>>>   if (n->is_CallJava() || n->Opcode() == Op_CallLeafNoFP) {
>>>     return true;
>>>   }
>>>
>>>
>>> Was this supposed to match the wb-stub-call (CallLeafNoFP) and if so,
>>> should it now match the runtime call (CallLeaf plus entry address)? But
>>> I don't really see the point: if it's a CallJava, it doesn't make a
>>> difference if it's also a CallLeafNoFP. Maybe the intention was to *not*
>>> match WB stub calls and it should have been:
>>>
>>>
>>>   if (n->is_CallJava() && n->Opcode() != Op_CallLeafNoFP) {
>>>     return true;
>>>   }
>>>
>>> ?
>>> Because that would kinda make sense: we want barriers for return values
>>> from java calls, but not really from wb-runtime-calls. Hmmmm. This is
>>> why I'd propose to change it to:
>>>
>>>   if (n->is_CallJava() && !(n->Opcode() == Op_CallLeaf &&
>>> n->as_Call()->entry_point() == CAST_FROM_FN_PTR(address,
>>> ShenandoahBarrierSet::write_barrier_JRT))) {
>>>     return true;
>>>   }
>>>
>>> I don't think this would make much difference though: all this stuff can
>>> only match after WB expansion, at which point such optimizations (WB
>>> after WB) should already have happened.
>>>
>>> Passes hotspot_gc_shenandoah (fastdebug/release) and specjvm
>>> (fastdebug/release)
>>>
>>> Ok to go?
>>>
>>
> 
>