RFR (M): 8000805: JMM issue: short loads are non-atomic
Vladimir Kozlov
vladimir.kozlov at oracle.com
Fri Oct 19 18:35:21 PDT 2012
Vladimir Ivanov wrote:
> They correspond, but indirectly.
> There are no "(AndI (LoadUS))" rules, but "(ConvI2L (AndI (LoadUS))" are
> present. So, if LoadS -> LoadUS transformation doesn't occur, ConvI2L
> rules can't be applied. I addresses this by adding missing cases.
I see. But in this case you can simple add second (or more) match rule:
// Load Unsigned Byte (8 bit UNsigned) with 8-bit mask into Long Register
instruct loadUB2L_immI8(iRegL dst, memory mem, immI8 mask) %{
match(Set dst (ConvI2L (AndI (LoadUB mem) mask)));
+ match(Set dst (ConvI2L (AndI (LoadB mem) mask)));
>
> Oh, I see what you are suggesting now. Yes, it definitely looks better
> than what I have now.
>
> What I was concerned about is slightly different. And I haven't
> addressed it with the the current fix.
>
> Current solution doesn't cover the following cases:
> long f(short[] sa) {
> short s = sa[0];
> return (s & 0xF0) & (s & 0x0F);
> }
> Before the fix, it was transformed into 2 AndI nodes which share LoadUS.
> Now it stays LoadS till the end.
I don't understand what the problem you have here. The expression result is 0 in
this case.
>
> And when I talked about filtering, I was thinking about the following case:
> public void observe(Specimen s, byte[] result) {
> short t = s.x;
> result[0] = (byte) (t & 0xFF);
> result[1] = (byte) (t & 0xF0);
> }
>
> After parsing of this method, LoadS has 3 users: LShiftI, AndI, StoreB,
> CallStaticJava. To convert it into LoadUS I need to look at LShiftI &
> AndI, but skip StoreB & CallStaticJava.
Where LShiftI comes from?
Vladimir
>
> But do these cases matter?
>
> Best regards,
> Vladimir Ivanov
>
> On 10/20/12 3:18 AM, Vladimir Kozlov wrote:
>> Vladimir I.,
>>
>> Mach nodes you added do not correspond to removed ideal code - ideal
>> code did not have ConvI2L transformations.
>>
>> What John suggested is to add simple check:
>>
>> if ((load->outcnt() == 1) && (load->unique_out() == this))
>>
>> to allow these transformations. I agree that such transformations could
>> be done only after parser when the graph is complete.
>>
>> Why do you think you need to filter out nodes?
>>
>> Thanks,
>> Vladimir K.
>>
>> Vladimir Ivanov wrote:
>>> John,
>>>
>>> Thanks for review.
>>>
>>> I started with the approach you suggests, but decided that moving
>>> these checks into AD files is a cleaner solution. The code looked more
>>> cumbersome because of the following:
>>> - can't perform such transformations during parsing
>>> - need to filter out nodes, which aren't affected by switch
>>> between signed/unsigned versions (like stores, calls, etc).
>>>
>>> If you are fine with these complications, I'll reimplement the fix w/o
>>> touching AD files.
>>>
>>> I'll double-check loadS2L_immI13 rule. Just checked and it turns out
>>> my test doesn't cover it. Thanks for catching this.
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>> On 10/20/12 1:32 AM, John Rose wrote:
>>>> I have a couple of reservations about this change. I would hope that
>>>> the AD file logic could be more factored. The simm13 operand looks
>>>> wrong: can't it be negative? In any case, the opt should apply up to
>>>> 0xFFFF.
>>>>
>>>> I think you can (& should) keep the memnode logic if you can prove
>>>> that all occurrences of the signed ld can be subsumed by the unsigned
>>>> ld. This can be detected by using out edges.
>>>>
>>>> -- John (on my iPhone)
>>>>
>>>> On Oct 19, 2012, at 11:39 AM, Vladimir Ivanov
>>>> <vladimir.x.ivanov at oracle.com> wrote:
>>>>
>>>>> http://cr.openjdk.java.net/~vlivanov/8000805/webrev.00/
>>>>> 282 lines changed: 255 ins; 27 del; 0 mod
>>>>>
>>>>> Ideal transformations during parsing & IGVN may rematerialize loads
>>>>> in order to reify useful information (like signed/unsigned load).
>>>>> Such behavior breaks JMM - instead of a single atomic load, multiple
>>>>> loads are performed violating consistency of loaded data.
>>>>>
>>>>> The fix is to disable such transformations at all, but perform
>>>>> relevant optimizations during matching.
>>>>>
>>>>> It fixes only C2 part of problem. С1 has similar deficiency, but
>>>>> it'll be addressed separately.
>>>>>
>>>>> Testing: failing test, test for new matching rules, JPRT, CTW (x86,
>>>>> x64 & sparc).
>>>>>
>>>>> Best regards,
>>>>> Vladimir Ivanov
More information about the hotspot-compiler-dev
mailing list