[9] RFR (XS): 8153340: Incorrect lower bound for AllocatePrefetchDistance with AllocatePrefetchStyle=3

Vladimir Kozlov vladimir.kozlov at oracle.com
Wed Apr 27 21:55:16 UTC 2016


Hi Zoltan

On 4/27/16 7:20 AM, Zoltán Majó wrote:
> Hi Vladimir,
>
>
> thank you for the feedback!
>
> On 04/27/2016 12:39 AM, Vladimir Kozlov wrote:
>> On 4/26/16 4:56 AM, Zoltán Majó wrote:
>>> Hi Vladimir,
>>>
>>>
>>> thank you for the feedback! Please see comments below.
>>>
>>> On 04/22/2016 12:37 AM, Vladimir Kozlov wrote:
>>>> Hi, Zoltan
>>>>
>>>> I think we should change code in prefetch_allocation() instead:
>>>>
>>>>       Node *cache_adr = new AddPNode(old_eden_top, old_eden_top,
>>>> _igvn.MakeConX(step_size + distance));
>>>
>>> The problem is that AllocatePrefetchStyle must align the first prefetched address to 8 bytes, otherwise the emitted stxa
>>> instruction could cause a SIGBUS. But the alignment does not have to be at step_size boundary, 8-byte alignment is
>>> sufficient.
>>
>> Actually it has to be step_size (cache line size) aligned - BIS instruction is triggered when the address of stxa is
>> the beginning of cache line for AllocatePrefetchStyle == 3. If it is just 8 bytes aligned it will be simple store.
>
> Thank you for clarifying.
>
>> We should require AllocatePrefetchStepSize to be 8 bytes aligned in vm_version_sparc.cpp instead of:
>> +       // BIS instructions require 8-byte aligned addresses
>> +       Node* mask = _igvn.MakeConX(~(intptr_t)(wordSize - 1));
>
> I agree. But I'd prefer we do that in the AllocatePrefetchStepSizeConstraintFunc() constraint function in
> commandLineFlagConstraintsCompiler.cpp). The reason is that since JEP-245 the preferred place to validate of
> command-line arguments is in constraint functions. Most compiler-related checks were moved there with JDK-8078554 and
> JDK-8146478.

Okay. Usually we do platform specific flag's setting in vm_version_<arch>.cpp files but it looks like it start changing.
I am not supporter of having multiply #ifdef in shared code (in commandLineFlagConstraintsCompiler).

>
> I'd also like to set the minimum value for AllocatePrefetchDistance to AllocatePrefetchStepSize, otherwise we can access
> the heap before newly allocated objects/arrays.

Only for style 3. And I said it may be better to change code.

>>>
>>>>
>>>> These way we allow AllocatePrefetchDistance == 0 in all AllocatePrefetchStyle cases - it is consistent.
>>>
>>> Unfortunately, it is not easy to have the same limit for AllocatePrefetchDistance on all platforms. Due to the 8-byte
>>> alignment performed by compiled code, the lower limit of 8 for AllocatePrefetchDistance is needed on SPARC; the lower
>>> limit of 0 works fine on all other platforms.
>>>
>>> I've started looking at the consistency of flags controlling allocation prefetch in general, as we have other issues
>>> open related to them (e.g., JDK-8151622). We're touching related code now, so I thought, maybe it makes sense to fix all
>>> remaining issues at once.
>>
>> Agree. I think AllocatePrefetchStyle=2 is broken on all platforms - it should be used only if UseTLAB is true. Please,
>> look.
>
> It seems AllocatePrefetchStyle = 2 can be used only if UseTLAB is true.
> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/6a17c49de974/src/share/vm/opto/macro.cpp#l1844

I got report from it crashing on T7 (attached). May be it is because by default it used BIS with it.

Thanks,
Vladimir

>
> Also, AllocatePrefetchStyle = 2 seems to work fine. But to be sure, I've started an RBT run with all hotspot on all
> platforms using AllocatePrefetchStyle=2. So far no problems have shown up.
>
>> And I think Abstract_VM_Version::_reserve_for_allocation_prefetch should be set for all styles on all platforms to
>> avoid accessing beyond heap. Prefetch instructions doc say that they does not trap but we should be careful.
>
> I agree.
>
> That means we initialize _reserve_for_allocation_prefetch in a platform-independent way. So I think it would make sense
> to move that field to ThreadLocalAllocBuffer, as TLAB is the only user of that field and we don't support
> platform-independent initialization of Abstract_VM_Version. I did that in the updated webrev.
>
>>>
>>> The updated webrev does the following (in addition to fixing the original problem with AllocatePrefetchDistance):
>>>
>>> 1. Enforce AllocatePrefetchStyle = 3 if AllocatePrefetchInstr = 1 (i.e., BIS instructions are used for prefetching). As
>>> far as I understand, AllocatePrefetchStyle = 3 was added to support prefetching with BIS, so if BIS is enabled, we
>>> should use AllocatePrefetchStyle = 3.
>>
>> Correct - if BIS (AllocatePrefetchInstr = 1) is used we should select AllocatePrefetchStyle = 3.
>
> OK.
>
>> But we should allow AllocatePrefetchStyle = 3  if normal prefetch instructions (or other platforms) are used.
>
> OK, I've removed that restriction.
>
>> I think we should update comment in globals.hpp to say "generate one prefetch per cache line" without saying BIS.
>
> OK.
>
>>
>> But I agree if BIS is not available we should not use BIS AllocatePrefetchInstr = 1.
>
> OK.
>
>>
>>>
>>> 2. AllocatePrefetchStyle = 3 is SPARC-specific, so disallow it on non-SPARC platforms.
>>
>> It could be useful on other platforms since it does one access per cache line.
>
> OK, let's keep it available then.
>
>>
>>
>>>
>>> 3. Enforce that AllocatePrefetchStepSize is multiple of 8 if AllocatePrefetchStyle is 3 (due to alignment requirements).
>>
>> That is correct since stxa requires at least 8 bytes alignment (as stx).
>
> OK.
>
>>
>>>
>>> 3. Determine the number of lines to prefetch in the same way for all prefetch styles:
>>> lines = (prefecth instance allocation) ? AllocateInstancePrefetchLines : AllocatePrefetchLines
>>
>> Agree.
>
> OK.
>
>>
>>>
>>> Here is the updated webrev:
>>> http://cr.openjdk.java.net/~zmajo/8153340/webrev.01/
>>
>> vm_version_sparc.cpp
>> AllocatePrefetchInstr = 0 should be set for all styles (not only 1) when BIS is not available.
>
> OK. I think it is sufficient to do
>
> 52 if (!has_blk_init()) {
> 53 if (AllocatePrefetchInstr == 1) {
> 54 warning("BIS instructions required for AllocatePrefetchInstr 1 unavailable");
> 55 FLAG_SET_DEFAULT(AllocatePrefetchInstr, 0);
> 56 } I hope I'm not missing anything here. Here is the updated webrev:
> http://cr.openjdk.java.net/~zmajo/8153340/webrev.02/ Testing: - JPRT (incl. TestOptionsWithRanges); - local testing on
> SPARC; - all hotspot tests with AllocaPrefetchStyle=2 on all platforms. Thank you! Best regards, Zoltan
>
>>
>> Thanks,
>> Vladimir
>>
>>>
>>> Testing:
>>> - JPRT (incl. TestOptionsWithRanges)
>>> - local testing on a SPARC machine.
>>>
>>> Thank you!
>>>
>>> Best regards,
>>>
>>>
>>> Zoltan
>>>
>>>
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 4/21/16 4:30 AM, Zoltán Majó wrote:
>>>>> Hi,
>>>>>
>>>>>
>>>>> please review the patch for 8153340.
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-8153340
>>>>>
>>>>>
>>>>> Problem: The VM crashes if AllocatePrefetchStyle==3 and AllocatePrefetchDistance==0. The crash happens due to the way
>>>>> the address for the first prefetch instruction is calculated [1]:
>>>>>
>>>>> If distance==0, cache_addr == old_eden_top. Then, cache_adr &= ~(AllocatePrefetchStepSize - 1) which can zero some of
>>>>> the bits of cache_adr. That result in accesses *before* the newly allocated object.
>>>>>
>>>>>
>>>>> Solution: Set lower limit of AllocatePrefetchDistance to AllocatePrefetchStepSize (for AllocatePrefetchStyle == 3).
>>>>> Unquarantine test.
>>>>>
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~zmajo/8153340/webrev.00/
>>>>>
>>>>> Testing:
>>>>> - JPRT (incl. TestOptionsWithRanges.java)
>>>>> - local testing on a SPARC machine.
>>>>>
>>>>> Thank you!
>>>>>
>>>>> Best regards,
>>>>>
>>>>>
>>>>> Zoltan
>>>>>
>>>>> [1] http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/f27c00e6f6bf/src/share/vm/opto/macro.cpp#l1941
>>>
>
-------------- next part --------------
An embedded message was scrubbed...
From: Vladimir Kozlov <vladimir.kozlov at oracle.com>
Subject: Re: Java's use of BIS and RAW penalty
Date: Tue, 15 Mar 2016 11:51:01 -0700
Size: 7622
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160427/ad0d0186/AttachedMessage-0001.mht>


More information about the hotspot-compiler-dev mailing list