[9] RFR (XS): 8153340: Incorrect lower bound for AllocatePrefetchDistance with AllocatePrefetchStyle=3

Zoltán Majó zoltan.majo at oracle.com
Wed Apr 27 14:20:08 UTC 2016


Hi Vladimir,


thank you for the feedback!

On 04/27/2016 12:39 AM, Vladimir Kozlov wrote:
> On 4/26/16 4:56 AM, Zoltán Majó wrote:
>> Hi Vladimir,
>>
>>
>> thank you for the feedback! Please see comments below.
>>
>> On 04/22/2016 12:37 AM, Vladimir Kozlov wrote:
>>> Hi, Zoltan
>>>
>>> I think we should change code in prefetch_allocation() instead:
>>>
>>>       Node *cache_adr = new AddPNode(old_eden_top, old_eden_top,
>>> _igvn.MakeConX(step_size + distance));
>>
>> The problem is that AllocatePrefetchStyle must align the first 
>> prefetched address to 8 bytes, otherwise the emitted stxa
>> instruction could cause a SIGBUS. But the alignment does not have to 
>> be at step_size boundary, 8-byte alignment is
>> sufficient.
>
> Actually it has to be step_size (cache line size) aligned - BIS 
> instruction is triggered when the address of stxa is the beginning of 
> cache line for AllocatePrefetchStyle == 3. If it is just 8 bytes 
> aligned it will be simple store.

Thank you for clarifying.

> We should require AllocatePrefetchStepSize to be 8 bytes aligned in 
> vm_version_sparc.cpp instead of:
> +       // BIS instructions require 8-byte aligned addresses
> +       Node* mask = _igvn.MakeConX(~(intptr_t)(wordSize - 1));

I agree. But I'd prefer we do that in the 
AllocatePrefetchStepSizeConstraintFunc() constraint function in 
commandLineFlagConstraintsCompiler.cpp). The reason is that since 
JEP-245 the preferred place to validate of command-line arguments is in 
constraint functions. Most compiler-related checks were moved there with 
JDK-8078554 and JDK-8146478.

I'd also like to set the minimum value for AllocatePrefetchDistance to 
AllocatePrefetchStepSize, otherwise we can access the heap before newly 
allocated objects/arrays.

>
>
>>
>>>
>>> These way we allow AllocatePrefetchDistance == 0 in all 
>>> AllocatePrefetchStyle cases - it is consistent.
>>
>> Unfortunately, it is not easy to have the same limit for 
>> AllocatePrefetchDistance on all platforms. Due to the 8-byte
>> alignment performed by compiled code, the lower limit of 8 for 
>> AllocatePrefetchDistance is needed on SPARC; the lower
>> limit of 0 works fine on all other platforms.
>>
>> I've started looking at the consistency of flags controlling 
>> allocation prefetch in general, as we have other issues
>> open related to them (e.g., JDK-8151622). We're touching related code 
>> now, so I thought, maybe it makes sense to fix all
>> remaining issues at once.
>
> Agree. I think AllocatePrefetchStyle=2 is broken on all platforms - it 
> should be used only if UseTLAB is true. Please, look.

It seems AllocatePrefetchStyle = 2 can be used only if UseTLAB is true.
http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/6a17c49de974/src/share/vm/opto/macro.cpp#l1844

Also, AllocatePrefetchStyle = 2 seems to work fine. But to be sure, I've 
started an RBT run with all hotspot on all platforms using 
AllocatePrefetchStyle=2. So far no problems have shown up.

> And I think Abstract_VM_Version::_reserve_for_allocation_prefetch 
> should be set for all styles on all platforms to avoid accessing 
> beyond heap. Prefetch instructions doc say that they does not trap but 
> we should be careful.

I agree.

That means we initialize _reserve_for_allocation_prefetch in a 
platform-independent way. So I think it would make sense to move that 
field to ThreadLocalAllocBuffer, as TLAB is the only user of that field 
and we don't support platform-independent initialization of 
Abstract_VM_Version. I did that in the updated webrev.

>>
>> The updated webrev does the following (in addition to fixing the 
>> original problem with AllocatePrefetchDistance):
>>
>> 1. Enforce AllocatePrefetchStyle = 3 if AllocatePrefetchInstr = 1 
>> (i.e., BIS instructions are used for prefetching). As
>> far as I understand, AllocatePrefetchStyle = 3 was added to support 
>> prefetching with BIS, so if BIS is enabled, we
>> should use AllocatePrefetchStyle = 3.
>
> Correct - if BIS (AllocatePrefetchInstr = 1) is used we should select 
> AllocatePrefetchStyle = 3.

OK.

> But we should allow AllocatePrefetchStyle = 3  if normal prefetch 
> instructions (or other platforms) are used.

OK, I've removed that restriction.

> I think we should update comment in globals.hpp to say "generate one 
> prefetch per cache line" without saying BIS.

OK.

>
> But I agree if BIS is not available we should not use BIS 
> AllocatePrefetchInstr = 1.

OK.

>
>>
>> 2. AllocatePrefetchStyle = 3 is SPARC-specific, so disallow it on 
>> non-SPARC platforms.
>
> It could be useful on other platforms since it does one access per 
> cache line.

OK, let's keep it available then.

>
>
>>
>> 3. Enforce that AllocatePrefetchStepSize is multiple of 8 if 
>> AllocatePrefetchStyle is 3 (due to alignment requirements).
>
> That is correct since stxa requires at least 8 bytes alignment (as stx).

OK.

>
>>
>> 3. Determine the number of lines to prefetch in the same way for all 
>> prefetch styles:
>> lines = (prefecth instance allocation) ? 
>> AllocateInstancePrefetchLines : AllocatePrefetchLines
>
> Agree.

OK.

>
>>
>> Here is the updated webrev:
>> http://cr.openjdk.java.net/~zmajo/8153340/webrev.01/
>
> vm_version_sparc.cpp
> AllocatePrefetchInstr = 0 should be set for all styles (not only 1) 
> when BIS is not available.

OK. I think it is sufficient to do

52 if (!has_blk_init()) {
53 if (AllocatePrefetchInstr == 1) {
54 warning("BIS instructions required for AllocatePrefetchInstr 1 
unavailable");
55 FLAG_SET_DEFAULT(AllocatePrefetchInstr, 0);
56 } I hope I'm not missing anything here. Here is the updated webrev: 
http://cr.openjdk.java.net/~zmajo/8153340/webrev.02/ Testing: - JPRT 
(incl. TestOptionsWithRanges); - local testing on SPARC; - all hotspot 
tests with AllocaPrefetchStyle=2 on all platforms. Thank you! Best 
regards, Zoltan

>
> Thanks,
> Vladimir
>
>>
>> Testing:
>> - JPRT (incl. TestOptionsWithRanges)
>> - local testing on a SPARC machine.
>>
>> Thank you!
>>
>> Best regards,
>>
>>
>> Zoltan
>>
>>
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 4/21/16 4:30 AM, Zoltán Majó wrote:
>>>> Hi,
>>>>
>>>>
>>>> please review the patch for 8153340.
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8153340
>>>>
>>>>
>>>> Problem: The VM crashes if AllocatePrefetchStyle==3 and 
>>>> AllocatePrefetchDistance==0. The crash happens due to the way
>>>> the address for the first prefetch instruction is calculated [1]:
>>>>
>>>> If distance==0, cache_addr == old_eden_top. Then, cache_adr &= 
>>>> ~(AllocatePrefetchStepSize - 1) which can zero some of
>>>> the bits of cache_adr. That result in accesses *before* the newly 
>>>> allocated object.
>>>>
>>>>
>>>> Solution: Set lower limit of AllocatePrefetchDistance to 
>>>> AllocatePrefetchStepSize (for AllocatePrefetchStyle == 3).
>>>> Unquarantine test.
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~zmajo/8153340/webrev.00/
>>>>
>>>> Testing:
>>>> - JPRT (incl. TestOptionsWithRanges.java)
>>>> - local testing on a SPARC machine.
>>>>
>>>> Thank you!
>>>>
>>>> Best regards,
>>>>
>>>>
>>>> Zoltan
>>>>
>>>> [1] 
>>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/f27c00e6f6bf/src/share/vm/opto/macro.cpp#l1941
>>



More information about the hotspot-compiler-dev mailing list