[9] RFR (XS): 8153340: Incorrect lower bound for AllocatePrefetchDistance with AllocatePrefetchStyle=3
Zoltán Majó
zoltan.majo at oracle.com
Wed Apr 27 14:20:08 UTC 2016
Hi Vladimir,
thank you for the feedback!
On 04/27/2016 12:39 AM, Vladimir Kozlov wrote:
> On 4/26/16 4:56 AM, Zoltán Majó wrote:
>> Hi Vladimir,
>>
>>
>> thank you for the feedback! Please see comments below.
>>
>> On 04/22/2016 12:37 AM, Vladimir Kozlov wrote:
>>> Hi, Zoltan
>>>
>>> I think we should change code in prefetch_allocation() instead:
>>>
>>> Node *cache_adr = new AddPNode(old_eden_top, old_eden_top,
>>> _igvn.MakeConX(step_size + distance));
>>
>> The problem is that AllocatePrefetchStyle must align the first
>> prefetched address to 8 bytes, otherwise the emitted stxa
>> instruction could cause a SIGBUS. But the alignment does not have to
>> be at step_size boundary, 8-byte alignment is
>> sufficient.
>
> Actually it has to be step_size (cache line size) aligned - BIS
> instruction is triggered when the address of stxa is the beginning of
> cache line for AllocatePrefetchStyle == 3. If it is just 8 bytes
> aligned it will be simple store.
Thank you for clarifying.
> We should require AllocatePrefetchStepSize to be 8 bytes aligned in
> vm_version_sparc.cpp instead of:
> + // BIS instructions require 8-byte aligned addresses
> + Node* mask = _igvn.MakeConX(~(intptr_t)(wordSize - 1));
I agree. But I'd prefer we do that in the
AllocatePrefetchStepSizeConstraintFunc() constraint function in
commandLineFlagConstraintsCompiler.cpp). The reason is that since
JEP-245 the preferred place to validate of command-line arguments is in
constraint functions. Most compiler-related checks were moved there with
JDK-8078554 and JDK-8146478.
I'd also like to set the minimum value for AllocatePrefetchDistance to
AllocatePrefetchStepSize, otherwise we can access the heap before newly
allocated objects/arrays.
>
>
>>
>>>
>>> These way we allow AllocatePrefetchDistance == 0 in all
>>> AllocatePrefetchStyle cases - it is consistent.
>>
>> Unfortunately, it is not easy to have the same limit for
>> AllocatePrefetchDistance on all platforms. Due to the 8-byte
>> alignment performed by compiled code, the lower limit of 8 for
>> AllocatePrefetchDistance is needed on SPARC; the lower
>> limit of 0 works fine on all other platforms.
>>
>> I've started looking at the consistency of flags controlling
>> allocation prefetch in general, as we have other issues
>> open related to them (e.g., JDK-8151622). We're touching related code
>> now, so I thought, maybe it makes sense to fix all
>> remaining issues at once.
>
> Agree. I think AllocatePrefetchStyle=2 is broken on all platforms - it
> should be used only if UseTLAB is true. Please, look.
It seems AllocatePrefetchStyle = 2 can be used only if UseTLAB is true.
http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/6a17c49de974/src/share/vm/opto/macro.cpp#l1844
Also, AllocatePrefetchStyle = 2 seems to work fine. But to be sure, I've
started an RBT run with all hotspot on all platforms using
AllocatePrefetchStyle=2. So far no problems have shown up.
> And I think Abstract_VM_Version::_reserve_for_allocation_prefetch
> should be set for all styles on all platforms to avoid accessing
> beyond heap. Prefetch instructions doc say that they does not trap but
> we should be careful.
I agree.
That means we initialize _reserve_for_allocation_prefetch in a
platform-independent way. So I think it would make sense to move that
field to ThreadLocalAllocBuffer, as TLAB is the only user of that field
and we don't support platform-independent initialization of
Abstract_VM_Version. I did that in the updated webrev.
>>
>> The updated webrev does the following (in addition to fixing the
>> original problem with AllocatePrefetchDistance):
>>
>> 1. Enforce AllocatePrefetchStyle = 3 if AllocatePrefetchInstr = 1
>> (i.e., BIS instructions are used for prefetching). As
>> far as I understand, AllocatePrefetchStyle = 3 was added to support
>> prefetching with BIS, so if BIS is enabled, we
>> should use AllocatePrefetchStyle = 3.
>
> Correct - if BIS (AllocatePrefetchInstr = 1) is used we should select
> AllocatePrefetchStyle = 3.
OK.
> But we should allow AllocatePrefetchStyle = 3 if normal prefetch
> instructions (or other platforms) are used.
OK, I've removed that restriction.
> I think we should update comment in globals.hpp to say "generate one
> prefetch per cache line" without saying BIS.
OK.
>
> But I agree if BIS is not available we should not use BIS
> AllocatePrefetchInstr = 1.
OK.
>
>>
>> 2. AllocatePrefetchStyle = 3 is SPARC-specific, so disallow it on
>> non-SPARC platforms.
>
> It could be useful on other platforms since it does one access per
> cache line.
OK, let's keep it available then.
>
>
>>
>> 3. Enforce that AllocatePrefetchStepSize is multiple of 8 if
>> AllocatePrefetchStyle is 3 (due to alignment requirements).
>
> That is correct since stxa requires at least 8 bytes alignment (as stx).
OK.
>
>>
>> 3. Determine the number of lines to prefetch in the same way for all
>> prefetch styles:
>> lines = (prefecth instance allocation) ?
>> AllocateInstancePrefetchLines : AllocatePrefetchLines
>
> Agree.
OK.
>
>>
>> Here is the updated webrev:
>> http://cr.openjdk.java.net/~zmajo/8153340/webrev.01/
>
> vm_version_sparc.cpp
> AllocatePrefetchInstr = 0 should be set for all styles (not only 1)
> when BIS is not available.
OK. I think it is sufficient to do
52 if (!has_blk_init()) {
53 if (AllocatePrefetchInstr == 1) {
54 warning("BIS instructions required for AllocatePrefetchInstr 1
unavailable");
55 FLAG_SET_DEFAULT(AllocatePrefetchInstr, 0);
56 } I hope I'm not missing anything here. Here is the updated webrev:
http://cr.openjdk.java.net/~zmajo/8153340/webrev.02/ Testing: - JPRT
(incl. TestOptionsWithRanges); - local testing on SPARC; - all hotspot
tests with AllocaPrefetchStyle=2 on all platforms. Thank you! Best
regards, Zoltan
>
> Thanks,
> Vladimir
>
>>
>> Testing:
>> - JPRT (incl. TestOptionsWithRanges)
>> - local testing on a SPARC machine.
>>
>> Thank you!
>>
>> Best regards,
>>
>>
>> Zoltan
>>
>>
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 4/21/16 4:30 AM, Zoltán Majó wrote:
>>>> Hi,
>>>>
>>>>
>>>> please review the patch for 8153340.
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8153340
>>>>
>>>>
>>>> Problem: The VM crashes if AllocatePrefetchStyle==3 and
>>>> AllocatePrefetchDistance==0. The crash happens due to the way
>>>> the address for the first prefetch instruction is calculated [1]:
>>>>
>>>> If distance==0, cache_addr == old_eden_top. Then, cache_adr &=
>>>> ~(AllocatePrefetchStepSize - 1) which can zero some of
>>>> the bits of cache_adr. That result in accesses *before* the newly
>>>> allocated object.
>>>>
>>>>
>>>> Solution: Set lower limit of AllocatePrefetchDistance to
>>>> AllocatePrefetchStepSize (for AllocatePrefetchStyle == 3).
>>>> Unquarantine test.
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~zmajo/8153340/webrev.00/
>>>>
>>>> Testing:
>>>> - JPRT (incl. TestOptionsWithRanges.java)
>>>> - local testing on a SPARC machine.
>>>>
>>>> Thank you!
>>>>
>>>> Best regards,
>>>>
>>>>
>>>> Zoltan
>>>>
>>>> [1]
>>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/f27c00e6f6bf/src/share/vm/opto/macro.cpp#l1941
>>
More information about the hotspot-compiler-dev
mailing list