[9] RFR (XS): 8153340: Incorrect lower bound for AllocatePrefetchDistance with AllocatePrefetchStyle=3

Vladimir Kozlov vladimir.kozlov at oracle.com
Thu Apr 28 23:29:26 UTC 2016


On 4/28/16 2:50 AM, Zoltán Majó wrote:
> Hi Vladimir,
>
>
> On 04/27/2016 11:55 PM, Vladimir Kozlov wrote:
>> [...]
>>> I agree. But I'd prefer we do that in the AllocatePrefetchStepSizeConstraintFunc() constraint function in
>>> commandLineFlagConstraintsCompiler.cpp). The reason is that since JEP-245 the preferred place to validate of
>>> command-line arguments is in constraint functions. Most compiler-related checks were moved there with JDK-8078554 and
>>> JDK-8146478.
>>
>> Okay. Usually we do platform specific flag's setting in vm_version_<arch>.cpp files but it looks like it start changing.
>> I am not supporter of having multiply #ifdef in shared code (in commandLineFlagConstraintsCompiler).
>
> JEP-245 proposed to have ranges/constraints defined for all flags. Some of these ranges/constraints are unavoidably architecture-specific.
>
> I agree with you that having lots of #ifdefs in shared code does not improve code readability. But I'm wondering what would be a good way to achieve the goals of JEP-245 without using too many
> #ifdefs. Maybe architecture-specific constraint files (e.g., commandLineFlagConstraintsCompiler_solaris.cpp, commandLineFlagConstraintsCompiler_x86.cpp, etc.)?

Yes, that will work.

>
> At least with the current change we won't add more #ifdefs...
>
>>
>>>
>>> I'd also like to set the minimum value for AllocatePrefetchDistance to AllocatePrefetchStepSize, otherwise we can access
>>> the heap before newly allocated objects/arrays.
>>
>> Only for style 3. And I said it may be better to change code.
>
> Yes, you've mentioned that in the first round of reviews and I missed your point in the subsequent rounds. Sorry for that.
>
> I've updated the code in macro.cpp and the constraint AllocatePrefetchDistanceConstraintFunc() as you've suggested.
>
>> [...]
>>>>
>>>> Agree. I think AllocatePrefetchStyle=2 is broken on all platforms - it should be used only if UseTLAB is true. Please,
>>>> look.
>>>
>>> It seems AllocatePrefetchStyle = 2 can be used only if UseTLAB is true.
>>> http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/6a17c49de974/src/share/vm/opto/macro.cpp#l1844
>>
>> I got report from it crashing on T7 (attached). May be it is because by default it used BIS with it.
>
> I think using BIS with AllocatePrefetchStyle=2 is indeed the cause.
>
> I've executed the following on our T7 machine with b115:
> * java -XX:AllocatePrefetchStyle=2 (uses AllocatePrefetchInstr=1 by default) -> crashes
> * java -XX:AllocatePrefetchStyle=2 -XX:AllocatePrefetchInstr=0 -> works
>
> With the newest webrev, both commands pass.
>
> Here is the newest webrev:
> http://cr.openjdk.java.net/~zmajo/8153340/webrev.03/

This looks good.

Thanks,
Vladimir

>
> I re-did testing with JPRT, the results look OK.
>
> Thank you!
>
> Best regards,
>
>
> Zoltan
>
>>
>> Thanks,
>> Vladimir
>>
>>>
>>> Also, AllocatePrefetchStyle = 2 seems to work fine. But to be sure, I've started an RBT run with all hotspot on all
>>> platforms using AllocatePrefetchStyle=2. So far no problems have shown up.
>>>
>>>> And I think Abstract_VM_Version::_reserve_for_allocation_prefetch should be set for all styles on all platforms to
>>>> avoid accessing beyond heap. Prefetch instructions doc say that they does not trap but we should be careful.
>>>
>>> I agree.
>>>
>>> That means we initialize _reserve_for_allocation_prefetch in a platform-independent way. So I think it would make sense
>>> to move that field to ThreadLocalAllocBuffer, as TLAB is the only user of that field and we don't support
>>> platform-independent initialization of Abstract_VM_Version. I did that in the updated webrev.
>>>
>>>>>
>>>>> The updated webrev does the following (in addition to fixing the original problem with AllocatePrefetchDistance):
>>>>>
>>>>> 1. Enforce AllocatePrefetchStyle = 3 if AllocatePrefetchInstr = 1 (i.e., BIS instructions are used for prefetching). As
>>>>> far as I understand, AllocatePrefetchStyle = 3 was added to support prefetching with BIS, so if BIS is enabled, we
>>>>> should use AllocatePrefetchStyle = 3.
>>>>
>>>> Correct - if BIS (AllocatePrefetchInstr = 1) is used we should select AllocatePrefetchStyle = 3.
>>>
>>> OK.
>>>
>>>> But we should allow AllocatePrefetchStyle = 3  if normal prefetch instructions (or other platforms) are used.
>>>
>>> OK, I've removed that restriction.
>>>
>>>> I think we should update comment in globals.hpp to say "generate one prefetch per cache line" without saying BIS.
>>>
>>> OK.
>>>
>>>>
>>>> But I agree if BIS is not available we should not use BIS AllocatePrefetchInstr = 1.
>>>
>>> OK.
>>>
>>>>
>>>>>
>>>>> 2. AllocatePrefetchStyle = 3 is SPARC-specific, so disallow it on non-SPARC platforms.
>>>>
>>>> It could be useful on other platforms since it does one access per cache line.
>>>
>>> OK, let's keep it available then.
>>>
>>>>
>>>>
>>>>>
>>>>> 3. Enforce that AllocatePrefetchStepSize is multiple of 8 if AllocatePrefetchStyle is 3 (due to alignment requirements).
>>>>
>>>> That is correct since stxa requires at least 8 bytes alignment (as stx).
>>>
>>> OK.
>>>
>>>>
>>>>>
>>>>> 3. Determine the number of lines to prefetch in the same way for all prefetch styles:
>>>>> lines = (prefecth instance allocation) ? AllocateInstancePrefetchLines : AllocatePrefetchLines
>>>>
>>>> Agree.
>>>
>>> OK.
>>>
>>>>
>>>>>
>>>>> Here is the updated webrev:
>>>>> http://cr.openjdk.java.net/~zmajo/8153340/webrev.01/
>>>>
>>>> vm_version_sparc.cpp
>>>> AllocatePrefetchInstr = 0 should be set for all styles (not only 1) when BIS is not available.
>>>
>>> OK. I think it is sufficient to do
>>>
>>> 52 if (!has_blk_init()) {
>>> 53 if (AllocatePrefetchInstr == 1) {
>>> 54 warning("BIS instructions required for AllocatePrefetchInstr 1 unavailable");
>>> 55 FLAG_SET_DEFAULT(AllocatePrefetchInstr, 0);
>>> 56 } I hope I'm not missing anything here. Here is the updated webrev:
>>> http://cr.openjdk.java.net/~zmajo/8153340/webrev.02/ Testing: - JPRT (incl. TestOptionsWithRanges); - local testing on
>>> SPARC; - all hotspot tests with AllocaPrefetchStyle=2 on all platforms. Thank you! Best regards, Zoltan
>>>
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>>>
>>>>> Testing:
>>>>> - JPRT (incl. TestOptionsWithRanges)
>>>>> - local testing on a SPARC machine.
>>>>>
>>>>> Thank you!
>>>>>
>>>>> Best regards,
>>>>>
>>>>>
>>>>> Zoltan
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>>
>>>>>> On 4/21/16 4:30 AM, Zoltán Majó wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>>
>>>>>>> please review the patch for 8153340.
>>>>>>>
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8153340
>>>>>>>
>>>>>>>
>>>>>>> Problem: The VM crashes if AllocatePrefetchStyle==3 and AllocatePrefetchDistance==0. The crash happens due to the way
>>>>>>> the address for the first prefetch instruction is calculated [1]:
>>>>>>>
>>>>>>> If distance==0, cache_addr == old_eden_top. Then, cache_adr &= ~(AllocatePrefetchStepSize - 1) which can zero some of
>>>>>>> the bits of cache_adr. That result in accesses *before* the newly allocated object.
>>>>>>>
>>>>>>>
>>>>>>> Solution: Set lower limit of AllocatePrefetchDistance to AllocatePrefetchStepSize (for AllocatePrefetchStyle == 3).
>>>>>>> Unquarantine test.
>>>>>>>
>>>>>>> Webrev:
>>>>>>> http://cr.openjdk.java.net/~zmajo/8153340/webrev.00/
>>>>>>>
>>>>>>> Testing:
>>>>>>> - JPRT (incl. TestOptionsWithRanges.java)
>>>>>>> - local testing on a SPARC machine.
>>>>>>>
>>>>>>> Thank you!
>>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>>
>>>>>>> Zoltan
>>>>>>>
>>>>>>> [1] http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/f27c00e6f6bf/src/share/vm/opto/macro.cpp#l1941
>>>>>
>>>
>


More information about the hotspot-compiler-dev mailing list