(S) 8176518: [9] C2: Invalid ImplicitNullChecks with non-protected heap base

Doerr, Martin martin.doerr at sap.com
Thu Mar 16 15:31:14 UTC 2017


Hi,

thanks, Zoltan, for all your support.

I have spent some more time for experiments on Windows. UseLargePages only works for users with special privilege "SeLockMemoryPrivilege" which is kind of tricky to get.
The good news is that this bug is not problematic on Windows.
The function gen_narrow_oop_implicit_null_checks returns false if narrow_oop_use_implicit_null_checks is false (except for AIX), so the null check gets performed after the decode. C2 uses regular decode (0 gets decoded to 0, unlike decode_not_null) in this case, so the implicit null check hits the 0 page which is not accessible and the exception gets thrown.
I have tried changing gen_narrow_oop_implicit_null_checks and that allows reproducing the bug on Windows.
With the current implementation of gen_narrow_oop_implicit_null_checks, only AIX is exposed to the bug.

Best regards,
Martin


-----Original Message-----
From: Zoltán Majó [mailto:zoltan.majo at oracle.com] 
Sent: Mittwoch, 15. März 2017 15:40
To: Doerr, Martin <martin.doerr at sap.com>; 'hotspot-compiler-dev at openjdk.java.net' <hotspot-compiler-dev at openjdk.java.net>
Subject: Re: (S) 8176518: [9] C2: Invalid ImplicitNullChecks with non-protected heap base

Hi Martin,


On 03/15/2017 03:27 PM, Doerr, Martin wrote:
> Hi Zoltan,
>
> thank you very much for reviewing and for sponsoring.

thank you for spending time on this issue.

>
> I have updated the comments and added the test.
> We think it should be fine to run the test on all 64 bit platforms as it is fast and a little more testing in this area is not bad.
> If you don't agree, I can change it to run only on fewer platforms.

OK, let's run it on all platforms then.

> Please take a look.
>
> New webrev is here:
> http://cr.openjdk.java.net/~mdoerr/8176518_C2_NullCheck/webrev.01/

The updated webrev looks good to me. I'll start pre-integration testing 
now and (if nothing unexpected happens) push the change afterwards. I'll 
keep you updated.

Thank you!

Best regards,


Zoltan

>
> Thanks and best regards,
> Martin
>
>
> -----Original Message-----
> From: Zoltán Majó [mailto:zoltan.majo at oracle.com]
> Sent: Mittwoch, 15. März 2017 11:07
> To: Doerr, Martin <martin.doerr at sap.com>; 'hotspot-compiler-dev at openjdk.java.net' <hotspot-compiler-dev at openjdk.java.net>
> Subject: Re: (S) 8176518: [9] C2: Invalid ImplicitNullChecks with non-protected heap base
>
> Hi Martin,
>
>
> thank you for looking into this issue! Your fix looks good to me in
> general. Here are a few minor comments that you could maybe consider.
>
> - Could you please update the comment on line 261 to describe that we
> skip generating the implicit null check if the heap base is not protected?
>
>    260 !Universe::narrow_oop_use_implicit_null_checks()))
>    261           continue;             // Give up if offset is beyond
> page size
>
> - Could you please update the comment on line 279 to be more consistent
> with other comments in the same code region. Something along the lines
> "Give up if operation is a DecodeN and the heap base is not protected"
> could do. I don't see anything wrong with the check of the graph's shape.
>
> - Could you please change comment
>
>    278           continue;             // Give up i*s* reference is
> beyond 4K page size
>
> to
>
>    278           continue;             // Give up i*f* reference is
> beyond 4K page size
>
> - Could you please add the test case you have to the fix you're
> proposing? I'm not sure if you need to record the duration of the test's
> execution. Maybe you can reduce the number of iterations -- 20K could be
> sufficient. Could you please make the test throw an exception if the
> NullPointerException is not thrown? And finally, can you force the test
> to be executed only on AIX/Windows and with the parameters you mentioned
> in the RFR (i.e., -XX:HeapBaseMinAddress=32g on AIX and
> -XX:ObjectAlignmentInBytes=16 -XX:+UseLargePages on Windows)
>
> I can do the pre-integration testing for the change and then sponsor it.
>
> Thank you!
>
> Best regards,
>
>
> Zoltan
>
> On 03/14/2017 02:35 PM, Doerr, Martin wrote:
>> Hi,
>>
>> I have experimented on x86 and I get a very similar graph when using
>> -XX:ObjectAlignmentInBytes=16 (prevents oop decoding matched into
>> operands) in addition to the flags below under [1].
>>
>> If -XX:+UseLargePages (which is used to skip the heap base protection)
>> is supported on Windows, the problem exists there as well.
>>
>> As the graph pattern on PPC64 is the same as on x86, I believe that
>> there's nothing wrong except the missing check for
>> narrow_oop_use_implicit_null_checks(), which is added by the webrev below.
>>
>> The remaining question is if there's a better test than
>> "base->as_Mach()->ideal_Opcode() == Op_DecodeN".
>>
>> I will also need a sponsor for this change, please.
>>
>> Thanks and best regards,
>>
>> Martin
>>
>> *From:*hotspot-compiler-dev
>> [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] *On Behalf Of
>> *Doerr, Martin
>> *Sent:* Freitag, 10. März 2017 18:49
>> *To:* 'hotspot-compiler-dev at openjdk.java.net'
>> <hotspot-compiler-dev at openjdk.java.net>
>> *Subject:* RFR: (S) 8176518: [9] C2: Invalid ImplicitNullChecks with
>> non-protected heap base
>>
>> Hi,
>>
>> we have observed that C2 generates implicit null checks for write
>> accesses to the heap base even though it was not protected
>> (narrow_oop_use_implicit_null_checks = false).
>>
>> The problem showed up on AIX with -XX:HeapBaseMinAddress=32g.
>> ReservedHeapSpace::establish_noaccess_prefix protects the base area on
>> almost all platforms, but not on AIX (in some case).
>>
>> PhaseCFG::implicit_null_check needs to skip the transformation in this
>> case.
>>
>> The problem can be reproduced by the simple test program below under
>> [1]. The null pointer exception is just missing when running with the
>> given parameters.
>>
>> The problem can be prevented by applying this patch:
>>
>> http://cr.openjdk.java.net/~mdoerr/8176518_C2_NullCheck/webrev.00/
>> <http://cr.openjdk.java.net/%7Emdoerr/8176518_C2_NullCheck/webrev.00/>
>>
>> Especially, the case "base->is_Mach() &&
>> base->as_Mach()->ideal_Opcode() == Op_DecodeN" gets hit. I'm not sure
>> if this is allowed at that place. Maybe something went wrong before.
>>
>> The node "base" is a storeI and "addr"="base" is a decodeN node (the
>> "index" retrieved by mach->memory_inputs is NULL).
>>
>> The patch helps, but is there a better way to fix the problem?
>>
>> Thanks and best regards,
>>
>> Martin
>>
>> [1] Reproduction case:
>>
>> /jdk/bin/java -XX:HeapBaseMinAddress=32g -XX:-TieredCompilation
>> -Xbatch TestWriteNPE
>>
>> *public**class*TestWriteNPE{
>>
>> TestWriteNPE instance = *null*;
>>
>> *int*i = 0;
>>
>> *public**void*bogus_test(*int*value) {
>>
>> instance.i = value;
>>
>>      }
>>
>> *static**final**int*loop_cnt=100000;
>>
>> *public**static**void*main(String args[]){
>>
>> TestWriteNPE xyz=*new*TestWriteNPE();
>>
>> xyz.instance = xyz;
>>
>> *long*duration = System.nanoTime();
>>
>> *for*(*int*x=0; x<loop_cnt; x++) xyz.bogus_test(x);
>>
>> duration = System.nanoTime() - duration;
>>
>> System.out.println("value: "+ xyz.i + " (duration: "+
>> duration/loop_cnt + ")");
>>
>> xyz.instance = *null*;
>>
>> *try*{
>>
>> xyz.bogus_test(0);
>>
>> } *catch*(NullPointerException np) {
>>
>> System.out.println("Got NPE:");
>>
>> np.printStackTrace();
>>
>>          }
>>
>>      }
>>
>> }
>>



More information about the hotspot-compiler-dev mailing list