From igor.veresov at oracle.com  Wed Jul  1 02:06:41 2020
From: igor.veresov at oracle.com (Igor Veresov)
Date: Tue, 30 Jun 2020 19:06:41 -0700
Subject: RFR 8248043: Need to eliminate excessive i2l conversions
In-Reply-To: <0be466e7-057c-b029-3461-de21d9cd3910@bell-sw.com>
References: <096e0df7-8208-2a07-975f-e2de8bc27e3a@bell-sw.com>
 <75920e44-518e-10e0-53b3-c2a6f85fd841@oracle.com>
 <0be466e7-057c-b029-3461-de21d9cd3910@bell-sw.com>
Message-ID: <AA66D907-6ADF-44DB-8ADE-2D16D3CF6AC5@oracle.com>

I think you forgot to include changes to BoolNode in the webrev.

igor


> On Jun 30, 2020, at 11:04 AM, Boris Ulasevich <boris.ulasevich at bell-sw.com> wrote:
> 
> Hi Claes,
> 
> > Seems like the optimization is mostly effective, but not getting all the way.
> 
> Good point about LHS, thanks! CmpL turned to be not canonized on the moment.
> I moved the optimization to CmpLNode::Ideal and transformations now works as follows:
> 1. CmpINode::Ideal: CmpI(CmpL3)->CmpL
> 2. BoolNode::Ideal: Bool(CmpL(const,val),test)->Bool(CmpL(val,const),test_invert)
> 3. CmpLNode::Ideal: CmpL(ConvI2L(val),ConL)->CmpI(val,ConI)
> 
> I applied your test to the benchmark. The result is:
> Benchmark                            Mode  Cnt   Score   Error Units
> SkipIntToLongCast.skipCastTestLeft   avgt    5  14.288 ? 0.052 ns/op
> SkipIntToLongCast.skipCastTestRight  avgt    5  14.338 ? 0.088 ns/op
> 
> Updated webrev:
> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b
> 
> thanks,
> Boris
> 
> On 26.06.2020 21:31, Claes Redestad wrote:
>> Hi Boris,
>> 
>> this looks like a nice improvement! I just have some comments about the
>> micro.
>> 
>> I was curious whether the optimization works when the constant is on
>> the LHS and added a variant of the micro to try that[1]. Results are
>> interesting (Intel Xeon):
>> 
>> Benchmark                            Mode  Cnt   Score   Error Units
>> SkipIntToLongCast.skipCastTest       avgt    5  30.937 ? 0.056 ns/op
>> SkipIntToLongCast.skipCastTestLeft   avgt    5  30.937 ? 0.140 ns/op
>> 
>> With your patch:
>> Benchmark                            Mode  Cnt   Score   Error Units
>> SkipIntToLongCast.skipCastTest       avgt    5  14.123 ? 0.035 ns/op
>> SkipIntToLongCast.skipCastTestLeft   avgt    5  17.420 ? 0.044 ns/op
>> 
>> Seems like the optimization is mostly effective, but not getting all
>> the way. I wouldn't worry about it for this RFE, but perhaps something
>> to investigate in a follow-up. Feel free to include such a variant in
>> your patch though (no attribution necessary).
>> 
>> The micro also stabilizes very quickly, so you might want to provide
>> some default tuning to keep runtime in check, e.g., something like:
>> 
>> @Warmup(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
>> @Measurement(iterations = 5, time = 1000, timeUnit = TimeUnit.MILLISECONDS)
>> @Fork(3)
>> 
>> Thanks!
>> 
>> /Claes
>> 
>> [1]
>>     @Benchmark
>>     public int skipCastTestLeft() {
>>         for (int i = 0; i < ARRAYSIZE_L; i++) {
>>             if (ARRAYSIZE_L == intValues[i]) {
>>                 return i;
>>             }
>>         }
>>         return 0;
>>     }
>> 
>> On 2020-06-26 17:05, Boris Ulasevich wrote:
>>> Hi all,
>>> 
>>> Please review the change to eliminate the unnecessary i2l conversion
>>> for expressions like this: "if (intValue == 1L)".
>>> 
>>> http://bugs.openjdk.java.net/browse/JDK-8248043
>>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.01
>>> 
>>> The provided benchmark shows performance boost on all platforms:
>>> - Intel Xeon: 32.705 --> 14.234 ns/op
>>> - arm64: 42.060 --> 25.456 ns/op
>>> - arm32: 618.763 --> 314.040 ns/op
>>> - ppc8:  81.218 --> 63.026 ns/op
>>> 
>>> Testing done: jtreg, jck.
>>> 
>>> thanks,
>>> Boris
> 


From vladimir.kozlov at oracle.com  Wed Jul  1 03:12:00 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 30 Jun 2020 20:12:00 -0700
Subject: [16] RFR(T) 8005088: remove unused NativeInstruction::test methods
Message-ID: <42cfbb51-4fe4-382f-6e8d-f740890df2db@oracle.com>

https://cr.openjdk.java.net/~kvn/8005088/webrev.00/
https://bugs.openjdk.java.net/browse/JDK-8005088

Only SPARC implemented NativeInstruction::test() methods [1]. And it is was removed in JDK 15 with SPARC port.

I think we can remove them on other platforms where they are not implemented.
If someone wants to recreate test, they should do that as gtest test(s).

Thanks,
Vladimir

[1] https://hg.openjdk.java.net/jdk/jdk14/file/6c954123ee8d/src/hotspot/cpu/sparc/nativeInst_sparc.cpp#l356

From igor.ignatyev at oracle.com  Wed Jul  1 03:15:11 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Tue, 30 Jun 2020 20:15:11 -0700
Subject: [16] RFR(T) 8005088: remove unused NativeInstruction::test methods
In-Reply-To: <42cfbb51-4fe4-382f-6e8d-f740890df2db@oracle.com>
References: <42cfbb51-4fe4-382f-6e8d-f740890df2db@oracle.com>
Message-ID: <6CBA6B2A-51A4-445E-A949-60132363E079@oracle.com>

Hi Vladimir,

the years in copyright notices should be updated, otherwise looks good to me, thanks for taking care of it.

-- Igor

> On Jun 30, 2020, at 8:12 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> https://cr.openjdk.java.net/~kvn/8005088/webrev.00/
> https://bugs.openjdk.java.net/browse/JDK-8005088
> 
> Only SPARC implemented NativeInstruction::test() methods [1]. And it is was removed in JDK 15 with SPARC port.
> 
> I think we can remove them on other platforms where they are not implemented.
> If someone wants to recreate test, they should do that as gtest test(s).
> 
> Thanks,
> Vladimir
> 
> [1] https://hg.openjdk.java.net/jdk/jdk14/file/6c954123ee8d/src/hotspot/cpu/sparc/nativeInst_sparc.cpp#l356


From vladimir.kozlov at oracle.com  Wed Jul  1 03:23:24 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 30 Jun 2020 20:23:24 -0700
Subject: [16] RFR(T) 8005088: remove unused NativeInstruction::test methods
In-Reply-To: <6CBA6B2A-51A4-445E-A949-60132363E079@oracle.com>
References: <42cfbb51-4fe4-382f-6e8d-f740890df2db@oracle.com>
 <6CBA6B2A-51A4-445E-A949-60132363E079@oracle.com>
Message-ID: <c9a6b384-448b-f8f1-3af8-29d6740dd067@oracle.com>

Thank you, Igor

On 6/30/20 8:15 PM, Igor Ignatyev wrote:
> Hi Vladimir,
> 
> the years in copyright notices should be updated, otherwise looks good to me, thanks for taking care of it.

Updated webrev.00 with correct years.

Vladimir

> 
> -- Igor
> 
>> On Jun 30, 2020, at 8:12 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>>
>> https://cr.openjdk.java.net/~kvn/8005088/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8005088
>>
>> Only SPARC implemented NativeInstruction::test() methods [1]. And it is was removed in JDK 15 with SPARC port.
>>
>> I think we can remove them on other platforms where they are not implemented.
>> If someone wants to recreate test, they should do that as gtest test(s).
>>
>> Thanks,
>> Vladimir
>>
>> [1] https://hg.openjdk.java.net/jdk/jdk14/file/6c954123ee8d/src/hotspot/cpu/sparc/nativeInst_sparc.cpp#l356
> 

From boris.ulasevich at bell-sw.com  Wed Jul  1 04:33:04 2020
From: boris.ulasevich at bell-sw.com (Boris Ulasevich)
Date: Wed, 1 Jul 2020 07:33:04 +0300
Subject: RFR 8248043: Need to eliminate excessive i2l conversions
In-Reply-To: <AA66D907-6ADF-44DB-8ADE-2D16D3CF6AC5@oracle.com>
References: <096e0df7-8208-2a07-975f-e2de8bc27e3a@bell-sw.com>
 <75920e44-518e-10e0-53b3-c2a6f85fd841@oracle.com>
 <0be466e7-057c-b029-3461-de21d9cd3910@bell-sw.com>
 <AA66D907-6ADF-44DB-8ADE-2D16D3CF6AC5@oracle.com>
Message-ID: <CAOhyNwDQGnDk5VYR=xLEW5KGJLtJiWTZLO9xnVhhKUXJR9UOog@mail.gmail.com>

Hi Igor,

By BoolNode I mean the canonicalization that is already in place:
https://hg.openjdk.java.net/jdk/jdk/file/de6ad5f86276/src/hotspot/share/opto/subnode.cpp#l1391

thanks,
Boris

On Wed, Jul 1, 2020 at 5:07 AM Igor Veresov <igor.veresov at oracle.com> wrote:

> I think you forgot to include changes to BoolNode in the webrev.
>
> igor
>
>
>
> On Jun 30, 2020, at 11:04 AM, Boris Ulasevich <boris.ulasevich at bell-sw.com>
> wrote:
>
> Hi Claes,
>
> > Seems like the optimization is mostly effective, but not getting all the
> way.
>
> Good point about LHS, thanks! CmpL turned to be not canonized on the
> moment.
> I moved the optimization to CmpLNode::Ideal and transformations now works
> as follows:
> 1. CmpINode::Ideal: CmpI(CmpL3)->CmpL
> 2. BoolNode::Ideal:
> Bool(CmpL(const,val),test)->Bool(CmpL(val,const),test_invert)
> 3. CmpLNode::Ideal: CmpL(ConvI2L(val),ConL)->CmpI(val,ConI)
>
> I applied your test to the benchmark. The result is:
> Benchmark                            Mode  Cnt   Score   Error Units
> SkipIntToLongCast.skipCastTestLeft   avgt    5  14.288 ? 0.052 ns/op
> SkipIntToLongCast.skipCastTestRight  avgt    5  14.338 ? 0.088 ns/op
>
> Updated webrev:
> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b
>
> thanks,
> Boris
>
> On 26.06.2020 21:31, Claes Redestad wrote:
>
> Hi Boris,
>
> this looks like a nice improvement! I just have some comments about the
> micro.
>
> I was curious whether the optimization works when the constant is on
> the LHS and added a variant of the micro to try that[1]. Results are
> interesting (Intel Xeon):
>
> Benchmark                            Mode  Cnt   Score   Error Units
> SkipIntToLongCast.skipCastTest       avgt    5  30.937 ? 0.056 ns/op
> SkipIntToLongCast.skipCastTestLeft   avgt    5  30.937 ? 0.140 ns/op
>
> With your patch:
> Benchmark                            Mode  Cnt   Score   Error Units
> SkipIntToLongCast.skipCastTest       avgt    5  14.123 ? 0.035 ns/op
> SkipIntToLongCast.skipCastTestLeft   avgt    5  17.420 ? 0.044 ns/op
>
> Seems like the optimization is mostly effective, but not getting all
> the way. I wouldn't worry about it for this RFE, but perhaps something
> to investigate in a follow-up. Feel free to include such a variant in
> your patch though (no attribution necessary).
>
> The micro also stabilizes very quickly, so you might want to provide
> some default tuning to keep runtime in check, e.g., something like:
>
> @Warmup(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
> @Measurement(iterations = 5, time = 1000, timeUnit = TimeUnit.MILLISECONDS)
> @Fork(3)
>
> Thanks!
>
> /Claes
>
> [1]
>     @Benchmark
>     public int skipCastTestLeft() {
>         for (int i = 0; i < ARRAYSIZE_L; i++) {
>             if (ARRAYSIZE_L == intValues[i]) {
>                 return i;
>             }
>         }
>         return 0;
>     }
>
> On 2020-06-26 17:05, Boris Ulasevich wrote:
>
> Hi all,
>
> Please review the change to eliminate the unnecessary i2l conversion
> for expressions like this: "if (intValue == 1L)".
>
> http://bugs.openjdk.java.net/browse/JDK-8248043
> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.01
>
> The provided benchmark shows performance boost on all platforms:
> - Intel Xeon: 32.705 --> 14.234 ns/op
> - arm64: 42.060 --> 25.456 ns/op
> - arm32: 618.763 --> 314.040 ns/op
> - ppc8:  81.218 --> 63.026 ns/op
>
> Testing done: jtreg, jck.
>
> thanks,
> Boris
>
>
>
>

From boris.ulasevich at bell-sw.com  Wed Jul  1 04:38:36 2020
From: boris.ulasevich at bell-sw.com (Boris Ulasevich)
Date: Wed, 1 Jul 2020 07:38:36 +0300
Subject: RFR 8248043: Need to eliminate excessive i2l conversions
In-Reply-To: <15778367-9a55-8fd2-353b-21927650125d@oracle.com>
References: <096e0df7-8208-2a07-975f-e2de8bc27e3a@bell-sw.com>
 <75920e44-518e-10e0-53b3-c2a6f85fd841@oracle.com>
 <0be466e7-057c-b029-3461-de21d9cd3910@bell-sw.com>
 <13148df7-502b-bd1e-5aa0-fb7a9244cddc@oracle.com>
 <15778367-9a55-8fd2-353b-21927650125d@oracle.com>
Message-ID: <CAOhyNwDRdrz0_4KHJg7Cgxs0cZk4Nqns8pyiedHbwrLV9++BnA@mail.gmail.com>

Hi Claes,

Ok. Thank you for your review!

Best regards,
Boris

On Wed, Jul 1, 2020 at 12:30 AM Claes Redestad <claes.redestad at oracle.com>
wrote:

> +1
>
> Maybe add tests for reversed variants to TestSkipLongToIntCast too? No
> need for a new webrev if you do.
>
> /Claes
>
> On 2020-06-30 23:13, Vladimir Kozlov wrote:
> > Good optimization. Reviewed.
> >
> > Thanks,
> > Vladimir
> >
> > On 6/30/20 11:04 AM, Boris Ulasevich wrote:
> >> Hi Claes,
> >>
> >>  > Seems like the optimization is mostly effective, but not getting
> >> all the way.
> >>
> >> Good point about LHS, thanks! CmpL turned to be not canonized on the
> >> moment.
> >> I moved the optimization to CmpLNode::Ideal and transformations now
> >> works as follows:
> >> 1. CmpINode::Ideal: CmpI(CmpL3)->CmpL
> >> 2. BoolNode::Ideal:
> >> Bool(CmpL(const,val),test)->Bool(CmpL(val,const),test_invert)
> >> 3. CmpLNode::Ideal: CmpL(ConvI2L(val),ConL)->CmpI(val,ConI)
> >>
> >> I applied your test to the benchmark. The result is:
> >> Benchmark                            Mode  Cnt   Score   Error Units
> >> SkipIntToLongCast.skipCastTestLeft   avgt    5  14.288 ? 0.052 ns/op
> >> SkipIntToLongCast.skipCastTestRight  avgt    5  14.338 ? 0.088 ns/op
> >>
> >> Updated webrev:
> >> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b
> >>
> >> thanks,
> >> Boris
> >>
> >> On 26.06.2020 21:31, Claes Redestad wrote:
> >>> Hi Boris,
> >>>
> >>> this looks like a nice improvement! I just have some comments about the
> >>> micro.
> >>>
> >>> I was curious whether the optimization works when the constant is on
> >>> the LHS and added a variant of the micro to try that[1]. Results are
> >>> interesting (Intel Xeon):
> >>>
> >>> Benchmark                            Mode  Cnt   Score   Error Units
> >>> SkipIntToLongCast.skipCastTest       avgt    5  30.937 ? 0.056 ns/op
> >>> SkipIntToLongCast.skipCastTestLeft   avgt    5  30.937 ? 0.140 ns/op
> >>>
> >>> With your patch:
> >>> Benchmark                            Mode  Cnt   Score   Error Units
> >>> SkipIntToLongCast.skipCastTest       avgt    5  14.123 ? 0.035 ns/op
> >>> SkipIntToLongCast.skipCastTestLeft   avgt    5  17.420 ? 0.044 ns/op
> >>>
> >>> Seems like the optimization is mostly effective, but not getting all
> >>> the way. I wouldn't worry about it for this RFE, but perhaps something
> >>> to investigate in a follow-up. Feel free to include such a variant in
> >>> your patch though (no attribution necessary).
> >>>
> >>> The micro also stabilizes very quickly, so you might want to provide
> >>> some default tuning to keep runtime in check, e.g., something like:
> >>>
> >>> @Warmup(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
> >>> @Measurement(iterations = 5, time = 1000, timeUnit =
> >>> TimeUnit.MILLISECONDS)
> >>> @Fork(3)
> >>>
> >>> Thanks!
> >>>
> >>> /Claes
> >>>
> >>> [1]
> >>>     @Benchmark
> >>>     public int skipCastTestLeft() {
> >>>         for (int i = 0; i < ARRAYSIZE_L; i++) {
> >>>             if (ARRAYSIZE_L == intValues[i]) {
> >>>                 return i;
> >>>             }
> >>>         }
> >>>         return 0;
> >>>     }
> >>>
> >>> On 2020-06-26 17:05, Boris Ulasevich wrote:
> >>>> Hi all,
> >>>>
> >>>> Please review the change to eliminate the unnecessary i2l conversion
> >>>> for expressions like this: "if (intValue == 1L)".
> >>>>
> >>>> http://bugs.openjdk.java.net/browse/JDK-8248043
> >>>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.01
> >>>>
> >>>> The provided benchmark shows performance boost on all platforms:
> >>>> - Intel Xeon: 32.705 --> 14.234 ns/op
> >>>> - arm64: 42.060 --> 25.456 ns/op
> >>>> - arm32: 618.763 --> 314.040 ns/op
> >>>> - ppc8:  81.218 --> 63.026 ns/op
> >>>>
> >>>> Testing done: jtreg, jck.
> >>>>
> >>>> thanks,
> >>>> Boris
> >>
>

From vladimir.kozlov at oracle.com  Wed Jul  1 05:15:02 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 30 Jun 2020 22:15:02 -0700
Subject: RFR 8248043: Need to eliminate excessive i2l conversions
In-Reply-To: <CAOhyNwDQGnDk5VYR=xLEW5KGJLtJiWTZLO9xnVhhKUXJR9UOog@mail.gmail.com>
References: <096e0df7-8208-2a07-975f-e2de8bc27e3a@bell-sw.com>
 <75920e44-518e-10e0-53b3-c2a6f85fd841@oracle.com>
 <0be466e7-057c-b029-3461-de21d9cd3910@bell-sw.com>
 <AA66D907-6ADF-44DB-8ADE-2D16D3CF6AC5@oracle.com>
 <CAOhyNwDQGnDk5VYR=xLEW5KGJLtJiWTZLO9xnVhhKUXJR9UOog@mail.gmail.com>
Message-ID: <d4b7547a-02d5-fcad-5898-ce7f89383fb0@oracle.com>

I think Igor said that you can't swap arguments of compare without changing condition test. For example, if it was CC_LT 
it should be CC_GT after swap.

It is not clear why you need swapping in CmpLNode::Ideal() if BoolNode::Ideal() should do it already. If it does not you 
need to investigate why.

Also your list of steps 1.-3. does not reflect changes in webrev.02b:
http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b/src/hotspot/share/opto/subnode.cpp.udiff.html

Regards,
Vladimir

On 6/30/20 9:33 PM, Boris Ulasevich wrote:
> Hi Igor,
> 
> By BoolNode I mean the canonicalization that is already in place:
> https://hg.openjdk.java.net/jdk/jdk/file/de6ad5f86276/src/hotspot/share/opto/subnode.cpp#l1391
> 
> thanks,
> Boris
> 
> On Wed, Jul 1, 2020 at 5:07 AM Igor Veresov <igor.veresov at oracle.com> wrote:
> 
>> I think you forgot to include changes to BoolNode in the webrev.
>>
>> igor
>>
>>
>>
>> On Jun 30, 2020, at 11:04 AM, Boris Ulasevich <boris.ulasevich at bell-sw.com>
>> wrote:
>>
>> Hi Claes,
>>
>>> Seems like the optimization is mostly effective, but not getting all the
>> way.
>>
>> Good point about LHS, thanks! CmpL turned to be not canonized on the
>> moment.
>> I moved the optimization to CmpLNode::Ideal and transformations now works
>> as follows:
>> 1. CmpINode::Ideal: CmpI(CmpL3)->CmpL
>> 2. BoolNode::Ideal:
>> Bool(CmpL(const,val),test)->Bool(CmpL(val,const),test_invert)
>> 3. CmpLNode::Ideal: CmpL(ConvI2L(val),ConL)->CmpI(val,ConI)
>>
>> I applied your test to the benchmark. The result is:
>> Benchmark                            Mode  Cnt   Score   Error Units
>> SkipIntToLongCast.skipCastTestLeft   avgt    5  14.288 ? 0.052 ns/op
>> SkipIntToLongCast.skipCastTestRight  avgt    5  14.338 ? 0.088 ns/op
>>
>> Updated webrev:
>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b
>>
>> thanks,
>> Boris
>>
>> On 26.06.2020 21:31, Claes Redestad wrote:
>>
>> Hi Boris,
>>
>> this looks like a nice improvement! I just have some comments about the
>> micro.
>>
>> I was curious whether the optimization works when the constant is on
>> the LHS and added a variant of the micro to try that[1]. Results are
>> interesting (Intel Xeon):
>>
>> Benchmark                            Mode  Cnt   Score   Error Units
>> SkipIntToLongCast.skipCastTest       avgt    5  30.937 ? 0.056 ns/op
>> SkipIntToLongCast.skipCastTestLeft   avgt    5  30.937 ? 0.140 ns/op
>>
>> With your patch:
>> Benchmark                            Mode  Cnt   Score   Error Units
>> SkipIntToLongCast.skipCastTest       avgt    5  14.123 ? 0.035 ns/op
>> SkipIntToLongCast.skipCastTestLeft   avgt    5  17.420 ? 0.044 ns/op
>>
>> Seems like the optimization is mostly effective, but not getting all
>> the way. I wouldn't worry about it for this RFE, but perhaps something
>> to investigate in a follow-up. Feel free to include such a variant in
>> your patch though (no attribution necessary).
>>
>> The micro also stabilizes very quickly, so you might want to provide
>> some default tuning to keep runtime in check, e.g., something like:
>>
>> @Warmup(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
>> @Measurement(iterations = 5, time = 1000, timeUnit = TimeUnit.MILLISECONDS)
>> @Fork(3)
>>
>> Thanks!
>>
>> /Claes
>>
>> [1]
>>      @Benchmark
>>      public int skipCastTestLeft() {
>>          for (int i = 0; i < ARRAYSIZE_L; i++) {
>>              if (ARRAYSIZE_L == intValues[i]) {
>>                  return i;
>>              }
>>          }
>>          return 0;
>>      }
>>
>> On 2020-06-26 17:05, Boris Ulasevich wrote:
>>
>> Hi all,
>>
>> Please review the change to eliminate the unnecessary i2l conversion
>> for expressions like this: "if (intValue == 1L)".
>>
>> http://bugs.openjdk.java.net/browse/JDK-8248043
>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.01
>>
>> The provided benchmark shows performance boost on all platforms:
>> - Intel Xeon: 32.705 --> 14.234 ns/op
>> - arm64: 42.060 --> 25.456 ns/op
>> - arm32: 618.763 --> 314.040 ns/op
>> - ppc8:  81.218 --> 63.026 ns/op
>>
>> Testing done: jtreg, jck.
>>
>> thanks,
>> Boris
>>
>>
>>
>>

From igor.veresov at oracle.com  Wed Jul  1 05:28:48 2020
From: igor.veresov at oracle.com (Igor Veresov)
Date: Tue, 30 Jun 2020 22:28:48 -0700
Subject: RFR 8248043: Need to eliminate excessive i2l conversions
In-Reply-To: <d4b7547a-02d5-fcad-5898-ce7f89383fb0@oracle.com>
References: <096e0df7-8208-2a07-975f-e2de8bc27e3a@bell-sw.com>
 <75920e44-518e-10e0-53b3-c2a6f85fd841@oracle.com>
 <0be466e7-057c-b029-3461-de21d9cd3910@bell-sw.com>
 <AA66D907-6ADF-44DB-8ADE-2D16D3CF6AC5@oracle.com>
 <CAOhyNwDQGnDk5VYR=xLEW5KGJLtJiWTZLO9xnVhhKUXJR9UOog@mail.gmail.com>
 <d4b7547a-02d5-fcad-5898-ce7f89383fb0@oracle.com>
Message-ID: <E8DCC635-1EEB-4A5D-92D7-7E7D52654BAF@oracle.com>

> On Jun 30, 2020, at 10:15 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> I think Igor said that you can't swap arguments of compare without changing condition test. For example, if it was CC_LT it should be CC_GT after swap.

Yes, that?s exactly what I had in mind.  Condition must be inverted. Otherwise your transformation [3] is not valid for anything else but equality, so that?s not going to work. May be if [3] didn?t work, perhaps there is another user of  the CmpLNode in addition to BoolNode ?

igor

> 
> It is not clear why you need swapping in CmpLNode::Ideal() if BoolNode::Ideal() should do it already. If it does not you need to investigate why.
> 
> Also your list of steps 1.-3. does not reflect changes in webrev.02b:
> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b/src/hotspot/share/opto/subnode.cpp.udiff.html
> 
> Regards,
> Vladimir
> 
> On 6/30/20 9:33 PM, Boris Ulasevich wrote:
>> Hi Igor,
>> By BoolNode I mean the canonicalization that is already in place:
>> https://hg.openjdk.java.net/jdk/jdk/file/de6ad5f86276/src/hotspot/share/opto/subnode.cpp#l1391
>> thanks,
>> Boris
>> On Wed, Jul 1, 2020 at 5:07 AM Igor Veresov <igor.veresov at oracle.com> wrote:
>>> I think you forgot to include changes to BoolNode in the webrev.
>>> 
>>> igor
>>> 
>>> 
>>> 
>>> On Jun 30, 2020, at 11:04 AM, Boris Ulasevich <boris.ulasevich at bell-sw.com>
>>> wrote:
>>> 
>>> Hi Claes,
>>> 
>>>> Seems like the optimization is mostly effective, but not getting all the
>>> way.
>>> 
>>> Good point about LHS, thanks! CmpL turned to be not canonized on the
>>> moment.
>>> I moved the optimization to CmpLNode::Ideal and transformations now works
>>> as follows:
>>> 1. CmpINode::Ideal: CmpI(CmpL3)->CmpL
>>> 2. BoolNode::Ideal:
>>> Bool(CmpL(const,val),test)->Bool(CmpL(val,const),test_invert)
>>> 3. CmpLNode::Ideal: CmpL(ConvI2L(val),ConL)->CmpI(val,ConI)
>>> 
>>> I applied your test to the benchmark. The result is:
>>> Benchmark                            Mode  Cnt   Score   Error Units
>>> SkipIntToLongCast.skipCastTestLeft   avgt    5  14.288 ? 0.052 ns/op
>>> SkipIntToLongCast.skipCastTestRight  avgt    5  14.338 ? 0.088 ns/op
>>> 
>>> Updated webrev:
>>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b
>>> 
>>> thanks,
>>> Boris
>>> 
>>> On 26.06.2020 21:31, Claes Redestad wrote:
>>> 
>>> Hi Boris,
>>> 
>>> this looks like a nice improvement! I just have some comments about the
>>> micro.
>>> 
>>> I was curious whether the optimization works when the constant is on
>>> the LHS and added a variant of the micro to try that[1]. Results are
>>> interesting (Intel Xeon):
>>> 
>>> Benchmark                            Mode  Cnt   Score   Error Units
>>> SkipIntToLongCast.skipCastTest       avgt    5  30.937 ? 0.056 ns/op
>>> SkipIntToLongCast.skipCastTestLeft   avgt    5  30.937 ? 0.140 ns/op
>>> 
>>> With your patch:
>>> Benchmark                            Mode  Cnt   Score   Error Units
>>> SkipIntToLongCast.skipCastTest       avgt    5  14.123 ? 0.035 ns/op
>>> SkipIntToLongCast.skipCastTestLeft   avgt    5  17.420 ? 0.044 ns/op
>>> 
>>> Seems like the optimization is mostly effective, but not getting all
>>> the way. I wouldn't worry about it for this RFE, but perhaps something
>>> to investigate in a follow-up. Feel free to include such a variant in
>>> your patch though (no attribution necessary).
>>> 
>>> The micro also stabilizes very quickly, so you might want to provide
>>> some default tuning to keep runtime in check, e.g., something like:
>>> 
>>> @Warmup(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
>>> @Measurement(iterations = 5, time = 1000, timeUnit = TimeUnit.MILLISECONDS)
>>> @Fork(3)
>>> 
>>> Thanks!
>>> 
>>> /Claes
>>> 
>>> [1]
>>>     @Benchmark
>>>     public int skipCastTestLeft() {
>>>         for (int i = 0; i < ARRAYSIZE_L; i++) {
>>>             if (ARRAYSIZE_L == intValues[i]) {
>>>                 return i;
>>>             }
>>>         }
>>>         return 0;
>>>     }
>>> 
>>> On 2020-06-26 17:05, Boris Ulasevich wrote:
>>> 
>>> Hi all,
>>> 
>>> Please review the change to eliminate the unnecessary i2l conversion
>>> for expressions like this: "if (intValue == 1L)".
>>> 
>>> http://bugs.openjdk.java.net/browse/JDK-8248043
>>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.01
>>> 
>>> The provided benchmark shows performance boost on all platforms:
>>> - Intel Xeon: 32.705 --> 14.234 ns/op
>>> - arm64: 42.060 --> 25.456 ns/op
>>> - arm32: 618.763 --> 314.040 ns/op
>>> - ppc8:  81.218 --> 63.026 ns/op
>>> 
>>> Testing done: jtreg, jck.
>>> 
>>> thanks,
>>> Boris
>>> 
>>> 
>>> 
>>> 


From boris.ulasevich at bell-sw.com  Wed Jul  1 08:51:34 2020
From: boris.ulasevich at bell-sw.com (Boris Ulasevich)
Date: Wed, 1 Jul 2020 11:51:34 +0300
Subject: RFR 8248043: Need to eliminate excessive i2l conversions
In-Reply-To: <E8DCC635-1EEB-4A5D-92D7-7E7D52654BAF@oracle.com>
References: <096e0df7-8208-2a07-975f-e2de8bc27e3a@bell-sw.com>
 <75920e44-518e-10e0-53b3-c2a6f85fd841@oracle.com>
 <0be466e7-057c-b029-3461-de21d9cd3910@bell-sw.com>
 <AA66D907-6ADF-44DB-8ADE-2D16D3CF6AC5@oracle.com>
 <CAOhyNwDQGnDk5VYR=xLEW5KGJLtJiWTZLO9xnVhhKUXJR9UOog@mail.gmail.com>
 <d4b7547a-02d5-fcad-5898-ce7f89383fb0@oracle.com>
 <E8DCC635-1EEB-4A5D-92D7-7E7D52654BAF@oracle.com>
Message-ID: <CAOhyNwAK-F3b2zgLsWKDQiKNyu2F1TavLcC_zfgtBn2GmSm=xQ@mail.gmail.com>

Hi,

I'm deeply sorry. Yes, webrev.02b is certainly wrong!
Correct link is webrev.02c:
http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02
<http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b/src/hotspot/share/opto/subnode.cpp.udiff.html>
c
- this is the change I described in my mail and wanted to review.

my apologies,
Boris

On Wednesday, July 1, 2020, Igor Veresov <igor.veresov at oracle.com> wrote:

> > On Jun 30, 2020, at 10:15 PM, Vladimir Kozlov <
> vladimir.kozlov at oracle.com> wrote:
> >
> > I think Igor said that you can't swap arguments of compare without
> changing condition test. For example, if it was CC_LT it should be CC_GT
> after swap.
>
> Yes, that?s exactly what I had in mind.  Condition must be inverted.
> Otherwise your transformation [3] is not valid for anything else but
> equality, so that?s not going to work. May be if [3] didn?t work, perhaps
> there is another user of  the CmpLNode in addition to BoolNode ?
>
> igor
>
> >
> > It is not clear why you need swapping in CmpLNode::Ideal() if
> BoolNode::Ideal() should do it already. If it does not you need to
> investigate why.
> >
> > Also your list of steps 1.-3. does not reflect changes in webrev.02b:
> > http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b/
> src/hotspot/share/opto/subnode.cpp.udiff.html
> >
> > Regards,
> > Vladimir
> >
> > On 6/30/20 9:33 PM, Boris Ulasevich wrote:
> >> Hi Igor,
> >> By BoolNode I mean the canonicalization that is already in place:
> >> https://hg.openjdk.java.net/jdk/jdk/file/de6ad5f86276/src/
> hotspot/share/opto/subnode.cpp#l1391
> >> thanks,
> >> Boris
> >> On Wed, Jul 1, 2020 at 5:07 AM Igor Veresov <igor.veresov at oracle.com>
> wrote:
> >>> I think you forgot to include changes to BoolNode in the webrev.
> >>>
> >>> igor
> >>>
> >>>
> >>>
> >>> On Jun 30, 2020, at 11:04 AM, Boris Ulasevich <
> boris.ulasevich at bell-sw.com>
> >>> wrote:
> >>>
> >>> Hi Claes,
> >>>
> >>>> Seems like the optimization is mostly effective, but not getting all
> the
> >>> way.
> >>>
> >>> Good point about LHS, thanks! CmpL turned to be not canonized on the
> >>> moment.
> >>> I moved the optimization to CmpLNode::Ideal and transformations now
> works
> >>> as follows:
> >>> 1. CmpINode::Ideal: CmpI(CmpL3)->CmpL
> >>> 2. BoolNode::Ideal:
> >>> Bool(CmpL(const,val),test)->Bool(CmpL(val,const),test_invert)
> >>> 3. CmpLNode::Ideal: CmpL(ConvI2L(val),ConL)->CmpI(val,ConI)
> >>>
> >>> I applied your test to the benchmark. The result is:
> >>> Benchmark                            Mode  Cnt   Score   Error Units
> >>> SkipIntToLongCast.skipCastTestLeft   avgt    5  14.288 ? 0.052 ns/op
> >>> SkipIntToLongCast.skipCastTestRight  avgt    5  14.338 ? 0.088 ns/op
> >>>
> >>> Updated webrev:
> >>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b
> >>>
> >>> thanks,
> >>> Boris
> >>>
> >>> On 26.06.2020 21:31, Claes Redestad wrote:
> >>>
> >>> Hi Boris,
> >>>
> >>> this looks like a nice improvement! I just have some comments about the
> >>> micro.
> >>>
> >>> I was curious whether the optimization works when the constant is on
> >>> the LHS and added a variant of the micro to try that[1]. Results are
> >>> interesting (Intel Xeon):
> >>>
> >>> Benchmark                            Mode  Cnt   Score   Error Units
> >>> SkipIntToLongCast.skipCastTest       avgt    5  30.937 ? 0.056 ns/op
> >>> SkipIntToLongCast.skipCastTestLeft   avgt    5  30.937 ? 0.140 ns/op
> >>>
> >>> With your patch:
> >>> Benchmark                            Mode  Cnt   Score   Error Units
> >>> SkipIntToLongCast.skipCastTest       avgt    5  14.123 ? 0.035 ns/op
> >>> SkipIntToLongCast.skipCastTestLeft   avgt    5  17.420 ? 0.044 ns/op
> >>>
> >>> Seems like the optimization is mostly effective, but not getting all
> >>> the way. I wouldn't worry about it for this RFE, but perhaps something
> >>> to investigate in a follow-up. Feel free to include such a variant in
> >>> your patch though (no attribution necessary).
> >>>
> >>> The micro also stabilizes very quickly, so you might want to provide
> >>> some default tuning to keep runtime in check, e.g., something like:
> >>>
> >>> @Warmup(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
> >>> @Measurement(iterations = 5, time = 1000, timeUnit =
> TimeUnit.MILLISECONDS)
> >>> @Fork(3)
> >>>
> >>> Thanks!
> >>>
> >>> /Claes
> >>>
> >>> [1]
> >>>     @Benchmark
> >>>     public int skipCastTestLeft() {
> >>>         for (int i = 0; i < ARRAYSIZE_L; i++) {
> >>>             if (ARRAYSIZE_L == intValues[i]) {
> >>>                 return i;
> >>>             }
> >>>         }
> >>>         return 0;
> >>>     }
> >>>
> >>> On 2020-06-26 17:05, Boris Ulasevich wrote:
> >>>
> >>> Hi all,
> >>>
> >>> Please review the change to eliminate the unnecessary i2l conversion
> >>> for expressions like this: "if (intValue == 1L)".
> >>>
> >>> http://bugs.openjdk.java.net/browse/JDK-8248043
> >>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.01
> >>>
> >>> The provided benchmark shows performance boost on all platforms:
> >>> - Intel Xeon: 32.705 --> 14.234 ns/op
> >>> - arm64: 42.060 --> 25.456 ns/op
> >>> - arm32: 618.763 --> 314.040 ns/op
> >>> - ppc8:  81.218 --> 63.026 ns/op
> >>>
> >>> Testing done: jtreg, jck.
> >>>
> >>> thanks,
> >>> Boris
> >>>
> >>>
> >>>
> >>>
>
>

From boris.ulasevich at bell-sw.com  Wed Jul  1 09:16:31 2020
From: boris.ulasevich at bell-sw.com (Boris Ulasevich)
Date: Wed, 1 Jul 2020 12:16:31 +0300
Subject: RFR 8248043: Need to eliminate excessive i2l conversions
In-Reply-To: <CAOhyNwAK-F3b2zgLsWKDQiKNyu2F1TavLcC_zfgtBn2GmSm=xQ@mail.gmail.com>
References: <096e0df7-8208-2a07-975f-e2de8bc27e3a@bell-sw.com>
 <75920e44-518e-10e0-53b3-c2a6f85fd841@oracle.com>
 <0be466e7-057c-b029-3461-de21d9cd3910@bell-sw.com>
 <AA66D907-6ADF-44DB-8ADE-2D16D3CF6AC5@oracle.com>
 <CAOhyNwDQGnDk5VYR=xLEW5KGJLtJiWTZLO9xnVhhKUXJR9UOog@mail.gmail.com>
 <d4b7547a-02d5-fcad-5898-ce7f89383fb0@oracle.com>
 <E8DCC635-1EEB-4A5D-92D7-7E7D52654BAF@oracle.com>
 <CAOhyNwAK-F3b2zgLsWKDQiKNyu2F1TavLcC_zfgtBn2GmSm=xQ@mail.gmail.com>
Message-ID: <CAOhyNwBbsRY_BFxiz+uNz09F_5FFzTU-p4bKK83rrT_icC6PhQ@mail.gmail.com>

Hi,

It is the third attempt to send a correct link. Sorry for that ;)
http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02c

Thanks,
Boris

On Wednesday, July 1, 2020, Boris Ulasevich <boris.ulasevich at bell-sw.com>
wrote:

> Hi,
>
> I'm deeply sorry. Yes, webrev.02b is certainly wrong!
> Correct link is webrev.02c:
> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02
> <http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b/src/hotspot/share/opto/subnode.cpp.udiff.html>
> c
> - this is the change I described in my mail and wanted to review.
>
> my apologies,
> Boris
>
> On Wednesday, July 1, 2020, Igor Veresov <igor.veresov at oracle.com> wrote:
>
>> > On Jun 30, 2020, at 10:15 PM, Vladimir Kozlov <
>> vladimir.kozlov at oracle.com> wrote:
>> >
>> > I think Igor said that you can't swap arguments of compare without
>> changing condition test. For example, if it was CC_LT it should be CC_GT
>> after swap.
>>
>> Yes, that?s exactly what I had in mind.  Condition must be inverted.
>> Otherwise your transformation [3] is not valid for anything else but
>> equality, so that?s not going to work. May be if [3] didn?t work, perhaps
>> there is another user of  the CmpLNode in addition to BoolNode ?
>>
>> igor
>>
>> >
>> > It is not clear why you need swapping in CmpLNode::Ideal() if
>> BoolNode::Ideal() should do it already. If it does not you need to
>> investigate why.
>> >
>> > Also your list of steps 1.-3. does not reflect changes in webrev.02b:
>> > http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b/
>> src/hotspot/share/opto/subnode.cpp.udiff.html
>> >
>> > Regards,
>> > Vladimir
>> >
>> > On 6/30/20 9:33 PM, Boris Ulasevich wrote:
>> >> Hi Igor,
>> >> By BoolNode I mean the canonicalization that is already in place:
>> >> https://hg.openjdk.java.net/jdk/jdk/file/de6ad5f86276/src/ho
>> tspot/share/opto/subnode.cpp#l1391
>> >> thanks,
>> >> Boris
>> >> On Wed, Jul 1, 2020 at 5:07 AM Igor Veresov <igor.veresov at oracle.com>
>> wrote:
>> >>> I think you forgot to include changes to BoolNode in the webrev.
>> >>>
>> >>> igor
>> >>>
>> >>>
>> >>>
>> >>> On Jun 30, 2020, at 11:04 AM, Boris Ulasevich <
>> boris.ulasevich at bell-sw.com>
>> >>> wrote:
>> >>>
>> >>> Hi Claes,
>> >>>
>> >>>> Seems like the optimization is mostly effective, but not getting all
>> the
>> >>> way.
>> >>>
>> >>> Good point about LHS, thanks! CmpL turned to be not canonized on the
>> >>> moment.
>> >>> I moved the optimization to CmpLNode::Ideal and transformations now
>> works
>> >>> as follows:
>> >>> 1. CmpINode::Ideal: CmpI(CmpL3)->CmpL
>> >>> 2. BoolNode::Ideal:
>> >>> Bool(CmpL(const,val),test)->Bool(CmpL(val,const),test_invert)
>> >>> 3. CmpLNode::Ideal: CmpL(ConvI2L(val),ConL)->CmpI(val,ConI)
>> >>>
>> >>> I applied your test to the benchmark. The result is:
>> >>> Benchmark                            Mode  Cnt   Score   Error Units
>> >>> SkipIntToLongCast.skipCastTestLeft   avgt    5  14.288 ? 0.052 ns/op
>> >>> SkipIntToLongCast.skipCastTestRight  avgt    5  14.338 ? 0.088 ns/op
>> >>>
>> >>> Updated webrev:
>> >>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b
>> >>>
>> >>> thanks,
>> >>> Boris
>> >>>
>> >>> On 26.06.2020 21:31, Claes Redestad wrote:
>> >>>
>> >>> Hi Boris,
>> >>>
>> >>> this looks like a nice improvement! I just have some comments about
>> the
>> >>> micro.
>> >>>
>> >>> I was curious whether the optimization works when the constant is on
>> >>> the LHS and added a variant of the micro to try that[1]. Results are
>> >>> interesting (Intel Xeon):
>> >>>
>> >>> Benchmark                            Mode  Cnt   Score   Error Units
>> >>> SkipIntToLongCast.skipCastTest       avgt    5  30.937 ? 0.056 ns/op
>> >>> SkipIntToLongCast.skipCastTestLeft   avgt    5  30.937 ? 0.140 ns/op
>> >>>
>> >>> With your patch:
>> >>> Benchmark                            Mode  Cnt   Score   Error Units
>> >>> SkipIntToLongCast.skipCastTest       avgt    5  14.123 ? 0.035 ns/op
>> >>> SkipIntToLongCast.skipCastTestLeft   avgt    5  17.420 ? 0.044 ns/op
>> >>>
>> >>> Seems like the optimization is mostly effective, but not getting all
>> >>> the way. I wouldn't worry about it for this RFE, but perhaps something
>> >>> to investigate in a follow-up. Feel free to include such a variant in
>> >>> your patch though (no attribution necessary).
>> >>>
>> >>> The micro also stabilizes very quickly, so you might want to provide
>> >>> some default tuning to keep runtime in check, e.g., something like:
>> >>>
>> >>> @Warmup(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
>> >>> @Measurement(iterations = 5, time = 1000, timeUnit =
>> TimeUnit.MILLISECONDS)
>> >>> @Fork(3)
>> >>>
>> >>> Thanks!
>> >>>
>> >>> /Claes
>> >>>
>> >>> [1]
>> >>>     @Benchmark
>> >>>     public int skipCastTestLeft() {
>> >>>         for (int i = 0; i < ARRAYSIZE_L; i++) {
>> >>>             if (ARRAYSIZE_L == intValues[i]) {
>> >>>                 return i;
>> >>>             }
>> >>>         }
>> >>>         return 0;
>> >>>     }
>> >>>
>> >>> On 2020-06-26 17:05, Boris Ulasevich wrote:
>> >>>
>> >>> Hi all,
>> >>>
>> >>> Please review the change to eliminate the unnecessary i2l conversion
>> >>> for expressions like this: "if (intValue == 1L)".
>> >>>
>> >>> http://bugs.openjdk.java.net/browse/JDK-8248043
>> >>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.01
>> >>>
>> >>> The provided benchmark shows performance boost on all platforms:
>> >>> - Intel Xeon: 32.705 --> 14.234 ns/op
>> >>> - arm64: 42.060 --> 25.456 ns/op
>> >>> - arm32: 618.763 --> 314.040 ns/op
>> >>> - ppc8:  81.218 --> 63.026 ns/op
>> >>>
>> >>> Testing done: jtreg, jck.
>> >>>
>> >>> thanks,
>> >>> Boris
>> >>>
>> >>>
>> >>>
>> >>>
>>
>>

From Charlie.Gracie at microsoft.com  Wed Jul  1 16:15:50 2020
From: Charlie.Gracie at microsoft.com (Charlie Gracie)
Date: Wed, 1 Jul 2020 16:15:50 +0000
Subject: Stack allocation prototype for C2
Message-ID: <2D33A1A7-A3EF-4103-BD0D-D466C0E7AA31@microsoft.com>

Hi Sergey,

We have some old data but we will gather new data for the benchmarks showing wins on overall allocation reduction. I attempted to gather this data yesterday but I ran into an issue with JFR. I have a work-around so we should be able to get the data in the next couple of days. With the holidays in Canada and the US this week it might take until Monday to get the data together. 

One of the common places we see wins is with Scala iterators, in particular, when iterating over primitive arrays. Regularly the array elements get boxed to perform an operation. I believe in the Scala TMT benchmark the win is removing an allocation of boxed Double objects when iterating over an array.

Thanks for the question and we will get back to you with the data soon.
Charlie

?On 2020-06-29, 11:34 PM, "hotspot-compiler-dev on behalf of Sergey Kuksenko" <hotspot-compiler-dev-retn at openjdk.java.net on behalf of sergey.kuksenko at oracle.com> wrote:

    I am just curious.
    
    For each benchmark you show allocation reduce size in general. Do you 
    have statistics which stack allocated objects gives major impact? And 
    which code patterns fail scalar replacement except well know Integer 
    cache flow merge?
    
    On 6/29/20 2:05 PM, Charlie Gracie wrote:
    > Hi hotspot-compiler-dev community,
    >
    > Here is the prototype code for our work on adding stack allocation to the HotSpot C2 compiler. We are looking for any and all feedback
    > as we hope to move from a prototype to something that could be contributed. A change of this size is difficult to review so we
    > understand the process will be thorough and will take time to complete. Any suggestions on how to allow for collaboration with others,
    > if they wanted to, would also be appreciated (i.e., a repo somewhere).
    >
    > For a quick refresher here is a link to Nikola?s talk at FOSDEM:
    > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Ffosdem.org%2F2020%2Fschedule%2Fevent%2Freducing_gc_times%2F&amp;data=02%7C01%7Ccharlie.gracie%40microsoft.com%7C9e9b56c23fde463bf6b808d81ca68bf4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637290848926541670&amp;sdata=qB1c8l5mUVk%2BAt7W5178A9wQ3pauoxW6XTVCfOTOmHw%3D&amp;reserved=0
    >
    > Here is a link to our initial webrev:
    > https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~adityam%2Fcharlie%2Fstack_alloc%2F&amp;data=02%7C01%7Ccharlie.gracie%40microsoft.com%7C9e9b56c23fde463bf6b808d81ca68bf4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637290848926541670&amp;sdata=46mF34J4XcMV58TJxvJ4%2FiDSxL41TSKgW0X2MX7HRV4%3D&amp;reserved=0
    >
    > Expecting that a change like this will require a JEP, we have prepared a document describing our work based off of the JEP submission
    > form. Our document has a few extra sections at the end discussing areas that we are looking for guidance on and some initial
    > performance results. This document can be found here:
    > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmicrosoft%2Fopenjdk-proposals%2Fblob%2Fmaster%2Fstack_allocation%2FStack_Allocation_JEP.md&amp;data=02%7C01%7Ccharlie.gracie%40microsoft.com%7C9e9b56c23fde463bf6b808d81ca68bf4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637290848926541670&amp;sdata=V%2BqKZ9QgCd%2BKDbFb9MqFDoxdtXm8fFmgh%2FLYxgiGqJA%3D&amp;reserved=0
    >
    > Thanks in advance for reviews, suggestions, concerns, comments and issues.
    > Charlie and Nikola
    >
    

From igor.veresov at oracle.com  Wed Jul  1 19:29:42 2020
From: igor.veresov at oracle.com (Igor Veresov)
Date: Wed, 1 Jul 2020 12:29:42 -0700
Subject: RFR 8248043: Need to eliminate excessive i2l conversions
In-Reply-To: <CAOhyNwBbsRY_BFxiz+uNz09F_5FFzTU-p4bKK83rrT_icC6PhQ@mail.gmail.com>
References: <096e0df7-8208-2a07-975f-e2de8bc27e3a@bell-sw.com>
 <75920e44-518e-10e0-53b3-c2a6f85fd841@oracle.com>
 <0be466e7-057c-b029-3461-de21d9cd3910@bell-sw.com>
 <AA66D907-6ADF-44DB-8ADE-2D16D3CF6AC5@oracle.com>
 <CAOhyNwDQGnDk5VYR=xLEW5KGJLtJiWTZLO9xnVhhKUXJR9UOog@mail.gmail.com>
 <d4b7547a-02d5-fcad-5898-ce7f89383fb0@oracle.com>
 <E8DCC635-1EEB-4A5D-92D7-7E7D52654BAF@oracle.com>
 <CAOhyNwAK-F3b2zgLsWKDQiKNyu2F1TavLcC_zfgtBn2GmSm=xQ@mail.gmail.com>
 <CAOhyNwBbsRY_BFxiz+uNz09F_5FFzTU-p4bKK83rrT_icC6PhQ@mail.gmail.com>
Message-ID: <424D5809-A580-43BD-A00D-B49C470AF280@oracle.com>

That looks good.

igor


> On Jul 1, 2020, at 2:16 AM, Boris Ulasevich <boris.ulasevich at bell-sw.com> wrote:
> 
> Hi,
> 
> It is the third attempt to send a correct link. Sorry for that ;)
> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02c <http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02c>
> 
> Thanks,
> Boris
> 
> On Wednesday, July 1, 2020, Boris Ulasevich <boris.ulasevich at bell-sw.com <mailto:boris.ulasevich at bell-sw.com>> wrote:
> Hi,
> 
> I'm deeply sorry. Yes, webrev.02b is certainly wrong!
> Correct link is webrev.02c:
> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02 <http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b/src/hotspot/share/opto/subnode.cpp.udiff.html>c
> - this is the change I described in my mail and wanted to review.
> 
> my apologies,
> Boris
> 
> On Wednesday, July 1, 2020, Igor Veresov <igor.veresov at oracle.com <mailto:igor.veresov at oracle.com>> wrote:
> > On Jun 30, 2020, at 10:15 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> wrote:
> > 
> > I think Igor said that you can't swap arguments of compare without changing condition test. For example, if it was CC_LT it should be CC_GT after swap.
> 
> Yes, that?s exactly what I had in mind.  Condition must be inverted. Otherwise your transformation [3] is not valid for anything else but equality, so that?s not going to work. May be if [3] didn?t work, perhaps there is another user of  the CmpLNode in addition to BoolNode ?
> 
> igor
> 
> > 
> > It is not clear why you need swapping in CmpLNode::Ideal() if BoolNode::Ideal() should do it already. If it does not you need to investigate why.
> > 
> > Also your list of steps 1.-3. does not reflect changes in webrev.02b:
> > http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b/src/hotspot/share/opto/subnode.cpp.udiff.html <http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b/src/hotspot/share/opto/subnode.cpp.udiff.html>
> > 
> > Regards,
> > Vladimir
> > 
> > On 6/30/20 9:33 PM, Boris Ulasevich wrote:
> >> Hi Igor,
> >> By BoolNode I mean the canonicalization that is already in place:
> >> https://hg.openjdk.java.net/jdk/jdk/file/de6ad5f86276/src/hotspot/share/opto/subnode.cpp#l1391 <https://hg.openjdk.java.net/jdk/jdk/file/de6ad5f86276/src/hotspot/share/opto/subnode.cpp#l1391>
> >> thanks,
> >> Boris
> >> On Wed, Jul 1, 2020 at 5:07 AM Igor Veresov <igor.veresov at oracle.com <mailto:igor.veresov at oracle.com>> wrote:
> >>> I think you forgot to include changes to BoolNode in the webrev.
> >>> 
> >>> igor
> >>> 
> >>> 
> >>> 
> >>> On Jun 30, 2020, at 11:04 AM, Boris Ulasevich <boris.ulasevich at bell-sw.com <mailto:boris.ulasevich at bell-sw.com>>
> >>> wrote:
> >>> 
> >>> Hi Claes,
> >>> 
> >>>> Seems like the optimization is mostly effective, but not getting all the
> >>> way.
> >>> 
> >>> Good point about LHS, thanks! CmpL turned to be not canonized on the
> >>> moment.
> >>> I moved the optimization to CmpLNode::Ideal and transformations now works
> >>> as follows:
> >>> 1. CmpINode::Ideal: CmpI(CmpL3)->CmpL
> >>> 2. BoolNode::Ideal:
> >>> Bool(CmpL(const,val),test)->Bool(CmpL(val,const),test_invert)
> >>> 3. CmpLNode::Ideal: CmpL(ConvI2L(val),ConL)->CmpI(val,ConI)
> >>> 
> >>> I applied your test to the benchmark. The result is:
> >>> Benchmark                            Mode  Cnt   Score   Error Units
> >>> SkipIntToLongCast.skipCastTestLeft   avgt    5  14.288 ? 0.052 ns/op
> >>> SkipIntToLongCast.skipCastTestRight  avgt    5  14.338 ? 0.088 ns/op
> >>> 
> >>> Updated webrev:
> >>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b <http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b>
> >>> 
> >>> thanks,
> >>> Boris
> >>> 
> >>> On 26.06.2020 21:31, Claes Redestad wrote:
> >>> 
> >>> Hi Boris,
> >>> 
> >>> this looks like a nice improvement! I just have some comments about the
> >>> micro.
> >>> 
> >>> I was curious whether the optimization works when the constant is on
> >>> the LHS and added a variant of the micro to try that[1]. Results are
> >>> interesting (Intel Xeon):
> >>> 
> >>> Benchmark                            Mode  Cnt   Score   Error Units
> >>> SkipIntToLongCast.skipCastTest       avgt    5  30.937 ? 0.056 ns/op
> >>> SkipIntToLongCast.skipCastTestLeft   avgt    5  30.937 ? 0.140 ns/op
> >>> 
> >>> With your patch:
> >>> Benchmark                            Mode  Cnt   Score   Error Units
> >>> SkipIntToLongCast.skipCastTest       avgt    5  14.123 ? 0.035 ns/op
> >>> SkipIntToLongCast.skipCastTestLeft   avgt    5  17.420 ? 0.044 ns/op
> >>> 
> >>> Seems like the optimization is mostly effective, but not getting all
> >>> the way. I wouldn't worry about it for this RFE, but perhaps something
> >>> to investigate in a follow-up. Feel free to include such a variant in
> >>> your patch though (no attribution necessary).
> >>> 
> >>> The micro also stabilizes very quickly, so you might want to provide
> >>> some default tuning to keep runtime in check, e.g., something like:
> >>> 
> >>> @Warmup(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
> >>> @Measurement(iterations = 5, time = 1000, timeUnit = TimeUnit.MILLISECONDS)
> >>> @Fork(3)
> >>> 
> >>> Thanks!
> >>> 
> >>> /Claes
> >>> 
> >>> [1]
> >>>     @Benchmark
> >>>     public int skipCastTestLeft() {
> >>>         for (int i = 0; i < ARRAYSIZE_L; i++) {
> >>>             if (ARRAYSIZE_L == intValues[i]) {
> >>>                 return i;
> >>>             }
> >>>         }
> >>>         return 0;
> >>>     }
> >>> 
> >>> On 2020-06-26 17:05, Boris Ulasevich wrote:
> >>> 
> >>> Hi all,
> >>> 
> >>> Please review the change to eliminate the unnecessary i2l conversion
> >>> for expressions like this: "if (intValue == 1L)".
> >>> 
> >>> http://bugs.openjdk.java.net/browse/JDK-8248043 <http://bugs.openjdk.java.net/browse/JDK-8248043>
> >>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.01 <http://cr.openjdk.java.net/~bulasevich/8248043/webrev.01>
> >>> 
> >>> The provided benchmark shows performance boost on all platforms:
> >>> - Intel Xeon: 32.705 --> 14.234 ns/op
> >>> - arm64: 42.060 --> 25.456 ns/op
> >>> - arm32: 618.763 --> 314.040 ns/op
> >>> - ppc8:  81.218 --> 63.026 ns/op
> >>> 
> >>> Testing done: jtreg, jck.
> >>> 
> >>> thanks,
> >>> Boris
> >>> 
> >>> 
> >>> 
> >>> 
> 


From vladimir.kozlov at oracle.com  Wed Jul  1 19:44:48 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 1 Jul 2020 12:44:48 -0700
Subject: RFR 8248043: Need to eliminate excessive i2l conversions
In-Reply-To: <424D5809-A580-43BD-A00D-B49C470AF280@oracle.com>
References: <096e0df7-8208-2a07-975f-e2de8bc27e3a@bell-sw.com>
 <75920e44-518e-10e0-53b3-c2a6f85fd841@oracle.com>
 <0be466e7-057c-b029-3461-de21d9cd3910@bell-sw.com>
 <AA66D907-6ADF-44DB-8ADE-2D16D3CF6AC5@oracle.com>
 <CAOhyNwDQGnDk5VYR=xLEW5KGJLtJiWTZLO9xnVhhKUXJR9UOog@mail.gmail.com>
 <d4b7547a-02d5-fcad-5898-ce7f89383fb0@oracle.com>
 <E8DCC635-1EEB-4A5D-92D7-7E7D52654BAF@oracle.com>
 <CAOhyNwAK-F3b2zgLsWKDQiKNyu2F1TavLcC_zfgtBn2GmSm=xQ@mail.gmail.com>
 <CAOhyNwBbsRY_BFxiz+uNz09F_5FFzTU-p4bKK83rrT_icC6PhQ@mail.gmail.com>
 <424D5809-A580-43BD-A00D-B49C470AF280@oracle.com>
Message-ID: <7616c777-cfa0-37c5-3f6b-0e03d471fe84@oracle.com>

+1

Thanks,
Vladimir

On 7/1/20 12:29 PM, Igor Veresov wrote:
> That looks good.
> 
> igor
> 
> 
> 
>> On Jul 1, 2020, at 2:16 AM, Boris Ulasevich <boris.ulasevich at bell-sw.com> wrote:
>>
>> Hi,
>>
>> It is the third attempt to send a correct link. Sorry for that ;)
>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02c <http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02c>
>>
>> Thanks,
>> Boris
>>
>> On Wednesday, July 1, 2020, Boris Ulasevich <boris.ulasevich at bell-sw.com <mailto:boris.ulasevich at bell-sw.com>> wrote:
>> Hi,
>>
>> I'm deeply sorry. Yes, webrev.02b is certainly wrong!
>> Correct link is webrev.02c:
>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02 <http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b/src/hotspot/share/opto/subnode.cpp.udiff.html>c
>> - this is the change I described in my mail and wanted to review.
>>
>> my apologies,
>> Boris
>>
>> On Wednesday, July 1, 2020, Igor Veresov <igor.veresov at oracle.com <mailto:igor.veresov at oracle.com>> wrote:
>>> On Jun 30, 2020, at 10:15 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> wrote:
>>>
>>> I think Igor said that you can't swap arguments of compare without changing condition test. For example, if it was CC_LT it should be CC_GT after swap.
>>
>> Yes, that?s exactly what I had in mind.  Condition must be inverted. Otherwise your transformation [3] is not valid for anything else but equality, so that?s not going to work. May be if [3] didn?t work, perhaps there is another user of  the CmpLNode in addition to BoolNode ?
>>
>> igor
>>
>>>
>>> It is not clear why you need swapping in CmpLNode::Ideal() if BoolNode::Ideal() should do it already. If it does not you need to investigate why.
>>>
>>> Also your list of steps 1.-3. does not reflect changes in webrev.02b:
>>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b/src/hotspot/share/opto/subnode.cpp.udiff.html <http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b/src/hotspot/share/opto/subnode.cpp.udiff.html>
>>>
>>> Regards,
>>> Vladimir
>>>
>>> On 6/30/20 9:33 PM, Boris Ulasevich wrote:
>>>> Hi Igor,
>>>> By BoolNode I mean the canonicalization that is already in place:
>>>> https://hg.openjdk.java.net/jdk/jdk/file/de6ad5f86276/src/hotspot/share/opto/subnode.cpp#l1391 <https://hg.openjdk.java.net/jdk/jdk/file/de6ad5f86276/src/hotspot/share/opto/subnode.cpp#l1391>
>>>> thanks,
>>>> Boris
>>>> On Wed, Jul 1, 2020 at 5:07 AM Igor Veresov <igor.veresov at oracle.com <mailto:igor.veresov at oracle.com>> wrote:
>>>>> I think you forgot to include changes to BoolNode in the webrev.
>>>>>
>>>>> igor
>>>>>
>>>>>
>>>>>
>>>>> On Jun 30, 2020, at 11:04 AM, Boris Ulasevich <boris.ulasevich at bell-sw.com <mailto:boris.ulasevich at bell-sw.com>>
>>>>> wrote:
>>>>>
>>>>> Hi Claes,
>>>>>
>>>>>> Seems like the optimization is mostly effective, but not getting all the
>>>>> way.
>>>>>
>>>>> Good point about LHS, thanks! CmpL turned to be not canonized on the
>>>>> moment.
>>>>> I moved the optimization to CmpLNode::Ideal and transformations now works
>>>>> as follows:
>>>>> 1. CmpINode::Ideal: CmpI(CmpL3)->CmpL
>>>>> 2. BoolNode::Ideal:
>>>>> Bool(CmpL(const,val),test)->Bool(CmpL(val,const),test_invert)
>>>>> 3. CmpLNode::Ideal: CmpL(ConvI2L(val),ConL)->CmpI(val,ConI)
>>>>>
>>>>> I applied your test to the benchmark. The result is:
>>>>> Benchmark                            Mode  Cnt   Score   Error Units
>>>>> SkipIntToLongCast.skipCastTestLeft   avgt    5  14.288 ? 0.052 ns/op
>>>>> SkipIntToLongCast.skipCastTestRight  avgt    5  14.338 ? 0.088 ns/op
>>>>>
>>>>> Updated webrev:
>>>>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b <http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b>
>>>>>
>>>>> thanks,
>>>>> Boris
>>>>>
>>>>> On 26.06.2020 21:31, Claes Redestad wrote:
>>>>>
>>>>> Hi Boris,
>>>>>
>>>>> this looks like a nice improvement! I just have some comments about the
>>>>> micro.
>>>>>
>>>>> I was curious whether the optimization works when the constant is on
>>>>> the LHS and added a variant of the micro to try that[1]. Results are
>>>>> interesting (Intel Xeon):
>>>>>
>>>>> Benchmark                            Mode  Cnt   Score   Error Units
>>>>> SkipIntToLongCast.skipCastTest       avgt    5  30.937 ? 0.056 ns/op
>>>>> SkipIntToLongCast.skipCastTestLeft   avgt    5  30.937 ? 0.140 ns/op
>>>>>
>>>>> With your patch:
>>>>> Benchmark                            Mode  Cnt   Score   Error Units
>>>>> SkipIntToLongCast.skipCastTest       avgt    5  14.123 ? 0.035 ns/op
>>>>> SkipIntToLongCast.skipCastTestLeft   avgt    5  17.420 ? 0.044 ns/op
>>>>>
>>>>> Seems like the optimization is mostly effective, but not getting all
>>>>> the way. I wouldn't worry about it for this RFE, but perhaps something
>>>>> to investigate in a follow-up. Feel free to include such a variant in
>>>>> your patch though (no attribution necessary).
>>>>>
>>>>> The micro also stabilizes very quickly, so you might want to provide
>>>>> some default tuning to keep runtime in check, e.g., something like:
>>>>>
>>>>> @Warmup(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
>>>>> @Measurement(iterations = 5, time = 1000, timeUnit = TimeUnit.MILLISECONDS)
>>>>> @Fork(3)
>>>>>
>>>>> Thanks!
>>>>>
>>>>> /Claes
>>>>>
>>>>> [1]
>>>>>      @Benchmark
>>>>>      public int skipCastTestLeft() {
>>>>>          for (int i = 0; i < ARRAYSIZE_L; i++) {
>>>>>              if (ARRAYSIZE_L == intValues[i]) {
>>>>>                  return i;
>>>>>              }
>>>>>          }
>>>>>          return 0;
>>>>>      }
>>>>>
>>>>> On 2020-06-26 17:05, Boris Ulasevich wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> Please review the change to eliminate the unnecessary i2l conversion
>>>>> for expressions like this: "if (intValue == 1L)".
>>>>>
>>>>> http://bugs.openjdk.java.net/browse/JDK-8248043 <http://bugs.openjdk.java.net/browse/JDK-8248043>
>>>>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.01 <http://cr.openjdk.java.net/~bulasevich/8248043/webrev.01>
>>>>>
>>>>> The provided benchmark shows performance boost on all platforms:
>>>>> - Intel Xeon: 32.705 --> 14.234 ns/op
>>>>> - arm64: 42.060 --> 25.456 ns/op
>>>>> - arm32: 618.763 --> 314.040 ns/op
>>>>> - ppc8:  81.218 --> 63.026 ns/op
>>>>>
>>>>> Testing done: jtreg, jck.
>>>>>
>>>>> thanks,
>>>>> Boris
>>>>>
>>>>>
>>>>>
>>>>>
>>
> 

From joserz at linux.ibm.com  Wed Jul  1 19:49:10 2020
From: joserz at linux.ibm.com (joserz at linux.ibm.com)
Date: Wed, 1 Jul 2020 16:49:10 -0300
Subject: RFR(M): 8248191: PPC: Implement Load/Store Vector with lxvl/stxvl in
 Power10
Message-ID: <20200701194910.GA141565@pacoca>

This patch introduces two instructions lxvl/stvxl and replaces the current lxvd2x/stxvd2x to load and store vectors. Like lxvd2x/stxvd2x, lxvl/stxvl can access unaligned effective addresses with the advantage of *not* requiring xxswapd after lxvd2x (or before stxvd2x) to correct the lanes in little-endian mode.

Webrev: https://cr.openjdk.java.net/~mhorie/8248191/webrev.00/
Bug: https://bugs.openjdk.java.net/browse/JDK-8248191

Thanks for your review!

Jose R. Ziviani

From nils.eliasson at oracle.com  Wed Jul  1 21:00:36 2020
From: nils.eliasson at oracle.com (Nils Eliasson)
Date: Wed, 1 Jul 2020 23:00:36 +0200
Subject: [15] RFR(S): 8248388: ZGC: Load barrier incorrectly elided in
 jdk/java/text/Format/DateFormat/SDFTCKZoneNamesTest.java
Message-ID: <31bc9579-55d2-8cdb-6ad1-58fb43f30c91@oracle.com>

Hi,

This issue was found on aarch64 but applies to all platforms. Stefan 
Karlsson tracked down the source of the issue and created a reproducer.

The bug is that the access API was not used in two places in macro.cpp 
where scalar replacement generate a load from the source of an 
arraycopy. This causes the creation of a LoadP without a barrier.

I fix this by reusing arraycopynode::load to create the loads. The 
abstraction is a bit of, but I don't want to make a larger change this 
late in 15.

https://bugs.openjdk.java.net/browse/JDK-8248388
http://cr.openjdk.java.net/~neliasso/8248388/webrev.01/

Please review,
Nils Eliasson

From doug.simon at oracle.com  Wed Jul  1 21:55:48 2020
From: doug.simon at oracle.com (Doug Simon)
Date: Wed, 1 Jul 2020 23:55:48 +0200
Subject: RFR: 8248321: [JVMCI] improve libgraal logging and fatal error
 handling
Message-ID: <E2A41147-E504-4CA0-B027-44B31B31ED35@oracle.com>

Please review this change that:

1. Sends log output from libgraal for options such as -Dlibgraal.PrintGC=true to HotSpot's tty stream.
2. Forwards a fatal error in libgraal to HotSpot's report_fatal function so that a proper hs_err_pid crash log is produced.
3. Adds coarse grained JVMCI events to the hs_eer_pid crash log that can help diagnose libgraal crashes.

https://bugs.openjdk.java.net/browse/JDK-8248321
https://cr.openjdk.java.net/~dnsimon/8248321/webrev.00/

Testing: hs-tier1,hs-tier2,hs-tier3-graal,hs-tier4-graal

I?ve also tested this on a JDK 16 libgraal build (thanks to Bob?s recent fixes <https://github.com/oracle/graal/pull/2635>) using the -Dlibgraal.CrashAtIsFatal=true option introduced for testing purposes. Here are extracts from the resulting hs_err_pid log:

Stack: [0x000070000404e000,0x000070000424e000],  sp=0x000070000424cfa0,  free space=2043k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.dylib+0xa74706]  _ZN7VMError14report_and_dieEiPKcS1_P13__va_list_tagP6ThreadPhPvS7_S1_im+0x696
V  [libjvm.dylib+0xa74dcb]  _ZN7VMError14report_and_dieEP6ThreadPvPKciS4_S4_P13__va_list_tag+0x3b
V  [libjvm.dylib+0x2ffeb6]  _Z12report_fatalPKciS0_z+0xb6
V  [libjvm.dylib+0x623b2e]  _ZL6_fatalv+0x1e
C  [libjvmcicompiler.dylib+0x50c2e]  FunctionPointerLogHandler_fatalError_45f632dec0d6a0795524f3a791e61bc3381552ca+0x5e
C  [libjvmcicompiler.dylib+0x6251d9]  GraalCompiler_notifyCrash_6e5abb0717b70e82f6be0f6751e33644079f0e7c+0x199
C  [libjvmcicompiler.dylib+0x622f36]  GraalCompiler_checkForRequestedCrash_a1f0e6b1c079f96a46be20bd2ccc87fb7db83871+0x256
C  [libjvmcicompiler.dylib+0x623929]  GraalCompiler_compile_5fc27c66103532b8aadfba9a53a0cfc56727e415+0x209
C  [libjvmcicompiler.dylib+0x623e80]  GraalCompiler_compileGraph_7c727cf4f7ff3555660a81773d74fd53c28861a9+0x1e0
C  [libjvmcicompiler.dylib+0x742259]  HotSpotGraalCompiler_compileHelper_d3a966217707633929a5b5a4a7670fbd583caf11+0x419
C  [libjvmcicompiler.dylib+0x741d95]  HotSpotGraalCompiler_compile_80896636e2e15249ae0fc7c3c7f4cb060aca0523+0x165


JVMCI Events (8 events):
Event: 0.015 Thread 0x00007fa00b011600 created new JVMCI runtime 0 (0x00007fa01af24040)
Event: 0.015 Thread 0x00007fa00b011600 created new JVMCI runtime -1 (0x00007fa01af240a0)
Event: 0.072 Thread 0x00007fa01b02bc00 loaded JVMCI shared library from /Users/dnsimon/hs/graal/sdk/mxbuild/darwin-amd64/GRAALVM_LIBGRAAL_JAVA16/graalvm-libgraal-java16-20.2.0-dev/lib/libjvmcicompiler.dylib
Event: 0.073 Thread 0x00007fa01b02bc00 created JavaVM[1]@0x00000001409a3cb0 for JVMCI runtime 0
Event: 0.073 Thread 0x00007fa01b02bc00 initializing JVMCI runtime 0
Event: 0.074 Thread 0x00007fa01b02bc00 initialized JVMCI runtime 0
Event: 0.082 Thread 0x00007fa01b02bc00 initializing JVMCI runtime -1
Event: 0.088 Thread 0x00007fa01b02bc00 initialized JVMCI runtime -1

-Doug


From vladimir.kozlov at oracle.com  Wed Jul  1 22:16:25 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 1 Jul 2020 15:16:25 -0700
Subject: [15] RFR(S): 8248388: ZGC: Load barrier incorrectly elided in
 jdk/java/text/Format/DateFormat/SDFTCKZoneNamesTest.java
In-Reply-To: <31bc9579-55d2-8cdb-6ad1-58fb43f30c91@oracle.com>
References: <31bc9579-55d2-8cdb-6ad1-58fb43f30c91@oracle.com>
Message-ID: <2f5d5fce-eb28-bd97-9914-b2f081df0c3e@oracle.com>

Looks good.

Thanks,
Vladimir K

On 7/1/20 2:00 PM, Nils Eliasson wrote:
> Hi,
> 
> This issue was found on aarch64 but applies to all platforms. Stefan Karlsson tracked down the source of the issue and 
> created a reproducer.
> 
> The bug is that the access API was not used in two places in macro.cpp where scalar replacement generate a load from the 
> source of an arraycopy. This causes the creation of a LoadP without a barrier.
> 
> I fix this by reusing arraycopynode::load to create the loads. The abstraction is a bit of, but I don't want to make a 
> larger change this late in 15.
> 
> https://bugs.openjdk.java.net/browse/JDK-8248388
> http://cr.openjdk.java.net/~neliasso/8248388/webrev.01/
> 
> Please review,
> Nils Eliasson

From vladimir.kozlov at oracle.com  Wed Jul  1 22:26:17 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 1 Jul 2020 15:26:17 -0700
Subject: RFR: 8248321: [JVMCI] improve libgraal logging and fatal error
 handling
In-Reply-To: <E2A41147-E504-4CA0-B027-44B31B31ED35@oracle.com>
References: <E2A41147-E504-4CA0-B027-44B31B31ED35@oracle.com>
Message-ID: <40d76f1c-d3cb-466f-ab14-b9cf9fdbd097@oracle.com>

Looks good.

Please, run usual testing before push.

Thanks,
Vladimir

On 7/1/20 2:55 PM, Doug Simon wrote:
> Please review this change that:
> 
> 1. Sends log output from libgraal for options such as -Dlibgraal.PrintGC=true to HotSpot's tty stream.
> 2. Forwards a fatal error in libgraal to HotSpot's report_fatal function so that a proper hs_err_pid crash log is produced.
> 3. Adds coarse grained JVMCI events to the hs_eer_pid crash log that can help diagnose libgraal crashes.
> 
> https://bugs.openjdk.java.net/browse/JDK-8248321
> https://cr.openjdk.java.net/~dnsimon/8248321/webrev.00/
> 
> Testing: hs-tier1,hs-tier2,hs-tier3-graal,hs-tier4-graal
> 
> I?ve also tested this on a JDK 16 libgraal build (thanks to Bob?s recent fixes <https://github.com/oracle/graal/pull/2635>) using the -Dlibgraal.CrashAtIsFatal=true option introduced for testing purposes. Here are extracts from the resulting hs_err_pid log:
> 
> Stack: [0x000070000404e000,0x000070000424e000],  sp=0x000070000424cfa0,  free space=2043k
> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
> V  [libjvm.dylib+0xa74706]  _ZN7VMError14report_and_dieEiPKcS1_P13__va_list_tagP6ThreadPhPvS7_S1_im+0x696
> V  [libjvm.dylib+0xa74dcb]  _ZN7VMError14report_and_dieEP6ThreadPvPKciS4_S4_P13__va_list_tag+0x3b
> V  [libjvm.dylib+0x2ffeb6]  _Z12report_fatalPKciS0_z+0xb6
> V  [libjvm.dylib+0x623b2e]  _ZL6_fatalv+0x1e
> C  [libjvmcicompiler.dylib+0x50c2e]  FunctionPointerLogHandler_fatalError_45f632dec0d6a0795524f3a791e61bc3381552ca+0x5e
> C  [libjvmcicompiler.dylib+0x6251d9]  GraalCompiler_notifyCrash_6e5abb0717b70e82f6be0f6751e33644079f0e7c+0x199
> C  [libjvmcicompiler.dylib+0x622f36]  GraalCompiler_checkForRequestedCrash_a1f0e6b1c079f96a46be20bd2ccc87fb7db83871+0x256
> C  [libjvmcicompiler.dylib+0x623929]  GraalCompiler_compile_5fc27c66103532b8aadfba9a53a0cfc56727e415+0x209
> C  [libjvmcicompiler.dylib+0x623e80]  GraalCompiler_compileGraph_7c727cf4f7ff3555660a81773d74fd53c28861a9+0x1e0
> C  [libjvmcicompiler.dylib+0x742259]  HotSpotGraalCompiler_compileHelper_d3a966217707633929a5b5a4a7670fbd583caf11+0x419
> C  [libjvmcicompiler.dylib+0x741d95]  HotSpotGraalCompiler_compile_80896636e2e15249ae0fc7c3c7f4cb060aca0523+0x165
> 
> 
> JVMCI Events (8 events):
> Event: 0.015 Thread 0x00007fa00b011600 created new JVMCI runtime 0 (0x00007fa01af24040)
> Event: 0.015 Thread 0x00007fa00b011600 created new JVMCI runtime -1 (0x00007fa01af240a0)
> Event: 0.072 Thread 0x00007fa01b02bc00 loaded JVMCI shared library from /Users/dnsimon/hs/graal/sdk/mxbuild/darwin-amd64/GRAALVM_LIBGRAAL_JAVA16/graalvm-libgraal-java16-20.2.0-dev/lib/libjvmcicompiler.dylib
> Event: 0.073 Thread 0x00007fa01b02bc00 created JavaVM[1]@0x00000001409a3cb0 for JVMCI runtime 0
> Event: 0.073 Thread 0x00007fa01b02bc00 initializing JVMCI runtime 0
> Event: 0.074 Thread 0x00007fa01b02bc00 initialized JVMCI runtime 0
> Event: 0.082 Thread 0x00007fa01b02bc00 initializing JVMCI runtime -1
> Event: 0.088 Thread 0x00007fa01b02bc00 initialized JVMCI runtime -1
> 
> -Doug
> 
> 

From doug.simon at oracle.com  Wed Jul  1 22:41:14 2020
From: doug.simon at oracle.com (Doug Simon)
Date: Thu, 2 Jul 2020 00:41:14 +0200
Subject: RFR: 8248321: [JVMCI] improve libgraal logging and fatal error
 handling
In-Reply-To: <40d76f1c-d3cb-466f-ab14-b9cf9fdbd097@oracle.com>
References: <E2A41147-E504-4CA0-B027-44B31B31ED35@oracle.com>
 <40d76f1c-d3cb-466f-ab14-b9cf9fdbd097@oracle.com>
Message-ID: <0EC13073-ACCD-4E13-8E05-F25DF5162F24@oracle.com>

Thanks.

I will be sure to run testing.

-Doug

> On 2 Jul 2020, at 00:26, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Looks good.
> 
> Please, run usual testing before push.
> 
> Thanks,
> Vladimir
> 
> On 7/1/20 2:55 PM, Doug Simon wrote:
>> Please review this change that:
>> 1. Sends log output from libgraal for options such as -Dlibgraal.PrintGC=true to HotSpot's tty stream.
>> 2. Forwards a fatal error in libgraal to HotSpot's report_fatal function so that a proper hs_err_pid crash log is produced.
>> 3. Adds coarse grained JVMCI events to the hs_eer_pid crash log that can help diagnose libgraal crashes.
>> https://bugs.openjdk.java.net/browse/JDK-8248321
>> https://cr.openjdk.java.net/~dnsimon/8248321/webrev.00/
>> Testing: hs-tier1,hs-tier2,hs-tier3-graal,hs-tier4-graal
>> I?ve also tested this on a JDK 16 libgraal build (thanks to Bob?s recent fixes <https://github.com/oracle/graal/pull/2635>) using the -Dlibgraal.CrashAtIsFatal=true option introduced for testing purposes. Here are extracts from the resulting hs_err_pid log:
>> Stack: [0x000070000404e000,0x000070000424e000],  sp=0x000070000424cfa0,  free space=2043k
>> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
>> V  [libjvm.dylib+0xa74706]  _ZN7VMError14report_and_dieEiPKcS1_P13__va_list_tagP6ThreadPhPvS7_S1_im+0x696
>> V  [libjvm.dylib+0xa74dcb]  _ZN7VMError14report_and_dieEP6ThreadPvPKciS4_S4_P13__va_list_tag+0x3b
>> V  [libjvm.dylib+0x2ffeb6]  _Z12report_fatalPKciS0_z+0xb6
>> V  [libjvm.dylib+0x623b2e]  _ZL6_fatalv+0x1e
>> C  [libjvmcicompiler.dylib+0x50c2e]  FunctionPointerLogHandler_fatalError_45f632dec0d6a0795524f3a791e61bc3381552ca+0x5e
>> C  [libjvmcicompiler.dylib+0x6251d9]  GraalCompiler_notifyCrash_6e5abb0717b70e82f6be0f6751e33644079f0e7c+0x199
>> C  [libjvmcicompiler.dylib+0x622f36]  GraalCompiler_checkForRequestedCrash_a1f0e6b1c079f96a46be20bd2ccc87fb7db83871+0x256
>> C  [libjvmcicompiler.dylib+0x623929]  GraalCompiler_compile_5fc27c66103532b8aadfba9a53a0cfc56727e415+0x209
>> C  [libjvmcicompiler.dylib+0x623e80]  GraalCompiler_compileGraph_7c727cf4f7ff3555660a81773d74fd53c28861a9+0x1e0
>> C  [libjvmcicompiler.dylib+0x742259]  HotSpotGraalCompiler_compileHelper_d3a966217707633929a5b5a4a7670fbd583caf11+0x419
>> C  [libjvmcicompiler.dylib+0x741d95]  HotSpotGraalCompiler_compile_80896636e2e15249ae0fc7c3c7f4cb060aca0523+0x165
>> JVMCI Events (8 events):
>> Event: 0.015 Thread 0x00007fa00b011600 created new JVMCI runtime 0 (0x00007fa01af24040)
>> Event: 0.015 Thread 0x00007fa00b011600 created new JVMCI runtime -1 (0x00007fa01af240a0)
>> Event: 0.072 Thread 0x00007fa01b02bc00 loaded JVMCI shared library from /Users/dnsimon/hs/graal/sdk/mxbuild/darwin-amd64/GRAALVM_LIBGRAAL_JAVA16/graalvm-libgraal-java16-20.2.0-dev/lib/libjvmcicompiler.dylib
>> Event: 0.073 Thread 0x00007fa01b02bc00 created JavaVM[1]@0x00000001409a3cb0 for JVMCI runtime 0
>> Event: 0.073 Thread 0x00007fa01b02bc00 initializing JVMCI runtime 0
>> Event: 0.074 Thread 0x00007fa01b02bc00 initialized JVMCI runtime 0
>> Event: 0.082 Thread 0x00007fa01b02bc00 initializing JVMCI runtime -1
>> Event: 0.088 Thread 0x00007fa01b02bc00 initialized JVMCI runtime -1
>> -Doug


From dean.long at oracle.com  Thu Jul  2 00:08:35 2020
From: dean.long at oracle.com (Dean Long)
Date: Wed, 1 Jul 2020 17:08:35 -0700
Subject: RFR(XL) 8247922: Update Graal
Message-ID: <d15ce4fe-af4b-5c6f-95e2-025c4956d55a@oracle.com>

https://bugs.openjdk.java.net/browse/JDK-8247922
http://cr.openjdk.java.net/~dlong/8247922/webrev/

This is a Graal update.? Changes since the last update (JDK-8243380) are 
listed in the bug description.

dl


From vladimir.kozlov at oracle.com  Thu Jul  2 00:23:56 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 1 Jul 2020 17:23:56 -0700
Subject: [16] RFR(XS) 8076985: Allocation path: biased locking + compressed
 oops code quality
Message-ID: <580c31f3-b86e-a6c3-ca61-2d6104a846f8@oracle.com>

https://bugs.openjdk.java.net/browse/JDK-8076985
https://cr.openjdk.java.net/~kvn/8076985/webrev.00/

First, this is about how C2 generates code for *constant* class pointers.

A little history here. When we implemented compressed oops and class pointers we had PermGen and classes were Java 
objects. We used the same decoding/encoding code for oops and classes - we used the same register containing Heap Base 
address. It was profitable to decode constant class and reuse it [1]. Also we greatly benefited on SPARC since decoding 
32-bit constant required 4 instructions instead of up to 7 instructions to load 64-bit constant.

Now compressed class decoding is different and always takes 2 instructions on x86 [2] if either base or shift is not 0.

As result we generated 3 instructions to get full class pointer from compressed 32-bit constant (example for base = 0, 
shift = 3):

movl $0x200001d5,%r11d
movabs $0x0,%r10
lea (%r10,%r11,8),%r10

Also when we store compressed class pointer into new object header we don't use register anymore on x86 - keeping it in 
register does not help now:

movl $0x200001d5,0x8(%rax)

Aleksey suggested to have only one instruction to load full 64-bit class pointer:

movq $0x100000EA8,%r10

It frees one register and uses 10 bytes instead of up to 20 bytes of code on x86.

In JDK 9 SAP contributed nice change [3] to have choice when to use 'compressed class pointer + decoding' or full 
'64-bit constant class pointer'. It significantly simplified changes for this RFE.


I ran performance testing but did not see difference - we don't use biased locking now and as result we don't need to 
load prototype header from class. But there are other places where we need load from class.

Thanks,
Vladimir K

[1] https://bugs.openjdk.java.net/browse/JDK-6709093
http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/rev/44abbb0d4c18
To generate instead of this:
      movl    R11, narrowoop: precise klass jnt/scimark2/Random: 0x000000000083b418:Constant:exact * # compressed ptr
      movq    R10, precise klass jnt/scimark2/Random: 0x000000000083b418:Constant:exact * # ptr
      movq    R10, [R10 + #176 (32-bit)]      # ptr
      movq    [RAX], R10      # ptr
      movl    [RAX + #8 (8-bit)], R11 # compressed ptr

generate this:
      movl    R11, narrowoop: precise klass Point: 0x00000000007ad518:Constant:exact * # compressed ptr
      movq    R10, [R12 + R11 << 3 + #176] (compressed oop addressing) # ptr
      movq    [R8], R10       # ptr
      movl    [R8 + #8 (8-bit)], R11  # compressed ptr

[2] http://hg.openjdk.java.net/jdk/jdk/file/c5ed42533134/src/hotspot/cpu/x86/macroAssembler_x86.cpp#l4609

[3] https://bugs.openjdk.java.net/browse/JDK-8155729

From tobias.hartmann at oracle.com  Thu Jul  2 06:43:23 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Thu, 2 Jul 2020 08:43:23 +0200
Subject: RFR(S): 8243670: Unexpected test result caused by C2
 MergeMemNode::Ideal
In-Reply-To: <134e1fc1-8e5c-a1f2-d0ed-50784b807578@oracle.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7DE0314@dggeml507-mbs.china.huawei.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E47B68@dggeml527-mbx.china.huawei.com>
 <fa89a8ca-34dd-abcf-60fa-9d18db754c14@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E48697@dggeml527-mbx.china.huawei.com>
 <87k103w2o7.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4ADEA@dggeml527-mbx.china.huawei.com>
 <87eeq7wmd2.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4B0CD@dggeml527-mbx.china.huawei.com>
 <878sgfwbyc.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4B7E4@dggeml527-mbx.china.huawei.com>
 <87wo3yupks.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4C62C@dggeml527-mbx.china.huawei.com>
 <a87b52f2-31e5-a526-4478-049ab6bba636@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E57047@dggeml507-mbx.china.huawei.com>
 <87o8p1ours.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E57CA5@dggeml507-mbx.china.huawei.com>
 <134e1fc1-8e5c-a1f2-d0ed-50784b807578@oracle.com>
Message-ID: <4d71bb09-2569-4d01-16cc-707ce61d23de@oracle.com>

Hi Felix,

On 30.06.20 19:06, Tobias Hartmann wrote:
> I'll run some perf and correctness testing and report back once it finished.

All passed.

Best regards,
Tobias

From rwestrel at redhat.com  Thu Jul  2 07:03:37 2020
From: rwestrel at redhat.com (Roland Westrelin)
Date: Thu, 02 Jul 2020 09:03:37 +0200
Subject: [15] RFR(S): 8248388: ZGC: Load barrier incorrectly elided in
 jdk/java/text/Format/DateFormat/SDFTCKZoneNamesTest.java
In-Reply-To: <31bc9579-55d2-8cdb-6ad1-58fb43f30c91@oracle.com>
References: <31bc9579-55d2-8cdb-6ad1-58fb43f30c91@oracle.com>
Message-ID: <87v9j69z5i.fsf@redhat.com>


> http://cr.openjdk.java.net/~neliasso/8248388/webrev.01/

Looks good.

Roland.


From rwestrel at redhat.com  Thu Jul  2 07:08:10 2020
From: rwestrel at redhat.com (Roland Westrelin)
Date: Thu, 02 Jul 2020 09:08:10 +0200
Subject: [16] RFR(XS) 8076985: Allocation path: biased locking +
 compressed oops code quality
In-Reply-To: <580c31f3-b86e-a6c3-ca61-2d6104a846f8@oracle.com>
References: <580c31f3-b86e-a6c3-ca61-2d6104a846f8@oracle.com>
Message-ID: <87sgea9yxx.fsf@redhat.com>


> https://cr.openjdk.java.net/~kvn/8076985/webrev.00/

Looks good to me.

Roland.


From rwestrel at redhat.com  Thu Jul  2 07:10:01 2020
From: rwestrel at redhat.com (Roland Westrelin)
Date: Thu, 02 Jul 2020 09:10:01 +0200
Subject: RFR(S): 8243670: Unexpected test result caused by C2
 MergeMemNode::Ideal
In-Reply-To: <DA41BE1DDCA941489001C7FBD7A8820EE7E57CA5@dggeml507-mbx.china.huawei.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7DE0314@dggeml507-mbs.china.huawei.com>
 <4d051aec-56ef-b35e-f082-2f6305ec1694@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7DEEDED@dggeml527-mbx.china.huawei.com>
 <c5fb2da0-927b-ee0c-a2cf-34c889937f68@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4368B@dggeml527-mbx.china.huawei.com>
 <9146b58c-9353-dcb8-827e-7f92a85cecd2@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E47B68@dggeml527-mbx.china.huawei.com>
 <fa89a8ca-34dd-abcf-60fa-9d18db754c14@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E48697@dggeml527-mbx.china.huawei.com>
 <87k103w2o7.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4ADEA@dggeml527-mbx.china.huawei.com>
 <87eeq7wmd2.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4B0CD@dggeml527-mbx.china.huawei.com>
 <878sgfwbyc.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4B7E4@dggeml527-mbx.china.huawei.com>
 <87wo3yupks.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4C62C@dggeml527-mbx.china.huawei.com>
 <a87b52f2-31e5-a526-4478-049ab6bba636@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E57047@dggeml507-mbx.china.huawei.com>
 <87o8p1ours.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E57CA5@dggeml507-mbx.china.huawei.com>
Message-ID: <87pn9e9yuu.fsf@redhat.com>


>     Updated webrev: http://cr.openjdk.java.net/~fyang/8243670/webrev.03/ 

Looks good to me.

Roland.


From christian.hagedorn at oracle.com  Thu Jul  2 07:33:24 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Thu, 2 Jul 2020 09:33:24 +0200
Subject: [16] RFR(S): 8247743: Segmentation fault in debug builds due to stack
 overflow in find_recur with deep graphs
Message-ID: <9af7a44c-4267-4900-812c-12aa0c37713a@oracle.com>

Hi

Please review the following patch:
https://bugs.openjdk.java.net/browse/JDK-8247743
http://cr.openjdk.java.net/~chagedorn/8247743/webrev.00/

The testcase creates a deep graph with a lot of nodes on a chain. When 
running with the specified test flags, it recursively calls 
Node::find_recur() for each node discovered which eventually results in 
a segmentation fault due to a stack overflow (around 10000 calls due to 
such a long chain of nodes). The fix just converts the recursive 
algorithm into an iterative one to avoid a segmentation fault. This is 
similar to JDK-8246203 [1].

I additionally removed Node::find_ctrl() and its special handling in the 
algorithm since it is not used.

There is actually another problem with the recursive version. When 
running the testcase without -XX:CompileOnly=compiler/c2/TestFindNode, 
it will spin forever inside [2] because there is a debug_orig node cycle 
and the loop does not break based on the debug_orig nodes being visited. 
This is also fixed in the patch.

Thank you!

Best regards,
Christian


[1] https://bugs.openjdk.java.net/browse/JDK-8246203
[2] 
http://hg.openjdk.java.net/jdk/jdk/file/e2622818f0bd/src/hotspot/share/opto/node.cpp#l1589

From adinn at redhat.com  Thu Jul  2 08:04:12 2020
From: adinn at redhat.com (Andrew Dinn)
Date: Thu, 2 Jul 2020 09:04:12 +0100
Subject: 8248336: AArch64: C2: offset overflow in BoxLockNode::emit
In-Reply-To: <2db7b669-63b6-1dbd-6d7a-7bac55144167@redhat.com>
References: <3fa560fa-c1fd-0131-10d2-040bac25b7f7@redhat.com>
 <2db7b669-63b6-1dbd-6d7a-7bac55144167@redhat.com>
Message-ID: <3b578127-9a25-3bc9-c9e5-12bbc3f366ce@redhat.com>

On 25/06/2020 17:48, Andrew Haley wrote:
> On 25/06/2020 17:31, Andrew Haley wrote:
>> BoxLockNode::emit only allows a 12-bit offset from register SP to the
>> stack slot that contains the inflated lock. Rather amazingly we've
>> never seen this fail in production, but in theory a BoxLockNode can be
>> anywhere in the stack frame.
>>
>> I have once seen this fail in test code, but it is very hard to
>> reproduce.
> 
> http://cr.openjdk.java.net/~aph/8248336/
Sorry, I checked the patch when you posted the webrev and I thought I
had posted an ack but clearly did not.

This is fine.

regards,


Andrew Dinn
-----------
Red Hat Distinguished Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill


From aph at redhat.com  Thu Jul  2 08:15:42 2020
From: aph at redhat.com (Andrew Haley)
Date: Thu, 2 Jul 2020 09:15:42 +0100
Subject: Stack allocation prototype for C2
In-Reply-To: <C297D000-859F-4B13-8F77-06CF553A9A85@microsoft.com>
References: <C297D000-859F-4B13-8F77-06CF553A9A85@microsoft.com>
Message-ID: <0f98b198-0769-08fc-f1ff-553eadcede22@redhat.com>

On 29/06/2020 22:05, Charlie Gracie wrote:

> Here is the prototype code for our work on adding stack allocation
> to the HotSpot C2 compiler. We are looking for any and all feedback
> as we hope to move from a prototype to something that could be
> contributed.

We certainly need a repo where it can go. It could either be adopted
by an existing project or it could have a project of its own. The
latter is perhaps a bad idea because it would be too isolated.

> A change of this size is difficult to review so we understand the
> process will be thorough and will take time to complete. Any
> suggestions on how to allow for collaboration with others, if they
> wanted to, would also be appreciated (i.e., a repo somewhere).

Here's my concern.

Java stacks are, in general, pretty small. This is good, and makes for
economical memory usage. This is particularly useful for Project Loom,
where there can be enormous numbers of "virtual" threads. These
threads, while they are not active, are stored in the heap.

As you might imagine, the idea of embedded objects (which, of course,
cannot be collected) in these virtual threads does not delight me at
all. Is this likely to be a real problem, do you think, or are all of
the stack-allocated objects so small that I shouldn't be concerned?

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From dalibor.topic at oracle.com  Thu Jul  2 08:59:40 2020
From: dalibor.topic at oracle.com (Dalibor Topic)
Date: Thu, 2 Jul 2020 10:59:40 +0200
Subject: Stack allocation prototype for C2
In-Reply-To: <C297D000-859F-4B13-8F77-06CF553A9A85@microsoft.com>
References: <C297D000-859F-4B13-8F77-06CF553A9A85@microsoft.com>
Message-ID: <d57b5803-d66a-ffaa-079f-247a34b51e9a@oracle.com>


On 29.06.2020 23:05, Charlie Gracie wrote:
> Hi hotspot-compiler-dev community,
> 
> Here is the prototype code for our work on adding stack allocation to the HotSpot C2 compiler. We are looking for any and all feedback
> as we hope to move from a prototype to something that could be contributed. A change of this size is difficult to review so we
> understand the process will be thorough and will take time to complete. Any suggestions on how to allow for collaboration with others,
> if they wanted to, would also be appreciated (i.e., a repo somewhere).

Hi Charlie,

You may want to take a look at

https://cr.openjdk.java.net/~chegar/docs/sandbox.html

"The primary purpose of the JDK Sandbox Development Repository is to 
facilitate OpenJDK developers that are working on non-trivial changes, 
possibly JEP-scale effort, whose scope and duration make it necessary to 
collaborate with others in an open shared version control system, rather 
than just using privately shared patches. "

cheers,
dalibor topic

-- 
<http://www.oracle.com> Dalibor Topic
Consulting Product Manager
Phone: +494089091214 <tel:+494089091214>, Mobile: +491737185961
<tel:+491737185961>, Video: dalibor.topic at oracle.com
<sip:dalibor.topic at oracle.com>

Oracle Global Services Germany GmbH
Hauptverwaltung: Riesstr. 25, D-80992 M?nchen
Registergericht: Amtsgericht M?nchen, HRB 246209
Gesch?ftsf?hrer: Ralf Herrmann


From zhuoren.wz at alibaba-inc.com  Thu Jul  2 07:36:24 2020
From: zhuoren.wz at alibaba-inc.com (=?UTF-8?B?V2FuZyBaaHVvKFpodW9yZW4p?=)
Date: Thu, 02 Jul 2020 15:36:24 +0800
Subject: =?UTF-8?B?W2FhcmNoNjQtcG9ydC1kZXYgXSBSRlIoWFhTKTo4MjQ4NTcwIEluY29ycmVjdCBjb3B5cmln?=
 =?UTF-8?B?aHQgaGVhZGVyIGluIFRlc3RVbnNhZmVVbmFsaWduZWRTd2FwLmphdmE=?=
Message-ID: <587101a8-7cb0-453b-aed5-4edca2cdda2d.zhuoren.wz@alibaba-inc.com>

Hi,
There's something wrong int the legal notice of TestUnsafeUnalignedSwap.java file. It should be GPLv2 as in `make/templates/gpl-header`. This patch(from Vladimir Kozlov) fixes it.
BUG Link:https://bugs.openjdk.java.net/browse/JDK-8248570
CR: http://cr.openjdk.java.net/~wzhuo/8248570/webrev.00/


Regards,
Zhuoren


From vladimir.x.ivanov at oracle.com  Thu Jul  2 12:01:16 2020
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Thu, 2 Jul 2020 15:01:16 +0300
Subject: Stack allocation prototype for C2
In-Reply-To: <C297D000-859F-4B13-8F77-06CF553A9A85@microsoft.com>
References: <C297D000-859F-4B13-8F77-06CF553A9A85@microsoft.com>
Message-ID: <6a3c74b0-daf8-04c2-76b0-dc2ce3714314@oracle.com>

Hi Charlie and Nikola,

> Here is the prototype code for our work on adding stack allocation to the HotSpot C2 compiler. We are looking for any and all feedback
> as we hope to move from a prototype to something that could be contributed. A change of this size is difficult to review so we
> understand the process will be thorough and will take time to complete. Any suggestions on how to allow for collaboration with others,
> if they wanted to, would also be appreciated (i.e., a repo somewhere).
> 
> For a quick refresher here is a link to Nikola?s talk at FOSDEM:
> https://fosdem.org/2020/schedule/event/reducing_gc_times/
> 
> Here is a link to our initial webrev:
> http://cr.openjdk.java.net/~adityam/charlie/stack_alloc/
> 
> Expecting that a change like this will require a JEP, we have prepared a document describing our work based off of the JEP submission
> form. Our document has a few extra sections at the end discussing areas that we are looking for guidance on and some initial
> performance results. This document can be found here:
> https://github.com/microsoft/openjdk-proposals/blob/master/stack_allocation/Stack_Allocation_JEP.md

Very nice write-up and design overview!

   "To implement stack allocation, we need to modify the C2 compiler, 
the GCs and some of the VM runtime interfaces"

 From the design overview and the implementation, I'm concerned about 
far-reaching consequences of the chosen approach. It's not limited just 
to existing set of JVM features, but as Andrew noted will affect the 
design of forthcoming functionality as well.

I think it's worth to start a broad discussion (HotSpot-wide) and decide 
  how much JVM design complexity budged it is worth spending on such an 
optimization.

Personally, I'm not convinced that supporting stack allocated objects in 
the JVM is justified.


As we discussed off-line (right after FOSDEM), I do see the benefits of 
in-memory representation for non-escaping objects: memory aliasing 
(either indeterminate base or indexed access) imposes inherent 
constraints on the escape analysis (both partial and conservative 
approaches suffer from it). Nevertheless, some of the problematic cases 
can be addressed by improving existing approach or introducing a more 
powerful analysis: covering more cases and making the analysis 
control-sensitive should improve the situation.

Also, the alternative approach (called zone-based heap allocation) looks 
very attractive to me. I haven't thought it through, but it looks like 
keeping the objects on the Java heap can save us a lot of complexity on 
the implementation side (more memory available for allocation - not 
necessarily fixed amount, no need to migrate objects from stack to heap, 
GC barriers are unaffected, etc.). For example, reserving a dedicated 
TLAB (or a stack of TLABs?) and do nmethod-scoped allocations from C2 
code looks attractive. It can simplify many aspects of the 
implementation: much more space available, free migration of 
non-escaping objects to heap on deoptimization.


Another idea:

"When dealing with stack allocated objects in loops we need a lifetime 
overlap check."

It doesn't look specific to stack-allocated objects. Non-overlapping 
live ranges can be coalesced the same way for on-heap freshly allocated 
objects. It should get comparable reduction in allocation pressure 
(single allocation per loop vs allocation per iteration) and doesn't 
require stack allocation support at all (as an example [1]).

If such improvements are enabled for non-escaping on-heap objects, how 
much benefit will stack allocation bring on top of that? IMO the 
performance gap should become much narrower.

Best regards,
Vladimir Ivanov

[1]

   MyObject a = new MyObject(x1); // no aliasing: always accessed 
through "a"
   for (...) {
     ...
     a = new MyObject(x2);
     ...
   }
   return a.x;

can be turned into:

   MyObject a = new MyObject(x1);
   for (...) {
     ...
     a.x = x2; // plus, re-initialize other instance fields
     ...
   }
   return a.x;

It can even be extended for escaping objects in some cases (while the 
object is provably not escaped).

From richard.reingruber at sap.com  Thu Jul  2 14:04:57 2020
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Thu, 2 Jul 2020 14:04:57 +0000
Subject: RFR(XS) 8247695: [PPC,S390]:
 compiler/intrinsics/math/TestFpMinMaxIntrinsics.java fails
Message-ID: <AM0PR0202MB333141A97C084B832FD6EEE69B6D0@AM0PR0202MB3331.eurprd02.prod.outlook.com>

Hi,

could I please get reviews for this small bugfix which adds support for AbsL nodes to the C2
backends on PPC and S390?

Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8247695/webrev.0/
Bug:    https://bugs.openjdk.java.net/browse/JDK-8247695

The patch successfully passes regression testing @SAP which includes JCK and JTREG tests, also in
Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance Suite, SAP specific tests with fastdebug and
release builds.

Thanks, Richard.

From christian.hagedorn at oracle.com  Thu Jul  2 14:31:51 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Thu, 2 Jul 2020 16:31:51 +0200
Subject: [16] RFR(T): 8248596: [TESTBUG]
 compiler/loopopts/PartialPeelingUnswitch.java times out with Graal enabled
Message-ID: <19cf54c7-776d-63e4-6d40-bd84733a2f17@oracle.com>

Hi

Please review the following patch:
https://bugs.openjdk.java.net/browse/JDK-8248596
http://cr.openjdk.java.net/~chagedorn/8248596/webrev.00/

It excludes the execution of this C2 specific test with Graal since it 
has many methods and runs with CompileOnly, possibly letting it time out 
with Graal.

Best regards,
Christian

From rwestrel at redhat.com  Thu Jul  2 14:58:53 2020
From: rwestrel at redhat.com (Roland Westrelin)
Date: Thu, 02 Jul 2020 16:58:53 +0200
Subject: [16] RFR(T): 8248596: [TESTBUG]
 compiler/loopopts/PartialPeelingUnswitch.java times out with Graal enabled
In-Reply-To: <19cf54c7-776d-63e4-6d40-bd84733a2f17@oracle.com>
References: <19cf54c7-776d-63e4-6d40-bd84733a2f17@oracle.com>
Message-ID: <87mu4i9d5e.fsf@redhat.com>


> http://cr.openjdk.java.net/~chagedorn/8248596/webrev.00/

That looks good to me but I've been wondering what the interaction of
this and similar issues with libgraal is. Presumably, running this test
with libgraal wouldn't time out. Does:
@requires !vm.graal.enabled cover
both graal and libgraal? If so is the plan to reevaluate all those
additions once libgraal becomes the standard way of running graal? Or
is there an other way to override the requirement so that if you run
with libgraal, all tests are run?

Roland.


From tom.rodriguez at oracle.com  Thu Jul  2 15:50:43 2020
From: tom.rodriguez at oracle.com (Tom Rodriguez)
Date: Thu, 2 Jul 2020 08:50:43 -0700
Subject: RFR: 8248321: [JVMCI] improve libgraal logging and fatal error
 handling
In-Reply-To: <E2A41147-E504-4CA0-B027-44B31B31ED35@oracle.com>
References: <E2A41147-E504-4CA0-B027-44B31B31ED35@oracle.com>
Message-ID: <4d9b0f80-35be-6ccc-6bf0-ea2b80b485a1@oracle.com>

Looks good.

tom

Doug Simon wrote on 7/1/20 2:55 PM:
> Please review this change that:
> 
> 1. Sends log output from libgraal for options such as -Dlibgraal.PrintGC=true to HotSpot's tty stream.
> 2. Forwards a fatal error in libgraal to HotSpot's report_fatal function so that a proper hs_err_pid crash log is produced.
> 3. Adds coarse grained JVMCI events to the hs_eer_pid crash log that can help diagnose libgraal crashes.
> 
> https://bugs.openjdk.java.net/browse/JDK-8248321
> https://cr.openjdk.java.net/~dnsimon/8248321/webrev.00/
> 
> Testing: hs-tier1,hs-tier2,hs-tier3-graal,hs-tier4-graal
> 
> I?ve also tested this on a JDK 16 libgraal build (thanks to Bob?s recent fixes <https://github.com/oracle/graal/pull/2635>) using the -Dlibgraal.CrashAtIsFatal=true option introduced for testing purposes. Here are extracts from the resulting hs_err_pid log:
> 
> Stack: [0x000070000404e000,0x000070000424e000],  sp=0x000070000424cfa0,  free space=2043k
> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
> V  [libjvm.dylib+0xa74706]  _ZN7VMError14report_and_dieEiPKcS1_P13__va_list_tagP6ThreadPhPvS7_S1_im+0x696
> V  [libjvm.dylib+0xa74dcb]  _ZN7VMError14report_and_dieEP6ThreadPvPKciS4_S4_P13__va_list_tag+0x3b
> V  [libjvm.dylib+0x2ffeb6]  _Z12report_fatalPKciS0_z+0xb6
> V  [libjvm.dylib+0x623b2e]  _ZL6_fatalv+0x1e
> C  [libjvmcicompiler.dylib+0x50c2e]  FunctionPointerLogHandler_fatalError_45f632dec0d6a0795524f3a791e61bc3381552ca+0x5e
> C  [libjvmcicompiler.dylib+0x6251d9]  GraalCompiler_notifyCrash_6e5abb0717b70e82f6be0f6751e33644079f0e7c+0x199
> C  [libjvmcicompiler.dylib+0x622f36]  GraalCompiler_checkForRequestedCrash_a1f0e6b1c079f96a46be20bd2ccc87fb7db83871+0x256
> C  [libjvmcicompiler.dylib+0x623929]  GraalCompiler_compile_5fc27c66103532b8aadfba9a53a0cfc56727e415+0x209
> C  [libjvmcicompiler.dylib+0x623e80]  GraalCompiler_compileGraph_7c727cf4f7ff3555660a81773d74fd53c28861a9+0x1e0
> C  [libjvmcicompiler.dylib+0x742259]  HotSpotGraalCompiler_compileHelper_d3a966217707633929a5b5a4a7670fbd583caf11+0x419
> C  [libjvmcicompiler.dylib+0x741d95]  HotSpotGraalCompiler_compile_80896636e2e15249ae0fc7c3c7f4cb060aca0523+0x165
> 
> 
> JVMCI Events (8 events):
> Event: 0.015 Thread 0x00007fa00b011600 created new JVMCI runtime 0 (0x00007fa01af24040)
> Event: 0.015 Thread 0x00007fa00b011600 created new JVMCI runtime -1 (0x00007fa01af240a0)
> Event: 0.072 Thread 0x00007fa01b02bc00 loaded JVMCI shared library from /Users/dnsimon/hs/graal/sdk/mxbuild/darwin-amd64/GRAALVM_LIBGRAAL_JAVA16/graalvm-libgraal-java16-20.2.0-dev/lib/libjvmcicompiler.dylib
> Event: 0.073 Thread 0x00007fa01b02bc00 created JavaVM[1]@0x00000001409a3cb0 for JVMCI runtime 0
> Event: 0.073 Thread 0x00007fa01b02bc00 initializing JVMCI runtime 0
> Event: 0.074 Thread 0x00007fa01b02bc00 initialized JVMCI runtime 0
> Event: 0.082 Thread 0x00007fa01b02bc00 initializing JVMCI runtime -1
> Event: 0.088 Thread 0x00007fa01b02bc00 initialized JVMCI runtime -1
> 
> -Doug
> 
> 

From martin.doerr at sap.com  Thu Jul  2 15:57:22 2020
From: martin.doerr at sap.com (Doerr, Martin)
Date: Thu, 2 Jul 2020 15:57:22 +0000
Subject: RFR(M): 8248191: PPC: Implement Load/Store Vector with lxvl/stxvl
 in Power10
In-Reply-To: <20200701194910.GA141565@pacoca>
References: <20200701194910.GA141565@pacoca>
Message-ID: <HE1PR02MB3066D76F3AA81AB764DEE3BF9A6D0@HE1PR02MB3066.eurprd02.prod.outlook.com>

Where do we save xxswapd instructions?
I can't see it in the webrev.

Best regards,
Martin

> -----Original Message-----
> From: hotspot-compiler-dev <hotspot-compiler-dev-
> retn at openjdk.java.net> On Behalf Of joserz at linux.ibm.com
> Sent: Mittwoch, 1. Juli 2020 21:49
> To: hotspot-compiler-dev at openjdk.java.net
> Cc: Michihiro Horie <HORIE at jp.ibm.com>
> Subject: RFR(M): 8248191: PPC: Implement Load/Store Vector with lxvl/stxvl
> in Power10
> 
> This patch introduces two instructions lxvl/stvxl and replaces the current
> lxvd2x/stxvd2x to load and store vectors. Like lxvd2x/stxvd2x, lxvl/stxvl can
> access unaligned effective addresses with the advantage of *not* requiring
> xxswapd after lxvd2x (or before stxvd2x) to correct the lanes in little-endian
> mode.
> 
> Webrev: https://cr.openjdk.java.net/~mhorie/8248191/webrev.00/
> Bug: https://bugs.openjdk.java.net/browse/JDK-8248191
> 
> Thanks for your review!
> 
> Jose R. Ziviani

From doug.simon at oracle.com  Thu Jul  2 16:00:23 2020
From: doug.simon at oracle.com (Doug Simon)
Date: Thu, 2 Jul 2020 18:00:23 +0200
Subject: RFR: 8248321: [JVMCI] improve libgraal logging and fatal error
 handling
In-Reply-To: <4d9b0f80-35be-6ccc-6bf0-ea2b80b485a1@oracle.com>
References: <E2A41147-E504-4CA0-B027-44B31B31ED35@oracle.com>
 <4d9b0f80-35be-6ccc-6bf0-ea2b80b485a1@oracle.com>
Message-ID: <B883FEE1-FC37-4786-9F9E-B0AA60F30389@oracle.com>

Thanks Tom.

-Doug

> On 2 Jul 2020, at 17:50, Tom Rodriguez <tom.rodriguez at oracle.com> wrote:
> 
> Looks good.
> 
> tom
> 
> Doug Simon wrote on 7/1/20 2:55 PM:
>> Please review this change that:
>> 1. Sends log output from libgraal for options such as -Dlibgraal.PrintGC=true to HotSpot's tty stream.
>> 2. Forwards a fatal error in libgraal to HotSpot's report_fatal function so that a proper hs_err_pid crash log is produced.
>> 3. Adds coarse grained JVMCI events to the hs_eer_pid crash log that can help diagnose libgraal crashes.
>> https://bugs.openjdk.java.net/browse/JDK-8248321
>> https://cr.openjdk.java.net/~dnsimon/8248321/webrev.00/
>> Testing: hs-tier1,hs-tier2,hs-tier3-graal,hs-tier4-graal
>> I?ve also tested this on a JDK 16 libgraal build (thanks to Bob?s recent fixes <https://github.com/oracle/graal/pull/2635>) using the -Dlibgraal.CrashAtIsFatal=true option introduced for testing purposes. Here are extracts from the resulting hs_err_pid log:
>> Stack: [0x000070000404e000,0x000070000424e000],  sp=0x000070000424cfa0,  free space=2043k
>> Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
>> V  [libjvm.dylib+0xa74706]  _ZN7VMError14report_and_dieEiPKcS1_P13__va_list_tagP6ThreadPhPvS7_S1_im+0x696
>> V  [libjvm.dylib+0xa74dcb]  _ZN7VMError14report_and_dieEP6ThreadPvPKciS4_S4_P13__va_list_tag+0x3b
>> V  [libjvm.dylib+0x2ffeb6]  _Z12report_fatalPKciS0_z+0xb6
>> V  [libjvm.dylib+0x623b2e]  _ZL6_fatalv+0x1e
>> C  [libjvmcicompiler.dylib+0x50c2e]  FunctionPointerLogHandler_fatalError_45f632dec0d6a0795524f3a791e61bc3381552ca+0x5e
>> C  [libjvmcicompiler.dylib+0x6251d9]  GraalCompiler_notifyCrash_6e5abb0717b70e82f6be0f6751e33644079f0e7c+0x199
>> C  [libjvmcicompiler.dylib+0x622f36]  GraalCompiler_checkForRequestedCrash_a1f0e6b1c079f96a46be20bd2ccc87fb7db83871+0x256
>> C  [libjvmcicompiler.dylib+0x623929]  GraalCompiler_compile_5fc27c66103532b8aadfba9a53a0cfc56727e415+0x209
>> C  [libjvmcicompiler.dylib+0x623e80]  GraalCompiler_compileGraph_7c727cf4f7ff3555660a81773d74fd53c28861a9+0x1e0
>> C  [libjvmcicompiler.dylib+0x742259]  HotSpotGraalCompiler_compileHelper_d3a966217707633929a5b5a4a7670fbd583caf11+0x419
>> C  [libjvmcicompiler.dylib+0x741d95]  HotSpotGraalCompiler_compile_80896636e2e15249ae0fc7c3c7f4cb060aca0523+0x165
>> JVMCI Events (8 events):
>> Event: 0.015 Thread 0x00007fa00b011600 created new JVMCI runtime 0 (0x00007fa01af24040)
>> Event: 0.015 Thread 0x00007fa00b011600 created new JVMCI runtime -1 (0x00007fa01af240a0)
>> Event: 0.072 Thread 0x00007fa01b02bc00 loaded JVMCI shared library from /Users/dnsimon/hs/graal/sdk/mxbuild/darwin-amd64/GRAALVM_LIBGRAAL_JAVA16/graalvm-libgraal-java16-20.2.0-dev/lib/libjvmcicompiler.dylib
>> Event: 0.073 Thread 0x00007fa01b02bc00 created JavaVM[1]@0x00000001409a3cb0 for JVMCI runtime 0
>> Event: 0.073 Thread 0x00007fa01b02bc00 initializing JVMCI runtime 0
>> Event: 0.074 Thread 0x00007fa01b02bc00 initialized JVMCI runtime 0
>> Event: 0.082 Thread 0x00007fa01b02bc00 initializing JVMCI runtime -1
>> Event: 0.088 Thread 0x00007fa01b02bc00 initialized JVMCI runtime -1
>> -Doug


From goetz.lindenmaier at sap.com  Thu Jul  2 16:45:07 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Thu, 2 Jul 2020 16:45:07 +0000
Subject: [CAUTION] RFR(XS) 8247695: [PPC, S390]:
 compiler/intrinsics/math/TestFpMinMaxIntrinsics.java fails
In-Reply-To: <AM0PR0202MB333141A97C084B832FD6EEE69B6D0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
References: <AM0PR0202MB333141A97C084B832FD6EEE69B6D0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
Message-ID: <AM4PR0202MB2964E741EE5E5292027FD611EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi Richard, 

I had a look at your change, looks good.  Reviewed.
Thanks for fixing this.

Best regards,
  Goetz.


> -----Original Message-----
> From: hotspot-compiler-dev <hotspot-compiler-dev-retn at openjdk.java.net>
> On Behalf Of Reingruber, Richard
> Sent: Thursday, July 2, 2020 4:05 PM
> To: hotspot-compiler-dev at openjdk.java.net
> Subject: [CAUTION] RFR(XS) 8247695: [PPC, S390]:
> compiler/intrinsics/math/TestFpMinMaxIntrinsics.java fails
> 
> Hi,
> 
> could I please get reviews for this small bugfix which adds support for AbsL
> nodes to the C2
> backends on PPC and S390?
> 
> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8247695/webrev.0/
> Bug:    https://bugs.openjdk.java.net/browse/JDK-8247695
> 
> The patch successfully passes regression testing @SAP which includes JCK
> and JTREG tests, also in
> Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance Suite, SAP specific
> tests with fastdebug and
> release builds.
> 
> Thanks, Richard.

From vladimir.kozlov at oracle.com  Thu Jul  2 17:05:14 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 2 Jul 2020 10:05:14 -0700
Subject: [16] RFR(XS) 8076985: Allocation path: biased locking +
 compressed oops code quality
In-Reply-To: <87sgea9yxx.fsf@redhat.com>
References: <580c31f3-b86e-a6c3-ca61-2d6104a846f8@oracle.com>
 <87sgea9yxx.fsf@redhat.com>
Message-ID: <a646a032-6c56-6dbd-0799-4031eee5c5c6@oracle.com>

Thank you, Roland

Vladimir K

On 7/2/20 12:08 AM, Roland Westrelin wrote:
> 
>> https://cr.openjdk.java.net/~kvn/8076985/webrev.00/
> 
> Looks good to me.
> 
> Roland.
> 

From vladimir.kozlov at oracle.com  Thu Jul  2 17:12:00 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 2 Jul 2020 10:12:00 -0700
Subject: RFR(XL) 8247922: Update Graal
In-Reply-To: <d15ce4fe-af4b-5c6f-95e2-025c4956d55a@oracle.com>
References: <d15ce4fe-af4b-5c6f-95e2-025c4956d55a@oracle.com>
Message-ID: <c862de34-c1b9-ea2a-e9ee-4a113ae07f25@oracle.com>

Looks good. Thank you for linking failures to bugs.

Vladimir K.

On 7/1/20 5:08 PM, Dean Long wrote:
> https://bugs.openjdk.java.net/browse/JDK-8247922
> http://cr.openjdk.java.net/~dlong/8247922/webrev/
> 
> This is a Graal update.? Changes since the last update (JDK-8243380) are listed in the bug description.
> 
> dl
> 
> 
> 

From vladimir.kozlov at oracle.com  Thu Jul  2 17:30:01 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 2 Jul 2020 10:30:01 -0700
Subject: [16] RFR(T): 8248596: [TESTBUG]
 compiler/loopopts/PartialPeelingUnswitch.java times out with Graal enabled
In-Reply-To: <19cf54c7-776d-63e4-6d40-bd84733a2f17@oracle.com>
References: <19cf54c7-776d-63e4-6d40-bd84733a2f17@oracle.com>
Message-ID: <660c7d05-520a-ae36-c608-36dfba5eebf2@oracle.com>

I think it should requires vm.compiler2.enabled because this test very C2 specific.
Note, Graal and C2 are mutually exclusive.

Thanks,
Vladimir K

On 7/2/20 7:31 AM, Christian Hagedorn wrote:
> Hi
> 
> Please review the following patch:
> https://bugs.openjdk.java.net/browse/JDK-8248596
> http://cr.openjdk.java.net/~chagedorn/8248596/webrev.00/
> 
> It excludes the execution of this C2 specific test with Graal since it has many methods and runs with CompileOnly, 
> possibly letting it time out with Graal.
> 
> Best regards,
> Christian

From vladimir.kozlov at oracle.com  Thu Jul  2 17:42:12 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 2 Jul 2020 10:42:12 -0700
Subject: [16] RFR(T): 8248596: [TESTBUG]
 compiler/loopopts/PartialPeelingUnswitch.java times out with Graal enabled
In-Reply-To: <87mu4i9d5e.fsf@redhat.com>
References: <19cf54c7-776d-63e4-6d40-bd84733a2f17@oracle.com>
 <87mu4i9d5e.fsf@redhat.com>
Message-ID: <905225a7-8e07-ba19-f9b4-d5fad89e68ce@oracle.com>

On 7/2/20 7:58 AM, Roland Westrelin wrote:
> 
>> http://cr.openjdk.java.net/~chagedorn/8248596/webrev.00/
> 
> That looks good to me but I've been wondering what the interaction of
> this and similar issues with libgraal is. Presumably, running this test
> with libgraal wouldn't time out. Does:
> @requires !vm.graal.enabled cover
> both graal and libgraal? If so is the plan to reevaluate all those
> additions once libgraal becomes the standard way of running graal? Or
> is there an other way to override the requirement so that if you run
> with libgraal, all tests are run?

Yes, we have such way: ProblemList-graal.txt, We list a test and link it to next bug:

https://bugs.openjdk.java.net/browse/JDK-8207267

But sometimes we have a test which only checks very C2 specific functionality. I think it is okay to run it only with C2.

Vladimir K

> 
> Roland.
> 

From vladimir.kozlov at oracle.com  Thu Jul  2 17:49:55 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 2 Jul 2020 10:49:55 -0700
Subject: [aarch64-port-dev ] RFR(XXS):8248570 Incorrect copyright header
 in TestUnsafeUnalignedSwap.java
In-Reply-To: <587101a8-7cb0-453b-aed5-4edca2cdda2d.zhuoren.wz@alibaba-inc.com>
References: <587101a8-7cb0-453b-aed5-4edca2cdda2d.zhuoren.wz@alibaba-inc.com>
Message-ID: <e59081eb-8513-2738-dfa3-c75f427ac43e@oracle.com>

Thank you, Zhuoren

Checks passed now.

Vladimir K

On 7/2/20 12:36 AM, Wang Zhuo(Zhuoren) wrote:
> Hi,
> There's something wrong int the legal notice of TestUnsafeUnalignedSwap.java file. It should be GPLv2 as in `make/templates/gpl-header`. This patch(from Vladimir Kozlov) fixes it.
> BUG Link:https://bugs.openjdk.java.net/browse/JDK-8248570
> CR: http://cr.openjdk.java.net/~wzhuo/8248570/webrev.00/
> 
> 
> Regards,
> Zhuoren
> 

From boris.ulasevich at bell-sw.com  Thu Jul  2 18:02:49 2020
From: boris.ulasevich at bell-sw.com (Boris Ulasevich)
Date: Thu, 2 Jul 2020 21:02:49 +0300
Subject: RFR 8248043: Need to eliminate excessive i2l conversions
In-Reply-To: <424D5809-A580-43BD-A00D-B49C470AF280@oracle.com>
References: <096e0df7-8208-2a07-975f-e2de8bc27e3a@bell-sw.com>
 <75920e44-518e-10e0-53b3-c2a6f85fd841@oracle.com>
 <0be466e7-057c-b029-3461-de21d9cd3910@bell-sw.com>
 <AA66D907-6ADF-44DB-8ADE-2D16D3CF6AC5@oracle.com>
 <CAOhyNwDQGnDk5VYR=xLEW5KGJLtJiWTZLO9xnVhhKUXJR9UOog@mail.gmail.com>
 <d4b7547a-02d5-fcad-5898-ce7f89383fb0@oracle.com>
 <E8DCC635-1EEB-4A5D-92D7-7E7D52654BAF@oracle.com>
 <CAOhyNwAK-F3b2zgLsWKDQiKNyu2F1TavLcC_zfgtBn2GmSm=xQ@mail.gmail.com>
 <CAOhyNwBbsRY_BFxiz+uNz09F_5FFzTU-p4bKK83rrT_icC6PhQ@mail.gmail.com>
 <424D5809-A580-43BD-A00D-B49C470AF280@oracle.com>
Message-ID: <044d72f2-8895-3070-21fe-937af7fd2bc3@bell-sw.com>

Thank you, Igor and Vladimir.

Boris

On 01.07.2020 22:29, Igor Veresov wrote:
> That looks good.
>
> igor
>
>
>
>> On Jul 1, 2020, at 2:16 AM, Boris Ulasevich 
>> <boris.ulasevich at bell-sw.com <mailto:boris.ulasevich at bell-sw.com>> wrote:
>>
>> Hi,
>>
>> It is the third attempt to send a correct link. Sorry for that ;)
>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02c
>>
>> Thanks,
>> Boris
>>
>> On Wednesday, July 1, 2020, Boris Ulasevich 
>> <boris.ulasevich at bell-sw.com <mailto:boris.ulasevich at bell-sw.com>> wrote:
>>
>>     Hi,
>>
>>     I'm deeply sorry. Yes, webrev.02b is certainly wrong!
>>     Correct link is webrev.02c:
>>     http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02
>>     <http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b/src/hotspot/share/opto/subnode.cpp.udiff.html>c
>>     - this is the change I described in my mail and wanted to review.
>>
>>     my apologies,
>>     Boris
>>
>>     On Wednesday, July 1, 2020, Igor Veresov <igor.veresov at oracle.com
>>     <mailto:igor.veresov at oracle.com>> wrote:
>>
>>         > On Jun 30, 2020, at 10:15 PM, Vladimir Kozlov
>>         <vladimir.kozlov at oracle.com
>>         <mailto:vladimir.kozlov at oracle.com>> wrote:
>>         >
>>         > I think Igor said that you can't swap arguments of compare
>>         without changing condition test. For example, if it was CC_LT
>>         it should be CC_GT after swap.
>>
>>         Yes, that?s exactly what I had in mind.? Condition must be
>>         inverted. Otherwise your transformation [3] is not valid for
>>         anything else but equality, so that?s not going to work. May
>>         be if [3] didn?t work, perhaps there is another user of? the
>>         CmpLNode in addition to BoolNode ?
>>
>>         igor
>>
>>         >
>>         > It is not clear why you need swapping in CmpLNode::Ideal()
>>         if BoolNode::Ideal() should do it already. If it does not you
>>         need to investigate why.
>>         >
>>         > Also your list of steps 1.-3. does not reflect changes in
>>         webrev.02b:
>>         >
>>         http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b/src/hotspot/share/opto/subnode.cpp.udiff.html
>>         <http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b/src/hotspot/share/opto/subnode.cpp.udiff.html>
>>         >
>>         > Regards,
>>         > Vladimir
>>         >
>>         > On 6/30/20 9:33 PM, Boris Ulasevich wrote:
>>         >> Hi Igor,
>>         >> By BoolNode I mean the canonicalization that is already in
>>         place:
>>         >>
>>         https://hg.openjdk.java.net/jdk/jdk/file/de6ad5f86276/src/hotspot/share/opto/subnode.cpp#l1391
>>         <https://hg.openjdk.java.net/jdk/jdk/file/de6ad5f86276/src/hotspot/share/opto/subnode.cpp#l1391>
>>         >> thanks,
>>         >> Boris
>>         >> On Wed, Jul 1, 2020 at 5:07 AM Igor Veresov
>>         <igor.veresov at oracle.com <mailto:igor.veresov at oracle.com>> wrote:
>>         >>> I think you forgot to include changes to BoolNode in the
>>         webrev.
>>         >>>
>>         >>> igor
>>         >>>
>>         >>>
>>         >>>
>>         >>> On Jun 30, 2020, at 11:04 AM, Boris Ulasevich
>>         <boris.ulasevich at bell-sw.com
>>         <mailto:boris.ulasevich at bell-sw.com>>
>>         >>> wrote:
>>         >>>
>>         >>> Hi Claes,
>>         >>>
>>         >>>> Seems like the optimization is mostly effective, but not
>>         getting all the
>>         >>> way.
>>         >>>
>>         >>> Good point about LHS, thanks! CmpL turned to be not
>>         canonized on the
>>         >>> moment.
>>         >>> I moved the optimization to CmpLNode::Ideal and
>>         transformations now works
>>         >>> as follows:
>>         >>> 1. CmpINode::Ideal: CmpI(CmpL3)->CmpL
>>         >>> 2. BoolNode::Ideal:
>>         >>> Bool(CmpL(const,val),test)->Bool(CmpL(val,const),test_invert)
>>         >>> 3. CmpLNode::Ideal: CmpL(ConvI2L(val),ConL)->CmpI(val,ConI)
>>         >>>
>>         >>> I applied your test to the benchmark. The result is:
>>         >>> Benchmark Mode? Cnt? ?Score? ?Error Units
>>         >>> SkipIntToLongCast.skipCastTestLeft? ?avgt? ? 5? 14.288 ?
>>         0.052 ns/op
>>         >>> SkipIntToLongCast.skipCastTestRight? avgt? ? 5? 14.338 ?
>>         0.088 ns/op
>>         >>>
>>         >>> Updated webrev:
>>         >>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b
>>         <http://cr.openjdk.java.net/~bulasevich/8248043/webrev.02b>
>>         >>>
>>         >>> thanks,
>>         >>> Boris
>>         >>>
>>         >>> On 26.06.2020 21:31, Claes Redestad wrote:
>>         >>>
>>         >>> Hi Boris,
>>         >>>
>>         >>> this looks like a nice improvement! I just have some
>>         comments about the
>>         >>> micro.
>>         >>>
>>         >>> I was curious whether the optimization works when the
>>         constant is on
>>         >>> the LHS and added a variant of the micro to try that[1].
>>         Results are
>>         >>> interesting (Intel Xeon):
>>         >>>
>>         >>> Benchmark Mode? Cnt? ?Score? ?Error Units
>>         >>> SkipIntToLongCast.skipCastTest ? ? ?avgt? ? 5? 30.937 ?
>>         0.056 ns/op
>>         >>> SkipIntToLongCast.skipCastTestLeft? ?avgt? ? 5? 30.937 ?
>>         0.140 ns/op
>>         >>>
>>         >>> With your patch:
>>         >>> Benchmark Mode? Cnt? ?Score? ?Error Units
>>         >>> SkipIntToLongCast.skipCastTest ? ? ?avgt? ? 5? 14.123 ?
>>         0.035 ns/op
>>         >>> SkipIntToLongCast.skipCastTestLeft? ?avgt? ? 5? 17.420 ?
>>         0.044 ns/op
>>         >>>
>>         >>> Seems like the optimization is mostly effective, but not
>>         getting all
>>         >>> the way. I wouldn't worry about it for this RFE, but
>>         perhaps something
>>         >>> to investigate in a follow-up. Feel free to include such
>>         a variant in
>>         >>> your patch though (no attribution necessary).
>>         >>>
>>         >>> The micro also stabilizes very quickly, so you might want
>>         to provide
>>         >>> some default tuning to keep runtime in check, e.g.,
>>         something like:
>>         >>>
>>         >>> @Warmup(iterations = 10, time = 500, timeUnit =
>>         TimeUnit.MILLISECONDS)
>>         >>> @Measurement(iterations = 5, time = 1000, timeUnit =
>>         TimeUnit.MILLISECONDS)
>>         >>> @Fork(3)
>>         >>>
>>         >>> Thanks!
>>         >>>
>>         >>> /Claes
>>         >>>
>>         >>> [1]
>>         >>>? ? ?@Benchmark
>>         >>>? ? ?public int skipCastTestLeft() {
>>         >>>? ? ? ? ?for (int i = 0; i < ARRAYSIZE_L; i++) {
>>         >>>? ? ? ? ? ? ?if (ARRAYSIZE_L == intValues[i]) {
>>         >>>? ? ? ? ? ? ? ? ?return i;
>>         >>>? ? ? ? ? ? ?}
>>         >>>? ? ? ? ?}
>>         >>>? ? ? ? ?return 0;
>>         >>>? ? ?}
>>         >>>
>>         >>> On 2020-06-26 17:05, Boris Ulasevich wrote:
>>         >>>
>>         >>> Hi all,
>>         >>>
>>         >>> Please review the change to eliminate the unnecessary i2l
>>         conversion
>>         >>> for expressions like this: "if (intValue == 1L)".
>>         >>>
>>         >>> http://bugs.openjdk.java.net/browse/JDK-8248043
>>         <http://bugs.openjdk.java.net/browse/JDK-8248043>
>>         >>> http://cr.openjdk.java.net/~bulasevich/8248043/webrev.01
>>         <http://cr.openjdk.java.net/~bulasevich/8248043/webrev.01>
>>         >>>
>>         >>> The provided benchmark shows performance boost on all
>>         platforms:
>>         >>> - Intel Xeon: 32.705 --> 14.234 ns/op
>>         >>> - arm64: 42.060 --> 25.456 ns/op
>>         >>> - arm32: 618.763 --> 314.040 ns/op
>>         >>> - ppc8:? 81.218 --> 63.026 ns/op
>>         >>>
>>         >>> Testing done: jtreg, jck.
>>         >>>
>>         >>> thanks,
>>         >>> Boris
>>         >>>
>>         >>>
>>         >>>
>>         >>>
>>
>


From boris.ulasevich at bell-sw.com  Thu Jul  2 18:13:44 2020
From: boris.ulasevich at bell-sw.com (Boris Ulasevich)
Date: Thu, 2 Jul 2020 21:13:44 +0300
Subject: RFR(XS) 8248568: compiler/c2/TestBit.java failed: 'test' missing from
 stdout/stderr
Message-ID: <ae99e130-ccc5-30c8-aeeb-1b7443475513@bell-sw.com>

Hi,

Please review a one-line change: adding -Xbatch option to recently
introduced test to get a more predictable PrintOptoAssembly output.

http://cr.openjdk.java.net/~bulasevich/8248568/webrev.00
http://bugs.openjdk.java.net/browse/JDK-8248568

thanks,
Boris

From vladimir.kozlov at oracle.com  Thu Jul  2 18:54:27 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 2 Jul 2020 11:54:27 -0700
Subject: RFR(XS) 8248568: compiler/c2/TestBit.java failed: 'test' missing
 from stdout/stderr
In-Reply-To: <ae99e130-ccc5-30c8-aeeb-1b7443475513@bell-sw.com>
References: <ae99e130-ccc5-30c8-aeeb-1b7443475513@bell-sw.com>
Message-ID: <a7bafafd-07e0-91ba-8c5b-2d7880e09c7e@oracle.com>

Good.

You may also replace next requirements:

vm.flavor == "server" & !vm.graal.enabled

with one:

vm.compiler2.enabled

Graal and C2 are mutually exclusive.

May be also run processes without C1 by switching off Tiered Compilation.

And instead of:
@run main/othervm compiler.c2.TestBit

use:
@run driver compiler.c2.TestBit

Because you launching separate processes.

Please, test changes with jtreg testing.

Thanks,
Vladimir K

On 7/2/20 11:13 AM, Boris Ulasevich wrote:
> Hi,
> 
> Please review a one-line change: adding -Xbatch option to recently
> introduced test to get a more predictable PrintOptoAssembly output.
> 
> http://cr.openjdk.java.net/~bulasevich/8248568/webrev.00
> http://bugs.openjdk.java.net/browse/JDK-8248568
> 
> thanks,
> Boris

From richard.reingruber at sap.com  Thu Jul  2 19:22:19 2020
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Thu, 2 Jul 2020 19:22:19 +0000
Subject: [CAUTION] RFR(XS) 8247695: [PPC, S390]:
 compiler/intrinsics/math/TestFpMinMaxIntrinsics.java fails
In-Reply-To: <AM4PR0202MB2964E741EE5E5292027FD611EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <AM0PR0202MB333141A97C084B832FD6EEE69B6D0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM4PR0202MB2964E741EE5E5292027FD611EC6D0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <AM0PR0202MB333191AD2F85C9527AF9778A9B6D0@AM0PR0202MB3331.eurprd02.prod.outlook.com>

Thank you Goetz!

Cheers, Richard.

-----Original Message-----
From: Lindenmaier, Goetz <goetz.lindenmaier at sap.com> 
Sent: Donnerstag, 2. Juli 2020 18:45
To: Reingruber, Richard <richard.reingruber at sap.com>; hotspot-compiler-dev at openjdk.java.net
Subject: RE: [CAUTION] RFR(XS) 8247695: [PPC, S390]: compiler/intrinsics/math/TestFpMinMaxIntrinsics.java fails

Hi Richard, 

I had a look at your change, looks good.  Reviewed.
Thanks for fixing this.

Best regards,
  Goetz.


> -----Original Message-----
> From: hotspot-compiler-dev <hotspot-compiler-dev-retn at openjdk.java.net>
> On Behalf Of Reingruber, Richard
> Sent: Thursday, July 2, 2020 4:05 PM
> To: hotspot-compiler-dev at openjdk.java.net
> Subject: [CAUTION] RFR(XS) 8247695: [PPC, S390]:
> compiler/intrinsics/math/TestFpMinMaxIntrinsics.java fails
> 
> Hi,
> 
> could I please get reviews for this small bugfix which adds support for AbsL
> nodes to the C2
> backends on PPC and S390?
> 
> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8247695/webrev.0/
> Bug:    https://bugs.openjdk.java.net/browse/JDK-8247695
> 
> The patch successfully passes regression testing @SAP which includes JCK
> and JTREG tests, also in
> Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance Suite, SAP specific
> tests with fastdebug and
> release builds.
> 
> Thanks, Richard.

From dean.long at oracle.com  Thu Jul  2 20:01:30 2020
From: dean.long at oracle.com (Dean Long)
Date: Thu, 2 Jul 2020 13:01:30 -0700
Subject: RFR(XL) 8247922: Update Graal
In-Reply-To: <c862de34-c1b9-ea2a-e9ee-4a113ae07f25@oracle.com>
References: <d15ce4fe-af4b-5c6f-95e2-025c4956d55a@oracle.com>
 <c862de34-c1b9-ea2a-e9ee-4a113ae07f25@oracle.com>
Message-ID: <8afc61d8-a542-0952-7854-13bfca718d2f@oracle.com>

Thanks Vladimir.

dl

On 7/2/20 10:12 AM, Vladimir Kozlov wrote:
> Looks good. Thank you for linking failures to bugs.
>
> Vladimir K.
>
> On 7/1/20 5:08 PM, Dean Long wrote:
>> https://bugs.openjdk.java.net/browse/JDK-8247922
>> http://cr.openjdk.java.net/~dlong/8247922/webrev/
>>
>> This is a Graal update.? Changes since the last update (JDK-8243380) 
>> are listed in the bug description.
>>
>> dl
>>
>>
>>


From boris.ulasevich at bell-sw.com  Thu Jul  2 21:29:35 2020
From: boris.ulasevich at bell-sw.com (Boris Ulasevich)
Date: Fri, 3 Jul 2020 00:29:35 +0300
Subject: RFR(XS) 8248568: compiler/c2/TestBit.java failed: 'test' missing
 from stdout/stderr
In-Reply-To: <a7bafafd-07e0-91ba-8c5b-2d7880e09c7e@oracle.com>
References: <ae99e130-ccc5-30c8-aeeb-1b7443475513@bell-sw.com>
 <a7bafafd-07e0-91ba-8c5b-2d7880e09c7e@oracle.com>
Message-ID: <CAOhyNwB8L9dabfCiuK=9qK6WMkdR7o2MdOxCNoBv0UiBRTmpEA@mail.gmail.com>

Hi Vladimir,

Thank you. I applied your suggestions. On our machines jtreg runs well.
Update: http://cr.openjdk.java.net/~bulasevich/8248568/webrev.01

regards,
Boris

On Thu, Jul 2, 2020 at 9:54 PM Vladimir Kozlov <vladimir.kozlov at oracle.com>
wrote:

> Good.
>
> You may also replace next requirements:
>
> vm.flavor == "server" & !vm.graal.enabled
>
> with one:
>
> vm.compiler2.enabled
>
> Graal and C2 are mutually exclusive.
>
> May be also run processes without C1 by switching off Tiered Compilation.
>
> And instead of:
> @run main/othervm compiler.c2.TestBit
>
> use:
> @run driver compiler.c2.TestBit
>
> Because you launching separate processes.
>
> Please, test changes with jtreg testing.
>
> Thanks,
> Vladimir K
>
> On 7/2/20 11:13 AM, Boris Ulasevich wrote:
> > Hi,
> >
> > Please review a one-line change: adding -Xbatch option to recently
> > introduced test to get a more predictable PrintOptoAssembly output.
> >
> > http://cr.openjdk.java.net/~bulasevich/8248568/webrev.00
> > http://bugs.openjdk.java.net/browse/JDK-8248568
> >
> > thanks,
> > Boris
>

From vladimir.kozlov at oracle.com  Thu Jul  2 21:45:49 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 2 Jul 2020 14:45:49 -0700
Subject: RFR(XS) 8248568: compiler/c2/TestBit.java failed: 'test' missing
 from stdout/stderr
In-Reply-To: <CAOhyNwB8L9dabfCiuK=9qK6WMkdR7o2MdOxCNoBv0UiBRTmpEA@mail.gmail.com>
References: <ae99e130-ccc5-30c8-aeeb-1b7443475513@bell-sw.com>
 <a7bafafd-07e0-91ba-8c5b-2d7880e09c7e@oracle.com>
 <CAOhyNwB8L9dabfCiuK=9qK6WMkdR7o2MdOxCNoBv0UiBRTmpEA@mail.gmail.com>
Message-ID: <d2b21bb9-91d7-1e2d-46e2-9d31a1ee41e9@oracle.com>

Looks good.

Thanks,
Vladimir

On 7/2/20 2:29 PM, Boris Ulasevich wrote:
> Hi Vladimir,
> 
> Thank you. I applied your suggestions. On our machines jtreg runs well.
> Update: http://cr.openjdk.java.net/~bulasevich/8248568/webrev.01
> 
> regards,
> Boris
> 
> On Thu, Jul 2, 2020 at 9:54 PM Vladimir Kozlov <vladimir.kozlov at oracle.com>
> wrote:
> 
>> Good.
>>
>> You may also replace next requirements:
>>
>> vm.flavor == "server" & !vm.graal.enabled
>>
>> with one:
>>
>> vm.compiler2.enabled
>>
>> Graal and C2 are mutually exclusive.
>>
>> May be also run processes without C1 by switching off Tiered Compilation.
>>
>> And instead of:
>> @run main/othervm compiler.c2.TestBit
>>
>> use:
>> @run driver compiler.c2.TestBit
>>
>> Because you launching separate processes.
>>
>> Please, test changes with jtreg testing.
>>
>> Thanks,
>> Vladimir K
>>
>> On 7/2/20 11:13 AM, Boris Ulasevich wrote:
>>> Hi,
>>>
>>> Please review a one-line change: adding -Xbatch option to recently
>>> introduced test to get a more predictable PrintOptoAssembly output.
>>>
>>> http://cr.openjdk.java.net/~bulasevich/8248568/webrev.00
>>> http://bugs.openjdk.java.net/browse/JDK-8248568
>>>
>>> thanks,
>>> Boris
>>

From vladimir.kozlov at oracle.com  Fri Jul  3 02:02:06 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 2 Jul 2020 19:02:06 -0700
Subject: [15] RFR(T) 8247527:
 serviceability/dcmd/gc/HeapDumpCompressedTest.java fails with Graal + ZGC
Message-ID: <ec23502a-60fc-58c0-faef-5d7365ed08db@oracle.com>

https://cr.openjdk.java.net/~kvn/8247527/webrev.00/
https://bugs.openjdk.java.net/browse/JDK-8247527

Test should have @requires which excludes running Graal with GC which it does not support.

Testing: hs-tier1,hs-tier4-graal

Thanks,
Vladimir

From Yang.Zhang at arm.com  Fri Jul  3 02:15:27 2020
From: Yang.Zhang at arm.com (Yang Zhang)
Date: Fri, 3 Jul 2020 02:15:27 +0000
Subject: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API
 (Incubator): AArch64 backend changes
In-Reply-To: <BYAPR11MB354348F74149B4E17BFB24E6EF6F0@BYAPR11MB3543.namprd11.prod.outlook.com>
References: <VI1PR0802MB2558C90ACF1189F17692FF278EB90@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <275eb57c-51c0-675e-c32a-91b198023559@redhat.com>
 <719F9169-ABC4-408E-B732-F1BD9A84337F@oracle.com>
 <9a13f5df-d946-579d-4282-917dc7338dc8@redhat.com>
 <09BC0693-80E0-4F87-855E-0B38A6F5EFA2@oracle.com>
 <668e500e-f621-5a2c-a41e-f73536880f73@redhat.com>
 <VI1PR0802MB255845256424F9EB4D5D81368EB30@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <1909fa9d-98bb-c2fb-45d8-540247d1ca8b@redhat.com>
 <VI1PR0802MB255895ADE9BC8D027EBB81048EB10@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <BYAPR11MB35438110FA50851942CE17A1EF950@BYAPR11MB3543.namprd11.prod.outlook.com>
 <HE1PR0802MB25542AE212E12FD3EF6D1BE48E6E0@HE1PR0802MB2554.eurprd08.prod.outlook.com>
 <BYAPR11MB354348F74149B4E17BFB24E6EF6F0@BYAPR11MB3543.namprd11.prod.outlook.com>
Message-ID: <VI1PR0802MB255858E1565FB8263C42D0CF8E6A0@VI1PR0802MB2558.eurprd08.prod.outlook.com>

Hi Sandhya

Thanks very much for your help.

Regards,
Yang

-----Original Message-----
From: Viswanathan, Sandhya <sandhya.viswanathan at intel.com> 
Sent: Wednesday, July 1, 2020 2:57 AM
To: Yang Zhang <Yang.Zhang at arm.com>; Andrew Haley <aph at redhat.com>; Paul Sandoz <paul.sandoz at oracle.com>
Cc: nd <nd at arm.com>; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; core-libs-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
Subject: RE: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API (Incubator): AArch64 backend changes

Hi Yang,

I have merged vectorIntrinsics with changes from panama/default.

Hope this helps.
Best Regards,
Sandhya

-----Original Message-----
From: Yang Zhang <Yang.Zhang at arm.com>
Sent: Monday, June 29, 2020 12:49 AM
To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; Andrew Haley <aph at redhat.com>; Paul Sandoz <paul.sandoz at oracle.com>
Cc: nd <nd at arm.com>; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; core-libs-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
Subject: RE: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API (Incubator): AArch64 backend changes

Hi Andrew,

1. Instructions that can be matched with NEON instructions directly.
MulVB, SqrtVF and AbsV have been merged into jdk master already.

2. Instructions that jdk master has middle end support for, but they cannot be matched with NEON instructions directly.
Such as AddReductionVL, MulReductionVL, And/Or/XorReductionV These new instructions can be moved into jdk master first, but for auto-vectorization, the performance might not get improved. 

3. Panama/Vector API specific  instructions such as Load/StoreVector ( 16 bits), VectorReinterpret, VectorMaskCmp, MaxV/MinV, VectorBlend etc. 
These instructions cannot be moved into jdk master first because there isn't middle-end support.

I will put 2 and 3 in a new ad file aarch64_neon.ad. I will also update aarch64_asmtest.py and macroassemler.cpp. When the patch is ready, I will send it again.


Hi Sandhya,

Could you please help to manual merge panama vectorIntrinsics/vector-unstable to jdk master? So that I can update this patch based on latest jdk master.

Regards
Yang


-----Original Message-----
From: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>
Sent: Thursday, June 25, 2020 3:04 AM
To: Yang Zhang <Yang.Zhang at arm.com>; Andrew Haley <aph at redhat.com>; Paul Sandoz <paul.sandoz at oracle.com>
Cc: nd <nd at arm.com>; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; core-libs-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
Subject: RE: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API (Incubator): AArch64 backend changes

Hi Andrew/Yang,
 
We couldn?t propose Vector API to target in time for JDK 15 and hoping to do so early in JDK 16 timeframe.
The implementation reviews on other components have made good progress. 
We have so far ok to PPT from (runtime, shared compiler changes, x86 backend).
Java API implementation review is in progress.
I wanted to check with you both if we have a go ahead from aarch64 backed point of view.

Best Regards,
Sandhya

-----Original Message-----
From: hotspot-compiler-dev <hotspot-compiler-dev-bounces at openjdk.java.net> On Behalf Of Yang Zhang
Sent: Tuesday, May 26, 2020 7:59 PM
To: Andrew Haley <aph at redhat.com>; Paul Sandoz <paul.sandoz at oracle.com>
Cc: nd <nd at arm.com>; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; core-libs-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
Subject: RE: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API (Incubator): AArch64 backend changes

> But to my earlier question. please: can the new instructions be moved into jdk head first, and then merged into the Panama branch, or not?

The new instructions can be classified as:
1. Instructions that can be matched with NEON instructions directly.
MulVB and SqrtVF have been merged into jdk master already. The patch of AbsV is in review [1].

2. Instructions that Jdk master has middle end support for, but they cannot be matched with NEON instructions directly.
Such as AddReductionVL, MulReductionVL, And/Or/XorReductionV These new instructions can be moved into jdk master first, but for auto-vectorization, the performance might not get improved.
May I have a new patch for these? 

3. Panama/Vector API specific  instructions Such as Load/StoreVector ( 16 bits), VectorReinterpret, VectorMaskCmp, MaxV/MinV, VectorBlend etc. 
These instructions cannot be moved into jdk master first because there isn't middle-end support.

Regards
Yang

[1] https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-May/008861.html

-----Original Message-----
From: Andrew Haley <aph at redhat.com>
Sent: Tuesday, May 26, 2020 4:25 PM
To: Yang Zhang <Yang.Zhang at arm.com>; Paul Sandoz <paul.sandoz at oracle.com>
Cc: hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; core-libs-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; nd <nd at arm.com>
Subject: Re: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API (Incubator): AArch64 backend changes

On 25/05/2020 09:26, Yang Zhang wrote:
> In jdk master, what we need to do is that writing m4 file for existing 
> vector instructions and placed them to a new file aarch64_neon.ad.
> If no question, I will do it right away.

I'm not entirely sure that such a change is necessary now. In particular, reorganizing the existing vector instructions is IMO excessive, but I admit that it might be an improvement.

But to my earlier question. please: can the new instructions be moved into jdk head first, and then merged into the Panama branch, or not?
It'd help if this was possible.

--
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com> https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From igor.ignatyev at oracle.com  Fri Jul  3 02:24:15 2020
From: igor.ignatyev at oracle.com (igor.ignatyev at oracle.com)
Date: Thu, 2 Jul 2020 19:24:15 -0700
Subject: [15] RFR(T) 8247527:
 serviceability/dcmd/gc/HeapDumpCompressedTest.java fails with Graal + ZGC
In-Reply-To: <ec23502a-60fc-58c0-faef-5d7365ed08db@oracle.com>
References: <ec23502a-60fc-58c0-faef-5d7365ed08db@oracle.com>
Message-ID: <5E33E613-882E-400A-886A-EA4FAD85F2EA@oracle.com>

LGTM

? Igor

> On Jul 2, 2020, at 7:03 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> ?https://cr.openjdk.java.net/~kvn/8247527/webrev.00/
> https://bugs.openjdk.java.net/browse/JDK-8247527
> 
> Test should have @requires which excludes running Graal with GC which it does not support.
> 
> Testing: hs-tier1,hs-tier4-graal
> 
> Thanks,
> Vladimir


From david.holmes at oracle.com  Fri Jul  3 02:25:45 2020
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 3 Jul 2020 12:25:45 +1000
Subject: [15] RFR(T) 8247527:
 serviceability/dcmd/gc/HeapDumpCompressedTest.java fails with Graal + ZGC
In-Reply-To: <ec23502a-60fc-58c0-faef-5d7365ed08db@oracle.com>
References: <ec23502a-60fc-58c0-faef-5d7365ed08db@oracle.com>
Message-ID: <e3f21f4d-f0ca-8814-00c4-5515698d4432@oracle.com>

Hi Vladimir,

On 3/07/2020 12:02 pm, Vladimir Kozlov wrote:
> https://cr.openjdk.java.net/~kvn/8247527/webrev.00/
> https://bugs.openjdk.java.net/browse/JDK-8247527
> 
> Test should have @requires which excludes running Graal with GC which it 
> does not support.

I find it somewhat disturbing that a generic test has to know about the 
limitations between GCs and Graal!

I would have been more inclined to just exclude this test when running 
with Graal, even if that theoretically reduced the test coverage in a 
ting way.

If/When Graal supports these other GCs who will remember to re-enable 
these test cases?

Thanks,
David

> Testing: hs-tier1,hs-tier4-graal
> 
> Thanks,
> Vladimir

From igor.ignatyev at oracle.com  Fri Jul  3 02:59:35 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Thu, 2 Jul 2020 19:59:35 -0700
Subject: [15] RFR(T) 8247527:
 serviceability/dcmd/gc/HeapDumpCompressedTest.java fails with Graal + ZGC
In-Reply-To: <e3f21f4d-f0ca-8814-00c4-5515698d4432@oracle.com>
References: <ec23502a-60fc-58c0-faef-5d7365ed08db@oracle.com>
 <e3f21f4d-f0ca-8814-00c4-5515698d4432@oracle.com>
Message-ID: <EBBF9FC9-F5FC-4219-B3B1-E0A5E2FC6B44@oracle.com>

Hi David,

it's in my todo list to improve this situation and have vm.gc.X to take selected JIT into account; and update existing (>200) occurrences of 'vm.gc.X & !vm.graal.enabled' 

-- Igor

> On Jul 2, 2020, at 7:25 PM, David Holmes <david.holmes at oracle.com> wrote:
> 
> Hi Vladimir,
> 
> On 3/07/2020 12:02 pm, Vladimir Kozlov wrote:
>> https://cr.openjdk.java.net/~kvn/8247527/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8247527
>> Test should have @requires which excludes running Graal with GC which it does not support.
> 
> I find it somewhat disturbing that a generic test has to know about the limitations between GCs and Graal!
> 
> I would have been more inclined to just exclude this test when running with Graal, even if that theoretically reduced the test coverage in a ting way.
> 
> If/When Graal supports these other GCs who will remember to re-enable these test cases?
> 
> Thanks,
> David
> 
>> Testing: hs-tier1,hs-tier4-graal
>> Thanks,
>> Vladimir


From david.holmes at oracle.com  Fri Jul  3 05:09:28 2020
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 3 Jul 2020 15:09:28 +1000
Subject: [15] RFR(T) 8247527:
 serviceability/dcmd/gc/HeapDumpCompressedTest.java fails with Graal + ZGC
In-Reply-To: <EBBF9FC9-F5FC-4219-B3B1-E0A5E2FC6B44@oracle.com>
References: <ec23502a-60fc-58c0-faef-5d7365ed08db@oracle.com>
 <e3f21f4d-f0ca-8814-00c4-5515698d4432@oracle.com>
 <EBBF9FC9-F5FC-4219-B3B1-E0A5E2FC6B44@oracle.com>
Message-ID: <df0d7cf1-6342-c2b9-1688-3f112a28e0a2@oracle.com>

Hi Igor,

On 3/07/2020 12:59 pm, Igor Ignatyev wrote:
> Hi David,
> 
> it's in my todo list to improve this situation and have vm.gc.X to take selected JIT into account; and update existing (>200) occurrences of 'vm.gc.X & !vm.graal.enabled'

200+ ouch! :(

I guess this fix doesn't make the situation any worse in a practical sense.

Thanks,
David
-----

> -- Igor
> 
>> On Jul 2, 2020, at 7:25 PM, David Holmes <david.holmes at oracle.com> wrote:
>>
>> Hi Vladimir,
>>
>> On 3/07/2020 12:02 pm, Vladimir Kozlov wrote:
>>> https://cr.openjdk.java.net/~kvn/8247527/webrev.00/
>>> https://bugs.openjdk.java.net/browse/JDK-8247527
>>> Test should have @requires which excludes running Graal with GC which it does not support.
>>
>> I find it somewhat disturbing that a generic test has to know about the limitations between GCs and Graal!
>>
>> I would have been more inclined to just exclude this test when running with Graal, even if that theoretically reduced the test coverage in a ting way.
>>
>> If/When Graal supports these other GCs who will remember to re-enable these test cases?
>>
>> Thanks,
>> David
>>
>>> Testing: hs-tier1,hs-tier4-graal
>>> Thanks,
>>> Vladimir
> 

From felix.yang at huawei.com  Fri Jul  3 06:30:05 2020
From: felix.yang at huawei.com (Yangfei (Felix))
Date: Fri, 3 Jul 2020 06:30:05 +0000
Subject: RFR(S): 8243670: Unexpected test result caused by C2
 MergeMemNode::Ideal
In-Reply-To: <4d71bb09-2569-4d01-16cc-707ce61d23de@oracle.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7DE0314@dggeml507-mbs.china.huawei.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E47B68@dggeml527-mbx.china.huawei.com>
 <fa89a8ca-34dd-abcf-60fa-9d18db754c14@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E48697@dggeml527-mbx.china.huawei.com>
 <87k103w2o7.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4ADEA@dggeml527-mbx.china.huawei.com>
 <87eeq7wmd2.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4B0CD@dggeml527-mbx.china.huawei.com>
 <878sgfwbyc.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4B7E4@dggeml527-mbx.china.huawei.com>
 <87wo3yupks.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4C62C@dggeml527-mbx.china.huawei.com>
 <a87b52f2-31e5-a526-4478-049ab6bba636@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E57047@dggeml507-mbx.china.huawei.com>
 <87o8p1ours.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E57CA5@dggeml507-mbx.china.huawei.com>
 <134e1fc1-8e5c-a1f2-d0ed-50784b807578@oracle.com>
 <4d71bb09-2569-4d01-16cc-707ce61d23de@oracle.com>
Message-ID: <DA41BE1DDCA941489001C7FBD7A8820EE7E5A3C4@dggeml507-mbx.china.huawei.com>

Hi Tobias,

> -----Original Message-----
> From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com]
> Sent: Thursday, July 2, 2020 2:43 PM
> To: Yangfei (Felix) <felix.yang at huawei.com>; Roland Westrelin
> <rwestrel at redhat.com>; hotspot-compiler-dev at openjdk.java.net
> Cc: guoge (A) <guoge1 at huawei.com>; zhouyong (V)
> <zhouyong44 at huawei.com>
> Subject: Re: RFR(S): 8243670: Unexpected test result caused by C2
> MergeMemNode::Ideal
> 
> Hi Felix,
> 
> On 30.06.20 19:06, Tobias Hartmann wrote:
> > I'll run some perf and correctness testing and report back once it finished.
> 
> All passed.

Thanks for the effort.  :-)
I also submitted the latest patch to jdk/submit repo for testing. 
First time submitted in branch JDK-8243670-3, but I haven?t got any test result after about 24 hours. 
Then I closed this branch and resubmitted in a new branch  JDK-8243670-4 about 8 hours ago. 
I guess maybe something is wrong with the submit repo? I am still waiting for the test result. 

http://hg.openjdk.java.net/jdk/submit/rev/798000e6da7f 

Thanks,
Felix

From christian.hagedorn at oracle.com  Fri Jul  3 07:19:14 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Fri, 3 Jul 2020 09:19:14 +0200
Subject: [16] RFR(T): 8248596: [TESTBUG]
 compiler/loopopts/PartialPeelingUnswitch.java times out with Graal enabled
In-Reply-To: <905225a7-8e07-ba19-f9b4-d5fad89e68ce@oracle.com>
References: <19cf54c7-776d-63e4-6d40-bd84733a2f17@oracle.com>
 <87mu4i9d5e.fsf@redhat.com> <905225a7-8e07-ba19-f9b4-d5fad89e68ce@oracle.com>
Message-ID: <db447504-ff86-9dcc-f510-f498b9d8fa9b@oracle.com>

Hi Vladimir, hi Roland

Thank you for your reviews!

> I think it should requires vm.compiler2.enabled because this test very C2 specific.
> Note, Graal and C2 are mutually exclusive. 

Sounds reasonable. I changed that in a new webrev:
http://cr.openjdk.java.net/~chagedorn/8248596/webrev.01/

Best regards,
Christian

On 02.07.20 19:42, Vladimir Kozlov wrote:
> On 7/2/20 7:58 AM, Roland Westrelin wrote:
>>
>>> http://cr.openjdk.java.net/~chagedorn/8248596/webrev.00/
>>
>> That looks good to me but I've been wondering what the interaction of
>> this and similar issues with libgraal is. Presumably, running this test
>> with libgraal wouldn't time out. Does:
>> @requires !vm.graal.enabled cover
>> both graal and libgraal? If so is the plan to reevaluate all those
>> additions once libgraal becomes the standard way of running graal? Or
>> is there an other way to override the requirement so that if you run
>> with libgraal, all tests are run?
> 
> Yes, we have such way: ProblemList-graal.txt, We list a test and link it 
> to next bug:
> 
> https://bugs.openjdk.java.net/browse/JDK-8207267
> 
> But sometimes we have a test which only checks very C2 specific 
> functionality. I think it is okay to run it only with C2.
> 
> Vladimir K
> 
>>
>> Roland.
>>

From tobias.hartmann at oracle.com  Fri Jul  3 07:31:36 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 3 Jul 2020 09:31:36 +0200
Subject: RFR(S): 8243670: Unexpected test result caused by C2
 MergeMemNode::Ideal
In-Reply-To: <DA41BE1DDCA941489001C7FBD7A8820EE7E5A3C4@dggeml507-mbx.china.huawei.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7DE0314@dggeml507-mbs.china.huawei.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E48697@dggeml527-mbx.china.huawei.com>
 <87k103w2o7.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4ADEA@dggeml527-mbx.china.huawei.com>
 <87eeq7wmd2.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4B0CD@dggeml527-mbx.china.huawei.com>
 <878sgfwbyc.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4B7E4@dggeml527-mbx.china.huawei.com>
 <87wo3yupks.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4C62C@dggeml527-mbx.china.huawei.com>
 <a87b52f2-31e5-a526-4478-049ab6bba636@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E57047@dggeml507-mbx.china.huawei.com>
 <87o8p1ours.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E57CA5@dggeml507-mbx.china.huawei.com>
 <134e1fc1-8e5c-a1f2-d0ed-50784b807578@oracle.com>
 <4d71bb09-2569-4d01-16cc-707ce61d23de@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E5A3C4@dggeml507-mbx.china.huawei.com>
Message-ID: <06cc9b64-0b56-1ed1-ad3f-5d646e46c98a@oracle.com>

Hi Felix,

On 03.07.20 08:30, Yangfei (Felix) wrote:
> Thanks for the effort.  :-)
> I also submitted the latest patch to jdk/submit repo for testing. 

The testing I did includes the jobs executed by the submit repo, so no need to submit again.

You can push your patch to:
http://hg.openjdk.java.net/jdk/jdk15

> First time submitted in branch JDK-8243670-3, but I haven?t got any test result after about 24 hours. 
> Then I closed this branch and resubmitted in a new branch  JDK-8243670-4 about 8 hours ago. 
> I guess maybe something is wrong with the submit repo? I am still waiting for the test result. 
> 
> http://hg.openjdk.java.net/jdk/submit/rev/798000e6da7f 

Okay, there seems to be an issue with the submit repo. I'll report it.

Best regards,
Tobias

From tobias.hartmann at oracle.com  Fri Jul  3 07:37:03 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 3 Jul 2020 09:37:03 +0200
Subject: [16] RFR(T): 8248596: [TESTBUG]
 compiler/loopopts/PartialPeelingUnswitch.java times out with Graal enabled
In-Reply-To: <db447504-ff86-9dcc-f510-f498b9d8fa9b@oracle.com>
References: <19cf54c7-776d-63e4-6d40-bd84733a2f17@oracle.com>
 <87mu4i9d5e.fsf@redhat.com> <905225a7-8e07-ba19-f9b4-d5fad89e68ce@oracle.com>
 <db447504-ff86-9dcc-f510-f498b9d8fa9b@oracle.com>
Message-ID: <8a88f3f6-fd14-7e93-2013-a0f37e6b7094@oracle.com>

Hi Christian,

On 03.07.20 09:19, Christian Hagedorn wrote:
> Sounds reasonable. I changed that in a new webrev:
> http://cr.openjdk.java.net/~chagedorn/8248596/webrev.01/

Looks good to me.

Best regards,
Tobias

From nils.eliasson at oracle.com  Fri Jul  3 07:39:02 2020
From: nils.eliasson at oracle.com (Nils Eliasson)
Date: Fri, 3 Jul 2020 09:39:02 +0200
Subject: [16] RFR(S): 8248398: Add diagnostic RepeatCompilation utility
In-Reply-To: <3b5be72a-8c6e-8d93-f48b-d37e6e7ef049@oracle.com>
References: <3708c0f5-1c43-5adf-c817-2fb1ff6518c8@oracle.com>
 <3b5be72a-8c6e-8d93-f48b-d37e6e7ef049@oracle.com>
Message-ID: <3ff8176b-9122-9fff-f628-5e001b21a1dd@oracle.com>

Thank you Tobias, Claes and Vladimir!

I got the suggestion from Patric to change to use unified logging 
instead of PrintCompilation. Are you ok with that?

Best regards,
Nils

On 2020-06-29 08:33, Tobias Hartmann wrote:
> Hi Nils,
>
> Looks good to me!
>
> In globals.hpp:543 there is an excess whitespace before "\".
>
> Best regards,
> Tobias
>
> On 26.06.20 16:48, Nils Eliasson wrote:
>> Hi,
>>
>> This is a diagnostic utility that was requested by Claes to enable better profiling of the compilers.
>>
>> This patch introduces the diagnostic flag RepeatCompilation.
>>
>> RepeatCompilation hold he number of times the compilation gets repeated without having the code
>> installed. RepeatCompilation = 0 is the default and means that only the regular compilation is done.
>> RepeatCompilation = 100 means that an extra 100 compilations are done but without installing the code.
>>
>> I have tried keeping the change small and non-intrusive, contained to the CompilerBroker (except the
>> boolean for disabling code install that is passed to the compilers).
>>
>> RepatCompilation works as a flag: "-XX:RepeatCompilation=100", a compile command:
>> "-XX:CompileCommand=option,*::toString,intx,RepeatCompilation,100"
>> and a compiler directive: "RepeatCompilation : 100".
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8248398
>> Webrev: http://cr.openjdk.java.net/~neliasso/8248398/webrev.04/
>>
>> Please review!
>>
>> Best regards,
>> Nils Eliasson
>>


From tobias.hartmann at oracle.com  Fri Jul  3 07:51:59 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 3 Jul 2020 09:51:59 +0200
Subject: [16] RFR(S): 8248398: Add diagnostic RepeatCompilation utility
In-Reply-To: <3ff8176b-9122-9fff-f628-5e001b21a1dd@oracle.com>
References: <3708c0f5-1c43-5adf-c817-2fb1ff6518c8@oracle.com>
 <3b5be72a-8c6e-8d93-f48b-d37e6e7ef049@oracle.com>
 <3ff8176b-9122-9fff-f628-5e001b21a1dd@oracle.com>
Message-ID: <3df4b83a-de08-3a1f-4daf-4ed48515217c@oracle.com>

Hi Nils,

On 03.07.20 09:39, Nils Eliasson wrote:
> I got the suggestion from Patric to change to use unified logging instead of PrintCompilation. Are
> you ok with that?

For the task printing with "NO CODE INSTALLED" message? I think that should go with PrintCompilation
to be consistent. Otherwise it would be weird that PrintCompilation would not print any compilations
if RepeatCompilation is enabled.

Best regards,
Tobias

From christian.hagedorn at oracle.com  Fri Jul  3 07:52:27 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Fri, 3 Jul 2020 09:52:27 +0200
Subject: [16] RFR(T): 8248596: [TESTBUG]
 compiler/loopopts/PartialPeelingUnswitch.java times out with Graal enabled
In-Reply-To: <8a88f3f6-fd14-7e93-2013-a0f37e6b7094@oracle.com>
References: <19cf54c7-776d-63e4-6d40-bd84733a2f17@oracle.com>
 <87mu4i9d5e.fsf@redhat.com> <905225a7-8e07-ba19-f9b4-d5fad89e68ce@oracle.com>
 <db447504-ff86-9dcc-f510-f498b9d8fa9b@oracle.com>
 <8a88f3f6-fd14-7e93-2013-a0f37e6b7094@oracle.com>
Message-ID: <cb8f43e5-65f6-7f28-01fd-b40e73ea9994@oracle.com>

Thank you Tobias for your review!

Best regards,
Christian

On 03.07.20 09:37, Tobias Hartmann wrote:
> Hi Christian,
> 
> On 03.07.20 09:19, Christian Hagedorn wrote:
>> Sounds reasonable. I changed that in a new webrev:
>> http://cr.openjdk.java.net/~chagedorn/8248596/webrev.01/
> 
> Looks good to me.
> 
> Best regards,
> Tobias
> 

From felix.yang at huawei.com  Fri Jul  3 08:03:26 2020
From: felix.yang at huawei.com (Yangfei (Felix))
Date: Fri, 3 Jul 2020 08:03:26 +0000
Subject: RFR(S): 8243670: Unexpected test result caused by C2
 MergeMemNode::Ideal
In-Reply-To: <06cc9b64-0b56-1ed1-ad3f-5d646e46c98a@oracle.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7DE0314@dggeml507-mbs.china.huawei.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E48697@dggeml527-mbx.china.huawei.com>
 <87k103w2o7.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4ADEA@dggeml527-mbx.china.huawei.com>
 <87eeq7wmd2.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4B0CD@dggeml527-mbx.china.huawei.com>
 <878sgfwbyc.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4B7E4@dggeml527-mbx.china.huawei.com>
 <87wo3yupks.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4C62C@dggeml527-mbx.china.huawei.com>
 <a87b52f2-31e5-a526-4478-049ab6bba636@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E57047@dggeml507-mbx.china.huawei.com>
 <87o8p1ours.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E57CA5@dggeml507-mbx.china.huawei.com>
 <134e1fc1-8e5c-a1f2-d0ed-50784b807578@oracle.com>
 <4d71bb09-2569-4d01-16cc-707ce61d23de@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E5A3C4@dggeml507-mbx.china.huawei.com>
 <06cc9b64-0b56-1ed1-ad3f-5d646e46c98a@oracle.com>
Message-ID: <DA41BE1DDCA941489001C7FBD7A8820EE7E5A50E@dggeml507-mbx.china.huawei.com>

Hi Tobias,

> -----Original Message-----
> From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com]
> Sent: Friday, July 3, 2020 3:32 PM
> To: Yangfei (Felix) <felix.yang at huawei.com>; Roland Westrelin
> <rwestrel at redhat.com>; hotspot-compiler-dev at openjdk.java.net
> Cc: guoge (A) <guoge1 at huawei.com>; zhouyong (V)
> <zhouyong44 at huawei.com>
> Subject: Re: RFR(S): 8243670: Unexpected test result caused by C2
> MergeMemNode::Ideal
> 
> Hi Felix,
> 
> On 03.07.20 08:30, Yangfei (Felix) wrote:
> > Thanks for the effort.  :-)
> > I also submitted the latest patch to jdk/submit repo for testing.
> 
> The testing I did includes the jobs executed by the submit repo, so no need
> to submit again.

Great to know that :-)

> You can push your patch to:
> http://hg.openjdk.java.net/jdk/jdk15

Yes.  Will push to jdk/jdk15 and to jdk/jdk after that.

> > First time submitted in branch JDK-8243670-3, but I haven?t got any test
> result after about 24 hours.
> > Then I closed this branch and resubmitted in a new branch  JDK-8243670-4
> about 8 hours ago.
> > I guess maybe something is wrong with the submit repo? I am still waiting
> for the test result.
> >
> > http://hg.openjdk.java.net/jdk/submit/rev/798000e6da7f
> 
> Okay, there seems to be an issue with the submit repo. I'll report it.

Thanks for reporting that.

Felix

From tobias.hartmann at oracle.com  Fri Jul  3 08:56:04 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 3 Jul 2020 10:56:04 +0200
Subject: RFR(S): 8243670: Unexpected test result caused by C2
 MergeMemNode::Ideal
In-Reply-To: <DA41BE1DDCA941489001C7FBD7A8820EE7E5A50E@dggeml507-mbx.china.huawei.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7DE0314@dggeml507-mbs.china.huawei.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4ADEA@dggeml527-mbx.china.huawei.com>
 <87eeq7wmd2.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4B0CD@dggeml527-mbx.china.huawei.com>
 <878sgfwbyc.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4B7E4@dggeml527-mbx.china.huawei.com>
 <87wo3yupks.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4C62C@dggeml527-mbx.china.huawei.com>
 <a87b52f2-31e5-a526-4478-049ab6bba636@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E57047@dggeml507-mbx.china.huawei.com>
 <87o8p1ours.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E57CA5@dggeml507-mbx.china.huawei.com>
 <134e1fc1-8e5c-a1f2-d0ed-50784b807578@oracle.com>
 <4d71bb09-2569-4d01-16cc-707ce61d23de@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E5A3C4@dggeml507-mbx.china.huawei.com>
 <06cc9b64-0b56-1ed1-ad3f-5d646e46c98a@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E5A50E@dggeml507-mbx.china.huawei.com>
Message-ID: <556b5a00-0b89-b2cd-e243-e1f4a15201a4@oracle.com>

Hi Felix,

On 03.07.20 10:03, Yangfei (Felix) wrote:
> Yes.  Will push to jdk/jdk15 and to jdk/jdk after that.

Pushing to jdk/jdk (JDK 16) is not required. It will be merged automatically:
https://mail.openjdk.java.net/pipermail/jdk-dev/2020-June/004372.html

Best regards,
Tobias

From felix.yang at huawei.com  Fri Jul  3 09:02:40 2020
From: felix.yang at huawei.com (Yangfei (Felix))
Date: Fri, 3 Jul 2020 09:02:40 +0000
Subject: RFR(S): 8243670: Unexpected test result caused by C2
 MergeMemNode::Ideal
In-Reply-To: <556b5a00-0b89-b2cd-e243-e1f4a15201a4@oracle.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7DE0314@dggeml507-mbs.china.huawei.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4ADEA@dggeml527-mbx.china.huawei.com>
 <87eeq7wmd2.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4B0CD@dggeml527-mbx.china.huawei.com>
 <878sgfwbyc.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4B7E4@dggeml527-mbx.china.huawei.com>
 <87wo3yupks.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E4C62C@dggeml527-mbx.china.huawei.com>
 <a87b52f2-31e5-a526-4478-049ab6bba636@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E57047@dggeml507-mbx.china.huawei.com>
 <87o8p1ours.fsf@redhat.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E57CA5@dggeml507-mbx.china.huawei.com>
 <134e1fc1-8e5c-a1f2-d0ed-50784b807578@oracle.com>
 <4d71bb09-2569-4d01-16cc-707ce61d23de@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E5A3C4@dggeml507-mbx.china.huawei.com>
 <06cc9b64-0b56-1ed1-ad3f-5d646e46c98a@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E5A50E@dggeml507-mbx.china.huawei.com>
 <556b5a00-0b89-b2cd-e243-e1f4a15201a4@oracle.com>
Message-ID: <DA41BE1DDCA941489001C7FBD7A8820EE7E5A568@dggeml507-mbx.china.huawei.com>

Hi Tobias,

> -----Original Message-----
> From: Tobias Hartmann [mailto:tobias.hartmann at oracle.com]
> Sent: Friday, July 3, 2020 4:56 PM
> To: Yangfei (Felix) <felix.yang at huawei.com>; Roland Westrelin
> <rwestrel at redhat.com>; hotspot-compiler-dev at openjdk.java.net
> Cc: guoge (A) <guoge1 at huawei.com>; zhouyong (V)
> <zhouyong44 at huawei.com>
> Subject: Re: RFR(S): 8243670: Unexpected test result caused by C2
> MergeMemNode::Ideal
> 
> Hi Felix,
> 
> On 03.07.20 10:03, Yangfei (Felix) wrote:
> > Yes.  Will push to jdk/jdk15 and to jdk/jdk after that.
> 
> Pushing to jdk/jdk (JDK 16) is not required. It will be merged automatically:
> https://mail.openjdk.java.net/pipermail/jdk-dev/2020-June/004372.html

Thanks for reminding that.  Will do.  

Felix

From nils.eliasson at oracle.com  Fri Jul  3 09:21:58 2020
From: nils.eliasson at oracle.com (Nils Eliasson)
Date: Fri, 3 Jul 2020 11:21:58 +0200
Subject: [16] RFR(S): 8248398: Add diagnostic RepeatCompilation utility
In-Reply-To: <3df4b83a-de08-3a1f-4daf-4ed48515217c@oracle.com>
References: <3708c0f5-1c43-5adf-c817-2fb1ff6518c8@oracle.com>
 <3b5be72a-8c6e-8d93-f48b-d37e6e7ef049@oracle.com>
 <3ff8176b-9122-9fff-f628-5e001b21a1dd@oracle.com>
 <3df4b83a-de08-3a1f-4daf-4ed48515217c@oracle.com>
Message-ID: <4860fa4b-f42f-61cd-56c7-b5682fae31a4@oracle.com>


On 2020-07-03 09:51, Tobias Hartmann wrote:
> Hi Nils,
>
> On 03.07.20 09:39, Nils Eliasson wrote:
>> I got the suggestion from Patric to change to use unified logging instead of PrintCompilation. Are
>> you ok with that?
> For the task printing with "NO CODE INSTALLED" message? I think that should go with PrintCompilation
> to be consistent. Otherwise it would be weird that PrintCompilation would not print any compilations
> if RepeatCompilation is enabled.
I would like to encourage moving to Xlog/UL and from PrintCompilation. 
The few people that will use RepeatCompilation will know how to use -Xlog.

If you insist I can add both.

// Nils


>
> Best regards,
> Tobias


From tobias.hartmann at oracle.com  Fri Jul  3 09:28:38 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 3 Jul 2020 11:28:38 +0200
Subject: [16] RFR(S): 8248398: Add diagnostic RepeatCompilation utility
In-Reply-To: <4860fa4b-f42f-61cd-56c7-b5682fae31a4@oracle.com>
References: <3708c0f5-1c43-5adf-c817-2fb1ff6518c8@oracle.com>
 <3b5be72a-8c6e-8d93-f48b-d37e6e7ef049@oracle.com>
 <3ff8176b-9122-9fff-f628-5e001b21a1dd@oracle.com>
 <3df4b83a-de08-3a1f-4daf-4ed48515217c@oracle.com>
 <4860fa4b-f42f-61cd-56c7-b5682fae31a4@oracle.com>
Message-ID: <7275e88d-6efc-45c5-6dc5-9b177b0a9fe4@oracle.com>


On 03.07.20 11:21, Nils Eliasson wrote:
> I would like to encourage moving to Xlog/UL and from PrintCompilation. The few people that will use
> RepeatCompilation will know how to use -Xlog.

Okay, fair enough.

> If you insist I can add both.

No, using only UL is fine with me.

Best regards,
Tobias

From christian.hagedorn at oracle.com  Fri Jul  3 11:42:04 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Fri, 3 Jul 2020 13:42:04 +0200
Subject: [16] RFR(S): 8248226: TestCloneAccessStressGCM fails with
 -XX:-ReduceBulkZeroing
Message-ID: <88cd871e-b05a-5803-cc11-f082fc18f80b@oracle.com>

Hi

Please review the following patch:
https://bugs.openjdk.java.net/browse/JDK-8248226
http://cr.openjdk.java.net/~chagedorn/8248226/webrev.00/

C2 erroneously folds the addition in the return statement in the method 
TestCloneAccessStressGCM::test() to 0 when ReduceInitialCardMarks and 
ReduceBulkZeroing are disabled.

The problem in the testcase can be traced back to 
LoadNode::find_previous_arraycopy() called from LoadNode::Ideal() for 
the loads dest.i1, dest.i2 etc. where we do not take GC barriers into 
account (disabled ReduceInitialCardMarks) when trying to find an 
ArrayCopyNode which belongs to a clone.

As a result, we conclude that there is no ArrayCopyNode and bailout of 
the ideal transformation. Afterwards, we call LoadNode::Value() and look 
for a stored value for the allocation belonging to the clone() call. 
Since we cannot find one (because the ArrayCopyNode is initializing the 
allocation) we conclude that the field is 0 and replace the LoadNode by 
a constant 0. This happens for all the LoadNodes in the addition in the 
return statement which is then folded to 0 and returned.

This could have been prevented if ReduceBulkZeroing was enabled. Because 
in that case, the InitializationNode would have been marked as completed 
at [1] and the InitializationNode::find_captured_store() method returned 
NULL at [2] and eventually the entire LoadNode::Value() method returned 
_type (int) instead of the constant 0 because of the bailout at [3] for 
completed InitializationNodes.

Thank you!

Best regards,
Christian


[1] 
http://hg.openjdk.java.net/jdk/jdk/file/a7c030723240/src/hotspot/share/opto/library_call.cpp#l4234
[2] 
http://hg.openjdk.java.net/jdk/jdk/file/a7c030723240/src/hotspot/share/opto/memnode.cpp#l3775
[3] 
http://hg.openjdk.java.net/jdk/jdk/file/a7c030723240/src/hotspot/share/opto/memnode.cpp#l3722

From aph at redhat.com  Fri Jul  3 13:40:45 2020
From: aph at redhat.com (Andrew Haley)
Date: Fri, 3 Jul 2020 14:40:45 +0100
Subject: Running IGV
In-Reply-To: <f2d4d59f-eb69-53ad-47ab-0946bdd734ee@oracle.com>
References: <09f19846-cd66-85ed-c491-c5348d8fe532@redhat.com>
 <f2d4d59f-eb69-53ad-47ab-0946bdd734ee@oracle.com>
Message-ID: <ed109cf0-985c-4902-8339-29ae2a6b910e@redhat.com>

Hi,

On 30/06/2020 08:11, Tobias Hartmann wrote:
> 
> igv.sh writes into a log file (.igv.log). The problem might be that you need to run with JDK 8.

Thanks. It's better with JDK 8, but although it does load saved XML
Ideal Graphs, all it's possible to see is a tree with the names of
the compilation passes. No graphs are displayed.

I'm guessing IGV must have rotted, and there's no version that works
with current HotSpot available.

Thanks anyway,

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From tobias.hartmann at oracle.com  Fri Jul  3 13:48:44 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 3 Jul 2020 15:48:44 +0200
Subject: Running IGV
In-Reply-To: <ed109cf0-985c-4902-8339-29ae2a6b910e@redhat.com>
References: <09f19846-cd66-85ed-c491-c5348d8fe532@redhat.com>
 <f2d4d59f-eb69-53ad-47ab-0946bdd734ee@oracle.com>
 <ed109cf0-985c-4902-8339-29ae2a6b910e@redhat.com>
Message-ID: <ff8e07f3-2416-6e89-6697-002542664539@oracle.com>

Hi Andrew,

On 03.07.20 15:40, Andrew Haley wrote:
> Thanks. It's better with JDK 8, but although it does load saved XML
> Ideal Graphs, all it's possible to see is a tree with the names of
> the compilation passes. No graphs are displayed.

After double-clicking on the phase, it sometimes takes a while to load if the graph is huge. If
there's an issue, you should at least get an error message (did you check the console?).

> I'm guessing IGV must have rotted, and there's no version that works
> with current HotSpot available.

Well it does work fine for me and I'm using it on a regular basis.

Best regards,
Tobias

From patric.hedlin at oracle.com  Fri Jul  3 15:09:17 2020
From: patric.hedlin at oracle.com (Patric Hedlin)
Date: Fri, 3 Jul 2020 17:09:17 +0200
Subject: RFR(S): 8245021: Add method 'remove_if_existing' to growableArray.
In-Reply-To: <243790ff-6640-8f48-b345-b195efc46ede@oracle.com>
References: <054bdcb1-9543-eefc-b814-60ad5ab641d3@oracle.com>
 <243790ff-6640-8f48-b345-b195efc46ede@oracle.com>
Message-ID: <e9a88025-603d-ed9c-c80d-7fbe569b63fe@oracle.com>

Hi Tobias,

On 2020-05-19 11:33, Tobias Hartmann wrote:
> Hi Patric,
>
> Looks good to me but please add brackets around the for loop.
>
> Also, there are some more cases of this code pattern. For example,
> JvmtiPendingMonitors::destroy/exit and
> ShenandoahBarrierSetC2State::remove_enqueue_barrier/remove_load_reference_barrier.
Fixed.

I moved this to 16 (after JDK-8247755). Added some refactoring to new 
webrev (refreshed).

/Patric
> Best regards,
> Tobias
>
> On 18.05.20 22:37, Patric Hedlin wrote:
>> Dear all,
>>
>> I would like to ask for help to review the following change/update:
>>
>> Issue:? https://bugs.openjdk.java.net/browse/JDK-8245021
>> Webrev: http://cr.openjdk.java.net/~phedlin/tr8245021/
>>
>>
>> 8245021: Add method 'remove_if_existing' to growableArray.
>>
>> Minor improvement to simplify the code pattern "if contains then remove" found in a few places (in
>> "compile.hpp").
>>
>>
>> Testing: hs-tier1-3
>>
>>
>> Best regards,
>> Patric

From patric.hedlin at oracle.com  Fri Jul  3 15:09:23 2020
From: patric.hedlin at oracle.com (Patric Hedlin)
Date: Fri, 3 Jul 2020 17:09:23 +0200
Subject: RFR(S): 8245021: Add method 'remove_if_existing' to growableArray.
In-Reply-To: <9c722439-2b3f-a94f-baa6-2ac9aef825c4@oracle.com>
References: <054bdcb1-9543-eefc-b814-60ad5ab641d3@oracle.com>
 <243790ff-6640-8f48-b345-b195efc46ede@oracle.com>
 <9c722439-2b3f-a94f-baa6-2ac9aef825c4@oracle.com>
Message-ID: <244903a6-c870-dc55-41ba-460679b7a779@oracle.com>

Thanks for reviewing Nils.

Care to take another look?

I moved this to 16 (after JDK-8247755). Added some refactoring to new 
webrev (refreshed).

/Patric

On 2020-06-02 09:51, Nils Eliasson wrote:
> +1
>
> Best regards,
> Nils Eliasson
>
> On 2020-05-19 11:33, Tobias Hartmann wrote:
>> Hi Patric,
>>
>> Looks good to me but please add brackets around the for loop.
>>
>> Also, there are some more cases of this code pattern. For example,
>> JvmtiPendingMonitors::destroy/exit and
>> ShenandoahBarrierSetC2State::remove_enqueue_barrier/remove_load_reference_barrier. 
>>
>>
>> Best regards,
>> Tobias
>>
>> On 18.05.20 22:37, Patric Hedlin wrote:
>>> Dear all,
>>>
>>> I would like to ask for help to review the following change/update:
>>>
>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8245021
>>> Webrev: http://cr.openjdk.java.net/~phedlin/tr8245021/
>>>
>>>
>>> 8245021: Add method 'remove_if_existing' to growableArray.
>>>
>>> Minor improvement to simplify the code pattern "if contains then 
>>> remove" found in a few places (in
>>> "compile.hpp").
>>>
>>>
>>> Testing: hs-tier1-3
>>>
>>>
>>> Best regards,
>>> Patric

From patric.hedlin at oracle.com  Fri Jul  3 15:10:30 2020
From: patric.hedlin at oracle.com (Patric Hedlin)
Date: Fri, 3 Jul 2020 17:10:30 +0200
Subject: RFR(S): 8245021: Add method 'remove_if_existing' to growableArray.
In-Reply-To: <1EEC80B7-9603-4B8C-A0D4-97D3DE51EBDB@amazon.com>
References: <054bdcb1-9543-eefc-b814-60ad5ab641d3@oracle.com>
 <1EEC80B7-9603-4B8C-A0D4-97D3DE51EBDB@amazon.com>
Message-ID: <843b3998-8246-d571-ee8f-9ac795306b8d@oracle.com>

Hi,

On 2020-05-19 21:03, Liu, Xin wrote:
> Hi, Patric,
>
> I don't object to your change.  I feel that the API 'remove' of GrowableArray was not good.  Even though it's complexity is still linear, you scan all elements and write some of them.
The interface (remove) is what it is I guess. There was no intention to 
change current behaviour.

> The problem is it has to retain order. Actually, I didn't run into any problem when I replace the removing element with the last one.
> It suggests that probably nobody in hotspot makes use the sorted GrowableArray.
>
> I found another interesting point.  There's an API delete_at which ignore orders, so I try and replace your remove_if_exists with it.
>    bool delete_if_existing(const E& elem) {
>      int index = find(elem);
>
>      if (index != -1) {
>        _data[index] = _data[--_len];
>        return true;
>      }
>
>      return false;
>    }
> I didn't have any regression in jtreg:hotspot:tier1.  Actually, CodeCache::unregister_old_nmethod use the same trick.
Indeed, in analogy with *remove*, you might argue that both "a delete" 
and delete_if_existing are missing in the interface (in cases when order 
is not required). Perhaps also; delete_all/remove_all for multi entry 
usage. However, adding one or the other is perhaps another RFE (and 
might require more than a test-run to replace current uses of 
*remove*).At the same time, I have to assume that they have not been 
added for a reason.

I moved this to 16 (after JDK-8247755). Added some refactoring to new 
webrev (refreshed).

Best regards,
Patric

> Here is current implementation of delete_at().  It checks if the index is the last element, and skip copying if so.  I am not sure if an extra comparison is worthy here.
> Users should use pop() instead in that scenario.
>    
>    // The order is changed.
>    void delete_at(int index) {
>      assert(0 <= index && index < _len, "illegal index");
>      if (index < --_len) {
>        // Replace removed element with last one.
>        _data[index] = _data[_len];
>      }
>    }
>
> Thanks,
> --lx
>
>
>      Dear all,
>
>      I would like to ask for help to review the following change/update:
>
>      Issue:  https://bugs.openjdk.java.net/browse/JDK-8245021
>      Webrev: http://cr.openjdk.java.net/~phedlin/tr8245021/
>
>
>      8245021: Add method 'remove_if_existing' to growableArray.
>
>      Minor improvement to simplify the code pattern "if contains then remove"
>      found in a few places (in "compile.hpp").
>
>
>      Testing: hs-tier1-3
>
>
>      Best regards,
>      Patric

From nils.eliasson at oracle.com  Fri Jul  3 16:18:26 2020
From: nils.eliasson at oracle.com (Nils Eliasson)
Date: Fri, 3 Jul 2020 18:18:26 +0200
Subject: RFR(S): 8245021: Add method 'remove_if_existing' to growableArray.
In-Reply-To: <244903a6-c870-dc55-41ba-460679b7a779@oracle.com>
References: <054bdcb1-9543-eefc-b814-60ad5ab641d3@oracle.com>
 <243790ff-6640-8f48-b345-b195efc46ede@oracle.com>
 <9c722439-2b3f-a94f-baa6-2ac9aef825c4@oracle.com>
 <244903a6-c870-dc55-41ba-460679b7a779@oracle.com>
Message-ID: <e9f8ef62-2c36-bc9a-fd64-700f0c44e22c@oracle.com>

Still looking good!

Best regards,
Nil


On 2020-07-03 17:09, Patric Hedlin wrote:
> Thanks for reviewing Nils.
>
> Care to take another look?
>
> I moved this to 16 (after JDK-8247755). Added some refactoring to new 
> webrev (refreshed).
>
> /Patric
>
> On 2020-06-02 09:51, Nils Eliasson wrote:
>> +1
>>
>> Best regards,
>> Nils Eliasson
>>
>> On 2020-05-19 11:33, Tobias Hartmann wrote:
>>> Hi Patric,
>>>
>>> Looks good to me but please add brackets around the for loop.
>>>
>>> Also, there are some more cases of this code pattern. For example,
>>> JvmtiPendingMonitors::destroy/exit and
>>> ShenandoahBarrierSetC2State::remove_enqueue_barrier/remove_load_reference_barrier. 
>>>
>>>
>>> Best regards,
>>> Tobias
>>>
>>> On 18.05.20 22:37, Patric Hedlin wrote:
>>>> Dear all,
>>>>
>>>> I would like to ask for help to review the following change/update:
>>>>
>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8245021
>>>> Webrev: http://cr.openjdk.java.net/~phedlin/tr8245021/
>>>>
>>>>
>>>> 8245021: Add method 'remove_if_existing' to growableArray.
>>>>
>>>> Minor improvement to simplify the code pattern "if contains then 
>>>> remove" found in a few places (in
>>>> "compile.hpp").
>>>>
>>>>
>>>> Testing: hs-tier1-3
>>>>
>>>>
>>>> Best regards,
>>>> Patric


From joserz at linux.ibm.com  Fri Jul  3 18:09:34 2020
From: joserz at linux.ibm.com (joserz at linux.ibm.com)
Date: Fri, 3 Jul 2020 15:09:34 -0300
Subject: RFR(M): 8248191: PPC: Implement Load/Store Vector with
 lxvl/stxvl in Power10
In-Reply-To: <HE1PR02MB3066D76F3AA81AB764DEE3BF9A6D0@HE1PR02MB3066.eurprd02.prod.outlook.com>
References: <20200701194910.GA141565@pacoca>
 <HE1PR02MB3066D76F3AA81AB764DEE3BF9A6D0@HE1PR02MB3066.eurprd02.prod.outlook.com>
Message-ID: <20200703180934.GA14622@pacoca>

Hello Martin,

Actually, there isn't xxswapd, my bad. In fact we usually need xxswapd to fix the vector lanes after lxvd2x but, if I understand it correctly, that order makes no difference in Hotspot.
Site note: GCC does a similar job when generating code at -O1 or higher and they're also avoiding lxvd2x on Power10.

Do you want me to resend the e-mail without mentioning xxswapd?

Thank you!!

Jose

On Thu, Jul 02, 2020 at 03:57:22PM +0000, Doerr, Martin wrote:
> Where do we save xxswapd instructions?
> I can't see it in the webrev.
> 
> Best regards,
> Martin
> 
> > -----Original Message-----
> > From: hotspot-compiler-dev <hotspot-compiler-dev-
> > retn at openjdk.java.net> On Behalf Of joserz at linux.ibm.com
> > Sent: Mittwoch, 1. Juli 2020 21:49
> > To: hotspot-compiler-dev at openjdk.java.net
> > Cc: Michihiro Horie <HORIE at jp.ibm.com>
> > Subject: RFR(M): 8248191: PPC: Implement Load/Store Vector with lxvl/stxvl
> > in Power10
> > 
> > This patch introduces two instructions lxvl/stvxl and replaces the current
> > lxvd2x/stxvd2x to load and store vectors. Like lxvd2x/stxvd2x, lxvl/stxvl can
> > access unaligned effective addresses with the advantage of *not* requiring
> > xxswapd after lxvd2x (or before stxvd2x) to correct the lanes in little-endian
> > mode.
> > 
> > Webrev: https://cr.openjdk.java.net/~mhorie/8248191/webrev.00/
> > Bug: https://bugs.openjdk.java.net/browse/JDK-8248191
> > 
> > Thanks for your review!
> > 
> > Jose R. Ziviani

From vladimir.kozlov at oracle.com  Fri Jul  3 18:09:26 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 3 Jul 2020 11:09:26 -0700
Subject: [15] RFR(T) 8247527:
 serviceability/dcmd/gc/HeapDumpCompressedTest.java fails with Graal + ZGC
In-Reply-To: <5E33E613-882E-400A-886A-EA4FAD85F2EA@oracle.com>
References: <ec23502a-60fc-58c0-faef-5d7365ed08db@oracle.com>
 <5E33E613-882E-400A-886A-EA4FAD85F2EA@oracle.com>
Message-ID: <2bde8004-4ed8-8ca3-b387-05240f423e3f@oracle.com>

Thank you, Igor

Vladimir K

On 7/2/20 7:24 PM, igor.ignatyev at oracle.com wrote:
> LGTM
> 
> ? Igor
> 
>> On Jul 2, 2020, at 7:03 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>>
>> ?https://cr.openjdk.java.net/~kvn/8247527/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8247527
>>
>> Test should have @requires which excludes running Graal with GC which it does not support.
>>
>> Testing: hs-tier1,hs-tier4-graal
>>
>> Thanks,
>> Vladimir
> 

From vladimir.kozlov at oracle.com  Fri Jul  3 18:30:31 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 3 Jul 2020 11:30:31 -0700
Subject: [15] RFR(T) 8247527:
 serviceability/dcmd/gc/HeapDumpCompressedTest.java fails with Graal + ZGC
In-Reply-To: <df0d7cf1-6342-c2b9-1688-3f112a28e0a2@oracle.com>
References: <ec23502a-60fc-58c0-faef-5d7365ed08db@oracle.com>
 <e3f21f4d-f0ca-8814-00c4-5515698d4432@oracle.com>
 <EBBF9FC9-F5FC-4219-B3B1-E0A5E2FC6B44@oracle.com>
 <df0d7cf1-6342-c2b9-1688-3f112a28e0a2@oracle.com>
Message-ID: <bf9f0728-9076-79c3-7d88-9dd9a6db4f63@oracle.com>

Thank you, David, for looking on changes.

I will remember to update tests. I filed RFE 8248815 [1] for tracking.

Can you approve this fix now?

Thanks,
Vladimir K

[1] https://bugs.openjdk.java.net/browse/JDK-8248815

On 7/2/20 10:09 PM, David Holmes wrote:
> Hi Igor,
> 
> On 3/07/2020 12:59 pm, Igor Ignatyev wrote:
>> Hi David,
>>
>> it's in my todo list to improve this situation and have vm.gc.X to take selected JIT into account; and update existing 
>> (>200) occurrences of 'vm.gc.X & !vm.graal.enabled'
> 
> 200+ ouch! :(
> 
> I guess this fix doesn't make the situation any worse in a practical sense.
> 
> Thanks,
> David
> -----
> 
>> -- Igor
>>
>>> On Jul 2, 2020, at 7:25 PM, David Holmes <david.holmes at oracle.com> wrote:
>>>
>>> Hi Vladimir,
>>>
>>> On 3/07/2020 12:02 pm, Vladimir Kozlov wrote:
>>>> https://cr.openjdk.java.net/~kvn/8247527/webrev.00/
>>>> https://bugs.openjdk.java.net/browse/JDK-8247527
>>>> Test should have @requires which excludes running Graal with GC which it does not support.
>>>
>>> I find it somewhat disturbing that a generic test has to know about the limitations between GCs and Graal!
>>>
>>> I would have been more inclined to just exclude this test when running with Graal, even if that theoretically reduced 
>>> the test coverage in a ting way.
>>>
>>> If/When Graal supports these other GCs who will remember to re-enable these test cases?
>>>
>>> Thanks,
>>> David
>>>
>>>> Testing: hs-tier1,hs-tier4-graal
>>>> Thanks,
>>>> Vladimir
>>

From vladimir.kozlov at oracle.com  Fri Jul  3 18:36:31 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 3 Jul 2020 11:36:31 -0700
Subject: [aarch64-port-dev ] RFR(XXS):8248570 Incorrect copyright header
 in TestUnsafeUnalignedSwap.java
In-Reply-To: <e59081eb-8513-2738-dfa3-c75f427ac43e@oracle.com>
References: <587101a8-7cb0-453b-aed5-4edca2cdda2d.zhuoren.wz@alibaba-inc.com>
 <e59081eb-8513-2738-dfa3-c75f427ac43e@oracle.com>
Message-ID: <a574f2e1-7c4a-2543-0293-4786c1107f46@oracle.com>

I forgot to ask to push the fix into jdk/jdk15 repository to fix it in JDK 15. It will be automatically forward ported 
into JDK 16 later.

Thanks,
Vladimir K

On 7/2/20 10:49 AM, Vladimir Kozlov wrote:
> Thank you, Zhuoren
> 
> Checks passed now.
> 
> Vladimir K
> 
> On 7/2/20 12:36 AM, Wang Zhuo(Zhuoren) wrote:
>> Hi,
>> There's something wrong int the legal notice of TestUnsafeUnalignedSwap.java file. It should be GPLv2 as in 
>> `make/templates/gpl-header`. This patch(from Vladimir Kozlov) fixes it.
>> BUG Link:https://bugs.openjdk.java.net/browse/JDK-8248570
>> CR: http://cr.openjdk.java.net/~wzhuo/8248570/webrev.00/
>>
>>
>> Regards,
>> Zhuoren
>>

From vladimir.kozlov at oracle.com  Fri Jul  3 18:37:45 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 3 Jul 2020 11:37:45 -0700
Subject: [16] RFR(T): 8248596: [TESTBUG]
 compiler/loopopts/PartialPeelingUnswitch.java times out with Graal enabled
In-Reply-To: <8a88f3f6-fd14-7e93-2013-a0f37e6b7094@oracle.com>
References: <19cf54c7-776d-63e4-6d40-bd84733a2f17@oracle.com>
 <87mu4i9d5e.fsf@redhat.com> <905225a7-8e07-ba19-f9b4-d5fad89e68ce@oracle.com>
 <db447504-ff86-9dcc-f510-f498b9d8fa9b@oracle.com>
 <8a88f3f6-fd14-7e93-2013-a0f37e6b7094@oracle.com>
Message-ID: <67de0043-b2b4-17c7-a6e2-df44954305e7@oracle.com>

+1

Thanks,
Vladimir K

On 7/3/20 12:37 AM, Tobias Hartmann wrote:
> Hi Christian,
> 
> On 03.07.20 09:19, Christian Hagedorn wrote:
>> Sounds reasonable. I changed that in a new webrev:
>> http://cr.openjdk.java.net/~chagedorn/8248596/webrev.01/
> 
> Looks good to me.
> 
> Best regards,
> Tobias
> 

From david.holmes at oracle.com  Fri Jul  3 22:18:53 2020
From: david.holmes at oracle.com (David Holmes)
Date: Sat, 4 Jul 2020 08:18:53 +1000
Subject: [15] RFR(T) 8247527:
 serviceability/dcmd/gc/HeapDumpCompressedTest.java fails with Graal + ZGC
In-Reply-To: <bf9f0728-9076-79c3-7d88-9dd9a6db4f63@oracle.com>
References: <ec23502a-60fc-58c0-faef-5d7365ed08db@oracle.com>
 <e3f21f4d-f0ca-8814-00c4-5515698d4432@oracle.com>
 <EBBF9FC9-F5FC-4219-B3B1-E0A5E2FC6B44@oracle.com>
 <df0d7cf1-6342-c2b9-1688-3f112a28e0a2@oracle.com>
 <bf9f0728-9076-79c3-7d88-9dd9a6db4f63@oracle.com>
Message-ID: <030c19da-616e-3b05-da33-5add5e6da747@oracle.com>

On 4/07/2020 4:30 am, Vladimir Kozlov wrote:
> Thank you, David, for looking on changes.
> 
> I will remember to update tests. I filed RFE 8248815 [1] for tracking.
> 
> Can you approve this fix now?

Yes - thanks.

David

> Thanks,
> Vladimir K
> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8248815
> 
> On 7/2/20 10:09 PM, David Holmes wrote:
>> Hi Igor,
>>
>> On 3/07/2020 12:59 pm, Igor Ignatyev wrote:
>>> Hi David,
>>>
>>> it's in my todo list to improve this situation and have vm.gc.X to 
>>> take selected JIT into account; and update existing (>200) 
>>> occurrences of 'vm.gc.X & !vm.graal.enabled'
>>
>> 200+ ouch! :(
>>
>> I guess this fix doesn't make the situation any worse in a practical 
>> sense.
>>
>> Thanks,
>> David
>> -----
>>
>>> -- Igor
>>>
>>>> On Jul 2, 2020, at 7:25 PM, David Holmes <david.holmes at oracle.com> 
>>>> wrote:
>>>>
>>>> Hi Vladimir,
>>>>
>>>> On 3/07/2020 12:02 pm, Vladimir Kozlov wrote:
>>>>> https://cr.openjdk.java.net/~kvn/8247527/webrev.00/
>>>>> https://bugs.openjdk.java.net/browse/JDK-8247527
>>>>> Test should have @requires which excludes running Graal with GC 
>>>>> which it does not support.
>>>>
>>>> I find it somewhat disturbing that a generic test has to know about 
>>>> the limitations between GCs and Graal!
>>>>
>>>> I would have been more inclined to just exclude this test when 
>>>> running with Graal, even if that theoretically reduced the test 
>>>> coverage in a ting way.
>>>>
>>>> If/When Graal supports these other GCs who will remember to 
>>>> re-enable these test cases?
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> Testing: hs-tier1,hs-tier4-graal
>>>>> Thanks,
>>>>> Vladimir
>>>

From vladimir.kozlov at oracle.com  Fri Jul  3 22:47:24 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 3 Jul 2020 15:47:24 -0700
Subject: [15] RFR(T) 8247527:
 serviceability/dcmd/gc/HeapDumpCompressedTest.java fails with Graal + ZGC
In-Reply-To: <030c19da-616e-3b05-da33-5add5e6da747@oracle.com>
References: <ec23502a-60fc-58c0-faef-5d7365ed08db@oracle.com>
 <e3f21f4d-f0ca-8814-00c4-5515698d4432@oracle.com>
 <EBBF9FC9-F5FC-4219-B3B1-E0A5E2FC6B44@oracle.com>
 <df0d7cf1-6342-c2b9-1688-3f112a28e0a2@oracle.com>
 <bf9f0728-9076-79c3-7d88-9dd9a6db4f63@oracle.com>
 <030c19da-616e-3b05-da33-5add5e6da747@oracle.com>
Message-ID: <e4801443-1442-8aca-4c0d-b21e51306e18@oracle.com>

Thank you, David

Vladimir K

On 7/3/20 3:18 PM, David Holmes wrote:
> On 4/07/2020 4:30 am, Vladimir Kozlov wrote:
>> Thank you, David, for looking on changes.
>>
>> I will remember to update tests. I filed RFE 8248815 [1] for tracking.
>>
>> Can you approve this fix now?
> 
> Yes - thanks.
> 
> David
> 
>> Thanks,
>> Vladimir K
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8248815
>>
>> On 7/2/20 10:09 PM, David Holmes wrote:
>>> Hi Igor,
>>>
>>> On 3/07/2020 12:59 pm, Igor Ignatyev wrote:
>>>> Hi David,
>>>>
>>>> it's in my todo list to improve this situation and have vm.gc.X to take selected JIT into account; and update 
>>>> existing (>200) occurrences of 'vm.gc.X & !vm.graal.enabled'
>>>
>>> 200+ ouch! :(
>>>
>>> I guess this fix doesn't make the situation any worse in a practical sense.
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>> -- Igor
>>>>
>>>>> On Jul 2, 2020, at 7:25 PM, David Holmes <david.holmes at oracle.com> wrote:
>>>>>
>>>>> Hi Vladimir,
>>>>>
>>>>> On 3/07/2020 12:02 pm, Vladimir Kozlov wrote:
>>>>>> https://cr.openjdk.java.net/~kvn/8247527/webrev.00/
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8247527
>>>>>> Test should have @requires which excludes running Graal with GC which it does not support.
>>>>>
>>>>> I find it somewhat disturbing that a generic test has to know about the limitations between GCs and Graal!
>>>>>
>>>>> I would have been more inclined to just exclude this test when running with Graal, even if that theoretically 
>>>>> reduced the test coverage in a ting way.
>>>>>
>>>>> If/When Graal supports these other GCs who will remember to re-enable these test cases?
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> Testing: hs-tier1,hs-tier4-graal
>>>>>> Thanks,
>>>>>> Vladimir
>>>>

From christian.hagedorn at oracle.com  Mon Jul  6 06:49:49 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Mon, 6 Jul 2020 06:49:49 +0000 (UTC)
Subject: [16] RFR(T): 8248596: [TESTBUG]
 compiler/loopopts/PartialPeelingUnswitch.java times out with Graal enabled
In-Reply-To: <67de0043-b2b4-17c7-a6e2-df44954305e7@oracle.com>
References: <19cf54c7-776d-63e4-6d40-bd84733a2f17@oracle.com>
 <87mu4i9d5e.fsf@redhat.com> <905225a7-8e07-ba19-f9b4-d5fad89e68ce@oracle.com>
 <db447504-ff86-9dcc-f510-f498b9d8fa9b@oracle.com>
 <8a88f3f6-fd14-7e93-2013-a0f37e6b7094@oracle.com>
 <67de0043-b2b4-17c7-a6e2-df44954305e7@oracle.com>
Message-ID: <870d5242-aa28-ecf2-f78c-9b9dc07d7c54@oracle.com>

Thank you Vladimir for your review again!

Best regards,
Christian

On 03.07.20 20:37, Vladimir Kozlov wrote:
> +1
> 
> Thanks,
> Vladimir K
> 
> On 7/3/20 12:37 AM, Tobias Hartmann wrote:
>> Hi Christian,
>>
>> On 03.07.20 09:19, Christian Hagedorn wrote:
>>> Sounds reasonable. I changed that in a new webrev:
>>> http://cr.openjdk.java.net/~chagedorn/8248596/webrev.01/
>>
>> Looks good to me.
>>
>> Best regards,
>> Tobias
>>

From boris.ulasevich at bell-sw.com  Mon Jul  6 09:17:54 2020
From: boris.ulasevich at bell-sw.com (Boris Ulasevich)
Date: Mon, 6 Jul 2020 12:17:54 +0300
Subject: RFR(XS) 8248568: compiler/c2/TestBit.java failed: 'test' missing
 from stdout/stderr
In-Reply-To: <d2b21bb9-91d7-1e2d-46e2-9d31a1ee41e9@oracle.com>
References: <ae99e130-ccc5-30c8-aeeb-1b7443475513@bell-sw.com>
 <a7bafafd-07e0-91ba-8c5b-2d7880e09c7e@oracle.com>
 <CAOhyNwB8L9dabfCiuK=9qK6WMkdR7o2MdOxCNoBv0UiBRTmpEA@mail.gmail.com>
 <d2b21bb9-91d7-1e2d-46e2-9d31a1ee41e9@oracle.com>
Message-ID: <10386b46-ada9-2ee9-2a53-9397faf23f87@bell-sw.com>

Thank you Vladimir.

May I consider the change trivial or should I ask for more reviews?

regards,
Boris

On 03.07.2020 00:45, Vladimir Kozlov wrote:
> Looks good.
>
> Thanks,
> Vladimir
>
> On 7/2/20 2:29 PM, Boris Ulasevich wrote:
>> Hi Vladimir,
>>
>> Thank you. I applied your suggestions. On our machines jtreg runs well.
>> Update: http://cr.openjdk.java.net/~bulasevich/8248568/webrev.01
>>
>> regards,
>> Boris
>>
>> On Thu, Jul 2, 2020 at 9:54 PM Vladimir Kozlov 
>> <vladimir.kozlov at oracle.com>
>> wrote:
>>
>>> Good.
>>>
>>> You may also replace next requirements:
>>>
>>> vm.flavor == "server" & !vm.graal.enabled
>>>
>>> with one:
>>>
>>> vm.compiler2.enabled
>>>
>>> Graal and C2 are mutually exclusive.
>>>
>>> May be also run processes without C1 by switching off Tiered 
>>> Compilation.
>>>
>>> And instead of:
>>> @run main/othervm compiler.c2.TestBit
>>>
>>> use:
>>> @run driver compiler.c2.TestBit
>>>
>>> Because you launching separate processes.
>>>
>>> Please, test changes with jtreg testing.
>>>
>>> Thanks,
>>> Vladimir K
>>>
>>> On 7/2/20 11:13 AM, Boris Ulasevich wrote:
>>>> Hi,
>>>>
>>>> Please review a one-line change: adding -Xbatch option to recently
>>>> introduced test to get a more predictable PrintOptoAssembly output.
>>>>
>>>> http://cr.openjdk.java.net/~bulasevich/8248568/webrev.00
>>>> http://bugs.openjdk.java.net/browse/JDK-8248568
>>>>
>>>> thanks,
>>>> Boris
>>>


From rwestrel at redhat.com  Mon Jul  6 15:55:19 2020
From: rwestrel at redhat.com (Roland Westrelin)
Date: Mon, 06 Jul 2020 17:55:19 +0200
Subject: RFR(M): 8229495: SIGILL in C2 generated OSR compilation
In-Reply-To: <3b720427-d718-5d1c-dbe9-6149a21883af@oracle.com>
References: <3b720427-d718-5d1c-dbe9-6149a21883af@oracle.com>
Message-ID: <87r1topriw.fsf@redhat.com>


I took that bug over.
Thanks to Patric for helping me understand the root cause of the bug.

http://cr.openjdk.java.net/~roland/8229495/webrev.00/
https://bugs.openjdk.java.net/browse/JDK-8229495

> The approach to insert range-check guards (see, JDK-8193130, 
> JDK-8216135, JDK-8240335) between the pre- and the main-loopis somewhat 
> problematic. The immediate problem here is due to an inherent dependency 
> between the additional (template) range-check guards introduced (during 
> RCE) and the state of the loop, such as the level of loop-unrolling.To 
> keep the range-check guards valid through the compilation, these 
> arere-generated when/if the main-loop is unrolled further. Here, the 
> error is introduced when a guard is generated with an illegal offset, 
> that will erroneously cut the path to the main-loop (resulting in a 
> 'Halt'). The reason for range-checks to be present in the main-loop to 
> begin with is due to a failing dominator search (this was also corrected 
> in JDK-8231412, for JDK14).

For a range check:

scale * i + offset <u length

the current code creates a "skeleton" predicate which is essentially an
assert:

assert scale * opaque(init) + offset <u length

As unrolling proceeds, the assert is duplicated and updated:

assert scale * opaque(init) + offset <u length
assert scale * init + offset <u length
assert scale * (init + stride - init_stride) + offset <u length

init_stride is the stride before any unrolling.

The goal is to cover the entire first iteration of the loop where
constant folding can occur and cause CastIIs to go dead in loop bodies
that are unreachable in practice but not optimized out.

This is broken if some rounds of unrolling happens before range check
elimination. If a loop with a range check:

scale * i + offset <u length

is unrolled once then we have 2 range checks:

scale * i + offset <u length
scale * (i+1) + offset <u length

Now range check elimination causes 2 predicates to be added:

assert scale * opaque(init) + offset <u length
assert scale * opaque(init) + (scale + offset) <u length

If the loop is unrolled one more time this is turned into:

assert scale * opaque(init) + offset <u length
assert scale * init + offset <u length
assert scale * (init + stride - init_stride) + offset <u length
assert scale * opaque(init) + (scale + offset) <u length
assert scale * init + (scale + offset) <u length
assert scale * (init + stride - init_stride) + (scale + offset) <u length

stride = 4, init_stride = 1 so the last assert is:

assert scale * (init + 4) + offset <u length

which for a loop unrolled 4 times is outside the first iteration.

The problem is that init_stride should be 2 here, the stride at which
range check elimination triggered. Given it can trigger at different
strides for different range checks, init_stride needs to be captured in
the skeleton. So I propose changing this scheme to:

For a range check:

scale * i + offset <u length

create 2 "skeleton" predicates:

assert scale * opaque(init) + offset <u length
assert scale * (opaque(init) + opaque(init_stride) - init_stride) + offset <u length

As unrolling proceeds, the asserts are duplicated and updated:

assert scale * opaque(init) + offset <u length
assert scale * init + offset <u length
assert scale * (opaque(init) + opaque(init_stride) - init_stride) + offset <u length
assert scale * (init + stride - init_stride) + offset <u length

I implemented this by using 2 subclasses to Opaque1 to denotate init and
stride and facilitate pattern matching. I had to extend _class_id to
juint to make Opaque1 a new class. I also had to tweak the budget
estimate code. Finally, the "asserts" above used to be removed at the
end of compilation. I now leave them in debug builds so we can catch
similar bugs earlier.

The bug doesn't reproduce with the included test case anymore but after
I backed out a couple unrelated changes I could use the test case to
verify the bug is indeed fixed.

And FWIW, I agree this has become way too complicated but I don't see a
simpler solution to the family of bugs this fixes.

Roland.


From vladimir.kozlov at oracle.com  Tue Jul  7 00:23:34 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 6 Jul 2020 17:23:34 -0700
Subject: [16] RFR(S): 8248226: TestCloneAccessStressGCM fails with
 -XX:-ReduceBulkZeroing
In-Reply-To: <88cd871e-b05a-5803-cc11-f082fc18f80b@oracle.com>
References: <88cd871e-b05a-5803-cc11-f082fc18f80b@oracle.com>
Message-ID: <515c1402-4c7a-9d3b-e4fa-ac2a6d43da4c@oracle.com>

Looks good.

Thanks,
Vladimir

On 7/3/20 4:42 AM, Christian Hagedorn wrote:
> Hi
> 
> Please review the following patch:
> https://bugs.openjdk.java.net/browse/JDK-8248226
> http://cr.openjdk.java.net/~chagedorn/8248226/webrev.00/
> 
> C2 erroneously folds the addition in the return statement in the method TestCloneAccessStressGCM::test() to 0 when 
> ReduceInitialCardMarks and ReduceBulkZeroing are disabled.
> 
> The problem in the testcase can be traced back to LoadNode::find_previous_arraycopy() called from LoadNode::Ideal() for 
> the loads dest.i1, dest.i2 etc. where we do not take GC barriers into account (disabled ReduceInitialCardMarks) when 
> trying to find an ArrayCopyNode which belongs to a clone.
> 
> As a result, we conclude that there is no ArrayCopyNode and bailout of the ideal transformation. Afterwards, we call 
> LoadNode::Value() and look for a stored value for the allocation belonging to the clone() call. Since we cannot find one 
> (because the ArrayCopyNode is initializing the allocation) we conclude that the field is 0 and replace the LoadNode by a 
> constant 0. This happens for all the LoadNodes in the addition in the return statement which is then folded to 0 and 
> returned.
> 
> This could have been prevented if ReduceBulkZeroing was enabled. Because in that case, the InitializationNode would have 
> been marked as completed at [1] and the InitializationNode::find_captured_store() method returned NULL at [2] and 
> eventually the entire LoadNode::Value() method returned _type (int) instead of the constant 0 because of the bailout at 
> [3] for completed InitializationNodes.
> 
> Thank you!
> 
> Best regards,
> Christian
> 
> 
> [1] http://hg.openjdk.java.net/jdk/jdk/file/a7c030723240/src/hotspot/share/opto/library_call.cpp#l4234
> [2] http://hg.openjdk.java.net/jdk/jdk/file/a7c030723240/src/hotspot/share/opto/memnode.cpp#l3775
> [3] http://hg.openjdk.java.net/jdk/jdk/file/a7c030723240/src/hotspot/share/opto/memnode.cpp#l3722

From nick.gasson at arm.com  Tue Jul  7 06:58:06 2020
From: nick.gasson at arm.com (Nick Gasson)
Date: Tue, 07 Jul 2020 14:58:06 +0800
Subject: [15] RFR(S): 8248845: AArch64: stack corruption after spilling
 vector register
Message-ID: <857dvfrev5.fsf@arm.com>

Hi,

Bug: https://bugs.openjdk.java.net/browse/JDK-8248845
Webrev: http://cr.openjdk.java.net/~ngasson/8248845/webrev.0/

This crash was seen on the Panama vectorIntrinsics branch and a minimal
test case is attached to the JBS entry, but it should also affect
vanilla jdk/jdk although I haven't found a reliable way to reproduce it.

  0x0000ffffa173fd40:   ldr	x11, [sp, #120]
  0x0000ffffa173fd44:   ldr	x10, [sp, #48]
  0x0000ffffa173fd48:   add	x10, x10, x11
  0x0000ffffa173fd4c:   str	q16, [x10, #16]    ; <==== CRASH HERE

X10 loaded from sp+48 contains a valid pointer but X11 from sp+120
contains a garbage value. Here's the relevant opto assembly that spills
to sp+120:

  6b4 +   spill [sp, #80] -> [sp, #16]	# vector spill size = 128
  6bc +   spill [sp, #24] -> [sp, #120]	# spill size = 64

These two instructions have been scheduled in the wrong order: 6b4
writes 16 bytes at sp+16 which overwrites another live value at sp+24.
Instruction 6bc spills the clobbered value at sp+24 to sp+120. If I dump
out the instructions before scheduling or pass -XX:-OptoScheduling the
order is correct.

It seems to be a known limitation that the scheduler doesn't correctly
compute anti-dependencies when a vector occupies more than two slots,
because PhaseOutput::ScheduleAndBundle() already has a check to skip
scheduling if a too-wide vector was generated. Unfortunately the check
is wrong as a pair of slots is 8 bytes not 16:

  // Scheduling code works only with pairs (16 bytes) maximum.
  if (C->max_vector_size() > 16)

Actually the test here used to be > 8, but was changed as part of
JDK-8076276 which added AVX512 support [1]. I couldn't see any
explanation of that change in the bug or mailing list thread, but it
seems wrong and reverting it fixes this crash.

This affects AArch64 because OptoScheduling is enabled by default and
NEON vectors are 16 bytes wide.

Tested hotspot_all_no_apps, jdk_core on AArch64 and x86_64.

[1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-April/017579.html

--
Thanks,
Nick

From christian.hagedorn at oracle.com  Tue Jul  7 07:19:25 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Tue, 7 Jul 2020 09:19:25 +0200
Subject: [16] RFR(S): 8248226: TestCloneAccessStressGCM fails with
 -XX:-ReduceBulkZeroing
In-Reply-To: <515c1402-4c7a-9d3b-e4fa-ac2a6d43da4c@oracle.com>
References: <88cd871e-b05a-5803-cc11-f082fc18f80b@oracle.com>
 <515c1402-4c7a-9d3b-e4fa-ac2a6d43da4c@oracle.com>
Message-ID: <23ee0766-6316-e7bd-d3bd-e9345b082f91@oracle.com>

Thank you Vladimir for your review!

Best regards,
Christian

On 07.07.20 02:23, Vladimir Kozlov wrote:
> Looks good.
> 
> Thanks,
> Vladimir
> 
> On 7/3/20 4:42 AM, Christian Hagedorn wrote:
>> Hi
>>
>> Please review the following patch:
>> https://bugs.openjdk.java.net/browse/JDK-8248226
>> http://cr.openjdk.java.net/~chagedorn/8248226/webrev.00/
>>
>> C2 erroneously folds the addition in the return statement in the 
>> method TestCloneAccessStressGCM::test() to 0 when 
>> ReduceInitialCardMarks and ReduceBulkZeroing are disabled.
>>
>> The problem in the testcase can be traced back to 
>> LoadNode::find_previous_arraycopy() called from LoadNode::Ideal() for 
>> the loads dest.i1, dest.i2 etc. where we do not take GC barriers into 
>> account (disabled ReduceInitialCardMarks) when trying to find an 
>> ArrayCopyNode which belongs to a clone.
>>
>> As a result, we conclude that there is no ArrayCopyNode and bailout of 
>> the ideal transformation. Afterwards, we call LoadNode::Value() and 
>> look for a stored value for the allocation belonging to the clone() 
>> call. Since we cannot find one (because the ArrayCopyNode is 
>> initializing the allocation) we conclude that the field is 0 and 
>> replace the LoadNode by a constant 0. This happens for all the 
>> LoadNodes in the addition in the return statement which is then folded 
>> to 0 and returned.
>>
>> This could have been prevented if ReduceBulkZeroing was enabled. 
>> Because in that case, the InitializationNode would have been marked as 
>> completed at [1] and the InitializationNode::find_captured_store() 
>> method returned NULL at [2] and eventually the entire 
>> LoadNode::Value() method returned _type (int) instead of the constant 
>> 0 because of the bailout at [3] for completed InitializationNodes.
>>
>> Thank you!
>>
>> Best regards,
>> Christian
>>
>>
>> [1] 
>> http://hg.openjdk.java.net/jdk/jdk/file/a7c030723240/src/hotspot/share/opto/library_call.cpp#l4234 
>>
>> [2] 
>> http://hg.openjdk.java.net/jdk/jdk/file/a7c030723240/src/hotspot/share/opto/memnode.cpp#l3775 
>>
>> [3] 
>> http://hg.openjdk.java.net/jdk/jdk/file/a7c030723240/src/hotspot/share/opto/memnode.cpp#l3722 
>>

From rwestrel at redhat.com  Tue Jul  7 07:27:16 2020
From: rwestrel at redhat.com (Roland Westrelin)
Date: Tue, 07 Jul 2020 09:27:16 +0200
Subject: [16] RFR(S): 8248226: TestCloneAccessStressGCM fails with
 -XX:-ReduceBulkZeroing
In-Reply-To: <88cd871e-b05a-5803-cc11-f082fc18f80b@oracle.com>
References: <88cd871e-b05a-5803-cc11-f082fc18f80b@oracle.com>
Message-ID: <87o8orpyy3.fsf@redhat.com>


> http://cr.openjdk.java.net/~chagedorn/8248226/webrev.00/

Looks good.

Roland.


From christian.hagedorn at oracle.com  Tue Jul  7 07:31:57 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Tue, 7 Jul 2020 09:31:57 +0200
Subject: [16] RFR(S): 8248226: TestCloneAccessStressGCM fails with
 -XX:-ReduceBulkZeroing
In-Reply-To: <87o8orpyy3.fsf@redhat.com>
References: <88cd871e-b05a-5803-cc11-f082fc18f80b@oracle.com>
 <87o8orpyy3.fsf@redhat.com>
Message-ID: <2a59757e-3e51-309b-8ef3-73e8a0d357d0@oracle.com>

Thank you Roland for your review!

Best regards,
Christian

On 07.07.20 09:27, Roland Westrelin wrote:
> 
>> http://cr.openjdk.java.net/~chagedorn/8248226/webrev.00/
> 
> Looks good.
> 
> Roland.
> 

From patric.hedlin at oracle.com  Tue Jul  7 11:00:12 2020
From: patric.hedlin at oracle.com (Patric Hedlin)
Date: Tue, 7 Jul 2020 13:00:12 +0200
Subject: RFR(S): 8248901: Signed immediate support in .../share/assembler.hpp
 is broken.
Message-ID: <3df3dab6-aa2f-bbbc-d231-6cda8f2a0ff7@oracle.com>

Dear all,

I would like to ask for help to review the following change/update:

Issue:? https://bugs.openjdk.java.net/browse/JDK-8248901
Webrev: http://cr.openjdk.java.net/~phedlin/tr8248901/


Current definition(s) of is_simm() and friends are not robust over 
inputs. Both min and max values are undefined for width > 32 (and width 
< 0).
No is_uimm() is currently provided (added). Several definitions are not 
used (cleanup).

NOTE: Adding currently unused is_simm9() and is_uimm12(), required by 
JDK-8247766.


Testing: hs-tier1-3


Best regards,
Patric

From patric.hedlin at oracle.com  Tue Jul  7 11:17:43 2020
From: patric.hedlin at oracle.com (Patric Hedlin)
Date: Tue, 7 Jul 2020 13:17:43 +0200
Subject: RFR(S/M): 8247766: [aarch64] guarantee(val < (1U << nbits)) failed:
 Field too big for insn
Message-ID: <0cdbdf26-ad4d-056b-a801-cc31b2cc4ab3@oracle.com>

Dear all,

I would like to ask for help to review the following change/update:

Issue:? https://bugs.openjdk.java.net/browse/JDK-8247766
Webrev: http://cr.openjdk.java.net/~phedlin/tr8247766/


C1 code generation for reading and writing stack-slots does not handle 
large immediate offsets on aarch64. This patch will ensure that 
immediate offsets are admissible for base+(immediate)offset encoding or, 
if this is not the case, will enforce an explicit address calculation to 
a scratch register. (Also correcting a small glitch in 9-bit signed 
immediate encoding check.)

NOTE: Current patch includes (local) definitions of is_simm/9 and 
is_uimm/12, for review purpose only. With JDK-8248901 these will move to 
Assembler, and will not be included in the change-set.


Testing: tier1-3,6


Best regards,
Patric


From nils.eliasson at oracle.com  Tue Jul  7 13:55:56 2020
From: nils.eliasson at oracle.com (Nils Eliasson)
Date: Tue, 7 Jul 2020 15:55:56 +0200
Subject: RFR(S/M): 8247766: [aarch64] guarantee(val < (1U << nbits))
 failed: Field too big for insn
In-Reply-To: <0cdbdf26-ad4d-056b-a801-cc31b2cc4ab3@oracle.com>
References: <0cdbdf26-ad4d-056b-a801-cc31b2cc4ab3@oracle.com>
Message-ID: <1cedcefe-547c-80ad-854f-0a38e7a07639@oracle.com>

Hi Patric,

There are some minor typos in the comments. Otherwise it looks good.

No re-review needed.

Best regards,
Nils

src/hotspot/cpu/aarch64/assembler_aarch64.hpp:

"// Scaled unsigned offset, ecoded in an unsigned imm12:_ field."

ecoded -> encoded
imm12:_ field.? -> imm12 field

"// Unscaled signed offset, ecoded in a signed imm9 field."

ecoded -> encoded

"// Scaled unsigned offset, ecoded in an unsigned imm12:_ field."

ecoded -> encoded
imm12:_ field.? -> imm12 field


On 2020-07-07 13:17, Patric Hedlin wrote:
> Dear all,
>
> I would like to ask for help to review the following change/update:
>
> Issue:? https://bugs.openjdk.java.net/browse/JDK-8247766
> Webrev: http://cr.openjdk.java.net/~phedlin/tr8247766/
>
>
> C1 code generation for reading and writing stack-slots does not handle 
> large immediate offsets on aarch64. This patch will ensure that 
> immediate offsets are admissible for base+(immediate)offset encoding 
> or, if this is not the case, will enforce an explicit address 
> calculation to a scratch register. (Also correcting a small glitch in 
> 9-bit signed immediate encoding check.)
>
> NOTE: Current patch includes (local) definitions of is_simm/9 and 
> is_uimm/12, for review purpose only. With JDK-8248901 these will move 
> to Assembler, and will not be included in the change-set.
>
>
> Testing: tier1-3,6
>
>
> Best regards,
> Patric
>
>


From patric.hedlin at oracle.com  Tue Jul  7 14:20:47 2020
From: patric.hedlin at oracle.com (Patric Hedlin)
Date: Tue, 7 Jul 2020 16:20:47 +0200
Subject: RFR(S/M): 8247766: [aarch64] guarantee(val < (1U << nbits))
 failed: Field too big for insn
In-Reply-To: <1cedcefe-547c-80ad-854f-0a38e7a07639@oracle.com>
References: <0cdbdf26-ad4d-056b-a801-cc31b2cc4ab3@oracle.com>
 <1cedcefe-547c-80ad-854f-0a38e7a07639@oracle.com>
Message-ID: <6217195b-fe7b-c28a-2ae8-292a2e319e95@oracle.com>

Thanks for reviewing Nils.

On 2020-07-07 15:55, Nils Eliasson wrote:
> Hi Patric,
>
> There are some minor typos in the comments. Otherwise it looks good.
>
> No re-review needed.
>
> Best regards,
> Nils
>
> src/hotspot/cpu/aarch64/assembler_aarch64.hpp:
>
> "// Scaled unsigned offset, ecoded in an unsigned imm12:_ field."
>
> ecoded -> encoded
> imm12:_ field.? -> imm12 field
>
> "// Unscaled signed offset, ecoded in a signed imm9 field."
>
> ecoded -> encoded
>
> "// Scaled unsigned offset, ecoded in an unsigned imm12:_ field."
>
> ecoded -> encoded
> imm12:_ field.? -> imm12 field
>
Ok, that was may obviously failed attempt to illustrate that there is 
scaling...


/Patric

>
>
>
> On 2020-07-07 13:17, Patric Hedlin wrote:
>> Dear all,
>>
>> I would like to ask for help to review the following change/update:
>>
>> Issue:? https://bugs.openjdk.java.net/browse/JDK-8247766
>> Webrev: http://cr.openjdk.java.net/~phedlin/tr8247766/
>>
>>
>> C1 code generation for reading and writing stack-slots does not 
>> handle large immediate offsets on aarch64. This patch will ensure 
>> that immediate offsets are admissible for base+(immediate)offset 
>> encoding or, if this is not the case, will enforce an explicit 
>> address calculation to a scratch register. (Also correcting a small 
>> glitch in 9-bit signed immediate encoding check.)
>>
>> NOTE: Current patch includes (local) definitions of is_simm/9 and 
>> is_uimm/12, for review purpose only. With JDK-8248901 these will move 
>> to Assembler, and will not be included in the change-set.
>>
>>
>> Testing: tier1-3,6
>>
>>
>> Best regards,
>> Patric 

From fairoz.matte at oracle.com  Tue Jul  7 14:49:11 2020
From: fairoz.matte at oracle.com (Fairoz Matte)
Date: Tue, 7 Jul 2020 07:49:11 -0700 (PDT)
Subject: RFR(s): 8236042: [TESTBUG] serviceability/sa/ClhsdbCDSCore.java fails
 with -Xcomp -XX:TieredStopAtLevel=1
Message-ID: <2abe9fba-e958-4b34-9f92-6bb8d8478f4e@default>

Hi,

Please review this small test change to consider the scenario when there is no "printmdo" output

JBS - https://bugs.openjdk.java.net/browse/JDK-8236042 
Webrev - http://cr.openjdk.java.net/~fmatte/8236042/webrev.00/ 

Thanks,
Fairoz

From boris.ulasevich at bell-sw.com  Tue Jul  7 15:47:35 2020
From: boris.ulasevich at bell-sw.com (Boris Ulasevich)
Date: Tue, 7 Jul 2020 18:47:35 +0300
Subject: RFR 8248870: AARCH64: I2L conversions can be skipped for small
 positive masked values
Message-ID: <d83ab998-ad1e-4889-b637-dc7f0369c50e@bell-sw.com>

Hi,

Please review the change to skip i2l conversion after the mask:

http://cr.openjdk.java.net/~bulasevich/8248870/webrev.00
http://bugs.openjdk.java.net/browse/JDK-8248870

With the change the micro-benchmark gets 11.082->7.520 ns/op
performance improvement. Tested with jtreg.

thanks,
Boris

From hohensee at amazon.com  Tue Jul  7 16:20:46 2020
From: hohensee at amazon.com (Hohensee, Paul)
Date: Tue, 7 Jul 2020 16:20:46 +0000
Subject: RFR(XS) 8248568: compiler/c2/TestBit.java failed: 'test' missing
 from stdout/stderr
Message-ID: <C976CF78-3A2E-4DA3-B0F6-B4FDDAAC44E9@amazon.com>

In case it's not judged to be trivial, lgtm.

Thanks,
Paul

?On 7/6/20, 2:38 AM, "hotspot-compiler-dev on behalf of Boris Ulasevich" <hotspot-compiler-dev-retn at openjdk.java.net on behalf of boris.ulasevich at bell-sw.com> wrote:

    Thank you Vladimir.

    May I consider the change trivial or should I ask for more reviews?

    regards,
    Boris

    On 03.07.2020 00:45, Vladimir Kozlov wrote:
    > Looks good.
    >
    > Thanks,
    > Vladimir
    >
    > On 7/2/20 2:29 PM, Boris Ulasevich wrote:
    >> Hi Vladimir,
    >>
    >> Thank you. I applied your suggestions. On our machines jtreg runs well.
    >> Update: http://cr.openjdk.java.net/~bulasevich/8248568/webrev.01
    >>
    >> regards,
    >> Boris
    >>
    >> On Thu, Jul 2, 2020 at 9:54 PM Vladimir Kozlov
    >> <vladimir.kozlov at oracle.com>
    >> wrote:
    >>
    >>> Good.
    >>>
    >>> You may also replace next requirements:
    >>>
    >>> vm.flavor == "server" & !vm.graal.enabled
    >>>
    >>> with one:
    >>>
    >>> vm.compiler2.enabled
    >>>
    >>> Graal and C2 are mutually exclusive.
    >>>
    >>> May be also run processes without C1 by switching off Tiered
    >>> Compilation.
    >>>
    >>> And instead of:
    >>> @run main/othervm compiler.c2.TestBit
    >>>
    >>> use:
    >>> @run driver compiler.c2.TestBit
    >>>
    >>> Because you launching separate processes.
    >>>
    >>> Please, test changes with jtreg testing.
    >>>
    >>> Thanks,
    >>> Vladimir K
    >>>
    >>> On 7/2/20 11:13 AM, Boris Ulasevich wrote:
    >>>> Hi,
    >>>>
    >>>> Please review a one-line change: adding -Xbatch option to recently
    >>>> introduced test to get a more predictable PrintOptoAssembly output.
    >>>>
    >>>> http://cr.openjdk.java.net/~bulasevich/8248568/webrev.00
    >>>> http://bugs.openjdk.java.net/browse/JDK-8248568
    >>>>
    >>>> thanks,
    >>>> Boris
    >>>


From vladimir.kozlov at oracle.com  Tue Jul  7 17:07:04 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 7 Jul 2020 10:07:04 -0700
Subject: [15] RFR(S): 8248845: AArch64: stack corruption after spilling
 vector register
In-Reply-To: <857dvfrev5.fsf@arm.com>
References: <857dvfrev5.fsf@arm.com>
Message-ID: <0eeec297-f2e1-e326-5d3a-eb4a11e47934@oracle.com>

Thank you, Nick

You are absolutely right that it was mistake change in 8076276. And we don't do scheduling for x86.
Do you need sponsorship for push? The fix is trivial and should be pushed into jdk/jdk15.

Thanks,
Vladimir

On 7/6/20 11:58 PM, Nick Gasson wrote:
> Hi,
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8248845
> Webrev: http://cr.openjdk.java.net/~ngasson/8248845/webrev.0/
> 
> This crash was seen on the Panama vectorIntrinsics branch and a minimal
> test case is attached to the JBS entry, but it should also affect
> vanilla jdk/jdk although I haven't found a reliable way to reproduce it.
> 
>    0x0000ffffa173fd40:   ldr	x11, [sp, #120]
>    0x0000ffffa173fd44:   ldr	x10, [sp, #48]
>    0x0000ffffa173fd48:   add	x10, x10, x11
>    0x0000ffffa173fd4c:   str	q16, [x10, #16]    ; <==== CRASH HERE
> 
> X10 loaded from sp+48 contains a valid pointer but X11 from sp+120
> contains a garbage value. Here's the relevant opto assembly that spills
> to sp+120:
> 
>    6b4 +   spill [sp, #80] -> [sp, #16]	# vector spill size = 128
>    6bc +   spill [sp, #24] -> [sp, #120]	# spill size = 64
> 
> These two instructions have been scheduled in the wrong order: 6b4
> writes 16 bytes at sp+16 which overwrites another live value at sp+24.
> Instruction 6bc spills the clobbered value at sp+24 to sp+120. If I dump
> out the instructions before scheduling or pass -XX:-OptoScheduling the
> order is correct.
> 
> It seems to be a known limitation that the scheduler doesn't correctly
> compute anti-dependencies when a vector occupies more than two slots,
> because PhaseOutput::ScheduleAndBundle() already has a check to skip
> scheduling if a too-wide vector was generated. Unfortunately the check
> is wrong as a pair of slots is 8 bytes not 16:
> 
>    // Scheduling code works only with pairs (16 bytes) maximum.
>    if (C->max_vector_size() > 16)
> 
> Actually the test here used to be > 8, but was changed as part of
> JDK-8076276 which added AVX512 support [1]. I couldn't see any
> explanation of that change in the bug or mailing list thread, but it
> seems wrong and reverting it fixes this crash.
> 
> This affects AArch64 because OptoScheduling is enabled by default and
> NEON vectors are 16 bytes wide.
> 
> Tested hotspot_all_no_apps, jdk_core on AArch64 and x86_64.
> 
> [1] http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-April/017579.html
> 
> --
> Thanks,
> Nick
> 

From chris.plummer at oracle.com  Tue Jul  7 22:07:49 2020
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 7 Jul 2020 15:07:49 -0700
Subject: RFR(s): 8236042: [TESTBUG] serviceability/sa/ClhsdbCDSCore.java
 fails with -Xcomp -XX:TieredStopAtLevel=1
In-Reply-To: <2abe9fba-e958-4b34-9f92-6bb8d8478f4e@default>
References: <2abe9fba-e958-4b34-9f92-6bb8d8478f4e@default>
Message-ID: <70057c31-e535-f03a-391d-d181b2ec150b@oracle.com>

Hi Fairoz,

Looks good, except for the missing space in "if(testJavaOpts...".

thanks,

Chris

On 7/7/20 7:49 AM, Fairoz Matte wrote:
> Hi,
>
> Please review this small test change to consider the scenario when there is no "printmdo" output
>
> JBS - https://bugs.openjdk.java.net/browse/JDK-8236042
> Webrev - http://cr.openjdk.java.net/~fmatte/8236042/webrev.00/
>
> Thanks,
> Fairoz


From igor.ignatyev at oracle.com  Wed Jul  8 00:38:18 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Tue, 7 Jul 2020 17:38:18 -0700
Subject: RFR(S) [15] : 8249000 : vm.gc.X should take selected JIT into account
Message-ID: <FFC49DC0-AA15-45B5-B285-E043D1AEAF97@oracle.com>

http://cr.openjdk.java.net/~iignatyev/8249000/webrev.00/
> 241 lines changed: 34 ins; 5 del; 202 mod;


Hi all,

could you please review the patch which modifies requires/VMProps to set vm.gc.X to false if Graal is selected and X GC isn't supported by Graal?

the patch also replaces @requires similar to `vm.gc.X & !vm.graal.enabled` w/ `vm.gc.X` where it's applicable.

from JBS:
> not all GCs are supported by Graal JIT, which leads to failures like JDK-8247527 and boilerplate fixes like replacing all `@requires vm.gc.Z` w/ `@requires vm.gc.Z & !vm.graal.enabled`. 
> 
> as vm.gc.X means that X GC can be selected, it would be more natural, less surprising, and much more clear to have it true if the selected JIT supports the said X GC.

webrev: http://cr.openjdk.java.net/~iignatyev/8249000/webrev.00/
JBS: https://bugs.openjdk.java.net/browse/JDK-8249000
testing: test/hotspot/jtreg/{gc,compiler,runtime,serviceability} on {linux,windows,macos}-x64 w/ and w/o Graal as JIT

Thanks,
-- Igor

From vladimir.kozlov at oracle.com  Wed Jul  8 03:00:12 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 7 Jul 2020 20:00:12 -0700
Subject: RFR(S) [15] : 8249000 : vm.gc.X should take selected JIT into
 account
In-Reply-To: <FFC49DC0-AA15-45B5-B285-E043D1AEAF97@oracle.com>
References: <FFC49DC0-AA15-45B5-B285-E043D1AEAF97@oracle.com>
Message-ID: <6964ac32-e9ec-d700-0bdb-ea51f4610afe@oracle.com>

Nice clean up, Igor

test/hotspot/jtreg/gc/stress/TestReclaimStringsLeaksMemory.java

Do we even can have vm.gc=="null" based on code in VMProps.java? At least some GC should be selected ergonomically even 
if non is specified on command line.

- * @requires vm.gc=="null" & !vm.graal.enabled & !vm.debug
+ * @requires vm.gc == "null"
+ * @requires !vm.debug


test/hotspot/jtreg/runtime/cds/appcds/TestZGCWithCDS.java

Does next combination of @requires ever work? I thought such sequence means 'AND' operation on all such conditions.

   * @requires vm.gc.Z
   * @requires vm.gc.Serial
   * @requires vm.gc == null


Thanks,
Vladimir

On 7/7/20 5:38 PM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev/8249000/webrev.00/
>> 241 lines changed: 34 ins; 5 del; 202 mod;
> 
> 
> Hi all,
> 
> could you please review the patch which modifies requires/VMProps to set vm.gc.X to false if Graal is selected and X GC isn't supported by Graal?
> 
> the patch also replaces @requires similar to `vm.gc.X & !vm.graal.enabled` w/ `vm.gc.X` where it's applicable.
> 
> from JBS:
>> not all GCs are supported by Graal JIT, which leads to failures like JDK-8247527 and boilerplate fixes like replacing all `@requires vm.gc.Z` w/ `@requires vm.gc.Z & !vm.graal.enabled`.
>>
>> as vm.gc.X means that X GC can be selected, it would be more natural, less surprising, and much more clear to have it true if the selected JIT supports the said X GC.
> 
> webrev: http://cr.openjdk.java.net/~iignatyev/8249000/webrev.00/
> JBS: https://bugs.openjdk.java.net/browse/JDK-8249000
> testing: test/hotspot/jtreg/{gc,compiler,runtime,serviceability} on {linux,windows,macos}-x64 w/ and w/o Graal as JIT
> 
> Thanks,
> -- Igor
> 

From igor.ignatyev at oracle.com  Wed Jul  8 03:30:38 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Tue, 7 Jul 2020 20:30:38 -0700
Subject: RFR(S) [15] : 8249000 : vm.gc.X should take selected JIT into
 account
In-Reply-To: <6964ac32-e9ec-d700-0bdb-ea51f4610afe@oracle.com>
References: <FFC49DC0-AA15-45B5-B285-E043D1AEAF97@oracle.com>
 <6964ac32-e9ec-d700-0bdb-ea51f4610afe@oracle.com>
Message-ID: <7A1992A7-1493-4DF0-B621-195CE986D34F@oracle.com>

Hi Vladimir,

thanks for your review!

`vm.gc` and `vm.gc.X`-s are different beasts (and admittedly, they confuse people a lot), `vm.gc` is set to "X", by jtreg itself, only if there is UseXGC in vm flags, otherwise it's "null". `vm.gc.X` are set by VMProps class, and you can have more than one vm.gc.X == true, as vm.gc.X means that X gc is supported by JVM and it can be selected; so if there are no Use.*GC in vm flags, vm.gc.X will yield true for all GCs which JVM was built with; if one of UseXGC is provided, only corresponding vm.gc.X is true, and all others are false. so to answer your questions, yes `vm.gc` can be "null" (if there are no Use.*GC) , and yes `vm.gc.Z & vm.gc.Serial & vm.gc == null` can be true (if there are no Use.*GC and JVM supports both Z and Serial GCs).

Thanks,
-- Igor 


> On Jul 7, 2020, at 8:00 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Nice clean up, Igor
> 
> test/hotspot/jtreg/gc/stress/TestReclaimStringsLeaksMemory.java
> 
> Do we even can have vm.gc=="null" based on code in VMProps.java? At least some GC should be selected ergonomically even if non is specified on command line.
> 
> - * @requires vm.gc=="null" & !vm.graal.enabled & !vm.debug
> + * @requires vm.gc == "null"
> + * @requires !vm.debug

> 
> 
> test/hotspot/jtreg/runtime/cds/appcds/TestZGCWithCDS.java
> 
> Does next combination of @requires ever work? I thought such sequence means 'AND' operation on all such conditions.
> 
>  * @requires vm.gc.Z
>  * @requires vm.gc.Serial
>  * @requires vm.gc == null
> 
> 
> Thanks,
> Vladimir
> 
> On 7/7/20 5:38 PM, Igor Ignatyev wrote:
>> http://cr.openjdk.java.net/~iignatyev/8249000/webrev.00/
>>> 241 lines changed: 34 ins; 5 del; 202 mod;
>> Hi all,
>> could you please review the patch which modifies requires/VMProps to set vm.gc.X to false if Graal is selected and X GC isn't supported by Graal?
>> the patch also replaces @requires similar to `vm.gc.X & !vm.graal.enabled` w/ `vm.gc.X` where it's applicable.
>> from JBS:
>>> not all GCs are supported by Graal JIT, which leads to failures like JDK-8247527 and boilerplate fixes like replacing all `@requires vm.gc.Z` w/ `@requires vm.gc.Z & !vm.graal.enabled`.
>>> 
>>> as vm.gc.X means that X GC can be selected, it would be more natural, less surprising, and much more clear to have it true if the selected JIT supports the said X GC.
>> webrev: http://cr.openjdk.java.net/~iignatyev/8249000/webrev.00/
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249000
>> testing: test/hotspot/jtreg/{gc,compiler,runtime,serviceability} on {linux,windows,macos}-x64 w/ and w/o Graal as JIT
>> Thanks,
>> -- Igor


From fairoz.matte at oracle.com  Wed Jul  8 03:47:48 2020
From: fairoz.matte at oracle.com (Fairoz Matte)
Date: Tue, 7 Jul 2020 20:47:48 -0700 (PDT)
Subject: RFR(s): 8236042: [TESTBUG] serviceability/sa/ClhsdbCDSCore.java
 fails with -Xcomp -XX:TieredStopAtLevel=1
In-Reply-To: <70057c31-e535-f03a-391d-d181b2ec150b@oracle.com>
References: <2abe9fba-e958-4b34-9f92-6bb8d8478f4e@default>
 <70057c31-e535-f03a-391d-d181b2ec150b@oracle.com>
Message-ID: <958fecdf-d7a1-4b22-835e-a75fadda0a84@default>

Thanks Chris, for the review comments.

I have updated the suggested change.

Thanks,
Fairoz

> -----Original Message-----
> From: Chris Plummer
> Sent: Wednesday, July 8, 2020 3:38 AM
> To: Fairoz Matte <fairoz.matte at oracle.com>; hotspot-compiler-
> dev at openjdk.java.net; serviceability-dev at openjdk.java.net
> Subject: Re: RFR(s): 8236042: [TESTBUG] serviceability/sa/ClhsdbCDSCore.java
> fails with -Xcomp -XX:TieredStopAtLevel=1
> 
> Hi Fairoz,
> 
> Looks good, except for the missing space in "if(testJavaOpts...".
> 
> thanks,
> 
> Chris
> 
> On 7/7/20 7:49 AM, Fairoz Matte wrote:
> > Hi,
> >
> > Please review this small test change to consider the scenario when there is no
> "printmdo" output
> >
> > JBS - https://bugs.openjdk.java.net/browse/JDK-8236042
> > Webrev - http://cr.openjdk.java.net/~fmatte/8236042/webrev.00/
> >
> > Thanks,
> > Fairoz
> 

From igor.ignatyev at oracle.com  Wed Jul  8 06:56:27 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Tue, 7 Jul 2020 23:56:27 -0700
Subject: RFR [15] : 8249018 : clean up FileInstaller $test.src $cwd in
 vmTestbase_vm_mlvm tests
Message-ID: <A6F60E58-5F83-4A6F-B0BA-57278278C481@oracle.com>

http://cr.openjdk.java.net/~iignatyev//8249018/webrev.00
> 116 lines changed: 0 ins; 64 del; 52 mod;

Hi all,

could you please review the patch which removes `FileInstaller . .` jtreg action from :vmTestbase_vm_mlvm tests?
from the main issue(8204985):
> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests.


testing: :vmTestbase_vm_mlvm on linux-x64
webrev: http://cr.openjdk.java.net/~iignatyev//8249018/webrev.00
JBS: https://bugs.openjdk.java.net/browse/JDK-8249018

Thanks,
-- Igor

From Yang.Zhang at arm.com  Wed Jul  8 07:05:09 2020
From: Yang.Zhang at arm.com (Yang Zhang)
Date: Wed, 8 Jul 2020 07:05:09 +0000
Subject: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API
 (Incubator): AArch64 backend changes
In-Reply-To: <2acbcc99-8dd4-b8f1-5982-1d439953c416@redhat.com>
References: <VI1PR0802MB2558C90ACF1189F17692FF278EB90@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <275eb57c-51c0-675e-c32a-91b198023559@redhat.com>
 <719F9169-ABC4-408E-B732-F1BD9A84337F@oracle.com>
 <9a13f5df-d946-579d-4282-917dc7338dc8@redhat.com>
 <09BC0693-80E0-4F87-855E-0B38A6F5EFA2@oracle.com>
 <668e500e-f621-5a2c-a41e-f73536880f73@redhat.com>
 <VI1PR0802MB255845256424F9EB4D5D81368EB30@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <1909fa9d-98bb-c2fb-45d8-540247d1ca8b@redhat.com>
 <VI1PR0802MB255895ADE9BC8D027EBB81048EB10@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <BYAPR11MB35438110FA50851942CE17A1EF950@BYAPR11MB3543.namprd11.prod.outlook.com>
 <HE1PR0802MB25542AE212E12FD3EF6D1BE48E6E0@HE1PR0802MB2554.eurprd08.prod.outlook.com>
 <2acbcc99-8dd4-b8f1-5982-1d439953c416@redhat.com>
Message-ID: <VI1PR0802MB2558ECC61528694DBDFFF48F8E670@VI1PR0802MB2558.eurprd08.prod.outlook.com>

Hi Andrew

I have updated this patch. Could you please help to review it again?
In this patch, the following changes are made:
1. Separate newly added NEON instructions to a new ad file
   aarch64_neon.ad
2. Add assembler tests for NEON instructions. Trailing spaces
   in the python script are also removed.

http://cr.openjdk.java.net/~yzhang/vectorapi/vectorapi.rfr/aarch64_webrev/webrev.02/

Thanks,
Yang


-----Original Message-----
From: Andrew Haley <aph at redhat.com> 
Sent: Tuesday, June 30, 2020 12:10 AM
To: Yang Zhang <Yang.Zhang at arm.com>; Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; Paul Sandoz <paul.sandoz at oracle.com>
Cc: nd <nd at arm.com>; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; core-libs-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
Subject: Re: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API (Incubator): AArch64 backend changes

On 29/06/2020 08:48, Yang Zhang wrote:
> 1. Instructions that can be matched with NEON instructions directly.
> MulVB, SqrtVF and AbsV have been merged into jdk master already.
> 
> 2. Instructions that jdk master has middle end support for, but they cannot be matched with NEON instructions directly.
> Such as AddReductionVL, MulReductionVL, And/Or/XorReductionV These new instructions can be moved into jdk master first, but for auto-vectorization, the performance might not get improved. 
> 
> 3. Panama/Vector API specific  instructions such as Load/StoreVector ( 16 bits), VectorReinterpret, VectorMaskCmp, MaxV/MinV, VectorBlend etc. 
> These instructions cannot be moved into jdk master first because there isn't middle-end support.
> 
> I will put 2 and 3 in a new ad file aarch64_neon.ad. I will also update aarch64_asmtest.py and macroassemler.cpp. When the patch is ready, I will send it again.

Thank you *very* much for your hard work. Appreciated!

--
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com> https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From xxinliu at amazon.com  Wed Jul  8 08:26:01 2020
From: xxinliu at amazon.com (Liu, Xin)
Date: Wed, 8 Jul 2020 08:26:01 +0000
Subject: RFR(S): 8247732: validate user-input intrinsic_ids in
 ControlIntrinsic
In-Reply-To: <821e3d29-c95b-aafc-8ee5-6e49a1bdde82@amazon.com>
References: <821e3d29-c95b-aafc-8ee5-6e49a1bdde82@amazon.com>
Message-ID: <9b324805-eb86-27e1-5dcb-96a823f8495b@amazon.com>

hi, Reviewers,

Please allow me to ping this CR.
It's the last left-over task for -XX:ControlIntrinsic=. it adds a sanity check for user-input.

Thanks,
--lx

On 6/25/20 6:59 PM, Liu, Xin wrote:

hi, Reviewers,

Could you review this patch?

bug: https://bugs.openjdk.java.net/browse/JDK-8247732

webrev: http://cr.openjdk.java.net/~xliu/8247732/00/webrev/


The core logic is class ControlIntrinsicValidator in compilerDirectives.hpp

It iterates the ccstrlist option and makes sure user-input intrinsic ids are all valid.  It stops and take a record when it meets the first unrecognized intrinsic.

I used constraints to validate the global options ControlIntrinsic and DisableIntrinsic.

ControlIntrinsic/DisableIntrinsic in compiler directives are more complex. The matched directive is only parsed when hotspot attempts to compile the corresponding method.

I validate at that time and JVM will crash if it doesnot meet guarantee() statement.

I added Method::external_name_short() which only returns the shorter method name in the form of  "classname::method".

Probably hotspot has had similar code, but I failed to discover. please let me know and I will remove it.


Test:

passed hotspot:tier1 and gtest:all

manually tests with wrong inputs.

https://bugs.openjdk.java.net/browse/JDK-8247732?focusedCommentId=14349960&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14349960


From nick.gasson at arm.com  Wed Jul  8 09:28:27 2020
From: nick.gasson at arm.com (Nick Gasson)
Date: Wed, 08 Jul 2020 17:28:27 +0800
Subject: [15] RFR(S): 8248845: AArch64: stack corruption after spilling
 vector register
In-Reply-To: <0eeec297-f2e1-e326-5d3a-eb4a11e47934@oracle.com>
References: <857dvfrev5.fsf@arm.com>
 <0eeec297-f2e1-e326-5d3a-eb4a11e47934@oracle.com>
Message-ID: <854kqiqrt0.fsf@arm.com>

On 07/08/20 01:07 am, Vladimir Kozlov wrote:
>
> You are absolutely right that it was mistake change in 8076276. And we don't do scheduling for x86.

I wonder whether we should only do scheduling on AArch64 for in-order
CPUs? I tried SPECjvm with/without OptoScheduling on a few different
AArch64 systems but couldn't get conclusive results either way.

> Do you need sponsorship for push? The fix is trivial and should be pushed into jdk/jdk15.
>

I pushed it to jdk15, thanks.

--
Nick

From aph at redhat.com  Wed Jul  8 09:46:53 2020
From: aph at redhat.com (Andrew Haley)
Date: Wed, 8 Jul 2020 10:46:53 +0100
Subject: [aarch64-port-dev ] RFR 8248870: AARCH64: I2L conversions can be
 skipped for small positive masked values
In-Reply-To: <d83ab998-ad1e-4889-b637-dc7f0369c50e@bell-sw.com>
References: <d83ab998-ad1e-4889-b637-dc7f0369c50e@bell-sw.com>
Message-ID: <9ccf64f1-7a88-0f67-8b50-4dea09af9c8b@redhat.com>

On 07/07/2020 16:47, Boris Ulasevich wrote:
> Please review the change to skip i2l conversion after the mask:
> 
> http://cr.openjdk.java.net/~bulasevich/8248870/webrev.00
> http://bugs.openjdk.java.net/browse/JDK-8248870

You seem to have inserted this between the DO NOT EDIT THIS SECTION
markers.

Please hold off this change until I've committed the patch for
8248414.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Wed Jul  8 11:28:23 2020
From: aph at redhat.com (Andrew Haley)
Date: Wed, 8 Jul 2020 12:28:23 +0100
Subject: Running IGV
In-Reply-To: <ff8e07f3-2416-6e89-6697-002542664539@oracle.com>
References: <09f19846-cd66-85ed-c491-c5348d8fe532@redhat.com>
 <f2d4d59f-eb69-53ad-47ab-0946bdd734ee@oracle.com>
 <ed109cf0-985c-4902-8339-29ae2a6b910e@redhat.com>
 <ff8e07f3-2416-6e89-6697-002542664539@oracle.com>
Message-ID: <4ce51de9-0cf3-8548-ca2b-bced67e3a561@redhat.com>

Hi,

On 03/07/2020 14:48, Tobias Hartmann wrote:
> 
> On 03.07.20 15:40, Andrew Haley wrote:
>> Thanks. It's better with JDK 8, but although it does load saved XML
>> Ideal Graphs, all it's possible to see is a tree with the names of
>> the compilation passes. No graphs are displayed.
> 
> After double-clicking on the phase, it sometimes takes a while to load if the graph is huge. If
> there's an issue, you should at least get an error message (did you check the console?).
> 
>> I'm guessing IGV must have rotted, and there's no version that works
>> with current HotSpot available.
> 
> Well it does work fine for me and I'm using it on a regular basis.
Thank you very much for your help. I persisted with my build and I've
now got something that works.

I am seeing these, but it doesn't seem to stop IGV from working.

java.lang.AssertionError
	at org.netbeans.api.visual.graph.GraphScene.addEdge(GraphScene.java:154)
	at com.sun.hotspot.igv.controlflow.ControlFlowScene.setGraph(ControlFlowScene.java:113)
	at com.sun.hotspot.igv.controlflow.ControlFlowTopComponent$1.run(ControlFlowTopComponent.java:145)
	at java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:311)
	at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:758)
	at java.awt.EventQueue.access$500(EventQueue.java:97)
	at java.awt.EventQueue$3.run(EventQueue.java:709)
	at java.awt.EventQueue$3.run(EventQueue.java:703)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:74)
	at java.awt.EventQueue.dispatchEvent(EventQueue.java:728)
	at org.netbeans.core.TimableEventQueue.dispatchEvent(TimableEventQueue.java:159)
[catch] at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:205)
	at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:116)
	at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:105)
	at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:101)
	at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:93)
	at java.awt.EventDispatchThread.run(EventDispatchThread.java:82)


-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From christian.hagedorn at oracle.com  Wed Jul  8 13:41:30 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Wed, 8 Jul 2020 15:41:30 +0200
Subject: RFR(M): 8229495: SIGILL in C2 generated OSR compilation
In-Reply-To: <87r1topriw.fsf@redhat.com>
References: <3b720427-d718-5d1c-dbe9-6149a21883af@oracle.com>
 <87r1topriw.fsf@redhat.com>
Message-ID: <84b2c86d-c7e6-7945-dae5-db1d8efe6f25@oracle.com>

Hi Roland

That's a nice solution and looks reasonable to me. Thanks for the 
detailed explanation!

I submitted some testing.

Some minor general comments:

1824         // Add back the predicate for the value at the beginning of 
the first entry
1825         prev_proj = clone_skeleton_predicate(iff, init, max_value, 
entry, proj, ctrl, outer_loop, prev_proj);

This comment seems to be outdated as you now clone both skeleton 
predicates with the same function call in different loop iterations.

- In loopopts.cpp: While fixing the spacing you could also add curly 
braces to the one-liner if statements like

955   if (n_op == Op_MergeMem) return n;

> I implemented this by using 2 subclasses to Opaque1 to denotate init and
> stride and facilitate pattern matching. I had to extend _class_id to
> juint to make Opaque1 a new class. 

While at it, you might want to consider to update other uses of the 
pattern Opcode() == Op_Opaque1 by is_Opaque1() as well like in 
loopTransform.cpp:

1158     assert(iff->in(1)->in(1)->Opcode() == Op_Opaque1, "unexpected 
predicate shape");

> Finally, the "asserts" above used to be removed at the
> end of compilation. I now leave them in debug builds so we can catch
> similar bugs earlier.

That's helpful!

> The bug doesn't reproduce with the included test case anymore but after
> I backed out a couple unrelated changes I could use the test case to
> verify the bug is indeed fixed.

I observed a Java Fuzzer crash ("fatal error: DEBUG MESSAGE: duplicated 
predicate failed which is impossible") this weekend which looked very 
similar to this bug and indeed it could be fixed with your patch. You 
could add it as additional testcase. Here is the simplified code and the 
command line I used to reproduce it.

$ java -Xcomp -XX:-TieredCompilation -XX:CompileOnly=Test::test Test.java

public class Test {

     public static int iFld = 0;
     public static short sFld = 1;

     public static void main(String[] strArr) {
         test();
     }

     public static int test() {
         int x = 11;
         int y = 0;
         int j = 0;
         int iArr[] = new int[400];

         init(iArr);

         for (int i = 0; i < 2; i++) {
             doNothing();
             for (j = 10; j > 1; j -= 2) {
                 sFld += (short)j;
                 iArr = iArr;
                 y += (j * 3);
                 x = (iArr[j - 1]/ x);
                 x = sFld;
             }
             int k = 1;
             while (++k < 8) {
                 iFld += x;
             }
         }
         return Float.floatToIntBits(654) + x + j + y;
     }

     // Inlined
     public static void doNothing() {
     }

     // Inlined
     public static void init(int[] a) {
         for (int j = 0; j < a.length; j++) {
             a[j] = 0;
         }
     }
}


Best regards,
Christian

From vladimir.kozlov at oracle.com  Wed Jul  8 17:58:59 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 8 Jul 2020 10:58:59 -0700
Subject: RFR [15] : 8249018 : clean up FileInstaller $test.src $cwd in
 vmTestbase_vm_mlvm tests
In-Reply-To: <A6F60E58-5F83-4A6F-B0BA-57278278C481@oracle.com>
References: <A6F60E58-5F83-4A6F-B0BA-57278278C481@oracle.com>
Message-ID: <c672514f-d503-256c-8689-81dc121cc45b@oracle.com>

Good.

Thanks,
Vladimir

On 7/7/20 11:56 PM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev//8249018/webrev.00
>> 116 lines changed: 0 ins; 64 del; 52 mod;
> 
> Hi all,
> 
> could you please review the patch which removes `FileInstaller . .` jtreg action from :vmTestbase_vm_mlvm tests?
> from the main issue(8204985):
>> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests.
> 
> 
> testing: :vmTestbase_vm_mlvm on linux-x64
> webrev: http://cr.openjdk.java.net/~iignatyev//8249018/webrev.00
> JBS: https://bugs.openjdk.java.net/browse/JDK-8249018
> 
> Thanks,
> -- Igor
> 

From vladimir.kozlov at oracle.com  Wed Jul  8 18:34:08 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 8 Jul 2020 11:34:08 -0700
Subject: RFR(S) [15] : 8249000 : vm.gc.X should take selected JIT into
 account
In-Reply-To: <7A1992A7-1493-4DF0-B621-195CE986D34F@oracle.com>
References: <FFC49DC0-AA15-45B5-B285-E043D1AEAF97@oracle.com>
 <6964ac32-e9ec-d700-0bdb-ea51f4610afe@oracle.com>
 <7A1992A7-1493-4DF0-B621-195CE986D34F@oracle.com>
Message-ID: <2c92a9a5-77af-c100-fa9b-f765e9d23dce@oracle.com>

Thank you, Igor

I got the difference between `vm.gc` and `vm.gc.X`.

In this case TestReclaimStringsLeaksMemory.java should be put into ProblemList-graal.txt with 8207267 to enable it with 
libgraal.  Current usage of !vm.graal.enabled in test is to skip this test with Java Graal because its effect on Java heap.

On 7/7/20 8:30 PM, Igor Ignatyev wrote:
> Hi Vladimir,
> 
> thanks for your review!
> 
> `vm.gc` and `vm.gc.X`-s are different beasts (and admittedly, they confuse people a lot), `vm.gc` is set to "X", by jtreg itself, only if there is UseXGC in vm flags, otherwise it's "null". `vm.gc.X` are set by VMProps class, and you can have more than one vm.gc.X == true, as vm.gc.X means that X gc is supported by JVM and it can be selected; so if there are no Use.*GC in vm flags, vm.gc.X will yield true for all GCs which JVM was built with; if one of UseXGC is provided, only corresponding vm.gc.X is true, and all others are false. so to answer your questions, yes `vm.gc` can be "null" (if there are no Use.*GC) , and yes `vm.gc.Z & vm.gc.Serial & vm.gc == null` can be true (if there are no Use.*GC and JVM supports both Z and Serial GCs).

Interesting. I thought vmGC will list only one selected GC. That explains requires in TestZGCWithCDS.java.

You only need to add TestReclaimStringsLeaksMemory.java into ProblemList-graal.txt.

Thanks,
Vladimir

> 
> Thanks,
> -- Igor
> 
> 
>> On Jul 7, 2020, at 8:00 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>>
>> Nice clean up, Igor
>>
>> test/hotspot/jtreg/gc/stress/TestReclaimStringsLeaksMemory.java
>>
>> Do we even can have vm.gc=="null" based on code in VMProps.java? At least some GC should be selected ergonomically even if non is specified on command line.
>>
>> - * @requires vm.gc=="null" & !vm.graal.enabled & !vm.debug
>> + * @requires vm.gc == "null"
>> + * @requires !vm.debug
> 
>>
>>
>> test/hotspot/jtreg/runtime/cds/appcds/TestZGCWithCDS.java
>>
>> Does next combination of @requires ever work? I thought such sequence means 'AND' operation on all such conditions.
>>
>>   * @requires vm.gc.Z
>>   * @requires vm.gc.Serial
>>   * @requires vm.gc == null
>>
>>
>> Thanks,
>> Vladimir
>>
>> On 7/7/20 5:38 PM, Igor Ignatyev wrote:
>>> http://cr.openjdk.java.net/~iignatyev/8249000/webrev.00/
>>>> 241 lines changed: 34 ins; 5 del; 202 mod;
>>> Hi all,
>>> could you please review the patch which modifies requires/VMProps to set vm.gc.X to false if Graal is selected and X GC isn't supported by Graal?
>>> the patch also replaces @requires similar to `vm.gc.X & !vm.graal.enabled` w/ `vm.gc.X` where it's applicable.
>>> from JBS:
>>>> not all GCs are supported by Graal JIT, which leads to failures like JDK-8247527 and boilerplate fixes like replacing all `@requires vm.gc.Z` w/ `@requires vm.gc.Z & !vm.graal.enabled`.
>>>>
>>>> as vm.gc.X means that X GC can be selected, it would be more natural, less surprising, and much more clear to have it true if the selected JIT supports the said X GC.
>>> webrev: http://cr.openjdk.java.net/~iignatyev/8249000/webrev.00/
>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249000
>>> testing: test/hotspot/jtreg/{gc,compiler,runtime,serviceability} on {linux,windows,macos}-x64 w/ and w/o Graal as JIT
>>> Thanks,
>>> -- Igor
> 

From igor.ignatyev at oracle.com  Wed Jul  8 18:40:14 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 8 Jul 2020 11:40:14 -0700
Subject: RFR(S) [15] : 8249000 : vm.gc.X should take selected JIT into
 account
In-Reply-To: <2c92a9a5-77af-c100-fa9b-f765e9d23dce@oracle.com>
References: <FFC49DC0-AA15-45B5-B285-E043D1AEAF97@oracle.com>
 <6964ac32-e9ec-d700-0bdb-ea51f4610afe@oracle.com>
 <7A1992A7-1493-4DF0-B621-195CE986D34F@oracle.com>
 <2c92a9a5-77af-c100-fa9b-f765e9d23dce@oracle.com>
Message-ID: <BA1B1E12-19E3-45E3-8FDD-0307DDADDC3E@oracle.com>

Thanks Vladimir. 

for the record, I've updated ProblemList-graal.txt w/ the following:

> diff -r 14ffd658a23a test/hotspot/jtreg/ProblemList-graal.txt
> --- a/test/hotspot/jtreg/ProblemList-graal.txt  Wed Jul 08 11:35:30 2020 -0700
> +++ b/test/hotspot/jtreg/ProblemList-graal.txt  Wed Jul 08 11:37:44 2020 -0700
> @@ -229,6 +229,7 @@
>  compiler/loopopts/TestOverunrolling.java                8207267   generic-all
>  compiler/jsr292/NonInlinedCall/InvokeTest.java          8207267   generic-all
>  compiler/codegen/TestTrichotomyExpressions.java         8207267   generic-all
> +gc/stress/TestReclaimStringsLeaksMemory.java            8207267   generic-all
>  
>  runtime/exceptionMsgs/AbstractMethodError/AbstractMethodErrorTest.java        8222582 generic-all

-- Igor


> On Jul 8, 2020, at 11:34 AM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Thank you, Igor
> 
> I got the difference between `vm.gc` and `vm.gc.X`.
> 
> In this case TestReclaimStringsLeaksMemory.java should be put into ProblemList-graal.txt with 8207267 to enable it with libgraal.  Current usage of !vm.graal.enabled in test is to skip this test with Java Graal because its effect on Java heap.
> 
> On 7/7/20 8:30 PM, Igor Ignatyev wrote:
>> Hi Vladimir,
>> thanks for your review!
>> `vm.gc` and `vm.gc.X`-s are different beasts (and admittedly, they confuse people a lot), `vm.gc` is set to "X", by jtreg itself, only if there is UseXGC in vm flags, otherwise it's "null". `vm.gc.X` are set by VMProps class, and you can have more than one vm.gc.X == true, as vm.gc.X means that X gc is supported by JVM and it can be selected; so if there are no Use.*GC in vm flags, vm.gc.X will yield true for all GCs which JVM was built with; if one of UseXGC is provided, only corresponding vm.gc.X is true, and all others are false. so to answer your questions, yes `vm.gc` can be "null" (if there are no Use.*GC) , and yes `vm.gc.Z & vm.gc.Serial & vm.gc == null` can be true (if there are no Use.*GC and JVM supports both Z and Serial GCs).
> 
> Interesting. I thought vmGC will list only one selected GC. That explains requires in TestZGCWithCDS.java.
> 
> You only need to add TestReclaimStringsLeaksMemory.java into ProblemList-graal.txt.
> 
> Thanks,
> Vladimir
> 
>> Thanks,
>> -- Igor
>>> On Jul 7, 2020, at 8:00 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>>> 
>>> Nice clean up, Igor
>>> 
>>> test/hotspot/jtreg/gc/stress/TestReclaimStringsLeaksMemory.java
>>> 
>>> Do we even can have vm.gc=="null" based on code in VMProps.java? At least some GC should be selected ergonomically even if non is specified on command line.
>>> 
>>> - * @requires vm.gc=="null" & !vm.graal.enabled & !vm.debug
>>> + * @requires vm.gc == "null"
>>> + * @requires !vm.debug
>>> 
>>> 
>>> test/hotspot/jtreg/runtime/cds/appcds/TestZGCWithCDS.java
>>> 
>>> Does next combination of @requires ever work? I thought such sequence means 'AND' operation on all such conditions.
>>> 
>>>  * @requires vm.gc.Z
>>>  * @requires vm.gc.Serial
>>>  * @requires vm.gc == null
>>> 
>>> 
>>> Thanks,
>>> Vladimir
>>> 
>>> On 7/7/20 5:38 PM, Igor Ignatyev wrote:
>>>> http://cr.openjdk.java.net/~iignatyev/8249000/webrev.00/
>>>>> 241 lines changed: 34 ins; 5 del; 202 mod;
>>>> Hi all,
>>>> could you please review the patch which modifies requires/VMProps to set vm.gc.X to false if Graal is selected and X GC isn't supported by Graal?
>>>> the patch also replaces @requires similar to `vm.gc.X & !vm.graal.enabled` w/ `vm.gc.X` where it's applicable.
>>>> from JBS:
>>>>> not all GCs are supported by Graal JIT, which leads to failures like JDK-8247527 and boilerplate fixes like replacing all `@requires vm.gc.Z` w/ `@requires vm.gc.Z & !vm.graal.enabled`.
>>>>> 
>>>>> as vm.gc.X means that X GC can be selected, it would be more natural, less surprising, and much more clear to have it true if the selected JIT supports the said X GC.
>>>> webrev: http://cr.openjdk.java.net/~iignatyev/8249000/webrev.00/
>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249000
>>>> testing: test/hotspot/jtreg/{gc,compiler,runtime,serviceability} on {linux,windows,macos}-x64 w/ and w/o Graal as JIT
>>>> Thanks,
>>>> -- Igor


From igor.ignatyev at oracle.com  Wed Jul  8 18:41:35 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 8 Jul 2020 11:41:35 -0700
Subject: RFR [15] : 8249018 : clean up FileInstaller $test.src $cwd in
 vmTestbase_vm_mlvm tests
In-Reply-To: <c672514f-d503-256c-8689-81dc121cc45b@oracle.com>
References: <A6F60E58-5F83-4A6F-B0BA-57278278C481@oracle.com>
 <c672514f-d503-256c-8689-81dc121cc45b@oracle.com>
Message-ID: <BF1683B6-B568-47B4-844E-BEE55E8CC310@oracle.com>

thanks Vladimir, pushed.

-- Igor

> On Jul 8, 2020, at 10:58 AM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Good.
> 
> Thanks,
> Vladimir
> 
> On 7/7/20 11:56 PM, Igor Ignatyev wrote:
>> http://cr.openjdk.java.net/~iignatyev//8249018/webrev.00
>>> 116 lines changed: 0 ins; 64 del; 52 mod;
>> Hi all,
>> could you please review the patch which removes `FileInstaller . .` jtreg action from :vmTestbase_vm_mlvm tests?
>> from the main issue(8204985):
>>> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests.
>> testing: :vmTestbase_vm_mlvm on linux-x64
>> webrev: http://cr.openjdk.java.net/~iignatyev//8249018/webrev.00
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249018
>> Thanks,
>> -- Igor


From vladimir.kozlov at oracle.com  Wed Jul  8 20:14:01 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 8 Jul 2020 13:14:01 -0700
Subject: [16] RFR(S) 8248987: AOT's Linker.java seems to eagerly fail-fast on
 Windows
Message-ID: <acbe7044-8237-0ec2-fa7b-b43657f62594@oracle.com>

https://cr.openjdk.java.net/~kvn/8248987/webrev.00/
https://bugs.openjdk.java.net/browse/JDK-8248987

Treat all problems in getVC141AndNewerLinker() as non-fatal to try find old version linker if newer one was not found. 
Print exception message from getVC141AndNewerLinker() with --verbose flag.

Thanks,
Vladimir

From igor.veresov at oracle.com  Wed Jul  8 20:17:10 2020
From: igor.veresov at oracle.com (Igor Veresov)
Date: Wed, 8 Jul 2020 13:17:10 -0700
Subject: [15] RFR(S) 8248822: 8
 vm/classfmt/atr_ann/atr_rtm_annot007/atr_rtm_annot00709 tests fail w/ AOT
Message-ID: <21B89D82-3C1A-4E20-A405-9962F721F8D5@oracle.com>

The root cause of this is that Graal has intrinsics for jdk.internal.reflect.ConstantPool.{getIntAt0, getLongAt0, getFloatAt0, getDoubleAt0}() that don't check the range of the cp index or tag validity, whereas the original native implementations do. Since the utility of these intrinsics is of dubious value I?d like to remove it. The same change is going upstream as well.

Webrev: http://cr.openjdk.java.net/~iveresov/8248822/webrev.00/
JBS: https://bugs.openjdk.java.net/browse/JDK-8248822

Thanks,
igor


From igor.veresov at oracle.com  Wed Jul  8 20:35:37 2020
From: igor.veresov at oracle.com (Igor Veresov)
Date: Wed, 8 Jul 2020 13:35:37 -0700
Subject: [16] RFR(S) 8248987: AOT's Linker.java seems to eagerly fail-fast
 on Windows
In-Reply-To: <acbe7044-8237-0ec2-fa7b-b43657f62594@oracle.com>
References: <acbe7044-8237-0ec2-fa7b-b43657f62594@oracle.com>
Message-ID: <85A8134F-B6EF-436A-BC03-2F8CE1737460@oracle.com>

Looks good to me.

igor


> On Jul 8, 2020, at 1:14 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> https://cr.openjdk.java.net/~kvn/8248987/webrev.00/
> https://bugs.openjdk.java.net/browse/JDK-8248987
> 
> Treat all problems in getVC141AndNewerLinker() as non-fatal to try find old version linker if newer one was not found. Print exception message from getVC141AndNewerLinker() with --verbose flag.
> 
> Thanks,
> Vladimir


From Charlie.Gracie at microsoft.com  Wed Jul  8 20:41:31 2020
From: Charlie.Gracie at microsoft.com (Charlie Gracie)
Date: Wed, 8 Jul 2020 20:41:31 +0000
Subject: Stack allocation prototype for C2
Message-ID: <97F7697A-7A47-456D-832C-5BC8746880E0@microsoft.com>

Hi Sergey,

To get an idea of the objects which are being stack allocated you can use a fastdebug build and gather the output 
from -XX:+PrintStackAllocation. This static view can be combined with inspecting the source code to find patterns
where allocations can be stack allocated but fail to be scalar replaced. This information is not great to understand
which allocation sites are important since it just describes where heap allocations were replaced with stack allocations
but not the frequency that they are used at runtime.

The common patterns we have recognized are:
1.	Boxing objects, with caches, make up a significant portion of the wins we measured.
2.           Iterators and transient data created during collection iteration.
3.	Object chains of non-escaping objects. In these scenarios a lot of the time the root object gets scalar replaced (SCR)
	but the children objects do not. I think SCR might be able to be improved for some of these cases but I need to get
	more data to understand why it is failing.
4.	Backing arrays for data structures. A lot of data structures have a default initial array length. Since the array may
	grow it is not eligible for SCR but it may be eligible for stack allocation. This is a common subcase of #3 but I separated
	it out since the reason why SCR fails is due to merge points.


To get a better understanding of the runtime wins we gathered JFR data with and without stack allocation enabled for some of the 
benchmarks showing large reductions in heap allocation. These workloads were all Scala based.

1.	In TMT, almost 100% of the reduction in heap allocations is due to stack allocation of java.lang.Double objects created
	via scala.runtime.BoxesRunTime.boxToDouble(double). The reduction is due to 2 different call stacks where this method
	was inlined. Here are the 2 callers that generate the allocations which get stack allocated.
		a. scala.runtime.ScalaRunTime$.array_apply(Object, int)
		b. edu.stanford.nlp.tmt.model.SoftAssignmentModel$$anonfun$summary$1$$anonfun$apply$5.apply(Object). 

2.	In ALS ,almost 100% of the reduction in heap allocations is due to stack allocation of java.lang.Integer objects created via
	scala.runtime.BoxesRunTime.boxToInteger(int). The reduction is due to 1 call stack containing the following caller.
		a. scala.runtime.ScalaRunTime$.array_apply(Object, int). When this function is used for primitive arrays it looks
		like stack allocation can regularly see big wins with the right amount of inlining.

3.	In factorie, there are 5 object types that benefit from stack allocation to reduce overall heap allocations. Digging
	further into the call stacks for the 5 allocation sites it appears as they are all related to iterating over data structures.
	Most of the objects are transient objects used for a single iteration and are not Boxed primitives. The object types are:
		a. scala.Some which is allocated as the result of scala.collection.mutable.HashMap.get(Object)
		b. scala.collection.immutable.ListBuffer which is allocated by scala.collection.immutable.List$.newBuilder()
		c. cc.factorie.generative.Proportions[] which is allocated by
		cc.factorie.generative.DiscreteMixtureVar$class.chosenParents(DiscreteMixtureVar)
		d. cc.factorie.package$$anon$1 which is allocated by cc.factorie.package$.singleFactorIterable(Factor)
		e. cc.factorie.Domain$$anonfun$get$1 which is allocated by cc.factorie.Domain$.get(Class)

I hope this is the type of information you were looking for. If you have any other questions or would like to see more/different data please let us know. I can always make log files available via our GitHub project or similar if that helps.

Charlie

?On 2020-06-29, 11:34 PM, "hotspot-compiler-dev on behalf of Sergey Kuksenko" <hotspot-compiler-dev-retn at openjdk.java.net on behalf of sergey.kuksenko at oracle.com> wrote:

    I am just curious.
    
    For each benchmark you show allocation reduce size in general. Do you 
    have statistics which stack allocated objects gives major impact? And 
    which code patterns fail scalar replacement except well know Integer 
    cache flow merge?
    
    On 6/29/20 2:05 PM, Charlie Gracie wrote:
    > Hi hotspot-compiler-dev community,
    >
    > Here is the prototype code for our work on adding stack allocation to the HotSpot C2 compiler. We are looking for any and all feedback
    > as we hope to move from a prototype to something that could be contributed. A change of this size is difficult to review so we
    > understand the process will be thorough and will take time to complete. Any suggestions on how to allow for collaboration with others,
    > if they wanted to, would also be appreciated (i.e., a repo somewhere).
    >
    > For a quick refresher here is a link to Nikola?s talk at FOSDEM:
    > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Ffosdem.org%2F2020%2Fschedule%2Fevent%2Freducing_gc_times%2F&amp;data=02%7C01%7Ccharlie.gracie%40microsoft.com%7C9e9b56c23fde463bf6b808d81ca68bf4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637290848926541670&amp;sdata=qB1c8l5mUVk%2BAt7W5178A9wQ3pauoxW6XTVCfOTOmHw%3D&amp;reserved=0
    >
    > Here is a link to our initial webrev:
    > https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~adityam%2Fcharlie%2Fstack_alloc%2F&amp;data=02%7C01%7Ccharlie.gracie%40microsoft.com%7C9e9b56c23fde463bf6b808d81ca68bf4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637290848926541670&amp;sdata=46mF34J4XcMV58TJxvJ4%2FiDSxL41TSKgW0X2MX7HRV4%3D&amp;reserved=0
    >
    > Expecting that a change like this will require a JEP, we have prepared a document describing our work based off of the JEP submission
    > form. Our document has a few extra sections at the end discussing areas that we are looking for guidance on and some initial
    > performance results. This document can be found here:
    > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmicrosoft%2Fopenjdk-proposals%2Fblob%2Fmaster%2Fstack_allocation%2FStack_Allocation_JEP.md&amp;data=02%7C01%7Ccharlie.gracie%40microsoft.com%7C9e9b56c23fde463bf6b808d81ca68bf4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637290848926541670&amp;sdata=V%2BqKZ9QgCd%2BKDbFb9MqFDoxdtXm8fFmgh%2FLYxgiGqJA%3D&amp;reserved=0
    >
    > Thanks in advance for reviews, suggestions, concerns, comments and issues.
    > Charlie and Nikola
    >
    

From Nikola.Grcevski at microsoft.com  Wed Jul  8 21:18:06 2020
From: Nikola.Grcevski at microsoft.com (Nikola Grcevski)
Date: Wed, 8 Jul 2020 21:18:06 +0000
Subject: Stack allocation prototype for C2
In-Reply-To: <0f98b198-0769-08fc-f1ff-553eadcede22@redhat.com>
References: <C297D000-859F-4B13-8F77-06CF553A9A85@microsoft.com>
 <0f98b198-0769-08fc-f1ff-553eadcede22@redhat.com>
Message-ID: <DM6PR21MB1289905A09142F29773D1250F5670@DM6PR21MB1289.namprd21.prod.outlook.com>

Hi Andrew,

>Here's my concern.
> 
>Java stacks are, in general, pretty small. This is good, and makes for
>economical memory usage. This is particularly useful for Project Loom, >where there can be enormous numbers of "virtual" threads. These threads,
>while they are not active, are stored in the heap.
 
>As you might imagine, the idea of embedded objects (which, of course, >cannot be collected) in these virtual threads does not delight me at all. >Is this likely to be a real problem, do you think, or are all of the 
>stack-allocated objects so small that I shouldn't be concerned?

Your concern about memory consumption increase is very valid, especially in the context of project Loom. 

We only stack allocate java objects of size 256B or less and arrays of length less than 64 elements. There?s also a C2 per method limit of how many stack slots can be allocated. After the stack slots limit is reached, we stop stack allocating more objects. These checks limit the overall amount of stack space that will be consumed. We see stack allocation as an addition to scalar replacement. Currently, scalar replacement will increase the stack size and we expect stack allocation to grow the stack, a similar but larger amount per object. Scalar replacement does not preserve the header words nor unused fields where stack allocated objects do.

We have collected some static data to understand the amount of increase of the stack size, but perhaps we need to extend the measurement in scenarios that are closer to typical project Loom use cases. 

Out of all programs in the Renaissance benchmark suite, ALS is where we stack allocate the most. There are about 2,500 methods compiled with C2 in ALS and the average method stack size can be found in the table below:

No stack allocation, average per method stack size: 69.9 B
With stack allocation, average per method stack size:  72.2 B
Average stack allocated object size: 25.7 B
MAX stack allocated object size: 96 B

It comes to about less than 2.5 bytes increase on average (or 3%) in a program where we?ve seen the most opportunities so far.

We observe similar numbers in the DaCapoScala benchmark suite in the benchmarks where we stack allocate a lot: TMT and FACTORIE.

FACTORIE (around 650 compiled methods)

No stack allocation, average per method stack size: 63.9 B
With stack allocation, , average per method stack size: 66.6 B
Average stack allocated object size: 25.5 B
MAX stack allocated object size: 48 B

TMT (around 900 compiled methods)

No stack allocation, average per method stack size: 67.4 B
With stack allocation, average per method stack size: 70.1 B
Average stack allocated object size: 23.5 B
MAX stack allocated object size: 40 B

If there is data from other workloads you would like to see, in particular when using Loom, please let us know. Also, if there are any other metrics you would like to see we can add those to our must gather list going forward.

If it turns out that the stack size increase is unacceptable, we can add further heuristics to do cost benefit analysis while deciding whether to stack allocate a given allocation candidate. For example, we might stack allocate only smaller objects, objects used in loops or only those in code with high frequency. 

Thanks for reviewing.
Nikola

-----Original Message-----
From: hotspot-compiler-dev <hotspot-compiler-dev-retn at openjdk.java.net> On Behalf Of Andrew Haley
Sent: July 2, 2020 4:16 AM
To: Charlie Gracie <Charlie.Gracie at microsoft.com>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: Stack allocation prototype for C2

On 29/06/2020 22:05, Charlie Gracie wrote:

> Here is the prototype code for our work on adding stack allocation to 
> the HotSpot C2 compiler. We are looking for any and all feedback as we 
> hope to move from a prototype to something that could be contributed.

We certainly need a repo where it can go. It could either be adopted by an existing project or it could have a project of its own. The latter is perhaps a bad idea because it would be too isolated.

> A change of this size is difficult to review so we understand the 
> process will be thorough and will take time to complete. Any 
> suggestions on how to allow for collaboration with others, if they 
> wanted to, would also be appreciated (i.e., a repo somewhere).

Here's my concern.

Java stacks are, in general, pretty small. This is good, and makes for economical memory usage. This is particularly useful for Project Loom, where there can be enormous numbers of "virtual" threads. These threads, while they are not active, are stored in the heap.

As you might imagine, the idea of embedded objects (which, of course, cannot be collected) in these virtual threads does not delight me at all. Is this likely to be a real problem, do you think, or are all of the stack-allocated objects so small that I shouldn't be concerned?

--
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.redhat.com%2F&amp;data=02%7C01%7CNikola.Grcevski%40microsoft.com%7C10c6163e539749badcbb08d81e604677%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637292746135351851&amp;sdata=p1Zx%2Fpoe5F5RoDGmrrPIggZ8UN%2BT4WFBZHTaxLkJ4A8%3D&amp;reserved=0>
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkeybase.io%2Fandrewhaley&amp;data=02%7C01%7CNikola.Grcevski%40microsoft.com%7C10c6163e539749badcbb08d81e604677%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637292746135351851&amp;sdata=k%2FlcpRETCDafZMvL%2B4P3abYrK4Eb83SOkoZcVBWoeS8%3D&amp;reserved=0
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From vladimir.kozlov at oracle.com  Wed Jul  8 22:16:08 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 8 Jul 2020 15:16:08 -0700
Subject: [15] RFR(S) 8248822: 8
 vm/classfmt/atr_ann/atr_rtm_annot007/atr_rtm_annot00709 tests fail w/ AOT
In-Reply-To: <21B89D82-3C1A-4E20-A405-9962F721F8D5@oracle.com>
References: <21B89D82-3C1A-4E20-A405-9962F721F8D5@oracle.com>
Message-ID: <f7d1195e-c639-a160-88b7-1808899092cc@oracle.com>

I see that Doug and Tom approved these changes. I am fine with fix too.

Thanks,
Vladimir

On 7/8/20 1:17 PM, Igor Veresov wrote:
> The root cause of this is that Graal has intrinsics for jdk.internal.reflect.ConstantPool.{getIntAt0, getLongAt0, getFloatAt0, getDoubleAt0}() that don't check the range of the cp index or tag validity, whereas the original native implementations do. Since the utility of these intrinsics is of dubious value I?d like to remove it. The same change is going upstream as well.
> 
> Webrev: http://cr.openjdk.java.net/~iveresov/8248822/webrev.00/
> JBS: https://bugs.openjdk.java.net/browse/JDK-8248822
> 
> Thanks,
> igor
> 
> 
> 

From vladimir.kozlov at oracle.com  Wed Jul  8 22:16:38 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 8 Jul 2020 15:16:38 -0700
Subject: [16] RFR(S) 8248987: AOT's Linker.java seems to eagerly fail-fast
 on Windows
In-Reply-To: <85A8134F-B6EF-436A-BC03-2F8CE1737460@oracle.com>
References: <acbe7044-8237-0ec2-fa7b-b43657f62594@oracle.com>
 <85A8134F-B6EF-436A-BC03-2F8CE1737460@oracle.com>
Message-ID: <25c25b01-6914-6854-d7c7-42683e1d5e92@oracle.com>

Thank you, Igor

Vladimir K

On 7/8/20 1:35 PM, Igor Veresov wrote:
> Looks good to me.
> 
> igor
> 
> 
> 
>> On Jul 8, 2020, at 1:14 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>>
>> https://cr.openjdk.java.net/~kvn/8248987/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8248987
>>
>> Treat all problems in getVC141AndNewerLinker() as non-fatal to try find old version linker if newer one was not found. Print exception message from getVC141AndNewerLinker() with --verbose flag.
>>
>> Thanks,
>> Vladimir
> 

From vladimir.kozlov at oracle.com  Wed Jul  8 22:36:20 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 8 Jul 2020 15:36:20 -0700
Subject: RFR(S) [15] : 8249000 : vm.gc.X should take selected JIT into
 account
In-Reply-To: <BA1B1E12-19E3-45E3-8FDD-0307DDADDC3E@oracle.com>
References: <FFC49DC0-AA15-45B5-B285-E043D1AEAF97@oracle.com>
 <6964ac32-e9ec-d700-0bdb-ea51f4610afe@oracle.com>
 <7A1992A7-1493-4DF0-B621-195CE986D34F@oracle.com>
 <2c92a9a5-77af-c100-fa9b-f765e9d23dce@oracle.com>
 <BA1B1E12-19E3-45E3-8FDD-0307DDADDC3E@oracle.com>
Message-ID: <dbf5a546-51e0-400f-94f7-5b528972f8fe@oracle.com>

Good.

Thanks,
Vladimir

On 7/8/20 11:40 AM, Igor Ignatyev wrote:
> Thanks Vladimir.
> 
> for the record, I've updated ProblemList-graal.txt w/ the following:
> 
>> diff -r 14ffd658a23a test/hotspot/jtreg/ProblemList-graal.txt
>> --- a/test/hotspot/jtreg/ProblemList-graal.txt  Wed Jul 08 11:35:30 2020 -0700
>> +++ b/test/hotspot/jtreg/ProblemList-graal.txt  Wed Jul 08 11:37:44 2020 -0700
>> @@ -229,6 +229,7 @@
>>   compiler/loopopts/TestOverunrolling.java                8207267   generic-all
>>   compiler/jsr292/NonInlinedCall/InvokeTest.java          8207267   generic-all
>>   compiler/codegen/TestTrichotomyExpressions.java         8207267   generic-all
>> +gc/stress/TestReclaimStringsLeaksMemory.java            8207267   generic-all
>>   
>>   runtime/exceptionMsgs/AbstractMethodError/AbstractMethodErrorTest.java        8222582 generic-all
> 
> -- Igor
> 
> 
>> On Jul 8, 2020, at 11:34 AM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>>
>> Thank you, Igor
>>
>> I got the difference between `vm.gc` and `vm.gc.X`.
>>
>> In this case TestReclaimStringsLeaksMemory.java should be put into ProblemList-graal.txt with 8207267 to enable it with libgraal.  Current usage of !vm.graal.enabled in test is to skip this test with Java Graal because its effect on Java heap.
>>
>> On 7/7/20 8:30 PM, Igor Ignatyev wrote:
>>> Hi Vladimir,
>>> thanks for your review!
>>> `vm.gc` and `vm.gc.X`-s are different beasts (and admittedly, they confuse people a lot), `vm.gc` is set to "X", by jtreg itself, only if there is UseXGC in vm flags, otherwise it's "null". `vm.gc.X` are set by VMProps class, and you can have more than one vm.gc.X == true, as vm.gc.X means that X gc is supported by JVM and it can be selected; so if there are no Use.*GC in vm flags, vm.gc.X will yield true for all GCs which JVM was built with; if one of UseXGC is provided, only corresponding vm.gc.X is true, and all others are false. so to answer your questions, yes `vm.gc` can be "null" (if there are no Use.*GC) , and yes `vm.gc.Z & vm.gc.Serial & vm.gc == null` can be true (if there are no Use.*GC and JVM supports both Z and Serial GCs).
>>
>> Interesting. I thought vmGC will list only one selected GC. That explains requires in TestZGCWithCDS.java.
>>
>> You only need to add TestReclaimStringsLeaksMemory.java into ProblemList-graal.txt.
>>
>> Thanks,
>> Vladimir
>>
>>> Thanks,
>>> -- Igor
>>>> On Jul 7, 2020, at 8:00 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>>>>
>>>> Nice clean up, Igor
>>>>
>>>> test/hotspot/jtreg/gc/stress/TestReclaimStringsLeaksMemory.java
>>>>
>>>> Do we even can have vm.gc=="null" based on code in VMProps.java? At least some GC should be selected ergonomically even if non is specified on command line.
>>>>
>>>> - * @requires vm.gc=="null" & !vm.graal.enabled & !vm.debug
>>>> + * @requires vm.gc == "null"
>>>> + * @requires !vm.debug
>>>>
>>>>
>>>> test/hotspot/jtreg/runtime/cds/appcds/TestZGCWithCDS.java
>>>>
>>>> Does next combination of @requires ever work? I thought such sequence means 'AND' operation on all such conditions.
>>>>
>>>>   * @requires vm.gc.Z
>>>>   * @requires vm.gc.Serial
>>>>   * @requires vm.gc == null
>>>>
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 7/7/20 5:38 PM, Igor Ignatyev wrote:
>>>>> http://cr.openjdk.java.net/~iignatyev/8249000/webrev.00/
>>>>>> 241 lines changed: 34 ins; 5 del; 202 mod;
>>>>> Hi all,
>>>>> could you please review the patch which modifies requires/VMProps to set vm.gc.X to false if Graal is selected and X GC isn't supported by Graal?
>>>>> the patch also replaces @requires similar to `vm.gc.X & !vm.graal.enabled` w/ `vm.gc.X` where it's applicable.
>>>>> from JBS:
>>>>>> not all GCs are supported by Graal JIT, which leads to failures like JDK-8247527 and boilerplate fixes like replacing all `@requires vm.gc.Z` w/ `@requires vm.gc.Z & !vm.graal.enabled`.
>>>>>>
>>>>>> as vm.gc.X means that X GC can be selected, it would be more natural, less surprising, and much more clear to have it true if the selected JIT supports the said X GC.
>>>>> webrev: http://cr.openjdk.java.net/~iignatyev/8249000/webrev.00/
>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249000
>>>>> testing: test/hotspot/jtreg/{gc,compiler,runtime,serviceability} on {linux,windows,macos}-x64 w/ and w/o Graal as JIT
>>>>> Thanks,
>>>>> -- Igor
> 

From igor.ignatyev at oracle.com  Wed Jul  8 22:36:53 2020
From: igor.ignatyev at oracle.com (igor.ignatyev at oracle.com)
Date: Wed, 8 Jul 2020 15:36:53 -0700
Subject: [16] RFR(S) 8248987: AOT's Linker.java seems to eagerly fail-fast
 on Windows
In-Reply-To: <acbe7044-8237-0ec2-fa7b-b43657f62594@oracle.com>
References: <acbe7044-8237-0ec2-fa7b-b43657f62594@oracle.com>
Message-ID: <55500A48-5982-4AAD-90F6-FA941967B439@oracle.com>

LGTM

? Igor

> On Jul 8, 2020, at 1:14 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> ?https://cr.openjdk.java.net/~kvn/8248987/webrev.00/
> https://bugs.openjdk.java.net/browse/JDK-8248987
> 
> Treat all problems in getVC141AndNewerLinker() as non-fatal to try find old version linker if newer one was not found. Print exception message from getVC141AndNewerLinker() with --verbose flag.
> 
> Thanks,
> Vladimir


From vladimir.kozlov at oracle.com  Wed Jul  8 22:39:42 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 8 Jul 2020 15:39:42 -0700
Subject: [16] RFR(S) 8248987: AOT's Linker.java seems to eagerly fail-fast
 on Windows
In-Reply-To: <55500A48-5982-4AAD-90F6-FA941967B439@oracle.com>
References: <acbe7044-8237-0ec2-fa7b-b43657f62594@oracle.com>
 <55500A48-5982-4AAD-90F6-FA941967B439@oracle.com>
Message-ID: <7c8110d2-87a3-cab8-2946-b83f86e83fe2@oracle.com>

Thank you, Igor

Vladimir K

On 7/8/20 3:36 PM, igor.ignatyev at oracle.com wrote:
> LGTM
> 
> ? Igor
> 
>> On Jul 8, 2020, at 1:14 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>>
>> ?https://cr.openjdk.java.net/~kvn/8248987/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8248987
>>
>> Treat all problems in getVC141AndNewerLinker() as non-fatal to try find old version linker if newer one was not found. Print exception message from getVC141AndNewerLinker() with --verbose flag.
>>
>> Thanks,
>> Vladimir
> 

From igor.veresov at oracle.com  Wed Jul  8 23:48:40 2020
From: igor.veresov at oracle.com (Igor Veresov)
Date: Wed, 8 Jul 2020 16:48:40 -0700
Subject: [15] RFR(S) 8248822: 8
 vm/classfmt/atr_ann/atr_rtm_annot007/atr_rtm_annot00709 tests fail w/ AOT
In-Reply-To: <f7d1195e-c639-a160-88b7-1808899092cc@oracle.com>
References: <21B89D82-3C1A-4E20-A405-9962F721F8D5@oracle.com>
 <f7d1195e-c639-a160-88b7-1808899092cc@oracle.com>
Message-ID: <2A8297CE-C29E-48F4-B5ED-D09C365F6EDD@oracle.com>

Thanks, Vladimir!

igor


> On Jul 8, 2020, at 3:16 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> I see that Doug and Tom approved these changes. I am fine with fix too.
> 
> Thanks,
> Vladimir
> 
> On 7/8/20 1:17 PM, Igor Veresov wrote:
>> The root cause of this is that Graal has intrinsics for jdk.internal.reflect.ConstantPool.{getIntAt0, getLongAt0, getFloatAt0, getDoubleAt0}() that don't check the range of the cp index or tag validity, whereas the original native implementations do. Since the utility of these intrinsics is of dubious value I?d like to remove it. The same change is going upstream as well.
>> Webrev: http://cr.openjdk.java.net/~iveresov/8248822/webrev.00/
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8248822
>> Thanks,
>> igor


From jamsheed.c.m at oracle.com  Thu Jul  9 07:31:11 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Thu, 9 Jul 2020 13:01:11 +0530
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
Message-ID: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>

Hi all,

JBS:https://bugs.openjdk.java.net/browse/JDK-8242895

Request for review changes made to offset computation and field write 
detection for init captured stores due to phis addition between alloc 
and init. This happen if init node in different outer loop wrt to alloc 
node and there is a loop opt.? This was required as a result of 
enhancement [1].

Normally init are not associated with multiple alloc node during EA 
phase, but changes done for [1] caused the code shapes of the form [2]? 
to generate inits associated with multiple alloc node.

This had implication in offset computation and field write detection 
related to initializing stores.

Attempt to fix in EA:

 ???? webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA/

Alternate fix:

 ???? Minimize the scenario in compiler generated code by throwing only 
j.l.Error from slowpath(all exception async/sync are handled in runtime 
exit).

 ???? Stub epilog doesn't poll or throw any exceptions. Disable full 
loop opt before EA for detectable patterns and bailout EA for late 
detected patterns.

 ???? webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_deopt/

Please advice.

Testing : mach tier1-5 (logs in jbs)

Best regards,

Jamsheed


[1] JDK-8231291 <https://bugs.openjdk.java.net/browse/JDK-8231291>C2: 
loop opts before EA should maximally unroll loops

[2] that have its init node in different outer loop wrt to alloc node.


loop begin

 ?? try{

 ?? return new obj()/? throw new obj()/ uncommon trap after allocation, 
in a loop

 ?? } catch(ex) {

 ?? }

loop end

  42     public static IntA test(int n) {
   43         for (int i=0; i<2; i++) {
   44             try {
   45                   return new IntA(n + i);
   46             } catch (Exception e) {
   47             }
   48         }
   49


From rwestrel at redhat.com  Thu Jul  9 08:32:00 2020
From: rwestrel at redhat.com (Roland Westrelin)
Date: Thu, 09 Jul 2020 10:32:00 +0200
Subject: RFR(S): 8248598: [Graal] Several testcases from
 applications/jcstress/acqrel.java fails with forbidden state
Message-ID: <87v9ixnl6n.fsf@redhat.com>


http://cr.openjdk.java.net/~roland/8248598/webrev.00/
https://bugs.openjdk.java.net/browse/JDK-8248598

It's the upstream graal fix from
https://github.com/oracle/graal/pull/2651 unmodifiered.

I wanted to verify that the test case once part of the jdk source tree
does fail without the fix and runs fine with it but couldn't figure out
how. What are the steps for that?

Roland.


From aph at redhat.com  Thu Jul  9 09:56:47 2020
From: aph at redhat.com (Andrew Haley)
Date: Thu, 9 Jul 2020 10:56:47 +0100
Subject: [aarch64-port-dev ] RFR(S/M): 8247766: [aarch64] guarantee(val <
 (1U << nbits)) failed: Field too big for insn
In-Reply-To: <0cdbdf26-ad4d-056b-a801-cc31b2cc4ab3@oracle.com>
References: <0cdbdf26-ad4d-056b-a801-cc31b2cc4ab3@oracle.com>
Message-ID: <e8b410c8-9d9d-ca6b-4cc8-d1c5a6cad0b0@redhat.com>

On 07/07/2020 12:17, Patric Hedlin wrote:
> I would like to ask for help to review the following change/update:
> 
> Issue:? https://bugs.openjdk.java.net/browse/JDK-8247766
> Webrev: http://cr.openjdk.java.net/~phedlin/tr8247766/

Can we have a reproducer for this please? The test is in the open/ directory
but I can't find it.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Thu Jul  9 10:05:03 2020
From: aph at redhat.com (Andrew Haley)
Date: Thu, 9 Jul 2020 11:05:03 +0100
Subject: [aarch64-port-dev ] RFR(S/M): 8247766: [aarch64] guarantee(val <
 (1U << nbits)) failed: Field too big for insn
In-Reply-To: <e8b410c8-9d9d-ca6b-4cc8-d1c5a6cad0b0@redhat.com>
References: <0cdbdf26-ad4d-056b-a801-cc31b2cc4ab3@oracle.com>
 <e8b410c8-9d9d-ca6b-4cc8-d1c5a6cad0b0@redhat.com>
Message-ID: <51ef5108-69dc-573e-ea6f-ddc05e00ab04@redhat.com>

On 09/07/2020 10:56, Andrew Haley wrote:
> On 07/07/2020 12:17, Patric Hedlin wrote:
>> I would like to ask for help to review the following change/update:
>>
>> Issue:? https://bugs.openjdk.java.net/browse/JDK-8247766
>> Webrev: http://cr.openjdk.java.net/~phedlin/tr8247766/
> 
> Can we have a reproducer for this please? The test is in the open/ directory
> but I can't find it.

And jtreg_test_jdk_java_lang_invoke_BigArityTest_java passes for me. If you
are running with some "interesting" seetings, please tell me.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From rwestrel at redhat.com  Thu Jul  9 11:43:55 2020
From: rwestrel at redhat.com (Roland Westrelin)
Date: Thu, 09 Jul 2020 13:43:55 +0200
Subject: RFR(M): 8229495: SIGILL in C2 generated OSR compilation
In-Reply-To: <84b2c86d-c7e6-7945-dae5-db1d8efe6f25@oracle.com>
References: <3b720427-d718-5d1c-dbe9-6149a21883af@oracle.com>
 <87r1topriw.fsf@redhat.com> <84b2c86d-c7e6-7945-dae5-db1d8efe6f25@oracle.com>
Message-ID: <87sge0oqv8.fsf@redhat.com>


Hi Christian,

new webrev:
http://cr.openjdk.java.net/~roland/8229495/webrev.01/

> I submitted some testing.

Thanks.

> 1824         // Add back the predicate for the value at the beginning of 
> the first entry
> 1825         prev_proj = clone_skeleton_predicate(iff, init, max_value, 
> entry, proj, ctrl, outer_loop, prev_proj);
>
> This comment seems to be outdated as you now clone both skeleton 
> predicates with the same function call in different loop iterations.

I tweaked the comment.

>
> - In loopopts.cpp: While fixing the spacing you could also add curly 
> braces to the one-liner if statements like
>
> 955   if (n_op == Op_MergeMem) return n;

Ok.

> While at it, you might want to consider to update other uses of the 
> pattern Opcode() == Op_Opaque1 by is_Opaque1() as well like in 
> loopTransform.cpp:
>
> 1158     assert(iff->in(1)->in(1)->Opcode() == Op_Opaque1, "unexpected 
> predicate shape");

Except in this case it really is an Opaque1 instead of a subclass so
using is_Opaque1() would weaken the assert.

> I observed a Java Fuzzer crash ("fatal error: DEBUG MESSAGE: duplicated 
> predicate failed which is impossible") this weekend which looked very 
> similar to this bug and indeed it could be fixed with your patch. You 
> could add it as additional testcase. Here is the simplified code and the 
> command line I used to reproduce it.

Thanks for test case. I included it in the new webrev.

Roland.


From christian.hagedorn at oracle.com  Thu Jul  9 12:16:12 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Thu, 9 Jul 2020 14:16:12 +0200
Subject: RFR(M): 8229495: SIGILL in C2 generated OSR compilation
In-Reply-To: <87sge0oqv8.fsf@redhat.com>
References: <3b720427-d718-5d1c-dbe9-6149a21883af@oracle.com>
 <87r1topriw.fsf@redhat.com> <84b2c86d-c7e6-7945-dae5-db1d8efe6f25@oracle.com>
 <87sge0oqv8.fsf@redhat.com>
Message-ID: <c4a2ce8a-2084-e4c1-d1a6-ab057eabe448@oracle.com>

Hi Roland

On 09.07.20 13:43, Roland Westrelin wrote:
> new webrev:
> http://cr.openjdk.java.net/~roland/8229495/webrev.01/

That looks good to me!

>> I submitted some testing.
> 
> Thanks.

An extended testing was completed successfully (up to tier7).

>> While at it, you might want to consider to update other uses of the
>> pattern Opcode() == Op_Opaque1 by is_Opaque1() as well like in
>> loopTransform.cpp:
>>
>> 1158     assert(iff->in(1)->in(1)->Opcode() == Op_Opaque1, "unexpected
>> predicate shape");
> 
> Except in this case it really is an Opaque1 instead of a subclass so
> using is_Opaque1() would weaken the assert.

You're right, I have not thought about that - then better leave it as it is.

>> I observed a Java Fuzzer crash ("fatal error: DEBUG MESSAGE: duplicated
>> predicate failed which is impossible") this weekend which looked very
>> similar to this bug and indeed it could be fixed with your patch. You
>> could add it as additional testcase. Here is the simplified code and the
>> command line I used to reproduce it.
> 
> Thanks for test case. I included it in the new webrev.

Great, thanks for adding it.

Best regards,
Christian

From patric.hedlin at oracle.com  Thu Jul  9 12:44:41 2020
From: patric.hedlin at oracle.com (Patric Hedlin)
Date: Thu, 9 Jul 2020 14:44:41 +0200
Subject: [aarch64-port-dev ] RFR(S/M): 8247766: [aarch64] guarantee(val <
 (1U << nbits)) failed: Field too big for insn
In-Reply-To: <51ef5108-69dc-573e-ea6f-ddc05e00ab04@redhat.com>
References: <0cdbdf26-ad4d-056b-a801-cc31b2cc4ab3@oracle.com>
 <e8b410c8-9d9d-ca6b-4cc8-d1c5a6cad0b0@redhat.com>
 <51ef5108-69dc-573e-ea6f-ddc05e00ab04@redhat.com>
Message-ID: <92889b14-2e5f-d0de-c6d2-016468619368@oracle.com>

I have updated the comment (in the report) on BigArityTest with the 
command to reproduce the failure.

/Patric

On 2020-07-09 12:05, Andrew Haley wrote:
> On 09/07/2020 10:56, Andrew Haley wrote:
>> On 07/07/2020 12:17, Patric Hedlin wrote:
>>> I would like to ask for help to review the following change/update:
>>>
>>> Issue:? https://bugs.openjdk.java.net/browse/JDK-8247766
>>> Webrev: http://cr.openjdk.java.net/~phedlin/tr8247766/
>> Can we have a reproducer for this please? The test is in the open/ directory
>> but I can't find it.
> And jtreg_test_jdk_java_lang_invoke_BigArityTest_java passes for me. If you
> are running with some "interesting" seetings, please tell me.
>


From jamsheed.c.m at oracle.com  Thu Jul  9 14:06:38 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Thu, 9 Jul 2020 19:36:38 +0530
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
Message-ID: <8325fac5-6258-9b88-6507-5dcb0597cc17@oracle.com>

Hi,

request to hold the review. need to change the code for dealing with 
unsafe access. as current capture code go for more execution time 
analyzing things.

Best regards,

Jamsheed

On 09/07/2020 13:01, Jamsheed C M wrote:
>
> Hi all,
>
> JBS:https://bugs.openjdk.java.net/browse/JDK-8242895
>
> Request for review changes made to offset computation and field write 
> detection for init captured stores due to phis addition between alloc 
> and init. This happen if init node in different outer loop wrt to 
> alloc node and there is a loop opt.? This was required as a result of 
> enhancement [1].
>
> Normally init are not associated with multiple alloc node during EA 
> phase, but changes done for [1] caused the code shapes of the form 
> [2]? to generate inits associated with multiple alloc node.
>
> This had implication in offset computation and field write detection 
> related to initializing stores.
>
> Attempt to fix in EA:
>
> ???? webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA/
>
> Alternate fix:
>
> ???? Minimize the scenario in compiler generated code by throwing only 
> j.l.Error from slowpath(all exception async/sync are handled in 
> runtime exit).
>
> ???? Stub epilog doesn't poll or throw any exceptions. Disable full 
> loop opt before EA for detectable patterns and bailout EA for late 
> detected patterns.
>
> ???? webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_deopt/
>
> Please advice.
>
> Testing : mach tier1-5 (logs in jbs)
>
> Best regards,
>
> Jamsheed
>
>
> [1] JDK-8231291 <https://bugs.openjdk.java.net/browse/JDK-8231291>C2: 
> loop opts before EA should maximally unroll loops
>
> [2] that have its init node in different outer loop wrt to alloc node.
>
>
> loop begin
>
> ?? try{
>
> ?? return new obj()/? throw new obj()/ uncommon trap after allocation, 
> in a loop
>
> ?? } catch(ex) {
>
> ?? }
>
> loop end
>
>   42     public static IntA test(int n) {
>    43         for (int i=0; i<2; i++) {
>    44             try {
>    45                   return new IntA(n + i);
>    46             } catch (Exception e) {
>    47             }
>    48         }
>    49
>

From aph at redhat.com  Thu Jul  9 14:26:36 2020
From: aph at redhat.com (Andrew Haley)
Date: Thu, 9 Jul 2020 15:26:36 +0100
Subject: [aarch64-port-dev ] RFR(S/M): 8247766: [aarch64] guarantee(val <
 (1U << nbits)) failed: Field too big for insn
In-Reply-To: <0cdbdf26-ad4d-056b-a801-cc31b2cc4ab3@oracle.com>
References: <0cdbdf26-ad4d-056b-a801-cc31b2cc4ab3@oracle.com>
Message-ID: <a5005a89-43b0-f248-6edd-6f242c037ff7@redhat.com>

On 07/07/2020 12:17, Patric Hedlin wrote:
> C1 code generation for reading and writing stack-slots does not handle 
> large immediate offsets on aarch64. This patch will ensure that 
> immediate offsets are admissible for base+(immediate)offset encoding or, 
> if this is not the case, will enforce an explicit address calculation to 
> a scratch register. (Also correcting a small glitch in 9-bit signed 
> immediate encoding check.)
> 
> NOTE: Current patch includes (local) definitions of is_simm/9 and 
> is_uimm/12, for review purpose only. With JDK-8248901 these will move to 
> Assembler, and will not be included in the change-set.

Umm, OK. These functions seem too complicated: all you have to do is

    int64_t chk = val >> (nbits - 1);                                                                                                   |(gdb)
    guarantee (chk == -1 || chk == 0, "Field too big for insn");                                                                        |#9  0x0000ffffbcab5c30 in Compilation::compile_method (this=0xffff80b7dde8)

but the AArch64 part of it looks fine.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From Charlie.Gracie at microsoft.com  Thu Jul  9 15:15:15 2020
From: Charlie.Gracie at microsoft.com (Charlie Gracie)
Date: Thu, 9 Jul 2020 15:15:15 +0000
Subject: Stack allocation prototype for C2
Message-ID: <CA0499A7-3EBB-41D9-904E-3322A209D485@microsoft.com>

Hi Dalibor,

Thanks for pointing us at the Sandbox Repo! It looks like a great place to host and collaborate on large changes.
If the community decides we should move forward with this investigation, I believe the Sandbox repo would be
a good fit.

Thanks,
Charlie Gracie

?On 2020-07-02, 5:02 AM, "hotspot-compiler-dev on behalf of Dalibor Topic" <hotspot-compiler-dev-retn at openjdk.java.net on behalf of dalibor.topic at oracle.com> wrote:

    
    On 29.06.2020 23:05, Charlie Gracie wrote:
    > Hi hotspot-compiler-dev community,
    > 
    > Here is the prototype code for our work on adding stack allocation to the HotSpot C2 compiler. We are looking for any and all feedback
    > as we hope to move from a prototype to something that could be contributed. A change of this size is difficult to review so we
    > understand the process will be thorough and will take time to complete. Any suggestions on how to allow for collaboration with others,
    > if they wanted to, would also be appreciated (i.e., a repo somewhere).
    
    Hi Charlie,
    
    You may want to take a look at
    
    https://nam06.safelinks.protection.outlook.com/?url=https:%2F%2Fcr.openjdk.java.net%2F~chegar%2Fdocs%2Fsandbox.html&amp;data=02%7C01%7Ccharlie.gracie%40microsoft.com%7Cc115b25534314799610308d81e669427%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637292773207414395&amp;sdata=cfAnpnbxBxfbMeA7lf2EOJokRjTSKpxqP25Ap7c6FZ4%3D&amp;reserved=0
    
    "The primary purpose of the JDK Sandbox Development Repository is to 
    facilitate OpenJDK developers that are working on non-trivial changes, 
    possibly JEP-scale effort, whose scope and duration make it necessary to 
    collaborate with others in an open shared version control system, rather 
    than just using privately shared patches. "
    
    cheers,
    dalibor topic
    
    -- 
    <https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.oracle.com%2F&amp;data=02%7C01%7Ccharlie.gracie%40microsoft.com%7Cc115b25534314799610308d81e669427%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637292773207414395&amp;sdata=dG8kucoEkkZpQrl9FlIglbDHs2eTyB86%2BJeX0mK5qsU%3D&amp;reserved=0> Dalibor Topic
    Consulting Product Manager
    Phone: +494089091214 <tel:+494089091214>, Mobile: +491737185961
    <tel:+491737185961>, Video: dalibor.topic at oracle.com
    <sip:dalibor.topic at oracle.com>
    
    Oracle Global Services Germany GmbH
    Hauptverwaltung: Riesstr. 25, D-80992 M?nchen
    Registergericht: Amtsgericht M?nchen, HRB 246209
    Gesch?ftsf?hrer: Ralf Herrmann
    
    
From aph at redhat.com  Thu Jul  9 15:48:59 2020
From: aph at redhat.com (Andrew Haley)
Date: Thu, 9 Jul 2020 16:48:59 +0100
Subject: [aarch64-port-dev ] RFR(S/M): 8247766: [aarch64] guarantee(val <
 (1U << nbits)) failed: Field too big for insn
In-Reply-To: <0cdbdf26-ad4d-056b-a801-cc31b2cc4ab3@oracle.com>
References: <0cdbdf26-ad4d-056b-a801-cc31b2cc4ab3@oracle.com>
Message-ID: <2809ab8c-4a2e-c0c3-9b93-a0f5df41b992@redhat.com>

On 07/07/2020 12:17, Patric Hedlin wrote:
> Dear all,
> 
> I would like to ask for help to review the following change/update:
> 
> Issue:? https://bugs.openjdk.java.net/browse/JDK-8247766
> Webrev: http://cr.openjdk.java.net/~phedlin/tr8247766/
> 
> 
> C1 code generation for reading and writing stack-slots does not handle 
> large immediate offsets on aarch64. This patch will ensure that 
> immediate offsets are admissible for base+(immediate)offset encoding or, 
> if this is not the case, will enforce an explicit address calculation to 
> a scratch register. (Also correcting a small glitch in 9-bit signed 
> immediate encoding check.)

This is all very complicated.

So it seems to me that there is a better way to do this. We already have
MacroAssembler::legitimize_address(), and you should use that.

Like so:

diff -r 7c59af4db158 src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.cpp
--- a/src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.cpp       Thu Jul 09 11:01:29 2020 -0400
+++ b/src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.cpp       Thu Jul 09 11:36:02 2020 -0400
@@ -736,25 +736,32 @@

 void LIR_Assembler::reg2stack(LIR_Opr src, LIR_Opr dest, BasicType type, bool pop_fpu_stack) {
   if (src->is_single_cpu()) {
+    int index = dest->single_stack_ix();
     if (is_reference_type(type)) {
-      __ str(src->as_register(), frame_map()->address_for_slot(dest->single_stack_ix()));
+      __ str(src->as_register(),
+             __ legitimize_address(frame_map()->address_for_slot(index), BytesPerWord, rscratch1));
       __ verify_oop(src->as_register());
     } else if (type == T_METADATA || type == T_DOUBLE || type == T_ADDRESS) {
-      __ str(src->as_register(), frame_map()->address_for_slot(dest->single_stack_ix()));
+      __ str(src->as_register(),
+             __ legitimize_address(frame_map()->address_for_slot(index), BytesPerWord, rscratch1));
     } else {
-      __ strw(src->as_register(), frame_map()->address_for_slot(dest->single_stack_ix()));
+      __ strw(src->as_register(),
+              __ legitimize_address(frame_map()->address_for_slot(index), BytesPerInt, rscratch1));
     }

   } else if (src->is_double_cpu()) {
     Address dest_addr_LO = frame_map()->address_for_slot(dest->double_stack_ix(), lo_word_offset_in_bytes);
+    dest_addr_LO = __ legitimize_address(dest_addr_LO, BytesPerLong, rscratch1);
     __ str(src->as_register_lo(), dest_addr_LO);

   } else if (src->is_single_fpu()) {
     Address dest_addr = frame_map()->address_for_slot(dest->single_stack_ix());
+    dest_addr = __ legitimize_address(dest_addr, BytesPerInt, rscratch1);
     __ strs(src->as_float_reg(), dest_addr);

   } else if (src->is_double_fpu()) {
     Address dest_addr = frame_map()->address_for_slot(dest->double_stack_ix());
+    dest_addr = __ legitimize_address(dest_addr, BytesPerLong, rscratch1);
     __ strd(src->as_double_reg(), dest_addr);

   } else {

stack_offset_in_reach() seems to duplicate the functionality of offset_ok_for_immed(),
and it's only used in this one place. By all means please use the new is_uimm() and
is_simm() in offset_ok_for_immed().

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From boris.ulasevich at bell-sw.com  Thu Jul  9 16:20:20 2020
From: boris.ulasevich at bell-sw.com (Boris Ulasevich)
Date: Thu, 9 Jul 2020 19:20:20 +0300
Subject: [aarch64-port-dev ] RFR 8248870: AARCH64: I2L conversions can be
 skipped for small positive masked values
In-Reply-To: <9ccf64f1-7a88-0f67-8b50-4dea09af9c8b@redhat.com>
References: <d83ab998-ad1e-4889-b637-dc7f0369c50e@bell-sw.com>
 <9ccf64f1-7a88-0f67-8b50-4dea09af9c8b@redhat.com>
Message-ID: <e54d3d17-c865-a422-e9eb-05b3936e81da@bell-sw.com>

Hi Andrew,

Ok, let us proceed after 8248414.

Meanwhile, I moved the change out of do-not-edit scope, thanks:
http://cr.openjdk.java.net/~bulasevich/8248870/webrev.01

regards,
Boris

On 08.07.2020 12:46, Andrew Haley wrote:
> On 07/07/2020 16:47, Boris Ulasevich wrote:
>> Please review the change to skip i2l conversion after the mask:
>>
>> http://cr.openjdk.java.net/~bulasevich/8248870/webrev.00
>> http://bugs.openjdk.java.net/browse/JDK-8248870
> You seem to have inserted this between the DO NOT EDIT THIS SECTION
> markers.
>
> Please hold off this change until I've committed the patch for
> 8248414.
>


From igor.ignatyev at oracle.com  Thu Jul  9 16:25:35 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Thu, 9 Jul 2020 09:25:35 -0700
Subject: RFR(S): 8248598: [Graal] Several testcases from
 applications/jcstress/acqrel.java fails with forbidden state
In-Reply-To: <87v9ixnl6n.fsf@redhat.com>
References: <87v9ixnl6n.fsf@redhat.com>
Message-ID: <87BD32EE-9FAE-4AA7-9861-583B499E39BF@oracle.com>

Hi Roland,

applications/jcstress tests are just jtreg wrappers around jcstress tests[1], so you can just run them as you would normally run a jcstress test:
$ java -jar jcstress.jar --jvmArgs -XX:+UnlockExperimentalVMOptions --jvmArgs -XX:+EnableJVMCI <rest of the flags prefixed w/ --jvmArgs) -v -t org.openjdk.jcstress.tests.acqrel

or, if you prefer to run the tests thru the wrapper, you can pass path to jcstress.jar via -Djdk.test.lib.artifacts.jcstress-tests-all:
$ make test TEST=test/hotspot/jtreg/applications/jcstress/acqrel.java JTREG=JAVA_OPTIONS="-Djdk.test.lib.artifacts.jcstress-tests-all=$path_to_ jcstress.jar -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI <rest of the flags>"

if you need information on how to get jcstress.jar, please refer to jsctress wiki[1] (or ping Aleksey, he might have a place where he publishes jcstress-tests-all)

[1] https://wiki.openjdk.java.net/display/CodeTools/jcstress

-- Igor

> On Jul 9, 2020, at 1:32 AM, Roland Westrelin <rwestrel at redhat.com> wrote:
> 
> 
> http://cr.openjdk.java.net/~roland/8248598/webrev.00/
> https://bugs.openjdk.java.net/browse/JDK-8248598
> 
> It's the upstream graal fix from
> https://github.com/oracle/graal/pull/2651 unmodifiered.
> 
> I wanted to verify that the test case once part of the jdk source tree
> does fail without the fix and runs fine with it but couldn't figure out
> how. What are the steps for that?
> 
> Roland.
> 


From rwestrel at redhat.com  Thu Jul  9 18:03:05 2020
From: rwestrel at redhat.com (Roland Westrelin)
Date: Thu, 09 Jul 2020 20:03:05 +0200
Subject: RFR(S): 8248598: [Graal] Several testcases from
 applications/jcstress/acqrel.java fails with forbidden state
In-Reply-To: <87BD32EE-9FAE-4AA7-9861-583B499E39BF@oracle.com>
References: <87v9ixnl6n.fsf@redhat.com>
 <87BD32EE-9FAE-4AA7-9861-583B499E39BF@oracle.com>
Message-ID: <87pn94o9ba.fsf@redhat.com>


Hi Igor,

Thanks for helping but my question was not about jcstress but about the
graal regression test:

http://cr.openjdk.java.net/~roland/8248598/webrev.00/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.core.test/src/org/graalvm/compiler/core/test/VolatileAccessReadEliminationTest.java.html

I can run it fine in the graal repo with mx but I have no idea how to
run it once it's pulled into the jdk repo.

Roland.


From igor.ignatyev at oracle.com  Thu Jul  9 18:07:40 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Thu, 9 Jul 2020 11:07:40 -0700
Subject: RFR(S): 8248598: [Graal] Several testcases from
 applications/jcstress/acqrel.java fails with forbidden state
In-Reply-To: <87pn94o9ba.fsf@redhat.com>
References: <87v9ixnl6n.fsf@redhat.com>
 <87BD32EE-9FAE-4AA7-9861-583B499E39BF@oracle.com> <87pn94o9ba.fsf@redhat.com>
Message-ID: <ECD61EA3-E44F-4F0F-9A91-8B1999714106@oracle.com>

oh, I see. I guess the easiest way would be to use jtreg wrappers (test/hotspot/jtreg/compiler/graalunit), there is README.md which explains where you can get dependencies and where you need to put them to make it work, after you finish that, you can run the test by run-test framework as `make test TEST=test/hotspot/jtreg/compiler/graalunit/CoreTest.java`.

HTH
-- Igor

> On Jul 9, 2020, at 11:03 AM, Roland Westrelin <rwestrel at redhat.com> wrote:
> 
> 
> Hi Igor,
> 
> Thanks for helping but my question was not about jcstress but about the
> graal regression test:
> 
> http://cr.openjdk.java.net/~roland/8248598/webrev.00/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.core.test/src/org/graalvm/compiler/core/test/VolatileAccessReadEliminationTest.java.html
> 
> I can run it fine in the graal repo with mx but I have no idea how to
> run it once it's pulled into the jdk repo.
> 
> Roland.
> 


From Charlie.Gracie at microsoft.com  Thu Jul  9 19:28:01 2020
From: Charlie.Gracie at microsoft.com (Charlie Gracie)
Date: Thu, 9 Jul 2020 19:28:01 +0000
Subject: Stack allocation prototype for C2
Message-ID: <4C6D4959-00E1-4300-BE30-BB6FC60A491F@microsoft.com>

Hi Vladimir,

Thanks for reviewing the document and providing your feedback.

> From the design overview and the implementation, I'm concerned about 
> far-reaching consequences of the chosen approach. It's not limited just 
> to existing set of JVM features, but as Andrew noted will affect the 
> design of forthcoming functionality as well.
>
> I think it's worth to start a broad discussion (HotSpot-wide) and decide 
> how much JVM design complexity budged it is worth spending on such an 
>optimization.

This is a great suggestion, where and how should we start this discussion
to get feedback from the broader community?

> As we discussed off-line (right after FOSDEM), I do see the benefits of 
> in-memory representation for non-escaping objects: memory aliasing 
> (either indeterminate base or indexed access) imposes inherent 
> constraints on the escape analysis (both partial and conservative 
> approaches suffer from it). Nevertheless, some of the problematic cases 
> can be addressed by improving existing approach or introducing a more 
> powerful analysis: covering more cases and making the analysis 
> control-sensitive should improve the situation.

We would like to work to improve escape analysis as per your suggestions above.
If we can achieve the same allocation reductions with this solution, it would be a
better long-term solution. We would like to continue reviewing stack allocation
and start a sandbox project as Dalibor suggested, but work on improving escape
analysis and measure against the sandbox for a baseline.   

> Also, the alternative approach (called zone-based heap allocation) looks 
> very attractive to me. I haven't thought it through, but it looks like 
> keeping the objects on the Java heap can save us a lot of complexity on 
> the implementation side (more memory available for allocation - not 
> necessarily fixed amount, no need to migrate objects from stack to heap, 
> GC barriers are unaffected, etc.). For example, reserving a dedicated 
> TLAB (or a stack of TLABs?) and do nmethod-scoped allocations from C2 
> code looks attractive. It can simplify many aspects of the 
> implementation: much more space available, free migration of 
> non-escaping objects to heap on deoptimization.

We have been thinking about this idea since FOSDEM and we completely agree
with the pros of zone-based allocation. The biggest benefits are the removal of
the restrictions in compressed oops mode and that barriers would not have to be
modified. 

For this approach were you envisioning that objects allocated in a stack zone are
pinned until the method returns? Also, while that zone memory is pinned the GC
would not reclaim memory in that zone? That is what we were thinking, but we
are worried about the complexity of the changes and restrictions it might add to
the GC implementations. 

Another thought is about the added cost to method enter / exit. With the current
on stack approach there is no added instructions for entering / exiting a method
since the stack size is just larger. For the zone-based approach we would need to
have a few more instructions on enter and exit to get the space from the zone TLAB
and to return it. If the current zone TLAB is full we would need to do more work to
get another one. Hopefully the common case of satisfying the space requirements
from the current zone TLAB would on average be the same or less than the current
TLAB checks for fast path allocations.

A final consideration is the footprint cost for project Loom. In the zone-based approach
would each virtual thread (fibre) have its own zone TLAB (or stack of TLABs)? If each
virtual thread had a zone TLAB it may lead to more frequent GCs because a significant
portion of the heap is reserved for zone-based allocations.

We do not see any of these as showstoppers, but just be sure we have the full picture. 

> Another idea:
> 
> "When dealing with stack allocated objects in loops we need a lifetime 
> overlap check."
>
> It doesn't look specific to stack-allocated objects. Non-overlapping 
> live ranges can be coalesced the same way for on-heap freshly allocated 
> objects. It should get comparable reduction in allocation pressure 
> (single allocation per loop vs allocation per iteration) and doesn't 
> require stack allocation support at all (as an example [1]).
>
> If such improvements are enabled for non-escaping on-heap objects, how 
> much benefit will stack allocation bring on top of that? IMO the 
>performance gap should become much narrower.

We agree, it?s one of the first things we wanted to try after we submitted the initial stack
allocation code for review. Again, our approach would be to have the current stack allocation
prototype as a baseline and work to see if we can shrink the gap with other approaches.

Thanks again for providing valuable feedback and insight
Charlie and Nikola


From luhenry at microsoft.com  Thu Jul  9 20:31:11 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Thu, 9 Jul 2020 20:31:11 +0000
Subject: RFR(S): 8248676: AArch64: Add workaround for LITable constructor
Message-ID: <MWHPR21MB05119533601E0AF1DE3CB0E9B0640@MWHPR21MB0511.namprd21.prod.outlook.com>

Hello,

JBS: https://bugs.openjdk.java.net/browse/JDK-8248676
Webrev: http://cr.openjdk.java.net/~burban/luhenry/8248676/webrev.00/
Testing: jtreg:test/hotspot/jtreg:tier1, jtreg:test/jdk:tier1, jtreg:test/jdk:tier2, jtreg:test/langtools on Linux-AArch64, no regressions.

This small fix is in the context of the larger support for Windows-AArch64. The attribute `__attribute__ ((constructor))` is not supported by MSVC, and the documented workaround is to allocate an empty static struct with a constructor. This patch only applies this workaround when compiling on Windows, and leaves other platforms unchanged.

I am using Bernhard Urban's CR as I am currently not an author.

Thank you,

--
Ludovic

From igor.ignatyev at oracle.com  Thu Jul  9 20:34:00 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Thu, 9 Jul 2020 13:34:00 -0700
Subject: RFR [15] : 8249019 : clean up FileInstaller $test.src $cwd in
 vmTestbase_vm_compiler tests
Message-ID: <50F2024A-BF63-4298-AB44-137179383723@oracle.com>

http://cr.openjdk.java.net/~iignatyev//8249019/webrev.00
> 269 lines changed: 0 ins; 163 del; 106 mod

Hi all,

could you please review the patch which removes `FileInstaller . .` jtreg action from vmTestbase_vm_compiler tests?
from the main issue(8204985):
> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests.

some of vmTestbase_vm_compiler tests depend on FileInstaller, so they are left intact and will be updated separately. 

testing: :vmTestbase_vm_compiler on linux-x64
JBS: https://bugs.openjdk.java.net/browse/JDK-8249019
webrev: http://cr.openjdk.java.net/~iignatyev//8249019/webrev.00

Thanks,
-- Igor

From ekaterina.pavlova at oracle.com  Thu Jul  9 20:44:39 2020
From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova)
Date: Thu, 9 Jul 2020 13:44:39 -0700
Subject: RFR [15] : 8249019 : clean up FileInstaller $test.src $cwd in
 vmTestbase_vm_compiler tests
In-Reply-To: <50F2024A-BF63-4298-AB44-137179383723@oracle.com>
References: <50F2024A-BF63-4298-AB44-137179383723@oracle.com>
Message-ID: <a6084c8e-b5f7-6301-1d6d-1fedd4f11f33@oracle.com>

Looks good,

-katya


On 7/9/20 1:34 PM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev//8249019/webrev.00
>> 269 lines changed: 0 ins; 163 del; 106 mod
> 
> Hi all,
> 
> could you please review the patch which removes `FileInstaller . .` jtreg action from vmTestbase_vm_compiler tests?
> from the main issue(8204985):
>> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests.
> 
> some of vmTestbase_vm_compiler tests depend on FileInstaller, so they are left intact and will be updated separately.
> 
> testing: :vmTestbase_vm_compiler on linux-x64
> JBS: https://bugs.openjdk.java.net/browse/JDK-8249019
> webrev: http://cr.openjdk.java.net/~iignatyev//8249019/webrev.00
> 
> Thanks,
> -- Igor
> 


From beurba at microsoft.com  Thu Jul  9 21:08:48 2020
From: beurba at microsoft.com (Bernhard Urban-Forster)
Date: Thu, 9 Jul 2020 21:08:48 +0000
Subject: RFR(XS) 8248671: AArch64: Remove unused variables
Message-ID: <DM6PR21MB14529CCD3BDC0E3AA1AA38A1C2640@DM6PR21MB1452.namprd21.prod.outlook.com>

Hello everyone,


please review this change:

JBS: https://bugs.openjdk.java.net/browse/JDK-8248671
Webrev: http://cr.openjdk.java.net/~burban/8248671_unused-vars/

We found this issue while bringing up Windows+AArch64 support for HotSpot. The Microsoft toolchain (MSVC) seems to be slightly more pedantic than GCC.


Thanks,
-Bernhard

From dean.long at oracle.com  Fri Jul 10 01:48:46 2020
From: dean.long at oracle.com (Dean Long)
Date: Thu, 9 Jul 2020 18:48:46 -0700
Subject: RFR(S): 8248598: [Graal] Several testcases from
 applications/jcstress/acqrel.java fails with forbidden state
In-Reply-To: <87pn94o9ba.fsf@redhat.com>
References: <87v9ixnl6n.fsf@redhat.com>
 <87BD32EE-9FAE-4AA7-9861-583B499E39BF@oracle.com> <87pn94o9ba.fsf@redhat.com>
Message-ID: <3db3371d-ed71-2bad-6c67-9fb6906d719f@oracle.com>

I confirmed that VolatileAccessReadEliminationTest fails without the 
patch and passed with it.

dl

On 7/9/20 11:03 AM, Roland Westrelin wrote:
> Hi Igor,
>
> Thanks for helping but my question was not about jcstress but about the
> graal regression test:
>
> http://cr.openjdk.java.net/~roland/8248598/webrev.00/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.core.test/src/org/graalvm/compiler/core/test/VolatileAccessReadEliminationTest.java.html
>
> I can run it fine in the graal repo with mx but I have no idea how to
> run it once it's pulled into the jdk repo.
>
> Roland.
>


From goetz.lindenmaier at sap.com  Fri Jul 10 06:41:19 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Fri, 10 Jul 2020 06:41:19 +0000
Subject: Question regarding 8248521: TestVerifyIterativeGVN.java is failing
 with timeout
Message-ID: <AM4PR0202MB29646E30B5F48F0E73272F99EC650@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi Fairoz,

we also see this test timing out on mac. But only so with
jdk11u.
Do you mind sharing how you fixed this? Did you just increase
the timeout, or did you figure out why this fails on mac in 11u?

Thanks,
  Goetz

From christian.hagedorn at oracle.com  Fri Jul 10 07:37:42 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Fri, 10 Jul 2020 09:37:42 +0200
Subject: [16] RFR(S): 8248552: C2 crashes with SIGFPE due to division by zero
Message-ID: <70e8e42b-5cb3-9c1e-419e-2f771f042368@oracle.com>

Hi

Please review the following patch:
https://bugs.openjdk.java.net/browse/JDK-8248552
http://cr.openjdk.java.net/~chagedorn/8248552/webrev.00/

In the failing testcase, C2 removes a zero check for a division/modulo 
node n based on the type information of the loop induction variable phi 
p (always between 1 and 50 and never 0). However, n is later split 
through p and ends up after the AddNode which updates the induction 
variable p. In the last iteration j equals 2 and is then updated to 0. 
The division/modulo node n is now executed before the loop limit check 
which results in a SIGFPE.

The fix bails out of PhaseIdealLoop::split_thru_phi if a division or 
modulo node has its zero check removed (i.e. control in NULL) and is 
split through a phi which has an input that could be zero. This should 
only happen for an induction variable phi of a trip-counted (integer) loop.

Best regards,
Christian

From rwestrel at redhat.com  Fri Jul 10 08:01:08 2020
From: rwestrel at redhat.com (Roland Westrelin)
Date: Fri, 10 Jul 2020 10:01:08 +0200
Subject: RFR(S): 8248598: [Graal] Several testcases from
 applications/jcstress/acqrel.java fails with forbidden state
In-Reply-To: <ECD61EA3-E44F-4F0F-9A91-8B1999714106@oracle.com>
References: <87v9ixnl6n.fsf@redhat.com>
 <87BD32EE-9FAE-4AA7-9861-583B499E39BF@oracle.com> <87pn94o9ba.fsf@redhat.com>
 <ECD61EA3-E44F-4F0F-9A91-8B1999714106@oracle.com>
Message-ID: <87mu47ol2z.fsf@redhat.com>


> oh, I see. I guess the easiest way would be to use jtreg wrappers
> (test/hotspot/jtreg/compiler/graalunit), there is README.md which
> explains where you can get dependencies and where you need to put them
> to make it work, after you finish that, you can run the test by
> run-test framework as `make test
> TEST=test/hotspot/jtreg/compiler/graalunit/CoreTest.java`.

I gave it a try. I downloaded the dependencies with downloadLibs.sh. But
then running the test fail. See output below.

The comment line would run all the core tests? Is there a way to run
only one?

Roland.

[roland at ws jdk-jdk]$ make CONF=linux-x86_64-server-release run-test TEST="compiler/graalunit/CoreTest.java" TEST_VM_OPTS="-server -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI"
Building target 'run-test' in configuration 'linux-x86_64-server-release'
*** failed to import extension defpath from ~/code-tools/defpath/defpath.py: [Errno 2] No such file or directory: '/home/roland/code-tools/defpath/defpath.py'
*** failed to import extension jcheck from ~/code-tools/jcheck/jcheck.py: [Errno 2] No such file or directory: '/home/roland/code-tools/jcheck/jcheck.py'
Running tests using TEST_OPTS control variable 'VM_OPTIONS=-server -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI'
Test selection 'compiler/graalunit/CoreTest.java', will run:
* jtreg:test/hotspot/jtreg/compiler/graalunit/CoreTest.java

Running test 'jtreg:test/hotspot/jtreg/compiler/graalunit/CoreTest.java'
--------------------------------------------------
TEST: compiler/graalunit/CoreTest.java
TEST JDK: /home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk

ACTION: build -- Passed. Build successful
REASON: User specified action: run build compiler.graalunit.common.GraalUnitTestLauncher 
TIME:   1.427 seconds
messages:
command: build compiler.graalunit.common.GraalUnitTestLauncher
reason: User specified action: run build compiler.graalunit.common.GraalUnitTestLauncher 
Library /:
  compile: compiler.graalunit.common.GraalUnitTestLauncher
elapsed time (seconds): 1.427

ACTION: compile -- Passed. Compilation successful
REASON: .class file out of date or does not exist
TIME:   1.423 seconds
messages:
command: compile /home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit/common/GraalUnitTestLauncher.java
reason: .class file out of date or does not exist
Additional options from @modules: --add-modules jdk.internal.vm.compiler
Mode: agentvm
Agent id: 1
elapsed time (seconds): 1.423
configuration:
Boot Layer (javac runtime environment)
  class path: /home/roland/tools/jtreg/build/images/jtreg/lib/javatest.jar 
              /home/roland/tools/jtreg/build/images/jtreg/lib/jtreg.jar 
  patch:      java.base /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/patches/java.base

javac compilation environment
  add modules: jdk.internal.vm.compiler
  source path: /home/roland/jdk-jdk/test/lib
               /home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit
               /home/roland/jdk-jdk/test/hotspot/jtreg
  class path:  /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib
               /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit
               /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0

rerun:
cd /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/scratch/0 && \
HOME=/home/roland \
JDK8_HOME=/home/roland/jdk-14.0.1 \
LANG=en_US.UTF-8 \
LC_ALL=C \
PATH=/bin:/usr/bin:/usr/sbin \
TEST_IMAGE_DIR=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test \
TEST_IMAGE_GRAAL_DIR=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/graal \
XMODIFIERS=@im=ibus \
    /home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk/bin/javac \
        -J-XX:MaxRAMPercentage=3 \
        -J-Djava.io.tmpdir=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/tmp \
        -J-server \
        -J-XX:+UnlockExperimentalVMOptions \
        -J-XX:+EnableJVMCI \
        -J-Djava.library.path=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/native \
        -J-Dtest.vm.opts='-XX:MaxRAMPercentage=3 -Djava.io.tmpdir=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/tmp -server -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI' \
        -J-Dtest.tool.vm.opts='-J-XX:MaxRAMPercentage=3 -J-Djava.io.tmpdir=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/tmp -J-server -J-XX:+UnlockExperimentalVMOptions -J-XX:+EnableJVMCI' \
        -J-Dtest.compiler.opts= \
        -J-Dtest.java.opts= \
        -J-Dtest.jdk=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk \
        -J-Dcompile.jdk=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk \
        -J-Dtest.timeout.factor=4.0 \
        -J-Dtest.nativepath=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/native \
        -J-Dtest.root=/home/roland/jdk-jdk/test/hotspot/jtreg \
        -J-Dtest.name=compiler/graalunit/CoreTest.java \
        -J-Dtest.file=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit/CoreTest.java \
        -J-Dtest.src=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit \
        -J-Dtest.src.path=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/test/hotspot/jtreg \
        -J-Dtest.classes=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit/CoreTest.d \
        -J-Dtest.class.path=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit/CoreTest.d:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0 \
        -J-Dtest.class.path.prefix=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0 \
        -J-Dtest.modules=jdk.internal.vm.compiler \
        --add-modules jdk.internal.vm.compiler \
        -d /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0 \
        -sourcepath /home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/test/hotspot/jtreg \
        -classpath /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0 /home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit/common/GraalUnitTestLauncher.java
direct:
Note: /home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit/common/GraalUnitTestLauncher.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

ACTION: build -- Passed. Build successful
REASON: Named class compiled on demand
TIME:   0.061 seconds
messages:
command: build jdk.test.lib.FileInstaller
reason: Named class compiled on demand
Library /test/lib:
  compile: jdk.test.lib.FileInstaller
elapsed time (seconds): 0.061

ACTION: compile -- Passed. Compilation successful
REASON: .class file out of date or does not exist
TIME:   0.061 seconds
messages:
command: compile /home/roland/jdk-jdk/test/lib/jdk/test/lib/FileInstaller.java
reason: .class file out of date or does not exist
Additional options from @modules: --add-modules jdk.internal.vm.compiler
Mode: agentvm
Agent id: 1
elapsed time (seconds): 0.061
configuration:
Boot Layer (javac runtime environment)
  class path: /home/roland/tools/jtreg/build/images/jtreg/lib/javatest.jar 
              /home/roland/tools/jtreg/build/images/jtreg/lib/jtreg.jar 
  patch:      java.base /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/patches/java.base

javac compilation environment
  add modules: jdk.internal.vm.compiler
  source path: /home/roland/jdk-jdk/test/lib
               /home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit
               /home/roland/jdk-jdk/test/hotspot/jtreg
  class path:  /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib
               /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit
               /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0

rerun:
cd /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/scratch/0 && \
HOME=/home/roland \
JDK8_HOME=/home/roland/jdk-14.0.1 \
LANG=en_US.UTF-8 \
LC_ALL=C \
PATH=/bin:/usr/bin:/usr/sbin \
TEST_IMAGE_DIR=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test \
TEST_IMAGE_GRAAL_DIR=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/graal \
XMODIFIERS=@im=ibus \
    /home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk/bin/javac \
        -J-XX:MaxRAMPercentage=3 \
        -J-Djava.io.tmpdir=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/tmp \
        -J-server \
        -J-XX:+UnlockExperimentalVMOptions \
        -J-XX:+EnableJVMCI \
        -J-Djava.library.path=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/native \
        -J-Dtest.vm.opts='-XX:MaxRAMPercentage=3 -Djava.io.tmpdir=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/tmp -server -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI' \
        -J-Dtest.tool.vm.opts='-J-XX:MaxRAMPercentage=3 -J-Djava.io.tmpdir=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/tmp -J-server -J-XX:+UnlockExperimentalVMOptions -J-XX:+EnableJVMCI' \
        -J-Dtest.compiler.opts= \
        -J-Dtest.java.opts= \
        -J-Dtest.jdk=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk \
        -J-Dcompile.jdk=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk \
        -J-Dtest.timeout.factor=4.0 \
        -J-Dtest.nativepath=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/native \
        -J-Dtest.root=/home/roland/jdk-jdk/test/hotspot/jtreg \
        -J-Dtest.name=compiler/graalunit/CoreTest.java \
        -J-Dtest.file=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit/CoreTest.java \
        -J-Dtest.src=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit \
        -J-Dtest.src.path=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/test/hotspot/jtreg \
        -J-Dtest.classes=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit/CoreTest.d \
        -J-Dtest.class.path=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit/CoreTest.d:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0 \
        -J-Dtest.class.path.prefix=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0 \
        -J-Dtest.modules=jdk.internal.vm.compiler \
        --add-modules jdk.internal.vm.compiler \
        -d /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib \
        -sourcepath /home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/test/hotspot/jtreg \
        -classpath /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0 /home/roland/jdk-jdk/test/lib/jdk/test/lib/FileInstaller.java

ACTION: driver -- Passed. Execution successful
REASON: User specified action: run driver jdk.test.lib.FileInstaller ../../ProblemList-graal.txt ExcludeList.txt 
TIME:   0.258 seconds
messages:
command: driver jdk.test.lib.FileInstaller ../../ProblemList-graal.txt ExcludeList.txt
reason: User specified action: run driver jdk.test.lib.FileInstaller ../../ProblemList-graal.txt ExcludeList.txt 
Mode: agentvm
Agent id: 2
elapsed time (seconds): 0.258
configuration:
Boot Layer
  class path: /home/roland/tools/jtreg/build/images/jtreg/lib/javatest.jar 
              /home/roland/tools/jtreg/build/images/jtreg/lib/jtreg.jar 
              /home/roland/tools/jtreg/build/images/jtreg/lib/junit.jar 
              /home/roland/tools/jtreg/build/images/jtreg/lib/testng.jar 
              /home/roland/tools/jtreg/build/images/jtreg/lib/jcommander.jar 
  patch:      java.base /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/patches/java.base

Test Layer
  class path: /home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit
              /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib
              /home/roland/jdk-jdk/test/lib
              /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit
              /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0
              /home/roland/jdk-jdk/test/hotspot/jtreg

rerun:
cd /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/scratch/0 && \
HOME=/home/roland \
JDK8_HOME=/home/roland/jdk-14.0.1 \
LANG=en_US.UTF-8 \
LC_ALL=C \
PATH=/bin:/usr/bin:/usr/sbin \
TEST_IMAGE_DIR=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test \
TEST_IMAGE_GRAAL_DIR=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/graal \
XMODIFIERS=@im=ibus \
    /home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk/bin/java \
        -Dtest.vm.opts='-XX:MaxRAMPercentage=3 -Djava.io.tmpdir=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/tmp -server -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI' \
        -Dtest.tool.vm.opts='-J-XX:MaxRAMPercentage=3 -J-Djava.io.tmpdir=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/tmp -J-server -J-XX:+UnlockExperimentalVMOptions -J-XX:+EnableJVMCI' \
        -Dtest.compiler.opts= \
        -Dtest.java.opts= \
        -Dtest.jdk=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk \
        -Dcompile.jdk=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk \
        -Dtest.timeout.factor=4.0 \
        -Dtest.nativepath=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/native \
        -Dtest.root=/home/roland/jdk-jdk/test/hotspot/jtreg \
        -Dtest.name=compiler/graalunit/CoreTest.java \
        -Dtest.file=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit/CoreTest.java \
        -Dtest.src=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit \
        -Dtest.src.path=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/test/hotspot/jtreg \
        -Dtest.classes=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit/CoreTest.d \
        -Dtest.class.path=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit/CoreTest.d:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0 \
        -Dtest.class.path.prefix=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0 \
        -Dtest.modules=jdk.internal.vm.compiler \
        -classpath /home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0:/home/roland/jdk-jdk/test/hotspot/jtreg:/home/roland/tools/jtreg/build/images/jtreg/lib/javatest.jar:/home/roland/tools/jtreg/build/images/jtreg/lib/jtreg.jar \
        jdk.test.lib.FileInstaller ../../ProblemList-graal.txt ExcludeList.txt
STDOUT:
copying /home/roland/jdk-jdk/test/hotspot/jtreg/ProblemList-graal.txt to /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/scratch/0/ExcludeList.txt
STDERR:

JavaTest Message: Test complete.


ACTION: build -- Passed. All files up to date
REASON: Named class compiled on demand
TIME:   0.0 seconds
messages:
command: build compiler.graalunit.common.GraalUnitTestLauncher
reason: Named class compiled on demand
elapsed time (seconds): 0.0

ACTION: main -- Failed. Execution failed: `main' threw exception: java.lang.Exception: Failed to find tests, VM crashed with exit code 1
REASON: User specified action: run main/othervm -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI compiler.graalunit.common.GraalUnitTestLauncher -prefix org.graalvm.compiler.core.test -exclude ExcludeList.txt -vmargs --add-opens=java.base/java.lang=ALL-UNNAMED 
TIME:   0.166 seconds
messages:
command: main -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI compiler.graalunit.common.GraalUnitTestLauncher -prefix org.graalvm.compiler.core.test -exclude ExcludeList.txt -vmargs --add-opens=java.base/java.lang=ALL-UNNAMED
reason: User specified action: run main/othervm -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI compiler.graalunit.common.GraalUnitTestLauncher -prefix org.graalvm.compiler.core.test -exclude ExcludeList.txt -vmargs --add-opens=java.base/java.lang=ALL-UNNAMED 
Mode: othervm [/othervm specified]
Additional options from @modules: --add-modules jdk.internal.vm.compiler
elapsed time (seconds): 0.166
configuration:
Boot Layer
  add modules: jdk.internal.vm.compiler

STDOUT:
INFO: graal libs dir is '/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/graal'
INFO: use following pattern to find tests: org\.graalvm\.compiler\.core\.test.*
Command line: [/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk/bin/java -cp /home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0:/home/roland/jdk-jdk/test/hotspot/jtreg:/home/roland/tools/jtreg/build/images/jtreg/lib/javatest.jar:/home/roland/tools/jtreg/build/images/jtreg/lib/jtreg.jar -cp /home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0:/home/roland/jdk-jdk/test/hotspot/jtreg:/home/roland/tools/jtreg/build/images/jtreg/lib/javatest.jar:/home/roland/tools/jtreg/build/images/jtreg/lib/jtreg.jar:/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/graal/com.oracle.mxtool.junit.jar com.oracle.mxtool.junit.FindClassesByAnnotatedMethods /home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/graal/jdk.vm.compiler.tests.jar @Test ]
INFO: run command /home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk/bin/java -cp /home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0:/home/roland/jdk-jdk/test/hotspot/jtreg:/home/roland/tools/jtreg/build/images/jtreg/lib/javatest.jar:/home/roland/tools/jtreg/build/images/jtreg/lib/jtreg.jar -cp /home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0:/home/roland/jdk-jdk/test/hotspot/jtreg:/home/roland/tools/jtreg/build/images/jtreg/lib/javatest.jar:/home/roland/tools/jtreg/build/images/jtreg/lib/jtreg.jar:/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/graal/com.oracle.mxtool.junit.jar com.oracle.mxtool.junit.FindClassesByAnnotatedMethods /home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/graal/jdk.vm.compiler.tests.jar @Test
[2020-07-10T07:58:37.884779794Z] Gathering output for process 2096875
[2020-07-10T07:58:37.901243107Z] Waiting for completion for process 2096875
[2020-07-10T07:58:37.931042481Z] Waiting for completion finished for process 2096875
STDERR:
java.lang.Exception: Failed to find tests, VM crashed with exit code 1
	at compiler.graalunit.common.GraalUnitTestLauncher.getListOfTestsByPrefix(GraalUnitTestLauncher.java:125)
	at compiler.graalunit.common.GraalUnitTestLauncher.main(GraalUnitTestLauncher.java:223)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
	at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
	at java.base/java.lang.Thread.run(Thread.java:832)

JavaTest Message: Test threw exception: java.lang.Exception: Failed to find tests, VM crashed with exit code 1
JavaTest Message: shutting down test

STATUS:Failed.`main' threw exception: java.lang.Exception: Failed to find tests, VM crashed with exit code 1
rerun:
cd /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/scratch/0 && \
HOME=/home/roland \
JDK8_HOME=/home/roland/jdk-14.0.1 \
LANG=en_US.UTF-8 \
LC_ALL=C \
PATH=/bin:/usr/bin:/usr/sbin \
TEST_IMAGE_DIR=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test \
TEST_IMAGE_GRAAL_DIR=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/graal \
XMODIFIERS=@im=ibus \
CLASSPATH=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0:/home/roland/jdk-jdk/test/hotspot/jtreg:/home/roland/tools/jtreg/build/images/jtreg/lib/javatest.jar:/home/roland/tools/jtreg/build/images/jtreg/lib/jtreg.jar \
    /home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk/bin/java \
        -Dtest.vm.opts='-XX:MaxRAMPercentage=3 -Djava.io.tmpdir=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/tmp -server -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI' \
        -Dtest.tool.vm.opts='-J-XX:MaxRAMPercentage=3 -J-Djava.io.tmpdir=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/tmp -J-server -J-XX:+UnlockExperimentalVMOptions -J-XX:+EnableJVMCI' \
        -Dtest.compiler.opts= \
        -Dtest.java.opts= \
        -Dtest.jdk=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk \
        -Dcompile.jdk=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk \
        -Dtest.timeout.factor=4.0 \
        -Dtest.nativepath=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/native \
        -Dtest.root=/home/roland/jdk-jdk/test/hotspot/jtreg \
        -Dtest.name=compiler/graalunit/CoreTest.java \
        -Dtest.file=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit/CoreTest.java \
        -Dtest.src=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit \
        -Dtest.src.path=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/test/hotspot/jtreg \
        -Dtest.classes=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit/CoreTest.d \
        -Dtest.class.path=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit/CoreTest.d:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0 \
        -Dtest.class.path.prefix=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0 \
        -Dtest.modules=jdk.internal.vm.compiler \
        --add-modules jdk.internal.vm.compiler \
        -XX:MaxRAMPercentage=3 \
        -Djava.io.tmpdir=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/tmp \
        -server \
        -XX:+UnlockExperimentalVMOptions \
        -XX:+EnableJVMCI \
        -Djava.library.path=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/native \
        -XX:+UnlockExperimentalVMOptions \
        -XX:+EnableJVMCI \
        com.sun.javatest.regtest.agent.MainWrapper /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/compiler/graalunit/CoreTest.d/main.0.jta -prefix org.graalvm.compiler.core.test -exclude ExcludeList.txt -vmargs --add-opens=java.base/java.lang=ALL-UNNAMED

TEST RESULT: Failed. Execution failed: `main' threw exception: java.lang.Exception: Failed to find tests, VM crashed with exit code 1
--------------------------------------------------
Test results: failed: 1
Report written to /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-results/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/html/report.html
Results written to /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java
Error: Some tests failed or other problems occurred.
Finished running test 'jtreg:test/hotspot/jtreg/compiler/graalunit/CoreTest.java'
Test report is stored in build/linux-x86_64-server-release/test-results/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java

==============================
Test summary
==============================
   TEST                                              TOTAL  PASS  FAIL ERROR   
   jtreg:test/hotspot/jtreg/compiler/graalunit/CoreTest.java
>>                                                       1     0     1     0 <<
==============================
TEST FAILURE

make[1]: *** [/home/roland/jdk-jdk/make/Init.gmk:319: main] Error 1
make: *** [/home/roland/jdk-jdk/make/Init.gmk:186: run-test] Error 2


From fairoz.matte at oracle.com  Fri Jul 10 08:01:25 2020
From: fairoz.matte at oracle.com (Fairoz Matte)
Date: Fri, 10 Jul 2020 01:01:25 -0700 (PDT)
Subject: Question regarding 8248521: TestVerifyIterativeGVN.java is
 failing with timeout
In-Reply-To: <AM4PR0202MB29646E30B5F48F0E73272F99EC650@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29646E30B5F48F0E73272F99EC650@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <a087c3c6-4d58-49a9-8dbe-2b8f498dde5f@default>

Hi Goetz,

 
This issue is only applicable to 11u.

After the fix of JDK-8246203,  which changed the algorithm for the verification used with VerifyIterativeGVN (takes more time)

We have adjusted timeout to 1200 from 600.

 
Thanks,

Fairoz

 
From: Lindenmaier, Goetz <goetz.lindenmaier at sap.com> 
Sent: Friday, July 10, 2020 12:11 PM
To: Fairoz Matte <fairoz.matte at oracle.com>
Cc: hotspot-compiler-dev at openjdk.java.net
Subject: Question regarding 8248521: TestVerifyIterativeGVN.java is failing with timeout

 
Hi Fairoz,

 
we also see this test timing out on mac. But only so with

jdk11u. 

Do you mind sharing how you fixed this? Did you just increase

the timeout, or did you figure out why this fails on mac in 11u?

 
Thanks,

  Goetz

From goetz.lindenmaier at sap.com  Fri Jul 10 08:06:04 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Fri, 10 Jul 2020 08:06:04 +0000
Subject: Question regarding 8248521: TestVerifyIterativeGVN.java is
 failing with timeout
In-Reply-To: <a087c3c6-4d58-49a9-8dbe-2b8f498dde5f@default>
References: <AM4PR0202MB29646E30B5F48F0E73272F99EC650@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <a087c3c6-4d58-49a9-8dbe-2b8f498dde5f@default>
Message-ID: <AM4PR0202MB296492C25C2E6913335560DBEC650@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi Fairoz,

Thanks for the info.

It's still unclear to me why the algorithm takes longer in 11
than in 15 ... but no matter.

Best regards,
  Goetz.

From: Fairoz Matte <fairoz.matte at oracle.com>
Sent: Friday, July 10, 2020 10:01 AM
To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
Cc: hotspot-compiler-dev at openjdk.java.net
Subject: RE: Question regarding 8248521: TestVerifyIterativeGVN.java is failing with timeout

Hi Goetz,

This issue is only applicable to 11u.
After the fix of JDK-8246203,  which changed the algorithm for the verification used with VerifyIterativeGVN (takes more time)
We have adjusted timeout to 1200 from 600.

Thanks,
Fairoz

From: Lindenmaier, Goetz <goetz.lindenmaier at sap.com<mailto:goetz.lindenmaier at sap.com>>
Sent: Friday, July 10, 2020 12:11 PM
To: Fairoz Matte <fairoz.matte at oracle.com<mailto:fairoz.matte at oracle.com>>
Cc: hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
Subject: Question regarding 8248521: TestVerifyIterativeGVN.java is failing with timeout

Hi Fairoz,

we also see this test timing out on mac. But only so with
jdk11u.
Do you mind sharing how you fixed this? Did you just increase
the timeout, or did you figure out why this fails on mac in 11u?

Thanks,
  Goetz

From aph at redhat.com  Fri Jul 10 08:10:26 2020
From: aph at redhat.com (Andrew Haley)
Date: Fri, 10 Jul 2020 09:10:26 +0100
Subject: [aarch64-port-dev ] RFR(S): 8248676: AArch64: Add workaround for
 LITable constructor
In-Reply-To: <MWHPR21MB05119533601E0AF1DE3CB0E9B0640@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB05119533601E0AF1DE3CB0E9B0640@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <af179579-b8cd-db69-fede-802614f23092@redhat.com>

On 09/07/2020 21:31, Ludovic Henry wrote:
> JBS: https://bugs.openjdk.java.net/browse/JDK-8248676
> Webrev: http://cr.openjdk.java.net/~burban/luhenry/8248676/webrev.00/
> Testing: jtreg:test/hotspot/jtreg:tier1, jtreg:test/jdk:tier1, jtreg:test/jdk:tier2, jtreg:test/langtools on Linux-AArch64, no regressions.
> 
> This small fix is in the context of the larger support for Windows-AArch64. The attribute `__attribute__ ((constructor))` is not supported by MSVC, and the documented workaround is to allocate an empty static struct with a constructor. This patch only applies this workaround when compiling on Windows, and leaves other platforms unchanged.

Please take out the #ifdef WINDOWS: we can use portable C++ here on all
platforms. Thanks,

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Fri Jul 10 08:11:13 2020
From: aph at redhat.com (Andrew Haley)
Date: Fri, 10 Jul 2020 09:11:13 +0100
Subject: [aarch64-port-dev ] RFR(XS) 8248671: AArch64: Remove unused
 variables
In-Reply-To: <DM6PR21MB14529CCD3BDC0E3AA1AA38A1C2640@DM6PR21MB1452.namprd21.prod.outlook.com>
References: <DM6PR21MB14529CCD3BDC0E3AA1AA38A1C2640@DM6PR21MB1452.namprd21.prod.outlook.com>
Message-ID: <108fd979-c60c-11d1-f125-e8e67160d099@redhat.com>

On 09/07/2020 22:08, Bernhard Urban-Forster wrote:
> JBS: https://bugs.openjdk.java.net/browse/JDK-8248671
> Webrev: http://cr.openjdk.java.net/~burban/8248671_unused-vars/
> 
> We found this issue while bringing up Windows+AArch64 support for HotSpot. The Microsoft toolchain (MSVC) seems to be slightly more pedantic than GCC.

OK, thanks.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From rwestrel at redhat.com  Fri Jul 10 08:27:34 2020
From: rwestrel at redhat.com (Roland Westrelin)
Date: Fri, 10 Jul 2020 10:27:34 +0200
Subject: RFR(S): 8248598: [Graal] Several testcases from
 applications/jcstress/acqrel.java fails with forbidden state
In-Reply-To: <3db3371d-ed71-2bad-6c67-9fb6906d719f@oracle.com>
References: <87v9ixnl6n.fsf@redhat.com>
 <87BD32EE-9FAE-4AA7-9861-583B499E39BF@oracle.com> <87pn94o9ba.fsf@redhat.com>
 <3db3371d-ed71-2bad-6c67-9fb6906d719f@oracle.com>
Message-ID: <87k0zbojux.fsf@redhat.com>


> I confirmed that VolatileAccessReadEliminationTest fails without the 
> patch and passed with it.

Thanks for checking. Can I push the change? Do I need to have it go
through the submit repo?

Roland.


From christian.hagedorn at oracle.com  Fri Jul 10 08:28:04 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Fri, 10 Jul 2020 10:28:04 +0200
Subject: Question regarding 8248521: TestVerifyIterativeGVN.java is
 failing with timeout
In-Reply-To: <AM4PR0202MB296492C25C2E6913335560DBEC650@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29646E30B5F48F0E73272F99EC650@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <a087c3c6-4d58-49a9-8dbe-2b8f498dde5f@default>
 <AM4PR0202MB296492C25C2E6913335560DBEC650@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <5eb6bc2c-7690-9d69-a82d-4ceac3399b3f@oracle.com>

Hi Goetz

As Fairoz has mentioned, JDK-8246203 changed the algorithm slightly such 
that it needs now more time for the verification (we check more nodes 
than before).

JDK-8246203 originally only happened in JDK-11 with 
-XX:+VerifyIterativeGVN where a stack overflow crash occurred with a 
more or less HelloWorld test (the old algorithm as a recursive one). It 
turned out that with JDK-11 it compiled a specific big method which 
generated quite a lot of nodes in a chain which let it crash. However, 
with JDK-15 (and 16), this method was not compiled anymore as part of a 
HelloWorld test. It probably got changed since JDK-11 or is not called 
anymore when starting up. Therefore, we concluded that it must be an 11 
only issue and just increased the timeout for the test as we have not 
seen timing it out in JDK-15 or 16.

Best regards,
Christian

On 10.07.20 10:06, Lindenmaier, Goetz wrote:
> Hi Fairoz,
> 
> Thanks for the info.
> 
> It's still unclear to me why the algorithm takes longer in 11
> than in 15 ... but no matter.
> 
> Best regards,
>    Goetz.
> 
> From: Fairoz Matte <fairoz.matte at oracle.com>
> Sent: Friday, July 10, 2020 10:01 AM
> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
> Cc: hotspot-compiler-dev at openjdk.java.net
> Subject: RE: Question regarding 8248521: TestVerifyIterativeGVN.java is failing with timeout
> 
> Hi Goetz,
> 
> This issue is only applicable to 11u.
> After the fix of JDK-8246203,  which changed the algorithm for the verification used with VerifyIterativeGVN (takes more time)
> We have adjusted timeout to 1200 from 600.
> 
> Thanks,
> Fairoz
> 
> From: Lindenmaier, Goetz <goetz.lindenmaier at sap.com<mailto:goetz.lindenmaier at sap.com>>
> Sent: Friday, July 10, 2020 12:11 PM
> To: Fairoz Matte <fairoz.matte at oracle.com<mailto:fairoz.matte at oracle.com>>
> Cc: hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
> Subject: Question regarding 8248521: TestVerifyIterativeGVN.java is failing with timeout
> 
> Hi Fairoz,
> 
> we also see this test timing out on mac. But only so with
> jdk11u.
> Do you mind sharing how you fixed this? Did you just increase
> the timeout, or did you figure out why this fails on mac in 11u?
> 
> Thanks,
>    Goetz
> 

From goetz.lindenmaier at sap.com  Fri Jul 10 08:46:36 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Fri, 10 Jul 2020 08:46:36 +0000
Subject: Question regarding 8248521: TestVerifyIterativeGVN.java is
 failing with timeout
In-Reply-To: <5eb6bc2c-7690-9d69-a82d-4ceac3399b3f@oracle.com>
References: <AM4PR0202MB29646E30B5F48F0E73272F99EC650@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <a087c3c6-4d58-49a9-8dbe-2b8f498dde5f@default>
 <AM4PR0202MB296492C25C2E6913335560DBEC650@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <5eb6bc2c-7690-9d69-a82d-4ceac3399b3f@oracle.com>
Message-ID: <AM4PR0202MB29640F6609B6436F8DEBF01FEC650@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi Christian, 

Thanks for your explanation. 
I can confirm that it never timed out in 15 in our test infra.
It might start again in case code changes again, but 
that is the risk with any test.
(We never saw the stack overflow, though.)

Best regards,
  Goetz.


> -----Original Message-----
> From: Christian Hagedorn <christian.hagedorn at oracle.com>
> Sent: Friday, July 10, 2020 10:28 AM
> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'Fairoz Matte'
> <fairoz.matte at oracle.com>
> Cc: hotspot-compiler-dev at openjdk.java.net
> Subject: Re: Question regarding 8248521: TestVerifyIterativeGVN.java is
> failing with timeout
> 
> Hi Goetz
> 
> As Fairoz has mentioned, JDK-8246203 changed the algorithm slightly such
> that it needs now more time for the verification (we check more nodes
> than before).
> 
> JDK-8246203 originally only happened in JDK-11 with
> -XX:+VerifyIterativeGVN where a stack overflow crash occurred with a
> more or less HelloWorld test (the old algorithm as a recursive one). It
> turned out that with JDK-11 it compiled a specific big method which
> generated quite a lot of nodes in a chain which let it crash. However,
> with JDK-15 (and 16), this method was not compiled anymore as part of a
> HelloWorld test. It probably got changed since JDK-11 or is not called
> anymore when starting up. Therefore, we concluded that it must be an 11
> only issue and just increased the timeout for the test as we have not
> seen timing it out in JDK-15 or 16.
> 
> Best regards,
> Christian
> 
> On 10.07.20 10:06, Lindenmaier, Goetz wrote:
> > Hi Fairoz,
> >
> > Thanks for the info.
> >
> > It's still unclear to me why the algorithm takes longer in 11
> > than in 15 ... but no matter.
> >
> > Best regards,
> >    Goetz.
> >
> > From: Fairoz Matte <fairoz.matte at oracle.com>
> > Sent: Friday, July 10, 2020 10:01 AM
> > To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
> > Cc: hotspot-compiler-dev at openjdk.java.net
> > Subject: RE: Question regarding 8248521: TestVerifyIterativeGVN.java is
> failing with timeout
> >
> > Hi Goetz,
> >
> > This issue is only applicable to 11u.
> > After the fix of JDK-8246203,  which changed the algorithm for the
> verification used with VerifyIterativeGVN (takes more time)
> > We have adjusted timeout to 1200 from 600.
> >
> > Thanks,
> > Fairoz
> >
> > From: Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com<mailto:goetz.lindenmaier at sap.com>>
> > Sent: Friday, July 10, 2020 12:11 PM
> > To: Fairoz Matte
> <fairoz.matte at oracle.com<mailto:fairoz.matte at oracle.com>>
> > Cc: hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-
> dev at openjdk.java.net>
> > Subject: Question regarding 8248521: TestVerifyIterativeGVN.java is failing
> with timeout
> >
> > Hi Fairoz,
> >
> > we also see this test timing out on mac. But only so with
> > jdk11u.
> > Do you mind sharing how you fixed this? Did you just increase
> > the timeout, or did you figure out why this fails on mac in 11u?
> >
> > Thanks,
> >    Goetz
> >

From christian.hagedorn at oracle.com  Fri Jul 10 09:00:46 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Fri, 10 Jul 2020 11:00:46 +0200
Subject: Question regarding 8248521: TestVerifyIterativeGVN.java is
 failing with timeout
In-Reply-To: <AM4PR0202MB29640F6609B6436F8DEBF01FEC650@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <AM4PR0202MB29646E30B5F48F0E73272F99EC650@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <a087c3c6-4d58-49a9-8dbe-2b8f498dde5f@default>
 <AM4PR0202MB296492C25C2E6913335560DBEC650@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <5eb6bc2c-7690-9d69-a82d-4ceac3399b3f@oracle.com>
 <AM4PR0202MB29640F6609B6436F8DEBF01FEC650@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <09cd9ab9-3dd4-2613-d7b1-86d4474081b0@oracle.com>

Hi Goetz

You're welcome and thanks for confirming that you have not seen a 
timeout in 15 either in your testing. We only saw the stack overflow a 
few times on SPARC in 11.

Best regards,
Christian

On 10.07.20 10:46, Lindenmaier, Goetz wrote:
> Hi Christian,
> 
> Thanks for your explanation.
> I can confirm that it never timed out in 15 in our test infra.
> It might start again in case code changes again, but
> that is the risk with any test.
> (We never saw the stack overflow, though.)
> 
> Best regards,
>    Goetz.
> 
> 
>> -----Original Message-----
>> From: Christian Hagedorn <christian.hagedorn at oracle.com>
>> Sent: Friday, July 10, 2020 10:28 AM
>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; 'Fairoz Matte'
>> <fairoz.matte at oracle.com>
>> Cc: hotspot-compiler-dev at openjdk.java.net
>> Subject: Re: Question regarding 8248521: TestVerifyIterativeGVN.java is
>> failing with timeout
>>
>> Hi Goetz
>>
>> As Fairoz has mentioned, JDK-8246203 changed the algorithm slightly such
>> that it needs now more time for the verification (we check more nodes
>> than before).
>>
>> JDK-8246203 originally only happened in JDK-11 with
>> -XX:+VerifyIterativeGVN where a stack overflow crash occurred with a
>> more or less HelloWorld test (the old algorithm as a recursive one). It
>> turned out that with JDK-11 it compiled a specific big method which
>> generated quite a lot of nodes in a chain which let it crash. However,
>> with JDK-15 (and 16), this method was not compiled anymore as part of a
>> HelloWorld test. It probably got changed since JDK-11 or is not called
>> anymore when starting up. Therefore, we concluded that it must be an 11
>> only issue and just increased the timeout for the test as we have not
>> seen timing it out in JDK-15 or 16.
>>
>> Best regards,
>> Christian
>>
>> On 10.07.20 10:06, Lindenmaier, Goetz wrote:
>>> Hi Fairoz,
>>>
>>> Thanks for the info.
>>>
>>> It's still unclear to me why the algorithm takes longer in 11
>>> than in 15 ... but no matter.
>>>
>>> Best regards,
>>>     Goetz.
>>>
>>> From: Fairoz Matte <fairoz.matte at oracle.com>
>>> Sent: Friday, July 10, 2020 10:01 AM
>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
>>> Cc: hotspot-compiler-dev at openjdk.java.net
>>> Subject: RE: Question regarding 8248521: TestVerifyIterativeGVN.java is
>> failing with timeout
>>>
>>> Hi Goetz,
>>>
>>> This issue is only applicable to 11u.
>>> After the fix of JDK-8246203,  which changed the algorithm for the
>> verification used with VerifyIterativeGVN (takes more time)
>>> We have adjusted timeout to 1200 from 600.
>>>
>>> Thanks,
>>> Fairoz
>>>
>>> From: Lindenmaier, Goetz
>> <goetz.lindenmaier at sap.com<mailto:goetz.lindenmaier at sap.com>>
>>> Sent: Friday, July 10, 2020 12:11 PM
>>> To: Fairoz Matte
>> <fairoz.matte at oracle.com<mailto:fairoz.matte at oracle.com>>
>>> Cc: hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-
>> dev at openjdk.java.net>
>>> Subject: Question regarding 8248521: TestVerifyIterativeGVN.java is failing
>> with timeout
>>>
>>> Hi Fairoz,
>>>
>>> we also see this test timing out on mac. But only so with
>>> jdk11u.
>>> Do you mind sharing how you fixed this? Did you just increase
>>> the timeout, or did you figure out why this fails on mac in 11u?
>>>
>>> Thanks,
>>>     Goetz
>>>

From dean.long at oracle.com  Fri Jul 10 09:49:37 2020
From: dean.long at oracle.com (Dean Long)
Date: Fri, 10 Jul 2020 02:49:37 -0700
Subject: RFR(S): 8248598: [Graal] Several testcases from
 applications/jcstress/acqrel.java fails with forbidden state
In-Reply-To: <87k0zbojux.fsf@redhat.com>
References: <87v9ixnl6n.fsf@redhat.com>
 <87BD32EE-9FAE-4AA7-9861-583B499E39BF@oracle.com> <87pn94o9ba.fsf@redhat.com>
 <3db3371d-ed71-2bad-6c67-9fb6906d719f@oracle.com> <87k0zbojux.fsf@redhat.com>
Message-ID: <1e4e7462-5b59-b627-972b-03262c32bc64@oracle.com>

On 7/10/20 1:27 AM, Roland Westrelin wrote:
>> I confirmed that VolatileAccessReadEliminationTest fails without the
>> patch and passed with it.
> Thanks for checking. Can I push the change?

Yes.? You can list me as a reviewer.

> Do I need to have it go
> through the submit repo?

Yes, I believe so.

dl

> Roland.
>


From lutz.schmidt at sap.com  Fri Jul 10 10:13:28 2020
From: lutz.schmidt at sap.com (Schmidt, Lutz)
Date: Fri, 10 Jul 2020 10:13:28 +0000
Subject: [CAUTION] RE: [CAUTION] RFR(XS) 8247695: [PPC, S390]:
 compiler/intrinsics/math/TestFpMinMaxIntrinsics.java fails
Message-ID: <5FAB4241-ADCE-4E5B-80E0-04893D8AC2C5@sap.com>

Hi Richard, 

your change looks good to me. Reviewed.

We had the change active in our test landscape for quite a few days now. It solves the issue and shows no negative side effects. 

Thanks for fixing.
Lutz

?On 02.07.20, 18:45, "hotspot-compiler-dev on behalf of Lindenmaier, Goetz" <hotspot-compiler-dev-retn at openjdk.java.net on behalf of goetz.lindenmaier at sap.com> wrote:

    Hi Richard, 

    I had a look at your change, looks good.  Reviewed.
    Thanks for fixing this.

    Best regards,
      Goetz.


    > -----Original Message-----
    > From: hotspot-compiler-dev <hotspot-compiler-dev-retn at openjdk.java.net>
    > On Behalf Of Reingruber, Richard
    > Sent: Thursday, July 2, 2020 4:05 PM
    > To: hotspot-compiler-dev at openjdk.java.net
    > Subject: [CAUTION] RFR(XS) 8247695: [PPC, S390]:
    > compiler/intrinsics/math/TestFpMinMaxIntrinsics.java fails
    > 
    > Hi,
    > 
    > could I please get reviews for this small bugfix which adds support for AbsL
    > nodes to the C2
    > backends on PPC and S390?
    > 
    > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8247695/webrev.0/
    > Bug:    https://bugs.openjdk.java.net/browse/JDK-8247695
    > 
    > The patch successfully passes regression testing @SAP which includes JCK
    > and JTREG tests, also in
    > Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance Suite, SAP specific
    > tests with fastdebug and
    > release builds.
    > 
    > Thanks, Richard.


From aph at redhat.com  Fri Jul 10 11:21:21 2020
From: aph at redhat.com (Andrew Haley)
Date: Fri, 10 Jul 2020 12:21:21 +0100
Subject: Stack allocation prototype for C2
In-Reply-To: <4C6D4959-00E1-4300-BE30-BB6FC60A491F@microsoft.com>
References: <4C6D4959-00E1-4300-BE30-BB6FC60A491F@microsoft.com>
Message-ID: <85cde128-9b75-c20b-6d17-3724c744392b@redhat.com>

On 09/07/2020 20:28, Charlie Gracie wrote:
> A final consideration is the footprint cost for project Loom. In the zone-based approach
> would each virtual thread (fibre) have its own zone TLAB (or stack of TLABs)?

I wouldn't have thought so. From the VM's point of view, it makes more
sense for the zone TLAB to be owned by the carrier thread. There is the
problem of what happens when we unmount a virtual thread, but that's
quite solvable, I would have thought.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From richard.reingruber at sap.com  Fri Jul 10 12:16:45 2020
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Fri, 10 Jul 2020 12:16:45 +0000
Subject: RFR(XS) 8247695: [PPC, S390]:
 compiler/intrinsics/math/TestFpMinMaxIntrinsics.java fails
Message-ID: <AM0PR0202MB33317896DAB6B0E9D7F3CC599B650@AM0PR0202MB3331.eurprd02.prod.outlook.com>

Hi Lutz,

thanks for your Review. I'll push after the weekend.

Cheers, Richard.

-----Original Message-----
From: Schmidt, Lutz <lutz.schmidt at sap.com> 
Sent: Freitag, 10. Juli 2020 12:13
To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Reingruber, Richard <richard.reingruber at sap.com>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: [CAUTION] RE: [CAUTION] RFR(XS) 8247695: [PPC, S390]: compiler/intrinsics/math/TestFpMinMaxIntrinsics.java fails

Hi Richard, 

your change looks good to me. Reviewed.

We had the change active in our test landscape for quite a few days now. It solves the issue and shows no negative side effects. 

Thanks for fixing.
Lutz

?On 02.07.20, 18:45, "hotspot-compiler-dev on behalf of Lindenmaier, Goetz" <hotspot-compiler-dev-retn at openjdk.java.net on behalf of goetz.lindenmaier at sap.com> wrote:

    Hi Richard, 

    I had a look at your change, looks good.  Reviewed.
    Thanks for fixing this.

    Best regards,
      Goetz.


    > -----Original Message-----
    > From: hotspot-compiler-dev <hotspot-compiler-dev-retn at openjdk.java.net>
    > On Behalf Of Reingruber, Richard
    > Sent: Thursday, July 2, 2020 4:05 PM
    > To: hotspot-compiler-dev at openjdk.java.net
    > Subject: [CAUTION] RFR(XS) 8247695: [PPC, S390]:
    > compiler/intrinsics/math/TestFpMinMaxIntrinsics.java fails
    > 
    > Hi,
    > 
    > could I please get reviews for this small bugfix which adds support for AbsL
    > nodes to the C2
    > backends on PPC and S390?
    > 
    > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8247695/webrev.0/
    > Bug:    https://bugs.openjdk.java.net/browse/JDK-8247695
    > 
    > The patch successfully passes regression testing @SAP which includes JCK
    > and JTREG tests, also in
    > Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance Suite, SAP specific
    > tests with fastdebug and
    > release builds.
    > 
    > Thanks, Richard.


From rwestrel at redhat.com  Fri Jul 10 13:16:56 2020
From: rwestrel at redhat.com (Roland Westrelin)
Date: Fri, 10 Jul 2020 15:16:56 +0200
Subject: RFR(S): 8248598: [Graal] Several testcases from
 applications/jcstress/acqrel.java fails with forbidden state
In-Reply-To: <1e4e7462-5b59-b627-972b-03262c32bc64@oracle.com>
References: <87v9ixnl6n.fsf@redhat.com>
 <87BD32EE-9FAE-4AA7-9861-583B499E39BF@oracle.com> <87pn94o9ba.fsf@redhat.com>
 <3db3371d-ed71-2bad-6c67-9fb6906d719f@oracle.com> <87k0zbojux.fsf@redhat.com>
 <1e4e7462-5b59-b627-972b-03262c32bc64@oracle.com>
Message-ID: <87h7ufo6gn.fsf@redhat.com>


> Yes.? You can list me as a reviewer.

Ok. Thanks. I had it go through the submit repo and pushed it.

Roland.


From igor.ignatyev at oracle.com  Fri Jul 10 14:03:29 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Fri, 10 Jul 2020 07:03:29 -0700
Subject: RFR(S): 8248598: [Graal] Several testcases from
 applications/jcstress/acqrel.java fails with forbidden state
In-Reply-To: <87mu47ol2z.fsf@redhat.com>
References: <87v9ixnl6n.fsf@redhat.com>
 <87BD32EE-9FAE-4AA7-9861-583B499E39BF@oracle.com> <87pn94o9ba.fsf@redhat.com>
 <ECD61EA3-E44F-4F0F-9A91-8B1999714106@oracle.com> <87mu47ol2z.fsf@redhat.com>
Message-ID: <10867EC3-A199-490D-A70D-43FC95CA69DA@oracle.com>


> On Jul 10, 2020, at 1:01 AM, Roland Westrelin <rwestrel at redhat.com> wrote:
> 
> 
>> oh, I see. I guess the easiest way would be to use jtreg wrappers
>> (test/hotspot/jtreg/compiler/graalunit), there is README.md which
>> explains where you can get dependencies and where you need to put them
>> to make it work, after you finish that, you can run the test by
>> run-test framework as `make test
>> TEST=test/hotspot/jtreg/compiler/graalunit/CoreTest.java`.
> 
> I gave it a try. I downloaded the dependencies with downloadLibs.sh. But
> then running the test fail. See output below.
that's weird... Katya, could you please take a look?

> 
> The comment line would run all the core tests? Is there a way to run
> only one?
AFAIK, the only way to do that is to temporary modify existing (or create a new one) jtreg test to have the concrete test name passed as `-prefix` option.

-- Igor
> 
> Roland.
> 
> [roland at ws jdk-jdk]$ make CONF=linux-x86_64-server-release run-test TEST="compiler/graalunit/CoreTest.java" TEST_VM_OPTS="-server -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI"
> Building target 'run-test' in configuration 'linux-x86_64-server-release'
> *** failed to import extension defpath from ~/code-tools/defpath/defpath.py: [Errno 2] No such file or directory: '/home/roland/code-tools/defpath/defpath.py'
> *** failed to import extension jcheck from ~/code-tools/jcheck/jcheck.py: [Errno 2] No such file or directory: '/home/roland/code-tools/jcheck/jcheck.py'
> Running tests using TEST_OPTS control variable 'VM_OPTIONS=-server -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI'
> Test selection 'compiler/graalunit/CoreTest.java', will run:
> * jtreg:test/hotspot/jtreg/compiler/graalunit/CoreTest.java
> 
> Running test 'jtreg:test/hotspot/jtreg/compiler/graalunit/CoreTest.java'
> --------------------------------------------------
> TEST: compiler/graalunit/CoreTest.java
> TEST JDK: /home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk
> 
> ACTION: build -- Passed. Build successful
> REASON: User specified action: run build compiler.graalunit.common.GraalUnitTestLauncher 
> TIME:   1.427 seconds
> messages:
> command: build compiler.graalunit.common.GraalUnitTestLauncher
> reason: User specified action: run build compiler.graalunit.common.GraalUnitTestLauncher 
> Library /:
>  compile: compiler.graalunit.common.GraalUnitTestLauncher
> elapsed time (seconds): 1.427
> 
> ACTION: compile -- Passed. Compilation successful
> REASON: .class file out of date or does not exist
> TIME:   1.423 seconds
> messages:
> command: compile /home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit/common/GraalUnitTestLauncher.java
> reason: .class file out of date or does not exist
> Additional options from @modules: --add-modules jdk.internal.vm.compiler
> Mode: agentvm
> Agent id: 1
> elapsed time (seconds): 1.423
> configuration:
> Boot Layer (javac runtime environment)
>  class path: /home/roland/tools/jtreg/build/images/jtreg/lib/javatest.jar 
>              /home/roland/tools/jtreg/build/images/jtreg/lib/jtreg.jar 
>  patch:      java.base /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/patches/java.base
> 
> javac compilation environment
>  add modules: jdk.internal.vm.compiler
>  source path: /home/roland/jdk-jdk/test/lib
>               /home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit
>               /home/roland/jdk-jdk/test/hotspot/jtreg
>  class path:  /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib
>               /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit
>               /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0
> 
> rerun:
> cd /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/scratch/0 && \
> HOME=/home/roland \
> JDK8_HOME=/home/roland/jdk-14.0.1 \
> LANG=en_US.UTF-8 \
> LC_ALL=C \
> PATH=/bin:/usr/bin:/usr/sbin \
> TEST_IMAGE_DIR=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test \
> TEST_IMAGE_GRAAL_DIR=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/graal \
> XMODIFIERS=@im=ibus \
>    /home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk/bin/javac \
>        -J-XX:MaxRAMPercentage=3 \
>        -J-Djava.io.tmpdir=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/tmp \
>        -J-server \
>        -J-XX:+UnlockExperimentalVMOptions \
>        -J-XX:+EnableJVMCI \
>        -J-Djava.library.path=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/native \
>        -J-Dtest.vm.opts='-XX:MaxRAMPercentage=3 -Djava.io.tmpdir=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/tmp -server -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI' \
>        -J-Dtest.tool.vm.opts='-J-XX:MaxRAMPercentage=3 -J-Djava.io.tmpdir=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/tmp -J-server -J-XX:+UnlockExperimentalVMOptions -J-XX:+EnableJVMCI' \
>        -J-Dtest.compiler.opts= \
>        -J-Dtest.java.opts= \
>        -J-Dtest.jdk=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk \
>        -J-Dcompile.jdk=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk \
>        -J-Dtest.timeout.factor=4.0 \
>        -J-Dtest.nativepath=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/native \
>        -J-Dtest.root=/home/roland/jdk-jdk/test/hotspot/jtreg \
>        -J-Dtest.name=compiler/graalunit/CoreTest.java \
>        -J-Dtest.file=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit/CoreTest.java \
>        -J-Dtest.src=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit \
>        -J-Dtest.src.path=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/test/hotspot/jtreg \
>        -J-Dtest.classes=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit/CoreTest.d \
>        -J-Dtest.class.path=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit/CoreTest.d:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0 \
>        -J-Dtest.class.path.prefix=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0 \
>        -J-Dtest.modules=jdk.internal.vm.compiler \
>        --add-modules jdk.internal.vm.compiler \
>        -d /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0 \
>        -sourcepath /home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/test/hotspot/jtreg \
>        -classpath /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0 /home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit/common/GraalUnitTestLauncher.java
> direct:
> Note: /home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit/common/GraalUnitTestLauncher.java uses unchecked or unsafe operations.
> Note: Recompile with -Xlint:unchecked for details.
> 
> ACTION: build -- Passed. Build successful
> REASON: Named class compiled on demand
> TIME:   0.061 seconds
> messages:
> command: build jdk.test.lib.FileInstaller
> reason: Named class compiled on demand
> Library /test/lib:
>  compile: jdk.test.lib.FileInstaller
> elapsed time (seconds): 0.061
> 
> ACTION: compile -- Passed. Compilation successful
> REASON: .class file out of date or does not exist
> TIME:   0.061 seconds
> messages:
> command: compile /home/roland/jdk-jdk/test/lib/jdk/test/lib/FileInstaller.java
> reason: .class file out of date or does not exist
> Additional options from @modules: --add-modules jdk.internal.vm.compiler
> Mode: agentvm
> Agent id: 1
> elapsed time (seconds): 0.061
> configuration:
> Boot Layer (javac runtime environment)
>  class path: /home/roland/tools/jtreg/build/images/jtreg/lib/javatest.jar 
>              /home/roland/tools/jtreg/build/images/jtreg/lib/jtreg.jar 
>  patch:      java.base /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/patches/java.base
> 
> javac compilation environment
>  add modules: jdk.internal.vm.compiler
>  source path: /home/roland/jdk-jdk/test/lib
>               /home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit
>               /home/roland/jdk-jdk/test/hotspot/jtreg
>  class path:  /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib
>               /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit
>               /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0
> 
> rerun:
> cd /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/scratch/0 && \
> HOME=/home/roland \
> JDK8_HOME=/home/roland/jdk-14.0.1 \
> LANG=en_US.UTF-8 \
> LC_ALL=C \
> PATH=/bin:/usr/bin:/usr/sbin \
> TEST_IMAGE_DIR=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test \
> TEST_IMAGE_GRAAL_DIR=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/graal \
> XMODIFIERS=@im=ibus \
>    /home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk/bin/javac \
>        -J-XX:MaxRAMPercentage=3 \
>        -J-Djava.io.tmpdir=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/tmp \
>        -J-server \
>        -J-XX:+UnlockExperimentalVMOptions \
>        -J-XX:+EnableJVMCI \
>        -J-Djava.library.path=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/native \
>        -J-Dtest.vm.opts='-XX:MaxRAMPercentage=3 -Djava.io.tmpdir=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/tmp -server -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI' \
>        -J-Dtest.tool.vm.opts='-J-XX:MaxRAMPercentage=3 -J-Djava.io.tmpdir=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/tmp -J-server -J-XX:+UnlockExperimentalVMOptions -J-XX:+EnableJVMCI' \
>        -J-Dtest.compiler.opts= \
>        -J-Dtest.java.opts= \
>        -J-Dtest.jdk=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk \
>        -J-Dcompile.jdk=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk \
>        -J-Dtest.timeout.factor=4.0 \
>        -J-Dtest.nativepath=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/native \
>        -J-Dtest.root=/home/roland/jdk-jdk/test/hotspot/jtreg \
>        -J-Dtest.name=compiler/graalunit/CoreTest.java \
>        -J-Dtest.file=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit/CoreTest.java \
>        -J-Dtest.src=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit \
>        -J-Dtest.src.path=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/test/hotspot/jtreg \
>        -J-Dtest.classes=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit/CoreTest.d \
>        -J-Dtest.class.path=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit/CoreTest.d:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0 \
>        -J-Dtest.class.path.prefix=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0 \
>        -J-Dtest.modules=jdk.internal.vm.compiler \
>        --add-modules jdk.internal.vm.compiler \
>        -d /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib \
>        -sourcepath /home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/test/hotspot/jtreg \
>        -classpath /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0 /home/roland/jdk-jdk/test/lib/jdk/test/lib/FileInstaller.java
> 
> ACTION: driver -- Passed. Execution successful
> REASON: User specified action: run driver jdk.test.lib.FileInstaller ../../ProblemList-graal.txt ExcludeList.txt 
> TIME:   0.258 seconds
> messages:
> command: driver jdk.test.lib.FileInstaller ../../ProblemList-graal.txt ExcludeList.txt
> reason: User specified action: run driver jdk.test.lib.FileInstaller ../../ProblemList-graal.txt ExcludeList.txt 
> Mode: agentvm
> Agent id: 2
> elapsed time (seconds): 0.258
> configuration:
> Boot Layer
>  class path: /home/roland/tools/jtreg/build/images/jtreg/lib/javatest.jar 
>              /home/roland/tools/jtreg/build/images/jtreg/lib/jtreg.jar 
>              /home/roland/tools/jtreg/build/images/jtreg/lib/junit.jar 
>              /home/roland/tools/jtreg/build/images/jtreg/lib/testng.jar 
>              /home/roland/tools/jtreg/build/images/jtreg/lib/jcommander.jar 
>  patch:      java.base /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/patches/java.base
> 
> Test Layer
>  class path: /home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit
>              /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib
>              /home/roland/jdk-jdk/test/lib
>              /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit
>              /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0
>              /home/roland/jdk-jdk/test/hotspot/jtreg
> 
> rerun:
> cd /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/scratch/0 && \
> HOME=/home/roland \
> JDK8_HOME=/home/roland/jdk-14.0.1 \
> LANG=en_US.UTF-8 \
> LC_ALL=C \
> PATH=/bin:/usr/bin:/usr/sbin \
> TEST_IMAGE_DIR=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test \
> TEST_IMAGE_GRAAL_DIR=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/graal \
> XMODIFIERS=@im=ibus \
>    /home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk/bin/java \
>        -Dtest.vm.opts='-XX:MaxRAMPercentage=3 -Djava.io.tmpdir=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/tmp -server -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI' \
>        -Dtest.tool.vm.opts='-J-XX:MaxRAMPercentage=3 -J-Djava.io.tmpdir=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/tmp -J-server -J-XX:+UnlockExperimentalVMOptions -J-XX:+EnableJVMCI' \
>        -Dtest.compiler.opts= \
>        -Dtest.java.opts= \
>        -Dtest.jdk=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk \
>        -Dcompile.jdk=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk \
>        -Dtest.timeout.factor=4.0 \
>        -Dtest.nativepath=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/native \
>        -Dtest.root=/home/roland/jdk-jdk/test/hotspot/jtreg \
>        -Dtest.name=compiler/graalunit/CoreTest.java \
>        -Dtest.file=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit/CoreTest.java \
>        -Dtest.src=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit \
>        -Dtest.src.path=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/test/hotspot/jtreg \
>        -Dtest.classes=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit/CoreTest.d \
>        -Dtest.class.path=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit/CoreTest.d:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0 \
>        -Dtest.class.path.prefix=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0 \
>        -Dtest.modules=jdk.internal.vm.compiler \
>        -classpath /home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0:/home/roland/jdk-jdk/test/hotspot/jtreg:/home/roland/tools/jtreg/build/images/jtreg/lib/javatest.jar:/home/roland/tools/jtreg/build/images/jtreg/lib/jtreg.jar \
>        jdk.test.lib.FileInstaller ../../ProblemList-graal.txt ExcludeList.txt
> STDOUT:
> copying /home/roland/jdk-jdk/test/hotspot/jtreg/ProblemList-graal.txt to /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/scratch/0/ExcludeList.txt
> STDERR:
> 
> JavaTest Message: Test complete.
> 
> 
> ACTION: build -- Passed. All files up to date
> REASON: Named class compiled on demand
> TIME:   0.0 seconds
> messages:
> command: build compiler.graalunit.common.GraalUnitTestLauncher
> reason: Named class compiled on demand
> elapsed time (seconds): 0.0
> 
> ACTION: main -- Failed. Execution failed: `main' threw exception: java.lang.Exception: Failed to find tests, VM crashed with exit code 1
> REASON: User specified action: run main/othervm -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI compiler.graalunit.common.GraalUnitTestLauncher -prefix org.graalvm.compiler.core.test -exclude ExcludeList.txt -vmargs --add-opens=java.base/java.lang=ALL-UNNAMED 
> TIME:   0.166 seconds
> messages:
> command: main -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI compiler.graalunit.common.GraalUnitTestLauncher -prefix org.graalvm.compiler.core.test -exclude ExcludeList.txt -vmargs --add-opens=java.base/java.lang=ALL-UNNAMED
> reason: User specified action: run main/othervm -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI compiler.graalunit.common.GraalUnitTestLauncher -prefix org.graalvm.compiler.core.test -exclude ExcludeList.txt -vmargs --add-opens=java.base/java.lang=ALL-UNNAMED 
> Mode: othervm [/othervm specified]
> Additional options from @modules: --add-modules jdk.internal.vm.compiler
> elapsed time (seconds): 0.166
> configuration:
> Boot Layer
>  add modules: jdk.internal.vm.compiler
> 
> STDOUT:
> INFO: graal libs dir is '/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/graal'
> INFO: use following pattern to find tests: org\.graalvm\.compiler\.core\.test.*
> Command line: [/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk/bin/java -cp /home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0:/home/roland/jdk-jdk/test/hotspot/jtreg:/home/roland/tools/jtreg/build/images/jtreg/lib/javatest.jar:/home/roland/tools/jtreg/build/images/jtreg/lib/jtreg.jar -cp /home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0:/home/roland/jdk-jdk/test/hotspot/jtreg:/home/roland/tools/jtreg/build/images/jtreg/lib/javatest.jar:/home/roland/tools/jtreg/build/images/jtreg/lib/jtreg.jar:/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/graal/com.oracle.mxtool.junit.jar com.oracle.mxtool.junit.FindClassesByAnnotatedMethods /home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/graal/jdk.vm.compiler.tests.jar @Test ]
> INFO: run command /home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk/bin/java -cp /home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0:/home/roland/jdk-jdk/test/hotspot/jtreg:/home/roland/tools/jtreg/build/images/jtreg/lib/javatest.jar:/home/roland/tools/jtreg/build/images/jtreg/lib/jtreg.jar -cp /home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0:/home/roland/jdk-jdk/test/hotspot/jtreg:/home/roland/tools/jtreg/build/images/jtreg/lib/javatest.jar:/home/roland/tools/jtreg/build/images/jtreg/lib/jtreg.jar:/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/graal/com.oracle.mxtool.junit.jar com.oracle.mxtool.junit.FindClassesByAnnotatedMethods /home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/graal/jdk.vm.compiler.tests.jar @Test
> [2020-07-10T07:58:37.884779794Z] Gathering output for process 2096875
> [2020-07-10T07:58:37.901243107Z] Waiting for completion for process 2096875
> [2020-07-10T07:58:37.931042481Z] Waiting for completion finished for process 2096875
> STDERR:
> java.lang.Exception: Failed to find tests, VM crashed with exit code 1
> 	at compiler.graalunit.common.GraalUnitTestLauncher.getListOfTestsByPrefix(GraalUnitTestLauncher.java:125)
> 	at compiler.graalunit.common.GraalUnitTestLauncher.main(GraalUnitTestLauncher.java:223)
> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
> 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
> 	at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
> 	at java.base/java.lang.Thread.run(Thread.java:832)
> 
> JavaTest Message: Test threw exception: java.lang.Exception: Failed to find tests, VM crashed with exit code 1
> JavaTest Message: shutting down test
> 
> STATUS:Failed.`main' threw exception: java.lang.Exception: Failed to find tests, VM crashed with exit code 1
> rerun:
> cd /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/scratch/0 && \
> HOME=/home/roland \
> JDK8_HOME=/home/roland/jdk-14.0.1 \
> LANG=en_US.UTF-8 \
> LC_ALL=C \
> PATH=/bin:/usr/bin:/usr/sbin \
> TEST_IMAGE_DIR=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test \
> TEST_IMAGE_GRAAL_DIR=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/graal \
> XMODIFIERS=@im=ibus \
> CLASSPATH=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0:/home/roland/jdk-jdk/test/hotspot/jtreg:/home/roland/tools/jtreg/build/images/jtreg/lib/javatest.jar:/home/roland/tools/jtreg/build/images/jtreg/lib/jtreg.jar \
>    /home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk/bin/java \
>        -Dtest.vm.opts='-XX:MaxRAMPercentage=3 -Djava.io.tmpdir=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/tmp -server -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI' \
>        -Dtest.tool.vm.opts='-J-XX:MaxRAMPercentage=3 -J-Djava.io.tmpdir=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/tmp -J-server -J-XX:+UnlockExperimentalVMOptions -J-XX:+EnableJVMCI' \
>        -Dtest.compiler.opts= \
>        -Dtest.java.opts= \
>        -Dtest.jdk=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk \
>        -Dcompile.jdk=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/jdk \
>        -Dtest.timeout.factor=4.0 \
>        -Dtest.nativepath=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/native \
>        -Dtest.root=/home/roland/jdk-jdk/test/hotspot/jtreg \
>        -Dtest.name=compiler/graalunit/CoreTest.java \
>        -Dtest.file=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit/CoreTest.java \
>        -Dtest.src=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit \
>        -Dtest.src.path=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/test/lib:/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/test/hotspot/jtreg \
>        -Dtest.classes=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit/CoreTest.d \
>        -Dtest.class.path=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit/CoreTest.d:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0 \
>        -Dtest.class.path.prefix=/home/roland/jdk-jdk/test/hotspot/jtreg/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/test/lib:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0/compiler/graalunit:/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/classes/0 \
>        -Dtest.modules=jdk.internal.vm.compiler \
>        --add-modules jdk.internal.vm.compiler \
>        -XX:MaxRAMPercentage=3 \
>        -Djava.io.tmpdir=/home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/tmp \
>        -server \
>        -XX:+UnlockExperimentalVMOptions \
>        -XX:+EnableJVMCI \
>        -Djava.library.path=/home/roland/jdk-jdk/build/linux-x86_64-server-release/images/test/hotspot/jtreg/native \
>        -XX:+UnlockExperimentalVMOptions \
>        -XX:+EnableJVMCI \
>        com.sun.javatest.regtest.agent.MainWrapper /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/compiler/graalunit/CoreTest.d/main.0.jta -prefix org.graalvm.compiler.core.test -exclude ExcludeList.txt -vmargs --add-opens=java.base/java.lang=ALL-UNNAMED
> 
> TEST RESULT: Failed. Execution failed: `main' threw exception: java.lang.Exception: Failed to find tests, VM crashed with exit code 1
> --------------------------------------------------
> Test results: failed: 1
> Report written to /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-results/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java/html/report.html
> Results written to /home/roland/jdk-jdk/build/linux-x86_64-server-release/test-support/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java
> Error: Some tests failed or other problems occurred.
> Finished running test 'jtreg:test/hotspot/jtreg/compiler/graalunit/CoreTest.java'
> Test report is stored in build/linux-x86_64-server-release/test-results/jtreg_test_hotspot_jtreg_compiler_graalunit_CoreTest_java
> 
> ==============================
> Test summary
> ==============================
>   TEST                                              TOTAL  PASS  FAIL ERROR   
>   jtreg:test/hotspot/jtreg/compiler/graalunit/CoreTest.java
>>>                                                      1     0     1     0 <<
> ==============================
> TEST FAILURE
> 
> make[1]: *** [/home/roland/jdk-jdk/make/Init.gmk:319: main] Error 1
> make: *** [/home/roland/jdk-jdk/make/Init.gmk:186: run-test] Error 2
> 
> 


From jatin.bhateja at intel.com  Fri Jul 10 14:04:30 2020
From: jatin.bhateja at intel.com (Bhateja, Jatin)
Date: Fri, 10 Jul 2020 14:04:30 +0000
Subject: 8248830 : RFR[S] : C2 : Rotate API intrinsification for X86
Message-ID: <MWHPR11MB1614502C0DEC2B63D905D683E8650@MWHPR11MB1614.namprd11.prod.outlook.com>

Hi All,

Following patch adds intrinsification and vectorization support for 4 java APIs:-

  *   Integer.rotateLeft
  *   Integer.rotateRight
  *   Long.rotateLeft
  *   Long.rotateRight

JBS   : https://bugs.openjdk.java.net/browse/JDK-8248830
WebRev: http://cr.openjdk.java.net/~jbhateja/8248830/webrev.01/

AVX512 offers 8 new vector rotate instructions [1], these can accept both immediate and variable rotate count
arguments. Patch exploits both these flavors of instructions.

Following are the benchmarks results

Before:
UseAVX=3
Benchmark                         (SHIFT)  (TESTSIZE)   Mode  Cnt      Score   Error   Units
RotateBenchmark.testRotateLeftI        20        1024  thrpt    2  13336.170          ops/ms
RotateBenchmark.testRotateLeftL        20        1024  thrpt    2   8897.930          ops/ms
RotateBenchmark.testRotateRightI       20        1024  thrpt    2  13447.273          ops/ms
RotateBenchmark.testRotateRightL       20        1024  thrpt    2   8783.535          ops/ms

After:
UseAVX=3
Benchmark                         (SHIFT)  (TESTSIZE)   Mode  Cnt      Score   Error   Units
RotateBenchmark.testRotateLeftI        20        1024  thrpt    2  20438.609          ops/ms
RotateBenchmark.testRotateLeftL        20        1024  thrpt    2  11238.110          ops/ms
RotateBenchmark.testRotateRightI       20        1024  thrpt    2  20306.805          ops/ms
RotateBenchmark.testRotateRightL       20        1024  thrpt    2  11190.639          ops/ms

Kindly review the patch.

Best Regards,
Jatin

[1] : https://www.felixcloutier.com/x86/vprold:vprolvd:vprolq:vprolvq
             https://www.felixcloutier.com/x86/vprord:vprorvd:vprorq:vprorvq


From igor.ignatyev at oracle.com  Fri Jul 10 16:24:38 2020
From: igor.ignatyev at oracle.com (igor.ignatyev at oracle.com)
Date: Fri, 10 Jul 2020 09:24:38 -0700
Subject: RFR [15] : 8249019 : clean up FileInstaller $test.src $cwd in
 vmTestbase_vm_compiler tests
In-Reply-To: <a6084c8e-b5f7-6301-1d6d-1fedd4f11f33@oracle.com>
References: <a6084c8e-b5f7-6301-1d6d-1fedd4f11f33@oracle.com>
Message-ID: <4DE9F13E-2B55-42DA-ABCB-4CF5F6EE422A@oracle.com>

Thanks Katya.

Can I get a (R)eview?

? Igor

> On Jul 9, 2020, at 1:44 PM, Ekaterina Pavlova <ekaterina.pavlova at oracle.com> wrote:
> 
> ?Looks good,
> 
> -katya
> 
> 
>> On 7/9/20 1:34 PM, Igor Ignatyev wrote:
>> http://cr.openjdk.java.net/~iignatyev//8249019/webrev.00
>>> 269 lines changed: 0 ins; 163 del; 106 mod
>> Hi all,
>> could you please review the patch which removes `FileInstaller . .` jtreg action from vmTestbase_vm_compiler tests?
>> from the main issue(8204985):
>>> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests.
>> some of vmTestbase_vm_compiler tests depend on FileInstaller, so they are left intact and will be updated separately.
>> testing: :vmTestbase_vm_compiler on linux-x64
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249019
>> webrev: http://cr.openjdk.java.net/~iignatyev//8249019/webrev.00
>> Thanks,
>> -- Igor
> 


From vladimir.x.ivanov at oracle.com  Fri Jul 10 16:26:33 2020
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Fri, 10 Jul 2020 19:26:33 +0300
Subject: [15] RFR (S): 8247502: PhaseStringOpts crashes while optimising
 effectively dead code
Message-ID: <9ee563ef-501b-bdaa-4e87-8e9e8aaf2dd7@oracle.com>

https://bugs.openjdk.java.net/browse/JDK-8247502
http://cr.openjdk.java.net/~vlivanov/8247502/webrev.00/

As Tobias discovered, PhaseStringOpts crashes when it encounters 
String::append() argument being TOP: TOP is a constant, but the code 
expects to see a String constant instead.

It happens while processing a call in unreachable infinite loop. The 
code is effectively dead, but IGVN and PhaseRemoveUseless don't see 
that. It is discovered later when loop opts kick in which clean it up.

Proposed fix tries to make the code more robust and just bails out the 
optimization when TOP is encountered.

Alternative way to fix the problem would be to clean up the graph before 
PhaseStringOpts (e.g., by running PhaseIdealLoop(LoopOptsNone) since 
PhaseRemoveUseless is not enough), but PhaseIdealLoop pass can be 
expensive. So, I'm in favor of the local fix in PhaseStringOpts.

Testing: crash reproducer, hs-precheckin-comp, hs-tier1, hs-tier2, tier1

Thanks!

PS: no regression test since I wasn't able to extract a simple 
reproducer from the crash log.

Best regards,
Vladimir Ivanov

From vladimir.x.ivanov at oracle.com  Fri Jul 10 17:32:00 2020
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Fri, 10 Jul 2020 20:32:00 +0300
Subject: 8248830 : RFR[S] : C2 : Rotate API intrinsification for X86
In-Reply-To: <MWHPR11MB1614502C0DEC2B63D905D683E8650@MWHPR11MB1614.namprd11.prod.outlook.com>
References: <MWHPR11MB1614502C0DEC2B63D905D683E8650@MWHPR11MB1614.namprd11.prod.outlook.com>
Message-ID: <9ad508ae-bf73-d4d7-ff2b-d1f4280adeff@oracle.com>


> WebRev: http://cr.openjdk.java.net/~jbhateja/8248830/webrev.01/

Nice work, Jatin!

High-level comment: so far, there were no pressing need in explicitly 
marking the methods as intrinsics. ROR/ROL instructions were selected 
during matching [1]. Now the patch introduces dedicated nodes 
(RotateLeft/RotateRight) specifically for intrinsics which partly 
duplicates existing logic.

As a consequence, while ROL/ROR instructions can be utilized without 
using the dedicated API methods, auto-vectorization won't handle 
rotations unless the intrinsics are used.

It would be nice to unify the approaches and get rid of the duplication. 
(Either by folding scalar operations into Rotate nodes or by extending 
auto-vectorizer to detect vector rotates in a similar way scalar rotates 
are handled.)

Otherwise, looks good. I'll submit it for testing.

Minor comments:

src/hotspot/share/opto/countbitsnode.hpp

Thought the nodes look like in the right company, formally speaking, 
RotateLeft/RotateRight aren't subtypes of CountBitsNode. Maybe rename 
countbitsnode.hpp or move RotateLeft/RotateRight declarations to 
src/hotspot/share/opto/intrinsicnode.hpp?

Best regards,
Vladimir Ivanov

[1] 
http://hg.openjdk.java.net/jdk/jdk/file/tip/src/hotspot/cpu/x86/x86_64.ad#l8970
> 
> AVX512 offers 8 new vector rotate instructions [1], these can accept both immediate and variable rotate count
> arguments. Patch exploits both these flavors of instructions.
> 
> Following are the benchmarks results
> 
> Before:
> UseAVX=3
> Benchmark                         (SHIFT)  (TESTSIZE)   Mode  Cnt      Score   Error   Units
> RotateBenchmark.testRotateLeftI        20        1024  thrpt    2  13336.170          ops/ms
> RotateBenchmark.testRotateLeftL        20        1024  thrpt    2   8897.930          ops/ms
> RotateBenchmark.testRotateRightI       20        1024  thrpt    2  13447.273          ops/ms
> RotateBenchmark.testRotateRightL       20        1024  thrpt    2   8783.535          ops/ms
> 
> After:
> UseAVX=3
> Benchmark                         (SHIFT)  (TESTSIZE)   Mode  Cnt      Score   Error   Units
> RotateBenchmark.testRotateLeftI        20        1024  thrpt    2  20438.609          ops/ms
> RotateBenchmark.testRotateLeftL        20        1024  thrpt    2  11238.110          ops/ms
> RotateBenchmark.testRotateRightI       20        1024  thrpt    2  20306.805          ops/ms
> RotateBenchmark.testRotateRightL       20        1024  thrpt    2  11190.639          ops/ms
> 
> Kindly review the patch.
> 
> Best Regards,
> Jatin
> 
> [1] : https://www.felixcloutier.com/x86/vprold:vprolvd:vprolq:vprolvq
>               https://www.felixcloutier.com/x86/vprord:vprorvd:vprorq:vprorvq
> 
> 

From luhenry at microsoft.com  Fri Jul 10 17:58:42 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Fri, 10 Jul 2020 17:58:42 +0000
Subject: [aarch64-port-dev ] RFR(S): 8248676: AArch64: Add workaround for
 LITable constructor
In-Reply-To: <af179579-b8cd-db69-fede-802614f23092@redhat.com>
References: <MWHPR21MB05119533601E0AF1DE3CB0E9B0640@MWHPR21MB0511.namprd21.prod.outlook.com>,
 <af179579-b8cd-db69-fede-802614f23092@redhat.com>
Message-ID: <MWHPR21MB0511A82D28A3FE2A09E3E8ADB0650@MWHPR21MB0511.namprd21.prod.outlook.com>

Hi Andrew,

I uploaded a new webrev following your review.

Webrev: http://cr.openjdk.java.net/~burban/luhenry/8248676/webrev.01/
Testing: jtreg:test/hotspot/jtreg:tier1, jtreg:test/jdk:tier1, jtreg:test/jdk:tier2, jtreg:test/langtools on Linux-AArch64, no regressions

Thank you,

________________________________________
From: Andrew Haley <aph at redhat.com>
Sent: Friday, July 10, 2020 01:10
To: Ludovic Henry; hotspot-compiler-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
Cc: openjdk-aarch64
Subject: Re: [aarch64-port-dev ] RFR(S): 8248676: AArch64: Add workaround for LITable constructor

On 09/07/2020 21:31, Ludovic Henry wrote:
> JBS: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2Fbrowse%2FJDK-8248676&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C69984ac08e714025b20608d824a8b8c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637299654356677974&amp;sdata=Xh0ioKQ3xkfQcrNwxbxfY8jPDyIUWuCA%2FiXFZWMiruE%3D&amp;reserved=0
> Webrev: https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248676%2Fwebrev.00%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C69984ac08e714025b20608d824a8b8c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637299654356677974&amp;sdata=8c1m3FeJw2ppdJ9LdYNY352AhS9ZcZ9WXGCmsKojU1s%3D&amp;reserved=0
> Testing: jtreg:test/hotspot/jtreg:tier1, jtreg:test/jdk:tier1, jtreg:test/jdk:tier2, jtreg:test/langtools on Linux-AArch64, no regressions.
>
> This small fix is in the context of the larger support for Windows-AArch64. The attribute `__attribute__ ((constructor))` is not supported by MSVC, and the documented workaround is to allocate an empty static struct with a constructor. This patch only applies this workaround when compiling on Windows, and leaves other platforms unchanged.

Please take out the #ifdef WINDOWS: we can use portable C++ here on all
platforms. Thanks,

--
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.redhat.com%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C69984ac08e714025b20608d824a8b8c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637299654356677974&amp;sdata=vWnYpIAYzEgKzbAg3U4VZ4hSRikFNU4cjJ5HCRdomJ8%3D&amp;reserved=0>
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkeybase.io%2Fandrewhaley&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C69984ac08e714025b20608d824a8b8c5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637299654356677974&amp;sdata=%2FmdsMxaEVB%2FWlJVAY%2FyxrKt6XOH0GrfL64EXTZgZAxE%3D&amp;reserved=0
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From vladimir.kozlov at oracle.com  Fri Jul 10 18:09:33 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 10 Jul 2020 11:09:33 -0700
Subject: RFR [15] : 8249019 : clean up FileInstaller $test.src $cwd in
 vmTestbase_vm_compiler tests
In-Reply-To: <4DE9F13E-2B55-42DA-ABCB-4CF5F6EE422A@oracle.com>
References: <a6084c8e-b5f7-6301-1d6d-1fedd4f11f33@oracle.com>
 <4DE9F13E-2B55-42DA-ABCB-4CF5F6EE422A@oracle.com>
Message-ID: <b0d25c76-be85-ae93-3552-06b903beb9e8@oracle.com>

Reviewed.

Vladimir K

On 7/10/20 9:24 AM, igor.ignatyev at oracle.com wrote:
> Thanks Katya.
> 
> Can I get a (R)eview?
> 
> ? Igor
> 
>> On Jul 9, 2020, at 1:44 PM, Ekaterina Pavlova <ekaterina.pavlova at oracle.com> wrote:
>>
>> ?Looks good,
>>
>> -katya
>>
>>
>>> On 7/9/20 1:34 PM, Igor Ignatyev wrote:
>>> http://cr.openjdk.java.net/~iignatyev//8249019/webrev.00
>>>> 269 lines changed: 0 ins; 163 del; 106 mod
>>> Hi all,
>>> could you please review the patch which removes `FileInstaller . .` jtreg action from vmTestbase_vm_compiler tests?
>>> from the main issue(8204985):
>>>> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests.
>>> some of vmTestbase_vm_compiler tests depend on FileInstaller, so they are left intact and will be updated separately.
>>> testing: :vmTestbase_vm_compiler on linux-x64
>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249019
>>> webrev: http://cr.openjdk.java.net/~iignatyev//8249019/webrev.00
>>> Thanks,
>>> -- Igor
>>
> 

From vladimir.x.ivanov at oracle.com  Fri Jul 10 18:42:25 2020
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Fri, 10 Jul 2020 21:42:25 +0300
Subject: Stack allocation prototype for C2
In-Reply-To: <4C6D4959-00E1-4300-BE30-BB6FC60A491F@microsoft.com>
References: <4C6D4959-00E1-4300-BE30-BB6FC60A491F@microsoft.com>
Message-ID: <a301d141-2d56-588d-665c-56b8c9b67de7@oracle.com>

Hi Charlie,

> Thanks for reviewing the document and providing your feedback.

One request about improving the document: please, elaborate more on 
interactions with EA implementation in C2.

For example, stack allocation can be used for both non-scalarizable 
NoEscape and ArgEscape, but the latter requires GC barriers everywhere 
to check for stack allocated objects while in the former case it can be 
limited only to the current nmethod.

>>  From the design overview and the implementation, I'm concerned about
>> far-reaching consequences of the chosen approach. It's not limited just
>> to existing set of JVM features, but as Andrew noted will affect the
>> design of forthcoming functionality as well.
>>
>> I think it's worth to start a broad discussion (HotSpot-wide) and decide
>> how much JVM design complexity budged it is worth spending on such an
>> optimization.
> 
> This is a great suggestion, where and how should we start this discussion
> to get feedback from the broader community?

I suggest to initiate a new discussion on hotspot-dev at ojn and stress 
that it's not just about optimizations in JIT-compilers, but a proposal 
to enable object allocations on thread stack and discuss the effects on 
other JVM subsystems and features.

>> As we discussed off-line (right after FOSDEM), I do see the benefits of
>> in-memory representation for non-escaping objects: memory aliasing
>> (either indeterminate base or indexed access) imposes inherent
>> constraints on the escape analysis (both partial and conservative
>> approaches suffer from it). Nevertheless, some of the problematic cases
>> can be addressed by improving existing approach or introducing a more
>> powerful analysis: covering more cases and making the analysis
>> control-sensitive should improve the situation.
> 
> We would like to work to improve escape analysis as per your suggestions above.
> If we can achieve the same allocation reductions with this solution, it would be a
> better long-term solution. We would like to continue reviewing stack allocation
> and start a sandbox project as Dalibor suggested, but work on improving escape
> analysis and measure against the sandbox for a baseline.

Good idea! Keeping the up-to-date patches in a sandbox repository would 
be very convenient.

>> Also, the alternative approach (called zone-based heap allocation) looks
>> very attractive to me. I haven't thought it through, but it looks like
>> keeping the objects on the Java heap can save us a lot of complexity on
>> the implementation side (more memory available for allocation - not
>> necessarily fixed amount, no need to migrate objects from stack to heap,
>> GC barriers are unaffected, etc.). For example, reserving a dedicated
>> TLAB (or a stack of TLABs?) and do nmethod-scoped allocations from C2
>> code looks attractive. It can simplify many aspects of the
>> implementation: much more space available, free migration of
>> non-escaping objects to heap on deoptimization.
> 
> We have been thinking about this idea since FOSDEM and we completely agree
> with the pros of zone-based allocation. The biggest benefits are the removal of
> the restrictions in compressed oops mode and that barriers would not have to be
> modified.
> 
> For this approach were you envisioning that objects allocated in a stack zone are
> pinned until the method returns? Also, while that zone memory is pinned the GC
> would not reclaim memory in that zone? That is what we were thinking, but we
> are worried about the complexity of the changes and restrictions it might add to
> the GC implementations.

Just want to reiterate that I haven't thought the idea through, but my 
educated guess is there should be a way to implement it in an optimistic 
way and mostly transparent to runtime and GCs.

Just a sketch of the idea:
   (1) JIT can optimistically use a dedicated TLAB in some scope (e.g., 
nmethod-based: record a watermark at nmethod entry for future use);

   (2) when leaving the scope (e.g, on nmethod exit), JIT can try to 
free allocated space (up to some watermark), but has to verify that some 
per-thread invariant still holds;

   (3) runtime can break the invariant at any time, but has to ensure 
that all allocated objects end up in Java heap.

For example (assuming all TLABs are allocated on-heap): using "the same 
zone TLAB is registered with the thread" as the invariant and 
de-registering zone TLAB with the thread (allocating new TLAB / 
resetting it to NULL) should do the job. Plus, there's an option to 
promote zone TLAB to ordinary TLAB may reduce heap waste.

So far, I don't see any major problems, but it is pending some 
validation with an experiment to get an understanding how efficient 
proposed scheme is in reducing allocation rate.

> Another thought is about the added cost to method enter / exit. With the current
> on stack approach there is no added instructions for entering / exiting a method
> since the stack size is just larger. For the zone-based approach we would need to
> have a few more instructions on enter and exit to get the space from the zone TLAB
> and to return it. If the current zone TLAB is full we would need to do more work to
> get another one. Hopefully the common case of satisfying the space requirements
> from the current zone TLAB would on average be the same or less than the current
> TLAB checks for fast path allocations.

Allocating a TLAB per method looks wasteful: TLABs are normally quite 
large (hence more heap waste for deep thread stacks and large number of 
threads) and their allocation is expensive (requires a CAS).

> A final consideration is the footprint cost for project Loom. In the zone-based approach
> would each virtual thread (fibre) have its own zone TLAB (or stack of TLABs)? If each
> virtual thread had a zone TLAB it may lead to more frequent GCs because a significant
> portion of the heap is reserved for zone-based allocations.

IMO having a TLAB per virtual thread may cause too much waste: TLAB size 
can easily outweight the footprint of the virtual thread itself.

Sharing a TLAB from a carrier thread may help, but it can't be used 
across possible freeze points.

So, I don't have a clear picture what will be the best option there.

> We do not see any of these as showstoppers, but just be sure we have the full picture.

>> Another idea:
>>
>> "When dealing with stack allocated objects in loops we need a lifetime
>> overlap check."
>>
>> It doesn't look specific to stack-allocated objects. Non-overlapping
>> live ranges can be coalesced the same way for on-heap freshly allocated
>> objects. It should get comparable reduction in allocation pressure
>> (single allocation per loop vs allocation per iteration) and doesn't
>> require stack allocation support at all (as an example [1]).
>>
>> If such improvements are enabled for non-escaping on-heap objects, how
>> much benefit will stack allocation bring on top of that? IMO the
>> performance gap should become much narrower.
> 
> We agree, it?s one of the first things we wanted to try after we submitted the initial stack
> allocation code for review. Again, our approach would be to have the current stack allocation
> prototype as a baseline and work to see if we can shrink the gap with other approaches.

Sounds good!

Best regards,
Vladimir Ivanov

From igor.ignatyev at oracle.com  Fri Jul 10 18:51:11 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Fri, 10 Jul 2020 11:51:11 -0700
Subject: RFR [15] : 8249019 : clean up FileInstaller $test.src $cwd in
 vmTestbase_vm_compiler tests
In-Reply-To: <b0d25c76-be85-ae93-3552-06b903beb9e8@oracle.com>
References: <a6084c8e-b5f7-6301-1d6d-1fedd4f11f33@oracle.com>
 <4DE9F13E-2B55-42DA-ABCB-4CF5F6EE422A@oracle.com>
 <b0d25c76-be85-ae93-3552-06b903beb9e8@oracle.com>
Message-ID: <E05C19AD-B017-4380-B3DD-E5A14B286919@oracle.com>

thanks Vladimir, pushed to jdk/jdk15.

-- Igor

> On Jul 10, 2020, at 11:09 AM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Reviewed.
> 
> Vladimir K
> 
> On 7/10/20 9:24 AM, igor.ignatyev at oracle.com wrote:
>> Thanks Katya.
>> Can I get a (R)eview?
>> ? Igor
>>> On Jul 9, 2020, at 1:44 PM, Ekaterina Pavlova <ekaterina.pavlova at oracle.com> wrote:
>>> 
>>> ?Looks good,
>>> 
>>> -katya
>>> 
>>> 
>>>> On 7/9/20 1:34 PM, Igor Ignatyev wrote:
>>>> http://cr.openjdk.java.net/~iignatyev//8249019/webrev.00
>>>>> 269 lines changed: 0 ins; 163 del; 106 mod
>>>> Hi all,
>>>> could you please review the patch which removes `FileInstaller . .` jtreg action from vmTestbase_vm_compiler tests?
>>>> from the main issue(8204985):
>>>>> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests.
>>>> some of vmTestbase_vm_compiler tests depend on FileInstaller, so they are left intact and will be updated separately.
>>>> testing: :vmTestbase_vm_compiler on linux-x64
>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249019
>>>> webrev: http://cr.openjdk.java.net/~iignatyev//8249019/webrev.00
>>>> Thanks,
>>>> -- Igor
>>> 


From igor.ignatyev at oracle.com  Fri Jul 10 19:07:12 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Fri, 10 Jul 2020 12:07:12 -0700
Subject: RFR(S) [15] : 8249000 : vm.gc.X should take selected JIT into
 account
In-Reply-To: <dbf5a546-51e0-400f-94f7-5b528972f8fe@oracle.com>
References: <FFC49DC0-AA15-45B5-B285-E043D1AEAF97@oracle.com>
 <6964ac32-e9ec-d700-0bdb-ea51f4610afe@oracle.com>
 <7A1992A7-1493-4DF0-B621-195CE986D34F@oracle.com>
 <2c92a9a5-77af-c100-fa9b-f765e9d23dce@oracle.com>
 <BA1B1E12-19E3-45E3-8FDD-0307DDADDC3E@oracle.com>
 <dbf5a546-51e0-400f-94f7-5b528972f8fe@oracle.com>
Message-ID: <6409D2AD-173C-451A-814E-32C88860A5C5@oracle.com>

thanks Vladimir, pushed.
-- Igor

> On Jul 8, 2020, at 3:36 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Good.
> 
> Thanks,
> Vladimir
> 
> On 7/8/20 11:40 AM, Igor Ignatyev wrote:
>> Thanks Vladimir.
>> for the record, I've updated ProblemList-graal.txt w/ the following:
>>> diff -r 14ffd658a23a test/hotspot/jtreg/ProblemList-graal.txt
>>> --- a/test/hotspot/jtreg/ProblemList-graal.txt  Wed Jul 08 11:35:30 2020 -0700
>>> +++ b/test/hotspot/jtreg/ProblemList-graal.txt  Wed Jul 08 11:37:44 2020 -0700
>>> @@ -229,6 +229,7 @@
>>>  compiler/loopopts/TestOverunrolling.java                8207267   generic-all
>>>  compiler/jsr292/NonInlinedCall/InvokeTest.java          8207267   generic-all
>>>  compiler/codegen/TestTrichotomyExpressions.java         8207267   generic-all
>>> +gc/stress/TestReclaimStringsLeaksMemory.java            8207267   generic-all
>>>    runtime/exceptionMsgs/AbstractMethodError/AbstractMethodErrorTest.java        8222582 generic-all
>> -- Igor
>>> On Jul 8, 2020, at 11:34 AM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>>> 
>>> Thank you, Igor
>>> 
>>> I got the difference between `vm.gc` and `vm.gc.X`.
>>> 
>>> In this case TestReclaimStringsLeaksMemory.java should be put into ProblemList-graal.txt with 8207267 to enable it with libgraal.  Current usage of !vm.graal.enabled in test is to skip this test with Java Graal because its effect on Java heap.
>>> 
>>> On 7/7/20 8:30 PM, Igor Ignatyev wrote:
>>>> Hi Vladimir,
>>>> thanks for your review!
>>>> `vm.gc` and `vm.gc.X`-s are different beasts (and admittedly, they confuse people a lot), `vm.gc` is set to "X", by jtreg itself, only if there is UseXGC in vm flags, otherwise it's "null". `vm.gc.X` are set by VMProps class, and you can have more than one vm.gc.X == true, as vm.gc.X means that X gc is supported by JVM and it can be selected; so if there are no Use.*GC in vm flags, vm.gc.X will yield true for all GCs which JVM was built with; if one of UseXGC is provided, only corresponding vm.gc.X is true, and all others are false. so to answer your questions, yes `vm.gc` can be "null" (if there are no Use.*GC) , and yes `vm.gc.Z & vm.gc.Serial & vm.gc == null` can be true (if there are no Use.*GC and JVM supports both Z and Serial GCs).
>>> 
>>> Interesting. I thought vmGC will list only one selected GC. That explains requires in TestZGCWithCDS.java.
>>> 
>>> You only need to add TestReclaimStringsLeaksMemory.java into ProblemList-graal.txt.
>>> 
>>> Thanks,
>>> Vladimir
>>> 
>>>> Thanks,
>>>> -- Igor
>>>>> On Jul 7, 2020, at 8:00 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>>>>> 
>>>>> Nice clean up, Igor
>>>>> 
>>>>> test/hotspot/jtreg/gc/stress/TestReclaimStringsLeaksMemory.java
>>>>> 
>>>>> Do we even can have vm.gc=="null" based on code in VMProps.java? At least some GC should be selected ergonomically even if non is specified on command line.
>>>>> 
>>>>> - * @requires vm.gc=="null" & !vm.graal.enabled & !vm.debug
>>>>> + * @requires vm.gc == "null"
>>>>> + * @requires !vm.debug
>>>>> 
>>>>> 
>>>>> test/hotspot/jtreg/runtime/cds/appcds/TestZGCWithCDS.java
>>>>> 
>>>>> Does next combination of @requires ever work? I thought such sequence means 'AND' operation on all such conditions.
>>>>> 
>>>>>  * @requires vm.gc.Z
>>>>>  * @requires vm.gc.Serial
>>>>>  * @requires vm.gc == null
>>>>> 
>>>>> 
>>>>> Thanks,
>>>>> Vladimir
>>>>> 
>>>>> On 7/7/20 5:38 PM, Igor Ignatyev wrote:
>>>>>> http://cr.openjdk.java.net/~iignatyev/8249000/webrev.00/
>>>>>>> 241 lines changed: 34 ins; 5 del; 202 mod;
>>>>>> Hi all,
>>>>>> could you please review the patch which modifies requires/VMProps to set vm.gc.X to false if Graal is selected and X GC isn't supported by Graal?
>>>>>> the patch also replaces @requires similar to `vm.gc.X & !vm.graal.enabled` w/ `vm.gc.X` where it's applicable.
>>>>>> from JBS:
>>>>>>> not all GCs are supported by Graal JIT, which leads to failures like JDK-8247527 and boilerplate fixes like replacing all `@requires vm.gc.Z` w/ `@requires vm.gc.Z & !vm.graal.enabled`.
>>>>>>> 
>>>>>>> as vm.gc.X means that X GC can be selected, it would be more natural, less surprising, and much more clear to have it true if the selected JIT supports the said X GC.
>>>>>> webrev: http://cr.openjdk.java.net/~iignatyev/8249000/webrev.00/
>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249000
>>>>>> testing: test/hotspot/jtreg/{gc,compiler,runtime,serviceability} on {linux,windows,macos}-x64 w/ and w/o Graal as JIT
>>>>>> Thanks,
>>>>>> -- Igor


From vladimir.kozlov at oracle.com  Fri Jul 10 22:56:54 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 10 Jul 2020 15:56:54 -0700
Subject: [16] RFFR(S) 8249165: Remove unneeded nops introduced by 8234160
 changes
Message-ID: <3baf8ea8-0ae3-1ce5-4d7a-0f524c53bb30@oracle.com>

https://cr.openjdk.java.net/~kvn/8249165/webrev.00/
https://bugs.openjdk.java.net/browse/JDK-8249165

Check for branch instruction at the and of code block when generating NOPs to align it.
I did not see significant difference in performance in our regular benchmarks (jvm2008, JBB) but I think it is still 
good to do.

Thanks,
Vladimir

From vladimir.kozlov at oracle.com  Fri Jul 10 23:19:32 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 10 Jul 2020 16:19:32 -0700
Subject: [15] RFR (S): 8247502: PhaseStringOpts crashes while optimising
 effectively dead code
In-Reply-To: <9ee563ef-501b-bdaa-4e87-8e9e8aaf2dd7@oracle.com>
References: <9ee563ef-501b-bdaa-4e87-8e9e8aaf2dd7@oracle.com>
Message-ID: <a1d56db5-cae7-f402-fed5-8cd13ee44450@oracle.com>

I agree with this small fix.

Thanks,
Vladimir

On 7/10/20 9:26 AM, Vladimir Ivanov wrote:
> https://bugs.openjdk.java.net/browse/JDK-8247502
> http://cr.openjdk.java.net/~vlivanov/8247502/webrev.00/
> 
> As Tobias discovered, PhaseStringOpts crashes when it encounters String::append() argument being TOP: TOP is a constant, 
> but the code expects to see a String constant instead.
> 
> It happens while processing a call in unreachable infinite loop. The code is effectively dead, but IGVN and 
> PhaseRemoveUseless don't see that. It is discovered later when loop opts kick in which clean it up.
> 
> Proposed fix tries to make the code more robust and just bails out the optimization when TOP is encountered.
> 
> Alternative way to fix the problem would be to clean up the graph before PhaseStringOpts (e.g., by running 
> PhaseIdealLoop(LoopOptsNone) since PhaseRemoveUseless is not enough), but PhaseIdealLoop pass can be expensive. So, I'm 
> in favor of the local fix in PhaseStringOpts.
> 
> Testing: crash reproducer, hs-precheckin-comp, hs-tier1, hs-tier2, tier1
> 
> Thanks!
> 
> PS: no regression test since I wasn't able to extract a simple reproducer from the crash log.
> 
> Best regards,
> Vladimir Ivanov

From vladimir.x.ivanov at oracle.com  Fri Jul 10 23:23:33 2020
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Sat, 11 Jul 2020 02:23:33 +0300
Subject: [16] RFFR(S) 8249165: Remove unneeded nops introduced by 8234160
 changes
In-Reply-To: <3baf8ea8-0ae3-1ce5-4d7a-0f524c53bb30@oracle.com>
References: <3baf8ea8-0ae3-1ce5-4d7a-0f524c53bb30@oracle.com>
Message-ID: <bfb8ad60-2bf4-1dfb-aee6-a6593a0d4b2a@oracle.com>

Looks good.

Best regards,
Vladimir Ivanov

On 11.07.2020 01:56, Vladimir Kozlov wrote:
> https://cr.openjdk.java.net/~kvn/8249165/webrev.00/
> https://bugs.openjdk.java.net/browse/JDK-8249165
> 
> Check for branch instruction at the and of code block when generating 
> NOPs to align it.
> I did not see significant difference in performance in our regular 
> benchmarks (jvm2008, JBB) but I think it is still good to do.
> 
> Thanks,
> Vladimir

From vladimir.kozlov at oracle.com  Fri Jul 10 23:25:03 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 10 Jul 2020 16:25:03 -0700
Subject: [16] RFR(S): 8248552: C2 crashes with SIGFPE due to division by
 zero
In-Reply-To: <70e8e42b-5cb3-9c1e-419e-2f771f042368@oracle.com>
References: <70e8e42b-5cb3-9c1e-419e-2f771f042368@oracle.com>
Message-ID: <3ba2ef6a-8ade-7ede-5252-21051c34b472@oracle.com>

Looks good.

Thanks,
Vladimir

On 7/10/20 12:37 AM, Christian Hagedorn wrote:
> Hi
> 
> Please review the following patch:
> https://bugs.openjdk.java.net/browse/JDK-8248552
> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.00/
> 
> In the failing testcase, C2 removes a zero check for a division/modulo node n based on the type information of the loop 
> induction variable phi p (always between 1 and 50 and never 0). However, n is later split through p and ends up after 
> the AddNode which updates the induction variable p. In the last iteration j equals 2 and is then updated to 0. The 
> division/modulo node n is now executed before the loop limit check which results in a SIGFPE.
> 
> The fix bails out of PhaseIdealLoop::split_thru_phi if a division or modulo node has its zero check removed (i.e. 
> control in NULL) and is split through a phi which has an input that could be zero. This should only happen for an 
> induction variable phi of a trip-counted (integer) loop.
> 
> Best regards,
> Christian

From vladimir.kozlov at oracle.com  Fri Jul 10 23:26:09 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 10 Jul 2020 16:26:09 -0700
Subject: [16] RFFR(S) 8249165: Remove unneeded nops introduced by 8234160
 changes
In-Reply-To: <bfb8ad60-2bf4-1dfb-aee6-a6593a0d4b2a@oracle.com>
References: <3baf8ea8-0ae3-1ce5-4d7a-0f524c53bb30@oracle.com>
 <bfb8ad60-2bf4-1dfb-aee6-a6593a0d4b2a@oracle.com>
Message-ID: <a9e73e85-6cd8-3f65-c864-ab3550a17097@oracle.com>

Thank you, Vladimir

On 7/10/20 4:23 PM, Vladimir Ivanov wrote:
> Looks good.
> 
> Best regards,
> Vladimir Ivanov
> 
> On 11.07.2020 01:56, Vladimir Kozlov wrote:
>> https://cr.openjdk.java.net/~kvn/8249165/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8249165
>>
>> Check for branch instruction at the and of code block when generating NOPs to align it.
>> I did not see significant difference in performance in our regular benchmarks (jvm2008, JBB) but I think it is still 
>> good to do.
>>
>> Thanks,
>> Vladimir

From vladimir.kozlov at oracle.com  Fri Jul 10 23:32:36 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 10 Jul 2020 16:32:36 -0700
Subject: RFR(M): 8229495: SIGILL in C2 generated OSR compilation
In-Reply-To: <c4a2ce8a-2084-e4c1-d1a6-ab057eabe448@oracle.com>
References: <3b720427-d718-5d1c-dbe9-6149a21883af@oracle.com>
 <87r1topriw.fsf@redhat.com> <84b2c86d-c7e6-7945-dae5-db1d8efe6f25@oracle.com>
 <87sge0oqv8.fsf@redhat.com> <c4a2ce8a-2084-e4c1-d1a6-ab057eabe448@oracle.com>
Message-ID: <da629280-b591-1095-48d4-ad155d2d1992@oracle.com>

On 7/9/20 5:16 AM, Christian Hagedorn wrote:
> Hi Roland
> 
> On 09.07.20 13:43, Roland Westrelin wrote:
>> new webrev:
>> http://cr.openjdk.java.net/~roland/8229495/webrev.01/
> 
> That looks good to me!

+1

Thanks,
Vladimir K

> 
>>> I submitted some testing.
>>
>> Thanks.
> 
> An extended testing was completed successfully (up to tier7).
> 
>>> While at it, you might want to consider to update other uses of the
>>> pattern Opcode() == Op_Opaque1 by is_Opaque1() as well like in
>>> loopTransform.cpp:
>>>
>>> 1158???? assert(iff->in(1)->in(1)->Opcode() == Op_Opaque1, "unexpected
>>> predicate shape");
>>
>> Except in this case it really is an Opaque1 instead of a subclass so
>> using is_Opaque1() would weaken the assert.
> 
> You're right, I have not thought about that - then better leave it as it is.
> 
>>> I observed a Java Fuzzer crash ("fatal error: DEBUG MESSAGE: duplicated
>>> predicate failed which is impossible") this weekend which looked very
>>> similar to this bug and indeed it could be fixed with your patch. You
>>> could add it as additional testcase. Here is the simplified code and the
>>> command line I used to reproduce it.
>>
>> Thanks for test case. I included it in the new webrev.
> 
> Great, thanks for adding it.
> 
> Best regards,
> Christian

From vladimir.kozlov at oracle.com  Sat Jul 11 00:06:15 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 10 Jul 2020 17:06:15 -0700
Subject: RFR(s): 8236042: [TESTBUG] serviceability/sa/ClhsdbCDSCore.java
 fails with -Xcomp -XX:TieredStopAtLevel=1
In-Reply-To: <958fecdf-d7a1-4b22-835e-a75fadda0a84@default>
References: <2abe9fba-e958-4b34-9f92-6bb8d8478f4e@default>
 <70057c31-e535-f03a-391d-d181b2ec150b@oracle.com>
 <958fecdf-d7a1-4b22-835e-a75fadda0a84@default>
Message-ID: <37bb4585-21b5-a6e7-9ee1-88ccf9be0914@oracle.com>

Fix is good.

I think next are reasons you don't get MDO in this scenario.

Tier1 (C1 compilation) does not generate profiling code and does not created MDO. C1 request MDO only with tiers 2 and 3 
[1][2].

With -Xcomp flag a Java method is not executed in Interpreter but requests its compilation and waits when it is 
finished. As result MDO is not created in Interpreter too. May be late if a method is deoptimized it will be executed in 
Interpreter and MDO will be created.

Thanks,
Vladimir

[1] http://hg.openjdk.java.net/jdk/jdk/file/796c9fa50850/src/hotspot/share/c1/c1_Compilation.hpp#l226
[2] http://hg.openjdk.java.net/jdk/jdk/file/796c9fa50850/src/hotspot/share/c1/c1_Compilation.cpp#l381

On 7/7/20 8:47 PM, Fairoz Matte wrote:
> Thanks Chris, for the review comments.
> 
> I have updated the suggested change.
> 
> Thanks,
> Fairoz
> 
>> -----Original Message-----
>> From: Chris Plummer
>> Sent: Wednesday, July 8, 2020 3:38 AM
>> To: Fairoz Matte <fairoz.matte at oracle.com>; hotspot-compiler-
>> dev at openjdk.java.net; serviceability-dev at openjdk.java.net
>> Subject: Re: RFR(s): 8236042: [TESTBUG] serviceability/sa/ClhsdbCDSCore.java
>> fails with -Xcomp -XX:TieredStopAtLevel=1
>>
>> Hi Fairoz,
>>
>> Looks good, except for the missing space in "if(testJavaOpts...".
>>
>> thanks,
>>
>> Chris
>>
>> On 7/7/20 7:49 AM, Fairoz Matte wrote:
>>> Hi,
>>>
>>> Please review this small test change to consider the scenario when there is no
>> "printmdo" output
>>>
>>> JBS - https://bugs.openjdk.java.net/browse/JDK-8236042
>>> Webrev - http://cr.openjdk.java.net/~fmatte/8236042/webrev.00/
>>>
>>> Thanks,
>>> Fairoz
>>

From fairoz.matte at oracle.com  Sat Jul 11 03:10:54 2020
From: fairoz.matte at oracle.com (Fairoz Matte)
Date: Fri, 10 Jul 2020 20:10:54 -0700 (PDT)
Subject: RFR(s): 8236042: [TESTBUG] serviceability/sa/ClhsdbCDSCore.java
 fails with -Xcomp -XX:TieredStopAtLevel=1
In-Reply-To: <37bb4585-21b5-a6e7-9ee1-88ccf9be0914@oracle.com>
References: <2abe9fba-e958-4b34-9f92-6bb8d8478f4e@default>
 <70057c31-e535-f03a-391d-d181b2ec150b@oracle.com>
 <958fecdf-d7a1-4b22-835e-a75fadda0a84@default>
 <37bb4585-21b5-a6e7-9ee1-88ccf9be0914@oracle.com>
Message-ID: <afed8263-1079-4ff7-be84-46be8a1aa2f8@default>

Thanks Vladimir for the review.

Thanks for mentioning the reasons for MDO's not being generated, I have added them as comment in bug for future reference.

Thanks,
Fairoz

> -----Original Message-----
> From: Vladimir Kozlov
> Sent: Saturday, July 11, 2020 5:36 AM
> To: Fairoz Matte <fairoz.matte at oracle.com>; Chris Plummer
> <chris.plummer at oracle.com>; hotspot-compiler-dev at openjdk.java.net;
> serviceability-dev at openjdk.java.net
> Subject: Re: RFR(s): 8236042: [TESTBUG] serviceability/sa/ClhsdbCDSCore.java
> fails with -Xcomp -XX:TieredStopAtLevel=1
> 
> Fix is good.
> 
> I think next are reasons you don't get MDO in this scenario.
> 
> Tier1 (C1 compilation) does not generate profiling code and does not created
> MDO. C1 request MDO only with tiers 2 and 3 [1][2].
> 
> With -Xcomp flag a Java method is not executed in Interpreter but requests its
> compilation and waits when it is finished. As result MDO is not created in
> Interpreter too. May be late if a method is deoptimized it will be executed in
> Interpreter and MDO will be created.
> 
> Thanks,
> Vladimir
> 
> [1]
> http://hg.openjdk.java.net/jdk/jdk/file/796c9fa50850/src/hotspot/share/c1/c1_
> Compilation.hpp#l226
> [2]
> http://hg.openjdk.java.net/jdk/jdk/file/796c9fa50850/src/hotspot/share/c1/c1_
> Compilation.cpp#l381
> 
> On 7/7/20 8:47 PM, Fairoz Matte wrote:
> > Thanks Chris, for the review comments.
> >
> > I have updated the suggested change.
> >
> > Thanks,
> > Fairoz
> >
> >> -----Original Message-----
> >> From: Chris Plummer
> >> Sent: Wednesday, July 8, 2020 3:38 AM
> >> To: Fairoz Matte <fairoz.matte at oracle.com>; hotspot-compiler-
> >> dev at openjdk.java.net; serviceability-dev at openjdk.java.net
> >> Subject: Re: RFR(s): 8236042: [TESTBUG]
> >> serviceability/sa/ClhsdbCDSCore.java
> >> fails with -Xcomp -XX:TieredStopAtLevel=1
> >>
> >> Hi Fairoz,
> >>
> >> Looks good, except for the missing space in "if(testJavaOpts...".
> >>
> >> thanks,
> >>
> >> Chris
> >>
> >> On 7/7/20 7:49 AM, Fairoz Matte wrote:
> >>> Hi,
> >>>
> >>> Please review this small test change to consider the scenario when
> >>> there is no
> >> "printmdo" output
> >>>
> >>> JBS - https://bugs.openjdk.java.net/browse/JDK-8236042
> >>> Webrev - http://cr.openjdk.java.net/~fmatte/8236042/webrev.00/
> >>>
> >>> Thanks,
> >>> Fairoz
> >>

From aph at redhat.com  Sat Jul 11 08:54:07 2020
From: aph at redhat.com (Andrew Haley)
Date: Sat, 11 Jul 2020 09:54:07 +0100
Subject: 8248830 : RFR[S] : C2 : Rotate API intrinsification for X86
In-Reply-To: <9ad508ae-bf73-d4d7-ff2b-d1f4280adeff@oracle.com>
References: <MWHPR11MB1614502C0DEC2B63D905D683E8650@MWHPR11MB1614.namprd11.prod.outlook.com>
 <9ad508ae-bf73-d4d7-ff2b-d1f4280adeff@oracle.com>
Message-ID: <d45977a6-f1d6-d204-4554-d1da94122066@redhat.com>

On 10/07/2020 18:32, Vladimir Ivanov wrote:

 > High-level comment: so far, there were no pressing need in
 > explicitly marking the methods as intrinsics. ROR/ROL instructions
 > were selected during matching [1]. Now the patch introduces
 > dedicated nodes (RotateLeft/RotateRight) specifically for intrinsics
 > which partly duplicates existing logic.

The lack of rotate nodes in the IR has always meant that AArch64 doesn't
generate optimal code for e.g.

   (Set dst (XorL reg1 (RotateLeftL reg2 imm)))

because, with the RotateLeft expanded to its full combination of ORs
and shifts, it's to complicated to match. At the time I put this to
one side because it wasn't urgent. This is a shame because although
such combinations are unusual they are used in some crypto operations.

If we can generate immediate-form rotate nodes early by pattern
matching during parsing (rather than depending on intrinsics) we'll
get more value than by depending on programmers calling intrinsics.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From david.holmes at oracle.com  Mon Jul 13 04:07:35 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 13 Jul 2020 14:07:35 +1000
Subject: RFR(XS) 8248671: AArch64: Remove unused variables
In-Reply-To: <DM6PR21MB14529CCD3BDC0E3AA1AA38A1C2640@DM6PR21MB1452.namprd21.prod.outlook.com>
References: <DM6PR21MB14529CCD3BDC0E3AA1AA38A1C2640@DM6PR21MB1452.namprd21.prod.outlook.com>
Message-ID: <1c652b56-2476-ede0-47f8-13c4e99639d0@oracle.com>

Hi Bernhard,

On 10/07/2020 7:08 am, Bernhard Urban-Forster wrote:
> Hello everyone,
> 
> 
> please review this change:
> 
> JBS: https://bugs.openjdk.java.net/browse/JDK-8248671
> Webrev: http://cr.openjdk.java.net/~burban/8248671_unused-vars/
> 
> We found this issue while bringing up Windows+AArch64 support for HotSpot. The Microsoft toolchain (MSVC) seems to be slightly more pedantic than GCC.

Looks good and trivial.

But could I request that webrevs/patches for mainline be generated 
against the mainline hg repository rather than the git mirror.

Thanks,
David

> 
> Thanks,
> -Bernhard
> 

From jamsheed.c.m at oracle.com  Mon Jul 13 05:44:02 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Mon, 13 Jul 2020 11:14:02 +0530
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <8325fac5-6258-9b88-6507-5dcb0597cc17@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
 <8325fac5-6258-9b88-6507-5dcb0597cc17@oracle.com>
Message-ID: <8522a69e-e538-2cc9-5364-887e450fc653@oracle.com>

Hi,

I reworked the fix. I compute offset for all init captures stores, but 
treats this special init captured stores similar to unsafe(as these 
objects are usually GlobalEscape and doesn't have any perf implications).

revised webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.01/

testing: mach1-5( logs in jbs)

Best regards,

Jamsheed

On 09/07/2020 19:36, Jamsheed C M wrote:
>
> Hi,
>
> request to hold the review. need to change the code for dealing with 
> unsafe access. as current capture code go for more execution time 
> analyzing things.
>
> Best regards,
>
> Jamsheed
>
> On 09/07/2020 13:01, Jamsheed C M wrote:
>>
>> Hi all,
>>
>> JBS:https://bugs.openjdk.java.net/browse/JDK-8242895
>>
>> Request for review changes made to offset computation and field write 
>> detection for init captured stores due to phis addition between alloc 
>> and init. This happen if init node in different outer loop wrt to 
>> alloc node and there is a loop opt.? This was required as a result of 
>> enhancement [1].
>>
>> Normally init are not associated with multiple alloc node during EA 
>> phase, but changes done for [1] caused the code shapes of the form 
>> [2]? to generate inits associated with multiple alloc node.
>>
>> This had implication in offset computation and field write detection 
>> related to initializing stores.
>>
>> Attempt to fix in EA:
>>
>> ???? webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA/
>>
>> Alternate fix:
>>
>> ???? Minimize the scenario in compiler generated code by throwing 
>> only j.l.Error from slowpath(all exception async/sync are handled in 
>> runtime exit).
>>
>> ???? Stub epilog doesn't poll or throw any exceptions. Disable full 
>> loop opt before EA for detectable patterns and bailout EA for late 
>> detected patterns.
>>
>> ???? webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_deopt/
>>
>> Please advice.
>>
>> Testing : mach tier1-5 (logs in jbs)
>>
>> Best regards,
>>
>> Jamsheed
>>
>>
>> [1] JDK-8231291 <https://bugs.openjdk.java.net/browse/JDK-8231291>C2: 
>> loop opts before EA should maximally unroll loops
>>
>> [2] that have its init node in different outer loop wrt to alloc node.
>>
>>
>> loop begin
>>
>> ?? try{
>>
>> ?? return new obj()/? throw new obj()/ uncommon trap after 
>> allocation, in a loop
>>
>> ?? } catch(ex) {
>>
>> ?? }
>>
>> loop end
>>
>>   42     public static IntA test(int n) {
>>    43         for (int i=0; i<2; i++) {
>>    44             try {
>>    45                   return new IntA(n + i);
>>    46             } catch (Exception e) {
>>    47             }
>>    48         }
>>    49
>>

From richard.reingruber at sap.com  Mon Jul 13 06:42:13 2020
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Mon, 13 Jul 2020 06:42:13 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <DB6PR0201MB2152AF18921A375D26A76D89ECA40@DB6PR0201MB2152.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
 <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331EEF36942FCEBA7E131389BCB0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB329746F57D1C78F14000CB799AC80@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331D64C693490FD0746D1989BC90@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <DB6PR0201MB2152AF18921A375D26A76D89ECA40@DB6PR0201MB2152.eurprd02.prod.outlook.com>
Message-ID: <AM0PR0202MB3331FF18BED42A71796488E59B600@AM0PR0202MB3331.eurprd02.prod.outlook.com>

Hi Goetz,

thanks for looking at this!

And my apologies for taking that long...

So here is the new webrev.6

Webrev.6: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6/
Delta:    http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.inc/

I spent most of the time running a microbenchmark [1] I wrote to answer questions from your
review. At first I had trouble with variance in the results until I found out it was due to the NUMA
architecture of the server I used. After that I noticed that there was a performance regression of
about 5% even at low agent activity. I finally found out that it was due to the implementation of
JavaThread::wait_for_object_deoptimization() which is called by the target of the JVMTI operation to
self suspend for object deoptimization. I fixed this by adding limited spinning before calling
wait() on the monitor.

The delta includes many changes in comments, renaming of names, etc. So I'd like to summarize
functional changes:

* Collected all the code for the testing feature DeoptimizeObjectsALot in compileBroker.cpp and
  reworked it.

  With DeoptimizeObjectsALot enabled internal threads are started that deoptimize frames and
  objects. The number of threads started are given with DeoptimizeObjectsALotThreadCountAll and
  DeoptimizeObjectsALotThreadCountSingle. The former targets all existing threads whereas the
  latter operates on a single thread selected round robin.

  I removed the mode where deoptimizations were performed at every nth exit from the runtime. I
  never used it.

* EscapeBarrier::sync_and_suspend_one(): use a direct handshake and execute it always independently
  of is_thread_fully_suspended().

* Bugfix in EscapeBarrier::thread_added(): must not clear deopt flag. Found this testing with
  DeoptimizeObjectsALot.

* Added EscapeBarrier::thread_removed().

* EscapeBarrier constructors: barriers can now be entirely disabled by disabling DoEscapeAnalysis.
  This effectively disables the enhancement.

* JavaThread::wait_for_object_deoptimization():

  - Bugfix: the last check of is_obj_deopt_suspend() must be /after/ the safepoint check! This
    caused issues with not walkable stacks with DeoptimizeObjectsALot.

  - Added limited spinning inspired by HandshakeSpinYield to fix regression in microbenchmark [1]

I refer to some more changes answering your questions and comments inline below.

Thanks,
Richard.

[1] Microbenchmark: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbenchmark/

> Hi Richard,
> 
> I had a look at your change.  It's complex, but not that big.
> A lot of code is just passing info through layers of abstraction.

Also it leverages preexisting functionality like materialization of virtual objects in non-top
frames (see materializeVirtualObjects).

> Also, one can tell this went through some iterations by now, 
> I think it's very well engineered.
> I had a look at webrev.05
> 
> Unfortunately
> "8242425: JVMTI monitor operations should use Thread-Local Handshakes" 
> breaks webrev.05.
> I updated to before that change and took that as base of my review.
> 
> I see four parts of the change that can be looked at
> rather individually.
> 
>  * Refactoring the scopeDesc constructors. Trivial.
>  * Persisting information about the optimizations done by the compilers.
>    Large and mostly trivial.
>  * Deoptimizing. The most complicated part. Really well abstracted, though.
>  * DeoptimizeObjectsALot for testing and the tests.
> 
> Review of compiler changes:
> 
> I understand you annotate at safepoints where the escape analysis
> finds out that an object is "better" than global escape. 
> This are the cases where the analysis identifies optimization 
> opportunities. These annotations are then used to deoptimize
> frames and the objects referenced by them.
> Doesn't this overestimate the optimized 
> objects?  E.g., eliminate_alloc_node has many cases where it bails
> out.

Yes, the implementation is conservative, but it is comparatively simple and the additional debug
info is just 2 flags per safepoint. On the other hand, those JVMTI operations that really trigger
deoptimizations are expected to be comparatively infrequent such that switching to the interpreter
for a few microseconds will hardly have an effect.

I've done microbenchmarking to check this.

http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbenchmark/

I found that in the worst case performance can be impacted by 10%. If the agent is extremely active
and does relevant JVMTI calls like GetOwnedMonitorStackDepthInfo() every millisecond or more often,
then the performance impact can be 30%. But I would think that this is not realistic. These calls
are issued in interactive sessions to analyze deadlocks.

We could get more precise deoptimizations by adding a third flag per safepoint for ea-local objects
among the owned monitors. This would help improve the worst case in the benchmark. But I'm not
convinced, if it is worth it.

Refer to the README.txt of the microbenchmark for a more detailled discussion.

> c1_IR.hpp   
> 
> OK, nothing to do for C1, just adapt to extended method signature.
> 
> Break line once more so that it matches above line length.

Done.

> ciEnv.h|cpp
> 
> Pass through another jvmti capability.  Trivial & good.
> 
> 
> debugInfoRec.hpp
> 
> Pass through escape info that must be recorded. OK.
> 
> pcDesc.hpp
> 
> I would like to see some documentation of the methods.
>
> Maybe:
>   // There is an object in the scope that does not escape globally.
>   // It either does not escape at all or it escapes as arguemnt.
> and
>   // One of the arguments is an object that is not globally visible
>   // but escapes to the callee.

Done. I didn't take your text, though, because I only noticed it after writing my own. Let me know
if you are not ok with it.

> scopeDesc.cpp
> 
>   Besides refactoring copy escape info from pcDesc to scopeDesc
>   and add accessors. Trivial.
> 
>   In scopeDesc.hpp you talk about NoEscape and ArgEscape. 
>   This are opto terms, but scopeDesc is a shared datastructure
>   that does not depend on a specific compiler. 
>   Please explain what is going on without using these terms.

Actually these are not too opto specific terms. They are used in the paper referenced in
escape.hpp. Also you can easily google them. I'd rather keep the comments as they are.

> jvmciCodeInstaller.cpp
> 
>   OK, nothing for JVMCI. Here support for Object Optimizations 
>   for JVMCI compilers could be added. Leave this to graal people.
> 
> callnode.hpp
> 
> You add functionality to annotate callnodes with escape information 
> This is carried through code generation to final output where it is
> added to the compiled methods meta information.
> 
> At Safepoints in general jvmti can access
>   - Objects that were scalar replaced. They must be reallocated.
>     (Flag EliminateAllocations)
>   - Objects that should be locked but are not because they never 
>     escape the thread. They need to be relocked.
> 
> At calls, Objects where locks have been removed escape to callees.
> We must persist this information so that if jvmti accesses the 
> object in a callee, we can determine by looking at the caller that
> it needs to be relocked.

Note that the ea-optimization must not be at the current location, it can also follow when control
returns to the caller. Lock elimination isn't the only relevant optimization. Accesses to instance
members or array elements can be optimized as well.

> A side comment: 
> I think the flage handling in Opto is not very intuitive.
> DoEscapeAnalysis depends on the jvmti capabilities.
> This makes no sense. It is only an analysis. The optimizations
> should depend on the jvmti capabilities.
> The correct setup would be to handle this in 
> CompilerConfig::ergo_initialize():
> If the jvmti capabilities allow, enable the optimizations 
> EliminateAllocations or  EliminateLocks/EliminateNestedLocks.
> If one of these optimizations is on, enable EscapeAnalysis.
>  -- end side comment.
>
> So I would propose the following comments:
> 
>   // In the scope of this safepoints there are objects
>   // that do not globally escape. They are either NoEscape or
>   // ArgEscape. As such, they might be subject to optimizations.
>   // Persist this information here so that the frame an the
>   // Objects in scope can 
>   // be deoptimized if jvmti accesses an object at this safepoint.
>   void set_not_global_escape_in_scope(bool b) {
> 
>   // This call passes objects that do not globally escape 
>   // to its callee. The object might be subject to optimization, 
>   // e.g. a lock might be omitted. Persist this information here 
>   // so that on a jvmti access to the callee frame we can deoptimize
>   // the object and this frame.
>   void  set_arg_escape(bool f)             { _arg_escape = f; }

I do not really like these comments. They are too verbose and do not match the comment style of the
surrounding code. The names are descriptive enough IMO. Also the measures taken depending on the
flags should be commented at the locations, where the flags are read.

> Actuall I am not sure whether the name of these fields (and all 
> the others in the course of this change) should refer to 
> escape analysis.  I think the term "Object deoptimization" 
> you also use is much better. You could call these properties 
> (througout the whole change) 
>   set_optimized_objects_in_scope()
> and
>   set_passes_optimized_objects().
> 
> I think this would make the whole matter much easier
> to understand. 

I'd prefer the current names. They are closer to established terminology.  And it is actually
unknown, if optimizations based on their escape state exist.

> Anyways, locks can already be removed without running
> escape analysis at all. C2 recognizes some local patterns
> that allow this.
> 
> escape.h|cpp
> 
> The code looks good. 
> 
> Line 325: The comment could be a bit more elaborate:
>   // Annotate at safepoints if they have <= ArgEscape objects in their
>   // scope. Additionally, if the safepoint is a java call, annotate
>   // whether it passes ArgEscape objects as parameters.
> 
> And maybe add these comments?:
> 
> // Returns true if an oop in the scope of sfn does not escape
> // globally.
> bool ConnectionGraph::has_not_global_escape_in_scope(SafePointNode* sfn) {
> 
> // Returns true if at least one of the arguments to the call is an oop
> // that does not escape globally.
> bool ConnectionGraph::has_arg_escape(CallJavaNode* call) {

IMHO the method names are descriptive and don't need the comments. But I give in :) (only replaced
"oop" with "object")

> General question:
> You collect the information you want to annotate to the 
> method during escape analysis.
> Don't you overestimate the optimized objects by this?
> E.g. elimination of allocations does bail out for 
> various reasons. At the end, no optimization might 
> have happened, but then during runtime the frame is 
> deoptimized nevertheless.

Please see statements and worst case microbenchmark above.

> machnode.hpp:
> 
> Extends MachSafePointNode similar to the ideal version.  Good.
> 
> matcher.cpp
>   
> Copy info from ideal to mach node. good.
> 
> output.cpp
> 
> Now finally the information is written to the 
> debug info.  Good.
> 
> ---------------------------------------------------------
> 
> So now let's have a look at the runtime part (including
> relaxing constraints to escape analysis):
> 
> rootResolver.cpp
> 
> Adapt to changed interface. good.
> 
> c2compiler.cpp / macro.cpp
> 
> Make EscpaeAnlysis independent of jvmti capabilities. Good.
> 
> jvmtiEnv.cpp/jvmtiEnvBase.cpp
> 
> You add deoptimization of objects where they are 
> accessed. good.
> 
> jvmtiImpl.cpp
> 
> In deoptimize_objects, you check for DoEscapeAnalysis.
> This is correct given the current design of the flag
> handling in the compiler.
> It's not really nice to have a dependency to C2 here, 
> though. I understand it's an optimization, the code 
> could be run anyways, it would check but not find
> anything. But actually I would excpect dependencies
> on EliminateLocks and EliminateAllocations (if they
> were set according to jvmti capabilitiers as I elaborated
> above.)  
> Would it make sense to protect the ArgEscape
> loop by if (EliminateLocks)?

You are right, it is not correct how flags are checked. Especially if only running with the JVMCI
compiler.

I changed Deoptimization::deoptimize_objects_internal() to make reallocation and relocking dependent
on similar checks as in Deoptimization::fetch_unroll_info_helper(). Furthermore EscapeBarriers are
conditionally activated depending on the following (see EscapeBarrier ctors):

JVMCI_ONLY(UseJVMCICompiler) NOT_JVMCI(false) COMPILER2_PRESENT(|| DoEscapeAnalysis)

So the enhancement can be practically completely disabled by disabling DoEscapeAnalysis, which is
what C2 currently does if JVMTI capabilities that allow access to local references are taken.

> jvmtiTagMap.cpp
> 
> Deoptimize for jvmti operations.  Good.
> 
> deoptimization.cpp
> 
> I guess this is the core of your work.
> 
> 
> You add a new mode that just deoptimizes objects but not frames. 
> Good idea. You have to use reallocated objects in upper frames, 
> or by jvmti accesses to inner frames, which can not easily be
> replaced by interpreter frames.
> This way you can wait with replacing the frame until just before
> execution returns.
> 
> eliminate_allocations():
> (Strange method name, should at least be in past tense, even
> better reallocate_eliminated_allocations() or 
> allocate_scalarized_objects(). Confused me until
> I groked the code. Legacy though, not your business.)

I still don't grok the name... ;) but it's preexisting as you noted

> It's not that nice to return whether you only deoptimized
> objects by the boolean reference argument. After all, 
> it again depends on the mode you pass in.
> A different design would be to clone the method and 
> have an eliminate_allocations_no_unpack() variant, but that would
> not be better as some code would be duplicated.
> Maybe a comment for argument eliminate_allocations:
> // deoptimized_objects is set to true if objects were deoptimized
> // but not the frame. It is unchanged if there are no objects to 
> // be deoptimized, or if the frame was deoptim

I agree: duplicating the code would be really bad, but I don't think that having reference
parameters is not nice. I think it is a common pattern, if you return an error code and additional
result data. The variable is a minor detail. With the meaningful name it is not necessary to
document it.

In my eyes it should be set independently of the exec_mode. I didn't do it to make the change smaller.

> Similar for eliminate_locks():
> // deoptimized_objects is set to true if objects were relocked,
> // else it is left unchanged.
> 
> You reuse and extend the existing realloc/relock_objects, but extended it.
> 
> deoptimize_objects_internal()
> 
> Simple version of fetch_unroll_info_helper for EscapeBarrier.
> Good.
> I attributed the comment "Then relock objects if synchronization on them was eliminated."
> to the if() just below. Add an empty line to make clear the comment
> refers to the next 10 lines.
> Alternatively, replace the whole comment by 
> // At first, reallocate the non-escaping objects and restore their fields
> // so they are available for relocking.
> And add 
> // Now relock objects with eliminated locks.
> befor the if ((DoEscape... below.

I went for the latter.

> In fetch_unroll_info_helper, I don't understand why you need 
>  && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) {
> for eliminated locks, but not for skalar replaced objects?

In short reallocation is idempotent, relocking is not.

Without the enhancement Deoptimization::realloc_objects() can already be called more than once for a frame:

First call in materializeVirtualObjects() (also iterateFrames()).

Second (indirect) call in fetch_unroll_info_helper().

The objects from the first call are saved as jvmti deferred updates when realloc_objects()
returns. Note that there is no relationship to jvmti. The thing in common is that updates cannot be
directely installed into a compiled frame, it is necessary to deoptimize the frame and defer the
updates until the compiled frame gets replaced. Every time the vframes corresponding to the owner
frame are iterated, they get the deferred updates. So in fetch_unroll_info_helper() the
GrowableArray<compiledVFrame*>* chunk reference them too. All references to the objects created by
the second (indirect) call to realloc_objects() are never used, because compiledVFrame accessors to
locals, expressions, and monitors override them with the deferred updates. The objects become
unreachable and get gc'ed.

materializeVirtualObjects() does not bother with relocking. deoptimize_objects_internal(), which is
introduced by the enhancement, does relock objects, after all the lock elimination becomes illegal
with the change in escape state. Relocking twice does not work, so the enhancement avoids it by
checking EscapeBarrier::objs_are_deoptimized(thread, deoptee.id()).

Note that materializeVirtualObjects() can be called more than once and will always return the very
same objects, even though it calls realloc_objects() again.

> I would guess it is because the eliminated locks can be applied to
> argEscape, but scalar replacement only to noescape objects?
> I.e. it might have been done before?
> 
> But why isn't this the case for eliminate_allocations?
> deoptimize_objects_internal does both unconditionally,
> so both can happen to inner frames, right?

Sorry, I don't quite understand. Hope the explanation above helps.

> relock_objects()
> 
> Ok, you need to undo biased locking. Also, you remember the 
> lock nesting for later relocking if waiting for lock.
> 
> revoke_for_object_deoptimization()
>   I like if boolean operators are at the beginning of broken lines, 
>   but I think hotspot convention is to have them at the end.

Ok, fixed.

> Code will get much more simple if BiasedLocking is removed.
> 
> EscapeBarrier:: ...
> 
> (This class maybe would qualify for a file of its own.)
> 
> deoptimize_objects()
> I would mention escape analysis only as side remark.  Also, as I understand, 
> there is only one frame at given depth?
> // Deoptimize frames with optimized objects. This can be omitted locks and 
> // objects not allocated but replaced by scalars. In C2, these optimizations
> // are based on escape analysis.
> // Up to depth, deoptimize frames with any optimized objects.
> // From depth to entry_frame, deoptimize only frames that
> // pass optimized objects to their callees.
> (First part similar for the comment above EscapeBarrier::deoptimize_objects_internal().)

I've reworked the comment. Let me know if you still think it needs to be improved.

> 
> What is the check (cur_depth <= depth) good for? Can you 
> ever walk past entry_frame?  

Yes (assuming you mean the outer while-statement), there are java frames beyond the entry frame if a
native method calls java methods again. So we visit all frames up to the given depth and from there
we continue to the entry frame. It is not necessary to continue beyond that entry frame, because
escape analysis assumes that arguments to native functions escape globally.

Example: Let the java stack look like this:

+---------+
| Frame A |
+---------+
| Frame N |
+---------+
| Frame B |
+---------+ <- top of stack

Where java method A calls native method N and N calls java method B.

Very simplified the native stack will look like this

+-------------------------+
| Frame of JIT Compiled A |
+-------------------------+
| Frame N                 |
+-------------------------+
| Entry Frame             |
+-------------------------+
| Frame B                 |
+-------------------------+ <- top of stack

The entry frame is an activation of the call stub, which is a small assembler routine that
translates from the native calling convention to the java calling convention.

There cannot be any ArgEscape that is passed to B (see above), therefore we can stop the stackwalk
at the entry frame if depth is 1. If depth is 3 we have to continue to Frame A, as it is directely
accessed.


> Isn't vf->is_compiled_frame() prerequisite that "Move to next physical frame" 
> is needed? You could move it into the other check.
> If so, similar for deoptimize_objects_all_threads().

Only compiledVFrame require moving to the /top/ frame. Fixed.

> Syncronization: looks good. I think others had a look at this before.
> 
> EscapeBarrier::deoptimize_objects_internal()
>   The method name is misleading, it is not used by 
>   deoptimize_objects().
>   Also, method with the same name is in Deopitmization.
>   Proposal: deoptimize_objects_thread() ?

Sorry, but I don't see, why it would be misleading.
What would be the meaning of 'deoptimize_objects_thread'? I don't understand that name.

> C1 stubs: this really shows you tested all configurations, great!
> 
> 
> mutexLocker: ok.
> objectMonitor.cpp: ok
> stackValue.hpp   Is this missing clearing a bug?

In short: that change is not needed anymore. I'll remove it again.

Details: it is not a real bug, but the assertion in vframeArrayElement::fill_in() was triggered:

assert(!value->obj_is_scalar_replaced() || realloc_failures) failed: object should be reallocated already.

But only with the first version of the enhancement (webrev.0), were objects were only reallocated
when replacing a compiled frame with equivalent interpreter frames iff virtual objects where not
reallocated before.

I changed this after prexisting code was refactored (JDK-8226705), because practically never already
reallocated objects exist and if there should be any, it does not harm to reallocate again, because
the unnecessarily allocated objects become immediately garbage and last but not least no tricky
synchronization is required.

Also that's what happens with the preexisting code if virtual objects are materialized with
materializeVirtualObjects().

> 
> thread.hpp
> 
> I would remove "_ea" from the flag and method names.

Done.

> 
> Renaming deferred_locals to deferred_updates is good, as well as 
> adding a datastructure for it. 
> (Adding this data structure might be a breakout, too.)
> 
> good.
> 
> thread.cpp
> 
> good.
> 
> vframe.cpp
> 
> Is this a bug in existing code?
> Makes sense. 

Depends on your definition of bug. There are no references to vframe::is_entry_frame() in the
existing code. I would think it is a bug.

> 
> vframe_hp.hpp 
> (What stands _hp for? helper? The file should be named compiledVFrame ...)
> 
> not_global_escape_in_scope() ...
> Again, you mention escape analysis here. Comments above hold, too.

I think it is the right name, because it is meaningful and simple.

> You introduce JvmtiDeferredUpdates. Good.
> 
> vframe_hp.cpp
> 
> Changes for JvmtiDeferredUpdates, escape state accessors,
> 
> line 422:
> Would an assertion assert(!info->owner_is_scalar_replaced(), ...) hold here?
> 
> 
> macros.hpp
>   Good.  
> 
> 
> Test coding
> ============
> 
> compileBroker.h|cpp
> 
> You introduce a third class of threads handled here and 
> add a new flag to distinguish it. Before, the two kinds
> of threads were distinguished implicitly by passing in 
> a compiler for compiler threads.
> The new thread kind is only used for testing in debug.
> 
> make_thread:
> You could assert (comp != NULL...) to assure previous
> conditions.

If replaced the if-statements with a switch-statement, made sure all enum-elements are covered, and
added the assertion you suggested.

> line 989 indentation broken

You are referring to this block I assume:
(from http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/src/hotspot/share/compiler/compileBroker.cpp.frames.html)

 976   if (MethodFlushing) {
 977     // Initialize the sweeper thread
 978     Handle thread_oop = create_thread_oop("Sweeper thread", CHECK);
 979     jobject thread_handle = JNIHandles::make_local(THREAD, thread_oop());
 980     make_thread(sweeper_t, thread_handle, NULL, NULL, THREAD);
 981   }
 982 
 983 #if defined(ASSERT) && COMPILER2_OR_JVMCI
 984   if (DeoptimizeObjectsALot == 2) {
 985     // Initialize and start the object deoptimizer threads
 986     for (int thread_count = 0; thread_count < DeoptimizeObjectsALotThreadCount; thread_count++) {
 987       Handle thread_oop = create_thread_oop("Deoptimize objects a lot thread", CHECK);
 988       jobject thread_handle = JNIHandles::make_local(THREAD, thread_oop());
 989       make_thread(deoptimizer_t, thread_handle, NULL, NULL, THREAD);
 990     }
 991   }
 992 #endif // defined(ASSERT) && COMPILER2_OR_JVMCI

I cannot really see broken indentation here. Am I looking at the wrong location?

> escape.cpp
> 
> You enable the optimization in case of testruns. good.
> 
> whitebox.cpp  ok.
> 
> deoptimization.cpp
> 
> deoptimize_objects_alot_loop()  Good.
> 
> globals.hpp
> 
> Nice docu of flags, but pleas mention "for testing purposes"
> or the like in DeoptimizeObjectsALot.
> I would place the flags next to each other. 
> 
> interfaceSupport.cpp: good.

Thanks! :)

-----Original Message-----
From: Lindenmaier, Goetz <goetz.lindenmaier at sap.com> 
Sent: Mittwoch, 6. Mai 2020 12:28
To: Reingruber, Richard <richard.reingruber at sap.com>; Doerr, Martin <martin.doerr at sap.com>; 'Robbin Ehn' <robbin.ehn at oracle.com>; David Holmes <david.holmes at oracle.com>; Vladimir Kozlov (vladimir.kozlov at oracle.com) <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

Hi Richard,

I had a look at your change.  It's complex, but not that big.
A lot of code is just passing info through layers of abstraction.
Also, one can tell this went through some iterations by now, 
I think it's very well engineered.
I had a look at webrev.05

Unfortunately
"8242425: JVMTI monitor operations should use Thread-Local Handshakes" 
breaks webrev.05.
I updated to before that change and took that as base of my review.

I see four parts of the change that can be looked at
rather individually.

 * Refactoring the scopeDesc constructors. Trivial.
 * Persisting information about the optimizations done by the compilers.
   Large and mostly trivial.
 * Deoptimizing. The most complicated part. Really well abstracted, though.
 * DeoptimizeObjectsALot for testing and the tests.

Review of compiler changes:

I understand you annotate at safepoints where the escape analysis
finds out that an object is "better" than global escape. 
This are the cases where the analysis identifies optimization 
opportunities. These annotations are then used to deoptimize
frames and the objects referenced by them.
Doesn't this overestimate the optimized 
objects?  E.g., eliminate_alloc_node has many cases where it bails
out.

c1_IR.hpp   

OK, nothing to do for C1, just adapt to extended method signature.

Break line once more so that it matches above line length.


ciEnv.h|cpp

Pass through another jvmti capability.  Trivial & good.


debugInfoRec.hpp

Pass through escape info that must be recorded. OK.

pcDesc.hpp

I would like to see some documentation of the methods.

Maybe:
  // There is an object in the scope that does not escape globally.
  // It either does not escape at all or it escapes as arguemnt.
and
  // One of the arguments is an object that is not globally visible
  // but escapes to the callee.

scopeDesc.cpp

  Besides refactoring copy escape info from pcDesc to scopeDesc
  and add accessors. Trivial.

  In scopeDesc.hpp you talk about NoEscape and ArgEscape. 
  This are opto terms, but scopeDesc is a shared datastructure
  that does not depend on a specific compiler. 
  Please explain what is going on without using these terms.

jvmciCodeInstaller.cpp

  OK, nothing for JVMCI. Here support for Object Optimizations 
  for JVMCI compilers could be added. Leave this to graal people.

callnode.hpp

You add functionality to annotate callnodes with escape information 
This is carried through code generation to final output where it is
added to the compiled methods meta information.

At Safepoints in general jvmti can access
  - Objects that were scalar replaced. They must be reallocated.
    (Flag EliminateAllocations)
  - Objects that should be locked but are not because they never 
    escape the thread. They need to be relocked.

At calls, Objects where locks have been removed escape to callees.
We must persist this information so that if jvmti accesses the 
object in a callee, we can determine by looking at the caller that
it needs to be relocked.

A side comment: 
I think the flage handling in Opto is not very intuitive.
DoEscapeAnalysis depends on the jvmti capabilities.
This makes no sense. It is only an analysis. The optimizations
should depend on the jvmti capabilities.
The correct setup would be to handle this in 
CompilerConfig::ergo_initialize():
If the jvmti capabilities allow, enable the optimizations 
EliminateAllocations or  EliminateLocks/EliminateNestedLocks.
If one of these optimizations is on, enable EscapeAnalysis.
 -- end side comment.

So I would propose the following comments:

  // In the scope of this safepoints there are objects
  // that do not globally escape. They are either NoEscape or
  // ArgEscape. As such, they might be subject to optimizations.
  // Persist this information here so that the frame an the
  // Objects in scope can 
  // be deoptimized if jvmti accesses an object at this safepoint.
  void set_not_global_escape_in_scope(bool b) {

  // This call passes objects that do not globally escape 
  // to its callee. The object might be subject to optimization, 
  // e.g. a lock might be omitted. Persist this information here 
  // so that on a jvmti access to the callee frame we can deoptimize
  // the object and this frame.
  void  set_arg_escape(bool f)             { _arg_escape = f; }

Actuall I am not sure whether the name of these fields (and all 
the others in the course of this change) should refer to 
escape analysis.  I think the term "Object deoptimization" 
you also use is much better. You could call these properties 
(througout the whole change) 
  set_optimized_objects_in_scope()
and
  set_passes_optimized_objects().

I think this would make the whole matter much easier
to understand. 

Anyways, locks can already be removed without running
escape analysis at all. C2 recognizes some local patterns
that allow this.

escape.h|cpp

The code looks good. 

Line 325: The comment could be a bit more elaborate:
  // Annotate at safepoints if they have <= ArgEscape objects in their
  // scope. Additionally, if the safepoint is a java call, annotate
  // whether it passes ArgEscape objects as parameters.

And maybe add these comments?:

// Returns true if an oop in the scope of sfn does not escape
// globally.
bool ConnectionGraph::has_not_global_escape_in_scope(SafePointNode* sfn) {

// Returns true if at least one of the arguments to the call is an oop
// that does not escape globally.
bool ConnectionGraph::has_arg_escape(CallJavaNode* call) {

General question:
You collect the information you want to annotate to the 
method during escape analysis.
Don't you overestimate the optimized objects by this?
E.g. elimination of allocations does bail out for 
various reasons. At the end, no optimization might 
have happened, but then during runtime the frame is 
deoptimized nevertheless.

machnode.hpp:

Extends MachSafePointNode similar to the ideal version.  Good.

matcher.cpp
  
Copy info from ideal to mach node. good.

output.cpp

Now finally the information is written to the 
debug info.  Good.

---------------------------------------------------------

So now let's have a look at the runtime part (including
relaxing constraints to escape analysis):

rootResolver.cpp

Adapt to changed interface. good.

c2compiler.cpp / macro.cpp

Make EscpaeAnlysis independent of jvmti capabilities. Good.

jvmtiEnv.cpp/jvmtiEnvBase.cpp

You add deoptimization of objects where they are 
accessed. good.

jvmtiImpl.cpp

In deoptimize_objects, you check for DoEscapeAnalysis.
This is correct given the current design of the flag
handling in the compiler.
It's not really nice to have a dependency to C2 here, 
though. I understand it's an optimization, the code 
could be run anyways, it would check but not find
anything. But actually I would excpect dependencies
on EliminateLocks and EliminateAllocations (if they
were set according to jvmti capabilitiers as I elaborated
above.)  
Would it make sense to protect the ArgEscape
loop by if (EliminateLocks)?

jvmtiTagMap.cpp

Deoptimize for jvmti operations.  Good.

deoptimization.cpp

I guess this is the core of your work.


You add a new mode that just deoptimizes objects but not frames. 
Good idea. You have to use reallocated objects in upper frames, 
or by jvmti accesses to inner frames, which can not easily be
replaced by interpreter frames.
This way you can wait with replacing the frame until just before
execution returns.

eliminate_allocations():
(Strange method name, should at least be in past tense, even
better reallocate_eliminated_allocations() or 
allocate_scalarized_objects(). Confused me until
I groked the code. Legacy though, not your business.)

It's not that nice to return whether you only deoptimized
objects by the boolean reference argument. After all, 
it again depends on the mode you pass in.
A different design would be to clone the method and 
have an eliminate_allocations_no_unpack() variant, but that would
not be better as some code would be duplicated.
Maybe a comment for argument eliminate_allocations:
// deoptimized_objects is set to true if objects were deoptimized
// but not the frame. It is unchanged if there are no objects to 
// be deoptimized, or if the frame was deoptim

Similar for eliminate_locks():
// deoptimized_objects is set to true if objects were relocked,
// else it is left unchanged.

You reuse and extend the existing realloc/relock_objects, but extended it.

deoptimize_objects_internal()

Simple version of fetch_unroll_info_helper for EscapeBarrier.
Good.
I attributed the comment "Then relock objects if synchronization on them was eliminated."
to the if() just below. Add an empty line to make clear the comment
refers to the next 10 lines.
Alternatively, replace the whole comment by 
// At first, reallocate the non-escaping objects and restore their fields
// so they are available for relocking.
And add 
// Now relock objects with eliminated locks.
befor the if ((DoEscape... below.

In fetch_unroll_info_helper, I don't understand why you need 
 && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) {
for eliminated locks, but not for skalar replaced objects?
I would guess it is because the eliminated locks can be applied to
argEscape, but scalar replacement only to noescape objects?
I.e. it might have been done before?

But why isn't this the case for eliminate_allocations?
deoptimize_objects_internal does both unconditionally,
so both can happen to inner frames, right?

relock_objects()

Ok, you need to undo biased locking. Also, you remember the 
lock nesting for later relocking if waiting for lock.

revoke_for_object_deoptimization()
  I like if boolean operators are at the beginning of broken lines, 
  but I think hotspot convention is to have them at the end.

Code will get much more simple if BiasedLocking is removed.

EscapeBarrier:: ...

(This class maybe would qualify for a file of its own.)

deoptimize_objects()
I would mention escape analysis only as side remark.  Also, as I understand, 
there is only one frame at given depth?
// Deoptimize frames with optimized objects. This can be omitted locks and 
// objects not allocated but replaced by scalars. In C2, these optimizations
// are based on escape analysis.
// Up to depth, deoptimize frames with any optimized objects.
// From depth to entry_frame, deoptimize only frames that
// pass optimized objects to their callees.
(First part similar for the comment above EscapeBarrier::deoptimize_objects_internal().)

What is the check (cur_depth <= depth) good for? Can you 
ever walk past entry_frame?  

Isn't vf->is_compiled_frame() prerequisite that "Move to next physical frame" 
is needed? You could move it into the other check.
If so, similar for deoptimize_objects_all_threads().

Syncronization: looks good. I think others had a look at this before.

EscapeBarrier::deoptimize_objects_internal()
  The method name is misleading, it is not used by 
  deoptimize_objects().
  Also, method with the same name is in Deopitmization.
  Proposal: deoptimize_objects_thread() ?

C1 stubs: this really shows you tested all configurations, great!


mutexLocker: ok.
objectMonitor.cpp: ok
stackValue.hpp   Is this missing clearing a bug?

thread.hpp

I would remove "_ea" from the flag and method names.

Renaming deferred_locals to deferred_updates is good, as well as 
adding a datastructure for it. 
(Adding this data structure might be a breakout, too.)

good.

thread.cpp

good.

vframe.cpp

Is this a bug in existing code?
Makes sense. 

vframe_hp.hpp 
(What stands _hp for? helper? The file should be named compiledVFrame ...)

not_global_escape_in_scope() ...
Again, you mention escape analysis here. Comments above hold, too.

You introduce JvmtiDeferredUpdates. Good.

vframe_hp.cpp

Changes for JvmtiDeferredUpdates, escape state accessors,

line 422:
Would an assertion assert(!info->owner_is_scalar_replaced(), ...) hold here?


macros.hpp
  Good.  


Test coding
============

compileBroker.h|cpp

You introduce a third class of threads handled here and 
add a new flag to distinguish it. Before, the two kinds
of threads were distinguished implicitly by passing in 
a compiler for compiler threads.
The new thread kind is only used for testing in debug.

make_thread:
You could assert (comp != NULL...) to assure previous
conditions.

line 989 indentation broken

escape.cpp

You enable the optimization in case of testruns. good.

whitebox.cpp  ok.

deoptimization.cpp

deoptimize_objects_alot_loop()  Good.

globals.hpp

Nice docu of flags, but pleas mention "for testing purposes"
or the like in DeoptimizeObjectsALot.
I would place the flags next to each other. 

interfaceSupport.cpp: good.

I'll look at the test themselves in an extra mail (learning from 
Martin ??)

Best regards,
  Goetz.


> -----Original Message-----
> From: Reingruber, Richard <richard.reingruber at sap.com>
> Sent: Wednesday, April 1, 2020 8:15 AM
> To: Doerr, Martin <martin.doerr at sap.com>; 'Robbin Ehn'
> <robbin.ehn at oracle.com>; Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com>; David Holmes <david.holmes at oracle.com>;
> Vladimir Kozlov (vladimir.kozlov at oracle.com) <vladimir.kozlov at oracle.com>;
> serviceability-dev at openjdk.java.net; hotspot-compiler-
> dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> in the Presence of JVMTI Agents
> 
> Hi Martin,
> 
> > thanks for addressing all my points. I've looked over webrev.5 and I'm
> satisfied with your changes.
> 
> Thanks!
> 
> > I had also promised to review the tests.
> 
> Thanks++
> I appreciate it very much, the tests are many lines of code.
> 
> > test/jdk/com/sun/jdi/EATests.java
> > This is a substantial amount of tests which is appropriate for a such a large
> change. Skipping some subtests with UseJVMCICompiler makes sense
> because it doesn't provide the necessary JVMTI functionality, yet.
> > Nice work!
> > I also like that you test with and without BiasedLocking. Your tests will still
> be fine after BiasedLocking deprecation.
> 
> Hope so :)
> 
> > Very minor nits:
> > - 2 typos in comment above EARelockingNestedInflatedTarget: "lockes are
> ommitted" (sounds funny)
> > - You sometimes write "graal" and sometimes "Graal". I guess the capital G
> is better. (Also in EATestsJVMCI.java.)
> 
> > test/jdk/com/sun/jdi/EATestsJVMCI.java
> > EATests with Graal enabled. Nice that you support Graal to some extent.
> Maybe Graal folks want to enhance them in the future. I think this is a good
> starting point.
> 
> Will change this in the next webrev.
> 
> > Conclusion: Looks good and not trivial :-)
> > Now, you have one full review. I'd be ok with covering 2nd review by partial
> reviews.
> > Compiler and JVMTI parts are not too complicated IMHO.
> > Runtime part should get at least one additional careful review.
> 
> Thanks a lot,
> Richard.
> 
> -----Original Message-----
> From: Doerr, Martin <martin.doerr at sap.com>
> Sent: Dienstag, 31. M?rz 2020 16:01
> To: Reingruber, Richard <richard.reingruber at sap.com>; 'Robbin Ehn'
> <robbin.ehn at oracle.com>; Lindenmaier, Goetz
> <goetz.lindenmaier at sap.com>; David Holmes <david.holmes at oracle.com>;
> Vladimir Kozlov (vladimir.kozlov at oracle.com) <vladimir.kozlov at oracle.com>;
> serviceability-dev at openjdk.java.net; hotspot-compiler-
> dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
> Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> in the Presence of JVMTI Agents
> 
> Hi Richard,
> 
> thanks for addressing all my points. I've looked over webrev.5 and I'm
> satisfied with your changes.
> 
> 
> I had also promised to review the tests.
> 
> test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysis
> Enabled.java
> Thanks for updating the @summary comment. Looks good in webrev.5.
> 
> test/hotspot/jtreg/serviceability/jvmti/Heap/libIterateHeapWithEscapeAnaly
> sisEnabled.c
> JVMTI agent for object tagging and heap iteration. Good.
> 
> test/jdk/com/sun/jdi/EATests.java
> This is a substantial amount of tests which is appropriate for a such a large
> change. Skipping some subtests with UseJVMCICompiler makes sense
> because it doesn't provide the necessary JVMTI functionality, yet.
> Nice work!
> I also like that you test with and without BiasedLocking. Your tests will still be
> fine after BiasedLocking deprecation.
> 
> Very minor nits:
> - 2 typos in comment above EARelockingNestedInflatedTarget: "lockes are
> ommitted" (sounds funny)
> - You sometimes write "graal" and sometimes "Graal". I guess the capital G is
> better. (Also in EATestsJVMCI.java.)
> 
> test/jdk/com/sun/jdi/EATestsJVMCI.java
> EATests with Graal enabled. Nice that you support Graal to some extent.
> Maybe Graal folks want to enhance them in the future. I think this is a good
> starting point.
> 
> 
> Conclusion: Looks good and not trivial :-)
> Now, you have one full review. I'd be ok with covering 2nd review by partial
> reviews.
> Compiler and JVMTI parts are not too complicated IMHO.
> Runtime part should get at least one additional careful review.
> 
> Best regards,
> Martin
> 
> 
> > -----Original Message-----
> > From: Reingruber, Richard <richard.reingruber at sap.com>
> > Sent: Montag, 30. M?rz 2020 10:32
> > To: Doerr, Martin <martin.doerr at sap.com>; 'Robbin Ehn'
> > <robbin.ehn at oracle.com>; Lindenmaier, Goetz
> > <goetz.lindenmaier at sap.com>; David Holmes
> <david.holmes at oracle.com>;
> > Vladimir Kozlov (vladimir.kozlov at oracle.com)
> > <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net;
> > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-
> > dev at openjdk.java.net
> > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> > in the Presence of JVMTI Agents
> >
> > Hi,
> >
> > this is webrev.5 based on Robbin's feedback and Martin's review - thanks! :)
> >
> > The change affects jvmti, hotspot and c2. Partial reviews are very welcome
> > too.
> >
> > Full:  http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/
> > Delta:
> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5.inc/
> >
> > Robbin, Martin, please let me know, if anything shouldn't be quite as you
> > wanted it. Also find my
> > comments on your feedback below.
> >
> > Robbin, can I count you as Reviewer for the runtime part?
> >
> > Thanks, Richard.
> >
> > --
> >
> > > DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> > > You can move both declaration and definition to that file, no need to
> > clobber
> > > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
> >
> > Done.
> >
> > > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in
> it's
> > own
> > > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
> >
> > I moved JvmtiDeferredUpdates to vframe_hp.hpp where preexisting
> > jvmtiDeferredLocalVariableSet is
> > declared.
> >
> > > src/hotspot/share/code/compiledMethod.cpp
> > > Nice cleanup!
> >
> > Thanks :)
> >
> > > src/hotspot/share/code/debugInfoRec.cpp
> > > src/hotspot/share/code/debugInfoRec.hpp
> > > Additional parmeters. (Remark: I think "non_global_escape_in_scope"
> > would read better than "not_global_escape_in_scope", but your version is
> > consistent with existing code, so no change request from my side.) Ok.
> >
> > I've been thinking about this too and finally stayed with
> > not_global_escape_in_scope. It's supposed
> > to mean an object whose escape state is not GlobalEscape is in scope.
> >
> > > src/hotspot/share/compiler/compileBroker.cpp
> > > src/hotspot/share/compiler/compileBroker.hpp
> > > Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into
> > a follow up change together with the test in order to make this webrev
> > smaller, but since it is included, I'm reviewing everything at once. Not a big
> > deal.) Ok.
> >
> > Yes the change would be a little smaller. And if it helps I'll split it off. In
> > general I prefer
> > patches that bring along a suitable amount of tests.
> >
> > > src/hotspot/share/opto/c2compiler.cpp
> > > Make do_escape_analysis independent of JVMCI capabilities. Nice!
> >
> > It is the main goal of the enhancement. It is done for C2, but could be done
> > for JVMCI compilers
> > with just a small effort as well.
> >
> > > src/hotspot/share/opto/escape.cpp
> > > Annotation for MachSafePointNodes. Your added functionality looks
> > correct.
> > > But I'd prefer to move the bulky code out of the large function.
> > > I suggest to factor out something like has_not_global_escape and
> > has_arg_escape. So the code could look like this:
> > >       SafePointNode* sfn = sfn_worklist.at(next);
> > >       sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn));
> > >       if (sfn->is_CallJava()) {
> > >         CallJavaNode* call = sfn->as_CallJava();
> > >         call->set_arg_escape(has_arg_escape(call));
> > >       }
> > > This would also allow us to get rid of the found_..._escape_in_args
> > variables making the loops better readable.
> >
> > Done.
> >
> > > It's kind of ugly to use strcmp to recognize uncommon trap, but that
> seems
> > to be the way to do it (there are more such places). So it's ok.
> >
> > Yeah. I copied the snippet.
> >
> > > src/hotspot/share/prims/jvmtiImpl.cpp
> > > src/hotspot/share/prims/jvmtiImpl.hpp
> > > The sequence is pretty complex:
> > > VM_GetOrSetLocal element initialization executes EscapeBarrier code
> > which suspends the target thread (extra VM Operation).
> >
> > Note that the target threads have to be suspended already for
> > VM_GetOrSetLocal*. So it's mainly the
> > synchronization effect of EscapeBarrier::sync_and_suspend_one() that is
> > required here. Also no extra
> > _handshake_ is executed, since sync_and_suspend_one() will find the
> > target threads already
> > suspended.
> >
> > > VM_GetOrSetLocal::doit_prologue performs object deoptimization (by
> VM
> > Thread to prepare VM Operation with frame deoptimization).
> > > VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor
> > which resumes the target thread.
> > > But I don't have any improvement proposal. Performance is probably not
> a
> > concern, here. So it's ok.
> >
> > > VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it
> > has non-globally escaping objects and other frames if they have arg
> escaping
> > ones. Good.
> >
> > It's not specifically the top frame, but the frame that is accessed.
> >
> > > src/hotspot/share/runtime/deoptimization.cpp
> > > Object deoptimization. I have more comments and proposals, here.
> > > First of all, handling recursive and waiting locks in relock_objects is tricky,
> > but looks correct.
> > > Comments are sufficient to understand why things are done as they are
> > implemented.
> >
> > > BiasedLocking related parts are complex, but we may get rid of them in
> the
> > future (with BiasedLocking removal).
> > > Anyway, looks correct, too.
> >
> > > Typo in comment: "regularily" => "regularly"
> >
> > > Deoptimization::fetch_unroll_info_helper is the only place where
> > _jvmti_deferred_updates get deallocated (except JavaThread destructor).
> > But I think we always go through it, so I can't see a memory leak or such
> kind
> > of issues.
> >
> > That's correct. The compiled frame for which deferred updates are
> allocated
> > is always deoptimized
> > before (see EscapeBarrier::deoptimize_objects()). This is also asserted in
> > compiledVFrame::update_deferred_value(). I've added the same assertion
> > to
> > Deoptimization::relock_objects(). So we can be sure that
> > _jvmti_deferred_updates are deallocated
> > again in fetch_unroll_info_helper().
> >
> > > EscapeBarrier::deoptimize_objects: ResourceMark should use
> > calling_thread().
> >
> > Sure, well spotted!
> >
> > > You can use MutexLocker and MonitorLocker with Thread* to save the
> > Thread::current() call.
> >
> > Right, good hint. This was recently introduced with 8235678. I even had to
> > resolve conflicts. Should
> > have done this then.
> >
> > > I'd make set_objs_are_deoptimized static and remove it from the
> > EscapeBarrier interface because I think it shouldn't be used outside of
> > EscapeBarrier::deoptimize_objects.
> >
> > Done.
> >
> > > Typo in comment: "we must only deoptimize" => "we only have to
> > deoptimize"
> >
> > Replaced with "[...] we deoptimize iff local objects are passed as args"
> >
> > > "bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and
> > barrier_active() is redundant. Implementation can get moved to hpp file.
> >
> > Ok. Done.
> >
> > > I'll get back to suspend flags, later.
> >
> > > There are weird cases regarding _self_deoptimization_in_progress.
> > > Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C.
> > C can set _self_deoptimization_in_progress while A performs the
> handshake
> > for suspending C. I think this doesn't lead to errors, but it's probably not
> > desired.
> > > I think it would be better to use only one "wait" call in
> > sync_and_suspend_one and sync_and_suspend_all.
> >
> > You're right. We've discussed that face-to-face, but couldn't find a real
> issue.
> > But now, thinking again, a reckon I found one:
> >
> > 2808   // Sync with other threads that might be doing deoptimizations
> > 2809   {
> > 2810     // Need to switch to _thread_blocked for the wait() call
> > 2811     ThreadBlockInVM tbivm(_calling_thread);
> > 2812     MonitorLocker ml(EscapeBarrier_lock,
> > Mutex::_no_safepoint_check_flag);
> > 2813     while (_self_deoptimization_in_progress) {
> > 2814       ml.wait();
> > 2815     }
> > 2816
> > 2817     if (self_deopt()) {
> > 2818       _self_deoptimization_in_progress = true;
> > 2819     }
> > 2820
> > 2821     while (_deoptee_thread->is_ea_obj_deopt_suspend()) {
> > 2822       ml.wait();
> > 2823     }
> > 2824
> > 2825     if (self_deopt()) {
> > 2826       return;
> > 2827     }
> > 2828
> > 2829     // set suspend flag for target thread
> > 2830     _deoptee_thread->set_ea_obj_deopt_flag();
> > 2831   }
> >
> > - A waits in 2822
> > - C is suspended
> > - B notifies all in resume_one()
> > - A and C wake up
> > - C wins over A and sets _self_deoptimization_in_progress = true in 2818
> > - C does the self deoptimization
> > - A executes 2830 _deoptee_thread->set_ea_obj_deopt_flag()
> >
> > C will self suspend at some undefined point. The resulting state is illegal.
> >
> > > I first thought it'd be better to move ThreadBlockInVM before wait() to
> > reduce thread state transitions, but that seems to be problematic because
> > ThreadBlockInVM destructor contains a safepoint check which we
> shouldn't
> > do while holding EscapeBarrier_lock. So no change request.
> >
> > Yes, would be nice to have the state change only if needed, but for the
> > reason you mentioned it is
> > not quite as easy as it seems to be. I experimented as well with a second
> > lock, but did not succeed.
> >
> > > Change in thred_added:
> > > I think the sequence would be more comprehensive if we waited for
> > deopt_all_threads in Thread::start and all other places where a new thread
> > can run into Java code (e.g. JVMTI attach).
> > > Your version makes new threads come up with suspend flag set. That
> looks
> > correct, too. Advantage is that you only have to change one place
> > (thread_added). It'll be interesting to see how it will look like when we use
> > async handshakes instead of suspend flags.
> > > For now, I'm ok with your version.
> >
> > I had a version that did what you are suggesting. The current version also
> has
> > the advantage, that
> > there are fewer places where a thread has to wait for ongoing object
> > deoptimization. This means
> > viewer places where you have to worry about correct thread state
> > transitions, possible deadlocks,
> > and if all oops are properly Handle'ed.
> >
> > > I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt-
> > >is_hidden_from_external_view()).
> >
> > Done.
> >
> > > Having 4 different deoptimize_objects functions makes it a little hard to
> > keep an overview of which one is used for what.
> > > Maybe adding suffixes would help a little bit, but I can also live with what
> > you have.
> > > Implementation looks correct to me.
> >
> > 2 are internal. I added the suffix _internal to them. This leaves 2 to choose
> > from.
> >
> > > src/hotspot/share/runtime/deoptimization.hpp
> > > Escape barriers and object deoptimization functions.
> > > Typo in comment: "helt" => "held"
> >
> > Done in place already.
> >
> > > src/hotspot/share/runtime/interfaceSupport.cpp
> > > InterfaceSupport::deoptimizeAllObjects() is only used for
> > DeoptimizeObjectsALot = 1.
> > > I think DeoptimizeObjectsALot = 2 is more important, but I think it's not
> bad
> > to have DeoptimizeObjectsALot = 1 in addition. Ok.
> >
> > I never used DeoptimizeObjectsALot = 1 that much. It could be more
> > deterministic in single threaded
> > scenarios. I wouldn't object to get rid of it though.
> >
> > > src/hotspot/share/runtime/stackValue.hpp
> > > Better reinitilization in StackValue. Good.
> >
> > StackValue::obj_is_scalar_replaced() should not return true after calling
> > set_obj().
> >
> > > src/hotspot/share/runtime/thread.cpp
> > > src/hotspot/share/runtime/thread.hpp
> > > src/hotspot/share/runtime/thread.inline.hpp
> > > wait_for_object_deoptimization, suspend flag, deferred updates and test
> > feature to deoptimize objects.
> >
> > > In the long term, we want to get rid of suspend flags, so it's not so nice to
> > introduce a new one. But I agree with G?tz that it should be acceptable as
> > temporary solution until async handshakes are available (which takes more
> > time). So I'm ok with your change.
> >
> > I'm keen to build the feature on async handshakes when the arive.
> >
> > > You can use MutexLocker with Thread*.
> >
> > Done.
> >
> > > JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class
> > out of thread.hpp.
> >
> > Done.
> >
> > > src/hotspot/share/runtime/vframe.cpp
> > > Added support for entry frame to new_vframe. Ok.
> >
> >
> > > src/hotspot/share/runtime/vframe_hp.cpp
> > > src/hotspot/share/runtime/vframe_hp.hpp
> >
> > > I think code()->as_nmethod() in not_global_escape_in_scope() and
> > arg_escape() should better be under #ifdef ASSERT or inside the assert
> > statement (no need for code cache walking in product build).
> >
> > Done.
> >
> > > jvmtiDeferredLocalVariableSet::update_monitors:
> > > Please add a comment explaining that owner referenced by original info
> > may be scalar replaced, but it is deoptimized in the vframe.
> >
> > Done.
> >
> > -----Original Message-----
> > From: Doerr, Martin <martin.doerr at sap.com>
> > Sent: Donnerstag, 12. M?rz 2020 17:28
> > To: Reingruber, Richard <richard.reingruber at sap.com>; 'Robbin Ehn'
> > <robbin.ehn at oracle.com>; Lindenmaier, Goetz
> > <goetz.lindenmaier at sap.com>; David Holmes
> <david.holmes at oracle.com>;
> > Vladimir Kozlov (vladimir.kozlov at oracle.com)
> > <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net;
> > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-
> > dev at openjdk.java.net
> > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance
> > in the Presence of JVMTI Agents
> >
> > Hi Richard,
> >
> >
> > I managed to find time for a (almost) complete review of webrev.4. (I'll
> > review the tests separately.)
> >
> > First of all, the change seems to be in pretty good quality for its significant
> > complexity. I couldn't find any real bugs. But I'd like to propose minor
> > improvements.
> > I'm convinced that it's mature because we did substantial testing.
> >
> > I like the new functionality for object deoptimization. It can possibly be
> > reused for future escape analysis based optimizations. So I appreciate
> having
> > it available in the code base.
> > In addition to that, your change makes the JVMTI implementation better
> > integrated into the VM.
> >
> >
> > Now to the details:
> >
> >
> > src/hotspot/share/c1/c1_IR.hpp
> > describe_scope parameters. Ok.
> >
> >
> > src/hotspot/share/ci/ciEnv.cpp
> > src/hotspot/share/ci/ciEnv.hpp
> > Fix for JvmtiExport::can_walk_any_space() capability. Ok.
> >
> >
> > src/hotspot/share/code/compiledMethod.cpp
> > Nice cleanup!
> >
> >
> > src/hotspot/share/code/debugInfoRec.cpp
> > src/hotspot/share/code/debugInfoRec.hpp
> > Additional parmeters. (Remark: I think "non_global_escape_in_scope"
> > would read better than "not_global_escape_in_scope", but your version is
> > consistent with existing code, so no change request from my side.) Ok.
> >
> >
> > src/hotspot/share/code/nmethod.cpp
> > Nice cleanup!
> >
> >
> > src/hotspot/share/code/pcDesc.hpp
> > Additional parameters. Ok.
> >
> >
> > src/hotspot/share/code/scopeDesc.cpp
> > src/hotspot/share/code/scopeDesc.hpp
> > Improved implementation + additional parameters. Ok.
> >
> >
> > src/hotspot/share/compiler/compileBroker.cpp
> > src/hotspot/share/compiler/compileBroker.hpp
> > Extra thread for DeoptimizeObjectsALot. (Remark: I would have put it into a
> > follow up change together with the test in order to make this webrev
> > smaller, but since it is included, I'm reviewing everything at once. Not a big
> > deal.) Ok.
> >
> >
> > src/hotspot/share/jvmci/jvmciCodeInstaller.cpp
> > Additional parameters. Ok.
> >
> >
> > src/hotspot/share/opto/c2compiler.cpp
> > Make do_escape_analysis independent of JVMCI capabilities. Nice!
> >
> >
> > src/hotspot/share/opto/callnode.hpp
> > Additional fields for MachSafePointNodes. Ok.
> >
> >
> > src/hotspot/share/opto/escape.cpp
> > Annotation for MachSafePointNodes. Your added functionality looks
> correct.
> > But I'd prefer to move the bulky code out of the large function.
> > I suggest to factor out something like has_not_global_escape and
> > has_arg_escape. So the code could look like this:
> >       SafePointNode* sfn = sfn_worklist.at(next);
> >       sfn->set_not_global_escape_in_scope(has_not_global_escape(sfn));
> >       if (sfn->is_CallJava()) {
> >         CallJavaNode* call = sfn->as_CallJava();
> >         call->set_arg_escape(has_arg_escape(call));
> >       }
> > This would also allow us to get rid of the found_..._escape_in_args
> variables
> > making the loops better readable.
> >
> > It's kind of ugly to use strcmp to recognize uncommon trap, but that seems
> > to be the way to do it (there are more such places). So it's ok.
> >
> >
> > src/hotspot/share/opto/machnode.hpp
> > Additional fields for MachSafePointNodes. Ok.
> >
> >
> > src/hotspot/share/opto/macro.cpp
> > Allow elimination of non-escaping allocations. Ok.
> >
> >
> > src/hotspot/share/opto/matcher.cpp
> > src/hotspot/share/opto/output.cpp
> > Copy attribute / pass parameters. Ok.
> >
> >
> > src/hotspot/share/prims/jvmtiCodeBlobEvents.cpp
> > Nice cleanup!
> >
> >
> > src/hotspot/share/prims/jvmtiEnv.cpp
> > src/hotspot/share/prims/jvmtiEnvBase.cpp
> > Escape barriers + deoptimize objects for target thread. Good.
> >
> >
> > src/hotspot/share/prims/jvmtiImpl.cpp
> > src/hotspot/share/prims/jvmtiImpl.hpp
> > The sequence is pretty complex:
> > VM_GetOrSetLocal element initialization executes EscapeBarrier code
> which
> > suspends the target thread (extra VM Operation).
> > VM_GetOrSetLocal::doit_prologue performs object deoptimization (by VM
> > Thread to prepare VM Operation with frame deoptimization).
> > VM_GetOrSetLocal destructor implicitly calls EscapeBarrier destructor
> which
> > resumes the target thread.
> > But I don't have any improvement proposal. Performance is probably not a
> > concern, here. So it's ok.
> >
> > VM_GetOrSetLocal::deoptimize_objects deoptimizes the top frame if it has
> > non-globally escaping objects and other frames if they have arg escaping
> > ones. Good.
> >
> >
> > src/hotspot/share/prims/jvmtiTagMap.cpp
> > Escape barriers + deoptimize objects for all threads. Ok.
> >
> >
> > src/hotspot/share/prims/whitebox.cpp
> > Added WB_IsFrameDeoptimized to API. Ok.
> >
> >
> > src/hotspot/share/runtime/deoptimization.cpp
> > Object deoptimization. I have more comments and proposals, here.
> > First of all, handling recursive and waiting locks in relock_objects is tricky,
> but
> > looks correct.
> > Comments are sufficient to understand why things are done as they are
> > implemented.
> >
> > BiasedLocking related parts are complex, but we may get rid of them in the
> > future (with BiasedLocking removal).
> > Anyway, looks correct, too.
> >
> > Typo in comment: "regularily" => "regularly"
> >
> > Deoptimization::fetch_unroll_info_helper is the only place where
> > _jvmti_deferred_updates get deallocated (except JavaThread destructor).
> > But I think we always go through it, so I can't see a memory leak or such
> kind
> > of issues.
> >
> > EscapeBarrier::deoptimize_objects: ResourceMark should use
> > calling_thread().
> >
> > You can use MutexLocker and MonitorLocker with Thread* to save the
> > Thread::current() call.
> >
> > I'd make set_objs_are_deoptimized static and remove it from the
> > EscapeBarrier interface because I think it shouldn't be used outside of
> > EscapeBarrier::deoptimize_objects.
> >
> > Typo in comment: "we must only deoptimize" => "we only have to
> > deoptimize"
> >
> > "bool EscapeBarrier::deoptimize_objects(intptr_t* fr_id)" is trivial and
> > barrier_active() is redundant. Implementation can get moved to hpp file.
> >
> > I'll get back to suspend flags, later.
> >
> > There are weird cases regarding _self_deoptimization_in_progress.
> > Assume we have 3 threads A, B and C. A deopts C, B deopts C, C deopts C.
> C
> > can set _self_deoptimization_in_progress while A performs the handshake
> > for suspending C. I think this doesn't lead to errors, but it's probably not
> > desired.
> > I think it would be better to use only one "wait" call in
> > sync_and_suspend_one and sync_and_suspend_all.
> >
> > I first thought it'd be better to move ThreadBlockInVM before wait() to
> > reduce thread state transitions, but that seems to be problematic because
> > ThreadBlockInVM destructor contains a safepoint check which we
> shouldn't
> > do while holding EscapeBarrier_lock. So no change request.
> >
> > Change in thred_added:
> > I think the sequence would be more comprehensive if we waited for
> > deopt_all_threads in Thread::start and all other places where a new thread
> > can run into Java code (e.g. JVMTI attach).
> > Your version makes new threads come up with suspend flag set. That looks
> > correct, too. Advantage is that you only have to change one place
> > (thread_added). It'll be interesting to see how it will look like when we use
> > async handshakes instead of suspend flags.
> > For now, I'm ok with your version.
> >
> > I'd only move MutexLocker ml(EscapeBarrier_lock...) after if (!jt-
> > >is_hidden_from_external_view()).
> >
> > Having 4 different deoptimize_objects functions makes it a little hard to
> keep
> > an overview of which one is used for what.
> > Maybe adding suffixes would help a little bit, but I can also live with what
> you
> > have.
> > Implementation looks correct to me.
> >
> >
> > src/hotspot/share/runtime/deoptimization.hpp
> > Escape barriers and object deoptimization functions.
> > Typo in comment: "helt" => "held"
> >
> >
> > src/hotspot/share/runtime/globals.hpp
> > Addition of develop flag DeoptimizeObjectsALotInterval. Ok.
> >
> >
> > src/hotspot/share/runtime/interfaceSupport.cpp
> > InterfaceSupport::deoptimizeAllObjects() is only used for
> > DeoptimizeObjectsALot = 1.
> > I think DeoptimizeObjectsALot = 2 is more important, but I think it's not bad
> > to have DeoptimizeObjectsALot = 1 in addition. Ok.
> >
> >
> > src/hotspot/share/runtime/interfaceSupport.inline.hpp
> > Addition of deoptimizeAllObjects. Ok.
> >
> >
> > src/hotspot/share/runtime/mutexLocker.cpp
> > src/hotspot/share/runtime/mutexLocker.hpp
> > Addition of EscapeBarrier_lock. Ok.
> >
> >
> > src/hotspot/share/runtime/objectMonitor.cpp
> > Make recursion count relock aware. Ok.
> >
> >
> > src/hotspot/share/runtime/stackValue.hpp
> > Better reinitilization in StackValue. Good.
> >
> >
> > src/hotspot/share/runtime/thread.cpp
> > src/hotspot/share/runtime/thread.hpp
> > src/hotspot/share/runtime/thread.inline.hpp
> > wait_for_object_deoptimization, suspend flag, deferred updates and test
> > feature to deoptimize objects.
> >
> > In the long term, we want to get rid of suspend flags, so it's not so nice to
> > introduce a new one. But I agree with G?tz that it should be acceptable as
> > temporary solution until async handshakes are available (which takes more
> > time). So I'm ok with your change.
> >
> > You can use MutexLocker with Thread*.
> >
> > JVMTIDeferredUpdates: I agree with Robin. It'd be nice to move the class
> out
> > of thread.hpp.
> >
> >
> > src/hotspot/share/runtime/vframe.cpp
> > Added support for entry frame to new_vframe. Ok.
> >
> >
> > src/hotspot/share/runtime/vframe_hp.cpp
> > src/hotspot/share/runtime/vframe_hp.hpp
> >
> > I think code()->as_nmethod() in not_global_escape_in_scope() and
> > arg_escape() should better be under #ifdef ASSERT or inside the assert
> > statement (no need for code cache walking in product build).
> >
> > jvmtiDeferredLocalVariableSet::update_monitors:
> > Please add a comment explaining that owner referenced by original info
> may
> > be scalar replaced, but it is deoptimized in the vframe.
> >
> >
> > src/hotspot/share/utilities/macros.hpp
> > Addition of NOT_COMPILER2_OR_JVMCI_RETURN macros. Ok.
> >
> >
> >
> test/hotspot/jtreg/serviceability/jvmti/Heap/IterateHeapWithEscapeAnalysi
> > sEnabled.java
> >
> test/hotspot/jtreg/serviceability/jvmti/Heap/libIterateHeapWithEscapeAnal
> > ysisEnabled.c
> > New test. Will review separately.
> >
> >
> > test/jdk/TEST.ROOT
> > Addition of vm.jvmci as required property. Ok.
> >
> >
> > test/jdk/com/sun/jdi/EATests.java
> > test/jdk/com/sun/jdi/EATestsJVMCI.java
> > New test. Will review separately.
> >
> >
> > test/lib/sun/hotspot/WhiteBox.java
> > Added isFrameDeoptimized to API. Ok.
> >
> >
> > That was it. Best regards,
> > Martin
> >
> >
> > > -----Original Message-----
> > > From: hotspot-compiler-dev <hotspot-compiler-dev-
> > > bounces at openjdk.java.net> On Behalf Of Reingruber, Richard
> > > Sent: Dienstag, 3. M?rz 2020 21:23
> > > To: 'Robbin Ehn' <robbin.ehn at oracle.com>; Lindenmaier, Goetz
> > > <goetz.lindenmaier at sap.com>; David Holmes
> > <david.holmes at oracle.com>;
> > > Vladimir Kozlov (vladimir.kozlov at oracle.com)
> > > <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net;
> > > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-
> > > dev at openjdk.java.net
> > > Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better
> > > Performance in the Presence of JVMTI Agents
> > >
> > > Hi Robbin,
> > >
> > > > > I understand that Robbin proposed to replace the usage of
> > > > > _suspend_flag with handshakes. Apparently, async handshakes
> > > > > are needed to do so. We have been waiting a while for removal
> > > > > of the _suspend_flag / introduction of async handshakes [2].
> > > > > What is the status here?
> > >
> > > > I have an old prototype which I would like to continue to work on.
> > > > So do not assume asynch handshakes will make 15.
> > > > Even if it would, I think there are a lot more investigate work to remove
> > > > _suspend_flag.
> > >
> > > Let us know, if we can be of any help to you and be it only testing.
> > >
> > > > >> Full:
> > > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/
> > >
> > > > DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> > > > You can move both declaration and definition to that file, no need to
> > > clobber
> > > > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
> > >
> > > Will do.
> > >
> > > > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in
> > it's
> > > own
> > > > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
> > >
> > > You are right. It shouldn't be declared in thread.hpp. I will look into that.
> > >
> > > > Note that we also think we may have a bug in deopt:
> > > > https://bugs.openjdk.java.net/browse/JDK-8238237
> > >
> > > > I think it would be best, if possible, to push after that is resolved.
> > >
> > > Sure.
> > >
> > > > Not even nearly a full review :)
> > >
> > > I know :)
> > >
> > > Anyways, thanks a lot,
> > > Richard.
> > >
> > >
> > > -----Original Message-----
> > > From: Robbin Ehn <robbin.ehn at oracle.com>
> > > Sent: Monday, March 2, 2020 11:17 AM
> > > To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Reingruber,
> > Richard
> > > <richard.reingruber at sap.com>; David Holmes
> > <david.holmes at oracle.com>;
> > > Vladimir Kozlov (vladimir.kozlov at oracle.com)
> > > <vladimir.kozlov at oracle.com>; serviceability-dev at openjdk.java.net;
> > > hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-
> > > dev at openjdk.java.net
> > > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> Performance
> > > in the Presence of JVMTI Agents
> > >
> > > Hi,
> > >
> > > On 2/24/20 5:39 PM, Lindenmaier, Goetz wrote:
> > > > Hi,
> > > >
> > > > I had a look at the progress of this change. Nothing
> > > > happened since Richard posted his update using more
> > > > handshakes [1].
> > > > But we (SAP) would appreciate a lot if this change could
> > > > be successfully reviewed and pushed.
> > > >
> > > > I think there is basic understanding that this
> > > > change is helpful. It fixes a number of issues with JVMTI,
> > > > and will deliver the same performance benefits as EA
> > > > does in current production mode for debugging scenarios.
> > > >
> > > > This is important for us as we run our VMs prepared
> > > > for debugging in production mode.
> > > >
> > > > I understand that Robbin proposed to replace the usage of
> > > > _suspend_flag with handshakes. Apparently, async handshakes
> > > > are needed to do so. We have been waiting a while for removal
> > > > of the _suspend_flag / introduction of async handshakes [2].
> > > > What is the status here?
> > >
> > > I have an old prototype which I would like to continue to work on.
> > > So do not assume asynch handshakes will make 15.
> > > Even if it would, I think there are a lot more investigate work to remove
> > > _suspend_flag.
> > >
> > > >
> > > > I think we should no longer wait, but proceed with
> > > > this change. We will look into removing the usage of
> > > > suspend_flag introduced here once it is possible to implement
> > > > it with handshakes.
> > >
> > > Yes, sure.
> > >
> > > >> Full:
> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/
> > >
> > > DeoptimizeObjectsALotThread is only used in compileBroker.cpp.
> > > You can move both declaration and definition to that file, no need to
> > clobber
> > > thread.[c|h]pp. (and the static function deopt_objs_alot_thread_entry)
> > >
> > > Does JvmtiDeferredUpdates really need to be in thread.hpp, can't be in
> it's
> > > own
> > > hpp file? It doesn't seem right to add JVM TI classes into thread.hpp.
> > >
> > > Note that we also think we may have a bug in deopt:
> > > https://bugs.openjdk.java.net/browse/JDK-8238237
> > >
> > > I think it would be best, if possible, to push after that is resolved.
> > >
> > > Not even nearly a full review :)
> > >
> > > Thanks, Robbin
> > >
> > >
> > > >> Incremental:
> > > >>
> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/
> > > >>
> > > >> I was not able to eliminate the additional suspend flag now. I'll take
> care
> > > of this
> > > >> as soon as the
> > > >> existing suspend-resume-mechanism is reworked.
> > > >>
> > > >> Testing:
> > > >>
> > > >> Nightly tests @SAP:
> > > >>
> > > >>    JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015,
> > > Renaissance
> > > >> Suite, SAP specific tests
> > > >>    with fastdebug and release builds on all platforms
> > > >>
> > > >>    Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x
> > > parallel
> > > >> for 24h
> > > >>
> > > >> Thanks, Richard.
> > > >>
> > > >>
> > > >> More details on the changes:
> > > >>
> > > >> * Hide DeoptimizeObjectsALotThread from external view.
> > > >>
> > > >> * Changed EscapeBarrier_lock to be a _safepoint_check_never lock.
> > > >>    It used to be _safepoint_check_sometimes, which will be eliminated
> > > sooner or
> > > >> later.
> > > >>    I added explicit thread state changes with ThreadBlockInVM to code
> > > paths
> > > >> where we can wait()
> > > >>    on EscapeBarrier_lock to become safepoint safe.
> > > >>
> > > >> * Use handshake EscapeBarrierSuspendHandshake to suspend target
> > > threads
> > > >> instead of vm operation
> > > >>    VM_ThreadSuspendAllForObjDeopt.
> > > >>
> > > >> * Removed uses of Threads_lock. When adding a new thread we
> > suspend
> > > it iff
> > > >> EA optimizations are
> > > >>    being reverted. In the previous version we were waiting on
> > > Threads_lock
> > > >> while EA optimizations
> > > >>    were reverted. See EscapeBarrier::thread_added().
> > > >>
> > > >> * Made tests require Xmixed compilation mode.
> > > >>
> > > >> * Made tests agnostic regarding tiered compilation.
> > > >>    I.e. tc isn't disabled anymore, and the tests can be run with tc
> enabled
> > or
> > > >> disabled.
> > > >>
> > > >> * Exercising EATests.java as well with stress test options
> > > >> DeoptimizeObjectsALot*
> > > >>    Due to the non-deterministic deoptimizations some tests need to be
> > > skipped.
> > > >>    We do this to prevent bit-rot of the stress test code.
> > > >>
> > > >> * Executing EATests.java as well with graal if available. Driver for this is
> > > >>    EATestsJVMCI.java. Graal cannot pass all tests, because it does not
> > > provide all
> > > >> the new debug info
> > > >>    (namely not_global_escape_in_scope and arg_escape in
> > > scopeDesc.hpp).
> > > >>    And graal does not yet support the JVMTI operations force early
> > return
> > > and
> > > >> pop frame.
> > > >>
> > > >> * Removed tracing from new jdi tests in EATests.java. Too much trace
> > > output
> > > >> before the debugging
> > > >>    connection is established can cause deadlock because output buffers
> > fill
> > > up.
> > > >>    (See https://bugs.openjdk.java.net/browse/JDK-8173304)
> > > >>
> > > >> * Many copyright year changes and smaller clean-up changes of
> testing
> > > code
> > > >> (trailing white-space and
> > > >>    the like).
> > > >>
> > > >>
> > > >> -----Original Message-----
> > > >> From: David Holmes <david.holmes at oracle.com>
> > > >> Sent: Donnerstag, 19. Dezember 2019 03:12
> > > >> To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-
> > > >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net;
> > > hotspot-
> > > >> runtime-dev at openjdk.java.net; Vladimir Kozlov
> > > (vladimir.kozlov at oracle.com)
> > > >> <vladimir.kozlov at oracle.com>
> > > >> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> > > Performance in
> > > >> the Presence of JVMTI Agents
> > > >>
> > > >> Hi Richard,
> > > >>
> > > >> I think my issue is with the way EliminateNestedLocks works so I'm
> going
> > > >> to look into that more deeply.
> > > >>
> > > >> Thanks for the explanations.
> > > >>
> > > >> David
> > > >>
> > > >> On 18/12/2019 12:47 am, Reingruber, Richard wrote:
> > > >>> Hi David,
> > > >>>
> > > >>>     > >    > Some further queries/concerns:
> > > >>>     > >    >
> > > >>>     > >    > src/hotspot/share/runtime/objectMonitor.cpp
> > > >>>     > >    >
> > > >>>     > >    > Can you please explain the changes to ObjectMonitor::wait:
> > > >>>     > >    >
> > > >>>     > >    > !   _recursions = save      // restore the old recursion count
> > > >>>     > >    > !                 + jt->get_and_reset_relock_count_after_wait(); //
> > > >>>     > >    > increased by the deferred relock count
> > > >>>     > >    >
> > > >>>     > >    > what is the "deferred relock count"? I gather it relates to
> > > >>>     > >    >
> > > >>>     > >    > "The code was extended to be able to deoptimize objects of
> a
> > > >>>     > > frame that
> > > >>>     > >    > is not the top frame and to let another thread than the
> > owning
> > > >>>     > > thread do
> > > >>>     > >    > it."
> > > >>>     > >
> > > >>>     > > Yes, these relate. Currently EA based optimizations are reverted,
> > > when a
> > > >> compiled frame is
> > > >>>     > > replaced with corresponding interpreter frames. Part of this is
> > > relocking
> > > >> objects with eliminated
> > > >>>     > > locking. New with the enhancement is that we do this also just
> > > before
> > > >> object references are
> > > >>>     > > acquired through JVMTI. In this case we deoptimize also the
> > > owning
> > > >> compiled frame C and we
> > > >>>     > > register deoptimized objects as deferred updates. When control
> > > returns
> > > >> to C it gets deoptimized,
> > > >>>     > > we notice that objects are already deoptimized (reallocated and
> > > >> relocked), so we don't do it again
> > > >>>     > > (relocking twice would be incorrect of course). Deferred
> updates
> > > are
> > > >> copied into the new
> > > >>>     > > interpreter frames.
> > > >>>     > >
> > > >>>     > > Problem: relocking is not possible if the target thread T is
> waiting
> > > on the
> > > >> monitor that needs to
> > > >>>     > > be relocked. This happens only with non-local objects with
> > > >> EliminateNestedLocks. Instead relocking
> > > >>>     > > is deferred until T owns the monitor again. This is what the
> piece
> > of
> > > >> code above does.
> > > >>>     >
> > > >>>     >  Sorry I need some more detail here. How can you wait() on an
> > > object
> > > >>>     >  monitor if the object allocation and/or locking was optimised
> > away?
> > > And
> > > >>>     >  what is a "non-local object" in this context? Isn't EA restricted to
> > > >>>     >  thread-confined objects?
> > > >>>
> > > >>> "Non-local object" is an object that escapes its thread. The issue I'm
> > > >> addressing with the changes
> > > >>> in ObjectMonitor::wait are almost unrelated to EA. They are caused
> by
> > > >> EliminateNestedLocks, where C2
> > > >>> eliminates recursive locking of an already owned lock. The lock
> owning
> > > object
> > > >> exists on the heap, it
> > > >>> is locked and you can call wait() on it.
> > > >>>
> > > >>> EliminateLocks is the C2 option that controls lock elimination based
> on
> > > EA.
> > > >> Both optimizations have
> > > >>> in common that objects with eliminated locking need to be relocked
> > > when
> > > >> deoptimizing a frame,
> > > >>> i.e. when replacing a compiled frame with equivalent interpreter
> > > >>> frames. Deoptimization::relock_objects does that job for /all/
> > eliminated
> > > >> locks in scope. /All/ can
> > > >>> be a mix of eliminated nested locks and locks of not-escaping objects.
> > > >>>
> > > >>> New with the enhancement: I call relock_objects earlier, just before
> > > objects
> > > >> pontentially
> > > >>> escape. But then later when the owning compiled frame gets
> > > deoptimized, I
> > > >> must not do it again:
> > > >>>
> > > >>> See call to EscapeBarrier::objs_are_deoptimized in
> > deoptimization.cpp:
> > > >>>
> > > >>>    373   if ((jvmci_enabled || ((DoEscapeAnalysis ||
> > > EliminateNestedLocks) &&
> > > >> EliminateLocks))
> > > >>>    374       && !EscapeBarrier::objs_are_deoptimized(thread,
> > > deoptee.id())) {
> > > >>>    375     bool unused;
> > > >>>    376     eliminate_locks(thread, chunk, realloc_failures, deoptee,
> > > exec_mode,
> > > >> unused);
> > > >>>    377   }
> > > >>>
> > > >>> Now when calling relock_objects early it is quiet possible that I have
> to
> > > relock
> > > >> an object the
> > > >>> target thread currently waits for. Obviously I cannot relock in this
> case,
> > > >> instead I chose to
> > > >>> introduce relock_count_after_wait to JavaThread.
> > > >>>
> > > >>>     >  Is it just that some of the locking gets optimized away e.g.
> > > >>>     >
> > > >>>     >  synchronised(obj) {
> > > >>>     >     synchronised(obj) {
> > > >>>     >       synchronised(obj) {
> > > >>>     >         obj.wait();
> > > >>>     >       }
> > > >>>     >     }
> > > >>>     >  }
> > > >>>     >
> > > >>>     >  If this is reduced to a form as-if it were a single lock of the
> monitor
> > > >>>     >  (due to EA) and the wait() triggers a JVM TI event which leads to
> > the
> > > >>>     >  escape of "obj" then we need to reconstruct the true lock state,
> > and
> > > so
> > > >>>     >  when the wait() internally unblocks and reacquires the monitor it
> > > has to
> > > >>>     >  set the true recursion count to 3, not the 1 that it appeared to be
> > > when
> > > >>>     >  wait() was initially called. Is that the scenario?
> > > >>>
> > > >>> Kind of... except that the locking is not eliminated due to EA and
> there
> > is
> > > no
> > > >> JVM TI event
> > > >>> triggered by wait.
> > > >>>
> > > >>> Add
> > > >>>
> > > >>> LocalObject l1 = new LocalObject();
> > > >>>
> > > >>> in front of the synchrnized blocks and assume a JVM TI agent
> acquires
> > l1.
> > > This
> > > >> triggers the code in
> > > >>> question.
> > > >>>
> > > >>> See that relocking/reallocating is transactional. If it is done then for
> > /all/
> > > >> objects in scope and it is
> > > >>> done at most once. It wouldn't be quite so easy to split this in
> relocking
> > > of
> > > >> nested/EA-based
> > > >>> eliminated locks.
> > > >>>
> > > >>>     >  If so I find this truly awful. Anyone using wait() in a realistic form
> > > >>>     >  requires a notification and so the object cannot be thread
> > confined.
> > > In
> > > >>>
> > > >>> It is not thread confined.
> > > >>>
> > > >>>     >  which case I would strongly argue that upon hitting the wait() the
> > > deopt
> > > >>>     >  should occur unconditionally and so the lock state is correct
> before
> > > we
> > > >>>     >  wait and so we don't need to mess with the recursion count
> > > internally
> > > >>>     >  when we reacquire the monitor.
> > > >>>     >
> > > >>>     > >
> > > >>>     > >    > which I don't like the sound of at all when it comes to
> > > ObjectMonitor
> > > >>>     > >    > state. So I'd like to understand in detail exactly what is going
> > on
> > > here
> > > >>>     > >    > and why.  This is a very intrusive change that seems to badly
> > > break
> > > >>>     > >    > encapsulation and impacts future changes to ObjectMonitor
> > > that are
> > > >> under
> > > >>>     > >    > investigation.
> > > >>>     > >
> > > >>>     > > I would not regard this as breaking encapsulation. Certainly not
> > > badly.
> > > >>>     > >
> > > >>>     > > I've added a property relock_count_after_wait to JavaThread.
> > The
> > > >> property is well
> > > >>>     > > encapsulated. Future ObjectMonitor implementations have to
> > deal
> > > with
> > > >> recursion too. They are free
> > > >>>     > > in choosing a way to do that as long as that property is taken
> into
> > > >> account. This is hardly a
> > > >>>     > > limitation.
> > > >>>     >
> > > >>>     >  I do think this badly breaks encapsulation as you have to add a
> > > callout
> > > >>>     >  from the guts of the ObjectMonitor code to reach into the thread
> > to
> > > get
> > > >>>     >  this lock count adjustment. I understand why you have had to do
> > > this but
> > > >>>     >  I would much rather see a change to the EA optimisation strategy
> > so
> > > that
> > > >>>     >  this is not needed.
> > > >>>     >
> > > >>>     > > Note also that the property is a straight forward extension of
> the
> > > >> existing concept of deferred
> > > >>>     > > local updates. It is embedded into the structure holding them.
> So
> > > not
> > > >> even the footprint of a
> > > >>>     > > JavaThread is enlarged if no deferred updates are generated.
> > > >>>     >
> > > >>>     > [...]
> > > >>>     >
> > > >>>     > >
> > > >>>     > > I'm actually duplicating the existing external suspend
> mechanism,
> > > >> because a thread can be
> > > >>>     > > suspended at most once. And hey, and don't like that either!
> But
> > it
> > > >> seems not unlikely that the
> > > >>>     > > duplicate can be removed together with the original and the
> new
> > > type
> > > >> of handshakes that will be
> > > >>>     > > used for thread suspend can be used for object deoptimization
> > > too. See
> > > >> today's discussion in
> > > >>>     > > JDK-8227745 [2].
> > > >>>     >
> > > >>>     >  I hope that discussion bears some fruit, at the moment it seems
> > not
> > > to
> > > >>>     >  be possible to use handshakes here. :(
> > > >>>     >
> > > >>>     >  The external suspend mechanism is a royal pain in the proverbial
> > > that we
> > > >>>     >  have to carefully live with. The idea that we're duplicating that
> for
> > > >>>     >  use in another fringe area of functionality does not thrill me at all.
> > > >>>     >
> > > >>>     >  To be clear, I understand the problem that exists and that you
> > wish
> > > to
> > > >>>     >  solve, but for the runtime parts I balk at the complexity cost of
> > > >>>     >  solving it.
> > > >>>
> > > >>> I know it's complex, but by far no rocket science.
> > > >>>
> > > >>> Also I find it hard to imagine another fix for JDK-8233915 besides
> > > changing
> > > >> the JVM TI specification.
> > > >>>
> > > >>> Thanks, Richard.
> > > >>>
> > > >>> -----Original Message-----
> > > >>> From: David Holmes <david.holmes at oracle.com>
> > > >>> Sent: Dienstag, 17. Dezember 2019 08:03
> > > >>> To: Reingruber, Richard <richard.reingruber at sap.com>;
> serviceability-
> > > >> dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net;
> > > hotspot-
> > > >> runtime-dev at openjdk.java.net; Vladimir Kozlov
> > > (vladimir.kozlov at oracle.com)
> > > >> <vladimir.kozlov at oracle.com>
> > > >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> > > Performance
> > > >> in the Presence of JVMTI Agents
> > > >>>
> > > >>> <resend as my mailer crashed during last send>
> > > >>>
> > > >>> David
> > > >>>
> > > >>> On 17/12/2019 4:57 pm, David Holmes wrote:
> > > >>>> Hi Richard,
> > > >>>>
> > > >>>> On 14/12/2019 5:01 am, Reingruber, Richard wrote:
> > > >>>>> Hi David,
> > > >>>>>
> > > >>>>>   ?? > Some further queries/concerns:
> > > >>>>>   ?? >
> > > >>>>>   ?? > src/hotspot/share/runtime/objectMonitor.cpp
> > > >>>>>   ?? >
> > > >>>>>   ?? > Can you please explain the changes to ObjectMonitor::wait:
> > > >>>>>   ?? >
> > > >>>>>   ?? > !?? _recursions = save????? // restore the old recursion count
> > > >>>>>   ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
> > > >>>>>   ?? > increased by the deferred relock count
> > > >>>>>   ?? >
> > > >>>>>   ?? > what is the "deferred relock count"? I gather it relates to
> > > >>>>>   ?? >
> > > >>>>>   ?? > "The code was extended to be able to deoptimize objects of a
> > > >>>>> frame that
> > > >>>>>   ?? > is not the top frame and to let another thread than the owning
> > > >>>>> thread do
> > > >>>>>   ?? > it."
> > > >>>>>
> > > >>>>> Yes, these relate. Currently EA based optimizations are reverted,
> > > when
> > > >>>>> a compiled frame is replaced
> > > >>>>> with corresponding interpreter frames. Part of this is relocking
> > > >>>>> objects with eliminated
> > > >>>>> locking. New with the enhancement is that we do this also just
> > before
> > > >>>>> object references are acquired
> > > >>>>> through JVMTI. In this case we deoptimize also the owning
> compiled
> > > >>>>> frame C and we register
> > > >>>>> deoptimized objects as deferred updates. When control returns to
> > C
> > > it
> > > >>>>> gets deoptimized, we notice
> > > >>>>> that objects are already deoptimized (reallocated and relocked), so
> > > we
> > > >>>>> don't do it again (relocking
> > > >>>>> twice would be incorrect of course). Deferred updates are copied
> > into
> > > >>>>> the new interpreter frames.
> > > >>>>>
> > > >>>>> Problem: relocking is not possible if the target thread T is waiting
> > > >>>>> on the monitor that needs to be
> > > >>>>> relocked. This happens only with non-local objects with
> > > >>>>> EliminateNestedLocks. Instead relocking is
> > > >>>>> deferred until T owns the monitor again. This is what the piece of
> > > >>>>> code above does.
> > > >>>>
> > > >>>> Sorry I need some more detail here. How can you wait() on an
> object
> > > >>>> monitor if the object allocation and/or locking was optimised away?
> > > And
> > > >>>> what is a "non-local object" in this context? Isn't EA restricted to
> > > >>>> thread-confined objects?
> > > >>>>
> > > >>>> Is it just that some of the locking gets optimized away e.g.
> > > >>>>
> > > >>>> synchronised(obj) {
> > > >>>>    ? synchronised(obj) {
> > > >>>>    ??? synchronised(obj) {
> > > >>>>    ????? obj.wait();
> > > >>>>    ??? }
> > > >>>>    ? }
> > > >>>> }
> > > >>>>
> > > >>>> If this is reduced to a form as-if it were a single lock of the monitor
> > > >>>> (due to EA) and the wait() triggers a JVM TI event which leads to the
> > > >>>> escape of "obj" then we need to reconstruct the true lock state, and
> > so
> > > >>>> when the wait() internally unblocks and reacquires the monitor it
> has
> > to
> > > >>>> set the true recursion count to 3, not the 1 that it appeared to be
> > when
> > > >>>> wait() was initially called. Is that the scenario?
> > > >>>>
> > > >>>> If so I find this truly awful. Anyone using wait() in a realistic form
> > > >>>> requires a notification and so the object cannot be thread confined.
> > In
> > > >>>> which case I would strongly argue that upon hitting the wait() the
> > > deopt
> > > >>>> should occur unconditionally and so the lock state is correct before
> > we
> > > >>>> wait and so we don't need to mess with the recursion count
> internally
> > > >>>> when we reacquire the monitor.
> > > >>>>
> > > >>>>>
> > > >>>>>   ?? > which I don't like the sound of at all when it comes to
> > > >>>>> ObjectMonitor
> > > >>>>>   ?? > state. So I'd like to understand in detail exactly what is going
> > > >>>>> on here
> > > >>>>>   ?? > and why.? This is a very intrusive change that seems to badly
> > > break
> > > >>>>>   ?? > encapsulation and impacts future changes to ObjectMonitor
> > that
> > > >>>>> are under
> > > >>>>>   ?? > investigation.
> > > >>>>>
> > > >>>>> I would not regard this as breaking encapsulation. Certainly not
> > badly.
> > > >>>>>
> > > >>>>> I've added a property relock_count_after_wait to JavaThread. The
> > > >>>>> property is well
> > > >>>>> encapsulated. Future ObjectMonitor implementations have to deal
> > > with
> > > >>>>> recursion too. They are free in
> > > >>>>> choosing a way to do that as long as that property is taken into
> > > >>>>> account. This is hardly a
> > > >>>>> limitation.
> > > >>>>
> > > >>>> I do think this badly breaks encapsulation as you have to add a
> callout
> > > >>>> from the guts of the ObjectMonitor code to reach into the thread to
> > > get
> > > >>>> this lock count adjustment. I understand why you have had to do
> this
> > > but
> > > >>>> I would much rather see a change to the EA optimisation strategy so
> > > that
> > > >>>> this is not needed.
> > > >>>>
> > > >>>>> Note also that the property is a straight forward extension of the
> > > >>>>> existing concept of deferred
> > > >>>>> local updates. It is embedded into the structure holding them. So
> > not
> > > >>>>> even the footprint of a
> > > >>>>> JavaThread is enlarged if no deferred updates are generated.
> > > >>>>>
> > > >>>>>   ?? > ---
> > > >>>>>   ?? >
> > > >>>>>   ?? > src/hotspot/share/runtime/thread.cpp
> > > >>>>>   ?? >
> > > >>>>>   ?? > Can you please explain why
> > > >>>>> JavaThread::wait_for_object_deoptimization
> > > >>>>>   ?? > has to be handcrafted in this way rather than using proper
> > > >>>>> transitions.
> > > >>>>>   ?? >
> > > >>>>>
> > > >>>>> I wrote wait_for_object_deoptimization taking
> > > >>>>> JavaThread::java_suspend_self_with_safepoint_check
> > > >>>>> as template. So in short: for the same reasons :)
> > > >>>>>
> > > >>>>> Threads reach both methods as part of thread state transitions,
> > > >>>>> therefore special handling is
> > > >>>>> required to change thread state on top of ongoing transitions.
> > > >>>>>
> > > >>>>>   ?? > We got rid of "deopt suspend" some time ago and it is
> > disturbing
> > > >>>>> to see
> > > >>>>>   ?? > it being added back (effectively). This seems like it may be
> > > >>>>> something
> > > >>>>>   ?? > that handshakes could be used for.
> > > >>>>>
> > > >>>>> Deopt suspend used to be something rather different with a
> similar
> > > >>>>> name[1]. It is not being added back.
> > > >>>>
> > > >>>> I stand corrected. Despite comments in the code to the contrary
> > > >>>> deopt_suspend didn't actually cause a self-suspend. I was doing a
> lot
> > of
> > > >>>> cleanup in this area 13 years ago :)
> > > >>>>
> > > >>>>>
> > > >>>>> I'm actually duplicating the existing external suspend mechanism,
> > > >>>>> because a thread can be suspended
> > > >>>>> at most once. And hey, and don't like that either! But it seems not
> > > >>>>> unlikely that the duplicate can
> > > >>>>> be removed together with the original and the new type of
> > > handshakes
> > > >>>>> that will be used for
> > > >>>>> thread suspend can be used for object deoptimization too. See
> > > today's
> > > >>>>> discussion in JDK-8227745 [2].
> > > >>>>
> > > >>>> I hope that discussion bears some fruit, at the moment it seems not
> > to
> > > >>>> be possible to use handshakes here. :(
> > > >>>>
> > > >>>> The external suspend mechanism is a royal pain in the proverbial
> that
> > > we
> > > >>>> have to carefully live with. The idea that we're duplicating that for
> > > >>>> use in another fringe area of functionality does not thrill me at all.
> > > >>>>
> > > >>>> To be clear, I understand the problem that exists and that you wish
> to
> > > >>>> solve, but for the runtime parts I balk at the complexity cost of
> > > >>>> solving it.
> > > >>>>
> > > >>>> Thanks,
> > > >>>> David
> > > >>>> -----
> > > >>>>
> > > >>>>> Thanks, Richard.
> > > >>>>>
> > > >>>>> [1] Deopt suspend was something like an async. handshake for
> > > >>>>> architectures with register windows,
> > > >>>>>   ???? where patching the return pc for deoptimization of a compiled
> > > >>>>> frame was racy if the owner thread
> > > >>>>>   ???? was in native code. Instead a "deopt" suspend flag was set on
> > > >>>>> which the thread patched its own
> > > >>>>>   ???? frame upon return from native. So no thread was suspended.
> It
> > > got
> > > >>>>> its name only from the name of
> > > >>>>>   ???? the flags.
> > > >>>>>
> > > >>>>> [2] Discussion about using handshakes to sync. with the target
> > thread:
> > > >>>>>
> > > >>>>> https://bugs.openjdk.java.net/browse/JDK-
> > > >>
> > >
> >
> 8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.syst
> > > e
> > > >> m.issuetabpanels:comment-tabpanel#comment-14306727
> > > >>>>>
> > > >>>>>
> > > >>>>> -----Original Message-----
> > > >>>>> From: David Holmes <david.holmes at oracle.com>
> > > >>>>> Sent: Freitag, 13. Dezember 2019 00:56
> > > >>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
> > > >>>>> serviceability-dev at openjdk.java.net;
> > > >>>>> hotspot-compiler-dev at openjdk.java.net;
> > > >>>>> hotspot-runtime-dev at openjdk.java.net
> > > >>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> > > >>>>> Performance in the Presence of JVMTI Agents
> > > >>>>>
> > > >>>>> Hi Richard,
> > > >>>>>
> > > >>>>> Some further queries/concerns:
> > > >>>>>
> > > >>>>> src/hotspot/share/runtime/objectMonitor.cpp
> > > >>>>>
> > > >>>>> Can you please explain the changes to ObjectMonitor::wait:
> > > >>>>>
> > > >>>>> !?? _recursions = save????? // restore the old recursion count
> > > >>>>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); //
> > > >>>>> increased by the deferred relock count
> > > >>>>>
> > > >>>>> what is the "deferred relock count"? I gather it relates to
> > > >>>>>
> > > >>>>> "The code was extended to be able to deoptimize objects of a
> > frame
> > > that
> > > >>>>> is not the top frame and to let another thread than the owning
> > thread
> > > do
> > > >>>>> it."
> > > >>>>>
> > > >>>>> which I don't like the sound of at all when it comes to
> ObjectMonitor
> > > >>>>> state. So I'd like to understand in detail exactly what is going on
> here
> > > >>>>> and why.? This is a very intrusive change that seems to badly break
> > > >>>>> encapsulation and impacts future changes to ObjectMonitor that
> > are
> > > under
> > > >>>>> investigation.
> > > >>>>>
> > > >>>>> ---
> > > >>>>>
> > > >>>>> src/hotspot/share/runtime/thread.cpp
> > > >>>>>
> > > >>>>> Can you please explain why
> > > JavaThread::wait_for_object_deoptimization
> > > >>>>> has to be handcrafted in this way rather than using proper
> > transitions.
> > > >>>>>
> > > >>>>> We got rid of "deopt suspend" some time ago and it is disturbing
> to
> > > see
> > > >>>>> it being added back (effectively). This seems like it may be
> > something
> > > >>>>> that handshakes could be used for.
> > > >>>>>
> > > >>>>> Thanks,
> > > >>>>> David
> > > >>>>> -----
> > > >>>>>
> > > >>>>> On 12/12/2019 7:02 am, David Holmes wrote:
> > > >>>>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote:
> > > >>>>>>> Hi David,
> > > >>>>>>>
> > > >>>>>>>   ??? > Most of the details here are in areas I can comment on in
> > > detail,
> > > >>>>>>> but I
> > > >>>>>>>   ??? > did take an initial general look at things.
> > > >>>>>>>
> > > >>>>>>> Thanks for taking the time!
> > > >>>>>>
> > > >>>>>> Apologies the above should read:
> > > >>>>>>
> > > >>>>>> "Most of the details here are in areas I *can't* comment on in
> > detail
> > > >>>>>> ..."
> > > >>>>>>
> > > >>>>>> David
> > > >>>>>>
> > > >>>>>>>   ??? > The only thing that jumped out at me is that I think the
> > > >>>>>>>   ??? > DeoptimizeObjectsALotThread should be a hidden thread.
> > > >>>>>>>   ??? >
> > > >>>>>>>   ??? > +? bool is_hidden_from_external_view() const { return true;
> > }
> > > >>>>>>>
> > > >>>>>>> Yes, it should. Will add the method like above.
> > > >>>>>>>
> > > >>>>>>>   ??? > Also I don't see any testing of the
> > > DeoptimizeObjectsALotThread.
> > > >>>>>>> Without
> > > >>>>>>>   ??? > active testing this will just bit-rot.
> > > >>>>>>>
> > > >>>>>>> DeoptimizeObjectsALot is meant for stress testing with a larger
> > > >>>>>>> workload. I will add a minimal test
> > > >>>>>>> to keep it fresh.
> > > >>>>>>>
> > > >>>>>>>   ??? > Also on the tests I don't understand your @requires clause:
> > > >>>>>>>   ??? >
> > > >>>>>>>   ??? >?? @requires ((vm.compMode != "Xcomp") &
> > > vm.compiler2.enabled
> > > >> &
> > > >>>>>>>   ??? > (vm.opt.TieredCompilation != true))
> > > >>>>>>>   ??? >
> > > >>>>>>>   ??? > This seems to require that TieredCompilation is disabled,
> but
> > > >>>>>>> tiered is
> > > >>>>>>>   ??? > our normal mode of operation. ??
> > > >>>>>>>   ??? >
> > > >>>>>>>
> > > >>>>>>> I removed the clause. I guess I wanted to target the tests
> towards
> > > the
> > > >>>>>>> code they are supposed to
> > > >>>>>>> test, and it's easier to analyze failures w/o tiered compilation
> and
> > > >>>>>>> with just one compiler thread.
> > > >>>>>>>
> > > >>>>>>> Additionally I will make use of
> > > >>>>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the
> > tests.
> > > >>>>>>>
> > > >>>>>>> Thanks,
> > > >>>>>>> Richard.
> > > >>>>>>>
> > > >>>>>>> -----Original Message-----
> > > >>>>>>> From: David Holmes <david.holmes at oracle.com>
> > > >>>>>>> Sent: Mittwoch, 11. Dezember 2019 08:03
> > > >>>>>>> To: Reingruber, Richard <richard.reingruber at sap.com>;
> > > >>>>>>> serviceability-dev at openjdk.java.net;
> > > >>>>>>> hotspot-compiler-dev at openjdk.java.net;
> > > >>>>>>> hotspot-runtime-dev at openjdk.java.net
> > > >>>>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better
> > > >>>>>>> Performance in the Presence of JVMTI Agents
> > > >>>>>>>
> > > >>>>>>> Hi Richard,
> > > >>>>>>>
> > > >>>>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote:
> > > >>>>>>>> Hi,
> > > >>>>>>>>
> > > >>>>>>>> I would like to get reviews please for
> > > >>>>>>>>
> > > >>>>>>>>
> > > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/
> > > >>>>>>>>
> > > >>>>>>>> Corresponding RFE:
> > > >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745
> > > >>>>>>>>
> > > >>>>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915
> > > >>>>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-
> > > 8214584 [1]
> > > >>>>>>>>
> > > >>>>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing
> > > without
> > > >>>>>>>> issues (thanks!). In addition the
> > > >>>>>>>> change is being tested at SAP since I posted the first RFR some
> > > >>>>>>>> months ago.
> > > >>>>>>>>
> > > >>>>>>>> The intention of this enhancement is to benefit performance
> > wise
> > > from
> > > >>>>>>>> escape analysis even if JVMTI
> > > >>>>>>>> agents request capabilities that allow them to access local
> > variable
> > > >>>>>>>> values. E.g. if you start-up
> > > >>>>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,
> > > then
> > > >>>>>>>> escape analysis is disabled right
> > > >>>>>>>> from the beginning, well before a debugger attaches -- if ever
> > one
> > > >>>>>>>> should do so. With the
> > > >>>>>>>> enhancement, escape analysis will remain enabled until and
> > after
> > > a
> > > >>>>>>>> debugger attaches. EA based
> > > >>>>>>>> optimizations are reverted just before an agent acquires the
> > > >>>>>>>> reference to an object. In the JBS item
> > > >>>>>>>> you'll find more details.
> > > >>>>>>>
> > > >>>>>>> Most of the details here are in areas I can comment on in detail,
> > but
> > > I
> > > >>>>>>> did take an initial general look at things.
> > > >>>>>>>
> > > >>>>>>> The only thing that jumped out at me is that I think the
> > > >>>>>>> DeoptimizeObjectsALotThread should be a hidden thread.
> > > >>>>>>>
> > > >>>>>>> +? bool is_hidden_from_external_view() const { return true; }
> > > >>>>>>>
> > > >>>>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread.
> > > >>>>>>> Without
> > > >>>>>>> active testing this will just bit-rot.
> > > >>>>>>>
> > > >>>>>>> Also on the tests I don't understand your @requires clause:
> > > >>>>>>>
> > > >>>>>>>   ??? @requires ((vm.compMode != "Xcomp") &
> > > vm.compiler2.enabled &
> > > >>>>>>> (vm.opt.TieredCompilation != true))
> > > >>>>>>>
> > > >>>>>>> This seems to require that TieredCompilation is disabled, but
> > tiered
> > > is
> > > >>>>>>> our normal mode of operation. ??
> > > >>>>>>>
> > > >>>>>>> Thanks,
> > > >>>>>>> David
> > > >>>>>>>
> > > >>>>>>>> Thanks,
> > > >>>>>>>> Richard.
> > > >>>>>>>>
> > > >>>>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745
> > > >>>>>>>>
> > > >>
> > >
> >
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.pa
> > > tc
> > > >> h
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>

From christian.hagedorn at oracle.com  Mon Jul 13 07:19:37 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Mon, 13 Jul 2020 09:19:37 +0200
Subject: [16] RFR(S): 8248552: C2 crashes with SIGFPE due to division by
 zero
In-Reply-To: <3ba2ef6a-8ade-7ede-5252-21051c34b472@oracle.com>
References: <70e8e42b-5cb3-9c1e-419e-2f771f042368@oracle.com>
 <3ba2ef6a-8ade-7ede-5252-21051c34b472@oracle.com>
Message-ID: <9e2f26bd-daa4-9540-8401-9850e0beea94@oracle.com>

Thank you Vladimir for your review!

Best regards,
Christian

On 11.07.20 01:25, Vladimir Kozlov wrote:
> Looks good.
> 
> Thanks,
> Vladimir
> 
> On 7/10/20 12:37 AM, Christian Hagedorn wrote:
>> Hi
>>
>> Please review the following patch:
>> https://bugs.openjdk.java.net/browse/JDK-8248552
>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.00/
>>
>> In the failing testcase, C2 removes a zero check for a division/modulo 
>> node n based on the type information of the loop induction variable 
>> phi p (always between 1 and 50 and never 0). However, n is later split 
>> through p and ends up after the AddNode which updates the induction 
>> variable p. In the last iteration j equals 2 and is then updated to 0. 
>> The division/modulo node n is now executed before the loop limit check 
>> which results in a SIGFPE.
>>
>> The fix bails out of PhaseIdealLoop::split_thru_phi if a division or 
>> modulo node has its zero check removed (i.e. control in NULL) and is 
>> split through a phi which has an input that could be zero. This should 
>> only happen for an induction variable phi of a trip-counted (integer) 
>> loop.
>>
>> Best regards,
>> Christian

From christian.hagedorn at oracle.com  Mon Jul 13 09:06:35 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Mon, 13 Jul 2020 11:06:35 +0200
Subject: [16] RFR(S): 8248552: C2 crashes with SIGFPE due to division by
 zero
In-Reply-To: <9e2f26bd-daa4-9540-8401-9850e0beea94@oracle.com>
References: <70e8e42b-5cb3-9c1e-419e-2f771f042368@oracle.com>
 <3ba2ef6a-8ade-7ede-5252-21051c34b472@oracle.com>
 <9e2f26bd-daa4-9540-8401-9850e0beea94@oracle.com>
Message-ID: <c569ebf8-58ed-c6df-2e0e-38c692f926c6@oracle.com>

A test in some later tier testing revealed that the assertion code is 
actually too strong. There can be a Div/Mod node whose zero check was 
removed but that is then spilt through a non-induction-variable phi 
whose inputs have zero in their type range (which is fine, this happens 
in some loop opts after partial peeling was applied earlier). This 
happened, for example, for a phi which merged two nodes from the 
original and a cloned loop. I think we just need to remove the 
additional assertion code.

New webrev:
http://cr.openjdk.java.net/~chagedorn/8248552/webrev.01/

Best regards,
Christian

On 13.07.20 09:19, Christian Hagedorn wrote:
> Thank you Vladimir for your review!
> 
> Best regards,
> Christian
> 
> On 11.07.20 01:25, Vladimir Kozlov wrote:
>> Looks good.
>>
>> Thanks,
>> Vladimir
>>
>> On 7/10/20 12:37 AM, Christian Hagedorn wrote:
>>> Hi
>>>
>>> Please review the following patch:
>>> https://bugs.openjdk.java.net/browse/JDK-8248552
>>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.00/
>>>
>>> In the failing testcase, C2 removes a zero check for a 
>>> division/modulo node n based on the type information of the loop 
>>> induction variable phi p (always between 1 and 50 and never 0). 
>>> However, n is later split through p and ends up after the AddNode 
>>> which updates the induction variable p. In the last iteration j 
>>> equals 2 and is then updated to 0. The division/modulo node n is now 
>>> executed before the loop limit check which results in a SIGFPE.
>>>
>>> The fix bails out of PhaseIdealLoop::split_thru_phi if a division or 
>>> modulo node has its zero check removed (i.e. control in NULL) and is 
>>> split through a phi which has an input that could be zero. This 
>>> should only happen for an induction variable phi of a trip-counted 
>>> (integer) loop.
>>>
>>> Best regards,
>>> Christian

From aph at redhat.com  Mon Jul 13 09:16:11 2020
From: aph at redhat.com (Andrew Haley)
Date: Mon, 13 Jul 2020 10:16:11 +0100
Subject: [15] RFR(S): 8248845: AArch64: stack corruption after spilling
 vector register
In-Reply-To: <854kqiqrt0.fsf@arm.com>
References: <857dvfrev5.fsf@arm.com>
 <0eeec297-f2e1-e326-5d3a-eb4a11e47934@oracle.com> <854kqiqrt0.fsf@arm.com>
Message-ID: <7acb43fb-abd2-d895-8f1b-2ab4aff140a2@redhat.com>

On 08/07/2020 10:28, Nick Gasson wrote:
> I wonder whether we should only do scheduling on AArch64 for in-order
> CPUs? I tried SPECjvm with/without OptoScheduling on a few different
> AArch64 systems but couldn't get conclusive results either way.

Arm has always been difficult to performance tune because it's an
architecture, not a processor. I didn't test on Arm's own designs at
all for the first few years of AArch64.

We schedule based on a conversation I had with Arm architects, which
basically amounted to "Schedule for in-order cores and the out-of-order
cores will look after themselves." I'd prefer not to disable scheduling
simply because it's buggy; that feels wrong to me. But if it's making
things worse, then we can do so.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From tobias.hartmann at oracle.com  Mon Jul 13 10:07:10 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 13 Jul 2020 12:07:10 +0200
Subject: [15] RFR (S): 8247502: PhaseStringOpts crashes while optimising
 effectively dead code
In-Reply-To: <a1d56db5-cae7-f402-fed5-8cd13ee44450@oracle.com>
References: <9ee563ef-501b-bdaa-4e87-8e9e8aaf2dd7@oracle.com>
 <a1d56db5-cae7-f402-fed5-8cd13ee44450@oracle.com>
Message-ID: <3beee001-0e4f-c92f-3746-74c8ed6bb043@oracle.com>

+1

Thanks for taking care of this while I'm on vacation!

Best regards,
Tobias


On 11.07.20 01:19, Vladimir Kozlov wrote:
> I agree with this small fix.
> 
> Thanks,
> Vladimir
> 
> On 7/10/20 9:26 AM, Vladimir Ivanov wrote:
>> https://bugs.openjdk.java.net/browse/JDK-8247502
>> http://cr.openjdk.java.net/~vlivanov/8247502/webrev.00/
>>
>> As Tobias discovered, PhaseStringOpts crashes when it encounters String::append() argument being
>> TOP: TOP is a constant, but the code expects to see a String constant instead.
>>
>> It happens while processing a call in unreachable infinite loop. The code is effectively dead, but
>> IGVN and PhaseRemoveUseless don't see that. It is discovered later when loop opts kick in which
>> clean it up.
>>
>> Proposed fix tries to make the code more robust and just bails out the optimization when TOP is
>> encountered.
>>
>> Alternative way to fix the problem would be to clean up the graph before PhaseStringOpts (e.g., by
>> running PhaseIdealLoop(LoopOptsNone) since PhaseRemoveUseless is not enough), but PhaseIdealLoop
>> pass can be expensive. So, I'm in favor of the local fix in PhaseStringOpts.
>>
>> Testing: crash reproducer, hs-precheckin-comp, hs-tier1, hs-tier2, tier1
>>
>> Thanks!
>>
>> PS: no regression test since I wasn't able to extract a simple reproducer from the crash log.
>>
>> Best regards,
>> Vladimir Ivanov

From christian.hagedorn at oracle.com  Mon Jul 13 10:09:51 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Mon, 13 Jul 2020 12:09:51 +0200
Subject: [16] RFR(S): 8247743: Segmentation fault in debug builds due to
 stack overflow in find_recur with deep graphs
In-Reply-To: <9af7a44c-4267-4900-812c-12aa0c37713a@oracle.com>
References: <9af7a44c-4267-4900-812c-12aa0c37713a@oracle.com>
Message-ID: <518ffdf1-143a-06f3-9aa4-96871d72d024@oracle.com>

Ping - could anyone review it, please? Thanks!

Best regards,
Christian

On 02.07.20 09:33, Christian Hagedorn wrote:
> Hi
> 
> Please review the following patch:
> https://bugs.openjdk.java.net/browse/JDK-8247743
> http://cr.openjdk.java.net/~chagedorn/8247743/webrev.00/
> 
> The testcase creates a deep graph with a lot of nodes on a chain. When 
> running with the specified test flags, it recursively calls 
> Node::find_recur() for each node discovered which eventually results in 
> a segmentation fault due to a stack overflow (around 10000 calls due to 
> such a long chain of nodes). The fix just converts the recursive 
> algorithm into an iterative one to avoid a segmentation fault. This is 
> similar to JDK-8246203 [1].
> 
> I additionally removed Node::find_ctrl() and its special handling in the 
> algorithm since it is not used.
> 
> There is actually another problem with the recursive version. When 
> running the testcase without -XX:CompileOnly=compiler/c2/TestFindNode, 
> it will spin forever inside [2] because there is a debug_orig node cycle 
> and the loop does not break based on the debug_orig nodes being visited. 
> This is also fixed in the patch.
> 
> Thank you!
> 
> Best regards,
> Christian
> 
> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8246203
> [2] 
> http://hg.openjdk.java.net/jdk/jdk/file/e2622818f0bd/src/hotspot/share/opto/node.cpp#l1589 
> 

From vladimir.x.ivanov at oracle.com  Mon Jul 13 13:40:36 2020
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Mon, 13 Jul 2020 16:40:36 +0300
Subject: [15] RFR (S): 8247502: PhaseStringOpts crashes while optimising
 effectively dead code
In-Reply-To: <3beee001-0e4f-c92f-3746-74c8ed6bb043@oracle.com>
References: <9ee563ef-501b-bdaa-4e87-8e9e8aaf2dd7@oracle.com>
 <a1d56db5-cae7-f402-fed5-8cd13ee44450@oracle.com>
 <3beee001-0e4f-c92f-3746-74c8ed6bb043@oracle.com>
Message-ID: <949b0726-d36a-c103-3632-2f31390cefdb@oracle.com>

Thanks for the reviews, Vladimir & Tobias.

Best regards,
Vladimir Ivanov

On 13.07.2020 13:07, Tobias Hartmann wrote:
> +1
> 
> Thanks for taking care of this while I'm on vacation!
> 
> Best regards,
> Tobias
> 
> 
> On 11.07.20 01:19, Vladimir Kozlov wrote:
>> I agree with this small fix.
>>
>> Thanks,
>> Vladimir
>>
>> On 7/10/20 9:26 AM, Vladimir Ivanov wrote:
>>> https://bugs.openjdk.java.net/browse/JDK-8247502
>>> http://cr.openjdk.java.net/~vlivanov/8247502/webrev.00/
>>>
>>> As Tobias discovered, PhaseStringOpts crashes when it encounters String::append() argument being
>>> TOP: TOP is a constant, but the code expects to see a String constant instead.
>>>
>>> It happens while processing a call in unreachable infinite loop. The code is effectively dead, but
>>> IGVN and PhaseRemoveUseless don't see that. It is discovered later when loop opts kick in which
>>> clean it up.
>>>
>>> Proposed fix tries to make the code more robust and just bails out the optimization when TOP is
>>> encountered.
>>>
>>> Alternative way to fix the problem would be to clean up the graph before PhaseStringOpts (e.g., by
>>> running PhaseIdealLoop(LoopOptsNone) since PhaseRemoveUseless is not enough), but PhaseIdealLoop
>>> pass can be expensive. So, I'm in favor of the local fix in PhaseStringOpts.
>>>
>>> Testing: crash reproducer, hs-precheckin-comp, hs-tier1, hs-tier2, tier1
>>>
>>> Thanks!
>>>
>>> PS: no regression test since I wasn't able to extract a simple reproducer from the crash log.
>>>
>>> Best regards,
>>> Vladimir Ivanov

From beurba at microsoft.com  Mon Jul 13 14:03:57 2020
From: beurba at microsoft.com (Bernhard Urban-Forster)
Date: Mon, 13 Jul 2020 14:03:57 +0000
Subject: RFR(XS) 8248671: AArch64: Remove unused variables
In-Reply-To: <1c652b56-2476-ede0-47f8-13c4e99639d0@oracle.com>
References: <DM6PR21MB14529CCD3BDC0E3AA1AA38A1C2640@DM6PR21MB1452.namprd21.prod.outlook.com>
 <1c652b56-2476-ede0-47f8-13c4e99639d0@oracle.com>
Message-ID: <DM6PR21MB1452947EABFC3F39470CB44BC2600@DM6PR21MB1452.namprd21.prod.outlook.com>

Thank you for your review Andrew and David.

Here is the webrev based on https://hg.openjdk.java.net/jdk/jdk/:

http://cr.openjdk.java.net/~burban/8248671_hg/


Thanks,
-Bernhard

> -----Original Message-----
> From: David Holmes <david.holmes at oracle.com>
> Sent: Monday, July 13, 2020 6:08 AM
> To: Bernhard Urban-Forster <beurba at microsoft.com>; aarch64-port-
> dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-compiler-
> dev at openjdk.java.net
> Subject: Re: RFR(XS) 8248671: AArch64: Remove unused variables
> 
> Hi Bernhard,
> 
> On 10/07/2020 7:08 am, Bernhard Urban-Forster wrote:
> > Hello everyone,
> >
> >
> > please review this change:
> >
> > JBS:
> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs
> > .openjdk.java.net%2Fbrowse%2FJDK-
> 8248671&amp;data=02%7C01%7Cbeurba%40m
> >
> icrosoft.com%7Cfa08ce93e8d44b4f38a708d826e296ed%7C72f988bf86f141af
> 91ab
> >
> 2d7cd011db47%7C1%7C0%7C637302101922573034&amp;sdata=Bv6Fsw104
> ZizId5EdC
> > %2BTnV8DrJCJyQzVk9ht6rWjLMw%3D&amp;reserved=0
> > Webrev:
> > https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.open
> > jdk.java.net%2F~burban%2F8248671_unused-
> vars%2F&amp;data=02%7C01%7Cbeu
> >
> rba%40microsoft.com%7Cfa08ce93e8d44b4f38a708d826e296ed%7C72f988bf
> 86f14
> >
> 1af91ab2d7cd011db47%7C1%7C0%7C637302101922573034&amp;sdata=X7
> 7Ri2iWLkm
> > %2FOSmdP9HyEWEwMA7rO%2BM6oKVRAg4zHGQ%3D&amp;reserved=0
> >
> > We found this issue while bringing up Windows+AArch64 support for
> HotSpot. The Microsoft toolchain (MSVC) seems to be slightly more pedantic
> than GCC.
> 
> Looks good and trivial.
> 
> But could I request that webrevs/patches for mainline be generated against the
> mainline hg repository rather than the git mirror.
> 
> Thanks,
> David
> 
> >
> > Thanks,
> > -Bernhard
> >

From patric.hedlin at oracle.com  Mon Jul 13 15:33:50 2020
From: patric.hedlin at oracle.com (Patric Hedlin)
Date: Mon, 13 Jul 2020 17:33:50 +0200
Subject: [aarch64-port-dev ] RFR(S/M): 8247766: [aarch64] guarantee(val <
 (1U << nbits)) failed: Field too big for insn
In-Reply-To: <a5005a89-43b0-f248-6edd-6f242c037ff7@redhat.com>
References: <0cdbdf26-ad4d-056b-a801-cc31b2cc4ab3@oracle.com>
 <a5005a89-43b0-f248-6edd-6f242c037ff7@redhat.com>
Message-ID: <a951dfb0-0ae6-5953-78f4-936d3d5e1831@oracle.com>

Hi Andrew,

On 2020-07-09 16:26, Andrew Haley wrote:
> On 07/07/2020 12:17, Patric Hedlin wrote:
>> C1 code generation for reading and writing stack-slots does not handle
>> large immediate offsets on aarch64. This patch will ensure that
>> immediate offsets are admissible for base+(immediate)offset encoding or,
>> if this is not the case, will enforce an explicit address calculation to
>> a scratch register. (Also correcting a small glitch in 9-bit signed
>> immediate encoding check.)
>>
>> NOTE: Current patch includes (local) definitions of is_simm/9 and
>> is_uimm/12, for review purpose only. With JDK-8248901 these will move to
>> Assembler, and will not be included in the change-set.
> Umm, OK. These functions seem too complicated: all you have to do is
>
>      int64_t chk = val >> (nbits - 1);                                                                                                   |(gdb)
>      guarantee (chk == -1 || chk == 0, "Field too big for insn");                                                                        |#9  0x0000ffffbcab5c30 in Compilation::compile_method (this=0xffff80b7dde8)
The 'guarantee' of course works poorly as a predicate and the 'chk' 
calculation is based on implementation-dependent behaviour.

> but the AArch64 part of it looks fine.
Having seen your second answer, I guess you had a change of hart. What a 
pity.

/Patric

From aph at redhat.com  Mon Jul 13 17:06:34 2020
From: aph at redhat.com (Andrew Haley)
Date: Mon, 13 Jul 2020 18:06:34 +0100
Subject: [aarch64-port-dev ] RFR(S/M): 8247766: [aarch64] guarantee(val <
 (1U << nbits)) failed: Field too big for insn
In-Reply-To: <a951dfb0-0ae6-5953-78f4-936d3d5e1831@oracle.com>
References: <0cdbdf26-ad4d-056b-a801-cc31b2cc4ab3@oracle.com>
 <a5005a89-43b0-f248-6edd-6f242c037ff7@redhat.com>
 <a951dfb0-0ae6-5953-78f4-936d3d5e1831@oracle.com>
Message-ID: <6656d2d2-053b-63c1-e3b8-878600313bc3@redhat.com>

Hi,

On 13/07/2020 16:33, Patric Hedlin wrote:
>
> On 2020-07-09 16:26, Andrew Haley wrote:
>> On 07/07/2020 12:17, Patric Hedlin wrote:
>>> C1 code generation for reading and writing stack-slots does not handle
>>> large immediate offsets on aarch64. This patch will ensure that
>>> immediate offsets are admissible for base+(immediate)offset encoding or,
>>> if this is not the case, will enforce an explicit address calculation to
>>> a scratch register. (Also correcting a small glitch in 9-bit signed
>>> immediate encoding check.)
>>>
>>> NOTE: Current patch includes (local) definitions of is_simm/9 and
>>> is_uimm/12, for review purpose only. With JDK-8248901 these will move to
>>> Assembler, and will not be included in the change-set.
>> Umm, OK. These functions seem too complicated: all you have to do is
>>
>>      int64_t chk = val >> (nbits - 1);                                                                                                   |(gdb)
>>      guarantee (chk == -1 || chk == 0, "Field too big for insn");                                                                        |#9  0x0000ffffbcab5c30 in Compilation::compile_method (this=0xffff80b7dde8)
> The 'guarantee' of course works poorly as a predicate and the 'chk'
> calculation is based on implementation-dependent behaviour.

Hmm. Signed >> does require a 2's complement C implementation, but we
assume many implementation-defined things in HotSpot. I know, if we
can remove such things perhaps we should, all other things being
equal.

>> but the AArch64 part of it looks fine.
> Having seen your second answer, I guess you had a change of
> hart. What a pity.

Of course I want this bug fixed, and I'm grateful for this patch.

However, we already have an equivalent overflow test in a couple of (a
few, probably) different places; and your patch adds another one. We
shouldn't be doing that.

I'm quite open to doing it in another way, thereby replacing the
existing logic, but not to duplicating code.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From vladimir.kozlov at oracle.com  Mon Jul 13 17:16:31 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 13 Jul 2020 10:16:31 -0700
Subject: [16] RFR(S): 8248552: C2 crashes with SIGFPE due to division by
 zero
In-Reply-To: <c569ebf8-58ed-c6df-2e0e-38c692f926c6@oracle.com>
References: <70e8e42b-5cb3-9c1e-419e-2f771f042368@oracle.com>
 <3ba2ef6a-8ade-7ede-5252-21051c34b472@oracle.com>
 <9e2f26bd-daa4-9540-8401-9850e0beea94@oracle.com>
 <c569ebf8-58ed-c6df-2e0e-38c692f926c6@oracle.com>
Message-ID: <a2066deb-ef3b-7071-33ff-03c829069cff@oracle.com>

This rise question: why zero check was removed if one of merged types has 0?
Should we be more careful when we remove zero check?

Thanks,
Vladimir

On 7/13/20 2:06 AM, Christian Hagedorn wrote:
> A test in some later tier testing revealed that the assertion code is actually too strong. There can be a Div/Mod node 
> whose zero check was removed but that is then spilt through a non-induction-variable phi whose inputs have zero in their 
> type range (which is fine, this happens in some loop opts after partial peeling was applied earlier). This happened, for 
> example, for a phi which merged two nodes from the original and a cloned loop. I think we just need to remove the 
> additional assertion code.
> 
> New webrev:
> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.01/
> 
> Best regards,
> Christian
> 
> On 13.07.20 09:19, Christian Hagedorn wrote:
>> Thank you Vladimir for your review!
>>
>> Best regards,
>> Christian
>>
>> On 11.07.20 01:25, Vladimir Kozlov wrote:
>>> Looks good.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 7/10/20 12:37 AM, Christian Hagedorn wrote:
>>>> Hi
>>>>
>>>> Please review the following patch:
>>>> https://bugs.openjdk.java.net/browse/JDK-8248552
>>>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.00/
>>>>
>>>> In the failing testcase, C2 removes a zero check for a division/modulo node n based on the type information of the 
>>>> loop induction variable phi p (always between 1 and 50 and never 0). However, n is later split through p and ends up 
>>>> after the AddNode which updates the induction variable p. In the last iteration j equals 2 and is then updated to 0. 
>>>> The division/modulo node n is now executed before the loop limit check which results in a SIGFPE.
>>>>
>>>> The fix bails out of PhaseIdealLoop::split_thru_phi if a division or modulo node has its zero check removed (i.e. 
>>>> control in NULL) and is split through a phi which has an input that could be zero. This should only happen for an 
>>>> induction variable phi of a trip-counted (integer) loop.
>>>>
>>>> Best regards,
>>>> Christian

From vladimir.kozlov at oracle.com  Mon Jul 13 17:43:36 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 13 Jul 2020 10:43:36 -0700
Subject: [16] RFR(S): 8247743: Segmentation fault in debug builds due to
 stack overflow in find_recur with deep graphs
In-Reply-To: <518ffdf1-143a-06f3-9aa4-96871d72d024@oracle.com>
References: <9af7a44c-4267-4900-812c-12aa0c37713a@oracle.com>
 <518ffdf1-143a-06f3-9aa4-96871d72d024@oracle.com>
Message-ID: <9b3a9632-c7bb-2f51-c295-72935add2670@oracle.com>

Node::find_ctrl() is used during debugging when you want to print and look on only control nodes.
We have several such methods which are only used in debugger.

I suggest to store old_arena() in local var and pass into add_to_worklist().

You can make add_to_worklist() static since you pass node as argument.

Thanks,
Vladimir

On 7/13/20 3:09 AM, Christian Hagedorn wrote:
> Ping - could anyone review it, please? Thanks!
> 
> Best regards,
> Christian
> 
> On 02.07.20 09:33, Christian Hagedorn wrote:
>> Hi
>>
>> Please review the following patch:
>> https://bugs.openjdk.java.net/browse/JDK-8247743
>> http://cr.openjdk.java.net/~chagedorn/8247743/webrev.00/
>>
>> The testcase creates a deep graph with a lot of nodes on a chain. When running with the specified test flags, it 
>> recursively calls Node::find_recur() for each node discovered which eventually results in a segmentation fault due to 
>> a stack overflow (around 10000 calls due to such a long chain of nodes). The fix just converts the recursive algorithm 
>> into an iterative one to avoid a segmentation fault. This is similar to JDK-8246203 [1].
>>
>> I additionally removed Node::find_ctrl() and its special handling in the algorithm since it is not used.
>>
>> There is actually another problem with the recursive version. When running the testcase without 
>> -XX:CompileOnly=compiler/c2/TestFindNode, it will spin forever inside [2] because there is a debug_orig node cycle and 
>> the loop does not break based on the debug_orig nodes being visited. This is also fixed in the patch.
>>
>> Thank you!
>>
>> Best regards,
>> Christian
>>
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8246203
>> [2] http://hg.openjdk.java.net/jdk/jdk/file/e2622818f0bd/src/hotspot/share/opto/node.cpp#l1589

From ekaterina.pavlova at oracle.com  Mon Jul 13 19:38:10 2020
From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova)
Date: Mon, 13 Jul 2020 12:38:10 -0700
Subject: RFR[15] (T/XS): 8236809 [Graal]
 java/lang/Class/getDeclaredField/FieldSetAccessibleTest.java timeouts
Message-ID: <d1821057-e181-d1b1-698d-3a2a21d2ee20@oracle.com>

Hi all,

please review this small change which adds the test into ProblemList-graal.txt
till we have libgraal.


     JBS: https://bugs.openjdk.java.net/browse/JDK-8236809
  webrev: http://cr.openjdk.java.net/~epavlova//8236809/webrev.00/index.html


regards,
-katya

From vladimir.kozlov at oracle.com  Mon Jul 13 21:25:40 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 13 Jul 2020 14:25:40 -0700
Subject: RFR[15] (T/XS): 8236809 [Graal]
 java/lang/Class/getDeclaredField/FieldSetAccessibleTest.java timeouts
In-Reply-To: <d1821057-e181-d1b1-698d-3a2a21d2ee20@oracle.com>
References: <d1821057-e181-d1b1-698d-3a2a21d2ee20@oracle.com>
Message-ID: <0816f8ed-4122-39ea-3806-0fba1a51d34a@oracle.com>

Good.

Thanks,
Vladimir K

On 7/13/20 12:38 PM, Ekaterina Pavlova wrote:
> Hi all,
> 
> please review this small change which adds the test into ProblemList-graal.txt
> till we have libgraal.
> 
> 
>  ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8236809
>  ?webrev: http://cr.openjdk.java.net/~epavlova//8236809/webrev.00/index.html
> 
> 
> regards,
> -katya

From igor.ignatyev at oracle.com  Mon Jul 13 21:29:18 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Mon, 13 Jul 2020 14:29:18 -0700
Subject: RFR [15] : 8249036 : clean up FileInstaller $test.src $cwd in
 vmTestbase_nsk_stress tests
Message-ID: <5E2ED18E-9CD6-44D6-95D0-E13D1AFC1BC3@oracle.com>

http://cr.openjdk.java.net/~iignatyev//8249036/webrev.00/
> 44 lines changed: 0 ins; 23 del; 21 mod; 

Hi all,

could you please review this clean-up which removes unnecessary `FileInstaller` actions for :vmTestbase_nsk_stress tests?
from the main issue(8204985):
> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests.

none of vmTestbase_nsk_stress tests need FileInstaller, hence the patch is just `ag -l  '@run driver jdk.test.lib.FileInstaller . .' vmTestbase/nsk/stress xargs -I{} gsed -i '/@run driver jdk.test.lib.FileInstaller \. \./d' {}`.

JBS: https://bugs.openjdk.java.net/browse/JDK-8249036
webrev: http://cr.openjdk.java.net/~iignatyev//8249036/webrev.00/
testing: :vmTestbase_nsk_stress on linux-x64

Thanks,
-- Igor

From vladimir.kozlov at oracle.com  Mon Jul 13 21:34:06 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 13 Jul 2020 14:34:06 -0700
Subject: RFR [15] : 8249036 : clean up FileInstaller $test.src $cwd in
 vmTestbase_nsk_stress tests
In-Reply-To: <5E2ED18E-9CD6-44D6-95D0-E13D1AFC1BC3@oracle.com>
References: <5E2ED18E-9CD6-44D6-95D0-E13D1AFC1BC3@oracle.com>
Message-ID: <5720357e-c3c6-6f7f-7993-535561fd84e2@oracle.com>

Good.

Thanks,
Vladimir K

On 7/13/20 2:29 PM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev//8249036/webrev.00/
>> 44 lines changed: 0 ins; 23 del; 21 mod;
> 
> Hi all,
> 
> could you please review this clean-up which removes unnecessary `FileInstaller` actions for :vmTestbase_nsk_stress tests?
> from the main issue(8204985):
>> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests.
> 
> none of vmTestbase_nsk_stress tests need FileInstaller, hence the patch is just `ag -l  '@run driver jdk.test.lib.FileInstaller . .' vmTestbase/nsk/stress xargs -I{} gsed -i '/@run driver jdk.test.lib.FileInstaller \. \./d' {}`.
> 
> JBS: https://bugs.openjdk.java.net/browse/JDK-8249036
> webrev: http://cr.openjdk.java.net/~iignatyev//8249036/webrev.00/
> testing: :vmTestbase_nsk_stress on linux-x64
> 
> Thanks,
> -- Igor
> 

From jamsheed.c.m at oracle.com  Tue Jul 14 08:28:33 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Tue, 14 Jul 2020 13:58:33 +0530
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <8522a69e-e538-2cc9-5364-887e450fc653@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
 <8325fac5-6258-9b88-6507-5dcb0597cc17@oracle.com>
 <8522a69e-e538-2cc9-5364-887e450fc653@oracle.com>
Message-ID: <46144d6d-5714-05ad-a263-01507db937cc@oracle.com>

Hi all,

I had incorrectly added extra check in assert after offset computation 
in address_offset . For addps with non constant offsets (like [1])

Not changing the old assert even though I am not expecting first 
addp/second addp(for array addressing) case for init captured store.

http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA_asserts_corrected/

Best regards,

Jamsheed

[1]

assert(offs != Type::OffsetBot ||
- adr->in(AddPNode::Address)->in(0)->is_AllocateArray(),
+ adr->in(AddPNode::Address)->in(0)->is_AllocateArray() || 
is_captured_store(adr),
             "offset must be a constant or it is initialization of array");

On 13/07/2020 11:14, Jamsheed C M wrote:
>
> Hi,
>
> I reworked the fix. I compute offset for all init captures stores, but 
> treats this special init captured stores similar to unsafe(as these 
> objects are usually GlobalEscape and doesn't have any perf implications).
>
> revised webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.01/
>
> testing: mach1-5( logs in jbs)
>
> Best regards,
>
> Jamsheed
>
> On 09/07/2020 19:36, Jamsheed C M wrote:
>>
>> Hi,
>>
>> request to hold the review. need to change the code for dealing with 
>> unsafe access. as current capture code go for more execution time 
>> analyzing things.
>>
>> Best regards,
>>
>> Jamsheed
>>
>> On 09/07/2020 13:01, Jamsheed C M wrote:
>>>
>>> Hi all,
>>>
>>> JBS:https://bugs.openjdk.java.net/browse/JDK-8242895
>>>
>>> Request for review changes made to offset computation and field 
>>> write detection for init captured stores due to phis addition 
>>> between alloc and init. This happen if init node in different outer 
>>> loop wrt to alloc node and there is a loop opt.? This was required 
>>> as a result of enhancement [1].
>>>
>>> Normally init are not associated with multiple alloc node during EA 
>>> phase, but changes done for [1] caused the code shapes of the form 
>>> [2]? to generate inits associated with multiple alloc node.
>>>
>>> This had implication in offset computation and field write detection 
>>> related to initializing stores.
>>>
>>> Attempt to fix in EA:
>>>
>>> ???? webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA/
>>>
>>> Alternate fix:
>>>
>>> ???? Minimize the scenario in compiler generated code by throwing 
>>> only j.l.Error from slowpath(all exception async/sync are handled in 
>>> runtime exit).
>>>
>>> ???? Stub epilog doesn't poll or throw any exceptions. Disable full 
>>> loop opt before EA for detectable patterns and bailout EA for late 
>>> detected patterns.
>>>
>>> ???? webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_deopt/
>>>
>>> Please advice.
>>>
>>> Testing : mach tier1-5 (logs in jbs)
>>>
>>> Best regards,
>>>
>>> Jamsheed
>>>
>>>
>>> [1] JDK-8231291 
>>> <https://bugs.openjdk.java.net/browse/JDK-8231291>C2: loop opts 
>>> before EA should maximally unroll loops
>>>
>>> [2] that have its init node in different outer loop wrt to alloc node.
>>>
>>>
>>> loop begin
>>>
>>> ?? try{
>>>
>>> ?? return new obj()/? throw new obj()/ uncommon trap after 
>>> allocation, in a loop
>>>
>>> ?? } catch(ex) {
>>>
>>> ?? }
>>>
>>> loop end
>>>
>>>   42     public static IntA test(int n) {
>>>    43         for (int i=0; i<2; i++) {
>>>    44             try {
>>>    45                   return new IntA(n + i);
>>>    46             } catch (Exception e) {
>>>    47             }
>>>    48         }
>>>    49
>>>

From christian.hagedorn at oracle.com  Tue Jul 14 09:54:01 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Tue, 14 Jul 2020 11:54:01 +0200
Subject: [16] RFR(S): 8247743: Segmentation fault in debug builds due to
 stack overflow in find_recur with deep graphs
In-Reply-To: <9b3a9632-c7bb-2f51-c295-72935add2670@oracle.com>
References: <9af7a44c-4267-4900-812c-12aa0c37713a@oracle.com>
 <518ffdf1-143a-06f3-9aa4-96871d72d024@oracle.com>
 <9b3a9632-c7bb-2f51-c295-72935add2670@oracle.com>
Message-ID: <2f317601-4845-541d-e2ef-ad7735386f1c@oracle.com>

Hi Vladimir

On 13.07.20 19:43, Vladimir Kozlov wrote:
> Node::find_ctrl() is used during debugging when you want to print and 
> look on only control nodes.
> We have several such methods which are only used in debugger.

I see, I restored this method and changed Node::find() accordingly. I 
additionally added two find_ctrl() methods to make it easier to call it 
from a debugger (as already present for find_node()).

> I suggest to store old_arena() in local var and pass into 
> add_to_worklist().
> 
> You can make add_to_worklist() static since you pass node as argument.

Okay. I updated this and the change above in a new webrev:
http://cr.openjdk.java.net/~chagedorn/8247743/webrev.01/

Best regards,
Christian

> Thanks,
> Vladimir
> 
> On 7/13/20 3:09 AM, Christian Hagedorn wrote:
>> Ping - could anyone review it, please? Thanks!
>>
>> Best regards,
>> Christian
>>
>> On 02.07.20 09:33, Christian Hagedorn wrote:
>>> Hi
>>>
>>> Please review the following patch:
>>> https://bugs.openjdk.java.net/browse/JDK-8247743
>>> http://cr.openjdk.java.net/~chagedorn/8247743/webrev.00/
>>>
>>> The testcase creates a deep graph with a lot of nodes on a chain. 
>>> When running with the specified test flags, it recursively calls 
>>> Node::find_recur() for each node discovered which eventually results 
>>> in a segmentation fault due to a stack overflow (around 10000 calls 
>>> due to such a long chain of nodes). The fix just converts the 
>>> recursive algorithm into an iterative one to avoid a segmentation 
>>> fault. This is similar to JDK-8246203 [1].
>>>
>>> I additionally removed Node::find_ctrl() and its special handling in 
>>> the algorithm since it is not used.
>>>
>>> There is actually another problem with the recursive version. When 
>>> running the testcase without 
>>> -XX:CompileOnly=compiler/c2/TestFindNode, it will spin forever inside 
>>> [2] because there is a debug_orig node cycle and the loop does not 
>>> break based on the debug_orig nodes being visited. This is also fixed 
>>> in the patch.
>>>
>>> Thank you!
>>>
>>> Best regards,
>>> Christian
>>>
>>>
>>> [1] https://bugs.openjdk.java.net/browse/JDK-8246203
>>> [2] 
>>> http://hg.openjdk.java.net/jdk/jdk/file/e2622818f0bd/src/hotspot/share/opto/node.cpp#l1589 
>>>

From christian.hagedorn at oracle.com  Tue Jul 14 12:32:19 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Tue, 14 Jul 2020 14:32:19 +0200
Subject: [16] RFR(S): 8248552: C2 crashes with SIGFPE due to division by
 zero
In-Reply-To: <a2066deb-ef3b-7071-33ff-03c829069cff@oracle.com>
References: <70e8e42b-5cb3-9c1e-419e-2f771f042368@oracle.com>
 <3ba2ef6a-8ade-7ede-5252-21051c34b472@oracle.com>
 <9e2f26bd-daa4-9540-8401-9850e0beea94@oracle.com>
 <c569ebf8-58ed-c6df-2e0e-38c692f926c6@oracle.com>
 <a2066deb-ef3b-7071-33ff-03c829069cff@oracle.com>
Message-ID: <f6a62d10-9cff-3fc2-0947-c49e775a23c9@oracle.com>

Hi Vladimir

I had a closer look at the failing testcase with webrev.00. The original 
DivNode has its zero check removed based on correct type information. 
Afterwards its split through an induction variable phi for which both 
inputs have non-zero types. So, the DivNode end up after an AddINode 
(which adds a positive constant) which has a non-zero type. All good so far.

Now we add pre/main/post loops and the induction variable phi for the 
pre-loop gets type int>=1 since the limit for the pre-loop is hidden 
behind an Opaque1 node which just returns int as type. The AddINode 
belonging to the loop induction variable phi in the pre-loop is 
therefore updated to have the type int as well (int>=1 + positive_int 
could overflow). This type information propagates to the main-loop and 
its AddINode belonging to the loop induction variable phi (which is an 
input to the DivNode) also gets its type set to int.

Later, we add a vector post loop where we clone the main loop and add a 
phi p for the the AddINode node and its new clone. Since the DivINode 
has a control outside of the main loop, it is not cloned and gets the 
phi p as an input. At a later point in time, we want to split through p. 
But then we detect zero as possible value due to the type range of both 
AddINodes being int.

Even though the type information is not accurate enough, the DivINode is 
never zero and we could safely apply the split through the phi. We could 
think about doing a bail out for all kinds of phis but I think it should 
only be an actual problem for loop induction variable phis.

Thinking about this type propagation problem, couldn't we somehow set 
the type of the Opaque1 node hiding the pre-loop limit to the same type 
as the pre-loop limit to allow this information to flow to the pre and 
main loop? Or would that cause other problems? I guess there probably 
must be a reason why we don't do it like that.

Best regards,
Christian

On 13.07.20 19:16, Vladimir Kozlov wrote:
> This rise question: why zero check was removed if one of merged types 
> has 0?
> Should we be more careful when we remove zero check?
> 
> Thanks,
> Vladimir
> 
> On 7/13/20 2:06 AM, Christian Hagedorn wrote:
>> A test in some later tier testing revealed that the assertion code is 
>> actually too strong. There can be a Div/Mod node whose zero check was 
>> removed but that is then spilt through a non-induction-variable phi 
>> whose inputs have zero in their type range (which is fine, this 
>> happens in some loop opts after partial peeling was applied earlier). 
>> This happened, for example, for a phi which merged two nodes from the 
>> original and a cloned loop. I think we just need to remove the 
>> additional assertion code.
>>
>> New webrev:
>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.01/
>>
>> Best regards,
>> Christian
>>
>> On 13.07.20 09:19, Christian Hagedorn wrote:
>>> Thank you Vladimir for your review!
>>>
>>> Best regards,
>>> Christian
>>>
>>> On 11.07.20 01:25, Vladimir Kozlov wrote:
>>>> Looks good.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 7/10/20 12:37 AM, Christian Hagedorn wrote:
>>>>> Hi
>>>>>
>>>>> Please review the following patch:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8248552
>>>>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.00/
>>>>>
>>>>> In the failing testcase, C2 removes a zero check for a 
>>>>> division/modulo node n based on the type information of the loop 
>>>>> induction variable phi p (always between 1 and 50 and never 0). 
>>>>> However, n is later split through p and ends up after the AddNode 
>>>>> which updates the induction variable p. In the last iteration j 
>>>>> equals 2 and is then updated to 0. The division/modulo node n is 
>>>>> now executed before the loop limit check which results in a SIGFPE.
>>>>>
>>>>> The fix bails out of PhaseIdealLoop::split_thru_phi if a division 
>>>>> or modulo node has its zero check removed (i.e. control in NULL) 
>>>>> and is split through a phi which has an input that could be zero. 
>>>>> This should only happen for an induction variable phi of a 
>>>>> trip-counted (integer) loop.
>>>>>
>>>>> Best regards,
>>>>> Christian

From christian.hagedorn at oracle.com  Tue Jul 14 12:39:32 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Tue, 14 Jul 2020 14:39:32 +0200
Subject: [16] RFR(S): 8248552: C2 crashes with SIGFPE due to division by
 zero
In-Reply-To: <f6a62d10-9cff-3fc2-0947-c49e775a23c9@oracle.com>
References: <70e8e42b-5cb3-9c1e-419e-2f771f042368@oracle.com>
 <3ba2ef6a-8ade-7ede-5252-21051c34b472@oracle.com>
 <9e2f26bd-daa4-9540-8401-9850e0beea94@oracle.com>
 <c569ebf8-58ed-c6df-2e0e-38c692f926c6@oracle.com>
 <a2066deb-ef3b-7071-33ff-03c829069cff@oracle.com>
 <f6a62d10-9cff-3fc2-0947-c49e775a23c9@oracle.com>
Message-ID: <e8d6aa47-b584-4fdc-f506-1b1c52aa28f5@oracle.com>


> [..] Since the DivINode has a control outside of the main loop [..]

Edit: I actually meant that get_ctrl() returns a node outside of the 
main-loop (i.e. the DivINode is not part of the main-loop body). The 
DivINode still has NULL as control input.

Best regards,
Christian

On 14.07.20 14:32, Christian Hagedorn wrote:
> Hi Vladimir
> 
> I had a closer look at the failing testcase with webrev.00. The original 
> DivNode has its zero check removed based on correct type information. 
> Afterwards its split through an induction variable phi for which both 
> inputs have non-zero types. So, the DivNode end up after an AddINode 
> (which adds a positive constant) which has a non-zero type. All good so 
> far.
> 
> Now we add pre/main/post loops and the induction variable phi for the 
> pre-loop gets type int>=1 since the limit for the pre-loop is hidden 
> behind an Opaque1 node which just returns int as type. The AddINode 
> belonging to the loop induction variable phi in the pre-loop is 
> therefore updated to have the type int as well (int>=1 + positive_int 
> could overflow). This type information propagates to the main-loop and 
> its AddINode belonging to the loop induction variable phi (which is an 
> input to the DivNode) also gets its type set to int.
> 
> Later, we add a vector post loop where we clone the main loop and add a 
> phi p for the the AddINode node and its new clone. Since the DivINode 
> has a control outside of the main loop, it is not cloned and gets the 
> phi p as an input. At a later point in time, we want to split through p. 
> But then we detect zero as possible value due to the type range of both 
> AddINodes being int.
> 
> Even though the type information is not accurate enough, the DivINode is 
> never zero and we could safely apply the split through the phi. We could 
> think about doing a bail out for all kinds of phis but I think it should 
> only be an actual problem for loop induction variable phis.
> 
> Thinking about this type propagation problem, couldn't we somehow set 
> the type of the Opaque1 node hiding the pre-loop limit to the same type 
> as the pre-loop limit to allow this information to flow to the pre and 
> main loop? Or would that cause other problems? I guess there probably 
> must be a reason why we don't do it like that.
> 
> Best regards,
> Christian
> 
> On 13.07.20 19:16, Vladimir Kozlov wrote:
>> This rise question: why zero check was removed if one of merged types 
>> has 0?
>> Should we be more careful when we remove zero check?
>>
>> Thanks,
>> Vladimir
>>
>> On 7/13/20 2:06 AM, Christian Hagedorn wrote:
>>> A test in some later tier testing revealed that the assertion code is 
>>> actually too strong. There can be a Div/Mod node whose zero check was 
>>> removed but that is then spilt through a non-induction-variable phi 
>>> whose inputs have zero in their type range (which is fine, this 
>>> happens in some loop opts after partial peeling was applied earlier). 
>>> This happened, for example, for a phi which merged two nodes from the 
>>> original and a cloned loop. I think we just need to remove the 
>>> additional assertion code.
>>>
>>> New webrev:
>>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.01/
>>>
>>> Best regards,
>>> Christian
>>>
>>> On 13.07.20 09:19, Christian Hagedorn wrote:
>>>> Thank you Vladimir for your review!
>>>>
>>>> Best regards,
>>>> Christian
>>>>
>>>> On 11.07.20 01:25, Vladimir Kozlov wrote:
>>>>> Looks good.
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> On 7/10/20 12:37 AM, Christian Hagedorn wrote:
>>>>>> Hi
>>>>>>
>>>>>> Please review the following patch:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8248552
>>>>>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.00/
>>>>>>
>>>>>> In the failing testcase, C2 removes a zero check for a 
>>>>>> division/modulo node n based on the type information of the loop 
>>>>>> induction variable phi p (always between 1 and 50 and never 0). 
>>>>>> However, n is later split through p and ends up after the AddNode 
>>>>>> which updates the induction variable p. In the last iteration j 
>>>>>> equals 2 and is then updated to 0. The division/modulo node n is 
>>>>>> now executed before the loop limit check which results in a SIGFPE.
>>>>>>
>>>>>> The fix bails out of PhaseIdealLoop::split_thru_phi if a division 
>>>>>> or modulo node has its zero check removed (i.e. control in NULL) 
>>>>>> and is split through a phi which has an input that could be zero. 
>>>>>> This should only happen for an induction variable phi of a 
>>>>>> trip-counted (integer) loop.
>>>>>>
>>>>>> Best regards,
>>>>>> Christian

From igor.ignatyev at oracle.com  Tue Jul 14 18:25:44 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Tue, 14 Jul 2020 11:25:44 -0700
Subject: RFR [15] : 8249036 : clean up FileInstaller $test.src $cwd in
 vmTestbase_nsk_stress tests
In-Reply-To: <5720357e-c3c6-6f7f-7993-535561fd84e2@oracle.com>
References: <5E2ED18E-9CD6-44D6-95D0-E13D1AFC1BC3@oracle.com>
 <5720357e-c3c6-6f7f-7993-535561fd84e2@oracle.com>
Message-ID: <EBFEB1E9-B6D3-4A89-8CF7-8D1D5B930A6A@oracle.com>

Thanks Vladimir,

pushed to jdk15.

-- Igor

> On Jul 13, 2020, at 2:34 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Good.
> 
> Thanks,
> Vladimir K
> 
> On 7/13/20 2:29 PM, Igor Ignatyev wrote:
>> http://cr.openjdk.java.net/~iignatyev//8249036/webrev.00/
>>> 44 lines changed: 0 ins; 23 del; 21 mod;
>> Hi all,
>> could you please review this clean-up which removes unnecessary `FileInstaller` actions for :vmTestbase_nsk_stress tests?
>> from the main issue(8204985):
>>> all vmTestbase tests have '@run driver jdk.test.lib.FileInstaller . .' to mimic old test harness behavior and copy all files from a test source directory to a current work directory. some tests depend on this step, so we need 1st identify such tests and then either rewrite them not to have this dependency or leave FileInstaller only in these tests.
>> none of vmTestbase_nsk_stress tests need FileInstaller, hence the patch is just `ag -l  '@run driver jdk.test.lib.FileInstaller . .' vmTestbase/nsk/stress xargs -I{} gsed -i '/@run driver jdk.test.lib.FileInstaller \. \./d' {}`.
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249036
>> webrev: http://cr.openjdk.java.net/~iignatyev//8249036/webrev.00/
>> testing: :vmTestbase_nsk_stress on linux-x64
>> Thanks,
>> -- Igor


From vladimir.kozlov at oracle.com  Tue Jul 14 18:46:32 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 14 Jul 2020 11:46:32 -0700
Subject: [16] RFR(S): 8247743: Segmentation fault in debug builds due to
 stack overflow in find_recur with deep graphs
In-Reply-To: <2f317601-4845-541d-e2ef-ad7735386f1c@oracle.com>
References: <9af7a44c-4267-4900-812c-12aa0c37713a@oracle.com>
 <518ffdf1-143a-06f3-9aa4-96871d72d024@oracle.com>
 <9b3a9632-c7bb-2f51-c295-72935add2670@oracle.com>
 <2f317601-4845-541d-e2ef-ad7735386f1c@oracle.com>
Message-ID: <7cfafcb9-6232-5738-6cad-508127fd31e8@oracle.com>

Can you move next up to where other small find*() methods are defined?:

+Node* Node::find_ctrl(int idx) {
+  return find(idx, true);
  }

Also add '// not PRODUCT' comment to #endif for #ifndef PRODUCT. It is hard to find where this not product code ends.

Looks good otherwise.

Thanks,
Vladimir

On 7/14/20 2:54 AM, Christian Hagedorn wrote:
> Hi Vladimir
> 
> On 13.07.20 19:43, Vladimir Kozlov wrote:
>> Node::find_ctrl() is used during debugging when you want to print and look on only control nodes.
>> We have several such methods which are only used in debugger.
> 
> I see, I restored this method and changed Node::find() accordingly. I additionally added two find_ctrl() methods to make 
> it easier to call it from a debugger (as already present for find_node()).
> 
>> I suggest to store old_arena() in local var and pass into add_to_worklist().
>>
>> You can make add_to_worklist() static since you pass node as argument.
> 
> Okay. I updated this and the change above in a new webrev:
> http://cr.openjdk.java.net/~chagedorn/8247743/webrev.01/
> 
> Best regards,
> Christian
> 
>> Thanks,
>> Vladimir
>>
>> On 7/13/20 3:09 AM, Christian Hagedorn wrote:
>>> Ping - could anyone review it, please? Thanks!
>>>
>>> Best regards,
>>> Christian
>>>
>>> On 02.07.20 09:33, Christian Hagedorn wrote:
>>>> Hi
>>>>
>>>> Please review the following patch:
>>>> https://bugs.openjdk.java.net/browse/JDK-8247743
>>>> http://cr.openjdk.java.net/~chagedorn/8247743/webrev.00/
>>>>
>>>> The testcase creates a deep graph with a lot of nodes on a chain. When running with the specified test flags, it 
>>>> recursively calls Node::find_recur() for each node discovered which eventually results in a segmentation fault due 
>>>> to a stack overflow (around 10000 calls due to such a long chain of nodes). The fix just converts the recursive 
>>>> algorithm into an iterative one to avoid a segmentation fault. This is similar to JDK-8246203 [1].
>>>>
>>>> I additionally removed Node::find_ctrl() and its special handling in the algorithm since it is not used.
>>>>
>>>> There is actually another problem with the recursive version. When running the testcase without 
>>>> -XX:CompileOnly=compiler/c2/TestFindNode, it will spin forever inside [2] because there is a debug_orig node cycle 
>>>> and the loop does not break based on the debug_orig nodes being visited. This is also fixed in the patch.
>>>>
>>>> Thank you!
>>>>
>>>> Best regards,
>>>> Christian
>>>>
>>>>
>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8246203
>>>> [2] http://hg.openjdk.java.net/jdk/jdk/file/e2622818f0bd/src/hotspot/share/opto/node.cpp#l1589

From vladimir.kozlov at oracle.com  Tue Jul 14 19:07:31 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 14 Jul 2020 12:07:31 -0700
Subject: [16] RFR(S): 8248552: C2 crashes with SIGFPE due to division by
 zero
In-Reply-To: <e8d6aa47-b584-4fdc-f506-1b1c52aa28f5@oracle.com>
References: <70e8e42b-5cb3-9c1e-419e-2f771f042368@oracle.com>
 <3ba2ef6a-8ade-7ede-5252-21051c34b472@oracle.com>
 <9e2f26bd-daa4-9540-8401-9850e0beea94@oracle.com>
 <c569ebf8-58ed-c6df-2e0e-38c692f926c6@oracle.com>
 <a2066deb-ef3b-7071-33ff-03c829069cff@oracle.com>
 <f6a62d10-9cff-3fc2-0947-c49e775a23c9@oracle.com>
 <e8d6aa47-b584-4fdc-f506-1b1c52aa28f5@oracle.com>
Message-ID: <ca043859-bb7b-f64a-a37d-7b4e47c745d6@oracle.com>

 > Thinking about this type propagation problem, couldn't we somehow set the type of the Opaque1 node hiding the pre-loop
 > limit to the same type as the pre-loop limit to allow this information to flow to the pre and main loop? Or would that
 > cause other problems? I guess there probably must be a reason why we don't do it like that.

It has wide type to prevent premature optimizations before loop is fully transformed. That is the reason we add it in 
first place.

But it would be interesting to see if we can use more narrow type: TypeInt::POS1 for example for positive limits (>0) 
(and opposite for negative limits < 0). I may be missing some nuances and it may not work but we should try.

Regards,
Vladimir

On 7/14/20 5:39 AM, Christian Hagedorn wrote:
> 
>> [..] Since the DivINode has a control outside of the main loop [..]
> 
> Edit: I actually meant that get_ctrl() returns a node outside of the main-loop (i.e. the DivINode is not part of the 
> main-loop body). The DivINode still has NULL as control input.
> 
> Best regards,
> Christian
> 
> On 14.07.20 14:32, Christian Hagedorn wrote:
>> Hi Vladimir
>>
>> I had a closer look at the failing testcase with webrev.00. The original DivNode has its zero check removed based on 
>> correct type information. Afterwards its split through an induction variable phi for which both inputs have non-zero 
>> types. So, the DivNode end up after an AddINode (which adds a positive constant) which has a non-zero type. All good 
>> so far.
>>
>> Now we add pre/main/post loops and the induction variable phi for the pre-loop gets type int>=1 since the limit for 
>> the pre-loop is hidden behind an Opaque1 node which just returns int as type. The AddINode belonging to the loop 
>> induction variable phi in the pre-loop is therefore updated to have the type int as well (int>=1 + positive_int could 
>> overflow). This type information propagates to the main-loop and its AddINode belonging to the loop induction variable 
>> phi (which is an input to the DivNode) also gets its type set to int.
>>
>> Later, we add a vector post loop where we clone the main loop and add a phi p for the the AddINode node and its new 
>> clone. Since the DivINode has a control outside of the main loop, it is not cloned and gets the phi p as an input. At 
>> a later point in time, we want to split through p. But then we detect zero as possible value due to the type range of 
>> both AddINodes being int.
>>
>> Even though the type information is not accurate enough, the DivINode is never zero and we could safely apply the 
>> split through the phi. We could think about doing a bail out for all kinds of phis but I think it should only be an 
>> actual problem for loop induction variable phis.
>>
>> Thinking about this type propagation problem, couldn't we somehow set the type of the Opaque1 node hiding the pre-loop 
>> limit to the same type as the pre-loop limit to allow this information to flow to the pre and main loop? Or would that 
>> cause other problems? I guess there probably must be a reason why we don't do it like that.
>>
>> Best regards,
>> Christian
>>
>> On 13.07.20 19:16, Vladimir Kozlov wrote:
>>> This rise question: why zero check was removed if one of merged types has 0?
>>> Should we be more careful when we remove zero check?
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 7/13/20 2:06 AM, Christian Hagedorn wrote:
>>>> A test in some later tier testing revealed that the assertion code is actually too strong. There can be a Div/Mod 
>>>> node whose zero check was removed but that is then spilt through a non-induction-variable phi whose inputs have zero 
>>>> in their type range (which is fine, this happens in some loop opts after partial peeling was applied earlier). This 
>>>> happened, for example, for a phi which merged two nodes from the original and a cloned loop. I think we just need to 
>>>> remove the additional assertion code.
>>>>
>>>> New webrev:
>>>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.01/
>>>>
>>>> Best regards,
>>>> Christian
>>>>
>>>> On 13.07.20 09:19, Christian Hagedorn wrote:
>>>>> Thank you Vladimir for your review!
>>>>>
>>>>> Best regards,
>>>>> Christian
>>>>>
>>>>> On 11.07.20 01:25, Vladimir Kozlov wrote:
>>>>>> Looks good.
>>>>>>
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>>
>>>>>> On 7/10/20 12:37 AM, Christian Hagedorn wrote:
>>>>>>> Hi
>>>>>>>
>>>>>>> Please review the following patch:
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8248552
>>>>>>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.00/
>>>>>>>
>>>>>>> In the failing testcase, C2 removes a zero check for a division/modulo node n based on the type information of 
>>>>>>> the loop induction variable phi p (always between 1 and 50 and never 0). However, n is later split through p and 
>>>>>>> ends up after the AddNode which updates the induction variable p. In the last iteration j equals 2 and is then 
>>>>>>> updated to 0. The division/modulo node n is now executed before the loop limit check which results in a SIGFPE.
>>>>>>>
>>>>>>> The fix bails out of PhaseIdealLoop::split_thru_phi if a division or modulo node has its zero check removed (i.e. 
>>>>>>> control in NULL) and is split through a phi which has an input that could be zero. This should only happen for an 
>>>>>>> induction variable phi of a trip-counted (integer) loop.
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Christian

From ekaterina.pavlova at oracle.com  Wed Jul 15 00:25:30 2020
From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova)
Date: Tue, 14 Jul 2020 17:25:30 -0700
Subject: RFR [15] (T/XS): 8242388 compiler/graalunit/CoreTest.java timed out
Message-ID: <39811448-6cf5-c329-de66-27233854cb62@oracle.com>

Hi all,

compiler/graalunit/CoreTest.java fails by timeout from time to time.
The most time expensive subtest is org.graalvm.compiler.core.test.CountedLoopTest.
The fix spits the test into two tests to reduce total execution time.
Please review.

     JBS: https://bugs.openjdk.java.net/browse/JDK-8242388
  webrev: http://cr.openjdk.java.net/~epavlova//8242388/webrev.00/index.html
testing: graalunit tests as part of tier3


Thanks,
-katya


From vladimir.kozlov at oracle.com  Wed Jul 15 01:20:18 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 14 Jul 2020 18:20:18 -0700
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <46144d6d-5714-05ad-a263-01507db937cc@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
 <8325fac5-6258-9b88-6507-5dcb0597cc17@oracle.com>
 <8522a69e-e538-2cc9-5364-887e450fc653@oracle.com>
 <46144d6d-5714-05ad-a263-01507db937cc@oracle.com>
Message-ID: <7a361c29-4771-9ed4-1542-b3f68a5726f3@oracle.com>

I looked more on this. EA already does not secularize allocations when Phi nodes merged them - it should handle this 
case. I did small experiment and relaxed assert for this new (10. needs comment update) case for AddP's base and test 
passed:

src/hotspot/share/opto/escape.cpp Tue Jul 14 18:11:27 2020 -0700
@@ -2357,6 +2357,7 @@
        int opcode = uncast_base->Opcode();
        assert(opcode == Op_ConP || opcode == Op_ThreadLocal ||
               opcode == Op_CastX2P || uncast_base->is_DecodeNarrowPtr() ||
+             (uncast_base->is_Phi() && (uncast_base->bottom_type()->isa_rawptr() != NULL)) ||
               (uncast_base->is_Mem() && (uncast_base->bottom_type()->isa_rawptr() != NULL)) ||
               (uncast_base->is_Proj() && uncast_base->in(0)->is_Allocate()), "sanity");
      }

Did you hit a case when this may not work?

And with LoopOpts off -XX:LoopUnrollLimit=0 it removed allocation (-XX:+PrintEscapeAnalysis -XX:+PrintEliminateAllocations):

======== Connection graph for  Test::test
JavaObject NoEscape(NoEscape) [ 158F [ 107 ]]   95	Allocate	===  242  76  230  8  1 ( 93  92  21  1  78  1  78 ) [[ 96 
97  98  105  106  107 ]]  rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, top ) Test::test1 @ bci:0 
Test::test @ bci:8 !jvms: Test::test1 @ bci:0 Test::test @ bci:8
LocalVar [ 95P [ 158b ]]   107	Proj	===  95  [[ 108  158 ]] #5 !jvms: Test::test1 @ bci:0 Test::test @ bci:8

Scalar  95	Allocate	===  242  76  230  8  1 ( 93  92  21  1  78  1  78 ) [[ 96  97  98  105  106  107 ]]  rawptr:NotNull 
( int:>=0, java/lang/Object:NotNull *, bool, top ) Test::test1 @ bci:0 Test::test @ bci:8 !jvms: Test::test1 @ bci:0 
Test::test @ bci:8
++++ Eliminated: 95 Allocate


t\Thanks,
Vladimir K

On 7/14/20 1:28 AM, Jamsheed C M wrote:
> Hi all,
> 
> I had incorrectly added extra check in assert after offset computation in address_offset . For addps with non constant 
> offsets (like [1])
> 
> Not changing the old assert even though I am not expecting first addp/second addp(for array addressing) case for init 
> captured store.
> 
> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA_asserts_corrected/
> 
> Best regards,
> 
> Jamsheed
> 
> [1]
> 
> assert(offs != Type::OffsetBot ||
> - adr->in(AddPNode::Address)->in(0)->is_AllocateArray(),
> + adr->in(AddPNode::Address)->in(0)->is_AllocateArray() || is_captured_store(adr),
>  ??????????? "offset must be a constant or it is initialization of array");
> 
> On 13/07/2020 11:14, Jamsheed C M wrote:
>>
>> Hi,
>>
>> I reworked the fix. I compute offset for all init captures stores, but treats this special init captured stores 
>> similar to unsafe(as these objects are usually GlobalEscape and doesn't have any perf implications).
>>
>> revised webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.01/
>>
>> testing: mach1-5( logs in jbs)
>>
>> Best regards,
>>
>> Jamsheed
>>
>> On 09/07/2020 19:36, Jamsheed C M wrote:
>>>
>>> Hi,
>>>
>>> request to hold the review. need to change the code for dealing with unsafe access. as current capture code go for 
>>> more execution time analyzing things.
>>>
>>> Best regards,
>>>
>>> Jamsheed
>>>
>>> On 09/07/2020 13:01, Jamsheed C M wrote:
>>>>
>>>> Hi all,
>>>>
>>>> JBS:https://bugs.openjdk.java.net/browse/JDK-8242895
>>>>
>>>> Request for review changes made to offset computation and field write detection for init captured stores due to phis 
>>>> addition between alloc and init. This happen if init node in different outer loop wrt to alloc node and there is a 
>>>> loop opt.? This was required as a result of enhancement [1].
>>>>
>>>> Normally init are not associated with multiple alloc node during EA phase, but changes done for [1] caused the code 
>>>> shapes of the form [2]? to generate inits associated with multiple alloc node.
>>>>
>>>> This had implication in offset computation and field write detection related to initializing stores.
>>>>
>>>> Attempt to fix in EA:
>>>>
>>>> ???? webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA/
>>>>
>>>> Alternate fix:
>>>>
>>>> ???? Minimize the scenario in compiler generated code by throwing only j.l.Error from slowpath(all exception 
>>>> async/sync are handled in runtime exit).
>>>>
>>>> ???? Stub epilog doesn't poll or throw any exceptions. Disable full loop opt before EA for detectable patterns and 
>>>> bailout EA for late detected patterns.
>>>>
>>>> ???? webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_deopt/
>>>>
>>>> Please advice.
>>>>
>>>> Testing : mach tier1-5 (logs in jbs)
>>>>
>>>> Best regards,
>>>>
>>>> Jamsheed
>>>>
>>>>
>>>> [1] JDK-8231291 <https://bugs.openjdk.java.net/browse/JDK-8231291>C2: loop opts before EA should maximally unroll loops
>>>>
>>>> [2] that have its init node in different outer loop wrt to alloc node.
>>>>
>>>>
>>>> loop begin
>>>>
>>>> ?? try{
>>>>
>>>> ?? return new obj()/? throw new obj()/ uncommon trap after allocation, in a loop
>>>>
>>>> ?? } catch(ex) {
>>>>
>>>> ?? }
>>>>
>>>> loop end
>>>>
>>>> ? 42???? public static IntA test(int n) {
>>>> ?? 43???????? for (int i=0; i<2; i++) {
>>>> ?? 44???????????? try {
>>>> ?? 45?????????????????? return new IntA(n + i);
>>>> ?? 46???????????? } catch (Exception e) {
>>>> ?? 47???????????? }
>>>> ?? 48???????? }
>>>> ?? 49
>>>>

From vladimir.kozlov at oracle.com  Wed Jul 15 01:24:11 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 14 Jul 2020 18:24:11 -0700
Subject: RFR [15] (T/XS): 8242388 compiler/graalunit/CoreTest.java timed
 out
In-Reply-To: <39811448-6cf5-c329-de66-27233854cb62@oracle.com>
References: <39811448-6cf5-c329-de66-27233854cb62@oracle.com>
Message-ID: <64ccd3d7-dc82-e243-a63e-db49d61503ef@oracle.com>

Good.

Thanks,
Vladimir K

On 7/14/20 5:25 PM, Ekaterina Pavlova wrote:
> Hi all,
> 
> compiler/graalunit/CoreTest.java fails by timeout from time to time.
> The most time expensive subtest is org.graalvm.compiler.core.test.CountedLoopTest.
> The fix spits the test into two tests to reduce total execution time.
> Please review.
> 
>  ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8242388
>  ?webrev: http://cr.openjdk.java.net/~epavlova//8242388/webrev.00/index.html
> testing: graalunit tests as part of tier3
> 
> 
> Thanks,
> -katya
> 
> 
> 

From jamsheed.c.m at oracle.com  Wed Jul 15 02:51:28 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Wed, 15 Jul 2020 08:21:28 +0530
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <7a361c29-4771-9ed4-1542-b3f68a5726f3@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
 <8325fac5-6258-9b88-6507-5dcb0597cc17@oracle.com>
 <8522a69e-e538-2cc9-5364-887e450fc653@oracle.com>
 <46144d6d-5714-05ad-a263-01507db937cc@oracle.com>
 <7a361c29-4771-9ed4-1542-b3f68a5726f3@oracle.com>
Message-ID: <6dc4c99b-1d90-09f1-60d1-fb2caf981266@oracle.com>

Hi Vladimir,

On 15/07/2020 06:50, Vladimir Kozlov wrote:
> I looked more on this. EA already does not secularize allocations when 
> Phi nodes merged them - it should handle this case. I did small 
> experiment and relaxed assert for this new (10. needs comment update) 
> case for AddP's base and test passed:
>
> src/hotspot/share/opto/escape.cpp Tue Jul 14 18:11:27 2020 -0700
> @@ -2357,6 +2357,7 @@
> ?????? int opcode = uncast_base->Opcode();
> ?????? assert(opcode == Op_ConP || opcode == Op_ThreadLocal ||
> ????????????? opcode == Op_CastX2P || 
> uncast_base->is_DecodeNarrowPtr() ||
> +???????????? (uncast_base->is_Phi() && 
> (uncast_base->bottom_type()->isa_rawptr() != NULL)) ||
> ????????????? (uncast_base->is_Mem() && 
> (uncast_base->bottom_type()->isa_rawptr() != NULL)) ||
> ????????????? (uncast_base->is_Proj() && 
> uncast_base->in(0)->is_Allocate()), "sanity");
> ???? }
>
> Did you hit a case when this may not work?

Yes, right it already doesn't mark it as scalarizable if base count is 
more than one(I think it missed a is_oop check there).

EA CG adds edges only for oop field making stores to them undetected. 
This makes these stored objects to NoEscape and if compiled method 
continues execution with this NoEscape object can have undesired 
results(i.e synchronization removed).

Probable case would be(didn't verify)

try {

LOOP BEGIN

 ? try {throw new Obj()} catch {}

LOOP END

} catch (Obj e) {

}

[1] 
https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/escape.cpp#L2256 

>
>
> And with LoopOpts off -XX:LoopUnrollLimit=0 it removed allocation 
> (-XX:+PrintEscapeAnalysis -XX:+PrintEliminateAllocations):
>
> ======== Connection graph for? Test::test
> JavaObject NoEscape(NoEscape) [ 158F [ 107 ]]?? 95??? Allocate ===? 
> 242? 76? 230? 8? 1 ( 93? 92? 21? 1? 78? 1? 78 ) [[ 96 97? 98 105? 106? 
> 107 ]]? rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, 
> top ) Test::test1 @ bci:0 Test::test @ bci:8 !jvms: Test::test1 @ 
> bci:0 Test::test @ bci:8
> LocalVar [ 95P [ 158b ]]?? 107??? Proj??? ===? 95? [[ 108? 158 ]] #5 
> !jvms: Test::test1 @ bci:0 Test::test @ bci:8
>
> Scalar? 95??? Allocate??? ===? 242? 76? 230? 8? 1 ( 93? 92? 21? 1 78? 
> 1? 78 ) [[ 96? 97? 98? 105? 106? 107 ]]? rawptr:NotNull ( int:>=0, 
> java/lang/Object:NotNull *, bool, top ) Test::test1 @ bci:0 Test::test 
> @ bci:8 !jvms: Test::test1 @ bci:0 Test::test @ bci:8
> ++++ Eliminated: 95 Allocate
>
>
> t\Thanks,
> Vladimir K
>
> On 7/14/20 1:28 AM, Jamsheed C M wrote:
>> Hi all,
>>
>> I had incorrectly added extra check in assert after offset 
>> computation in address_offset . For addps with non constant offsets 
>> (like [1])
>>
>> Not changing the old assert even though I am not expecting first 
>> addp/second addp(for array addressing) case for init captured store.
>>
>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA_asserts_corrected/
>>
>> Best regards,
>>
>> Jamsheed
>>
>> [1]
>>
>> assert(offs != Type::OffsetBot ||
>> - adr->in(AddPNode::Address)->in(0)->is_AllocateArray(),
>> + adr->in(AddPNode::Address)->in(0)->is_AllocateArray() || 
>> is_captured_store(adr),
>> ???????????? "offset must be a constant or it is initialization of 
>> array");
>>
>> On 13/07/2020 11:14, Jamsheed C M wrote:
>>>
>>> Hi,
>>>
>>> I reworked the fix. I compute offset for all init captures stores, 
>>> but treats this special init captured stores similar to unsafe(as 
>>> these objects are usually GlobalEscape and doesn't have any perf 
>>> implications).
>>>
>>> revised webrev: 
>>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.01/
>>>
>>> testing: mach1-5( logs in jbs)
>>>
>>> Best regards,
>>>
>>> Jamsheed
>>>
>>> On 09/07/2020 19:36, Jamsheed C M wrote:
>>>>
>>>> Hi,
>>>>
>>>> request to hold the review. need to change the code for dealing 
>>>> with unsafe access. as current capture code go for more execution 
>>>> time analyzing things.
>>>>
>>>> Best regards,
>>>>
>>>> Jamsheed
>>>>
>>>> On 09/07/2020 13:01, Jamsheed C M wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> JBS:https://bugs.openjdk.java.net/browse/JDK-8242895
>>>>>
>>>>> Request for review changes made to offset computation and field 
>>>>> write detection for init captured stores due to phis addition 
>>>>> between alloc and init. This happen if init node in different 
>>>>> outer loop wrt to alloc node and there is a loop opt.? This was 
>>>>> required as a result of enhancement [1].
>>>>>
>>>>> Normally init are not associated with multiple alloc node during 
>>>>> EA phase, but changes done for [1] caused the code shapes of the 
>>>>> form [2]? to generate inits associated with multiple alloc node.
>>>>>
>>>>> This had implication in offset computation and field write 
>>>>> detection related to initializing stores.
>>>>>
>>>>> Attempt to fix in EA:
>>>>>
>>>>> ???? webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA/
>>>>>
>>>>> Alternate fix:
>>>>>
>>>>> ???? Minimize the scenario in compiler generated code by throwing 
>>>>> only j.l.Error from slowpath(all exception async/sync are handled 
>>>>> in runtime exit).
>>>>>
>>>>> ???? Stub epilog doesn't poll or throw any exceptions. Disable 
>>>>> full loop opt before EA for detectable patterns and bailout EA for 
>>>>> late detected patterns.
>>>>>
>>>>> ???? webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_deopt/
>>>>>
>>>>> Please advice.
>>>>>
>>>>> Testing : mach tier1-5 (logs in jbs)
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Jamsheed
>>>>>
>>>>>
>>>>> [1] JDK-8231291 
>>>>> <https://bugs.openjdk.java.net/browse/JDK-8231291>C2: loop opts 
>>>>> before EA should maximally unroll loops
>>>>>
>>>>> [2] that have its init node in different outer loop wrt to alloc 
>>>>> node.
>>>>>
>>>>>
>>>>> loop begin
>>>>>
>>>>> ?? try{
>>>>>
>>>>> ?? return new obj()/? throw new obj()/ uncommon trap after 
>>>>> allocation, in a loop
>>>>>
>>>>> ?? } catch(ex) {
>>>>>
>>>>> ?? }
>>>>>
>>>>> loop end
>>>>>
>>>>> ? 42???? public static IntA test(int n) {
>>>>> ?? 43???????? for (int i=0; i<2; i++) {
>>>>> ?? 44???????????? try {
>>>>> ?? 45?????????????????? return new IntA(n + i);
>>>>> ?? 46???????????? } catch (Exception e) {
>>>>> ?? 47???????????? }
>>>>> ?? 48???????? }
>>>>> ?? 49
>>>>>

From jamsheed.c.m at oracle.com  Wed Jul 15 03:08:10 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Wed, 15 Jul 2020 08:38:10 +0530
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <6dc4c99b-1d90-09f1-60d1-fb2caf981266@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
 <8325fac5-6258-9b88-6507-5dcb0597cc17@oracle.com>
 <8522a69e-e538-2cc9-5364-887e450fc653@oracle.com>
 <46144d6d-5714-05ad-a263-01507db937cc@oracle.com>
 <7a361c29-4771-9ed4-1542-b3f68a5726f3@oracle.com>
 <6dc4c99b-1d90-09f1-60d1-fb2caf981266@oracle.com>
Message-ID: <6b4e4dda-01d4-37d0-5403-a4f5481e5bf0@oracle.com>

(unfinished mail got sent, so completing it)
On 15/07/2020 08:21, Jamsheed C M wrote:
> Hi Vladimir,
>
> On 15/07/2020 06:50, Vladimir Kozlov wrote:
>> I looked more on this. EA already does not secularize allocations 
>> when Phi nodes merged them - it should handle this case. I did small 
>> experiment and relaxed assert for this new (10. needs comment update) 
>> case for AddP's base and test passed:
>>
>> src/hotspot/share/opto/escape.cpp Tue Jul 14 18:11:27 2020 -0700
>> @@ -2357,6 +2357,7 @@
>> ?????? int opcode = uncast_base->Opcode();
>> ?????? assert(opcode == Op_ConP || opcode == Op_ThreadLocal ||
>> ????????????? opcode == Op_CastX2P || 
>> uncast_base->is_DecodeNarrowPtr() ||
>> +???????????? (uncast_base->is_Phi() && 
>> (uncast_base->bottom_type()->isa_rawptr() != NULL)) ||
>> ????????????? (uncast_base->is_Mem() && 
>> (uncast_base->bottom_type()->isa_rawptr() != NULL)) ||
>> ????????????? (uncast_base->is_Proj() && 
>> uncast_base->in(0)->is_Allocate()), "sanity");
>> ???? }
>>
>> Did you hit a case when this may not work?
>
> Yes, right it already doesn't mark it as scalarizable if base count is 
> more than one(I think it missed a is_oop check there)[1].
>
> EA CG adds edges only for oop field making stores to them undetected. 
> This makes these stored objects to NoEscape and if compiled method 
> continues execution with this NoEscape object can have undesired 
> results(i.e synchronization removed).
>
> Probable case would be(didn't verify)
>
> try {
>
> LOOP BEGIN
>
> ? try {throw new Obj()} catch {}
>
> LOOP END
>
> } catch (Obj e) {
>
> }

Best Regards,

Jamsheed

[1]https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/escape.cpp#L1770


>>
>>
>> And with LoopOpts off -XX:LoopUnrollLimit=0 it removed allocation 
>> (-XX:+PrintEscapeAnalysis -XX:+PrintEliminateAllocations):
>>
>> ======== Connection graph for? Test::test
>> JavaObject NoEscape(NoEscape) [ 158F [ 107 ]]?? 95??? Allocate ===? 
>> 242? 76? 230? 8? 1 ( 93? 92? 21? 1? 78? 1? 78 ) [[ 96 97 98 105? 106? 
>> 107 ]]? rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, 
>> top ) Test::test1 @ bci:0 Test::test @ bci:8 !jvms: Test::test1 @ 
>> bci:0 Test::test @ bci:8
>> LocalVar [ 95P [ 158b ]]?? 107??? Proj??? ===? 95? [[ 108? 158 ]] #5 
>> !jvms: Test::test1 @ bci:0 Test::test @ bci:8
>>
>> Scalar? 95??? Allocate??? ===? 242? 76? 230? 8? 1 ( 93? 92? 21 1 78? 
>> 1? 78 ) [[ 96? 97? 98? 105? 106? 107 ]]? rawptr:NotNull ( int:>=0, 
>> java/lang/Object:NotNull *, bool, top ) Test::test1 @ bci:0 
>> Test::test @ bci:8 !jvms: Test::test1 @ bci:0 Test::test @ bci:8
>> ++++ Eliminated: 95 Allocate
>>
>>
>> t\Thanks,
>> Vladimir K
>>
>> On 7/14/20 1:28 AM, Jamsheed C M wrote:
>>> Hi all,
>>>
>>> I had incorrectly added extra check in assert after offset 
>>> computation in address_offset . For addps with non constant offsets 
>>> (like [1])
>>>
>>> Not changing the old assert even though I am not expecting first 
>>> addp/second addp(for array addressing) case for init captured store.
>>>
>>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA_asserts_corrected/ 
>>>
>>>
>>> Best regards,
>>>
>>> Jamsheed
>>>
>>> [1]
>>>
>>> assert(offs != Type::OffsetBot ||
>>> - adr->in(AddPNode::Address)->in(0)->is_AllocateArray(),
>>> + adr->in(AddPNode::Address)->in(0)->is_AllocateArray() || 
>>> is_captured_store(adr),
>>> ???????????? "offset must be a constant or it is initialization of 
>>> array");
>>>
>>> On 13/07/2020 11:14, Jamsheed C M wrote:
>>>>
>>>> Hi,
>>>>
>>>> I reworked the fix. I compute offset for all init captures stores, 
>>>> but treats this special init captured stores similar to unsafe(as 
>>>> these objects are usually GlobalEscape and doesn't have any perf 
>>>> implications).
>>>>
>>>> revised webrev: 
>>>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.01/
>>>>
>>>> testing: mach1-5( logs in jbs)
>>>>
>>>> Best regards,
>>>>
>>>> Jamsheed
>>>>
>>>> On 09/07/2020 19:36, Jamsheed C M wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> request to hold the review. need to change the code for dealing 
>>>>> with unsafe access. as current capture code go for more execution 
>>>>> time analyzing things.
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Jamsheed
>>>>>
>>>>> On 09/07/2020 13:01, Jamsheed C M wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> JBS:https://bugs.openjdk.java.net/browse/JDK-8242895
>>>>>>
>>>>>> Request for review changes made to offset computation and field 
>>>>>> write detection for init captured stores due to phis addition 
>>>>>> between alloc and init. This happen if init node in different 
>>>>>> outer loop wrt to alloc node and there is a loop opt.? This was 
>>>>>> required as a result of enhancement [1].
>>>>>>
>>>>>> Normally init are not associated with multiple alloc node during 
>>>>>> EA phase, but changes done for [1] caused the code shapes of the 
>>>>>> form [2]? to generate inits associated with multiple alloc node.
>>>>>>
>>>>>> This had implication in offset computation and field write 
>>>>>> detection related to initializing stores.
>>>>>>
>>>>>> Attempt to fix in EA:
>>>>>>
>>>>>> ???? webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA/
>>>>>>
>>>>>> Alternate fix:
>>>>>>
>>>>>> ???? Minimize the scenario in compiler generated code by throwing 
>>>>>> only j.l.Error from slowpath(all exception async/sync are handled 
>>>>>> in runtime exit).
>>>>>>
>>>>>> ???? Stub epilog doesn't poll or throw any exceptions. Disable 
>>>>>> full loop opt before EA for detectable patterns and bailout EA 
>>>>>> for late detected patterns.
>>>>>>
>>>>>> ???? webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_deopt/
>>>>>>
>>>>>> Please advice.
>>>>>>
>>>>>> Testing : mach tier1-5 (logs in jbs)
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> Jamsheed
>>>>>>
>>>>>>
>>>>>> [1] JDK-8231291 
>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8231291>C2: loop opts 
>>>>>> before EA should maximally unroll loops
>>>>>>
>>>>>> [2] that have its init node in different outer loop wrt to alloc 
>>>>>> node.
>>>>>>
>>>>>>
>>>>>> loop begin
>>>>>>
>>>>>> ?? try{
>>>>>>
>>>>>> ?? return new obj()/? throw new obj()/ uncommon trap after 
>>>>>> allocation, in a loop
>>>>>>
>>>>>> ?? } catch(ex) {
>>>>>>
>>>>>> ?? }
>>>>>>
>>>>>> loop end
>>>>>>
>>>>>> ? 42???? public static IntA test(int n) {
>>>>>> ?? 43???????? for (int i=0; i<2; i++) {
>>>>>> ?? 44???????????? try {
>>>>>> ?? 45?????????????????? return new IntA(n + i);
>>>>>> ?? 46???????????? } catch (Exception e) {
>>>>>> ?? 47???????????? }
>>>>>> ?? 48???????? }
>>>>>> ?? 49
>>>>>>

From christian.hagedorn at oracle.com  Wed Jul 15 07:58:17 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Wed, 15 Jul 2020 09:58:17 +0200
Subject: [16] RFR(S): 8247743: Segmentation fault in debug builds due to
 stack overflow in find_recur with deep graphs
In-Reply-To: <7cfafcb9-6232-5738-6cad-508127fd31e8@oracle.com>
References: <9af7a44c-4267-4900-812c-12aa0c37713a@oracle.com>
 <518ffdf1-143a-06f3-9aa4-96871d72d024@oracle.com>
 <9b3a9632-c7bb-2f51-c295-72935add2670@oracle.com>
 <2f317601-4845-541d-e2ef-ad7735386f1c@oracle.com>
 <7cfafcb9-6232-5738-6cad-508127fd31e8@oracle.com>
Message-ID: <53d1eebe-e85f-58cb-7fba-0baf2ecf8701@oracle.com>

Hi Vladimir

On 14.07.20 20:46, Vladimir Kozlov wrote:
> Can you move next up to where other small find*() methods are defined?:
> 
> +Node* Node::find_ctrl(int idx) {
> +? return find(idx, true);
>  ?}
> 
> Also add '// not PRODUCT' comment to #endif for #ifndef PRODUCT. It is 
> hard to find where this not product code ends.
> 
> Looks good otherwise.

Thanks, I added these changes in a new webrev:
http://cr.openjdk.java.net/~chagedorn/8247743/webrev.02/

Best regards,
Christian


> Thanks,
> Vladimir
> 
> On 7/14/20 2:54 AM, Christian Hagedorn wrote:
>> Hi Vladimir
>>
>> On 13.07.20 19:43, Vladimir Kozlov wrote:
>>> Node::find_ctrl() is used during debugging when you want to print and 
>>> look on only control nodes.
>>> We have several such methods which are only used in debugger.
>>
>> I see, I restored this method and changed Node::find() accordingly. I 
>> additionally added two find_ctrl() methods to make it easier to call 
>> it from a debugger (as already present for find_node()).
>>
>>> I suggest to store old_arena() in local var and pass into 
>>> add_to_worklist().
>>>
>>> You can make add_to_worklist() static since you pass node as argument.
>>
>> Okay. I updated this and the change above in a new webrev:
>> http://cr.openjdk.java.net/~chagedorn/8247743/webrev.01/
>>
>> Best regards,
>> Christian
>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 7/13/20 3:09 AM, Christian Hagedorn wrote:
>>>> Ping - could anyone review it, please? Thanks!
>>>>
>>>> Best regards,
>>>> Christian
>>>>
>>>> On 02.07.20 09:33, Christian Hagedorn wrote:
>>>>> Hi
>>>>>
>>>>> Please review the following patch:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8247743
>>>>> http://cr.openjdk.java.net/~chagedorn/8247743/webrev.00/
>>>>>
>>>>> The testcase creates a deep graph with a lot of nodes on a chain. 
>>>>> When running with the specified test flags, it recursively calls 
>>>>> Node::find_recur() for each node discovered which eventually 
>>>>> results in a segmentation fault due to a stack overflow (around 
>>>>> 10000 calls due to such a long chain of nodes). The fix just 
>>>>> converts the recursive algorithm into an iterative one to avoid a 
>>>>> segmentation fault. This is similar to JDK-8246203 [1].
>>>>>
>>>>> I additionally removed Node::find_ctrl() and its special handling 
>>>>> in the algorithm since it is not used.
>>>>>
>>>>> There is actually another problem with the recursive version. When 
>>>>> running the testcase without 
>>>>> -XX:CompileOnly=compiler/c2/TestFindNode, it will spin forever 
>>>>> inside [2] because there is a debug_orig node cycle and the loop 
>>>>> does not break based on the debug_orig nodes being visited. This is 
>>>>> also fixed in the patch.
>>>>>
>>>>> Thank you!
>>>>>
>>>>> Best regards,
>>>>> Christian
>>>>>
>>>>>
>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8246203
>>>>> [2] 
>>>>> http://hg.openjdk.java.net/jdk/jdk/file/e2622818f0bd/src/hotspot/share/opto/node.cpp#l1589 
>>>>>

From rwestrel at redhat.com  Wed Jul 15 09:59:53 2020
From: rwestrel at redhat.com (Roland Westrelin)
Date: Wed, 15 Jul 2020 11:59:53 +0200
Subject: RFR(M): 8229495: SIGILL in C2 generated OSR compilation
In-Reply-To: <da629280-b591-1095-48d4-ad155d2d1992@oracle.com>
References: <3b720427-d718-5d1c-dbe9-6149a21883af@oracle.com>
 <87r1topriw.fsf@redhat.com> <84b2c86d-c7e6-7945-dae5-db1d8efe6f25@oracle.com>
 <87sge0oqv8.fsf@redhat.com> <c4a2ce8a-2084-e4c1-d1a6-ab057eabe448@oracle.com>
 <da629280-b591-1095-48d4-ad155d2d1992@oracle.com>
Message-ID: <878sflnlnq.fsf@redhat.com>


Thanks for the reviews Christian & Vladimir.

Roland.


From christian.hagedorn at oracle.com  Wed Jul 15 13:08:33 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Wed, 15 Jul 2020 15:08:33 +0200
Subject: [16] RFR(S): 8248552: C2 crashes with SIGFPE due to division by
 zero
In-Reply-To: <ca043859-bb7b-f64a-a37d-7b4e47c745d6@oracle.com>
References: <70e8e42b-5cb3-9c1e-419e-2f771f042368@oracle.com>
 <3ba2ef6a-8ade-7ede-5252-21051c34b472@oracle.com>
 <9e2f26bd-daa4-9540-8401-9850e0beea94@oracle.com>
 <c569ebf8-58ed-c6df-2e0e-38c692f926c6@oracle.com>
 <a2066deb-ef3b-7071-33ff-03c829069cff@oracle.com>
 <f6a62d10-9cff-3fc2-0947-c49e775a23c9@oracle.com>
 <e8d6aa47-b584-4fdc-f506-1b1c52aa28f5@oracle.com>
 <ca043859-bb7b-f64a-a37d-7b4e47c745d6@oracle.com>
Message-ID: <5b2e7b1b-24f7-d575-58a3-376ec9ab7944@oracle.com>

Hi Vladimir

On 14.07.20 21:07, Vladimir Kozlov wrote:
>  > Thinking about this type propagation problem, couldn't we somehow set 
> the type of the Opaque1 node hiding the pre-loop
>  > limit to the same type as the pre-loop limit to allow this 
> information to flow to the pre and main loop? Or would that
>  > cause other problems? I guess there probably must be a reason why we 
> don't do it like that.
> 
> It has wide type to prevent premature optimizations before loop is fully 
> transformed. That is the reason we add it in first place.
> 
> But it would be interesting to see if we can use more narrow type: 
> TypeInt::POS1 for example for positive limits (>0) (and opposite for 
> negative limits < 0). I may be missing some nuances and it may not work 
> but we should try.

I had an additional discussion about this with Roland. He made a good 
point that not the Opaque1 nodes themselves are the problem but rather 
the type of the iv phi, or more specifically the PhiNode::Value() function.

Before creating pre/main/post loops, the iv phi has already a narrow 
type 1..300 set by PhiNode::Value(). However, when creating the pre (and 
post loop), we actually widen the type of the iv phi of the pre-loop to 
int>=1 (based on the pre-loop limit which is an Opaque1 node with type 
int). Roland suggested that we should not do that but instead filter the 
returned type with the already existing type to not widen it. I think 
that makes sense. We are already doing that for the other cases in 
PhiNode::Value() [1][2]. It looks like we just miss it for the special 
handling of iv phis of trip-counted loops. This also fixes the assertion 
failure that occurred before with webrev.00.

I created a new webrev based on webrev.00 with this change in 
PhiNode::Value():
http://cr.openjdk.java.net/~chagedorn/8248552/webrev.02/

I'm currently running some testing with it again.

Best regards,
Christian


[1] 
http://hg.openjdk.java.net/jdk/jdk/file/9ea3344c6445/src/hotspot/share/opto/cfgnode.cpp#l1097
[2] 
http://hg.openjdk.java.net/jdk/jdk/file/9ea3344c6445/src/hotspot/share/opto/cfgnode.cpp#l1157

> Regards,
> Vladimir
> 
> On 7/14/20 5:39 AM, Christian Hagedorn wrote:
>>
>>> [..] Since the DivINode has a control outside of the main loop [..]
>>
>> Edit: I actually meant that get_ctrl() returns a node outside of the 
>> main-loop (i.e. the DivINode is not part of the main-loop body). The 
>> DivINode still has NULL as control input.
>>
>> Best regards,
>> Christian
>>
>> On 14.07.20 14:32, Christian Hagedorn wrote:
>>> Hi Vladimir
>>>
>>> I had a closer look at the failing testcase with webrev.00. The 
>>> original DivNode has its zero check removed based on correct type 
>>> information. Afterwards its split through an induction variable phi 
>>> for which both inputs have non-zero types. So, the DivNode end up 
>>> after an AddINode (which adds a positive constant) which has a 
>>> non-zero type. All good so far.
>>>
>>> Now we add pre/main/post loops and the induction variable phi for the 
>>> pre-loop gets type int>=1 since the limit for the pre-loop is hidden 
>>> behind an Opaque1 node which just returns int as type. The AddINode 
>>> belonging to the loop induction variable phi in the pre-loop is 
>>> therefore updated to have the type int as well (int>=1 + positive_int 
>>> could overflow). This type information propagates to the main-loop 
>>> and its AddINode belonging to the loop induction variable phi (which 
>>> is an input to the DivNode) also gets its type set to int.
>>>
>>> Later, we add a vector post loop where we clone the main loop and add 
>>> a phi p for the the AddINode node and its new clone. Since the 
>>> DivINode has a control outside of the main loop, it is not cloned and 
>>> gets the phi p as an input. At a later point in time, we want to 
>>> split through p. But then we detect zero as possible value due to the 
>>> type range of both AddINodes being int.
>>>
>>> Even though the type information is not accurate enough, the DivINode 
>>> is never zero and we could safely apply the split through the phi. We 
>>> could think about doing a bail out for all kinds of phis but I think 
>>> it should only be an actual problem for loop induction variable phis.
>>>
>>> Thinking about this type propagation problem, couldn't we somehow set 
>>> the type of the Opaque1 node hiding the pre-loop limit to the same 
>>> type as the pre-loop limit to allow this information to flow to the 
>>> pre and main loop? Or would that cause other problems? I guess there 
>>> probably must be a reason why we don't do it like that.
>>>
>>> Best regards,
>>> Christian
>>>
>>> On 13.07.20 19:16, Vladimir Kozlov wrote:
>>>> This rise question: why zero check was removed if one of merged 
>>>> types has 0?
>>>> Should we be more careful when we remove zero check?
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 7/13/20 2:06 AM, Christian Hagedorn wrote:
>>>>> A test in some later tier testing revealed that the assertion code 
>>>>> is actually too strong. There can be a Div/Mod node whose zero 
>>>>> check was removed but that is then spilt through a 
>>>>> non-induction-variable phi whose inputs have zero in their type 
>>>>> range (which is fine, this happens in some loop opts after partial 
>>>>> peeling was applied earlier). This happened, for example, for a phi 
>>>>> which merged two nodes from the original and a cloned loop. I think 
>>>>> we just need to remove the additional assertion code.
>>>>>
>>>>> New webrev:
>>>>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.01/
>>>>>
>>>>> Best regards,
>>>>> Christian
>>>>>
>>>>> On 13.07.20 09:19, Christian Hagedorn wrote:
>>>>>> Thank you Vladimir for your review!
>>>>>>
>>>>>> Best regards,
>>>>>> Christian
>>>>>>
>>>>>> On 11.07.20 01:25, Vladimir Kozlov wrote:
>>>>>>> Looks good.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Vladimir
>>>>>>>
>>>>>>> On 7/10/20 12:37 AM, Christian Hagedorn wrote:
>>>>>>>> Hi
>>>>>>>>
>>>>>>>> Please review the following patch:
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8248552
>>>>>>>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.00/
>>>>>>>>
>>>>>>>> In the failing testcase, C2 removes a zero check for a 
>>>>>>>> division/modulo node n based on the type information of the loop 
>>>>>>>> induction variable phi p (always between 1 and 50 and never 0). 
>>>>>>>> However, n is later split through p and ends up after the 
>>>>>>>> AddNode which updates the induction variable p. In the last 
>>>>>>>> iteration j equals 2 and is then updated to 0. The 
>>>>>>>> division/modulo node n is now executed before the loop limit 
>>>>>>>> check which results in a SIGFPE.
>>>>>>>>
>>>>>>>> The fix bails out of PhaseIdealLoop::split_thru_phi if a 
>>>>>>>> division or modulo node has its zero check removed (i.e. control 
>>>>>>>> in NULL) and is split through a phi which has an input that 
>>>>>>>> could be zero. This should only happen for an induction variable 
>>>>>>>> phi of a trip-counted (integer) loop.
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> Christian

From luhenry at microsoft.com  Wed Jul 15 13:27:15 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Wed, 15 Jul 2020 13:27:15 +0000
Subject: [aarch64-port-dev ] RFR(S): 8248676: AArch64: Add workaround for
 LITable constructor
In-Reply-To: <MWHPR21MB0511A82D28A3FE2A09E3E8ADB0650@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB05119533601E0AF1DE3CB0E9B0640@MWHPR21MB0511.namprd21.prod.outlook.com>,
 <af179579-b8cd-db69-fede-802614f23092@redhat.com>,
 <MWHPR21MB0511A82D28A3FE2A09E3E8ADB0650@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <MWHPR21MB051121A1E582BCC22FA96F47B07E0@MWHPR21MB0511.namprd21.prod.outlook.com>

Hi Andrew,

A quick follow-up on that patch. Is there anything you would like to see done differently?

Thank you,

--
Lidovic

________________________________________
From: Ludovic Henry <luhenry at microsoft.com>
Sent: Friday, July 10, 2020 10:58
To: Andrew Haley; hotspot-compiler-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
Cc: openjdk-aarch64
Subject: Re: [aarch64-port-dev ] RFR(S): 8248676: AArch64: Add workaround for LITable constructor

Hi Andrew,

I uploaded a new webrev following your review.

Webrev: https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248676%2Fwebrev.01%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cf0344ec5e0284918c41308d824fae260%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637300007247824629&amp;sdata=XUa2v10K2zvoqA7OD7BstRYrRc85ewchBD3YRkdbqDc%3D&amp;reserved=0
Testing: jtreg:test/hotspot/jtreg:tier1, jtreg:test/jdk:tier1, jtreg:test/jdk:tier2, jtreg:test/langtools on Linux-AArch64, no regressions

Thank you,

________________________________________
From: Andrew Haley <aph at redhat.com>
Sent: Friday, July 10, 2020 01:10
To: Ludovic Henry; hotspot-compiler-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
Cc: openjdk-aarch64
Subject: Re: [aarch64-port-dev ] RFR(S): 8248676: AArch64: Add workaround for LITable constructor

On 09/07/2020 21:31, Ludovic Henry wrote:
> JBS: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2Fbrowse%2FJDK-8248676&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cf0344ec5e0284918c41308d824fae260%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637300007247824629&amp;sdata=3zArU%2F%2FmsCilK%2F8wIIsxEtp4bXd%2BEn0ZOkQVODcRyDA%3D&amp;reserved=0
> Webrev: https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248676%2Fwebrev.00%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cf0344ec5e0284918c41308d824fae260%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637300007247824629&amp;sdata=cyCZTT5%2BIRkbms%2BI5jpWhpH%2BsdQuAPXVppnJ0dNsDh8%3D&amp;reserved=0
> Testing: jtreg:test/hotspot/jtreg:tier1, jtreg:test/jdk:tier1, jtreg:test/jdk:tier2, jtreg:test/langtools on Linux-AArch64, no regressions.
>
> This small fix is in the context of the larger support for Windows-AArch64. The attribute `__attribute__ ((constructor))` is not supported by MSVC, and the documented workaround is to allocate an empty static struct with a constructor. This patch only applies this workaround when compiling on Windows, and leaves other platforms unchanged.

Please take out the #ifdef WINDOWS: we can use portable C++ here on all
platforms. Thanks,

--
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.redhat.com%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cf0344ec5e0284918c41308d824fae260%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637300007247824629&amp;sdata=%2FVEaURXjnDXwjVj2i5oci6ZH7ZXKm1a4QGsXY0OzPco%3D&amp;reserved=0>
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkeybase.io%2Fandrewhaley&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cf0344ec5e0284918c41308d824fae260%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637300007247834623&amp;sdata=8HQN5TP4Kbxqji6PEue7wk0Tirpc7qRMOyGTnp0jBm0%3D&amp;reserved=0
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From christian.hagedorn at oracle.com  Wed Jul 15 15:04:58 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Wed, 15 Jul 2020 17:04:58 +0200
Subject: [16] RFR(XS): 8248467: C2: compiler/intrinsics/object/TestClone fails
 with -XX:+VerifyGraphEdges
Message-ID: <60c17f38-6cb2-d380-252f-15f8d5151b29@oracle.com>

Hi

Please review the following patch:
https://bugs.openjdk.java.net/browse/JDK-8248467
http://cr.openjdk.java.net/~chagedorn/8248467/webrev.00/

The assertion is hit due to a MemBarNode whose precedence edge was set 
to NULL at [1] (result_phi_rawoop is NULL and _resproj is the precedence 
edge to a MemBarStoreStore). This is possible since JDK-8237581 [2] 
which can remove some allocations. The fix just adds this additional 
case in the assert.

Best regards,
Christian


[1] 
http://hg.openjdk.java.net/jdk/jdk/file/4a8fd81d64ba/src/hotspot/share/opto/macro.cpp#l1566
[2] https://bugs.openjdk.java.net/browse/JDK-8237581

From xxinliu at amazon.com  Wed Jul 15 15:31:56 2020
From: xxinliu at amazon.com (Liu, Xin)
Date: Wed, 15 Jul 2020 15:31:56 +0000
Subject: question about PrintOptoStatistics atomicity
Message-ID: <1594827116846.89704@amazon.com>

Hi,


I have a question about -XX:+PrintOptoStatistics in c2_globals.hpp.

It dumps many internal counters in different C2 phases.  I found those counters are all static fields.

eg.

http://hg.openjdk.java.net/jdk/jdk/file/4b9ced2b948c/src/hotspot/share/opto/chaitin.cpp#l2297

http://hg.openjdk.java.net/jdk/jdk/file/4b9ced2b948c/src/hotspot/share/opto/phaseX.hpp#l599


I notice that all setters of those fields are not atomic. IMHO, hotspot may has more than one c2-compiler-threads running at the same time.

How does hotspot guarantee those fields are thread-safe? or the flag intends to do statistics in single-thread mode by design?


If those counters are not atomic, shall we connect this flag to CICompilerCount?

I think we can constrain the number of c2-compiler-thread to 1 if user set PrintOptoStatistics. Does it make sense?


thanks,

--lx


From jamsheed.c.m at oracle.com  Wed Jul 15 15:55:44 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Wed, 15 Jul 2020 21:25:44 +0530
Subject: [15] RFR: 8246381: VM crashes with "Current BasicObjectLock* below
 than low_mark"
Message-ID: <7a802330-e836-1ff3-af0a-ede587e049ff@oracle.com>

Hi,

Async handling at method entry requires it to be aware of 
synchronization(like whether it is doing async handling before lock 
acquire or after)

This is required as exception handler rely on this info for unlocking.? 
Async handling code never had this special condition handled and it 
worked most of the time as we were using biased locking which got 
disabled by [1]

There was one other issue reported in similar time[2]. This issue got 
triggered in test case by [3], back to back extra safepoint after 
suspend and TLH for ThreadDeath. So in this setup both PopFrame request 
and Thread.Stop request happened together for the test scenario and it 
reached java method entry with pending_exception set.

I have done a partial fix for the issue, mainly to handle production 
mode crash failures(do not unlock flag related ones)

Fix detail:

1) I save restore the "do not unlock" flag in async handling.

2) Return for floating pending exception for some cases(PopFrame, Early 
return related). This is debug(JVMTI) feature and floating exception can 
get cleaned just like that in present compiler request and deopt code.

webrev :http://cr.openjdk.java.net/~jcm/8246381/webrev.02/

There are more problems in these code areas, like we clear all 
exceptions in compilation request path(interpreter,c1), as well as 
deoptimization path.

All these un-handled cases will be separately handled by 
https://bugs.openjdk.java.net/browse/JDK-8249451

Request for review.

Best regards,

Jamsheed

[1]https://bugs.openjdk.java.net/browse/JDK-8231264 
<https://bugs.openjdk.java.net/browse/JDK-8231264>

[2] https://bugs.openjdk.java.net/browse/JDK-8246727

[3] https://bugs.openjdk.java.net/browse/JDK-8221207


From vladimir.kozlov at oracle.com  Wed Jul 15 17:26:17 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 15 Jul 2020 10:26:17 -0700
Subject: [16] RFR(XS): 8248467: C2: compiler/intrinsics/object/TestClone
 fails with -XX:+VerifyGraphEdges
In-Reply-To: <60c17f38-6cb2-d380-252f-15f8d5151b29@oracle.com>
References: <60c17f38-6cb2-d380-252f-15f8d5151b29@oracle.com>
Message-ID: <6a458143-aeee-486b-2bc5-a210779c26dc@oracle.com>

Good.

Thanks,
Vladimir

On 7/15/20 8:04 AM, Christian Hagedorn wrote:
> Hi
> 
> Please review the following patch:
> https://bugs.openjdk.java.net/browse/JDK-8248467
> http://cr.openjdk.java.net/~chagedorn/8248467/webrev.00/
> 
> The assertion is hit due to a MemBarNode whose precedence edge was set to NULL at [1] (result_phi_rawoop is NULL and 
> _resproj is the precedence edge to a MemBarStoreStore). This is possible since JDK-8237581 [2] which can remove some 
> allocations. The fix just adds this additional case in the assert.
> 
> Best regards,
> Christian
> 
> 
> [1] http://hg.openjdk.java.net/jdk/jdk/file/4a8fd81d64ba/src/hotspot/share/opto/macro.cpp#l1566
> [2] https://bugs.openjdk.java.net/browse/JDK-8237581

From vladimir.kozlov at oracle.com  Wed Jul 15 17:37:50 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 15 Jul 2020 10:37:50 -0700
Subject: [16] RFR(S): 8247743: Segmentation fault in debug builds due to
 stack overflow in find_recur with deep graphs
In-Reply-To: <53d1eebe-e85f-58cb-7fba-0baf2ecf8701@oracle.com>
References: <9af7a44c-4267-4900-812c-12aa0c37713a@oracle.com>
 <518ffdf1-143a-06f3-9aa4-96871d72d024@oracle.com>
 <9b3a9632-c7bb-2f51-c295-72935add2670@oracle.com>
 <2f317601-4845-541d-e2ef-ad7735386f1c@oracle.com>
 <7cfafcb9-6232-5738-6cad-508127fd31e8@oracle.com>
 <53d1eebe-e85f-58cb-7fba-0baf2ecf8701@oracle.com>
Message-ID: <d793324b-88ed-d86b-0655-76443460d219@oracle.com>

Looks good.

Thanks,
Vladimir K

On 7/15/20 12:58 AM, Christian Hagedorn wrote:
> Hi Vladimir
> 
> On 14.07.20 20:46, Vladimir Kozlov wrote:
>> Can you move next up to where other small find*() methods are defined?:
>>
>> +Node* Node::find_ctrl(int idx) {
>> +? return find(idx, true);
>> ??}
>>
>> Also add '// not PRODUCT' comment to #endif for #ifndef PRODUCT. It is hard to find where this not product code ends.
>>
>> Looks good otherwise.
> 
> Thanks, I added these changes in a new webrev:
> http://cr.openjdk.java.net/~chagedorn/8247743/webrev.02/
> 
> Best regards,
> Christian
> 
> 
>> Thanks,
>> Vladimir
>>
>> On 7/14/20 2:54 AM, Christian Hagedorn wrote:
>>> Hi Vladimir
>>>
>>> On 13.07.20 19:43, Vladimir Kozlov wrote:
>>>> Node::find_ctrl() is used during debugging when you want to print and look on only control nodes.
>>>> We have several such methods which are only used in debugger.
>>>
>>> I see, I restored this method and changed Node::find() accordingly. I additionally added two find_ctrl() methods to 
>>> make it easier to call it from a debugger (as already present for find_node()).
>>>
>>>> I suggest to store old_arena() in local var and pass into add_to_worklist().
>>>>
>>>> You can make add_to_worklist() static since you pass node as argument.
>>>
>>> Okay. I updated this and the change above in a new webrev:
>>> http://cr.openjdk.java.net/~chagedorn/8247743/webrev.01/
>>>
>>> Best regards,
>>> Christian
>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 7/13/20 3:09 AM, Christian Hagedorn wrote:
>>>>> Ping - could anyone review it, please? Thanks!
>>>>>
>>>>> Best regards,
>>>>> Christian
>>>>>
>>>>> On 02.07.20 09:33, Christian Hagedorn wrote:
>>>>>> Hi
>>>>>>
>>>>>> Please review the following patch:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8247743
>>>>>> http://cr.openjdk.java.net/~chagedorn/8247743/webrev.00/
>>>>>>
>>>>>> The testcase creates a deep graph with a lot of nodes on a chain. When running with the specified test flags, it 
>>>>>> recursively calls Node::find_recur() for each node discovered which eventually results in a segmentation fault due 
>>>>>> to a stack overflow (around 10000 calls due to such a long chain of nodes). The fix just converts the recursive 
>>>>>> algorithm into an iterative one to avoid a segmentation fault. This is similar to JDK-8246203 [1].
>>>>>>
>>>>>> I additionally removed Node::find_ctrl() and its special handling in the algorithm since it is not used.
>>>>>>
>>>>>> There is actually another problem with the recursive version. When running the testcase without 
>>>>>> -XX:CompileOnly=compiler/c2/TestFindNode, it will spin forever inside [2] because there is a debug_orig node cycle 
>>>>>> and the loop does not break based on the debug_orig nodes being visited. This is also fixed in the patch.
>>>>>>
>>>>>> Thank you!
>>>>>>
>>>>>> Best regards,
>>>>>> Christian
>>>>>>
>>>>>>
>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8246203
>>>>>> [2] http://hg.openjdk.java.net/jdk/jdk/file/e2622818f0bd/src/hotspot/share/opto/node.cpp#l1589

From christian.hagedorn at oracle.com  Wed Jul 15 17:42:36 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Wed, 15 Jul 2020 19:42:36 +0200
Subject: [16] RFR(XS): 8248467: C2: compiler/intrinsics/object/TestClone
 fails with -XX:+VerifyGraphEdges
In-Reply-To: <6a458143-aeee-486b-2bc5-a210779c26dc@oracle.com>
References: <60c17f38-6cb2-d380-252f-15f8d5151b29@oracle.com>
 <6a458143-aeee-486b-2bc5-a210779c26dc@oracle.com>
Message-ID: <f368e967-4037-244d-0eb5-81dfbf29c039@oracle.com>

Thank you Vladimir for your review!

Best regards,
Christian

On 15.07.20 19:26, Vladimir Kozlov wrote:
> Good.
> 
> Thanks,
> Vladimir
> 
> On 7/15/20 8:04 AM, Christian Hagedorn wrote:
>> Hi
>>
>> Please review the following patch:
>> https://bugs.openjdk.java.net/browse/JDK-8248467
>> http://cr.openjdk.java.net/~chagedorn/8248467/webrev.00/
>>
>> The assertion is hit due to a MemBarNode whose precedence edge was set 
>> to NULL at [1] (result_phi_rawoop is NULL and _resproj is the 
>> precedence edge to a MemBarStoreStore). This is possible since 
>> JDK-8237581 [2] which can remove some allocations. The fix just adds 
>> this additional case in the assert.
>>
>> Best regards,
>> Christian
>>
>>
>> [1] 
>> http://hg.openjdk.java.net/jdk/jdk/file/4a8fd81d64ba/src/hotspot/share/opto/macro.cpp#l1566 
>>
>> [2] https://bugs.openjdk.java.net/browse/JDK-8237581

From vladimir.kozlov at oracle.com  Wed Jul 15 17:43:03 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 15 Jul 2020 10:43:03 -0700
Subject: [16] RFR(S): 8248552: C2 crashes with SIGFPE due to division by
 zero
In-Reply-To: <5b2e7b1b-24f7-d575-58a3-376ec9ab7944@oracle.com>
References: <70e8e42b-5cb3-9c1e-419e-2f771f042368@oracle.com>
 <3ba2ef6a-8ade-7ede-5252-21051c34b472@oracle.com>
 <9e2f26bd-daa4-9540-8401-9850e0beea94@oracle.com>
 <c569ebf8-58ed-c6df-2e0e-38c692f926c6@oracle.com>
 <a2066deb-ef3b-7071-33ff-03c829069cff@oracle.com>
 <f6a62d10-9cff-3fc2-0947-c49e775a23c9@oracle.com>
 <e8d6aa47-b584-4fdc-f506-1b1c52aa28f5@oracle.com>
 <ca043859-bb7b-f64a-a37d-7b4e47c745d6@oracle.com>
 <5b2e7b1b-24f7-d575-58a3-376ec9ab7944@oracle.com>
Message-ID: <150a1de1-86bb-22a4-6b9c-b868cb686cea@oracle.com>

Looks good.

Thanks,
Vladimir

On 7/15/20 6:08 AM, Christian Hagedorn wrote:
> Hi Vladimir
> 
> On 14.07.20 21:07, Vladimir Kozlov wrote:
>> ?> Thinking about this type propagation problem, couldn't we somehow set the type of the Opaque1 node hiding the pre-loop
>> ?> limit to the same type as the pre-loop limit to allow this information to flow to the pre and main loop? Or would that
>> ?> cause other problems? I guess there probably must be a reason why we don't do it like that.
>>
>> It has wide type to prevent premature optimizations before loop is fully transformed. That is the reason we add it in 
>> first place.
>>
>> But it would be interesting to see if we can use more narrow type: TypeInt::POS1 for example for positive limits (>0) 
>> (and opposite for negative limits < 0). I may be missing some nuances and it may not work but we should try.
> 
> I had an additional discussion about this with Roland. He made a good point that not the Opaque1 nodes themselves are 
> the problem but rather the type of the iv phi, or more specifically the PhiNode::Value() function.
> 
> Before creating pre/main/post loops, the iv phi has already a narrow type 1..300 set by PhiNode::Value(). However, when 
> creating the pre (and post loop), we actually widen the type of the iv phi of the pre-loop to int>=1 (based on the 
> pre-loop limit which is an Opaque1 node with type int). Roland suggested that we should not do that but instead filter 
> the returned type with the already existing type to not widen it. I think that makes sense. We are already doing that 
> for the other cases in PhiNode::Value() [1][2]. It looks like we just miss it for the special handling of iv phis of 
> trip-counted loops. This also fixes the assertion failure that occurred before with webrev.00.
> 
> I created a new webrev based on webrev.00 with this change in PhiNode::Value():
> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.02/
> 
> I'm currently running some testing with it again.
> 
> Best regards,
> Christian
> 
> 
> [1] http://hg.openjdk.java.net/jdk/jdk/file/9ea3344c6445/src/hotspot/share/opto/cfgnode.cpp#l1097
> [2] http://hg.openjdk.java.net/jdk/jdk/file/9ea3344c6445/src/hotspot/share/opto/cfgnode.cpp#l1157
> 
>> Regards,
>> Vladimir
>>
>> On 7/14/20 5:39 AM, Christian Hagedorn wrote:
>>>
>>>> [..] Since the DivINode has a control outside of the main loop [..]
>>>
>>> Edit: I actually meant that get_ctrl() returns a node outside of the main-loop (i.e. the DivINode is not part of the 
>>> main-loop body). The DivINode still has NULL as control input.
>>>
>>> Best regards,
>>> Christian
>>>
>>> On 14.07.20 14:32, Christian Hagedorn wrote:
>>>> Hi Vladimir
>>>>
>>>> I had a closer look at the failing testcase with webrev.00. The original DivNode has its zero check removed based on 
>>>> correct type information. Afterwards its split through an induction variable phi for which both inputs have non-zero 
>>>> types. So, the DivNode end up after an AddINode (which adds a positive constant) which has a non-zero type. All good 
>>>> so far.
>>>>
>>>> Now we add pre/main/post loops and the induction variable phi for the pre-loop gets type int>=1 since the limit for 
>>>> the pre-loop is hidden behind an Opaque1 node which just returns int as type. The AddINode belonging to the loop 
>>>> induction variable phi in the pre-loop is therefore updated to have the type int as well (int>=1 + positive_int 
>>>> could overflow). This type information propagates to the main-loop and its AddINode belonging to the loop induction 
>>>> variable phi (which is an input to the DivNode) also gets its type set to int.
>>>>
>>>> Later, we add a vector post loop where we clone the main loop and add a phi p for the the AddINode node and its new 
>>>> clone. Since the DivINode has a control outside of the main loop, it is not cloned and gets the phi p as an input. 
>>>> At a later point in time, we want to split through p. But then we detect zero as possible value due to the type 
>>>> range of both AddINodes being int.
>>>>
>>>> Even though the type information is not accurate enough, the DivINode is never zero and we could safely apply the 
>>>> split through the phi. We could think about doing a bail out for all kinds of phis but I think it should only be an 
>>>> actual problem for loop induction variable phis.
>>>>
>>>> Thinking about this type propagation problem, couldn't we somehow set the type of the Opaque1 node hiding the 
>>>> pre-loop limit to the same type as the pre-loop limit to allow this information to flow to the pre and main loop? Or 
>>>> would that cause other problems? I guess there probably must be a reason why we don't do it like that.
>>>>
>>>> Best regards,
>>>> Christian
>>>>
>>>> On 13.07.20 19:16, Vladimir Kozlov wrote:
>>>>> This rise question: why zero check was removed if one of merged types has 0?
>>>>> Should we be more careful when we remove zero check?
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> On 7/13/20 2:06 AM, Christian Hagedorn wrote:
>>>>>> A test in some later tier testing revealed that the assertion code is actually too strong. There can be a Div/Mod 
>>>>>> node whose zero check was removed but that is then spilt through a non-induction-variable phi whose inputs have 
>>>>>> zero in their type range (which is fine, this happens in some loop opts after partial peeling was applied 
>>>>>> earlier). This happened, for example, for a phi which merged two nodes from the original and a cloned loop. I 
>>>>>> think we just need to remove the additional assertion code.
>>>>>>
>>>>>> New webrev:
>>>>>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.01/
>>>>>>
>>>>>> Best regards,
>>>>>> Christian
>>>>>>
>>>>>> On 13.07.20 09:19, Christian Hagedorn wrote:
>>>>>>> Thank you Vladimir for your review!
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Christian
>>>>>>>
>>>>>>> On 11.07.20 01:25, Vladimir Kozlov wrote:
>>>>>>>> Looks good.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Vladimir
>>>>>>>>
>>>>>>>> On 7/10/20 12:37 AM, Christian Hagedorn wrote:
>>>>>>>>> Hi
>>>>>>>>>
>>>>>>>>> Please review the following patch:
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8248552
>>>>>>>>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.00/
>>>>>>>>>
>>>>>>>>> In the failing testcase, C2 removes a zero check for a division/modulo node n based on the type information of 
>>>>>>>>> the loop induction variable phi p (always between 1 and 50 and never 0). However, n is later split through p 
>>>>>>>>> and ends up after the AddNode which updates the induction variable p. In the last iteration j equals 2 and is 
>>>>>>>>> then updated to 0. The division/modulo node n is now executed before the loop limit check which results in a 
>>>>>>>>> SIGFPE.
>>>>>>>>>
>>>>>>>>> The fix bails out of PhaseIdealLoop::split_thru_phi if a division or modulo node has its zero check removed 
>>>>>>>>> (i.e. control in NULL) and is split through a phi which has an input that could be zero. This should only 
>>>>>>>>> happen for an induction variable phi of a trip-counted (integer) loop.
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Christian

From christian.hagedorn at oracle.com  Wed Jul 15 17:43:48 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Wed, 15 Jul 2020 19:43:48 +0200
Subject: [16] RFR(S): 8247743: Segmentation fault in debug builds due to
 stack overflow in find_recur with deep graphs
In-Reply-To: <d793324b-88ed-d86b-0655-76443460d219@oracle.com>
References: <9af7a44c-4267-4900-812c-12aa0c37713a@oracle.com>
 <518ffdf1-143a-06f3-9aa4-96871d72d024@oracle.com>
 <9b3a9632-c7bb-2f51-c295-72935add2670@oracle.com>
 <2f317601-4845-541d-e2ef-ad7735386f1c@oracle.com>
 <7cfafcb9-6232-5738-6cad-508127fd31e8@oracle.com>
 <53d1eebe-e85f-58cb-7fba-0baf2ecf8701@oracle.com>
 <d793324b-88ed-d86b-0655-76443460d219@oracle.com>
Message-ID: <b87d1d92-3e2d-b177-18ea-4042c6d038e1@oracle.com>

Thank you Vladimir for your review!

Best regards,
Christian

On 15.07.20 19:37, Vladimir Kozlov wrote:
> Looks good.
> 
> Thanks,
> Vladimir K
> 
> On 7/15/20 12:58 AM, Christian Hagedorn wrote:
>> Hi Vladimir
>>
>> On 14.07.20 20:46, Vladimir Kozlov wrote:
>>> Can you move next up to where other small find*() methods are defined?:
>>>
>>> +Node* Node::find_ctrl(int idx) {
>>> +? return find(idx, true);
>>> ??}
>>>
>>> Also add '// not PRODUCT' comment to #endif for #ifndef PRODUCT. It 
>>> is hard to find where this not product code ends.
>>>
>>> Looks good otherwise.
>>
>> Thanks, I added these changes in a new webrev:
>> http://cr.openjdk.java.net/~chagedorn/8247743/webrev.02/
>>
>> Best regards,
>> Christian
>>
>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 7/14/20 2:54 AM, Christian Hagedorn wrote:
>>>> Hi Vladimir
>>>>
>>>> On 13.07.20 19:43, Vladimir Kozlov wrote:
>>>>> Node::find_ctrl() is used during debugging when you want to print 
>>>>> and look on only control nodes.
>>>>> We have several such methods which are only used in debugger.
>>>>
>>>> I see, I restored this method and changed Node::find() accordingly. 
>>>> I additionally added two find_ctrl() methods to make it easier to 
>>>> call it from a debugger (as already present for find_node()).
>>>>
>>>>> I suggest to store old_arena() in local var and pass into 
>>>>> add_to_worklist().
>>>>>
>>>>> You can make add_to_worklist() static since you pass node as argument.
>>>>
>>>> Okay. I updated this and the change above in a new webrev:
>>>> http://cr.openjdk.java.net/~chagedorn/8247743/webrev.01/
>>>>
>>>> Best regards,
>>>> Christian
>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> On 7/13/20 3:09 AM, Christian Hagedorn wrote:
>>>>>> Ping - could anyone review it, please? Thanks!
>>>>>>
>>>>>> Best regards,
>>>>>> Christian
>>>>>>
>>>>>> On 02.07.20 09:33, Christian Hagedorn wrote:
>>>>>>> Hi
>>>>>>>
>>>>>>> Please review the following patch:
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8247743
>>>>>>> http://cr.openjdk.java.net/~chagedorn/8247743/webrev.00/
>>>>>>>
>>>>>>> The testcase creates a deep graph with a lot of nodes on a chain. 
>>>>>>> When running with the specified test flags, it recursively calls 
>>>>>>> Node::find_recur() for each node discovered which eventually 
>>>>>>> results in a segmentation fault due to a stack overflow (around 
>>>>>>> 10000 calls due to such a long chain of nodes). The fix just 
>>>>>>> converts the recursive algorithm into an iterative one to avoid a 
>>>>>>> segmentation fault. This is similar to JDK-8246203 [1].
>>>>>>>
>>>>>>> I additionally removed Node::find_ctrl() and its special handling 
>>>>>>> in the algorithm since it is not used.
>>>>>>>
>>>>>>> There is actually another problem with the recursive version. 
>>>>>>> When running the testcase without 
>>>>>>> -XX:CompileOnly=compiler/c2/TestFindNode, it will spin forever 
>>>>>>> inside [2] because there is a debug_orig node cycle and the loop 
>>>>>>> does not break based on the debug_orig nodes being visited. This 
>>>>>>> is also fixed in the patch.
>>>>>>>
>>>>>>> Thank you!
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Christian
>>>>>>>
>>>>>>>
>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8246203
>>>>>>> [2] 
>>>>>>> http://hg.openjdk.java.net/jdk/jdk/file/e2622818f0bd/src/hotspot/share/opto/node.cpp#l1589 
>>>>>>>

From christian.hagedorn at oracle.com  Wed Jul 15 17:44:27 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Wed, 15 Jul 2020 19:44:27 +0200
Subject: [16] RFR(S): 8248552: C2 crashes with SIGFPE due to division by
 zero
In-Reply-To: <150a1de1-86bb-22a4-6b9c-b868cb686cea@oracle.com>
References: <70e8e42b-5cb3-9c1e-419e-2f771f042368@oracle.com>
 <3ba2ef6a-8ade-7ede-5252-21051c34b472@oracle.com>
 <9e2f26bd-daa4-9540-8401-9850e0beea94@oracle.com>
 <c569ebf8-58ed-c6df-2e0e-38c692f926c6@oracle.com>
 <a2066deb-ef3b-7071-33ff-03c829069cff@oracle.com>
 <f6a62d10-9cff-3fc2-0947-c49e775a23c9@oracle.com>
 <e8d6aa47-b584-4fdc-f506-1b1c52aa28f5@oracle.com>
 <ca043859-bb7b-f64a-a37d-7b4e47c745d6@oracle.com>
 <5b2e7b1b-24f7-d575-58a3-376ec9ab7944@oracle.com>
 <150a1de1-86bb-22a4-6b9c-b868cb686cea@oracle.com>
Message-ID: <fd69fe58-a170-e1ac-3547-0af79824433e@oracle.com>

Thank you Vladimir for your review!

Best regards,
Christian

On 15.07.20 19:43, Vladimir Kozlov wrote:
> Looks good.
> 
> Thanks,
> Vladimir
> 
> On 7/15/20 6:08 AM, Christian Hagedorn wrote:
>> Hi Vladimir
>>
>> On 14.07.20 21:07, Vladimir Kozlov wrote:
>>> ?> Thinking about this type propagation problem, couldn't we somehow 
>>> set the type of the Opaque1 node hiding the pre-loop
>>> ?> limit to the same type as the pre-loop limit to allow this 
>>> information to flow to the pre and main loop? Or would that
>>> ?> cause other problems? I guess there probably must be a reason why 
>>> we don't do it like that.
>>>
>>> It has wide type to prevent premature optimizations before loop is 
>>> fully transformed. That is the reason we add it in first place.
>>>
>>> But it would be interesting to see if we can use more narrow type: 
>>> TypeInt::POS1 for example for positive limits (>0) (and opposite for 
>>> negative limits < 0). I may be missing some nuances and it may not 
>>> work but we should try.
>>
>> I had an additional discussion about this with Roland. He made a good 
>> point that not the Opaque1 nodes themselves are the problem but rather 
>> the type of the iv phi, or more specifically the PhiNode::Value() 
>> function.
>>
>> Before creating pre/main/post loops, the iv phi has already a narrow 
>> type 1..300 set by PhiNode::Value(). However, when creating the pre 
>> (and post loop), we actually widen the type of the iv phi of the 
>> pre-loop to int>=1 (based on the pre-loop limit which is an Opaque1 
>> node with type int). Roland suggested that we should not do that but 
>> instead filter the returned type with the already existing type to not 
>> widen it. I think that makes sense. We are already doing that for the 
>> other cases in PhiNode::Value() [1][2]. It looks like we just miss it 
>> for the special handling of iv phis of trip-counted loops. This also 
>> fixes the assertion failure that occurred before with webrev.00.
>>
>> I created a new webrev based on webrev.00 with this change in 
>> PhiNode::Value():
>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.02/
>>
>> I'm currently running some testing with it again.
>>
>> Best regards,
>> Christian
>>
>>
>> [1] 
>> http://hg.openjdk.java.net/jdk/jdk/file/9ea3344c6445/src/hotspot/share/opto/cfgnode.cpp#l1097 
>>
>> [2] 
>> http://hg.openjdk.java.net/jdk/jdk/file/9ea3344c6445/src/hotspot/share/opto/cfgnode.cpp#l1157 
>>
>>
>>> Regards,
>>> Vladimir
>>>
>>> On 7/14/20 5:39 AM, Christian Hagedorn wrote:
>>>>
>>>>> [..] Since the DivINode has a control outside of the main loop [..]
>>>>
>>>> Edit: I actually meant that get_ctrl() returns a node outside of the 
>>>> main-loop (i.e. the DivINode is not part of the main-loop body). The 
>>>> DivINode still has NULL as control input.
>>>>
>>>> Best regards,
>>>> Christian
>>>>
>>>> On 14.07.20 14:32, Christian Hagedorn wrote:
>>>>> Hi Vladimir
>>>>>
>>>>> I had a closer look at the failing testcase with webrev.00. The 
>>>>> original DivNode has its zero check removed based on correct type 
>>>>> information. Afterwards its split through an induction variable phi 
>>>>> for which both inputs have non-zero types. So, the DivNode end up 
>>>>> after an AddINode (which adds a positive constant) which has a 
>>>>> non-zero type. All good so far.
>>>>>
>>>>> Now we add pre/main/post loops and the induction variable phi for 
>>>>> the pre-loop gets type int>=1 since the limit for the pre-loop is 
>>>>> hidden behind an Opaque1 node which just returns int as type. The 
>>>>> AddINode belonging to the loop induction variable phi in the 
>>>>> pre-loop is therefore updated to have the type int as well (int>=1 
>>>>> + positive_int could overflow). This type information propagates to 
>>>>> the main-loop and its AddINode belonging to the loop induction 
>>>>> variable phi (which is an input to the DivNode) also gets its type 
>>>>> set to int.
>>>>>
>>>>> Later, we add a vector post loop where we clone the main loop and 
>>>>> add a phi p for the the AddINode node and its new clone. Since the 
>>>>> DivINode has a control outside of the main loop, it is not cloned 
>>>>> and gets the phi p as an input. At a later point in time, we want 
>>>>> to split through p. But then we detect zero as possible value due 
>>>>> to the type range of both AddINodes being int.
>>>>>
>>>>> Even though the type information is not accurate enough, the 
>>>>> DivINode is never zero and we could safely apply the split through 
>>>>> the phi. We could think about doing a bail out for all kinds of 
>>>>> phis but I think it should only be an actual problem for loop 
>>>>> induction variable phis.
>>>>>
>>>>> Thinking about this type propagation problem, couldn't we somehow 
>>>>> set the type of the Opaque1 node hiding the pre-loop limit to the 
>>>>> same type as the pre-loop limit to allow this information to flow 
>>>>> to the pre and main loop? Or would that cause other problems? I 
>>>>> guess there probably must be a reason why we don't do it like that.
>>>>>
>>>>> Best regards,
>>>>> Christian
>>>>>
>>>>> On 13.07.20 19:16, Vladimir Kozlov wrote:
>>>>>> This rise question: why zero check was removed if one of merged 
>>>>>> types has 0?
>>>>>> Should we be more careful when we remove zero check?
>>>>>>
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>>
>>>>>> On 7/13/20 2:06 AM, Christian Hagedorn wrote:
>>>>>>> A test in some later tier testing revealed that the assertion 
>>>>>>> code is actually too strong. There can be a Div/Mod node whose 
>>>>>>> zero check was removed but that is then spilt through a 
>>>>>>> non-induction-variable phi whose inputs have zero in their type 
>>>>>>> range (which is fine, this happens in some loop opts after 
>>>>>>> partial peeling was applied earlier). This happened, for example, 
>>>>>>> for a phi which merged two nodes from the original and a cloned 
>>>>>>> loop. I think we just need to remove the additional assertion code.
>>>>>>>
>>>>>>> New webrev:
>>>>>>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.01/
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Christian
>>>>>>>
>>>>>>> On 13.07.20 09:19, Christian Hagedorn wrote:
>>>>>>>> Thank you Vladimir for your review!
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> Christian
>>>>>>>>
>>>>>>>> On 11.07.20 01:25, Vladimir Kozlov wrote:
>>>>>>>>> Looks good.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Vladimir
>>>>>>>>>
>>>>>>>>> On 7/10/20 12:37 AM, Christian Hagedorn wrote:
>>>>>>>>>> Hi
>>>>>>>>>>
>>>>>>>>>> Please review the following patch:
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8248552
>>>>>>>>>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.00/
>>>>>>>>>>
>>>>>>>>>> In the failing testcase, C2 removes a zero check for a 
>>>>>>>>>> division/modulo node n based on the type information of the 
>>>>>>>>>> loop induction variable phi p (always between 1 and 50 and 
>>>>>>>>>> never 0). However, n is later split through p and ends up 
>>>>>>>>>> after the AddNode which updates the induction variable p. In 
>>>>>>>>>> the last iteration j equals 2 and is then updated to 0. The 
>>>>>>>>>> division/modulo node n is now executed before the loop limit 
>>>>>>>>>> check which results in a SIGFPE.
>>>>>>>>>>
>>>>>>>>>> The fix bails out of PhaseIdealLoop::split_thru_phi if a 
>>>>>>>>>> division or modulo node has its zero check removed (i.e. 
>>>>>>>>>> control in NULL) and is split through a phi which has an input 
>>>>>>>>>> that could be zero. This should only happen for an induction 
>>>>>>>>>> variable phi of a trip-counted (integer) loop.
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>> Christian

From vladimir.kozlov at oracle.com  Wed Jul 15 17:50:56 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 15 Jul 2020 10:50:56 -0700
Subject: question about PrintOptoStatistics atomicity
In-Reply-To: <1594827116846.89704@amazon.com>
References: <1594827116846.89704@amazon.com>
Message-ID: <6b3d2637-e01c-8ab9-e32c-2404c7b2a40a@oracle.com>

It was done intentionally because when that code was implemented atomic operations were expensive.
We never intended these counters to be precise - they were used mostly for debugging purpose.
It is up to user how he want to use them - for example using only one C2 thread.
When you collect data for general application you want to execute it with the same parameters as in production. I don't 
think we should enforce any restrictions in VM when PrintOptoStatistics is used.

Regards,
Vladimir K

On 7/15/20 8:31 AM, Liu, Xin wrote:
> Hi,
> 
> 
> I have a question about -XX:+PrintOptoStatistics in c2_globals.hpp.
> 
> It dumps many internal counters in different C2 phases.  I found those counters are all static fields.
> 
> eg.
> 
> http://hg.openjdk.java.net/jdk/jdk/file/4b9ced2b948c/src/hotspot/share/opto/chaitin.cpp#l2297
> 
> http://hg.openjdk.java.net/jdk/jdk/file/4b9ced2b948c/src/hotspot/share/opto/phaseX.hpp#l599
> 
> 
> I notice that all setters of those fields are not atomic. IMHO, hotspot may has more than one c2-compiler-threads running at the same time.
> 
> How does hotspot guarantee those fields are thread-safe? or the flag intends to do statistics in single-thread mode by design?
> 
> 
> If those counters are not atomic, shall we connect this flag to CICompilerCount?
> 
> I think we can constrain the number of c2-compiler-thread to 1 if user set PrintOptoStatistics. Does it make sense?
> 
> 
> thanks,
> 
> --lx
> 

From ekaterina.pavlova at oracle.com  Wed Jul 15 17:54:32 2020
From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova)
Date: Wed, 15 Jul 2020 10:54:32 -0700
Subject: RFR [15] (T/XS): 8242388 compiler/graalunit/CoreTest.java timed
 out
In-Reply-To: <64ccd3d7-dc82-e243-a63e-db49d61503ef@oracle.com>
References: <39811448-6cf5-c329-de66-27233854cb62@oracle.com>
 <64ccd3d7-dc82-e243-a63e-db49d61503ef@oracle.com>
Message-ID: <a2bee94b-8e8b-a84b-d68b-c745efcd566f@oracle.com>

Thanks Vladimir, pushed.

-katya

On 7/14/20 6:24 PM, Vladimir Kozlov wrote:
> Good.
> 
> Thanks,
> Vladimir K
> 
> On 7/14/20 5:25 PM, Ekaterina Pavlova wrote:
>> Hi all,
>>
>> compiler/graalunit/CoreTest.java fails by timeout from time to time.
>> The most time expensive subtest is org.graalvm.compiler.core.test.CountedLoopTest.
>> The fix spits the test into two tests to reduce total execution time.
>> Please review.
>>
>> ???? JBS: https://bugs.openjdk.java.net/browse/JDK-8242388
>> ??webrev: http://cr.openjdk.java.net/~epavlova//8242388/webrev.00/index.html
>> testing: graalunit tests as part of tier3
>>
>>
>> Thanks,
>> -katya
>>
>>
>>


From jamsheed.c.m at oracle.com  Wed Jul 15 17:54:56 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Wed, 15 Jul 2020 23:24:56 +0530
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <6b4e4dda-01d4-37d0-5403-a4f5481e5bf0@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
 <8325fac5-6258-9b88-6507-5dcb0597cc17@oracle.com>
 <8522a69e-e538-2cc9-5364-887e450fc653@oracle.com>
 <46144d6d-5714-05ad-a263-01507db937cc@oracle.com>
 <7a361c29-4771-9ed4-1542-b3f68a5726f3@oracle.com>
 <6dc4c99b-1d90-09f1-60d1-fb2caf981266@oracle.com>
 <6b4e4dda-01d4-37d0-5403-a4f5481e5bf0@oracle.com>
Message-ID: <f8e18f7b-bd13-5032-fb17-04fb037c88b4@oracle.com>

Hi Vladimir,

with unrolling i understand that many cases will just have phis 
everywhere to outside the loop as the uses are outside the loop.

and this is not restricted to escaping objects alone as i depicted. it 
can be escaping as well as non-escaping.

so marking store to them as global escape doesn't seems to be nice idea. 
i will rework on this fix and get back again.

Thank you

Best regards

Jamsheed

On 15/07/2020 08:38, Jamsheed C M wrote:
> (unfinished mail got sent, so completing it)
> On 15/07/2020 08:21, Jamsheed C M wrote:
>> Hi Vladimir,
>>
>> On 15/07/2020 06:50, Vladimir Kozlov wrote:
>>> I looked more on this. EA already does not secularize allocations 
>>> when Phi nodes merged them - it should handle this case. I did small 
>>> experiment and relaxed assert for this new (10. needs comment 
>>> update) case for AddP's base and test passed:
>>>
>>> src/hotspot/share/opto/escape.cpp Tue Jul 14 18:11:27 2020 -0700
>>> @@ -2357,6 +2357,7 @@
>>> ?????? int opcode = uncast_base->Opcode();
>>> ?????? assert(opcode == Op_ConP || opcode == Op_ThreadLocal ||
>>> ????????????? opcode == Op_CastX2P || 
>>> uncast_base->is_DecodeNarrowPtr() ||
>>> +???????????? (uncast_base->is_Phi() && 
>>> (uncast_base->bottom_type()->isa_rawptr() != NULL)) ||
>>> ????????????? (uncast_base->is_Mem() && 
>>> (uncast_base->bottom_type()->isa_rawptr() != NULL)) ||
>>> ????????????? (uncast_base->is_Proj() && 
>>> uncast_base->in(0)->is_Allocate()), "sanity");
>>> ???? }
>>>
>>> Did you hit a case when this may not work?
>>
>> Yes, right it already doesn't mark it as scalarizable if base count 
>> is more than one(I think it missed a is_oop check there)[1].
>>
>> EA CG adds edges only for oop field making stores to them undetected. 
>> This makes these stored objects to NoEscape and if compiled method 
>> continues execution with this NoEscape object can have undesired 
>> results(i.e synchronization removed).
>>
>> Probable case would be(didn't verify)
>>
>> try {
>>
>> LOOP BEGIN
>>
>> ? try {throw new Obj()} catch {}
>>
>> LOOP END
>>
>> } catch (Obj e) {
>>
>> }
>
> Best Regards,
>
> Jamsheed
>
> [1]https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/escape.cpp#L1770 
>
>
>
>>>
>>>
>>> And with LoopOpts off -XX:LoopUnrollLimit=0 it removed allocation 
>>> (-XX:+PrintEscapeAnalysis -XX:+PrintEliminateAllocations):
>>>
>>> ======== Connection graph for? Test::test
>>> JavaObject NoEscape(NoEscape) [ 158F [ 107 ]]?? 95??? Allocate ===? 
>>> 242? 76? 230? 8? 1 ( 93? 92? 21? 1? 78? 1? 78 ) [[ 96 97 98 105? 
>>> 106? 107 ]]? rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, 
>>> bool, top ) Test::test1 @ bci:0 Test::test @ bci:8 !jvms: 
>>> Test::test1 @ bci:0 Test::test @ bci:8
>>> LocalVar [ 95P [ 158b ]]?? 107??? Proj??? ===? 95? [[ 108? 158 ]] #5 
>>> !jvms: Test::test1 @ bci:0 Test::test @ bci:8
>>>
>>> Scalar? 95??? Allocate??? ===? 242? 76? 230? 8? 1 ( 93? 92? 21 1 78? 
>>> 1? 78 ) [[ 96? 97? 98? 105? 106? 107 ]]? rawptr:NotNull ( int:>=0, 
>>> java/lang/Object:NotNull *, bool, top ) Test::test1 @ bci:0 
>>> Test::test @ bci:8 !jvms: Test::test1 @ bci:0 Test::test @ bci:8
>>> ++++ Eliminated: 95 Allocate
>>>
>>>
>>> t\Thanks,
>>> Vladimir K
>>>
>>> On 7/14/20 1:28 AM, Jamsheed C M wrote:
>>>> Hi all,
>>>>
>>>> I had incorrectly added extra check in assert after offset 
>>>> computation in address_offset . For addps with non constant offsets 
>>>> (like [1])
>>>>
>>>> Not changing the old assert even though I am not expecting first 
>>>> addp/second addp(for array addressing) case for init captured store.
>>>>
>>>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA_asserts_corrected/ 
>>>>
>>>>
>>>> Best regards,
>>>>
>>>> Jamsheed
>>>>
>>>> [1]
>>>>
>>>> assert(offs != Type::OffsetBot ||
>>>> - adr->in(AddPNode::Address)->in(0)->is_AllocateArray(),
>>>> + adr->in(AddPNode::Address)->in(0)->is_AllocateArray() || 
>>>> is_captured_store(adr),
>>>> ???????????? "offset must be a constant or it is initialization of 
>>>> array");
>>>>
>>>> On 13/07/2020 11:14, Jamsheed C M wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I reworked the fix. I compute offset for all init captures stores, 
>>>>> but treats this special init captured stores similar to unsafe(as 
>>>>> these objects are usually GlobalEscape and doesn't have any perf 
>>>>> implications).
>>>>>
>>>>> revised webrev: 
>>>>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.01/
>>>>>
>>>>> testing: mach1-5( logs in jbs)
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Jamsheed
>>>>>
>>>>> On 09/07/2020 19:36, Jamsheed C M wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> request to hold the review. need to change the code for dealing 
>>>>>> with unsafe access. as current capture code go for more execution 
>>>>>> time analyzing things.
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> Jamsheed
>>>>>>
>>>>>> On 09/07/2020 13:01, Jamsheed C M wrote:
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> JBS:https://bugs.openjdk.java.net/browse/JDK-8242895
>>>>>>>
>>>>>>> Request for review changes made to offset computation and field 
>>>>>>> write detection for init captured stores due to phis addition 
>>>>>>> between alloc and init. This happen if init node in different 
>>>>>>> outer loop wrt to alloc node and there is a loop opt.? This was 
>>>>>>> required as a result of enhancement [1].
>>>>>>>
>>>>>>> Normally init are not associated with multiple alloc node during 
>>>>>>> EA phase, but changes done for [1] caused the code shapes of the 
>>>>>>> form [2]? to generate inits associated with multiple alloc node.
>>>>>>>
>>>>>>> This had implication in offset computation and field write 
>>>>>>> detection related to initializing stores.
>>>>>>>
>>>>>>> Attempt to fix in EA:
>>>>>>>
>>>>>>> ???? webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA/
>>>>>>>
>>>>>>> Alternate fix:
>>>>>>>
>>>>>>> ???? Minimize the scenario in compiler generated code by 
>>>>>>> throwing only j.l.Error from slowpath(all exception async/sync 
>>>>>>> are handled in runtime exit).
>>>>>>>
>>>>>>> ???? Stub epilog doesn't poll or throw any exceptions. Disable 
>>>>>>> full loop opt before EA for detectable patterns and bailout EA 
>>>>>>> for late detected patterns.
>>>>>>>
>>>>>>> ???? webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_deopt/
>>>>>>>
>>>>>>> Please advice.
>>>>>>>
>>>>>>> Testing : mach tier1-5 (logs in jbs)
>>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>> Jamsheed
>>>>>>>
>>>>>>>
>>>>>>> [1] JDK-8231291 
>>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8231291>C2: loop opts 
>>>>>>> before EA should maximally unroll loops
>>>>>>>
>>>>>>> [2] that have its init node in different outer loop wrt to alloc 
>>>>>>> node.
>>>>>>>
>>>>>>>
>>>>>>> loop begin
>>>>>>>
>>>>>>> ?? try{
>>>>>>>
>>>>>>> ?? return new obj()/? throw new obj()/ uncommon trap after 
>>>>>>> allocation, in a loop
>>>>>>>
>>>>>>> ?? } catch(ex) {
>>>>>>>
>>>>>>> ?? }
>>>>>>>
>>>>>>> loop end
>>>>>>>
>>>>>>> ? 42???? public static IntA test(int n) {
>>>>>>> ?? 43???????? for (int i=0; i<2; i++) {
>>>>>>> ?? 44???????????? try {
>>>>>>> ?? 45?????????????????? return new IntA(n + i);
>>>>>>> ?? 46???????????? } catch (Exception e) {
>>>>>>> ?? 47???????????? }
>>>>>>> ?? 48???????? }
>>>>>>> ?? 49
>>>>>>>

From vladimir.kozlov at oracle.com  Wed Jul 15 18:59:40 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 15 Jul 2020 11:59:40 -0700
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <f8e18f7b-bd13-5032-fb17-04fb037c88b4@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
 <8325fac5-6258-9b88-6507-5dcb0597cc17@oracle.com>
 <8522a69e-e538-2cc9-5364-887e450fc653@oracle.com>
 <46144d6d-5714-05ad-a263-01507db937cc@oracle.com>
 <7a361c29-4771-9ed4-1542-b3f68a5726f3@oracle.com>
 <6dc4c99b-1d90-09f1-60d1-fb2caf981266@oracle.com>
 <6b4e4dda-01d4-37d0-5403-a4f5481e5bf0@oracle.com>
 <f8e18f7b-bd13-5032-fb17-04fb037c88b4@oracle.com>
Message-ID: <32d7fb64-75a5-7add-d496-df33cfaefabf@oracle.com>

As I said before I agree with your additional checks for StoreN and StoreNKlass.

But I have concerns about new is_init_captured_store code. EA is mostly looking only on inputs to see Allocation. And in 
several places it expecting only to see Allocation because other cases should be filtered out before.

Thanks,
Vladimir

On 7/15/20 10:54 AM, Jamsheed C M wrote:
> Hi Vladimir,
> 
> with unrolling i understand that many cases will just have phis everywhere to outside the loop as the uses are outside 
> the loop.
> 
> and this is not restricted to escaping objects alone as i depicted. it can be escaping as well as non-escaping.
> 
> so marking store to them as global escape doesn't seems to be nice idea. i will rework on this fix and get back again.
> 
> Thank you
> 
> Best regards
> 
> Jamsheed
> 
> On 15/07/2020 08:38, Jamsheed C M wrote:
>> (unfinished mail got sent, so completing it)
>> On 15/07/2020 08:21, Jamsheed C M wrote:
>>> Hi Vladimir,
>>>
>>> On 15/07/2020 06:50, Vladimir Kozlov wrote:
>>>> I looked more on this. EA already does not secularize allocations when Phi nodes merged them - it should handle this 
>>>> case. I did small experiment and relaxed assert for this new (10. needs comment update) case for AddP's base and 
>>>> test passed:
>>>>
>>>> src/hotspot/share/opto/escape.cpp Tue Jul 14 18:11:27 2020 -0700
>>>> @@ -2357,6 +2357,7 @@
>>>> ?????? int opcode = uncast_base->Opcode();
>>>> ?????? assert(opcode == Op_ConP || opcode == Op_ThreadLocal ||
>>>> ????????????? opcode == Op_CastX2P || uncast_base->is_DecodeNarrowPtr() ||
>>>> +???????????? (uncast_base->is_Phi() && (uncast_base->bottom_type()->isa_rawptr() != NULL)) ||
>>>> ????????????? (uncast_base->is_Mem() && (uncast_base->bottom_type()->isa_rawptr() != NULL)) ||
>>>> ????????????? (uncast_base->is_Proj() && uncast_base->in(0)->is_Allocate()), "sanity");
>>>> ???? }
>>>>
>>>> Did you hit a case when this may not work?
>>>
>>> Yes, right it already doesn't mark it as scalarizable if base count is more than one(I think it missed a is_oop check 
>>> there)[1].
>>>
>>> EA CG adds edges only for oop field making stores to them undetected. This makes these stored objects to NoEscape and 
>>> if compiled method continues execution with this NoEscape object can have undesired results(i.e synchronization 
>>> removed).
>>>
>>> Probable case would be(didn't verify)
>>>
>>> try {
>>>
>>> LOOP BEGIN
>>>
>>> ? try {throw new Obj()} catch {}
>>>
>>> LOOP END
>>>
>>> } catch (Obj e) {
>>>
>>> }
>>
>> Best Regards,
>>
>> Jamsheed
>>
>> [1]https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/escape.cpp#L1770
>>
>>
>>>>
>>>>
>>>> And with LoopOpts off -XX:LoopUnrollLimit=0 it removed allocation (-XX:+PrintEscapeAnalysis 
>>>> -XX:+PrintEliminateAllocations):
>>>>
>>>> ======== Connection graph for? Test::test
>>>> JavaObject NoEscape(NoEscape) [ 158F [ 107 ]]?? 95??? Allocate === 242? 76? 230? 8? 1 ( 93? 92? 21? 1? 78? 1? 78 ) 
>>>> [[ 96 97 98 105 106? 107 ]]? rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, top ) Test::test1 @ bci:0 
>>>> Test::test @ bci:8 !jvms: Test::test1 @ bci:0 Test::test @ bci:8
>>>> LocalVar [ 95P [ 158b ]]?? 107??? Proj??? ===? 95? [[ 108? 158 ]] #5 !jvms: Test::test1 @ bci:0 Test::test @ bci:8
>>>>
>>>> Scalar? 95??? Allocate??? ===? 242? 76? 230? 8? 1 ( 93? 92? 21 1 78 1? 78 ) [[ 96? 97? 98? 105? 106? 107 ]]  
>>>> rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, top ) Test::test1 @ bci:0 Test::test @ bci:8 !jvms: 
>>>> Test::test1 @ bci:0 Test::test @ bci:8
>>>> ++++ Eliminated: 95 Allocate
>>>>
>>>>
>>>> t\Thanks,
>>>> Vladimir K
>>>>
>>>> On 7/14/20 1:28 AM, Jamsheed C M wrote:
>>>>> Hi all,
>>>>>
>>>>> I had incorrectly added extra check in assert after offset computation in address_offset . For addps with non 
>>>>> constant offsets (like [1])
>>>>>
>>>>> Not changing the old assert even though I am not expecting first addp/second addp(for array addressing) case for 
>>>>> init captured store.
>>>>>
>>>>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA_asserts_corrected/
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Jamsheed
>>>>>
>>>>> [1]
>>>>>
>>>>> assert(offs != Type::OffsetBot ||
>>>>> - adr->in(AddPNode::Address)->in(0)->is_AllocateArray(),
>>>>> + adr->in(AddPNode::Address)->in(0)->is_AllocateArray() || is_captured_store(adr),
>>>>> ???????????? "offset must be a constant or it is initialization of array");
>>>>>
>>>>> On 13/07/2020 11:14, Jamsheed C M wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I reworked the fix. I compute offset for all init captures stores, but treats this special init captured stores 
>>>>>> similar to unsafe(as these objects are usually GlobalEscape and doesn't have any perf implications).
>>>>>>
>>>>>> revised webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.01/
>>>>>>
>>>>>> testing: mach1-5( logs in jbs)
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> Jamsheed
>>>>>>
>>>>>> On 09/07/2020 19:36, Jamsheed C M wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> request to hold the review. need to change the code for dealing with unsafe access. as current capture code go 
>>>>>>> for more execution time analyzing things.
>>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>> Jamsheed
>>>>>>>
>>>>>>> On 09/07/2020 13:01, Jamsheed C M wrote:
>>>>>>>>
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> JBS:https://bugs.openjdk.java.net/browse/JDK-8242895
>>>>>>>>
>>>>>>>> Request for review changes made to offset computation and field write detection for init captured stores due to 
>>>>>>>> phis addition between alloc and init. This happen if init node in different outer loop wrt to alloc node and 
>>>>>>>> there is a loop opt.? This was required as a result of enhancement [1].
>>>>>>>>
>>>>>>>> Normally init are not associated with multiple alloc node during EA phase, but changes done for [1] caused the 
>>>>>>>> code shapes of the form [2]? to generate inits associated with multiple alloc node.
>>>>>>>>
>>>>>>>> This had implication in offset computation and field write detection related to initializing stores.
>>>>>>>>
>>>>>>>> Attempt to fix in EA:
>>>>>>>>
>>>>>>>> ???? webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA/
>>>>>>>>
>>>>>>>> Alternate fix:
>>>>>>>>
>>>>>>>> ???? Minimize the scenario in compiler generated code by throwing only j.l.Error from slowpath(all exception 
>>>>>>>> async/sync are handled in runtime exit).
>>>>>>>>
>>>>>>>> ???? Stub epilog doesn't poll or throw any exceptions. Disable full loop opt before EA for detectable patterns 
>>>>>>>> and bailout EA for late detected patterns.
>>>>>>>>
>>>>>>>> ???? webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_deopt/
>>>>>>>>
>>>>>>>> Please advice.
>>>>>>>>
>>>>>>>> Testing : mach tier1-5 (logs in jbs)
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>> Jamsheed
>>>>>>>>
>>>>>>>>
>>>>>>>> [1] JDK-8231291 <https://bugs.openjdk.java.net/browse/JDK-8231291>C2: loop opts before EA should maximally 
>>>>>>>> unroll loops
>>>>>>>>
>>>>>>>> [2] that have its init node in different outer loop wrt to alloc node.
>>>>>>>>
>>>>>>>>
>>>>>>>> loop begin
>>>>>>>>
>>>>>>>> ?? try{
>>>>>>>>
>>>>>>>> ?? return new obj()/? throw new obj()/ uncommon trap after allocation, in a loop
>>>>>>>>
>>>>>>>> ?? } catch(ex) {
>>>>>>>>
>>>>>>>> ?? }
>>>>>>>>
>>>>>>>> loop end
>>>>>>>>
>>>>>>>> ? 42???? public static IntA test(int n) {
>>>>>>>> ?? 43???????? for (int i=0; i<2; i++) {
>>>>>>>> ?? 44???????????? try {
>>>>>>>> ?? 45?????????????????? return new IntA(n + i);
>>>>>>>> ?? 46???????????? } catch (Exception e) {
>>>>>>>> ?? 47???????????? }
>>>>>>>> ?? 48???????? }
>>>>>>>> ?? 49
>>>>>>>>

From jamsheed.c.m at oracle.com  Wed Jul 15 22:16:11 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Thu, 16 Jul 2020 03:46:11 +0530
Subject: [15] RFR: 8246381: VM crashes with "Current BasicObjectLock*
 below than low_mark"
In-Reply-To: <7a802330-e836-1ff3-af0a-ede587e049ff@oracle.com>
References: <7a802330-e836-1ff3-af0a-ede587e049ff@oracle.com>
Message-ID: <30bd811e-c890-5bb1-8c78-4cf944fd5a42@oracle.com>

(Thank you Dean, adding serviceability team as this issue involves JVMTI 
features PopFrame, EarlyReturn features)

JBS entry: https://bugs.openjdk.java.net/browse/JDK-8246381

(testing: mach5, tier1-5 links in JBS)

Best regards,

Jamsheed

On 15/07/2020 21:25, Jamsheed C M wrote:
>
> Hi,
>
> Async handling at method entry requires it to be aware of 
> synchronization(like whether it is doing async handling before lock 
> acquire or after)
>
> This is required as exception handler rely on this info for 
> unlocking.? Async handling code never had this special condition 
> handled and it worked most of the time as we were using biased locking 
> which got disabled by [1]
>
> There was one other issue reported in similar time[2]. This issue got 
> triggered in test case by [3], back to back extra safepoint after 
> suspend and TLH for ThreadDeath. So in this setup both PopFrame 
> request and Thread.Stop request happened together for the test 
> scenario and it reached java method entry with pending_exception set.
>
> I have done a partial fix for the issue, mainly to handle production 
> mode crash failures(do not unlock flag related ones)
>
> Fix detail:
>
> 1) I save restore the "do not unlock" flag in async handling.
>
> 2) Return for floating pending exception for some cases(PopFrame, 
> Early return related). This is debug(JVMTI) feature and floating 
> exception can get cleaned just like that in present compiler request 
> and deopt code.
>
> webrev :http://cr.openjdk.java.net/~jcm/8246381/webrev.02/
>
> There are more problems in these code areas, like we clear all 
> exceptions in compilation request path(interpreter,c1), as well as 
> deoptimization path.
>
> All these un-handled cases will be separately handled by 
> https://bugs.openjdk.java.net/browse/JDK-8249451
>
> Request for review.
>
> Best regards,
>
> Jamsheed
>
> [1]https://bugs.openjdk.java.net/browse/JDK-8231264 
> <https://bugs.openjdk.java.net/browse/JDK-8231264>
>
> [2] https://bugs.openjdk.java.net/browse/JDK-8246727
>
> [3] https://bugs.openjdk.java.net/browse/JDK-8221207
>

From david.holmes at oracle.com  Wed Jul 15 23:50:35 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 16 Jul 2020 09:50:35 +1000
Subject: [15] RFR: 8246381: VM crashes with "Current BasicObjectLock*
 below than low_mark"
In-Reply-To: <30bd811e-c890-5bb1-8c78-4cf944fd5a42@oracle.com>
References: <7a802330-e836-1ff3-af0a-ede587e049ff@oracle.com>
 <30bd811e-c890-5bb1-8c78-4cf944fd5a42@oracle.com>
Message-ID: <5d43f963-b931-3b69-4b5c-188c45b57de8@oracle.com>

Hi Jamsheed,

On 16/07/2020 8:16 am, Jamsheed C M wrote:
> (Thank you Dean, adding serviceability team as this issue involves JVMTI 
> features PopFrame, EarlyReturn features)

It is not at all obvious how your proposed fix impacts the JVM TI features.

> JBS entry: https://bugs.openjdk.java.net/browse/JDK-8246381
> 
> (testing: mach5, tier1-5 links in JBS)
> 
> Best regards,
> 
> Jamsheed
> 
> On 15/07/2020 21:25, Jamsheed C M wrote:
>>
>> Hi,
>>
>> Async handling at method entry requires it to be aware of 
>> synchronization(like whether it is doing async handling before lock 
>> acquire or after)
>>
>> This is required as exception handler rely on this info for 
>> unlocking.? Async handling code never had this special condition 
>> handled and it worked most of the time as we were using biased locking 
>> which got disabled by [1]
>>
>> There was one other issue reported in similar time[2]. This issue got 
>> triggered in test case by [3], back to back extra safepoint after 
>> suspend and TLH for ThreadDeath. So in this setup both PopFrame 
>> request and Thread.Stop request happened together for the test 
>> scenario and it reached java method entry with pending_exception set.
>>
>> I have done a partial fix for the issue, mainly to handle production 
>> mode crash failures(do not unlock flag related ones)
>>
>> Fix detail:
>>
>> 1) I save restore the "do not unlock" flag in async handling.

Sorry but you completely changed the fix compared to what we discussed 
and what I pre-reviewed! What happened to changing from JRT_ENTRY to 
JRT_ENTRY_NOASYNC? It is going to take me a lot of time and effort to 
determine that this save/restore of the "do not unlock flag" is actually 
correct and valid!

>>
>> 2) Return for floating pending exception for some cases(PopFrame, 
>> Early return related). This is debug(JVMTI) feature and floating 
>> exception can get cleaned just like that in present compiler request 
>> and deopt code.

What part of the change addresses this?

Thanks,
David
-----

>>
>> webrev :http://cr.openjdk.java.net/~jcm/8246381/webrev.02/
>>
>> There are more problems in these code areas, like we clear all 
>> exceptions in compilation request path(interpreter,c1), as well as 
>> deoptimization path.
>>
>> All these un-handled cases will be separately handled by 
>> https://bugs.openjdk.java.net/browse/JDK-8249451
>>
>> Request for review.
>>
>> Best regards,
>>
>> Jamsheed
>>
>> [1]https://bugs.openjdk.java.net/browse/JDK-8231264 
>> <https://bugs.openjdk.java.net/browse/JDK-8231264>
>>
>> [2] https://bugs.openjdk.java.net/browse/JDK-8246727
>>
>> [3] https://bugs.openjdk.java.net/browse/JDK-8221207
>>

From jamsheed.c.m at oracle.com  Thu Jul 16 00:01:21 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Thu, 16 Jul 2020 05:31:21 +0530
Subject: [15] RFR: 8246381: VM crashes with "Current BasicObjectLock*
 below than low_mark"
In-Reply-To: <5d43f963-b931-3b69-4b5c-188c45b57de8@oracle.com>
References: <7a802330-e836-1ff3-af0a-ede587e049ff@oracle.com>
 <30bd811e-c890-5bb1-8c78-4cf944fd5a42@oracle.com>
 <5d43f963-b931-3b69-4b5c-188c45b57de8@oracle.com>
Message-ID: <973c4e4c-ed0e-7152-8387-28243a3ac275@oracle.com>

Hi David,

On 16/07/2020 05:20, David Holmes wrote:
> Hi Jamsheed,
>
> On 16/07/2020 8:16 am, Jamsheed C M wrote:
>> (Thank you Dean, adding serviceability team as this issue involves 
>> JVMTI features PopFrame, EarlyReturn features)
>
> It is not at all obvious how your proposed fix impacts the JVM TI 
> features.

Yes, proposed fix doesn't. Fix doesn't plan to address JVMTI feature 
related issues.

Added just to keep everyone in the loop.

Best regards,

Jamsheed
>
>> JBS entry: https://bugs.openjdk.java.net/browse/JDK-8246381
>>
>> (testing: mach5, tier1-5 links in JBS)
>>
>> Best regards,
>>
>> Jamsheed
>>
>> On 15/07/2020 21:25, Jamsheed C M wrote:
>>>
>>> Hi,
>>>
>>> Async handling at method entry requires it to be aware of 
>>> synchronization(like whether it is doing async handling before lock 
>>> acquire or after)
>>>
>>> This is required as exception handler rely on this info for 
>>> unlocking.? Async handling code never had this special condition 
>>> handled and it worked most of the time as we were using biased 
>>> locking which got disabled by [1]
>>>
>>> There was one other issue reported in similar time[2]. This issue 
>>> got triggered in test case by [3], back to back extra safepoint 
>>> after suspend and TLH for ThreadDeath. So in this setup both 
>>> PopFrame request and Thread.Stop request happened together for the 
>>> test scenario and it reached java method entry with 
>>> pending_exception set.
>>>
>>> I have done a partial fix for the issue, mainly to handle production 
>>> mode crash failures(do not unlock flag related ones)
>>>
>>> Fix detail:
>>>
>>> 1) I save restore the "do not unlock" flag in async handling.
>
> Sorry but you completely changed the fix compared to what we discussed 
> and what I pre-reviewed! What happened to changing from JRT_ENTRY to 
> JRT_ENTRY_NOASYNC? It is going to take me a lot of time and effort to 
> determine that this save/restore of the "do not unlock flag" is 
> actually correct and valid!
>
>>>
>>> 2) Return for floating pending exception for some cases(PopFrame, 
>>> Early return related). This is debug(JVMTI) feature and floating 
>>> exception can get cleaned just like that in present compiler request 
>>> and deopt code.
>
> What part of the change addresses this?
>
> Thanks,
> David
> -----
>
>>>
>>> webrev :http://cr.openjdk.java.net/~jcm/8246381/webrev.02/
>>>
>>> There are more problems in these code areas, like we clear all 
>>> exceptions in compilation request path(interpreter,c1), as well as 
>>> deoptimization path.
>>>
>>> All these un-handled cases will be separately handled by 
>>> https://bugs.openjdk.java.net/browse/JDK-8249451
>>>
>>> Request for review.
>>>
>>> Best regards,
>>>
>>> Jamsheed
>>>
>>> [1]https://bugs.openjdk.java.net/browse/JDK-8231264 
>>> <https://bugs.openjdk.java.net/browse/JDK-8231264>
>>>
>>> [2] https://bugs.openjdk.java.net/browse/JDK-8246727
>>>
>>> [3] https://bugs.openjdk.java.net/browse/JDK-8221207
>>>

From jamsheed.c.m at oracle.com  Thu Jul 16 00:37:25 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Thu, 16 Jul 2020 06:07:25 +0530
Subject: [15] RFR: 8246381: VM crashes with "Current BasicObjectLock*
 below than low_mark"
In-Reply-To: <5d43f963-b931-3b69-4b5c-188c45b57de8@oracle.com>
References: <7a802330-e836-1ff3-af0a-ede587e049ff@oracle.com>
 <30bd811e-c890-5bb1-8c78-4cf944fd5a42@oracle.com>
 <5d43f963-b931-3b69-4b5c-188c45b57de8@oracle.com>
Message-ID: <122f8079-958c-acdf-bb60-3934729a313a@oracle.com>

Hi David,

On 16/07/2020 05:20, David Holmes wrote:
>>>
>>> Hi,
>>>
>>> Async handling at method entry requires it to be aware of 
>>> synchronization(like whether it is doing async handling before lock 
>>> acquire or after)
>>>
>>> This is required as exception handler rely on this info for 
>>> unlocking.? Async handling code never had this special condition 
>>> handled and it worked most of the time as we were using biased 
>>> locking which got disabled by [1]
>>>
>>> There was one other issue reported in similar time[2]. This issue 
>>> got triggered in test case by [3], back to back extra safepoint 
>>> after suspend and TLH for ThreadDeath. So in this setup both 
>>> PopFrame request and Thread.Stop request happened together for the 
>>> test scenario and it reached java method entry with 
>>> pending_exception set.
>>>
>>> I have done a partial fix for the issue, mainly to handle production 
>>> mode crash failures(do not unlock flag related ones)
>>>
>>> Fix detail:
>>>
>>> 1) I save restore the "do not unlock" flag in async handling.
>
> Sorry but you completely changed the fix compared to what we discussed 
> and what I pre-reviewed! What happened to changing from JRT_ENTRY to 
> JRT_ENTRY_NOASYNC? It is going to take me a lot of time and effort to 
> determine that this save/restore of the "do not unlock flag" is 
> actually correct and valid!

I tried JRT_ENTRY to JRT_ENTRY_NOASYNC. but unfortunately that made some 
tests to fail(logs in JBS), I didn't investigate it in detail, but what 
I presume is

pending_async_exception is set for those failing scenarios but as we 
have? disabled async handling in some prominent code paths, the 
exception is never delivered.

>>>
>>> 2) Return for floating pending exception for some cases(PopFrame, 
>>> Early return related). This is debug(JVMTI) feature and floating 
>>> exception can get cleaned just like that in present compiler request 
>>> and deopt code.
>
> What part of the change addresses this?

It doesn't address this issue completely. As it requires other changes 
in compilation request path(c1,interpreter) and deopt.

Just made changes to interpreter part(compilation request part). that 
fixes interpreter part partially.

  JRT_ENTRY(nmethod*,
            InterpreterRuntime::frequency_counter_overflow_inner(JavaThread* thread, address branch_bcp))
+ if (HAS_PENDING_EXCEPTION) {
+ return NULL;
+ }   JRT_ENTRY(void, InterpreterRuntime::profile_method(JavaThread* thread))
+ if (HAS_PENDING_EXCEPTION) {
+ return;
+ }

Best regards

Jamsheed

>
> Thanks,
> David
> ----- 

From david.holmes at oracle.com  Thu Jul 16 01:07:33 2020
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 16 Jul 2020 11:07:33 +1000
Subject: [15] RFR: 8246381: VM crashes with "Current BasicObjectLock*
 below than low_mark"
In-Reply-To: <5d43f963-b931-3b69-4b5c-188c45b57de8@oracle.com>
References: <7a802330-e836-1ff3-af0a-ede587e049ff@oracle.com>
 <30bd811e-c890-5bb1-8c78-4cf944fd5a42@oracle.com>
 <5d43f963-b931-3b69-4b5c-188c45b57de8@oracle.com>
Message-ID: <1af60254-a239-c21f-68df-be9b65534e7f@oracle.com>

Hi Jamsheed,

tl;dr version: fix looks good. Thanks for working through things with me 
on this one.

Long version ... for the sake of other reviewers (and myself) I'm going 
to walk through the problem scenario and how the fix addresses it, 
because the bug report is long and confusing and touches on a number of 
different issues with async exception handling.

We are dealing with the code generated for Java method entry, and in 
particular for a synchronized Java method. We do a lot of things in the 
entry code before we actually lock the monitor and jump to the Java 
method. Some of those things include method profiling and the counter 
overflow check for the JIT. If an exception is thrown at this point, the 
logic to remove the activation would unlock the monitor - which we 
haven't actually locked yet! So we have the 
do_not_unlock_if_synchronized flag which is stored in the current 
JavaThread. We set that flag true so that if any exceptions result in 
activation removal, the removal logic won't try to unlock the monitor. 
Once we're ready to lock the monitor we set the flag back to false (note 
there is an implicit assumption here that monitor locking can never 
raise an exception).

The problem arises with async exceptions, or more specifically the async 
exception that is raised due to an "unsafe access error". This is where 
a memory-mapped ByteBuffer causes an access violation (SEGV) due to a 
bad pointer. The signal handler simply sets a flag to indicate we 
encountered an "unsafe access error", adjusts the BCI to the next 
instruction and allows execution to proceed at the next instruction. It 
is then expected that the runtime will "soon" notice this pending unsafe 
access error and create and throw the InternalError instance that 
indicates the ByteBuffer operation failed. This requires executing Java 
code.

One of the places that checks for that pending unsafe access error is in 
the destructor of the JRT_ENTRY wrapper that is used for the method 
profiling and counter overflow checking. This occurs whilst the 
do_not_unlock_if_synchronized flag is true, so the resulting 
InternalError won't result in an attempt to unlock the not-locked monitor.

The problem is that creating the InternalError executes Java code - it 
calls constructors, which call methods etc. And some of those methods 
are synchronized. So the method entry logic for such a call will set 
do_not_unlock_if_synchronized to true, perform all the preamble related 
to the call, then set do_not_unlock_if_synchronized to false, lock the 
monitor and make the call. When construction completes the InternalError 
is thrown and we remove the activation for the method we had originally 
started to call. But now the do_not_unlock_if_synchronized flag has been 
reset to false by the nested Java method call, so we do in fact try to 
unlock a monitor that was never locked, and things break.

This nesting problem is well known and we have a mechanism for dealing 
with - the UnlockFlagSaver. The actual logic executed for profiling 
methods and doing the counter overflow check contains the requisite 
UnlockFlagSaver to avoid the problem just outlined. Unfortunately the 
async exception is processed in the JRT_ENTRY wrapper, which is outside 
the scope of those UnlockFlagSaver helpers and so they don't help in 
this case.

So the fix is to "simply" move the UnlockFlagSaver deeper into the call 
stack to the code that actually does the async exception processing:

  void JavaThread::check_and_handle_async_exceptions(bool 
check_unsafe_error) {
+   // May be we are at method entry and requires to save do not unlock 
flag.
+   UnlockFlagSaver fs(this);

so now after the InternalError has been created and thrown we will 
restore the original value of the do_not_unlock_if_synchronized flag 
(false) and so the InternalError will not cause activation removal to 
attempt to unlock the not-locked monitor.

The scope of the UnlockFlagSaver could be narrowed to the actual logic 
for processing the unsafe access error, but it seems fine at method scope.

A second fix is that the overflow counter check had an assertion that it 
was not executed with any pending exceptions. But that turned out to be 
false for reasons I can't fully explain, but it again appears to relate 
to a pending async exception being installed prior to the method call - 
and seems related to the two referenced JVM TI functions. The simple 
solution here is to delete the assertion and to check for pending 
exceptions on entry to the code and just return immediately. The 
JRT_ENTRY destructor will see the pending exception and propagate it.

Cheers,
David

On 16/07/2020 9:50 am, David Holmes wrote:
> Hi Jamsheed,
> 
> On 16/07/2020 8:16 am, Jamsheed C M wrote:
>> (Thank you Dean, adding serviceability team as this issue involves 
>> JVMTI features PopFrame, EarlyReturn features)
> 
> It is not at all obvious how your proposed fix impacts the JVM TI features.
> 
>> JBS entry: https://bugs.openjdk.java.net/browse/JDK-8246381
>>
>> (testing: mach5, tier1-5 links in JBS)
>>
>> Best regards,
>>
>> Jamsheed
>>
>> On 15/07/2020 21:25, Jamsheed C M wrote:
>>>
>>> Hi,
>>>
>>> Async handling at method entry requires it to be aware of 
>>> synchronization(like whether it is doing async handling before lock 
>>> acquire or after)
>>>
>>> This is required as exception handler rely on this info for 
>>> unlocking.? Async handling code never had this special condition 
>>> handled and it worked most of the time as we were using biased 
>>> locking which got disabled by [1]
>>>
>>> There was one other issue reported in similar time[2]. This issue got 
>>> triggered in test case by [3], back to back extra safepoint after 
>>> suspend and TLH for ThreadDeath. So in this setup both PopFrame 
>>> request and Thread.Stop request happened together for the test 
>>> scenario and it reached java method entry with pending_exception set.
>>>
>>> I have done a partial fix for the issue, mainly to handle production 
>>> mode crash failures(do not unlock flag related ones)
>>>
>>> Fix detail:
>>>
>>> 1) I save restore the "do not unlock" flag in async handling.
> 
> Sorry but you completely changed the fix compared to what we discussed 
> and what I pre-reviewed! What happened to changing from JRT_ENTRY to 
> JRT_ENTRY_NOASYNC? It is going to take me a lot of time and effort to 
> determine that this save/restore of the "do not unlock flag" is actually 
> correct and valid!
> 
>>>
>>> 2) Return for floating pending exception for some cases(PopFrame, 
>>> Early return related). This is debug(JVMTI) feature and floating 
>>> exception can get cleaned just like that in present compiler request 
>>> and deopt code.
> 
> What part of the change addresses this?
> 
> Thanks,
> David
> -----
> 
>>>
>>> webrev :http://cr.openjdk.java.net/~jcm/8246381/webrev.02/
>>>
>>> There are more problems in these code areas, like we clear all 
>>> exceptions in compilation request path(interpreter,c1), as well as 
>>> deoptimization path.
>>>
>>> All these un-handled cases will be separately handled by 
>>> https://bugs.openjdk.java.net/browse/JDK-8249451
>>>
>>> Request for review.
>>>
>>> Best regards,
>>>
>>> Jamsheed
>>>
>>> [1]https://bugs.openjdk.java.net/browse/JDK-8231264 
>>> <https://bugs.openjdk.java.net/browse/JDK-8231264>
>>>
>>> [2] https://bugs.openjdk.java.net/browse/JDK-8246727
>>>
>>> [3] https://bugs.openjdk.java.net/browse/JDK-8221207
>>>

From jamsheed.c.m at oracle.com  Thu Jul 16 01:55:30 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Thu, 16 Jul 2020 07:25:30 +0530
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <32d7fb64-75a5-7add-d496-df33cfaefabf@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
 <8325fac5-6258-9b88-6507-5dcb0597cc17@oracle.com>
 <8522a69e-e538-2cc9-5364-887e450fc653@oracle.com>
 <46144d6d-5714-05ad-a263-01507db937cc@oracle.com>
 <7a361c29-4771-9ed4-1542-b3f68a5726f3@oracle.com>
 <6dc4c99b-1d90-09f1-60d1-fb2caf981266@oracle.com>
 <6b4e4dda-01d4-37d0-5403-a4f5481e5bf0@oracle.com>
 <f8e18f7b-bd13-5032-fb17-04fb037c88b4@oracle.com>
 <32d7fb64-75a5-7add-d496-df33cfaefabf@oracle.com>
Message-ID: <f9f80bf5-a565-3820-efa1-2b949b92d764@oracle.com>

Hi Vladimir,

On 16/07/2020 00:29, Vladimir Kozlov wrote:
> As I said before I agree with your additional checks for StoreN and 
> StoreNKlass.
>
> But I have concerns about new is_init_captured_store code. EA is 
> mostly looking only on inputs to see Allocation. And in several places 
> it expecting only to see Allocation because other cases should be 
> filtered out before.
If that is the case, I would like to go with my first webrev for this 
fix as it nicely propagate es and there in no unnecessary promotion to 
global escape state.

http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA/

Best regards,

Jamsheed

>
> Thanks,
> Vladimir
>
> On 7/15/20 10:54 AM, Jamsheed C M wrote:
>> Hi Vladimir,
>>
>> with unrolling i understand that many cases will just have phis 
>> everywhere to outside the loop as the uses are outside the loop.
>>
>> and this is not restricted to escaping objects alone as i depicted. 
>> it can be escaping as well as non-escaping.
>>
>> so marking store to them as global escape doesn't seems to be nice 
>> idea. i will rework on this fix and get back again.
>>
>> Thank you
>>
>> Best regards
>>
>> Jamsheed
>>
>> On 15/07/2020 08:38, Jamsheed C M wrote:
>>> (unfinished mail got sent, so completing it)
>>> On 15/07/2020 08:21, Jamsheed C M wrote:
>>>> Hi Vladimir,
>>>>
>>>> On 15/07/2020 06:50, Vladimir Kozlov wrote:
>>>>> I looked more on this. EA already does not secularize allocations 
>>>>> when Phi nodes merged them - it should handle this case. I did 
>>>>> small experiment and relaxed assert for this new (10. needs 
>>>>> comment update) case for AddP's base and test passed:
>>>>>
>>>>> src/hotspot/share/opto/escape.cpp Tue Jul 14 18:11:27 2020 -0700
>>>>> @@ -2357,6 +2357,7 @@
>>>>> ?????? int opcode = uncast_base->Opcode();
>>>>> ?????? assert(opcode == Op_ConP || opcode == Op_ThreadLocal ||
>>>>> ????????????? opcode == Op_CastX2P || 
>>>>> uncast_base->is_DecodeNarrowPtr() ||
>>>>> +???????????? (uncast_base->is_Phi() && 
>>>>> (uncast_base->bottom_type()->isa_rawptr() != NULL)) ||
>>>>> ????????????? (uncast_base->is_Mem() && 
>>>>> (uncast_base->bottom_type()->isa_rawptr() != NULL)) ||
>>>>> ????????????? (uncast_base->is_Proj() && 
>>>>> uncast_base->in(0)->is_Allocate()), "sanity");
>>>>> ???? }
>>>>>
>>>>> Did you hit a case when this may not work?
>>>>
>>>> Yes, right it already doesn't mark it as scalarizable if base count 
>>>> is more than one(I think it missed a is_oop check there)[1].
>>>>
>>>> EA CG adds edges only for oop field making stores to them 
>>>> undetected. This makes these stored objects to NoEscape and if 
>>>> compiled method continues execution with this NoEscape object can 
>>>> have undesired results(i.e synchronization removed).
>>>>
>>>> Probable case would be(didn't verify)
>>>>
>>>> try {
>>>>
>>>> LOOP BEGIN
>>>>
>>>> ? try {throw new Obj()} catch {}
>>>>
>>>> LOOP END
>>>>
>>>> } catch (Obj e) {
>>>>
>>>> }
>>>
>>> Best Regards,
>>>
>>> Jamsheed
>>>
>>> [1]https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/escape.cpp#L1770 
>>>
>>>
>>>
>>>>>
>>>>>
>>>>> And with LoopOpts off -XX:LoopUnrollLimit=0 it removed allocation 
>>>>> (-XX:+PrintEscapeAnalysis -XX:+PrintEliminateAllocations):
>>>>>
>>>>> ======== Connection graph for? Test::test
>>>>> JavaObject NoEscape(NoEscape) [ 158F [ 107 ]]?? 95 Allocate === 
>>>>> 242? 76? 230? 8? 1 ( 93? 92? 21? 1? 78? 1? 78 ) [[ 96 97 98 105 
>>>>> 106? 107 ]]? rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, 
>>>>> bool, top ) Test::test1 @ bci:0 Test::test @ bci:8 !jvms: 
>>>>> Test::test1 @ bci:0 Test::test @ bci:8
>>>>> LocalVar [ 95P [ 158b ]]?? 107??? Proj??? ===? 95? [[ 108 158 ]] 
>>>>> #5 !jvms: Test::test1 @ bci:0 Test::test @ bci:8
>>>>>
>>>>> Scalar? 95??? Allocate??? ===? 242? 76? 230? 8? 1 ( 93 92? 21 1 78 
>>>>> 1? 78 ) [[ 96? 97? 98? 105? 106? 107 ]] rawptr:NotNull ( int:>=0, 
>>>>> java/lang/Object:NotNull *, bool, top ) Test::test1 @ bci:0 
>>>>> Test::test @ bci:8 !jvms: Test::test1 @ bci:0 Test::test @ bci:8
>>>>> ++++ Eliminated: 95 Allocate
>>>>>
>>>>>
>>>>> t\Thanks,
>>>>> Vladimir K
>>>>>
>>>>> On 7/14/20 1:28 AM, Jamsheed C M wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> I had incorrectly added extra check in assert after offset 
>>>>>> computation in address_offset . For addps with non constant 
>>>>>> offsets (like [1])
>>>>>>
>>>>>> Not changing the old assert even though I am not expecting first 
>>>>>> addp/second addp(for array addressing) case for init captured store.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA_asserts_corrected/ 
>>>>>>
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> Jamsheed
>>>>>>
>>>>>> [1]
>>>>>>
>>>>>> assert(offs != Type::OffsetBot ||
>>>>>> - adr->in(AddPNode::Address)->in(0)->is_AllocateArray(),
>>>>>> + adr->in(AddPNode::Address)->in(0)->is_AllocateArray() || 
>>>>>> is_captured_store(adr),
>>>>>> ???????????? "offset must be a constant or it is initialization 
>>>>>> of array");
>>>>>>
>>>>>> On 13/07/2020 11:14, Jamsheed C M wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I reworked the fix. I compute offset for all init captures 
>>>>>>> stores, but treats this special init captured stores similar to 
>>>>>>> unsafe(as these objects are usually GlobalEscape and doesn't 
>>>>>>> have any perf implications).
>>>>>>>
>>>>>>> revised webrev: 
>>>>>>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.01/
>>>>>>>
>>>>>>> testing: mach1-5( logs in jbs)
>>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>> Jamsheed
>>>>>>>
>>>>>>> On 09/07/2020 19:36, Jamsheed C M wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> request to hold the review. need to change the code for dealing 
>>>>>>>> with unsafe access. as current capture code go for more 
>>>>>>>> execution time analyzing things.
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>> Jamsheed
>>>>>>>>
>>>>>>>> On 09/07/2020 13:01, Jamsheed C M wrote:
>>>>>>>>>
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> JBS:https://bugs.openjdk.java.net/browse/JDK-8242895
>>>>>>>>>
>>>>>>>>> Request for review changes made to offset computation and 
>>>>>>>>> field write detection for init captured stores due to phis 
>>>>>>>>> addition between alloc and init. This happen if init node in 
>>>>>>>>> different outer loop wrt to alloc node and there is a loop 
>>>>>>>>> opt.? This was required as a result of enhancement [1].
>>>>>>>>>
>>>>>>>>> Normally init are not associated with multiple alloc node 
>>>>>>>>> during EA phase, but changes done for [1] caused the code 
>>>>>>>>> shapes of the form [2]? to generate inits associated with 
>>>>>>>>> multiple alloc node.
>>>>>>>>>
>>>>>>>>> This had implication in offset computation and field write 
>>>>>>>>> detection related to initializing stores.
>>>>>>>>>
>>>>>>>>> Attempt to fix in EA:
>>>>>>>>>
>>>>>>>>> ???? webrev: 
>>>>>>>>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA/
>>>>>>>>>
>>>>>>>>> Alternate fix:
>>>>>>>>>
>>>>>>>>> ???? Minimize the scenario in compiler generated code by 
>>>>>>>>> throwing only j.l.Error from slowpath(all exception async/sync 
>>>>>>>>> are handled in runtime exit).
>>>>>>>>>
>>>>>>>>> ???? Stub epilog doesn't poll or throw any exceptions. Disable 
>>>>>>>>> full loop opt before EA for detectable patterns and bailout EA 
>>>>>>>>> for late detected patterns.
>>>>>>>>>
>>>>>>>>> ???? webrev: 
>>>>>>>>> http://cr.openjdk.java.net/~jcm/8242895/webrev_deopt/
>>>>>>>>>
>>>>>>>>> Please advice.
>>>>>>>>>
>>>>>>>>> Testing : mach tier1-5 (logs in jbs)
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>>
>>>>>>>>> Jamsheed
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [1] JDK-8231291 
>>>>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8231291>C2: loop 
>>>>>>>>> opts before EA should maximally unroll loops
>>>>>>>>>
>>>>>>>>> [2] that have its init node in different outer loop wrt to 
>>>>>>>>> alloc node.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> loop begin
>>>>>>>>>
>>>>>>>>> ?? try{
>>>>>>>>>
>>>>>>>>> ?? return new obj()/? throw new obj()/ uncommon trap after 
>>>>>>>>> allocation, in a loop
>>>>>>>>>
>>>>>>>>> ?? } catch(ex) {
>>>>>>>>>
>>>>>>>>> ?? }
>>>>>>>>>
>>>>>>>>> loop end
>>>>>>>>>
>>>>>>>>> ? 42???? public static IntA test(int n) {
>>>>>>>>> ?? 43???????? for (int i=0; i<2; i++) {
>>>>>>>>> ?? 44???????????? try {
>>>>>>>>> ?? 45?????????????????? return new IntA(n + i);
>>>>>>>>> ?? 46???????????? } catch (Exception e) {
>>>>>>>>> ?? 47???????????? }
>>>>>>>>> ?? 48???????? }
>>>>>>>>> ?? 49
>>>>>>>>>

From jiefu at tencent.com  Thu Jul 16 01:59:32 2020
From: jiefu at tencent.com (=?utf-8?B?amllZnUo5YKF5p2wKQ==?=)
Date: Thu, 16 Jul 2020 01:59:32 +0000
Subject: RFR: 8246805: Incorrect copyright header in
 TestInvalidTieredStopAtLevel.java
Message-ID: <FC3DFC2D-5CDA-4839-AFAC-552844D6C5AF@tencent.com>

Hi all,

May I get reviews for this tiny fix, which just updates the license to be GPLv2 only (not GPLv2+CPE)?

JBS:    https://bugs.openjdk.java.net/browse/JDK-8246805
Webrev: http://cr.openjdk.java.net/~jiefu/8246805/webrev.00/

Thanks a lot.
Best regards,
Jie

From jamsheed.c.m at oracle.com  Thu Jul 16 02:03:31 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Thu, 16 Jul 2020 07:33:31 +0530
Subject: [15] RFR: 8246381: VM crashes with "Current BasicObjectLock*
 below than low_mark"
In-Reply-To: <1af60254-a239-c21f-68df-be9b65534e7f@oracle.com>
References: <7a802330-e836-1ff3-af0a-ede587e049ff@oracle.com>
 <30bd811e-c890-5bb1-8c78-4cf944fd5a42@oracle.com>
 <5d43f963-b931-3b69-4b5c-188c45b57de8@oracle.com>
 <1af60254-a239-c21f-68df-be9b65534e7f@oracle.com>
Message-ID: <e547e782-dfba-7984-75ee-1df9e2e80fd9@oracle.com>

Hi David,

On 16/07/2020 06:37, David Holmes wrote:
> Hi Jamsheed,
>
> tl;dr version: fix looks good. Thanks for working through things with 
> me on this one.
>
> Long version ... for the sake of other reviewers (and myself) I'm 
> going to walk through the problem scenario and how the fix addresses 
> it, because the bug report is long and confusing and touches on a 
> number of different issues with async exception handling.
>
> We are dealing with the code generated for Java method entry, and in 
> particular for a synchronized Java method. We do a lot of things in 
> the entry code before we actually lock the monitor and jump to the 
> Java method. Some of those things include method profiling and the 
> counter overflow check for the JIT. If an exception is thrown at this 
> point, the logic to remove the activation would unlock the monitor - 
> which we haven't actually locked yet! So we have the 
> do_not_unlock_if_synchronized flag which is stored in the current 
> JavaThread. We set that flag true so that if any exceptions result in 
> activation removal, the removal logic won't try to unlock the monitor. 
> Once we're ready to lock the monitor we set the flag back to false 
> (note there is an implicit assumption here that monitor locking can 
> never raise an exception).
>
> The problem arises with async exceptions, or more specifically the 
> async exception that is raised due to an "unsafe access error". This 
> is where a memory-mapped ByteBuffer causes an access violation (SEGV) 
> due to a bad pointer. The signal handler simply sets a flag to 
> indicate we encountered an "unsafe access error", adjusts the BCI to 
> the next instruction and allows execution to proceed at the next 
> instruction. It is then expected that the runtime will "soon" notice 
> this pending unsafe access error and create and throw the 
> InternalError instance that indicates the ByteBuffer operation failed. 
> This requires executing Java code.
>
> One of the places that checks for that pending unsafe access error is 
> in the destructor of the JRT_ENTRY wrapper that is used for the method 
> profiling and counter overflow checking. This occurs whilst the 
> do_not_unlock_if_synchronized flag is true, so the resulting 
> InternalError won't result in an attempt to unlock the not-locked 
> monitor.
>
> The problem is that creating the InternalError executes Java code - it 
> calls constructors, which call methods etc. And some of those methods 
> are synchronized. So the method entry logic for such a call will set 
> do_not_unlock_if_synchronized to true, perform all the preamble 
> related to the call, then set do_not_unlock_if_synchronized to false, 
> lock the monitor and make the call. When construction completes the 
> InternalError is thrown and we remove the activation for the method we 
> had originally started to call. But now the 
> do_not_unlock_if_synchronized flag has been reset to false by the 
> nested Java method call, so we do in fact try to unlock a monitor that 
> was never locked, and things break.
>
> This nesting problem is well known and we have a mechanism for dealing 
> with - the UnlockFlagSaver. The actual logic executed for profiling 
> methods and doing the counter overflow check contains the requisite 
> UnlockFlagSaver to avoid the problem just outlined. Unfortunately the 
> async exception is processed in the JRT_ENTRY wrapper, which is 
> outside the scope of those UnlockFlagSaver helpers and so they don't 
> help in this case.
>
> So the fix is to "simply" move the UnlockFlagSaver deeper into the 
> call stack to the code that actually does the async exception processing:
>
> ?void JavaThread::check_and_handle_async_exceptions(bool 
> check_unsafe_error) {
> +?? // May be we are at method entry and requires to save do not 
> unlock flag.
> +?? UnlockFlagSaver fs(this);
>
> so now after the InternalError has been created and thrown we will 
> restore the original value of the do_not_unlock_if_synchronized flag 
> (false) and so the InternalError will not cause activation removal to 
> attempt to unlock the not-locked monitor.
>
> The scope of the UnlockFlagSaver could be narrowed to the actual logic 
> for processing the unsafe access error, but it seems fine at method 
> scope.
>
> A second fix is that the overflow counter check had an assertion that 
> it was not executed with any pending exceptions. But that turned out 
> to be false for reasons I can't fully explain, but it again appears to 
> relate to a pending async exception being installed prior to the 
> method call - and seems related to the two referenced JVM TI 
> functions. The simple solution here is to delete the assertion and to 
> check for pending exceptions on entry to the code and just return 
> immediately. The JRT_ENTRY destructor will see the pending exception 
> and propagate it.

Thanks a lot for the opportunity, for all the help, and for putting 
detailed description of the problem here.

Best regards,

Jamsheed

>
> Cheers,
> David
>
> On 16/07/2020 9:50 am, David Holmes wrote:
>> Hi Jamsheed,
>>
>> On 16/07/2020 8:16 am, Jamsheed C M wrote:
>>> (Thank you Dean, adding serviceability team as this issue involves 
>>> JVMTI features PopFrame, EarlyReturn features)
>>
>> It is not at all obvious how your proposed fix impacts the JVM TI 
>> features.
>>
>>> JBS entry: https://bugs.openjdk.java.net/browse/JDK-8246381
>>>
>>> (testing: mach5, tier1-5 links in JBS)
>>>
>>> Best regards,
>>>
>>> Jamsheed
>>>
>>> On 15/07/2020 21:25, Jamsheed C M wrote:
>>>>
>>>> Hi,
>>>>
>>>> Async handling at method entry requires it to be aware of 
>>>> synchronization(like whether it is doing async handling before lock 
>>>> acquire or after)
>>>>
>>>> This is required as exception handler rely on this info for 
>>>> unlocking.? Async handling code never had this special condition 
>>>> handled and it worked most of the time as we were using biased 
>>>> locking which got disabled by [1]
>>>>
>>>> There was one other issue reported in similar time[2]. This issue 
>>>> got triggered in test case by [3], back to back extra safepoint 
>>>> after suspend and TLH for ThreadDeath. So in this setup both 
>>>> PopFrame request and Thread.Stop request happened together for the 
>>>> test scenario and it reached java method entry with 
>>>> pending_exception set.
>>>>
>>>> I have done a partial fix for the issue, mainly to handle 
>>>> production mode crash failures(do not unlock flag related ones)
>>>>
>>>> Fix detail:
>>>>
>>>> 1) I save restore the "do not unlock" flag in async handling.
>>
>> Sorry but you completely changed the fix compared to what we 
>> discussed and what I pre-reviewed! What happened to changing from 
>> JRT_ENTRY to JRT_ENTRY_NOASYNC? It is going to take me a lot of time 
>> and effort to determine that this save/restore of the "do not unlock 
>> flag" is actually correct and valid!
>>
>>>>
>>>> 2) Return for floating pending exception for some cases(PopFrame, 
>>>> Early return related). This is debug(JVMTI) feature and floating 
>>>> exception can get cleaned just like that in present compiler 
>>>> request and deopt code.
>>
>> What part of the change addresses this?
>>
>> Thanks,
>> David
>> -----
>>
>>>>
>>>> webrev :http://cr.openjdk.java.net/~jcm/8246381/webrev.02/
>>>>
>>>> There are more problems in these code areas, like we clear all 
>>>> exceptions in compilation request path(interpreter,c1), as well as 
>>>> deoptimization path.
>>>>
>>>> All these un-handled cases will be separately handled by 
>>>> https://bugs.openjdk.java.net/browse/JDK-8249451
>>>>
>>>> Request for review.
>>>>
>>>> Best regards,
>>>>
>>>> Jamsheed
>>>>
>>>> [1]https://bugs.openjdk.java.net/browse/JDK-8231264 
>>>> <https://bugs.openjdk.java.net/browse/JDK-8231264>
>>>>
>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8246727
>>>>
>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8221207
>>>>

From mikael.vidstedt at oracle.com  Thu Jul 16 02:10:20 2020
From: mikael.vidstedt at oracle.com (Mikael Vidstedt)
Date: Wed, 15 Jul 2020 19:10:20 -0700
Subject: RFR: 8246805: Incorrect copyright header in
 TestInvalidTieredStopAtLevel.java
In-Reply-To: <FC3DFC2D-5CDA-4839-AFAC-552844D6C5AF@tencent.com>
References: <FC3DFC2D-5CDA-4839-AFAC-552844D6C5AF@tencent.com>
Message-ID: <95C84F92-6E0E-4995-AA01-CBD4BB81CC8B@oracle.com>


Thanks for doing this. Can you please use the same exact license header found in make/templates/gpl-header and/or the surrounding files in that test directory?

Cheers,
Mikael

> On Jul 15, 2020, at 6:59 PM, jiefu(??) <jiefu at tencent.com> wrote:
> 
> Hi all,
> 
> May I get reviews for this tiny fix, which just updates the license to be GPLv2 only (not GPLv2+CPE)?
> 
> JBS:    https://bugs.openjdk.java.net/browse/JDK-8246805
> Webrev: http://cr.openjdk.java.net/~jiefu/8246805/webrev.00/
> 
> Thanks a lot.
> Best regards,
> Jie


From jiefu at tencent.com  Thu Jul 16 02:55:04 2020
From: jiefu at tencent.com (=?utf-8?B?amllZnUo5YKF5p2wKQ==?=)
Date: Thu, 16 Jul 2020 02:55:04 +0000
Subject: RFR: 8246805: Incorrect copyright header in
 TestInvalidTieredStopAtLevel.java
Message-ID: <48C11F71-A4AD-44EB-A3EF-C3F920176605@tencent.com>

Hi Mikael,

Thanks for your review.
Updated: http://cr.openjdk.java.net/~jiefu/8246805/webrev.01/

Thanks.
Best regards,
Jie

?On 2020/7/16, 10:11 AM, "Mikael Vidstedt" <mikael.vidstedt at oracle.com> wrote:

    
    Thanks for doing this. Can you please use the same exact license header found in make/templates/gpl-header and/or the surrounding files in that test directory?
    
    Cheers,
    Mikael
    
    > On Jul 15, 2020, at 6:59 PM, jiefu(??) <jiefu at tencent.com> wrote:
    > 
    > Hi all,
    > 
    > May I get reviews for this tiny fix, which just updates the license to be GPLv2 only (not GPLv2+CPE)?
    > 
    > JBS:    https://bugs.openjdk.java.net/browse/JDK-8246805
    > Webrev: http://cr.openjdk.java.net/~jiefu/8246805/webrev.00/
    > 
    > Thanks a lot.
    > Best regards,
    > Jie
    
    
From igor.ignatyev at oracle.com  Thu Jul 16 03:31:01 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 15 Jul 2020 20:31:01 -0700
Subject: RFR: 8246805: Incorrect copyright header in
 TestInvalidTieredStopAtLevel.java
In-Reply-To: <48C11F71-A4AD-44EB-A3EF-C3F920176605@tencent.com>
References: <48C11F71-A4AD-44EB-A3EF-C3F920176605@tencent.com>
Message-ID: <5E44E4D8-1478-4604-A472-8065E09276E1@oracle.com>

Hi Jie,

LGTM

-- Igor

> On Jul 15, 2020, at 7:55 PM, jiefu(??) <jiefu at tencent.com> wrote:
> 
> Hi Mikael,
> 
> Thanks for your review.
> Updated: http://cr.openjdk.java.net/~jiefu/8246805/webrev.01/
> 
> Thanks.
> Best regards,
> Jie
> 
> ?On 2020/7/16, 10:11 AM, "Mikael Vidstedt" <mikael.vidstedt at oracle.com> wrote:
> 
> 
>    Thanks for doing this. Can you please use the same exact license header found in make/templates/gpl-header and/or the surrounding files in that test directory?
> 
>    Cheers,
>    Mikael
> 
>> On Jul 15, 2020, at 6:59 PM, jiefu(??) <jiefu at tencent.com> wrote:
>> 
>> Hi all,
>> 
>> May I get reviews for this tiny fix, which just updates the license to be GPLv2 only (not GPLv2+CPE)?
>> 
>> JBS:    https://bugs.openjdk.java.net/browse/JDK-8246805
>> Webrev: http://cr.openjdk.java.net/~jiefu/8246805/webrev.00/
>> 
>> Thanks a lot.
>> Best regards,
>> Jie
> 
> 
> 
> 


From mikael.vidstedt at oracle.com  Thu Jul 16 03:33:55 2020
From: mikael.vidstedt at oracle.com (Mikael Vidstedt)
Date: Wed, 15 Jul 2020 20:33:55 -0700
Subject: RFR: 8246805: Incorrect copyright header in
 TestInvalidTieredStopAtLevel.java
In-Reply-To: <48C11F71-A4AD-44EB-A3EF-C3F920176605@tencent.com>
References: <48C11F71-A4AD-44EB-A3EF-C3F920176605@tencent.com>
Message-ID: <37DFCE79-0F1F-45A0-A0CC-13A915D34936@oracle.com>


Looks good, thank you!

Cheers,
Mikael

> On Jul 15, 2020, at 7:55 PM, jiefu(??) <jiefu at tencent.com> wrote:
> 
> Hi Mikael,
> 
> Thanks for your review.
> Updated: http://cr.openjdk.java.net/~jiefu/8246805/webrev.01/
> 
> Thanks.
> Best regards,
> Jie
> 
> ?On 2020/7/16, 10:11 AM, "Mikael Vidstedt" <mikael.vidstedt at oracle.com> wrote:
> 
> 
>    Thanks for doing this. Can you please use the same exact license header found in make/templates/gpl-header and/or the surrounding files in that test directory?
> 
>    Cheers,
>    Mikael
> 
>> On Jul 15, 2020, at 6:59 PM, jiefu(??) <jiefu at tencent.com> wrote:
>> 
>> Hi all,
>> 
>> May I get reviews for this tiny fix, which just updates the license to be GPLv2 only (not GPLv2+CPE)?
>> 
>> JBS:    https://bugs.openjdk.java.net/browse/JDK-8246805
>> Webrev: http://cr.openjdk.java.net/~jiefu/8246805/webrev.00/
>> 
>> Thanks a lot.
>> Best regards,
>> Jie
> 
> 
> 
> 


From jiefu at tencent.com  Thu Jul 16 03:58:06 2020
From: jiefu at tencent.com (=?utf-8?B?amllZnUo5YKF5p2wKQ==?=)
Date: Thu, 16 Jul 2020 03:58:06 +0000
Subject: RFR: 8246805: Incorrect copyright header in
 TestInvalidTieredStopAtLevel.java
Message-ID: <AEC0016D-8827-4EB9-997D-FF8AF0D234F0@tencent.com>

Thanks Igor and Mikael for your review.
Pushed.

Best regards,
Jie

?On 2020/7/16, 11:34 AM, "Mikael Vidstedt" <mikael.vidstedt at oracle.com> wrote:

    
    Looks good, thank you!
    
    Cheers,
    Mikael
    
    > On Jul 15, 2020, at 7:55 PM, jiefu(??) <jiefu at tencent.com> wrote:
    > 
    > Hi Mikael,
    > 
    > Thanks for your review.
    > Updated: http://cr.openjdk.java.net/~jiefu/8246805/webrev.01/
    > 
    > Thanks.
    > Best regards,
    > Jie
    > 
    > On 2020/7/16, 10:11 AM, "Mikael Vidstedt" <mikael.vidstedt at oracle.com> wrote:
    > 
    > 
    >    Thanks for doing this. Can you please use the same exact license header found in make/templates/gpl-header and/or the surrounding files in that test directory?
    > 
    >    Cheers,
    >    Mikael
    > 
    >> On Jul 15, 2020, at 6:59 PM, jiefu(??) <jiefu at tencent.com> wrote:
    >> 
    >> Hi all,
    >> 
    >> May I get reviews for this tiny fix, which just updates the license to be GPLv2 only (not GPLv2+CPE)?
    >> 
    >> JBS:    https://bugs.openjdk.java.net/browse/JDK-8246805
    >> Webrev: http://cr.openjdk.java.net/~jiefu/8246805/webrev.00/
    >> 
    >> Thanks a lot.
    >> Best regards,
    >> Jie
    > 
    > 
    > 
    > 
    
    
From xxinliu at amazon.com  Thu Jul 16 04:01:49 2020
From: xxinliu at amazon.com (Liu, Xin)
Date: Thu, 16 Jul 2020 04:01:49 +0000
Subject: question about PrintOptoStatistics atomicity
In-Reply-To: <6b3d2637-e01c-8ab9-e32c-2404c7b2a40a@oracle.com>
References: <1594827116846.89704@amazon.com>,
 <6b3d2637-e01c-8ab9-e32c-2404c7b2a40a@oracle.com>
Message-ID: <1594872109123.1250@amazon.com>

Hi, Vladimir, 

Thank you for you information. 
I understand. I can use -XX:CICompilerCount=2 if I need to have precise counters. 

thanks,
--lx


________________________________________
From: hotspot-compiler-dev <hotspot-compiler-dev-retn at openjdk.java.net> on behalf of Vladimir Kozlov <vladimir.kozlov at oracle.com>
Sent: Wednesday, July 15, 2020 10:50 AM
To: hotspot-compiler-dev at openjdk.java.net
Subject: RE: [EXTERNAL] question about PrintOptoStatistics atomicity

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


It was done intentionally because when that code was implemented atomic operations were expensive.
We never intended these counters to be precise - they were used mostly for debugging purpose.
It is up to user how he want to use them - for example using only one C2 thread.
When you collect data for general application you want to execute it with the same parameters as in production. I don't
think we should enforce any restrictions in VM when PrintOptoStatistics is used.

Regards,
Vladimir K

On 7/15/20 8:31 AM, Liu, Xin wrote:
> Hi,
>
>
> I have a question about -XX:+PrintOptoStatistics in c2_globals.hpp.
>
> It dumps many internal counters in different C2 phases.  I found those counters are all static fields.
>
> eg.
>
> http://hg.openjdk.java.net/jdk/jdk/file/4b9ced2b948c/src/hotspot/share/opto/chaitin.cpp#l2297
>
> http://hg.openjdk.java.net/jdk/jdk/file/4b9ced2b948c/src/hotspot/share/opto/phaseX.hpp#l599
>
>
> I notice that all setters of those fields are not atomic. IMHO, hotspot may has more than one c2-compiler-threads running at the same time.
>
> How does hotspot guarantee those fields are thread-safe? or the flag intends to do statistics in single-thread mode by design?
>
>
> If those counters are not atomic, shall we connect this flag to CICompilerCount?
>
> I think we can constrain the number of c2-compiler-thread to 1 if user set PrintOptoStatistics. Does it make sense?
>
>
> thanks,
>
> --lx
>

From jamsheed.c.m at oracle.com  Thu Jul 16 07:00:18 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Thu, 16 Jul 2020 12:30:18 +0530
Subject: [15] RFR: 8246381: VM crashes with "Current BasicObjectLock*
 below than low_mark"
In-Reply-To: <1af60254-a239-c21f-68df-be9b65534e7f@oracle.com>
References: <7a802330-e836-1ff3-af0a-ede587e049ff@oracle.com>
 <30bd811e-c890-5bb1-8c78-4cf944fd5a42@oracle.com>
 <5d43f963-b931-3b69-4b5c-188c45b57de8@oracle.com>
 <1af60254-a239-c21f-68df-be9b65534e7f@oracle.com>
Message-ID: <55b4473d-8aa4-77e0-1145-2a94a0a5f62e@oracle.com>

Hi all,

could i get another review?

Best regards,

Jamsheed

On 16/07/2020 06:37, David Holmes wrote:
> Hi Jamsheed,
>
> tl;dr version: fix looks good. Thanks for working through things with 
> me on this one.
>
> Long version ... for the sake of other reviewers (and myself) I'm 
> going to walk through the problem scenario and how the fix addresses 
> it, because the bug report is long and confusing and touches on a 
> number of different issues with async exception handling.
>
> We are dealing with the code generated for Java method entry, and in 
> particular for a synchronized Java method. We do a lot of things in 
> the entry code before we actually lock the monitor and jump to the 
> Java method. Some of those things include method profiling and the 
> counter overflow check for the JIT. If an exception is thrown at this 
> point, the logic to remove the activation would unlock the monitor - 
> which we haven't actually locked yet! So we have the 
> do_not_unlock_if_synchronized flag which is stored in the current 
> JavaThread. We set that flag true so that if any exceptions result in 
> activation removal, the removal logic won't try to unlock the monitor. 
> Once we're ready to lock the monitor we set the flag back to false 
> (note there is an implicit assumption here that monitor locking can 
> never raise an exception).
>
> The problem arises with async exceptions, or more specifically the 
> async exception that is raised due to an "unsafe access error". This 
> is where a memory-mapped ByteBuffer causes an access violation (SEGV) 
> due to a bad pointer. The signal handler simply sets a flag to 
> indicate we encountered an "unsafe access error", adjusts the BCI to 
> the next instruction and allows execution to proceed at the next 
> instruction. It is then expected that the runtime will "soon" notice 
> this pending unsafe access error and create and throw the 
> InternalError instance that indicates the ByteBuffer operation failed. 
> This requires executing Java code.
>
> One of the places that checks for that pending unsafe access error is 
> in the destructor of the JRT_ENTRY wrapper that is used for the method 
> profiling and counter overflow checking. This occurs whilst the 
> do_not_unlock_if_synchronized flag is true, so the resulting 
> InternalError won't result in an attempt to unlock the not-locked 
> monitor.
>
> The problem is that creating the InternalError executes Java code - it 
> calls constructors, which call methods etc. And some of those methods 
> are synchronized. So the method entry logic for such a call will set 
> do_not_unlock_if_synchronized to true, perform all the preamble 
> related to the call, then set do_not_unlock_if_synchronized to false, 
> lock the monitor and make the call. When construction completes the 
> InternalError is thrown and we remove the activation for the method we 
> had originally started to call. But now the 
> do_not_unlock_if_synchronized flag has been reset to false by the 
> nested Java method call, so we do in fact try to unlock a monitor that 
> was never locked, and things break.
>
> This nesting problem is well known and we have a mechanism for dealing 
> with - the UnlockFlagSaver. The actual logic executed for profiling 
> methods and doing the counter overflow check contains the requisite 
> UnlockFlagSaver to avoid the problem just outlined. Unfortunately the 
> async exception is processed in the JRT_ENTRY wrapper, which is 
> outside the scope of those UnlockFlagSaver helpers and so they don't 
> help in this case.
>
> So the fix is to "simply" move the UnlockFlagSaver deeper into the 
> call stack to the code that actually does the async exception processing:
>
> ?void JavaThread::check_and_handle_async_exceptions(bool 
> check_unsafe_error) {
> +?? // May be we are at method entry and requires to save do not 
> unlock flag.
> +?? UnlockFlagSaver fs(this);
>
> so now after the InternalError has been created and thrown we will 
> restore the original value of the do_not_unlock_if_synchronized flag 
> (false) and so the InternalError will not cause activation removal to 
> attempt to unlock the not-locked monitor.
>
> The scope of the UnlockFlagSaver could be narrowed to the actual logic 
> for processing the unsafe access error, but it seems fine at method 
> scope.
>
> A second fix is that the overflow counter check had an assertion that 
> it was not executed with any pending exceptions. But that turned out 
> to be false for reasons I can't fully explain, but it again appears to 
> relate to a pending async exception being installed prior to the 
> method call - and seems related to the two referenced JVM TI 
> functions. The simple solution here is to delete the assertion and to 
> check for pending exceptions on entry to the code and just return 
> immediately. The JRT_ENTRY destructor will see the pending exception 
> and propagate it.
>
> Cheers,
> David
>
> On 16/07/2020 9:50 am, David Holmes wrote:
>> Hi Jamsheed,
>>
>> On 16/07/2020 8:16 am, Jamsheed C M wrote:
>>> (Thank you Dean, adding serviceability team as this issue involves 
>>> JVMTI features PopFrame, EarlyReturn features)
>>
>> It is not at all obvious how your proposed fix impacts the JVM TI 
>> features.
>>
>>> JBS entry: https://bugs.openjdk.java.net/browse/JDK-8246381
>>>
>>> (testing: mach5, tier1-5 links in JBS)
>>>
>>> Best regards,
>>>
>>> Jamsheed
>>>
>>> On 15/07/2020 21:25, Jamsheed C M wrote:
>>>>
>>>> Hi,
>>>>
>>>> Async handling at method entry requires it to be aware of 
>>>> synchronization(like whether it is doing async handling before lock 
>>>> acquire or after)
>>>>
>>>> This is required as exception handler rely on this info for 
>>>> unlocking.? Async handling code never had this special condition 
>>>> handled and it worked most of the time as we were using biased 
>>>> locking which got disabled by [1]
>>>>
>>>> There was one other issue reported in similar time[2]. This issue 
>>>> got triggered in test case by [3], back to back extra safepoint 
>>>> after suspend and TLH for ThreadDeath. So in this setup both 
>>>> PopFrame request and Thread.Stop request happened together for the 
>>>> test scenario and it reached java method entry with 
>>>> pending_exception set.
>>>>
>>>> I have done a partial fix for the issue, mainly to handle 
>>>> production mode crash failures(do not unlock flag related ones)
>>>>
>>>> Fix detail:
>>>>
>>>> 1) I save restore the "do not unlock" flag in async handling.
>>
>> Sorry but you completely changed the fix compared to what we 
>> discussed and what I pre-reviewed! What happened to changing from 
>> JRT_ENTRY to JRT_ENTRY_NOASYNC? It is going to take me a lot of time 
>> and effort to determine that this save/restore of the "do not unlock 
>> flag" is actually correct and valid!
>>
>>>>
>>>> 2) Return for floating pending exception for some cases(PopFrame, 
>>>> Early return related). This is debug(JVMTI) feature and floating 
>>>> exception can get cleaned just like that in present compiler 
>>>> request and deopt code.
>>
>> What part of the change addresses this?
>>
>> Thanks,
>> David
>> -----
>>
>>>>
>>>> webrev :http://cr.openjdk.java.net/~jcm/8246381/webrev.02/
>>>>
>>>> There are more problems in these code areas, like we clear all 
>>>> exceptions in compilation request path(interpreter,c1), as well as 
>>>> deoptimization path.
>>>>
>>>> All these un-handled cases will be separately handled by 
>>>> https://bugs.openjdk.java.net/browse/JDK-8249451
>>>>
>>>> Request for review.
>>>>
>>>> Best regards,
>>>>
>>>> Jamsheed
>>>>
>>>> [1]https://bugs.openjdk.java.net/browse/JDK-8231264 
>>>> <https://bugs.openjdk.java.net/browse/JDK-8231264>
>>>>
>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8246727
>>>>
>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8221207
>>>>

From aph at redhat.com  Thu Jul 16 08:44:16 2020
From: aph at redhat.com (Andrew Haley)
Date: Thu, 16 Jul 2020 09:44:16 +0100
Subject: [aarch64-port-dev ] RFR(S): 8248676: AArch64: Add workaround for
 LITable constructor
In-Reply-To: <MWHPR21MB051121A1E582BCC22FA96F47B07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB05119533601E0AF1DE3CB0E9B0640@MWHPR21MB0511.namprd21.prod.outlook.com>
 <af179579-b8cd-db69-fede-802614f23092@redhat.com>
 <MWHPR21MB0511A82D28A3FE2A09E3E8ADB0650@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB051121A1E582BCC22FA96F47B07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <0aed0646-c770-03e6-4e0b-5108919b7203@redhat.com>

On 15/07/2020 14:27, Ludovic Henry wrote:
> A quick follow-up on that patch. Is there anything you would like to see done differently?

It's fine, but (as discussed) it should go into
http://hg.openjdk.java.net/aarch64-port/jdk-windows/

We'll need to do a regular pull from jdk/jdk into that tree.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From jamsheed.c.m at oracle.com  Thu Jul 16 11:40:46 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Thu, 16 Jul 2020 17:10:46 +0530
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <f9f80bf5-a565-3820-efa1-2b949b92d764@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
 <8325fac5-6258-9b88-6507-5dcb0597cc17@oracle.com>
 <8522a69e-e538-2cc9-5364-887e450fc653@oracle.com>
 <46144d6d-5714-05ad-a263-01507db937cc@oracle.com>
 <7a361c29-4771-9ed4-1542-b3f68a5726f3@oracle.com>
 <6dc4c99b-1d90-09f1-60d1-fb2caf981266@oracle.com>
 <6b4e4dda-01d4-37d0-5403-a4f5481e5bf0@oracle.com>
 <f8e18f7b-bd13-5032-fb17-04fb037c88b4@oracle.com>
 <32d7fb64-75a5-7add-d496-df33cfaefabf@oracle.com>
 <f9f80bf5-a565-3820-efa1-2b949b92d764@oracle.com>
Message-ID: <03e49aa9-167f-8b8f-a744-408febff5bf6@oracle.com>

Hi Vladimir,

On 16/07/2020 07:25, Jamsheed C M wrote:
> But I have concerns about new is_init_captured_store code. EA is 
> mostly looking only on inputs to see Allocation. And in several places 
> it expecting only to see Allocation because other cases should be 
> filtered out before. 

I understand the concern here. If I am using the newer webrevs, I will 
ensure I don't filter out the inputs(basically check uncast i/p)

for the stores I don't want to re-compute and find if it is initializing 
store, but this info is actually already available in Field/ can be made 
available.

As I don't want EA taking more time analyzing stuffs due to my change, 
and in-turn have a perf impact.

Best regards,

Jamsheed


From jamsheed.c.m at oracle.com  Thu Jul 16 14:36:16 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Thu, 16 Jul 2020 20:06:16 +0530
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <32d7fb64-75a5-7add-d496-df33cfaefabf@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
 <8325fac5-6258-9b88-6507-5dcb0597cc17@oracle.com>
 <8522a69e-e538-2cc9-5364-887e450fc653@oracle.com>
 <46144d6d-5714-05ad-a263-01507db937cc@oracle.com>
 <7a361c29-4771-9ed4-1542-b3f68a5726f3@oracle.com>
 <6dc4c99b-1d90-09f1-60d1-fb2caf981266@oracle.com>
 <6b4e4dda-01d4-37d0-5403-a4f5481e5bf0@oracle.com>
 <f8e18f7b-bd13-5032-fb17-04fb037c88b4@oracle.com>
 <32d7fb64-75a5-7add-d496-df33cfaefabf@oracle.com>
Message-ID: <83c48e9f-b247-bfd2-18b5-eea9ea5ae23a@oracle.com>

Hi Vladimir,
On 16/07/2020 00:29, Vladimir Kozlov wrote:
> But I have concerns about new is_init_captured_store code. EA is 
> mostly looking only on inputs to see Allocation. And in several places 
> it expecting only to see Allocation because other cases should be 
> filtered out before.

In all the cases we analyze inputs of addp(field), if it is a 
raw(uncasted) and if its input points to alloc projection we are sure 
they are init captured stores or intrinsic initialization. when i 
searched for present intrinsic code i see all its uses are casted 
address(before macro expansion). so only remaining case that is left out 
was init captured stores. case #3.

so i used is_captured_store for finding all the raw stores that need to 
be analyzed(is a oop field store).

all init captured store has base as top. and get_addp_base code i added 
will always detect them and direct address_offset to compute offset.

Best regards,

Jamsheed


From coleen.phillimore at oracle.com  Thu Jul 16 14:43:17 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Thu, 16 Jul 2020 10:43:17 -0400
Subject: [15] RFR: 8246381: VM crashes with "Current BasicObjectLock*
 below than low_mark"
In-Reply-To: <55b4473d-8aa4-77e0-1145-2a94a0a5f62e@oracle.com>
References: <7a802330-e836-1ff3-af0a-ede587e049ff@oracle.com>
 <30bd811e-c890-5bb1-8c78-4cf944fd5a42@oracle.com>
 <5d43f963-b931-3b69-4b5c-188c45b57de8@oracle.com>
 <1af60254-a239-c21f-68df-be9b65534e7f@oracle.com>
 <55b4473d-8aa4-77e0-1145-2a94a0a5f62e@oracle.com>
Message-ID: <38336861-a8eb-fdb0-7860-9cbc8eb820b6@oracle.com>


Thanks to David's description of the problem and the fix, this makes 
sense to me now.
I don't like it and we should revisit async exceptions for all the other 
problems it causes, but this change looks safe and good.

thanks,
Coleen

On 7/16/20 3:00 AM, Jamsheed C M wrote:
> Hi all,
>
> could i get another review?
>
> Best regards,
>
> Jamsheed
>
> On 16/07/2020 06:37, David Holmes wrote:
>> Hi Jamsheed,
>>
>> tl;dr version: fix looks good. Thanks for working through things with 
>> me on this one.
>>
>> Long version ... for the sake of other reviewers (and myself) I'm 
>> going to walk through the problem scenario and how the fix addresses 
>> it, because the bug report is long and confusing and touches on a 
>> number of different issues with async exception handling.
>>
>> We are dealing with the code generated for Java method entry, and in 
>> particular for a synchronized Java method. We do a lot of things in 
>> the entry code before we actually lock the monitor and jump to the 
>> Java method. Some of those things include method profiling and the 
>> counter overflow check for the JIT. If an exception is thrown at this 
>> point, the logic to remove the activation would unlock the monitor - 
>> which we haven't actually locked yet! So we have the 
>> do_not_unlock_if_synchronized flag which is stored in the current 
>> JavaThread. We set that flag true so that if any exceptions result in 
>> activation removal, the removal logic won't try to unlock the 
>> monitor. Once we're ready to lock the monitor we set the flag back to 
>> false (note there is an implicit assumption here that monitor locking 
>> can never raise an exception).
>>
>> The problem arises with async exceptions, or more specifically the 
>> async exception that is raised due to an "unsafe access error". This 
>> is where a memory-mapped ByteBuffer causes an access violation (SEGV) 
>> due to a bad pointer. The signal handler simply sets a flag to 
>> indicate we encountered an "unsafe access error", adjusts the BCI to 
>> the next instruction and allows execution to proceed at the next 
>> instruction. It is then expected that the runtime will "soon" notice 
>> this pending unsafe access error and create and throw the 
>> InternalError instance that indicates the ByteBuffer operation 
>> failed. This requires executing Java code.
>>
>> One of the places that checks for that pending unsafe access error is 
>> in the destructor of the JRT_ENTRY wrapper that is used for the 
>> method profiling and counter overflow checking. This occurs whilst 
>> the do_not_unlock_if_synchronized flag is true, so the resulting 
>> InternalError won't result in an attempt to unlock the not-locked 
>> monitor.
>>
>> The problem is that creating the InternalError executes Java code - 
>> it calls constructors, which call methods etc. And some of those 
>> methods are synchronized. So the method entry logic for such a call 
>> will set do_not_unlock_if_synchronized to true, perform all the 
>> preamble related to the call, then set do_not_unlock_if_synchronized 
>> to false, lock the monitor and make the call. When construction 
>> completes the InternalError is thrown and we remove the activation 
>> for the method we had originally started to call. But now the 
>> do_not_unlock_if_synchronized flag has been reset to false by the 
>> nested Java method call, so we do in fact try to unlock a monitor 
>> that was never locked, and things break.
>>
>> This nesting problem is well known and we have a mechanism for 
>> dealing with - the UnlockFlagSaver. The actual logic executed for 
>> profiling methods and doing the counter overflow check contains the 
>> requisite UnlockFlagSaver to avoid the problem just outlined. 
>> Unfortunately the async exception is processed in the JRT_ENTRY 
>> wrapper, which is outside the scope of those UnlockFlagSaver helpers 
>> and so they don't help in this case.
>>
>> So the fix is to "simply" move the UnlockFlagSaver deeper into the 
>> call stack to the code that actually does the async exception 
>> processing:
>>
>> ?void JavaThread::check_and_handle_async_exceptions(bool 
>> check_unsafe_error) {
>> +?? // May be we are at method entry and requires to save do not 
>> unlock flag.
>> +?? UnlockFlagSaver fs(this);
>>
>> so now after the InternalError has been created and thrown we will 
>> restore the original value of the do_not_unlock_if_synchronized flag 
>> (false) and so the InternalError will not cause activation removal to 
>> attempt to unlock the not-locked monitor.
>>
>> The scope of the UnlockFlagSaver could be narrowed to the actual 
>> logic for processing the unsafe access error, but it seems fine at 
>> method scope.
>>
>> A second fix is that the overflow counter check had an assertion that 
>> it was not executed with any pending exceptions. But that turned out 
>> to be false for reasons I can't fully explain, but it again appears 
>> to relate to a pending async exception being installed prior to the 
>> method call - and seems related to the two referenced JVM TI 
>> functions. The simple solution here is to delete the assertion and to 
>> check for pending exceptions on entry to the code and just return 
>> immediately. The JRT_ENTRY destructor will see the pending exception 
>> and propagate it.
>>
>> Cheers,
>> David
>>
>> On 16/07/2020 9:50 am, David Holmes wrote:
>>> Hi Jamsheed,
>>>
>>> On 16/07/2020 8:16 am, Jamsheed C M wrote:
>>>> (Thank you Dean, adding serviceability team as this issue involves 
>>>> JVMTI features PopFrame, EarlyReturn features)
>>>
>>> It is not at all obvious how your proposed fix impacts the JVM TI 
>>> features.
>>>
>>>> JBS entry: https://bugs.openjdk.java.net/browse/JDK-8246381
>>>>
>>>> (testing: mach5, tier1-5 links in JBS)
>>>>
>>>> Best regards,
>>>>
>>>> Jamsheed
>>>>
>>>> On 15/07/2020 21:25, Jamsheed C M wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> Async handling at method entry requires it to be aware of 
>>>>> synchronization(like whether it is doing async handling before 
>>>>> lock acquire or after)
>>>>>
>>>>> This is required as exception handler rely on this info for 
>>>>> unlocking.? Async handling code never had this special condition 
>>>>> handled and it worked most of the time as we were using biased 
>>>>> locking which got disabled by [1]
>>>>>
>>>>> There was one other issue reported in similar time[2]. This issue 
>>>>> got triggered in test case by [3], back to back extra safepoint 
>>>>> after suspend and TLH for ThreadDeath. So in this setup both 
>>>>> PopFrame request and Thread.Stop request happened together for the 
>>>>> test scenario and it reached java method entry with 
>>>>> pending_exception set.
>>>>>
>>>>> I have done a partial fix for the issue, mainly to handle 
>>>>> production mode crash failures(do not unlock flag related ones)
>>>>>
>>>>> Fix detail:
>>>>>
>>>>> 1) I save restore the "do not unlock" flag in async handling.
>>>
>>> Sorry but you completely changed the fix compared to what we 
>>> discussed and what I pre-reviewed! What happened to changing from 
>>> JRT_ENTRY to JRT_ENTRY_NOASYNC? It is going to take me a lot of time 
>>> and effort to determine that this save/restore of the "do not unlock 
>>> flag" is actually correct and valid!
>>>
>>>>>
>>>>> 2) Return for floating pending exception for some cases(PopFrame, 
>>>>> Early return related). This is debug(JVMTI) feature and floating 
>>>>> exception can get cleaned just like that in present compiler 
>>>>> request and deopt code.
>>>
>>> What part of the change addresses this?
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>>>
>>>>> webrev :http://cr.openjdk.java.net/~jcm/8246381/webrev.02/
>>>>>
>>>>> There are more problems in these code areas, like we clear all 
>>>>> exceptions in compilation request path(interpreter,c1), as well as 
>>>>> deoptimization path.
>>>>>
>>>>> All these un-handled cases will be separately handled by 
>>>>> https://bugs.openjdk.java.net/browse/JDK-8249451
>>>>>
>>>>> Request for review.
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Jamsheed
>>>>>
>>>>> [1]https://bugs.openjdk.java.net/browse/JDK-8231264 
>>>>> <https://bugs.openjdk.java.net/browse/JDK-8231264>
>>>>>
>>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8246727
>>>>>
>>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8221207
>>>>>


From jamsheed.c.m at oracle.com  Thu Jul 16 14:49:48 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Thu, 16 Jul 2020 20:19:48 +0530
Subject: [15] RFR: 8246381: VM crashes with "Current BasicObjectLock*
 below than low_mark"
In-Reply-To: <38336861-a8eb-fdb0-7860-9cbc8eb820b6@oracle.com>
References: <7a802330-e836-1ff3-af0a-ede587e049ff@oracle.com>
 <30bd811e-c890-5bb1-8c78-4cf944fd5a42@oracle.com>
 <5d43f963-b931-3b69-4b5c-188c45b57de8@oracle.com>
 <1af60254-a239-c21f-68df-be9b65534e7f@oracle.com>
 <55b4473d-8aa4-77e0-1145-2a94a0a5f62e@oracle.com>
 <38336861-a8eb-fdb0-7860-9cbc8eb820b6@oracle.com>
Message-ID: <24043cec-b3f2-bfa0-fd66-f2fcedc4be27@oracle.com>

Hi Coleen,

Thank you for the review.

Best regards,

Jamsheed

On 16/07/2020 20:13, coleen.phillimore at oracle.com wrote:
>
> Thanks to David's description of the problem and the fix, this makes 
> sense to me now.
> I don't like it and we should revisit async exceptions for all the 
> other problems it causes, but this change looks safe and good.
>
> thanks,
> Coleen
>
> On 7/16/20 3:00 AM, Jamsheed C M wrote:
>> Hi all,
>>
>> could i get another review?
>>
>> Best regards,
>>
>> Jamsheed
>>
>> On 16/07/2020 06:37, David Holmes wrote:
>>> Hi Jamsheed,
>>>
>>> tl;dr version: fix looks good. Thanks for working through things 
>>> with me on this one.
>>>
>>> Long version ... for the sake of other reviewers (and myself) I'm 
>>> going to walk through the problem scenario and how the fix addresses 
>>> it, because the bug report is long and confusing and touches on a 
>>> number of different issues with async exception handling.
>>>
>>> We are dealing with the code generated for Java method entry, and in 
>>> particular for a synchronized Java method. We do a lot of things in 
>>> the entry code before we actually lock the monitor and jump to the 
>>> Java method. Some of those things include method profiling and the 
>>> counter overflow check for the JIT. If an exception is thrown at 
>>> this point, the logic to remove the activation would unlock the 
>>> monitor - which we haven't actually locked yet! So we have the 
>>> do_not_unlock_if_synchronized flag which is stored in the current 
>>> JavaThread. We set that flag true so that if any exceptions result 
>>> in activation removal, the removal logic won't try to unlock the 
>>> monitor. Once we're ready to lock the monitor we set the flag back 
>>> to false (note there is an implicit assumption here that monitor 
>>> locking can never raise an exception).
>>>
>>> The problem arises with async exceptions, or more specifically the 
>>> async exception that is raised due to an "unsafe access error". This 
>>> is where a memory-mapped ByteBuffer causes an access violation 
>>> (SEGV) due to a bad pointer. The signal handler simply sets a flag 
>>> to indicate we encountered an "unsafe access error", adjusts the BCI 
>>> to the next instruction and allows execution to proceed at the next 
>>> instruction. It is then expected that the runtime will "soon" notice 
>>> this pending unsafe access error and create and throw the 
>>> InternalError instance that indicates the ByteBuffer operation 
>>> failed. This requires executing Java code.
>>>
>>> One of the places that checks for that pending unsafe access error 
>>> is in the destructor of the JRT_ENTRY wrapper that is used for the 
>>> method profiling and counter overflow checking. This occurs whilst 
>>> the do_not_unlock_if_synchronized flag is true, so the resulting 
>>> InternalError won't result in an attempt to unlock the not-locked 
>>> monitor.
>>>
>>> The problem is that creating the InternalError executes Java code - 
>>> it calls constructors, which call methods etc. And some of those 
>>> methods are synchronized. So the method entry logic for such a call 
>>> will set do_not_unlock_if_synchronized to true, perform all the 
>>> preamble related to the call, then set do_not_unlock_if_synchronized 
>>> to false, lock the monitor and make the call. When construction 
>>> completes the InternalError is thrown and we remove the activation 
>>> for the method we had originally started to call. But now the 
>>> do_not_unlock_if_synchronized flag has been reset to false by the 
>>> nested Java method call, so we do in fact try to unlock a monitor 
>>> that was never locked, and things break.
>>>
>>> This nesting problem is well known and we have a mechanism for 
>>> dealing with - the UnlockFlagSaver. The actual logic executed for 
>>> profiling methods and doing the counter overflow check contains the 
>>> requisite UnlockFlagSaver to avoid the problem just outlined. 
>>> Unfortunately the async exception is processed in the JRT_ENTRY 
>>> wrapper, which is outside the scope of those UnlockFlagSaver helpers 
>>> and so they don't help in this case.
>>>
>>> So the fix is to "simply" move the UnlockFlagSaver deeper into the 
>>> call stack to the code that actually does the async exception 
>>> processing:
>>>
>>> ?void JavaThread::check_and_handle_async_exceptions(bool 
>>> check_unsafe_error) {
>>> +?? // May be we are at method entry and requires to save do not 
>>> unlock flag.
>>> +?? UnlockFlagSaver fs(this);
>>>
>>> so now after the InternalError has been created and thrown we will 
>>> restore the original value of the do_not_unlock_if_synchronized flag 
>>> (false) and so the InternalError will not cause activation removal 
>>> to attempt to unlock the not-locked monitor.
>>>
>>> The scope of the UnlockFlagSaver could be narrowed to the actual 
>>> logic for processing the unsafe access error, but it seems fine at 
>>> method scope.
>>>
>>> A second fix is that the overflow counter check had an assertion 
>>> that it was not executed with any pending exceptions. But that 
>>> turned out to be false for reasons I can't fully explain, but it 
>>> again appears to relate to a pending async exception being installed 
>>> prior to the method call - and seems related to the two referenced 
>>> JVM TI functions. The simple solution here is to delete the 
>>> assertion and to check for pending exceptions on entry to the code 
>>> and just return immediately. The JRT_ENTRY destructor will see the 
>>> pending exception and propagate it.
>>>
>>> Cheers,
>>> David
>>>
>>> On 16/07/2020 9:50 am, David Holmes wrote:
>>>> Hi Jamsheed,
>>>>
>>>> On 16/07/2020 8:16 am, Jamsheed C M wrote:
>>>>> (Thank you Dean, adding serviceability team as this issue involves 
>>>>> JVMTI features PopFrame, EarlyReturn features)
>>>>
>>>> It is not at all obvious how your proposed fix impacts the JVM TI 
>>>> features.
>>>>
>>>>> JBS entry: https://bugs.openjdk.java.net/browse/JDK-8246381
>>>>>
>>>>> (testing: mach5, tier1-5 links in JBS)
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Jamsheed
>>>>>
>>>>> On 15/07/2020 21:25, Jamsheed C M wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Async handling at method entry requires it to be aware of 
>>>>>> synchronization(like whether it is doing async handling before 
>>>>>> lock acquire or after)
>>>>>>
>>>>>> This is required as exception handler rely on this info for 
>>>>>> unlocking.? Async handling code never had this special condition 
>>>>>> handled and it worked most of the time as we were using biased 
>>>>>> locking which got disabled by [1]
>>>>>>
>>>>>> There was one other issue reported in similar time[2]. This issue 
>>>>>> got triggered in test case by [3], back to back extra safepoint 
>>>>>> after suspend and TLH for ThreadDeath. So in this setup both 
>>>>>> PopFrame request and Thread.Stop request happened together for 
>>>>>> the test scenario and it reached java method entry with 
>>>>>> pending_exception set.
>>>>>>
>>>>>> I have done a partial fix for the issue, mainly to handle 
>>>>>> production mode crash failures(do not unlock flag related ones)
>>>>>>
>>>>>> Fix detail:
>>>>>>
>>>>>> 1) I save restore the "do not unlock" flag in async handling.
>>>>
>>>> Sorry but you completely changed the fix compared to what we 
>>>> discussed and what I pre-reviewed! What happened to changing from 
>>>> JRT_ENTRY to JRT_ENTRY_NOASYNC? It is going to take me a lot of 
>>>> time and effort to determine that this save/restore of the "do not 
>>>> unlock flag" is actually correct and valid!
>>>>
>>>>>>
>>>>>> 2) Return for floating pending exception for some cases(PopFrame, 
>>>>>> Early return related). This is debug(JVMTI) feature and floating 
>>>>>> exception can get cleaned just like that in present compiler 
>>>>>> request and deopt code.
>>>>
>>>> What part of the change addresses this?
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>>>
>>>>>> webrev :http://cr.openjdk.java.net/~jcm/8246381/webrev.02/
>>>>>>
>>>>>> There are more problems in these code areas, like we clear all 
>>>>>> exceptions in compilation request path(interpreter,c1), as well 
>>>>>> as deoptimization path.
>>>>>>
>>>>>> All these un-handled cases will be separately handled by 
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8249451
>>>>>>
>>>>>> Request for review.
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> Jamsheed
>>>>>>
>>>>>> [1]https://bugs.openjdk.java.net/browse/JDK-8231264 
>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8231264>
>>>>>>
>>>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8246727
>>>>>>
>>>>>> [3] https://bugs.openjdk.java.net/browse/JDK-8221207
>>>>>>
>

From jatin.bhateja at intel.com  Thu Jul 16 14:52:13 2020
From: jatin.bhateja at intel.com (Bhateja, Jatin)
Date: Thu, 16 Jul 2020 14:52:13 +0000
Subject: RFR[S] : 8248830 : C2 : Rotate API intrinsification for X86
Message-ID: <MWHPR11MB1614EAFF216144FE6EAE68F9E87F0@MWHPR11MB1614.namprd11.prod.outlook.com>

Hi Vladimir, Andrew,

Thanks for your comments.

I have placed updated patch at following location.

http://cr.openjdk.java.net/~jbhateja/8248830/webrev_02/


Summary of changes:
1) Optimization is specifically targeted to exploit vector rotation instruction added for X86 AVX512. A single rotate instruction  encapsulates entire vector OR/SHIFTs pattern thus offers better latency at reduced instruction count.

2) There were two approaches to implement this:
    a)  Let everything remain the same and add new wide complex instruction patterns in the matcher for e.g.
         set Dst ( OrV (Binary (LShiftVI dst (Binary ReplicateI shift)) (URShiftVI dst (Binary (SubI (Binary ReplicateI 32) ( Replicate shift))
    It would have been an overoptimistic assumption to expect that graph shape would be preserved till the matcher for correct inferencing.  
    In addition we would have required multiple such bulky patterns. 
    b) Create new RotateLeft/RotateRight scalar nodes, these gets generated during intrinsification as well as during additional pattern
    matching during node Idealization, later on these nodes are consumed by SLP for valid vectorization scenarios to emit their vector
    counterparts which eventually emits vector rotates.

3) I choose approach 2b) since its cleaner, only problem here was that in non-evex mode (UseAVX < 3) new scalar Rotate nodes should either
be dismantled back to OR/SHIFT pattern or we penalize the vectorization which would be very costly, other option would have been to add additional vector rotate pattern for UseAVX=3 in the matcher which emit vector OR-SHIFTs instruction but then it will loose on emitting efficient instruction sequence which node sharing (OrV/LShiftV/URShift) offer in current implementation - thus it will not be beneficial for non-AVX512 targets, only saving will be in terms of cleanup of few existing scalar rotate matcher patterns, also old targets does not offer this powerful rotate instruction. Therefore new scalar nodes are created only for AVX512 targets.

As per suggestions constant folding scenarios have been covered during Idealizations of newly added scalar nodes. 

Please review the latest version and share your feedback and test results.

Best Regards,
Jatin


> -----Original Message-----
> From: Andrew Haley <aph at redhat.com>
> Sent: Saturday, July 11, 2020 2:24 PM
> To: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>; Bhateja, Jatin
> <jatin.bhateja at intel.com>; hotspot-compiler-dev at openjdk.java.net
> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>
> Subject: Re: 8248830 : RFR[S] : C2 : Rotate API intrinsification for X86
> 
> On 10/07/2020 18:32, Vladimir Ivanov wrote:
> 
>  > High-level comment: so far, there were no pressing need in  > explicitly
> marking the methods as intrinsics. ROR/ROL instructions  > were selected
> during matching [1]. Now the patch introduces  > dedicated nodes
> (RotateLeft/RotateRight) specifically for intrinsics  > which partly
> duplicates existing logic.
> 
> The lack of rotate nodes in the IR has always meant that AArch64 doesn't
> generate optimal code for e.g.
> 
>    (Set dst (XorL reg1 (RotateLeftL reg2 imm)))
> 
> because, with the RotateLeft expanded to its full combination of ORs and
> shifts, it's to complicated to match. At the time I put this to one side
> because it wasn't urgent. This is a shame because although such
> combinations are unusual they are used in some crypto operations.
> 
> If we can generate immediate-form rotate nodes early by pattern matching
> during parsing (rather than depending on intrinsics) we'll get more value
> than by depending on programmers calling intrinsics.
> 
> --
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From jamsheed.c.m at oracle.com  Thu Jul 16 16:19:55 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Thu, 16 Jul 2020 21:49:55 +0530
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <f9f80bf5-a565-3820-efa1-2b949b92d764@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
 <8325fac5-6258-9b88-6507-5dcb0597cc17@oracle.com>
 <8522a69e-e538-2cc9-5364-887e450fc653@oracle.com>
 <46144d6d-5714-05ad-a263-01507db937cc@oracle.com>
 <7a361c29-4771-9ed4-1542-b3f68a5726f3@oracle.com>
 <6dc4c99b-1d90-09f1-60d1-fb2caf981266@oracle.com>
 <6b4e4dda-01d4-37d0-5403-a4f5481e5bf0@oracle.com>
 <f8e18f7b-bd13-5032-fb17-04fb037c88b4@oracle.com>
 <32d7fb64-75a5-7add-d496-df33cfaefabf@oracle.com>
 <f9f80bf5-a565-3820-efa1-2b949b92d764@oracle.com>
Message-ID: <fa60eea1-3d7b-af3a-678d-354263e3eccc@oracle.com>

Hi Vladimir,
I ran performance run for 
http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA/? (links in JBS)
I don't see any issues, so i would like to go with webrev_fix_EA if it 
fixes all the reported issues.
Best regards,
Jamsheed

On 16/07/2020 07:25, Jamsheed C M wrote:
> Hi Vladimir,
>
> On 16/07/2020 00:29, Vladimir Kozlov wrote:
>> As I said before I agree with your additional checks for StoreN and 
>> StoreNKlass.
>>
>> But I have concerns about new is_init_captured_store code. EA is 
>> mostly looking only on inputs to see Allocation. And in several 
>> places it expecting only to see Allocation because other cases should 
>> be filtered out before.
> If that is the case, I would like to go with my first webrev for this 
> fix as it nicely propagate es and there in no unnecessary promotion to 
> global escape state.
>
> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA/
>
> Best regards,
>
> Jamsheed
>
>>
>> Thanks,
>> Vladimir
>>
>> On 7/15/20 10:54 AM, Jamsheed C M wrote:
>>> Hi Vladimir,
>>>
>>> with unrolling i understand that many cases will just have phis 
>>> everywhere to outside the loop as the uses are outside the loop.
>>>
>>> and this is not restricted to escaping objects alone as i depicted. 
>>> it can be escaping as well as non-escaping.
>>>
>>> so marking store to them as global escape doesn't seems to be nice 
>>> idea. i will rework on this fix and get back again.
>>>
>>> Thank you
>>>
>>> Best regards
>>>
>>> Jamsheed
>>>
>>> On 15/07/2020 08:38, Jamsheed C M wrote:
>>>> (unfinished mail got sent, so completing it)
>>>> On 15/07/2020 08:21, Jamsheed C M wrote:
>>>>> Hi Vladimir,
>>>>>
>>>>> On 15/07/2020 06:50, Vladimir Kozlov wrote:
>>>>>> I looked more on this. EA already does not secularize allocations 
>>>>>> when Phi nodes merged them - it should handle this case. I did 
>>>>>> small experiment and relaxed assert for this new (10. needs 
>>>>>> comment update) case for AddP's base and test passed:
>>>>>>
>>>>>> src/hotspot/share/opto/escape.cpp Tue Jul 14 18:11:27 2020 -0700
>>>>>> @@ -2357,6 +2357,7 @@
>>>>>> ?????? int opcode = uncast_base->Opcode();
>>>>>> ?????? assert(opcode == Op_ConP || opcode == Op_ThreadLocal ||
>>>>>> ????????????? opcode == Op_CastX2P || 
>>>>>> uncast_base->is_DecodeNarrowPtr() ||
>>>>>> +???????????? (uncast_base->is_Phi() && 
>>>>>> (uncast_base->bottom_type()->isa_rawptr() != NULL)) ||
>>>>>> ????????????? (uncast_base->is_Mem() && 
>>>>>> (uncast_base->bottom_type()->isa_rawptr() != NULL)) ||
>>>>>> ????????????? (uncast_base->is_Proj() && 
>>>>>> uncast_base->in(0)->is_Allocate()), "sanity");
>>>>>> ???? }
>>>>>>
>>>>>> Did you hit a case when this may not work?
>>>>>
>>>>> Yes, right it already doesn't mark it as scalarizable if base 
>>>>> count is more than one(I think it missed a is_oop check there)[1].
>>>>>
>>>>> EA CG adds edges only for oop field making stores to them 
>>>>> undetected. This makes these stored objects to NoEscape and if 
>>>>> compiled method continues execution with this NoEscape object can 
>>>>> have undesired results(i.e synchronization removed).
>>>>>
>>>>> Probable case would be(didn't verify)
>>>>>
>>>>> try {
>>>>>
>>>>> LOOP BEGIN
>>>>>
>>>>> ? try {throw new Obj()} catch {}
>>>>>
>>>>> LOOP END
>>>>>
>>>>> } catch (Obj e) {
>>>>>
>>>>> }
>>>>
>>>> Best Regards,
>>>>
>>>> Jamsheed
>>>>
>>>> [1]https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/escape.cpp#L1770 
>>>>
>>>>
>>>>
>>>>>>
>>>>>>
>>>>>> And with LoopOpts off -XX:LoopUnrollLimit=0 it removed allocation 
>>>>>> (-XX:+PrintEscapeAnalysis -XX:+PrintEliminateAllocations):
>>>>>>
>>>>>> ======== Connection graph for? Test::test
>>>>>> JavaObject NoEscape(NoEscape) [ 158F [ 107 ]]?? 95 Allocate === 
>>>>>> 242? 76? 230? 8? 1 ( 93? 92? 21? 1? 78? 1 78 ) [[ 96 97 98 105 
>>>>>> 106? 107 ]]? rawptr:NotNull ( int:>=0, java/lang/Object:NotNull 
>>>>>> *, bool, top ) Test::test1 @ bci:0 Test::test @ bci:8 !jvms: 
>>>>>> Test::test1 @ bci:0 Test::test @ bci:8
>>>>>> LocalVar [ 95P [ 158b ]]?? 107??? Proj??? ===? 95? [[ 108 158 ]] 
>>>>>> #5 !jvms: Test::test1 @ bci:0 Test::test @ bci:8
>>>>>>
>>>>>> Scalar? 95??? Allocate??? ===? 242? 76? 230? 8? 1 ( 93 92? 21 1 
>>>>>> 78 1? 78 ) [[ 96? 97? 98? 105? 106? 107 ]] rawptr:NotNull ( 
>>>>>> int:>=0, java/lang/Object:NotNull *, bool, top ) Test::test1 @ 
>>>>>> bci:0 Test::test @ bci:8 !jvms: Test::test1 @ bci:0 Test::test @ 
>>>>>> bci:8
>>>>>> ++++ Eliminated: 95 Allocate
>>>>>>
>>>>>>
>>>>>> t\Thanks,
>>>>>> Vladimir K
>>>>>>
>>>>>> On 7/14/20 1:28 AM, Jamsheed C M wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I had incorrectly added extra check in assert after offset 
>>>>>>> computation in address_offset . For addps with non constant 
>>>>>>> offsets (like [1])
>>>>>>>
>>>>>>> Not changing the old assert even though I am not expecting first 
>>>>>>> addp/second addp(for array addressing) case for init captured 
>>>>>>> store.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA_asserts_corrected/ 
>>>>>>>
>>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>> Jamsheed
>>>>>>>
>>>>>>> [1]
>>>>>>>
>>>>>>> assert(offs != Type::OffsetBot ||
>>>>>>> - adr->in(AddPNode::Address)->in(0)->is_AllocateArray(),
>>>>>>> + adr->in(AddPNode::Address)->in(0)->is_AllocateArray() || 
>>>>>>> is_captured_store(adr),
>>>>>>> ???????????? "offset must be a constant or it is initialization 
>>>>>>> of array");
>>>>>>>
>>>>>>> On 13/07/2020 11:14, Jamsheed C M wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I reworked the fix. I compute offset for all init captures 
>>>>>>>> stores, but treats this special init captured stores similar to 
>>>>>>>> unsafe(as these objects are usually GlobalEscape and doesn't 
>>>>>>>> have any perf implications).
>>>>>>>>
>>>>>>>> revised webrev: 
>>>>>>>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.01/
>>>>>>>>
>>>>>>>> testing: mach1-5( logs in jbs)
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>> Jamsheed
>>>>>>>>
>>>>>>>> On 09/07/2020 19:36, Jamsheed C M wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> request to hold the review. need to change the code for 
>>>>>>>>> dealing with unsafe access. as current capture code go for 
>>>>>>>>> more execution time analyzing things.
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>>
>>>>>>>>> Jamsheed
>>>>>>>>>
>>>>>>>>> On 09/07/2020 13:01, Jamsheed C M wrote:
>>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> JBS:https://bugs.openjdk.java.net/browse/JDK-8242895
>>>>>>>>>>
>>>>>>>>>> Request for review changes made to offset computation and 
>>>>>>>>>> field write detection for init captured stores due to phis 
>>>>>>>>>> addition between alloc and init. This happen if init node in 
>>>>>>>>>> different outer loop wrt to alloc node and there is a loop 
>>>>>>>>>> opt.? This was required as a result of enhancement [1].
>>>>>>>>>>
>>>>>>>>>> Normally init are not associated with multiple alloc node 
>>>>>>>>>> during EA phase, but changes done for [1] caused the code 
>>>>>>>>>> shapes of the form [2]? to generate inits associated with 
>>>>>>>>>> multiple alloc node.
>>>>>>>>>>
>>>>>>>>>> This had implication in offset computation and field write 
>>>>>>>>>> detection related to initializing stores.
>>>>>>>>>>
>>>>>>>>>> Attempt to fix in EA:
>>>>>>>>>>
>>>>>>>>>> ???? webrev: 
>>>>>>>>>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA/
>>>>>>>>>>
>>>>>>>>>> Alternate fix:
>>>>>>>>>>
>>>>>>>>>> ???? Minimize the scenario in compiler generated code by 
>>>>>>>>>> throwing only j.l.Error from slowpath(all exception 
>>>>>>>>>> async/sync are handled in runtime exit).
>>>>>>>>>>
>>>>>>>>>> ???? Stub epilog doesn't poll or throw any exceptions. 
>>>>>>>>>> Disable full loop opt before EA for detectable patterns and 
>>>>>>>>>> bailout EA for late detected patterns.
>>>>>>>>>>
>>>>>>>>>> ???? webrev: 
>>>>>>>>>> http://cr.openjdk.java.net/~jcm/8242895/webrev_deopt/
>>>>>>>>>>
>>>>>>>>>> Please advice.
>>>>>>>>>>
>>>>>>>>>> Testing : mach tier1-5 (logs in jbs)
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>>
>>>>>>>>>> Jamsheed
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> [1] JDK-8231291 
>>>>>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8231291>C2: loop 
>>>>>>>>>> opts before EA should maximally unroll loops
>>>>>>>>>>
>>>>>>>>>> [2] that have its init node in different outer loop wrt to 
>>>>>>>>>> alloc node.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> loop begin
>>>>>>>>>>
>>>>>>>>>> ?? try{
>>>>>>>>>>
>>>>>>>>>> ?? return new obj()/? throw new obj()/ uncommon trap after 
>>>>>>>>>> allocation, in a loop
>>>>>>>>>>
>>>>>>>>>> ?? } catch(ex) {
>>>>>>>>>>
>>>>>>>>>> ?? }
>>>>>>>>>>
>>>>>>>>>> loop end
>>>>>>>>>>
>>>>>>>>>> ? 42???? public static IntA test(int n) {
>>>>>>>>>> ?? 43???????? for (int i=0; i<2; i++) {
>>>>>>>>>> ?? 44???????????? try {
>>>>>>>>>> ?? 45?????????????????? return new IntA(n + i);
>>>>>>>>>> ?? 46???????????? } catch (Exception e) {
>>>>>>>>>> ?? 47???????????? }
>>>>>>>>>> ?? 48???????? }
>>>>>>>>>> ?? 49
>>>>>>>>>>

From goetz.lindenmaier at sap.com  Thu Jul 16 16:30:23 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Thu, 16 Jul 2020 16:30:23 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <AM0PR0202MB3331FF18BED42A71796488E59B600@AM0PR0202MB3331.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
 <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331EEF36942FCEBA7E131389BCB0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB329746F57D1C78F14000CB799AC80@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331D64C693490FD0746D1989BC90@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <DB6PR0201MB2152AF18921A375D26A76D89ECA40@DB6PR0201MB2152.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331FF18BED42A71796488E59B600@AM0PR0202MB3331.eurprd02.prod.outlook.com>
Message-ID: <AM4PR0202MB29641555B86889D51E08441BEC7F0@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi Richard, 

I'll answer to the obvious things in this mail now.
I'll go through the code thoroughly again and write 
a review of my findings thereafter.

> So here is the new webrev.6
> 
> Webrev.6:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6/
> Delta:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.inc/
Thanks for the incremental webrev, it's helpful!
 
> I spent most of the time running a microbenchmark [1] I wrote to answer
> questions from your
> review. At first I had trouble with variance in the results until I found out it
> was due to the NUMA
> architecture of the server I used. After that I noticed that there was a
> performance regression of
> about 5% even at low agent activity. I finally found out that it was due to the
> implementation of
> JavaThread::wait_for_object_deoptimization() which is called by the target
> of the JVMTI operation to
> self suspend for object deoptimization. I fixed this by adding limited spinning
> before calling
> wait() on the monitor.
> 
> The delta includes many changes in comments, renaming of names, etc. So
> I'd like to summarize
> functional changes:
> 
> * Collected all the code for the testing feature DeoptimizeObjectsALot in
> compileBroker.cpp and reworked it.
Thanks, this makes it much more compact.

>   With DeoptimizeObjectsALot enabled internal threads are started that
> deoptimize frames and
>   objects. The number of threads started are given with
> DeoptimizeObjectsALotThreadCountAll and
>   DeoptimizeObjectsALotThreadCountSingle. The former targets all existing
> threads whereas the
>   latter operates on a single thread selected round robin.
> 
>   I removed the mode where deoptimizations were performed at every nth
> exit from the runtime. I never used it.

Do I get it right? You have a n:1 and a n:all test scenario.
 n:1: n threads deoptimize 1 Jana thread    where n = DOALThreadCountSingle
 n:m: n threads deoptimize all Java threads where n = DOALThreadCountAll?

> * EscapeBarrier::sync_and_suspend_one(): use a direct handshake and
> execute it always independently
>   of is_thread_fully_suspended().
Is this also a performance optimization?

> * Bugfix in EscapeBarrier::thread_added(): must not clear deopt flag. Found
> this testing with DeoptimizeObjectsALot.
Ok.

> * Added EscapeBarrier::thread_removed().
Ok.

> * EscapeBarrier constructors: barriers can now be entirely disabled by
> disabling DoEscapeAnalysis.
>   This effectively disables the enhancement.
Good!

> * JavaThread::wait_for_object_deoptimization():
>   - Bugfix: the last check of is_obj_deopt_suspend() must be /after/ the
> safepoint check! This
>     caused issues with not walkable stacks with DeoptimizeObjectsALot.
OK. As I understand, there was one safepoint check in the old version, 
now there is one in each iteration.  I assume this is intended, right?

>   - Added limited spinning inspired by HandshakeSpinYield to fix regression in
> microbenchmark [1]
Ok.  Nice improvement, nice catch!

> 
> I refer to some more changes answering your questions and comments inline
> below.
> 
> Thanks,
> Richard.
> 
> [1] Microbenchmark:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbenchmark/
> 


> > I understand you annotate at safepoints where the escape analysis
> > finds out that an object is "better" than global escape.
> > This are the cases where the analysis identifies optimization
> > opportunities. These annotations are then used to deoptimize
> > frames and the objects referenced by them.
> > Doesn't this overestimate the optimized
> > objects?  E.g., eliminate_alloc_node has many cases where it bails
> > out.
> 
> Yes, the implementation is conservative, but it is comparatively simple and
> the additional debug
> info is just 2 flags per safepoint. 
Thanks. It also helped that you explained to me offline that 
there are more optimizations than only lock elimination and scalar
replacement done based on the ea information.
The ea refines the IR graph with allows follow up optimizations 
which can not easily be tracked back to the escaping objects or 
the call sites where they do not escape. 
Thus, if there are non-global escaping objects, you have to 
deoptimize the frame.
Did I repeat that correctly?
With this understanding, a row of my proposed renamings/comments
are obsolete.


> On the other hand, those JVMTI operations
> that really trigger
> deoptimizations are expected to be comparatively infrequent such that
> switching to the interpreter
> for a few microseconds will hardly have an effect.
That sounds reasonable.

> I've done microbenchmarking to check this.
> 
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbe
> nchmark/
> 
> I found that in the worst case performance can be impacted by 10%. If the
> agent is extremely active
> and does relevant JVMTI calls like GetOwnedMonitorStackDepthInfo() every
> millisecond or more often,
> then the performance impact can be 30%. But I would think that this is not
> realistic. These calls
> are issued in interactive sessions to analyze deadlocks.
Ok. 
 
> We could get more precise deoptimizations by adding a third flag per
> safepoint for ea-local objects
> among the owned monitors. This would help improve the worst case in the
> benchmark. But I'm not
> convinced, if it is worth it.
> 
> Refer to the README.txt of the microbenchmark for a more detailled
> discussion.
 
> > pcDesc.hpp
> >
> > I would like to see some documentation of the methods. 
> Done. I didn't take your text, though, because I only noticed it after writing
> my own. Let me know if you are not ok with it.
That's fine. My texts were only proposals, you as author know better
what goes on anyways.

> > scopeDesc.cpp
> >
> >   Besides refactoring copy escape info from pcDesc to scopeDesc
> >   and add accessors. Trivial.
> >
> >   In scopeDesc.hpp you talk about NoEscape and ArgEscape.
> >   This are opto terms, but scopeDesc is a shared datastructure
> >   that does not depend on a specific compiler.
> >   Please explain what is going on without using these terms.
> 
> Actually these are not too opto specific terms. They are used in the paper
> referenced in
> escape.hpp. Also you can easily google them. I'd rather keep the comments
> as they are.
Hmm, I'm not really happy with this, as also the papers
are for the compiler community, and probably not familiar to 
others that work with HotSpot.
But stay with your terms if you think it makes it clearer.
Anyways, with now understanding why you use conservative
Information (see above), the descriptions I had in mind are not precise.

> > callnode.hpp
> >
> > You add functionality to annotate callnodes with escape information
> > This is carried through code generation to final output where it is
> > added to the compiled methods meta information.
> >
> > At Safepoints in general jvmti can access
> >   - Objects that were scalar replaced. They must be reallocated.
> >     (Flag EliminateAllocations)
> >   - Objects that should be locked but are not because they never
> >     escape the thread. They need to be relocked.
> >
> > At calls, Objects where locks have been removed escape to callees.
> > We must persist this information so that if jvmti accesses the
> > object in a callee, we can determine by looking at the caller that
> > it needs to be relocked.
> 
> Note that the ea-optimization must not be at the current location, it can also
> follow when control
> returns to the caller. Lock elimination isn't the only relevant optimization.
Yes, I understood now, see above. Thanks for explaining.
> Accesses to instance
> members or array elements can be optimized as well.
You mean the compiler can/will ignore volatile or memory ordering
requirements for non-escaping objects? Sounds reasonable to do.

> > // Returns true if at least one of the arguments to the call is an oop
> > // that does not escape globally.
> > bool ConnectionGraph::has_arg_escape(CallJavaNode* call) {
> 
> IMHO the method names are descriptive and don't need the comments. But I
> give in :) (only replaced
> "oop" with "object")
Thanks. Yes, object is better than oop.

> You are right, it is not correct how flags are checked. Especially if only
> running with the JVMCI compiler.
>
> I changed Deoptimization::deoptimize_objects_internal() to make
> reallocation and relocking dependent
> on similar checks as in Deoptimization::fetch_unroll_info_helper().
> Furthermore EscapeBarriers are
> conditionally activated depending on the following (see EscapeBarrier ctors):
> 
> JVMCI_ONLY(UseJVMCICompiler) NOT_JVMCI(false)
> COMPILER2_PRESENT(|| DoEscapeAnalysis)
> 
> So the enhancement can be practically completely disabled by disabling
> DoEscapeAnalysis, which is
> what C2 currently does if JVMTI capabilities that allow access to local
> references are taken.
Thanks for fixing. 

> I went for the latter.
> 
> > In fetch_unroll_info_helper, I don't understand why you need
> >  && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) {
> > for eliminated locks, but not for skalar replaced objects?
> 
> In short reallocation is idempotent, relocking is not.
> 
> Without the enhancement Deoptimization::realloc_objects() can already be
> called more than once for a frame:
> 
> First call in materializeVirtualObjects() (also iterateFrames()).
> 
> Second (indirect) call in fetch_unroll_info_helper().
> 
> The objects from the first call are saved as jvmti deferred updates when
> realloc_objects()
> returns. Note that there is no relationship to jvmti. The thing in common is
> that updates cannot be
> directely installed into a compiled frame, it is necessary to deoptimize the
> frame and defer the
> updates until the compiled frame gets replaced. Every time the vframes
> corresponding to the owner
> frame are iterated, they get the deferred updates. So in
> fetch_unroll_info_helper() the
> GrowableArray<compiledVFrame*>* chunk reference them too. All
> references to the objects created by
> the second (indirect) call to realloc_objects() are never used, because
> compiledVFrame accessors to
> locals, expressions, and monitors override them with the deferred updates.
> The objects become
> unreachable and get gc'ed.
OK, so repeatedly computed vFrames always have the first version of 
reallocated objects by construction, so it needs not be handled here.
But also due to construction, objects might be allocated just to be
discarded.
 
> materializeVirtualObjects() does not bother with relocking.
> deoptimize_objects_internal(), which is
> introduced by the enhancement, does relock objects, after all the lock
> elimination becomes illegal 
> with the change in escape state. Relocking twice does not work, so the
> enhancement avoids it by
> checking EscapeBarrier::objs_are_deoptimized(thread, deoptee.id()).
> 
> Note that materializeVirtualObjects() can be called more than once and will
> always return the very
> same objects, even though it calls realloc_objects() again.
Ok.


> > I would guess it is because the eliminated locks can be applied to
> > argEscape, but scalar replacement only to noescape objects?
> > I.e. it might have been done before?
> >
> > But why isn't this the case for eliminate_allocations?
> > deoptimize_objects_internal does both unconditionally,
> > so both can happen to inner frames, right?
> 
> Sorry, I don't quite understand. Hope the explanation above helps.
Yes.  I was guessing wrong :)

> >   I like if boolean operators are at the beginning of broken lines,
> >   but I think hotspot convention is to have them at the end.
> Ok, fixed.
Thanks.

> 
> > Code will get much more simple if BiasedLocking is removed.
> >
> > EscapeBarrier:: ...
> >
> > (This class maybe would qualify for a file of its own.)
> >
> > deoptimize_objects()
> > I would mention escape analysis only as side remark.  Also, as I understand,
> > there is only one frame at given depth?
> > // Deoptimize frames with optimized objects. This can be omitted locks and
> > // objects not allocated but replaced by scalars. In C2, these optimizations
> > // are based on escape analysis.
> > // Up to depth, deoptimize frames with any optimized objects.
> > // From depth to entry_frame, deoptimize only frames that
> > // pass optimized objects to their callees.
> > (First part similar for the comment above
> EscapeBarrier::deoptimize_objects_internal().)
> 
> I've reworked the comment. Let me know if you still think it needs to be
> improved.
Good now, thanks (maybe break the long line ...)


> > What is the check (cur_depth <= depth) good for? Can you
> > ever walk past entry_frame?
> 
> Yes (assuming you mean the outer while-statement), there are java frames
> beyond the entry frame if a
> native method calls java methods again. So we visit all frames up to the given
> depth and from there
> we continue to the entry frame. It is not necessary to continue beyond that
> entry frame, because
> escape analysis assumes that arguments to native functions escape globally.
> 
> Example: Let the java stack look like this:
> 
> +---------+
> | Frame A |
> +---------+
> | Frame N |
> +---------+
> | Frame B |
> +---------+ <- top of stack
> 
> Where java method A calls native method N and N calls java method B.
> 
> Very simplified the native stack will look like this
> 
> +-------------------------+
> | Frame of JIT Compiled A |
> +-------------------------+
> | Frame N                 |
> +-------------------------+
> | Entry Frame             |
> +-------------------------+
> | Frame B                 |
> +-------------------------+ <- top of stack
> 
> The entry frame is an activation of the call stub, which is a small assembler
> routine that
> translates from the native calling convention to the java calling convention.
> 
> There cannot be any ArgEscape that is passed to B (see above), therefore we
> can stop the stackwalk
> at the entry frame if depth is 1. If depth is 3 we have to continue to Frame A,
> as it is directely
> accessed. 
Ok, thanks, nice explanation!!

> > Isn't vf->is_compiled_frame() prerequisite that "Move to next physical
> frame"
> > is needed? You could move it into the other check.
> > If so, similar for deoptimize_objects_all_threads().
> 
> Only compiledVFrame require moving to the /top/ frame. Fixed.
Thanks, this looks better.

> > Syncronization: looks good. I think others had a look at this before.
> >
> > EscapeBarrier::deoptimize_objects_internal()
> >   The method name is misleading, it is not used by
> >   deoptimize_objects().
> >   Also, method with the same name is in Deopitmization.
> >   Proposal: deoptimize_objects_thread() ?
> 
> Sorry, but I don't see, why it would be misleading.
> What would be the meaning of 'deoptimize_objects_thread'? I don't
> understand that name.
1. I have no idea why it's called "_internal". Because it is private?
   By the name, I would expect that EscapeBarrier::deoptimize_objects()
   calls it for some internal tasks. But it does not.
2. My proposal: deoptimize_objects_all_threads() iterates all threads 
and calls deoptimize_objects(_one)_thread(thread) for each of these.
That's how I would have named it. 
But no bike shedding, if you don't see what I mean it's not obvious.


> > C1 stubs: this really shows you tested all configurations, great!
> >
> >
> > mutexLocker: ok.
> > objectMonitor.cpp: ok
> > stackValue.hpp   Is this missing clearing a bug?
> 
> In short: that change is not needed anymore. I'll remove it again.
Good. Thanks for the details.

> > Renaming deferred_locals to deferred_updates is good, as well as
> > adding a datastructure for it.
> > (Adding this data structure might be a breakout, too.)
> >
> > good.
> >
> > thread.cpp
> >
> > good.
> >
> > vframe.cpp
> >
> > Is this a bug in existing code?
> > Makes sense.
> 
> Depends on your definition of bug. There are no references to
> vframe::is_entry_frame() in the
> existing code. I would think it is a bug.
So it is :)

> 
> >
> > vframe_hp.hpp
> > (What stands _hp for? helper? The file should be named
> compiledVFrame ...)
> >
> > not_global_escape_in_scope() ...
> > Again, you mention escape analysis here. Comments above hold, too.
> 
> I think it is the right name, because it is meaningful and simple.
Ok, accepted ... given my understandings from above.

> 
> > You introduce JvmtiDeferredUpdates. Good.
> >
> > vframe_hp.cpp
> >
> > Changes for JvmtiDeferredUpdates, escape state accessors,
> >
> > line 422:
> > Would an assertion assert(!info->owner_is_scalar_replaced(), ...) hold here?
> >
> >
> > macros.hpp
> >   Good.
> >
> >
> > Test coding
> > ============
> >
> > compileBroker.h|cpp
> >
> > You introduce a third class of threads handled here and
> > add a new flag to distinguish it. Before, the two kinds
> > of threads were distinguished implicitly by passing in
> > a compiler for compiler threads.
> > The new thread kind is only used for testing in debug.
> >
> > make_thread:
> > You could assert (comp != NULL...) to assure previous
> > conditions.
> 
> If replaced the if-statements with a switch-statement, made sure all enum-
> elements are covered, and
> added the assertion you suggested.
> 
> > line 989 indentation broken
> 
> You are referring to this block I assume:
> (from
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/src/hots
> pot/share/compiler/compileBroker.cpp.frames.html)
> 
>  976   if (MethodFlushing) {
>  977     // Initialize the sweeper thread
>  978     Handle thread_oop = create_thread_oop("Sweeper thread", CHECK);
>  979     jobject thread_handle = JNIHandles::make_local(THREAD,
> thread_oop());
>  980     make_thread(sweeper_t, thread_handle, NULL, NULL, THREAD);
>  981   }
>  982
>  983 #if defined(ASSERT) && COMPILER2_OR_JVMCI
>  984   if (DeoptimizeObjectsALot == 2) {
>  985     // Initialize and start the object deoptimizer threads
>  986     for (int thread_count = 0; thread_count <
> DeoptimizeObjectsALotThreadCount; thread_count++) {
>  987       Handle thread_oop = create_thread_oop("Deoptimize objects a lot
> thread", CHECK);
>  988       jobject thread_handle = JNIHandles::make_local(THREAD,
> thread_oop());
>  989       make_thread(deoptimizer_t, thread_handle, NULL, NULL, THREAD);
>  990     }
>  991   }
>  992 #endif // defined(ASSERT) && COMPILER2_OR_JVMCI
> 
> I cannot really see broken indentation here. Am I looking at the wrong
> location?
I don't have the source version I reviewed last time any more, so 
I can't check. But maybe an artefact from patching ... if there were
tabs jcheck would have told you, so that's not it. No problem.

Best regards,
  Goetz.

From igor.ignatyev at oracle.com  Thu Jul 16 17:05:17 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Thu, 16 Jul 2020 10:05:17 -0700
Subject: [15] RFR(T) : 8249622 : use 8249621 to ignore 8 jvmci tests
Message-ID: <FEC8BE1D-EF0F-4B81-8831-527C60A1F4FF@oracle.com>

http://cr.openjdk.java.net/~iignatyev//8249622/webrev.00/
> 2 lines changed: 0 ins; 0 del; 12 mod;


Hi all,

could you please review this trivial patch which updates @ignore tag in 8 jvmci to follow common practice and have a bug id?

from JBS:
> JDK-8220623 added @ignore to 8 jvmci tests but didn't provide any bug id, JDK-8249621 has been created to address the problem w/ the tests, this issue is to change @ignore to be followed by 8249621.

JBS: https://bugs.openjdk.java.net/browse/JDK-8249622
webrev: http://cr.openjdk.java.net/~iignatyev//8249622/webrev.00/

Thanks,
-- Igor


8220623: https://bugs.openjdk.java.net/browse/JDK-8220623
8249621: https://bugs.openjdk.java.net/browse/JDK-8249621

From ekaterina.pavlova at oracle.com  Thu Jul 16 17:28:05 2020
From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova)
Date: Thu, 16 Jul 2020 10:28:05 -0700
Subject: [15] RFR(T) : 8249622 : use 8249621 to ignore 8 jvmci tests
In-Reply-To: <FEC8BE1D-EF0F-4B81-8831-527C60A1F4FF@oracle.com>
References: <FEC8BE1D-EF0F-4B81-8831-527C60A1F4FF@oracle.com>
Message-ID: <5a2c1162-ed38-bbe6-4192-36539243800b@oracle.com>

Looks good,

-katya

On 7/16/20 10:05 AM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev//8249622/webrev.00/
>> 2 lines changed: 0 ins; 0 del; 12 mod;
> 
> 
> Hi all,
> 
> could you please review this trivial patch which updates @ignore tag in 8 jvmci to follow common practice and have a bug id?
> 
> from JBS:
>> JDK-8220623 added @ignore to 8 jvmci tests but didn't provide any bug id, JDK-8249621 has been created to address the problem w/ the tests, this issue is to change @ignore to be followed by 8249621.
> 
> JBS: https://bugs.openjdk.java.net/browse/JDK-8249622
> webrev: http://cr.openjdk.java.net/~iignatyev//8249622/webrev.00/
> 
> Thanks,
> -- Igor
> 
> 
> 8220623: https://bugs.openjdk.java.net/browse/JDK-8220623
> 8249621: https://bugs.openjdk.java.net/browse/JDK-8249621
> 


From vladimir.kozlov at oracle.com  Thu Jul 16 19:56:19 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 16 Jul 2020 12:56:19 -0700
Subject: [15] RFR(T) : 8249622 : use 8249621 to ignore 8 jvmci tests
In-Reply-To: <FEC8BE1D-EF0F-4B81-8831-527C60A1F4FF@oracle.com>
References: <FEC8BE1D-EF0F-4B81-8831-527C60A1F4FF@oracle.com>
Message-ID: <8f0c6d12-7783-5b4e-f79b-554d31337258@oracle.com>

Good.

Thanks,
Vladimir K

On 7/16/20 10:05 AM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev//8249622/webrev.00/
>> 2 lines changed: 0 ins; 0 del; 12 mod;
> 
> 
> Hi all,
> 
> could you please review this trivial patch which updates @ignore tag in 8 jvmci to follow common practice and have a bug id?
> 
> from JBS:
>> JDK-8220623 added @ignore to 8 jvmci tests but didn't provide any bug id, JDK-8249621 has been created to address the problem w/ the tests, this issue is to change @ignore to be followed by 8249621.
> 
> JBS: https://bugs.openjdk.java.net/browse/JDK-8249622
> webrev: http://cr.openjdk.java.net/~iignatyev//8249622/webrev.00/
> 
> Thanks,
> -- Igor
> 
> 
> 8220623: https://bugs.openjdk.java.net/browse/JDK-8220623
> 8249621: https://bugs.openjdk.java.net/browse/JDK-8249621
> 

From igor.ignatyev at oracle.com  Fri Jul 17 03:04:13 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Thu, 16 Jul 2020 20:04:13 -0700
Subject: [15] RFR(T) : 8249622 : use 8249621 to ignore 8 jvmci tests
In-Reply-To: <8f0c6d12-7783-5b4e-f79b-554d31337258@oracle.com>
References: <FEC8BE1D-EF0F-4B81-8831-527C60A1F4FF@oracle.com>
 <8f0c6d12-7783-5b4e-f79b-554d31337258@oracle.com>
Message-ID: <1C0D0178-80EA-402A-B114-80230B0BB663@oracle.com>

Katya, Vladimir,

thank you for your review, pushed to jdk15.

-- Igor

> On Jul 16, 2020, at 12:56 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Good.
> 
> Thanks,
> Vladimir K
> 

> On Jul 16, 2020, at 10:28 AM, Ekaterina Pavlova <ekaterina.pavlova at oracle.com> wrote:
> 
> Looks good,
> 
> -katya

> On 7/16/20 10:05 AM, Igor Ignatyev wrote:
>> http://cr.openjdk.java.net/~iignatyev//8249622/webrev.00/
>>> 2 lines changed: 0 ins; 0 del; 12 mod;
>> Hi all,
>> could you please review this trivial patch which updates @ignore tag in 8 jvmci to follow common practice and have a bug id?
>> from JBS:
>>> JDK-8220623 added @ignore to 8 jvmci tests but didn't provide any bug id, JDK-8249621 has been created to address the problem w/ the tests, this issue is to change @ignore to be followed by 8249621.
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249622
>> webrev: http://cr.openjdk.java.net/~iignatyev//8249622/webrev.00/
>> Thanks,
>> -- Igor
>> 8220623: https://bugs.openjdk.java.net/browse/JDK-8220623
>> 8249621: https://bugs.openjdk.java.net/browse/JDK-8249621


From goetz.lindenmaier at sap.com  Fri Jul 17 12:30:40 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Fri, 17 Jul 2020 12:30:40 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <AM4PR0202MB29641555B86889D51E08441BEC7F0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
 <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331EEF36942FCEBA7E131389BCB0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB329746F57D1C78F14000CB799AC80@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331D64C693490FD0746D1989BC90@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <DB6PR0201MB2152AF18921A375D26A76D89ECA40@DB6PR0201MB2152.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331FF18BED42A71796488E59B600@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM4PR0202MB29641555B86889D51E08441BEC7F0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <AM4PR0202MB2964FAF58FBD21D6705A4418EC7C0@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi Richard,

> I'll answer to the obvious things in this mail now.
> I'll go through the code thoroughly again and write
> a review of my findings thereafter.
As promised a detailed walk-throug, but without any major findings:

c1_IR.hpp: ok
ci_Env.h|cpp: ok
compiledMethod.cpp, nmethod.cpp: ok
debugInfoRec.h|cpp: ok
scopeDesc.h|cpp ok

compileBroker.h|cpp: 
Maybe a bit of documentation how and why you start 
the threads? I had expected there are two test
scenarios run after each other, but now I understand 'Single'
and 'All' run simultaneously.  Well, this really is a stress test!
Also good the two variants of depotimization are
stressed against each other.
Besides that really nice it's all in one place.

rootResolver.cpp: ok
jvmciCodeInstaller.cpp: ok

c2compiler.cpp: The essence of this change! Just one line :)
Great!

callnode.hpp ok
escape.h|cpp ok
macro.cpp 
I was not that happy with the names saying not_global_escape
and similar. I now agreed you have to use the terms of the escape
analysis (NoEscape ArgEscape= throughout the runtime code. I'm still not happy with 
the 'not' in the term, I always try to expand the name to some
sentence with a negated verb, but it makes no sense.
For example, "has_not_global_escape_in_scope" expands to 
"Hasn't a global escape in its scope." in my thinking, which makes 
no sense. You probably mean
"Has not-global escape in its scope." or "Has {ArgEscape|NoEscape} 
in its scope."

C2 is using the word "non" in this context, e.g., here 
alloc->is_non_escaping.

non obviously negates the adjective 'global',
non-global or nonglobal even is a English term I find in the 
net. 
So what about "has_non_global_escape_in_scope?"

matcher.cpp ok

output.cpp:1071
Please break the long line.

jvmtiCodeBlobEvents.cpp ok

jvmtiEnv.cpp
MaxJavaStackTraceDepth is only documented to affect
the exceptions stack trace depth, not to limit jvmti 
operations. Therefore I wondered why it is used here. 
Non of your business, but the flag should
document this in globals.hpp, too.  
Does jvmti specify that the same limits are used ...?
ok on your side.

jvmtiEnvBase.cpp  ok
jvmtiImpl.h|cpp  ok
jvmtiTagMap.cpp ok
whitebox.cpp ok

deoptimization.cpp

line 177: Please break line
line 246, 281: Please break line
1578, 1583, 1589, 1632, 1649, 1651 Break line

1651: You use 'non'-terms, too: non-escaping :)

2805, 2929, 2946ff, break lines

deoptimization.hpp

158, 174, 176 ... I would break lines too, but here you are in
good company :)

globals.hpp ok
mutexLocker.h|cpp ok
objectMonitor.cpp ok

thread.cpp 

2631 typo: sapfepont --> safepoint

thread.hpp ok
thread.inline.hpp ok
vframe.cpp ok
vframe_hp.cpp   458ff break lines
vframe_hp.hpp ok
macros.hpp ok
TEST.ROOT ok
WhiteBox.java ok

IterateHeapWithEscapeAnalysisEnabled.java

line 415:
msg("wait until target thread has set testMethod_result");
while (testMethod_result == 0) {
    Thread.sleep(50);
}
Might the test run into timeouts at this place?
The field is volatile, i.e. it will be reloaded
in each iteration. But will dontinline_testMethod
write it back to main memory in time?

libIterateHeapWithEscapeAnalysisEnabled.c ok

EATests.java

This is a very elaborate test.
I found a row of test cases illustrating issues
we talked about before. Really helpful!

1311: TypeO materialize -> materialized

1640: setting local variable i triggers always deoptimization
  --> setting local variable i always triggers deoptimization

2176: dontinline_calee --> dontinline_callee
2510: poping --> popping  ... but I'm not sure here.

https://www.urbandictionary.com/define.php?term=poping
poping
Drinking large amounts of Dextromethorphan Hydrobromide (DXM)based cough syrup, and then embarking on an adventure while wandering around neighborhoods or parks all night. This is usually done while listening to Punk rock music from a portable jambox. 
;)
Don?t do it! ??

EATestsJVMTI.java

I think you can just copy this test description into the other
test. You can have two @test comments, they will be treated
as separate tests.  The @requires will be evaluated accordingly.
For an example see 
test/hotspot/jtreg/runtime/exceptionMsgs/NullPointerException/NullPointerExceptionTest.java
which has two different compile setups for the test class (-g).

so, that's it for reading code ...


Some general remarks, maybe a bit picky ...:
I think you could use less commas ',' in comments.
As I understand, you need a comma if the relative
sentence is at the beginning, but not if it is at 
the end:
  If Corona is over, I go to the office.
but
  I go to the office if Corona is over.
I think the same holds for 'because', 'while' etc.
E.g., jvmtiEnvBase.cpp:1313, jvmtiImpl.cpp:646ff, 
vframe_hp.hpp 104ff

Also, I like full sentences in comments.  
Especially for me as foreign speaker, this makes
things much more clear. I.e., I try to make it
a real sentence with articles, capitalized and a
dot at the end if there is a subject and a verb
in first place.
E.g., jvmtiEnvBase.cpp:1327
In many places, your comments read really 
well but some are quite abbreviated I think.

E.g. thread.cpp:2601 is an example where a simple
'a' helps a lot.
"Single deoptimization is typically very short."
I would add 'A': "A single deoptimization is typically very short (fast?)."
An other meaning of the comment I first considered is this:
"Single deoptimization is typically very short, all_threads deoptimization takes longer"
having in mind the functions
EscapeBarries::deoptimize_objects_all_threads()  
and 
EscapeBarries::deoptimize_objects() doing a single thread.
German with it's compound nouns is helpful here :)

Einzeldeoptimierung <--> eine einzelne Deoptimierung

Best regards,
  Goetz.


From igor.ignatyev at oracle.com  Fri Jul 17 17:22:04 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Fri, 17 Jul 2020 10:22:04 -0700
Subject: [15] RFR(T) : 8249673 : cleanup graal problem lists
Message-ID: <2564EBA5-2F22-4105-B5AE-984018F7D8C2@oracle.com>

http://cr.openjdk.java.net/~iignatyev//8249673/webrev.00
> 21 lines changed: 0 ins; 5 del; 16 mod; 


Hi all,

could you please review this clean up of ProblemList-graal.txt in hotspot and jdk test suites?

from JBS:
> graal problem-lists list several already closed bugs: 
>  - JDK-8193210 fixed in jdk15-b17 
>  - JDK-8244656, JDK-8204347, JDK-8230419, JDK-8181833 closed as dup of JDK-8207267


JBS: https://bugs.openjdk.java.net/browse/JDK-8249673
webrev: http://cr.openjdk.java.net/~iignatyev//8249673/webrev.00
testing: 
 - jdk/jfr/event/compiler/ tests w/ Graal as JIT
 - grep-ed for bug ids

Thanks,
-- Igor

JDK-8193210 : https://bugs.openjdk.java.net/browse/JDK-8193210
JDK-8244656 : https://bugs.openjdk.java.net/browse/JDK-8244656
JDK-8204347 : https://bugs.openjdk.java.net/browse/JDK-8204347
JDK-8230419 : https://bugs.openjdk.java.net/browse/JDK-8230419

JDK-8207267 : https://bugs.openjdk.java.net/browse/JDK-8207267


From vladimir.kozlov at oracle.com  Fri Jul 17 17:29:42 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 17 Jul 2020 10:29:42 -0700
Subject: [15] RFR(T) : 8249673 : cleanup graal problem lists
In-Reply-To: <2564EBA5-2F22-4105-B5AE-984018F7D8C2@oracle.com>
References: <2564EBA5-2F22-4105-B5AE-984018F7D8C2@oracle.com>
Message-ID: <4d6fe5f2-b947-50cf-0f51-6f8f218e1fad@oracle.com>

LGTM

Thanks,
Vladimir K

On 7/17/20 10:22 AM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev//8249673/webrev.00
>> 21 lines changed: 0 ins; 5 del; 16 mod;
> 
> 
> Hi all,
> 
> could you please review this clean up of ProblemList-graal.txt in hotspot and jdk test suites?
> 
> from JBS:
>> graal problem-lists list several already closed bugs:
>>   - JDK-8193210 fixed in jdk15-b17
>>   - JDK-8244656, JDK-8204347, JDK-8230419, JDK-8181833 closed as dup of JDK-8207267
> 
> 
> JBS: https://bugs.openjdk.java.net/browse/JDK-8249673
> webrev: http://cr.openjdk.java.net/~iignatyev//8249673/webrev.00
> testing:
>   - jdk/jfr/event/compiler/ tests w/ Graal as JIT
>   - grep-ed for bug ids
> 
> Thanks,
> -- Igor
> 
> JDK-8193210 : https://bugs.openjdk.java.net/browse/JDK-8193210
> JDK-8244656 : https://bugs.openjdk.java.net/browse/JDK-8244656
> JDK-8204347 : https://bugs.openjdk.java.net/browse/JDK-8204347
> JDK-8230419 : https://bugs.openjdk.java.net/browse/JDK-8230419
> 
> JDK-8207267 : https://bugs.openjdk.java.net/browse/JDK-8207267
> 

From igor.ignatyev at oracle.com  Fri Jul 17 17:51:26 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Fri, 17 Jul 2020 10:51:26 -0700
Subject: [15] RFR(T) : 8249678 : @ignore should be used instead of ProblemList
 for 8158860, 8163894, 8193479, 8194310
Message-ID: <1BB411DA-3695-4CA4-B77D-9B834D03BEF4@oracle.com>

http://cr.openjdk.java.net/~iignatyev//8249678/webrev.00
> 10 lines changed: 4 ins; 5 del; 1 mod;

Hi all,

could you please review this trivial clean up which replaces ProblemList entires w/ @ignore tag in tests which aren't runnable?

- compiler/jvmci/compilerToVM/GetResolvedJavaTypeTest.java isn't runnable due to 8158860
- compiler/jvmci/compilerToVM/InvalidateInstalledCodeTest.java isn't runnable due to 8163894 
- compiler/codegen/Test6896617.java isn't runnable due to 8193479 
- compiler/c2/Test6852078.java isn't runnable due to 8194310 

from main bug(8249618):
> although ProblemList and @ignore achieve the same end result (test exclusion), their server different goals and have slightly different meanings, simplified @ignore should be used to exclude useless or harmful tests, and ProblemList in all other cases (see yet-not-integrated `ProblemListing or `@ignore`-ing a Test` section of dev guide, PR -- https://github.com/openjdk/guide/pull/21 for more details). 
> 
> due to different reasons, this hasn't been always followed and some currently @ignore-d tests should rather be ProblemList-ed, and some of ProblemList-ed should be @ignore-d, this issue is to clean up the current state in a hope that this will reduce further confusion. 


JBS: https://bugs.openjdk.java.net/browse/JDK-8249678
webrev: http://cr.openjdk.java.net/~iignatyev//8249678/webrev.00

Thanks,
-- Igor

8249618 : https://bugs.openjdk.java.net/browse/JDK-8249618

8158860 : https://bugs.openjdk.java.net/browse/JDK-8158860
8163894 : https://bugs.openjdk.java.net/browse/JDK-8163894
8193479 : https://bugs.openjdk.java.net/browse/JDK-8193479
8194310 : https://bugs.openjdk.java.net/browse/JDK-8194310


From sandhya.viswanathan at intel.com  Fri Jul 17 18:32:04 2020
From: sandhya.viswanathan at intel.com (Viswanathan, Sandhya)
Date: Fri, 17 Jul 2020 18:32:04 +0000
Subject: RFR (XXL): 8223347: Integration of Vector API (Incubator):
 Hotspot and x86 backend changes
Message-ID: <BYAPR11MB3543026ABF92A463562E5DE5EF7C0@BYAPR11MB3543.namprd11.prod.outlook.com>

Hi Vladimir and Coleen,

We are getting ready to propose to target Vector API to JDK 16. 
Please find below the updated hotspot and x86 backend changes:

Shared Hotspot:
   Full: http://cr.openjdk.java.net/~sviswanathan/VAPI_RFR/hs_webrev/webrev.01/
   Incremental: http://cr.openjdk.java.net/~sviswanathan/VAPI_RFR/hs_webrev/webrev.00-webrev.01/

X86:
   Full: http://cr.openjdk.java.net/~sviswanathan/VAPI_RFR/x86_webrev/webrev.01/
   Incremental: http://cr.openjdk.java.net/~sviswanathan/VAPI_RFR/x86_webrev/webrev.00-webrev.01/

Older webrev links for your reference:
Shared Hotspot: http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.00/
X86b backend: http://cr.openjdk.java.net/~sviswanathan/VAPI_RFR/x86_webrev/webrev.00/
To get incremental webrev, I had to do some adjustments to these to be able to apply it to the jdk tip.

Please let us know your feedback and if we have ok from you to propose to target to JDK 16. 

Best Regards,
Sandhya

-----Original Message-----
From: Vladimir Kozlov <vladimir.kozlov at oracle.com> 
Sent: Friday, May 01, 2020 6:05 PM
To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; hotspot-compiler-dev at openjdk.java.net; core-libs-dev at openjdk.java.net; hotspot-dev <hotspot-dev at openjdk.java.net>
Subject: Re: RFR (XXL): 8223347: Integration of Vector API (Incubator): x86 backend changes

On 5/1/20 5:55 PM, Viswanathan, Sandhya wrote:
> Hi Vladimir,
> 
> Thanks a lot for the feedback.
> 
> We used an old existing separate branch to share the code for review and to track changes.
> We didn?t know how to change the name of the branch from vector-unstable to vector-stable.

Good to know that it does not mean that code is "unstable" ;)

Katya filed today new bug [1]. Please look.

Regards,
Vladimir

[1] https://bugs.openjdk.java.net/browse/JDK-8244269

> 
> Best Regards,
> Sandhya
> 
> -----Original Message-----
> From: Vladimir Kozlov <vladimir.kozlov at oracle.com>
> Sent: Friday, May 01, 2020 5:32 PM
> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; hotspot-compiler-dev at openjdk.java.net; core-libs-dev at openjdk.java.net; hotspot-dev <hotspot-dev at openjdk.java.net>
> Subject: Re: RFR (XXL): 8223347: Integration of Vector API (Incubator): x86 backend changes
> 
> Changes seems fine. Nice work.
> 
> Why it is called "vector-unstable branch"?
> 
> Thanks,
> Vladimir K
> 
> On 4/3/20 5:16 PM, Viswanathan, Sandhya wrote:
>> Hi,
>>
>>
>> Following up on review requests of API [0], Java implementation [1] and
>>
>> General Hotspot changes[3] for Vector API, here's a request for review
>>
>> of x86 backend changes required for supporting the API:
>>
>>
>>
>> JEP: https://openjdk.java.net/jeps/338
>>
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8223347
>>
>> Webrev:http://cr.openjdk.java.net/~sviswanathan/VAPI_RFR/x86_webrev/webrev.00/
>>
>>
>>
>> Complete implementation resides in vector-unstable branch of
>>
>> panama/dev repository [3].
>>
>> Looking forward to your feedback.
>>
>> Best Regards,
>> Sandhya
>>
>>
>> [0]  https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-March/065345.html
>>
>>
>>
>> [1]  https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-April/065587.html
>>
>>
>>
>> [2]  https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-April/037798.html
>>
>>
>>
>> [3]  https://openjdk.java.net/projects/panama/
>>
>>          $ hg clone http://hg.openjdk.java.net/panama/dev/ -b vector-unstable
>>
>>
>>
>>
>>

From vladimir.kozlov at oracle.com  Fri Jul 17 18:39:57 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 17 Jul 2020 11:39:57 -0700
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <fa60eea1-3d7b-af3a-678d-354263e3eccc@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
 <8325fac5-6258-9b88-6507-5dcb0597cc17@oracle.com>
 <8522a69e-e538-2cc9-5364-887e450fc653@oracle.com>
 <46144d6d-5714-05ad-a263-01507db937cc@oracle.com>
 <7a361c29-4771-9ed4-1542-b3f68a5726f3@oracle.com>
 <6dc4c99b-1d90-09f1-60d1-fb2caf981266@oracle.com>
 <6b4e4dda-01d4-37d0-5403-a4f5481e5bf0@oracle.com>
 <f8e18f7b-bd13-5032-fb17-04fb037c88b4@oracle.com>
 <32d7fb64-75a5-7add-d496-df33cfaefabf@oracle.com>
 <f9f80bf5-a565-3820-efa1-2b949b92d764@oracle.com>
 <fa60eea1-3d7b-af3a-678d-354263e3eccc@oracle.com>
Message-ID: <cd1bfaeb-96a5-73e0-c244-e58b9484ec17@oracle.com>

Yes, I agree with webrev_fix_EA version.

I would suggest to modify TestIdealAllocShape.java test to add new method with synchronization from your example in JBS 
comment. Or add it as separate test.

Thanks,
Vladimir

On 7/16/20 9:19 AM, Jamsheed C M wrote:
> Hi Vladimir,
> I ran performance run for http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA/? (links in JBS)
> I don't see any issues, so i would like to go with webrev_fix_EA if it fixes all the reported issues.
> Best regards,
> Jamsheed
> 
> On 16/07/2020 07:25, Jamsheed C M wrote:
>> Hi Vladimir,
>>
>> On 16/07/2020 00:29, Vladimir Kozlov wrote:
>>> As I said before I agree with your additional checks for StoreN and StoreNKlass.
>>>
>>> But I have concerns about new is_init_captured_store code. EA is mostly looking only on inputs to see Allocation. And 
>>> in several places it expecting only to see Allocation because other cases should be filtered out before.
>> If that is the case, I would like to go with my first webrev for this fix as it nicely propagate es and there in no 
>> unnecessary promotion to global escape state.
>>
>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA/
>>
>> Best regards,
>>
>> Jamsheed
>>
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 7/15/20 10:54 AM, Jamsheed C M wrote:
>>>> Hi Vladimir,
>>>>
>>>> with unrolling i understand that many cases will just have phis everywhere to outside the loop as the uses are 
>>>> outside the loop.
>>>>
>>>> and this is not restricted to escaping objects alone as i depicted. it can be escaping as well as non-escaping.
>>>>
>>>> so marking store to them as global escape doesn't seems to be nice idea. i will rework on this fix and get back again.
>>>>
>>>> Thank you
>>>>
>>>> Best regards
>>>>
>>>> Jamsheed
>>>>
>>>> On 15/07/2020 08:38, Jamsheed C M wrote:
>>>>> (unfinished mail got sent, so completing it)
>>>>> On 15/07/2020 08:21, Jamsheed C M wrote:
>>>>>> Hi Vladimir,
>>>>>>
>>>>>> On 15/07/2020 06:50, Vladimir Kozlov wrote:
>>>>>>> I looked more on this. EA already does not secularize allocations when Phi nodes merged them - it should handle 
>>>>>>> this case. I did small experiment and relaxed assert for this new (10. needs comment update) case for AddP's base 
>>>>>>> and test passed:
>>>>>>>
>>>>>>> src/hotspot/share/opto/escape.cpp Tue Jul 14 18:11:27 2020 -0700
>>>>>>> @@ -2357,6 +2357,7 @@
>>>>>>> ?????? int opcode = uncast_base->Opcode();
>>>>>>> ?????? assert(opcode == Op_ConP || opcode == Op_ThreadLocal ||
>>>>>>> ????????????? opcode == Op_CastX2P || uncast_base->is_DecodeNarrowPtr() ||
>>>>>>> +???????????? (uncast_base->is_Phi() && (uncast_base->bottom_type()->isa_rawptr() != NULL)) ||
>>>>>>> ????????????? (uncast_base->is_Mem() && (uncast_base->bottom_type()->isa_rawptr() != NULL)) ||
>>>>>>> ????????????? (uncast_base->is_Proj() && uncast_base->in(0)->is_Allocate()), "sanity");
>>>>>>> ???? }
>>>>>>>
>>>>>>> Did you hit a case when this may not work?
>>>>>>
>>>>>> Yes, right it already doesn't mark it as scalarizable if base count is more than one(I think it missed a is_oop 
>>>>>> check there)[1].
>>>>>>
>>>>>> EA CG adds edges only for oop field making stores to them undetected. This makes these stored objects to NoEscape 
>>>>>> and if compiled method continues execution with this NoEscape object can have undesired results(i.e 
>>>>>> synchronization removed).
>>>>>>
>>>>>> Probable case would be(didn't verify)
>>>>>>
>>>>>> try {
>>>>>>
>>>>>> LOOP BEGIN
>>>>>>
>>>>>> ? try {throw new Obj()} catch {}
>>>>>>
>>>>>> LOOP END
>>>>>>
>>>>>> } catch (Obj e) {
>>>>>>
>>>>>> }
>>>>>
>>>>> Best Regards,
>>>>>
>>>>> Jamsheed
>>>>>
>>>>> [1]https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/escape.cpp#L1770
>>>>>
>>>>>
>>>>>>>
>>>>>>>
>>>>>>> And with LoopOpts off -XX:LoopUnrollLimit=0 it removed allocation (-XX:+PrintEscapeAnalysis 
>>>>>>> -XX:+PrintEliminateAllocations):
>>>>>>>
>>>>>>> ======== Connection graph for? Test::test
>>>>>>> JavaObject NoEscape(NoEscape) [ 158F [ 107 ]]?? 95 Allocate === 242? 76? 230? 8? 1 ( 93? 92? 21? 1? 78? 1 78 ) [[ 
>>>>>>> 96 97 98 105 106? 107 ]]? rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, top ) Test::test1 @ bci:0 
>>>>>>> Test::test @ bci:8 !jvms: Test::test1 @ bci:0 Test::test @ bci:8
>>>>>>> LocalVar [ 95P [ 158b ]]?? 107??? Proj??? ===? 95? [[ 108 158 ]] #5 !jvms: Test::test1 @ bci:0 Test::test @ bci:8
>>>>>>>
>>>>>>> Scalar? 95??? Allocate??? ===? 242? 76? 230? 8? 1 ( 93 92? 21 1 78 1? 78 ) [[ 96? 97? 98? 105? 106? 107 ]] 
>>>>>>> rawptr:NotNull ( int:>=0, java/lang/Object:NotNull *, bool, top ) Test::test1 @ bci:0 Test::test @ bci:8 !jvms: 
>>>>>>> Test::test1 @ bci:0 Test::test @ bci:8
>>>>>>> ++++ Eliminated: 95 Allocate
>>>>>>>
>>>>>>>
>>>>>>> t\Thanks,
>>>>>>> Vladimir K
>>>>>>>
>>>>>>> On 7/14/20 1:28 AM, Jamsheed C M wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> I had incorrectly added extra check in assert after offset computation in address_offset . For addps with non 
>>>>>>>> constant offsets (like [1])
>>>>>>>>
>>>>>>>> Not changing the old assert even though I am not expecting first addp/second addp(for array addressing) case for 
>>>>>>>> init captured store.
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA_asserts_corrected/
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>> Jamsheed
>>>>>>>>
>>>>>>>> [1]
>>>>>>>>
>>>>>>>> assert(offs != Type::OffsetBot ||
>>>>>>>> - adr->in(AddPNode::Address)->in(0)->is_AllocateArray(),
>>>>>>>> + adr->in(AddPNode::Address)->in(0)->is_AllocateArray() || is_captured_store(adr),
>>>>>>>> ???????????? "offset must be a constant or it is initialization of array");
>>>>>>>>
>>>>>>>> On 13/07/2020 11:14, Jamsheed C M wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I reworked the fix. I compute offset for all init captures stores, but treats this special init captured stores 
>>>>>>>>> similar to unsafe(as these objects are usually GlobalEscape and doesn't have any perf implications).
>>>>>>>>>
>>>>>>>>> revised webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.01/
>>>>>>>>>
>>>>>>>>> testing: mach1-5( logs in jbs)
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>>
>>>>>>>>> Jamsheed
>>>>>>>>>
>>>>>>>>> On 09/07/2020 19:36, Jamsheed C M wrote:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> request to hold the review. need to change the code for dealing with unsafe access. as current capture code go 
>>>>>>>>>> for more execution time analyzing things.
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>>
>>>>>>>>>> Jamsheed
>>>>>>>>>>
>>>>>>>>>> On 09/07/2020 13:01, Jamsheed C M wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> JBS:https://bugs.openjdk.java.net/browse/JDK-8242895
>>>>>>>>>>>
>>>>>>>>>>> Request for review changes made to offset computation and field write detection for init captured stores due 
>>>>>>>>>>> to phis addition between alloc and init. This happen if init node in different outer loop wrt to alloc node 
>>>>>>>>>>> and there is a loop opt.? This was required as a result of enhancement [1].
>>>>>>>>>>>
>>>>>>>>>>> Normally init are not associated with multiple alloc node during EA phase, but changes done for [1] caused 
>>>>>>>>>>> the code shapes of the form [2]? to generate inits associated with multiple alloc node.
>>>>>>>>>>>
>>>>>>>>>>> This had implication in offset computation and field write detection related to initializing stores.
>>>>>>>>>>>
>>>>>>>>>>> Attempt to fix in EA:
>>>>>>>>>>>
>>>>>>>>>>> ???? webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA/
>>>>>>>>>>>
>>>>>>>>>>> Alternate fix:
>>>>>>>>>>>
>>>>>>>>>>> ???? Minimize the scenario in compiler generated code by throwing only j.l.Error from slowpath(all exception 
>>>>>>>>>>> async/sync are handled in runtime exit).
>>>>>>>>>>>
>>>>>>>>>>> ???? Stub epilog doesn't poll or throw any exceptions. Disable full loop opt before EA for detectable 
>>>>>>>>>>> patterns and bailout EA for late detected patterns.
>>>>>>>>>>>
>>>>>>>>>>> ???? webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_deopt/
>>>>>>>>>>>
>>>>>>>>>>> Please advice.
>>>>>>>>>>>
>>>>>>>>>>> Testing : mach tier1-5 (logs in jbs)
>>>>>>>>>>>
>>>>>>>>>>> Best regards,
>>>>>>>>>>>
>>>>>>>>>>> Jamsheed
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> [1] JDK-8231291 <https://bugs.openjdk.java.net/browse/JDK-8231291>C2: loop opts before EA should maximally 
>>>>>>>>>>> unroll loops
>>>>>>>>>>>
>>>>>>>>>>> [2] that have its init node in different outer loop wrt to alloc node.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> loop begin
>>>>>>>>>>>
>>>>>>>>>>> ?? try{
>>>>>>>>>>>
>>>>>>>>>>> ?? return new obj()/? throw new obj()/ uncommon trap after allocation, in a loop
>>>>>>>>>>>
>>>>>>>>>>> ?? } catch(ex) {
>>>>>>>>>>>
>>>>>>>>>>> ?? }
>>>>>>>>>>>
>>>>>>>>>>> loop end
>>>>>>>>>>>
>>>>>>>>>>> ? 42???? public static IntA test(int n) {
>>>>>>>>>>> ?? 43???????? for (int i=0; i<2; i++) {
>>>>>>>>>>> ?? 44???????????? try {
>>>>>>>>>>> ?? 45?????????????????? return new IntA(n + i);
>>>>>>>>>>> ?? 46???????????? } catch (Exception e) {
>>>>>>>>>>> ?? 47???????????? }
>>>>>>>>>>> ?? 48???????? }
>>>>>>>>>>> ?? 49
>>>>>>>>>>>

From vladimir.kozlov at oracle.com  Fri Jul 17 18:40:49 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 17 Jul 2020 11:40:49 -0700
Subject: [15] RFR(T) : 8249678 : @ignore should be used instead of
 ProblemList for 8158860, 8163894, 8193479, 8194310
In-Reply-To: <1BB411DA-3695-4CA4-B77D-9B834D03BEF4@oracle.com>
References: <1BB411DA-3695-4CA4-B77D-9B834D03BEF4@oracle.com>
Message-ID: <b3f3b981-91a3-8ed6-63d7-c61baf4b7b78@oracle.com>

Good.

Thanks,
Vladimir

On 7/17/20 10:51 AM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev//8249678/webrev.00
>> 10 lines changed: 4 ins; 5 del; 1 mod;
> 
> Hi all,
> 
> could you please review this trivial clean up which replaces ProblemList entires w/ @ignore tag in tests which aren't runnable?
> 
> - compiler/jvmci/compilerToVM/GetResolvedJavaTypeTest.java isn't runnable due to 8158860
> - compiler/jvmci/compilerToVM/InvalidateInstalledCodeTest.java isn't runnable due to 8163894
> - compiler/codegen/Test6896617.java isn't runnable due to 8193479
> - compiler/c2/Test6852078.java isn't runnable due to 8194310
> 
> from main bug(8249618):
>> although ProblemList and @ignore achieve the same end result (test exclusion), their server different goals and have slightly different meanings, simplified @ignore should be used to exclude useless or harmful tests, and ProblemList in all other cases (see yet-not-integrated `ProblemListing or `@ignore`-ing a Test` section of dev guide, PR -- https://github.com/openjdk/guide/pull/21 for more details).
>>
>> due to different reasons, this hasn't been always followed and some currently @ignore-d tests should rather be ProblemList-ed, and some of ProblemList-ed should be @ignore-d, this issue is to clean up the current state in a hope that this will reduce further confusion.
> 
> 
> JBS: https://bugs.openjdk.java.net/browse/JDK-8249678
> webrev: http://cr.openjdk.java.net/~iignatyev//8249678/webrev.00
> 
> Thanks,
> -- Igor
> 
> 8249618 : https://bugs.openjdk.java.net/browse/JDK-8249618
> 
> 8158860 : https://bugs.openjdk.java.net/browse/JDK-8158860
> 8163894 : https://bugs.openjdk.java.net/browse/JDK-8163894
> 8193479 : https://bugs.openjdk.java.net/browse/JDK-8193479
> 8194310 : https://bugs.openjdk.java.net/browse/JDK-8194310
> 
> 
> 

From vladimir.x.ivanov at oracle.com  Fri Jul 17 18:54:33 2020
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Fri, 17 Jul 2020 21:54:33 +0300
Subject: RFR[S] : 8248830 : C2 : Rotate API intrinsification for X86
In-Reply-To: <MWHPR11MB1614EAFF216144FE6EAE68F9E87F0@MWHPR11MB1614.namprd11.prod.outlook.com>
References: <MWHPR11MB1614EAFF216144FE6EAE68F9E87F0@MWHPR11MB1614.namprd11.prod.outlook.com>
Message-ID: <92d97d1b-fc53-e368-b249-1cab7db33964@oracle.com>

Hi Jatin,

> http://cr.openjdk.java.net/~jbhateja/8248830/webrev_02/

It definitely looks better, but IMO it hasn't reached the sweet spot 
yet. It feels like the focus is on auto-vectorizer while the burden is 
put on scalar cases.

First of all, considering GVN folds relevant operation patterns into a 
single Rotate node now, what's the motivation to introduce intrinsics?

Another point is there's still significant duplication for scalar cases.

I'd prefer to see the legacy cases which rely on pattern matching to go 
away and be substituted with instructions which match Rotate 
instructions (migrating ).

I understand that it will penalize the vectorization implementation, but 
IMO reducing overall complexity is worth it. On auto-vectorizer side, I 
see 2 ways to fix it:

   (1) introduce additional AD instructions for RotateLeftV/RotateRightV 
specifically for pre-AVX512 hardware;

   (2) in SuperWord::output(), when matcher doesn't support 
RotateLeftV/RotateLeftV nodes (Matcher::match_rule_supported()), 
generate vectorized version of the original pattern.

Overall, it looks like more and more focus is made on scalar part. 
Considering the main goal of the patch is to enable vectorization, I'm 
fine with separating cleanup of scalar part. As an interim solution, it 
seems that leaving the scalar part as it is now and matching scalar bit 
rotate pattern in VectorNode::is_rotate() should be enough to keep the 
vectorization part functioning. Then scalar Rotate nodes and relevant 
cleanups can be integrated later. (Or vice versa: clean up scalar part 
first and then follow up with vectorization.)

Some other comments:

* There's a lot of duplication between OrINode::Ideal and 
OrLNode::Ideal. What do you think about introducing a super type 
(OrNode) and put a unified version (OrNode::Ideal) there?


* src/hotspot/cpu/x86/x86.ad

+instruct vprotate_immI8(vec dst, vec src, immI8 shift) %{
+  predicate(n->bottom_type()->is_vect()->element_basic_type() == T_INT ||
+            n->bottom_type()->is_vect()->element_basic_type() == T_LONG);

+instruct vprorate(vec dst, vec src, vec shift) %{
+  predicate(n->bottom_type()->is_vect()->element_basic_type() == T_INT ||
+            n->bottom_type()->is_vect()->element_basic_type() == T_LONG);

The predicates are redundant here.


* src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp

+void C2_MacroAssembler::vprotate_imm(int opcode, BasicType etype, 
XMMRegister dst, XMMRegister src,
+                                     int shift, int vector_len) {
+  if (opcode == Op_RotateLeftV) {
+    if (etype == T_INT) {
+      evprold(dst, src, shift, vector_len);
+    } else {
+      evprolq(dst, src, shift, vector_len);
+    }

Please, put an assert for the false case (assert(etype == T_LONG, "...")).


* On testing (with previous version of the patch): -XX:UseAVX is 
x86-specific flag, so new/adjusted tests now fail on non-x86 platforms. 
Either omitting the flag or adding -XX:+IgnoreUnrecognizedVMOptions will 
solve the issue.

Best regards,
Vladimir Ivanov

> 
> 
> Summary of changes:
> 1) Optimization is specifically targeted to exploit vector rotation instruction added for X86 AVX512. A single rotate instruction  encapsulates entire vector OR/SHIFTs pattern thus offers better latency at reduced instruction count.
> 
> 2) There were two approaches to implement this:
>      a)  Let everything remain the same and add new wide complex instruction patterns in the matcher for e.g.
>           set Dst ( OrV (Binary (LShiftVI dst (Binary ReplicateI shift)) (URShiftVI dst (Binary (SubI (Binary ReplicateI 32) ( Replicate shift))
>      It would have been an overoptimistic assumption to expect that graph shape would be preserved till the matcher for correct inferencing.
>      In addition we would have required multiple such bulky patterns.
>      b) Create new RotateLeft/RotateRight scalar nodes, these gets generated during intrinsification as well as during additional pattern
>      matching during node Idealization, later on these nodes are consumed by SLP for valid vectorization scenarios to emit their vector
>      counterparts which eventually emits vector rotates.
> 
> 3) I choose approach 2b) since its cleaner, only problem here was that in non-evex mode (UseAVX < 3) new scalar Rotate nodes should either
> be dismantled back to OR/SHIFT pattern or we penalize the vectorization which would be very costly, other option would have been to add additional vector rotate pattern for UseAVX=3 in the matcher which emit vector OR-SHIFTs instruction but then it will loose on emitting efficient instruction sequence which node sharing (OrV/LShiftV/URShift) offer in current implementation - thus it will not be beneficial for non-AVX512 targets, only saving will be in terms of cleanup of few existing scalar rotate matcher patterns, also old targets does not offer this powerful rotate instruction. Therefore new scalar nodes are created only for AVX512 targets.
> 
> As per suggestions constant folding scenarios have been covered during Idealizations of newly added scalar nodes.
> 
> Please review the latest version and share your feedback and test results.
> 
> Best Regards,
> Jatin
> 
> 
>> -----Original Message-----
>> From: Andrew Haley <aph at redhat.com>
>> Sent: Saturday, July 11, 2020 2:24 PM
>> To: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>; Bhateja, Jatin
>> <jatin.bhateja at intel.com>; hotspot-compiler-dev at openjdk.java.net
>> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>
>> Subject: Re: 8248830 : RFR[S] : C2 : Rotate API intrinsification for X86
>>
>> On 10/07/2020 18:32, Vladimir Ivanov wrote:
>>
>>   > High-level comment: so far, there were no pressing need in  > explicitly
>> marking the methods as intrinsics. ROR/ROL instructions  > were selected
>> during matching [1]. Now the patch introduces  > dedicated nodes
>> (RotateLeft/RotateRight) specifically for intrinsics  > which partly
>> duplicates existing logic.
>>
>> The lack of rotate nodes in the IR has always meant that AArch64 doesn't
>> generate optimal code for e.g.
>>
>>     (Set dst (XorL reg1 (RotateLeftL reg2 imm)))
>>
>> because, with the RotateLeft expanded to its full combination of ORs and
>> shifts, it's to complicated to match. At the time I put this to one side
>> because it wasn't urgent. This is a shame because although such
>> combinations are unusual they are used in some crypto operations.
>>
>> If we can generate immediate-form rotate nodes early by pattern matching
>> during parsing (rather than depending on intrinsics) we'll get more value
>> than by depending on programmers calling intrinsics.
>>
>> --
>> Andrew Haley  (he/him)
>> Java Platform Lead Engineer
>> Red Hat UK Ltd. <https://www.redhat.com> https://keybase.io/andrewhaley
>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
> 

From igor.ignatyev at oracle.com  Fri Jul 17 18:57:25 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Fri, 17 Jul 2020 11:57:25 -0700
Subject: [15] RFR(T) : 8249678 : @ignore should be used instead of
 ProblemList for 8158860, 8163894, 8193479, 8194310
In-Reply-To: <b3f3b981-91a3-8ed6-63d7-c61baf4b7b78@oracle.com>
References: <1BB411DA-3695-4CA4-B77D-9B834D03BEF4@oracle.com>
 <b3f3b981-91a3-8ed6-63d7-c61baf4b7b78@oracle.com>
Message-ID: <B4C40AB5-7707-40D0-BECC-512AF0ECEBA6@oracle.com>

thanks Vladimir, pushed.

-- Igor

> On Jul 17, 2020, at 11:40 AM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Good.
> 
> Thanks,
> Vladimir
> 
> On 7/17/20 10:51 AM, Igor Ignatyev wrote:
>> http://cr.openjdk.java.net/~iignatyev//8249678/webrev.00
>>> 10 lines changed: 4 ins; 5 del; 1 mod;
>> Hi all,
>> could you please review this trivial clean up which replaces ProblemList entires w/ @ignore tag in tests which aren't runnable?
>> - compiler/jvmci/compilerToVM/GetResolvedJavaTypeTest.java isn't runnable due to 8158860
>> - compiler/jvmci/compilerToVM/InvalidateInstalledCodeTest.java isn't runnable due to 8163894
>> - compiler/codegen/Test6896617.java isn't runnable due to 8193479
>> - compiler/c2/Test6852078.java isn't runnable due to 8194310
>> from main bug(8249618):
>>> although ProblemList and @ignore achieve the same end result (test exclusion), their server different goals and have slightly different meanings, simplified @ignore should be used to exclude useless or harmful tests, and ProblemList in all other cases (see yet-not-integrated `ProblemListing or `@ignore`-ing a Test` section of dev guide, PR -- https://github.com/openjdk/guide/pull/21 for more details).
>>> 
>>> due to different reasons, this hasn't been always followed and some currently @ignore-d tests should rather be ProblemList-ed, and some of ProblemList-ed should be @ignore-d, this issue is to clean up the current state in a hope that this will reduce further confusion.
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249678
>> webrev: http://cr.openjdk.java.net/~iignatyev//8249678/webrev.00
>> Thanks,
>> -- Igor
>> 8249618 : https://bugs.openjdk.java.net/browse/JDK-8249618
>> 8158860 : https://bugs.openjdk.java.net/browse/JDK-8158860
>> 8163894 : https://bugs.openjdk.java.net/browse/JDK-8163894
>> 8193479 : https://bugs.openjdk.java.net/browse/JDK-8193479
>> 8194310 : https://bugs.openjdk.java.net/browse/JDK-8194310


From igor.ignatyev at oracle.com  Fri Jul 17 18:57:36 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Fri, 17 Jul 2020 11:57:36 -0700
Subject: [15] RFR(T) : 8249673 : cleanup graal problem lists
In-Reply-To: <4d6fe5f2-b947-50cf-0f51-6f8f218e1fad@oracle.com>
References: <2564EBA5-2F22-4105-B5AE-984018F7D8C2@oracle.com>
 <4d6fe5f2-b947-50cf-0f51-6f8f218e1fad@oracle.com>
Message-ID: <0650C344-3216-4C1B-A2D7-5404671998A7@oracle.com>

thanks Vladimir, pushed.

-- Igor

> On Jul 17, 2020, at 10:29 AM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> LGTM
> 
> Thanks,
> Vladimir K
> 
> On 7/17/20 10:22 AM, Igor Ignatyev wrote:
>> http://cr.openjdk.java.net/~iignatyev//8249673/webrev.00
>>> 21 lines changed: 0 ins; 5 del; 16 mod;
>> Hi all,
>> could you please review this clean up of ProblemList-graal.txt in hotspot and jdk test suites?
>> from JBS:
>>> graal problem-lists list several already closed bugs:
>>>  - JDK-8193210 fixed in jdk15-b17
>>>  - JDK-8244656, JDK-8204347, JDK-8230419, JDK-8181833 closed as dup of JDK-8207267
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249673
>> webrev: http://cr.openjdk.java.net/~iignatyev//8249673/webrev.00
>> testing:
>>  - jdk/jfr/event/compiler/ tests w/ Graal as JIT
>>  - grep-ed for bug ids
>> Thanks,
>> -- Igor
>> JDK-8193210 : https://bugs.openjdk.java.net/browse/JDK-8193210
>> JDK-8244656 : https://bugs.openjdk.java.net/browse/JDK-8244656
>> JDK-8204347 : https://bugs.openjdk.java.net/browse/JDK-8204347
>> JDK-8230419 : https://bugs.openjdk.java.net/browse/JDK-8230419
>> JDK-8207267 : https://bugs.openjdk.java.net/browse/JDK-8207267


From vladimir.a.ivanov at intel.com  Fri Jul 17 19:57:42 2020
From: vladimir.a.ivanov at intel.com (Ivanov, Vladimir A)
Date: Fri, 17 Jul 2020 19:57:42 +0000
Subject: add microcode version to the hs_err files
Message-ID: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>

Hello,

could you please review the patch  http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/

This patch add the microcode version for different OSes that may be useful in the issue resolution process.


The reported microcode version for different OSes loos as:


Linux (RHEL7.7):

# cat hs_err_pid251046.log |grep microc

CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt, clwb


Windows (Win10, v1809):

CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per core) family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt


MacOS (Darwin):

$ cat hs_err_pid95187.log |grep microc

CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per core) family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes, clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, sha, fma, clflush, clflushopt


Thanks, Vladimir


  Thanks, Vladimir


From thomas.stuefe at gmail.com  Fri Jul 17 21:19:43 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Fri, 17 Jul 2020 23:19:43 +0200
Subject: add microcode version to the hs_err files
In-Reply-To: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
Message-ID: <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>

Hi Vladimir,

I think this would be more suited to hotspot-runtime.

http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html

+#if defined(IA32) || defined(AMD64)

Is that not synonymous with x86?

+    while ((read = getline(&line, &len, fp)) != -1) {
+      if (len > 10 && strstr(line, "microcode") != NULL) {
+        char* rev = strchr(line, ':');
+        if (rev != NULL) sscanf(rev + 1, "%x", &result);
+        break;
+      }
+    }
+    free(line);

Not sure this works as intended. At the first call to getline() it will
allocate a line buffer for you and return it. That buffer will be as large
as the first line you happen to read. You then pass that same buffer into
getline to fetch the next lines, but what if those are longer than the
first?

But anyway it would be better to pass a simple caller provided buffer in -
stack allocated. Since this function is called at crash time and the C heap
could be corrupted.

Cheers, Thomas


On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <
vladimir.a.ivanov at intel.com> wrote:

> Hello,
>
> could you please review the patch
> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>
> This patch add the microcode version for different OSes that may be useful
> in the issue resolution process.
>
>
>
> The reported microcode version for different OSes loos as:
>
>
>
> Linux (RHEL7.7):
>
> # cat hs_err_pid251046.log |grep microc
>
> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per core)
> family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8, fxsr, mmx,
> sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
> clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx,
> fma, clflush, clflushopt, clwb
>
>
>
> Windows (Win10, v1809):
>
> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per core)
> family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr, mmx, sse,
> sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
> clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx,
> fma, clflush, clflushopt
>
>
>
> MacOS (Darwin):
>
> $ cat hs_err_pid95187.log |grep microc
>
> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per core)
> family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr, mmx, sse,
> sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
> clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, sha,
> fma, clflush, clflushopt
>
>
>
> Thanks, Vladimir
>
>
>   Thanks, Vladimir
>
>

From thomas.stuefe at gmail.com  Fri Jul 17 21:26:16 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Fri, 17 Jul 2020 23:26:16 +0200
Subject: add microcode version to the hs_err files
In-Reply-To: <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
Message-ID: <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>

On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe <thomas.stuefe at gmail.com>
wrote:

> Hi Vladimir,
>
> I think this would be more suited to hotspot-runtime.
>
>
> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html
>
> +#if defined(IA32) || defined(AMD64)
>
> Is that not synonymous with x86?
>
> +    while ((read = getline(&line, &len, fp)) != -1) {
> +      if (len > 10 && strstr(line, "microcode") != NULL) {
> +        char* rev = strchr(line, ':');
> +        if (rev != NULL) sscanf(rev + 1, "%x", &result);
> +        break;
> +      }
> +    }
> +    free(line);
>
> Not sure this works as intended. At the first call to getline() it will
> allocate a line buffer for you and return it. That buffer will be as large
> as the first line you happen to read. You then pass that same buffer into
> getline to fetch the next lines, but what if those are longer than the
> first?
>
>
Forget that point, getline calls realloc() on the line buffer to resize it,
so this should be okay.

Thanks, Thomas


> But anyway it would be better to pass a simple caller provided buffer in -
> stack allocated. Since this function is called at crash time and the C heap
> could be corrupted.
>
> Cheers, Thomas
>
>
> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <
> vladimir.a.ivanov at intel.com> wrote:
>
>> Hello,
>>
>> could you please review the patch
>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>>
>> This patch add the microcode version for different OSes that may be
>> useful in the issue resolution process.
>>
>>
>>
>> The reported microcode version for different OSes loos as:
>>
>>
>>
>> Linux (RHEL7.7):
>>
>> # cat hs_err_pid251046.log |grep microc
>>
>> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per
>> core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8, fxsr,
>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2,
>> aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2,
>> adx, fma, clflush, clflushopt, clwb
>>
>>
>>
>> Windows (Win10, v1809):
>>
>> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per core)
>> family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr, mmx, sse,
>> sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
>> clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx,
>> fma, clflush, clflushopt
>>
>>
>>
>> MacOS (Darwin):
>>
>> $ cat hs_err_pid95187.log |grep microc
>>
>> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per core)
>> family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr, mmx, sse,
>> sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
>> clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, sha,
>> fma, clflush, clflushopt
>>
>>
>>
>> Thanks, Vladimir
>>
>>
>>   Thanks, Vladimir
>>
>>

From vladimir.a.ivanov at intel.com  Fri Jul 17 21:57:37 2020
From: vladimir.a.ivanov at intel.com (Ivanov, Vladimir A)
Date: Fri, 17 Jul 2020 21:57:37 +0000
Subject: add microcode version to the hs_err files
In-Reply-To: <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
Message-ID: <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>

>  +#if defined(IA32) || defined(AMD64)
>
> Is that not synonymous with x86?
This patter was copied from the method ?print_model_name_and_flags? (file os/linux/os_linux.cpp).
This method also read the ?/proc/cpuinfo? file and I reuse it as ?template? for the new method.
It is better to use one pattern to work with exactly same file but in general you are right.
The X86 is defined in the file ./share/utilities/macros.hpp as:
#if defined(IA32) || defined(AMD64)
#define X86
#define X86_ONLY(code) code
#define NOT_X86(code)

The question here: could I delete this ?ifdefs? while this method should work on x86 only?

Thanks, Vladimir

From: Thomas St?fe <thomas.stuefe at gmail.com>
Sent: Friday, July 17, 2020 2:26 PM
To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>; Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>
Cc: hotspot-compiler-dev at openjdk.java.net
Subject: Re: add microcode version to the hs_err files


On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>> wrote:
Hi Vladimir,

I think this would be more suited to hotspot-runtime.

http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html

+#if defined(IA32) || defined(AMD64)

Is that not synonymous with x86?

+    while ((read = getline(&line, &len, fp)) != -1) {
+      if (len > 10 && strstr(line, "microcode") != NULL) {
+        char* rev = strchr(line, ':');
+        if (rev != NULL) sscanf(rev + 1, "%x", &result);
+        break;
+      }
+    }
+    free(line);

Not sure this works as intended. At the first call to getline() it will allocate a line buffer for you and return it. That buffer will be as large as the first line you happen to read. You then pass that same buffer into getline to fetch the next lines, but what if those are longer than the first?


Forget that point, getline calls realloc() on the line buffer to resize it, so this should be okay.

Thanks, Thomas

But anyway it would be better to pass a simple caller provided buffer in - stack allocated. Since this function is called at crash time and the C heap could be corrupted.

Cheers, Thomas


On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
Hello,

could you please review the patch  http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/

This patch add the microcode version for different OSes that may be useful in the issue resolution process.


The reported microcode version for different OSes loos as:


Linux (RHEL7.7):

# cat hs_err_pid251046.log |grep microc

CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt, clwb


Windows (Win10, v1809):

CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per core) family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt


MacOS (Darwin):

$ cat hs_err_pid95187.log |grep microc

CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per core) family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes, clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, sha, fma, clflush, clflushopt


Thanks, Vladimir


  Thanks, Vladimir

From thomas.stuefe at gmail.com  Fri Jul 17 22:02:29 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Sat, 18 Jul 2020 00:02:29 +0200
Subject: add microcode version to the hs_err files
In-Reply-To: <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
Message-ID: <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>

Hi Vladimir,

On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A <
vladimir.a.ivanov at intel.com> wrote:

> >  +#if defined(IA32) || defined(AMD64)
> >
> > Is that not synonymous with x86?
>
> This patter was copied from the method ?print_model_name_and_flags? (file
> os/linux/os_linux.cpp).
>
> This method also read the ?/proc/cpuinfo? file and I reuse it as
> ?template? for the new method.
>
> It is better to use one pattern to work with exactly same file but in
> general you are right.
>
> The X86 is defined in the file ./share/utilities/macros.hpp as:
>
> #if defined(IA32) || defined(AMD64)
>
> #define X86
>
> #define X86_ONLY(code) code
>
> #define NOT_X86(code)
>
>
>
> The question here: could I delete this ?ifdefs? while this method should
> work on x86 only?
>
>
>

os_linux_x86.cpp is compiled for x86 platforms only, whereas os_linux.cpp
is shared among all architectures.

So, in the former you do not need to exclude non-x86 architectures.

Cheers, Thomas


> Thanks, Vladimir
>
>
>
> *From:* Thomas St?fe <thomas.stuefe at gmail.com>
> *Sent:* Friday, July 17, 2020 2:26 PM
> *To:* Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>; Hotspot dev
> runtime <hotspot-runtime-dev at openjdk.java.net>
> *Cc:* hotspot-compiler-dev at openjdk.java.net
> *Subject:* Re: add microcode version to the hs_err files
>
>
>
>
>
>
>
> On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe <thomas.stuefe at gmail.com>
> wrote:
>
> Hi Vladimir,
>
>
>
> I think this would be more suited to hotspot-runtime.
>
>
>
>
> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html
>
>
> +#if defined(IA32) || defined(AMD64)
>
> Is that not synonymous with x86?
>
>
>
> +    while ((read = getline(&line, &len, fp)) != -1) {
> +      if (len > 10 && strstr(line, "microcode") != NULL) {
> +        char* rev = strchr(line, ':');
> +        if (rev != NULL) sscanf(rev + 1, "%x", &result);
> +        break;
> +      }
> +    }
> +    free(line);
>
>
>
> Not sure this works as intended. At the first call to getline() it will
> allocate a line buffer for you and return it. That buffer will be as large
> as the first line you happen to read. You then pass that same buffer into
> getline to fetch the next lines, but what if those are longer than the
> first?
>
>
>
>
>
> Forget that point, getline calls realloc() on the line buffer to resize
> it, so this should be okay.
>
>
>
> Thanks, Thomas
>
>
>
> But anyway it would be better to pass a simple caller provided buffer in -
> stack allocated. Since this function is called at crash time and the C heap
> could be corrupted.
>
>
>
> Cheers, Thomas
>
>
>
>
>
> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <
> vladimir.a.ivanov at intel.com> wrote:
>
> Hello,
>
> could you please review the patch
> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>
> This patch add the microcode version for different OSes that may be useful
> in the issue resolution process.
>
>
>
> The reported microcode version for different OSes loos as:
>
>
>
> Linux (RHEL7.7):
>
> # cat hs_err_pid251046.log |grep microc
>
> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per core)
> family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8, fxsr, mmx,
> sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
> clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx,
> fma, clflush, clflushopt, clwb
>
>
>
> Windows (Win10, v1809):
>
> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per core)
> family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr, mmx, sse,
> sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
> clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx,
> fma, clflush, clflushopt
>
>
>
> MacOS (Darwin):
>
> $ cat hs_err_pid95187.log |grep microc
>
> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per core)
> family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr, mmx, sse,
> sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
> clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, sha,
> fma, clflush, clflushopt
>
>
>
> Thanks, Vladimir
>
>
>   Thanks, Vladimir
>
>

From vladimir.a.ivanov at intel.com  Fri Jul 17 22:52:42 2020
From: vladimir.a.ivanov at intel.com (Ivanov, Vladimir A)
Date: Fri, 17 Jul 2020 22:52:42 +0000
Subject: add microcode version to the hs_err files
In-Reply-To: <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
Message-ID: <BYAPR11MB378241E44D75A7AAC274DECDA77C0@BYAPR11MB3782.namprd11.prod.outlook.com>

Thanks for your comment.
The updated patch available as http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.01/

Thanks, Vladimir

From: Thomas St?fe <thomas.stuefe at gmail.com>
Sent: Friday, July 17, 2020 3:02 PM
To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>
Cc: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: add microcode version to the hs_err files

Hi Vladimir,

On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
>  +#if defined(IA32) || defined(AMD64)
>
> Is that not synonymous with x86?
This patter was copied from the method ?print_model_name_and_flags? (file os/linux/os_linux.cpp).
This method also read the ?/proc/cpuinfo? file and I reuse it as ?template? for the new method.
It is better to use one pattern to work with exactly same file but in general you are right.
The X86 is defined in the file ./share/utilities/macros.hpp as:
#if defined(IA32) || defined(AMD64)
#define X86
#define X86_ONLY(code) code
#define NOT_X86(code)

The question here: could I delete this ?ifdefs? while this method should work on x86 only?


os_linux_x86.cpp is compiled for x86 platforms only, whereas os_linux.cpp is shared among all architectures.

So, in the former you do not need to exclude non-x86 architectures.

Cheers, Thomas

Thanks, Vladimir

From: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
Sent: Friday, July 17, 2020 2:26 PM
To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>; Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>
Cc: hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
Subject: Re: add microcode version to the hs_err files


On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>> wrote:
Hi Vladimir,

I think this would be more suited to hotspot-runtime.

http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html

+#if defined(IA32) || defined(AMD64)

Is that not synonymous with x86?

+    while ((read = getline(&line, &len, fp)) != -1) {
+      if (len > 10 && strstr(line, "microcode") != NULL) {
+        char* rev = strchr(line, ':');
+        if (rev != NULL) sscanf(rev + 1, "%x", &result);
+        break;
+      }
+    }
+    free(line);

Not sure this works as intended. At the first call to getline() it will allocate a line buffer for you and return it. That buffer will be as large as the first line you happen to read. You then pass that same buffer into getline to fetch the next lines, but what if those are longer than the first?


Forget that point, getline calls realloc() on the line buffer to resize it, so this should be okay.

Thanks, Thomas

But anyway it would be better to pass a simple caller provided buffer in - stack allocated. Since this function is called at crash time and the C heap could be corrupted.

Cheers, Thomas


On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
Hello,

could you please review the patch  http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/

This patch add the microcode version for different OSes that may be useful in the issue resolution process.


The reported microcode version for different OSes loos as:


Linux (RHEL7.7):

# cat hs_err_pid251046.log |grep microc

CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt, clwb


Windows (Win10, v1809):

CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per core) family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt


MacOS (Darwin):

$ cat hs_err_pid95187.log |grep microc

CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per core) family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes, clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, sha, fma, clflush, clflushopt


Thanks, Vladimir


  Thanks, Vladimir

From vladimir.kozlov at oracle.com  Fri Jul 17 23:03:20 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 17 Jul 2020 16:03:20 -0700
Subject: [16] RFR(S) 8249672: Include microcode revision in
 features_string on x86
In-Reply-To: <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
Message-ID: <29dd9cde-48c8-915f-fa28-26312c7af17a@oracle.com>

I updated subject to our formal review request format (JDK version, RFE's id and subject).

I moved RFE to runtime group as Thomas said:

https://bugs.openjdk.java.net/browse/JDK-8249672

Submitted tier1 testing to build on all our supported platforms. And debug builds on linux failed:

#  SIGSEGV (0xb) at pc=0x0000146fc6af4b0b, pid=9715, tid=9718
# V  [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*) const+0xeb

V  [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*) const+0xeb
V  [libjvm.so+0x13c898a]  verify_memory(void*)+0x26a
V  [libjvm.so+0x13cd30b]  os::free(void*)+0x5b
V  [libjvm.so+0x13e5598]  os::cpu_microcode_revision()+0xc8
V  [libjvm.so+0x17d314c]  VM_Version::get_processor_features()+0x76c
V  [libjvm.so+0x17d6ead]  VM_Version::initialize()+0x10d
V  [libjvm.so+0x17ce6c6]  VM_Version_init()+0x26
V  [libjvm.so+0xcb2895]   init_globals()+0x55
V  [libjvm.so+0x16dde63]  Threads::create_vm(JavaVMInitArgs*, bool*)+0x2d3


Regards,
Vladimir K

On 7/17/20 3:02 PM, Thomas St?fe wrote:
> Hi Vladimir,
> 
> On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A <
> vladimir.a.ivanov at intel.com> wrote:
> 
>>>   +#if defined(IA32) || defined(AMD64)
>>>
>>> Is that not synonymous with x86?
>>
>> This patter was copied from the method ?print_model_name_and_flags? (file
>> os/linux/os_linux.cpp).
>>
>> This method also read the ?/proc/cpuinfo? file and I reuse it as
>> ?template? for the new method.
>>
>> It is better to use one pattern to work with exactly same file but in
>> general you are right.
>>
>> The X86 is defined in the file ./share/utilities/macros.hpp as:
>>
>> #if defined(IA32) || defined(AMD64)
>>
>> #define X86
>>
>> #define X86_ONLY(code) code
>>
>> #define NOT_X86(code)
>>
>>
>>
>> The question here: could I delete this ?ifdefs? while this method should
>> work on x86 only?
>>
>>
>>
> 
> os_linux_x86.cpp is compiled for x86 platforms only, whereas os_linux.cpp
> is shared among all architectures.
> 
> So, in the former you do not need to exclude non-x86 architectures.
> 
> Cheers, Thomas
> 
> 
>> Thanks, Vladimir
>>
>>
>>
>> *From:* Thomas St?fe <thomas.stuefe at gmail.com>
>> *Sent:* Friday, July 17, 2020 2:26 PM
>> *To:* Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>; Hotspot dev
>> runtime <hotspot-runtime-dev at openjdk.java.net>
>> *Cc:* hotspot-compiler-dev at openjdk.java.net
>> *Subject:* Re: add microcode version to the hs_err files
>>
>>
>>
>>
>>
>>
>>
>> On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe <thomas.stuefe at gmail.com>
>> wrote:
>>
>> Hi Vladimir,
>>
>>
>>
>> I think this would be more suited to hotspot-runtime.
>>
>>
>>
>>
>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html
>>
>>
>> +#if defined(IA32) || defined(AMD64)
>>
>> Is that not synonymous with x86?
>>
>>
>>
>> +    while ((read = getline(&line, &len, fp)) != -1) {
>> +      if (len > 10 && strstr(line, "microcode") != NULL) {
>> +        char* rev = strchr(line, ':');
>> +        if (rev != NULL) sscanf(rev + 1, "%x", &result);
>> +        break;
>> +      }
>> +    }
>> +    free(line);
>>
>>
>>
>> Not sure this works as intended. At the first call to getline() it will
>> allocate a line buffer for you and return it. That buffer will be as large
>> as the first line you happen to read. You then pass that same buffer into
>> getline to fetch the next lines, but what if those are longer than the
>> first?
>>
>>
>>
>>
>>
>> Forget that point, getline calls realloc() on the line buffer to resize
>> it, so this should be okay.
>>
>>
>>
>> Thanks, Thomas
>>
>>
>>
>> But anyway it would be better to pass a simple caller provided buffer in -
>> stack allocated. Since this function is called at crash time and the C heap
>> could be corrupted.
>>
>>
>>
>> Cheers, Thomas
>>
>>
>>
>>
>>
>> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <
>> vladimir.a.ivanov at intel.com> wrote:
>>
>> Hello,
>>
>> could you please review the patch
>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>>
>> This patch add the microcode version for different OSes that may be useful
>> in the issue resolution process.
>>
>>
>>
>> The reported microcode version for different OSes loos as:
>>
>>
>>
>> Linux (RHEL7.7):
>>
>> # cat hs_err_pid251046.log |grep microc
>>
>> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per core)
>> family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8, fxsr, mmx,
>> sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
>> clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx,
>> fma, clflush, clflushopt, clwb
>>
>>
>>
>> Windows (Win10, v1809):
>>
>> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per core)
>> family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr, mmx, sse,
>> sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
>> clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx,
>> fma, clflush, clflushopt
>>
>>
>>
>> MacOS (Darwin):
>>
>> $ cat hs_err_pid95187.log |grep microc
>>
>> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per core)
>> family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr, mmx, sse,
>> sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
>> clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, sha,
>> fma, clflush, clflushopt
>>
>>
>>
>> Thanks, Vladimir
>>
>>
>>    Thanks, Vladimir
>>
>>

From vladimir.kozlov at oracle.com  Fri Jul 17 23:17:00 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 17 Jul 2020 16:17:00 -0700
Subject: [16] RFR(S) 8249672: Include microcode revision in
 features_string on x86
In-Reply-To: <29dd9cde-48c8-915f-fa28-26312c7af17a@oracle.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
 <29dd9cde-48c8-915f-fa28-26312c7af17a@oracle.com>
Message-ID: <d6d5d0a8-c990-c74f-ab8a-ef0a8e9a17d0@oracle.com>

I think the issue is 'line' buffer is allocated by libc getline() and os:free() which is HotSpot function [1] does not 
know about it. You need C's ::free() or use HS's os::malloc() to allocate 'line' buffer.

Someone from Runtime may suggest what is the best for this case.

Thanks,
Vladimir K

[1] http://hg.openjdk.java.net/jdk/jdk/file/14f465f62984/src/hotspot/share/runtime/os.cpp#l792

On 7/17/20 4:03 PM, Vladimir Kozlov wrote:
> I updated subject to our formal review request format (JDK version, RFE's id and subject).
> 
> I moved RFE to runtime group as Thomas said:
> 
> https://bugs.openjdk.java.net/browse/JDK-8249672
> 
> Submitted tier1 testing to build on all our supported platforms. And debug builds on linux failed:
> 
> #? SIGSEGV (0xb) at pc=0x0000146fc6af4b0b, pid=9715, tid=9718
> # V? [libjvm.so+0xc12b0b]? GuardedMemory::print_on(outputStream*) const+0xeb
> 
> V? [libjvm.so+0xc12b0b]? GuardedMemory::print_on(outputStream*) const+0xeb
> V? [libjvm.so+0x13c898a]? verify_memory(void*)+0x26a
> V? [libjvm.so+0x13cd30b]? os::free(void*)+0x5b
> V? [libjvm.so+0x13e5598]? os::cpu_microcode_revision()+0xc8
> V? [libjvm.so+0x17d314c]? VM_Version::get_processor_features()+0x76c
> V? [libjvm.so+0x17d6ead]? VM_Version::initialize()+0x10d
> V? [libjvm.so+0x17ce6c6]? VM_Version_init()+0x26
> V? [libjvm.so+0xcb2895]?? init_globals()+0x55
> V? [libjvm.so+0x16dde63]? Threads::create_vm(JavaVMInitArgs*, bool*)+0x2d3
> 
> 
> Regards,
> Vladimir K
> 
> On 7/17/20 3:02 PM, Thomas St?fe wrote:
>> Hi Vladimir,
>>
>> On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A <
>> vladimir.a.ivanov at intel.com> wrote:
>>
>>>> ? +#if defined(IA32) || defined(AMD64)
>>>>
>>>> Is that not synonymous with x86?
>>>
>>> This patter was copied from the method ?print_model_name_and_flags? (file
>>> os/linux/os_linux.cpp).
>>>
>>> This method also read the ?/proc/cpuinfo? file and I reuse it as
>>> ?template? for the new method.
>>>
>>> It is better to use one pattern to work with exactly same file but in
>>> general you are right.
>>>
>>> The X86 is defined in the file ./share/utilities/macros.hpp as:
>>>
>>> #if defined(IA32) || defined(AMD64)
>>>
>>> #define X86
>>>
>>> #define X86_ONLY(code) code
>>>
>>> #define NOT_X86(code)
>>>
>>>
>>>
>>> The question here: could I delete this ?ifdefs? while this method should
>>> work on x86 only?
>>>
>>>
>>>
>>
>> os_linux_x86.cpp is compiled for x86 platforms only, whereas os_linux.cpp
>> is shared among all architectures.
>>
>> So, in the former you do not need to exclude non-x86 architectures.
>>
>> Cheers, Thomas
>>
>>
>>> Thanks, Vladimir
>>>
>>>
>>>
>>> *From:* Thomas St?fe <thomas.stuefe at gmail.com>
>>> *Sent:* Friday, July 17, 2020 2:26 PM
>>> *To:* Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>; Hotspot dev
>>> runtime <hotspot-runtime-dev at openjdk.java.net>
>>> *Cc:* hotspot-compiler-dev at openjdk.java.net
>>> *Subject:* Re: add microcode version to the hs_err files
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe <thomas.stuefe at gmail.com>
>>> wrote:
>>>
>>> Hi Vladimir,
>>>
>>>
>>>
>>> I think this would be more suited to hotspot-runtime.
>>>
>>>
>>>
>>>
>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html 
>>>
>>>
>>>
>>> +#if defined(IA32) || defined(AMD64)
>>>
>>> Is that not synonymous with x86?
>>>
>>>
>>>
>>> +??? while ((read = getline(&line, &len, fp)) != -1) {
>>> +????? if (len > 10 && strstr(line, "microcode") != NULL) {
>>> +??????? char* rev = strchr(line, ':');
>>> +??????? if (rev != NULL) sscanf(rev + 1, "%x", &result);
>>> +??????? break;
>>> +????? }
>>> +??? }
>>> +??? free(line);
>>>
>>>
>>>
>>> Not sure this works as intended. At the first call to getline() it will
>>> allocate a line buffer for you and return it. That buffer will be as large
>>> as the first line you happen to read. You then pass that same buffer into
>>> getline to fetch the next lines, but what if those are longer than the
>>> first?
>>>
>>>
>>>
>>>
>>>
>>> Forget that point, getline calls realloc() on the line buffer to resize
>>> it, so this should be okay.
>>>
>>>
>>>
>>> Thanks, Thomas
>>>
>>>
>>>
>>> But anyway it would be better to pass a simple caller provided buffer in -
>>> stack allocated. Since this function is called at crash time and the C heap
>>> could be corrupted.
>>>
>>>
>>>
>>> Cheers, Thomas
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <
>>> vladimir.a.ivanov at intel.com> wrote:
>>>
>>> Hello,
>>>
>>> could you please review the patch
>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>>>
>>> This patch add the microcode version for different OSes that may be useful
>>> in the issue resolution process.
>>>
>>>
>>>
>>> The reported microcode version for different OSes loos as:
>>>
>>>
>>>
>>> Linux (RHEL7.7):
>>>
>>> # cat hs_err_pid251046.log |grep microc
>>>
>>> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per core)
>>> family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8, fxsr, mmx,
>>> sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
>>> clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx,
>>> fma, clflush, clflushopt, clwb
>>>
>>>
>>>
>>> Windows (Win10, v1809):
>>>
>>> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per core)
>>> family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr, mmx, sse,
>>> sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
>>> clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx,
>>> fma, clflush, clflushopt
>>>
>>>
>>>
>>> MacOS (Darwin):
>>>
>>> $ cat hs_err_pid95187.log |grep microc
>>>
>>> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per core)
>>> family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr, mmx, sse,
>>> sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes,
>>> clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, sha,
>>> fma, clflush, clflushopt
>>>
>>>
>>>
>>> Thanks, Vladimir
>>>
>>>
>>> ?? Thanks, Vladimir
>>>
>>>

From vladimir.kozlov at oracle.com  Fri Jul 17 23:24:07 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 17 Jul 2020 16:24:07 -0700
Subject: add microcode version to the hs_err files
In-Reply-To: <BYAPR11MB378241E44D75A7AAC274DECDA77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
 <BYAPR11MB378241E44D75A7AAC274DECDA77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
Message-ID: <ce56b6b9-2498-4050-eeef-6ee7facc6be0@oracle.com>

I forked new e-mail thread with correct subject line:

[16] RFR(S) 8249672: Include microcode revision in features_string on x86

Lets continue discussion there. There is issue with changes in os_linux_x86.cpp

Regards,
Vladimir K

On 7/17/20 3:52 PM, Ivanov, Vladimir A wrote:
> Thanks for your comment.
> The updated patch available as http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.01/
> 
> Thanks, Vladimir
> 
> From: Thomas St?fe <thomas.stuefe at gmail.com>
> Sent: Friday, July 17, 2020 3:02 PM
> To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>
> Cc: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>; hotspot-compiler-dev at openjdk.java.net
> Subject: Re: add microcode version to the hs_err files
> 
> Hi Vladimir,
> 
> On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
>>   +#if defined(IA32) || defined(AMD64)
>>
>> Is that not synonymous with x86?
> This patter was copied from the method ?print_model_name_and_flags? (file os/linux/os_linux.cpp).
> This method also read the ?/proc/cpuinfo? file and I reuse it as ?template? for the new method.
> It is better to use one pattern to work with exactly same file but in general you are right.
> The X86 is defined in the file ./share/utilities/macros.hpp as:
> #if defined(IA32) || defined(AMD64)
> #define X86
> #define X86_ONLY(code) code
> #define NOT_X86(code)
> 
> The question here: could I delete this ?ifdefs? while this method should work on x86 only?
> 
> 
> os_linux_x86.cpp is compiled for x86 platforms only, whereas os_linux.cpp is shared among all architectures.
> 
> So, in the former you do not need to exclude non-x86 architectures.
> 
> Cheers, Thomas
> 
> Thanks, Vladimir
> 
> From: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
> Sent: Friday, July 17, 2020 2:26 PM
> To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>; Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>
> Cc: hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
> Subject: Re: add microcode version to the hs_err files
> 
> 
> 
> On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>> wrote:
> Hi Vladimir,
> 
> I think this would be more suited to hotspot-runtime.
> 
> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html
> 
> +#if defined(IA32) || defined(AMD64)
> 
> Is that not synonymous with x86?
> 
> +    while ((read = getline(&line, &len, fp)) != -1) {
> +      if (len > 10 && strstr(line, "microcode") != NULL) {
> +        char* rev = strchr(line, ':');
> +        if (rev != NULL) sscanf(rev + 1, "%x", &result);
> +        break;
> +      }
> +    }
> +    free(line);
> 
> Not sure this works as intended. At the first call to getline() it will allocate a line buffer for you and return it. That buffer will be as large as the first line you happen to read. You then pass that same buffer into getline to fetch the next lines, but what if those are longer than the first?
> 
> 
> Forget that point, getline calls realloc() on the line buffer to resize it, so this should be okay.
> 
> Thanks, Thomas
> 
> But anyway it would be better to pass a simple caller provided buffer in - stack allocated. Since this function is called at crash time and the C heap could be corrupted.
> 
> Cheers, Thomas
> 
> 
> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
> Hello,
> 
> could you please review the patch  http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
> 
> This patch add the microcode version for different OSes that may be useful in the issue resolution process.
> 
> 
> 
> The reported microcode version for different OSes loos as:
> 
> 
> 
> Linux (RHEL7.7):
> 
> # cat hs_err_pid251046.log |grep microc
> 
> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt, clwb
> 
> 
> 
> Windows (Win10, v1809):
> 
> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per core) family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt
> 
> 
> 
> MacOS (Darwin):
> 
> $ cat hs_err_pid95187.log |grep microc
> 
> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per core) family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, avx, avx2, aes, clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, sha, fma, clflush, clflushopt
> 
> 
> 
> Thanks, Vladimir
> 
> 
>    Thanks, Vladimir
> 

From vladimir.a.ivanov at intel.com  Fri Jul 17 23:24:32 2020
From: vladimir.a.ivanov at intel.com (Ivanov, Vladimir A)
Date: Fri, 17 Jul 2020 23:24:32 +0000
Subject: [16] RFR(S) 8249672: Include microcode revision in
 features_string on x86
In-Reply-To: <d6d5d0a8-c990-c74f-ab8a-ef0a8e9a17d0@oracle.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
 <29dd9cde-48c8-915f-fa28-26312c7af17a@oracle.com>
 <d6d5d0a8-c990-c74f-ab8a-ef0a8e9a17d0@oracle.com>
Message-ID: <BYAPR11MB378279AB52DD8560F661DA03A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>

Thanks, I expected the C's functions here. Let's wait a little bit for Runtime team and update work with buffer.

 Thanks, Vladimir

-----Original Message-----
From: Vladimir Kozlov <vladimir.kozlov at oracle.com> 
Sent: Friday, July 17, 2020 4:17 PM
To: Thomas St?fe <thomas.stuefe at gmail.com>; Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>
Cc: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: [16] RFR(S) 8249672: Include microcode revision in features_string on x86

I think the issue is 'line' buffer is allocated by libc getline() and os:free() which is HotSpot function [1] does not know about it. You need C's ::free() or use HS's os::malloc() to allocate 'line' buffer.

Someone from Runtime may suggest what is the best for this case.

Thanks,
Vladimir K

[1] http://hg.openjdk.java.net/jdk/jdk/file/14f465f62984/src/hotspot/share/runtime/os.cpp#l792

On 7/17/20 4:03 PM, Vladimir Kozlov wrote:
> I updated subject to our formal review request format (JDK version, RFE's id and subject).
> 
> I moved RFE to runtime group as Thomas said:
> 
> https://bugs.openjdk.java.net/browse/JDK-8249672
> 
> Submitted tier1 testing to build on all our supported platforms. And debug builds on linux failed:
> 
> #? SIGSEGV (0xb) at pc=0x0000146fc6af4b0b, pid=9715, tid=9718 # V? 
> [libjvm.so+0xc12b0b]? GuardedMemory::print_on(outputStream*) 
> const+0xeb
> 
> V? [libjvm.so+0xc12b0b]? GuardedMemory::print_on(outputStream*) 
> const+0xeb V? [libjvm.so+0x13c898a]? verify_memory(void*)+0x26a V? 
> [libjvm.so+0x13cd30b]? os::free(void*)+0x5b V? [libjvm.so+0x13e5598]? 
> os::cpu_microcode_revision()+0xc8 V? [libjvm.so+0x17d314c]? 
> VM_Version::get_processor_features()+0x76c
> V? [libjvm.so+0x17d6ead]? VM_Version::initialize()+0x10d V? 
> [libjvm.so+0x17ce6c6]? VM_Version_init()+0x26 V? [libjvm.so+0xcb2895]?? 
> init_globals()+0x55 V? [libjvm.so+0x16dde63]? 
> Threads::create_vm(JavaVMInitArgs*, bool*)+0x2d3
> 
> 
> Regards,
> Vladimir K
> 
> On 7/17/20 3:02 PM, Thomas St?fe wrote:
>> Hi Vladimir,
>>
>> On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A < 
>> vladimir.a.ivanov at intel.com> wrote:
>>
>>>> ? +#if defined(IA32) || defined(AMD64)
>>>>
>>>> Is that not synonymous with x86?
>>>
>>> This patter was copied from the method ?print_model_name_and_flags? 
>>> (file os/linux/os_linux.cpp).
>>>
>>> This method also read the ?/proc/cpuinfo? file and I reuse it as 
>>> ?template? for the new method.
>>>
>>> It is better to use one pattern to work with exactly same file but 
>>> in general you are right.
>>>
>>> The X86 is defined in the file ./share/utilities/macros.hpp as:
>>>
>>> #if defined(IA32) || defined(AMD64)
>>>
>>> #define X86
>>>
>>> #define X86_ONLY(code) code
>>>
>>> #define NOT_X86(code)
>>>
>>>
>>>
>>> The question here: could I delete this ?ifdefs? while this method 
>>> should work on x86 only?
>>>
>>>
>>>
>>
>> os_linux_x86.cpp is compiled for x86 platforms only, whereas 
>> os_linux.cpp is shared among all architectures.
>>
>> So, in the former you do not need to exclude non-x86 architectures.
>>
>> Cheers, Thomas
>>
>>
>>> Thanks, Vladimir
>>>
>>>
>>>
>>> *From:* Thomas St?fe <thomas.stuefe at gmail.com>
>>> *Sent:* Friday, July 17, 2020 2:26 PM
>>> *To:* Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>; Hotspot dev 
>>> runtime <hotspot-runtime-dev at openjdk.java.net>
>>> *Cc:* hotspot-compiler-dev at openjdk.java.net
>>> *Subject:* Re: add microcode version to the hs_err files
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe 
>>> <thomas.stuefe at gmail.com>
>>> wrote:
>>>
>>> Hi Vladimir,
>>>
>>>
>>>
>>> I think this would be more suited to hotspot-runtime.
>>>
>>>
>>>
>>>
>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>>> src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html
>>>
>>>
>>>
>>> +#if defined(IA32) || defined(AMD64)
>>>
>>> Is that not synonymous with x86?
>>>
>>>
>>>
>>> +??? while ((read = getline(&line, &len, fp)) != -1) {
>>> +????? if (len > 10 && strstr(line, "microcode") != NULL) {
>>> +??????? char* rev = strchr(line, ':');
>>> +??????? if (rev != NULL) sscanf(rev + 1, "%x", &result);
>>> +??????? break;
>>> +????? }
>>> +??? }
>>> +??? free(line);
>>>
>>>
>>>
>>> Not sure this works as intended. At the first call to getline() it 
>>> will allocate a line buffer for you and return it. That buffer will 
>>> be as large as the first line you happen to read. You then pass that 
>>> same buffer into getline to fetch the next lines, but what if those 
>>> are longer than the first?
>>>
>>>
>>>
>>>
>>>
>>> Forget that point, getline calls realloc() on the line buffer to 
>>> resize it, so this should be okay.
>>>
>>>
>>>
>>> Thanks, Thomas
>>>
>>>
>>>
>>> But anyway it would be better to pass a simple caller provided 
>>> buffer in - stack allocated. Since this function is called at crash 
>>> time and the C heap could be corrupted.
>>>
>>>
>>>
>>> Cheers, Thomas
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A < 
>>> vladimir.a.ivanov at intel.com> wrote:
>>>
>>> Hello,
>>>
>>> could you please review the patch
>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>>>
>>> This patch add the microcode version for different OSes that may be 
>>> useful in the issue resolution process.
>>>
>>>
>>>
>>> The reported microcode version for different OSes loos as:
>>>
>>>
>>>
>>> Linux (RHEL7.7):
>>>
>>> # cat hs_err_pid251046.log |grep microc
>>>
>>> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per 
>>> core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8, 
>>> fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, 
>>> vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, 
>>> tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt, clwb
>>>
>>>
>>>
>>> Windows (Win10, v1809):
>>>
>>> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per 
>>> core) family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr, 
>>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, 
>>> avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, 
>>> tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt
>>>
>>>
>>>
>>> MacOS (Darwin):
>>>
>>> $ cat hs_err_pid95187.log |grep microc
>>>
>>> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per 
>>> core) family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr, 
>>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper, 
>>> avx, avx2, aes, clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit, 
>>> bmi1, bmi2, adx, sha, fma, clflush, clflushopt
>>>
>>>
>>>
>>> Thanks, Vladimir
>>>
>>>
>>> ?? Thanks, Vladimir
>>>
>>>

From igor.ignatyev at oracle.com  Sat Jul 18 03:54:12 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Fri, 17 Jul 2020 20:54:12 -0700
Subject: [15] RFR(T) :  8249697 : java/lang/invoke/RicochetTest.java should
 use @requires instead of @ignore
Message-ID: <054E0326-B61C-40FA-A8E3-89C433A49EE3@oracle.com>

http://cr.openjdk.java.net/~iignatyev/8249697/webrev.00/
> 7 lines changed: 4 ins; 0 del; 3 mod;


Hi all,

could you please review this small and trivial patch for java/lang/invoke/RicochetTest.java test?
from JBS:
> a run of java/lang/invoke/RicochetTest.java w/ MAX_ARITY=255 was removed from all configurations by JDK-7049122, yet the problem manifests itself only w/ Xcomp. as now we have @requires to filter out tests from certain configurations, the test can be updated to run MAX_ARITY=255 in all configs but Xcomp.

the patch splits the test into two subtests, each one w/ one @run, and use @requires to exclude one w/ MAX_ARITY=255 from execution if Xcomp flag is used.

JBS: https://bugs.openjdk.java.net/browse/JDK-8249697
webrev: http://cr.openjdk.java.net/~iignatyev/8249697/webrev.00/
testing: java/lang/invoke/RicochetTest.java on {linux,windows,macos}-x64 w/ and w/o -Xcomp; Xcomp runs, as expected, had only 1 test run

Thanks,
-- Igor

JDK-7049122 : https://bugs.openjdk.java.net/browse/JDK-7049122

From igor.ignatyev at oracle.com  Sat Jul 18 03:57:43 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Fri, 17 Jul 2020 20:57:43 -0700
Subject: [15] RFR(T) : 8249698 :
 java/lang/invoke/LFCaching/LFGarbageCollectedTest.java should be
 ProblemList-ed and not @ignored
Message-ID: <61EBB792-FAF4-4DFD-A674-4BE7153F20AA@oracle.com>

http://cr.openjdk.java.net/~iignatyev//8249698/webrev.00
> 3 lines changed: 1 ins; 1 del; 1 mod;


Hi all,

could you please review this trivial patch which removes @ignore from LFGarbageCollectedTest and adds it into problem-list instead?

from  8249698:
> java/lang/invoke/LFCaching/LFGarbageCollectedTest.java is excluded from execution due to JDK-8078602. although the test might indeed fail due to JDK-8078602, it still can be useful and isn't harmful to run, therefore this test should be put in ProblemList.txt and @ignore is to be removed.
from main issue(8249618):
> although ProblemList and @ignore achieve the same end result (test exclusion), their server different goals and have slightly different meanings, simplified @ignore should be used to exclude useless or harmful tests, and ProblemList in all other cases (see yet-not-integrated `ProblemListing or `@ignore`-ing a Test` section of dev guide, PR -- https://github.com/openjdk/guide/pull/21 for more details). 
> 
> due to different reasons, this hasn't been always followed and some currently @ignore-d tests should rather be ProblemList-ed, and some of ProblemList-ed should be @ignore-d, this issue is to clean up the current state in a hope that this will reduce further confusion. 


webrev: http://cr.openjdk.java.net/~iignatyev//8249698/webrev.00
JBS: https://bugs.openjdk.java.net/browse/JDK-8249698

Thanks,
-- Igor

8078602: https://bugs.openjdk.java.net/browse/JDK-8078602
8249618: https://bugs.openjdk.java.net/browse/JDK-8249618

From thomas.stuefe at gmail.com  Sat Jul 18 04:41:33 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Sat, 18 Jul 2020 06:41:33 +0200
Subject: [16] RFR(S) 8249672: Include microcode revision in
 features_string on x86
In-Reply-To: <BYAPR11MB378279AB52DD8560F661DA03A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
 <29dd9cde-48c8-915f-fa28-26312c7af17a@oracle.com>
 <d6d5d0a8-c990-c74f-ab8a-ef0a8e9a17d0@oracle.com>
 <BYAPR11MB378279AB52DD8560F661DA03A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
Message-ID: <CAA-vtUxq5BinzYfOF6bmDO1OxxLexPnaoYJfPVeC4f1j05AEig@mail.gmail.com>

Hi,

yes, you must use the raw free here (for the same reason we cannot pass in
an os::malloc() allocated buffer to getline, since if it were to resize it
would use raw ::realloc() internally and crash the same way).

But as I wrote in my first mail to the original thread, I would not use
c-heap memory at all, since this function is used during crash reporting in
the signal handler and the c-heap may be corrupted.

It the max line length of /proc/cpu can be reliably predicted (so that
getline wont realloc()) I would pass a stack allocated buffer into getline.
If not, I would not use getline() at all but rewrite this, probably using
fgets().

Cheers, Thomas


On Sat, Jul 18, 2020 at 1:24 AM Ivanov, Vladimir A <
vladimir.a.ivanov at intel.com> wrote:

> Thanks, I expected the C's functions here. Let's wait a little bit for
> Runtime team and update work with buffer.
>
>  Thanks, Vladimir
>
> -----Original Message-----
> From: Vladimir Kozlov <vladimir.kozlov at oracle.com>
> Sent: Friday, July 17, 2020 4:17 PM
> To: Thomas St?fe <thomas.stuefe at gmail.com>; Ivanov, Vladimir A <
> vladimir.a.ivanov at intel.com>
> Cc: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>;
> hotspot-compiler-dev at openjdk.java.net
> Subject: Re: [16] RFR(S) 8249672: Include microcode revision in
> features_string on x86
>
> I think the issue is 'line' buffer is allocated by libc getline() and
> os:free() which is HotSpot function [1] does not know about it. You need
> C's ::free() or use HS's os::malloc() to allocate 'line' buffer.
>
> Someone from Runtime may suggest what is the best for this case.
>
> Thanks,
> Vladimir K
>
> [1]
> http://hg.openjdk.java.net/jdk/jdk/file/14f465f62984/src/hotspot/share/runtime/os.cpp#l792
>
> On 7/17/20 4:03 PM, Vladimir Kozlov wrote:
> > I updated subject to our formal review request format (JDK version,
> RFE's id and subject).
> >
> > I moved RFE to runtime group as Thomas said:
> >
> > https://bugs.openjdk.java.net/browse/JDK-8249672
> >
> > Submitted tier1 testing to build on all our supported platforms. And
> debug builds on linux failed:
> >
> > #  SIGSEGV (0xb) at pc=0x0000146fc6af4b0b, pid=9715, tid=9718 # V
> > [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
> > const+0xeb
> >
> > V  [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
> > const+0xeb V  [libjvm.so+0x13c898a]  verify_memory(void*)+0x26a V
> > [libjvm.so+0x13cd30b]  os::free(void*)+0x5b V  [libjvm.so+0x13e5598]
> > os::cpu_microcode_revision()+0xc8 V  [libjvm.so+0x17d314c]
> > VM_Version::get_processor_features()+0x76c
> > V  [libjvm.so+0x17d6ead]  VM_Version::initialize()+0x10d V
> > [libjvm.so+0x17ce6c6]  VM_Version_init()+0x26 V  [libjvm.so+0xcb2895]
> > init_globals()+0x55 V  [libjvm.so+0x16dde63]
> > Threads::create_vm(JavaVMInitArgs*, bool*)+0x2d3
> >
> >
> > Regards,
> > Vladimir K
> >
> > On 7/17/20 3:02 PM, Thomas St?fe wrote:
> >> Hi Vladimir,
> >>
> >> On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A <
> >> vladimir.a.ivanov at intel.com> wrote:
> >>
> >>>>   +#if defined(IA32) || defined(AMD64)
> >>>>
> >>>> Is that not synonymous with x86?
> >>>
> >>> This patter was copied from the method ?print_model_name_and_flags?
> >>> (file os/linux/os_linux.cpp).
> >>>
> >>> This method also read the ?/proc/cpuinfo? file and I reuse it as
> >>> ?template? for the new method.
> >>>
> >>> It is better to use one pattern to work with exactly same file but
> >>> in general you are right.
> >>>
> >>> The X86 is defined in the file ./share/utilities/macros.hpp as:
> >>>
> >>> #if defined(IA32) || defined(AMD64)
> >>>
> >>> #define X86
> >>>
> >>> #define X86_ONLY(code) code
> >>>
> >>> #define NOT_X86(code)
> >>>
> >>>
> >>>
> >>> The question here: could I delete this ?ifdefs? while this method
> >>> should work on x86 only?
> >>>
> >>>
> >>>
> >>
> >> os_linux_x86.cpp is compiled for x86 platforms only, whereas
> >> os_linux.cpp is shared among all architectures.
> >>
> >> So, in the former you do not need to exclude non-x86 architectures.
> >>
> >> Cheers, Thomas
> >>
> >>
> >>> Thanks, Vladimir
> >>>
> >>>
> >>>
> >>> *From:* Thomas St?fe <thomas.stuefe at gmail.com>
> >>> *Sent:* Friday, July 17, 2020 2:26 PM
> >>> *To:* Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>; Hotspot dev
> >>> runtime <hotspot-runtime-dev at openjdk.java.net>
> >>> *Cc:* hotspot-compiler-dev at openjdk.java.net
> >>> *Subject:* Re: add microcode version to the hs_err files
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe
> >>> <thomas.stuefe at gmail.com>
> >>> wrote:
> >>>
> >>> Hi Vladimir,
> >>>
> >>>
> >>>
> >>> I think this would be more suited to hotspot-runtime.
> >>>
> >>>
> >>>
> >>>
> >>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
> >>> src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html
> >>>
> >>>
> >>>
> >>> +#if defined(IA32) || defined(AMD64)
> >>>
> >>> Is that not synonymous with x86?
> >>>
> >>>
> >>>
> >>> +    while ((read = getline(&line, &len, fp)) != -1) {
> >>> +      if (len > 10 && strstr(line, "microcode") != NULL) {
> >>> +        char* rev = strchr(line, ':');
> >>> +        if (rev != NULL) sscanf(rev + 1, "%x", &result);
> >>> +        break;
> >>> +      }
> >>> +    }
> >>> +    free(line);
> >>>
> >>>
> >>>
> >>> Not sure this works as intended. At the first call to getline() it
> >>> will allocate a line buffer for you and return it. That buffer will
> >>> be as large as the first line you happen to read. You then pass that
> >>> same buffer into getline to fetch the next lines, but what if those
> >>> are longer than the first?
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Forget that point, getline calls realloc() on the line buffer to
> >>> resize it, so this should be okay.
> >>>
> >>>
> >>>
> >>> Thanks, Thomas
> >>>
> >>>
> >>>
> >>> But anyway it would be better to pass a simple caller provided
> >>> buffer in - stack allocated. Since this function is called at crash
> >>> time and the C heap could be corrupted.
> >>>
> >>>
> >>>
> >>> Cheers, Thomas
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <
> >>> vladimir.a.ivanov at intel.com> wrote:
> >>>
> >>> Hello,
> >>>
> >>> could you please review the patch
> >>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
> >>>
> >>> This patch add the microcode version for different OSes that may be
> >>> useful in the issue resolution process.
> >>>
> >>>
> >>>
> >>> The reported microcode version for different OSes loos as:
> >>>
> >>>
> >>>
> >>> Linux (RHEL7.7):
> >>>
> >>> # cat hs_err_pid251046.log |grep microc
> >>>
> >>> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per
> >>> core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8,
> >>> fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt,
> >>> vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht,
> >>> tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt, clwb
> >>>
> >>>
> >>>
> >>> Windows (Win10, v1809):
> >>>
> >>> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per
> >>> core) family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr,
> >>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper,
> >>> avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc,
> >>> tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt
> >>>
> >>>
> >>>
> >>> MacOS (Darwin):
> >>>
> >>> $ cat hs_err_pid95187.log |grep microc
> >>>
> >>> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per
> >>> core) family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr,
> >>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper,
> >>> avx, avx2, aes, clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit,
> >>> bmi1, bmi2, adx, sha, fma, clflush, clflushopt
> >>>
> >>>
> >>>
> >>> Thanks, Vladimir
> >>>
> >>>
> >>>    Thanks, Vladimir
> >>>
> >>>
>

From vladimir.a.ivanov at intel.com  Sat Jul 18 05:07:59 2020
From: vladimir.a.ivanov at intel.com (Ivanov, Vladimir A)
Date: Sat, 18 Jul 2020 05:07:59 +0000
Subject: [16] RFR(S) 8249672: Include microcode revision in
 features_string on x86
In-Reply-To: <CAA-vtUxq5BinzYfOF6bmDO1OxxLexPnaoYJfPVeC4f1j05AEig@mail.gmail.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
 <29dd9cde-48c8-915f-fa28-26312c7af17a@oracle.com>
 <d6d5d0a8-c990-c74f-ab8a-ef0a8e9a17d0@oracle.com>
 <BYAPR11MB378279AB52DD8560F661DA03A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUxq5BinzYfOF6bmDO1OxxLexPnaoYJfPVeC4f1j05AEig@mail.gmail.com>
Message-ID: <BYAPR11MB37828BD1DA9857660415F50EA77D0@BYAPR11MB3782.namprd11.prod.outlook.com>

Hi,
seems, this info created during initialization phase. Is it correct? Collect or parse common info at the crash point usually not a good idea. During initialization usage of the c-heap not a problem.
The ?::free? work OK here. At least tier1 test produce same results for patched and non-patched builds. But these tests not generates real case for hs_err files.
It looks like 2k byte array enough for the one record for CPU from cpuinfo file. Will update code to use local buffer.

Thanks, Vladimir

From: Thomas St?fe <thomas.stuefe at gmail.com>
Sent: Friday, July 17, 2020 9:42 PM
To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>
Cc: Vladimir Kozlov <vladimir.kozlov at oracle.com>; Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: [16] RFR(S) 8249672: Include microcode revision in features_string on x86

Hi,

yes, you must use the raw free here (for the same reason we cannot pass in an os::malloc() allocated buffer to getline, since if it were to resize it would use raw ::realloc() internally and crash the same way).

But as I wrote in my first mail to the original thread, I would not use c-heap memory at all, since this function is used during crash reporting in the signal handler and the c-heap may be corrupted.

It the max line length of /proc/cpu can be reliably predicted (so that getline wont realloc()) I would pass a stack allocated buffer into getline. If not, I would not use getline() at all but rewrite this, probably using fgets().

Cheers, Thomas


On Sat, Jul 18, 2020 at 1:24 AM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
Thanks, I expected the C's functions here. Let's wait a little bit for Runtime team and update work with buffer.

 Thanks, Vladimir

-----Original Message-----
From: Vladimir Kozlov <vladimir.kozlov at oracle.com<mailto:vladimir.kozlov at oracle.com>>
Sent: Friday, July 17, 2020 4:17 PM
To: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>; Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>
Cc: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>; hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
Subject: Re: [16] RFR(S) 8249672: Include microcode revision in features_string on x86

I think the issue is 'line' buffer is allocated by libc getline() and os:free() which is HotSpot function [1] does not know about it. You need C's ::free() or use HS's os::malloc() to allocate 'line' buffer.

Someone from Runtime may suggest what is the best for this case.

Thanks,
Vladimir K

[1] http://hg.openjdk.java.net/jdk/jdk/file/14f465f62984/src/hotspot/share/runtime/os.cpp#l792

On 7/17/20 4:03 PM, Vladimir Kozlov wrote:
> I updated subject to our formal review request format (JDK version, RFE's id and subject).
>
> I moved RFE to runtime group as Thomas said:
>
> https://bugs.openjdk.java.net/browse/JDK-8249672
>
> Submitted tier1 testing to build on all our supported platforms. And debug builds on linux failed:
>
> #  SIGSEGV (0xb) at pc=0x0000146fc6af4b0b, pid=9715, tid=9718 # V
> [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
> const+0xeb
>
> V  [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
> const+0xeb V  [libjvm.so+0x13c898a]  verify_memory(void*)+0x26a V
> [libjvm.so+0x13cd30b]  os::free(void*)+0x5b V  [libjvm.so+0x13e5598]
> os::cpu_microcode_revision()+0xc8 V  [libjvm.so+0x17d314c]
> VM_Version::get_processor_features()+0x76c
> V  [libjvm.so+0x17d6ead]  VM_Version::initialize()+0x10d V
> [libjvm.so+0x17ce6c6]  VM_Version_init()+0x26 V  [libjvm.so+0xcb2895]
> init_globals()+0x55 V  [libjvm.so+0x16dde63]
> Threads::create_vm(JavaVMInitArgs*, bool*)+0x2d3
>
>
> Regards,
> Vladimir K
>
> On 7/17/20 3:02 PM, Thomas St?fe wrote:
>> Hi Vladimir,
>>
>> On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A <
>> vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
>>
>>>>   +#if defined(IA32) || defined(AMD64)
>>>>
>>>> Is that not synonymous with x86?
>>>
>>> This patter was copied from the method ?print_model_name_and_flags?
>>> (file os/linux/os_linux.cpp).
>>>
>>> This method also read the ?/proc/cpuinfo? file and I reuse it as
>>> ?template? for the new method.
>>>
>>> It is better to use one pattern to work with exactly same file but
>>> in general you are right.
>>>
>>> The X86 is defined in the file ./share/utilities/macros.hpp as:
>>>
>>> #if defined(IA32) || defined(AMD64)
>>>
>>> #define X86
>>>
>>> #define X86_ONLY(code) code
>>>
>>> #define NOT_X86(code)
>>>
>>>
>>>
>>> The question here: could I delete this ?ifdefs? while this method
>>> should work on x86 only?
>>>
>>>
>>>
>>
>> os_linux_x86.cpp is compiled for x86 platforms only, whereas
>> os_linux.cpp is shared among all architectures.
>>
>> So, in the former you do not need to exclude non-x86 architectures.
>>
>> Cheers, Thomas
>>
>>
>>> Thanks, Vladimir
>>>
>>>
>>>
>>> *From:* Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
>>> *Sent:* Friday, July 17, 2020 2:26 PM
>>> *To:* Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>; Hotspot dev
>>> runtime <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>
>>> *Cc:* hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
>>> *Subject:* Re: add microcode version to the hs_err files
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe
>>> <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
>>> wrote:
>>>
>>> Hi Vladimir,
>>>
>>>
>>>
>>> I think this would be more suited to hotspot-runtime.
>>>
>>>
>>>
>>>
>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>>> src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html
>>>
>>>
>>>
>>> +#if defined(IA32) || defined(AMD64)
>>>
>>> Is that not synonymous with x86?
>>>
>>>
>>>
>>> +    while ((read = getline(&line, &len, fp)) != -1) {
>>> +      if (len > 10 && strstr(line, "microcode") != NULL) {
>>> +        char* rev = strchr(line, ':');
>>> +        if (rev != NULL) sscanf(rev + 1, "%x", &result);
>>> +        break;
>>> +      }
>>> +    }
>>> +    free(line);
>>>
>>>
>>>
>>> Not sure this works as intended. At the first call to getline() it
>>> will allocate a line buffer for you and return it. That buffer will
>>> be as large as the first line you happen to read. You then pass that
>>> same buffer into getline to fetch the next lines, but what if those
>>> are longer than the first?
>>>
>>>
>>>
>>>
>>>
>>> Forget that point, getline calls realloc() on the line buffer to
>>> resize it, so this should be okay.
>>>
>>>
>>>
>>> Thanks, Thomas
>>>
>>>
>>>
>>> But anyway it would be better to pass a simple caller provided
>>> buffer in - stack allocated. Since this function is called at crash
>>> time and the C heap could be corrupted.
>>>
>>>
>>>
>>> Cheers, Thomas
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <
>>> vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
>>>
>>> Hello,
>>>
>>> could you please review the patch
>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>>>
>>> This patch add the microcode version for different OSes that may be
>>> useful in the issue resolution process.
>>>
>>>
>>>
>>> The reported microcode version for different OSes loos as:
>>>
>>>
>>>
>>> Linux (RHEL7.7):
>>>
>>> # cat hs_err_pid251046.log |grep microc
>>>
>>> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per
>>> core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8,
>>> fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt,
>>> vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht,
>>> tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt, clwb
>>>
>>>
>>>
>>> Windows (Win10, v1809):
>>>
>>> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per
>>> core) family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr,
>>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper,
>>> avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc,
>>> tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt
>>>
>>>
>>>
>>> MacOS (Darwin):
>>>
>>> $ cat hs_err_pid95187.log |grep microc
>>>
>>> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per
>>> core) family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr,
>>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper,
>>> avx, avx2, aes, clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit,
>>> bmi1, bmi2, adx, sha, fma, clflush, clflushopt
>>>
>>>
>>>
>>> Thanks, Vladimir
>>>
>>>
>>>    Thanks, Vladimir
>>>
>>>

From thomas.stuefe at gmail.com  Sat Jul 18 05:24:45 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Sat, 18 Jul 2020 07:24:45 +0200
Subject: [16] RFR(S) 8249672: Include microcode revision in
 features_string on x86
In-Reply-To: <BYAPR11MB37828BD1DA9857660415F50EA77D0@BYAPR11MB3782.namprd11.prod.outlook.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
 <29dd9cde-48c8-915f-fa28-26312c7af17a@oracle.com>
 <d6d5d0a8-c990-c74f-ab8a-ef0a8e9a17d0@oracle.com>
 <BYAPR11MB378279AB52DD8560F661DA03A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUxq5BinzYfOF6bmDO1OxxLexPnaoYJfPVeC4f1j05AEig@mail.gmail.com>
 <BYAPR11MB37828BD1DA9857660415F50EA77D0@BYAPR11MB3782.namprd11.prod.outlook.com>
Message-ID: <CAA-vtUxWzQ6bsxv08CGTfEN_qpj5cXz00eVcJeb1fiqOGe0UoA@mail.gmail.com>

Oh, sorry, you are right :(

I was under the assumption you wanted to call os::cpu_microcode_revision()
directly from within VMError::report(). During initialization using c-heap
like this should not be a problem and you can forget about 9/10ths of what
I wrote, sorry.

In that case your original variant is fine, my only suggestion would be to
clearly mark the free as ::free() with a comment to prevent someone from
correcting it to os::free.

Thank you,

Thomas


On Sat, Jul 18, 2020 at 7:08 AM Ivanov, Vladimir A <
vladimir.a.ivanov at intel.com> wrote:

> Hi,
>
> seems, this info created during initialization phase. Is it correct?
> Collect or parse common info at the crash point usually not a good idea.
> During initialization usage of the c-heap not a problem.
>
> The ?::free? work OK here. At least tier1 test produce same results for
> patched and non-patched builds. But these tests not generates real case for
> hs_err files.
>
> It looks like 2k byte array enough for the one record for CPU from cpuinfo
> file. Will update code to use local buffer.
>
>
>
> Thanks, Vladimir
>
>
>
> *From:* Thomas St?fe <thomas.stuefe at gmail.com>
> *Sent:* Friday, July 17, 2020 9:42 PM
> *To:* Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>
> *Cc:* Vladimir Kozlov <vladimir.kozlov at oracle.com>; Hotspot dev runtime <
> hotspot-runtime-dev at openjdk.java.net>;
> hotspot-compiler-dev at openjdk.java.net
> *Subject:* Re: [16] RFR(S) 8249672: Include microcode revision in
> features_string on x86
>
>
>
> Hi,
>
>
>
> yes, you must use the raw free here (for the same reason we cannot pass in
> an os::malloc() allocated buffer to getline, since if it were to resize it
> would use raw ::realloc() internally and crash the same way).
>
>
>
> But as I wrote in my first mail to the original thread, I would not use
> c-heap memory at all, since this function is used during crash reporting in
> the signal handler and the c-heap may be corrupted.
>
>
>
> It the max line length of /proc/cpu can be reliably predicted (so that
> getline wont realloc()) I would pass a stack allocated buffer into getline.
> If not, I would not use getline() at all but rewrite this, probably using
> fgets().
>
>
>
> Cheers, Thomas
>
>
>
>
>
>
>
>
>
> On Sat, Jul 18, 2020 at 1:24 AM Ivanov, Vladimir A <
> vladimir.a.ivanov at intel.com> wrote:
>
> Thanks, I expected the C's functions here. Let's wait a little bit for
> Runtime team and update work with buffer.
>
>  Thanks, Vladimir
>
> -----Original Message-----
> From: Vladimir Kozlov <vladimir.kozlov at oracle.com>
> Sent: Friday, July 17, 2020 4:17 PM
> To: Thomas St?fe <thomas.stuefe at gmail.com>; Ivanov, Vladimir A <
> vladimir.a.ivanov at intel.com>
> Cc: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>;
> hotspot-compiler-dev at openjdk.java.net
> Subject: Re: [16] RFR(S) 8249672: Include microcode revision in
> features_string on x86
>
> I think the issue is 'line' buffer is allocated by libc getline() and
> os:free() which is HotSpot function [1] does not know about it. You need
> C's ::free() or use HS's os::malloc() to allocate 'line' buffer.
>
> Someone from Runtime may suggest what is the best for this case.
>
> Thanks,
> Vladimir K
>
> [1]
> http://hg.openjdk.java.net/jdk/jdk/file/14f465f62984/src/hotspot/share/runtime/os.cpp#l792
>
> On 7/17/20 4:03 PM, Vladimir Kozlov wrote:
> > I updated subject to our formal review request format (JDK version,
> RFE's id and subject).
> >
> > I moved RFE to runtime group as Thomas said:
> >
> > https://bugs.openjdk.java.net/browse/JDK-8249672
> >
> > Submitted tier1 testing to build on all our supported platforms. And
> debug builds on linux failed:
> >
> > #  SIGSEGV (0xb) at pc=0x0000146fc6af4b0b, pid=9715, tid=9718 # V
> > [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
> > const+0xeb
> >
> > V  [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
> > const+0xeb V  [libjvm.so+0x13c898a]  verify_memory(void*)+0x26a V
> > [libjvm.so+0x13cd30b]  os::free(void*)+0x5b V  [libjvm.so+0x13e5598]
> > os::cpu_microcode_revision()+0xc8 V  [libjvm.so+0x17d314c]
> > VM_Version::get_processor_features()+0x76c
> > V  [libjvm.so+0x17d6ead]  VM_Version::initialize()+0x10d V
> > [libjvm.so+0x17ce6c6]  VM_Version_init()+0x26 V  [libjvm.so+0xcb2895]
> > init_globals()+0x55 V  [libjvm.so+0x16dde63]
> > Threads::create_vm(JavaVMInitArgs*, bool*)+0x2d3
> >
> >
> > Regards,
> > Vladimir K
> >
> > On 7/17/20 3:02 PM, Thomas St?fe wrote:
> >> Hi Vladimir,
> >>
> >> On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A <
> >> vladimir.a.ivanov at intel.com> wrote:
> >>
> >>>>   +#if defined(IA32) || defined(AMD64)
> >>>>
> >>>> Is that not synonymous with x86?
> >>>
> >>> This patter was copied from the method ?print_model_name_and_flags?
> >>> (file os/linux/os_linux.cpp).
> >>>
> >>> This method also read the ?/proc/cpuinfo? file and I reuse it as
> >>> ?template? for the new method.
> >>>
> >>> It is better to use one pattern to work with exactly same file but
> >>> in general you are right.
> >>>
> >>> The X86 is defined in the file ./share/utilities/macros.hpp as:
> >>>
> >>> #if defined(IA32) || defined(AMD64)
> >>>
> >>> #define X86
> >>>
> >>> #define X86_ONLY(code) code
> >>>
> >>> #define NOT_X86(code)
> >>>
> >>>
> >>>
> >>> The question here: could I delete this ?ifdefs? while this method
> >>> should work on x86 only?
> >>>
> >>>
> >>>
> >>
> >> os_linux_x86.cpp is compiled for x86 platforms only, whereas
> >> os_linux.cpp is shared among all architectures.
> >>
> >> So, in the former you do not need to exclude non-x86 architectures.
> >>
> >> Cheers, Thomas
> >>
> >>
> >>> Thanks, Vladimir
> >>>
> >>>
> >>>
> >>> *From:* Thomas St?fe <thomas.stuefe at gmail.com>
> >>> *Sent:* Friday, July 17, 2020 2:26 PM
> >>> *To:* Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>; Hotspot dev
> >>> runtime <hotspot-runtime-dev at openjdk.java.net>
> >>> *Cc:* hotspot-compiler-dev at openjdk.java.net
> >>> *Subject:* Re: add microcode version to the hs_err files
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe
> >>> <thomas.stuefe at gmail.com>
> >>> wrote:
> >>>
> >>> Hi Vladimir,
> >>>
> >>>
> >>>
> >>> I think this would be more suited to hotspot-runtime.
> >>>
> >>>
> >>>
> >>>
> >>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
> >>> src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html
> >>>
> >>>
> >>>
> >>> +#if defined(IA32) || defined(AMD64)
> >>>
> >>> Is that not synonymous with x86?
> >>>
> >>>
> >>>
> >>> +    while ((read = getline(&line, &len, fp)) != -1) {
> >>> +      if (len > 10 && strstr(line, "microcode") != NULL) {
> >>> +        char* rev = strchr(line, ':');
> >>> +        if (rev != NULL) sscanf(rev + 1, "%x", &result);
> >>> +        break;
> >>> +      }
> >>> +    }
> >>> +    free(line);
> >>>
> >>>
> >>>
> >>> Not sure this works as intended. At the first call to getline() it
> >>> will allocate a line buffer for you and return it. That buffer will
> >>> be as large as the first line you happen to read. You then pass that
> >>> same buffer into getline to fetch the next lines, but what if those
> >>> are longer than the first?
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Forget that point, getline calls realloc() on the line buffer to
> >>> resize it, so this should be okay.
> >>>
> >>>
> >>>
> >>> Thanks, Thomas
> >>>
> >>>
> >>>
> >>> But anyway it would be better to pass a simple caller provided
> >>> buffer in - stack allocated. Since this function is called at crash
> >>> time and the C heap could be corrupted.
> >>>
> >>>
> >>>
> >>> Cheers, Thomas
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <
> >>> vladimir.a.ivanov at intel.com> wrote:
> >>>
> >>> Hello,
> >>>
> >>> could you please review the patch
> >>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
> >>>
> >>> This patch add the microcode version for different OSes that may be
> >>> useful in the issue resolution process.
> >>>
> >>>
> >>>
> >>> The reported microcode version for different OSes loos as:
> >>>
> >>>
> >>>
> >>> Linux (RHEL7.7):
> >>>
> >>> # cat hs_err_pid251046.log |grep microc
> >>>
> >>> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per
> >>> core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8,
> >>> fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt,
> >>> vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht,
> >>> tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt, clwb
> >>>
> >>>
> >>>
> >>> Windows (Win10, v1809):
> >>>
> >>> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per
> >>> core) family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr,
> >>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper,
> >>> avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc,
> >>> tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt
> >>>
> >>>
> >>>
> >>> MacOS (Darwin):
> >>>
> >>> $ cat hs_err_pid95187.log |grep microc
> >>>
> >>> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per
> >>> core) family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr,
> >>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper,
> >>> avx, avx2, aes, clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit,
> >>> bmi1, bmi2, adx, sha, fma, clflush, clflushopt
> >>>
> >>>
> >>>
> >>> Thanks, Vladimir
> >>>
> >>>
> >>>    Thanks, Vladimir
> >>>
> >>>
>
>

From vladimir.kozlov at oracle.com  Sat Jul 18 17:09:40 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Sat, 18 Jul 2020 10:09:40 -0700
Subject: [15] RFR(T) : 8249697 : java/lang/invoke/RicochetTest.java should
 use @requires instead of @ignore
In-Reply-To: <054E0326-B61C-40FA-A8E3-89C433A49EE3@oracle.com>
References: <054E0326-B61C-40FA-A8E3-89C433A49EE3@oracle.com>
Message-ID: <1539aec8-c8ad-0acb-b7a3-20d4e839a3cd@oracle.com>

Good.

Thanks,
Vladimir

On 7/17/20 8:54 PM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev/8249697/webrev.00/
>> 7 lines changed: 4 ins; 0 del; 3 mod;
> 
> 
> Hi all,
> 
> could you please review this small and trivial patch for java/lang/invoke/RicochetTest.java test?
> from JBS:
>> a run of java/lang/invoke/RicochetTest.java w/ MAX_ARITY=255 was removed from all configurations by JDK-7049122, yet the problem manifests itself only w/ Xcomp. as now we have @requires to filter out tests from certain configurations, the test can be updated to run MAX_ARITY=255 in all configs but Xcomp.
> 
> the patch splits the test into two subtests, each one w/ one @run, and use @requires to exclude one w/ MAX_ARITY=255 from execution if Xcomp flag is used.
> 
> JBS: https://bugs.openjdk.java.net/browse/JDK-8249697
> webrev: http://cr.openjdk.java.net/~iignatyev/8249697/webrev.00/
> testing: java/lang/invoke/RicochetTest.java on {linux,windows,macos}-x64 w/ and w/o -Xcomp; Xcomp runs, as expected, had only 1 test run
> 
> Thanks,
> -- Igor
> 
> JDK-7049122 : https://bugs.openjdk.java.net/browse/JDK-7049122
> 

From vladimir.kozlov at oracle.com  Sat Jul 18 17:10:26 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Sat, 18 Jul 2020 10:10:26 -0700
Subject: [15] RFR(T) : 8249698 :
 java/lang/invoke/LFCaching/LFGarbageCollectedTest.java should be
 ProblemList-ed and not @ignored
In-Reply-To: <61EBB792-FAF4-4DFD-A674-4BE7153F20AA@oracle.com>
References: <61EBB792-FAF4-4DFD-A674-4BE7153F20AA@oracle.com>
Message-ID: <42b76420-205b-e72a-558d-8659242e1c06@oracle.com>

Good.

Thanks,
Vladimir

On 7/17/20 8:57 PM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev//8249698/webrev.00
>> 3 lines changed: 1 ins; 1 del; 1 mod;
> 
> 
> Hi all,
> 
> could you please review this trivial patch which removes @ignore from LFGarbageCollectedTest and adds it into problem-list instead?
> 
> from  8249698:
>> java/lang/invoke/LFCaching/LFGarbageCollectedTest.java is excluded from execution due to JDK-8078602. although the test might indeed fail due to JDK-8078602, it still can be useful and isn't harmful to run, therefore this test should be put in ProblemList.txt and @ignore is to be removed.
> from main issue(8249618):
>> although ProblemList and @ignore achieve the same end result (test exclusion), their server different goals and have slightly different meanings, simplified @ignore should be used to exclude useless or harmful tests, and ProblemList in all other cases (see yet-not-integrated `ProblemListing or `@ignore`-ing a Test` section of dev guide, PR -- https://github.com/openjdk/guide/pull/21 for more details).
>>
>> due to different reasons, this hasn't been always followed and some currently @ignore-d tests should rather be ProblemList-ed, and some of ProblemList-ed should be @ignore-d, this issue is to clean up the current state in a hope that this will reduce further confusion.
> 
> 
> webrev: http://cr.openjdk.java.net/~iignatyev//8249698/webrev.00
> JBS: https://bugs.openjdk.java.net/browse/JDK-8249698
> 
> Thanks,
> -- Igor
> 
> 8078602: https://bugs.openjdk.java.net/browse/JDK-8078602
> 8249618: https://bugs.openjdk.java.net/browse/JDK-8249618
> 

From mandy.chung at oracle.com  Sun Jul 19 04:32:32 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Sat, 18 Jul 2020 21:32:32 -0700
Subject: [15] RFR(T) : 8249697 : java/lang/invoke/RicochetTest.java should
 use @requires instead of @ignore
In-Reply-To: <054E0326-B61C-40FA-A8E3-89C433A49EE3@oracle.com>
References: <054E0326-B61C-40FA-A8E3-89C433A49EE3@oracle.com>
Message-ID: <60806519-8e57-d126-8a2e-800053b4ee9a@oracle.com>


On 7/17/20 8:54 PM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev/8249697/webrev.00/
>

I suggest to change this:
 ? 32? * @comment The following test creates an unreasonable number of 
adapters in -Xcomp mode (7049122)

To:

 ?? @bug 8249697
 ?? @summary verify very high number of adapters in -Xcomp mode

Otherwise, looks fine.

Mandy
> Hi all,
>
> could you please review this small and trivial patch for java/lang/invoke/RicochetTest.java test?
> from JBS:
>> a run of java/lang/invoke/RicochetTest.java w/ MAX_ARITY=255 was removed from all configurations by JDK-7049122, yet the problem manifests itself only w/ Xcomp. as now we have @requires to filter out tests from certain configurations, the test can be updated to run MAX_ARITY=255 in all configs but Xcomp.
> the patch splits the test into two subtests, each one w/ one @run, and use @requires to exclude one w/ MAX_ARITY=255 from execution if Xcomp flag is used.
>
> JBS: https://bugs.openjdk.java.net/browse/JDK-8249697
> webrev: http://cr.openjdk.java.net/~iignatyev/8249697/webrev.00/
> testing: java/lang/invoke/RicochetTest.java on {linux,windows,macos}-x64 w/ and w/o -Xcomp; Xcomp runs, as expected, had only 1 test run
>
> Thanks,
> -- Igor
>
> JDK-7049122 : https://bugs.openjdk.java.net/browse/JDK-7049122


From mandy.chung at oracle.com  Sun Jul 19 04:33:21 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Sat, 18 Jul 2020 21:33:21 -0700
Subject: [15] RFR(T) : 8249698 :
 java/lang/invoke/LFCaching/LFGarbageCollectedTest.java should be
 ProblemList-ed and not @ignored
In-Reply-To: <61EBB792-FAF4-4DFD-A674-4BE7153F20AA@oracle.com>
References: <61EBB792-FAF4-4DFD-A674-4BE7153F20AA@oracle.com>
Message-ID: <ee1085d5-af99-09f9-02ca-0fa7a79c3f33@oracle.com>

+1

Mandy

On 7/17/20 8:57 PM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev//8249698/webrev.00
>> 3 lines changed: 1 ins; 1 del; 1 mod;
>
> Hi all,
>
> could you please review this trivial patch which removes @ignore from LFGarbageCollectedTest and adds it into problem-list instead?
>
> from  8249698:
>> java/lang/invoke/LFCaching/LFGarbageCollectedTest.java is excluded from execution due to JDK-8078602. although the test might indeed fail due to JDK-8078602, it still can be useful and isn't harmful to run, therefore this test should be put in ProblemList.txt and @ignore is to be removed.
> from main issue(8249618):
>> although ProblemList and @ignore achieve the same end result (test exclusion), their server different goals and have slightly different meanings, simplified @ignore should be used to exclude useless or harmful tests, and ProblemList in all other cases (see yet-not-integrated `ProblemListing or `@ignore`-ing a Test` section of dev guide, PR -- https://github.com/openjdk/guide/pull/21 for more details).
>>
>> due to different reasons, this hasn't been always followed and some currently @ignore-d tests should rather be ProblemList-ed, and some of ProblemList-ed should be @ignore-d, this issue is to clean up the current state in a hope that this will reduce further confusion.
>
> webrev: http://cr.openjdk.java.net/~iignatyev//8249698/webrev.00
> JBS: https://bugs.openjdk.java.net/browse/JDK-8249698
>
> Thanks,
> -- Igor
>
> 8078602: https://bugs.openjdk.java.net/browse/JDK-8078602
> 8249618: https://bugs.openjdk.java.net/browse/JDK-8249618


From david.holmes at oracle.com  Mon Jul 20 01:06:44 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 20 Jul 2020 11:06:44 +1000
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
Message-ID: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>

Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/

This is a simple cleanup that touches files across a number of VM areas 
- hence the cross-post.

Whilst working on a different JNI fix I noticed that in most cases in 
jni.cpp we were using the following form of make_local:

JNIHandles::make_local(env, obj);

and what that form does is first extract the thread from the JNIEnv:

JavaThread* thread = JavaThread::thread_from_jni_environment(env);
return thread->active_handles()->allocate_handle(obj);

but there is also another, faster, variant for when you already have the 
"thread":

jobject JNIHandles::make_local(Thread* thread, oop obj) {
   return thread->active_handles()->allocate_handle(obj);
}

When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, WB_ENTRY, 
UNSAFE_ENTRY etc) it has already extracted the thread from the JNIEnv:

     JavaThread* thread=JavaThread::thread_from_jni_environment(env);

and further defined:

     Thread* THREAD = thread;

so we always already have direct access to the "thread" available (or 
indirect via TRAPS), and in fact we can end up removing the 
make_local(JNIEnv* env, oop obj) variant altogether.

Along the way I spotted some related issues with unnecessary use of 
Thread::current() when it is already available from TRAPS, and some 
other cases where we extracted the JNIEnv from a thread only to later 
extract the thread from the JNIEnv.

Testing: tiers 1 - 3

Thanks,
David
-----

From ningsheng.jian at arm.com  Mon Jul 20 03:51:25 2020
From: ningsheng.jian at arm.com (Ningsheng Jian)
Date: Mon, 20 Jul 2020 11:51:25 +0800
Subject: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API
 (Incubator): AArch64 backend changes
In-Reply-To: <VI1PR0802MB2558ECC61528694DBDFFF48F8E670@VI1PR0802MB2558.eurprd08.prod.outlook.com>
References: <VI1PR0802MB2558C90ACF1189F17692FF278EB90@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <275eb57c-51c0-675e-c32a-91b198023559@redhat.com>
 <719F9169-ABC4-408E-B732-F1BD9A84337F@oracle.com>
 <9a13f5df-d946-579d-4282-917dc7338dc8@redhat.com>
 <09BC0693-80E0-4F87-855E-0B38A6F5EFA2@oracle.com>
 <668e500e-f621-5a2c-a41e-f73536880f73@redhat.com>
 <VI1PR0802MB255845256424F9EB4D5D81368EB30@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <1909fa9d-98bb-c2fb-45d8-540247d1ca8b@redhat.com>
 <VI1PR0802MB255895ADE9BC8D027EBB81048EB10@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <BYAPR11MB35438110FA50851942CE17A1EF950@BYAPR11MB3543.namprd11.prod.outlook.com>
 <HE1PR0802MB25542AE212E12FD3EF6D1BE48E6E0@HE1PR0802MB2554.eurprd08.prod.outlook.com>
 <2acbcc99-8dd4-b8f1-5982-1d439953c416@redhat.com>
 <VI1PR0802MB2558ECC61528694DBDFFF48F8E670@VI1PR0802MB2558.eurprd08.prod.outlook.com>
Message-ID: <54d6b2b6-b79a-4700-981c-6ab33aca82f2@arm.com>

Hi Andrew and all,

Since we are getting ready to propose Vector API target to JDK 16 [1]. I 
have regenerated webrev of aarch64 backend parts from panama repo, which 
has been rebased to jdk/jdk very recently, by:

$ hg update vector-unstable && hg diff -r default > all.patch
$ grep "diff -r" all.patch | grep -e "src/hotspot/cpu/aarch64" | awk 
'{print $4}' > aarch64_list
$ ksh ./webrev.ksh -r default -o aarch64_webrev aarch64_list

The new webrev:
http://cr.openjdk.java.net/~njian/vectorapi/8223347-integration/aarch64-webrev.01/

Could you please help to take a look?

Yang's previous webrevs can still be found at [2], with review comments 
addressed in the latest webrev above.

[1] http://mail.openjdk.java.net/pipermail/hotspot-dev/2020-July/042427.html
[2] 
http://cr.openjdk.java.net/~yzhang/vectorapi/vectorapi.rfr/aarch64_webrev/


Thanks,
Ningsheng

On 7/8/20 3:05 PM, Yang Zhang wrote:
> Hi Andrew
> 
> I have updated this patch. Could you please help to review it again?
> In this patch, the following changes are made:
> 1. Separate newly added NEON instructions to a new ad file
>     aarch64_neon.ad
> 2. Add assembler tests for NEON instructions. Trailing spaces
>     in the python script are also removed.
> 
> http://cr.openjdk.java.net/~yzhang/vectorapi/vectorapi.rfr/aarch64_webrev/webrev.02/
> 
> Thanks,
> Yang
> 
> 
> -----Original Message-----
> From: Andrew Haley <aph at redhat.com>
> Sent: Tuesday, June 30, 2020 12:10 AM
> To: Yang Zhang <Yang.Zhang at arm.com>; Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; Paul Sandoz <paul.sandoz at oracle.com>
> Cc: nd <nd at arm.com>; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; core-libs-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
> Subject: Re: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API (Incubator): AArch64 backend changes
> 
> On 29/06/2020 08:48, Yang Zhang wrote:
>> 1. Instructions that can be matched with NEON instructions directly.
>> MulVB, SqrtVF and AbsV have been merged into jdk master already.
>>
>> 2. Instructions that jdk master has middle end support for, but they cannot be matched with NEON instructions directly.
>> Such as AddReductionVL, MulReductionVL, And/Or/XorReductionV These new instructions can be moved into jdk master first, but for auto-vectorization, the performance might not get improved.
>>
>> 3. Panama/Vector API specific  instructions such as Load/StoreVector ( 16 bits), VectorReinterpret, VectorMaskCmp, MaxV/MinV, VectorBlend etc.
>> These instructions cannot be moved into jdk master first because there isn't middle-end support.
>>
>> I will put 2 and 3 in a new ad file aarch64_neon.ad. I will also update aarch64_asmtest.py and macroassemler.cpp. When the patch is ready, I will send it again.
> 
> Thank you *very* much for your hard work. Appreciated!
> 
> --
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
> 


From david.holmes at oracle.com  Mon Jul 20 04:16:49 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 20 Jul 2020 14:16:49 +1000
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
Message-ID: <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>

Subject line got truncated by accident ...

On 20/07/2020 11:06 am, David Holmes wrote:
> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
> 
> This is a simple cleanup that touches files across a number of VM areas 
> - hence the cross-post.
> 
> Whilst working on a different JNI fix I noticed that in most cases in 
> jni.cpp we were using the following form of make_local:
> 
> JNIHandles::make_local(env, obj);
> 
> and what that form does is first extract the thread from the JNIEnv:
> 
> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
> return thread->active_handles()->allocate_handle(obj);
> 
> but there is also another, faster, variant for when you already have the 
> "thread":
> 
> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>  ? return thread->active_handles()->allocate_handle(obj);
> }
> 
> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, WB_ENTRY, 
> UNSAFE_ENTRY etc) it has already extracted the thread from the JNIEnv:
> 
>  ??? JavaThread* thread=JavaThread::thread_from_jni_environment(env);
> 
> and further defined:
> 
>  ??? Thread* THREAD = thread;
> 
> so we always already have direct access to the "thread" available (or 
> indirect via TRAPS), and in fact we can end up removing the 
> make_local(JNIEnv* env, oop obj) variant altogether.
> 
> Along the way I spotted some related issues with unnecessary use of 
> Thread::current() when it is already available from TRAPS, and some 
> other cases where we extracted the JNIEnv from a thread only to later 
> extract the thread from the JNIEnv.
> 
> Testing: tiers 1 - 3
> 
> Thanks,
> David
> -----

From kim.barrett at oracle.com  Mon Jul 20 05:22:49 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 20 Jul 2020 01:22:49 -0400
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
Message-ID: <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>

> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> wrote:
> 
> Subject line got truncated by accident ...
> 
> On 20/07/2020 11:06 am, David Holmes wrote:
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>> This is a simple cleanup that touches files across a number of VM areas - hence the cross-post.
>> Whilst working on a different JNI fix I noticed that in most cases in jni.cpp we were using the following form of make_local:
>> JNIHandles::make_local(env, obj);
>> and what that form does is first extract the thread from the JNIEnv:
>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>> return thread->active_handles()->allocate_handle(obj);
>> but there is also another, faster, variant for when you already have the "thread":
>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>   return thread->active_handles()->allocate_handle(obj);
>> }
>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted the thread from the JNIEnv:
>>     JavaThread* thread=JavaThread::thread_from_jni_environment(env);
>> and further defined:
>>     Thread* THREAD = thread;
>> so we always already have direct access to the "thread" available (or indirect via TRAPS), and in fact we can end up removing the make_local(JNIEnv* env, oop obj) variant altogether.
>> Along the way I spotted some related issues with unnecessary use of Thread::current() when it is already available from TRAPS, and some other cases where we extracted the JNIEnv from a thread only to later extract the thread from the JNIEnv.
>> Testing: tiers 1 - 3
>> Thanks,
>> David
>> -----

------------------------------------------------------------------------------
src/hotspot/share/classfile/javaClasses.cpp
 439     JNIEnv *env = thread->jni_environment();

Since env is no longer used on the next line, move this down to where
it is used, at line 444.

------------------------------------------------------------------------------
src/hotspot/share/classfile/verifier.cpp
 299   JNIEnv *env = thread->jni_environment();

env now seems to only be used at line 320.  Move this closer.

------------------------------------------------------------------------------
src/hotspot/share/prims/jni.cpp
 743     result = JNIHandles::make_local(THREAD, result_handle());

jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
previously it just used "thread". Maybe this change shouldn't be made?
Or can the other uses be changed to THREAD for consistency?

------------------------------------------------------------------------------
src/hotspot/share/prims/jvm.cpp

The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
instead of "THREAD", even though other places nearby are using
"THREAD".  That inconsistency is kind of unfortunate, but doesn't seem
easily avoidable.

------------------------------------------------------------------------------


From david.holmes at oracle.com  Mon Jul 20 05:53:37 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 20 Jul 2020 15:53:37 +1000
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
Message-ID: <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>

Hi Kim,

Thanks for looking at this.

Updated webrev at:

http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/

On 20/07/2020 3:22 pm, Kim Barrett wrote:
>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> wrote:
>>
>> Subject line got truncated by accident ...
>>
>> On 20/07/2020 11:06 am, David Holmes wrote:
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>>> This is a simple cleanup that touches files across a number of VM areas - hence the cross-post.
>>> Whilst working on a different JNI fix I noticed that in most cases in jni.cpp we were using the following form of make_local:
>>> JNIHandles::make_local(env, obj);
>>> and what that form does is first extract the thread from the JNIEnv:
>>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>>> return thread->active_handles()->allocate_handle(obj);
>>> but there is also another, faster, variant for when you already have the "thread":
>>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>>    return thread->active_handles()->allocate_handle(obj);
>>> }
>>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted the thread from the JNIEnv:
>>>      JavaThread* thread=JavaThread::thread_from_jni_environment(env);
>>> and further defined:
>>>      Thread* THREAD = thread;
>>> so we always already have direct access to the "thread" available (or indirect via TRAPS), and in fact we can end up removing the make_local(JNIEnv* env, oop obj) variant altogether.
>>> Along the way I spotted some related issues with unnecessary use of Thread::current() when it is already available from TRAPS, and some other cases where we extracted the JNIEnv from a thread only to later extract the thread from the JNIEnv.
>>> Testing: tiers 1 - 3
>>> Thanks,
>>> David
>>> -----
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/classfile/javaClasses.cpp
>   439     JNIEnv *env = thread->jni_environment();
> 
> Since env is no longer used on the next line, move this down to where
> it is used, at line 444.

Fixed.

> ------------------------------------------------------------------------------
> src/hotspot/share/classfile/verifier.cpp
>   299   JNIEnv *env = thread->jni_environment();
> 
> env now seems to only be used at line 320.  Move this closer.

Fixed.

> ------------------------------------------------------------------------------
> src/hotspot/share/prims/jni.cpp
>   743     result = JNIHandles::make_local(THREAD, result_handle());
> 
> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
> previously it just used "thread". Maybe this change shouldn't be made?
> Or can the other uses be changed to THREAD for consistency?

"thread" and "THREAD" are interchangeable for anything expecting a 
"Thread*" (and somewhat surprisingly a number of API's that only work 
for JavaThreads actually take a Thread*. :( ). I had choice between 
trying to be file-wide consistent with the make_local calls, versus 
local-code consistent, and used THREAD as it is available in both 
JNI_ENTRY and via TRAPS. But I can certainly make a local change to 
"thread" for local consistency.

> ------------------------------------------------------------------------------
> src/hotspot/share/prims/jvm.cpp
> 
> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
> instead of "THREAD", even though other places nearby are using
> "THREAD".  That inconsistency is kind of unfortunate, but doesn't seem
> easily avoidable.

Everything that uses THREAD in a JVM_ENTRY method can be changed to use 
"thread" instead. But I'm not sure it's a consistency worth pursuing at 
least as part of these changes (there are likely similar issues with 
most of the touched files).

Thanks,
David

> ------------------------------------------------------------------------------
> 

From kim.barrett at oracle.com  Mon Jul 20 06:15:13 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 20 Jul 2020 02:15:13 -0400
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
Message-ID: <A4977786-905C-46C2-AE30-807D9A932080@oracle.com>

> On Jul 20, 2020, at 1:53 AM, David Holmes <david.holmes at oracle.com> wrote:
> 
> Hi Kim,
> 
> Thanks for looking at this.
> 
> Updated webrev at:
> 
> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/

Looks good.

> 
> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> wrote:
>> src/hotspot/share/prims/jni.cpp
>>  743     result = JNIHandles::make_local(THREAD, result_handle());
>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>> previously it just used "thread". Maybe this change shouldn't be made?
>> Or can the other uses be changed to THREAD for consistency?
> 
> "thread" and "THREAD" are interchangeable for anything expecting a "Thread*" (and somewhat surprisingly a number of API's that only work for JavaThreads actually take a Thread*. :( ). I had choice between trying to be file-wide consistent with the make_local calls, versus local-code consistent, and used THREAD as it is available in both JNI_ENTRY and via TRAPS. But I can certainly make a local change to "thread" for local consistency.

I don?t feel strongly either way.  It just struck me as a little odd to have the mix in close proximity,
especially since I think consistently using either one might work in this function.  But being consistent
about make_local usage has something to be said for it too.

>> src/hotspot/share/prims/jvm.cpp
>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>> instead of "THREAD", even though other places nearby are using
>> "THREAD".  That inconsistency is kind of unfortunate, but doesn't seem
>> easily avoidable.
> 
> Everything that uses THREAD in a JVM_ENTRY method can be changed to use "thread" instead. But I'm not sure it's a consistency worth pursuing at least as part of these changes (there are likely similar issues with most of the touched files).

Yeah, it?s not really obvious whether to use THREAD or thread in some cases.
But I agree that addressing any inconsistencies there is mostly out of scope for
this change.


From tobias.hartmann at oracle.com  Mon Jul 20 07:46:34 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 20 Jul 2020 09:46:34 +0200
Subject: RFR(S): 8248901: Signed immediate support in
 .../share/assembler.hpp is broken.
In-Reply-To: <3df3dab6-aa2f-bbbc-d231-6cda8f2a0ff7@oracle.com>
References: <3df3dab6-aa2f-bbbc-d231-6cda8f2a0ff7@oracle.com>
Message-ID: <91ddfdac-2ce4-637c-b68c-7e042d67483f@oracle.com>

Hi Patric,

looks good to me.

Best regards,
Tobias

On 07.07.20 13:00, Patric Hedlin wrote:
> Dear all,
> 
> I would like to ask for help to review the following change/update:
> 
> Issue:? https://bugs.openjdk.java.net/browse/JDK-8248901
> Webrev: http://cr.openjdk.java.net/~phedlin/tr8248901/
> 
> 
> Current definition(s) of is_simm() and friends are not robust over inputs. Both min and max values
> are undefined for width > 32 (and width < 0).
> No is_uimm() is currently provided (added). Several definitions are not used (cleanup).
> 
> NOTE: Adding currently unused is_simm9() and is_uimm12(), required by JDK-8247766.
> 
> 
> Testing: hs-tier1-3
> 
> 
> Best regards,
> Patric

From jamsheed.c.m at oracle.com  Mon Jul 20 07:52:21 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Mon, 20 Jul 2020 13:22:21 +0530
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <cd1bfaeb-96a5-73e0-c244-e58b9484ec17@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
 <8325fac5-6258-9b88-6507-5dcb0597cc17@oracle.com>
 <8522a69e-e538-2cc9-5364-887e450fc653@oracle.com>
 <46144d6d-5714-05ad-a263-01507db937cc@oracle.com>
 <7a361c29-4771-9ed4-1542-b3f68a5726f3@oracle.com>
 <6dc4c99b-1d90-09f1-60d1-fb2caf981266@oracle.com>
 <6b4e4dda-01d4-37d0-5403-a4f5481e5bf0@oracle.com>
 <f8e18f7b-bd13-5032-fb17-04fb037c88b4@oracle.com>
 <32d7fb64-75a5-7add-d496-df33cfaefabf@oracle.com>
 <f9f80bf5-a565-3820-efa1-2b949b92d764@oracle.com>
 <fa60eea1-3d7b-af3a-678d-354263e3eccc@oracle.com>
 <cd1bfaeb-96a5-73e0-c244-e58b9484ec17@oracle.com>
Message-ID: <4ffa8190-d57e-a9a2-e508-0d98035a34c6@oracle.com>

Hi Vladimir,

Thank you for the review, I have updated the test 
http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.02/

Hi all,

Could I get another review ?

Best regards,

Jamsheed

On 18/07/2020 00:09, Vladimir Kozlov wrote:
> Yes, I agree with webrev_fix_EA version.
>
> I would suggest to modify TestIdealAllocShape.java test to add new 
> method with synchronization from your example in JBS comment. Or add 
> it as separate test.
>
> Thanks,
> Vladimir
>
> On 7/16/20 9:19 AM, Jamsheed C M wrote:
>> Hi Vladimir,
>> I ran performance run for 
>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA/? (links in JBS)
>> I don't see any issues, so i would like to go with webrev_fix_EA if 
>> it fixes all the reported issues.
>> Best regards,
>> Jamsheed
>>
>> On 16/07/2020 07:25, Jamsheed C M wrote:
>>> Hi Vladimir,
>>>
>>> On 16/07/2020 00:29, Vladimir Kozlov wrote:
>>>> As I said before I agree with your additional checks for StoreN and 
>>>> StoreNKlass.
>>>>
>>>> But I have concerns about new is_init_captured_store code. EA is 
>>>> mostly looking only on inputs to see Allocation. And in several 
>>>> places it expecting only to see Allocation because other cases 
>>>> should be filtered out before.
>>> If that is the case, I would like to go with my first webrev for 
>>> this fix as it nicely propagate es and there in no unnecessary 
>>> promotion to global escape state.
>>>
>>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA/
>>>
>>> Best regards,
>>>
>>> Jamsheed
>>>
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 7/15/20 10:54 AM, Jamsheed C M wrote:
>>>>> Hi Vladimir,
>>>>>
>>>>> with unrolling i understand that many cases will just have phis 
>>>>> everywhere to outside the loop as the uses are outside the loop.
>>>>>
>>>>> and this is not restricted to escaping objects alone as i 
>>>>> depicted. it can be escaping as well as non-escaping.
>>>>>
>>>>> so marking store to them as global escape doesn't seems to be nice 
>>>>> idea. i will rework on this fix and get back again.
>>>>>
>>>>> Thank you
>>>>>
>>>>> Best regards
>>>>>
>>>>> Jamsheed
>>>>>
>>>>> On 15/07/2020 08:38, Jamsheed C M wrote:
>>>>>> (unfinished mail got sent, so completing it)
>>>>>> On 15/07/2020 08:21, Jamsheed C M wrote:
>>>>>>> Hi Vladimir,
>>>>>>>
>>>>>>> On 15/07/2020 06:50, Vladimir Kozlov wrote:
>>>>>>>> I looked more on this. EA already does not secularize 
>>>>>>>> allocations when Phi nodes merged them - it should handle this 
>>>>>>>> case. I did small experiment and relaxed assert for this new 
>>>>>>>> (10. needs comment update) case for AddP's base and test passed:
>>>>>>>>
>>>>>>>> src/hotspot/share/opto/escape.cpp Tue Jul 14 18:11:27 2020 -0700
>>>>>>>> @@ -2357,6 +2357,7 @@
>>>>>>>> ?????? int opcode = uncast_base->Opcode();
>>>>>>>> ?????? assert(opcode == Op_ConP || opcode == Op_ThreadLocal ||
>>>>>>>> ????????????? opcode == Op_CastX2P || 
>>>>>>>> uncast_base->is_DecodeNarrowPtr() ||
>>>>>>>> +???????????? (uncast_base->is_Phi() && 
>>>>>>>> (uncast_base->bottom_type()->isa_rawptr() != NULL)) ||
>>>>>>>> ????????????? (uncast_base->is_Mem() && 
>>>>>>>> (uncast_base->bottom_type()->isa_rawptr() != NULL)) ||
>>>>>>>> ????????????? (uncast_base->is_Proj() && 
>>>>>>>> uncast_base->in(0)->is_Allocate()), "sanity");
>>>>>>>> ???? }
>>>>>>>>
>>>>>>>> Did you hit a case when this may not work?
>>>>>>>
>>>>>>> Yes, right it already doesn't mark it as scalarizable if base 
>>>>>>> count is more than one(I think it missed a is_oop check there)[1].
>>>>>>>
>>>>>>> EA CG adds edges only for oop field making stores to them 
>>>>>>> undetected. This makes these stored objects to NoEscape and if 
>>>>>>> compiled method continues execution with this NoEscape object 
>>>>>>> can have undesired results(i.e synchronization removed).
>>>>>>>
>>>>>>> Probable case would be(didn't verify)
>>>>>>>
>>>>>>> try {
>>>>>>>
>>>>>>> LOOP BEGIN
>>>>>>>
>>>>>>> ? try {throw new Obj()} catch {}
>>>>>>>
>>>>>>> LOOP END
>>>>>>>
>>>>>>> } catch (Obj e) {
>>>>>>>
>>>>>>> }
>>>>>>
>>>>>> Best Regards,
>>>>>>
>>>>>> Jamsheed
>>>>>>
>>>>>> [1]https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/escape.cpp#L1770 
>>>>>>
>>>>>>
>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> And with LoopOpts off -XX:LoopUnrollLimit=0 it removed 
>>>>>>>> allocation (-XX:+PrintEscapeAnalysis 
>>>>>>>> -XX:+PrintEliminateAllocations):
>>>>>>>>
>>>>>>>> ======== Connection graph for? Test::test
>>>>>>>> JavaObject NoEscape(NoEscape) [ 158F [ 107 ]]?? 95 Allocate === 
>>>>>>>> 242? 76? 230? 8? 1 ( 93? 92? 21? 1? 78 1 78 ) [[ 96 97 98 105 
>>>>>>>> 106? 107 ]]? rawptr:NotNull ( int:>=0, java/lang/Object:NotNull 
>>>>>>>> *, bool, top ) Test::test1 @ bci:0 Test::test @ bci:8 !jvms: 
>>>>>>>> Test::test1 @ bci:0 Test::test @ bci:8
>>>>>>>> LocalVar [ 95P [ 158b ]]?? 107??? Proj??? ===? 95 [[ 108 158 ]] 
>>>>>>>> #5 !jvms: Test::test1 @ bci:0 Test::test @ bci:8
>>>>>>>>
>>>>>>>> Scalar? 95??? Allocate??? ===? 242? 76? 230? 8? 1 ( 93 92? 21 1 
>>>>>>>> 78 1? 78 ) [[ 96? 97? 98? 105? 106? 107 ]] rawptr:NotNull ( 
>>>>>>>> int:>=0, java/lang/Object:NotNull *, bool, top ) Test::test1 @ 
>>>>>>>> bci:0 Test::test @ bci:8 !jvms: Test::test1 @ bci:0 Test::test 
>>>>>>>> @ bci:8
>>>>>>>> ++++ Eliminated: 95 Allocate
>>>>>>>>
>>>>>>>>
>>>>>>>> t\Thanks,
>>>>>>>> Vladimir K
>>>>>>>>
>>>>>>>> On 7/14/20 1:28 AM, Jamsheed C M wrote:
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> I had incorrectly added extra check in assert after offset 
>>>>>>>>> computation in address_offset . For addps with non constant 
>>>>>>>>> offsets (like [1])
>>>>>>>>>
>>>>>>>>> Not changing the old assert even though I am not expecting 
>>>>>>>>> first addp/second addp(for array addressing) case for init 
>>>>>>>>> captured store.
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA_asserts_corrected/ 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>>
>>>>>>>>> Jamsheed
>>>>>>>>>
>>>>>>>>> [1]
>>>>>>>>>
>>>>>>>>> assert(offs != Type::OffsetBot ||
>>>>>>>>> - adr->in(AddPNode::Address)->in(0)->is_AllocateArray(),
>>>>>>>>> + adr->in(AddPNode::Address)->in(0)->is_AllocateArray() || 
>>>>>>>>> is_captured_store(adr),
>>>>>>>>> ???????????? "offset must be a constant or it is 
>>>>>>>>> initialization of array");
>>>>>>>>>
>>>>>>>>> On 13/07/2020 11:14, Jamsheed C M wrote:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I reworked the fix. I compute offset for all init captures 
>>>>>>>>>> stores, but treats this special init captured stores similar 
>>>>>>>>>> to unsafe(as these objects are usually GlobalEscape and 
>>>>>>>>>> doesn't have any perf implications).
>>>>>>>>>>
>>>>>>>>>> revised webrev: 
>>>>>>>>>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.01/
>>>>>>>>>>
>>>>>>>>>> testing: mach1-5( logs in jbs)
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>>
>>>>>>>>>> Jamsheed
>>>>>>>>>>
>>>>>>>>>> On 09/07/2020 19:36, Jamsheed C M wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> request to hold the review. need to change the code for 
>>>>>>>>>>> dealing with unsafe access. as current capture code go for 
>>>>>>>>>>> more execution time analyzing things.
>>>>>>>>>>>
>>>>>>>>>>> Best regards,
>>>>>>>>>>>
>>>>>>>>>>> Jamsheed
>>>>>>>>>>>
>>>>>>>>>>> On 09/07/2020 13:01, Jamsheed C M wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> JBS:https://bugs.openjdk.java.net/browse/JDK-8242895
>>>>>>>>>>>>
>>>>>>>>>>>> Request for review changes made to offset computation and 
>>>>>>>>>>>> field write detection for init captured stores due to phis 
>>>>>>>>>>>> addition between alloc and init. This happen if init node 
>>>>>>>>>>>> in different outer loop wrt to alloc node and there is a 
>>>>>>>>>>>> loop opt.? This was required as a result of enhancement [1].
>>>>>>>>>>>>
>>>>>>>>>>>> Normally init are not associated with multiple alloc node 
>>>>>>>>>>>> during EA phase, but changes done for [1] caused the code 
>>>>>>>>>>>> shapes of the form [2]? to generate inits associated with 
>>>>>>>>>>>> multiple alloc node.
>>>>>>>>>>>>
>>>>>>>>>>>> This had implication in offset computation and field write 
>>>>>>>>>>>> detection related to initializing stores.
>>>>>>>>>>>>
>>>>>>>>>>>> Attempt to fix in EA:
>>>>>>>>>>>>
>>>>>>>>>>>> ???? webrev: 
>>>>>>>>>>>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA/
>>>>>>>>>>>>
>>>>>>>>>>>> Alternate fix:
>>>>>>>>>>>>
>>>>>>>>>>>> ???? Minimize the scenario in compiler generated code by 
>>>>>>>>>>>> throwing only j.l.Error from slowpath(all exception 
>>>>>>>>>>>> async/sync are handled in runtime exit).
>>>>>>>>>>>>
>>>>>>>>>>>> ???? Stub epilog doesn't poll or throw any exceptions. 
>>>>>>>>>>>> Disable full loop opt before EA for detectable patterns and 
>>>>>>>>>>>> bailout EA for late detected patterns.
>>>>>>>>>>>>
>>>>>>>>>>>> ???? webrev: 
>>>>>>>>>>>> http://cr.openjdk.java.net/~jcm/8242895/webrev_deopt/
>>>>>>>>>>>>
>>>>>>>>>>>> Please advice.
>>>>>>>>>>>>
>>>>>>>>>>>> Testing : mach tier1-5 (logs in jbs)
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>
>>>>>>>>>>>> Jamsheed
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> [1] JDK-8231291 
>>>>>>>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8231291>C2: loop 
>>>>>>>>>>>> opts before EA should maximally unroll loops
>>>>>>>>>>>>
>>>>>>>>>>>> [2] that have its init node in different outer loop wrt to 
>>>>>>>>>>>> alloc node.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> loop begin
>>>>>>>>>>>>
>>>>>>>>>>>> ?? try{
>>>>>>>>>>>>
>>>>>>>>>>>> ?? return new obj()/? throw new obj()/ uncommon trap after 
>>>>>>>>>>>> allocation, in a loop
>>>>>>>>>>>>
>>>>>>>>>>>> ?? } catch(ex) {
>>>>>>>>>>>>
>>>>>>>>>>>> ?? }
>>>>>>>>>>>>
>>>>>>>>>>>> loop end
>>>>>>>>>>>>
>>>>>>>>>>>> ? 42???? public static IntA test(int n) {
>>>>>>>>>>>> ?? 43???????? for (int i=0; i<2; i++) {
>>>>>>>>>>>> ?? 44???????????? try {
>>>>>>>>>>>> ?? 45?????????????????? return new IntA(n + i);
>>>>>>>>>>>> ?? 46???????????? } catch (Exception e) {
>>>>>>>>>>>> ?? 47???????????? }
>>>>>>>>>>>> ?? 48???????? }
>>>>>>>>>>>> ?? 49
>>>>>>>>>>>>

From david.holmes at oracle.com  Mon Jul 20 07:53:48 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 20 Jul 2020 17:53:48 +1000
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <A4977786-905C-46C2-AE30-807D9A932080@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
 <A4977786-905C-46C2-AE30-807D9A932080@oracle.com>
Message-ID: <6e0d9af0-92f0-1eba-fc0a-22eebf008fe0@oracle.com>

Thanks Kim!

David

On 20/07/2020 4:15 pm, Kim Barrett wrote:
>> On Jul 20, 2020, at 1:53 AM, David Holmes <david.holmes at oracle.com> wrote:
>>
>> Hi Kim,
>>
>> Thanks for looking at this.
>>
>> Updated webrev at:
>>
>> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/
> 
> Looks good.
> 
>>
>> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> wrote:
>>> src/hotspot/share/prims/jni.cpp
>>>   743     result = JNIHandles::make_local(THREAD, result_handle());
>>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>>> previously it just used "thread". Maybe this change shouldn't be made?
>>> Or can the other uses be changed to THREAD for consistency?
>>
>> "thread" and "THREAD" are interchangeable for anything expecting a "Thread*" (and somewhat surprisingly a number of API's that only work for JavaThreads actually take a Thread*. :( ). I had choice between trying to be file-wide consistent with the make_local calls, versus local-code consistent, and used THREAD as it is available in both JNI_ENTRY and via TRAPS. But I can certainly make a local change to "thread" for local consistency.
> 
> I don?t feel strongly either way.  It just struck me as a little odd to have the mix in close proximity,
> especially since I think consistently using either one might work in this function.  But being consistent
> about make_local usage has something to be said for it too.
> 
>>> src/hotspot/share/prims/jvm.cpp
>>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>>> instead of "THREAD", even though other places nearby are using
>>> "THREAD".  That inconsistency is kind of unfortunate, but doesn't seem
>>> easily avoidable.
>>
>> Everything that uses THREAD in a JVM_ENTRY method can be changed to use "thread" instead. But I'm not sure it's a consistency worth pursuing at least as part of these changes (there are likely similar issues with most of the touched files).
> 
> Yeah, it?s not really obvious whether to use THREAD or thread in some cases.
> But I agree that addressing any inconsistencies there is mostly out of scope for
> this change.
> 

From tobias.hartmann at oracle.com  Mon Jul 20 08:23:58 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 20 Jul 2020 10:23:58 +0200
Subject: RFR(S): 8247732: validate user-input intrinsic_ids in
 ControlIntrinsic
In-Reply-To: <9b324805-eb86-27e1-5dcb-96a823f8495b@amazon.com>
References: <821e3d29-c95b-aafc-8ee5-6e49a1bdde82@amazon.com>
 <9b324805-eb86-27e1-5dcb-96a823f8495b@amazon.com>
Message-ID: <82cba5e4-2020-ce0a-4576-e8e0cc2e5ae5@oracle.com>

Hi,

On 08.07.20 10:26, Liu, Xin wrote:
> ControlIntrinsic/DisableIntrinsic in compiler directives are more complex. The matched directive is only parsed when hotspot attempts to compile the corresponding method.
> 
> I validate at that time and JVM will crash if it doesnot meet guarantee() statement.

I don't think a guarantee should be used here, i.e. the VM shouldn't crash but we should exit
gracefully with an error message. Isn't it possible to piggy-back on the error mechanism in
DirectivesParser?

> I added Method::external_name_short() which only returns the shorter method name in the form of  "classname::method".
> 
> Probably hotspot has had similar code, but I failed to discover. please let me know and I will remove it.

I would just use name_and_sig_as_C_string().

jvmFlagConstraintList.cpp:180/181
- Wrong indentation

jvmFlagConstraintsCompiler.cpp:388/400
- Maybe change the error message to "Unrecognized intrinsic detected in DisableIntrinsic [...]"

Best regards,
Tobias

From tobias.hartmann at oracle.com  Mon Jul 20 08:29:10 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 20 Jul 2020 10:29:10 +0200
Subject: [16] RFR(XS): 8248467: C2: compiler/intrinsics/object/TestClone
 fails with -XX:+VerifyGraphEdges
In-Reply-To: <6a458143-aeee-486b-2bc5-a210779c26dc@oracle.com>
References: <60c17f38-6cb2-d380-252f-15f8d5151b29@oracle.com>
 <6a458143-aeee-486b-2bc5-a210779c26dc@oracle.com>
Message-ID: <2ac39054-e9bf-d7a8-2dcc-a954d1a94abf@oracle.com>

+1

Best regards,
Tobias

On 15.07.20 19:26, Vladimir Kozlov wrote:
> Good.
> 
> Thanks,
> Vladimir
> 
> On 7/15/20 8:04 AM, Christian Hagedorn wrote:
>> Hi
>>
>> Please review the following patch:
>> https://bugs.openjdk.java.net/browse/JDK-8248467
>> http://cr.openjdk.java.net/~chagedorn/8248467/webrev.00/
>>
>> The assertion is hit due to a MemBarNode whose precedence edge was set to NULL at [1]
>> (result_phi_rawoop is NULL and _resproj is the precedence edge to a MemBarStoreStore). This is
>> possible since JDK-8237581 [2] which can remove some allocations. The fix just adds this
>> additional case in the assert.
>>
>> Best regards,
>> Christian
>>
>>
>> [1] http://hg.openjdk.java.net/jdk/jdk/file/4a8fd81d64ba/src/hotspot/share/opto/macro.cpp#l1566
>> [2] https://bugs.openjdk.java.net/browse/JDK-8237581

From tobias.hartmann at oracle.com  Mon Jul 20 08:32:47 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 20 Jul 2020 10:32:47 +0200
Subject: [16] RFR(S): 8247743: Segmentation fault in debug builds due to
 stack overflow in find_recur with deep graphs
In-Reply-To: <d793324b-88ed-d86b-0655-76443460d219@oracle.com>
References: <9af7a44c-4267-4900-812c-12aa0c37713a@oracle.com>
 <518ffdf1-143a-06f3-9aa4-96871d72d024@oracle.com>
 <9b3a9632-c7bb-2f51-c295-72935add2670@oracle.com>
 <2f317601-4845-541d-e2ef-ad7735386f1c@oracle.com>
 <7cfafcb9-6232-5738-6cad-508127fd31e8@oracle.com>
 <53d1eebe-e85f-58cb-7fba-0baf2ecf8701@oracle.com>
 <d793324b-88ed-d86b-0655-76443460d219@oracle.com>
Message-ID: <f5267bd5-5387-13f2-eaf5-8db8efd52de9@oracle.com>

+1

Best regards,
Tobias

On 15.07.20 19:37, Vladimir Kozlov wrote:
> Looks good.
> 
> Thanks,
> Vladimir K
> 
> On 7/15/20 12:58 AM, Christian Hagedorn wrote:
>> Hi Vladimir
>>
>> On 14.07.20 20:46, Vladimir Kozlov wrote:
>>> Can you move next up to where other small find*() methods are defined?:
>>>
>>> +Node* Node::find_ctrl(int idx) {
>>> +? return find(idx, true);
>>> ??}
>>>
>>> Also add '// not PRODUCT' comment to #endif for #ifndef PRODUCT. It is hard to find where this
>>> not product code ends.
>>>
>>> Looks good otherwise.
>>
>> Thanks, I added these changes in a new webrev:
>> http://cr.openjdk.java.net/~chagedorn/8247743/webrev.02/
>>
>> Best regards,
>> Christian
>>
>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 7/14/20 2:54 AM, Christian Hagedorn wrote:
>>>> Hi Vladimir
>>>>
>>>> On 13.07.20 19:43, Vladimir Kozlov wrote:
>>>>> Node::find_ctrl() is used during debugging when you want to print and look on only control nodes.
>>>>> We have several such methods which are only used in debugger.
>>>>
>>>> I see, I restored this method and changed Node::find() accordingly. I additionally added two
>>>> find_ctrl() methods to make it easier to call it from a debugger (as already present for
>>>> find_node()).
>>>>
>>>>> I suggest to store old_arena() in local var and pass into add_to_worklist().
>>>>>
>>>>> You can make add_to_worklist() static since you pass node as argument.
>>>>
>>>> Okay. I updated this and the change above in a new webrev:
>>>> http://cr.openjdk.java.net/~chagedorn/8247743/webrev.01/
>>>>
>>>> Best regards,
>>>> Christian
>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> On 7/13/20 3:09 AM, Christian Hagedorn wrote:
>>>>>> Ping - could anyone review it, please? Thanks!
>>>>>>
>>>>>> Best regards,
>>>>>> Christian
>>>>>>
>>>>>> On 02.07.20 09:33, Christian Hagedorn wrote:
>>>>>>> Hi
>>>>>>>
>>>>>>> Please review the following patch:
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8247743
>>>>>>> http://cr.openjdk.java.net/~chagedorn/8247743/webrev.00/
>>>>>>>
>>>>>>> The testcase creates a deep graph with a lot of nodes on a chain. When running with the
>>>>>>> specified test flags, it recursively calls Node::find_recur() for each node discovered which
>>>>>>> eventually results in a segmentation fault due to a stack overflow (around 10000 calls due to
>>>>>>> such a long chain of nodes). The fix just converts the recursive algorithm into an iterative
>>>>>>> one to avoid a segmentation fault. This is similar to JDK-8246203 [1].
>>>>>>>
>>>>>>> I additionally removed Node::find_ctrl() and its special handling in the algorithm since it
>>>>>>> is not used.
>>>>>>>
>>>>>>> There is actually another problem with the recursive version. When running the testcase
>>>>>>> without -XX:CompileOnly=compiler/c2/TestFindNode, it will spin forever inside [2] because
>>>>>>> there is a debug_orig node cycle and the loop does not break based on the debug_orig nodes
>>>>>>> being visited. This is also fixed in the patch.
>>>>>>>
>>>>>>> Thank you!
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Christian
>>>>>>>
>>>>>>>
>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8246203
>>>>>>> [2] http://hg.openjdk.java.net/jdk/jdk/file/e2622818f0bd/src/hotspot/share/opto/node.cpp#l1589

From tobias.hartmann at oracle.com  Mon Jul 20 09:14:01 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 20 Jul 2020 11:14:01 +0200
Subject: [16] RFR(S): 8248552: C2 crashes with SIGFPE due to division by
 zero
In-Reply-To: <5b2e7b1b-24f7-d575-58a3-376ec9ab7944@oracle.com>
References: <70e8e42b-5cb3-9c1e-419e-2f771f042368@oracle.com>
 <3ba2ef6a-8ade-7ede-5252-21051c34b472@oracle.com>
 <9e2f26bd-daa4-9540-8401-9850e0beea94@oracle.com>
 <c569ebf8-58ed-c6df-2e0e-38c692f926c6@oracle.com>
 <a2066deb-ef3b-7071-33ff-03c829069cff@oracle.com>
 <f6a62d10-9cff-3fc2-0947-c49e775a23c9@oracle.com>
 <e8d6aa47-b584-4fdc-f506-1b1c52aa28f5@oracle.com>
 <ca043859-bb7b-f64a-a37d-7b4e47c745d6@oracle.com>
 <5b2e7b1b-24f7-d575-58a3-376ec9ab7944@oracle.com>
Message-ID: <518cd022-73e1-cb5c-499d-86853ae679c3@oracle.com>

Hi Christian,

On 15.07.20 15:08, Christian Hagedorn wrote:
> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.02/

Looks good to me.

Some code style comments:

cfgnode.cpp:1083
- There's an extra whitespace before ","

loopopts.cpp:84/86
- No need for extra brackets

Please make sure to run performance testing.

Best regards,
Tobias

From tobias.hartmann at oracle.com  Mon Jul 20 09:50:45 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 20 Jul 2020 11:50:45 +0200
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <4ffa8190-d57e-a9a2-e508-0d98035a34c6@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
 <8325fac5-6258-9b88-6507-5dcb0597cc17@oracle.com>
 <8522a69e-e538-2cc9-5364-887e450fc653@oracle.com>
 <46144d6d-5714-05ad-a263-01507db937cc@oracle.com>
 <7a361c29-4771-9ed4-1542-b3f68a5726f3@oracle.com>
 <6dc4c99b-1d90-09f1-60d1-fb2caf981266@oracle.com>
 <6b4e4dda-01d4-37d0-5403-a4f5481e5bf0@oracle.com>
 <f8e18f7b-bd13-5032-fb17-04fb037c88b4@oracle.com>
 <32d7fb64-75a5-7add-d496-df33cfaefabf@oracle.com>
 <f9f80bf5-a565-3820-efa1-2b949b92d764@oracle.com>
 <fa60eea1-3d7b-af3a-678d-354263e3eccc@oracle.com>
 <cd1bfaeb-96a5-73e0-c244-e58b9484ec17@oracle.com>
 <4ffa8190-d57e-a9a2-e508-0d98035a34c6@oracle.com>
Message-ID: <0fa9d47a-e568-bf22-4c49-74c926ae9f14@oracle.com>

Hi Jamsheed,

On 20.07.20 09:52, Jamsheed C M wrote:
> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.02/

Looks good to me too. Some style comments:

escape.cpp:
- line 2250: Maybe rename to "is_captured_store_address" or something similar
- line 2254: just move _igvn->type into the assert
- line 2257: wrong indentation
- line 2996: "assocaited" -> "associated"

Best regards,
Tobias

From jamsheed.c.m at oracle.com  Mon Jul 20 13:30:32 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Mon, 20 Jul 2020 19:00:32 +0530
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <0fa9d47a-e568-bf22-4c49-74c926ae9f14@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
 <8325fac5-6258-9b88-6507-5dcb0597cc17@oracle.com>
 <8522a69e-e538-2cc9-5364-887e450fc653@oracle.com>
 <46144d6d-5714-05ad-a263-01507db937cc@oracle.com>
 <7a361c29-4771-9ed4-1542-b3f68a5726f3@oracle.com>
 <6dc4c99b-1d90-09f1-60d1-fb2caf981266@oracle.com>
 <6b4e4dda-01d4-37d0-5403-a4f5481e5bf0@oracle.com>
 <f8e18f7b-bd13-5032-fb17-04fb037c88b4@oracle.com>
 <32d7fb64-75a5-7add-d496-df33cfaefabf@oracle.com>
 <f9f80bf5-a565-3820-efa1-2b949b92d764@oracle.com>
 <fa60eea1-3d7b-af3a-678d-354263e3eccc@oracle.com>
 <cd1bfaeb-96a5-73e0-c244-e58b9484ec17@oracle.com>
 <4ffa8190-d57e-a9a2-e508-0d98035a34c6@oracle.com>
 <0fa9d47a-e568-bf22-4c49-74c926ae9f14@oracle.com>
Message-ID: <16aead29-6788-a7e8-bf6e-ae2b56fdb9dc@oracle.com>

Hi Tobias,

Thank you for the review and the feedback.

Revised webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.03/

Best regards,

Jamsheed

On 20/07/2020 15:20, Tobias Hartmann wrote:
> Hi Jamsheed,
>
> On 20.07.20 09:52, Jamsheed C M wrote:
>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.02/
> Looks good to me too. Some style comments:
>
> escape.cpp:
> - line 2250: Maybe rename to "is_captured_store_address" or something similar
> - line 2254: just move _igvn->type into the assert
> - line 2257: wrong indentation
> - line 2996: "assocaited" -> "associated"
>
> Best regards,
> Tobias

From tobias.hartmann at oracle.com  Mon Jul 20 13:35:01 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 20 Jul 2020 15:35:01 +0200
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <16aead29-6788-a7e8-bf6e-ae2b56fdb9dc@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
 <8325fac5-6258-9b88-6507-5dcb0597cc17@oracle.com>
 <8522a69e-e538-2cc9-5364-887e450fc653@oracle.com>
 <46144d6d-5714-05ad-a263-01507db937cc@oracle.com>
 <7a361c29-4771-9ed4-1542-b3f68a5726f3@oracle.com>
 <6dc4c99b-1d90-09f1-60d1-fb2caf981266@oracle.com>
 <6b4e4dda-01d4-37d0-5403-a4f5481e5bf0@oracle.com>
 <f8e18f7b-bd13-5032-fb17-04fb037c88b4@oracle.com>
 <32d7fb64-75a5-7add-d496-df33cfaefabf@oracle.com>
 <f9f80bf5-a565-3820-efa1-2b949b92d764@oracle.com>
 <fa60eea1-3d7b-af3a-678d-354263e3eccc@oracle.com>
 <cd1bfaeb-96a5-73e0-c244-e58b9484ec17@oracle.com>
 <4ffa8190-d57e-a9a2-e508-0d98035a34c6@oracle.com>
 <0fa9d47a-e568-bf22-4c49-74c926ae9f14@oracle.com>
 <16aead29-6788-a7e8-bf6e-ae2b56fdb9dc@oracle.com>
Message-ID: <2805861f-4760-c768-9b1e-55cd6af1cde1@oracle.com>

Hi Jamsheed,

On 20.07.20 15:30, Jamsheed C M wrote:
> Revised webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.03/

You don't need #ifdef ASSERT in escape.cpp:2252. Otherwise looks good to me! No new webrev required.

Best regards,
Tobias

From jamsheed.c.m at oracle.com  Mon Jul 20 13:48:04 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Mon, 20 Jul 2020 19:18:04 +0530
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <2805861f-4760-c768-9b1e-55cd6af1cde1@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
 <8325fac5-6258-9b88-6507-5dcb0597cc17@oracle.com>
 <8522a69e-e538-2cc9-5364-887e450fc653@oracle.com>
 <46144d6d-5714-05ad-a263-01507db937cc@oracle.com>
 <7a361c29-4771-9ed4-1542-b3f68a5726f3@oracle.com>
 <6dc4c99b-1d90-09f1-60d1-fb2caf981266@oracle.com>
 <6b4e4dda-01d4-37d0-5403-a4f5481e5bf0@oracle.com>
 <f8e18f7b-bd13-5032-fb17-04fb037c88b4@oracle.com>
 <32d7fb64-75a5-7add-d496-df33cfaefabf@oracle.com>
 <f9f80bf5-a565-3820-efa1-2b949b92d764@oracle.com>
 <fa60eea1-3d7b-af3a-678d-354263e3eccc@oracle.com>
 <cd1bfaeb-96a5-73e0-c244-e58b9484ec17@oracle.com>
 <4ffa8190-d57e-a9a2-e508-0d98035a34c6@oracle.com>
 <0fa9d47a-e568-bf22-4c49-74c926ae9f14@oracle.com>
 <16aead29-6788-a7e8-bf6e-ae2b56fdb9dc@oracle.com>
 <2805861f-4760-c768-9b1e-55cd6af1cde1@oracle.com>
Message-ID: <3d5fc552-d3e2-494b-e921-c65967af8207@oracle.com>

Hi Tobias,
On 20/07/2020 19:05, Tobias Hartmann wrote:
> Hi Jamsheed,
>
> On 20.07.20 15:30, Jamsheed C M wrote:
>> Revised webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.03/
> You don't need #ifdef ASSERT in escape.cpp:2252. Otherwise looks good to me! No new webrev required.

Missed removing it. Thank you for the review.

Best regards,

Jamsheed

>
> Best regards,
> Tobias

From patric.hedlin at oracle.com  Mon Jul 20 14:03:24 2020
From: patric.hedlin at oracle.com (Patric Hedlin)
Date: Mon, 20 Jul 2020 16:03:24 +0200
Subject: RFR(S): 8248901: Signed immediate support in
 .../share/assembler.hpp is broken.
In-Reply-To: <91ddfdac-2ce4-637c-b68c-7e042d67483f@oracle.com>
References: <3df3dab6-aa2f-bbbc-d231-6cda8f2a0ff7@oracle.com>
 <91ddfdac-2ce4-637c-b68c-7e042d67483f@oracle.com>
Message-ID: <378b61b1-06a6-82ca-9c5f-eb76e024292f@oracle.com>

Thanks for reviewing Tobias.

/Patric

On 2020-07-20 09:46, Tobias Hartmann wrote:
> Hi Patric,
>
> looks good to me.
>
> Best regards,
> Tobias
>
> On 07.07.20 13:00, Patric Hedlin wrote:
>> Dear all,
>>
>> I would like to ask for help to review the following change/update:
>>
>> Issue:? https://bugs.openjdk.java.net/browse/JDK-8248901
>> Webrev: http://cr.openjdk.java.net/~phedlin/tr8248901/
>>
>>
>> Current definition(s) of is_simm() and friends are not robust over inputs. Both min and max values
>> are undefined for width > 32 (and width < 0).
>> No is_uimm() is currently provided (added). Several definitions are not used (cleanup).
>>
>> NOTE: Adding currently unused is_simm9() and is_uimm12(), required by JDK-8247766.
>>
>>
>> Testing: hs-tier1-3
>>
>>
>> Best regards,
>> Patric


From igor.ignatyev at oracle.com  Mon Jul 20 16:13:35 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Mon, 20 Jul 2020 09:13:35 -0700
Subject: [15] RFR(T) : 8249698 :
 java/lang/invoke/LFCaching/LFGarbageCollectedTest.java should be
 ProblemList-ed and not @ignored
In-Reply-To: <ee1085d5-af99-09f9-02ca-0fa7a79c3f33@oracle.com>
References: <61EBB792-FAF4-4DFD-A674-4BE7153F20AA@oracle.com>
 <ee1085d5-af99-09f9-02ca-0fa7a79c3f33@oracle.com>
Message-ID: <74D1782A-4AA1-44CB-98DC-BD038B263F3A@oracle.com>

Mandy, Vladimir,

thanks for your reviews, pushed to jdk15.

-- Igor

> On Jul 18, 2020, at 9:33 PM, Mandy Chung <mandy.chung at oracle.com> wrote:
> 
> +1
> 
> Mandy
> 
> On 7/17/20 8:57 PM, Igor Ignatyev wrote:
>> http://cr.openjdk.java.net/~iignatyev//8249698/webrev.00 <http://cr.openjdk.java.net/~iignatyev//8249698/webrev.00>
>>> 3 lines changed: 1 ins; 1 del; 1 mod;
>> 
>> Hi all,
>> 
>> could you please review this trivial patch which removes @ignore from LFGarbageCollectedTest and adds it into problem-list instead?
>> 
>> from  8249698:
>>> java/lang/invoke/LFCaching/LFGarbageCollectedTest.java is excluded from execution due to JDK-8078602. although the test might indeed fail due to JDK-8078602, it still can be useful and isn't harmful to run, therefore this test should be put in ProblemList.txt and @ignore is to be removed.
>> from main issue(8249618):
>>> although ProblemList and @ignore achieve the same end result (test exclusion), their server different goals and have slightly different meanings, simplified @ignore should be used to exclude useless or harmful tests, and ProblemList in all other cases (see yet-not-integrated `ProblemListing or `@ignore`-ing a Test` section of dev guide, PR -- https://github.com/openjdk/guide/pull/21 <https://github.com/openjdk/guide/pull/21> for more details). 
>>> 
>>> due to different reasons, this hasn't been always followed and some currently @ignore-d tests should rather be ProblemList-ed, and some of ProblemList-ed should be @ignore-d, this issue is to clean up the current state in a hope that this will reduce further confusion. 
>> 
>> webrev: http://cr.openjdk.java.net/~iignatyev//8249698/webrev.00 <http://cr.openjdk.java.net/~iignatyev//8249698/webrev.00>
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249698 <https://bugs.openjdk.java.net/browse/JDK-8249698>
>> 
>> Thanks,
>> -- Igor
>> 
>> 8078602: https://bugs.openjdk.java.net/browse/JDK-8078602 <https://bugs.openjdk.java.net/browse/JDK-8078602>
>> 8249618: https://bugs.openjdk.java.net/browse/JDK-8249618 <https://bugs.openjdk.java.net/browse/JDK-8249618>


From daniel.daugherty at oracle.com  Mon Jul 20 17:07:10 2020
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 20 Jul 2020 13:07:10 -0400
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
Message-ID: <328fb322-5b14-968b-7b13-4b449a8d98fd@oracle.com>

On 7/20/20 1:53 AM, David Holmes wrote:
> Hi Kim,
>
> Thanks for looking at this.
>
> Updated webrev at:
>
> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/

I like this cleanup very much!


src/hotspot/share/classfile/javaClasses.cpp
 ??? No comments.

src/hotspot/share/classfile/verifier.cpp
 ??? L298: ? JavaThread* thread = (JavaThread*)THREAD;
 ??? L307: ? ResourceMark rm(THREAD);
 ??????? Since we've gone to the trouble of creating the 'thread' variable,
 ??????? I would prefer it to be used instead of THREAD where possible.

src/hotspot/share/jvmci/jvmciCompilerToVM.cpp
 ??? L1021: ? HandleMark hm;
 ??????? Can this be 'hm(THREAD)'? (Not your problem, but while you're
 ??????? in that file?)

src/hotspot/share/prims/jni.cpp
 ??? No comments.

src/hotspot/share/prims/jvm.cpp
 ??? L140: ? ResourceMark rm;
 ??????? Can this be 'rm(THREAD)'? (Not your problem, but while you're
 ??????? in that file?)

 ??? L611: ? Handle stackStream_h(THREAD, 
JNIHandles::resolve_non_null(stackStream));
 ??? L617: ? objArrayHandle frames_array_h(THREAD, fa);
 ??? L626: ? return JNIHandles::make_local(THREAD, result);
 ??????? Since we've gone to the trouble of creating the 'jt' variable,
 ??????? I would prefer it to be used instead of THREAD where possible.

 ??? L767: ? vframeStream vfst(thread);
 ??? L788???????? return (jclass) JNIHandles::make_local(THREAD, 
m->method_holder()->java_mirror());
 ??????? Can we use 'thread' on L788? (preferred)
 ??????? Can we use 'THREAD' on L767? (less preferred)

 ??? L949: ? ResourceMark rm(THREAD);
 ??? L951: ? Handle class_loader (THREAD, JNIHandles::resolve(loader));
 ??? L955: ?????????????????????????? THREAD);
 ??? L957: ? Handle protection_domain (THREAD, JNIHandles::resolve(pd));
 ??? L968: ? return (jclass) JNIHandles::make_local(THREAD, 
k->java_mirror());
 ??????? Since we've gone to the trouble of creating the 'jt' variable,
 ??????? I would prefer it to be used instead of THREAD where possible.

 ??? L986: ? JavaThread* jt = (JavaThread*) THREAD;
 ??????? This 'jt' is unused and can be deleted (Not your problem, but 
while you're
 ??????? in that file?)

 ??? L1154: ? while (*p != '\0') {
 ??? L1155: ????? if (*p == '.') {
 ??? L1156: ????????? *p = '/';
 ??? L1157: ????? }
 ??? L1158: ????? p++;
 ??????? Nit - the indents are wrong on L1155-58. (Not your problem, but 
while you're
 ??????? in that file?)

 ??? L1389: ? ResourceMark rm(THREAD);
 ??? L1446: ??? return JNIHandles::make_local(THREAD, result);
 ??? L1460: ? return JNIHandles::make_local(THREAD, result);
 ??????? Can we use 'thread' on L1389? (preferred) And then the line you
 ??????? touched could also be 'thread' and we'll be consistent in this
 ??????? function...

 ??? L3287: ? oop jthread = thread->threadObj();
 ??? L3288: ? assert (thread != NULL, "no current thread!");
 ??????? I think the assert is wrong. It should be:

 ??????????? assert(jthread != NULL, "no current thread!");

 ??????? If 'thread == NULL', then we would have crashed at L3287.
 ??????? Also notice that I deleted the extra ' ' before '('. (Not
 ??????? your problem, but while you're in that file?)

 ??? L3289: ? return JNIHandles::make_local(THREAD, jthread);
 ??????? Can you use 'thread' instead of 'THREAD' here for consistency?

 ??? L3681: ??? method_handle = Handle(THREAD, JNIHandles::resolve(method));
 ??? L3682: ??? Handle receiver(THREAD, JNIHandles::resolve(obj));
 ??? L3683: ??? objArrayHandle args(THREAD, 
objArrayOop(JNIHandles::resolve(args0)));
 ??? L3685: ??? jobject res = JNIHandles::make_local(THREAD, result);
 ??????? Can you use 'thread' instead of 'THREAD' here for consistency?

 ??? L3705: ? objArrayHandle args(THREAD, 
objArrayOop(JNIHandles::resolve(args0)));
 ??? L3707?? jobject res = JNIHandles::make_local(THREAD, result);
 ??????? Can you use 'thread' instead of 'THREAD' here for consistency?

src/hotspot/share/prims/methodHandles.cpp
 ??? No comments.

src/hotspot/share/prims/methodHandles.hpp
 ??? No comments.

src/hotspot/share/prims/unsafe.cpp
 ??? No comments.

src/hotspot/share/prims/whitebox.cpp
 ??? No comments.

src/hotspot/share/runtime/jniHandles.cpp
 ??? No comments.

src/hotspot/share/runtime/jniHandles.hpp
 ??? No comments.

src/hotspot/share/services/management.cpp
 ??? No comments.


None of my comments above are "must do". If you choose to make the
changes, a new webrev isn't required, but would be useful for a
sanity check.

Thumbs up.

Dan


>
> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> 
>>> wrote:
>>>
>>> Subject line got truncated by accident ...
>>>
>>> On 20/07/2020 11:06 am, David Holmes wrote:
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>>>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>>>> This is a simple cleanup that touches files across a number of VM 
>>>> areas - hence the cross-post.
>>>> Whilst working on a different JNI fix I noticed that in most cases 
>>>> in jni.cpp we were using the following form of make_local:
>>>> JNIHandles::make_local(env, obj);
>>>> and what that form does is first extract the thread from the JNIEnv:
>>>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>>>> return thread->active_handles()->allocate_handle(obj);
>>>> but there is also another, faster, variant for when you already 
>>>> have the "thread":
>>>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>>> ?? return thread->active_handles()->allocate_handle(obj);
>>>> }
>>>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, 
>>>> WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted the thread 
>>>> from the JNIEnv:
>>>> ???? JavaThread* thread=JavaThread::thread_from_jni_environment(env);
>>>> and further defined:
>>>> ???? Thread* THREAD = thread;
>>>> so we always already have direct access to the "thread" available 
>>>> (or indirect via TRAPS), and in fact we can end up removing the 
>>>> make_local(JNIEnv* env, oop obj) variant altogether.
>>>> Along the way I spotted some related issues with unnecessary use of 
>>>> Thread::current() when it is already available from TRAPS, and some 
>>>> other cases where we extracted the JNIEnv from a thread only to 
>>>> later extract the thread from the JNIEnv.
>>>> Testing: tiers 1 - 3
>>>> Thanks,
>>>> David
>>>> -----
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/javaClasses.cpp
>> ? 439???? JNIEnv *env = thread->jni_environment();
>>
>> Since env is no longer used on the next line, move this down to where
>> it is used, at line 444.
>
> Fixed.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/verifier.cpp
>> ? 299?? JNIEnv *env = thread->jni_environment();
>>
>> env now seems to only be used at line 320.? Move this closer.
>
> Fixed.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jni.cpp
>> ? 743???? result = JNIHandles::make_local(THREAD, result_handle());
>>
>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>> previously it just used "thread". Maybe this change shouldn't be made?
>> Or can the other uses be changed to THREAD for consistency?
>
> "thread" and "THREAD" are interchangeable for anything expecting a 
> "Thread*" (and somewhat surprisingly a number of API's that only work 
> for JavaThreads actually take a Thread*. :( ). I had choice between 
> trying to be file-wide consistent with the make_local calls, versus 
> local-code consistent, and used THREAD as it is available in both 
> JNI_ENTRY and via TRAPS. But I can certainly make a local change to 
> "thread" for local consistency.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jvm.cpp
>>
>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>> instead of "THREAD", even though other places nearby are using
>> "THREAD".? That inconsistency is kind of unfortunate, but doesn't seem
>> easily avoidable.
>
> Everything that uses THREAD in a JVM_ENTRY method can be changed to 
> use "thread" instead. But I'm not sure it's a consistency worth 
> pursuing at least as part of these changes (there are likely similar 
> issues with most of the touched files).
>
> Thanks,
> David
>
>> ------------------------------------------------------------------------------ 
>>
>>


From vladimir.a.ivanov at intel.com  Mon Jul 20 17:12:25 2020
From: vladimir.a.ivanov at intel.com (Ivanov, Vladimir A)
Date: Mon, 20 Jul 2020 17:12:25 +0000
Subject: [16] RFR(S) 8249672: Include microcode revision in
 features_string on x86
In-Reply-To: <CAA-vtUxWzQ6bsxv08CGTfEN_qpj5cXz00eVcJeb1fiqOGe0UoA@mail.gmail.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
 <29dd9cde-48c8-915f-fa28-26312c7af17a@oracle.com>
 <d6d5d0a8-c990-c74f-ab8a-ef0a8e9a17d0@oracle.com>
 <BYAPR11MB378279AB52DD8560F661DA03A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUxq5BinzYfOF6bmDO1OxxLexPnaoYJfPVeC4f1j05AEig@mail.gmail.com>
 <BYAPR11MB37828BD1DA9857660415F50EA77D0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUxWzQ6bsxv08CGTfEN_qpj5cXz00eVcJeb1fiqOGe0UoA@mail.gmail.com>
Message-ID: <BYAPR11MB37826BC619E8ECC8BF62C711A77B0@BYAPR11MB3782.namprd11.prod.outlook.com>

HI,
The updated patch available as http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.03/
It use the ?fgets? instead of ?getline? to use local memory.
The tier1 tests passed on the release and fastdebug builds on Linux and fastdebug builds on MacOS systems.
Testing results same for patched and non-patched builds.

Thanks, Vladmir

From: Thomas St?fe <thomas.stuefe at gmail.com>
Sent: Friday, July 17, 2020 10:25 PM
To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>
Cc: Vladimir Kozlov <vladimir.kozlov at oracle.com>; Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: [16] RFR(S) 8249672: Include microcode revision in features_string on x86

Oh, sorry, you are right :(

I was under the assumption you wanted to call os::cpu_microcode_revision() directly from within VMError::report(). During initialization using c-heap like this should not be a problem and you can forget about 9/10ths of what I wrote, sorry.

In that case your original variant is fine, my only suggestion would be to clearly mark the free as ::free() with a comment to prevent someone from correcting it to os::free.

Thank you,

Thomas


On Sat, Jul 18, 2020 at 7:08 AM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
Hi,
seems, this info created during initialization phase. Is it correct? Collect or parse common info at the crash point usually not a good idea. During initialization usage of the c-heap not a problem.
The ?::free? work OK here. At least tier1 test produce same results for patched and non-patched builds. But these tests not generates real case for hs_err files.
It looks like 2k byte array enough for the one record for CPU from cpuinfo file. Will update code to use local buffer.

Thanks, Vladimir

From: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
Sent: Friday, July 17, 2020 9:42 PM
To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>
Cc: Vladimir Kozlov <vladimir.kozlov at oracle.com<mailto:vladimir.kozlov at oracle.com>>; Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>; hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
Subject: Re: [16] RFR(S) 8249672: Include microcode revision in features_string on x86

Hi,

yes, you must use the raw free here (for the same reason we cannot pass in an os::malloc() allocated buffer to getline, since if it were to resize it would use raw ::realloc() internally and crash the same way).

But as I wrote in my first mail to the original thread, I would not use c-heap memory at all, since this function is used during crash reporting in the signal handler and the c-heap may be corrupted.

It the max line length of /proc/cpu can be reliably predicted (so that getline wont realloc()) I would pass a stack allocated buffer into getline. If not, I would not use getline() at all but rewrite this, probably using fgets().

Cheers, Thomas


On Sat, Jul 18, 2020 at 1:24 AM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
Thanks, I expected the C's functions here. Let's wait a little bit for Runtime team and update work with buffer.

 Thanks, Vladimir

-----Original Message-----
From: Vladimir Kozlov <vladimir.kozlov at oracle.com<mailto:vladimir.kozlov at oracle.com>>
Sent: Friday, July 17, 2020 4:17 PM
To: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>; Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>
Cc: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>; hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
Subject: Re: [16] RFR(S) 8249672: Include microcode revision in features_string on x86

I think the issue is 'line' buffer is allocated by libc getline() and os:free() which is HotSpot function [1] does not know about it. You need C's ::free() or use HS's os::malloc() to allocate 'line' buffer.

Someone from Runtime may suggest what is the best for this case.

Thanks,
Vladimir K

[1] http://hg.openjdk.java.net/jdk/jdk/file/14f465f62984/src/hotspot/share/runtime/os.cpp#l792

On 7/17/20 4:03 PM, Vladimir Kozlov wrote:
> I updated subject to our formal review request format (JDK version, RFE's id and subject).
>
> I moved RFE to runtime group as Thomas said:
>
> https://bugs.openjdk.java.net/browse/JDK-8249672
>
> Submitted tier1 testing to build on all our supported platforms. And debug builds on linux failed:
>
> #  SIGSEGV (0xb) at pc=0x0000146fc6af4b0b, pid=9715, tid=9718 # V
> [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
> const+0xeb
>
> V  [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
> const+0xeb V  [libjvm.so+0x13c898a]  verify_memory(void*)+0x26a V
> [libjvm.so+0x13cd30b]  os::free(void*)+0x5b V  [libjvm.so+0x13e5598]
> os::cpu_microcode_revision()+0xc8 V  [libjvm.so+0x17d314c]
> VM_Version::get_processor_features()+0x76c
> V  [libjvm.so+0x17d6ead]  VM_Version::initialize()+0x10d V
> [libjvm.so+0x17ce6c6]  VM_Version_init()+0x26 V  [libjvm.so+0xcb2895]
> init_globals()+0x55 V  [libjvm.so+0x16dde63]
> Threads::create_vm(JavaVMInitArgs*, bool*)+0x2d3
>
>
> Regards,
> Vladimir K
>
> On 7/17/20 3:02 PM, Thomas St?fe wrote:
>> Hi Vladimir,
>>
>> On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A <
>> vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
>>
>>>>   +#if defined(IA32) || defined(AMD64)
>>>>
>>>> Is that not synonymous with x86?
>>>
>>> This patter was copied from the method ?print_model_name_and_flags?
>>> (file os/linux/os_linux.cpp).
>>>
>>> This method also read the ?/proc/cpuinfo? file and I reuse it as
>>> ?template? for the new method.
>>>
>>> It is better to use one pattern to work with exactly same file but
>>> in general you are right.
>>>
>>> The X86 is defined in the file ./share/utilities/macros.hpp as:
>>>
>>> #if defined(IA32) || defined(AMD64)
>>>
>>> #define X86
>>>
>>> #define X86_ONLY(code) code
>>>
>>> #define NOT_X86(code)
>>>
>>>
>>>
>>> The question here: could I delete this ?ifdefs? while this method
>>> should work on x86 only?
>>>
>>>
>>>
>>
>> os_linux_x86.cpp is compiled for x86 platforms only, whereas
>> os_linux.cpp is shared among all architectures.
>>
>> So, in the former you do not need to exclude non-x86 architectures.
>>
>> Cheers, Thomas
>>
>>
>>> Thanks, Vladimir
>>>
>>>
>>>
>>> *From:* Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
>>> *Sent:* Friday, July 17, 2020 2:26 PM
>>> *To:* Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>; Hotspot dev
>>> runtime <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>
>>> *Cc:* hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
>>> *Subject:* Re: add microcode version to the hs_err files
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe
>>> <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
>>> wrote:
>>>
>>> Hi Vladimir,
>>>
>>>
>>>
>>> I think this would be more suited to hotspot-runtime.
>>>
>>>
>>>
>>>
>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>>> src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html
>>>
>>>
>>>
>>> +#if defined(IA32) || defined(AMD64)
>>>
>>> Is that not synonymous with x86?
>>>
>>>
>>>
>>> +    while ((read = getline(&line, &len, fp)) != -1) {
>>> +      if (len > 10 && strstr(line, "microcode") != NULL) {
>>> +        char* rev = strchr(line, ':');
>>> +        if (rev != NULL) sscanf(rev + 1, "%x", &result);
>>> +        break;
>>> +      }
>>> +    }
>>> +    free(line);
>>>
>>>
>>>
>>> Not sure this works as intended. At the first call to getline() it
>>> will allocate a line buffer for you and return it. That buffer will
>>> be as large as the first line you happen to read. You then pass that
>>> same buffer into getline to fetch the next lines, but what if those
>>> are longer than the first?
>>>
>>>
>>>
>>>
>>>
>>> Forget that point, getline calls realloc() on the line buffer to
>>> resize it, so this should be okay.
>>>
>>>
>>>
>>> Thanks, Thomas
>>>
>>>
>>>
>>> But anyway it would be better to pass a simple caller provided
>>> buffer in - stack allocated. Since this function is called at crash
>>> time and the C heap could be corrupted.
>>>
>>>
>>>
>>> Cheers, Thomas
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <
>>> vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
>>>
>>> Hello,
>>>
>>> could you please review the patch
>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>>>
>>> This patch add the microcode version for different OSes that may be
>>> useful in the issue resolution process.
>>>
>>>
>>>
>>> The reported microcode version for different OSes loos as:
>>>
>>>
>>>
>>> Linux (RHEL7.7):
>>>
>>> # cat hs_err_pid251046.log |grep microc
>>>
>>> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per
>>> core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8,
>>> fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt,
>>> vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht,
>>> tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt, clwb
>>>
>>>
>>>
>>> Windows (Win10, v1809):
>>>
>>> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per
>>> core) family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr,
>>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper,
>>> avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc,
>>> tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt
>>>
>>>
>>>
>>> MacOS (Darwin):
>>>
>>> $ cat hs_err_pid95187.log |grep microc
>>>
>>> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per
>>> core) family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr,
>>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper,
>>> avx, avx2, aes, clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit,
>>> bmi1, bmi2, adx, sha, fma, clflush, clflushopt
>>>
>>>
>>>
>>> Thanks, Vladimir
>>>
>>>
>>>    Thanks, Vladimir
>>>
>>>

From igor.ignatyev at oracle.com  Mon Jul 20 17:13:34 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Mon, 20 Jul 2020 10:13:34 -0700
Subject: [15] RFR(T) : 8249697 : java/lang/invoke/RicochetTest.java should
 use @requires instead of @ignore
In-Reply-To: <60806519-8e57-d126-8a2e-800053b4ee9a@oracle.com>
References: <054E0326-B61C-40FA-A8E3-89C433A49EE3@oracle.com>
 <60806519-8e57-d126-8a2e-800053b4ee9a@oracle.com>
Message-ID: <956AE2D8-1D95-4357-9DBD-9A9D5ABF9CD1@oracle.com>

Hi Mandy,

that's actually the opposite, the 2nd subtest is run only in modes other than Xcomp, as w/ Xcomp the test creates lots of adapters and used to lead to JVM failure as described in 7049122. I tried to reproduce this failure, but in vain,..  after a bit more historical digging, I realized that the underlying problem was 7009641, which has been fixed in hs25/jdk8. so I've changed the fix for 8249697 to simply return run w/ '-DRicochetTest.MAX_ARITY=255': http://cr.openjdk.java.net/~iignatyev//8249697/webrev.02

I've verified that the test passes w/ Xcomp and 
 - -XX:+TieredCompilation (c1 + c2);
 - -XX:-TieredCompilation (c2-only);
 - -XX:+NeverActAsServerClassMachine (emulated-client, c1-only)

the test was run 100 times on {linux,windows,macos}-x64 w/ 0 failures.
 
Thanks,
-- Igor

> On Jul 18, 2020, at 9:32 PM, Mandy Chung <mandy.chung at oracle.com> wrote:
> 
> 
> 
> On 7/17/20 8:54 PM, Igor Ignatyev wrote:
>> http://cr.openjdk.java.net/~iignatyev/8249697/webrev.00/ <http://cr.openjdk.java.net/~iignatyev/8249697/webrev.00/>
>> 
> 
> I suggest to change this:
>   32  * @comment The following test creates an unreasonable number of adapters in -Xcomp mode (7049122)
> 
> To:
> 
>    @bug 8249697
>    @summary verify very high number of adapters in -Xcomp mode
> 
> Otherwise, looks fine.
> 
> Mandy
>> Hi all,
>> 
>> could you please review this small and trivial patch for java/lang/invoke/RicochetTest.java test?
>> from JBS:
>>> a run of java/lang/invoke/RicochetTest.java w/ MAX_ARITY=255 was removed from all configurations by JDK-7049122, yet the problem manifests itself only w/ Xcomp. as now we have @requires to filter out tests from certain configurations, the test can be updated to run MAX_ARITY=255 in all configs but Xcomp.
>> the patch splits the test into two subtests, each one w/ one @run, and use @requires to exclude one w/ MAX_ARITY=255 from execution if Xcomp flag is used.
>> 
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249697 <https://bugs.openjdk.java.net/browse/JDK-8249697>
>> webrev: http://cr.openjdk.java.net/~iignatyev/8249697/webrev.00/ <http://cr.openjdk.java.net/~iignatyev/8249697/webrev.00/>
>> testing: java/lang/invoke/RicochetTest.java on {linux,windows,macos}-x64 w/ and w/o -Xcomp; Xcomp runs, as expected, had only 1 test run
>> 
>> Thanks,
>> -- Igor
>> 
>> JDK-7049122 : https://bugs.openjdk.java.net/browse/JDK-7049122 <https://bugs.openjdk.java.net/browse/JDK-7049122>


From vladimir.kozlov at oracle.com  Mon Jul 20 18:01:31 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 20 Jul 2020 11:01:31 -0700
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <3d5fc552-d3e2-494b-e921-c65967af8207@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
 <8325fac5-6258-9b88-6507-5dcb0597cc17@oracle.com>
 <8522a69e-e538-2cc9-5364-887e450fc653@oracle.com>
 <46144d6d-5714-05ad-a263-01507db937cc@oracle.com>
 <7a361c29-4771-9ed4-1542-b3f68a5726f3@oracle.com>
 <6dc4c99b-1d90-09f1-60d1-fb2caf981266@oracle.com>
 <6b4e4dda-01d4-37d0-5403-a4f5481e5bf0@oracle.com>
 <f8e18f7b-bd13-5032-fb17-04fb037c88b4@oracle.com>
 <32d7fb64-75a5-7add-d496-df33cfaefabf@oracle.com>
 <f9f80bf5-a565-3820-efa1-2b949b92d764@oracle.com>
 <fa60eea1-3d7b-af3a-678d-354263e3eccc@oracle.com>
 <cd1bfaeb-96a5-73e0-c244-e58b9484ec17@oracle.com>
 <4ffa8190-d57e-a9a2-e508-0d98035a34c6@oracle.com>
 <0fa9d47a-e568-bf22-4c49-74c926ae9f14@oracle.com>
 <16aead29-6788-a7e8-bf6e-ae2b56fdb9dc@oracle.com>
 <2805861f-4760-c768-9b1e-55cd6af1cde1@oracle.com>
 <3d5fc552-d3e2-494b-e921-c65967af8207@oracle.com>
Message-ID: <f1d233cc-8243-37e8-1592-90e55574953b@oracle.com>

I asked to have 2 different test methods to reproduce 2 cases separately.
You can't mix them.

Regards,
Vladimir


On 7/20/20 6:48 AM, Jamsheed C M wrote:
> Hi Tobias,
> On 20/07/2020 19:05, Tobias Hartmann wrote:
>> Hi Jamsheed,
>>
>> On 20.07.20 15:30, Jamsheed C M wrote:
>>> Revised webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.03/
>> You don't need #ifdef ASSERT in escape.cpp:2252. Otherwise looks good to me! No new webrev required.
> 
> Missed removing it. Thank you for the review.
> 
> Best regards,
> 
> Jamsheed
> 
>>
>> Best regards,
>> Tobias

From mandy.chung at oracle.com  Mon Jul 20 18:57:09 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Mon, 20 Jul 2020 11:57:09 -0700
Subject: [15] RFR(T) : 8249697 : java/lang/invoke/RicochetTest.java should
 use @requires instead of @ignore
In-Reply-To: <956AE2D8-1D95-4357-9DBD-9A9D5ABF9CD1@oracle.com>
References: <054E0326-B61C-40FA-A8E3-89C433A49EE3@oracle.com>
 <60806519-8e57-d126-8a2e-800053b4ee9a@oracle.com>
 <956AE2D8-1D95-4357-9DBD-9A9D5ABF9CD1@oracle.com>
Message-ID: <9f5959b8-f9ff-9962-77f7-7807b247ae90@oracle.com>

Hi Igor,

OK.? Should this revert the change by 7049122 then? i.e. simply change 
-DRicochetTest.MAX_ARITY=10 to 255

Your proposed patch adds a new @run instead of modifying the existing 
@run command:

 ? * @run junit/othervm/timeout=3600 -XX:+IgnoreUnrecognizedVMOptions 
-XX:-VerifyDependencies -DRicochetTest.MAX_ARITY=10 
test.java.lang.invoke.RicochetTest

I looked at the history and this @run was modified by JDK-7197210 that 
adds -XX:+IgnoreUnrecognizedVMOptions -XX:-VerifyDependencies options 
and reduce MAX_ARITY from 50 to 10.

This issue is not critical to target for 15.? It may worth considering 
target this test fix for 16.? Just a suggestion.

Mandy

On 7/20/20 10:13 AM, Igor Ignatyev wrote:
> Hi Mandy,
>
> that's actually the opposite, the 2nd subtest is run only in modes 
> other than Xcomp, as w/ Xcomp the test creates lots of adapters and 
> used to lead to JVM failure as described in?7049122. I tried to 
> reproduce this failure, but in vain,.. ?after a bit more historical 
> digging, I realized that the underlying problem was?7009641, which has 
> been fixed in hs25/jdk8. so I've changed the fix for?8249697 to simply 
> return run w/ '-DRicochetTest.MAX_ARITY=255': 
> http://cr.openjdk.java.net/~iignatyev//8249697/webrev.02
>
> I've verified that the test passes w/ Xcomp and
> ?- -XX:+TieredCompilation (c1 + c2);
> ?-?-XX:-TieredCompilation?(c2-only);
> ?- -XX:+NeverActAsServerClassMachine (emulated-client, c1-only)
>
> the test was run 100 times on {linux,windows,macos}-x64 w/ 0 failures.
> Thanks,
> -- Igor
>
>> On Jul 18, 2020, at 9:32 PM, Mandy Chung <mandy.chung at oracle.com 
>> <mailto:mandy.chung at oracle.com>> wrote:
>>
>>
>>
>> On 7/17/20 8:54 PM, Igor Ignatyev wrote:
>>> http://cr.openjdk.java.net/~iignatyev/8249697/webrev.00/
>>>
>>
>> I suggest to change this:
>> ? 32? * @comment The following test creates an unreasonable number of 
>> adapters in -Xcomp mode (7049122)
>>
>> To:
>>
>> ?? @bug 8249697
>> ?? @summary verify very high number of adapters in -Xcomp mode
>>
>> Otherwise, looks fine.
>>
>> Mandy
>>> Hi all,
>>>
>>> could you please review this small and trivial patch for java/lang/invoke/RicochetTest.java test?
>>> from JBS:
>>>> a run of java/lang/invoke/RicochetTest.java w/ MAX_ARITY=255 was removed from all configurations by JDK-7049122, yet the problem manifests itself only w/ Xcomp. as now we have @requires to filter out tests from certain configurations, the test can be updated to run MAX_ARITY=255 in all configs but Xcomp.
>>> the patch splits the test into two subtests, each one w/ one @run, and use @requires to exclude one w/ MAX_ARITY=255 from execution if Xcomp flag is used.
>>>
>>> JBS:https://bugs.openjdk.java.net/browse/JDK-8249697
>>> webrev:http://cr.openjdk.java.net/~iignatyev/8249697/webrev.00/
>>> testing: java/lang/invoke/RicochetTest.java on {linux,windows,macos}-x64 w/ and w/o -Xcomp; Xcomp runs, as expected, had only 1 test run
>>>
>>> Thanks,
>>> -- Igor
>>>
>>> JDK-7049122 :https://bugs.openjdk.java.net/browse/JDK-7049122
>>
>


From jamsheed.c.m at oracle.com  Mon Jul 20 19:22:11 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Tue, 21 Jul 2020 00:52:11 +0530
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <f1d233cc-8243-37e8-1592-90e55574953b@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
 <8325fac5-6258-9b88-6507-5dcb0597cc17@oracle.com>
 <8522a69e-e538-2cc9-5364-887e450fc653@oracle.com>
 <46144d6d-5714-05ad-a263-01507db937cc@oracle.com>
 <7a361c29-4771-9ed4-1542-b3f68a5726f3@oracle.com>
 <6dc4c99b-1d90-09f1-60d1-fb2caf981266@oracle.com>
 <6b4e4dda-01d4-37d0-5403-a4f5481e5bf0@oracle.com>
 <f8e18f7b-bd13-5032-fb17-04fb037c88b4@oracle.com>
 <32d7fb64-75a5-7add-d496-df33cfaefabf@oracle.com>
 <f9f80bf5-a565-3820-efa1-2b949b92d764@oracle.com>
 <fa60eea1-3d7b-af3a-678d-354263e3eccc@oracle.com>
 <cd1bfaeb-96a5-73e0-c244-e58b9484ec17@oracle.com>
 <4ffa8190-d57e-a9a2-e508-0d98035a34c6@oracle.com>
 <0fa9d47a-e568-bf22-4c49-74c926ae9f14@oracle.com>
 <16aead29-6788-a7e8-bf6e-ae2b56fdb9dc@oracle.com>
 <2805861f-4760-c768-9b1e-55cd6af1cde1@oracle.com>
 <3d5fc552-d3e2-494b-e921-c65967af8207@oracle.com>
 <f1d233cc-8243-37e8-1592-90e55574953b@oracle.com>
Message-ID: <c36328fe-3a91-e27b-40a8-123c273a7b19@oracle.com>

Hi Vladimir,

Added both the tests

http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.04/

Best Regards,

Jamsheed

On 20/07/2020 23:31, Vladimir Kozlov wrote:
> I asked to have 2 different test methods to reproduce 2 cases separately.
> You can't mix them.
>
> Regards,
> Vladimir
>
>
> On 7/20/20 6:48 AM, Jamsheed C M wrote:
>> Hi Tobias,
>> On 20/07/2020 19:05, Tobias Hartmann wrote:
>>> Hi Jamsheed,
>>>
>>> On 20.07.20 15:30, Jamsheed C M wrote:
>>>> Revised webrev: 
>>>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.03/
>>> You don't need #ifdef ASSERT in escape.cpp:2252. Otherwise looks 
>>> good to me! No new webrev required.
>>
>> Missed removing it. Thank you for the review.
>>
>> Best regards,
>>
>> Jamsheed
>>
>>>
>>> Best regards,
>>> Tobias

From igor.ignatyev at oracle.com  Mon Jul 20 19:22:05 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Mon, 20 Jul 2020 12:22:05 -0700
Subject: [15] RFR(T) : 8249697 : java/lang/invoke/RicochetTest.java should
 use @requires instead of @ignore
In-Reply-To: <9f5959b8-f9ff-9962-77f7-7807b247ae90@oracle.com>
References: <054E0326-B61C-40FA-A8E3-89C433A49EE3@oracle.com>
 <60806519-8e57-d126-8a2e-800053b4ee9a@oracle.com>
 <956AE2D8-1D95-4357-9DBD-9A9D5ABF9CD1@oracle.com>
 <9f5959b8-f9ff-9962-77f7-7807b247ae90@oracle.com>
Message-ID: <3BD624C3-D3B7-4306-959A-1062CA34DF64@oracle.com>

Hi Mandy,

you are right, it's better to have just one @run, and as I don't think that 7197210 changes '-XX:-VerifyDependencies' nor '/timeout=3600' are needed anymore, I suggest to restore the test to its original version w/  `@run junit/othervm -DRicochetTest.MAX_ARITY=255 test.java.lang.invoke.RicochetTest`, so the patch (http://cr.openjdk.java.net/~iignatyev//8249697/webrev.03) would be just:

> -/* @test
> +/*
> + * @test
>   * @summary unit tests for recursive method handles
> - * @run junit/othervm/timeout=3600 -XX:+IgnoreUnrecognizedVMOptions -XX:-VerifyDependencies -DRicochetTest.MAX_ARITY=10 test.java.lang.invoke.RicochetTest
> - */
> -/*
> - * @ignore The following test creates an unreasonable number of adapters in -Xcomp mode (7049122)
>   * @run junit/othervm -DRicochetTest.MAX_ARITY=255 test.java.lang.invoke.RicochetTest
>   */


and then the bug's summary would be smth like 'remove temporary fixes from java/lang/invoke/RicochetTest.java' .

sure there is no reason for it to be pushed into 15, I've retargeted to 16.

-- Igor

> On Jul 20, 2020, at 11:57 AM, Mandy Chung <mandy.chung at oracle.com> wrote:
> 
> Hi Igor,
> 
> OK.  Should this revert the change by 7049122 then? i.e. simply change -DRicochetTest.MAX_ARITY=10 to 255
> 
> Your proposed patch adds a new @run instead of modifying the existing @run command:
> 
>   * @run junit/othervm/timeout=3600 -XX:+IgnoreUnrecognizedVMOptions -XX:-VerifyDependencies -DRicochetTest.MAX_ARITY=10 test.java.lang.invoke.RicochetTest
> 
> I looked at the history and this @run was modified by JDK-7197210 that adds -XX:+IgnoreUnrecognizedVMOptions -XX:-VerifyDependencies options and reduce MAX_ARITY from 50 to 10.
> 
> This issue is not critical to target for 15.  It may worth considering target this test fix for 16.  Just a suggestion.
> 
> Mandy
> 
> On 7/20/20 10:13 AM, Igor Ignatyev wrote:
>> Hi Mandy,
>> 
>> that's actually the opposite, the 2nd subtest is run only in modes other than Xcomp, as w/ Xcomp the test creates lots of adapters and used to lead to JVM failure as described in 7049122. I tried to reproduce this failure, but in vain,..  after a bit more historical digging, I realized that the underlying problem was 7009641, which has been fixed in hs25/jdk8. so I've changed the fix for 8249697 to simply return run w/ '-DRicochetTest.MAX_ARITY=255': http://cr.openjdk.java.net/~iignatyev//8249697/webrev.02 <http://cr.openjdk.java.net/~iignatyev//8249697/webrev.02>
>> 
>> I've verified that the test passes w/ Xcomp and 
>>  - -XX:+TieredCompilation (c1 + c2);
>>  - -XX:-TieredCompilation (c2-only);
>>  - -XX:+NeverActAsServerClassMachine (emulated-client, c1-only)
>> 
>> the test was run 100 times on {linux,windows,macos}-x64 w/ 0 failures.
>>  
>> Thanks,
>> -- Igor
>> 
>>> On Jul 18, 2020, at 9:32 PM, Mandy Chung <mandy.chung at oracle.com <mailto:mandy.chung at oracle.com>> wrote:
>>> 
>>> 
>>> 
>>> On 7/17/20 8:54 PM, Igor Ignatyev wrote:
>>>> http://cr.openjdk.java.net/~iignatyev/8249697/webrev.00/ <http://cr.openjdk.java.net/~iignatyev/8249697/webrev.00/>
>>>> 
>>> 
>>> I suggest to change this:
>>>   32  * @comment The following test creates an unreasonable number of adapters in -Xcomp mode (7049122)
>>> 
>>> To:
>>> 
>>>    @bug 8249697
>>>    @summary verify very high number of adapters in -Xcomp mode
>>> 
>>> Otherwise, looks fine.
>>> 
>>> Mandy
>>>> Hi all,
>>>> 
>>>> could you please review this small and trivial patch for java/lang/invoke/RicochetTest.java test?
>>>> from JBS:
>>>>> a run of java/lang/invoke/RicochetTest.java w/ MAX_ARITY=255 was removed from all configurations by JDK-7049122, yet the problem manifests itself only w/ Xcomp. as now we have @requires to filter out tests from certain configurations, the test can be updated to run MAX_ARITY=255 in all configs but Xcomp.
>>>> the patch splits the test into two subtests, each one w/ one @run, and use @requires to exclude one w/ MAX_ARITY=255 from execution if Xcomp flag is used.
>>>> 
>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8249697 <https://bugs.openjdk.java.net/browse/JDK-8249697>
>>>> webrev: http://cr.openjdk.java.net/~iignatyev/8249697/webrev.00/ <http://cr.openjdk.java.net/~iignatyev/8249697/webrev.00/>
>>>> testing: java/lang/invoke/RicochetTest.java on {linux,windows,macos}-x64 w/ and w/o -Xcomp; Xcomp runs, as expected, had only 1 test run
>>>> 
>>>> Thanks,
>>>> -- Igor
>>>> 
>>>> JDK-7049122 : https://bugs.openjdk.java.net/browse/JDK-7049122 <https://bugs.openjdk.java.net/browse/JDK-7049122>
>> 
> 


From mandy.chung at oracle.com  Mon Jul 20 19:44:02 2020
From: mandy.chung at oracle.com (Mandy Chung)
Date: Mon, 20 Jul 2020 12:44:02 -0700
Subject: [15] RFR(T) : 8249697 : java/lang/invoke/RicochetTest.java should
 use @requires instead of @ignore
In-Reply-To: <3BD624C3-D3B7-4306-959A-1062CA34DF64@oracle.com>
References: <054E0326-B61C-40FA-A8E3-89C433A49EE3@oracle.com>
 <60806519-8e57-d126-8a2e-800053b4ee9a@oracle.com>
 <956AE2D8-1D95-4357-9DBD-9A9D5ABF9CD1@oracle.com>
 <9f5959b8-f9ff-9962-77f7-7807b247ae90@oracle.com>
 <3BD624C3-D3B7-4306-959A-1062CA34DF64@oracle.com>
Message-ID: <1ceb03d9-5f15-22ed-286e-881dfa751c06@oracle.com>

webrev.03 looks good.

Mandy

On 7/20/20 12:22 PM, Igor Ignatyev wrote:
> Hi Mandy,
>
> you are right, it's better to have just one @run, and as I don't think 
> that 7197210 changes '-XX:-VerifyDependencies' nor '/timeout=3600' are 
> needed anymore, I suggest to restore the test to its original version 
> w/ ?`@run junit/othervm -DRicochetTest.MAX_ARITY=255 
> test.java.lang.invoke.RicochetTest`, so the patch 
> (http://cr.openjdk.java.net/~iignatyev//8249697/webrev.03) would be just:
>
>> -/* @test
>> +/*
>> + * @test
>> ??* @summary unit tests for recursive method handles
>> - * @run junit/othervm/timeout=3600 -XX:+IgnoreUnrecognizedVMOptions 
>> -XX:-VerifyDependencies -DRicochetTest.MAX_ARITY=10 
>> test.java.lang.invoke.RicochetTest
>> - */
>> -/*
>> - * @ignore The following test creates an unreasonable number of 
>> adapters in -Xcomp mode (7049122)
>> ??* @run junit/othervm -DRicochetTest.MAX_ARITY=255 
>> test.java.lang.invoke.RicochetTest
>> ??*/
>
> and then the bug's summary would be smth like 'remove temporary fixes 
> from?java/lang/invoke/RicochetTest.java' .
>
> sure there is no reason for it to be pushed into 15, I've retargeted 
> to 16.
>
> -- Igor
>
>> On Jul 20, 2020, at 11:57 AM, Mandy Chung <mandy.chung at oracle.com 
>> <mailto:mandy.chung at oracle.com>> wrote:
>>
>> Hi Igor,
>>
>> OK.? Should this revert the change by 7049122 then? i.e. simply 
>> change -DRicochetTest.MAX_ARITY=10 to 255
>>
>> Your proposed patch adds a new @run instead of modifying the existing 
>> @run command:
>>
>> ? * @run junit/othervm/timeout=3600 -XX:+IgnoreUnrecognizedVMOptions 
>> -XX:-VerifyDependencies -DRicochetTest.MAX_ARITY=10 
>> test.java.lang.invoke.RicochetTest
>>
>> I looked at the history and this @run was modified by JDK-7197210 
>> that adds -XX:+IgnoreUnrecognizedVMOptions -XX:-VerifyDependencies 
>> options and reduce MAX_ARITY from 50 to 10.
>>
>> This issue is not critical to target for 15.? It may worth 
>> considering target this test fix for 16.? Just a suggestion.
>>
>> Mandy
>>
>> On 7/20/20 10:13 AM, Igor Ignatyev wrote:
>>> Hi Mandy,
>>>
>>> that's actually the opposite, the 2nd subtest is run only in modes 
>>> other than Xcomp, as w/ Xcomp the test creates lots of adapters and 
>>> used to lead to JVM failure as described in?7049122. I tried to 
>>> reproduce this failure, but in vain,.. ?after a bit more historical 
>>> digging, I realized that the underlying problem was?7009641, which 
>>> has been fixed in hs25/jdk8. so I've changed the fix for?8249697 to 
>>> simply return run w/ '-DRicochetTest.MAX_ARITY=255': 
>>> http://cr.openjdk.java.net/~iignatyev//8249697/webrev.02
>>>
>>> I've verified that the test passes w/ Xcomp and
>>> ?- -XX:+TieredCompilation (c1 + c2);
>>> ?-?-XX:-TieredCompilation?(c2-only);
>>> ?- -XX:+NeverActAsServerClassMachine (emulated-client, c1-only)
>>>
>>> the test was run 100 times on {linux,windows,macos}-x64 w/ 0 failures.
>>> Thanks,
>>> -- Igor
>>>
>>>> On Jul 18, 2020, at 9:32 PM, Mandy Chung <mandy.chung at oracle.com 
>>>> <mailto:mandy.chung at oracle.com>> wrote:
>>>>
>>>>
>>>>
>>>> On 7/17/20 8:54 PM, Igor Ignatyev wrote:
>>>>> http://cr.openjdk.java.net/~iignatyev/8249697/webrev.00/
>>>>>
>>>>
>>>> I suggest to change this:
>>>> ? 32? * @comment The following test creates an unreasonable number 
>>>> of adapters in -Xcomp mode (7049122)
>>>>
>>>> To:
>>>>
>>>> ?? @bug 8249697
>>>> ?? @summary verify very high number of adapters in -Xcomp mode
>>>>
>>>> Otherwise, looks fine.
>>>>
>>>> Mandy
>>>>> Hi all,
>>>>>
>>>>> could you please review this small and trivial patch for java/lang/invoke/RicochetTest.java test?
>>>>> from JBS:
>>>>>> a run of java/lang/invoke/RicochetTest.java w/ MAX_ARITY=255 was removed from all configurations by JDK-7049122, yet the problem manifests itself only w/ Xcomp. as now we have @requires to filter out tests from certain configurations, the test can be updated to run MAX_ARITY=255 in all configs but Xcomp.
>>>>> the patch splits the test into two subtests, each one w/ one @run, and use @requires to exclude one w/ MAX_ARITY=255 from execution if Xcomp flag is used.
>>>>>
>>>>> JBS:https://bugs.openjdk.java.net/browse/JDK-8249697
>>>>> webrev:http://cr.openjdk.java.net/~iignatyev/8249697/webrev.00/
>>>>> testing: java/lang/invoke/RicochetTest.java on {linux,windows,macos}-x64 w/ and w/o -Xcomp; Xcomp runs, as expected, had only 1 test run
>>>>>
>>>>> Thanks,
>>>>> -- Igor
>>>>>
>>>>> JDK-7049122 :https://bugs.openjdk.java.net/browse/JDK-7049122
>>>>
>>>
>>
>


From vladimir.kozlov at oracle.com  Mon Jul 20 20:05:22 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 20 Jul 2020 13:05:22 -0700
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <c36328fe-3a91-e27b-40a8-123c273a7b19@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
 <8325fac5-6258-9b88-6507-5dcb0597cc17@oracle.com>
 <8522a69e-e538-2cc9-5364-887e450fc653@oracle.com>
 <46144d6d-5714-05ad-a263-01507db937cc@oracle.com>
 <7a361c29-4771-9ed4-1542-b3f68a5726f3@oracle.com>
 <6dc4c99b-1d90-09f1-60d1-fb2caf981266@oracle.com>
 <6b4e4dda-01d4-37d0-5403-a4f5481e5bf0@oracle.com>
 <f8e18f7b-bd13-5032-fb17-04fb037c88b4@oracle.com>
 <32d7fb64-75a5-7add-d496-df33cfaefabf@oracle.com>
 <f9f80bf5-a565-3820-efa1-2b949b92d764@oracle.com>
 <fa60eea1-3d7b-af3a-678d-354263e3eccc@oracle.com>
 <cd1bfaeb-96a5-73e0-c244-e58b9484ec17@oracle.com>
 <4ffa8190-d57e-a9a2-e508-0d98035a34c6@oracle.com>
 <0fa9d47a-e568-bf22-4c49-74c926ae9f14@oracle.com>
 <16aead29-6788-a7e8-bf6e-ae2b56fdb9dc@oracle.com>
 <2805861f-4760-c768-9b1e-55cd6af1cde1@oracle.com>
 <3d5fc552-d3e2-494b-e921-c65967af8207@oracle.com>
 <f1d233cc-8243-37e8-1592-90e55574953b@oracle.com>
 <c36328fe-3a91-e27b-40a8-123c273a7b19@oracle.com>
Message-ID: <104285e4-811a-5314-54de-d6461320a76c@oracle.com>

Good.

Thanks,
Vladimir

On 7/20/20 12:22 PM, Jamsheed C M wrote:
> Hi Vladimir,
> 
> Added both the tests
> 
> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.04/
> 
> Best Regards,
> 
> Jamsheed
> 
> On 20/07/2020 23:31, Vladimir Kozlov wrote:
>> I asked to have 2 different test methods to reproduce 2 cases separately.
>> You can't mix them.
>>
>> Regards,
>> Vladimir
>>
>>
>> On 7/20/20 6:48 AM, Jamsheed C M wrote:
>>> Hi Tobias,
>>> On 20/07/2020 19:05, Tobias Hartmann wrote:
>>>> Hi Jamsheed,
>>>>
>>>> On 20.07.20 15:30, Jamsheed C M wrote:
>>>>> Revised webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.03/
>>>> You don't need #ifdef ASSERT in escape.cpp:2252. Otherwise looks good to me! No new webrev required.
>>>
>>> Missed removing it. Thank you for the review.
>>>
>>> Best regards,
>>>
>>> Jamsheed
>>>
>>>>
>>>> Best regards,
>>>> Tobias

From vladimir.kozlov at oracle.com  Mon Jul 20 20:20:26 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 20 Jul 2020 13:20:26 -0700
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
Message-ID: <5e298ff3-6dc1-c4fa-4545-1fc26d7379b5@oracle.com>

Hi David,

Changes look good.

On 7/20/20 10:07 AM, Daniel D. Daugherty wrote:
 > src/hotspot/share/jvmci/jvmciCompilerToVM.cpp
 >      L1021:   HandleMark hm;
 > Can this be 'hm(THREAD)'? (Not your problem, but while you're in that file?)

There are several cases like this in jvmciCompilerToVM.cpp and may be in other places.
I think it should be done as separate clean up.

Thanks,
Vladimir

On 7/19/20 10:53 PM, David Holmes wrote:
> Hi Kim,
> 
> Thanks for looking at this.
> 
> Updated webrev at:
> 
> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/
> 
> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> wrote:
>>>
>>> Subject line got truncated by accident ...
>>>
>>> On 20/07/2020 11:06 am, David Holmes wrote:
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>>>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>>>> This is a simple cleanup that touches files across a number of VM areas - hence the cross-post.
>>>> Whilst working on a different JNI fix I noticed that in most cases in jni.cpp we were using the following form of 
>>>> make_local:
>>>> JNIHandles::make_local(env, obj);
>>>> and what that form does is first extract the thread from the JNIEnv:
>>>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>>>> return thread->active_handles()->allocate_handle(obj);
>>>> but there is also another, faster, variant for when you already have the "thread":
>>>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>>> ?? return thread->active_handles()->allocate_handle(obj);
>>>> }
>>>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted 
>>>> the thread from the JNIEnv:
>>>> ???? JavaThread* thread=JavaThread::thread_from_jni_environment(env);
>>>> and further defined:
>>>> ???? Thread* THREAD = thread;
>>>> so we always already have direct access to the "thread" available (or indirect via TRAPS), and in fact we can end up 
>>>> removing the make_local(JNIEnv* env, oop obj) variant altogether.
>>>> Along the way I spotted some related issues with unnecessary use of Thread::current() when it is already available 
>>>> from TRAPS, and some other cases where we extracted the JNIEnv from a thread only to later extract the thread from 
>>>> the JNIEnv.
>>>> Testing: tiers 1 - 3
>>>> Thanks,
>>>> David
>>>> -----
>>
>> ------------------------------------------------------------------------------
>> src/hotspot/share/classfile/javaClasses.cpp
>> ? 439???? JNIEnv *env = thread->jni_environment();
>>
>> Since env is no longer used on the next line, move this down to where
>> it is used, at line 444.
> 
> Fixed.
> 
>> ------------------------------------------------------------------------------
>> src/hotspot/share/classfile/verifier.cpp
>> ? 299?? JNIEnv *env = thread->jni_environment();
>>
>> env now seems to only be used at line 320.? Move this closer.
> 
> Fixed.
> 
>> ------------------------------------------------------------------------------
>> src/hotspot/share/prims/jni.cpp
>> ? 743???? result = JNIHandles::make_local(THREAD, result_handle());
>>
>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>> previously it just used "thread". Maybe this change shouldn't be made?
>> Or can the other uses be changed to THREAD for consistency?
> 
> "thread" and "THREAD" are interchangeable for anything expecting a "Thread*" (and somewhat surprisingly a number of 
> API's that only work for JavaThreads actually take a Thread*. :( ). I had choice between trying to be file-wide 
> consistent with the make_local calls, versus local-code consistent, and used THREAD as it is available in both JNI_ENTRY 
> and via TRAPS. But I can certainly make a local change to "thread" for local consistency.
> 
>> ------------------------------------------------------------------------------
>> src/hotspot/share/prims/jvm.cpp
>>
>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>> instead of "THREAD", even though other places nearby are using
>> "THREAD".? That inconsistency is kind of unfortunate, but doesn't seem
>> easily avoidable.
> 
> Everything that uses THREAD in a JVM_ENTRY method can be changed to use "thread" instead. But I'm not sure it's a 
> consistency worth pursuing at least as part of these changes (there are likely similar issues with most of the touched 
> files).
> 
> Thanks,
> David
> 
>> ------------------------------------------------------------------------------
>>

From vladimir.kozlov at oracle.com  Mon Jul 20 22:37:11 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 20 Jul 2020 15:37:11 -0700
Subject: [16] RFR(S) 8249672: Include microcode revision in
 features_string on x86
In-Reply-To: <BYAPR11MB37826BC619E8ECC8BF62C711A77B0@BYAPR11MB3782.namprd11.prod.outlook.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
 <29dd9cde-48c8-915f-fa28-26312c7af17a@oracle.com>
 <d6d5d0a8-c990-c74f-ab8a-ef0a8e9a17d0@oracle.com>
 <BYAPR11MB378279AB52DD8560F661DA03A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUxq5BinzYfOF6bmDO1OxxLexPnaoYJfPVeC4f1j05AEig@mail.gmail.com>
 <BYAPR11MB37828BD1DA9857660415F50EA77D0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUxWzQ6bsxv08CGTfEN_qpj5cXz00eVcJeb1fiqOGe0UoA@mail.gmail.com>
 <BYAPR11MB37826BC619E8ECC8BF62C711A77B0@BYAPR11MB3782.namprd11.prod.outlook.com>
Message-ID: <d1d2cc32-6e80-e76e-0431-9d87c665c6c4@oracle.com>

Looks good.

Passed my tier1 testing.

Thanks,
Vladimir

On 7/20/20 10:12 AM, Ivanov, Vladimir A wrote:
> HI,
> The updated patch available as http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.03/
> It use the ?fgets? instead of ?getline? to use local memory.
> The tier1 tests passed on the release and fastdebug builds on Linux and fastdebug builds on MacOS systems.
> Testing results same for patched and non-patched builds.
> 
> Thanks, Vladmir
> 
> From: Thomas St?fe <thomas.stuefe at gmail.com>
> Sent: Friday, July 17, 2020 10:25 PM
> To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>
> Cc: Vladimir Kozlov <vladimir.kozlov at oracle.com>; Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>; hotspot-compiler-dev at openjdk.java.net
> Subject: Re: [16] RFR(S) 8249672: Include microcode revision in features_string on x86
> 
> Oh, sorry, you are right :(
> 
> I was under the assumption you wanted to call os::cpu_microcode_revision() directly from within VMError::report(). During initialization using c-heap like this should not be a problem and you can forget about 9/10ths of what I wrote, sorry.
> 
> In that case your original variant is fine, my only suggestion would be to clearly mark the free as ::free() with a comment to prevent someone from correcting it to os::free.
> 
> Thank you,
> 
> Thomas
> 
> 
> 
> On Sat, Jul 18, 2020 at 7:08 AM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
> Hi,
> seems, this info created during initialization phase. Is it correct? Collect or parse common info at the crash point usually not a good idea. During initialization usage of the c-heap not a problem.
> The ?::free? work OK here. At least tier1 test produce same results for patched and non-patched builds. But these tests not generates real case for hs_err files.
> It looks like 2k byte array enough for the one record for CPU from cpuinfo file. Will update code to use local buffer.
> 
> Thanks, Vladimir
> 
> From: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
> Sent: Friday, July 17, 2020 9:42 PM
> To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>
> Cc: Vladimir Kozlov <vladimir.kozlov at oracle.com<mailto:vladimir.kozlov at oracle.com>>; Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>; hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
> Subject: Re: [16] RFR(S) 8249672: Include microcode revision in features_string on x86
> 
> Hi,
> 
> yes, you must use the raw free here (for the same reason we cannot pass in an os::malloc() allocated buffer to getline, since if it were to resize it would use raw ::realloc() internally and crash the same way).
> 
> But as I wrote in my first mail to the original thread, I would not use c-heap memory at all, since this function is used during crash reporting in the signal handler and the c-heap may be corrupted.
> 
> It the max line length of /proc/cpu can be reliably predicted (so that getline wont realloc()) I would pass a stack allocated buffer into getline. If not, I would not use getline() at all but rewrite this, probably using fgets().
> 
> Cheers, Thomas
> 
> 
> 
> 
> On Sat, Jul 18, 2020 at 1:24 AM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
> Thanks, I expected the C's functions here. Let's wait a little bit for Runtime team and update work with buffer.
> 
>   Thanks, Vladimir
> 
> -----Original Message-----
> From: Vladimir Kozlov <vladimir.kozlov at oracle.com<mailto:vladimir.kozlov at oracle.com>>
> Sent: Friday, July 17, 2020 4:17 PM
> To: Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>; Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>
> Cc: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>; hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
> Subject: Re: [16] RFR(S) 8249672: Include microcode revision in features_string on x86
> 
> I think the issue is 'line' buffer is allocated by libc getline() and os:free() which is HotSpot function [1] does not know about it. You need C's ::free() or use HS's os::malloc() to allocate 'line' buffer.
> 
> Someone from Runtime may suggest what is the best for this case.
> 
> Thanks,
> Vladimir K
> 
> [1] http://hg.openjdk.java.net/jdk/jdk/file/14f465f62984/src/hotspot/share/runtime/os.cpp#l792
> 
> On 7/17/20 4:03 PM, Vladimir Kozlov wrote:
>> I updated subject to our formal review request format (JDK version, RFE's id and subject).
>>
>> I moved RFE to runtime group as Thomas said:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8249672
>>
>> Submitted tier1 testing to build on all our supported platforms. And debug builds on linux failed:
>>
>> #  SIGSEGV (0xb) at pc=0x0000146fc6af4b0b, pid=9715, tid=9718 # V
>> [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
>> const+0xeb
>>
>> V  [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
>> const+0xeb V  [libjvm.so+0x13c898a]  verify_memory(void*)+0x26a V
>> [libjvm.so+0x13cd30b]  os::free(void*)+0x5b V  [libjvm.so+0x13e5598]
>> os::cpu_microcode_revision()+0xc8 V  [libjvm.so+0x17d314c]
>> VM_Version::get_processor_features()+0x76c
>> V  [libjvm.so+0x17d6ead]  VM_Version::initialize()+0x10d V
>> [libjvm.so+0x17ce6c6]  VM_Version_init()+0x26 V  [libjvm.so+0xcb2895]
>> init_globals()+0x55 V  [libjvm.so+0x16dde63]
>> Threads::create_vm(JavaVMInitArgs*, bool*)+0x2d3
>>
>>
>> Regards,
>> Vladimir K
>>
>> On 7/17/20 3:02 PM, Thomas St?fe wrote:
>>> Hi Vladimir,
>>>
>>> On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A <
>>> vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
>>>
>>>>>    +#if defined(IA32) || defined(AMD64)
>>>>>
>>>>> Is that not synonymous with x86?
>>>>
>>>> This patter was copied from the method ?print_model_name_and_flags?
>>>> (file os/linux/os_linux.cpp).
>>>>
>>>> This method also read the ?/proc/cpuinfo? file and I reuse it as
>>>> ?template? for the new method.
>>>>
>>>> It is better to use one pattern to work with exactly same file but
>>>> in general you are right.
>>>>
>>>> The X86 is defined in the file ./share/utilities/macros.hpp as:
>>>>
>>>> #if defined(IA32) || defined(AMD64)
>>>>
>>>> #define X86
>>>>
>>>> #define X86_ONLY(code) code
>>>>
>>>> #define NOT_X86(code)
>>>>
>>>>
>>>>
>>>> The question here: could I delete this ?ifdefs? while this method
>>>> should work on x86 only?
>>>>
>>>>
>>>>
>>>
>>> os_linux_x86.cpp is compiled for x86 platforms only, whereas
>>> os_linux.cpp is shared among all architectures.
>>>
>>> So, in the former you do not need to exclude non-x86 architectures.
>>>
>>> Cheers, Thomas
>>>
>>>
>>>> Thanks, Vladimir
>>>>
>>>>
>>>>
>>>> *From:* Thomas St?fe <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
>>>> *Sent:* Friday, July 17, 2020 2:26 PM
>>>> *To:* Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>; Hotspot dev
>>>> runtime <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openjdk.java.net>>
>>>> *Cc:* hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at openjdk.java.net>
>>>> *Subject:* Re: add microcode version to the hs_err files
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe
>>>> <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
>>>> wrote:
>>>>
>>>> Hi Vladimir,
>>>>
>>>>
>>>>
>>>> I think this would be more suited to hotspot-runtime.
>>>>
>>>>
>>>>
>>>>
>>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>>>> src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html
>>>>
>>>>
>>>>
>>>> +#if defined(IA32) || defined(AMD64)
>>>>
>>>> Is that not synonymous with x86?
>>>>
>>>>
>>>>
>>>> +    while ((read = getline(&line, &len, fp)) != -1) {
>>>> +      if (len > 10 && strstr(line, "microcode") != NULL) {
>>>> +        char* rev = strchr(line, ':');
>>>> +        if (rev != NULL) sscanf(rev + 1, "%x", &result);
>>>> +        break;
>>>> +      }
>>>> +    }
>>>> +    free(line);
>>>>
>>>>
>>>>
>>>> Not sure this works as intended. At the first call to getline() it
>>>> will allocate a line buffer for you and return it. That buffer will
>>>> be as large as the first line you happen to read. You then pass that
>>>> same buffer into getline to fetch the next lines, but what if those
>>>> are longer than the first?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Forget that point, getline calls realloc() on the line buffer to
>>>> resize it, so this should be okay.
>>>>
>>>>
>>>>
>>>> Thanks, Thomas
>>>>
>>>>
>>>>
>>>> But anyway it would be better to pass a simple caller provided
>>>> buffer in - stack allocated. Since this function is called at crash
>>>> time and the C heap could be corrupted.
>>>>
>>>>
>>>>
>>>> Cheers, Thomas
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A <
>>>> vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
>>>>
>>>> Hello,
>>>>
>>>> could you please review the patch
>>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00/
>>>>
>>>> This patch add the microcode version for different OSes that may be
>>>> useful in the issue resolution process.
>>>>
>>>>
>>>>
>>>> The reported microcode version for different OSes loos as:
>>>>
>>>>
>>>>
>>>> Linux (RHEL7.7):
>>>>
>>>> # cat hs_err_pid251046.log |grep microc
>>>>
>>>> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads per
>>>> core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8,
>>>> fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt,
>>>> vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht,
>>>> tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt, clwb
>>>>
>>>>
>>>>
>>>> Windows (Win10, v1809):
>>>>
>>>> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per
>>>> core) family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, fxsr,
>>>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper,
>>>> avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc,
>>>> tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt
>>>>
>>>>
>>>>
>>>> MacOS (Darwin):
>>>>
>>>> $ cat hs_err_pid95187.log |grep microc
>>>>
>>>> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per
>>>> core) family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, fxsr,
>>>> mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, vzeroupper,
>>>> avx, avx2, aes, clmul, erms, 3dnowpref, lzcnt, ht, tsc, tscinvbit,
>>>> bmi1, bmi2, adx, sha, fma, clflush, clflushopt
>>>>
>>>>
>>>>
>>>> Thanks, Vladimir
>>>>
>>>>
>>>>     Thanks, Vladimir
>>>>
>>>>

From ningsheng.jian at arm.com  Tue Jul 21 06:05:48 2020
From: ningsheng.jian at arm.com (Ningsheng Jian)
Date: Tue, 21 Jul 2020 14:05:48 +0800
Subject: [aarch64-port-dev ] RFR(L): 8231441: AArch64: Initial SVE backend
 support
In-Reply-To: <42fca25d-7172-b4f3-335b-92e2b05e8195@arm.com>
References: <bce2556a-b709-d6d1-ffa2-655faaf56787@arm.com>
 <42fca25d-7172-b4f3-335b-92e2b05e8195@arm.com>
Message-ID: <707df21c-849d-ac9d-0ab2-61a30d1354f9@arm.com>

[Ping]

Could anyone please help to review this patch, especially for the c2 
register allocation part?

JBS: https://bugs.openjdk.java.net/browse/JDK-8231441

The latest webrev:
http://cr.openjdk.java.net/~njian/8231441/webrev.02

In the latest webrev, we block one predicate register (p7) with all 
elements preset to TRUE, so that c2 compiled code can use it freely to 
generate instructions for unpredicated operations.

And the split parts:

1) SVE feature detection:
http://cr.openjdk.java.net/~njian/8231441/webrev.02-feature

2) c2 register allocation:
http://cr.openjdk.java.net/~njian/8231441/webrev.02-ra

3) SVE c2 backend:
http://cr.openjdk.java.net/~njian/8231441/webrev.02-c2

The initial RFR which has some descriptions of the patch:
http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-March/037628.html

The description can also be found at:
http://cr.openjdk.java.net/~njian/8231441/README-RFR.txt

Notes to verify the patch on QEMU user emulation, with an example of 
compiled code:
http://cr.openjdk.java.net/~njian/8231441/running-sve-in-qemu-user.txt

Thanks,
Ningsheng


On 5/27/20 3:23 PM, Ningsheng Jian wrote:
> Hi,
> 
> I have rebased this patch with some more comments added. And also 
> relaxed the instruction matching conditions for 128-bit vector.
> 
> I would appreciate if someone could help to review this.
> 
> Whole patch:
> http://cr.openjdk.java.net/~njian/8231441/webrev.01
> 
> Different parts of changes:
> 
> 1) SVE feature detection
> http://cr.openjdk.java.net/~njian/8231441/webrev.01-feature
> 
> 2) c2 registion allocation
> http://cr.openjdk.java.net/~njian/8231441/webrev.01-ra
> 
> 3) SVE c2 backend
> http://cr.openjdk.java.net/~njian/8231441/webrev.01-c2
> 
> (Or should I split this into different JBS?)
> 
> Thanks,
> Ningsheng
> 
> On 3/25/20 2:37 PM, Ningsheng Jian wrote:
>> Hi,
>>
>> Could you please help to review this patch adding AArch64 SVE support?
>> It also touches c2 compiler shared code.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231441
>> Webrev: http://cr.openjdk.java.net/~njian/8231441/webrev.00
>>
>> Arm has released new vector ISA extension for AArch64, SVE [1] and
>> SVE2 [2]. This patch adds the initial SVE support in OpenJDK. In this
>> patch we have:
>>
>> 1) SVE feature enablement and detection
>> 2) SVE vector register allocation support with initial predicate
>> register definition
>> 3) SVE c2 backend for current SLP based vectorizer. (We also have a POC
>> patch of a new vectorizer using SVE predicate-driven loop control, but
>> that's still under development.)
>>
>> SVE register definition
>> =======================
>> Unlike other SIMD architectures, SVE allows hardware implementations to
>> choose a vector register length from 128 and 2048 bits, multiple of 128
>> bits. So we introduce a new vector type VectorA, i.e. length agnostic
>> (scalable) vector type, and Op_VecA for machine vectora register. In the
>> meantime, to minimize register allocation code changes, we also take
>> advantage of one JIT compiler aspect, that is during the compile time we
>> actually know the real hardware SVE vector register size of current
>> running machine. So, the register allocator actually knows how many
>> register slots an Op_VecA ideal reg requires, and could work fine
>> without much modification.
>>
>> Since the bottom 128 bits are shared with the NEON, we extend current
>> register mask definition of V0-V31 registers. Currently, c2 uses one bit
>> mask for a 32-bit register slot, so to define at most 2048 bits we will
>> need to add 64 slots in AD file. That's a really large number, and will
>> also break current regmask assumption. Considering the SVE vector
>> register is architecturally scalable for different sizes, we just define
>> double of original NEON vector register slots, i.e. 8 slots: Vx, Vx_H,
>> Vx_J ... Vx_O. After adlc, the generated register masks now looks like:
>>
>> const RegMask _VECTORA_REG_mask( 0x0, 0x0, 0xffffffff, 0xffffffff,
>> 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, ...
>>
>> const RegMask _VECTORD_REG_mask( 0x0, 0x0, 0x3030303, 0x3030303,
>> 0x3030303, 0x3030303, 0x3030303, 0x3030303, ...
>>
>> const RegMask _VECTORX_REG_mask( 0x0, 0x0, 0xf0f0f0f, 0xf0f0f0f,
>> 0xf0f0f0f, 0xf0f0f0f, 0xf0f0f0f, 0xf0f0f0f, ...
>>
>> And we use SlotsPerVecA to indicate regmask bit size for a VecA register.
>>
>> Although for physical register allocation, register allocator does not
>> need to know the real VecA register size, while doing spill/unspill,
>> current register allocation needs to know actual stack slot size to
>> store/load VecA registers. SVE is able to do vector size agnostic
>> spilling, but to minimize the code changes, as I mentioned before, we
>> just let RA know the actual vector register size in current running
>> machine, by calling scalable_vector_reg_size().
>>
>> In the meantime, since some vector operations do not have unpredicated
>> SVE1 instructions, but only predicate version, e.g. vector multiply,
>> vector load/store. We have also defined predicate registers in this
>> patch, and c2 register allocator will allocate a temp predicate register
>> to fulfill the expecting unpredicated operations. And this can also be
>> used for future predicate-driven vectorizer. This is not efficient for
>> now, as we can see many ptrue instructions in the generated code. One
>> possible solution I can see, is to block one predicate register, and
>> preset it to all true. But to preserve/reinitialize a caller save
>> register value cross calls seems risky to work in this patch. I decide
>> to defer it to further optimization work. If anyone has any suggestions
>> on this, I would appreciate.
>>
>> SVE feature detection
>> =====================
>> Since we may have some compiled code based on the initial detected SVE
>> vector register length and the compiled code is compiled only for that
>> vector register length, we assume that the SVE vector register length
>> will not be changed during the JVM lifetime. However, SVE vector length
>> is per-thread and can be changed by system call [3], so we need to make
>> sure that each jni call will not change the sve vector length.
>>
>> Currently, we verify the SVE vector register length on each JNI return,
>> and if an SVE vector length change is detected, jvm simply reports error
>> and stops running. The VM running vector length can also be set by
>> existing VM option MaxVectorSize with c2 enabled. If MaxVectorSize is
>> specified not the same as system default sve vector length (in
>> /proc/sys/abi/sve_default_vector_length), JVM will set current process
>> sve vector length to the specified vector length.
>>
>> Compiled code
>> =============
>> We have added all current c2 backend codegen on par with NEON, but only
>> for vector length larger than 128-bit.
>>
>> On a 1024 bit SVE environment, for the following simple loop with int
>> array element type:
>>
>> ??? for (int i = 0; i < LENGTH; i++) {
>> ????? c[i] = a[i] + b[i];
>> ??? }
>>
>> c2 generated loop:
>>
>> ??? 0x0000ffff811c0820:?? sbfiz?? x11, x10, #2, #32
>> ??? 0x0000ffff811c0824:?? add???? x13, x18, x11
>> ??? 0x0000ffff811c0828:?? add???? x14, x1, x11
>> ??? 0x0000ffff811c082c:?? add???? x13, x13, #0x10
>> ??? 0x0000ffff811c0830:?? add???? x14, x14, #0x10
>> ??? 0x0000ffff811c0834:?? add???? x11, x0, x11
>> ??? 0x0000ffff811c0838:?? add???? x11, x11, #0x10
>> ??? 0x0000ffff811c083c:?? ptrue?? p1.s??? // To be optimized
>> ??? 0x0000ffff811c0840:?? ld1w??? {z16.s}, p1/z, [x14]
>> ??? 0x0000ffff811c0844:?? ptrue?? p0.s
>> ??? 0x0000ffff811c0848:?? ld1w??? {z17.s}, p0/z, [x13]
>> ??? 0x0000ffff811c084c:?? add???? z16.s, z17.s, z16.s
>> ??? 0x0000ffff811c0850:?? ptrue?? p1.s
>> ??? 0x0000ffff811c0854:?? st1w??? {z16.s}, p1, [x11]
>> ??? 0x0000ffff811c0858:?? add???? w10, w10, #0x20
>> ??? 0x0000ffff811c085c:?? cmp???? w10, w12
>> ??? 0x0000ffff811c0860:?? b.lt??? 0x0000ffff811c0820
>>
>> Test
>> ====
>> Currently, we don't have real hardware to verify SVE features (and
>> performance). But we have run jtreg tests with SVE in some emulators. On
>> QEMU system emulator, which has SVE emulation support, jtreg tier1-3
>> passed with different vector sizes. We've also verified it with full
>> jtreg tests without SVE on both x86 and AArch64, to make sure that
>> there's no regression.
>>
>> The patch has also been applied to Vector API code base, and verified on
>> emulator. In Vector API, there are more vector related tests and is more
>> possible to generate vector instructions by intrinsification.
>>
>> A simple test can also run in QEMU user emulation, e.g.
>>
>> $ qemu-aarch64 -cpu max,sve-max-vq=2 java -XX:UseSVE=1 SIMD
>>
>> (
>> To run it in user emulation mode, we will need to bypass SVE feature
>> detection code in this patch. E.g. apply:
>> http://cr.openjdk.java.net/~njian/8231441/user-emulation.patch
>> )l
>>
>> Others
>> ======
>> Since this patch is a bit large, I've also split it into 3 parts, for
>> easy review:
>>
>> 1) SVE feature detection
>> http://cr.openjdk.java.net/~njian/8231441/webrev.00-feature
>>
>> 2) c2 registion allocation
>> http://cr.openjdk.java.net/~njian/8231441/webrev.00-ra
>>
>> 3) SVE c2 backend
>> http://cr.openjdk.java.net/~njian/8231441/webrev.00-c2
>>
>> Part of this patch has been contributed by Joshua Zhu and Yang Zhang.
>>
>> Refs
>> ====
>> [1] https://developer.arm.com/docs/ddi0584/latest
>> [2] https://developer.arm.com/docs/ddi0602/latest
>> [3] https://www.kernel.org/doc/Documentation/arm64/sve.txt
>>
>> Thanks,
>> Ningsheng
>>
> 


From tobias.hartmann at oracle.com  Tue Jul 21 06:17:31 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Tue, 21 Jul 2020 08:17:31 +0200
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <104285e4-811a-5314-54de-d6461320a76c@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
 <8522a69e-e538-2cc9-5364-887e450fc653@oracle.com>
 <46144d6d-5714-05ad-a263-01507db937cc@oracle.com>
 <7a361c29-4771-9ed4-1542-b3f68a5726f3@oracle.com>
 <6dc4c99b-1d90-09f1-60d1-fb2caf981266@oracle.com>
 <6b4e4dda-01d4-37d0-5403-a4f5481e5bf0@oracle.com>
 <f8e18f7b-bd13-5032-fb17-04fb037c88b4@oracle.com>
 <32d7fb64-75a5-7add-d496-df33cfaefabf@oracle.com>
 <f9f80bf5-a565-3820-efa1-2b949b92d764@oracle.com>
 <fa60eea1-3d7b-af3a-678d-354263e3eccc@oracle.com>
 <cd1bfaeb-96a5-73e0-c244-e58b9484ec17@oracle.com>
 <4ffa8190-d57e-a9a2-e508-0d98035a34c6@oracle.com>
 <0fa9d47a-e568-bf22-4c49-74c926ae9f14@oracle.com>
 <16aead29-6788-a7e8-bf6e-ae2b56fdb9dc@oracle.com>
 <2805861f-4760-c768-9b1e-55cd6af1cde1@oracle.com>
 <3d5fc552-d3e2-494b-e921-c65967af8207@oracle.com>
 <f1d233cc-8243-37e8-1592-90e55574953b@oracle.com>
 <c36328fe-3a91-e27b-40a8-123c273a7b19@oracle.com>
 <104285e4-811a-5314-54de-d6461320a76c@oracle.com>
Message-ID: <36de481a-d4b1-d9cf-3632-db1f82c5baba@oracle.com>

+1

Best regards,
Tobias

On 20.07.20 22:05, Vladimir Kozlov wrote:
> Good.
> 
> Thanks,
> Vladimir
> 
> On 7/20/20 12:22 PM, Jamsheed C M wrote:
>> Hi Vladimir,
>>
>> Added both the tests
>>
>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.04/
>>
>> Best Regards,
>>
>> Jamsheed
>>
>> On 20/07/2020 23:31, Vladimir Kozlov wrote:
>>> I asked to have 2 different test methods to reproduce 2 cases separately.
>>> You can't mix them.
>>>
>>> Regards,
>>> Vladimir
>>>
>>>
>>> On 7/20/20 6:48 AM, Jamsheed C M wrote:
>>>> Hi Tobias,
>>>> On 20/07/2020 19:05, Tobias Hartmann wrote:
>>>>> Hi Jamsheed,
>>>>>
>>>>> On 20.07.20 15:30, Jamsheed C M wrote:
>>>>>> Revised webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.03/
>>>>> You don't need #ifdef ASSERT in escape.cpp:2252. Otherwise looks good to me! No new webrev
>>>>> required.
>>>>
>>>> Missed removing it. Thank you for the review.
>>>>
>>>> Best regards,
>>>>
>>>> Jamsheed
>>>>
>>>>>
>>>>> Best regards,
>>>>> Tobias

From jamsheed.c.m at oracle.com  Tue Jul 21 06:40:32 2020
From: jamsheed.c.m at oracle.com (Jamsheed C M)
Date: Tue, 21 Jul 2020 12:10:32 +0530
Subject: [15] RFR: 8242895: failed: sanity at
 src/hotspot/share/opto/escape.cpp:2361
In-Reply-To: <36de481a-d4b1-d9cf-3632-db1f82c5baba@oracle.com>
References: <4a389db7-ebce-e2b8-4691-2ce6625e2709@oracle.com>
 <46144d6d-5714-05ad-a263-01507db937cc@oracle.com>
 <7a361c29-4771-9ed4-1542-b3f68a5726f3@oracle.com>
 <6dc4c99b-1d90-09f1-60d1-fb2caf981266@oracle.com>
 <6b4e4dda-01d4-37d0-5403-a4f5481e5bf0@oracle.com>
 <f8e18f7b-bd13-5032-fb17-04fb037c88b4@oracle.com>
 <32d7fb64-75a5-7add-d496-df33cfaefabf@oracle.com>
 <f9f80bf5-a565-3820-efa1-2b949b92d764@oracle.com>
 <fa60eea1-3d7b-af3a-678d-354263e3eccc@oracle.com>
 <cd1bfaeb-96a5-73e0-c244-e58b9484ec17@oracle.com>
 <4ffa8190-d57e-a9a2-e508-0d98035a34c6@oracle.com>
 <0fa9d47a-e568-bf22-4c49-74c926ae9f14@oracle.com>
 <16aead29-6788-a7e8-bf6e-ae2b56fdb9dc@oracle.com>
 <2805861f-4760-c768-9b1e-55cd6af1cde1@oracle.com>
 <3d5fc552-d3e2-494b-e921-c65967af8207@oracle.com>
 <f1d233cc-8243-37e8-1592-90e55574953b@oracle.com>
 <c36328fe-3a91-e27b-40a8-123c273a7b19@oracle.com>
 <104285e4-811a-5314-54de-d6461320a76c@oracle.com>
 <36de481a-d4b1-d9cf-3632-db1f82c5baba@oracle.com>
Message-ID: <6baf448a-f44c-cd78-d7d0-121589a3c9cf@oracle.com>

Thank you for the reviews.

Initiated the Fix Request for JDK15, testing links are added in JBS.

Best regards,

Jamsheed

On 21/07/2020 11:47, Tobias Hartmann wrote:
> +1
>
> Best regards,
> Tobias
>
> On 20.07.20 22:05, Vladimir Kozlov wrote:
>> Good.
>>
>> Thanks,
>> Vladimir
>>
>> On 7/20/20 12:22 PM, Jamsheed C M wrote:
>>> Hi Vladimir,
>>>
>>> Added both the tests
>>>
>>> http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.04/
>>>
>>> Best Regards,
>>>
>>> Jamsheed
>>>
>>> On 20/07/2020 23:31, Vladimir Kozlov wrote:
>>>> I asked to have 2 different test methods to reproduce 2 cases separately.
>>>> You can't mix them.
>>>>
>>>> Regards,
>>>> Vladimir
>>>>
>>>>
>>>> On 7/20/20 6:48 AM, Jamsheed C M wrote:
>>>>> Hi Tobias,
>>>>> On 20/07/2020 19:05, Tobias Hartmann wrote:
>>>>>> Hi Jamsheed,
>>>>>>
>>>>>> On 20.07.20 15:30, Jamsheed C M wrote:
>>>>>>> Revised webrev: http://cr.openjdk.java.net/~jcm/8242895/webrev_fix_EA.03/
>>>>>> You don't need #ifdef ASSERT in escape.cpp:2252. Otherwise looks good to me! No new webrev
>>>>>> required.
>>>>> Missed removing it. Thank you for the review.
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Jamsheed
>>>>>
>>>>>> Best regards,
>>>>>> Tobias

From sandhya.viswanathan at intel.com  Tue Jul 21 16:28:54 2020
From: sandhya.viswanathan at intel.com (Viswanathan, Sandhya)
Date: Tue, 21 Jul 2020 16:28:54 +0000
Subject: [16] RFR(S) 8249672: Include microcode revision in
 features_string on x86
In-Reply-To: <d1d2cc32-6e80-e76e-0431-9d87c665c6c4@oracle.com>
References: <BYAPR11MB3782B346ECA7097DC8B09E63A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUx_vkBfhapOJp9w5si3bJboKe8Q1=Msji4TUQua=VO5oA@mail.gmail.com>
 <CAA-vtUxUBx4EEC98TWF=bSq9c9=SFMOO9Sq3dZ0qD+YdzQPmrA@mail.gmail.com>
 <BYAPR11MB378254CCE31566E91CBBFE09A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUyQxJ5+B-AQat9W=G2v6omuNfrXE4gfh8SMW=ntQ=e8sg@mail.gmail.com>
 <29dd9cde-48c8-915f-fa28-26312c7af17a@oracle.com>
 <d6d5d0a8-c990-c74f-ab8a-ef0a8e9a17d0@oracle.com>
 <BYAPR11MB378279AB52DD8560F661DA03A77C0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUxq5BinzYfOF6bmDO1OxxLexPnaoYJfPVeC4f1j05AEig@mail.gmail.com>
 <BYAPR11MB37828BD1DA9857660415F50EA77D0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <CAA-vtUxWzQ6bsxv08CGTfEN_qpj5cXz00eVcJeb1fiqOGe0UoA@mail.gmail.com>
 <BYAPR11MB37826BC619E8ECC8BF62C711A77B0@BYAPR11MB3782.namprd11.prod.outlook.com>
 <d1d2cc32-6e80-e76e-0431-9d87c665c6c4@oracle.com>
Message-ID: <BYAPR11MB35432A7DA631DDFD29E58B5EEF780@BYAPR11MB3543.namprd11.prod.outlook.com>

Hi VladimirK,

Please let me know if I can push this onto jdk/jdk.

Best Regards,
Sandhya


-----Original Message-----
From: hotspot-compiler-dev <hotspot-compiler-dev-retn at openjdk.java.net> On Behalf Of Vladimir Kozlov
Sent: Monday, July 20, 2020 3:37 PM
To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>
Cc: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: [16] RFR(S) 8249672: Include microcode revision in features_string on x86

Looks good.

Passed my tier1 testing.

Thanks,
Vladimir

On 7/20/20 10:12 AM, Ivanov, Vladimir A wrote:
> HI,
> The updated patch available as 
> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.03/
> It use the ?fgets? instead of ?getline? to use local memory.
> The tier1 tests passed on the release and fastdebug builds on Linux and fastdebug builds on MacOS systems.
> Testing results same for patched and non-patched builds.
> 
> Thanks, Vladmir
> 
> From: Thomas St?fe <thomas.stuefe at gmail.com>
> Sent: Friday, July 17, 2020 10:25 PM
> To: Ivanov, Vladimir A <vladimir.a.ivanov at intel.com>
> Cc: Vladimir Kozlov <vladimir.kozlov at oracle.com>; Hotspot dev runtime 
> <hotspot-runtime-dev at openjdk.java.net>; 
> hotspot-compiler-dev at openjdk.java.net
> Subject: Re: [16] RFR(S) 8249672: Include microcode revision in 
> features_string on x86
> 
> Oh, sorry, you are right :(
> 
> I was under the assumption you wanted to call os::cpu_microcode_revision() directly from within VMError::report(). During initialization using c-heap like this should not be a problem and you can forget about 9/10ths of what I wrote, sorry.
> 
> In that case your original variant is fine, my only suggestion would be to clearly mark the free as ::free() with a comment to prevent someone from correcting it to os::free.
> 
> Thank you,
> 
> Thomas
> 
> 
> 
> On Sat, Jul 18, 2020 at 7:08 AM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
> Hi,
> seems, this info created during initialization phase. Is it correct? Collect or parse common info at the crash point usually not a good idea. During initialization usage of the c-heap not a problem.
> The ?::free? work OK here. At least tier1 test produce same results for patched and non-patched builds. But these tests not generates real case for hs_err files.
> It looks like 2k byte array enough for the one record for CPU from cpuinfo file. Will update code to use local buffer.
> 
> Thanks, Vladimir
> 
> From: Thomas St?fe 
> <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
> Sent: Friday, July 17, 2020 9:42 PM
> To: Ivanov, Vladimir A 
> <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>
> Cc: Vladimir Kozlov 
> <vladimir.kozlov at oracle.com<mailto:vladimir.kozlov at oracle.com>>; 
> Hotspot dev runtime 
> <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openj
> dk.java.net>>; 
> hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at open
> jdk.java.net>
> Subject: Re: [16] RFR(S) 8249672: Include microcode revision in 
> features_string on x86
> 
> Hi,
> 
> yes, you must use the raw free here (for the same reason we cannot pass in an os::malloc() allocated buffer to getline, since if it were to resize it would use raw ::realloc() internally and crash the same way).
> 
> But as I wrote in my first mail to the original thread, I would not use c-heap memory at all, since this function is used during crash reporting in the signal handler and the c-heap may be corrupted.
> 
> It the max line length of /proc/cpu can be reliably predicted (so that getline wont realloc()) I would pass a stack allocated buffer into getline. If not, I would not use getline() at all but rewrite this, probably using fgets().
> 
> Cheers, Thomas
> 
> 
> 
> 
> On Sat, Jul 18, 2020 at 1:24 AM Ivanov, Vladimir A <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
> Thanks, I expected the C's functions here. Let's wait a little bit for Runtime team and update work with buffer.
> 
>   Thanks, Vladimir
> 
> -----Original Message-----
> From: Vladimir Kozlov 
> <vladimir.kozlov at oracle.com<mailto:vladimir.kozlov at oracle.com>>
> Sent: Friday, July 17, 2020 4:17 PM
> To: Thomas St?fe 
> <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>; Ivanov, 
> Vladimir A 
> <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>
> Cc: Hotspot dev runtime 
> <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at openj
> dk.java.net>>; 
> hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at open
> jdk.java.net>
> Subject: Re: [16] RFR(S) 8249672: Include microcode revision in 
> features_string on x86
> 
> I think the issue is 'line' buffer is allocated by libc getline() and os:free() which is HotSpot function [1] does not know about it. You need C's ::free() or use HS's os::malloc() to allocate 'line' buffer.
> 
> Someone from Runtime may suggest what is the best for this case.
> 
> Thanks,
> Vladimir K
> 
> [1] 
> http://hg.openjdk.java.net/jdk/jdk/file/14f465f62984/src/hotspot/share
> /runtime/os.cpp#l792
> 
> On 7/17/20 4:03 PM, Vladimir Kozlov wrote:
>> I updated subject to our formal review request format (JDK version, RFE's id and subject).
>>
>> I moved RFE to runtime group as Thomas said:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8249672
>>
>> Submitted tier1 testing to build on all our supported platforms. And debug builds on linux failed:
>>
>> #  SIGSEGV (0xb) at pc=0x0000146fc6af4b0b, pid=9715, tid=9718 # V 
>> [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
>> const+0xeb
>>
>> V  [libjvm.so+0xc12b0b]  GuardedMemory::print_on(outputStream*)
>> const+0xeb V  [libjvm.so+0x13c898a]  verify_memory(void*)+0x26a V
>> [libjvm.so+0x13cd30b]  os::free(void*)+0x5b V  [libjvm.so+0x13e5598]
>> os::cpu_microcode_revision()+0xc8 V  [libjvm.so+0x17d314c] 
>> VM_Version::get_processor_features()+0x76c
>> V  [libjvm.so+0x17d6ead]  VM_Version::initialize()+0x10d V 
>> [libjvm.so+0x17ce6c6]  VM_Version_init()+0x26 V  [libjvm.so+0xcb2895]
>> init_globals()+0x55 V  [libjvm.so+0x16dde63] 
>> Threads::create_vm(JavaVMInitArgs*, bool*)+0x2d3
>>
>>
>> Regards,
>> Vladimir K
>>
>> On 7/17/20 3:02 PM, Thomas St?fe wrote:
>>> Hi Vladimir,
>>>
>>> On Fri, Jul 17, 2020 at 11:57 PM Ivanov, Vladimir A < 
>>> vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
>>>
>>>>>    +#if defined(IA32) || defined(AMD64)
>>>>>
>>>>> Is that not synonymous with x86?
>>>>
>>>> This patter was copied from the method ?print_model_name_and_flags?
>>>> (file os/linux/os_linux.cpp).
>>>>
>>>> This method also read the ?/proc/cpuinfo? file and I reuse it as 
>>>> ?template? for the new method.
>>>>
>>>> It is better to use one pattern to work with exactly same file but 
>>>> in general you are right.
>>>>
>>>> The X86 is defined in the file ./share/utilities/macros.hpp as:
>>>>
>>>> #if defined(IA32) || defined(AMD64)
>>>>
>>>> #define X86
>>>>
>>>> #define X86_ONLY(code) code
>>>>
>>>> #define NOT_X86(code)
>>>>
>>>>
>>>>
>>>> The question here: could I delete this ?ifdefs? while this method 
>>>> should work on x86 only?
>>>>
>>>>
>>>>
>>>
>>> os_linux_x86.cpp is compiled for x86 platforms only, whereas 
>>> os_linux.cpp is shared among all architectures.
>>>
>>> So, in the former you do not need to exclude non-x86 architectures.
>>>
>>> Cheers, Thomas
>>>
>>>
>>>> Thanks, Vladimir
>>>>
>>>>
>>>>
>>>> *From:* Thomas St?fe 
>>>> <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
>>>> *Sent:* Friday, July 17, 2020 2:26 PM
>>>> *To:* Ivanov, Vladimir A 
>>>> <vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>>; 
>>>> Hotspot dev runtime 
>>>> <hotspot-runtime-dev at openjdk.java.net<mailto:hotspot-runtime-dev at op
>>>> enjdk.java.net>>
>>>> *Cc:* 
>>>> hotspot-compiler-dev at openjdk.java.net<mailto:hotspot-compiler-dev at o
>>>> penjdk.java.net>
>>>> *Subject:* Re: add microcode version to the hs_err files
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Jul 17, 2020 at 11:19 PM Thomas St?fe 
>>>> <thomas.stuefe at gmail.com<mailto:thomas.stuefe at gmail.com>>
>>>> wrote:
>>>>
>>>> Hi Vladimir,
>>>>
>>>>
>>>>
>>>> I think this would be more suited to hotspot-runtime.
>>>>
>>>>
>>>>
>>>>
>>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00
>>>> / src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp.udiff.html
>>>>
>>>>
>>>>
>>>> +#if defined(IA32) || defined(AMD64)
>>>>
>>>> Is that not synonymous with x86?
>>>>
>>>>
>>>>
>>>> +    while ((read = getline(&line, &len, fp)) != -1) {
>>>> +      if (len > 10 && strstr(line, "microcode") != NULL) {
>>>> +        char* rev = strchr(line, ':');
>>>> +        if (rev != NULL) sscanf(rev + 1, "%x", &result);
>>>> +        break;
>>>> +      }
>>>> +    }
>>>> +    free(line);
>>>>
>>>>
>>>>
>>>> Not sure this works as intended. At the first call to getline() it 
>>>> will allocate a line buffer for you and return it. That buffer will 
>>>> be as large as the first line you happen to read. You then pass 
>>>> that same buffer into getline to fetch the next lines, but what if 
>>>> those are longer than the first?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Forget that point, getline calls realloc() on the line buffer to 
>>>> resize it, so this should be okay.
>>>>
>>>>
>>>>
>>>> Thanks, Thomas
>>>>
>>>>
>>>>
>>>> But anyway it would be better to pass a simple caller provided 
>>>> buffer in - stack allocated. Since this function is called at crash 
>>>> time and the C heap could be corrupted.
>>>>
>>>>
>>>>
>>>> Cheers, Thomas
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Jul 17, 2020 at 10:22 PM Ivanov, Vladimir A < 
>>>> vladimir.a.ivanov at intel.com<mailto:vladimir.a.ivanov at intel.com>> wrote:
>>>>
>>>> Hello,
>>>>
>>>> could you please review the patch
>>>> http://cr.openjdk.java.net/~sviswanathan/Vladimir/8249672/webrev.00
>>>> /
>>>>
>>>> This patch add the microcode version for different OSes that may be 
>>>> useful in the issue resolution process.
>>>>
>>>>
>>>>
>>>> The reported microcode version for different OSes loos as:
>>>>
>>>>
>>>>
>>>> Linux (RHEL7.7):
>>>>
>>>> # cat hs_err_pid251046.log |grep microc
>>>>
>>>> CPU: total 112 (initial active 112) (28 cores per cpu, 2 threads 
>>>> per
>>>> core) family 6 model 85 stepping 4 microcode 0x200005e, cmov, cx8, 
>>>> fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, 
>>>> vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, 
>>>> tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt, clwb
>>>>
>>>>
>>>>
>>>> Windows (Win10, v1809):
>>>>
>>>> CPU: total 4 (initial active 4) (2 cores per cpu, 2 threads per
>>>> core) family 6 model 142 stepping 9 microcode 0xb4, cmov, cx8, 
>>>> fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, 
>>>> vzeroupper, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, 
>>>> tsc, tscinvbit, bmi1, bmi2, adx, fma, clflush, clflushopt
>>>>
>>>>
>>>>
>>>> MacOS (Darwin):
>>>>
>>>> $ cat hs_err_pid95187.log |grep microc
>>>>
>>>> CPU: total 8 (initial active 8) (4 cores per cpu, 2 threads per
>>>> core) family 6 model 126 stepping 5 microcode 0x78, cmov, cx8, 
>>>> fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, 
>>>> vzeroupper, avx, avx2, aes, clmul, erms, 3dnowpref, lzcnt, ht, tsc, 
>>>> tscinvbit, bmi1, bmi2, adx, sha, fma, clflush, clflushopt
>>>>
>>>>
>>>>
>>>> Thanks, Vladimir
>>>>
>>>>
>>>>     Thanks, Vladimir
>>>>
>>>>

From coleen.phillimore at oracle.com  Tue Jul 21 17:57:36 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 21 Jul 2020 13:57:36 -0400
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
Message-ID: <2b52127c-8637-ed24-2a63-0b1372d4bff0@oracle.com>


One note below:

On 7/20/20 1:53 AM, David Holmes wrote:
> Hi Kim,
>
> Thanks for looking at this.
>
> Updated webrev at:
>
> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/
>
> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> 
>>> wrote:
>>>
>>> Subject line got truncated by accident ...
>>>
>>> On 20/07/2020 11:06 am, David Holmes wrote:
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>>>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>>>> This is a simple cleanup that touches files across a number of VM 
>>>> areas - hence the cross-post.
>>>> Whilst working on a different JNI fix I noticed that in most cases 
>>>> in jni.cpp we were using the following form of make_local:
>>>> JNIHandles::make_local(env, obj);
>>>> and what that form does is first extract the thread from the JNIEnv:
>>>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>>>> return thread->active_handles()->allocate_handle(obj);
>>>> but there is also another, faster, variant for when you already 
>>>> have the "thread":
>>>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>>> ?? return thread->active_handles()->allocate_handle(obj);
>>>> }
>>>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, 
>>>> WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted the thread 
>>>> from the JNIEnv:
>>>> ???? JavaThread* thread=JavaThread::thread_from_jni_environment(env);
>>>> and further defined:
>>>> ???? Thread* THREAD = thread;
>>>> so we always already have direct access to the "thread" available 
>>>> (or indirect via TRAPS), and in fact we can end up removing the 
>>>> make_local(JNIEnv* env, oop obj) variant altogether.
>>>> Along the way I spotted some related issues with unnecessary use of 
>>>> Thread::current() when it is already available from TRAPS, and some 
>>>> other cases where we extracted the JNIEnv from a thread only to 
>>>> later extract the thread from the JNIEnv.
>>>> Testing: tiers 1 - 3
>>>> Thanks,
>>>> David
>>>> -----
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/javaClasses.cpp
>> ? 439???? JNIEnv *env = thread->jni_environment();
>>
>> Since env is no longer used on the next line, move this down to where
>> it is used, at line 444.
>
> Fixed.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/verifier.cpp
>> ? 299?? JNIEnv *env = thread->jni_environment();
>>
>> env now seems to only be used at line 320.? Move this closer.
>
> Fixed.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jni.cpp
>> ? 743???? result = JNIHandles::make_local(THREAD, result_handle());
>>
>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>> previously it just used "thread". Maybe this change shouldn't be made?
>> Or can the other uses be changed to THREAD for consistency?
>
> "thread" and "THREAD" are interchangeable for anything expecting a 
> "Thread*" (and somewhat surprisingly a number of API's that only work 
> for JavaThreads actually take a Thread*. :( ). I had choice between 
> trying to be file-wide consistent with the make_local calls, versus 
> local-code consistent, and used THREAD as it is available in both 
> JNI_ENTRY and via TRAPS. But I can certainly make a local change to 
> "thread" for local consistency.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jvm.cpp
>>
>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>> instead of "THREAD", even though other places nearby are using
>> "THREAD".? That inconsistency is kind of unfortunate, but doesn't seem
>> easily avoidable.
>
> Everything that uses THREAD in a JVM_ENTRY method can be changed to 
> use "thread" instead. But I'm not sure it's a consistency worth 
> pursuing at least as part of these changes (there are likely similar 
> issues with most of the touched files).

The thing I like about THREAD if it's available is that it's assumed to 
be *always* the current thread, so I have to wonder no further. Also, 
"thread" is generally the current thread too, but if you have a choice, 
my preference would be to use THREAD.

I wouldn't want to see this changed.

Thanks,
Coleen
>
> Thanks,
> David
>
>> ------------------------------------------------------------------------------ 
>>
>>


From coleen.phillimore at oracle.com  Tue Jul 21 18:01:36 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 21 Jul 2020 14:01:36 -0400
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
Message-ID: <82ac807a-1492-9ac0-570a-d08b1dc93e09@oracle.com>


This looks like a nice cleanup.

http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/src/hotspot/share/runtime/jniHandles.cpp.udiff.html

I'm wondering why you took out the NULL return for make_local() without 
a thread argument?? Here you may call Thread::current() unnecessarily.

  jobject JNIHandles::make_local(oop obj) {
- if (obj == NULL) {
- return NULL; // ignore null handles
- } else {
- Thread* thread = Thread::current();
- assert(oopDesc::is_oop(obj), "not an oop");
- assert(!current_thread_in_native(), "must not be in native");
- return thread->active_handles()->allocate_handle(obj);
- }
+ return make_local(Thread::current(), obj);
  }
  

Beyond the scope of this fix, but it'd be cool to not have a version 
that doesn't take thread, since there may be many more callers that 
already have Thread::current().

Coleen


On 7/20/20 1:53 AM, David Holmes wrote:
> Hi Kim,
>
> Thanks for looking at this.
>
> Updated webrev at:
>
> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/
>
> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> 
>>> wrote:
>>>
>>> Subject line got truncated by accident ...
>>>
>>> On 20/07/2020 11:06 am, David Holmes wrote:
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>>>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>>>> This is a simple cleanup that touches files across a number of VM 
>>>> areas - hence the cross-post.
>>>> Whilst working on a different JNI fix I noticed that in most cases 
>>>> in jni.cpp we were using the following form of make_local:
>>>> JNIHandles::make_local(env, obj);
>>>> and what that form does is first extract the thread from the JNIEnv:
>>>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>>>> return thread->active_handles()->allocate_handle(obj);
>>>> but there is also another, faster, variant for when you already 
>>>> have the "thread":
>>>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>>> ?? return thread->active_handles()->allocate_handle(obj);
>>>> }
>>>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, 
>>>> WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted the thread 
>>>> from the JNIEnv:
>>>> ???? JavaThread* thread=JavaThread::thread_from_jni_environment(env);
>>>> and further defined:
>>>> ???? Thread* THREAD = thread;
>>>> so we always already have direct access to the "thread" available 
>>>> (or indirect via TRAPS), and in fact we can end up removing the 
>>>> make_local(JNIEnv* env, oop obj) variant altogether.
>>>> Along the way I spotted some related issues with unnecessary use of 
>>>> Thread::current() when it is already available from TRAPS, and some 
>>>> other cases where we extracted the JNIEnv from a thread only to 
>>>> later extract the thread from the JNIEnv.
>>>> Testing: tiers 1 - 3
>>>> Thanks,
>>>> David
>>>> -----
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/javaClasses.cpp
>> ? 439???? JNIEnv *env = thread->jni_environment();
>>
>> Since env is no longer used on the next line, move this down to where
>> it is used, at line 444.
>
> Fixed.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/verifier.cpp
>> ? 299?? JNIEnv *env = thread->jni_environment();
>>
>> env now seems to only be used at line 320.? Move this closer.
>
> Fixed.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jni.cpp
>> ? 743???? result = JNIHandles::make_local(THREAD, result_handle());
>>
>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>> previously it just used "thread". Maybe this change shouldn't be made?
>> Or can the other uses be changed to THREAD for consistency?
>
> "thread" and "THREAD" are interchangeable for anything expecting a 
> "Thread*" (and somewhat surprisingly a number of API's that only work 
> for JavaThreads actually take a Thread*. :( ). I had choice between 
> trying to be file-wide consistent with the make_local calls, versus 
> local-code consistent, and used THREAD as it is available in both 
> JNI_ENTRY and via TRAPS. But I can certainly make a local change to 
> "thread" for local consistency.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jvm.cpp
>>
>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>> instead of "THREAD", even though other places nearby are using
>> "THREAD".? That inconsistency is kind of unfortunate, but doesn't seem
>> easily avoidable.
>
> Everything that uses THREAD in a JVM_ENTRY method can be changed to 
> use "thread" instead. But I'm not sure it's a consistency worth 
> pursuing at least as part of these changes (there are likely similar 
> issues with most of the touched files).
>
> Thanks,
> David
>
>> ------------------------------------------------------------------------------ 
>>
>>


From serguei.spitsyn at oracle.com  Tue Jul 21 19:25:31 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 21 Jul 2020 12:25:31 -0700
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
Message-ID: <1256c311-76cf-2d59-2e12-c79516728d34@oracle.com>

Hi David,

The fix looks good to me.

Thanks,
Serguei


On 7/19/20 22:53, David Holmes wrote:
> Hi Kim,
>
> Thanks for looking at this.
>
> Updated webrev at:
>
> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/
>
> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> 
>>> wrote:
>>>
>>> Subject line got truncated by accident ...
>>>
>>> On 20/07/2020 11:06 am, David Holmes wrote:
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>>>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>>>> This is a simple cleanup that touches files across a number of VM 
>>>> areas - hence the cross-post.
>>>> Whilst working on a different JNI fix I noticed that in most cases 
>>>> in jni.cpp we were using the following form of make_local:
>>>> JNIHandles::make_local(env, obj);
>>>> and what that form does is first extract the thread from the JNIEnv:
>>>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>>>> return thread->active_handles()->allocate_handle(obj);
>>>> but there is also another, faster, variant for when you already 
>>>> have the "thread":
>>>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>>> ?? return thread->active_handles()->allocate_handle(obj);
>>>> }
>>>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, 
>>>> WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted the thread 
>>>> from the JNIEnv:
>>>> ???? JavaThread* thread=JavaThread::thread_from_jni_environment(env);
>>>> and further defined:
>>>> ???? Thread* THREAD = thread;
>>>> so we always already have direct access to the "thread" available 
>>>> (or indirect via TRAPS), and in fact we can end up removing the 
>>>> make_local(JNIEnv* env, oop obj) variant altogether.
>>>> Along the way I spotted some related issues with unnecessary use of 
>>>> Thread::current() when it is already available from TRAPS, and some 
>>>> other cases where we extracted the JNIEnv from a thread only to 
>>>> later extract the thread from the JNIEnv.
>>>> Testing: tiers 1 - 3
>>>> Thanks,
>>>> David
>>>> -----
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/javaClasses.cpp
>> ? 439???? JNIEnv *env = thread->jni_environment();
>>
>> Since env is no longer used on the next line, move this down to where
>> it is used, at line 444.
>
> Fixed.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/verifier.cpp
>> ? 299?? JNIEnv *env = thread->jni_environment();
>>
>> env now seems to only be used at line 320.? Move this closer.
>
> Fixed.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jni.cpp
>> ? 743???? result = JNIHandles::make_local(THREAD, result_handle());
>>
>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>> previously it just used "thread". Maybe this change shouldn't be made?
>> Or can the other uses be changed to THREAD for consistency?
>
> "thread" and "THREAD" are interchangeable for anything expecting a 
> "Thread*" (and somewhat surprisingly a number of API's that only work 
> for JavaThreads actually take a Thread*. :( ). I had choice between 
> trying to be file-wide consistent with the make_local calls, versus 
> local-code consistent, and used THREAD as it is available in both 
> JNI_ENTRY and via TRAPS. But I can certainly make a local change to 
> "thread" for local consistency.
>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jvm.cpp
>>
>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>> instead of "THREAD", even though other places nearby are using
>> "THREAD".? That inconsistency is kind of unfortunate, but doesn't seem
>> easily avoidable.
>
> Everything that uses THREAD in a JVM_ENTRY method can be changed to 
> use "thread" instead. But I'm not sure it's a consistency worth 
> pursuing at least as part of these changes (there are likely similar 
> issues with most of the touched files).
>
> Thanks,
> David
>
>> ------------------------------------------------------------------------------ 
>>
>>


From david.holmes at oracle.com  Wed Jul 22 02:34:02 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 22 Jul 2020 12:34:02 +1000
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <1256c311-76cf-2d59-2e12-c79516728d34@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
 <1256c311-76cf-2d59-2e12-c79516728d34@oracle.com>
Message-ID: <63ff96e0-bcba-5041-0844-fb55b4fbfc1f@oracle.com>

Thanks Serguei!

David

On 22/07/2020 5:25 am, serguei.spitsyn at oracle.com wrote:
> Hi David,
> 
> The fix looks good to me.
> 
> Thanks,
> Serguei
> 
> 
> 
> On 7/19/20 22:53, David Holmes wrote:
>> Hi Kim,
>>
>> Thanks for looking at this.
>>
>> Updated webrev at:
>>
>> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/
>>
>> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> 
>>>> wrote:
>>>>
>>>> Subject line got truncated by accident ...
>>>>
>>>> On 20/07/2020 11:06 am, David Holmes wrote:
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>>>>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>>>>> This is a simple cleanup that touches files across a number of VM 
>>>>> areas - hence the cross-post.
>>>>> Whilst working on a different JNI fix I noticed that in most cases 
>>>>> in jni.cpp we were using the following form of make_local:
>>>>> JNIHandles::make_local(env, obj);
>>>>> and what that form does is first extract the thread from the JNIEnv:
>>>>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>>>>> return thread->active_handles()->allocate_handle(obj);
>>>>> but there is also another, faster, variant for when you already 
>>>>> have the "thread":
>>>>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>>>> ?? return thread->active_handles()->allocate_handle(obj);
>>>>> }
>>>>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, 
>>>>> WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted the thread 
>>>>> from the JNIEnv:
>>>>> ???? JavaThread* thread=JavaThread::thread_from_jni_environment(env);
>>>>> and further defined:
>>>>> ???? Thread* THREAD = thread;
>>>>> so we always already have direct access to the "thread" available 
>>>>> (or indirect via TRAPS), and in fact we can end up removing the 
>>>>> make_local(JNIEnv* env, oop obj) variant altogether.
>>>>> Along the way I spotted some related issues with unnecessary use of 
>>>>> Thread::current() when it is already available from TRAPS, and some 
>>>>> other cases where we extracted the JNIEnv from a thread only to 
>>>>> later extract the thread from the JNIEnv.
>>>>> Testing: tiers 1 - 3
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/classfile/javaClasses.cpp
>>> ? 439???? JNIEnv *env = thread->jni_environment();
>>>
>>> Since env is no longer used on the next line, move this down to where
>>> it is used, at line 444.
>>
>> Fixed.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/classfile/verifier.cpp
>>> ? 299?? JNIEnv *env = thread->jni_environment();
>>>
>>> env now seems to only be used at line 320.? Move this closer.
>>
>> Fixed.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/prims/jni.cpp
>>> ? 743???? result = JNIHandles::make_local(THREAD, result_handle());
>>>
>>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>>> previously it just used "thread". Maybe this change shouldn't be made?
>>> Or can the other uses be changed to THREAD for consistency?
>>
>> "thread" and "THREAD" are interchangeable for anything expecting a 
>> "Thread*" (and somewhat surprisingly a number of API's that only work 
>> for JavaThreads actually take a Thread*. :( ). I had choice between 
>> trying to be file-wide consistent with the make_local calls, versus 
>> local-code consistent, and used THREAD as it is available in both 
>> JNI_ENTRY and via TRAPS. But I can certainly make a local change to 
>> "thread" for local consistency.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/prims/jvm.cpp
>>>
>>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>>> instead of "THREAD", even though other places nearby are using
>>> "THREAD".? That inconsistency is kind of unfortunate, but doesn't seem
>>> easily avoidable.
>>
>> Everything that uses THREAD in a JVM_ENTRY method can be changed to 
>> use "thread" instead. But I'm not sure it's a consistency worth 
>> pursuing at least as part of these changes (there are likely similar 
>> issues with most of the touched files).
>>
>> Thanks,
>> David
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>>
> 

From david.holmes at oracle.com  Wed Jul 22 02:46:26 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 22 Jul 2020 12:46:26 +1000
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <328fb322-5b14-968b-7b13-4b449a8d98fd@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
 <328fb322-5b14-968b-7b13-4b449a8d98fd@oracle.com>
Message-ID: <4d763c6f-96e1-5c9b-8739-a441ee3b4b31@oracle.com>

Hi Dan,

On 21/07/2020 3:07 am, Daniel D. Daugherty wrote:
> On 7/20/20 1:53 AM, David Holmes wrote:
>> Hi Kim,
>>
>> Thanks for looking at this.
>>
>> Updated webrev at:
>>
>> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/
> 
> I like this cleanup very much!

Thanks for looking at it.

> 
> src/hotspot/share/classfile/javaClasses.cpp
>  ??? No comments.
> 
> src/hotspot/share/classfile/verifier.cpp
>  ??? L298: ? JavaThread* thread = (JavaThread*)THREAD;
>  ??? L307: ? ResourceMark rm(THREAD);
>  ??????? Since we've gone to the trouble of creating the 'thread' variable,
>  ??????? I would prefer it to be used instead of THREAD where possible.

Okay I made this change as we already use "thread" throughout that method.

> src/hotspot/share/jvmci/jvmciCompilerToVM.cpp
>  ??? L1021: ? HandleMark hm;
>  ??????? Can this be 'hm(THREAD)'? (Not your problem, but while you're
>  ??????? in that file?)

It probably could but there are around 8 such uses and I don't want to 
expand this change any further than necessary for the current issue. I 
filed a general RFE for things that should take advantage of having a 
current thread reference already (that will encompass Coleen's 
make_local(obj) change as well).

https://bugs.openjdk.java.net/browse/JDK-8249837

> src/hotspot/share/prims/jni.cpp
>  ??? No comments.
> 
> src/hotspot/share/prims/jvm.cpp
>  ??? L140: ? ResourceMark rm;
>  ??????? Can this be 'rm(THREAD)'? (Not your problem, but while you're
>  ??????? in that file?)
> 
>  ??? L611: ? Handle stackStream_h(THREAD, 
> JNIHandles::resolve_non_null(stackStream));
>  ??? L617: ? objArrayHandle frames_array_h(THREAD, fa);
>  ??? L626: ? return JNIHandles::make_local(THREAD, result);
>  ??????? Since we've gone to the trouble of creating the 'jt' variable,
>  ??????? I would prefer it to be used instead of THREAD where possible.
> 
>  ??? L767: ? vframeStream vfst(thread);
>  ??? L788???????? return (jclass) JNIHandles::make_local(THREAD, 
> m->method_holder()->java_mirror());
>  ??????? Can we use 'thread' on L788? (preferred)
>  ??????? Can we use 'THREAD' on L767? (less preferred)
> 
>  ??? L949: ? ResourceMark rm(THREAD);
>  ??? L951: ? Handle class_loader (THREAD, JNIHandles::resolve(loader));
>  ??? L955: ?????????????????????????? THREAD);
>  ??? L957: ? Handle protection_domain (THREAD, JNIHandles::resolve(pd));
>  ??? L968: ? return (jclass) JNIHandles::make_local(THREAD, 
> k->java_mirror());
>  ??????? Since we've gone to the trouble of creating the 'jt' variable,
>  ??????? I would prefer it to be used instead of THREAD where possible.

As per our slack chat, and the fact you are okay with things as-is, I 
will forego a more general "consistency" pass as it is unclear what is 
best here. As Coleen notes THREAD is generally understood to always be 
the current thread, while thread/jthread/jt could be any old thread in 
general. Also THREAD usage can highlight a Thread* API, while "thread" 
has to be used for JavaThread* API - but obviously that needs to be 
carefully and consistently applied to be useful. :)

>  ??? L986: ? JavaThread* jt = (JavaThread*) THREAD;
>  ??????? This 'jt' is unused and can be deleted (Not your problem, but 
> while you're
>  ??????? in that file?)

Fixed (and another case elsewhere).

>  ??? L1154: ? while (*p != '\0') {
>  ??? L1155: ????? if (*p == '.') {
>  ??? L1156: ????????? *p = '/';
>  ??? L1157: ????? }
>  ??? L1158: ????? p++;
>  ??????? Nit - the indents are wrong on L1155-58. (Not your problem, but 
> while you're
>  ??????? in that file?)

Fixed

>  ??? L1389: ? ResourceMark rm(THREAD);
>  ??? L1446: ??? return JNIHandles::make_local(THREAD, result);
>  ??? L1460: ? return JNIHandles::make_local(THREAD, result);
>  ??????? Can we use 'thread' on L1389? (preferred) And then the line you
>  ??????? touched could also be 'thread' and we'll be consistent in this
>  ??????? function...

Left as-is.

>  ??? L3287: ? oop jthread = thread->threadObj();
>  ??? L3288: ? assert (thread != NULL, "no current thread!");
>  ??????? I think the assert is wrong. It should be:
> 
>  ??????????? assert(jthread != NULL, "no current thread!");
> 
>  ??????? If 'thread == NULL', then we would have crashed at L3287.
>  ??????? Also notice that I deleted the extra ' ' before '('. (Not
>  ??????? your problem, but while you're in that file?)

Fixed. I was initially concerned about bootstrapping but it is fine - we 
ensure we set threadObj() before executing any Java code.

>  ??? L3289: ? return JNIHandles::make_local(THREAD, jthread);
>  ??????? Can you use 'thread' instead of 'THREAD' here for consistency?
> 
>  ??? L3681: ??? method_handle = Handle(THREAD, 
> JNIHandles::resolve(method));
>  ??? L3682: ??? Handle receiver(THREAD, JNIHandles::resolve(obj));
>  ??? L3683: ??? objArrayHandle args(THREAD, 
> objArrayOop(JNIHandles::resolve(args0)));
>  ??? L3685: ??? jobject res = JNIHandles::make_local(THREAD, result);
>  ??????? Can you use 'thread' instead of 'THREAD' here for consistency?
> 
>  ??? L3705: ? objArrayHandle args(THREAD, 
> objArrayOop(JNIHandles::resolve(args0)));
>  ??? L3707?? jobject res = JNIHandles::make_local(THREAD, result);
>  ??????? Can you use 'thread' instead of 'THREAD' here for consistency?

Left as-is.

> src/hotspot/share/prims/methodHandles.cpp
>  ??? No comments.
> 
> src/hotspot/share/prims/methodHandles.hpp
>  ??? No comments.
> 
> src/hotspot/share/prims/unsafe.cpp
>  ??? No comments.
> 
> src/hotspot/share/prims/whitebox.cpp
>  ??? No comments.
> 
> src/hotspot/share/runtime/jniHandles.cpp
>  ??? No comments.
> 
> src/hotspot/share/runtime/jniHandles.hpp
>  ??? No comments.
> 
> src/hotspot/share/services/management.cpp
>  ??? No comments.
> 
> 
> None of my comments above are "must do". If you choose to make the
> changes, a new webrev isn't required, but would be useful for a
> sanity check.

In addition to the tweak above I found a bunch of make_locasl(obj) 
usages in jvm.cpp and jni.cpp thanks to Coleen, which I have also fixed. 
Updated webrev:

http://cr.openjdk.java.net/~dholmes/8249650/webrev.v3/

If this passes tier 1-3 re-testing then I plan to push.

Thanks,
David
-----

> Thumbs up.
> 
> Dan
> 
> 
>>
>> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> 
>>>> wrote:
>>>>
>>>> Subject line got truncated by accident ...
>>>>
>>>> On 20/07/2020 11:06 am, David Holmes wrote:
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>>>>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>>>>> This is a simple cleanup that touches files across a number of VM 
>>>>> areas - hence the cross-post.
>>>>> Whilst working on a different JNI fix I noticed that in most cases 
>>>>> in jni.cpp we were using the following form of make_local:
>>>>> JNIHandles::make_local(env, obj);
>>>>> and what that form does is first extract the thread from the JNIEnv:
>>>>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>>>>> return thread->active_handles()->allocate_handle(obj);
>>>>> but there is also another, faster, variant for when you already 
>>>>> have the "thread":
>>>>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>>>> ?? return thread->active_handles()->allocate_handle(obj);
>>>>> }
>>>>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, 
>>>>> WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted the thread 
>>>>> from the JNIEnv:
>>>>> ???? JavaThread* thread=JavaThread::thread_from_jni_environment(env);
>>>>> and further defined:
>>>>> ???? Thread* THREAD = thread;
>>>>> so we always already have direct access to the "thread" available 
>>>>> (or indirect via TRAPS), and in fact we can end up removing the 
>>>>> make_local(JNIEnv* env, oop obj) variant altogether.
>>>>> Along the way I spotted some related issues with unnecessary use of 
>>>>> Thread::current() when it is already available from TRAPS, and some 
>>>>> other cases where we extracted the JNIEnv from a thread only to 
>>>>> later extract the thread from the JNIEnv.
>>>>> Testing: tiers 1 - 3
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/classfile/javaClasses.cpp
>>> ? 439???? JNIEnv *env = thread->jni_environment();
>>>
>>> Since env is no longer used on the next line, move this down to where
>>> it is used, at line 444.
>>
>> Fixed.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/classfile/verifier.cpp
>>> ? 299?? JNIEnv *env = thread->jni_environment();
>>>
>>> env now seems to only be used at line 320.? Move this closer.
>>
>> Fixed.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/prims/jni.cpp
>>> ? 743???? result = JNIHandles::make_local(THREAD, result_handle());
>>>
>>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>>> previously it just used "thread". Maybe this change shouldn't be made?
>>> Or can the other uses be changed to THREAD for consistency?
>>
>> "thread" and "THREAD" are interchangeable for anything expecting a 
>> "Thread*" (and somewhat surprisingly a number of API's that only work 
>> for JavaThreads actually take a Thread*. :( ). I had choice between 
>> trying to be file-wide consistent with the make_local calls, versus 
>> local-code consistent, and used THREAD as it is available in both 
>> JNI_ENTRY and via TRAPS. But I can certainly make a local change to 
>> "thread" for local consistency.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/prims/jvm.cpp
>>>
>>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>>> instead of "THREAD", even though other places nearby are using
>>> "THREAD".? That inconsistency is kind of unfortunate, but doesn't seem
>>> easily avoidable.
>>
>> Everything that uses THREAD in a JVM_ENTRY method can be changed to 
>> use "thread" instead. But I'm not sure it's a consistency worth 
>> pursuing at least as part of these changes (there are likely similar 
>> issues with most of the touched files).
>>
>> Thanks,
>> David
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>>
> 

From david.holmes at oracle.com  Wed Jul 22 02:46:56 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 22 Jul 2020 12:46:56 +1000
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <82ac807a-1492-9ac0-570a-d08b1dc93e09@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
 <82ac807a-1492-9ac0-570a-d08b1dc93e09@oracle.com>
Message-ID: <4ca86ddb-8a73-783c-0b3f-e8003f7160a3@oracle.com>

Hi Coleen,

On 22/07/2020 4:01 am, coleen.phillimore at oracle.com wrote:
> 
> This looks like a nice cleanup.

Thanks for looking at this.

> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/src/hotspot/share/runtime/jniHandles.cpp.udiff.html
> 
> I'm wondering why you took out the NULL return for make_local() without 
> a thread argument?? Here you may call Thread::current() unnecessarily.
> 
>   jobject JNIHandles::make_local(oop obj) {
> - if (obj == NULL) {
> - return NULL; // ignore null handles
> - } else {
> - Thread* thread = Thread::current();
> - assert(oopDesc::is_oop(obj), "not an oop");
> - assert(!current_thread_in_native(), "must not be in native");
> - return thread->active_handles()->allocate_handle(obj);
> - }
> + return make_local(Thread::current(), obj);
>   }

I was simply using a standard call forwarding pattern to avoid code 
duplication. I suspect passing NULL is very rare so the unnecessary 
Thread::current() call is not an issue. Otherwise, if not NULL, the NULL 
check would happen twice (unless I keep the duplicated implementations).

> Beyond the scope of this fix, but it'd be cool to not have a version 
> that doesn't take thread, since there may be many more callers that 
> already have Thread::current().

Indeed! And in fact I had missed a number of these in jvm.cpp and 
jni.cpp so I have fixed those. I've filed a RFE for other cases:

https://bugs.openjdk.java.net/browse/JDK-8249837

Updated webrev:

http://cr.openjdk.java.net/~dholmes/8249650/webrev.v3/

If this passes tier 1-3 re-testing then I plan to push.

Thanks,
David
-----

> Coleen
> 
> 
> On 7/20/20 1:53 AM, David Holmes wrote:
>> Hi Kim,
>>
>> Thanks for looking at this.
>>
>> Updated webrev at:
>>
>> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/
>>
>> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>>> On Jul 20, 2020, at 12:16 AM, David Holmes <david.holmes at oracle.com> 
>>>> wrote:
>>>>
>>>> Subject line got truncated by accident ...
>>>>
>>>> On 20/07/2020 11:06 am, David Holmes wrote:
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>>>>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>>>>> This is a simple cleanup that touches files across a number of VM 
>>>>> areas - hence the cross-post.
>>>>> Whilst working on a different JNI fix I noticed that in most cases 
>>>>> in jni.cpp we were using the following form of make_local:
>>>>> JNIHandles::make_local(env, obj);
>>>>> and what that form does is first extract the thread from the JNIEnv:
>>>>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>>>>> return thread->active_handles()->allocate_handle(obj);
>>>>> but there is also another, faster, variant for when you already 
>>>>> have the "thread":
>>>>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>>>> ?? return thread->active_handles()->allocate_handle(obj);
>>>>> }
>>>>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, 
>>>>> WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted the thread 
>>>>> from the JNIEnv:
>>>>> ???? JavaThread* thread=JavaThread::thread_from_jni_environment(env);
>>>>> and further defined:
>>>>> ???? Thread* THREAD = thread;
>>>>> so we always already have direct access to the "thread" available 
>>>>> (or indirect via TRAPS), and in fact we can end up removing the 
>>>>> make_local(JNIEnv* env, oop obj) variant altogether.
>>>>> Along the way I spotted some related issues with unnecessary use of 
>>>>> Thread::current() when it is already available from TRAPS, and some 
>>>>> other cases where we extracted the JNIEnv from a thread only to 
>>>>> later extract the thread from the JNIEnv.
>>>>> Testing: tiers 1 - 3
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/classfile/javaClasses.cpp
>>> ? 439???? JNIEnv *env = thread->jni_environment();
>>>
>>> Since env is no longer used on the next line, move this down to where
>>> it is used, at line 444.
>>
>> Fixed.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/classfile/verifier.cpp
>>> ? 299?? JNIEnv *env = thread->jni_environment();
>>>
>>> env now seems to only be used at line 320.? Move this closer.
>>
>> Fixed.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/prims/jni.cpp
>>> ? 743???? result = JNIHandles::make_local(THREAD, result_handle());
>>>
>>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>>> previously it just used "thread". Maybe this change shouldn't be made?
>>> Or can the other uses be changed to THREAD for consistency?
>>
>> "thread" and "THREAD" are interchangeable for anything expecting a 
>> "Thread*" (and somewhat surprisingly a number of API's that only work 
>> for JavaThreads actually take a Thread*. :( ). I had choice between 
>> trying to be file-wide consistent with the make_local calls, versus 
>> local-code consistent, and used THREAD as it is available in both 
>> JNI_ENTRY and via TRAPS. But I can certainly make a local change to 
>> "thread" for local consistency.
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/prims/jvm.cpp
>>>
>>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>>> instead of "THREAD", even though other places nearby are using
>>> "THREAD".? That inconsistency is kind of unfortunate, but doesn't seem
>>> easily avoidable.
>>
>> Everything that uses THREAD in a JVM_ENTRY method can be changed to 
>> use "thread" instead. But I'm not sure it's a consistency worth 
>> pursuing at least as part of these changes (there are likely similar 
>> issues with most of the touched files).
>>
>> Thanks,
>> David
>>
>>> ------------------------------------------------------------------------------ 
>>>
>>>
> 

From xxinliu at amazon.com  Wed Jul 22 07:12:40 2020
From: xxinliu at amazon.com (Liu, Xin)
Date: Wed, 22 Jul 2020 07:12:40 +0000
Subject: RFR(S): 8247732: validate user-input intrinsic_ids in
 ControlIntrinsic
In-Reply-To: <82cba5e4-2020-ce0a-4576-e8e0cc2e5ae5@oracle.com>
References: <821e3d29-c95b-aafc-8ee5-6e49a1bdde82@amazon.com>
 <9b324805-eb86-27e1-5dcb-96a823f8495b@amazon.com>,
 <82cba5e4-2020-ce0a-4576-e8e0cc2e5ae5@oracle.com>
Message-ID: <1595401959932.33284@amazon.com>

hi, Tobias, 

Thank you to review my patch. 
I make changes according to your feedbacks. here is the updated revision: 
https://cr.openjdk.java.net/~xliu/8247732/01/webrev/

1. I move the validation logic for compiler directives to compilerOracle::scan_flag_and_value.  
If something wrong  happens in parser, the patch will "gracefully" quit JVM using jvm_exit(1). is that okay? 
here is the example:

$./build/linux-x86_64-server-release/jdk/bin/java -XX:CompileCommand=option,java.util.HashMap::putVal,ccstrlist,DisableIntrinsic,_hello -version
CompileCommand: An error occurred during parsing
Line: option,java/util/HashMap  putVal ccstrlist DisableIntrinsic _hello
Error: Unrecognized intrinsic detected in DisableIntrinsic: _hello

Usage: '-XX:CompileCommand=command,"package/Class.method()"'
Use:   '-XX:CompileCommand=help' for more information.

2. I removed Method::external_name_short().

3. fixed indentation issue. 

Test: 
hotspot:tier1 and gtest:all

thanks,
--lx


________________________________________
From: Tobias Hartmann <tobias.hartmann at oracle.com>
Sent: Monday, July 20, 2020 1:23 AM
To: Liu, Xin; Nils Eliasson; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev
Subject: RE: [EXTERNAL] RFR(S): 8247732: validate user-input intrinsic_ids in ControlIntrinsic

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


Hi,

On 08.07.20 10:26, Liu, Xin wrote:
> ControlIntrinsic/DisableIntrinsic in compiler directives are more complex. The matched directive is only parsed when hotspot attempts to compile the corresponding method.
>
> I validate at that time and JVM will crash if it doesnot meet guarantee() statement.

I don't think a guarantee should be used here, i.e. the VM shouldn't crash but we should exit
gracefully with an error message. Isn't it possible to piggy-back on the error mechanism in
DirectivesParser?

> I added Method::external_name_short() which only returns the shorter method name in the form of  "classname::method".
>
> Probably hotspot has had similar code, but I failed to discover. please let me know and I will remove it.

I would just use name_and_sig_as_C_string().

jvmFlagConstraintList.cpp:180/181
- Wrong indentation

jvmFlagConstraintsCompiler.cpp:388/400
- Maybe change the error message to "Unrecognized intrinsic detected in DisableIntrinsic [...]"

Best regards,
Tobias

From christian.hagedorn at oracle.com  Wed Jul 22 08:23:15 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Wed, 22 Jul 2020 10:23:15 +0200
Subject: [16] RFR(XS): 8248467: C2: compiler/intrinsics/object/TestClone
 fails with -XX:+VerifyGraphEdges
In-Reply-To: <2ac39054-e9bf-d7a8-2dcc-a954d1a94abf@oracle.com>
References: <60c17f38-6cb2-d380-252f-15f8d5151b29@oracle.com>
 <6a458143-aeee-486b-2bc5-a210779c26dc@oracle.com>
 <2ac39054-e9bf-d7a8-2dcc-a954d1a94abf@oracle.com>
Message-ID: <96a83931-cc69-42bd-43b8-71b688403920@oracle.com>

Thank you Tobias for your review!

Best regards,
Christian

On 20.07.20 10:29, Tobias Hartmann wrote:
> +1
> 
> Best regards,
> Tobias
> 
> On 15.07.20 19:26, Vladimir Kozlov wrote:
>> Good.
>>
>> Thanks,
>> Vladimir
>>
>> On 7/15/20 8:04 AM, Christian Hagedorn wrote:
>>> Hi
>>>
>>> Please review the following patch:
>>> https://bugs.openjdk.java.net/browse/JDK-8248467
>>> http://cr.openjdk.java.net/~chagedorn/8248467/webrev.00/
>>>
>>> The assertion is hit due to a MemBarNode whose precedence edge was set to NULL at [1]
>>> (result_phi_rawoop is NULL and _resproj is the precedence edge to a MemBarStoreStore). This is
>>> possible since JDK-8237581 [2] which can remove some allocations. The fix just adds this
>>> additional case in the assert.
>>>
>>> Best regards,
>>> Christian
>>>
>>>
>>> [1] http://hg.openjdk.java.net/jdk/jdk/file/4a8fd81d64ba/src/hotspot/share/opto/macro.cpp#l1566
>>> [2] https://bugs.openjdk.java.net/browse/JDK-8237581

From christian.hagedorn at oracle.com  Wed Jul 22 08:23:44 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Wed, 22 Jul 2020 10:23:44 +0200
Subject: [16] RFR(S): 8247743: Segmentation fault in debug builds due to
 stack overflow in find_recur with deep graphs
In-Reply-To: <f5267bd5-5387-13f2-eaf5-8db8efd52de9@oracle.com>
References: <9af7a44c-4267-4900-812c-12aa0c37713a@oracle.com>
 <518ffdf1-143a-06f3-9aa4-96871d72d024@oracle.com>
 <9b3a9632-c7bb-2f51-c295-72935add2670@oracle.com>
 <2f317601-4845-541d-e2ef-ad7735386f1c@oracle.com>
 <7cfafcb9-6232-5738-6cad-508127fd31e8@oracle.com>
 <53d1eebe-e85f-58cb-7fba-0baf2ecf8701@oracle.com>
 <d793324b-88ed-d86b-0655-76443460d219@oracle.com>
 <f5267bd5-5387-13f2-eaf5-8db8efd52de9@oracle.com>
Message-ID: <be186a9b-0724-5883-29c5-de022786f4d2@oracle.com>

Thank you Tobias for your review!

Best regards,
Christian

On 20.07.20 10:32, Tobias Hartmann wrote:
> +1
> 
> Best regards,
> Tobias
> 
> On 15.07.20 19:37, Vladimir Kozlov wrote:
>> Looks good.
>>
>> Thanks,
>> Vladimir K
>>
>> On 7/15/20 12:58 AM, Christian Hagedorn wrote:
>>> Hi Vladimir
>>>
>>> On 14.07.20 20:46, Vladimir Kozlov wrote:
>>>> Can you move next up to where other small find*() methods are defined?:
>>>>
>>>> +Node* Node::find_ctrl(int idx) {
>>>> +? return find(idx, true);
>>>>  ??}
>>>>
>>>> Also add '// not PRODUCT' comment to #endif for #ifndef PRODUCT. It is hard to find where this
>>>> not product code ends.
>>>>
>>>> Looks good otherwise.
>>>
>>> Thanks, I added these changes in a new webrev:
>>> http://cr.openjdk.java.net/~chagedorn/8247743/webrev.02/
>>>
>>> Best regards,
>>> Christian
>>>
>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 7/14/20 2:54 AM, Christian Hagedorn wrote:
>>>>> Hi Vladimir
>>>>>
>>>>> On 13.07.20 19:43, Vladimir Kozlov wrote:
>>>>>> Node::find_ctrl() is used during debugging when you want to print and look on only control nodes.
>>>>>> We have several such methods which are only used in debugger.
>>>>>
>>>>> I see, I restored this method and changed Node::find() accordingly. I additionally added two
>>>>> find_ctrl() methods to make it easier to call it from a debugger (as already present for
>>>>> find_node()).
>>>>>
>>>>>> I suggest to store old_arena() in local var and pass into add_to_worklist().
>>>>>>
>>>>>> You can make add_to_worklist() static since you pass node as argument.
>>>>>
>>>>> Okay. I updated this and the change above in a new webrev:
>>>>> http://cr.openjdk.java.net/~chagedorn/8247743/webrev.01/
>>>>>
>>>>> Best regards,
>>>>> Christian
>>>>>
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>>
>>>>>> On 7/13/20 3:09 AM, Christian Hagedorn wrote:
>>>>>>> Ping - could anyone review it, please? Thanks!
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Christian
>>>>>>>
>>>>>>> On 02.07.20 09:33, Christian Hagedorn wrote:
>>>>>>>> Hi
>>>>>>>>
>>>>>>>> Please review the following patch:
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8247743
>>>>>>>> http://cr.openjdk.java.net/~chagedorn/8247743/webrev.00/
>>>>>>>>
>>>>>>>> The testcase creates a deep graph with a lot of nodes on a chain. When running with the
>>>>>>>> specified test flags, it recursively calls Node::find_recur() for each node discovered which
>>>>>>>> eventually results in a segmentation fault due to a stack overflow (around 10000 calls due to
>>>>>>>> such a long chain of nodes). The fix just converts the recursive algorithm into an iterative
>>>>>>>> one to avoid a segmentation fault. This is similar to JDK-8246203 [1].
>>>>>>>>
>>>>>>>> I additionally removed Node::find_ctrl() and its special handling in the algorithm since it
>>>>>>>> is not used.
>>>>>>>>
>>>>>>>> There is actually another problem with the recursive version. When running the testcase
>>>>>>>> without -XX:CompileOnly=compiler/c2/TestFindNode, it will spin forever inside [2] because
>>>>>>>> there is a debug_orig node cycle and the loop does not break based on the debug_orig nodes
>>>>>>>> being visited. This is also fixed in the patch.
>>>>>>>>
>>>>>>>> Thank you!
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> Christian
>>>>>>>>
>>>>>>>>
>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8246203
>>>>>>>> [2] http://hg.openjdk.java.net/jdk/jdk/file/e2622818f0bd/src/hotspot/share/opto/node.cpp#l1589

From richard.reingruber at sap.com  Wed Jul 22 08:20:15 2020
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Wed, 22 Jul 2020 08:20:15 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <AM4PR0202MB29641555B86889D51E08441BEC7F0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
 <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331EEF36942FCEBA7E131389BCB0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB329746F57D1C78F14000CB799AC80@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331D64C693490FD0746D1989BC90@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <DB6PR0201MB2152AF18921A375D26A76D89ECA40@DB6PR0201MB2152.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331FF18BED42A71796488E59B600@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM4PR0202MB29641555B86889D51E08441BEC7F0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <AM0PR0202MB3331445A57DBEC5F24C155649B790@AM0PR0202MB3331.eurprd02.prod.outlook.com>

Hi Goetz,

> I'll answer to the obvious things in this mail now.
> I'll go through the code thoroughly again and write 
> a review of my findings thereafter.

Sure. If trimmed my citations to relevant parts.

> > The delta includes many changes in comments, renaming of names, etc. So
> > I'd like to summarize
> > functional changes:
> > 
> > * Collected all the code for the testing feature DeoptimizeObjectsALot in
> > compileBroker.cpp and reworked it.
> Thanks, this makes it much more compact.

> >   With DeoptimizeObjectsALot enabled internal threads are started that
> > deoptimize frames and
> >   objects. The number of threads started are given with
> > DeoptimizeObjectsALotThreadCountAll and
> >   DeoptimizeObjectsALotThreadCountSingle. The former targets all existing
> > threads whereas the
> >   latter operates on a single thread selected round robin.
> > 
> >   I removed the mode where deoptimizations were performed at every nth
> > exit from the runtime. I never used it.

> Do I get it right? You have a n:1 and a n:all test scenario.
>  n:1: n threads deoptimize 1 Jana thread    where n = DOALThreadCountSingle
>  n:m: n threads deoptimize all Java threads where n = DOALThreadCountAll?

Not quite.

-XX:+DeoptimizeObjectsALot // required
-XX:DeoptimizeObjectsALotThreadCountAll=m
-XX:DeoptimizeObjectsALotThreadCountSingle=n

Will start m+n threads. Each operating on all existing JavaThreads using EscapeBarriers. The
difference between the 2 thread types is that one distinct EscapeBarrier targets either just a
single thread or all exisitng threads at onece. If just one single thread is targeted per
EscapeBarrier, then it is not always the same thread, but threads are selected round robin. So there
will be n threads selecting independently single threads round robin per EscapeBarrier and m threads
that target all threads in every EscapeBarrier.


> > * EscapeBarrier::sync_and_suspend_one(): use a direct handshake and
> > execute it always independently
> >   of is_thread_fully_suspended().
> Is this also a performance optimization?

Maybe a minor one.

> > * JavaThread::wait_for_object_deoptimization():
> >   - Bugfix: the last check of is_obj_deopt_suspend() must be /after/ the
> > safepoint check! This
> >     caused issues with not walkable stacks with DeoptimizeObjectsALot.
> OK. As I understand, there was one safepoint check in the old version, 
> now there is one in each iteration.  I assume this is intended, right?

Yes it is. The important thing here is (A) a safepoint check is needed /after/ leaving a safe state
(_thread_in_native, _thread_blocked). (B) Shared variables that are modified at safepoints or with
handshakes need to be reread /after/ the safepoint check.

BTW: I only noticed now that since JDK-8240918 JavaThreads themselves must disarm their polling
page. Originally (before handshakes) this was done by the VM thread. With handshakes it was done by
the thread executing the handshake op. This was change for OrderAccess::cross_modify_fence() where
the poll is left armed if the thread is in native and sice JDK-8240918 it is always left armed. So
when a thread leaves a safe state (native, blocked) and there was a handshake/vm op, it will always
call SafepointMechanism::block_if_requested_slow(), even if the handshake/vm operation have been
processed already and everybody else is happyly executing bytecodes :)

Still (A) and (B) hold.

> >   - Added limited spinning inspired by HandshakeSpinYield to fix regression in
> > microbenchmark [1]
> Ok.  Nice improvement, nice catch!

Yes. It certainly took some time to find out.

> > 
> > I refer to some more changes answering your questions and comments inline
> > below.
> > 
> > Thanks,
> > Richard.
> > 
> > [1] Microbenchmark:
> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbenchmark/
> > 


> > > I understand you annotate at safepoints where the escape analysis
> > > finds out that an object is "better" than global escape.
> > > This are the cases where the analysis identifies optimization
> > > opportunities. These annotations are then used to deoptimize
> > > frames and the objects referenced by them.
> > > Doesn't this overestimate the optimized
> > > objects?  E.g., eliminate_alloc_node has many cases where it bails
> > > out.
> > 
> > Yes, the implementation is conservative, but it is comparatively simple and
> > the additional debug
> > info is just 2 flags per safepoint. 
> Thanks. It also helped that you explained to me offline that 
> there are more optimizations than only lock elimination and scalar
> replacement done based on the ea information.
> The ea refines the IR graph with allows follow up optimizations 
> which can not easily be tracked back to the escaping objects or 
> the call sites where they do not escape. 
> Thus, if there are non-global escaping objects, you have to 
> deoptimize the frame.
> Did I repeat that correctly?

Mostly, but there are also cases, where deoptimization is required if and only if ea-local objects
are passed as arguments. This is the case, when values are not read directely from a frame, but from
a callee frame.

> With this understanding, a row of my proposed renamings/comments
> are obsolete.

Ok.


> > On the other hand, those JVMTI operations
> > that really trigger
> > deoptimizations are expected to be comparatively infrequent such that
> > switching to the interpreter
> > for a few microseconds will hardly have an effect.
> That sounds reasonable.

> > I've done microbenchmarking to check this.
> > 
> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbe
> > nchmark/
> > 
> > I found that in the worst case performance can be impacted by 10%. If the
> > agent is extremely active
> > and does relevant JVMTI calls like GetOwnedMonitorStackDepthInfo() every
> > millisecond or more often,
> > then the performance impact can be 30%. But I would think that this is not
> > realistic. These calls
> > are issued in interactive sessions to analyze deadlocks.
> Ok. 
 
> > We could get more precise deoptimizations by adding a third flag per
> > safepoint for ea-local objects
> > among the owned monitors. This would help improve the worst case in the
> > benchmark. But I'm not
> > convinced, if it is worth it.
> > 
> > Refer to the README.txt of the microbenchmark for a more detailled
> > discussion.
 
> > > pcDesc.hpp
> > >
> > > I would like to see some documentation of the methods. 
> > Done. I didn't take your text, though, because I only noticed it after writing
> > my own. Let me know if you are not ok with it.
> That's fine. My texts were only proposals, you as author know better
> what goes on anyways.

Ok.

> > > scopeDesc.cpp
> > >
> > >   Besides refactoring copy escape info from pcDesc to scopeDesc
> > >   and add accessors. Trivial.
> > >
> > >   In scopeDesc.hpp you talk about NoEscape and ArgEscape.
> > >   This are opto terms, but scopeDesc is a shared datastructure
> > >   that does not depend on a specific compiler.
> > >   Please explain what is going on without using these terms.
> > 
> > Actually these are not too opto specific terms. They are used in the paper
> > referenced in
> > escape.hpp. Also you can easily google them. I'd rather keep the comments
> > as they are.
> Hmm, I'm not really happy with this, as also the papers
> are for the compiler community, and probably not familiar to 
> others that work with HotSpot.
> But stay with your terms if you think it makes it clearer.
> Anyways, with now understanding why you use conservative
> Information (see above), the descriptions I had in mind are not precise.

Ok.

> > > callnode.hpp
> > >
> > > You add functionality to annotate callnodes with escape information
> > > This is carried through code generation to final output where it is
> > > added to the compiled methods meta information.
> > >
> > > At Safepoints in general jvmti can access
> > >   - Objects that were scalar replaced. They must be reallocated.
> > >     (Flag EliminateAllocations)
> > >   - Objects that should be locked but are not because they never
> > >     escape the thread. They need to be relocked.
> > >
> > > At calls, Objects where locks have been removed escape to callees.
> > > We must persist this information so that if jvmti accesses the
> > > object in a callee, we can determine by looking at the caller that
> > > it needs to be relocked.
> > 
> > Note that the ea-optimization must not be at the current location, it can also
> > follow when control
> > returns to the caller. Lock elimination isn't the only relevant optimization.
> Yes, I understood now, see above. Thanks for explaining.

Ok.

> > Accesses to instance
> > members or array elements can be optimized as well.
> You mean the compiler can/will ignore volatile or memory ordering
> requirements for non-escaping objects? Sounds reasonable to do.

Yes, for instance. Also without volatile modifiers it will eliminate accesses. Here is an example:
Method A has a NoEscape allocation O that is not scalar replaced. A calls Method B, which is not
inlined. When you use your debugger to break in B, then modify a field of O, then this modification
would have no effect without deoptimization, because the jit assumes that B cannot modify O without
a reference to it.

> > You are right, it is not correct how flags are checked. Especially if only
> > running with the JVMCI compiler.
> >
> > I changed Deoptimization::deoptimize_objects_internal() to make
> > reallocation and relocking dependent
> > on similar checks as in Deoptimization::fetch_unroll_info_helper().
> > Furthermore EscapeBarriers are
> > conditionally activated depending on the following (see EscapeBarrier ctors):
> > 
> > JVMCI_ONLY(UseJVMCICompiler) NOT_JVMCI(false)
> > COMPILER2_PRESENT(|| DoEscapeAnalysis)
> > 
> > So the enhancement can be practically completely disabled by disabling
> > DoEscapeAnalysis, which is
> > what C2 currently does if JVMTI capabilities that allow access to local
> > references are taken.
> Thanks for fixing. 

Thanks for finding :)

> > I went for the latter.
> > 
> > > In fetch_unroll_info_helper, I don't understand why you need
> > >  && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) {
> > > for eliminated locks, but not for skalar replaced objects?
> > 
> > In short reallocation is idempotent, relocking is not.
> > 
> > Without the enhancement Deoptimization::realloc_objects() can already be
> > called more than once for a frame:
> > 
> > First call in materializeVirtualObjects() (also iterateFrames()).
> > 
> > Second (indirect) call in fetch_unroll_info_helper().
> > 
> > The objects from the first call are saved as jvmti deferred updates when
> > realloc_objects()
> > returns. Note that there is no relationship to jvmti. The thing in common is
> > that updates cannot be
> > directely installed into a compiled frame, it is necessary to deoptimize the
> > frame and defer the
> > updates until the compiled frame gets replaced. Every time the vframes
> > corresponding to the owner
> > frame are iterated, they get the deferred updates. So in
> > fetch_unroll_info_helper() the
> > GrowableArray<compiledVFrame*>* chunk reference them too. All
> > references to the objects created by
> > the second (indirect) call to realloc_objects() are never used, because
> > compiledVFrame accessors to
> > locals, expressions, and monitors override them with the deferred updates.
> > The objects become
> > unreachable and get gc'ed.
> OK, so repeatedly computed vFrames always have the first version of 
> reallocated objects by construction, so it needs not be handled here.
> But also due to construction, objects might be allocated just to be
> discarded.

Yes.
 
> > materializeVirtualObjects() does not bother with relocking.
> > deoptimize_objects_internal(), which is
> > introduced by the enhancement, does relock objects, after all the lock
> > elimination becomes illegal 
> > with the change in escape state. Relocking twice does not work, so the
> > enhancement avoids it by
> > checking EscapeBarrier::objs_are_deoptimized(thread, deoptee.id()).
> > 
> > Note that materializeVirtualObjects() can be called more than once and will
> > always return the very
> > same objects, even though it calls realloc_objects() again.
> Ok.


> > > I would guess it is because the eliminated locks can be applied to
> > > argEscape, but scalar replacement only to noescape objects?
> > > I.e. it might have been done before?
> > >
> > > But why isn't this the case for eliminate_allocations?
> > > deoptimize_objects_internal does both unconditionally,
> > > so both can happen to inner frames, right?
> > 
> > Sorry, I don't quite understand. Hope the explanation above helps.
> Yes.  I was guessing wrong :)

Ok, good :)

> > 
> > > Code will get much more simple if BiasedLocking is removed.
> > >
> > > EscapeBarrier:: ...
> > >
> > > (This class maybe would qualify for a file of its own.)
> > >
> > > deoptimize_objects()
> > > I would mention escape analysis only as side remark.  Also, as I understand,
> > > there is only one frame at given depth?
> > > // Deoptimize frames with optimized objects. This can be omitted locks and
> > > // objects not allocated but replaced by scalars. In C2, these optimizations
> > > // are based on escape analysis.
> > > // Up to depth, deoptimize frames with any optimized objects.
> > > // From depth to entry_frame, deoptimize only frames that
> > > // pass optimized objects to their callees.
> > > (First part similar for the comment above
> > EscapeBarrier::deoptimize_objects_internal().)
> > 
> > I've reworked the comment. Let me know if you still think it needs to be
> > improved.
> Good now, thanks (maybe break the long line ...)

Ok. Will do in next webrev.7

> > > Syncronization: looks good. I think others had a look at this before.
> > >
> > > EscapeBarrier::deoptimize_objects_internal()
> > >   The method name is misleading, it is not used by
> > >   deoptimize_objects().
> > >   Also, method with the same name is in Deopitmization.
> > >   Proposal: deoptimize_objects_thread() ?
> > 
> > Sorry, but I don't see, why it would be misleading.
> > What would be the meaning of 'deoptimize_objects_thread'? I don't
> > understand that name.
> 1. I have no idea why it's called "_internal". Because it is private?
>    By the name, I would expect that EscapeBarrier::deoptimize_objects()
>    calls it for some internal tasks. But it does not.

Well, I'd say it is pretty internal, what's happening in that method. So IMHO the suffix _internal
is a match.

> 2. My proposal: deoptimize_objects_all_threads() iterates all threads
> and calls deoptimize_objects(_one)_thread(thread) for each of these.
> That's how I would have named it. 
> But no bike shedding, if you don't see what I mean it's not obvious.

Ok. We could have a quick call, too, if you like.

> > > Renaming deferred_locals to deferred_updates is good, as well as
> > > adding a datastructure for it.
> > > (Adding this data structure might be a breakout, too.)
> > >
> > > good.
> > >
> > > thread.cpp
> > >
> > > good.
> > >
> > > vframe.cpp
> > >
> > > Is this a bug in existing code?
> > > Makes sense.
> > 
> > Depends on your definition of bug. There are no references to
> > vframe::is_entry_frame() in the
> > existing code. I would think it is a bug.
> So it is :)

I'm just afraid it could get fixed by removing the class entryVFrame.

> > 
> > >
> > > vframe_hp.hpp
> > > (What stands _hp for? helper? The file should be named
> > compiledVFrame ...)
> > >
> > > not_global_escape_in_scope() ...
> > > Again, you mention escape analysis here. Comments above hold, too.
> > 
> > I think it is the right name, because it is meaningful and simple.
> Ok, accepted ... given my understandings from above.

Ok.

> > 
> > > You introduce JvmtiDeferredUpdates. Good.
> > >
> > > vframe_hp.cpp
> > >
> > > Changes for JvmtiDeferredUpdates, escape state accessors,
> > >
> > > line 422:
> > > Would an assertion assert(!info->owner_is_scalar_replaced(), ...) hold here?
> > >
> > >
> > > macros.hpp
> > >   Good.
> > >
> > >
> > > Test coding
> > > ============
> > >
> > > compileBroker.h|cpp
> > >
> > > You introduce a third class of threads handled here and
> > > add a new flag to distinguish it. Before, the two kinds
> > > of threads were distinguished implicitly by passing in
> > > a compiler for compiler threads.
> > > The new thread kind is only used for testing in debug.
> > >
> > > make_thread:
> > > You could assert (comp != NULL...) to assure previous
> > > conditions.
> > 
> > If replaced the if-statements with a switch-statement, made sure all enum-
> > elements are covered, and
> > added the assertion you suggested.
> > 
> > > line 989 indentation broken
> > 
> > You are referring to this block I assume:
> > (from
> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/src/hots
> > pot/share/compiler/compileBroker.cpp.frames.html)
> > 
> >  976   if (MethodFlushing) {
> >  977     // Initialize the sweeper thread
> >  978     Handle thread_oop = create_thread_oop("Sweeper thread", CHECK);
> >  979     jobject thread_handle = JNIHandles::make_local(THREAD,
> > thread_oop());
> >  980     make_thread(sweeper_t, thread_handle, NULL, NULL, THREAD);
> >  981   }
> >  982
> >  983 #if defined(ASSERT) && COMPILER2_OR_JVMCI
> >  984   if (DeoptimizeObjectsALot == 2) {
> >  985     // Initialize and start the object deoptimizer threads
> >  986     for (int thread_count = 0; thread_count <
> > DeoptimizeObjectsALotThreadCount; thread_count++) {
> >  987       Handle thread_oop = create_thread_oop("Deoptimize objects a lot
> > thread", CHECK);
> >  988       jobject thread_handle = JNIHandles::make_local(THREAD,
> > thread_oop());
> >  989       make_thread(deoptimizer_t, thread_handle, NULL, NULL, THREAD);
> >  990     }
> >  991   }
> >  992 #endif // defined(ASSERT) && COMPILER2_OR_JVMCI
> > 
> > I cannot really see broken indentation here. Am I looking at the wrong
> > location?
> I don't have the source version I reviewed last time any more, so 
> I can't check. But maybe an artefact from patching ... if there were
> tabs jcheck would have told you, so that's not it. No problem.

Ok.

Thanks again!

Cheers, Richard.

-----Original Message-----
From: Lindenmaier, Goetz <goetz.lindenmaier at sap.com> 
Sent: Donnerstag, 16. Juli 2020 18:30
To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

Hi Richard, 

I'll answer to the obvious things in this mail now.
I'll go through the code thoroughly again and write 
a review of my findings thereafter.

> So here is the new webrev.6
> 
> Webrev.6:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6/
> Delta:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.inc/
Thanks for the incremental webrev, it's helpful!
 
> I spent most of the time running a microbenchmark [1] I wrote to answer
> questions from your
> review. At first I had trouble with variance in the results until I found out it
> was due to the NUMA
> architecture of the server I used. After that I noticed that there was a
> performance regression of
> about 5% even at low agent activity. I finally found out that it was due to the
> implementation of
> JavaThread::wait_for_object_deoptimization() which is called by the target
> of the JVMTI operation to
> self suspend for object deoptimization. I fixed this by adding limited spinning
> before calling
> wait() on the monitor.
> 
> The delta includes many changes in comments, renaming of names, etc. So
> I'd like to summarize
> functional changes:
> 
> * Collected all the code for the testing feature DeoptimizeObjectsALot in
> compileBroker.cpp and reworked it.
Thanks, this makes it much more compact.

>   With DeoptimizeObjectsALot enabled internal threads are started that
> deoptimize frames and
>   objects. The number of threads started are given with
> DeoptimizeObjectsALotThreadCountAll and
>   DeoptimizeObjectsALotThreadCountSingle. The former targets all existing
> threads whereas the
>   latter operates on a single thread selected round robin.
> 
>   I removed the mode where deoptimizations were performed at every nth
> exit from the runtime. I never used it.

Do I get it right? You have a n:1 and a n:all test scenario.
 n:1: n threads deoptimize 1 Jana thread    where n = DOALThreadCountSingle
 n:m: n threads deoptimize all Java threads where n = DOALThreadCountAll?

> * EscapeBarrier::sync_and_suspend_one(): use a direct handshake and
> execute it always independently
>   of is_thread_fully_suspended().
Is this also a performance optimization?

> * Bugfix in EscapeBarrier::thread_added(): must not clear deopt flag. Found
> this testing with DeoptimizeObjectsALot.
Ok.

> * Added EscapeBarrier::thread_removed().
Ok.

> * EscapeBarrier constructors: barriers can now be entirely disabled by
> disabling DoEscapeAnalysis.
>   This effectively disables the enhancement.
Good!

> * JavaThread::wait_for_object_deoptimization():
>   - Bugfix: the last check of is_obj_deopt_suspend() must be /after/ the
> safepoint check! This
>     caused issues with not walkable stacks with DeoptimizeObjectsALot.
OK. As I understand, there was one safepoint check in the old version, 
now there is one in each iteration.  I assume this is intended, right?

>   - Added limited spinning inspired by HandshakeSpinYield to fix regression in
> microbenchmark [1]
Ok.  Nice improvement, nice catch!

> 
> I refer to some more changes answering your questions and comments inline
> below.
> 
> Thanks,
> Richard.
> 
> [1] Microbenchmark:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbenchmark/
> 


> > I understand you annotate at safepoints where the escape analysis
> > finds out that an object is "better" than global escape.
> > This are the cases where the analysis identifies optimization
> > opportunities. These annotations are then used to deoptimize
> > frames and the objects referenced by them.
> > Doesn't this overestimate the optimized
> > objects?  E.g., eliminate_alloc_node has many cases where it bails
> > out.
> 
> Yes, the implementation is conservative, but it is comparatively simple and
> the additional debug
> info is just 2 flags per safepoint. 
Thanks. It also helped that you explained to me offline that 
there are more optimizations than only lock elimination and scalar
replacement done based on the ea information.
The ea refines the IR graph with allows follow up optimizations 
which can not easily be tracked back to the escaping objects or 
the call sites where they do not escape. 
Thus, if there are non-global escaping objects, you have to 
deoptimize the frame.
Did I repeat that correctly?
With this understanding, a row of my proposed renamings/comments
are obsolete.


> On the other hand, those JVMTI operations
> that really trigger
> deoptimizations are expected to be comparatively infrequent such that
> switching to the interpreter
> for a few microseconds will hardly have an effect.
That sounds reasonable.

> I've done microbenchmarking to check this.
> 
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbe
> nchmark/
> 
> I found that in the worst case performance can be impacted by 10%. If the
> agent is extremely active
> and does relevant JVMTI calls like GetOwnedMonitorStackDepthInfo() every
> millisecond or more often,
> then the performance impact can be 30%. But I would think that this is not
> realistic. These calls
> are issued in interactive sessions to analyze deadlocks.
Ok. 
 
> We could get more precise deoptimizations by adding a third flag per
> safepoint for ea-local objects
> among the owned monitors. This would help improve the worst case in the
> benchmark. But I'm not
> convinced, if it is worth it.
> 
> Refer to the README.txt of the microbenchmark for a more detailled
> discussion.
 
> > pcDesc.hpp
> >
> > I would like to see some documentation of the methods. 
> Done. I didn't take your text, though, because I only noticed it after writing
> my own. Let me know if you are not ok with it.
That's fine. My texts were only proposals, you as author know better
what goes on anyways.

> > scopeDesc.cpp
> >
> >   Besides refactoring copy escape info from pcDesc to scopeDesc
> >   and add accessors. Trivial.
> >
> >   In scopeDesc.hpp you talk about NoEscape and ArgEscape.
> >   This are opto terms, but scopeDesc is a shared datastructure
> >   that does not depend on a specific compiler.
> >   Please explain what is going on without using these terms.
> 
> Actually these are not too opto specific terms. They are used in the paper
> referenced in
> escape.hpp. Also you can easily google them. I'd rather keep the comments
> as they are.
Hmm, I'm not really happy with this, as also the papers
are for the compiler community, and probably not familiar to 
others that work with HotSpot.
But stay with your terms if you think it makes it clearer.
Anyways, with now understanding why you use conservative
Information (see above), the descriptions I had in mind are not precise.

> > callnode.hpp
> >
> > You add functionality to annotate callnodes with escape information
> > This is carried through code generation to final output where it is
> > added to the compiled methods meta information.
> >
> > At Safepoints in general jvmti can access
> >   - Objects that were scalar replaced. They must be reallocated.
> >     (Flag EliminateAllocations)
> >   - Objects that should be locked but are not because they never
> >     escape the thread. They need to be relocked.
> >
> > At calls, Objects where locks have been removed escape to callees.
> > We must persist this information so that if jvmti accesses the
> > object in a callee, we can determine by looking at the caller that
> > it needs to be relocked.
> 
> Note that the ea-optimization must not be at the current location, it can also
> follow when control
> returns to the caller. Lock elimination isn't the only relevant optimization.
Yes, I understood now, see above. Thanks for explaining.
> Accesses to instance
> members or array elements can be optimized as well.
You mean the compiler can/will ignore volatile or memory ordering
requirements for non-escaping objects? Sounds reasonable to do.

> > // Returns true if at least one of the arguments to the call is an oop
> > // that does not escape globally.
> > bool ConnectionGraph::has_arg_escape(CallJavaNode* call) {
> 
> IMHO the method names are descriptive and don't need the comments. But I
> give in :) (only replaced
> "oop" with "object")
Thanks. Yes, object is better than oop.

> You are right, it is not correct how flags are checked. Especially if only
> running with the JVMCI compiler.
>
> I changed Deoptimization::deoptimize_objects_internal() to make
> reallocation and relocking dependent
> on similar checks as in Deoptimization::fetch_unroll_info_helper().
> Furthermore EscapeBarriers are
> conditionally activated depending on the following (see EscapeBarrier ctors):
> 
> JVMCI_ONLY(UseJVMCICompiler) NOT_JVMCI(false)
> COMPILER2_PRESENT(|| DoEscapeAnalysis)
> 
> So the enhancement can be practically completely disabled by disabling
> DoEscapeAnalysis, which is
> what C2 currently does if JVMTI capabilities that allow access to local
> references are taken.
Thanks for fixing. 

> I went for the latter.
> 
> > In fetch_unroll_info_helper, I don't understand why you need
> >  && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) {
> > for eliminated locks, but not for skalar replaced objects?
> 
> In short reallocation is idempotent, relocking is not.
> 
> Without the enhancement Deoptimization::realloc_objects() can already be
> called more than once for a frame:
> 
> First call in materializeVirtualObjects() (also iterateFrames()).
> 
> Second (indirect) call in fetch_unroll_info_helper().
> 
> The objects from the first call are saved as jvmti deferred updates when
> realloc_objects()
> returns. Note that there is no relationship to jvmti. The thing in common is
> that updates cannot be
> directely installed into a compiled frame, it is necessary to deoptimize the
> frame and defer the
> updates until the compiled frame gets replaced. Every time the vframes
> corresponding to the owner
> frame are iterated, they get the deferred updates. So in
> fetch_unroll_info_helper() the
> GrowableArray<compiledVFrame*>* chunk reference them too. All
> references to the objects created by
> the second (indirect) call to realloc_objects() are never used, because
> compiledVFrame accessors to
> locals, expressions, and monitors override them with the deferred updates.
> The objects become
> unreachable and get gc'ed.
OK, so repeatedly computed vFrames always have the first version of 
reallocated objects by construction, so it needs not be handled here.
But also due to construction, objects might be allocated just to be
discarded.
 
> materializeVirtualObjects() does not bother with relocking.
> deoptimize_objects_internal(), which is
> introduced by the enhancement, does relock objects, after all the lock
> elimination becomes illegal 
> with the change in escape state. Relocking twice does not work, so the
> enhancement avoids it by
> checking EscapeBarrier::objs_are_deoptimized(thread, deoptee.id()).
> 
> Note that materializeVirtualObjects() can be called more than once and will
> always return the very
> same objects, even though it calls realloc_objects() again.
Ok.


> > I would guess it is because the eliminated locks can be applied to
> > argEscape, but scalar replacement only to noescape objects?
> > I.e. it might have been done before?
> >
> > But why isn't this the case for eliminate_allocations?
> > deoptimize_objects_internal does both unconditionally,
> > so both can happen to inner frames, right?
> 
> Sorry, I don't quite understand. Hope the explanation above helps.
Yes.  I was guessing wrong :)

> >   I like if boolean operators are at the beginning of broken lines,
> >   but I think hotspot convention is to have them at the end.
> Ok, fixed.
Thanks.

> 
> > Code will get much more simple if BiasedLocking is removed.
> >
> > EscapeBarrier:: ...
> >
> > (This class maybe would qualify for a file of its own.)
> >
> > deoptimize_objects()
> > I would mention escape analysis only as side remark.  Also, as I understand,
> > there is only one frame at given depth?
> > // Deoptimize frames with optimized objects. This can be omitted locks and
> > // objects not allocated but replaced by scalars. In C2, these optimizations
> > // are based on escape analysis.
> > // Up to depth, deoptimize frames with any optimized objects.
> > // From depth to entry_frame, deoptimize only frames that
> > // pass optimized objects to their callees.
> > (First part similar for the comment above
> EscapeBarrier::deoptimize_objects_internal().)
> 
> I've reworked the comment. Let me know if you still think it needs to be
> improved.
Good now, thanks (maybe break the long line ...)


> > What is the check (cur_depth <= depth) good for? Can you
> > ever walk past entry_frame?
> 
> Yes (assuming you mean the outer while-statement), there are java frames
> beyond the entry frame if a
> native method calls java methods again. So we visit all frames up to the given
> depth and from there
> we continue to the entry frame. It is not necessary to continue beyond that
> entry frame, because
> escape analysis assumes that arguments to native functions escape globally.
> 
> Example: Let the java stack look like this:
> 
> +---------+
> | Frame A |
> +---------+
> | Frame N |
> +---------+
> | Frame B |
> +---------+ <- top of stack
> 
> Where java method A calls native method N and N calls java method B.
> 
> Very simplified the native stack will look like this
> 
> +-------------------------+
> | Frame of JIT Compiled A |
> +-------------------------+
> | Frame N                 |
> +-------------------------+
> | Entry Frame             |
> +-------------------------+
> | Frame B                 |
> +-------------------------+ <- top of stack
> 
> The entry frame is an activation of the call stub, which is a small assembler
> routine that
> translates from the native calling convention to the java calling convention.
> 
> There cannot be any ArgEscape that is passed to B (see above), therefore we
> can stop the stackwalk
> at the entry frame if depth is 1. If depth is 3 we have to continue to Frame A,
> as it is directely
> accessed. 
Ok, thanks, nice explanation!!

> > Isn't vf->is_compiled_frame() prerequisite that "Move to next physical
> frame"
> > is needed? You could move it into the other check.
> > If so, similar for deoptimize_objects_all_threads().
> 
> Only compiledVFrame require moving to the /top/ frame. Fixed.
Thanks, this looks better.

> > Syncronization: looks good. I think others had a look at this before.
> >
> > EscapeBarrier::deoptimize_objects_internal()
> >   The method name is misleading, it is not used by
> >   deoptimize_objects().
> >   Also, method with the same name is in Deopitmization.
> >   Proposal: deoptimize_objects_thread() ?
> 
> Sorry, but I don't see, why it would be misleading.
> What would be the meaning of 'deoptimize_objects_thread'? I don't
> understand that name.
1. I have no idea why it's called "_internal". Because it is private?
   By the name, I would expect that EscapeBarrier::deoptimize_objects()
   calls it for some internal tasks. But it does not.
2. My proposal: deoptimize_objects_all_threads() iterates all threads 
and calls deoptimize_objects(_one)_thread(thread) for each of these.
That's how I would have named it. 
But no bike shedding, if you don't see what I mean it's not obvious.


> > C1 stubs: this really shows you tested all configurations, great!
> >
> >
> > mutexLocker: ok.
> > objectMonitor.cpp: ok
> > stackValue.hpp   Is this missing clearing a bug?
> 
> In short: that change is not needed anymore. I'll remove it again.
Good. Thanks for the details.

> > Renaming deferred_locals to deferred_updates is good, as well as
> > adding a datastructure for it.
> > (Adding this data structure might be a breakout, too.)
> >
> > good.
> >
> > thread.cpp
> >
> > good.
> >
> > vframe.cpp
> >
> > Is this a bug in existing code?
> > Makes sense.
> 
> Depends on your definition of bug. There are no references to
> vframe::is_entry_frame() in the
> existing code. I would think it is a bug.
So it is :)

> 
> >
> > vframe_hp.hpp
> > (What stands _hp for? helper? The file should be named
> compiledVFrame ...)
> >
> > not_global_escape_in_scope() ...
> > Again, you mention escape analysis here. Comments above hold, too.
> 
> I think it is the right name, because it is meaningful and simple.
Ok, accepted ... given my understandings from above.

> 
> > You introduce JvmtiDeferredUpdates. Good.
> >
> > vframe_hp.cpp
> >
> > Changes for JvmtiDeferredUpdates, escape state accessors,
> >
> > line 422:
> > Would an assertion assert(!info->owner_is_scalar_replaced(), ...) hold here?
> >
> >
> > macros.hpp
> >   Good.
> >
> >
> > Test coding
> > ============
> >
> > compileBroker.h|cpp
> >
> > You introduce a third class of threads handled here and
> > add a new flag to distinguish it. Before, the two kinds
> > of threads were distinguished implicitly by passing in
> > a compiler for compiler threads.
> > The new thread kind is only used for testing in debug.
> >
> > make_thread:
> > You could assert (comp != NULL...) to assure previous
> > conditions.
> 
> If replaced the if-statements with a switch-statement, made sure all enum-
> elements are covered, and
> added the assertion you suggested.
> 
> > line 989 indentation broken
> 
> You are referring to this block I assume:
> (from
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.5/src/hots
> pot/share/compiler/compileBroker.cpp.frames.html)
> 
>  976   if (MethodFlushing) {
>  977     // Initialize the sweeper thread
>  978     Handle thread_oop = create_thread_oop("Sweeper thread", CHECK);
>  979     jobject thread_handle = JNIHandles::make_local(THREAD,
> thread_oop());
>  980     make_thread(sweeper_t, thread_handle, NULL, NULL, THREAD);
>  981   }
>  982
>  983 #if defined(ASSERT) && COMPILER2_OR_JVMCI
>  984   if (DeoptimizeObjectsALot == 2) {
>  985     // Initialize and start the object deoptimizer threads
>  986     for (int thread_count = 0; thread_count <
> DeoptimizeObjectsALotThreadCount; thread_count++) {
>  987       Handle thread_oop = create_thread_oop("Deoptimize objects a lot
> thread", CHECK);
>  988       jobject thread_handle = JNIHandles::make_local(THREAD,
> thread_oop());
>  989       make_thread(deoptimizer_t, thread_handle, NULL, NULL, THREAD);
>  990     }
>  991   }
>  992 #endif // defined(ASSERT) && COMPILER2_OR_JVMCI
> 
> I cannot really see broken indentation here. Am I looking at the wrong
> location?
I don't have the source version I reviewed last time any more, so 
I can't check. But maybe an artefact from patching ... if there were
tabs jcheck would have told you, so that's not it. No problem.

Best regards,
  Goetz.

From jatin.bhateja at intel.com  Wed Jul 22 10:27:26 2020
From: jatin.bhateja at intel.com (Bhateja, Jatin)
Date: Wed, 22 Jul 2020 10:27:26 +0000
Subject: RFR[S] : 8248830 : C2 : Rotate API intrinsification for X86
In-Reply-To: <92d97d1b-fc53-e368-b249-1cab7db33964@oracle.com>
References: <MWHPR11MB1614EAFF216144FE6EAE68F9E87F0@MWHPR11MB1614.namprd11.prod.outlook.com>
 <92d97d1b-fc53-e368-b249-1cab7db33964@oracle.com>
Message-ID: <MWHPR11MB1614CB6E26028AC98DAA7F30E8790@MWHPR11MB1614.namprd11.prod.outlook.com>

Hi Vladimir,

Please find the updated patch at following link

http://cr.openjdk.java.net/~jbhateja/8248830/webrev.03/

Change Summary:

1) Unified the handling for scalar rotate operation. All scalar rotate selection patterns are now dependent on newly created RotateLeft/RotateRight nodes. This promotes rotate inferencing. Currently if DAG nodes corresponding to a sub-pattern are shared (have multiple users) then existing complex patterns based on Or/LShiftL/URShift does not get matched and this prevents inferring rotate nodes. Please refer to JIT'ed assembly output with baseline[1] and with patch[2] . We can see that generated code size also went done from 832 byte to 768 bytes. Also this can cause perf degradation if shift-or dependency chain appears inside a hot region.

2) Due to enhanced rotate inferencing new patch shows better performance even for legacy targets (non AVX-512). Please refer to the perf result[3] over AVX2 machine for JMH benchmark part of the patch.

3) As suggested, removed Java API intrinsification changes and scalar rotate transformation are done during OrI/OrL node idealizations.

4) SLP always gets to work on new scalar Rotate nodes and creates vector rotate nodes which are degenerated into OrV/LShiftV/URShiftV nodes if target does not supports vector rotates(non-AVX512).

5) Added new instruction patterns for vector shift Left/Right operations with constant shift operands. This prevents emitting extra moves to XMM.

6) Constant folding scenarios are covered in RotateLeft/RotateRight idealization, inferencing of vector rotate through OrV idealization covers the vector patterns generated though non SLP route i.e. VectorAPI.

Following are the results JMH benchmark over AVX3 target.


Baseline:

Benchmark                         (SHIFT)  (TESTSIZE)   Mode  Cnt      Score   Error   Units
RotateBenchmark.testRotateLeftI        20         512  thrpt    2  33541.569          ops/ms
RotateBenchmark.testRotateLeftL        20         512  thrpt    2  20363.973          ops/ms
RotateBenchmark.testRotateRightI       20         512  thrpt    2  33944.085          ops/ms
RotateBenchmark.testRotateRightL       20         512  thrpt    2  20443.967          ops/ms


With Changes:

Benchmark                         (SHIFT)  (TESTSIZE)   Mode  Cnt      Score   Error   Units
RotateBenchmark.testRotateLeftI        20         512  thrpt    2  48439.220          ops/ms
RotateBenchmark.testRotateLeftL        20         512  thrpt    2  35758.933          ops/ms
RotateBenchmark.testRotateRightI       20         512  thrpt    2  49702.219          ops/ms
RotateBenchmark.testRotateRightL       20         512  thrpt    2  35618.666          ops/ms

Please push the patch through your testing framework and let me know your review feedback.

Best Regards,
Jatin

[1] http://cr.openjdk.java.net/~jbhateja/8248830/rotate_baseline_avx2_asm.txt
[2] http://cr.openjdk.java.net/~jbhateja/8248830/rotate_new_patch_avx2_asm.txt
[3] http://cr.openjdk.java.net/~jbhateja/8248830/rotate_perf_avx2_new_patch.txt


> -----Original Message-----
> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
> Sent: Saturday, July 18, 2020 12:25 AM
> To: Bhateja, Jatin <jatin.bhateja at intel.com>; Andrew Haley <aph at redhat.com>
> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; hotspot-compiler-
> dev at openjdk.java.net
> Subject: Re: RFR[S] : 8248830 : C2 : Rotate API intrinsification for X86
> 
> Hi Jatin,
> 
> > http://cr.openjdk.java.net/~jbhateja/8248830/webrev_02/
> 
> It definitely looks better, but IMO it hasn't reached the sweet spot yet.
> It feels like the focus is on auto-vectorizer while the burden is put on
> scalar cases.
> 
> First of all, considering GVN folds relevant operation patterns into a
> single Rotate node now, what's the motivation to introduce intrinsics?
> 
> Another point is there's still significant duplication for scalar cases.
> 
> I'd prefer to see the legacy cases which rely on pattern matching to go
> away and be substituted with instructions which match Rotate instructions
> (migrating ).
> 
> I understand that it will penalize the vectorization implementation, but
> IMO reducing overall complexity is worth it. On auto-vectorizer side, I see
> 2 ways to fix it:
> 
>    (1) introduce additional AD instructions for RotateLeftV/RotateRightV
> specifically for pre-AVX512 hardware;
> 
>    (2) in SuperWord::output(), when matcher doesn't support
> RotateLeftV/RotateLeftV nodes (Matcher::match_rule_supported()),
> generate vectorized version of the original pattern.
> 
> Overall, it looks like more and more focus is made on scalar part.
> Considering the main goal of the patch is to enable vectorization, I'm fine
> with separating cleanup of scalar part. As an interim solution, it seems
> that leaving the scalar part as it is now and matching scalar bit rotate
> pattern in VectorNode::is_rotate() should be enough to keep the
> vectorization part functioning. Then scalar Rotate nodes and relevant
> cleanups can be integrated later. (Or vice versa: clean up scalar part
> first and then follow up with vectorization.)
> 
> Some other comments:
> 
> * There's a lot of duplication between OrINode::Ideal and OrLNode::Ideal.
> What do you think about introducing a super type
> (OrNode) and put a unified version (OrNode::Ideal) there?
> 
> 
> * src/hotspot/cpu/x86/x86.ad
> 
> +instruct vprotate_immI8(vec dst, vec src, immI8 shift) %{
> +  predicate(n->bottom_type()->is_vect()->element_basic_type() == T_INT ||
> +            n->bottom_type()->is_vect()->element_basic_type() ==
> +T_LONG);
> 
> +instruct vprorate(vec dst, vec src, vec shift) %{
> +  predicate(n->bottom_type()->is_vect()->element_basic_type() == T_INT ||
> +            n->bottom_type()->is_vect()->element_basic_type() ==
> +T_LONG);
> 
> The predicates are redundant here.
> 
> 
> * src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp
> 
> +void C2_MacroAssembler::vprotate_imm(int opcode, BasicType etype,
> XMMRegister dst, XMMRegister src,
> +                                     int shift, int vector_len) {  if
> + (opcode == Op_RotateLeftV) {
> +    if (etype == T_INT) {
> +      evprold(dst, src, shift, vector_len);
> +    } else {
> +      evprolq(dst, src, shift, vector_len);
> +    }
> 
> Please, put an assert for the false case (assert(etype == T_LONG, "...")).
> 
> 
> * On testing (with previous version of the patch): -XX:UseAVX is x86-
> specific flag, so new/adjusted tests now fail on non-x86 platforms.
> Either omitting the flag or adding -XX:+IgnoreUnrecognizedVMOptions will
> solve the issue.
> 
> Best regards,
> Vladimir Ivanov
> 
> >
> >
> > Summary of changes:
> > 1) Optimization is specifically targeted to exploit vector rotation
> instruction added for X86 AVX512. A single rotate instruction  encapsulates
> entire vector OR/SHIFTs pattern thus offers better latency at reduced
> instruction count.
> >
> > 2) There were two approaches to implement this:
> >      a)  Let everything remain the same and add new wide complex
> instruction patterns in the matcher for e.g.
> >           set Dst ( OrV (Binary (LShiftVI dst (Binary ReplicateI shift))
> (URShiftVI dst (Binary (SubI (Binary ReplicateI 32) ( Replicate shift))
> >      It would have been an overoptimistic assumption to expect that graph
> shape would be preserved till the matcher for correct inferencing.
> >      In addition we would have required multiple such bulky patterns.
> >      b) Create new RotateLeft/RotateRight scalar nodes, these gets
> generated during intrinsification as well as during additional pattern
> >      matching during node Idealization, later on these nodes are consumed
> by SLP for valid vectorization scenarios to emit their vector
> >      counterparts which eventually emits vector rotates.
> >
> > 3) I choose approach 2b) since its cleaner, only problem here was that
> > in non-evex mode (UseAVX < 3) new scalar Rotate nodes should either be
> dismantled back to OR/SHIFT pattern or we penalize the vectorization which
> would be very costly, other option would have been to add additional vector
> rotate pattern for UseAVX=3 in the matcher which emit vector OR-SHIFTs
> instruction but then it will loose on emitting efficient instruction
> sequence which node sharing (OrV/LShiftV/URShift) offer in current
> implementation - thus it will not be beneficial for non-AVX512 targets,
> only saving will be in terms of cleanup of few existing scalar rotate
> matcher patterns, also old targets does not offer this powerful rotate
> instruction. Therefore new scalar nodes are created only for AVX512
> targets.
> >
> > As per suggestions constant folding scenarios have been covered during
> Idealizations of newly added scalar nodes.
> >
> > Please review the latest version and share your feedback and test
> results.
> >
> > Best Regards,
> > Jatin
> >
> >
> >> -----Original Message-----
> >> From: Andrew Haley <aph at redhat.com>
> >> Sent: Saturday, July 11, 2020 2:24 PM
> >> To: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>; Bhateja, Jatin
> >> <jatin.bhateja at intel.com>; hotspot-compiler-dev at openjdk.java.net
> >> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>
> >> Subject: Re: 8248830 : RFR[S] : C2 : Rotate API intrinsification for
> >> X86
> >>
> >> On 10/07/2020 18:32, Vladimir Ivanov wrote:
> >>
> >>   > High-level comment: so far, there were no pressing need in  >
> >> explicitly marking the methods as intrinsics. ROR/ROL instructions  >
> >> were selected during matching [1]. Now the patch introduces  >
> >> dedicated nodes
> >> (RotateLeft/RotateRight) specifically for intrinsics  > which partly
> >> duplicates existing logic.
> >>
> >> The lack of rotate nodes in the IR has always meant that AArch64
> >> doesn't generate optimal code for e.g.
> >>
> >>     (Set dst (XorL reg1 (RotateLeftL reg2 imm)))
> >>
> >> because, with the RotateLeft expanded to its full combination of ORs
> >> and shifts, it's to complicated to match. At the time I put this to
> >> one side because it wasn't urgent. This is a shame because although
> >> such combinations are unusual they are used in some crypto operations.
> >>
> >> If we can generate immediate-form rotate nodes early by pattern
> >> matching during parsing (rather than depending on intrinsics) we'll
> >> get more value than by depending on programmers calling intrinsics.
> >>
> >> --
> >> Andrew Haley  (he/him)
> >> Java Platform Lead Engineer
> >> Red Hat UK Ltd. <https://www.redhat.com>
> >> https://keybase.io/andrewhaley
> >> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
> >

From coleen.phillimore at oracle.com  Wed Jul 22 12:25:13 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Wed, 22 Jul 2020 08:25:13 -0400
Subject: RFR (M) 8249650: Optimize JNIHandle::make_local thread variable
 usage
In-Reply-To: <4ca86ddb-8a73-783c-0b3f-e8003f7160a3@oracle.com>
References: <8410d4a2-bbad-090f-55bf-88940f786781@oracle.com>
 <f5726b31-c23e-f76f-aa0e-68f1599e3944@oracle.com>
 <0590E210-6F23-4498-A51A-C3DAEF54B5AB@oracle.com>
 <6166e191-c954-70e5-5595-956a0c145d10@oracle.com>
 <82ac807a-1492-9ac0-570a-d08b1dc93e09@oracle.com>
 <4ca86ddb-8a73-783c-0b3f-e8003f7160a3@oracle.com>
Message-ID: <e6a0a004-7805-7985-d844-5a2e74cf0814@oracle.com>

Ok, looks good to me.
Colen

On 7/21/20 10:46 PM, David Holmes wrote:
> Hi Coleen,
>
> On 22/07/2020 4:01 am, coleen.phillimore at oracle.com wrote:
>>
>> This looks like a nice cleanup.
>
> Thanks for looking at this.
>
>> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/src/hotspot/share/runtime/jniHandles.cpp.udiff.html 
>>
>>
>> I'm wondering why you took out the NULL return for make_local() 
>> without a thread argument?? Here you may call Thread::current() 
>> unnecessarily.
>>
>> ? jobject JNIHandles::make_local(oop obj) {
>> - if (obj == NULL) {
>> - return NULL; // ignore null handles
>> - } else {
>> - Thread* thread = Thread::current();
>> - assert(oopDesc::is_oop(obj), "not an oop");
>> - assert(!current_thread_in_native(), "must not be in native");
>> - return thread->active_handles()->allocate_handle(obj);
>> - }
>> + return make_local(Thread::current(), obj);
>> ? }
>
> I was simply using a standard call forwarding pattern to avoid code 
> duplication. I suspect passing NULL is very rare so the unnecessary 
> Thread::current() call is not an issue. Otherwise, if not NULL, the 
> NULL check would happen twice (unless I keep the duplicated 
> implementations).
>
>> Beyond the scope of this fix, but it'd be cool to not have a version 
>> that doesn't take thread, since there may be many more callers that 
>> already have Thread::current().
>
> Indeed! And in fact I had missed a number of these in jvm.cpp and 
> jni.cpp so I have fixed those. I've filed a RFE for other cases:
>
> https://bugs.openjdk.java.net/browse/JDK-8249837
>
> Updated webrev:
>
> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v3/
>
> If this passes tier 1-3 re-testing then I plan to push.
>
> Thanks,
> David
> -----
>
>> Coleen
>>
>>
>> On 7/20/20 1:53 AM, David Holmes wrote:
>>> Hi Kim,
>>>
>>> Thanks for looking at this.
>>>
>>> Updated webrev at:
>>>
>>> http://cr.openjdk.java.net/~dholmes/8249650/webrev.v2/
>>>
>>> On 20/07/2020 3:22 pm, Kim Barrett wrote:
>>>>> On Jul 20, 2020, at 12:16 AM, David Holmes 
>>>>> <david.holmes at oracle.com> wrote:
>>>>>
>>>>> Subject line got truncated by accident ...
>>>>>
>>>>> On 20/07/2020 11:06 am, David Holmes wrote:
>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8249650
>>>>>> webrev: http://cr.openjdk.java.net/~dholmes/8249650/webrev/
>>>>>> This is a simple cleanup that touches files across a number of VM 
>>>>>> areas - hence the cross-post.
>>>>>> Whilst working on a different JNI fix I noticed that in most 
>>>>>> cases in jni.cpp we were using the following form of make_local:
>>>>>> JNIHandles::make_local(env, obj);
>>>>>> and what that form does is first extract the thread from the JNIEnv:
>>>>>> JavaThread* thread = JavaThread::thread_from_jni_environment(env);
>>>>>> return thread->active_handles()->allocate_handle(obj);
>>>>>> but there is also another, faster, variant for when you already 
>>>>>> have the "thread":
>>>>>> jobject JNIHandles::make_local(Thread* thread, oop obj) {
>>>>>> ?? return thread->active_handles()->allocate_handle(obj);
>>>>>> }
>>>>>> When you look at the JNI_ENTRY wrapper (and related JVM_ENTRY, 
>>>>>> WB_ENTRY, UNSAFE_ENTRY etc) it has already extracted the thread 
>>>>>> from the JNIEnv:
>>>>>> ???? JavaThread* 
>>>>>> thread=JavaThread::thread_from_jni_environment(env);
>>>>>> and further defined:
>>>>>> ???? Thread* THREAD = thread;
>>>>>> so we always already have direct access to the "thread" available 
>>>>>> (or indirect via TRAPS), and in fact we can end up removing the 
>>>>>> make_local(JNIEnv* env, oop obj) variant altogether.
>>>>>> Along the way I spotted some related issues with unnecessary use 
>>>>>> of Thread::current() when it is already available from TRAPS, and 
>>>>>> some other cases where we extracted the JNIEnv from a thread only 
>>>>>> to later extract the thread from the JNIEnv.
>>>>>> Testing: tiers 1 - 3
>>>>>> Thanks,
>>>>>> David
>>>>>> -----
>>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/classfile/javaClasses.cpp
>>>> ? 439???? JNIEnv *env = thread->jni_environment();
>>>>
>>>> Since env is no longer used on the next line, move this down to where
>>>> it is used, at line 444.
>>>
>>> Fixed.
>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/classfile/verifier.cpp
>>>> ? 299?? JNIEnv *env = thread->jni_environment();
>>>>
>>>> env now seems to only be used at line 320.? Move this closer.
>>>
>>> Fixed.
>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/prims/jni.cpp
>>>> ? 743???? result = JNIHandles::make_local(THREAD, result_handle());
>>>>
>>>> jni_PopLocalFrame is now using a mix of "thread" and "THREAD", where
>>>> previously it just used "thread". Maybe this change shouldn't be made?
>>>> Or can the other uses be changed to THREAD for consistency?
>>>
>>> "thread" and "THREAD" are interchangeable for anything expecting a 
>>> "Thread*" (and somewhat surprisingly a number of API's that only 
>>> work for JavaThreads actually take a Thread*. :( ). I had choice 
>>> between trying to be file-wide consistent with the make_local calls, 
>>> versus local-code consistent, and used THREAD as it is available in 
>>> both JNI_ENTRY and via TRAPS. But I can certainly make a local 
>>> change to "thread" for local consistency.
>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/prims/jvm.cpp
>>>>
>>>> The calls to JvmtiExport::post_vm_object_alloc have to use "thread"
>>>> instead of "THREAD", even though other places nearby are using
>>>> "THREAD".? That inconsistency is kind of unfortunate, but doesn't seem
>>>> easily avoidable.
>>>
>>> Everything that uses THREAD in a JVM_ENTRY method can be changed to 
>>> use "thread" instead. But I'm not sure it's a consistency worth 
>>> pursuing at least as part of these changes (there are likely similar 
>>> issues with most of the touched files).
>>>
>>> Thanks,
>>> David
>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>>
>>


From boris.ulasevich at bell-sw.com  Wed Jul 22 13:36:34 2020
From: boris.ulasevich at bell-sw.com (Boris Ulasevich)
Date: Wed, 22 Jul 2020 16:36:34 +0300
Subject: RFR 8249189: AARCH64: more L2I conversions can be skipped (ubfiz)
Message-ID: <209c5713-4218-4e9c-037d-fe337734697f@bell-sw.com>

Hi,

Please review the update for aarch64 AD template file to generate more
bitfield extraction rules where I2L and L2I conversions can be skipped.

http://cr.openjdk.java.net/~bulasevich/8249189/webrev.02
http://bugs.openjdk.java.net/browse/JDK-8249189

Tested with JTREG and manual [1] tests.

thanks,
Boris

[1] 
http://cr.openjdk.java.net/~bulasevich/8249189/webrev.02/TestConversionSkip.java

From bob.vandette at oracle.com  Wed Jul 22 15:07:53 2020
From: bob.vandette at oracle.com (Bob Vandette)
Date: Wed, 22 Jul 2020 11:07:53 -0400
Subject: RFR: 8249880 - JVMCI calling register_nmethod without CodeCache lock
Message-ID: <07AE1117-D70D-4CE5-A636-3B0C789E0555@oracle.com>

Please review this fix which adds a CodeCache lock around registering an nmethod with
the collector.   This is causing a guarantee to fire when the nmethod sweeper runs during
a CompileTheWorld test.

BUG:
https://bugs.openjdk.java.net/browse/JDK-8249880

PATCH:

diff --git a/src/hotspot/share/jvmci/jvmciRuntime.cpp b/src/hotspot/share/jvmci/jvmciRuntime.cpp 
--- a/src/hotspot/share/jvmci/jvmciRuntime.cpp 
+++ b/src/hotspot/share/jvmci/jvmciRuntime.cpp 
@@ -668,6 +668,7 @@ 
  
   // Since we've patched some oops in the nmethod, 
   // (re)register it with the heap. 
+ MutexLocker ml(CodeCache_lock, Mutex::_no_safepoint_check_flag); 
   Universe::heap()->register_nmethod(nm); 
 } 

Bob.


From erik.osterlund at oracle.com  Wed Jul 22 15:12:41 2020
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Wed, 22 Jul 2020 17:12:41 +0200
Subject: RFR: 8249880 - JVMCI calling register_nmethod without CodeCache
 lock
In-Reply-To: <07AE1117-D70D-4CE5-A636-3B0C789E0555@oracle.com>
References: <07AE1117-D70D-4CE5-A636-3B0C789E0555@oracle.com>
Message-ID: <6723fc8c-b25e-1545-e653-8851f746b7a9@oracle.com>

Hi Bob,

Looks good.

Thanks,
/Erik

On 2020-07-22 17:07, Bob Vandette wrote:
> Please review this fix which adds a CodeCache lock around registering an nmethod with
> the collector.   This is causing a guarantee to fire when the nmethod sweeper runs during
> a CompileTheWorld test.
>
> BUG:
> https://bugs.openjdk.java.net/browse/JDK-8249880
>
> PATCH:
>
> diff --git a/src/hotspot/share/jvmci/jvmciRuntime.cpp b/src/hotspot/share/jvmci/jvmciRuntime.cpp
> --- a/src/hotspot/share/jvmci/jvmciRuntime.cpp
> +++ b/src/hotspot/share/jvmci/jvmciRuntime.cpp
> @@ -668,6 +668,7 @@
>    
>     // Since we've patched some oops in the nmethod,
>     // (re)register it with the heap.
> + MutexLocker ml(CodeCache_lock, Mutex::_no_safepoint_check_flag);
>     Universe::heap()->register_nmethod(nm);
>   }
>
> Bob.
>


From doug.simon at oracle.com  Wed Jul 22 15:53:05 2020
From: doug.simon at oracle.com (Doug Simon)
Date: Wed, 22 Jul 2020 17:53:05 +0200
Subject: RFR: 8249888: failure to create a libgraal JavaVM should result in a
 VM crash
Message-ID: <4ED30258-0888-4D6B-867E-6CC5DB4159E4@oracle.com>

Please review this enhancement which improves the debuggability of certain libgraal isolate creation issues.

BUG:
https://bugs.openjdk.java.net/browse/JDK-8249888

PATCH:

diff -r 8995e9efdee7 src/hotspot/share/jvmci/jvmciRuntime.cpp
--- a/src/hotspot/share/jvmci/jvmciRuntime.cpp	Wed Jul 22 17:08:01 2020 +0200
+++ b/src/hotspot/share/jvmci/jvmciRuntime.cpp	Wed Jul 22 17:41:06 2020 +0200
@@ -800,7 +800,7 @@
 
     JNI_CreateJavaVM = CAST_TO_FN_PTR(JNI_CreateJavaVM_t, os::dll_lookup(sl_handle, "JNI_CreateJavaVM"));
     if (JNI_CreateJavaVM == NULL) {
-      vm_exit_during_initialization("Unable to find JNI_CreateJavaVM", sl_path);
+      fatal("Unable to find JNI_CreateJavaVM in %s", sl_path);
     }
 
     ResourceMark rm;
@@ -835,7 +835,7 @@
       JVMCI_event_1("created JavaVM[%ld]@" PTR_FORMAT " for JVMCI runtime %d", javaVM_id, p2i(javaVM), _id);
       return env;
     } else {
-      vm_exit_during_initialization(err_msg("JNI_CreateJavaVM failed with return value %d", result), sl_path);
+      fatal("JNI_CreateJavaVM failed with return value %d", result);
     }
   }
   return NULL;

-Doug


From goetz.lindenmaier at sap.com  Wed Jul 22 16:21:38 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Wed, 22 Jul 2020 16:21:38 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <AM0PR0202MB3331445A57DBEC5F24C155649B790@AM0PR0202MB3331.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
 <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331EEF36942FCEBA7E131389BCB0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB329746F57D1C78F14000CB799AC80@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331D64C693490FD0746D1989BC90@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <DB6PR0201MB2152AF18921A375D26A76D89ECA40@DB6PR0201MB2152.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331FF18BED42A71796488E59B600@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM4PR0202MB29641555B86889D51E08441BEC7F0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331445A57DBEC5F24C155649B790@AM0PR0202MB3331.eurprd02.prod.outlook.com>
Message-ID: <AM4PR0202MB29648700486756F4DA6ED521EC790@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi Richard,

Thanks for the quick reply.

> > >   With DeoptimizeObjectsALot enabled internal threads are started that
> > > deoptimize frames and
> > >   objects. The number of threads started are given with
> > > DeoptimizeObjectsALotThreadCountAll and
> > >   DeoptimizeObjectsALotThreadCountSingle. The former targets all
> existing
> > > threads whereas the
> > >   latter operates on a single thread selected round robin.
> > >
> > >   I removed the mode where deoptimizations were performed at every nth
> > > exit from the runtime. I never used it.
> 
> > Do I get it right? You have a n:1 and a n:all test scenario.
> >  n:1: n threads deoptimize 1 Jana thread    where n => DOALThreadCountSingle
> >  n:m: n threads deoptimize all Java threads where n = DOALThreadCountAll?
> 
> Not quite.
> 
> -XX:+DeoptimizeObjectsALot // required
> -XX:DeoptimizeObjectsALotThreadCountAll=m
> -XX:DeoptimizeObjectsALotThreadCountSingle=n
> 
> Will start m+n threads. Each operating on all existing JavaThreads using
> EscapeBarriers. The
> difference between the 2 thread types is that one distinct EscapeBarrier
> targets either just a
> single thread or all exisitng threads at onece. If just one single thread is
> targeted per
> EscapeBarrier, then it is not always the same thread, but threads are selected
> round robin. So there
> will be n threads selecting independently single threads round robin per
> EscapeBarrier and m threads
> that target all threads in every EscapeBarrier.
Ok, yes, that is how I understood it. 
 
> > > * EscapeBarrier::sync_and_suspend_one(): use a direct handshake and
> > > execute it always independently
> > >   of is_thread_fully_suspended().
> > Is this also a performance optimization?
> 
> Maybe a minor one.
OK

> > > * JavaThread::wait_for_object_deoptimization():
> > >   - Bugfix: the last check of is_obj_deopt_suspend() must be /after/ the
> > > safepoint check! This
> > >     caused issues with not walkable stacks with DeoptimizeObjectsALot.
> > OK. As I understand, there was one safepoint check in the old version,
> > now there is one in each iteration.  I assume this is intended, right?
> 
> Yes it is. The important thing here is (A) a safepoint check is needed /after/
> leaving a safe state
> (_thread_in_native, _thread_blocked). (B) Shared variables that are modified
> at safepoints or with handshakes need to be reread /after/ the safepoint check.
> 
> BTW: I only noticed now that since JDK-8240918 JavaThreads themselves
> must disarm their polling
> page. Originally (before handshakes) this was done by the VM thread. With
> handshakes it was done by
> the thread executing the handshake op. This was changed for
> OrderAccess::cross_modify_fence() where
> the poll is left armed if the thread is in native and sice JDK-8240918 it is
> always left armed. So
> when a thread leaves a safe state (native, blocked) and there was a
> handshake/vm op, it will always
> call SafepointMechanism::block_if_requested_slow(), even if the
> handshake/vm operation have been
> processed already and everybody else is happyly executing bytecodes :)
Ok.

> Still (A) and (B) hold.

> > >   - Added limited spinning inspired by HandshakeSpinYield to fix regression in
> > > microbenchmark [1]
> > Ok.  Nice improvement, nice catch!
> 
> Yes. It certainly took some time to find out.
> 
> > >
> > > I refer to some more changes answering your questions and comments
> inline
> > > below.
> > >
> > > Thanks,
> > > Richard.
> > >
> > > [1] Microbenchmark:
> > >
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbe
> nchmark/
> > >
> 
> 
> > > > I understand you annotate at safepoints where the escape analysis
> > > > finds out that an object is "better" than global escape.
> > > > This are the cases where the analysis identifies optimization
> > > > opportunities. These annotations are then used to deoptimize
> > > > frames and the objects referenced by them.
> > > > Doesn't this overestimate the optimized
> > > > objects?  E.g., eliminate_alloc_node has many cases where it bails
> > > > out.
> > >
> > > Yes, the implementation is conservative, but it is comparatively simple
> and
> > > the additional debug
> > > info is just 2 flags per safepoint.
> > Thanks. It also helped that you explained to me offline that
> > there are more optimizations than only lock elimination and scalar
> > replacement done based on the ea information.
> > The ea refines the IR graph with allows follow up optimizations
> > which can not easily be tracked back to the escaping objects or
> > the call sites where they do not escape.
> > Thus, if there are non-global escaping objects, you have to
> > deoptimize the frame.
> > Did I repeat that correctly?
> 
> Mostly, but there are also cases where deoptimization is required if and only
> if ea-local objects
> are passed as arguments. This is the case when values are not read directly
> from a frame, but from a callee frame.
Hmm, don't get this completely, but ok.
  
> > > Accesses to instance
> > > members or array elements can be optimized as well.
> > You mean the compiler can/will ignore volatile or memory ordering
> > requirements for non-escaping objects? Sounds reasonable to do.
> 
> Yes, for instance. Also without volatile modifiers it will eliminate accesses.
> Here is an example:
> Method A has a NoEscape allocation O that is not scalar replaced. A calls
> Method B, which is not
> inlined. When you use your debugger to break in B, then modify a field of O,
> then this modification
> would have no effect without deoptimization, because the jit assumes that B
> cannot modify O without
> a reference to it.
Yes, A can keep O in a register, while the JVMTI thread would write to 
the location in the stack where the local is held (if it was written back).

> > > > Syncronization: looks good. I think others had a look at this before.
> > > >
> > > > EscapeBarrier::deoptimize_objects_internal()
> > > >   The method name is misleading, it is not used by
> > > >   deoptimize_objects().
> > > >   Also, method with the same name is in Deopitmization.
> > > >   Proposal: deoptimize_objects_thread() ?
> > >
> > > Sorry, but I don't see, why it would be misleading.
> > > What would be the meaning of 'deoptimize_objects_thread'? I don't
> > > understand that name.
> > 1. I have no idea why it's called "_internal". Because it is private?
> >    By the name, I would expect that EscapeBarrier::deoptimize_objects()
> >    calls it for some internal tasks. But it does not.
> 
> Well, I'd say it is pretty internal, what's happening in that method. So IMHO
> the suffix _internal
> is a match.
> 
> > 2. My proposal: deoptimize_objects_all_threads() iterates all threads
> > and calls deoptimize_objects(_one)_thread(thread) for each of these.
> > That's how I would have named it.
> > But no bike shedding, if you don't see what I mean it's not obvious.
> Ok. We could have a quick call, too, if you like.

Ok, I think I have understood the remaining points.  I'm fine with this 
so far.

Thanks,
  Goetz.


From vladimir.kozlov at oracle.com  Wed Jul 22 17:40:18 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 22 Jul 2020 10:40:18 -0700
Subject: RFR: 8249880 - JVMCI calling register_nmethod without CodeCache
 lock
In-Reply-To: <6723fc8c-b25e-1545-e653-8851f746b7a9@oracle.com>
References: <07AE1117-D70D-4CE5-A636-3B0C789E0555@oracle.com>
 <6723fc8c-b25e-1545-e653-8851f746b7a9@oracle.com>
Message-ID: <3625530e-4e48-8346-0ae4-86c88a2778a2@oracle.com>

+1

Thanks,
Vladimir

On 7/22/20 8:12 AM, Erik ?sterlund wrote:
> Hi Bob,
> 
> Looks good.
> 
> Thanks,
> /Erik
> 
> On 2020-07-22 17:07, Bob Vandette wrote:
>> Please review this fix which adds a CodeCache lock around registering an nmethod with
>> the collector.?? This is causing a guarantee to fire when the nmethod sweeper runs during
>> a CompileTheWorld test.
>>
>> BUG:
>> https://bugs.openjdk.java.net/browse/JDK-8249880
>>
>> PATCH:
>>
>> diff --git a/src/hotspot/share/jvmci/jvmciRuntime.cpp b/src/hotspot/share/jvmci/jvmciRuntime.cpp
>> --- a/src/hotspot/share/jvmci/jvmciRuntime.cpp
>> +++ b/src/hotspot/share/jvmci/jvmciRuntime.cpp
>> @@ -668,6 +668,7 @@
>> ??? // Since we've patched some oops in the nmethod,
>> ??? // (re)register it with the heap.
>> + MutexLocker ml(CodeCache_lock, Mutex::_no_safepoint_check_flag);
>> ??? Universe::heap()->register_nmethod(nm);
>> ? }
>>
>> Bob.
>>
> 

From tom.rodriguez at oracle.com  Wed Jul 22 17:56:11 2020
From: tom.rodriguez at oracle.com (Tom Rodriguez)
Date: Wed, 22 Jul 2020 10:56:11 -0700
Subject: RFR: 8249880 - JVMCI calling register_nmethod without CodeCache
 lock
In-Reply-To: <07AE1117-D70D-4CE5-A636-3B0C789E0555@oracle.com>
References: <07AE1117-D70D-4CE5-A636-3B0C789E0555@oracle.com>
Message-ID: <9951f7db-24fe-c1c5-26e8-a5174ca57f79@oracle.com>

Looks good.

tom

Bob Vandette wrote on 7/22/20 8:07 AM:
> Please review this fix which adds a CodeCache lock around registering an nmethod with
> the collector.   This is causing a guarantee to fire when the nmethod sweeper runs during
> a CompileTheWorld test.
> 
> BUG:
> https://bugs.openjdk.java.net/browse/JDK-8249880
> 
> PATCH:
> 
> diff --git a/src/hotspot/share/jvmci/jvmciRuntime.cpp b/src/hotspot/share/jvmci/jvmciRuntime.cpp
> --- a/src/hotspot/share/jvmci/jvmciRuntime.cpp
> +++ b/src/hotspot/share/jvmci/jvmciRuntime.cpp
> @@ -668,6 +668,7 @@
>    
>     // Since we've patched some oops in the nmethod,
>     // (re)register it with the heap.
> + MutexLocker ml(CodeCache_lock, Mutex::_no_safepoint_check_flag);
>     Universe::heap()->register_nmethod(nm);
>   }
> 
> Bob.
> 

From vladimir.kozlov at oracle.com  Wed Jul 22 18:02:53 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 22 Jul 2020 11:02:53 -0700
Subject: RFR: 8249888: failure to create a libgraal JavaVM should result
 in a VM crash
In-Reply-To: <4ED30258-0888-4D6B-867E-6CC5DB4159E4@oracle.com>
References: <4ED30258-0888-4D6B-867E-6CC5DB4159E4@oracle.com>
Message-ID: <09a740fa-73e8-5e6b-cc92-db382130c486@oracle.com>

Looks good.

Thanks,
Vladimir

On 7/22/20 8:53 AM, Doug Simon wrote:
> Please review this enhancement which improves the debuggability of certain libgraal isolate creation issues.
> 
> BUG:
> https://bugs.openjdk.java.net/browse/JDK-8249888
> 
> PATCH:
> 
> diff -r 8995e9efdee7 src/hotspot/share/jvmci/jvmciRuntime.cpp
> --- a/src/hotspot/share/jvmci/jvmciRuntime.cpp	Wed Jul 22 17:08:01 2020 +0200
> +++ b/src/hotspot/share/jvmci/jvmciRuntime.cpp	Wed Jul 22 17:41:06 2020 +0200
> @@ -800,7 +800,7 @@
>   
>       JNI_CreateJavaVM = CAST_TO_FN_PTR(JNI_CreateJavaVM_t, os::dll_lookup(sl_handle, "JNI_CreateJavaVM"));
>       if (JNI_CreateJavaVM == NULL) {
> -      vm_exit_during_initialization("Unable to find JNI_CreateJavaVM", sl_path);
> +      fatal("Unable to find JNI_CreateJavaVM in %s", sl_path);
>       }
>   
>       ResourceMark rm;
> @@ -835,7 +835,7 @@
>         JVMCI_event_1("created JavaVM[%ld]@" PTR_FORMAT " for JVMCI runtime %d", javaVM_id, p2i(javaVM), _id);
>         return env;
>       } else {
> -      vm_exit_during_initialization(err_msg("JNI_CreateJavaVM failed with return value %d", result), sl_path);
> +      fatal("JNI_CreateJavaVM failed with return value %d", result);
>       }
>     }
>     return NULL;
> 
> -Doug
> 

From tom.rodriguez at oracle.com  Wed Jul 22 18:04:42 2020
From: tom.rodriguez at oracle.com (Tom Rodriguez)
Date: Wed, 22 Jul 2020 11:04:42 -0700
Subject: RFR: 8249888: failure to create a libgraal JavaVM should result
 in a VM crash
In-Reply-To: <4ED30258-0888-4D6B-867E-6CC5DB4159E4@oracle.com>
References: <4ED30258-0888-4D6B-867E-6CC5DB4159E4@oracle.com>
Message-ID: <62657817-db4d-8887-4d90-9aedd530db78@oracle.com>

Looks good.

tom

Doug Simon wrote on 7/22/20 8:53 AM:
> Please review this enhancement which improves the debuggability of certain libgraal isolate creation issues.
> 
> BUG:
> https://bugs.openjdk.java.net/browse/JDK-8249888
> 
> PATCH:
> 
> diff -r 8995e9efdee7 src/hotspot/share/jvmci/jvmciRuntime.cpp
> --- a/src/hotspot/share/jvmci/jvmciRuntime.cpp	Wed Jul 22 17:08:01 2020 +0200
> +++ b/src/hotspot/share/jvmci/jvmciRuntime.cpp	Wed Jul 22 17:41:06 2020 +0200
> @@ -800,7 +800,7 @@
>   
>       JNI_CreateJavaVM = CAST_TO_FN_PTR(JNI_CreateJavaVM_t, os::dll_lookup(sl_handle, "JNI_CreateJavaVM"));
>       if (JNI_CreateJavaVM == NULL) {
> -      vm_exit_during_initialization("Unable to find JNI_CreateJavaVM", sl_path);
> +      fatal("Unable to find JNI_CreateJavaVM in %s", sl_path);
>       }
>   
>       ResourceMark rm;
> @@ -835,7 +835,7 @@
>         JVMCI_event_1("created JavaVM[%ld]@" PTR_FORMAT " for JVMCI runtime %d", javaVM_id, p2i(javaVM), _id);
>         return env;
>       } else {
> -      vm_exit_during_initialization(err_msg("JNI_CreateJavaVM failed with return value %d", result), sl_path);
> +      fatal("JNI_CreateJavaVM failed with return value %d", result);
>       }
>     }
>     return NULL;
> 
> -Doug
> 

From richard.reingruber at sap.com  Wed Jul 22 20:18:23 2020
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Wed, 22 Jul 2020 20:18:23 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <AM4PR0202MB2964FAF58FBD21D6705A4418EC7C0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
 <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331EEF36942FCEBA7E131389BCB0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB329746F57D1C78F14000CB799AC80@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331D64C693490FD0746D1989BC90@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <DB6PR0201MB2152AF18921A375D26A76D89ECA40@DB6PR0201MB2152.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331FF18BED42A71796488E59B600@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM4PR0202MB29641555B86889D51E08441BEC7F0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <AM4PR0202MB2964FAF58FBD21D6705A4418EC7C0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <AM0PR0202MB333139A9A877B64198E73D0F9B790@AM0PR0202MB3331.eurprd02.prod.outlook.com>

Hi Goetz,

> > I'll answer to the obvious things in this mail now.
> > I'll go through the code thoroughly again and write
> > a review of my findings thereafter.
> As promised a detailed walk-throug, but without any major findings:

> c1_IR.hpp: ok
> ci_Env.h|cpp: ok
> compiledMethod.cpp, nmethod.cpp: ok
> debugInfoRec.h|cpp: ok
> scopeDesc.h|cpp ok

> compileBroker.h|cpp: 
> Maybe a bit of documentation how and why you start 
> the threads? I had expected there are two test
> scenarios run after each other, but now I understand 'Single'
> and 'All' run simultaneously.  Well, this really is a stress test!
> Also good the two variants of depotimization are
> stressed against each other.
> Besides that really nice it's all in one place.

Done.

> rootResolver.cpp: ok
> jvmciCodeInstaller.cpp: ok

> c2compiler.cpp: The essence of this change! Just one line :)
> Great!

:)

> callnode.hpp ok
> escape.h|cpp ok
> macro.cpp 
> I was not that happy with the names saying not_global_escape
> and similar. I now agreed you have to use the terms of the escape
> analysis (NoEscape ArgEscape= throughout the runtime code. I'm still not happy with 
> the 'not' in the term, I always try to expand the name to some
> sentence with a negated verb, but it makes no sense.
> For example, "has_not_global_escape_in_scope" expands to 
> "Hasn't a global escape in its scope." in my thinking, which makes 
> no sense. You probably mean
> "Has not-global escape in its scope." or "Has {ArgEscape|NoEscape} 
> in its scope."

> C2 is using the word "non" in this context, e.g., here 
> alloc->is_non_escaping.

There is also ConnectionGraph::not_global_escape()

> non obviously negates the adjective 'global',
> non-global or nonglobal even is a English term I find in the 
> net. 
> So what about "has_non_global_escape_in_scope?"

And what about has_ea_local_in_scope?

> matcher.cpp ok

> output.cpp:1071
> Please break the long line.

Done.

> jvmtiCodeBlobEvents.cpp ok

> jvmtiEnv.cpp
> MaxJavaStackTraceDepth is only documented to affect
> the exceptions stack trace depth, not to limit jvmti 
> operations. Therefore I wondered why it is used here. 
> Non of your business, but the flag should
> document this in globals.hpp, too.  
> Does jvmti specify that the same limits are used ...?
> ok on your side.

I don't know and didn't find anything in a quick search.

> jvmtiEnvBase.cpp  ok
> jvmtiImpl.h|cpp  ok
> jvmtiTagMap.cpp ok
> whitebox.cpp ok

> deoptimization.cpp

> line 177: Please break line
> line 246, 281: Please break line
> 1578, 1583, 1589, 1632, 1649, 1651 Break line

> 1651: You use 'non'-terms, too: non-escaping :)

I know :) At least here it is wrong I'd say. "...has to be a not escaping obj..." sounds better
(hopefully not only to my german ears).

> 2805, 2929, 2946ff, break lines

> deoptimization.hpp

> 158, 174, 176 ... I would break lines too, but here you are in
> good company :)

Done.

> globals.hpp ok
> mutexLocker.h|cpp ok
> objectMonitor.cpp ok

> thread.cpp 

> 2631 typo: sapfepont --> safepoint

Done.

> thread.hpp ok
> thread.inline.hpp ok
> vframe.cpp ok
> vframe_hp.cpp   458ff break lines
> vframe_hp.hpp ok
> macros.hpp ok
> TEST.ROOT ok
> WhiteBox.java ok

> IterateHeapWithEscapeAnalysisEnabled.java

> line 415:
> msg("wait until target thread has set testMethod_result");
> while (testMethod_result == 0) {
>     Thread.sleep(50);
> }
> Might the test run into timeouts at this place?
> The field is volatile, i.e. it will be reloaded
> in each iteration. But will dontinline_testMethod
> write it back to main memory in time?

You mean, the test could hang in that loop for a couple of minutes? I don't
think so. There are cache coherence protocols in place which will invalidate
stale data very timely.

> libIterateHeapWithEscapeAnalysisEnabled.c ok

> EATests.java

> This is a very elaborate test.
> I found a row of test cases illustrating issues
> we talked about before. Really helpful!

> 1311: TypeO materialize -> materialized

Found and fix typo at line 1369.
(Probably the cursor was on 1311 and your eyes on 1369 ;))

> 1640: setting local variable i triggers always deoptimization
>   --> setting local variable i always triggers deoptimization

Fixed.

> 2176: dontinline_calee --> dontinline_callee
> 2510: poping --> popping  ... but I'm not sure here.

Done.

> https://www.urbandictionary.com/define.php?term=poping
> poping
> Drinking large amounts of Dextromethorphan Hydrobromide (DXM)based cough syrup, and then embarking on an adventure while wandering around neighborhoods or parks all night. This is usually done while listening to Punk rock music from a portable jambox. 
> ;)
> Don?t do it! ??

OMG! How come you know?! ;)

> EATestsJVMTI.java

> I think you can just copy this test description into the other
> test. You can have two @test comments, they will be treated
> as separate tests.  The @requires will be evaluated accordingly.
> For an example see 
> test/hotspot/jtreg/runtime/exceptionMsgs/NullPointerException/NullPointerExceptionTest.java
> which has two different compile setups for the test class (-g).

Done.

> so, that's it for reading code ...


> Some general remarks, maybe a bit picky ...:
> I think you could use less commas ',' in comments.
> As I understand, you need a comma if the relative
> sentence is at the beginning, but not if it is at 
> the end:
>   If Corona is over, I go to the office.
> but
>   I go to the office if Corona is over.

That seem's to be correct except "If Corona is over" isn't a relative sentence
but a conditional sentence, isn't it?

The general rule seems to be: the subordinate clause is separated with a comma
from a following main clause. No comma separation is needed if the subordinate
clause follows the main clause.

Thanks, that's a lesson I learned!

> I think the same holds for 'because', 'while' etc.
> E.g., jvmtiEnvBase.cpp:1313, jvmtiImpl.cpp:646ff, 
> vframe_hp.hpp 104ff

Ok. I've removed quite a lot of the occurrances.

> Also, I like full sentences in comments.  
> Especially for me as foreign speaker, this makes
> things much more clear. I.e., I try to make it
> a real sentence with articles, capitalized and a
> dot at the end if there is a subject and a verb
> in first place.
> E.g., jvmtiEnvBase.cpp:1327

Are you referring to the following?
(from http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6/src/hotspot/share/prims/jvmtiEnvBase.cpp.frames.html)

1326 
1327   // If the frame is a compiled one, need to deoptimize it.
1328   if (vf->is_compiled_frame()) {

This line 1327 is preexisting.

> In many places, your comments read really 
> well but some are quite abbreviated I think.

Yeah, but not only because I'm lazy... It is the style that I prefer and I think
it matches the surrounding code quite well.

> E.g. thread.cpp:2601 is an example where a simple
> 'a' helps a lot.
> "Single deoptimization is typically very short."
> I would add 'A': "A single deoptimization is typically very short (fast?)."
> An other meaning of the comment I first considered is this:
> "Single deoptimization is typically very short, all_threads deoptimization takes longer"
> having in mind the functions
> EscapeBarries::deoptimize_objects_all_threads()  
> and 
> EscapeBarries::deoptimize_objects() doing a single thread.
> German with it's compound nouns is helpful here :)

> Einzeldeoptimierung <--> eine einzelne Deoptimierung

I've added the 'A' and I'll try to use complete sentences in the future. The
telegram style has advantages, too, though ;)

Thanks!

Cheers, Richard.

-----Original Message-----
From: Lindenmaier, Goetz <goetz.lindenmaier at sap.com> 
Sent: Freitag, 17. Juli 2020 14:31
To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Reingruber, Richard <richard.reingruber at sap.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

Hi Richard,

> I'll answer to the obvious things in this mail now.
> I'll go through the code thoroughly again and write
> a review of my findings thereafter.
As promised a detailed walk-throug, but without any major findings:

c1_IR.hpp: ok
ci_Env.h|cpp: ok
compiledMethod.cpp, nmethod.cpp: ok
debugInfoRec.h|cpp: ok
scopeDesc.h|cpp ok

compileBroker.h|cpp: 
Maybe a bit of documentation how and why you start 
the threads? I had expected there are two test
scenarios run after each other, but now I understand 'Single'
and 'All' run simultaneously.  Well, this really is a stress test!
Also good the two variants of depotimization are
stressed against each other.
Besides that really nice it's all in one place.

rootResolver.cpp: ok
jvmciCodeInstaller.cpp: ok

c2compiler.cpp: The essence of this change! Just one line :)
Great!

callnode.hpp ok
escape.h|cpp ok
macro.cpp 
I was not that happy with the names saying not_global_escape
and similar. I now agreed you have to use the terms of the escape
analysis (NoEscape ArgEscape= throughout the runtime code. I'm still not happy with 
the 'not' in the term, I always try to expand the name to some
sentence with a negated verb, but it makes no sense.
For example, "has_not_global_escape_in_scope" expands to 
"Hasn't a global escape in its scope." in my thinking, which makes 
no sense. You probably mean
"Has not-global escape in its scope." or "Has {ArgEscape|NoEscape} 
in its scope."

C2 is using the word "non" in this context, e.g., here 
alloc->is_non_escaping.

non obviously negates the adjective 'global',
non-global or nonglobal even is a English term I find in the 
net. 
So what about "has_non_global_escape_in_scope?"

matcher.cpp ok

output.cpp:1071
Please break the long line.

jvmtiCodeBlobEvents.cpp ok

jvmtiEnv.cpp
MaxJavaStackTraceDepth is only documented to affect
the exceptions stack trace depth, not to limit jvmti 
operations. Therefore I wondered why it is used here. 
Non of your business, but the flag should
document this in globals.hpp, too.  
Does jvmti specify that the same limits are used ...?
ok on your side.

jvmtiEnvBase.cpp  ok
jvmtiImpl.h|cpp  ok
jvmtiTagMap.cpp ok
whitebox.cpp ok

deoptimization.cpp

line 177: Please break line
line 246, 281: Please break line
1578, 1583, 1589, 1632, 1649, 1651 Break line

1651: You use 'non'-terms, too: non-escaping :)

2805, 2929, 2946ff, break lines

deoptimization.hpp

158, 174, 176 ... I would break lines too, but here you are in
good company :)

globals.hpp ok
mutexLocker.h|cpp ok
objectMonitor.cpp ok

thread.cpp 

2631 typo: sapfepont --> safepoint

thread.hpp ok
thread.inline.hpp ok
vframe.cpp ok
vframe_hp.cpp   458ff break lines
vframe_hp.hpp ok
macros.hpp ok
TEST.ROOT ok
WhiteBox.java ok

IterateHeapWithEscapeAnalysisEnabled.java

line 415:
msg("wait until target thread has set testMethod_result");
while (testMethod_result == 0) {
    Thread.sleep(50);
}
Might the test run into timeouts at this place?
The field is volatile, i.e. it will be reloaded
in each iteration. But will dontinline_testMethod
write it back to main memory in time?

libIterateHeapWithEscapeAnalysisEnabled.c ok

EATests.java

This is a very elaborate test.
I found a row of test cases illustrating issues
we talked about before. Really helpful!

1311: TypeO materialize -> materialized

1640: setting local variable i triggers always deoptimization
  --> setting local variable i always triggers deoptimization

2176: dontinline_calee --> dontinline_callee
2510: poping --> popping  ... but I'm not sure here.

https://www.urbandictionary.com/define.php?term=poping
poping
Drinking large amounts of Dextromethorphan Hydrobromide (DXM)based cough syrup, and then embarking on an adventure while wandering around neighborhoods or parks all night. This is usually done while listening to Punk rock music from a portable jambox. 
;)
Don?t do it! ??

EATestsJVMTI.java

I think you can just copy this test description into the other
test. You can have two @test comments, they will be treated
as separate tests.  The @requires will be evaluated accordingly.
For an example see 
test/hotspot/jtreg/runtime/exceptionMsgs/NullPointerException/NullPointerExceptionTest.java
which has two different compile setups for the test class (-g).

so, that's it for reading code ...


Some general remarks, maybe a bit picky ...:
I think you could use less commas ',' in comments.
As I understand, you need a comma if the relative
sentence is at the beginning, but not if it is at 
the end:
  If Corona is over, I go to the office.
but
  I go to the office if Corona is over.
I think the same holds for 'because', 'while' etc.
E.g., jvmtiEnvBase.cpp:1313, jvmtiImpl.cpp:646ff, 
vframe_hp.hpp 104ff

Also, I like full sentences in comments.  
Especially for me as foreign speaker, this makes
things much more clear. I.e., I try to make it
a real sentence with articles, capitalized and a
dot at the end if there is a subject and a verb
in first place.
E.g., jvmtiEnvBase.cpp:1327
In many places, your comments read really 
well but some are quite abbreviated I think.

E.g. thread.cpp:2601 is an example where a simple
'a' helps a lot.
"Single deoptimization is typically very short."
I would add 'A': "A single deoptimization is typically very short (fast?)."
An other meaning of the comment I first considered is this:
"Single deoptimization is typically very short, all_threads deoptimization takes longer"
having in mind the functions
EscapeBarries::deoptimize_objects_all_threads()  
and 
EscapeBarries::deoptimize_objects() doing a single thread.
German with it's compound nouns is helpful here :)

Einzeldeoptimierung <--> eine einzelne Deoptimierung

Best regards,
  Goetz.


From richard.reingruber at sap.com  Wed Jul 22 20:53:19 2020
From: richard.reingruber at sap.com (Reingruber, Richard)
Date: Wed, 22 Jul 2020 20:53:19 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <AM4PR0202MB29648700486756F4DA6ED521EC790@AM4PR0202MB2964.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
 <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331EEF36942FCEBA7E131389BCB0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB329746F57D1C78F14000CB799AC80@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331D64C693490FD0746D1989BC90@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <DB6PR0201MB2152AF18921A375D26A76D89ECA40@DB6PR0201MB2152.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331FF18BED42A71796488E59B600@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM4PR0202MB29641555B86889D51E08441BEC7F0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331445A57DBEC5F24C155649B790@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM4PR0202MB29648700486756F4DA6ED521EC790@AM4PR0202MB2964.eurprd02.prod.outlook.com>
Message-ID: <AM0PR0202MB3331CCE0DCF038DE3E6838BF9B790@AM0PR0202MB3331.eurprd02.prod.outlook.com>

Hi Goetz,

> Thanks for the quick reply.

Yes, this time it didn't take that long...

[... snip ...]

> > > > > I understand you annotate at safepoints where the escape analysis
> > > > > finds out that an object is "better" than global escape.
> > > > > This are the cases where the analysis identifies optimization
> > > > > opportunities. These annotations are then used to deoptimize
> > > > > frames and the objects referenced by them.
> > > > > Doesn't this overestimate the optimized
> > > > > objects?  E.g., eliminate_alloc_node has many cases where it bails
> > > > > out.
> > > >
> > > > Yes, the implementation is conservative, but it is comparatively simple
> > and
> > > > the additional debug
> > > > info is just 2 flags per safepoint.
> > > Thanks. It also helped that you explained to me offline that
> > > there are more optimizations than only lock elimination and scalar
> > > replacement done based on the ea information.
> > > The ea refines the IR graph with allows follow up optimizations
> > > which can not easily be tracked back to the escaping objects or
> > > the call sites where they do not escape.
> > > Thus, if there are non-global escaping objects, you have to
> > > deoptimize the frame.
> > > Did I repeat that correctly?
> > 
> > Mostly, but there are also cases where deoptimization is required if and only
> > if ea-local objects
> > are passed as arguments. This is the case when values are not read directly
> > from a frame, but from a callee frame.
> Hmm, don't get this completely, but ok.

Let C be a callee frame of B which is a callee of A. If you use JVMTI to read an
object reference from a local variable of C then the implementation of
JDK-8227745 deoptimizes A if it passes any ea-local as argument, because the
reference could be ea-local in A and there might be optimizations that are
invalid after the escape state change.
  
> > > > Accesses to instance
> > > > members or array elements can be optimized as well.
> > > You mean the compiler can/will ignore volatile or memory ordering
> > > requirements for non-escaping objects? Sounds reasonable to do.
> > 
> > Yes, for instance. Also without volatile modifiers it will eliminate accesses.
> > Here is an example:
> > Method A has a NoEscape allocation O that is not scalar replaced. A calls
> > Method B, which is not
> > inlined. When you use your debugger to break in B, then modify a field of O,
> > then this modification
> > would have no effect without deoptimization, because the jit assumes that B
> > cannot modify O without
> > a reference to it.
> Yes, A can keep O in a register, while the JVMTI thread would write to 
> the location in the stack where the local is held (if it was written back).

Not quite. It is the value of the field of O that is in a register not the
reference to O itself. The agent changes the field's value in the /java heap/
(remember: O is _not_ scalar replaced), but the fields value is not reloaded
after return from B.

> > > > > Syncronization: looks good. I think others had a look at this before.
> > > > >
> > > > > EscapeBarrier::deoptimize_objects_internal()
> > > > >   The method name is misleading, it is not used by
> > > > >   deoptimize_objects().
> > > > >   Also, method with the same name is in Deopitmization.
> > > > >   Proposal: deoptimize_objects_thread() ?
> > > >
> > > > Sorry, but I don't see, why it would be misleading.
> > > > What would be the meaning of 'deoptimize_objects_thread'? I don't
> > > > understand that name.
> > > 1. I have no idea why it's called "_internal". Because it is private?
> > >    By the name, I would expect that EscapeBarrier::deoptimize_objects()
> > >    calls it for some internal tasks. But it does not.
> > 
> > Well, I'd say it is pretty internal, what's happening in that method. So IMHO
> > the suffix _internal
> > is a match.
> > 
> > > 2. My proposal: deoptimize_objects_all_threads() iterates all threads
> > > and calls deoptimize_objects(_one)_thread(thread) for each of these.
> > > That's how I would have named it.
> > > But no bike shedding, if you don't see what I mean it's not obvious.
> > Ok. We could have a quick call, too, if you like.

> Ok, I think I have understood the remaining points.  I'm fine with this 
> so far.

Thanks again and best regards,
Richard.

-----Original Message-----
From: Lindenmaier, Goetz <goetz.lindenmaier at sap.com> 
Sent: Mittwoch, 22. Juli 2020 18:22
To: Reingruber, Richard <richard.reingruber at sap.com>; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net
Subject: RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents

Hi Richard,

Thanks for the quick reply.

> > >   With DeoptimizeObjectsALot enabled internal threads are started that
> > > deoptimize frames and
> > >   objects. The number of threads started are given with
> > > DeoptimizeObjectsALotThreadCountAll and
> > >   DeoptimizeObjectsALotThreadCountSingle. The former targets all
> existing
> > > threads whereas the
> > >   latter operates on a single thread selected round robin.
> > >
> > >   I removed the mode where deoptimizations were performed at every nth
> > > exit from the runtime. I never used it.
> 
> > Do I get it right? You have a n:1 and a n:all test scenario.
> >  n:1: n threads deoptimize 1 Jana thread    where n => DOALThreadCountSingle
> >  n:m: n threads deoptimize all Java threads where n = DOALThreadCountAll?
> 
> Not quite.
> 
> -XX:+DeoptimizeObjectsALot // required
> -XX:DeoptimizeObjectsALotThreadCountAll=m
> -XX:DeoptimizeObjectsALotThreadCountSingle=n
> 
> Will start m+n threads. Each operating on all existing JavaThreads using
> EscapeBarriers. The
> difference between the 2 thread types is that one distinct EscapeBarrier
> targets either just a
> single thread or all exisitng threads at onece. If just one single thread is
> targeted per
> EscapeBarrier, then it is not always the same thread, but threads are selected
> round robin. So there
> will be n threads selecting independently single threads round robin per
> EscapeBarrier and m threads
> that target all threads in every EscapeBarrier.
Ok, yes, that is how I understood it. 
 
> > > * EscapeBarrier::sync_and_suspend_one(): use a direct handshake and
> > > execute it always independently
> > >   of is_thread_fully_suspended().
> > Is this also a performance optimization?
> 
> Maybe a minor one.
OK

> > > * JavaThread::wait_for_object_deoptimization():
> > >   - Bugfix: the last check of is_obj_deopt_suspend() must be /after/ the
> > > safepoint check! This
> > >     caused issues with not walkable stacks with DeoptimizeObjectsALot.
> > OK. As I understand, there was one safepoint check in the old version,
> > now there is one in each iteration.  I assume this is intended, right?
> 
> Yes it is. The important thing here is (A) a safepoint check is needed /after/
> leaving a safe state
> (_thread_in_native, _thread_blocked). (B) Shared variables that are modified
> at safepoints or with handshakes need to be reread /after/ the safepoint check.
> 
> BTW: I only noticed now that since JDK-8240918 JavaThreads themselves
> must disarm their polling
> page. Originally (before handshakes) this was done by the VM thread. With
> handshakes it was done by
> the thread executing the handshake op. This was changed for
> OrderAccess::cross_modify_fence() where
> the poll is left armed if the thread is in native and sice JDK-8240918 it is
> always left armed. So
> when a thread leaves a safe state (native, blocked) and there was a
> handshake/vm op, it will always
> call SafepointMechanism::block_if_requested_slow(), even if the
> handshake/vm operation have been
> processed already and everybody else is happyly executing bytecodes :)
Ok.

> Still (A) and (B) hold.

> > >   - Added limited spinning inspired by HandshakeSpinYield to fix regression in
> > > microbenchmark [1]
> > Ok.  Nice improvement, nice catch!
> 
> Yes. It certainly took some time to find out.
> 
> > >
> > > I refer to some more changes answering your questions and comments
> inline
> > > below.
> > >
> > > Thanks,
> > > Richard.
> > >
> > > [1] Microbenchmark:
> > >
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6.microbe
> nchmark/
> > >
> 
> 
> > > > I understand you annotate at safepoints where the escape analysis
> > > > finds out that an object is "better" than global escape.
> > > > This are the cases where the analysis identifies optimization
> > > > opportunities. These annotations are then used to deoptimize
> > > > frames and the objects referenced by them.
> > > > Doesn't this overestimate the optimized
> > > > objects?  E.g., eliminate_alloc_node has many cases where it bails
> > > > out.
> > >
> > > Yes, the implementation is conservative, but it is comparatively simple
> and
> > > the additional debug
> > > info is just 2 flags per safepoint.
> > Thanks. It also helped that you explained to me offline that
> > there are more optimizations than only lock elimination and scalar
> > replacement done based on the ea information.
> > The ea refines the IR graph with allows follow up optimizations
> > which can not easily be tracked back to the escaping objects or
> > the call sites where they do not escape.
> > Thus, if there are non-global escaping objects, you have to
> > deoptimize the frame.
> > Did I repeat that correctly?
> 
> Mostly, but there are also cases where deoptimization is required if and only
> if ea-local objects
> are passed as arguments. This is the case when values are not read directly
> from a frame, but from a callee frame.
Hmm, don't get this completely, but ok.
  
> > > Accesses to instance
> > > members or array elements can be optimized as well.
> > You mean the compiler can/will ignore volatile or memory ordering
> > requirements for non-escaping objects? Sounds reasonable to do.
> 
> Yes, for instance. Also without volatile modifiers it will eliminate accesses.
> Here is an example:
> Method A has a NoEscape allocation O that is not scalar replaced. A calls
> Method B, which is not
> inlined. When you use your debugger to break in B, then modify a field of O,
> then this modification
> would have no effect without deoptimization, because the jit assumes that B
> cannot modify O without
> a reference to it.
Yes, A can keep O in a register, while the JVMTI thread would write to 
the location in the stack where the local is held (if it was written back).

> > > > Syncronization: looks good. I think others had a look at this before.
> > > >
> > > > EscapeBarrier::deoptimize_objects_internal()
> > > >   The method name is misleading, it is not used by
> > > >   deoptimize_objects().
> > > >   Also, method with the same name is in Deopitmization.
> > > >   Proposal: deoptimize_objects_thread() ?
> > >
> > > Sorry, but I don't see, why it would be misleading.
> > > What would be the meaning of 'deoptimize_objects_thread'? I don't
> > > understand that name.
> > 1. I have no idea why it's called "_internal". Because it is private?
> >    By the name, I would expect that EscapeBarrier::deoptimize_objects()
> >    calls it for some internal tasks. But it does not.
> 
> Well, I'd say it is pretty internal, what's happening in that method. So IMHO
> the suffix _internal
> is a match.
> 
> > 2. My proposal: deoptimize_objects_all_threads() iterates all threads
> > and calls deoptimize_objects(_one)_thread(thread) for each of these.
> > That's how I would have named it.
> > But no bike shedding, if you don't see what I mean it's not obvious.
> Ok. We could have a quick call, too, if you like.

Ok, I think I have understood the remaining points.  I'm fine with this 
so far.

Thanks,
  Goetz.


From doug.simon at oracle.com  Wed Jul 22 20:56:47 2020
From: doug.simon at oracle.com (Doug Simon)
Date: Wed, 22 Jul 2020 22:56:47 +0200
Subject: RFR: 8249888: failure to create a libgraal JavaVM should result
 in a VM crash
In-Reply-To: <09a740fa-73e8-5e6b-cc92-db382130c486@oracle.com>
References: <4ED30258-0888-4D6B-867E-6CC5DB4159E4@oracle.com>
 <09a740fa-73e8-5e6b-cc92-db382130c486@oracle.com>
Message-ID: <439CC977-9905-4D3E-9FC1-F345D0785F37@oracle.com>

Thanks Vladimir.

-Doug

> On 22 Jul 2020, at 20:02, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Looks good.
> 
> Thanks,
> Vladimir
> 
> On 7/22/20 8:53 AM, Doug Simon wrote:
>> Please review this enhancement which improves the debuggability of certain libgraal isolate creation issues.
>> BUG:
>> https://bugs.openjdk.java.net/browse/JDK-8249888
>> PATCH:
>> diff -r 8995e9efdee7 src/hotspot/share/jvmci/jvmciRuntime.cpp
>> --- a/src/hotspot/share/jvmci/jvmciRuntime.cpp	Wed Jul 22 17:08:01 2020 +0200
>> +++ b/src/hotspot/share/jvmci/jvmciRuntime.cpp	Wed Jul 22 17:41:06 2020 +0200
>> @@ -800,7 +800,7 @@
>>        JNI_CreateJavaVM = CAST_TO_FN_PTR(JNI_CreateJavaVM_t, os::dll_lookup(sl_handle, "JNI_CreateJavaVM"));
>>      if (JNI_CreateJavaVM == NULL) {
>> -      vm_exit_during_initialization("Unable to find JNI_CreateJavaVM", sl_path);
>> +      fatal("Unable to find JNI_CreateJavaVM in %s", sl_path);
>>      }
>>        ResourceMark rm;
>> @@ -835,7 +835,7 @@
>>        JVMCI_event_1("created JavaVM[%ld]@" PTR_FORMAT " for JVMCI runtime %d", javaVM_id, p2i(javaVM), _id);
>>        return env;
>>      } else {
>> -      vm_exit_during_initialization(err_msg("JNI_CreateJavaVM failed with return value %d", result), sl_path);
>> +      fatal("JNI_CreateJavaVM failed with return value %d", result);
>>      }
>>    }
>>    return NULL;
>> -Doug


From doug.simon at oracle.com  Wed Jul 22 20:56:57 2020
From: doug.simon at oracle.com (Doug Simon)
Date: Wed, 22 Jul 2020 22:56:57 +0200
Subject: RFR: 8249888: failure to create a libgraal JavaVM should result
 in a VM crash
In-Reply-To: <62657817-db4d-8887-4d90-9aedd530db78@oracle.com>
References: <4ED30258-0888-4D6B-867E-6CC5DB4159E4@oracle.com>
 <62657817-db4d-8887-4d90-9aedd530db78@oracle.com>
Message-ID: <8B5E6F86-B1DD-4DE6-AF9F-3A110F152997@oracle.com>

Thanks Tom.

-Doug

> On 22 Jul 2020, at 20:04, Tom Rodriguez <tom.rodriguez at oracle.com> wrote:
> 
> Looks good.
> 
> tom
> 
> Doug Simon wrote on 7/22/20 8:53 AM:
>> Please review this enhancement which improves the debuggability of certain libgraal isolate creation issues.
>> BUG:
>> https://bugs.openjdk.java.net/browse/JDK-8249888
>> PATCH:
>> diff -r 8995e9efdee7 src/hotspot/share/jvmci/jvmciRuntime.cpp
>> --- a/src/hotspot/share/jvmci/jvmciRuntime.cpp	Wed Jul 22 17:08:01 2020 +0200
>> +++ b/src/hotspot/share/jvmci/jvmciRuntime.cpp	Wed Jul 22 17:41:06 2020 +0200
>> @@ -800,7 +800,7 @@
>>        JNI_CreateJavaVM = CAST_TO_FN_PTR(JNI_CreateJavaVM_t, os::dll_lookup(sl_handle, "JNI_CreateJavaVM"));
>>      if (JNI_CreateJavaVM == NULL) {
>> -      vm_exit_during_initialization("Unable to find JNI_CreateJavaVM", sl_path);
>> +      fatal("Unable to find JNI_CreateJavaVM in %s", sl_path);
>>      }
>>        ResourceMark rm;
>> @@ -835,7 +835,7 @@
>>        JVMCI_event_1("created JavaVM[%ld]@" PTR_FORMAT " for JVMCI runtime %d", javaVM_id, p2i(javaVM), _id);
>>        return env;
>>      } else {
>> -      vm_exit_during_initialization(err_msg("JNI_CreateJavaVM failed with return value %d", result), sl_path);
>> +      fatal("JNI_CreateJavaVM failed with return value %d", result);
>>      }
>>    }
>>    return NULL;
>> -Doug


From vladimir.x.ivanov at oracle.com  Wed Jul 22 21:36:52 2020
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Thu, 23 Jul 2020 00:36:52 +0300
Subject: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API
 (Incubator): AArch64 backend changes
In-Reply-To: <54d6b2b6-b79a-4700-981c-6ab33aca82f2@arm.com>
References: <VI1PR0802MB2558C90ACF1189F17692FF278EB90@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <275eb57c-51c0-675e-c32a-91b198023559@redhat.com>
 <719F9169-ABC4-408E-B732-F1BD9A84337F@oracle.com>
 <9a13f5df-d946-579d-4282-917dc7338dc8@redhat.com>
 <09BC0693-80E0-4F87-855E-0B38A6F5EFA2@oracle.com>
 <668e500e-f621-5a2c-a41e-f73536880f73@redhat.com>
 <VI1PR0802MB255845256424F9EB4D5D81368EB30@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <1909fa9d-98bb-c2fb-45d8-540247d1ca8b@redhat.com>
 <VI1PR0802MB255895ADE9BC8D027EBB81048EB10@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <BYAPR11MB35438110FA50851942CE17A1EF950@BYAPR11MB3543.namprd11.prod.outlook.com>
 <HE1PR0802MB25542AE212E12FD3EF6D1BE48E6E0@HE1PR0802MB2554.eurprd08.prod.outlook.com>
 <2acbcc99-8dd4-b8f1-5982-1d439953c416@redhat.com>
 <VI1PR0802MB2558ECC61528694DBDFFF48F8E670@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <54d6b2b6-b79a-4700-981c-6ab33aca82f2@arm.com>
Message-ID: <8c05d468-8753-b671-e3a9-92a7148f4f14@oracle.com>

> http://cr.openjdk.java.net/~njian/vectorapi/8223347-integration/aarch64-webrev.01/ 

FTR there's one more aarch64-specific change in shared code to enable 
aarch64_neon.ad processing:

diff --git a/make/hotspot/gensrc/GensrcAdlc.gmk 
b/make/hotspot/gensrc/GensrcAdlc.gmk
--- a/make/hotspot/gensrc/GensrcAdlc.gmk
+++ b/make/hotspot/gensrc/GensrcAdlc.gmk
@@ -129,6 +129,12 @@
 
$d/os_cpu/$(HOTSPOT_TARGET_OS)_$(HOTSPOT_TARGET_CPU_ARCH)/$(HOTSPOT_TARGET_OS)_$(HOTSPOT_TARGET_CPU_ARCH).ad 
\
      )))

+  ifeq ($(HOTSPOT_TARGET_CPU_ARCH), aarch64)
+    AD_SRC_FILES += $(call uniq, $(wildcard $(foreach d, $(AD_SRC_ROOTS), \
+ 
$d/cpu/$(HOTSPOT_TARGET_CPU_ARCH)/$(HOTSPOT_TARGET_CPU_ARCH)_neon.ad \
+    )))
+  endif
+
    ifeq ($(call check-jvm-feature, shenandoahgc), true)
      AD_SRC_FILES += $(call uniq, $(wildcard $(foreach d, 
$(AD_SRC_ROOTS), \
 
$d/cpu/$(HOTSPOT_TARGET_CPU_ARCH)/gc/shenandoah/shenandoah_$(HOTSPOT_TARGET_CPU).ad 
\

Best regards,
Vladimir Ivanov

> On 7/8/20 3:05 PM, Yang Zhang wrote:
>> Hi Andrew
>>
>> I have updated this patch. Could you please help to review it again?
>> In this patch, the following changes are made:
>> 1. Separate newly added NEON instructions to a new ad file
>> ??? aarch64_neon.ad
>> 2. Add assembler tests for NEON instructions. Trailing spaces
>> ??? in the python script are also removed.
>>
>> http://cr.openjdk.java.net/~yzhang/vectorapi/vectorapi.rfr/aarch64_webrev/webrev.02/ 
>>
>>
>> Thanks,
>> Yang
>>
>>
>> -----Original Message-----
>> From: Andrew Haley <aph at redhat.com>
>> Sent: Tuesday, June 30, 2020 12:10 AM
>> To: Yang Zhang <Yang.Zhang at arm.com>; Viswanathan, Sandhya 
>> <sandhya.viswanathan at intel.com>; Paul Sandoz <paul.sandoz at oracle.com>
>> Cc: nd <nd at arm.com>; hotspot-compiler-dev at openjdk.java.net; 
>> hotspot-dev at openjdk.java.net; core-libs-dev at openjdk.java.net; 
>> aarch64-port-dev at openjdk.java.net
>> Subject: Re: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of 
>> Vector API (Incubator): AArch64 backend changes
>>
>> On 29/06/2020 08:48, Yang Zhang wrote:
>>> 1. Instructions that can be matched with NEON instructions directly.
>>> MulVB, SqrtVF and AbsV have been merged into jdk master already.
>>>
>>> 2. Instructions that jdk master has middle end support for, but they 
>>> cannot be matched with NEON instructions directly.
>>> Such as AddReductionVL, MulReductionVL, And/Or/XorReductionV These 
>>> new instructions can be moved into jdk master first, but for 
>>> auto-vectorization, the performance might not get improved.
>>>
>>> 3. Panama/Vector API specific? instructions such as Load/StoreVector 
>>> ( 16 bits), VectorReinterpret, VectorMaskCmp, MaxV/MinV, VectorBlend 
>>> etc.
>>> These instructions cannot be moved into jdk master first because 
>>> there isn't middle-end support.
>>>
>>> I will put 2 and 3 in a new ad file aarch64_neon.ad. I will also 
>>> update aarch64_asmtest.py and macroassemler.cpp. When the patch is 
>>> ready, I will send it again.
>>
>> Thank you *very* much for your hard work. Appreciated!
>>
>> -- 
>> Andrew Haley? (he/him)
>> Java Platform Lead Engineer
>> Red Hat UK Ltd. <https://www.redhat.com> https://keybase.io/andrewhaley
>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>>
> 

From tobias.hartmann at oracle.com  Thu Jul 23 09:13:41 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Thu, 23 Jul 2020 11:13:41 +0200
Subject: RFR(S): 8247732: validate user-input intrinsic_ids in
 ControlIntrinsic
In-Reply-To: <1595401959932.33284@amazon.com>
References: <821e3d29-c95b-aafc-8ee5-6e49a1bdde82@amazon.com>
 <9b324805-eb86-27e1-5dcb-96a823f8495b@amazon.com>
 <82cba5e4-2020-ce0a-4576-e8e0cc2e5ae5@oracle.com>
 <1595401959932.33284@amazon.com>
Message-ID: <a03d92d6-ad07-b347-7452-776459b8d174@oracle.com>


On 22.07.20 09:12, Liu, Xin wrote:
> 1. I move the validation logic for compiler directives to compilerOracle::scan_flag_and_value.  
> If something wrong  happens in parser, the patch will "gracefully" quit JVM using jvm_exit(1). is that okay? 

With "piggy-back on the error mechanism" I meant that you should use the existing bailout mechanism
in the parser. In this case, couldn't you simply put the error message in 'errorbuf' and let the
caller take care of handling it?

Best regards,
Tobias


From boris.ulasevich at bell-sw.com  Thu Jul 23 11:25:00 2020
From: boris.ulasevich at bell-sw.com (Boris Ulasevich)
Date: Thu, 23 Jul 2020 14:25:00 +0300
Subject: [aarch64-port-dev ] RFR 8248870: AARCH64: I2L conversions can be
 skipped for small positive masked values
In-Reply-To: <e54d3d17-c865-a422-e9eb-05b3936e81da@bell-sw.com>
References: <d83ab998-ad1e-4889-b637-dc7f0369c50e@bell-sw.com>
 <9ccf64f1-7a88-0f67-8b50-4dea09af9c8b@redhat.com>
 <e54d3d17-c865-a422-e9eb-05b3936e81da@bell-sw.com>
Message-ID: <05369383-c6d8-5e61-50ce-51fec955e2d4@bell-sw.com>

Hi Andrew,

Since the JDK-8248414 patch has been committed, I believe we can revive
this review. I think it is still better to move my rule to the ubfiz 
command group,
which is in the auto-generated area.

http://cr.openjdk.java.net/~bulasevich/8248870/webrev.02

regards,
Boris

On 09.07.2020 19:20, Boris Ulasevich wrote:
> Hi Andrew,
>
> Ok, let us proceed after 8248414.
>
> Meanwhile, I moved the change out of do-not-edit scope, thanks:
> http://cr.openjdk.java.net/~bulasevich/8248870/webrev.01
>
> regards,
> Boris
>
> On 08.07.2020 12:46, Andrew Haley wrote:
>> On 07/07/2020 16:47, Boris Ulasevich wrote:
>>> Please review the change to skip i2l conversion after the mask:
>>>
>>> http://cr.openjdk.java.net/~bulasevich/8248870/webrev.00
>>> http://bugs.openjdk.java.net/browse/JDK-8248870
>> You seem to have inserted this between the DO NOT EDIT THIS SECTION
>> markers.
>>
>> Please hold off this change until I've committed the patch for
>> 8248414.
>>
>


From ningsheng.jian at arm.com  Thu Jul 23 08:02:47 2020
From: ningsheng.jian at arm.com (Ningsheng Jian)
Date: Thu, 23 Jul 2020 16:02:47 +0800
Subject: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API
 (Incubator): AArch64 backend changes
In-Reply-To: <8c05d468-8753-b671-e3a9-92a7148f4f14@oracle.com>
References: <VI1PR0802MB2558C90ACF1189F17692FF278EB90@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <275eb57c-51c0-675e-c32a-91b198023559@redhat.com>
 <719F9169-ABC4-408E-B732-F1BD9A84337F@oracle.com>
 <9a13f5df-d946-579d-4282-917dc7338dc8@redhat.com>
 <09BC0693-80E0-4F87-855E-0B38A6F5EFA2@oracle.com>
 <668e500e-f621-5a2c-a41e-f73536880f73@redhat.com>
 <VI1PR0802MB255845256424F9EB4D5D81368EB30@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <1909fa9d-98bb-c2fb-45d8-540247d1ca8b@redhat.com>
 <VI1PR0802MB255895ADE9BC8D027EBB81048EB10@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <BYAPR11MB35438110FA50851942CE17A1EF950@BYAPR11MB3543.namprd11.prod.outlook.com>
 <HE1PR0802MB25542AE212E12FD3EF6D1BE48E6E0@HE1PR0802MB2554.eurprd08.prod.outlook.com>
 <2acbcc99-8dd4-b8f1-5982-1d439953c416@redhat.com>
 <VI1PR0802MB2558ECC61528694DBDFFF48F8E670@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <54d6b2b6-b79a-4700-981c-6ab33aca82f2@arm.com>
 <8c05d468-8753-b671-e3a9-92a7148f4f14@oracle.com>
Message-ID: <f588f038-669d-2320-9d3f-c12e6e9c1a66@arm.com>

Hi Vladimir,

Thanks for pointing out this. Yes, I missed that change in shared code. 
I've regenerated the webrev, with GensrcAdlc.gmk file change included:

http://cr.openjdk.java.net/~njian/vectorapi/8223347-integration/aarch64-webrev.01/

Also add build-dev.

Thanks,
Ningsheng

On 7/23/20 5:36 AM, Vladimir Ivanov wrote:
>> http://cr.openjdk.java.net/~njian/vectorapi/8223347-integration/aarch64-webrev.01/ 
> 
> 
> FTR there's one more aarch64-specific change in shared code to enable 
> aarch64_neon.ad processing:
> 
> diff --git a/make/hotspot/gensrc/GensrcAdlc.gmk 
> b/make/hotspot/gensrc/GensrcAdlc.gmk
> --- a/make/hotspot/gensrc/GensrcAdlc.gmk
> +++ b/make/hotspot/gensrc/GensrcAdlc.gmk
> @@ -129,6 +129,12 @@
> 
> $d/os_cpu/$(HOTSPOT_TARGET_OS)_$(HOTSPOT_TARGET_CPU_ARCH)/$(HOTSPOT_TARGET_OS)_$(HOTSPOT_TARGET_CPU_ARCH).ad 
> \
>  ???? )))
> 
> +? ifeq ($(HOTSPOT_TARGET_CPU_ARCH), aarch64)
> +??? AD_SRC_FILES += $(call uniq, $(wildcard $(foreach d, 
> $(AD_SRC_ROOTS), \
> + $d/cpu/$(HOTSPOT_TARGET_CPU_ARCH)/$(HOTSPOT_TARGET_CPU_ARCH)_neon.ad \
> +??? )))
> +? endif
> +
>  ?? ifeq ($(call check-jvm-feature, shenandoahgc), true)
>  ???? AD_SRC_FILES += $(call uniq, $(wildcard $(foreach d, 
> $(AD_SRC_ROOTS), \
> 
> $d/cpu/$(HOTSPOT_TARGET_CPU_ARCH)/gc/shenandoah/shenandoah_$(HOTSPOT_TARGET_CPU).ad 
> \
> 
> Best regards,
> Vladimir Ivanov
> 
>> On 7/8/20 3:05 PM, Yang Zhang wrote:
>>> Hi Andrew
>>>
>>> I have updated this patch. Could you please help to review it again?
>>> In this patch, the following changes are made:
>>> 1. Separate newly added NEON instructions to a new ad file
>>> ??? aarch64_neon.ad
>>> 2. Add assembler tests for NEON instructions. Trailing spaces
>>> ??? in the python script are also removed.
>>>
>>> http://cr.openjdk.java.net/~yzhang/vectorapi/vectorapi.rfr/aarch64_webrev/webrev.02/ 
>>>
>>>
>>> Thanks,
>>> Yang
>>>
>>>
>>> -----Original Message-----
>>> From: Andrew Haley <aph at redhat.com>
>>> Sent: Tuesday, June 30, 2020 12:10 AM
>>> To: Yang Zhang <Yang.Zhang at arm.com>; Viswanathan, Sandhya 
>>> <sandhya.viswanathan at intel.com>; Paul Sandoz <paul.sandoz at oracle.com>
>>> Cc: nd <nd at arm.com>; hotspot-compiler-dev at openjdk.java.net; 
>>> hotspot-dev at openjdk.java.net; core-libs-dev at openjdk.java.net; 
>>> aarch64-port-dev at openjdk.java.net
>>> Subject: Re: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of 
>>> Vector API (Incubator): AArch64 backend changes
>>>
>>> On 29/06/2020 08:48, Yang Zhang wrote:
>>>> 1. Instructions that can be matched with NEON instructions directly.
>>>> MulVB, SqrtVF and AbsV have been merged into jdk master already.
>>>>
>>>> 2. Instructions that jdk master has middle end support for, but they 
>>>> cannot be matched with NEON instructions directly.
>>>> Such as AddReductionVL, MulReductionVL, And/Or/XorReductionV These 
>>>> new instructions can be moved into jdk master first, but for 
>>>> auto-vectorization, the performance might not get improved.
>>>>
>>>> 3. Panama/Vector API specific? instructions such as Load/StoreVector 
>>>> ( 16 bits), VectorReinterpret, VectorMaskCmp, MaxV/MinV, VectorBlend 
>>>> etc.
>>>> These instructions cannot be moved into jdk master first because 
>>>> there isn't middle-end support.
>>>>
>>>> I will put 2 and 3 in a new ad file aarch64_neon.ad. I will also 
>>>> update aarch64_asmtest.py and macroassemler.cpp. When the patch is 
>>>> ready, I will send it again.
>>>
>>> Thank you *very* much for your hard work. Appreciated!
>>>
>>> -- 
>>> Andrew Haley? (he/him)
>>> Java Platform Lead Engineer
>>> Red Hat UK Ltd. <https://www.redhat.com> https://keybase.io/andrewhaley
>>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>>>
>>


From goetz.lindenmaier at sap.com  Thu Jul 23 14:19:57 2020
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Thu, 23 Jul 2020 14:19:57 +0000
Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in
 the Presence of JVMTI Agents
In-Reply-To: <AM0PR0202MB333139A9A877B64198E73D0F9B790@AM0PR0202MB3331.eurprd02.prod.outlook.com>
References: <DB7PR02MB3612C77802B72D3B3A131C729B5B0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <ca46e04d-6c46-7365-0f09-9d649e196442@oracle.com>
 <DB7PR02MB3612E34960EAD89951E788839B5A0@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com>
 <a4213452-e7bd-5bed-7456-3eebf4a4c3a7@oracle.com>
 <DB7PR02MB3612C72A7DC0C14CFC8B92969B540@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <f97264ed-c43e-2d7e-19ae-fcff174f74df@oracle.com>
 <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com>
 <DB7PR02MB36127925DB5D6609DDBF96909B500@DB7PR02MB3612.eurprd02.prod.outlook.com>
 <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com>
 <AM0PR0202MB33316510E86767AED0D29F679B030@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM7PR02MB6049A3D2F6DE10CAD6AA7A51ECEC0@AM7PR02MB6049.eurprd02.prod.outlook.com>
 <b159e349-95bc-01c3-5250-f3b454d7ef53@oracle.com>
 <AM0PR0202MB33315707EAB1F5C9801DB4C19BE40@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB32972071A26C80FB22FC49DE9AFD0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331EEF36942FCEBA7E131389BCB0@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM0PR0202MB329746F57D1C78F14000CB799AC80@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331D64C693490FD0746D1989BC90@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <DB6PR0201MB2152AF18921A375D26A76D89ECA40@DB6PR0201MB2152.eurprd02.prod.outlook.com>
 <AM0PR0202MB3331FF18BED42A71796488E59B600@AM0PR0202MB3331.eurprd02.prod.outlook.com>
 <AM4PR0202MB29641555B86889D51E08441BEC7F0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <AM4PR0202MB2964FAF58FBD21D6705A4418EC7C0@AM4PR0202MB2964.eurprd02.prod.outlook.com>
 <AM0PR0202MB333139A9A877B64198E73D0F9B790@AM0PR0202MB3331.eurprd02.prod.outlook.com>
Message-ID: <AM4PR0202MB296490252335D6D6D638277AEC760@AM4PR0202MB2964.eurprd02.prod.outlook.com>

Hi Richard, 

Thanks for your two further explanations in the other thread. 
That made the points clear to me.

> > I was not that happy with the names saying not_global_escape
> > and similar. I now agreed you have to use the terms of the escape
> > analysis (NoEscape ArgEscape= throughout the runtime code. I'm still not happy with
> > the 'not' in the term, I always try to expand the name to some
> > sentence with a negated verb, but it makes no sense.
> > For example, "has_not_global_escape_in_scope" expands to
> > "Hasn't a global escape in its scope." in my thinking, which makes
> > no sense. You probably mean
> > "Has not-global escape in its scope." or "Has {ArgEscape|NoEscape}
> > in its scope."
> 
> > C2 is using the word "non" in this context, e.g., here
> > alloc->is_non_escaping.
> 
> There is also ConnectionGraph::not_global_escape()
That talks about a single node that represents a single 
Object. An object has a single state wrt. ea.
You use the term for safepoint which tracks a set of objects.
Here, has_not_global_excape can mean
  1. None of the several objects does escape globaly.
  2. There is at least one object that escapes globaly.

> > non obviously negates the adjective 'global',
> > non-global or nonglobal even is a English term I find in the
> > net.
> > So what about "has_non_global_escape_in_scope?"
> 
> And what about has_ea_local_in_scope?
That's good. Please document somewhere that 
Ea_local == ArgEscape | NoEscape.
That's what it is, right?

> > Does jvmti specify that the same limits are used ...?
> > ok on your side.
> 
> I don't know and didn't find anything in a quick search.
Ok, not your business.

> 
> > jvmtiEnvBase.cpp  ok
> > jvmtiImpl.h|cpp  ok
> > jvmtiTagMap.cpp ok
> > whitebox.cpp ok
> 
> > deoptimization.cpp
> 
> > line 177: Please break line
> > line 246, 281: Please break line
> > 1578, 1583, 1589, 1632, 1649, 1651 Break line
> 
> > 1651: You use 'non'-terms, too: non-escaping :)
> 
> I know :) At least here it is wrong I'd say. "...has to be a not escaping obj..."
> sounds better
> (hopefully not only to my german ears).
I thought the term non-escpaing makes it quite clear.
I just wanted to point out that using non above would
be similar to the wording here.

> > IterateHeapWithEscapeAnalysisEnabled.java
> 
> > line 415:
> > msg("wait until target thread has set testMethod_result");
> > while (testMethod_result == 0) {
> >     Thread.sleep(50);
> > }
> > Might the test run into timeouts at this place?
> > The field is volatile, i.e. it will be reloaded
> > in each iteration. But will dontinline_testMethod
> > write it back to main memory in time?
> 
> You mean, the test could hang in that loop for a couple of minutes? I don't
> think so. There are cache coherence protocols in place which will invalidate
> stale data very timely.
Ok, anyways, it would only be a hanging test.
> 
> Ok. I've removed quite a lot of the occurrances.
> 
> > Also, I like full sentences in comments.
> > Especially for me as foreign speaker, this makes
> > things much more clear. I.e., I try to make it
> > a real sentence with articles, capitalized and a
> > dot at the end if there is a subject and a verb
> > in first place.
> > E.g., jvmtiEnvBase.cpp:1327
> 
> Are you referring to the following?
> (from
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.6/src/hots
> pot/share/prims/jvmtiEnvBase.cpp.frames.html)
> 
> 1326
> 1327   // If the frame is a compiled one, need to deoptimize it.
> 1328   if (vf->is_compiled_frame()) {
> 
> This line 1327 is preexisting.
Sorry, wrong line number again. 
I think I meant
1333 // eagerly reallocate scalar replaced objects.

But I must admit, the subject is missing. It's one of these 
imperative sentences where the subject is left out, which 
are used throughout documentation.
Bad example, but still a correct sentence, so qualifies 
for punctuation?

Best regards,
  Goetz.


From erik.joelsson at oracle.com  Thu Jul 23 13:06:22 2020
From: erik.joelsson at oracle.com (Erik Joelsson)
Date: Thu, 23 Jul 2020 06:06:22 -0700
Subject: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API
 (Incubator): AArch64 backend changes
In-Reply-To: <f588f038-669d-2320-9d3f-c12e6e9c1a66@arm.com>
References: <VI1PR0802MB2558C90ACF1189F17692FF278EB90@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <719F9169-ABC4-408E-B732-F1BD9A84337F@oracle.com>
 <9a13f5df-d946-579d-4282-917dc7338dc8@redhat.com>
 <09BC0693-80E0-4F87-855E-0B38A6F5EFA2@oracle.com>
 <668e500e-f621-5a2c-a41e-f73536880f73@redhat.com>
 <VI1PR0802MB255845256424F9EB4D5D81368EB30@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <1909fa9d-98bb-c2fb-45d8-540247d1ca8b@redhat.com>
 <VI1PR0802MB255895ADE9BC8D027EBB81048EB10@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <BYAPR11MB35438110FA50851942CE17A1EF950@BYAPR11MB3543.namprd11.prod.outlook.com>
 <HE1PR0802MB25542AE212E12FD3EF6D1BE48E6E0@HE1PR0802MB2554.eurprd08.prod.outlook.com>
 <2acbcc99-8dd4-b8f1-5982-1d439953c416@redhat.com>
 <VI1PR0802MB2558ECC61528694DBDFFF48F8E670@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <54d6b2b6-b79a-4700-981c-6ab33aca82f2@arm.com>
 <8c05d468-8753-b671-e3a9-92a7148f4f14@oracle.com>
 <f588f038-669d-2320-9d3f-c12e6e9c1a66@arm.com>
Message-ID: <2bc029fc-2823-18ac-9aa0-1a8edd7f9094@oracle.com>

Hello Ningsheng,

Build change looks good.

/Erik

On 2020-07-23 01:02, Ningsheng Jian wrote:
> Hi Vladimir,
>
> Thanks for pointing out this. Yes, I missed that change in shared 
> code. I've regenerated the webrev, with GensrcAdlc.gmk file change 
> included:
>
> http://cr.openjdk.java.net/~njian/vectorapi/8223347-integration/aarch64-webrev.01/ 
>
>
> Also add build-dev.
>
> Thanks,
> Ningsheng
>
> On 7/23/20 5:36 AM, Vladimir Ivanov wrote:
>>> http://cr.openjdk.java.net/~njian/vectorapi/8223347-integration/aarch64-webrev.01/ 
>>
>>
>>
>> FTR there's one more aarch64-specific change in shared code to enable 
>> aarch64_neon.ad processing:
>>
>> diff --git a/make/hotspot/gensrc/GensrcAdlc.gmk 
>> b/make/hotspot/gensrc/GensrcAdlc.gmk
>> --- a/make/hotspot/gensrc/GensrcAdlc.gmk
>> +++ b/make/hotspot/gensrc/GensrcAdlc.gmk
>> @@ -129,6 +129,12 @@
>>
>> $d/os_cpu/$(HOTSPOT_TARGET_OS)_$(HOTSPOT_TARGET_CPU_ARCH)/$(HOTSPOT_TARGET_OS)_$(HOTSPOT_TARGET_CPU_ARCH).ad 
>> \
>> ????? )))
>>
>> +? ifeq ($(HOTSPOT_TARGET_CPU_ARCH), aarch64)
>> +??? AD_SRC_FILES += $(call uniq, $(wildcard $(foreach d, 
>> $(AD_SRC_ROOTS), \
>> + $d/cpu/$(HOTSPOT_TARGET_CPU_ARCH)/$(HOTSPOT_TARGET_CPU_ARCH)_neon.ad \
>> +??? )))
>> +? endif
>> +
>> ??? ifeq ($(call check-jvm-feature, shenandoahgc), true)
>> ????? AD_SRC_FILES += $(call uniq, $(wildcard $(foreach d, 
>> $(AD_SRC_ROOTS), \
>>
>> $d/cpu/$(HOTSPOT_TARGET_CPU_ARCH)/gc/shenandoah/shenandoah_$(HOTSPOT_TARGET_CPU).ad 
>> \
>>
>> Best regards,
>> Vladimir Ivanov
>>
>>> On 7/8/20 3:05 PM, Yang Zhang wrote:
>>>> Hi Andrew
>>>>
>>>> I have updated this patch. Could you please help to review it again?
>>>> In this patch, the following changes are made:
>>>> 1. Separate newly added NEON instructions to a new ad file
>>>> ??? aarch64_neon.ad
>>>> 2. Add assembler tests for NEON instructions. Trailing spaces
>>>> ??? in the python script are also removed.
>>>>
>>>> http://cr.openjdk.java.net/~yzhang/vectorapi/vectorapi.rfr/aarch64_webrev/webrev.02/ 
>>>>
>>>>
>>>> Thanks,
>>>> Yang
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Andrew Haley <aph at redhat.com>
>>>> Sent: Tuesday, June 30, 2020 12:10 AM
>>>> To: Yang Zhang <Yang.Zhang at arm.com>; Viswanathan, Sandhya 
>>>> <sandhya.viswanathan at intel.com>; Paul Sandoz <paul.sandoz at oracle.com>
>>>> Cc: nd <nd at arm.com>; hotspot-compiler-dev at openjdk.java.net; 
>>>> hotspot-dev at openjdk.java.net; core-libs-dev at openjdk.java.net; 
>>>> aarch64-port-dev at openjdk.java.net
>>>> Subject: Re: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of 
>>>> Vector API (Incubator): AArch64 backend changes
>>>>
>>>> On 29/06/2020 08:48, Yang Zhang wrote:
>>>>> 1. Instructions that can be matched with NEON instructions directly.
>>>>> MulVB, SqrtVF and AbsV have been merged into jdk master already.
>>>>>
>>>>> 2. Instructions that jdk master has middle end support for, but 
>>>>> they cannot be matched with NEON instructions directly.
>>>>> Such as AddReductionVL, MulReductionVL, And/Or/XorReductionV These 
>>>>> new instructions can be moved into jdk master first, but for 
>>>>> auto-vectorization, the performance might not get improved.
>>>>>
>>>>> 3. Panama/Vector API specific? instructions such as 
>>>>> Load/StoreVector ( 16 bits), VectorReinterpret, VectorMaskCmp, 
>>>>> MaxV/MinV, VectorBlend etc.
>>>>> These instructions cannot be moved into jdk master first because 
>>>>> there isn't middle-end support.
>>>>>
>>>>> I will put 2 and 3 in a new ad file aarch64_neon.ad. I will also 
>>>>> update aarch64_asmtest.py and macroassemler.cpp. When the patch is 
>>>>> ready, I will send it again.
>>>>
>>>> Thank you *very* much for your hard work. Appreciated!
>>>>
>>>> -- 
>>>> Andrew Haley? (he/him)
>>>> Java Platform Lead Engineer
>>>> Red Hat UK Ltd. <https://www.redhat.com> 
>>>> https://keybase.io/andrewhaley
>>>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>>>>
>>>
>

From xxinliu at amazon.com  Thu Jul 23 16:02:42 2020
From: xxinliu at amazon.com (Liu, Xin)
Date: Thu, 23 Jul 2020 16:02:42 +0000
Subject: RFR(S): 8247732: validate user-input intrinsic_ids in
 ControlIntrinsic
In-Reply-To: <a03d92d6-ad07-b347-7452-776459b8d174@oracle.com>
References: <821e3d29-c95b-aafc-8ee5-6e49a1bdde82@amazon.com>
 <9b324805-eb86-27e1-5dcb-96a823f8495b@amazon.com>
 <82cba5e4-2020-ce0a-4576-e8e0cc2e5ae5@oracle.com>
 <1595401959932.33284@amazon.com>,
 <a03d92d6-ad07-b347-7452-776459b8d174@oracle.com>
Message-ID: <1595520162373.22868@amazon.com>

hi, Tobias, 

That is my intention too, but CompilerOracle doesn't exit JVM when it encounters parsing errors. 
It just exacts information from CompileCommand as many as possible. That makes sense because compiler "directives" are supposed to be optional for program execution. 

I do put the error message in parser's errorbuf.  I set a flag "exit_on_error" to quit JVM after it dumps parser errors. yes, I treat undefined intrinsics as fatal errors.  
This behavior is from Nils comment: "I want to see an error on startup if the user has specified unknown intrinsic names."  It is also consistent with JVM option -XX:ControlIntrinsic=. 

thanks, 
--lx

________________________________________
From: Tobias Hartmann <tobias.hartmann at oracle.com>
Sent: Thursday, July 23, 2020 2:13 AM
To: Liu, Xin; Nils Eliasson; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev
Subject: RE: [EXTERNAL] RFR(S): 8247732: validate user-input intrinsic_ids in ControlIntrinsic

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


On 22.07.20 09:12, Liu, Xin wrote:
> 1. I move the validation logic for compiler directives to compilerOracle::scan_flag_and_value.
> If something wrong  happens in parser, the patch will "gracefully" quit JVM using jvm_exit(1). is that okay?

With "piggy-back on the error mechanism" I meant that you should use the existing bailout mechanism
in the parser. In this case, couldn't you simply put the error message in 'errorbuf' and let the
caller take care of handling it?

Best regards,
Tobias


From vladimir.x.ivanov at oracle.com  Thu Jul 23 21:50:54 2020
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Fri, 24 Jul 2020 00:50:54 +0300
Subject: RFR[S] : 8248830 : C2 : Rotate API intrinsification for X86
In-Reply-To: <MWHPR11MB1614CB6E26028AC98DAA7F30E8790@MWHPR11MB1614.namprd11.prod.outlook.com>
References: <MWHPR11MB1614EAFF216144FE6EAE68F9E87F0@MWHPR11MB1614.namprd11.prod.outlook.com>
 <92d97d1b-fc53-e368-b249-1cab7db33964@oracle.com>
 <MWHPR11MB1614CB6E26028AC98DAA7F30E8790@MWHPR11MB1614.namprd11.prod.outlook.com>
Message-ID: <dd691913-d9c7-2657-905f-4f3df50f6bb4@oracle.com>

Hi Jatin,

> http://cr.openjdk.java.net/~jbhateja/8248830/webrev.03/

Much better! Thanks.

> Change Summary:
> 
> 1) Unified the handling for scalar rotate operation. All scalar rotate selection patterns are now dependent on newly created RotateLeft/RotateRight nodes. This promotes rotate inferencing. Currently if DAG nodes corresponding to a sub-pattern are shared (have multiple users) then existing complex patterns based on Or/LShiftL/URShift does not get matched and this prevents inferring rotate nodes. Please refer to JIT'ed assembly output with baseline[1] and with patch[2] . We can see that generated code size also went done from 832 byte to 768 bytes. Also this can cause perf degradation if shift-or dependency chain appears inside a hot region.
> 
> 2) Due to enhanced rotate inferencing new patch shows better performance even for legacy targets (non AVX-512). Please refer to the perf result[3] over AVX2 machine for JMH benchmark part of the patch.

Very nice!
> 3) As suggested, removed Java API intrinsification changes and scalar rotate transformation are done during OrI/OrL node idealizations.

Good.

(Still would be nice to factor the matching code from Ideal() and share 
it between multiple use sites. Especially considering OrVNode::Ideal() 
now does basically the same thing. As an example/idea, take a look at 
is_bmi_pattern() in x86.ad.)

> 4) SLP always gets to work on new scalar Rotate nodes and creates vector rotate nodes which are degenerated into OrV/LShiftV/URShiftV nodes if target does not supports vector rotates(non-AVX512).

Good.

> 5) Added new instruction patterns for vector shift Left/Right operations with constant shift operands. This prevents emitting extra moves to XMM.

+instruct vshiftI_imm(vec dst, vec src, immI8 shift) %{
+  match(Set dst (LShiftVI src shift));

I'd prefer to see a uniform Ideal IR shape being used irrespective of 
whether the argument is a constant or not. It should also simplify the 
logic in SuperWord and make it easier to support on non-x86 architectures.

For example, here's how it is done on AArch64:

instruct vsll4I_imm(vecX dst, vecX src, immI shift) %{
   predicate(n->as_Vector()->length() == 4);
   match(Set dst (LShiftVI src (LShiftCntV shift)));
...

> 6) Constant folding scenarios are covered in RotateLeft/RotateRight idealization, inferencing of vector rotate through OrV idealization covers the vector patterns generated though non SLP route i.e. VectorAPI.

I'm fine with keeping OrV::Ideal(), but I'm concerned with the general 
direction here - duplication of scalar transformations to lane-wise 
vector operations. It definitely won't scale and in a longer run it 
risks to diverge. Would be nice to find a way to automatically "lift" 
scalar transformations to vectors and apply them uniformly. But right 
now it is just an idea which requires more experimentation.


Some other minor comments/suggestions:

+  // Swap the computed left and right shift counts.
+  if (is_rotate_left) {
+    Node* temp = shiftRCnt;
+    shiftRCnt  = shiftLCnt;
+    shiftLCnt  = temp;
+  }

Maybe use swap() here (declared in globalDefinitions.hpp)?


+  if (Matcher::match_rule_supported_vector(vopc, vlen, bt))
+    return true;

Please, don't omit curly braces (even for simple cases).


-// Rotate Right by variable
-instruct rorI_rReg_Var_C0(no_rcx_RegI dst, rcx_RegI shift, immI0 zero, 
rFlagsReg cr)
+instruct rorI_immI8_legacy(rRegI dst, immI8 shift, rFlagsReg cr)
  %{
-  match(Set dst (OrI (URShiftI dst shift) (LShiftI dst (SubI zero 
shift))));
-
+  predicate(!VM_Version::supports_bmi2() && 
n->bottom_type()->basic_type() == T_INT);
+  match(Set dst (RotateRight dst shift));
+  format %{ "rorl     $dst, $shift" %}
    expand %{
-    rorI_rReg_CL(dst, shift, cr);
+    rorI_rReg_imm8(dst, shift, cr);
    %}

It would be really nice to migrate to MacroAssembler along the way (as a 
cleanup).

> Please push the patch through your testing framework and let me know your review feedback.

There's one new assertion failure:

#  Internal Error (.../src/hotspot/share/opto/phaseX.cpp:1238), 
pid=5476, tid=6219
#  assert((i->_idx >= k->_idx) || i->is_top()) failed: Idealize should 
return new nodes, use Identity to return old nodes

I believe it comes from RotateLeftNode::Ideal/RotateRightNode::Ideal 
which can return pre-contructed constants. I suggest to get rid of 
Ideal() methods and move constant folding logic into Node::Value() (as 
implemented for other bitwise/arithmethic nodes in 
addnode.cpp/subnode.cpp/mulnode.cpp et al). It's a more generic approach 
since it enables richer type information (ranges vs constants) and IMO 
it's more convenient to work with constants through Types than ConNodes.

(I suspect that original/expanded IR shape may already provide more 
precise type info for non-constant case which can affect the benchmarks.)

Best regards,
Vladimir Ivanov

> 
> Best Regards,
> Jatin
> 
> [1] http://cr.openjdk.java.net/~jbhateja/8248830/rotate_baseline_avx2_asm.txt
> [2] http://cr.openjdk.java.net/~jbhateja/8248830/rotate_new_patch_avx2_asm.txt
> [3] http://cr.openjdk.java.net/~jbhateja/8248830/rotate_perf_avx2_new_patch.txt
> 
> 
>> -----Original Message-----
>> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
>> Sent: Saturday, July 18, 2020 12:25 AM
>> To: Bhateja, Jatin <jatin.bhateja at intel.com>; Andrew Haley <aph at redhat.com>
>> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; hotspot-compiler-
>> dev at openjdk.java.net
>> Subject: Re: RFR[S] : 8248830 : C2 : Rotate API intrinsification for X86
>>
>> Hi Jatin,
>>
>>> http://cr.openjdk.java.net/~jbhateja/8248830/webrev_02/
>>
>> It definitely looks better, but IMO it hasn't reached the sweet spot yet.
>> It feels like the focus is on auto-vectorizer while the burden is put on
>> scalar cases.
>>
>> First of all, considering GVN folds relevant operation patterns into a
>> single Rotate node now, what's the motivation to introduce intrinsics?
>>
>> Another point is there's still significant duplication for scalar cases.
>>
>> I'd prefer to see the legacy cases which rely on pattern matching to go
>> away and be substituted with instructions which match Rotate instructions
>> (migrating ).
>>
>> I understand that it will penalize the vectorization implementation, but
>> IMO reducing overall complexity is worth it. On auto-vectorizer side, I see
>> 2 ways to fix it:
>>
>>     (1) introduce additional AD instructions for RotateLeftV/RotateRightV
>> specifically for pre-AVX512 hardware;
>>
>>     (2) in SuperWord::output(), when matcher doesn't support
>> RotateLeftV/RotateLeftV nodes (Matcher::match_rule_supported()),
>> generate vectorized version of the original pattern.
>>
>> Overall, it looks like more and more focus is made on scalar part.
>> Considering the main goal of the patch is to enable vectorization, I'm fine
>> with separating cleanup of scalar part. As an interim solution, it seems
>> that leaving the scalar part as it is now and matching scalar bit rotate
>> pattern in VectorNode::is_rotate() should be enough to keep the
>> vectorization part functioning. Then scalar Rotate nodes and relevant
>> cleanups can be integrated later. (Or vice versa: clean up scalar part
>> first and then follow up with vectorization.)
>>
>> Some other comments:
>>
>> * There's a lot of duplication between OrINode::Ideal and OrLNode::Ideal.
>> What do you think about introducing a super type
>> (OrNode) and put a unified version (OrNode::Ideal) there?
>>
>>
>> * src/hotspot/cpu/x86/x86.ad
>>
>> +instruct vprotate_immI8(vec dst, vec src, immI8 shift) %{
>> +  predicate(n->bottom_type()->is_vect()->element_basic_type() == T_INT ||
>> +            n->bottom_type()->is_vect()->element_basic_type() ==
>> +T_LONG);
>>
>> +instruct vprorate(vec dst, vec src, vec shift) %{
>> +  predicate(n->bottom_type()->is_vect()->element_basic_type() == T_INT ||
>> +            n->bottom_type()->is_vect()->element_basic_type() ==
>> +T_LONG);
>>
>> The predicates are redundant here.
>>
>>
>> * src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp
>>
>> +void C2_MacroAssembler::vprotate_imm(int opcode, BasicType etype,
>> XMMRegister dst, XMMRegister src,
>> +                                     int shift, int vector_len) {  if
>> + (opcode == Op_RotateLeftV) {
>> +    if (etype == T_INT) {
>> +      evprold(dst, src, shift, vector_len);
>> +    } else {
>> +      evprolq(dst, src, shift, vector_len);
>> +    }
>>
>> Please, put an assert for the false case (assert(etype == T_LONG, "...")).
>>
>>
>> * On testing (with previous version of the patch): -XX:UseAVX is x86-
>> specific flag, so new/adjusted tests now fail on non-x86 platforms.
>> Either omitting the flag or adding -XX:+IgnoreUnrecognizedVMOptions will
>> solve the issue.
>>
>> Best regards,
>> Vladimir Ivanov
>>
>>>
>>>
>>> Summary of changes:
>>> 1) Optimization is specifically targeted to exploit vector rotation
>> instruction added for X86 AVX512. A single rotate instruction  encapsulates
>> entire vector OR/SHIFTs pattern thus offers better latency at reduced
>> instruction count.
>>>
>>> 2) There were two approaches to implement this:
>>>       a)  Let everything remain the same and add new wide complex
>> instruction patterns in the matcher for e.g.
>>>            set Dst ( OrV (Binary (LShiftVI dst (Binary ReplicateI shift))
>> (URShiftVI dst (Binary (SubI (Binary ReplicateI 32) ( Replicate shift))
>>>       It would have been an overoptimistic assumption to expect that graph
>> shape would be preserved till the matcher for correct inferencing.
>>>       In addition we would have required multiple such bulky patterns.
>>>       b) Create new RotateLeft/RotateRight scalar nodes, these gets
>> generated during intrinsification as well as during additional pattern
>>>       matching during node Idealization, later on these nodes are consumed
>> by SLP for valid vectorization scenarios to emit their vector
>>>       counterparts which eventually emits vector rotates.
>>>
>>> 3) I choose approach 2b) since its cleaner, only problem here was that
>>> in non-evex mode (UseAVX < 3) new scalar Rotate nodes should either be
>> dismantled back to OR/SHIFT pattern or we penalize the vectorization which
>> would be very costly, other option would have been to add additional vector
>> rotate pattern for UseAVX=3 in the matcher which emit vector OR-SHIFTs
>> instruction but then it will loose on emitting efficient instruction
>> sequence which node sharing (OrV/LShiftV/URShift) offer in current
>> implementation - thus it will not be beneficial for non-AVX512 targets,
>> only saving will be in terms of cleanup of few existing scalar rotate
>> matcher patterns, also old targets does not offer this powerful rotate
>> instruction. Therefore new scalar nodes are created only for AVX512
>> targets.
>>>
>>> As per suggestions constant folding scenarios have been covered during
>> Idealizations of newly added scalar nodes.
>>>
>>> Please review the latest version and share your feedback and test
>> results.
>>>
>>> Best Regards,
>>> Jatin
>>>
>>>
>>>> -----Original Message-----
>>>> From: Andrew Haley <aph at redhat.com>
>>>> Sent: Saturday, July 11, 2020 2:24 PM
>>>> To: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>; Bhateja, Jatin
>>>> <jatin.bhateja at intel.com>; hotspot-compiler-dev at openjdk.java.net
>>>> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>
>>>> Subject: Re: 8248830 : RFR[S] : C2 : Rotate API intrinsification for
>>>> X86
>>>>
>>>> On 10/07/2020 18:32, Vladimir Ivanov wrote:
>>>>
>>>>    > High-level comment: so far, there were no pressing need in  >
>>>> explicitly marking the methods as intrinsics. ROR/ROL instructions  >
>>>> were selected during matching [1]. Now the patch introduces  >
>>>> dedicated nodes
>>>> (RotateLeft/RotateRight) specifically for intrinsics  > which partly
>>>> duplicates existing logic.
>>>>
>>>> The lack of rotate nodes in the IR has always meant that AArch64
>>>> doesn't generate optimal code for e.g.
>>>>
>>>>      (Set dst (XorL reg1 (RotateLeftL reg2 imm)))
>>>>
>>>> because, with the RotateLeft expanded to its full combination of ORs
>>>> and shifts, it's to complicated to match. At the time I put this to
>>>> one side because it wasn't urgent. This is a shame because although
>>>> such combinations are unusual they are used in some crypto operations.
>>>>
>>>> If we can generate immediate-form rotate nodes early by pattern
>>>> matching during parsing (rather than depending on intrinsics) we'll
>>>> get more value than by depending on programmers calling intrinsics.
>>>>
>>>> --
>>>> Andrew Haley  (he/him)
>>>> Java Platform Lead Engineer
>>>> Red Hat UK Ltd. <https://www.redhat.com>
>>>> https://keybase.io/andrewhaley
>>>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>>>

From nick.gasson at arm.com  Fri Jul 24 06:08:52 2020
From: nick.gasson at arm.com (Nick Gasson)
Date: Fri, 24 Jul 2020 14:08:52 +0800
Subject: RFR(XS): 8249781: AArch64: AOT compiled code crashes if C2 allocates
 r27
Message-ID: <85eep11m2z.fsf@nicgas01-pc.shanghai.arm.com>

Hi,

Bug: https://bugs.openjdk.java.net/browse/JDK-8249781
Webrev: http://cr.openjdk.java.net/~ngasson/8249781/webrev.0/

AOT compiled code always assumes r27 is the heap base pointer, but since
JDK-8242449 C2 can allocate it as a general register if the compressed
class base is null. If C2 complied code that uses r27 runs before AOT
code, rheapbase will be clobbered causing a crash in the AOT code.

To reproduce:

  make test TEST="compiler/aot/cli/jaotc/CompileModuleTest.java" \
     JTREG="VM_OPTIONS=-Xcomp -XX:-TieredCompilation"

Fix by checking if AOT is enabled before using r27 as a general
register.

Tested with jtreg hotspot_all_no_apps and jdk_core.

--
Thanks,
Nick

From rwestrel at redhat.com  Fri Jul 24 07:20:38 2020
From: rwestrel at redhat.com (Roland Westrelin)
Date: Fri, 24 Jul 2020 09:20:38 +0200
Subject: [8u] RFR: 8240676: Meet not symmetric failure when running lucene on
 jdk8
Message-ID: <87zh7pmla1.fsf@redhat.com>


Original bug:
https://bugs.openjdk.java.net/browse/JDK-8240676
https://hg.openjdk.java.net/jdk/jdk/rev/6ccf082f50d4

The context in compile.hpp changed so the original patch requires a
small adjustment. Testing triggered a crash, so I had to cherry-pick the
change in type.cpp line 3996 from an RFE that was integrated in a later
version of the jdk: 8031755 (Type speculation should be used to optimize
explicit null checks).

8u webrev:
http://cr.openjdk.java.net/~roland/8240676.8u/webrev.00/

Testing: x86_64, verified new test fails with the fix commented out,
works otherwise, hotspot/compiler jtreg, some CTW, ran octane with
nashorn.

Roland.


From aph at redhat.com  Fri Jul 24 07:49:37 2020
From: aph at redhat.com (Andrew Haley)
Date: Fri, 24 Jul 2020 08:49:37 +0100
Subject: [aarch64-port-dev ] RFR(XS): 8249781: AArch64: AOT compiled code
 crashes if C2 allocates r27
In-Reply-To: <85eep11m2z.fsf@nicgas01-pc.shanghai.arm.com>
References: <85eep11m2z.fsf@nicgas01-pc.shanghai.arm.com>
Message-ID: <6780400a-dd63-fef2-fb38-b92d2e9d8292@redhat.com>

Hi,

On 24/07/2020 07:08, Nick Gasson wrote:
> Bug: https://bugs.openjdk.java.net/browse/JDK-8249781
> Webrev: http://cr.openjdk.java.net/~ngasson/8249781/webrev.0/
> 
> AOT compiled code always assumes r27 is the heap base pointer, but since
> JDK-8242449 C2 can allocate it as a general register if the compressed
> class base is null. If C2 complied code that uses r27 runs before AOT
> code, rheapbase will be clobbered causing a crash in the AOT code.
> 
> To reproduce:
> 
>   make test TEST="compiler/aot/cli/jaotc/CompileModuleTest.java" \
>      JTREG="VM_OPTIONS=-Xcomp -XX:-TieredCompilation"
> 
> Fix by checking if AOT is enabled before using r27 as a general
> register.
> 
> Tested with jtreg hotspot_all_no_apps and jdk_core.

OK, thanks. Are there any backports needed?

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From nick.gasson at arm.com  Fri Jul 24 07:59:55 2020
From: nick.gasson at arm.com (Nick Gasson)
Date: Fri, 24 Jul 2020 15:59:55 +0800
Subject: [aarch64-port-dev ] RFR(XS): 8249781: AArch64: AOT compiled code
 crashes if C2 allocates r27
In-Reply-To: <6780400a-dd63-fef2-fb38-b92d2e9d8292@redhat.com>
References: <85eep11m2z.fsf@nicgas01-pc.shanghai.arm.com>
 <6780400a-dd63-fef2-fb38-b92d2e9d8292@redhat.com>
Message-ID: <85blk51gxw.fsf@nicgas01-pc.shanghai.arm.com>


On 07/24/20 15:49 pm, Andrew Haley wrote:
>
> OK, thanks. Are there any backports needed?

It only affects JDK 15 and tip. I don't know if it's appropriate for
jdk15 in the current RDP2 phase as AOT is an experimental feature? (The
JBS entry is P3.)

--
Thanks,
Nick

From aph at redhat.com  Fri Jul 24 08:15:23 2020
From: aph at redhat.com (Andrew Haley)
Date: Fri, 24 Jul 2020 09:15:23 +0100
Subject: RFR 8249189: AARCH64: more L2I conversions can be skipped (ubfiz)
In-Reply-To: <209c5713-4218-4e9c-037d-fe337734697f@bell-sw.com>
References: <209c5713-4218-4e9c-037d-fe337734697f@bell-sw.com>
Message-ID: <8dc598ba-f17e-9d0c-db02-1a329dc010c4@redhat.com>

On 22/07/2020 14:36, Boris Ulasevich wrote:
> Please review the update for aarch64 AD template file to generate more
> bitfield extraction rules where I2L and L2I conversions can be skipped.
> 
> http://cr.openjdk.java.net/~bulasevich/8249189/webrev.02
> http://bugs.openjdk.java.net/browse/JDK-8249189
> 
> Tested with JTREG and manual [1] tests.

4056 operand immL_positive_bitmaskI()
4057 %{
4058   predicate((n->get_long() != 0)
4059             && ((n->get_long() & 0xffffffff80000000L) == 0)
4060             && is_power_of_2(n->get_long() + 1));
4061   match(ConL);
4062
4063   op_cost(0);
4064   format %{ %}
4065   interface(CONST_INTER);
4066 %}

Isn't this a difficult-to-understand way of saying

4058   predicate((n->get_long() != 0)
4059             && ((julong)n->get_long() < 0x80000000LL)
4060             && is_power_of_2(n->get_long() + 1));

Note the "LL" here: we have to work with LLP64 systems.

Otherwise OK.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Fri Jul 24 08:18:40 2020
From: aph at redhat.com (Andrew Haley)
Date: Fri, 24 Jul 2020 09:18:40 +0100
Subject: [aarch64-port-dev ] RFR 8248870: AARCH64: I2L conversions can be
 skipped for small positive masked values
In-Reply-To: <05369383-c6d8-5e61-50ce-51fec955e2d4@bell-sw.com>
References: <d83ab998-ad1e-4889-b637-dc7f0369c50e@bell-sw.com>
 <9ccf64f1-7a88-0f67-8b50-4dea09af9c8b@redhat.com>
 <e54d3d17-c865-a422-e9eb-05b3936e81da@bell-sw.com>
 <05369383-c6d8-5e61-50ce-51fec955e2d4@bell-sw.com>
Message-ID: <e3e7ad61-7747-a16a-6335-f3eed6c507ca@redhat.com>

On 23/07/2020 12:25, Boris Ulasevich wrote:
> Since the JDK-8248414 patch has been committed, I believe we can revive
> this review. I think it is still better to move my rule to the ubfiz 
> command group,
> which is in the auto-generated area.
> 
> http://cr.openjdk.java.net/~bulasevich/8248870/webrev.02

OK, thanks.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Fri Jul 24 08:19:10 2020
From: aph at redhat.com (Andrew Haley)
Date: Fri, 24 Jul 2020 09:19:10 +0100
Subject: [aarch64-port-dev ] RFR(XS): 8249781: AArch64: AOT compiled code
 crashes if C2 allocates r27
In-Reply-To: <85blk51gxw.fsf@nicgas01-pc.shanghai.arm.com>
References: <85eep11m2z.fsf@nicgas01-pc.shanghai.arm.com>
 <6780400a-dd63-fef2-fb38-b92d2e9d8292@redhat.com>
 <85blk51gxw.fsf@nicgas01-pc.shanghai.arm.com>
Message-ID: <b7fc0116-0784-b9b9-fcbe-c611990927d6@redhat.com>

On 24/07/2020 08:59, Nick Gasson wrote:
> 
> On 07/24/20 15:49 pm, Andrew Haley wrote:
>>
>> OK, thanks. Are there any backports needed?
> 
> It only affects JDK 15 and tip. I don't know if it's appropriate for
> jdk15 in the current RDP2 phase as AOT is an experimental feature? (The
> JBS entry is P3.)

Probably no need, thanks.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From adinn at redhat.com  Fri Jul 24 08:57:53 2020
From: adinn at redhat.com (Andrew Dinn)
Date: Fri, 24 Jul 2020 09:57:53 +0100
Subject: [8u] RFR: 8240676: Meet not symmetric failure when running lucene
 on jdk8
In-Reply-To: <87zh7pmla1.fsf@redhat.com>
References: <87zh7pmla1.fsf@redhat.com>
Message-ID: <9ae30a2b-3443-9954-950e-08e7e26ddd97@redhat.com>

Hi Roland,

On 24/07/2020 08:20, Roland Westrelin wrote:
> 
> Original bug:
> https://bugs.openjdk.java.net/browse/JDK-8240676
> https://hg.openjdk.java.net/jdk/jdk/rev/6ccf082f50d4
> 
> The context in compile.hpp changed so the original patch requires a
> small adjustment. Testing triggered a crash, so I had to cherry-pick the
> change in type.cpp line 3996 from an RFE that was integrated in a later
> version of the jdk: 8031755 (Type speculation should be used to optimize
> explicit null checks).
> 
> 8u webrev:
> http://cr.openjdk.java.net/~roland/8240676.8u/webrev.00/
> 
> Testing: x86_64, verified new test fails with the fix commented out,
> works otherwise, hotspot/compiler jtreg, some CTW, ran octane with
> nashorn.
The changes to Type::meet_helper and Type::check_symmetrical look fine.

However, I don't understand what the cherry-picked change to line 3996
in TypeAryPtr::xmeet_helper does and why it is legitimate:

-      return make(NotNull, NULL, tary, lazy_klass, false, off,
InstanceBot);
+      return make(NotNull, NULL, tary, lazy_klass, false, off,
InstanceBot, speculative, depth);

Obviously it fixes a crash but -- for the record -- can you explain

  1) why the crash happened and how this fixes it

  2) why this was not needed in the upstream patch and is needed here

regards,


Andrew Dinn
-----------
Red Hat Distinguished Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill


From rwestrel at redhat.com  Fri Jul 24 09:26:36 2020
From: rwestrel at redhat.com (Roland Westrelin)
Date: Fri, 24 Jul 2020 11:26:36 +0200
Subject: [8u] RFR: 8240676: Meet not symmetric failure when running lucene
 on jdk8
In-Reply-To: <9ae30a2b-3443-9954-950e-08e7e26ddd97@redhat.com>
References: <87zh7pmla1.fsf@redhat.com>
 <9ae30a2b-3443-9954-950e-08e7e26ddd97@redhat.com>
Message-ID: <87wo2tmfg3.fsf@redhat.com>


Hi Andrew,

Thanks for looking at this one.

> The changes to Type::meet_helper and Type::check_symmetrical look fine.
>
> However, I don't understand what the cherry-picked change to line 3996
> in TypeAryPtr::xmeet_helper does and why it is legitimate:
>
> -      return make(NotNull, NULL, tary, lazy_klass, false, off,
> InstanceBot);
> +      return make(NotNull, NULL, tary, lazy_klass, false, off,
> InstanceBot, speculative, depth);
>
> Obviously it fixes a crash but -- for the record -- can you explain
>
>   1) why the crash happened and how this fixes it

The background for this patch is the following: we saw a rare crash
during testing. The crash couldn't be reproduced. My attempts at a test
case didn't succeed either. So instead, I made a change to the
verification code in the type system so it stress tested some
combinations of types that were usually rarely exercised. It was then
easy to write a test case that triggered the failure and implement a
fix.

The risk with this change is not so much in the fix itself but in the
improvement to the verification code that can uncover bugs that we were
not aware of before. That's what happens with 8u where we hit a bug that
was never seen with 8u before.

Object pointer types have 2 parts: a known type part and a speculative
part. When the verification code triggers it verify both parts. In the
case of this fix, the speculative parts gets accidentally dropped. The
new verification code catches it. The previous one didn't for some
reason.

>   2) why this was not needed in the upstream patch and is needed here

I cherry-picked the change from a later release (jdk 9 I think). So the
change was not needed in the 11u patch because it was already there.

Roland


From tobias.hartmann at oracle.com  Fri Jul 24 09:52:42 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 24 Jul 2020 11:52:42 +0200
Subject: RFR(S): 8247732: validate user-input intrinsic_ids in
 ControlIntrinsic
In-Reply-To: <1595520162373.22868@amazon.com>
References: <821e3d29-c95b-aafc-8ee5-6e49a1bdde82@amazon.com>
 <9b324805-eb86-27e1-5dcb-96a823f8495b@amazon.com>
 <82cba5e4-2020-ce0a-4576-e8e0cc2e5ae5@oracle.com>
 <1595401959932.33284@amazon.com>
 <a03d92d6-ad07-b347-7452-776459b8d174@oracle.com>
 <1595520162373.22868@amazon.com>
Message-ID: <916b3a4a-5617-941d-6161-840f3ea900bd@oracle.com>

Hi Liu,

On 23.07.20 18:02, Liu, Xin wrote:
> That is my intention too, but CompilerOracle doesn't exit JVM when it encounters parsing errors. 
> It just exacts information from CompileCommand as many as possible. That makes sense because compiler "directives" are supposed to be optional for program execution. 
> 
> I do put the error message in parser's errorbuf.  I set a flag "exit_on_error" to quit JVM after it dumps parser errors. yes, I treat undefined intrinsics as fatal errors.  
> This behavior is from Nils comment: "I want to see an error on startup if the user has specified unknown intrinsic names."  It is also consistent with JVM option -XX:ControlIntrinsic=. 

Okay, thanks for the explanation! I would prefer consistency in error handling of compiler
directives, i.e., handle all parser failures the same way. But I leave it to Nils to decide.

Best regards,
Tobias

From boris.ulasevich at bell-sw.com  Fri Jul 24 10:48:16 2020
From: boris.ulasevich at bell-sw.com (Boris Ulasevich)
Date: Fri, 24 Jul 2020 13:48:16 +0300
Subject: RFR 8249893: AARCH64: optimize the construction of the value from the
 bits of the other two
Message-ID: <c11d8af8-bca7-7a7e-5486-94c487f241ac@bell-sw.com>

Hi,

Please review the change to C2 and AArch64 which reduces constructs
like? "(v1 & 0xFF) | ((v2 & 0xFF) << 8)" into two Bitfield Insert 
instructions.

http://bugs.openjdk.java.net/browse/JDK-8249893
http://cr.openjdk.java.net/~bulasevich/8249893/webrev.00

The change in common code was made to enable Node::is_AndL method.
The method in the rule predicate is required to find out if we are within
the straight or reversed rule (ADLC adds rule with swapped parameters
for commutative operands).

Tested with JTREG and generated [1] tests.

thanks,
Boris

[1] http://cr.openjdk.java.net/~bulasevich/8249893/webrev.00/Gen.java

From boris.ulasevich at bell-sw.com  Fri Jul 24 10:35:41 2020
From: boris.ulasevich at bell-sw.com (Boris Ulasevich)
Date: Fri, 24 Jul 2020 13:35:41 +0300
Subject: RFR 8249189: AARCH64: more L2I conversions can be skipped (ubfiz)
In-Reply-To: <8dc598ba-f17e-9d0c-db02-1a329dc010c4@redhat.com>
References: <209c5713-4218-4e9c-037d-fe337734697f@bell-sw.com>
 <8dc598ba-f17e-9d0c-db02-1a329dc010c4@redhat.com>
Message-ID: <34b9e725-de15-4cb4-c211-9c870c871c52@bell-sw.com>

Hi Andrew,

Thank you! Fixed inline:
(julong)n->get_long() < 0x80000000ULL

Boris

On 24.07.2020 11:15, Andrew Haley wrote:
> On 22/07/2020 14:36, Boris Ulasevich wrote:
>> Please review the update for aarch64 AD template file to generate more
>> bitfield extraction rules where I2L and L2I conversions can be skipped.
>>
>> http://cr.openjdk.java.net/~bulasevich/8249189/webrev.02
>> http://bugs.openjdk.java.net/browse/JDK-8249189
>>
>> Tested with JTREG and manual [1] tests.
> 4056 operand immL_positive_bitmaskI()
> 4057 %{
> 4058   predicate((n->get_long() != 0)
> 4059             && ((n->get_long() & 0xffffffff80000000L) == 0)
> 4060             && is_power_of_2(n->get_long() + 1));
> 4061   match(ConL);
> 4062
> 4063   op_cost(0);
> 4064   format %{ %}
> 4065   interface(CONST_INTER);
> 4066 %}
>
> Isn't this a difficult-to-understand way of saying
>
> 4058   predicate((n->get_long() != 0)
> 4059             && ((julong)n->get_long() < 0x80000000LL)
> 4060             && is_power_of_2(n->get_long() + 1));
>
> Note the "LL" here: we have to work with LLP64 systems.
>
> Otherwise OK.
>


From christian.hagedorn at oracle.com  Fri Jul 24 11:44:41 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Fri, 24 Jul 2020 13:44:41 +0200
Subject: [16] RFR(S): 8248552: C2 crashes with SIGFPE due to division by
 zero
In-Reply-To: <518cd022-73e1-cb5c-499d-86853ae679c3@oracle.com>
References: <70e8e42b-5cb3-9c1e-419e-2f771f042368@oracle.com>
 <3ba2ef6a-8ade-7ede-5252-21051c34b472@oracle.com>
 <9e2f26bd-daa4-9540-8401-9850e0beea94@oracle.com>
 <c569ebf8-58ed-c6df-2e0e-38c692f926c6@oracle.com>
 <a2066deb-ef3b-7071-33ff-03c829069cff@oracle.com>
 <f6a62d10-9cff-3fc2-0947-c49e775a23c9@oracle.com>
 <e8d6aa47-b584-4fdc-f506-1b1c52aa28f5@oracle.com>
 <ca043859-bb7b-f64a-a37d-7b4e47c745d6@oracle.com>
 <5b2e7b1b-24f7-d575-58a3-376ec9ab7944@oracle.com>
 <518cd022-73e1-cb5c-499d-86853ae679c3@oracle.com>
Message-ID: <2f5978fe-af76-df18-15c0-dcc62563299d@oracle.com>

Hi Tobias

Thank you for your review!

> Please make sure to run performance testing.

There is a repeated regression in the micros open crypto benchmark 
openjdk.bench.javax.crypto.small.SecureRandomBench.nextBytes with these 
two settings:
- algorithm=SHA1PRNG-dataSize:64-provider:-shared:false
- algorithm=SHA1PRNG-dataSize:64-provider:-shared:true

Repeated runs with these two settings resulted in a regression between 1 
and 2%. I could trace it back to the additional type filtering in 
PhiNode::Value() (webrev.02). This is only required for the assertion 
code and not for the bailout fix itself. When running performance 
testing with webrev.01, the regressions disappear.

I therefore suggest to go with webrev.01 (without assertion code and 
type filtering) and file a new RFE to investigate the usage of type 
filtering in PhiNode::Value() for iv phis and why we get a performance 
regression in these two benchmark settings. In theory, I think it should 
be beneficial to narrow the type range of iv phis.

> cfgnode.cpp:1083
> - There's an extra whitespace before ","
> 
> loopopts.cpp:84/86
> - No need for extra brackets

These are not present anymore in webrev.01.

http://cr.openjdk.java.net/~chagedorn/8248552/webrev.01/

Best regards,
Christian


On 20.07.20 11:14, Tobias Hartmann wrote:
> Hi Christian,
> 
> On 15.07.20 15:08, Christian Hagedorn wrote:
>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.02/
> 
> Looks good to me.
> 
> Some code style comments:


> Best regards,
> Tobias
> 

From adinn at redhat.com  Fri Jul 24 12:32:21 2020
From: adinn at redhat.com (Andrew Dinn)
Date: Fri, 24 Jul 2020 13:32:21 +0100
Subject: [8u] RFR: 8240676: Meet not symmetric failure when running lucene
 on jdk8
In-Reply-To: <87wo2tmfg3.fsf@redhat.com>
References: <87zh7pmla1.fsf@redhat.com>
 <9ae30a2b-3443-9954-950e-08e7e26ddd97@redhat.com> <87wo2tmfg3.fsf@redhat.com>
Message-ID: <af9c4744-de1b-d9ad-6a4d-c69f1f273049@redhat.com>

On 24/07/2020 10:26, Roland Westrelin wrote:
>> The changes to Type::meet_helper and Type::check_symmetrical look fine.
>>
>> However, I don't understand what the cherry-picked change to line 3996
>> in TypeAryPtr::xmeet_helper does and why it is legitimate:
>>
>> -      return make(NotNull, NULL, tary, lazy_klass, false, off,
>> InstanceBot);
>> +      return make(NotNull, NULL, tary, lazy_klass, false, off,
>> InstanceBot, speculative, depth);
>>
>> Obviously it fixes a crash but -- for the record -- can you explain
>>
>>   1) why the crash happened and how this fixes it
> 
> The background for this patch is the following: we saw a rare crash
> during testing. The crash couldn't be reproduced. My attempts at a test
> case didn't succeed either. So instead, I made a change to the
> verification code in the type system so it stress tested some
> combinations of types that were usually rarely exercised. It was then
> easy to write a test case that triggered the failure and implement a
> fix.

Ok, understood.

> The risk with this change is not so much in the fix itself but in the
> improvement to the verification code that can uncover bugs that we were
> not aware of before. That's what happens with 8u where we hit a bug that
> was never seen with 8u before.

Ok, but all the verification code happens under #ifdef ASSERT so that is
only going to change behaviour in non-production builds right?

i.e. the important change is the one to the meet code?

> Object pointer types have 2 parts: a known type part and a speculative
> part. When the verification code triggers it verify both parts. In the
> case of this fix, the speculative parts gets accidentally dropped. The
> new verification code catches it. The previous one didn't for some
> reason.

Ah ok, I get this now. The change ensures that the speculative type of
the meet type is the meet of the respective speculative types.

That may well change behaviour for some programs as meets are computed
outside of the changed verification path. I'd like to assume the
benefits of improving type accuracy override the risk. Do you think that
is justified? (one might argue that improved type accuracy is not always
better, especially for speculative info where avoiding the erasure might
enable optimizations not previously attempted).

>>   2) why this was not needed in the upstream patch and is needed here
> 
> I cherry-picked the change from a later release (jdk 9 I think). So the
> change was not needed in the 11u patch because it was already there.
Doh! Of course. Thanks for the explanation.

Well, the change looks good to me but I'm not really in a position to
assess the risk of the xmeet change. I am reassured that it exists in
the upstream code and is not known to have caused any errors.

regards,


Andrew Dinn
-----------
Red Hat Distinguished Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill


From lutz.schmidt at sap.com  Fri Jul 24 12:51:14 2020
From: lutz.schmidt at sap.com (Schmidt, Lutz)
Date: Fri, 24 Jul 2020 12:51:14 +0000
Subject: RFR(XS): -XX:+CITime triggers guarantee(events != NULL) in jvmci.cpp
Message-ID: <DA1DEC95-7C81-4B4A-841B-2DBD93D29183@sap.com>

Dear all, 

may I please request reviews for this small fix? I would even say it is a trivial fix. It inverts an if condition such that JVMCI specific code is called only when JVMCI compilation is enabled via UseJVMCICompiler.

Bug:    https://bugs.openjdk.java.net/browse/JDK-8250233
Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8250233.00/ 

Local testing looks good. jdk/submit tests pending.

Thank you!
Lutz

 
From lutz.schmidt at sap.com  Fri Jul 24 12:53:21 2020
From: lutz.schmidt at sap.com (Schmidt, Lutz)
Date: Fri, 24 Jul 2020 12:53:21 +0000
Subject: RFR(XS): 8250233: -XX:+CITime triggers guarantee(events != NULL)
 in jvmci.cpp
Message-ID: <E9E87374-CAA7-4BB8-8860-D5F03080098E@sap.com>

Resending after updating subject line with bug id. 
Sorry for the spam.
Lutz

?On 24.07.20, 14:51, "Schmidt, Lutz" <lutz.schmidt at sap.com> wrote:

    Dear all, 

    may I please request reviews for this small fix? I would even say it is a trivial fix. It inverts an if condition such that JVMCI specific code is called only when JVMCI compilation is enabled via UseJVMCICompiler.

    Bug:    https://bugs.openjdk.java.net/browse/JDK-8250233
    Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8250233.00/ 

    Local testing looks good. jdk/submit tests pending.

    Thank you!
    Lutz


From christian.hagedorn at oracle.com  Fri Jul 24 12:57:57 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Fri, 24 Jul 2020 14:57:57 +0200
Subject: [16] RFR(XS): 8249602: C2: assert(cnt == _outcnt) failed: no
 insertions allowed
Message-ID: <2cd118ab-c117-bf61-ae03-117b9383a5e6@oracle.com>

Hi

Please review the following patch:
https://bugs.openjdk.java.net/browse/JDK-8249602
http://cr.openjdk.java.net/~chagedorn/8249602/webrev.00/

The testcase hits the assert when inserting a post loop. When correcting 
the fall-in values to the post-loop phis to take the values from the 
main-loop, we have to separately handle nodes that belong to the 
backedge control block and cannot float. In this process, we clone data 
nodes in PhaseIdealLoop::clone_up_backedge_goo and then hit the assert 
because some nodes to be cloned have a control input from the main-loop 
header node (main_head). These nodes are cloned and the main_head node 
gets these nodes as additional output nodes. This should be fine but the 
DUIterator_Fast forbids insertions.

The fix simply switches to a normal DUIterator which allows insertions. 
This should also be done when correcting the fall-in values to the 
main-loop to take the values from the pre-loop.

Best regards,
Christian

From rwestrel at redhat.com  Fri Jul 24 12:59:44 2020
From: rwestrel at redhat.com (Roland Westrelin)
Date: Fri, 24 Jul 2020 14:59:44 +0200
Subject: [8u] RFR: 8240676: Meet not symmetric failure when running lucene
 on jdk8
In-Reply-To: <af9c4744-de1b-d9ad-6a4d-c69f1f273049@redhat.com>
References: <87zh7pmla1.fsf@redhat.com>
 <9ae30a2b-3443-9954-950e-08e7e26ddd97@redhat.com> <87wo2tmfg3.fsf@redhat.com>
 <af9c4744-de1b-d9ad-6a4d-c69f1f273049@redhat.com>
Message-ID: <87lfj9m5kv.fsf@redhat.com>


> Ok, but all the verification code happens under #ifdef ASSERT so that is
> only going to change behaviour in non-production builds right?
>
> i.e. the important change is the one to the meet code?

Yes, to both.

> That may well change behaviour for some programs as meets are computed
> outside of the changed verification path. I'd like to assume the
> benefits of improving type accuracy override the risk. Do you think that
> is justified? (one might argue that improved type accuracy is not always
> better, especially for speculative info where avoiding the erasure might
> enable optimizations not previously attempted).

I would say both benefit and risk are small. Without the speculative
type change, we'll hit failures in the new verification code so some
other tweak would have to be done to work around them. Not sure what
could be done but that would likely be as risky. Or only the actual fix
is backported and the new verification code is left out. Then the
speculative type fix is not required. But a regression wouldn't be
caught either.

Roland.


From coleen.phillimore at oracle.com  Fri Jul 24 13:10:24 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 24 Jul 2020 09:10:24 -0400
Subject: RFR (T) 8250042: Clean up methodOop and method_oop names from the
 code
In-Reply-To: <c5392d2c-7fe1-09e5-3366-4b9840fef087@oracle.com>
References: <85efc3ab-abbf-c5f2-9b7b-47fa516d9a2d@oracle.com>
 <6f973a0a-cf55-e1ab-8de3-b57f68dbd2cf@oracle.com>
 <c5392d2c-7fe1-09e5-3366-4b9840fef087@oracle.com>
Message-ID: <c6ec9d68-e44f-7288-d084-e64d62a79490@oracle.com>


I can also replace method_oop with method_ptr in the CPU ad files, and 
this seems to build but now someone who knows the compiler area needs to 
comment; this was supposed to be trivial... :)? But it still is really 
trivial to look at.

I left interpreter_method_oop_reg and compiler_method_oop_reg and 
friends in opto/matcher.cpp for someone else.

incremental webrev at 
http://cr.openjdk.java.net/~coleenp/2020/8250042.02.incr/webrev
full webrev at http://cr.openjdk.java.net/~coleenp/2020/8250042.02/webrev

Thanks,
Coleen


On 7/24/20 8:23 AM, coleen.phillimore at oracle.com wrote:
>
> Thanks for looking at this.
>
> On 7/24/20 1:01 AM, David Holmes wrote:
>> Hi Coleen,
>>
>> On 24/07/2020 2:58 am, coleen.phillimore at oracle.com wrote:
>>> See bug for more details.? I've been running into these names a lot 
>>> lately.?? Many of these names are in JVMTI.
>>>
>>> Tested with tier1 on all Oracle platforms and built on non-Oracle 
>>> platforms.
>>>
>>> open webrev at 
>>> http://cr.openjdk.java.net/~coleenp/2020/8250042.01/webrev
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8250042
>>
>> src/hotspot/cpu/*/*.ad
>>
>> These still refer to "method oop" and method_oop in a number of places.
>
> Yes, I only replaced method_oop in the shared code and not in the AD 
> code.? method_oop can be the name of a parameter and using "sed" to 
> change it to "method" doesn't work.?? Somebody who understands this 
> code and looks at it will have to make the rest of the changes.
>
> What I did was replace "method oop" with "method" and "methodOop" with 
> "method" in all the sources.? I replaced "method_oop" with "method" or 
> "checked_method" in the shared sources.
>
>>
>> src/hotspot/share/adlc/adlparse.cpp
>>
>> +? frame->_interpreter_method_oop_reg = parse_one_arg("method reg 
>> entry");
>>
>> I guess I'm not understanding the scope of this renaming - why is 
>> _interpreter_method_oop_reg not renamed as well? Should this (and 
>> other uses) be parsed as method-(oop-reg) rather than (method-oop)-reg?
>
> I don't know this code, so I'd rather not change any more of it. The 
> comment makes sense changed, even though the variable name still 
> refers to method_oop.
>
> Thanks,
> Coleen
>>
>> Otherwise all okay.
>>
>> Thanks,
>> David
>>
>>> Thanks,
>>> Coleen
>


From boris.ulasevich at bell-sw.com  Fri Jul 24 13:19:33 2020
From: boris.ulasevich at bell-sw.com (Boris Ulasevich)
Date: Fri, 24 Jul 2020 16:19:33 +0300
Subject: [aarch64-port-dev ] RFR 8248870: AARCH64: I2L conversions can be
 skipped for small positive masked values
In-Reply-To: <e3e7ad61-7747-a16a-6335-f3eed6c507ca@redhat.com>
References: <d83ab998-ad1e-4889-b637-dc7f0369c50e@bell-sw.com>
 <9ccf64f1-7a88-0f67-8b50-4dea09af9c8b@redhat.com>
 <e54d3d17-c865-a422-e9eb-05b3936e81da@bell-sw.com>
 <05369383-c6d8-5e61-50ce-51fec955e2d4@bell-sw.com>
 <e3e7ad61-7747-a16a-6335-f3eed6c507ca@redhat.com>
Message-ID: <ab938f27-3fc8-e202-e57c-2ab8849d6db7@bell-sw.com>

Thank you for review, Andrew!

Boris

On 24.07.2020 11:18, Andrew Haley wrote:
> On 23/07/2020 12:25, Boris Ulasevich wrote:
>> Since the JDK-8248414 patch has been committed, I believe we can revive
>> this review. I think it is still better to move my rule to the ubfiz
>> command group,
>> which is in the auto-generated area.
>>
>> http://cr.openjdk.java.net/~bulasevich/8248870/webrev.02
> OK, thanks.
>


From doug.simon at oracle.com  Fri Jul 24 13:21:14 2020
From: doug.simon at oracle.com (Doug Simon)
Date: Fri, 24 Jul 2020 15:21:14 +0200
Subject: RFR(XS): -XX:+CITime triggers guarantee(events != NULL) in
 jvmci.cpp
In-Reply-To: <DA1DEC95-7C81-4B4A-841B-2DBD93D29183@sap.com>
References: <DA1DEC95-7C81-4B4A-841B-2DBD93D29183@sap.com>
Message-ID: <9800BE84-F2C2-456A-BDAD-66D3CA74DBF9@oracle.com>

This is not quite right. UseJVMCICompiler means ?use JVMCI compiler as top tier JIT compiler?. It is still possible to use the JVMCI compiler via its Java API without using it as the top tier JIT compiler (i.e. ?hosted? mode).

I think what you are aiming for is to omit printing this info when the JVMCI compiler is not used at all. This patch should achieve that:

diff -r f564ec7074f0 src/hotspot/share/jvmci/jvmciCompiler.cpp
--- a/src/hotspot/share/jvmci/jvmciCompiler.cpp Thu Jul 23 11:47:20 2020 +0200
+++ b/src/hotspot/share/jvmci/jvmciCompiler.cpp Fri Jul 24 15:18:22 2020 +0200
@@ -146,6 +146,8 @@

 // Print compilation timers and statistics
 void JVMCICompiler::print_compilation_timers() {
-  JVMCI_event_1("JVMCICompiler::print_timers");
-  tty->print_cr("       JVMCI code install time:        %6.3f s",    _codeInstallTimer.seconds());
+  if (_codeInstallTimer.seconds() != 0) {
+    JVMCI_event_1("JVMCICompiler::print_timers");
+    tty->print_cr("       JVMCI code install time:        %6.3f s",    _codeInstallTimer.seconds());
+  }
 }

-Doug

> On 24 Jul 2020, at 14:51, Schmidt, Lutz <lutz.schmidt at sap.com> wrote:
> 
> Dear all, 
> 
> may I please request reviews for this small fix? I would even say it is a trivial fix. It inverts an if condition such that JVMCI specific code is called only when JVMCI compilation is enabled via UseJVMCICompiler.
> 
> Bug:    https://bugs.openjdk.java.net/browse/JDK-8250233
> Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8250233.00/ 
> 
> Local testing looks good. jdk/submit tests pending.
> 
> Thank you!
> Lutz
> 
> 
> 


From luhenry at microsoft.com  Fri Jul 24 15:53:25 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Fri, 24 Jul 2020 15:53:25 +0000
Subject: RFR[XXS] 8248672: utilities: Introduce DEPRECATED macro for GCC and
 MSVC
Message-ID: <MWHPR21MB05113A054F2D8291BFAFB085B0770@MWHPR21MB0511.namprd21.prod.outlook.com>

Hi,

Could I please have a review on the following. It simply adds the `DEPRECATED` macro to wrap `__attribute__ ((deprecated))` for GCC, and the equivalent for MSVC.

JBS: https://bugs.openjdk.java.net/browse/JDK-8248672
Webrev: http://cr.openjdk.java.net/~burban/luhenry/8248672/webrev.00

Thank you,

--
Ludovic

From vladimir.kozlov at oracle.com  Fri Jul 24 16:02:23 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 24 Jul 2020 09:02:23 -0700
Subject: RFR[XXS] 8248672: utilities: Introduce DEPRECATED macro for GCC
 and MSVC
In-Reply-To: <MWHPR21MB05113A054F2D8291BFAFB085B0770@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB05113A054F2D8291BFAFB085B0770@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <5e301790-8bfe-0ced-b5e2-8a9c76ae33de@oracle.com>

"And use it in the AArch64 sub system."

Can you explain more? Do you have RFE filed for DEPRECATED use?
It is small change which usually done together with usage. Why do this separately?

Thanks,
Vladimir K

On 7/24/20 8:53 AM, Ludovic Henry wrote:
> Hi,
> 
> Could I please have a review on the following. It simply adds the `DEPRECATED` macro to wrap `__attribute__ ((deprecated))` for GCC, and the equivalent for MSVC.
> 
> JBS: https://bugs.openjdk.java.net/browse/JDK-8248672
> Webrev: http://cr.openjdk.java.net/~burban/luhenry/8248672/webrev.00
> 
> Thank you,
> 
> --
> Ludovic
> 

From luhenry at microsoft.com  Fri Jul 24 16:02:35 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Fri, 24 Jul 2020 16:02:35 +0000
Subject: [aarch64-port-dev ] RFR(S): 8248676: AArch64: Add workaround for
 LITable constructor
In-Reply-To: <0aed0646-c770-03e6-4e0b-5108919b7203@redhat.com>
References: <MWHPR21MB05119533601E0AF1DE3CB0E9B0640@MWHPR21MB0511.namprd21.prod.outlook.com>
 <af179579-b8cd-db69-fede-802614f23092@redhat.com>
 <MWHPR21MB0511A82D28A3FE2A09E3E8ADB0650@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB051121A1E582BCC22FA96F47B07E0@MWHPR21MB0511.namprd21.prod.outlook.com>,
 <0aed0646-c770-03e6-4e0b-5108919b7203@redhat.com>
Message-ID: <MWHPR21MB0511092557F1BBCB3D91276DB0770@MWHPR21MB0511.namprd21.prod.outlook.com>

Hi Andrew,

Are you saying that you would like this change to land into `aarch64-port/jdk-windows` before getting into jdk/jdk? This change doesn't strike me as windows-aarch64 specific and is in line with general removal of GCC-specific code (similarly to the LP64 vs LLP64, or JDK-8248666).

Webrev: http://cr.openjdk.java.net/~burban/luhenry/8248676/webrev.01

Thank you for your review,

--
Ludovic

________________________________________
From: Andrew Haley <aph at redhat.com>
Sent: Thursday, July 16, 2020 01:44
To: Ludovic Henry; hotspot-compiler-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
Cc: openjdk-aarch64
Subject: Re: [aarch64-port-dev ] RFR(S): 8248676: AArch64: Add workaround for LITable constructor

On 15/07/2020 14:27, Ludovic Henry wrote:
> A quick follow-up on that patch. Is there anything you would like to see done differently?

It's fine, but (as discussed) it should go into
https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fhg.openjdk.java.net%2Faarch64-port%2Fjdk-windows%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cc5846b05f89e459465c008d829647194%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304858666610938&amp;sdata=XHQJVnYMgVPu6NTEq94rJRO2sgXGCVFCaCr8yFVa60I%3D&amp;reserved=0

We'll need to do a regular pull from jdk/jdk into that tree.

--
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.redhat.com%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cc5846b05f89e459465c008d829647194%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304858666610938&amp;sdata=i63s%2BHdFnhfAQwOz9iSmTBfgCsgzR14hNXzNtdb%2F4C4%3D&amp;reserved=0>
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkeybase.io%2Fandrewhaley&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cc5846b05f89e459465c008d829647194%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637304858666610938&amp;sdata=Sa%2BaNNLEzkQqnHjobj3CfdW%2B6oX3ItJrBV3IHlgAvek%3D&amp;reserved=0
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From luhenry at microsoft.com  Fri Jul 24 16:41:52 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Fri, 24 Jul 2020 16:41:52 +0000
Subject: RFR[XXS] 8248672: utilities: Introduce DEPRECATED macro for GCC
 and MSVC
In-Reply-To: <5e301790-8bfe-0ced-b5e2-8a9c76ae33de@oracle.com>
References: <MWHPR21MB05113A054F2D8291BFAFB085B0770@MWHPR21MB0511.namprd21.prod.outlook.com>,
 <5e301790-8bfe-0ced-b5e2-8a9c76ae33de@oracle.com>
Message-ID: <MWHPR21MB051146316FF56A63423DF514B0770@MWHPR21MB0511.namprd21.prod.outlook.com>

I'm not sure I understand your question. Are you asking whether I also replaced all uses of __attribute__((deprecated)) with DEPRECATED? If so, I did replace the only use of it [1] together with defining the macro.

Please let me know if I misunderstood your question.

Thank you.

--
Ludovic

[1] in src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp.

________________________________________
From: Vladimir Kozlov <vladimir.kozlov at oracle.com>
Sent: Friday, July 24, 2020 09:02
To: Ludovic Henry; hotspot-compiler-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
Cc: openjdk-aarch64; hotspot-gc-dev at openjdk.java.net
Subject: Re: RFR[XXS] 8248672: utilities: Introduce DEPRECATED macro for GCC and MSVC

"And use it in the AArch64 sub system."

Can you explain more? Do you have RFE filed for DEPRECATED use?
It is small change which usually done together with usage. Why do this separately?

Thanks,
Vladimir K

On 7/24/20 8:53 AM, Ludovic Henry wrote:
> Hi,
>
> Could I please have a review on the following. It simply adds the `DEPRECATED` macro to wrap `__attribute__ ((deprecated))` for GCC, and the equivalent for MSVC.
>
> JBS: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2Fbrowse%2FJDK-8248672&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cf8e2909451c44ee9c9b108d82feafcd6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637312033704245529&amp;sdata=p%2Bec5f3YYaazBblPt9vRWjQ2ZWa209lHGPLlsuMbpk8%3D&amp;reserved=0
> Webrev: https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248672%2Fwebrev.00&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cf8e2909451c44ee9c9b108d82feafcd6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637312033704245529&amp;sdata=aOC6Xq%2BJExcKQ8oeNew4aZMHZUD2idlTi3tRihRwpjs%3D&amp;reserved=0
>
> Thank you,
>
> --
> Ludovic
>

From doug.simon at oracle.com  Fri Jul 24 16:53:54 2020
From: doug.simon at oracle.com (Doug Simon)
Date: Fri, 24 Jul 2020 18:53:54 +0200
Subject: RFR: 8250548: libgraal can deadlock in -Xcomp mode
Message-ID: <F5BBA60E-C1E8-4653-B424-572DA60BE447@oracle.com>

Please review this bug fix for a deadlock in libgraal under Xcomp.

BUG:
https://bugs.openjdk.java.net/browse/JDK-8250548

PATCH:

diff -r 1f37a5cd6afc src/hotspot/share/compiler/compileBroker.cpp
--- a/src/hotspot/share/compiler/compileBroker.cpp      Fri Jul 24 11:00:50 2020 -0400
+++ b/src/hotspot/share/compiler/compileBroker.cpp      Fri Jul 24 18:52:30 2020 +0200
@@ -1655,7 +1655,7 @@
   bool free_task;
 #if INCLUDE_JVMCI
   AbstractCompiler* comp = compiler(task->comp_level());
-  if (!UseJVMCINativeLibrary && comp->is_jvmci() && !task->should_wait_for_compilation()) {
+  if (comp->is_jvmci() && !task->should_wait_for_compilation()) {
     // It may return before compilation is completed.
     free_task = wait_for_jvmci_completion((JVMCICompiler*) comp, task, thread);
   } else


Testing: hs-tier1,hs-tier2,hs-tier3-graal

-Doug


From tom.rodriguez at oracle.com  Fri Jul 24 16:54:56 2020
From: tom.rodriguez at oracle.com (Tom Rodriguez)
Date: Fri, 24 Jul 2020 09:54:56 -0700
Subject: RFR: 8250548: libgraal can deadlock in -Xcomp mode
In-Reply-To: <F5BBA60E-C1E8-4653-B424-572DA60BE447@oracle.com>
References: <F5BBA60E-C1E8-4653-B424-572DA60BE447@oracle.com>
Message-ID: <daea33fe-6425-613d-21e7-922498fe5001@oracle.com>

Looks good.

tom

Doug Simon wrote on 7/24/20 9:53 AM:
> Please review this bug fix for a deadlock in libgraal under Xcomp.
> 
> BUG:
> https://bugs.openjdk.java.net/browse/JDK-8250548
> 
> PATCH:
> 
> diff -r 1f37a5cd6afc src/hotspot/share/compiler/compileBroker.cpp
> --- a/src/hotspot/share/compiler/compileBroker.cpp      Fri Jul 24 11:00:50 2020 -0400
> +++ b/src/hotspot/share/compiler/compileBroker.cpp      Fri Jul 24 18:52:30 2020 +0200
> @@ -1655,7 +1655,7 @@
>     bool free_task;
>   #if INCLUDE_JVMCI
>     AbstractCompiler* comp = compiler(task->comp_level());
> -  if (!UseJVMCINativeLibrary && comp->is_jvmci() && !task->should_wait_for_compilation()) {
> +  if (comp->is_jvmci() && !task->should_wait_for_compilation()) {
>       // It may return before compilation is completed.
>       free_task = wait_for_jvmci_completion((JVMCICompiler*) comp, task, thread);
>     } else
> 
> 
> Testing: hs-tier1,hs-tier2,hs-tier3-graal
> 
> -Doug
> 

From vladimir.kozlov at oracle.com  Fri Jul 24 17:23:58 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 24 Jul 2020 10:23:58 -0700
Subject: RFR: 8250548: libgraal can deadlock in -Xcomp mode
In-Reply-To: <daea33fe-6425-613d-21e7-922498fe5001@oracle.com>
References: <F5BBA60E-C1E8-4653-B424-572DA60BE447@oracle.com>
 <daea33fe-6425-613d-21e7-922498fe5001@oracle.com>
Message-ID: <a60c1db7-874a-9079-4f1f-b7c2de9317cd@oracle.com>

+1

Thanks,
Vladimir K

On 7/24/20 9:54 AM, Tom Rodriguez wrote:
> Looks good.
> 
> tom
> 
> Doug Simon wrote on 7/24/20 9:53 AM:
>> Please review this bug fix for a deadlock in libgraal under Xcomp.
>>
>> BUG:
>> https://bugs.openjdk.java.net/browse/JDK-8250548
>>
>> PATCH:
>>
>> diff -r 1f37a5cd6afc src/hotspot/share/compiler/compileBroker.cpp
>> --- a/src/hotspot/share/compiler/compileBroker.cpp????? Fri Jul 24 11:00:50 2020 -0400
>> +++ b/src/hotspot/share/compiler/compileBroker.cpp????? Fri Jul 24 18:52:30 2020 +0200
>> @@ -1655,7 +1655,7 @@
>> ??? bool free_task;
>> ? #if INCLUDE_JVMCI
>> ??? AbstractCompiler* comp = compiler(task->comp_level());
>> -? if (!UseJVMCINativeLibrary && comp->is_jvmci() && !task->should_wait_for_compilation()) {
>> +? if (comp->is_jvmci() && !task->should_wait_for_compilation()) {
>> ????? // It may return before compilation is completed.
>> ????? free_task = wait_for_jvmci_completion((JVMCICompiler*) comp, task, thread);
>> ??? } else
>>
>>
>> Testing: hs-tier1,hs-tier2,hs-tier3-graal
>>
>> -Doug
>>

From vladimir.kozlov at oracle.com  Fri Jul 24 17:26:31 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 24 Jul 2020 10:26:31 -0700
Subject: RFR[XXS] 8248672: utilities: Introduce DEPRECATED macro for GCC
 and MSVC
In-Reply-To: <MWHPR21MB051146316FF56A63423DF514B0770@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB05113A054F2D8291BFAFB085B0770@MWHPR21MB0511.namprd21.prod.outlook.com>
 <5e301790-8bfe-0ced-b5e2-8a9c76ae33de@oracle.com>
 <MWHPR21MB051146316FF56A63423DF514B0770@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <ded2765f-00e2-f875-4d99-cf0e97ad19f0@oracle.com>

It was my mistake - I missed that it is indeed used in macroAssembler_aarch64.hpp

And thank you, Monica, for RFE description change - it is more clear now.

Change is fine. You need someone from aarch64 to review this too to make sure it works with their GCC.

Regards,
Vladimir K

On 7/24/20 9:41 AM, Ludovic Henry wrote:
> I'm not sure I understand your question. Are you asking whether I also replaced all uses of __attribute__((deprecated)) with DEPRECATED? If so, I did replace the only use of it [1] together with defining the macro.
> 
> Please let me know if I misunderstood your question.
> 
> Thank you.
> 
> --
> Ludovic
> 
> [1] in src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp.
> 
> ________________________________________
> From: Vladimir Kozlov <vladimir.kozlov at oracle.com>
> Sent: Friday, July 24, 2020 09:02
> To: Ludovic Henry; hotspot-compiler-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
> Cc: openjdk-aarch64; hotspot-gc-dev at openjdk.java.net
> Subject: Re: RFR[XXS] 8248672: utilities: Introduce DEPRECATED macro for GCC and MSVC
> 
> "And use it in the AArch64 sub system."
> 
> Can you explain more? Do you have RFE filed for DEPRECATED use?
> It is small change which usually done together with usage. Why do this separately?
> 
> Thanks,
> Vladimir K
> 
> On 7/24/20 8:53 AM, Ludovic Henry wrote:
>> Hi,
>>
>> Could I please have a review on the following. It simply adds the `DEPRECATED` macro to wrap `__attribute__ ((deprecated))` for GCC, and the equivalent for MSVC.
>>
>> JBS: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2Fbrowse%2FJDK-8248672&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cf8e2909451c44ee9c9b108d82feafcd6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637312033704245529&amp;sdata=p%2Bec5f3YYaazBblPt9vRWjQ2ZWa209lHGPLlsuMbpk8%3D&amp;reserved=0
>> Webrev: https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248672%2Fwebrev.00&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cf8e2909451c44ee9c9b108d82feafcd6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637312033704245529&amp;sdata=aOC6Xq%2BJExcKQ8oeNew4aZMHZUD2idlTi3tRihRwpjs%3D&amp;reserved=0
>>
>> Thank you,
>>
>> --
>> Ludovic
>>

From vladimir.kozlov at oracle.com  Fri Jul 24 18:03:29 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 24 Jul 2020 11:03:29 -0700
Subject: [16] RFR(XS): 8249602: C2: assert(cnt == _outcnt) failed: no
 insertions allowed
In-Reply-To: <2cd118ab-c117-bf61-ae03-117b9383a5e6@oracle.com>
References: <2cd118ab-c117-bf61-ae03-117b9383a5e6@oracle.com>
Message-ID: <82cbd463-d480-b882-04da-0d1269717fff@oracle.com>

Looks good.

Thanks,
Vladimir

On 7/24/20 5:57 AM, Christian Hagedorn wrote:
> Hi
> 
> Please review the following patch:
> https://bugs.openjdk.java.net/browse/JDK-8249602
> http://cr.openjdk.java.net/~chagedorn/8249602/webrev.00/
> 
> The testcase hits the assert when inserting a post loop. When correcting the fall-in values to the post-loop phis to 
> take the values from the main-loop, we have to separately handle nodes that belong to the backedge control block and 
> cannot float. In this process, we clone data nodes in PhaseIdealLoop::clone_up_backedge_goo and then hit the assert 
> because some nodes to be cloned have a control input from the main-loop header node (main_head). These nodes are cloned 
> and the main_head node gets these nodes as additional output nodes. This should be fine but the DUIterator_Fast forbids 
> insertions.
> 
> The fix simply switches to a normal DUIterator which allows insertions. This should also be done when correcting the 
> fall-in values to the main-loop to take the values from the pre-loop.
> 
> Best regards,
> Christian

From doug.simon at oracle.com  Fri Jul 24 18:12:33 2020
From: doug.simon at oracle.com (Doug Simon)
Date: Fri, 24 Jul 2020 20:12:33 +0200
Subject: RFR: 8250548: libgraal can deadlock in -Xcomp mode
Message-ID: <0DE64525-958F-4885-84D0-151990B5D8F9@oracle.com>

Please review this bug fix to revert the JVMCI changes made as part of JDK-8230395.

Instead of aborting the VM when JVMCI counter expansion fails, the JVMCI client should simply be informed of the failure (as was originally suggested <https://bugs.openjdk.java.net/browse/JDK-8230395?focusedCommentId=14289994&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14289994> by David).

https://bugs.openjdk.java.net/browse/JDK-8250556
https://cr.openjdk.java.net/~dnsimon/8250556/webrev.00/

Testing: hs-tier1,hs-tier2,hs-tier3-graal

-Doug


From doug.simon at oracle.com  Fri Jul 24 18:35:21 2020
From: doug.simon at oracle.com (Doug Simon)
Date: Fri, 24 Jul 2020 20:35:21 +0200
Subject: RFR: 8250556: revert JVMCI part of JDK-8230395
Message-ID: <0DE58760-A197-46FD-99B0-A2C1A5394DEE@oracle.com>

(with correct subject this time)

Please review this bug fix to revert the JVMCI changes made as part of JDK-8230395.

Instead of aborting the VM when JVMCI counter expansion fails, the JVMCI client should simply be informed of the failure (as was originally suggested <https://bugs.openjdk.java.net/browse/JDK-8230395?focusedCommentId=14289994&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14289994> by David).

https://bugs.openjdk.java.net/browse/JDK-8250556
https://cr.openjdk.java.net/~dnsimon/8250556/webrev.00/

Testing: hs-tier1,hs-tier2,hs-tier3-graal

-Doug


From vladimir.kozlov at oracle.com  Fri Jul 24 18:40:13 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 24 Jul 2020 11:40:13 -0700
Subject: RFR: 8250556: revert JVMCI part of JDK-8230395
In-Reply-To: <0DE58760-A197-46FD-99B0-A2C1A5394DEE@oracle.com>
References: <0DE58760-A197-46FD-99B0-A2C1A5394DEE@oracle.com>
Message-ID: <cdff486e-e958-1159-c354-08591c98aa69@oracle.com>

Looks good.

Thanks,
Vladimir K

On 7/24/20 11:35 AM, Doug Simon wrote:
> (with correct subject this time)
> 
> Please review this bug fix to revert the JVMCI changes made as part of JDK-8230395.
> 
> Instead of aborting the VM when JVMCI counter expansion fails, the JVMCI client should simply be informed of the failure (as was originally suggested <https://bugs.openjdk.java.net/browse/JDK-8230395?focusedCommentId=14289994&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14289994> by David).
> 
> https://bugs.openjdk.java.net/browse/JDK-8250556
> https://cr.openjdk.java.net/~dnsimon/8250556/webrev.00/
> 
> Testing: hs-tier1,hs-tier2,hs-tier3-graal
> 
> -Doug
> 

From serguei.spitsyn at oracle.com  Fri Jul 24 20:28:18 2020
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 24 Jul 2020 13:28:18 -0700
Subject: RFR (T) 8250042: Clean up methodOop and method_oop names from the
 code
In-Reply-To: <c6ec9d68-e44f-7288-d084-e64d62a79490@oracle.com>
References: <85efc3ab-abbf-c5f2-9b7b-47fa516d9a2d@oracle.com>
 <6f973a0a-cf55-e1ab-8de3-b57f68dbd2cf@oracle.com>
 <c5392d2c-7fe1-09e5-3366-4b9840fef087@oracle.com>
 <c6ec9d68-e44f-7288-d084-e64d62a79490@oracle.com>
Message-ID: <60098204-b23d-7da6-908f-80f3d40c2ebc@oracle.com>

Hi Coleen,

The fix looks good to me.
I've more focused on the serviceability related update.
Thank you for taking care about it!

Thanks,
Serguei


On 7/24/20 06:10, coleen.phillimore at oracle.com wrote:
>
> I can also replace method_oop with method_ptr in the CPU ad files, and 
> this seems to build but now someone who knows the compiler area needs 
> to comment; this was supposed to be trivial... :)? But it still is 
> really trivial to look at.
>
> I left interpreter_method_oop_reg and compiler_method_oop_reg and 
> friends in opto/matcher.cpp for someone else.
>
> incremental webrev at 
> http://cr.openjdk.java.net/~coleenp/2020/8250042.02.incr/webrev
> full webrev at http://cr.openjdk.java.net/~coleenp/2020/8250042.02/webrev
>
> Thanks,
> Coleen
>
>
> On 7/24/20 8:23 AM, coleen.phillimore at oracle.com wrote:
>>
>> Thanks for looking at this.
>>
>> On 7/24/20 1:01 AM, David Holmes wrote:
>>> Hi Coleen,
>>>
>>> On 24/07/2020 2:58 am, coleen.phillimore at oracle.com wrote:
>>>> See bug for more details.? I've been running into these names a lot 
>>>> lately.?? Many of these names are in JVMTI.
>>>>
>>>> Tested with tier1 on all Oracle platforms and built on non-Oracle 
>>>> platforms.
>>>>
>>>> open webrev at 
>>>> http://cr.openjdk.java.net/~coleenp/2020/8250042.01/webrev
>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8250042
>>>
>>> src/hotspot/cpu/*/*.ad
>>>
>>> These still refer to "method oop" and method_oop in a number of places.
>>
>> Yes, I only replaced method_oop in the shared code and not in the AD 
>> code.? method_oop can be the name of a parameter and using "sed" to 
>> change it to "method" doesn't work.?? Somebody who understands this 
>> code and looks at it will have to make the rest of the changes.
>>
>> What I did was replace "method oop" with "method" and "methodOop" 
>> with "method" in all the sources.? I replaced "method_oop" with 
>> "method" or "checked_method" in the shared sources.
>>
>>>
>>> src/hotspot/share/adlc/adlparse.cpp
>>>
>>> +? frame->_interpreter_method_oop_reg = parse_one_arg("method reg 
>>> entry");
>>>
>>> I guess I'm not understanding the scope of this renaming - why is 
>>> _interpreter_method_oop_reg not renamed as well? Should this (and 
>>> other uses) be parsed as method-(oop-reg) rather than (method-oop)-reg?
>>
>> I don't know this code, so I'd rather not change any more of it. The 
>> comment makes sense changed, even though the variable name still 
>> refers to method_oop.
>>
>> Thanks,
>> Coleen
>>>
>>> Otherwise all okay.
>>>
>>> Thanks,
>>> David
>>>
>>>> Thanks,
>>>> Coleen
>>
>


From adinn at redhat.com  Fri Jul 24 21:07:48 2020
From: adinn at redhat.com (Andrew Dinn)
Date: Fri, 24 Jul 2020 22:07:48 +0100
Subject: [8u] RFR: 8240676: Meet not symmetric failure when running lucene
 on jdk8
In-Reply-To: <87lfj9m5kv.fsf@redhat.com>
References: <87zh7pmla1.fsf@redhat.com>
 <9ae30a2b-3443-9954-950e-08e7e26ddd97@redhat.com> <87wo2tmfg3.fsf@redhat.com>
 <af9c4744-de1b-d9ad-6a4d-c69f1f273049@redhat.com> <87lfj9m5kv.fsf@redhat.com>
Message-ID: <7e9268bf-41d5-52d3-cb6c-449fabb0f192@redhat.com>

On 24/07/2020 13:59, Roland Westrelin wrote:
>> That may well change behaviour for some programs as meets are computed
>> outside of the changed verification path. I'd like to assume the
>> benefits of improving type accuracy override the risk. Do you think that
>> is justified? (one might argue that improved type accuracy is not always
>> better, especially for speculative info where avoiding the erasure might
>> enable optimizations not previously attempted).
> 
> I would say both benefit and risk are small. Without the speculative
> type change, we'll hit failures in the new verification code so some
> other tweak would have to be done to work around them. Not sure what
> could be done but that would likely be as risky. Or only the actual fix
> is backported and the new verification code is left out. Then the
> speculative type fix is not required. But a regression wouldn't be
> caught either.
That's a good enough justification for me. Ship it!

. . . well, modulo maintainer approval ;-)

regards,


Andrew Dinn
-----------
Red Hat Distinguished Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill


From vladimir.kozlov at oracle.com  Fri Jul 24 21:41:29 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 24 Jul 2020 14:41:29 -0700
Subject: [16] RFR(S): 8248552: C2 crashes with SIGFPE due to division by
 zero
In-Reply-To: <2f5978fe-af76-df18-15c0-dcc62563299d@oracle.com>
References: <70e8e42b-5cb3-9c1e-419e-2f771f042368@oracle.com>
 <3ba2ef6a-8ade-7ede-5252-21051c34b472@oracle.com>
 <9e2f26bd-daa4-9540-8401-9850e0beea94@oracle.com>
 <c569ebf8-58ed-c6df-2e0e-38c692f926c6@oracle.com>
 <a2066deb-ef3b-7071-33ff-03c829069cff@oracle.com>
 <f6a62d10-9cff-3fc2-0947-c49e775a23c9@oracle.com>
 <e8d6aa47-b584-4fdc-f506-1b1c52aa28f5@oracle.com>
 <ca043859-bb7b-f64a-a37d-7b4e47c745d6@oracle.com>
 <5b2e7b1b-24f7-d575-58a3-376ec9ab7944@oracle.com>
 <518cd022-73e1-cb5c-499d-86853ae679c3@oracle.com>
 <2f5978fe-af76-df18-15c0-dcc62563299d@oracle.com>
Message-ID: <e1ef73a9-e38f-0847-fdaa-d1b69550bdee@oracle.com>

Good.

Thanks,
Vladimir K

On 7/24/20 4:44 AM, Christian Hagedorn wrote:
> Hi Tobias
> 
> Thank you for your review!
> 
>> Please make sure to run performance testing.
> 
> There is a repeated regression in the micros open crypto benchmark 
> openjdk.bench.javax.crypto.small.SecureRandomBench.nextBytes with these two settings:
> - algorithm=SHA1PRNG-dataSize:64-provider:-shared:false
> - algorithm=SHA1PRNG-dataSize:64-provider:-shared:true
> 
> Repeated runs with these two settings resulted in a regression between 1 and 2%. I could trace it back to the additional 
> type filtering in PhiNode::Value() (webrev.02). This is only required for the assertion code and not for the bailout fix 
> itself. When running performance testing with webrev.01, the regressions disappear.
> 
> I therefore suggest to go with webrev.01 (without assertion code and type filtering) and file a new RFE to investigate 
> the usage of type filtering in PhiNode::Value() for iv phis and why we get a performance regression in these two 
> benchmark settings. In theory, I think it should be beneficial to narrow the type range of iv phis.
> 
>> cfgnode.cpp:1083
>> - There's an extra whitespace before ","
>>
>> loopopts.cpp:84/86
>> - No need for extra brackets
> 
> These are not present anymore in webrev.01.
> 
> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.01/
> 
> Best regards,
> Christian
> 
> 
> On 20.07.20 11:14, Tobias Hartmann wrote:
>> Hi Christian,
>>
>> On 15.07.20 15:08, Christian Hagedorn wrote:
>>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.02/
>>
>> Looks good to me.
>>
>> Some code style comments:
> 
> 
>> Best regards,
>> Tobias
>>

From coleen.phillimore at oracle.com  Fri Jul 24 22:20:23 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 24 Jul 2020 18:20:23 -0400
Subject: RFR (T) 8250042: Clean up methodOop and method_oop names from the
 code
In-Reply-To: <60098204-b23d-7da6-908f-80f3d40c2ebc@oracle.com>
References: <85efc3ab-abbf-c5f2-9b7b-47fa516d9a2d@oracle.com>
 <6f973a0a-cf55-e1ab-8de3-b57f68dbd2cf@oracle.com>
 <c5392d2c-7fe1-09e5-3366-4b9840fef087@oracle.com>
 <c6ec9d68-e44f-7288-d084-e64d62a79490@oracle.com>
 <60098204-b23d-7da6-908f-80f3d40c2ebc@oracle.com>
Message-ID: <4232a089-3984-7feb-d2eb-46f1551ee0ab@oracle.com>


On 7/24/20 4:28 PM, serguei.spitsyn at oracle.com wrote:
> Hi Coleen,
>
> The fix looks good to me.
> I've more focused on the serviceability related update.
> Thank you for taking care about it!

Thank you for reviewing it!? Most of the name changes were in jvmti.? 
Hope it's cleaner to work on now.
Coleen
>
> Thanks,
> Serguei
>
>
> On 7/24/20 06:10, coleen.phillimore at oracle.com wrote:
>>
>> I can also replace method_oop with method_ptr in the CPU ad files, 
>> and this seems to build but now someone who knows the compiler area 
>> needs to comment; this was supposed to be trivial... :)? But it still 
>> is really trivial to look at.
>>
>> I left interpreter_method_oop_reg and compiler_method_oop_reg and 
>> friends in opto/matcher.cpp for someone else.
>>
>> incremental webrev at 
>> http://cr.openjdk.java.net/~coleenp/2020/8250042.02.incr/webrev
>> full webrev at 
>> http://cr.openjdk.java.net/~coleenp/2020/8250042.02/webrev
>>
>> Thanks,
>> Coleen
>>
>>
>> On 7/24/20 8:23 AM, coleen.phillimore at oracle.com wrote:
>>>
>>> Thanks for looking at this.
>>>
>>> On 7/24/20 1:01 AM, David Holmes wrote:
>>>> Hi Coleen,
>>>>
>>>> On 24/07/2020 2:58 am, coleen.phillimore at oracle.com wrote:
>>>>> See bug for more details.? I've been running into these names a 
>>>>> lot lately.?? Many of these names are in JVMTI.
>>>>>
>>>>> Tested with tier1 on all Oracle platforms and built on non-Oracle 
>>>>> platforms.
>>>>>
>>>>> open webrev at 
>>>>> http://cr.openjdk.java.net/~coleenp/2020/8250042.01/webrev
>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8250042
>>>>
>>>> src/hotspot/cpu/*/*.ad
>>>>
>>>> These still refer to "method oop" and method_oop in a number of 
>>>> places.
>>>
>>> Yes, I only replaced method_oop in the shared code and not in the AD 
>>> code.? method_oop can be the name of a parameter and using "sed" to 
>>> change it to "method" doesn't work.?? Somebody who understands this 
>>> code and looks at it will have to make the rest of the changes.
>>>
>>> What I did was replace "method oop" with "method" and "methodOop" 
>>> with "method" in all the sources.? I replaced "method_oop" with 
>>> "method" or "checked_method" in the shared sources.
>>>
>>>>
>>>> src/hotspot/share/adlc/adlparse.cpp
>>>>
>>>> +? frame->_interpreter_method_oop_reg = parse_one_arg("method reg 
>>>> entry");
>>>>
>>>> I guess I'm not understanding the scope of this renaming - why is 
>>>> _interpreter_method_oop_reg not renamed as well? Should this (and 
>>>> other uses) be parsed as method-(oop-reg) rather than 
>>>> (method-oop)-reg?
>>>
>>> I don't know this code, so I'd rather not change any more of it. The 
>>> comment makes sense changed, even though the variable name still 
>>> refers to method_oop.
>>>
>>> Thanks,
>>> Coleen
>>>>
>>>> Otherwise all okay.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> Thanks,
>>>>> Coleen
>>>
>>
>


From kim.barrett at oracle.com  Fri Jul 24 23:42:43 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Fri, 24 Jul 2020 19:42:43 -0400
Subject: RFR[XXS] 8248672: utilities: Introduce DEPRECATED macro for GCC
 and MSVC
In-Reply-To: <MWHPR21MB051146316FF56A63423DF514B0770@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB05113A054F2D8291BFAFB085B0770@MWHPR21MB0511.namprd21.prod.outlook.com>
 <5e301790-8bfe-0ced-b5e2-8a9c76ae33de@oracle.com>
 <MWHPR21MB051146316FF56A63423DF514B0770@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <F50C37BA-FE47-465C-A60C-10547DCF8F69@oracle.com>

> On Jul 24, 2020, at 12:41 PM, Ludovic Henry <luhenry at microsoft.com> wrote:
> 
> I'm not sure I understand your question. Are you asking whether I also replaced all uses of __attribute__((deprecated)) with DEPRECATED? If so, I did replace the only use of it [1] together with defining the macro.
> 
> Please let me know if I misunderstood your question.
> 
> Thank you.
> 
> --
> Ludovic
> 
> [1] in src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp.

Why are we deprecating something rather than just deleting it and
fixing any users?

If the point is to keep the overload but prevent it from being called,
there are better ways than a deprecation warning.

And if we *really* needed deprecation warnings, I suggest using the
C++14 [[deprecated]] attribute (after adding it to the approved new
feature list).  But I think we shouldn't be doing this at all.


> 
> ________________________________________
> From: Vladimir Kozlov <vladimir.kozlov at oracle.com>
> Sent: Friday, July 24, 2020 09:02
> To: Ludovic Henry; hotspot-compiler-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
> Cc: openjdk-aarch64; hotspot-gc-dev at openjdk.java.net
> Subject: Re: RFR[XXS] 8248672: utilities: Introduce DEPRECATED macro for GCC and MSVC
> 
> "And use it in the AArch64 sub system."
> 
> Can you explain more? Do you have RFE filed for DEPRECATED use?
> It is small change which usually done together with usage. Why do this separately?
> 
> Thanks,
> Vladimir K
> 
> On 7/24/20 8:53 AM, Ludovic Henry wrote:
>> Hi,
>> 
>> Could I please have a review on the following. It simply adds the `DEPRECATED` macro to wrap `__attribute__ ((deprecated))` for GCC, and the equivalent for MSVC.
>> 
>> JBS: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.java.net%2Fbrowse%2FJDK-8248672&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cf8e2909451c44ee9c9b108d82feafcd6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637312033704245529&amp;sdata=p%2Bec5f3YYaazBblPt9vRWjQ2ZWa209lHGPLlsuMbpk8%3D&amp;reserved=0
>> Webrev: https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248672%2Fwebrev.00&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cf8e2909451c44ee9c9b108d82feafcd6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637312033704245529&amp;sdata=aOC6Xq%2BJExcKQ8oeNew4aZMHZUD2idlTi3tRihRwpjs%3D&amp;reserved=0
>> 
>> Thank you,
>> 
>> --
>> Ludovic


From aph at redhat.com  Sat Jul 25 15:14:12 2020
From: aph at redhat.com (Andrew Haley)
Date: Sat, 25 Jul 2020 16:14:12 +0100
Subject: [8u] RFR: 8240676: Meet not symmetric failure when running lucene
 on jdk8
In-Reply-To: <7e9268bf-41d5-52d3-cb6c-449fabb0f192@redhat.com>
References: <87zh7pmla1.fsf@redhat.com>
 <9ae30a2b-3443-9954-950e-08e7e26ddd97@redhat.com> <87wo2tmfg3.fsf@redhat.com>
 <af9c4744-de1b-d9ad-6a4d-c69f1f273049@redhat.com> <87lfj9m5kv.fsf@redhat.com>
 <7e9268bf-41d5-52d3-cb6c-449fabb0f192@redhat.com>
Message-ID: <c30ef557-fa61-7d33-3ade-8ae327db6393@redhat.com>

On 24/07/2020 22:07, Andrew Dinn wrote:
> On 24/07/2020 13:59, Roland Westrelin wrote:
>>> That may well change behaviour for some programs as meets are computed
>>> outside of the changed verification path. I'd like to assume the
>>> benefits of improving type accuracy override the risk. Do you think that
>>> is justified? (one might argue that improved type accuracy is not always
>>> better, especially for speculative info where avoiding the erasure might
>>> enable optimizations not previously attempted).
>>
>> I would say both benefit and risk are small. Without the speculative
>> type change, we'll hit failures in the new verification code so some
>> other tweak would have to be done to work around them. Not sure what
>> could be done but that would likely be as risky. Or only the actual fix
>> is backported and the new verification code is left out. Then the
>> speculative type fix is not required. But a regression wouldn't be
>> caught either.
> That's a good enough justification for me. Ship it!
> 
> . . . well, modulo maintainer approval ;-)

Yeah, OK. Of course I'm not super keen on a change in C2 which fixes
a bug that we can't reproduce, but it'll have to do.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Sat Jul 25 15:30:26 2020
From: aph at redhat.com (Andrew Haley)
Date: Sat, 25 Jul 2020 16:30:26 +0100
Subject: [aarch64-port-dev ] RFR(S): 8248676: AArch64: Add workaround for
 LITable constructor
In-Reply-To: <MWHPR21MB0511092557F1BBCB3D91276DB0770@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB05119533601E0AF1DE3CB0E9B0640@MWHPR21MB0511.namprd21.prod.outlook.com>
 <af179579-b8cd-db69-fede-802614f23092@redhat.com>
 <MWHPR21MB0511A82D28A3FE2A09E3E8ADB0650@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB051121A1E582BCC22FA96F47B07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <0aed0646-c770-03e6-4e0b-5108919b7203@redhat.com>
 <MWHPR21MB0511092557F1BBCB3D91276DB0770@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <fa5eb920-30ef-f16c-4efb-d053d842cbd1@redhat.com>

On 24/07/2020 17:02, Ludovic Henry wrote:

> Are you saying that you would like this change to land into
> `aarch64-port/jdk-windows` before getting into jdk/jdk? This change
> doesn't strike me as windows-aarch64 specific and is in line with
> general removal of GCC-specific code (similarly to the LP64 vs
> LLP64, or JDK-8248666).
>
> Webrev: http://cr.openjdk.java.net/~burban/luhenry/8248676/webrev.01

You make a good point. I didn't want to get integer type cleanups
mixed up with the Windows import, so I wanted to do them first.

I think there was a general feeling, expressed by Dalibor, the leader
of the Porters' Group, that the Windows changes should be integrated
into the http://hg.openjdk.java.net/aarch64-port/jdk-windows/ tree.

This change is marginal, IMO. Clearly it's a GCC-ism, so I won't
refuse it being cleaned up in mainline if you want.

But I think we should now move to integrating all of your Windows-
specific changes in the jdk-windows tree and then we'll put together a
Big Windows Patch and push that to mainline. I don't think it'll take
long. Let's just get it done!

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From luhenry at microsoft.com  Sat Jul 25 15:51:21 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Sat, 25 Jul 2020 15:51:21 +0000
Subject: [aarch64-port-dev ] RFR(S): 8248676: AArch64: Add workaround for
 LITable constructor
In-Reply-To: <fa5eb920-30ef-f16c-4efb-d053d842cbd1@redhat.com>
References: <MWHPR21MB05119533601E0AF1DE3CB0E9B0640@MWHPR21MB0511.namprd21.prod.outlook.com>
 <af179579-b8cd-db69-fede-802614f23092@redhat.com>
 <MWHPR21MB0511A82D28A3FE2A09E3E8ADB0650@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB051121A1E582BCC22FA96F47B07E0@MWHPR21MB0511.namprd21.prod.outlook.com>
 <0aed0646-c770-03e6-4e0b-5108919b7203@redhat.com>
 <MWHPR21MB0511092557F1BBCB3D91276DB0770@MWHPR21MB0511.namprd21.prod.outlook.com>,
 <fa5eb920-30ef-f16c-4efb-d053d842cbd1@redhat.com>
Message-ID: <MWHPR21MB0511BD8DB4E5BAA0FA146E52B0740@MWHPR21MB0511.namprd21.prod.outlook.com>

> But I think we should now move to integrating all of your Windows-
> specific changes in the jdk-windows tree and then we'll put together a
> Big Windows Patch and push that to mainline. I don't think it'll take
> long. Let's just get it done!

Sounds good to me, let's do that then.

________________________________________
From: Andrew Haley <aph at redhat.com>
Sent: Saturday, July 25, 2020 08:30
To: Ludovic Henry; hotspot-compiler-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
Cc: openjdk-aarch64
Subject: Re: [aarch64-port-dev ] RFR(S): 8248676: AArch64: Add workaround for LITable constructor

On 24/07/2020 17:02, Ludovic Henry wrote:

> Are you saying that you would like this change to land into
> `aarch64-port/jdk-windows` before getting into jdk/jdk? This change
> doesn't strike me as windows-aarch64 specific and is in line with
> general removal of GCC-specific code (similarly to the LP64 vs
> LLP64, or JDK-8248666).
>
> Webrev: https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248676%2Fwebrev.01&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cac4df4162a664138e45308d830afacd7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637312878365016046&amp;sdata=9994EG19jrHKN5ITL4unJ4E4UeA5g%2FEG0w%2BoRlJaIGA%3D&amp;reserved=0

You make a good point. I didn't want to get integer type cleanups
mixed up with the Windows import, so I wanted to do them first.

I think there was a general feeling, expressed by Dalibor, the leader
of the Porters' Group, that the Windows changes should be integrated
into the https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fhg.openjdk.java.net%2Faarch64-port%2Fjdk-windows%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cac4df4162a664138e45308d830afacd7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637312878365026037&amp;sdata=GFvB2M2i7Dfgsxi%2B61JpdBatSAORz%2Bt1TQwpG06eBfY%3D&amp;reserved=0 tree.

This change is marginal, IMO. Clearly it's a GCC-ism, so I won't
refuse it being cleaned up in mainline if you want.

But I think we should now move to integrating all of your Windows-
specific changes in the jdk-windows tree and then we'll put together a
Big Windows Patch and push that to mainline. I don't think it'll take
long. Let's just get it done!

--
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.redhat.com%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cac4df4162a664138e45308d830afacd7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637312878365026037&amp;sdata=tLRvd1De7n64g7NaYF7aGeYtbZZt61tG1hpWK0cJHG0%3D&amp;reserved=0>
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkeybase.io%2Fandrewhaley&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cac4df4162a664138e45308d830afacd7%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637312878365026037&amp;sdata=S%2FaJkySd0c1SlT1b6XWYoAJQrkI6Vzm2X%2Fd44oEyaUw%3D&amp;reserved=0
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Sun Jul 26 09:56:31 2020
From: aph at redhat.com (Andrew Haley)
Date: Sun, 26 Jul 2020 10:56:31 +0100
Subject: [aarch64-port-dev ] RFR[XXS] 8248672: utilities: Introduce
 DEPRECATED macro for GCC and MSVC
In-Reply-To: <F50C37BA-FE47-465C-A60C-10547DCF8F69@oracle.com>
References: <MWHPR21MB05113A054F2D8291BFAFB085B0770@MWHPR21MB0511.namprd21.prod.outlook.com>
 <5e301790-8bfe-0ced-b5e2-8a9c76ae33de@oracle.com>
 <MWHPR21MB051146316FF56A63423DF514B0770@MWHPR21MB0511.namprd21.prod.outlook.com>
 <F50C37BA-FE47-465C-A60C-10547DCF8F69@oracle.com>
Message-ID: <1259c3fd-b69c-6d81-0427-cb769f00bca5@redhat.com>

On 25/07/2020 00:42, Kim Barrett wrote:
> Why are we deprecating something rather than just deleting it and
> fixing any users?

C++ overloading. AArch64 CMP (immediate) only has a limited range, so
we only have a byte-wide Assembler::cmp() definition. The deprecation
warning on the wider version makes sure that any maintenance
programmer is immediately warned if it is used. There are other things
we could do: by not providing a definition for the wider cmp() you get
a link error, but that wouldn't be as explicit as a deprecation
warning.

The root problem is that the immediate value to CMP isn't always known
when HotSpot is compiled, but may be calculated at runtime. We have
seen failures in production when an immediate offset overflowed.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From kim.barrett at oracle.com  Sun Jul 26 21:41:47 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Sun, 26 Jul 2020 17:41:47 -0400
Subject: [aarch64-port-dev ] RFR[XXS] 8248672: utilities: Introduce
 DEPRECATED macro for GCC and MSVC
In-Reply-To: <1259c3fd-b69c-6d81-0427-cb769f00bca5@redhat.com>
References: <MWHPR21MB05113A054F2D8291BFAFB085B0770@MWHPR21MB0511.namprd21.prod.outlook.com>
 <5e301790-8bfe-0ced-b5e2-8a9c76ae33de@oracle.com>
 <MWHPR21MB051146316FF56A63423DF514B0770@MWHPR21MB0511.namprd21.prod.outlook.com>
 <F50C37BA-FE47-465C-A60C-10547DCF8F69@oracle.com>
 <1259c3fd-b69c-6d81-0427-cb769f00bca5@redhat.com>
Message-ID: <CD1AE64E-8555-4074-97EB-831BF0A72C61@oracle.com>

> On Jul 26, 2020, at 5:56 AM, Andrew Haley <aph at redhat.com> wrote:
> 
> On 25/07/2020 00:42, Kim Barrett wrote:
>> Why are we deprecating something rather than just deleting it and
>> fixing any users?
> 
> C++ overloading. AArch64 CMP (immediate) only has a limited range, so
> we only have a byte-wide Assembler::cmp() definition. The deprecation
> warning on the wider version makes sure that any maintenance
> programmer is immediately warned if it is used. There are other things
> we could do: by not providing a definition for the wider cmp() you get
> a link error, but that wouldn't be as explicit as a deprecation
> warning.
> 
> The root problem is that the immediate value to CMP isn't always known
> when HotSpot is compiled, but may be calculated at runtime. We have
> seen failures in production when an immediate offset overflowed.

Yeah, I'd guessed that might be the point, and confirmed it later by
looking at the changeset that originally introduced the attribute.

As of early last week, a definition of "= delete;" is the way to
poison an overload.


From luhenry at microsoft.com  Sun Jul 26 23:10:56 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Sun, 26 Jul 2020 23:10:56 +0000
Subject: [aarch64-port-dev ] RFR[XXS] 8248672: utilities: Introduce
 DEPRECATED macro for GCC and MSVC
In-Reply-To: <CD1AE64E-8555-4074-97EB-831BF0A72C61@oracle.com>
References: <MWHPR21MB05113A054F2D8291BFAFB085B0770@MWHPR21MB0511.namprd21.prod.outlook.com>
 <5e301790-8bfe-0ced-b5e2-8a9c76ae33de@oracle.com>
 <MWHPR21MB051146316FF56A63423DF514B0770@MWHPR21MB0511.namprd21.prod.outlook.com>
 <F50C37BA-FE47-465C-A60C-10547DCF8F69@oracle.com>
 <1259c3fd-b69c-6d81-0427-cb769f00bca5@redhat.com>,
 <CD1AE64E-8555-4074-97EB-831BF0A72C61@oracle.com>
Message-ID: <MWHPR21MB05117357CE8A5903943234C2B0750@MWHPR21MB0511.namprd21.prod.outlook.com>

> As of early last week, a definition of "= delete;" is the way to
> poison an overload.

Let me try that locally, compile on Windows-AArch64 and Linux-AArch64, and confirm whether it works for MSVC.

________________________________________
From: Kim Barrett <kim.barrett at oracle.com>
Sent: Sunday, July 26, 2020 14:41
To: Andrew Haley
Cc: Ludovic Henry; Vladimir Kozlov; hotspot-compiler-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; openjdk-aarch64; hotspot-gc-dev at openjdk.java.net
Subject: Re: [aarch64-port-dev ] RFR[XXS] 8248672: utilities: Introduce DEPRECATED macro for GCC and MSVC

> On Jul 26, 2020, at 5:56 AM, Andrew Haley <aph at redhat.com> wrote:
>
> On 25/07/2020 00:42, Kim Barrett wrote:
>> Why are we deprecating something rather than just deleting it and
>> fixing any users?
>
> C++ overloading. AArch64 CMP (immediate) only has a limited range, so
> we only have a byte-wide Assembler::cmp() definition. The deprecation
> warning on the wider version makes sure that any maintenance
> programmer is immediately warned if it is used. There are other things
> we could do: by not providing a definition for the wider cmp() you get
> a link error, but that wouldn't be as explicit as a deprecation
> warning.
>
> The root problem is that the immediate value to CMP isn't always known
> when HotSpot is compiled, but may be calculated at runtime. We have
> seen failures in production when an immediate offset overflowed.

Yeah, I'd guessed that might be the point, and confirmed it later by
looking at the changeset that originally introduced the attribute.

As of early last week, a definition of "= delete;" is the way to
poison an overload.


From xxinliu at amazon.com  Sun Jul 26 23:46:38 2020
From: xxinliu at amazon.com (Liu, Xin)
Date: Sun, 26 Jul 2020 23:46:38 +0000
Subject: RFR[XS] 8249809 avoid calling DirectiveSet::clone(this) in
 compilecommand_compatibility_init 
Message-ID: <1595807197546.52082@amazon.com>

hi, Reviewers,

Could you review this simple patch?
bug: https://bugs.openjdk.java.net/browse/JDK-8249809
webrev: https://cr.openjdk.java.net/~xliu/8249809/00/webrev/

When the users specify a method-level compiler directive, the DirectiveSet is cloned for every single compiling method. It's expensive but rarely hit. Actually, Only user-specified methods must clone the DirectiveSet. I introduce a smart pointer DirectiveSetPtr. operator->() returns a pointer to a constant DirectiveSet, which is read-only. It doesn't clone the _origin until c2 need to update its members. transfer() yield the ownership of the pointer.

Test:
manually tests with different CompileComand options.
hotspot:tier1 and gtest:all.

thanks,
--lx


From david.holmes at oracle.com  Mon Jul 27 01:26:31 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 27 Jul 2020 11:26:31 +1000
Subject: RFR (T) 8250042: Clean up methodOop and method_oop names from the
 code
In-Reply-To: <c6ec9d68-e44f-7288-d084-e64d62a79490@oracle.com>
References: <85efc3ab-abbf-c5f2-9b7b-47fa516d9a2d@oracle.com>
 <6f973a0a-cf55-e1ab-8de3-b57f68dbd2cf@oracle.com>
 <c5392d2c-7fe1-09e5-3366-4b9840fef087@oracle.com>
 <c6ec9d68-e44f-7288-d084-e64d62a79490@oracle.com>
Message-ID: <8eab1729-9a35-8c79-65cf-e67a098965d5@oracle.com>

Hi Coleen,

On 24/07/2020 11:10 pm, coleen.phillimore at oracle.com wrote:
> 
> I can also replace method_oop with method_ptr in the CPU ad files, and 
> this seems to build but now someone who knows the compiler area needs to 
> comment; this was supposed to be trivial... :)? But it still is really 
> trivial to look at.

method_ptr works for me. Changes seem fine.

> I left interpreter_method_oop_reg and compiler_method_oop_reg and 
> friends in opto/matcher.cpp for someone else.

Okay. Hopefully someone will pick it up.

Thanks,
David
-----

> incremental webrev at 
> http://cr.openjdk.java.net/~coleenp/2020/8250042.02.incr/webrev
> full webrev at http://cr.openjdk.java.net/~coleenp/2020/8250042.02/webrev
> 
> Thanks,
> Coleen
> 
> 
> On 7/24/20 8:23 AM, coleen.phillimore at oracle.com wrote:
>>
>> Thanks for looking at this.
>>
>> On 7/24/20 1:01 AM, David Holmes wrote:
>>> Hi Coleen,
>>>
>>> On 24/07/2020 2:58 am, coleen.phillimore at oracle.com wrote:
>>>> See bug for more details.? I've been running into these names a lot 
>>>> lately.?? Many of these names are in JVMTI.
>>>>
>>>> Tested with tier1 on all Oracle platforms and built on non-Oracle 
>>>> platforms.
>>>>
>>>> open webrev at 
>>>> http://cr.openjdk.java.net/~coleenp/2020/8250042.01/webrev
>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8250042
>>>
>>> src/hotspot/cpu/*/*.ad
>>>
>>> These still refer to "method oop" and method_oop in a number of places.
>>
>> Yes, I only replaced method_oop in the shared code and not in the AD 
>> code.? method_oop can be the name of a parameter and using "sed" to 
>> change it to "method" doesn't work.?? Somebody who understands this 
>> code and looks at it will have to make the rest of the changes.
>>
>> What I did was replace "method oop" with "method" and "methodOop" with 
>> "method" in all the sources.? I replaced "method_oop" with "method" or 
>> "checked_method" in the shared sources.
>>
>>>
>>> src/hotspot/share/adlc/adlparse.cpp
>>>
>>> +? frame->_interpreter_method_oop_reg = parse_one_arg("method reg 
>>> entry");
>>>
>>> I guess I'm not understanding the scope of this renaming - why is 
>>> _interpreter_method_oop_reg not renamed as well? Should this (and 
>>> other uses) be parsed as method-(oop-reg) rather than (method-oop)-reg?
>>
>> I don't know this code, so I'd rather not change any more of it. The 
>> comment makes sense changed, even though the variable name still 
>> refers to method_oop.
>>
>> Thanks,
>> Coleen
>>>
>>> Otherwise all okay.
>>>
>>> Thanks,
>>> David
>>>
>>>> Thanks,
>>>> Coleen
>>
> 

From david.holmes at oracle.com  Mon Jul 27 01:46:03 2020
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 27 Jul 2020 11:46:03 +1000
Subject: RFR: 8250556: revert JVMCI part of JDK-8230395
In-Reply-To: <0DE58760-A197-46FD-99B0-A2C1A5394DEE@oracle.com>
References: <0DE58760-A197-46FD-99B0-A2C1A5394DEE@oracle.com>
Message-ID: <c6864b7d-52e9-aa2e-bf58-df0bc4b713c0@oracle.com>

Hi Doug,

This looks like an accurate reversal of the previous changes.

Thanks,
David
-----

On 25/07/2020 4:35 am, Doug Simon wrote:
> (with correct subject this time)
> 
> Please review this bug fix to revert the JVMCI changes made as part of JDK-8230395.
> 
> Instead of aborting the VM when JVMCI counter expansion fails, the JVMCI client should simply be informed of the failure (as was originally suggested <https://bugs.openjdk.java.net/browse/JDK-8230395?focusedCommentId=14289994&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14289994> by David).
> 
> https://bugs.openjdk.java.net/browse/JDK-8250556
> https://cr.openjdk.java.net/~dnsimon/8250556/webrev.00/
> 
> Testing: hs-tier1,hs-tier2,hs-tier3-graal
> 
> -Doug
> 

From ningsheng.jian at arm.com  Mon Jul 27 01:58:38 2020
From: ningsheng.jian at arm.com (Ningsheng Jian)
Date: Mon, 27 Jul 2020 09:58:38 +0800
Subject: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API
 (Incubator): AArch64 backend changes
In-Reply-To: <2bc029fc-2823-18ac-9aa0-1a8edd7f9094@oracle.com>
References: <VI1PR0802MB2558C90ACF1189F17692FF278EB90@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <9a13f5df-d946-579d-4282-917dc7338dc8@redhat.com>
 <09BC0693-80E0-4F87-855E-0B38A6F5EFA2@oracle.com>
 <668e500e-f621-5a2c-a41e-f73536880f73@redhat.com>
 <VI1PR0802MB255845256424F9EB4D5D81368EB30@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <1909fa9d-98bb-c2fb-45d8-540247d1ca8b@redhat.com>
 <VI1PR0802MB255895ADE9BC8D027EBB81048EB10@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <BYAPR11MB35438110FA50851942CE17A1EF950@BYAPR11MB3543.namprd11.prod.outlook.com>
 <HE1PR0802MB25542AE212E12FD3EF6D1BE48E6E0@HE1PR0802MB2554.eurprd08.prod.outlook.com>
 <2acbcc99-8dd4-b8f1-5982-1d439953c416@redhat.com>
 <VI1PR0802MB2558ECC61528694DBDFFF48F8E670@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <54d6b2b6-b79a-4700-981c-6ab33aca82f2@arm.com>
 <8c05d468-8753-b671-e3a9-92a7148f4f14@oracle.com>
 <f588f038-669d-2320-9d3f-c12e6e9c1a66@arm.com>
 <2bc029fc-2823-18ac-9aa0-1a8edd7f9094@oracle.com>
Message-ID: <942c4be0-4f5d-acd6-86ae-e6769215ca37@arm.com>

Thank you Erik!

Regards,
Ningsheng

On 7/23/20 9:06 PM, Erik Joelsson wrote:
> Hello Ningsheng,
> 
> Build change looks good.
> 
> /Erik
> 
> On 2020-07-23 01:02, Ningsheng Jian wrote:
>> Hi Vladimir,
>>
>> Thanks for pointing out this. Yes, I missed that change in shared 
>> code. I've regenerated the webrev, with GensrcAdlc.gmk file change 
>> included:
>>
>> http://cr.openjdk.java.net/~njian/vectorapi/8223347-integration/aarch64-webrev.01/ 
>>
>>
>> Also add build-dev.
>>
>> Thanks,
>> Ningsheng
>>
>> On 7/23/20 5:36 AM, Vladimir Ivanov wrote:
>>>> http://cr.openjdk.java.net/~njian/vectorapi/8223347-integration/aarch64-webrev.01/ 
>>>
>>>
>>>
>>>
>>> FTR there's one more aarch64-specific change in shared code to enable 
>>> aarch64_neon.ad processing:
>>>
>>> diff --git a/make/hotspot/gensrc/GensrcAdlc.gmk 
>>> b/make/hotspot/gensrc/GensrcAdlc.gmk
>>> --- a/make/hotspot/gensrc/GensrcAdlc.gmk
>>> +++ b/make/hotspot/gensrc/GensrcAdlc.gmk
>>> @@ -129,6 +129,12 @@
>>>
>>> $d/os_cpu/$(HOTSPOT_TARGET_OS)_$(HOTSPOT_TARGET_CPU_ARCH)/$(HOTSPOT_TARGET_OS)_$(HOTSPOT_TARGET_CPU_ARCH).ad 
>>> \
>>> ????? )))
>>>
>>> +? ifeq ($(HOTSPOT_TARGET_CPU_ARCH), aarch64)
>>> +??? AD_SRC_FILES += $(call uniq, $(wildcard $(foreach d, 
>>> $(AD_SRC_ROOTS), \
>>> + $d/cpu/$(HOTSPOT_TARGET_CPU_ARCH)/$(HOTSPOT_TARGET_CPU_ARCH)_neon.ad \
>>> +??? )))
>>> +? endif
>>> +
>>> ??? ifeq ($(call check-jvm-feature, shenandoahgc), true)
>>> ????? AD_SRC_FILES += $(call uniq, $(wildcard $(foreach d, 
>>> $(AD_SRC_ROOTS), \
>>>
>>> $d/cpu/$(HOTSPOT_TARGET_CPU_ARCH)/gc/shenandoah/shenandoah_$(HOTSPOT_TARGET_CPU).ad 
>>> \
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>>> On 7/8/20 3:05 PM, Yang Zhang wrote:
>>>>> Hi Andrew
>>>>>
>>>>> I have updated this patch. Could you please help to review it again?
>>>>> In this patch, the following changes are made:
>>>>> 1. Separate newly added NEON instructions to a new ad file
>>>>> ??? aarch64_neon.ad
>>>>> 2. Add assembler tests for NEON instructions. Trailing spaces
>>>>> ??? in the python script are also removed.
>>>>>
>>>>> http://cr.openjdk.java.net/~yzhang/vectorapi/vectorapi.rfr/aarch64_webrev/webrev.02/ 
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Yang
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Andrew Haley <aph at redhat.com>
>>>>> Sent: Tuesday, June 30, 2020 12:10 AM
>>>>> To: Yang Zhang <Yang.Zhang at arm.com>; Viswanathan, Sandhya 
>>>>> <sandhya.viswanathan at intel.com>; Paul Sandoz <paul.sandoz at oracle.com>
>>>>> Cc: nd <nd at arm.com>; hotspot-compiler-dev at openjdk.java.net; 
>>>>> hotspot-dev at openjdk.java.net; core-libs-dev at openjdk.java.net; 
>>>>> aarch64-port-dev at openjdk.java.net
>>>>> Subject: Re: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of 
>>>>> Vector API (Incubator): AArch64 backend changes
>>>>>
>>>>> On 29/06/2020 08:48, Yang Zhang wrote:
>>>>>> 1. Instructions that can be matched with NEON instructions directly.
>>>>>> MulVB, SqrtVF and AbsV have been merged into jdk master already.
>>>>>>
>>>>>> 2. Instructions that jdk master has middle end support for, but 
>>>>>> they cannot be matched with NEON instructions directly.
>>>>>> Such as AddReductionVL, MulReductionVL, And/Or/XorReductionV These 
>>>>>> new instructions can be moved into jdk master first, but for 
>>>>>> auto-vectorization, the performance might not get improved.
>>>>>>
>>>>>> 3. Panama/Vector API specific? instructions such as 
>>>>>> Load/StoreVector ( 16 bits), VectorReinterpret, VectorMaskCmp, 
>>>>>> MaxV/MinV, VectorBlend etc.
>>>>>> These instructions cannot be moved into jdk master first because 
>>>>>> there isn't middle-end support.
>>>>>>
>>>>>> I will put 2 and 3 in a new ad file aarch64_neon.ad. I will also 
>>>>>> update aarch64_asmtest.py and macroassemler.cpp. When the patch is 
>>>>>> ready, I will send it again.
>>>>>
>>>>> Thank you *very* much for your hard work. Appreciated!
>>>>>
>>>>> -- 
>>>>> Andrew Haley? (he/him)
>>>>> Java Platform Lead Engineer
>>>>> Red Hat UK Ltd. <https://www.redhat.com> 
>>>>> https://keybase.io/andrewhaley
>>>>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>>>>>
>>>>
>>


From christian.hagedorn at oracle.com  Mon Jul 27 06:42:45 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Mon, 27 Jul 2020 08:42:45 +0200
Subject: [16] RFR(XS): 8249602: C2: assert(cnt == _outcnt) failed: no
 insertions allowed
In-Reply-To: <82cbd463-d480-b882-04da-0d1269717fff@oracle.com>
References: <2cd118ab-c117-bf61-ae03-117b9383a5e6@oracle.com>
 <82cbd463-d480-b882-04da-0d1269717fff@oracle.com>
Message-ID: <e37256e4-0f08-fe34-a3f1-5c301a93030d@oracle.com>

Thank you Vladimir for your review!

Best regards,
Christian

On 24.07.20 20:03, Vladimir Kozlov wrote:
> Looks good.
> 
> Thanks,
> Vladimir
> 
> On 7/24/20 5:57 AM, Christian Hagedorn wrote:
>> Hi
>>
>> Please review the following patch:
>> https://bugs.openjdk.java.net/browse/JDK-8249602
>> http://cr.openjdk.java.net/~chagedorn/8249602/webrev.00/
>>
>> The testcase hits the assert when inserting a post loop. When 
>> correcting the fall-in values to the post-loop phis to take the values 
>> from the main-loop, we have to separately handle nodes that belong to 
>> the backedge control block and cannot float. In this process, we clone 
>> data nodes in PhaseIdealLoop::clone_up_backedge_goo and then hit the 
>> assert because some nodes to be cloned have a control input from the 
>> main-loop header node (main_head). These nodes are cloned and the 
>> main_head node gets these nodes as additional output nodes. This 
>> should be fine but the DUIterator_Fast forbids insertions.
>>
>> The fix simply switches to a normal DUIterator which allows 
>> insertions. This should also be done when correcting the fall-in 
>> values to the main-loop to take the values from the pre-loop.
>>
>> Best regards,
>> Christian

From tobias.hartmann at oracle.com  Mon Jul 27 07:35:56 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 27 Jul 2020 09:35:56 +0200
Subject: [16] RFR(S): 8248552: C2 crashes with SIGFPE due to division by
 zero
In-Reply-To: <e1ef73a9-e38f-0847-fdaa-d1b69550bdee@oracle.com>
References: <70e8e42b-5cb3-9c1e-419e-2f771f042368@oracle.com>
 <3ba2ef6a-8ade-7ede-5252-21051c34b472@oracle.com>
 <9e2f26bd-daa4-9540-8401-9850e0beea94@oracle.com>
 <c569ebf8-58ed-c6df-2e0e-38c692f926c6@oracle.com>
 <a2066deb-ef3b-7071-33ff-03c829069cff@oracle.com>
 <f6a62d10-9cff-3fc2-0947-c49e775a23c9@oracle.com>
 <e8d6aa47-b584-4fdc-f506-1b1c52aa28f5@oracle.com>
 <ca043859-bb7b-f64a-a37d-7b4e47c745d6@oracle.com>
 <5b2e7b1b-24f7-d575-58a3-376ec9ab7944@oracle.com>
 <518cd022-73e1-cb5c-499d-86853ae679c3@oracle.com>
 <2f5978fe-af76-df18-15c0-dcc62563299d@oracle.com>
 <e1ef73a9-e38f-0847-fdaa-d1b69550bdee@oracle.com>
Message-ID: <15b4d739-a972-13db-5b61-f08ab24d2ca7@oracle.com>

+1

Best regards,
Tobias

On 24.07.20 23:41, Vladimir Kozlov wrote:
> Good.
> 
> Thanks,
> Vladimir K
> 
> On 7/24/20 4:44 AM, Christian Hagedorn wrote:
>> Hi Tobias
>>
>> Thank you for your review!
>>
>>> Please make sure to run performance testing.
>>
>> There is a repeated regression in the micros open crypto benchmark
>> openjdk.bench.javax.crypto.small.SecureRandomBench.nextBytes with these two settings:
>> - algorithm=SHA1PRNG-dataSize:64-provider:-shared:false
>> - algorithm=SHA1PRNG-dataSize:64-provider:-shared:true
>>
>> Repeated runs with these two settings resulted in a regression between 1 and 2%. I could trace it
>> back to the additional type filtering in PhiNode::Value() (webrev.02). This is only required for
>> the assertion code and not for the bailout fix itself. When running performance testing with
>> webrev.01, the regressions disappear.
>>
>> I therefore suggest to go with webrev.01 (without assertion code and type filtering) and file a
>> new RFE to investigate the usage of type filtering in PhiNode::Value() for iv phis and why we get
>> a performance regression in these two benchmark settings. In theory, I think it should be
>> beneficial to narrow the type range of iv phis.
>>
>>> cfgnode.cpp:1083
>>> - There's an extra whitespace before ","
>>>
>>> loopopts.cpp:84/86
>>> - No need for extra brackets
>>
>> These are not present anymore in webrev.01.
>>
>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.01/
>>
>> Best regards,
>> Christian
>>
>>
>> On 20.07.20 11:14, Tobias Hartmann wrote:
>>> Hi Christian,
>>>
>>> On 15.07.20 15:08, Christian Hagedorn wrote:
>>>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.02/
>>>
>>> Looks good to me.
>>>
>>> Some code style comments:
>>
>>
>>> Best regards,
>>> Tobias
>>>

From tobias.hartmann at oracle.com  Mon Jul 27 07:50:15 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Mon, 27 Jul 2020 09:50:15 +0200
Subject: [16] RFR(XS): 8249602: C2: assert(cnt == _outcnt) failed: no
 insertions allowed
In-Reply-To: <2cd118ab-c117-bf61-ae03-117b9383a5e6@oracle.com>
References: <2cd118ab-c117-bf61-ae03-117b9383a5e6@oracle.com>
Message-ID: <81184963-aa5a-17e8-cc99-58e60045cbbb@oracle.com>

Hi Christian,

On 24.07.20 14:57, Christian Hagedorn wrote:
> http://cr.openjdk.java.net/~chagedorn/8249602/webrev.00/

Looks good to me!

Small suggestion: The test can be executed in same VM mode because it does not require any
additional flags. No new webrev required.

JDK 11 is probably affected as well, right? If so, please add the corresponding affects version.

Best regards,
Tobias


From christian.hagedorn at oracle.com  Mon Jul 27 07:58:23 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Mon, 27 Jul 2020 09:58:23 +0200
Subject: [16] RFR(S): 8248552: C2 crashes with SIGFPE due to division by
 zero
In-Reply-To: <15b4d739-a972-13db-5b61-f08ab24d2ca7@oracle.com>
References: <70e8e42b-5cb3-9c1e-419e-2f771f042368@oracle.com>
 <3ba2ef6a-8ade-7ede-5252-21051c34b472@oracle.com>
 <9e2f26bd-daa4-9540-8401-9850e0beea94@oracle.com>
 <c569ebf8-58ed-c6df-2e0e-38c692f926c6@oracle.com>
 <a2066deb-ef3b-7071-33ff-03c829069cff@oracle.com>
 <f6a62d10-9cff-3fc2-0947-c49e775a23c9@oracle.com>
 <e8d6aa47-b584-4fdc-f506-1b1c52aa28f5@oracle.com>
 <ca043859-bb7b-f64a-a37d-7b4e47c745d6@oracle.com>
 <5b2e7b1b-24f7-d575-58a3-376ec9ab7944@oracle.com>
 <518cd022-73e1-cb5c-499d-86853ae679c3@oracle.com>
 <2f5978fe-af76-df18-15c0-dcc62563299d@oracle.com>
 <e1ef73a9-e38f-0847-fdaa-d1b69550bdee@oracle.com>
 <15b4d739-a972-13db-5b61-f08ab24d2ca7@oracle.com>
Message-ID: <8d1cd893-b366-272a-59f9-e3180526a07c@oracle.com>

Thank you Vladimir and Tobias for reviewing it again!

I filed an RFE [1] to investigate the change in PhiNode::Value() further.

Best regards,
Christian

[1] https://bugs.openjdk.java.net/browse/JDK-8250607

On 27.07.20 09:35, Tobias Hartmann wrote:
> +1
> 
> Best regards,
> Tobias
> 
> On 24.07.20 23:41, Vladimir Kozlov wrote:
>> Good.
>>
>> Thanks,
>> Vladimir K
>>
>> On 7/24/20 4:44 AM, Christian Hagedorn wrote:
>>> Hi Tobias
>>>
>>> Thank you for your review!
>>>
>>>> Please make sure to run performance testing.
>>>
>>> There is a repeated regression in the micros open crypto benchmark
>>> openjdk.bench.javax.crypto.small.SecureRandomBench.nextBytes with these two settings:
>>> - algorithm=SHA1PRNG-dataSize:64-provider:-shared:false
>>> - algorithm=SHA1PRNG-dataSize:64-provider:-shared:true
>>>
>>> Repeated runs with these two settings resulted in a regression between 1 and 2%. I could trace it
>>> back to the additional type filtering in PhiNode::Value() (webrev.02). This is only required for
>>> the assertion code and not for the bailout fix itself. When running performance testing with
>>> webrev.01, the regressions disappear.
>>>
>>> I therefore suggest to go with webrev.01 (without assertion code and type filtering) and file a
>>> new RFE to investigate the usage of type filtering in PhiNode::Value() for iv phis and why we get
>>> a performance regression in these two benchmark settings. In theory, I think it should be
>>> beneficial to narrow the type range of iv phis.
>>>
>>>> cfgnode.cpp:1083
>>>> - There's an extra whitespace before ","
>>>>
>>>> loopopts.cpp:84/86
>>>> - No need for extra brackets
>>>
>>> These are not present anymore in webrev.01.
>>>
>>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.01/
>>>
>>> Best regards,
>>> Christian
>>>
>>>
>>> On 20.07.20 11:14, Tobias Hartmann wrote:
>>>> Hi Christian,
>>>>
>>>> On 15.07.20 15:08, Christian Hagedorn wrote:
>>>>> http://cr.openjdk.java.net/~chagedorn/8248552/webrev.02/
>>>>
>>>> Looks good to me.
>>>>
>>>> Some code style comments:
>>>
>>>
>>>> Best regards,
>>>> Tobias
>>>>

From christian.hagedorn at oracle.com  Mon Jul 27 08:12:35 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Mon, 27 Jul 2020 10:12:35 +0200
Subject: [16] RFR(XS): 8249602: C2: assert(cnt == _outcnt) failed: no
 insertions allowed
In-Reply-To: <81184963-aa5a-17e8-cc99-58e60045cbbb@oracle.com>
References: <2cd118ab-c117-bf61-ae03-117b9383a5e6@oracle.com>
 <81184963-aa5a-17e8-cc99-58e60045cbbb@oracle.com>
Message-ID: <e31c2972-15e3-e3b3-8b30-7db4e127426d@oracle.com>

Hi Tobias

On 27.07.20 09:50, Tobias Hartmann wrote:
> Hi Christian,
> 
> On 24.07.20 14:57, Christian Hagedorn wrote:
>> http://cr.openjdk.java.net/~chagedorn/8249602/webrev.00/
> 
> Looks good to me!

Thank you for your review!

> Small suggestion: The test can be executed in same VM mode because it does not require any
> additional flags. No new webrev required.

Yes that makes sense. I updated my webrev inline.

> JDK 11 is probably affected as well, right? If so, please add the corresponding affects version.

Even though I could not directly reproduce it in JDK-11, it should also 
be affected as the changed code was already there for a long time. I 
added 11 as affected version.

Best regards,
Christian

From nick.gasson at arm.com  Mon Jul 27 08:50:56 2020
From: nick.gasson at arm.com (Nick Gasson)
Date: Mon, 27 Jul 2020 16:50:56 +0800
Subject: RFR(S): 8237483: AArch64 C1 OopMap inserted twice fatal error
Message-ID: <85k0ypjq8f.fsf@nicgas01-pc.shanghai.arm.com>

Hi,

Bug: https://bugs.openjdk.java.net/browse/JDK-8237483
Webrev: http://cr.openjdk.java.net/~ngasson/8237483/webrev.0/

In the method java.util.zip.Inflater::inflate C1 generates these two LIR
instructions:

 724 move [c_rarg3|I] [Base:[c_rarg1|L] Disp: 2147483647|I] [patch_normal] [bci:95]
 728 throw [c_rarg3|I] [c_rarg0|L]  [bci:100]

The move instruction at 724 generates a runtime call to deoptimise the
method since this patching is not implemented on AArch64. An oop map is
inserted for the return PC of the runtime call
(LIR_Assembler::deoptimize_trap()). The following throw LIR instruction
then inserts another oop map at the same PC, triggering an assertion
failure.

To reproduce:

  make test TEST="compiler/c1/CanonicalizeArrayLength.java" \
     JTREG="VM_OPTIONS=-Xcomp"

This patch just adds a NOP in this situation to ensure the PCs are
unique. Not sure if there's a better way to do it?

Tested hotspot_all_no_apps, jdk_core.

--
Thanks,
Nick

From aph at redhat.com  Mon Jul 27 09:40:57 2020
From: aph at redhat.com (Andrew Haley)
Date: Mon, 27 Jul 2020 10:40:57 +0100
Subject: [aarch64-port-dev ] RFR(S): 8237483: AArch64 C1 OopMap inserted
 twice fatal error
In-Reply-To: <85k0ypjq8f.fsf@nicgas01-pc.shanghai.arm.com>
References: <85k0ypjq8f.fsf@nicgas01-pc.shanghai.arm.com>
Message-ID: <04c4f9e0-e29a-3250-878c-2b29c11a45d8@redhat.com>

On 7/27/20 9:50 AM, Nick Gasson wrote:
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237483
> Webrev: http://cr.openjdk.java.net/~ngasson/8237483/webrev.0/
>
> In the method java.util.zip.Inflater::inflate C1 generates these two LIR
> instructions:
>
>  724 move [c_rarg3|I] [Base:[c_rarg1|L] Disp: 2147483647|I] [patch_normal] [bci:95]
>  728 throw [c_rarg3|I] [c_rarg0|L]  [bci:100]
>
> The move instruction at 724 generates a runtime call to deoptimise the
> method since this patching is not implemented on AArch64. An oop map is
> inserted for the return PC of the runtime call
> (LIR_Assembler::deoptimize_trap()). The following throw LIR instruction
> then inserts another oop map at the same PC, triggering an assertion
> failure.
>
> To reproduce:
>
>   make test TEST="compiler/c1/CanonicalizeArrayLength.java" \
>      JTREG="VM_OPTIONS=-Xcomp"
>
> This patch just adds a NOP in this situation to ensure the PCs are
> unique. Not sure if there's a better way to do it?

I would have thought it would make more sense, rather than asserting,
simply to detect that we already have an oopmap so we don't need
another one. Having said that, it's probably not worth worrying about
so your fix is OK.

It needs a better comment, though. The only way to find out why this
code is here would be to trawl the email archives. Something like this
would do:

// In the method java.util.zip.Inflater::inflate C1 generates these two LIR
// instructions:

//  724 move [c_rarg3|I] [Base:[c_rarg1|L] Disp: 2147483647|I] [patch_normal] [bci:95]
//  728 throw [c_rarg3|I] [c_rarg0|L]  [bci:100]

// The move instruction at 724 generates a runtime call to deoptimise the
// method since this patching is not implemented on AArch64. An oop map is
// inserted for the return PC of the runtime call
// (LIR_Assembler::deoptimize_trap()). The following throw LIR instruction
// then inserts another oop map at the same PC, triggering an assertion
// failure.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From patric.hedlin at oracle.com  Mon Jul 27 10:02:49 2020
From: patric.hedlin at oracle.com (Patric Hedlin)
Date: Mon, 27 Jul 2020 12:02:49 +0200
Subject: [aarch64-port-dev ] RFR(S/M): 8247766: [aarch64] guarantee(val <
 (1U << nbits)) failed: Field too big for insn
In-Reply-To: <2809ab8c-4a2e-c0c3-9b93-a0f5df41b992@redhat.com>
References: <0cdbdf26-ad4d-056b-a801-cc31b2cc4ab3@oracle.com>
 <2809ab8c-4a2e-c0c3-9b93-a0f5df41b992@redhat.com>
Message-ID: <4d5b4219-3f9a-f606-64cc-4bc40fe2c7bd@oracle.com>

Hi Andrew,

On 2020-07-09 17:48, Andrew Haley wrote:
> On 07/07/2020 12:17, Patric Hedlin wrote:
>> Dear all,
>>
>> I would like to ask for help to review the following change/update:
>>
>> Issue:? https://bugs.openjdk.java.net/browse/JDK-8247766
>> Webrev: http://cr.openjdk.java.net/~phedlin/tr8247766/
>>
>>
>> C1 code generation for reading and writing stack-slots does not handle
>> large immediate offsets on aarch64. This patch will ensure that
>> immediate offsets are admissible for base+(immediate)offset encoding or,
>> if this is not the case, will enforce an explicit address calculation to
>> a scratch register. (Also correcting a small glitch in 9-bit signed
>> immediate encoding check.)
> This is all very complicated.
>
> So it seems to me that there is a better way to do this. We already have
> MacroAssembler::legitimize_address(), and you should use that.
>
> Like so:
>
> diff -r 7c59af4db158 src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.cpp
> --- a/src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.cpp       Thu Jul 09 11:01:29 2020 -0400
> +++ b/src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.cpp       Thu Jul 09 11:36:02 2020 -0400
> @@ -736,25 +736,32 @@
>
>   void LIR_Assembler::reg2stack(LIR_Opr src, LIR_Opr dest, BasicType type, bool pop_fpu_stack) {
>     if (src->is_single_cpu()) {
> +    int index = dest->single_stack_ix();
>       if (is_reference_type(type)) {
> -      __ str(src->as_register(), frame_map()->address_for_slot(dest->single_stack_ix()));
> +      __ str(src->as_register(),
> +             __ legitimize_address(frame_map()->address_for_slot(index), BytesPerWord, rscratch1));
>         __ verify_oop(src->as_register());
>       } else if (type == T_METADATA || type == T_DOUBLE || type == T_ADDRESS) {
> -      __ str(src->as_register(), frame_map()->address_for_slot(dest->single_stack_ix()));
> +      __ str(src->as_register(),
> +             __ legitimize_address(frame_map()->address_for_slot(index), BytesPerWord, rscratch1));
>       } else {
> -      __ strw(src->as_register(), frame_map()->address_for_slot(dest->single_stack_ix()));
> +      __ strw(src->as_register(),
> +              __ legitimize_address(frame_map()->address_for_slot(index), BytesPerInt, rscratch1));
>       }
>
>     } else if (src->is_double_cpu()) {
>       Address dest_addr_LO = frame_map()->address_for_slot(dest->double_stack_ix(), lo_word_offset_in_bytes);
> +    dest_addr_LO = __ legitimize_address(dest_addr_LO, BytesPerLong, rscratch1);
>       __ str(src->as_register_lo(), dest_addr_LO);
>
>     } else if (src->is_single_fpu()) {
>       Address dest_addr = frame_map()->address_for_slot(dest->single_stack_ix());
> +    dest_addr = __ legitimize_address(dest_addr, BytesPerInt, rscratch1);
>       __ strs(src->as_float_reg(), dest_addr);
>
>     } else if (src->is_double_fpu()) {
>       Address dest_addr = frame_map()->address_for_slot(dest->double_stack_ix());
> +    dest_addr = __ legitimize_address(dest_addr, BytesPerLong, rscratch1);
>       __ strd(src->as_double_reg(), dest_addr);
>
>     } else {
>
> stack_offset_in_reach() seems to duplicate the functionality of offset_ok_for_immed(),
> and it's only used in this one place. By all means please use the new is_uimm() and
> is_simm() in offset_ok_for_immed().
>

I've refreshed the webrev (as discussed off-line), moving 
legitimize_address() into the stack_slot_address() with additional 
conditions related to the (well-aligned) frame slot address produced. 
Use of is_simm9() and is_uimm12() is now using implementation in Assembler.

/Patric


From felix.yang at huawei.com  Mon Jul 27 12:27:19 2020
From: felix.yang at huawei.com (Yangfei (Felix))
Date: Mon, 27 Jul 2020 12:27:19 +0000
Subject: RFR(S): 8250609: C2 crash in IfNode::fold_compares
Message-ID: <DA41BE1DDCA941489001C7FBD7A8820EE7E8D547@dggeml507-mbx.china.huawei.com>

Hi,

Bug: https://bugs.openjdk.java.net/browse/JDK-8250609 
Webrev: http://cr.openjdk.java.net/~fyang/8250609/webrev.00/ 

In IfNode::fold_compares_helper, C2 tries to fold 2 CmpI into a single CmpU. 
At the crash site in IfNode::fold_compares_helper:
 995   if (lo && hi) {
 996     // Merge the two compares into a single unsigned compare by building (CmpU (n - lo) (hi - lo))
 997     Node* adjusted_val = igvn->transform(new SubINode(n,  lo));
 998     if (adjusted_lim == NULL) {
 999       adjusted_lim = igvn->transform(new SubINode(hi, lo));
1000     }

At line 997, we have:
(gdb) p lo->dump()
 641    AddI    === _  513  92  [[]]
$1 = void

After the transformation at line 997, we have
(gdb) p lo->dump()
 641    AddI    === _ _ _  [[]]   [34200641]
$3 = void

Then node 641 was used at line 999, which triggers the crash.
Patch fixes the issue by delaying transformation in IfNode::fold_compares temporarily.
Tier1-3 tested on aarch64-linux-gnu & x86_64-linux-gnu.
Newly added test fail without the patch and pass otherwise.
Suggestions?

Thanks,
Felix

From aph at redhat.com  Mon Jul 27 13:24:23 2020
From: aph at redhat.com (Andrew Haley)
Date: Mon, 27 Jul 2020 14:24:23 +0100
Subject: [aarch64-port-dev ] RFR(S/M): 8247766: [aarch64] guarantee(val <
 (1U << nbits)) failed: Field too big for insn
In-Reply-To: <4d5b4219-3f9a-f606-64cc-4bc40fe2c7bd@oracle.com>
References: <0cdbdf26-ad4d-056b-a801-cc31b2cc4ab3@oracle.com>
 <2809ab8c-4a2e-c0c3-9b93-a0f5df41b992@redhat.com>
 <4d5b4219-3f9a-f606-64cc-4bc40fe2c7bd@oracle.com>
Message-ID: <6663f2a2-ccd8-5692-d90b-6ea664294ea8@redhat.com>

Hi,

On 27/07/2020 11:02, Patric Hedlin wrote:
>
> I've refreshed the webrev (as discussed off-line), moving
> legitimize_address() into the stack_slot_address() with additional
> conditions related to the (well-aligned) frame slot address produced.
> Use of is_simm9() and is_uimm12() is now using implementation in Assembler.

That's much nicer. Some minor nits...

Please pass the scratch register to be used as an argument to
stack_slot_address:

+// Ensure a valid Address (base + offset) to a stack-slot. If stack access is
+// not encodable as a base + (immediate) offset, generate an explicit address
+// calculation to hold the address in a temporary register (rscratch1).
+Address LIR_Assembler::stack_slot_address(int index, uint size, int adjust) {

These consts are too obscure. Please be explicit: either use one of
the predefined constants that mean the same thing (such as
BytesPerInt) or if you really want low-level types, sizeof int32_t:

+  uint const c_sz32 = 4;
+  uint const c_sz64 = 8;

Otherwise OK. It doesn't need another review with these changes.

Thanks.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From eric.c.liu at arm.com  Mon Jul 27 14:35:17 2020
From: eric.c.liu at arm.com (Eric Liu)
Date: Mon, 27 Jul 2020 14:35:17 +0000
Subject: [aarch64-port-dev ] RFR 8248870: AARCH64: I2L conversions can be
 skipped for small positive masked values
In-Reply-To: <e3e7ad61-7747-a16a-6335-f3eed6c507ca@redhat.com>
References: <d83ab998-ad1e-4889-b637-dc7f0369c50e@bell-sw.com>
 <9ccf64f1-7a88-0f67-8b50-4dea09af9c8b@redhat.com>
 <e54d3d17-c865-a422-e9eb-05b3936e81da@bell-sw.com>
 <05369383-c6d8-5e61-50ce-51fec955e2d4@bell-sw.com>,
 <e3e7ad61-7747-a16a-6335-f3eed6c507ca@redhat.com>
Message-ID: <AM6PR08MB44227FC8590B04B4513C440BC5720@AM6PR08MB4422.eurprd08.prod.outlook.com>

Hi,

We are planing to elide the redundant 'sxt' on AArch64 in macro-assembler for better performance and small code size. I think the
redundant signed extend could be generated in following cases:

A)  Load a data less than 32 bits and then using it as 64 bits data. E.g.
     ldrsb w1, mem
     sxt   x1, w1
B)  And with a possible number. E.g.
     and     w11, w1, #0xffff
     sxtw    x0, w11
C) Sign extend a number twice. E.g.
     sxth    w11, w1
     sxtw    x0, w11

To address issue A), current C2's ad file has about 8 match rules to match those kinds of patterns. E.g.

// Load Byte (8 bit signed) into long
 instruct loadB2L(iRegLNoSp dst, memory1 mem)
 %{
   match(Set dst (ConvI2L (LoadB mem)));
   predicate(!needs_acquiring_load(n->in(1)));
   ins_cost(4 * INSN_COST);
   format %{ "ldrsb  $dst, $mem\t# byte" %}
   ins_encode(aarch64_enc_ldrsb(dst, mem));
   ins_pipe(iload_reg_mem);
%}

For issue B), Boris' patch did a good job to elide the redundant 'sxt' followed by 'and'. But this pair could also be generated
by other pattern, e.g. (ConvI2L (RShiftI (LShiftI src lshift_count) rshift_count)). This pattern can be reproduced by below case:

        public static long l2c2l (long x) {
          return (char) x;
        }

For issue C), type conversion usually generate those piece of code. e.g.

    private static long test_l2s2l(long x) {
        return (short) x;
    }


In my view, eliding the redundant 'sxt' above are more likely a machine code problem rather than a IR problem.
I think peephole is the best fit for above optimization. However, C2's peephole is very complicated and I'm not
sure whether it has been enabled in AArch64.

So I was thinking if it better to remove this kinds of instructions in macro-assembler, even this sounds somehow beyond
the assembler's responsibility. By handling those pair in macro-assembler, we can only focus on instruction sequence rather
than the type and shape of IR node.


What do you think? Welcome any feedback!


--
Best regards,
Eric


From: hotspot-compiler-dev <hotspot-compiler-dev-retn at openjdk.java.net> on behalf of Andrew Haley <aph at redhat.com>
Sent: 24 July 2020 16:18
To: Boris Ulasevich <boris.ulasevich at bell-sw.com>; aarch64-port-dev at openjdk.java.net <aarch64-port-dev at openjdk.java.net>
Cc: hotspot-compiler-dev at openjdk.java.net <hotspot-compiler-dev at openjdk.java.net>
Subject: Re: [aarch64-port-dev ] RFR 8248870: AARCH64: I2L conversions can be skipped for small positive masked values

On 23/07/2020 12:25, Boris Ulasevich wrote:
> Since the JDK-8248414 patch has been committed, I believe we can revive
> this review. I think it is still better to move my rule to the ubfiz
> command group,
> which is in the auto-generated area.
>
> http://cr.openjdk.java.net/~bulasevich/8248870/webrev.02

OK, thanks.

--
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

From aph at redhat.com  Mon Jul 27 15:45:24 2020
From: aph at redhat.com (Andrew Haley)
Date: Mon, 27 Jul 2020 16:45:24 +0100
Subject: [aarch64-port-dev ] RFR 8248870: AARCH64: I2L conversions can be
 skipped for small positive masked values
In-Reply-To: <AM6PR08MB44227FC8590B04B4513C440BC5720@AM6PR08MB4422.eurprd08.prod.outlook.com>
References: <d83ab998-ad1e-4889-b637-dc7f0369c50e@bell-sw.com>
 <9ccf64f1-7a88-0f67-8b50-4dea09af9c8b@redhat.com>
 <e54d3d17-c865-a422-e9eb-05b3936e81da@bell-sw.com>
 <05369383-c6d8-5e61-50ce-51fec955e2d4@bell-sw.com>
 <e3e7ad61-7747-a16a-6335-f3eed6c507ca@redhat.com>
 <AM6PR08MB44227FC8590B04B4513C440BC5720@AM6PR08MB4422.eurprd08.prod.outlook.com>
Message-ID: <7c7cf3e3-1d7a-8d83-25c3-47a605055deb@redhat.com>


> In my view, eliding the redundant 'sxt' above are more likely a
> machine code problem rather than a IR problem.  I think peephole is
> the best fit for above optimization. However, C2's peephole is very
> complicated and I'm not sure whether it has been enabled in AArch64.

> So I was thinking if it better to remove this kinds of instructions
> in macro-assembler, even this sounds somehow beyond the assembler's
> responsibility. By handling those pair in macro-assembler, we can
> only focus on instruction sequence rather than the type and shape of
> IR node.

Is that even possible? Sure, it works if an instruction's output is the
same as its sole register input, but that's all.

If these things can be canonicalized earlier in compilation they
should be. This one:

     sxth    w11, w1
     sxtw    x0, w11

Corresponds with

     (((long)n) << 48) >> 48

and

     and     w11, w1, #0xffff
     sxtw    x0, w11

with

     (((long)n) << 48) >>> 48

and that's how they could be canonicalized early in compilation.
Whether that's a good idea depends on other processors, though. It'll
probably not hurt them but it'll help us.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From volker.simonis at gmail.com  Mon Jul 27 15:51:55 2020
From: volker.simonis at gmail.com (Volker Simonis)
Date: Mon, 27 Jul 2020 17:51:55 +0200
Subject: RFR[XS] 8249809 avoid calling DirectiveSet::clone(this) in
 compilecommand_compatibility_init
In-Reply-To: <1595807197546.52082@amazon.com>
References: <1595807197546.52082@amazon.com>
Message-ID: <CA+3eh10LEgJdcSAFrtsE=Ycy2oHJR3Hk2ddrTAFYLiC1E-sJ2w@mail.gmail.com>

Hi Xin,

I'm not sure if saving the allocation of an DirectiveSet has any
visible effect compared to the much larger allocations required for
the method compilation itself.

Apart from that, I must confess that I'm not totally understanding the
original logic. From what I see, it sets "changed" to true in the case
where it changes the cloned DirectiveSet. But it doesn't do that in
the cases where it only changes the clone's control word:

 341         set->_intrinsic_control_words.fill_in(TriBool());
...
 348           set->_intrinsic_control_words[id] = iter.is_enabled();
...
 361         set->_intrinsic_control_words.fill_in(TriBool());
...
 368           set->_intrinsic_control_words[id] = false;

Why don't these mutations count as "changing"  the cloned DirectiveSet?

After your patch, you've changed the above lines such that they will
always create a clone which seems different from the initial
behaviour.

Which of the two behaviours is correct here, the original one, the new
one after your change or doesn't it matter for reasons I don't
understand?


I also wonder why you need to overload both operators "operator*()"
and "operator->()"? It seems a little bit arbitrary (and hard to
understand for people reading the code) that "operator*()" clones the
underlying directiveSet while "operator->()" uses the original one.
Why not just define two versions of "operator->()" and let the
compiler choose the right one like so:

DirectiveSet const* operator->() const {
  return !_clone ? _origin : _clone;
}

DirectiveSet* operator->() {
  if (!_clone) {
    _clone = DirectiveSet::clone(_origin);
  }
  return _clone;
}
...
if (!_modified[LogIndex]) {
  bool log = CompilerOracle::should_log(method);
  if (log != const_cast<const DirectiveSetPtr&>(set)->LogOption) {
    set->LogOption = log;
  }
}

Thank you and best regards,
Volker

On Mon, Jul 27, 2020 at 1:47 AM Liu, Xin <xxinliu at amazon.com> wrote:
>
> hi, Reviewers,
>
> Could you review this simple patch?
> bug: https://bugs.openjdk.java.net/browse/JDK-8249809
> webrev: https://cr.openjdk.java.net/~xliu/8249809/00/webrev/
>
> When the users specify a method-level compiler directive, the DirectiveSet is cloned for every single compiling method. It's expensive but rarely hit. Actually, Only user-specified methods must clone the DirectiveSet. I introduce a smart pointer DirectiveSetPtr. operator->() returns a pointer to a constant DirectiveSet, which is read-only. It doesn't clone the _origin until c2 need to update its members. transfer() yield the ownership of the pointer.
>
> Test:
> manually tests with different CompileComand options.
> hotspot:tier1 and gtest:all.
>
> thanks,
> --lx
>

From patric.hedlin at oracle.com  Mon Jul 27 16:11:59 2020
From: patric.hedlin at oracle.com (Patric Hedlin)
Date: Mon, 27 Jul 2020 18:11:59 +0200
Subject: [aarch64-port-dev ] RFR(S/M): 8247766: [aarch64] guarantee(val <
 (1U << nbits)) failed: Field too big for insn
In-Reply-To: <6663f2a2-ccd8-5692-d90b-6ea664294ea8@redhat.com>
References: <0cdbdf26-ad4d-056b-a801-cc31b2cc4ab3@oracle.com>
 <2809ab8c-4a2e-c0c3-9b93-a0f5df41b992@redhat.com>
 <4d5b4219-3f9a-f606-64cc-4bc40fe2c7bd@oracle.com>
 <6663f2a2-ccd8-5692-d90b-6ea664294ea8@redhat.com>
Message-ID: <19f664ec-429e-0927-84e2-90749bffca5a@oracle.com>

Thanks for reviewing Andrew.

/Patric

On 2020-07-27 15:24, Andrew Haley wrote:
> Hi,
>
> On 27/07/2020 11:02, Patric Hedlin wrote:
>> I've refreshed the webrev (as discussed off-line), moving
>> legitimize_address() into the stack_slot_address() with additional
>> conditions related to the (well-aligned) frame slot address produced.
>> Use of is_simm9() and is_uimm12() is now using implementation in Assembler.
> That's much nicer. Some minor nits...
>
> Please pass the scratch register to be used as an argument to
> stack_slot_address:
>
> +// Ensure a valid Address (base + offset) to a stack-slot. If stack access is
> +// not encodable as a base + (immediate) offset, generate an explicit address
> +// calculation to hold the address in a temporary register (rscratch1).
> +Address LIR_Assembler::stack_slot_address(int index, uint size, int adjust) {
>
> These consts are too obscure. Please be explicit: either use one of
> the predefined constants that mean the same thing (such as
> BytesPerInt) or if you really want low-level types, sizeof int32_t:
>
> +  uint const c_sz32 = 4;
> +  uint const c_sz64 = 8;
>
> Otherwise OK. It doesn't need another review with these changes.
>
> Thanks.
>


From lutz.schmidt at sap.com  Mon Jul 27 16:45:33 2020
From: lutz.schmidt at sap.com (Schmidt, Lutz)
Date: Mon, 27 Jul 2020 16:45:33 +0000
Subject: RFR(XS): 8250233: -XX:+CITime triggers guarantee(events != NULL)
 in jvmci.cpp
In-Reply-To: <E9E87374-CAA7-4BB8-8860-D5F03080098E@sap.com>
References: <E9E87374-CAA7-4BB8-8860-D5F03080098E@sap.com>
Message-ID: <B416A958-6690-4506-AD74-5AE086335762@sap.com>

Hi Community,

any volunteers for a review? 

Sorry for being impatient. I'll be on vacation starting Wednesday, EOB, and would like to get this thing out of the way before. 

Thanks,
Lutz

?On 24.07.20, 14:53, "Schmidt, Lutz" <lutz.schmidt at sap.com> wrote:

    Resending after updating subject line with bug id. 
    Sorry for the spam.
    Lutz

    On 24.07.20, 14:51, "Schmidt, Lutz" <lutz.schmidt at sap.com> wrote:

        Dear all, 

        may I please request reviews for this small fix? I would even say it is a trivial fix. It inverts an if condition such that JVMCI specific code is called only when JVMCI compilation is enabled via UseJVMCICompiler.

        Bug:    https://bugs.openjdk.java.net/browse/JDK-8250233
        Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8250233.00/ 

        Local testing looks good. jdk/submit tests pending.

        Thank you!
        Lutz


From vladimir.kozlov at oracle.com  Mon Jul 27 19:15:03 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 27 Jul 2020 12:15:03 -0700
Subject: RFR(S): 8250609: C2 crash in IfNode::fold_compares
In-Reply-To: <DA41BE1DDCA941489001C7FBD7A8820EE7E8D547@dggeml507-mbx.china.huawei.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E8D547@dggeml507-mbx.china.huawei.com>
Message-ID: <23f0ab18-8bfb-3874-3000-ee2b37caca7c@oracle.com>

It happens because 'lo' is new node created just now and have no uses yet.
For such new nodes we usually add dummy use to avoid removal from graph:

http://hg.openjdk.java.net/jdk/jdk/file/c379dc750a02/src/hotspot/share/opto/convertnode.cpp#l403

Thanks,
Vladimir K

On 7/27/20 5:27 AM, Yangfei (Felix) wrote:
> Hi,
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8250609
> Webrev: http://cr.openjdk.java.net/~fyang/8250609/webrev.00/
> 
> In IfNode::fold_compares_helper, C2 tries to fold 2 CmpI into a single CmpU.
> At the crash site in IfNode::fold_compares_helper:
>   995   if (lo && hi) {
>   996     // Merge the two compares into a single unsigned compare by building (CmpU (n - lo) (hi - lo))
>   997     Node* adjusted_val = igvn->transform(new SubINode(n,  lo));
>   998     if (adjusted_lim == NULL) {
>   999       adjusted_lim = igvn->transform(new SubINode(hi, lo));
> 1000     }
> 
> At line 997, we have:
> (gdb) p lo->dump()
>   641    AddI    === _  513  92  [[]]
> $1 = void
> 
> After the transformation at line 997, we have
> (gdb) p lo->dump()
>   641    AddI    === _ _ _  [[]]   [34200641]
> $3 = void
> 
> Then node 641 was used at line 999, which triggers the crash.
> Patch fixes the issue by delaying transformation in IfNode::fold_compares temporarily.
> Tier1-3 tested on aarch64-linux-gnu & x86_64-linux-gnu.
> Newly added test fail without the patch and pass otherwise.
> Suggestions?
> 
> Thanks,
> Felix
> 

From vladimir.kozlov at oracle.com  Mon Jul 27 19:35:48 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 27 Jul 2020 12:35:48 -0700
Subject: RFR(XS): 8250233: -XX:+CITime triggers guarantee(events != NULL)
 in jvmci.cpp
In-Reply-To: <E9E87374-CAA7-4BB8-8860-D5F03080098E@sap.com>
References: <E9E87374-CAA7-4BB8-8860-D5F03080098E@sap.com>
Message-ID: <7f5c8191-748f-d26d-9a4b-4efcea72ab3c@oracle.com>

Nope, the check is correct. In hosted mode Graal is used as Java application and not as JIT compiler so that 
UseJVMCICompiler flag is false.

The problem is really caused by recent 8248321 changes - it is regression. Doug, please advice how to fix it.
I think the new JVMCI events code should be adjusted for hosted mode.

Thanks,
Vladimir

On 7/24/20 5:53 AM, Schmidt, Lutz wrote:
> Resending after updating subject line with bug id.
> Sorry for the spam.
> Lutz
> 
> ?On 24.07.20, 14:51, "Schmidt, Lutz" <lutz.schmidt at sap.com> wrote:
> 
>      Dear all,
> 
>      may I please request reviews for this small fix? I would even say it is a trivial fix. It inverts an if condition such that JVMCI specific code is called only when JVMCI compilation is enabled via UseJVMCICompiler.
> 
>      Bug:    https://bugs.openjdk.java.net/browse/JDK-8250233
>      Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8250233.00/
> 
>      Local testing looks good. jdk/submit tests pending.
> 
>      Thank you!
>      Lutz
> 
> 
> 
> 

From evgeny.nikitin at oracle.com  Mon Jul 27 19:38:14 2020
From: evgeny.nikitin at oracle.com (Evgeny Nikitin)
Date: Mon, 27 Jul 2020 21:38:14 +0200
Subject: RFR(M): 8067651: Fix Trivial code path for LevelTransitionTest.java
Message-ID: <58fd3cd5-cdce-8e15-3237-d22a3566b0da@oracle.com>

Hi,

Bug: https://bugs.openjdk.java.net/browse/JDK-8067651
Webrev: https://cr.openjdk.java.net/~enikitin/8067651/webrev.00/

Adjusting the test to current state of the VM.

     - Definition of 'trivial code' does not depend on whether the 
method has been profiled or not;
     - Trivial code does only go level 0 to level 1;
     - Some refactoring.

The change has been checked in mach5 for the 5 platforms (passed).

Please review,
/Evgeny Nikitin.

From tom.rodriguez at oracle.com  Mon Jul 27 19:51:27 2020
From: tom.rodriguez at oracle.com (Tom Rodriguez)
Date: Mon, 27 Jul 2020 12:51:27 -0700
Subject: RFR(XS): 8250233: -XX:+CITime triggers guarantee(events != NULL)
 in jvmci.cpp
In-Reply-To: <7f5c8191-748f-d26d-9a4b-4efcea72ab3c@oracle.com>
References: <E9E87374-CAA7-4BB8-8860-D5F03080098E@sap.com>
 <7f5c8191-748f-d26d-9a4b-4efcea72ab3c@oracle.com>
Message-ID: <8ff1c8ad-4111-03da-90cc-9c22b9e2b078@oracle.com>

Doug is away, but it seems like we could just remove the JVMCI_event_1 
call in print_compilation_timers.  It's not a particularly worthwhile 
notification and all the other events should be able to safely assume 
that JVMCI has actually been initialized which I think was probably the 
point of the guarantee.  If that's not sufficient then we need to 
convert that guarantee in a check for NULL and return.

tom

Vladimir Kozlov wrote on 7/27/20 12:35 PM:
> Nope, the check is correct. In hosted mode Graal is used as Java 
> application and not as JIT compiler so that UseJVMCICompiler flag is false.
> 
> The problem is really caused by recent 8248321 changes - it is 
> regression. Doug, please advice how to fix it.
> I think the new JVMCI events code should be adjusted for hosted mode.
> 
> Thanks,
> Vladimir
> 
> On 7/24/20 5:53 AM, Schmidt, Lutz wrote:
>> Resending after updating subject line with bug id.
>> Sorry for the spam.
>> Lutz
>>
>> ?On 24.07.20, 14:51, "Schmidt, Lutz" <lutz.schmidt at sap.com> wrote:
>>
>> ???? Dear all,
>>
>> ???? may I please request reviews for this small fix? I would even say 
>> it is a trivial fix. It inverts an if condition such that JVMCI 
>> specific code is called only when JVMCI compilation is enabled via 
>> UseJVMCICompiler.
>>
>> ???? Bug:??? https://bugs.openjdk.java.net/browse/JDK-8250233
>> ???? Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8250233.00/
>>
>> ???? Local testing looks good. jdk/submit tests pending.
>>
>> ???? Thank you!
>> ???? Lutz
>>
>>
>>
>>

From vladimir.kozlov at oracle.com  Mon Jul 27 20:47:24 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Mon, 27 Jul 2020 13:47:24 -0700
Subject: RFR(XS): 8250233: -XX:+CITime triggers guarantee(events != NULL)
 in jvmci.cpp
In-Reply-To: <8ff1c8ad-4111-03da-90cc-9c22b9e2b078@oracle.com>
References: <E9E87374-CAA7-4BB8-8860-D5F03080098E@sap.com>
 <7f5c8191-748f-d26d-9a4b-4efcea72ab3c@oracle.com>
 <8ff1c8ad-4111-03da-90cc-9c22b9e2b078@oracle.com>
Message-ID: <994bb7a0-20dd-ea53-021c-9f8d49b49917@oracle.com>

We simply missing EnableJVMCI flag check! That is why JVMCI is not initialized. EnableJVMCI should be true in hosted mode.

I looked and I see few problematic places in compileBroker.cpp  statistic code guarded by #if INCLUDE_JVMCI but which 
does not check EnableJVMCI or comp->is_jvmci(). I think it should be fixed.

Also I think JVMCI_event_1 is useless now in print_compilation_timers() because its output does not go into tty.

Thanks,
Vladimir

On 7/27/20 12:51 PM, Tom Rodriguez wrote:
> Doug is away, but it seems like we could just remove the JVMCI_event_1 call in print_compilation_timers.? It's not a 
> particularly worthwhile notification and all the other events should be able to safely assume that JVMCI has actually 
> been initialized which I think was probably the point of the guarantee.? If that's not sufficient then we need to 
> convert that guarantee in a check for NULL and return.
> 
> tom
> 
> Vladimir Kozlov wrote on 7/27/20 12:35 PM:
>> Nope, the check is correct. In hosted mode Graal is used as Java application and not as JIT compiler so that 
>> UseJVMCICompiler flag is false.
>>
>> The problem is really caused by recent 8248321 changes - it is regression. Doug, please advice how to fix it.
>> I think the new JVMCI events code should be adjusted for hosted mode.
>>
>> Thanks,
>> Vladimir
>>
>> On 7/24/20 5:53 AM, Schmidt, Lutz wrote:
>>> Resending after updating subject line with bug id.
>>> Sorry for the spam.
>>> Lutz
>>>
>>> ?On 24.07.20, 14:51, "Schmidt, Lutz" <lutz.schmidt at sap.com> wrote:
>>>
>>> ???? Dear all,
>>>
>>> ???? may I please request reviews for this small fix? I would even say it is a trivial fix. It inverts an if 
>>> condition such that JVMCI specific code is called only when JVMCI compilation is enabled via UseJVMCICompiler.
>>>
>>> ???? Bug:??? https://bugs.openjdk.java.net/browse/JDK-8250233
>>> ???? Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8250233.00/
>>>
>>> ???? Local testing looks good. jdk/submit tests pending.
>>>
>>> ???? Thank you!
>>> ???? Lutz
>>>
>>>
>>>
>>>

From tom.rodriguez at oracle.com  Mon Jul 27 20:50:20 2020
From: tom.rodriguez at oracle.com (Tom Rodriguez)
Date: Mon, 27 Jul 2020 13:50:20 -0700
Subject: RFR(XS): 8250233: -XX:+CITime triggers guarantee(events != NULL)
 in jvmci.cpp
In-Reply-To: <994bb7a0-20dd-ea53-021c-9f8d49b49917@oracle.com>
References: <E9E87374-CAA7-4BB8-8860-D5F03080098E@sap.com>
 <7f5c8191-748f-d26d-9a4b-4efcea72ab3c@oracle.com>
 <8ff1c8ad-4111-03da-90cc-9c22b9e2b078@oracle.com>
 <994bb7a0-20dd-ea53-021c-9f8d49b49917@oracle.com>
Message-ID: <fb109e31-b966-ca09-03a8-40ba06ef14fa@oracle.com>


Vladimir Kozlov wrote on 7/27/20 1:47 PM:
> We simply missing EnableJVMCI flag check! That is why JVMCI is not 
> initialized. EnableJVMCI should be true in hosted mode.

Yes that makes sense.

tom

> 
> I looked and I see few problematic places in compileBroker.cpp  
> statistic code guarded by #if INCLUDE_JVMCI but which does not check 
> EnableJVMCI or comp->is_jvmci(). I think it should be fixed.
> 
> Also I think JVMCI_event_1 is useless now in print_compilation_timers() 
> because its output does not go into tty.
> 
> Thanks,
> Vladimir
> 
> On 7/27/20 12:51 PM, Tom Rodriguez wrote:
>> Doug is away, but it seems like we could just remove the JVMCI_event_1 
>> call in print_compilation_timers.? It's not a particularly worthwhile 
>> notification and all the other events should be able to safely assume 
>> that JVMCI has actually been initialized which I think was probably 
>> the point of the guarantee.? If that's not sufficient then we need to 
>> convert that guarantee in a check for NULL and return.
>>
>> tom
>>
>> Vladimir Kozlov wrote on 7/27/20 12:35 PM:
>>> Nope, the check is correct. In hosted mode Graal is used as Java 
>>> application and not as JIT compiler so that UseJVMCICompiler flag is 
>>> false.
>>>
>>> The problem is really caused by recent 8248321 changes - it is 
>>> regression. Doug, please advice how to fix it.
>>> I think the new JVMCI events code should be adjusted for hosted mode.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 7/24/20 5:53 AM, Schmidt, Lutz wrote:
>>>> Resending after updating subject line with bug id.
>>>> Sorry for the spam.
>>>> Lutz
>>>>
>>>> ?On 24.07.20, 14:51, "Schmidt, Lutz" <lutz.schmidt at sap.com> wrote:
>>>>
>>>> ???? Dear all,
>>>>
>>>> ???? may I please request reviews for this small fix? I would even 
>>>> say it is a trivial fix. It inverts an if condition such that JVMCI 
>>>> specific code is called only when JVMCI compilation is enabled via 
>>>> UseJVMCICompiler.
>>>>
>>>> ???? Bug:??? https://bugs.openjdk.java.net/browse/JDK-8250233
>>>> ???? Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8250233.00/
>>>>
>>>> ???? Local testing looks good. jdk/submit tests pending.
>>>>
>>>> ???? Thank you!
>>>> ???? Lutz
>>>>
>>>>
>>>>
>>>>

From cjashfor at linux.ibm.com  Tue Jul 28 01:49:06 2020
From: cjashfor at linux.ibm.com (Corey Ashford)
Date: Mon, 27 Jul 2020 18:49:06 -0700
Subject: RFR(S): 8248188: [PATCH] Add HotSpotIntrinsicCandidate and API
 for Base64 decoding
In-Reply-To: <OFF371E11D.F5B64585-ON00258591.0006A5F8-49258591.0007AA3E@notes.na.collabserv.com>
References: <11ca749f-3015-c004-aa6b-3194e1dfe4eb@linux.ibm.com>
 <OFF371E11D.F5B64585-ON00258591.0006A5F8-49258591.0007AA3E@notes.na.collabserv.com>
Message-ID: <ef75251b-c5ef-10b5-d0bd-c4ba15d2c934@linux.ibm.com>

Michihiro Horie uploaded a new revision of the Base64 decodeBlock 
intrinsic API for me:

http://cr.openjdk.java.net/~mhorie/8248188/webrev.01/

It has the following changes with respect to the original one posted:

  * In the event of encountering a non-base64 character, instead of 
having a separate error code of -1, the intrinsic can now just return 
either 0, or the number of data bytes produced up to the point where the 
illegal base64 character was encountered.  This reduces the number of 
special cases, and also provides a way to speed up the process of 
finding the bad character by the slower, pure-Java algorithm.

  * The isMIME boolean is removed from the API for two reasons:
    - The current API is not sufficient to handle the isMIME case, 
because there isn't a strict relationship between the number of input 
bytes and the number of output bytes, because there can be an arbitrary 
number of non-base64 characters in the source.
    - If an intrinsic only implements the (isMIME == false) case as ours 
does, it will always return 0 bytes processed, which will slightly slow 
down the normal path of processing an (isMIME == true) instantiation.
    - We considered adding a separate hotspot candidate for the (isMIME 
== true) case, but since we don't have an intrinsic implementation to 
test that, we decided to leave it as a future optimization.

Comments and suggestions are welcome.  Thanks for your consideration.

- Corey

On 6/23/20 6:23 PM, Michihiro Horie wrote:
> Hi Corey,
> 
> Following is the issue I created.
> https://bugs.openjdk.java.net/browse/JDK-8248188
> 
> I will upload a webrev when you're ready as we talked in private.
> 
> Best regards,
> Michihiro
> 
> Inactive hide details for "Corey Ashford" ---2020/06/24 
> 09:40:10---Currently in java.util.Base64, there is a 
> HotSpotIntrinsicCa"Corey Ashford" ---2020/06/24 09:40:10---Currently in 
> java.util.Base64, there is a HotSpotIntrinsicCandidate and API for 
> encodeBlock, but no
> 
> From: "Corey Ashford" <cjashfor at linux.ibm.com>
> To: "hotspot-compiler-dev at openjdk.java.net" 
> <hotspot-compiler-dev at openjdk.java.net>, 
> "ppc-aix-port-dev at openjdk.java.net" <ppc-aix-port-dev at openjdk.java.net>
> Cc: Michihiro Horie/Japan/IBM at IBMJP, Kazunori Ogata/Japan/IBM at IBMJP, 
> joserz at br.ibm.com
> Date: 2020/06/24 09:40
> Subject: RFR(S): [PATCH] Add HotSpotIntrinsicCandidate and API for 
> Base64 decoding
> 
> ------------------------------------------------------------------------
> 
> 
> 
> Currently in java.util.Base64, there is a HotSpotIntrinsicCandidate and
> API for encodeBlock, but none for decoding. ?This means that only
> encoding gets acceleration from the underlying CPU's vector hardware.
> 
> I'd like to propose adding a new intrinsic for decodeBlock. ?The
> considerations I have for this new intrinsic's API:
> 
>  ?* Don't make any assumptions about the underlying capability of the
> hardware. ?For example, do not impose any specific block size granularity.
> 
>  ?* Don't assume the underlying intrinsic can handle isMIME or isURL
> modes, but also let them decide if they will process the data regardless
> of the settings of the two booleans.
> 
>  ?* Any remaining data that is not processed by the intrinsic will be
> processed by the pure Java implementation. ?This allows the intrinsic to
> process whatever block sizes it's good at without the complexity of
> handling the end fragments.
> 
>  ?* If any illegal character is discovered in the decoding process, the
> intrinsic will simply return -1, instead of requiring it to throw a
> proper exception from the context of the intrinsic. ?In the event of
> getting a -1 returned from the intrinsic, the Java Base64 library code
> simply calls the pure Java implementation to have it find the error and
> properly throw an exception. ?This is a performance trade-off in the
> case of an error (which I expect to be very rare).
> 
>  ?* One thought I have for a further optimization (not implemented in
> the current patch), is that when the intrinsic decides not to process a
> block because of some combination of isURL and isMIME settings it
> doesn't handle, it could return extra bits in the return code, encoded
> as a negative number. ?For example:
> 
> Illegal_Base64_char ? = 0b001;
> isMIME_unsupported ? ?= 0b010;
> isURL_unsupported ? ? = 0b100;
> 
> These can be OR'd together as needed and then negated (flip the sign).
> The Base64 library code could then cache these flags, so it will know
> not to call the intrinsic again when another decodeBlock is requested
> but with an unsupported mode. ?This will save the performance hit of
> calling the intrinsic when it is guaranteed to fail.
> 
> I've tested the attached patch with an actual intrinsic coded up for
> Power9/Power10, but those runtime intrinsics and arch-specific patches
> aren't attached today. ?I want to get some consensus on the
> library-level intrinsic API first.
> 
> Also attached is a simple test case to test that the new intrinsic API
> doesn't break anything.
> 
> I'm open to any comments about this.
> 
> Thanks for your consideration,
> 
> - Corey
> 
> 
> Corey Ashford
> IBM Systems, Linux Technology Center, OpenJDK team
> cjashfor at us dot ibm dot com
> [attachment "decodeBlock_api-20200623.patch" deleted by Michihiro 
> Horie/Japan/IBM] [attachment "TestBase64.java" deleted by Michihiro 
> Horie/Japan/IBM]
> 
> 


From xxinliu at amazon.com  Tue Jul 28 03:39:08 2020
From: xxinliu at amazon.com (Liu, Xin)
Date: Tue, 28 Jul 2020 03:39:08 +0000
Subject: RFR[XS] 8249809 avoid calling DirectiveSet::clone(this) in
 compilecommand_compatibility_init
In-Reply-To: <CA+3eh10LEgJdcSAFrtsE=Ycy2oHJR3Hk2ddrTAFYLiC1E-sJ2w@mail.gmail.com>
References: <1595807197546.52082@amazon.com>,
 <CA+3eh10LEgJdcSAFrtsE=Ycy2oHJR3Hk2ddrTAFYLiC1E-sJ2w@mail.gmail.com>
Message-ID: <1595907547514.55531@amazon.com>

hi, Volker, 

Thank you to review my patch. 

1. yes. I guess nodes are the major memory consumption, but compiler directives are for both c1 and c2. 
it still can reduce memory footprint a little. 

2.  Previous code set the flag here. both ControlIntrinsic and DisableIntrinsic belong to compilerdirectives_common_flags.

 327 #define init_default_cc(name, type, dvalue, cc_flag) { type v; if (!_modified[name##Index] && CompilerOracle::has_option_value(method, #cc_flag, v) && v != this->name##Option) { set->name##Option = v; changed = true;} }
 328     compilerdirectives_common_flags(init_default_cc)

When method-level directives override the global directives, this code snippet set changed. 
Even though I remove the flag 'changed', I have the same logic in the smart pointer. 

3.  the smart pointer DirectiveSetPtr needs to provide 2 accesses of the underlying pointer. 
one is read-only and the other one is mutable. 

Ideally, it should has the following 2 operator->().
DirectiveSet* operator->(); 
DirectiveSet const* operator->(). 

AFAFI, C++ doesn't support covariant return type overload.  That is to say, we need to find a way to work around. 
the reason I provide overload operator*() because it returns a reference to object.  Users who want to modify the pointee have to explicitly dereference the smart pointer. 
(*set).member = newvalue; 
set->member = newvalue; // compiler error. 

your approach also works,  but you need to invoke const_cast<> for all places where you want to  read. 
I think my approach has shorter code. 


I just came up a new idea.  How about I provide a method cloned(), which returns the unqualified pointer?

+  DirectiveSet* cloned() {
+    if (!_clone) {
+      _clone = DirectiveSet::clone(_origin);
+    }
+    return _clone;
+  }
+
   DirectiveSet* transfer() {
     assert(_origin != NULL, "_origin is NULL! transfer() can only be invoked once.");
     if (_clone != NULL) {
@@ -340,7 +347,7 @@

     if (CompilerOracle::should_print(method)) {
       if (!_modified[PrintAssemblyIndex]) {
-        (*set).PrintAssemblyOption = true;
+        set.cloned()->PrintAssemblyOption = true;
       }
     }

thanks,
--lx

________________________________________
From: Volker Simonis <volker.simonis at gmail.com>
Sent: Monday, July 27, 2020 8:51 AM
To: Liu, Xin
Cc: hotspot-compiler-dev at openjdk.java.net
Subject: RE: [EXTERNAL] RFR[XS] 8249809 avoid calling DirectiveSet::clone(this) in compilecommand_compatibility_init

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


Hi Xin,

I'm not sure if saving the allocation of an DirectiveSet has any
visible effect compared to the much larger allocations required for
the method compilation itself.

Apart from that, I must confess that I'm not totally understanding the
original logic. From what I see, it sets "changed" to true in the case
where it changes the cloned DirectiveSet. But it doesn't do that in
the cases where it only changes the clone's control word:

 341         set->_intrinsic_control_words.fill_in(TriBool());
...
 348           set->_intrinsic_control_words[id] = iter.is_enabled();
...
 361         set->_intrinsic_control_words.fill_in(TriBool());
...
 368           set->_intrinsic_control_words[id] = false;

Why don't these mutations count as "changing"  the cloned DirectiveSet?

After your patch, you've changed the above lines such that they will
always create a clone which seems different from the initial
behaviour.

Which of the two behaviours is correct here, the original one, the new
one after your change or doesn't it matter for reasons I don't
understand?


I also wonder why you need to overload both operators "operator*()"
and "operator->()"? It seems a little bit arbitrary (and hard to
understand for people reading the code) that "operator*()" clones the
underlying directiveSet while "operator->()" uses the original one.
Why not just define two versions of "operator->()" and let the
compiler choose the right one like so:

DirectiveSet const* operator->() const {
  return !_clone ? _origin : _clone;
}

DirectiveSet* operator->() {
  if (!_clone) {
    _clone = DirectiveSet::clone(_origin);
  }
  return _clone;
}
...
if (!_modified[LogIndex]) {
  bool log = CompilerOracle::should_log(method);
  if (log != const_cast<const DirectiveSetPtr&>(set)->LogOption) {
    set->LogOption = log;
  }
}

Thank you and best regards,
Volker

On Mon, Jul 27, 2020 at 1:47 AM Liu, Xin <xxinliu at amazon.com> wrote:
>
> hi, Reviewers,
>
> Could you review this simple patch?
> bug: https://bugs.openjdk.java.net/browse/JDK-8249809
> webrev: https://cr.openjdk.java.net/~xliu/8249809/00/webrev/
>
> When the users specify a method-level compiler directive, the DirectiveSet is cloned for every single compiling method. It's expensive but rarely hit. Actually, Only user-specified methods must clone the DirectiveSet. I introduce a smart pointer DirectiveSetPtr. operator->() returns a pointer to a constant DirectiveSet, which is read-only. It doesn't clone the _origin until c2 need to update its members. transfer() yield the ownership of the pointer.
>
> Test:
> manually tests with different CompileComand options.
> hotspot:tier1 and gtest:all.
>
> thanks,
> --lx
>

From nick.gasson at arm.com  Tue Jul 28 05:56:54 2020
From: nick.gasson at arm.com (Nick Gasson)
Date: Tue, 28 Jul 2020 13:56:54 +0800
Subject: [aarch64-port-dev ] RFR(S): 8237483: AArch64 C1 OopMap inserted
 twice fatal error
In-Reply-To: <04c4f9e0-e29a-3250-878c-2b29c11a45d8@redhat.com>
References: <85k0ypjq8f.fsf@nicgas01-pc.shanghai.arm.com>
 <04c4f9e0-e29a-3250-878c-2b29c11a45d8@redhat.com>
Message-ID: <85h7tsji6x.fsf@nicgas01-pc.shanghai.arm.com>

On 07/27/20 17:40 pm, Andrew Haley wrote:
>
> I would have thought it would make more sense, rather than asserting,
> simply to detect that we already have an oopmap so we don't need
> another one. Having said that, it's probably not worth worrying about
> so your fix is OK.
>
> It needs a better comment, though. The only way to find out why this
> code is here would be to trawl the email archives. Something like this
> would do:
>
> // In the method java.util.zip.Inflater::inflate C1 generates these two LIR
> // instructions:
>
> //  724 move [c_rarg3|I] [Base:[c_rarg1|L] Disp: 2147483647|I] [patch_normal] [bci:95]
> //  728 throw [c_rarg3|I] [c_rarg0|L]  [bci:100]
>
> // The move instruction at 724 generates a runtime call to deoptimise the
> // method since this patching is not implemented on AArch64. An oop map is
> // inserted for the return PC of the runtime call
> // (LIR_Assembler::deoptimize_trap()). The following throw LIR instruction
> // then inserts another oop map at the same PC, triggering an assertion
> // failure.

Seems a bit too verbose? How about this:

--- a/src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.cpp
@@ -2085,6 +2085,13 @@ void LIR_Assembler::throw_op(LIR_Opr exceptionPC, LIR_Opr exceptionOop, CodeEmit
 
   // get current pc information
   // pc is only needed if the method has an exception handler, the unwind code does not need it.
+  if (compilation()->debug_info_recorder()->last_pc_offset() == __ offset()) {
+    // As no instructions have been generated yet for this LIR node it's
+    // possible that an oop map already exists for the current offset.
+    // In that case insert an dummy NOP here to ensure all oop map PCs
+    // are unique. See JDK-8237483.
+    __ nop();
+  }
   int pc_for_athrow_offset = __ offset();
   InternalAddress pc_for_athrow(__ pc());
   __ adr(exceptionPC->as_register(), pc_for_athrow);


--
Thanks,
Nick

From shade at redhat.com  Tue Jul 28 07:09:33 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 28 Jul 2020 09:09:33 +0200
Subject: RFR (XS) 8250612: jvmciCompilerToVM.cpp declares jio_printf with
 "void" return type, should be "int"
Message-ID: <90ebea60-d625-e67f-918b-1ba2a531316e@redhat.com>

Bug:
  https://bugs.openjdk.java.net/browse/JDK-8250612

Fix:
  https://cr.openjdk.java.net/~shade/8250612/webrev.01/

Testing: Linux x86_64 builds; jdk-submit

-- 
Thanks,
-Aleksey


From christian.hagedorn at oracle.com  Tue Jul 28 07:31:41 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Tue, 28 Jul 2020 09:31:41 +0200
Subject: RFR (T) 8250042: Clean up methodOop and method_oop names from the
 code
In-Reply-To: <c6ec9d68-e44f-7288-d084-e64d62a79490@oracle.com>
References: <85efc3ab-abbf-c5f2-9b7b-47fa516d9a2d@oracle.com>
 <6f973a0a-cf55-e1ab-8de3-b57f68dbd2cf@oracle.com>
 <c5392d2c-7fe1-09e5-3366-4b9840fef087@oracle.com>
 <c6ec9d68-e44f-7288-d084-e64d62a79490@oracle.com>
Message-ID: <8c737ff0-7a62-d18d-78b2-b415802ebbdb@oracle.com>

Hi Coleen

On 24.07.20 15:10, coleen.phillimore at oracle.com wrote:
> incremental webrev at 
> http://cr.openjdk.java.net/~coleenp/2020/8250042.02.incr/webrev
> full webrev at http://cr.openjdk.java.net/~coleenp/2020/8250042.02/webrev

Thanks for cleaning this up! The compiler changes look good to me.

Just a minor comment (no new webrev required):
- arm.ad:8873 & x86_32.ad:13321: There is an extra whitespace before ")"

Best regards,
Christian

> Thanks,
> Coleen
> 
> 
> On 7/24/20 8:23 AM, coleen.phillimore at oracle.com wrote:
>>
>> Thanks for looking at this.
>>
>> On 7/24/20 1:01 AM, David Holmes wrote:
>>> Hi Coleen,
>>>
>>> On 24/07/2020 2:58 am, coleen.phillimore at oracle.com wrote:
>>>> See bug for more details.? I've been running into these names a lot 
>>>> lately.?? Many of these names are in JVMTI.
>>>>
>>>> Tested with tier1 on all Oracle platforms and built on non-Oracle 
>>>> platforms.
>>>>
>>>> open webrev at 
>>>> http://cr.openjdk.java.net/~coleenp/2020/8250042.01/webrev
>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8250042
>>>
>>> src/hotspot/cpu/*/*.ad
>>>
>>> These still refer to "method oop" and method_oop in a number of places.
>>
>> Yes, I only replaced method_oop in the shared code and not in the AD 
>> code.? method_oop can be the name of a parameter and using "sed" to 
>> change it to "method" doesn't work.?? Somebody who understands this 
>> code and looks at it will have to make the rest of the changes.
>>
>> What I did was replace "method oop" with "method" and "methodOop" with 
>> "method" in all the sources.? I replaced "method_oop" with "method" or 
>> "checked_method" in the shared sources.
>>
>>>
>>> src/hotspot/share/adlc/adlparse.cpp
>>>
>>> +? frame->_interpreter_method_oop_reg = parse_one_arg("method reg 
>>> entry");
>>>
>>> I guess I'm not understanding the scope of this renaming - why is 
>>> _interpreter_method_oop_reg not renamed as well? Should this (and 
>>> other uses) be parsed as method-(oop-reg) rather than (method-oop)-reg?
>>
>> I don't know this code, so I'd rather not change any more of it. The 
>> comment makes sense changed, even though the variable name still 
>> refers to method_oop.
>>
>> Thanks,
>> Coleen
>>>
>>> Otherwise all okay.
>>>
>>> Thanks,
>>> David
>>>
>>>> Thanks,
>>>> Coleen
>>
> 

From aph at redhat.com  Tue Jul 28 08:34:24 2020
From: aph at redhat.com (Andrew Haley)
Date: Tue, 28 Jul 2020 09:34:24 +0100
Subject: [aarch64-port-dev ] RFR(S): 8237483: AArch64 C1 OopMap inserted
 twice fatal error
In-Reply-To: <85h7tsji6x.fsf@nicgas01-pc.shanghai.arm.com>
References: <85k0ypjq8f.fsf@nicgas01-pc.shanghai.arm.com>
 <04c4f9e0-e29a-3250-878c-2b29c11a45d8@redhat.com>
 <85h7tsji6x.fsf@nicgas01-pc.shanghai.arm.com>
Message-ID: <569a6967-408c-1895-ce99-18c5147c5958@redhat.com>

On 7/28/20 6:56 AM, Nick Gasson wrote:
> Seems a bit too verbose? How about this:
> 
> --- a/src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.cpp
> +++ b/src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.cpp
> @@ -2085,6 +2085,13 @@ void LIR_Assembler::throw_op(LIR_Opr exceptionPC, LIR_Opr exceptionOop, CodeEmit
>  
>    // get current pc information
>    // pc is only needed if the method has an exception handler, the unwind code does not need it.
> +  if (compilation()->debug_info_recorder()->last_pc_offset() == __ offset()) {
> +    // As no instructions have been generated yet for this LIR node it's
> +    // possible that an oop map already exists for the current offset.
> +    // In that case insert an dummy NOP here to ensure all oop map PCs
> +    // are unique. See JDK-8237483.
> +    __ nop();
> +  }

OK. I wonder if this bug exists in other ports. They too deoptimize
C1 code, albeit more rarely.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From tobias.hartmann at oracle.com  Tue Jul 28 09:05:26 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Tue, 28 Jul 2020 11:05:26 +0200
Subject: RFR (XS) 8250612: jvmciCompilerToVM.cpp declares jio_printf with
 "void" return type, should be "int"
In-Reply-To: <90ebea60-d625-e67f-918b-1ba2a531316e@redhat.com>
References: <90ebea60-d625-e67f-918b-1ba2a531316e@redhat.com>
Message-ID: <5f0a2b21-c897-c474-b54d-03e587bdf046@oracle.com>

Hi Aleksey,

looks good and trivial to me.

Best regards,
Tobias

On 28.07.20 09:09, Aleksey Shipilev wrote:
> Bug:
>   https://bugs.openjdk.java.net/browse/JDK-8250612
> 
> Fix:
>   https://cr.openjdk.java.net/~shade/8250612/webrev.01/
> 
> Testing: Linux x86_64 builds; jdk-submit
> 

From lutz.schmidt at sap.com  Tue Jul 28 09:24:30 2020
From: lutz.schmidt at sap.com (Schmidt, Lutz)
Date: Tue, 28 Jul 2020 09:24:30 +0000
Subject: RFR(XS): 8250233: -XX:+CITime triggers guarantee(events != NULL)
 in jvmci.cpp
In-Reply-To: <fb109e31-b966-ca09-03a8-40ba06ef14fa@oracle.com>
References: <E9E87374-CAA7-4BB8-8860-D5F03080098E@sap.com>
 <7f5c8191-748f-d26d-9a4b-4efcea72ab3c@oracle.com>
 <8ff1c8ad-4111-03da-90cc-9c22b9e2b078@oracle.com>
 <994bb7a0-20dd-ea53-021c-9f8d49b49917@oracle.com>
 <fb109e31-b966-ca09-03a8-40ba06ef14fa@oracle.com>
Message-ID: <3D2D6A62-52A7-43B2-88EF-AF6589E5B21A@sap.com>

OK then,

I withdraw my proposed fix which is apparently incorrect. 

May I assume the issue is handled by knowledgeable people from now on? Tom, will you take over?

Thanks,
Lutz


?On 27.07.20, 22:50, "hotspot-compiler-dev on behalf of Tom Rodriguez" <hotspot-compiler-dev-retn at openjdk.java.net on behalf of tom.rodriguez at oracle.com> wrote:


    Vladimir Kozlov wrote on 7/27/20 1:47 PM:
    > We simply missing EnableJVMCI flag check! That is why JVMCI is not 
    > initialized. EnableJVMCI should be true in hosted mode.

    Yes that makes sense.

    tom

    > 
    > I looked and I see few problematic places in compileBroker.cpp  
    > statistic code guarded by #if INCLUDE_JVMCI but which does not check 
    > EnableJVMCI or comp->is_jvmci(). I think it should be fixed.
    > 
    > Also I think JVMCI_event_1 is useless now in print_compilation_timers() 
    > because its output does not go into tty.
    > 
    > Thanks,
    > Vladimir
    > 
    > On 7/27/20 12:51 PM, Tom Rodriguez wrote:
    >> Doug is away, but it seems like we could just remove the JVMCI_event_1 
    >> call in print_compilation_timers.  It's not a particularly worthwhile 
    >> notification and all the other events should be able to safely assume 
    >> that JVMCI has actually been initialized which I think was probably 
    >> the point of the guarantee.  If that's not sufficient then we need to 
    >> convert that guarantee in a check for NULL and return.
    >>
    >> tom
    >>
    >> Vladimir Kozlov wrote on 7/27/20 12:35 PM:
    >>> Nope, the check is correct. In hosted mode Graal is used as Java 
    >>> application and not as JIT compiler so that UseJVMCICompiler flag is 
    >>> false.
    >>>
    >>> The problem is really caused by recent 8248321 changes - it is 
    >>> regression. Doug, please advice how to fix it.
    >>> I think the new JVMCI events code should be adjusted for hosted mode.
    >>>
    >>> Thanks,
    >>> Vladimir
    >>>
    >>> On 7/24/20 5:53 AM, Schmidt, Lutz wrote:
    >>>> Resending after updating subject line with bug id.
    >>>> Sorry for the spam.
    >>>> Lutz
    >>>>
    >>>> On 24.07.20, 14:51, "Schmidt, Lutz" <lutz.schmidt at sap.com> wrote:
    >>>>
    >>>>      Dear all,
    >>>>
    >>>>      may I please request reviews for this small fix? I would even 
    >>>> say it is a trivial fix. It inverts an if condition such that JVMCI 
    >>>> specific code is called only when JVMCI compilation is enabled via 
    >>>> UseJVMCICompiler.
    >>>>
    >>>>      Bug:    https://bugs.openjdk.java.net/browse/JDK-8250233
    >>>>      Webrev: http://cr.openjdk.java.net/~lucy/webrevs/8250233.00/
    >>>>
    >>>>      Local testing looks good. jdk/submit tests pending.
    >>>>
    >>>>      Thank you!
    >>>>      Lutz
    >>>>
    >>>>
    >>>>
    >>>>


From tobias.hartmann at oracle.com  Tue Jul 28 10:00:41 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Tue, 28 Jul 2020 12:00:41 +0200
Subject: RFR(XS): 8250233: -XX:+CITime triggers guarantee(events != NULL)
 in jvmci.cpp
In-Reply-To: <3D2D6A62-52A7-43B2-88EF-AF6589E5B21A@sap.com>
References: <E9E87374-CAA7-4BB8-8860-D5F03080098E@sap.com>
 <7f5c8191-748f-d26d-9a4b-4efcea72ab3c@oracle.com>
 <8ff1c8ad-4111-03da-90cc-9c22b9e2b078@oracle.com>
 <994bb7a0-20dd-ea53-021c-9f8d49b49917@oracle.com>
 <fb109e31-b966-ca09-03a8-40ba06ef14fa@oracle.com>
 <3D2D6A62-52A7-43B2-88EF-AF6589E5B21A@sap.com>
Message-ID: <e4a13823-3626-3be8-8ef5-0744ee9bb9d2@oracle.com>

Hi Lutz,

On 28.07.20 11:24, Schmidt, Lutz wrote:
> May I assume the issue is handled by knowledgeable people from now on? Tom, will you take over?

You've probably missed that but Doug already replied in your original RFR:
https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-July/039171.html

Best regards,
Tobias

From lutz.schmidt at sap.com  Tue Jul 28 10:31:06 2020
From: lutz.schmidt at sap.com (Schmidt, Lutz)
Date: Tue, 28 Jul 2020 10:31:06 +0000
Subject: RFR(XS): 8250233: -XX:+CITime triggers guarantee(events != NULL)
 in jvmci.cpp
In-Reply-To: <e4a13823-3626-3be8-8ef5-0744ee9bb9d2@oracle.com>
References: <E9E87374-CAA7-4BB8-8860-D5F03080098E@sap.com>
 <7f5c8191-748f-d26d-9a4b-4efcea72ab3c@oracle.com>
 <8ff1c8ad-4111-03da-90cc-9c22b9e2b078@oracle.com>
 <994bb7a0-20dd-ea53-021c-9f8d49b49917@oracle.com>
 <fb109e31-b966-ca09-03a8-40ba06ef14fa@oracle.com>
 <3D2D6A62-52A7-43B2-88EF-AF6589E5B21A@sap.com>
 <e4a13823-3626-3be8-8ef5-0744ee9bb9d2@oracle.com>
Message-ID: <0BA04FB6-18C2-4AE9-B4ED-677270C910E8@sap.com>

Hi Tobias,

thank you for pointing me to Doug's reply. You are right, I missed just that single one - my fault (e-mail filter issue).

Regards,
Lutz 


?On 28.07.20, 12:00, "Tobias Hartmann" <tobias.hartmann at oracle.com> wrote:

    Hi Lutz,

    On 28.07.20 11:24, Schmidt, Lutz wrote:
    > May I assume the issue is handled by knowledgeable people from now on? Tom, will you take over?

    You've probably missed that but Doug already replied in your original RFR:
    https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-July/039171.html

    Best regards,
    Tobias


From sergei.tsypanov at yandex.ru  Tue Jul 28 10:35:35 2020
From: sergei.tsypanov at yandex.ru (=?utf-8?B?0KHQtdGA0LPQtdC5INCm0YvQv9Cw0L3QvtCy?=)
Date: Tue, 28 Jul 2020 12:35:35 +0200
Subject: Performance degradation due to probable (?) C2 issue
Message-ID: <925401595926726@mail.yandex.ru>

Hello,

I've run into a strange issue while trying to improve java.net.URLEncoder.encode() for the case URL contains UTF-8 symbols.
The idea of the fix it to replace the contents of line 276

  String str = new String(charArrayWriter.toCharArray());

with 

  String str = charArrayWriter.toString());

The CharArrayWriter.toCharArray() allocates a copy of underlying char[] which is passed into String constructor,
while CharArrayWriter.toString() passes the char[] to String constructor direclty. In theory this must give us
ceratin improvement both in time and memory as we don't allocate redundant char[]. To verify it I've used the benchmark
encoding the link to article about UN in Russian wiki:

@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@Fork(jvmArgsAppend = {"-Xms2g", "-Xmx2g", "-XX:+UseParallelGC"})
public class UrlEncoderBenchmark {
  private final Charset charset = Charset.defaultCharset();
  private final String utf8Url = "https://ru.wikipedia.org/wiki/???????????_????????????_?????";

  @Benchmark
  public String encodeUtf8() {
    return URLEncoder.encode(utf8Url, charset);
  }
}

In practise it turned out that we win only in interpreter and tier1:

Benchmark                                                    Mode  Cnt     Score      Error   Units

-Xint before

UrlEncoderBenchmark.encodeUtf8                               avgt  100   179.905 ?    2.498   us/op
UrlEncoderBenchmark.encodeUtf8:?gc.alloc.rate.norm           avgt  100  1712.752 ?    0.542    B/op

-Xint after

UrlEncoderBenchmark.encodeUtf8                               avgt  100   173.323 ?    3.459   us/op
UrlEncoderBenchmark.encodeUtf8:?gc.alloc.rate.norm           avgt  100  1552.409 ?    0.339    B/op

-XX:TieredStopAtLevel=1 before

UrlEncoderBenchmark.encodeUtf8                               avgt  100     3.846 ?    0.021   us/op
UrlEncoderBenchmark.encodeUtf8:?gc.alloc.rate.norm           avgt  100  1712.271 ?    0.011    B/op

-XX:TieredStopAtLevel=1 after

UrlEncoderBenchmark.encodeUtf8                               avgt  100     3.732 ?    0.013   us/op
UrlEncoderBenchmark.encodeUtf8:?gc.alloc.rate.norm           avgt  100  1552.246 ?    0.014    B/op

Here we see that we indeed consume less time and memory. However in case of full compilation we have severe degraddation (+ 30%)
in time consumption while as of memory we still have the same improvement:

before

UrlEncoderBenchmark.encodeUtf8                               avgt  100  1108.668 ?    6.226   ns/op
UrlEncoderBenchmark.encodeUtf8:?gc.alloc.rate.norm           avgt  100  1712.202 ?    0.003    B/op

after

UrlEncoderBenchmark.encodeUtf8                               avgt  100  1454.647 ?    6.067   ns/op
UrlEncoderBenchmark.encodeUtf8:?gc.alloc.rate.norm           avgt  100  1528.219 ?    0.007    B/op

As inlining log says in the second case ther's something wrong:

Compilation before

               @ 186   java.io.CharArrayWriter::flush (1 bytes)   inline (hot)
!m             @ 195   java.io.CharArrayWriter::toCharArray (26 bytes)   inline (hot)
                 @ 15   java.util.Arrays::copyOf (19 bytes)   inline (hot)
                   @ 11   java.lang.Math::min (11 bytes)   (intrinsic)
                   @ 14   java.lang.System::arraycopy (0 bytes)   (intrinsic)
               @ 198   java.lang.String::<init> (10 bytes)   inline (hot)
                 @ 6   java.lang.String::<init> (74 bytes)   inline (hot)
                   @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                   @ 36   java.lang.StringUTF16::compress (20 bytes)   inline (hot)
                     @ 9   java.lang.StringUTF16::compress (50 bytes)   (intrinsic)
                   @ 67   java.lang.StringUTF16::toBytes (34 bytes)   (intrinsic)

Compilation after

               @ 186   java.io.CharArrayWriter::flush (1 bytes)   inline (hot)
!m             @ 191   java.io.CharArrayWriter::toString (31 bytes)   already compiled into a big method    <----------------
               @ 199   java.lang.String::getBytes (25 bytes)   inline (hot)
                 @ 14   java.lang.String::coder (15 bytes)   inline (hot)
!                @ 21   java.lang.StringCoding::encode (324 bytes)   inline (hot)
                   @ 10   java.lang.StringCoding::encodeUTF8 (132 bytes)   inline (hot)
                     @ 7   java.lang.StringCoding::encodeUTF8_UTF16 (369 bytes)   hot method too big        <----------------
                     @ 15   java.lang.StringCoding::hasNegatives (25 bytes)   (intrinsic)
                     @ 24   java.util.Arrays::copyOf (19 bytes)   inline (hot)
                       @ 11   java.lang.Math::min (11 bytes)   (intrinsic)
                       @ 14   java.lang.System::arraycopy (0 bytes)   (intrinsic)

And in compilation log for the patched case I have this entry:

<method id='1166' holder='1154' name='toString' return='1032' flags='1' bytes='31' compile_id='1062' compiler='c2' level='4' iicount='11163'/>
<dependency type='unique_concrete_method' ctxk='1154' x='1166'/>
<call method='1166' count='75859' prof_factor='1,000000' inline='1'/>
<inline_fail reason='already compiled into a big method'/>

This complies with results of profiling with perfasm:

- for the original code we have only 1 hot region

....................................................................................................
 62.29%  <total for region 1>
....[Hottest Regions]...............................................................................
 62.29%         c2, level 4  java.net.URLEncoder::encode, version 1032 (1487 bytes)


- for the patched code we have 2 hot regions:

....[Hottest Region 1]..............................................................................
c2, level 4, java.net.URLEncoder::encode, version 1019 (1467 bytes)
....................................................................................................
 61.44%  <total for region 1>

....[Hottest Region 2]..............................................................................
c2, level 4, java.net.URLEncoder::encode, version 1019 (1048 bytes)
....................................................................................................
 10.90%  <total for region 2>

So my question is whether there's something wrong with compier of the original idea of improvement was wrong?

Here are some attachments if one finds them useful

1. Output of LinuxPerfAsmProfiler for original code: https://gist.github.com/stsypanov/6bcd95fd9fbe79afc5f29db929e517f1
2. Output of LinuxPerfAsmProfiler for patched code: https://gist.github.com/stsypanov/794c0b4fdb13bad9fcb7fc890cec3dc8

Regards,
Sergey Tsypanov


From coleen.phillimore at oracle.com  Tue Jul 28 11:20:50 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 28 Jul 2020 07:20:50 -0400
Subject: RFR (T) 8250042: Clean up methodOop and method_oop names from the
 code
In-Reply-To: <8c737ff0-7a62-d18d-78b2-b415802ebbdb@oracle.com>
References: <85efc3ab-abbf-c5f2-9b7b-47fa516d9a2d@oracle.com>
 <6f973a0a-cf55-e1ab-8de3-b57f68dbd2cf@oracle.com>
 <c5392d2c-7fe1-09e5-3366-4b9840fef087@oracle.com>
 <c6ec9d68-e44f-7288-d084-e64d62a79490@oracle.com>
 <8c737ff0-7a62-d18d-78b2-b415802ebbdb@oracle.com>
Message-ID: <460241cc-7b49-b793-fa50-c12898d4b332@oracle.com>


Hi, Thank you for reviewing the compiler changes.

On 7/28/20 3:31 AM, Christian Hagedorn wrote:
> Hi Coleen
>
> On 24.07.20 15:10, coleen.phillimore at oracle.com wrote:
>> incremental webrev at 
>> http://cr.openjdk.java.net/~coleenp/2020/8250042.02.incr/webrev
>> full webrev at 
>> http://cr.openjdk.java.net/~coleenp/2020/8250042.02/webrev
>
> Thanks for cleaning this up! The compiler changes look good to me.
>
> Just a minor comment (no new webrev required):
> - arm.ad:8873 & x86_32.ad:13321: There is an extra whitespace before ")"

Fixed!
Thanks,
Coleen
>
> Best regards,
> Christian
>
>> Thanks,
>> Coleen
>>
>>
>> On 7/24/20 8:23 AM, coleen.phillimore at oracle.com wrote:
>>>
>>> Thanks for looking at this.
>>>
>>> On 7/24/20 1:01 AM, David Holmes wrote:
>>>> Hi Coleen,
>>>>
>>>> On 24/07/2020 2:58 am, coleen.phillimore at oracle.com wrote:
>>>>> See bug for more details.? I've been running into these names a 
>>>>> lot lately.?? Many of these names are in JVMTI.
>>>>>
>>>>> Tested with tier1 on all Oracle platforms and built on non-Oracle 
>>>>> platforms.
>>>>>
>>>>> open webrev at 
>>>>> http://cr.openjdk.java.net/~coleenp/2020/8250042.01/webrev
>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8250042
>>>>
>>>> src/hotspot/cpu/*/*.ad
>>>>
>>>> These still refer to "method oop" and method_oop in a number of 
>>>> places.
>>>
>>> Yes, I only replaced method_oop in the shared code and not in the AD 
>>> code.? method_oop can be the name of a parameter and using "sed" to 
>>> change it to "method" doesn't work.?? Somebody who understands this 
>>> code and looks at it will have to make the rest of the changes.
>>>
>>> What I did was replace "method oop" with "method" and "methodOop" 
>>> with "method" in all the sources.? I replaced "method_oop" with 
>>> "method" or "checked_method" in the shared sources.
>>>
>>>>
>>>> src/hotspot/share/adlc/adlparse.cpp
>>>>
>>>> +? frame->_interpreter_method_oop_reg = parse_one_arg("method reg 
>>>> entry");
>>>>
>>>> I guess I'm not understanding the scope of this renaming - why is 
>>>> _interpreter_method_oop_reg not renamed as well? Should this (and 
>>>> other uses) be parsed as method-(oop-reg) rather than 
>>>> (method-oop)-reg?
>>>
>>> I don't know this code, so I'd rather not change any more of it. The 
>>> comment makes sense changed, even though the variable name still 
>>> refers to method_oop.
>>>
>>> Thanks,
>>> Coleen
>>>>
>>>> Otherwise all okay.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> Thanks,
>>>>> Coleen
>>>
>>


From felix.yang at huawei.com  Tue Jul 28 12:10:13 2020
From: felix.yang at huawei.com (Yangfei (Felix))
Date: Tue, 28 Jul 2020 12:10:13 +0000
Subject: RFR(S): 8250609: C2 crash in IfNode::fold_compares
In-Reply-To: <23f0ab18-8bfb-3874-3000-ee2b37caca7c@oracle.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E8D547@dggeml507-mbx.china.huawei.com>
 <23f0ab18-8bfb-3874-3000-ee2b37caca7c@oracle.com>
Message-ID: <DA41BE1DDCA941489001C7FBD7A8820EE7E8F0BE@dggeml507-mbx.china.huawei.com>

Hi,

> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Tuesday, July 28, 2020 3:15 AM
> To: Yangfei (Felix) <felix.yang at huawei.com>; hotspot-compiler-
> dev at openjdk.java.net
> Subject: Re: RFR(S): 8250609: C2 crash in IfNode::fold_compares
> 
> It happens because 'lo' is new node created just now and have no uses yet.
> For such new nodes we usually add dummy use to avoid removal from graph:
> 
> http://hg.openjdk.java.net/jdk/jdk/file/c379dc750a02/src/hotspot/share/op
> to/convertnode.cpp#l403

Thanks for the suggestions.  Yes, that will also fix the issue. 
New webrev: http://cr.openjdk.java.net/~fyang/8250609/webrev.01/ 
Performed the same tests as before.  Does it look better? 

Felix
 
> On 7/27/20 5:27 AM, Yangfei (Felix) wrote:
> > Hi,
> >
> > Bug: https://bugs.openjdk.java.net/browse/JDK-8250609
> > Webrev: http://cr.openjdk.java.net/~fyang/8250609/webrev.00/
> >
> > In IfNode::fold_compares_helper, C2 tries to fold 2 CmpI into a single
> CmpU.
> > At the crash site in IfNode::fold_compares_helper:
> >   995   if (lo && hi) {
> >   996     // Merge the two compares into a single unsigned compare by
> building (CmpU (n - lo) (hi - lo))
> >   997     Node* adjusted_val = igvn->transform(new SubINode(n,  lo));
> >   998     if (adjusted_lim == NULL) {
> >   999       adjusted_lim = igvn->transform(new SubINode(hi, lo));
> > 1000     }
> >
> > At line 997, we have:
> > (gdb) p lo->dump()
> >   641    AddI    === _  513  92  [[]]
> > $1 = void
> >
> > After the transformation at line 997, we have
> > (gdb) p lo->dump()
> >   641    AddI    === _ _ _  [[]]   [34200641]
> > $3 = void
> >
> > Then node 641 was used at line 999, which triggers the crash.
> > Patch fixes the issue by delaying transformation in IfNode::fold_compares
> temporarily.
> > Tier1-3 tested on aarch64-linux-gnu & x86_64-linux-gnu.
> > Newly added test fail without the patch and pass otherwise.
> > Suggestions?
> >
> > Thanks,
> > Felix
> >

From aph at redhat.com  Tue Jul 28 12:12:43 2020
From: aph at redhat.com (Andrew Haley)
Date: Tue, 28 Jul 2020 13:12:43 +0100
Subject: Performance degradation due to probable (?) C2 issue
In-Reply-To: <925401595926726@mail.yandex.ru>
References: <925401595926726@mail.yandex.ru>
Message-ID: <da261d22-3801-bf63-b44c-c181e80f6209@redhat.com>

Hi,

On 28/07/2020 11:35, ?????? ??????? wrote:

> So my question is whether there's something wrong with compier of
> the original idea of improvement was wrong?

No, and (probably) no.

C2 uses a bunch of of heuristics. Here, it's detected that
CharArrayWriter::toString is large and has already been compiled so
there's no sense inlining another copy of it. This isn't necessarily
true, but it's a good guess. Try playing with InlineSmallCode: start
with =1000, and increases it from there to see if it helps.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From volker.simonis at gmail.com  Tue Jul 28 13:06:29 2020
From: volker.simonis at gmail.com (Volker Simonis)
Date: Tue, 28 Jul 2020 15:06:29 +0200
Subject: RFR[XS] 8249809 avoid calling DirectiveSet::clone(this) in
 compilecommand_compatibility_init
In-Reply-To: <1595907547514.55531@amazon.com>
References: <1595807197546.52082@amazon.com>
 <CA+3eh10LEgJdcSAFrtsE=Ycy2oHJR3Hk2ddrTAFYLiC1E-sJ2w@mail.gmail.com>
 <1595907547514.55531@amazon.com>
Message-ID: <CA+3eh10dQqSJyu+odn1zfLgMXw_oQaMN5+9icSp0LeERwui7JA@mail.gmail.com>

On Tue, Jul 28, 2020 at 5:40 AM Liu, Xin <xxinliu at amazon.com> wrote:
>
> hi, Volker,
>
> Thank you to review my patch.
>
> 1. yes. I guess nodes are the major memory consumption, but compiler directives are for both c1 and c2.
> it still can reduce memory footprint a little.
>
> 2.  Previous code set the flag here. both ControlIntrinsic and DisableIntrinsic belong to compilerdirectives_common_flags.
>
>  327 #define init_default_cc(name, type, dvalue, cc_flag) { type v; if (!_modified[name##Index] && CompilerOracle::has_option_value(method, #cc_flag, v) && v != this->name##Option) { set->name##Option = v; changed = true;} }
>  328     compilerdirectives_common_flags(init_default_cc)
>
> When method-level directives override the global directives, this code snippet set changed.
> Even though I remove the flag 'changed', I have the same logic in the smart pointer.
>
> 3.  the smart pointer DirectiveSetPtr needs to provide 2 accesses of the underlying pointer.
> one is read-only and the other one is mutable.
>
> Ideally, it should has the following 2 operator->().
> DirectiveSet* operator->();
> DirectiveSet const* operator->().
>
> AFAFI, C++ doesn't support covariant return type overload.  That is to say, we need to find a way to work around.
> the reason I provide overload operator*() because it returns a reference to object.  Users who want to modify the pointee have to explicitly dereference the smart pointer.
> (*set).member = newvalue;
> set->member = newvalue; // compiler error.
>
> your approach also works,  but you need to invoke const_cast<> for all places where you want to  read.
> I think my approach has shorter code.
>
>
> I just came up a new idea.  How about I provide a method cloned(), which returns the unqualified pointer?
>
> +  DirectiveSet* cloned() {
> +    if (!_clone) {
> +      _clone = DirectiveSet::clone(_origin);
> +    }
> +    return _clone;
> +  }
> +
>    DirectiveSet* transfer() {
>      assert(_origin != NULL, "_origin is NULL! transfer() can only be invoked once.");
>      if (_clone != NULL) {
> @@ -340,7 +347,7 @@
>
>      if (CompilerOracle::should_print(method)) {
>        if (!_modified[PrintAssemblyIndex]) {
> -        (*set).PrintAssemblyOption = true;
> +        set.cloned()->PrintAssemblyOption = true;
>        }
>      }
>

Hi Xin,

I like this solution much better. It makes it clear that we want to
alter the state of the Directive Set. Can you please provide a new
webrev based on this idea?

Thank you and best regards,
Volker

> thanks,
> --lx
>
> ________________________________________
> From: Volker Simonis <volker.simonis at gmail.com>
> Sent: Monday, July 27, 2020 8:51 AM
> To: Liu, Xin
> Cc: hotspot-compiler-dev at openjdk.java.net
> Subject: RE: [EXTERNAL] RFR[XS] 8249809 avoid calling DirectiveSet::clone(this) in compilecommand_compatibility_init
>
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> Hi Xin,
>
> I'm not sure if saving the allocation of an DirectiveSet has any
> visible effect compared to the much larger allocations required for
> the method compilation itself.
>
> Apart from that, I must confess that I'm not totally understanding the
> original logic. From what I see, it sets "changed" to true in the case
> where it changes the cloned DirectiveSet. But it doesn't do that in
> the cases where it only changes the clone's control word:
>
>  341         set->_intrinsic_control_words.fill_in(TriBool());
> ...
>  348           set->_intrinsic_control_words[id] = iter.is_enabled();
> ...
>  361         set->_intrinsic_control_words.fill_in(TriBool());
> ...
>  368           set->_intrinsic_control_words[id] = false;
>
> Why don't these mutations count as "changing"  the cloned DirectiveSet?
>
> After your patch, you've changed the above lines such that they will
> always create a clone which seems different from the initial
> behaviour.
>
> Which of the two behaviours is correct here, the original one, the new
> one after your change or doesn't it matter for reasons I don't
> understand?
>
>
> I also wonder why you need to overload both operators "operator*()"
> and "operator->()"? It seems a little bit arbitrary (and hard to
> understand for people reading the code) that "operator*()" clones the
> underlying directiveSet while "operator->()" uses the original one.
> Why not just define two versions of "operator->()" and let the
> compiler choose the right one like so:
>
> DirectiveSet const* operator->() const {
>   return !_clone ? _origin : _clone;
> }
>
> DirectiveSet* operator->() {
>   if (!_clone) {
>     _clone = DirectiveSet::clone(_origin);
>   }
>   return _clone;
> }
> ...
> if (!_modified[LogIndex]) {
>   bool log = CompilerOracle::should_log(method);
>   if (log != const_cast<const DirectiveSetPtr&>(set)->LogOption) {
>     set->LogOption = log;
>   }
> }
>
> Thank you and best regards,
> Volker
>
> On Mon, Jul 27, 2020 at 1:47 AM Liu, Xin <xxinliu at amazon.com> wrote:
> >
> > hi, Reviewers,
> >
> > Could you review this simple patch?
> > bug: https://bugs.openjdk.java.net/browse/JDK-8249809
> > webrev: https://cr.openjdk.java.net/~xliu/8249809/00/webrev/
> >
> > When the users specify a method-level compiler directive, the DirectiveSet is cloned for every single compiling method. It's expensive but rarely hit. Actually, Only user-specified methods must clone the DirectiveSet. I introduce a smart pointer DirectiveSetPtr. operator->() returns a pointer to a constant DirectiveSet, which is read-only. It doesn't clone the _origin until c2 need to update its members. transfer() yield the ownership of the pointer.
> >
> > Test:
> > manually tests with different CompileComand options.
> > hotspot:tier1 and gtest:all.
> >
> > thanks,
> > --lx
> >

From vladimir.kozlov at oracle.com  Tue Jul 28 16:01:45 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 28 Jul 2020 09:01:45 -0700
Subject: RFR (XS) 8250612: jvmciCompilerToVM.cpp declares jio_printf with
 "void" return type, should be "int"
In-Reply-To: <5f0a2b21-c897-c474-b54d-03e587bdf046@oracle.com>
References: <90ebea60-d625-e67f-918b-1ba2a531316e@redhat.com>
 <5f0a2b21-c897-c474-b54d-03e587bdf046@oracle.com>
Message-ID: <97787ade-5d9f-ab11-5e38-11345d4d3c95@oracle.com>

+1

Thanks,
Vladimir K

On 7/28/20 2:05 AM, Tobias Hartmann wrote:
> Hi Aleksey,
> 
> looks good and trivial to me.
> 
> Best regards,
> Tobias
> 
> On 28.07.20 09:09, Aleksey Shipilev wrote:
>> Bug:
>>    https://bugs.openjdk.java.net/browse/JDK-8250612
>>
>> Fix:
>>    https://cr.openjdk.java.net/~shade/8250612/webrev.01/
>>
>> Testing: Linux x86_64 builds; jdk-submit
>>

From luhenry at microsoft.com  Tue Jul 28 16:22:52 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Tue, 28 Jul 2020 16:22:52 +0000
Subject: [aarch64-port-dev ] RFR[XXS] 8248672: utilities: Introduce
 DEPRECATED macro for GCC and MSVC
In-Reply-To: <MWHPR21MB05117357CE8A5903943234C2B0750@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB05113A054F2D8291BFAFB085B0770@MWHPR21MB0511.namprd21.prod.outlook.com>
 <5e301790-8bfe-0ced-b5e2-8a9c76ae33de@oracle.com>
 <MWHPR21MB051146316FF56A63423DF514B0770@MWHPR21MB0511.namprd21.prod.outlook.com>
 <F50C37BA-FE47-465C-A60C-10547DCF8F69@oracle.com>
 <1259c3fd-b69c-6d81-0427-cb769f00bca5@redhat.com>,
 <CD1AE64E-8555-4074-97EB-831BF0A72C61@oracle.com>,
 <MWHPR21MB05117357CE8A5903943234C2B0750@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <MWHPR21MB0511A0713E147045CA02522CB0730@MWHPR21MB0511.namprd21.prod.outlook.com>

Hi,

I confirm that `= delete` works, and that we get a compile-time error if you try to use it.

Please find the updated webrev at http://cr.openjdk.java.net/~burban/luhenry/8248672/webrev.01

Thank you
Ludovic

________________________________________
From: Ludovic Henry <luhenry at microsoft.com>
Sent: Sunday, July 26, 2020 16:10
To: Kim Barrett; Andrew Haley
Cc: Vladimir Kozlov; hotspot-compiler-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; openjdk-aarch64; hotspot-gc-dev at openjdk.java.net
Subject: Re: [aarch64-port-dev ] RFR[XXS] 8248672: utilities: Introduce DEPRECATED macro for GCC and MSVC

> As of early last week, a definition of "= delete;" is the way to
> poison an overload.

Let me try that locally, compile on Windows-AArch64 and Linux-AArch64, and confirm whether it works for MSVC.

________________________________________
From: Kim Barrett <kim.barrett at oracle.com>
Sent: Sunday, July 26, 2020 14:41
To: Andrew Haley
Cc: Ludovic Henry; Vladimir Kozlov; hotspot-compiler-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; openjdk-aarch64; hotspot-gc-dev at openjdk.java.net
Subject: Re: [aarch64-port-dev ] RFR[XXS] 8248672: utilities: Introduce DEPRECATED macro for GCC and MSVC

> On Jul 26, 2020, at 5:56 AM, Andrew Haley <aph at redhat.com> wrote:
>
> On 25/07/2020 00:42, Kim Barrett wrote:
>> Why are we deprecating something rather than just deleting it and
>> fixing any users?
>
> C++ overloading. AArch64 CMP (immediate) only has a limited range, so
> we only have a byte-wide Assembler::cmp() definition. The deprecation
> warning on the wider version makes sure that any maintenance
> programmer is immediately warned if it is used. There are other things
> we could do: by not providing a definition for the wider cmp() you get
> a link error, but that wouldn't be as explicit as a deprecation
> warning.
>
> The root problem is that the immediate value to CMP isn't always known
> when HotSpot is compiled, but may be calculated at runtime. We have
> seen failures in production when an immediate offset overflowed.

Yeah, I'd guessed that might be the point, and confirmed it later by
looking at the changeset that originally introduced the attribute.

As of early last week, a definition of "= delete;" is the way to
poison an overload.


From vladimir.kozlov at oracle.com  Tue Jul 28 16:29:20 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 28 Jul 2020 09:29:20 -0700
Subject: RFR(XS): 8250233: -XX:+CITime triggers guarantee(events != NULL)
 in jvmci.cpp
In-Reply-To: <0BA04FB6-18C2-4AE9-B4ED-677270C910E8@sap.com>
References: <E9E87374-CAA7-4BB8-8860-D5F03080098E@sap.com>
 <7f5c8191-748f-d26d-9a4b-4efcea72ab3c@oracle.com>
 <8ff1c8ad-4111-03da-90cc-9c22b9e2b078@oracle.com>
 <994bb7a0-20dd-ea53-021c-9f8d49b49917@oracle.com>
 <fb109e31-b966-ca09-03a8-40ba06ef14fa@oracle.com>
 <3D2D6A62-52A7-43B2-88EF-AF6589E5B21A@sap.com>
 <e4a13823-3626-3be8-8ef5-0744ee9bb9d2@oracle.com>
 <0BA04FB6-18C2-4AE9-B4ED-677270C910E8@sap.com>
Message-ID: <4584367b-c6f4-cb21-d319-357cc9003796@oracle.com>

I actually did not see Doug's reply too.

But I think we still need to add EnableJVMCI check in few places to not do useless work. I took this bug.

Thanks,
Vladimir K

On 7/28/20 3:31 AM, Schmidt, Lutz wrote:
> Hi Tobias,
> 
> thank you for pointing me to Doug's reply. You are right, I missed just that single one - my fault (e-mail filter issue).
> 
> Regards,
> Lutz
> 
> 
> ?On 28.07.20, 12:00, "Tobias Hartmann" <tobias.hartmann at oracle.com> wrote:
> 
>      Hi Lutz,
> 
>      On 28.07.20 11:24, Schmidt, Lutz wrote:
>      > May I assume the issue is handled by knowledgeable people from now on? Tom, will you take over?
> 
>      You've probably missed that but Doug already replied in your original RFR:
>      https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-July/039171.html
> 
>      Best regards,
>      Tobias
> 

From vladimir.kozlov at oracle.com  Tue Jul 28 17:27:47 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 28 Jul 2020 10:27:47 -0700
Subject: RFR(S): 8250609: C2 crash in IfNode::fold_compares
In-Reply-To: <DA41BE1DDCA941489001C7FBD7A8820EE7E8F0BE@dggeml507-mbx.china.huawei.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E8D547@dggeml507-mbx.china.huawei.com>
 <23f0ab18-8bfb-3874-3000-ee2b37caca7c@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E8F0BE@dggeml507-mbx.china.huawei.com>
Message-ID: <d2e01ca9-1f00-1258-ce50-c27a99064cd2@oracle.com>

Yes, this looks good.

Thanks,
Vladimir K

On 7/28/20 5:10 AM, Yangfei (Felix) wrote:
> Hi,
> 
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>> Sent: Tuesday, July 28, 2020 3:15 AM
>> To: Yangfei (Felix) <felix.yang at huawei.com>; hotspot-compiler-
>> dev at openjdk.java.net
>> Subject: Re: RFR(S): 8250609: C2 crash in IfNode::fold_compares
>>
>> It happens because 'lo' is new node created just now and have no uses yet.
>> For such new nodes we usually add dummy use to avoid removal from graph:
>>
>> http://hg.openjdk.java.net/jdk/jdk/file/c379dc750a02/src/hotspot/share/op
>> to/convertnode.cpp#l403
> 
> Thanks for the suggestions.  Yes, that will also fix the issue.
> New webrev: http://cr.openjdk.java.net/~fyang/8250609/webrev.01/
> Performed the same tests as before.  Does it look better?
> 
> Felix
>   
>> On 7/27/20 5:27 AM, Yangfei (Felix) wrote:
>>> Hi,
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8250609
>>> Webrev: http://cr.openjdk.java.net/~fyang/8250609/webrev.00/
>>>
>>> In IfNode::fold_compares_helper, C2 tries to fold 2 CmpI into a single
>> CmpU.
>>> At the crash site in IfNode::fold_compares_helper:
>>>    995   if (lo && hi) {
>>>    996     // Merge the two compares into a single unsigned compare by
>> building (CmpU (n - lo) (hi - lo))
>>>    997     Node* adjusted_val = igvn->transform(new SubINode(n,  lo));
>>>    998     if (adjusted_lim == NULL) {
>>>    999       adjusted_lim = igvn->transform(new SubINode(hi, lo));
>>> 1000     }
>>>
>>> At line 997, we have:
>>> (gdb) p lo->dump()
>>>    641    AddI    === _  513  92  [[]]
>>> $1 = void
>>>
>>> After the transformation at line 997, we have
>>> (gdb) p lo->dump()
>>>    641    AddI    === _ _ _  [[]]   [34200641]
>>> $3 = void
>>>
>>> Then node 641 was used at line 999, which triggers the crash.
>>> Patch fixes the issue by delaying transformation in IfNode::fold_compares
>> temporarily.
>>> Tier1-3 tested on aarch64-linux-gnu & x86_64-linux-gnu.
>>> Newly added test fail without the patch and pass otherwise.
>>> Suggestions?
>>>
>>> Thanks,
>>> Felix
>>>

From aph at redhat.com  Tue Jul 28 18:21:18 2020
From: aph at redhat.com (Andrew Haley)
Date: Tue, 28 Jul 2020 19:21:18 +0100
Subject: [aarch64-port-dev ] RFR[XXS] 8248672: utilities: Introduce
 DEPRECATED macro for GCC and MSVC
In-Reply-To: <MWHPR21MB0511A0713E147045CA02522CB0730@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB05113A054F2D8291BFAFB085B0770@MWHPR21MB0511.namprd21.prod.outlook.com>
 <5e301790-8bfe-0ced-b5e2-8a9c76ae33de@oracle.com>
 <MWHPR21MB051146316FF56A63423DF514B0770@MWHPR21MB0511.namprd21.prod.outlook.com>
 <F50C37BA-FE47-465C-A60C-10547DCF8F69@oracle.com>
 <1259c3fd-b69c-6d81-0427-cb769f00bca5@redhat.com>
 <CD1AE64E-8555-4074-97EB-831BF0A72C61@oracle.com>
 <MWHPR21MB05117357CE8A5903943234C2B0750@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511A0713E147045CA02522CB0730@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <279cd44e-dfcc-00c2-4aee-1cca630bd5ec@redhat.com>

On 28/07/2020 17:22, Ludovic Henry wrote:
> I confirm that `= delete` works, and that we get a compile-time error if you try to use it.
> 
> Please find the updated webrev at http://cr.openjdk.java.net/~burban/luhenry/8248672/webrev.01

OK, thanks.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From Charlie.Gracie at microsoft.com  Tue Jul 28 20:39:58 2020
From: Charlie.Gracie at microsoft.com (Charlie Gracie)
Date: Tue, 28 Jul 2020 20:39:58 +0000
Subject: Inlining difference when using G1GC instead of ParallelGC
Message-ID: <4EBCB07A-73AF-43D9-AE9D-3F71152082BB@microsoft.com>

Hi, 
?
I have noticed an inlining difference in C2 when the JVM is using G1GC as compared to
ParallelGC. It is causing a measurable difference in performance since other optimizations
cannot take place if the method is not inlined. This is a code snippet my small example [1]
that demonstrates the difference: 

public class TypeCheck {
public static void main(String[] args) {
...
?? Handler handler1 = new Handler(new InnerImpl1()); 
?? handler1.doIt(); 
? 
?? Handler handler2 = new Handler(new InnerImpl2()); 
?? handler2.doIt(); 
? 
?? Handler handler3 = new Handler(new InnerImpl3()); 
?? handler3.doIt();
}
}
?
public class Handler { 
?? Inner inner; 
?? public Handler(Inner i) { 
????? inner = i; 
?? } 
?? public int doIt() { 
????? return inner.getValue(); 
?? } 
} 
?
abstract class Inner { 
?? public abstract int getValue(); 
}? 
?
Handler.doIt() is invoked with Handler.inner having more than 2 different types, so
TypeSpeculation is not used. When the JVM is using ParallelGC, C2 determines a concrete
type because it can see the value stored to the `inner` field in the constructor instead of
reading it from the field. I believe this is happening because of
MemNode::can_see_stored_value(). With this optimization the concrete subclass type
is known and the method is inlined. When using G1GC the GC write barrier contains
an Op_MemBarVolatile. I believe that volatile memory barrier generated for the field write
in the constructor stops the value from being visible after the write barrier. This forces the
read of `inner` in doIt() to happen and then the result only has a type of Inner so getValue()
cannot be inlined. 
?
Is this a deficiency that should be investigated further to attempt a "fix"? I would like to
work on a solution, but I am looking for feedback on whether this is something the
community feels can and should be fixed. 
?
Cheers, 
Charlie Gracie 
?
Extra information: 
If the code is modified such that the allocations are on separate lines, then C2 can inline
the getValue() method when the JVM is using G1GC and ParallelGC. This is because the
constructor will directly follow the allocation of the Handler object. When this happens
the GC barrier can be elided so the original value being stored can be used instead of
having to do the read. I have a 2nd example [3] which can be used to demonstrate this.? 
? 
[1] https://github.com/charliegracie/code-examples/tree/master/java/InlineTests
[2] https://github.com/charliegracie/code-examples/blob/master/java/InlineTests/TypeCheck2.java


From xxinliu at amazon.com  Tue Jul 28 20:56:25 2020
From: xxinliu at amazon.com (Liu, Xin)
Date: Tue, 28 Jul 2020 20:56:25 +0000
Subject: RFR[XS] 8249809 avoid calling DirectiveSet::clone(this) in
 compilecommand_compatibility_init
In-Reply-To: <CA+3eh10dQqSJyu+odn1zfLgMXw_oQaMN5+9icSp0LeERwui7JA@mail.gmail.com>
References: <1595807197546.52082@amazon.com>
 <CA+3eh10LEgJdcSAFrtsE=Ycy2oHJR3Hk2ddrTAFYLiC1E-sJ2w@mail.gmail.com>
 <1595907547514.55531@amazon.com>,
 <CA+3eh10dQqSJyu+odn1zfLgMXw_oQaMN5+9icSp0LeERwui7JA@mail.gmail.com>
Message-ID: <1595969785292.62158@amazon.com>

hi, Volker, 

Here is a new revision with cloned(). 
http://cr.openjdk.java.net/~xliu/8249809/01/webrev/

thanks,
--lx

________________________________________
From: Volker Simonis <volker.simonis at gmail.com>
Sent: Tuesday, July 28, 2020 6:06 AM
To: Liu, Xin
Cc: hotspot-compiler-dev at openjdk.java.net
Subject: RE: [EXTERNAL] RFR[XS] 8249809 avoid calling DirectiveSet::clone(this) in compilecommand_compatibility_init

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


On Tue, Jul 28, 2020 at 5:40 AM Liu, Xin <xxinliu at amazon.com> wrote:
>
> hi, Volker,
>
> Thank you to review my patch.
>
> 1. yes. I guess nodes are the major memory consumption, but compiler directives are for both c1 and c2.
> it still can reduce memory footprint a little.
>
> 2.  Previous code set the flag here. both ControlIntrinsic and DisableIntrinsic belong to compilerdirectives_common_flags.
>
>  327 #define init_default_cc(name, type, dvalue, cc_flag) { type v; if (!_modified[name##Index] && CompilerOracle::has_option_value(method, #cc_flag, v) && v != this->name##Option) { set->name##Option = v; changed = true;} }
>  328     compilerdirectives_common_flags(init_default_cc)
>
> When method-level directives override the global directives, this code snippet set changed.
> Even though I remove the flag 'changed', I have the same logic in the smart pointer.
>
> 3.  the smart pointer DirectiveSetPtr needs to provide 2 accesses of the underlying pointer.
> one is read-only and the other one is mutable.
>
> Ideally, it should has the following 2 operator->().
> DirectiveSet* operator->();
> DirectiveSet const* operator->().
>
> AFAFI, C++ doesn't support covariant return type overload.  That is to say, we need to find a way to work around.
> the reason I provide overload operator*() because it returns a reference to object.  Users who want to modify the pointee have to explicitly dereference the smart pointer.
> (*set).member = newvalue;
> set->member = newvalue; // compiler error.
>
> your approach also works,  but you need to invoke const_cast<> for all places where you want to  read.
> I think my approach has shorter code.
>
>
> I just came up a new idea.  How about I provide a method cloned(), which returns the unqualified pointer?
>
> +  DirectiveSet* cloned() {
> +    if (!_clone) {
> +      _clone = DirectiveSet::clone(_origin);
> +    }
> +    return _clone;
> +  }
> +
>    DirectiveSet* transfer() {
>      assert(_origin != NULL, "_origin is NULL! transfer() can only be invoked once.");
>      if (_clone != NULL) {
> @@ -340,7 +347,7 @@
>
>      if (CompilerOracle::should_print(method)) {
>        if (!_modified[PrintAssemblyIndex]) {
> -        (*set).PrintAssemblyOption = true;
> +        set.cloned()->PrintAssemblyOption = true;
>        }
>      }
>

Hi Xin,

I like this solution much better. It makes it clear that we want to
alter the state of the Directive Set. Can you please provide a new
webrev based on this idea?

Thank you and best regards,
Volker

> thanks,
> --lx
>
> ________________________________________
> From: Volker Simonis <volker.simonis at gmail.com>
> Sent: Monday, July 27, 2020 8:51 AM
> To: Liu, Xin
> Cc: hotspot-compiler-dev at openjdk.java.net
> Subject: RE: [EXTERNAL] RFR[XS] 8249809 avoid calling DirectiveSet::clone(this) in compilecommand_compatibility_init
>
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> Hi Xin,
>
> I'm not sure if saving the allocation of an DirectiveSet has any
> visible effect compared to the much larger allocations required for
> the method compilation itself.
>
> Apart from that, I must confess that I'm not totally understanding the
> original logic. From what I see, it sets "changed" to true in the case
> where it changes the cloned DirectiveSet. But it doesn't do that in
> the cases where it only changes the clone's control word:
>
>  341         set->_intrinsic_control_words.fill_in(TriBool());
> ...
>  348           set->_intrinsic_control_words[id] = iter.is_enabled();
> ...
>  361         set->_intrinsic_control_words.fill_in(TriBool());
> ...
>  368           set->_intrinsic_control_words[id] = false;
>
> Why don't these mutations count as "changing"  the cloned DirectiveSet?
>
> After your patch, you've changed the above lines such that they will
> always create a clone which seems different from the initial
> behaviour.
>
> Which of the two behaviours is correct here, the original one, the new
> one after your change or doesn't it matter for reasons I don't
> understand?
>
>
> I also wonder why you need to overload both operators "operator*()"
> and "operator->()"? It seems a little bit arbitrary (and hard to
> understand for people reading the code) that "operator*()" clones the
> underlying directiveSet while "operator->()" uses the original one.
> Why not just define two versions of "operator->()" and let the
> compiler choose the right one like so:
>
> DirectiveSet const* operator->() const {
>   return !_clone ? _origin : _clone;
> }
>
> DirectiveSet* operator->() {
>   if (!_clone) {
>     _clone = DirectiveSet::clone(_origin);
>   }
>   return _clone;
> }
> ...
> if (!_modified[LogIndex]) {
>   bool log = CompilerOracle::should_log(method);
>   if (log != const_cast<const DirectiveSetPtr&>(set)->LogOption) {
>     set->LogOption = log;
>   }
> }
>
> Thank you and best regards,
> Volker
>
> On Mon, Jul 27, 2020 at 1:47 AM Liu, Xin <xxinliu at amazon.com> wrote:
> >
> > hi, Reviewers,
> >
> > Could you review this simple patch?
> > bug: https://bugs.openjdk.java.net/browse/JDK-8249809
> > webrev: https://cr.openjdk.java.net/~xliu/8249809/00/webrev/
> >
> > When the users specify a method-level compiler directive, the DirectiveSet is cloned for every single compiling method. It's expensive but rarely hit. Actually, Only user-specified methods must clone the DirectiveSet. I introduce a smart pointer DirectiveSetPtr. operator->() returns a pointer to a constant DirectiveSet, which is read-only. It doesn't clone the _origin until c2 need to update its members. transfer() yield the ownership of the pointer.
> >
> > Test:
> > manually tests with different CompileComand options.
> > hotspot:tier1 and gtest:all.
> >
> > thanks,
> > --lx
> >

From igor.ignatyev at oracle.com  Tue Jul 28 21:38:05 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Tue, 28 Jul 2020 14:38:05 -0700
Subject: RFR(T) : 8250739 : remove Compile::Generate_*_Graph methods
 declarations
Message-ID: <122EF98B-1ED5-4813-8D7C-7F6326D8ABD7@oracle.com>

http://cr.openjdk.java.net/~iignatyev//8250739/webrev.00/
> 8 lines changed: 0 ins; 8 del; 0 mod;

Hi all,

could you please review this trivial cleanup?

from JBS:
> Compile::Generate_Compiled_To_Interpreter_Graph and Generate_Interpreter_To_Compiled_Graph methods are declared but not defined (and not used)

webrev: http://cr.openjdk.java.net/~iignatyev//8250739/webrev.00/
JBS: https://bugs.openjdk.java.net/browse/JDK-8250739

Thanks,
-- Igor

From igor.ignatyev at oracle.com  Tue Jul 28 21:40:46 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Tue, 28 Jul 2020 14:40:46 -0700
Subject: RFR(T) : 8250738 :
 C2Compiler::is_intrinsic_supported(methodHandle&,bool) shouldn't be virtual
Message-ID: <F6E7E0CA-EA0C-419E-AEBD-47F39FE170E2@oracle.com>

http://cr.openjdk.java.net/~iignatyev//8250738/webrev.00
> 2 lines changed: 0 ins; 0 del; 2 mod; 

Hi all,

could you please review this trivial one-liner which removes virtual specifier from C2Compiler::is_intrinsic_supported(methodHandle&,bool)?

from JBS:
> C2Compiler::is_intrinsic_supported(methodHandle&,bool) is declared by C2Compiler which doesn't and shouldn't have any subclasses. 

webrev: http://cr.openjdk.java.net/~iignatyev//8250738/webrev.00
JBS: https://bugs.openjdk.java.net/browse/JDK-8250738

Thanks,
-- Igor


From xxinliu at amazon.com  Tue Jul 28 22:01:40 2020
From: xxinliu at amazon.com (Liu, Xin)
Date: Tue, 28 Jul 2020 22:01:40 +0000
Subject: RFR(T) : 8250738 :
 C2Compiler::is_intrinsic_supported(methodHandle&, 
 bool) shouldn't be virtual
In-Reply-To: <F6E7E0CA-EA0C-419E-AEBD-47F39FE170E2@oracle.com>
References: <F6E7E0CA-EA0C-419E-AEBD-47F39FE170E2@oracle.com>
Message-ID: <1595973700138.36641@amazon.com>

hi, Igor, 

Reviewed your code. you are right. I don't think it intends to be virtual. 

thanks,
--lx

________________________________________
From: hotspot-compiler-dev <hotspot-compiler-dev-retn at openjdk.java.net> on behalf of Igor Ignatyev <igor.ignatyev at oracle.com>
Sent: Tuesday, July 28, 2020 2:40 PM
To: hotspot compiler
Subject: [EXTERNAL] RFR(T) : 8250738 : C2Compiler::is_intrinsic_supported(methodHandle&, bool) shouldn't be virtual

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


http://cr.openjdk.java.net/~iignatyev//8250738/webrev.00
> 2 lines changed: 0 ins; 0 del; 2 mod;

Hi all,

could you please review this trivial one-liner which removes virtual specifier from C2Compiler::is_intrinsic_supported(methodHandle&,bool)?

from JBS:
> C2Compiler::is_intrinsic_supported(methodHandle&,bool) is declared by C2Compiler which doesn't and shouldn't have any subclasses.

webrev: http://cr.openjdk.java.net/~iignatyev//8250738/webrev.00
JBS: https://bugs.openjdk.java.net/browse/JDK-8250738

Thanks,
-- Igor


From vladimir.x.ivanov at oracle.com  Tue Jul 28 22:09:54 2020
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Wed, 29 Jul 2020 01:09:54 +0300
Subject: RFR (XXL): 8223347: Integration of Vector API (Incubator):
 Hotspot and x86 backend changes
In-Reply-To: <BYAPR11MB3543026ABF92A463562E5DE5EF7C0@BYAPR11MB3543.namprd11.prod.outlook.com>
References: <BYAPR11MB3543026ABF92A463562E5DE5EF7C0@BYAPR11MB3543.namprd11.prod.outlook.com>
Message-ID: <e4c76a68-97d1-6fe5-7a12-15ab04a216bf@oracle.com>


> Shared Hotspot:
>     Full: http://cr.openjdk.java.net/~sviswanathan/VAPI_RFR/hs_webrev/webrev.01/
>     Incremental: http://cr.openjdk.java.net/~sviswanathan/VAPI_RFR/hs_webrev/webrev.00-webrev.01/

FTR here are the latest changes in HotSpot shared code:
 
http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.01

Incremental changes:
 
http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.01_00

Best regards,
Vladimir Ivanov

> Older webrev links for your reference:
> Shared Hotspot: http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.00/

From vladimir.kozlov at oracle.com  Tue Jul 28 22:21:25 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 28 Jul 2020 15:21:25 -0700
Subject: RFR(T) : 8250739 : remove Compile::Generate_*_Graph methods
 declarations
In-Reply-To: <122EF98B-1ED5-4813-8D7C-7F6326D8ABD7@oracle.com>
References: <122EF98B-1ED5-4813-8D7C-7F6326D8ABD7@oracle.com>
Message-ID: <cd0a68ce-df0a-11dc-4fe2-2499b7574f62@oracle.com>

Cleanup is good and trivial.

The code was removed in JDK 6 as part of preparing for tiered JIT system "JDK-5082720: Remove adapter frames".

Thanks,
Vladimir K

On 7/28/20 2:38 PM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev//8250739/webrev.00/
>> 8 lines changed: 0 ins; 8 del; 0 mod;
> 
> Hi all,
> 
> could you please review this trivial cleanup?
> 
> from JBS:
>> Compile::Generate_Compiled_To_Interpreter_Graph and Generate_Interpreter_To_Compiled_Graph methods are declared but not defined (and not used)
> 
> webrev: http://cr.openjdk.java.net/~iignatyev//8250739/webrev.00/
> JBS: https://bugs.openjdk.java.net/browse/JDK-8250739
> 
> Thanks,
> -- Igor
> 

From vladimir.x.ivanov at oracle.com  Tue Jul 28 22:29:41 2020
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Wed, 29 Jul 2020 01:29:41 +0300
Subject: RFR (XXL): 8223347: Integration of Vector API (Incubator):
 General HotSpot changes
In-Reply-To: <c1bdf88c-5de2-d069-5f31-5a95c6988bf8@oracle.com>
References: <c1bdf88c-5de2-d069-5f31-5a95c6988bf8@oracle.com>
Message-ID: <38a7fe74-0c5e-4a28-b128-24c40b8ea01e@oracle.com>

Hi,

Thanks for the feedback on webrev.00, Remi, Coleen, Vladimir K., and 
Ekaterina!

Here are the latest changes for Vector API support in HotSpot shared code:
 
http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.01

Incremental changes (diff against webrev.00):
 
http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.01_00

I decided to post it here and not initiate a new round of reviews 
because the changes are mostly limited to minor cleanups / simple bug 
fixes.

Detailed summary:
   - rebased to jdk/jdk tip;
   - got rid of NotV, VLShiftV, VRShiftV, VURShiftV nodes;
   - restore lazy cleanup logic during incremental inlining (see 
needs_cleanup in compile.cpp);
   - got rid of x86-specific changes in shared code;
   - fix for 8244867 [1];
   - fix Graal test failure: enumerate VectorSupport intrinsics in 
CheckGraalIntrinsics
   - numerous minor cleanups

Best regards,
Vladimir Ivanov

[1] http://hg.openjdk.java.net/panama/dev/rev/dcfc7b6e8977
     http://jbs.oracle.com/browse/JDK-8244867
     8244867: 2 vector api tests crash with 
assert(is_reference_type(basic_type())) failed: wrong type
Summary: Adding safety checks to prevent intrinsification if class 
arguments of non-primitive types are uninitialized.

On 04.04.2020 02:12, Vladimir Ivanov wrote:
> Hi,
> 
> Following up on review requests of API [0] and Java implementation [1] 
> for Vector API (JEP 338 [2]), here's a request for review of general 
> HotSpot changes (in shared code) required for supporting the API:
> 
> 
> http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.00/all.00-03/ 
> 
> 
> (First of all, to set proper expectations: since the JEP is still in 
> Candidate state, the intention is to initiate preliminary round(s) of 
> review to inform the community and gather feedback before sending out 
> final/official RFRs once the JEP is Targeted to a release.)
> 
> Vector API (being developed in Project Panama [3]) relies on JVM support 
> to utilize optimal vector hardware instructions at runtime. It interacts 
> with JVM through intrinsics (declared in 
> jdk.internal.vm.vector.VectorSupport [4]) which expose vector operations 
> support in C2 JIT-compiler.
> 
> As Paul wrote earlier: "A vector intrinsic is an internal low-level 
> vector operation. The last argument to the intrinsic is fall back 
> behavior in Java, implementing the scalar operation over the number of 
> elements held by the vector.? Thus, If the intrinsic is not supported in 
> C2 for the other arguments then the Java implementation is executed (the 
> Java implementation is always executed when running in the interpreter 
> or for C1)."
> 
> The rest of JVM support is about aggressively optimizing vector boxes to 
> minimize (ideally eliminate) the overhead of boxing for vector values.
> It's a stop-the-gap solution for vector box elimination problem until 
> inline classes arrive. Vector classes are value-based and in the longer 
> term will be migrated to inline classes once the support becomes available.
> 
> Vector API talk from JVMLS'18 [5] contains brief overview of JVM 
> implementation and some details.
> 
> Complete implementation resides in vector-unstable branch of panama/dev 
> repository [6].
> 
> Now to gory details (the patch is split in multiple "sub-webrevs"):
> 
> ===========================================================
> 
> (1) 
> http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.00/00.backend.shared/ 
> 
> 
> Ideal vector nodes for new operations introduced by Vector API.
> 
> (Platform-specific back end support will be posted for review separately).
> 
> ===========================================================
> 
> (2) 
> http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.00/01.intrinsics/ 
> 
> 
> JVM Java interface (VectorSupport) and intrinsic support in C2.
> 
> Vector instances are initially represented as VectorBox macro nodes and 
> "unboxing" is represented by VectorUnbox node. It simplifies vector box 
> elimination analysis and the nodes are expanded later right before EA pass.
> 
> Vectors have 2-level on-heap representation: for the vector value 
> primitive array is used as a backing storage and it is encapsulated in a 
> typed wrapper (e.g., Int256Vector - vector of 8 ints - contains a int[8] 
> instance which is used to store vector value).
> 
> Unless VectorBox node goes away, it needs to be expanded into an 
> allocation eventually, but it is a pure node and doesn't have any JVM 
> state associated with it. The problem is solved by keeping JVM state 
> separately in a VectorBoxAllocate node associated with VectorBox node 
> and use it during expansion.
> 
> Also, to simplify vector box elimination, inlining of vector reboxing 
> calls (VectorSupport::maybeRebox) is delayed until the analysis is over.
> 
> ===========================================================
> 
> (3) 
> http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.00/02.vbox_elimination/ 
> 
> 
> Vector box elimination analysis implementation. (Brief overview: slides 
> #36-42 [5].)
> 
> The main part is devoted to scalarization across safepoints and 
> rematerialization support during deoptimization. In C2-generated code 
> vector operations work with raw vector values which live in registers or 
> spilled on the stack and it allows to avoid boxing/unboxing when a 
> vector value is alive across a safepoint. As with other values, there's 
> just a location of the vector value at the safepoint and vector type 
> information recorded in the relevant nmethod metadata and all the 
> heavy-lifting happens only when rematerialization takes place.
> 
> The analysis preserves object identity invariants except during 
> aggressive reboxing (guarded by -XX:+EnableAggressiveReboxing).
> 
> (Aggressive reboxing is crucial for cases when vectors "escape": it 
> allocates a fresh instance at every escape point thus enabling original 
> instance to go away.)
> 
> ===========================================================
> 
> (4) 
> http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.00/03.module.hotspot/ 
> 
> 
> HotSpot changes for jdk.incubator.vector module. Vector support is 
> makred experimental and turned off by default. JEP 338 proposes the API 
> to be released as an incubator module, so a user has to specify 
> "--add-module jdk.incubator.vector" on the command line to be able to 
> use it.
> When user does that, JVM automatically enables Vector API support.
> It improves usability (user doesn't need to separately "open" the API 
> and enable JVM support) while minimizing risks of destabilitzation from 
> new code when the API is not used.
> 
> 
> That's it! Will be happy to answer any questions.
> 
> And thanks in advance for any feedback!
> 
> Best regards,
> Vladimir Ivanov
> 
> [0] 
> https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-March/065345.html 
> 
> 
> [1] 
> https://mail.openjdk.java.net/pipermail/hotspot-dev/2020-April/041228.html
> 
> [2] https://openjdk.java.net/jeps/338
> 
> [3] https://openjdk.java.net/projects/panama/
> 
> [4] 
> http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.00/01.intrinsics/src/java.base/share/classes/jdk/internal/vm/vector/VectorSupport.java.html 
> 
> 
> [5] http://cr.openjdk.java.net/~vlivanov/talks/2018_JVMLS_VectorAPI.pdf
> 
> [6] http://hg.openjdk.java.net/panama/dev/shortlog/92bbd44386e9
> 
>  ??? $ hg clone http://hg.openjdk.java.net/panama/dev/ -b vector-unstable

From vladimir.kozlov at oracle.com  Tue Jul 28 22:27:26 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Tue, 28 Jul 2020 15:27:26 -0700
Subject: RFR(T) : 8250738 :
 C2Compiler::is_intrinsic_supported(methodHandle&,bool) shouldn't be virtual
In-Reply-To: <F6E7E0CA-EA0C-419E-AEBD-47F39FE170E2@oracle.com>
References: <F6E7E0CA-EA0C-419E-AEBD-47F39FE170E2@oracle.com>
Message-ID: <19167625-9430-3230-87ea-e67b86728de0@oracle.com>

Good and trivial.

Thanks,
Vladimir K

On 7/28/20 2:40 PM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev//8250738/webrev.00
>> 2 lines changed: 0 ins; 0 del; 2 mod;
> 
> Hi all,
> 
> could you please review this trivial one-liner which removes virtual specifier from C2Compiler::is_intrinsic_supported(methodHandle&,bool)?
> 
> from JBS:
>> C2Compiler::is_intrinsic_supported(methodHandle&,bool) is declared by C2Compiler which doesn't and shouldn't have any subclasses.
> 
> webrev: http://cr.openjdk.java.net/~iignatyev//8250738/webrev.00
> JBS: https://bugs.openjdk.java.net/browse/JDK-8250738
> 
> Thanks,
> -- Igor
> 

From igor.ignatyev at oracle.com  Tue Jul 28 22:31:58 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Tue, 28 Jul 2020 15:31:58 -0700
Subject: RFR(T) : 8250739 : remove Compile::Generate_*_Graph methods
 declarations
In-Reply-To: <cd0a68ce-df0a-11dc-4fe2-2499b7574f62@oracle.com>
References: <122EF98B-1ED5-4813-8D7C-7F6326D8ABD7@oracle.com>
 <cd0a68ce-df0a-11dc-4fe2-2499b7574f62@oracle.com>
Message-ID: <7EBECEA2-FB20-4A46-8FBA-A3CA6DD5055D@oracle.com>

Thanks Vladimir, pushed.

-- Igor

> On Jul 28, 2020, at 3:21 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Cleanup is good and trivial.
> 
> The code was removed in JDK 6 as part of preparing for tiered JIT system "JDK-5082720: Remove adapter frames".
> 
> Thanks,
> Vladimir K
> 
> On 7/28/20 2:38 PM, Igor Ignatyev wrote:
>> http://cr.openjdk.java.net/~iignatyev//8250739/webrev.00/
>>> 8 lines changed: 0 ins; 8 del; 0 mod;
>> Hi all,
>> could you please review this trivial cleanup?
>> from JBS:
>>> Compile::Generate_Compiled_To_Interpreter_Graph and Generate_Interpreter_To_Compiled_Graph methods are declared but not defined (and not used)
>> webrev: http://cr.openjdk.java.net/~iignatyev//8250739/webrev.00/
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8250739
>> Thanks,
>> -- Igor


From igor.ignatyev at oracle.com  Tue Jul 28 22:33:56 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Tue, 28 Jul 2020 15:33:56 -0700
Subject: RFR(T) : 8250738 :
 C2Compiler::is_intrinsic_supported(methodHandle&,bool) shouldn't be virtual
In-Reply-To: <19167625-9430-3230-87ea-e67b86728de0@oracle.com>
References: <F6E7E0CA-EA0C-419E-AEBD-47F39FE170E2@oracle.com>
 <19167625-9430-3230-87ea-e67b86728de0@oracle.com>
Message-ID: <1402219D-0403-47AD-82E0-1E73C7434815@oracle.com>

Vladimir, Xin,

thank you for your reviews, pushed.

-- Igor

> On Jul 28, 2020, at 3:27 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Good and trivial.
> 
> Thanks,
> Vladimir K
> 
> On 7/28/20 2:40 PM, Igor Ignatyev wrote:
>> http://cr.openjdk.java.net/~iignatyev//8250738/webrev.00
>>> 2 lines changed: 0 ins; 0 del; 2 mod;
>> Hi all,
>> could you please review this trivial one-liner which removes virtual specifier from C2Compiler::is_intrinsic_supported(methodHandle&,bool)?
>> from JBS:
>>> C2Compiler::is_intrinsic_supported(methodHandle&,bool) is declared by C2Compiler which doesn't and shouldn't have any subclasses.
>> webrev: http://cr.openjdk.java.net/~iignatyev//8250738/webrev.00
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8250738
>> Thanks,
>> -- Igor


From kim.barrett at oracle.com  Wed Jul 29 00:34:35 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 28 Jul 2020 20:34:35 -0400
Subject: [aarch64-port-dev ] RFR[XXS] 8248672: utilities: Introduce
 DEPRECATED macro for GCC and MSVC
In-Reply-To: <MWHPR21MB0511A0713E147045CA02522CB0730@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB05113A054F2D8291BFAFB085B0770@MWHPR21MB0511.namprd21.prod.outlook.com>
 <5e301790-8bfe-0ced-b5e2-8a9c76ae33de@oracle.com>
 <MWHPR21MB051146316FF56A63423DF514B0770@MWHPR21MB0511.namprd21.prod.outlook.com>
 <F50C37BA-FE47-465C-A60C-10547DCF8F69@oracle.com>
 <1259c3fd-b69c-6d81-0427-cb769f00bca5@redhat.com>
 <CD1AE64E-8555-4074-97EB-831BF0A72C61@oracle.com>
 <MWHPR21MB05117357CE8A5903943234C2B0750@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511A0713E147045CA02522CB0730@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <116277BD-EA21-49AA-8DE1-DBC06ED43C43@oracle.com>

> On Jul 28, 2020, at 12:22 PM, Ludovic Henry <luhenry at microsoft.com> wrote:
> 
> Hi,
> 
> I confirm that `= delete` works, and that we get a compile-time error if you try to use it.
> 
> Please find the updated webrev at http://cr.openjdk.java.net/~burban/luhenry/8248672/webrev.01
> 
> Thank you
> Ludovic

Looks good.

Probably the bug title should be updated.


From felix.yang at huawei.com  Wed Jul 29 03:20:34 2020
From: felix.yang at huawei.com (Yangfei (Felix))
Date: Wed, 29 Jul 2020 03:20:34 +0000
Subject: RFR(S): 8250609: C2 crash in IfNode::fold_compares
In-Reply-To: <d2e01ca9-1f00-1258-ce50-c27a99064cd2@oracle.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E8D547@dggeml507-mbx.china.huawei.com>
 <23f0ab18-8bfb-3874-3000-ee2b37caca7c@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E8F0BE@dggeml507-mbx.china.huawei.com>
 <d2e01ca9-1f00-1258-ce50-c27a99064cd2@oracle.com>
Message-ID: <DA41BE1DDCA941489001C7FBD7A8820EE7E8F804@dggeml507-mbx.china.huawei.com>

Hi,

    Thanks for reviewing this.
    Committed to jdk/submit repo and test result received looks good.
    Will do the push.

Thanks,
Felix

> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Wednesday, July 29, 2020 1:28 AM
> To: Yangfei (Felix) <felix.yang at huawei.com>; hotspot-compiler-
> dev at openjdk.java.net
> Subject: Re: RFR(S): 8250609: C2 crash in IfNode::fold_compares
> 
> Yes, this looks good.
> 
> Thanks,
> Vladimir K
> 
> On 7/28/20 5:10 AM, Yangfei (Felix) wrote:
> > Hi,
> >
> >> -----Original Message-----
> >> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> >> Sent: Tuesday, July 28, 2020 3:15 AM
> >> To: Yangfei (Felix) <felix.yang at huawei.com>; hotspot-compiler-
> >> dev at openjdk.java.net
> >> Subject: Re: RFR(S): 8250609: C2 crash in IfNode::fold_compares
> >>
> >> It happens because 'lo' is new node created just now and have no uses
> yet.
> >> For such new nodes we usually add dummy use to avoid removal from
> graph:
> >>
> >> http://hg.openjdk.java.net/jdk/jdk/file/c379dc750a02/src/hotspot/shar
> >> e/op
> >> to/convertnode.cpp#l403
> >
> > Thanks for the suggestions.  Yes, that will also fix the issue.
> > New webrev: http://cr.openjdk.java.net/~fyang/8250609/webrev.01/
> > Performed the same tests as before.  Does it look better?
> >
> > Felix
> >
> >> On 7/27/20 5:27 AM, Yangfei (Felix) wrote:
> >>> Hi,
> >>>
> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8250609
> >>> Webrev: http://cr.openjdk.java.net/~fyang/8250609/webrev.00/
> >>>
> >>> In IfNode::fold_compares_helper, C2 tries to fold 2 CmpI into a
> >>> single
> >> CmpU.
> >>> At the crash site in IfNode::fold_compares_helper:
> >>>    995   if (lo && hi) {
> >>>    996     // Merge the two compares into a single unsigned compare by
> >> building (CmpU (n - lo) (hi - lo))
> >>>    997     Node* adjusted_val = igvn->transform(new SubINode(n,  lo));
> >>>    998     if (adjusted_lim == NULL) {
> >>>    999       adjusted_lim = igvn->transform(new SubINode(hi, lo));
> >>> 1000     }
> >>>
> >>> At line 997, we have:
> >>> (gdb) p lo->dump()
> >>>    641    AddI    === _  513  92  [[]]
> >>> $1 = void
> >>>
> >>> After the transformation at line 997, we have
> >>> (gdb) p lo->dump()
> >>>    641    AddI    === _ _ _  [[]]   [34200641]
> >>> $3 = void
> >>>
> >>> Then node 641 was used at line 999, which triggers the crash.
> >>> Patch fixes the issue by delaying transformation in
> >>> IfNode::fold_compares
> >> temporarily.
> >>> Tier1-3 tested on aarch64-linux-gnu & x86_64-linux-gnu.
> >>> Newly added test fail without the patch and pass otherwise.
> >>> Suggestions?
> >>>
> >>> Thanks,
> >>> Felix
> >>>

From jiefu at tencent.com  Wed Jul 29 03:43:22 2020
From: jiefu at tencent.com (=?utf-8?B?amllZnUo5YKF5p2wKQ==?=)
Date: Wed, 29 Jul 2020 03:43:22 +0000
Subject: RFR: 8250745: Fix a potential bug on AVX512 machines with
 assert(eval_map.contains(n)) failed: absent
Message-ID: <6E0374A1-1E57-4FF3-A8B0-BB605E5E8F68@tencent.com>

Hi all,

This bug[1] was first observed while testing Panama's vector api on AVX512 machines.
During the discussion on panama-dev, Vladimir Ivanov pointed out that this is not Vector API-specific[2].
So it would be better to fix the potential bug in the auto-vectorizer.

JBS:    https://bugs.openjdk.java.net/browse/JDK-8250745
Webrev: http://cr.openjdk.java.net/~jiefu/8250745/webrev.00/

Testing:
  1. jdk/jdk: tier1-3 on Linux/x64 AVX512 machines
  2. Panama(vectorIntrinsics): jdk/incubator/vector on Linux/x64 AVX512 machines

Thanks a lot.
Best regards,
Jie

[1] https://bugs.openjdk.java.net/browse/JDK-8250675
[2] https://mail.openjdk.java.net/pipermail/panama-dev/2020-July/010113.html

From christian.hagedorn at oracle.com  Wed Jul 29 07:03:10 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Wed, 29 Jul 2020 09:03:10 +0200
Subject: RFR(S): 8250609: C2 crash in IfNode::fold_compares
In-Reply-To: <DA41BE1DDCA941489001C7FBD7A8820EE7E8F804@dggeml507-mbx.china.huawei.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E8D547@dggeml507-mbx.china.huawei.com>
 <23f0ab18-8bfb-3874-3000-ee2b37caca7c@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E8F0BE@dggeml507-mbx.china.huawei.com>
 <d2e01ca9-1f00-1258-ce50-c27a99064cd2@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E8F804@dggeml507-mbx.china.huawei.com>
Message-ID: <a9d89b82-a20d-c9b8-cd56-ae456a97b67a@oracle.com>

Hi Felix

Looks good to me.

Just some minor comments (no new webrev required):
- L996: The asterisk should be at the type (Node*)
- L997/1003: You could remove one extra whitespace before the comment starts

Best regards,
Christian

On 29.07.20 05:20, Yangfei (Felix) wrote:
> Hi,
> 
>      Thanks for reviewing this.
>      Committed to jdk/submit repo and test result received looks good.
>      Will do the push.
> 
> Thanks,
> Felix
> 
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>> Sent: Wednesday, July 29, 2020 1:28 AM
>> To: Yangfei (Felix) <felix.yang at huawei.com>; hotspot-compiler-
>> dev at openjdk.java.net
>> Subject: Re: RFR(S): 8250609: C2 crash in IfNode::fold_compares
>>
>> Yes, this looks good.
>>
>> Thanks,
>> Vladimir K
>>
>> On 7/28/20 5:10 AM, Yangfei (Felix) wrote:
>>> Hi,
>>>
>>>> -----Original Message-----
>>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>>>> Sent: Tuesday, July 28, 2020 3:15 AM
>>>> To: Yangfei (Felix) <felix.yang at huawei.com>; hotspot-compiler-
>>>> dev at openjdk.java.net
>>>> Subject: Re: RFR(S): 8250609: C2 crash in IfNode::fold_compares
>>>>
>>>> It happens because 'lo' is new node created just now and have no uses
>> yet.
>>>> For such new nodes we usually add dummy use to avoid removal from
>> graph:
>>>>
>>>> http://hg.openjdk.java.net/jdk/jdk/file/c379dc750a02/src/hotspot/shar
>>>> e/op
>>>> to/convertnode.cpp#l403
>>>
>>> Thanks for the suggestions.  Yes, that will also fix the issue.
>>> New webrev: http://cr.openjdk.java.net/~fyang/8250609/webrev.01/
>>> Performed the same tests as before.  Does it look better?
>>>
>>> Felix
>>>
>>>> On 7/27/20 5:27 AM, Yangfei (Felix) wrote:
>>>>> Hi,
>>>>>
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8250609
>>>>> Webrev: http://cr.openjdk.java.net/~fyang/8250609/webrev.00/
>>>>>
>>>>> In IfNode::fold_compares_helper, C2 tries to fold 2 CmpI into a
>>>>> single
>>>> CmpU.
>>>>> At the crash site in IfNode::fold_compares_helper:
>>>>>     995   if (lo && hi) {
>>>>>     996     // Merge the two compares into a single unsigned compare by
>>>> building (CmpU (n - lo) (hi - lo))
>>>>>     997     Node* adjusted_val = igvn->transform(new SubINode(n,  lo));
>>>>>     998     if (adjusted_lim == NULL) {
>>>>>     999       adjusted_lim = igvn->transform(new SubINode(hi, lo));
>>>>> 1000     }
>>>>>
>>>>> At line 997, we have:
>>>>> (gdb) p lo->dump()
>>>>>     641    AddI    === _  513  92  [[]]
>>>>> $1 = void
>>>>>
>>>>> After the transformation at line 997, we have
>>>>> (gdb) p lo->dump()
>>>>>     641    AddI    === _ _ _  [[]]   [34200641]
>>>>> $3 = void
>>>>>
>>>>> Then node 641 was used at line 999, which triggers the crash.
>>>>> Patch fixes the issue by delaying transformation in
>>>>> IfNode::fold_compares
>>>> temporarily.
>>>>> Tier1-3 tested on aarch64-linux-gnu & x86_64-linux-gnu.
>>>>> Newly added test fail without the patch and pass otherwise.
>>>>> Suggestions?
>>>>>
>>>>> Thanks,
>>>>> Felix
>>>>>

From felix.yang at huawei.com  Wed Jul 29 07:17:47 2020
From: felix.yang at huawei.com (Yangfei (Felix))
Date: Wed, 29 Jul 2020 07:17:47 +0000
Subject: RFR(S): 8250609: C2 crash in IfNode::fold_compares
In-Reply-To: <a9d89b82-a20d-c9b8-cd56-ae456a97b67a@oracle.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE7E8D547@dggeml507-mbx.china.huawei.com>
 <23f0ab18-8bfb-3874-3000-ee2b37caca7c@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E8F0BE@dggeml507-mbx.china.huawei.com>
 <d2e01ca9-1f00-1258-ce50-c27a99064cd2@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE7E8F804@dggeml507-mbx.china.huawei.com>
 <a9d89b82-a20d-c9b8-cd56-ae456a97b67a@oracle.com>
Message-ID: <DA41BE1DDCA941489001C7FBD7A8820EE7E8F9E4@dggeml507-mbx.china.huawei.com>

Hi Christian,

Thanks for the careful reviewing :-)
That will be easy to fix and I will modify when I push.

Felix

> -----Original Message-----
> From: Christian Hagedorn [mailto:christian.hagedorn at oracle.com]
> Sent: Wednesday, July 29, 2020 3:03 PM
> To: Yangfei (Felix) <felix.yang at huawei.com>; Vladimir Kozlov
> <vladimir.kozlov at oracle.com>; hotspot-compiler-dev at openjdk.java.net
> Subject: Re: RFR(S): 8250609: C2 crash in IfNode::fold_compares
> 
> Hi Felix
> 
> Looks good to me.
> 
> Just some minor comments (no new webrev required):
> - L996: The asterisk should be at the type (Node*)
> - L997/1003: You could remove one extra whitespace before the comment
> starts
> 
> Best regards,
> Christian
> 
> On 29.07.20 05:20, Yangfei (Felix) wrote:
> > Hi,
> >
> >      Thanks for reviewing this.
> >      Committed to jdk/submit repo and test result received looks good.
> >      Will do the push.
> >
> > Thanks,
> > Felix
> >
> >> -----Original Message-----
> >> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> >> Sent: Wednesday, July 29, 2020 1:28 AM
> >> To: Yangfei (Felix) <felix.yang at huawei.com>; hotspot-compiler-
> >> dev at openjdk.java.net
> >> Subject: Re: RFR(S): 8250609: C2 crash in IfNode::fold_compares
> >>
> >> Yes, this looks good.
> >>
> >> Thanks,
> >> Vladimir K
> >>
> >> On 7/28/20 5:10 AM, Yangfei (Felix) wrote:
> >>> Hi,
> >>>
> >>>> -----Original Message-----
> >>>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> >>>> Sent: Tuesday, July 28, 2020 3:15 AM
> >>>> To: Yangfei (Felix) <felix.yang at huawei.com>; hotspot-compiler-
> >>>> dev at openjdk.java.net
> >>>> Subject: Re: RFR(S): 8250609: C2 crash in IfNode::fold_compares
> >>>>
> >>>> It happens because 'lo' is new node created just now and have no
> >>>> uses
> >> yet.
> >>>> For such new nodes we usually add dummy use to avoid removal from
> >> graph:
> >>>>
> >>>> http://hg.openjdk.java.net/jdk/jdk/file/c379dc750a02/src/hotspot/sh
> >>>> ar
> >>>> e/op
> >>>> to/convertnode.cpp#l403
> >>>
> >>> Thanks for the suggestions.  Yes, that will also fix the issue.
> >>> New webrev: http://cr.openjdk.java.net/~fyang/8250609/webrev.01/
> >>> Performed the same tests as before.  Does it look better?
> >>>
> >>> Felix
> >>>
> >>>> On 7/27/20 5:27 AM, Yangfei (Felix) wrote:
> >>>>> Hi,
> >>>>>
> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8250609
> >>>>> Webrev: http://cr.openjdk.java.net/~fyang/8250609/webrev.00/
> >>>>>
> >>>>> In IfNode::fold_compares_helper, C2 tries to fold 2 CmpI into a
> >>>>> single
> >>>> CmpU.
> >>>>> At the crash site in IfNode::fold_compares_helper:
> >>>>>     995   if (lo && hi) {
> >>>>>     996     // Merge the two compares into a single unsigned compare by
> >>>> building (CmpU (n - lo) (hi - lo))
> >>>>>     997     Node* adjusted_val = igvn->transform(new SubINode(n,  lo));
> >>>>>     998     if (adjusted_lim == NULL) {
> >>>>>     999       adjusted_lim = igvn->transform(new SubINode(hi, lo));
> >>>>> 1000     }
> >>>>>
> >>>>> At line 997, we have:
> >>>>> (gdb) p lo->dump()
> >>>>>     641    AddI    === _  513  92  [[]]
> >>>>> $1 = void
> >>>>>
> >>>>> After the transformation at line 997, we have
> >>>>> (gdb) p lo->dump()
> >>>>>     641    AddI    === _ _ _  [[]]   [34200641]
> >>>>> $3 = void
> >>>>>
> >>>>> Then node 641 was used at line 999, which triggers the crash.
> >>>>> Patch fixes the issue by delaying transformation in
> >>>>> IfNode::fold_compares
> >>>> temporarily.
> >>>>> Tier1-3 tested on aarch64-linux-gnu & x86_64-linux-gnu.
> >>>>> Newly added test fail without the patch and pass otherwise.
> >>>>> Suggestions?
> >>>>>
> >>>>> Thanks,
> >>>>> Felix
> >>>>>

From tobias.hartmann at oracle.com  Wed Jul 29 07:38:50 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Wed, 29 Jul 2020 09:38:50 +0200
Subject: RFR[XS] 8249809 avoid calling DirectiveSet::clone(this) in
 compilecommand_compatibility_init
In-Reply-To: <1595969785292.62158@amazon.com>
References: <1595807197546.52082@amazon.com>
 <CA+3eh10LEgJdcSAFrtsE=Ycy2oHJR3Hk2ddrTAFYLiC1E-sJ2w@mail.gmail.com>
 <1595907547514.55531@amazon.com>
 <CA+3eh10dQqSJyu+odn1zfLgMXw_oQaMN5+9icSp0LeERwui7JA@mail.gmail.com>
 <1595969785292.62158@amazon.com>
Message-ID: <eba1ae8f-9409-48d2-64fa-6bb1e069bb6d@oracle.com>

Hi Xin,

On 28.07.20 22:56, Liu, Xin wrote:
> http://cr.openjdk.java.net/~xliu/8249809/01/webrev/

Overall looks good to me.

Some style comments:
- Add a comment to 'DirectiveSetPtr' to describe its purpose
- Why not put the "cloned" logic in "operator->"?
- Do not use the _clone pointer as boolean (see "Miscellaneous" section in the style guide [1])
- Indentation in line 301-303 is wrong
- Line 306 use brackets around the "else" and move it one line up "} else {"

Best regards,
Tobias

[1] https://hg.openjdk.java.net/jdk/jdk/raw-file/tip/doc/hotspot-style.html

From sergei.tsypanov at yandex.ru  Wed Jul 29 07:43:10 2020
From: sergei.tsypanov at yandex.ru (=?utf-8?B?0KHQtdGA0LPQtdC5INCm0YvQv9Cw0L3QvtCy?=)
Date: Wed, 29 Jul 2020 09:43:10 +0200
Subject: Performance degradation due to probable (?) C2 issue
In-Reply-To: <da261d22-3801-bf63-b44c-c181e80f6209@redhat.com>
References: <925401595926726@mail.yandex.ru>
 <da261d22-3801-bf63-b44c-c181e80f6209@redhat.com>
Message-ID: <79821596008482@mail.yandex.ru>

Hi Andrew,

your suggestion was correct: with -XX:InlineSmallCode=1000
patched code works faster than original as expected.

Thanks for explaining that to me!

28.07.2020, 14:12, "Andrew Haley" <aph at redhat.com>:
> Hi,
>
> On 28/07/2020 11:35, ?????? ??????? wrote:
>
>> ?So my question is whether there's something wrong with compier of
>> ?the original idea of improvement was wrong?
>
> No, and (probably) no.
>
> C2 uses a bunch of of heuristics. Here, it's detected that
> CharArrayWriter::toString is large and has already been compiled so
> there's no sense inlining another copy of it. This isn't necessarily
> true, but it's a good guess. Try playing with InlineSmallCode: start
> with =1000, and increases it from there to see if it helps.
>
> --
> Andrew Haley (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From vladimir.x.ivanov at oracle.com  Wed Jul 29 09:20:10 2020
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Wed, 29 Jul 2020 12:20:10 +0300
Subject: RFR: 8250745: Fix a potential bug on AVX512 machines with
 assert(eval_map.contains(n)) failed: absent
In-Reply-To: <6E0374A1-1E57-4FF3-A8B0-BB605E5E8F68@tencent.com>
References: <6E0374A1-1E57-4FF3-A8B0-BB605E5E8F68@tencent.com>
Message-ID: <f4a157c5-4ac4-07b8-5d94-6d01f5000674@oracle.com>

> Webrev: http://cr.openjdk.java.net/~jiefu/8250745/webrev.00/

Looks good.

FTR the bug was introduced by JDK-8241040, but I don't see a way it can 
be hit by auto-vectorizer: before it kicks in, scalar code is strongly 
normalized and constants are pushed to the right. It leads to the shape 
where (Replicate -1) is always the second input of bitwise NOT shape 
(XorV v (Replicate -1)). Since there are no GVN transformations 
happening for vector nodes, both left-hand and right-hand variants 
become possible with Vector API.

Best regards,
Vladimir Ivanov

From aph at redhat.com  Wed Jul 29 11:44:50 2020
From: aph at redhat.com (Andrew Haley)
Date: Wed, 29 Jul 2020 12:44:50 +0100
Subject: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API
 (Incubator): AArch64 backend changes
In-Reply-To: <54d6b2b6-b79a-4700-981c-6ab33aca82f2@arm.com>
References: <VI1PR0802MB2558C90ACF1189F17692FF278EB90@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <275eb57c-51c0-675e-c32a-91b198023559@redhat.com>
 <719F9169-ABC4-408E-B732-F1BD9A84337F@oracle.com>
 <9a13f5df-d946-579d-4282-917dc7338dc8@redhat.com>
 <09BC0693-80E0-4F87-855E-0B38A6F5EFA2@oracle.com>
 <668e500e-f621-5a2c-a41e-f73536880f73@redhat.com>
 <VI1PR0802MB255845256424F9EB4D5D81368EB30@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <1909fa9d-98bb-c2fb-45d8-540247d1ca8b@redhat.com>
 <VI1PR0802MB255895ADE9BC8D027EBB81048EB10@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <BYAPR11MB35438110FA50851942CE17A1EF950@BYAPR11MB3543.namprd11.prod.outlook.com>
 <HE1PR0802MB25542AE212E12FD3EF6D1BE48E6E0@HE1PR0802MB2554.eurprd08.prod.outlook.com>
 <2acbcc99-8dd4-b8f1-5982-1d439953c416@redhat.com>
 <VI1PR0802MB2558ECC61528694DBDFFF48F8E670@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <54d6b2b6-b79a-4700-981c-6ab33aca82f2@arm.com>
Message-ID: <852a3a09-a627-c0fc-89c6-8c8100ae17f5@redhat.com>

On 20/07/2020 04:51, Ningsheng Jian wrote:
> Since we are getting ready to propose Vector API target to JDK 16 [1]. I 
> have regenerated webrev of aarch64 backend parts from panama repo, which 
> has been rebased to jdk/jdk very recently, by:
> 
> $ hg update vector-unstable && hg diff -r default > all.patch
> $ grep "diff -r" all.patch | grep -e "src/hotspot/cpu/aarch64" | awk 
> '{print $4}' > aarch64_list
> $ ksh ./webrev.ksh -r default -o aarch64_webrev aarch64_list
> 
> The new webrev:
> http://cr.openjdk.java.net/~njian/vectorapi/8223347-integration/aarch64-webrev.01/
> 
> Could you please help to take a look?

OK, thanks. It all looks fine. Sorry for the delay.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From luhenry at microsoft.com  Wed Jul 29 13:59:47 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Wed, 29 Jul 2020 13:59:47 +0000
Subject: [aarch64-port-dev ] RFR[XXS] 8248672: utilities: Introduce
 DEPRECATED macro for GCC and MSVC
In-Reply-To: <116277BD-EA21-49AA-8DE1-DBC06ED43C43@oracle.com>
References: <MWHPR21MB05113A054F2D8291BFAFB085B0770@MWHPR21MB0511.namprd21.prod.outlook.com>
 <5e301790-8bfe-0ced-b5e2-8a9c76ae33de@oracle.com>
 <MWHPR21MB051146316FF56A63423DF514B0770@MWHPR21MB0511.namprd21.prod.outlook.com>
 <F50C37BA-FE47-465C-A60C-10547DCF8F69@oracle.com>
 <1259c3fd-b69c-6d81-0427-cb769f00bca5@redhat.com>
 <CD1AE64E-8555-4074-97EB-831BF0A72C61@oracle.com>
 <MWHPR21MB05117357CE8A5903943234C2B0750@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB0511A0713E147045CA02522CB0730@MWHPR21MB0511.namprd21.prod.outlook.com>,
 <116277BD-EA21-49AA-8DE1-DBC06ED43C43@oracle.com>
Message-ID: <MWHPR21MB0511A595027B7328169F3816B0700@MWHPR21MB0511.namprd21.prod.outlook.com>

Hi Kim,

I just had it updated.

Thanks

________________________________________
From: Kim Barrett <kim.barrett at oracle.com>
Sent: Tuesday, July 28, 2020 17:34
To: Ludovic Henry
Cc: Andrew Haley; Vladimir Kozlov; hotspot-compiler-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net; openjdk-aarch64; hotspot-gc-dev at openjdk.java.net
Subject: Re: [aarch64-port-dev ] RFR[XXS] 8248672: utilities: Introduce DEPRECATED macro for GCC and MSVC

> On Jul 28, 2020, at 12:22 PM, Ludovic Henry <luhenry at microsoft.com> wrote:
>
> Hi,
>
> I confirm that `= delete` works, and that we get a compile-time error if you try to use it.
>
> Please find the updated webrev at https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2F8248672%2Fwebrev.01&amp;data=02%7C01%7Cluhenry%40microsoft.com%7C80a04191ac684f7ed64008d833577d4d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637315798145154680&amp;sdata=FLhVIXFkMZpRfpuV2jiVoZmqcm8dHKfg8SgNtQJSUrE%3D&amp;reserved=0
>
> Thank you
> Ludovic

Looks good.

Probably the bug title should be updated.


From volker.simonis at gmail.com  Wed Jul 29 14:34:08 2020
From: volker.simonis at gmail.com (Volker Simonis)
Date: Wed, 29 Jul 2020 16:34:08 +0200
Subject: RFR[XS] 8249809 avoid calling DirectiveSet::clone(this) in
 compilecommand_compatibility_init
In-Reply-To: <eba1ae8f-9409-48d2-64fa-6bb1e069bb6d@oracle.com>
References: <1595807197546.52082@amazon.com>
 <CA+3eh10LEgJdcSAFrtsE=Ycy2oHJR3Hk2ddrTAFYLiC1E-sJ2w@mail.gmail.com>
 <1595907547514.55531@amazon.com>
 <CA+3eh10dQqSJyu+odn1zfLgMXw_oQaMN5+9icSp0LeERwui7JA@mail.gmail.com>
 <1595969785292.62158@amazon.com>
 <eba1ae8f-9409-48d2-64fa-6bb1e069bb6d@oracle.com>
Message-ID: <CA+3eh10yzWC1phhCSv9CMZRiH=A1ySJ6ij5jTQq0jeGpttxi6g@mail.gmail.com>

On Wed, Jul 29, 2020 at 9:38 AM Tobias Hartmann
<tobias.hartmann at oracle.com> wrote:
>
> Hi Xin,
>
> On 28.07.20 22:56, Liu, Xin wrote:
> > http://cr.openjdk.java.net/~xliu/8249809/01/webrev/
>
> Overall looks good to me.
>
> Some style comments:
> - Add a comment to 'DirectiveSetPtr' to describe its purpose
> - Why not put the "cloned" logic in "operator->"?

Because there's also a "read-only" access  of the DirectiveSetPtr
which doesn't mutate its content and therefore should clone the
underlying DirectiveSet. See my first mail where I proposed to add a
second, `const`-version of "operator->". But that still required const
casts in the places where we didn't want to clone. I've therefore
voted for the new "cloned()" method which makes cloning and mutating
explicit and which is much easier to understand from my point of view
(compared to two overloaded operators).

> - Do not use the _clone pointer as boolean (see "Miscellaneous" section in the style guide [1])
> - Indentation in line 301-303 is wrong
> - Line 306 use brackets around the "else" and move it one line up "} else {"
>
> Best regards,
> Tobias
>
> [1] https://hg.openjdk.java.net/jdk/jdk/raw-file/tip/doc/hotspot-style.html

From jatin.bhateja at intel.com  Wed Jul 29 14:45:49 2020
From: jatin.bhateja at intel.com (Bhateja, Jatin)
Date: Wed, 29 Jul 2020 14:45:49 +0000
Subject: RFR[S] : 8248830 : C2 : Rotate API intrinsification for X86
In-Reply-To: <dd691913-d9c7-2657-905f-4f3df50f6bb4@oracle.com>
References: <MWHPR11MB1614EAFF216144FE6EAE68F9E87F0@MWHPR11MB1614.namprd11.prod.outlook.com>
 <92d97d1b-fc53-e368-b249-1cab7db33964@oracle.com>
 <MWHPR11MB1614CB6E26028AC98DAA7F30E8790@MWHPR11MB1614.namprd11.prod.outlook.com>
 <dd691913-d9c7-2657-905f-4f3df50f6bb4@oracle.com>
Message-ID: <MWHPR11MB1614E047E14386D3B51EA3A9E8700@MWHPR11MB1614.namprd11.prod.outlook.com>

Hi Vladimir,

Thanks for the pointers, following is the link to updated patch:

http://cr.openjdk.java.net/~jbhateja/8248830/webrev.04/

> I'd prefer to see a uniform Ideal IR shape being used irrespective of
> whether the argument is a constant or not. It should also simplify the
> logic in SuperWord and make it easier to support on non-x86 architectures.
> 
> For example, here's how it is done on AArch64:
> 
> instruct vsll4I_imm(vecX dst, vecX src, immI shift) %{
>    predicate(n->as_Vector()->length() == 4);
>    match(Set dst (LShiftVI src (LShiftCntV shift))); ...
>

Graph shape has been made consistent, we could have also optimized the pattern for ARM port for
immediate shifts.

> #  Internal Error (.../src/hotspot/share/opto/phaseX.cpp:1238),
> pid=5476, tid=6219
> #  assert((i->_idx >= k->_idx) || i->is_top()) failed: Idealize should
> return new nodes, use Identity to return old nodes
> 
> I believe it comes from RotateLeftNode::Ideal/RotateRightNode::Ideal
> which can return pre-contructed constants. I suggest to get rid of
> Ideal() methods and move constant folding logic into Node::Value() (as
> implemented for other bitwise/arithmetic nodes in
> addnode.cpp/subnode.cpp/mulnode.cpp et al). It's a more generic approach
> since it enables richer type information (ranges vs constants) and IMO it's
> more convenient to work with constants through Types than ConNodes.

I have removed RotateLeftNode/RotateRightNode::Ideal routines since we are anyways
doing constant folding in LShiftI/URShiftI value routines. Since JAVA rotate APIs are no longer 
intrincified hence these routines may no longer be useful.

> 
> It would be really nice to migrate to MacroAssembler along the way (as a
> cleanup).

I guess you are saying remove opcodes/encoding from patterns and move then to Assembler,
Can we take this cleanup activity separately since other patterns are also using these matcher
directives.

Other synthetic comments have been taken care of. I have extended the Test to cover all the newly 
added scalar transforms. Kindly let me know if there other comments. 

Best Regards,
Jatin

> -----Original Message-----
> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
> Sent: Friday, July 24, 2020 3:21 AM
> To: Bhateja, Jatin <jatin.bhateja at intel.com>
> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; Andrew Haley
> <aph at redhat.com>; hotspot-compiler-dev at openjdk.java.net
> Subject: Re: RFR[S] : 8248830 : C2 : Rotate API intrinsification for X86
> 
> Hi Jatin,
> 
> > http://cr.openjdk.java.net/~jbhateja/8248830/webrev.03/
> 
> Much better! Thanks.
> 
> > Change Summary:
> >
> > 1) Unified the handling for scalar rotate operation. All scalar rotate
> selection patterns are now dependent on newly created
> RotateLeft/RotateRight nodes. This promotes rotate inferencing. Currently
> if DAG nodes corresponding to a sub-pattern are shared (have multiple
> users) then existing complex patterns based on Or/LShiftL/URShift does not
> get matched and this prevents inferring rotate nodes. Please refer to
> JIT'ed assembly output with baseline[1] and with patch[2] . We can see that
> generated code size also went done from 832 byte to 768 bytes. Also this
> can cause perf degradation if shift-or dependency chain appears inside a
> hot region.
> >
> > 2) Due to enhanced rotate inferencing new patch shows better performance
> even for legacy targets (non AVX-512). Please refer to the perf result[3]
> over AVX2 machine for JMH benchmark part of the patch.
> 
> Very nice!
> > 3) As suggested, removed Java API intrinsification changes and scalar
> rotate transformation are done during OrI/OrL node idealizations.
> 
> Good.
> 
> (Still would be nice to factor the matching code from Ideal() and share it
> between multiple use sites. Especially considering OrVNode::Ideal() now
> does basically the same thing. As an example/idea, take a look at
> is_bmi_pattern() in x86.ad.)
> 
> > 4) SLP always gets to work on new scalar Rotate nodes and creates vector
> rotate nodes which are degenerated into OrV/LShiftV/URShiftV nodes if
> target does not supports vector rotates(non-AVX512).
> 
> Good.
> 
> > 5) Added new instruction patterns for vector shift Left/Right operations
> with constant shift operands. This prevents emitting extra moves to XMM.
> 
> +instruct vshiftI_imm(vec dst, vec src, immI8 shift) %{
> +  match(Set dst (LShiftVI src shift));
> 
> I'd prefer to see a uniform Ideal IR shape being used irrespective of
> whether the argument is a constant or not. It should also simplify the
> logic in SuperWord and make it easier to support on non-x86 architectures.
> 
> For example, here's how it is done on AArch64:
> 
> instruct vsll4I_imm(vecX dst, vecX src, immI shift) %{
>    predicate(n->as_Vector()->length() == 4);
>    match(Set dst (LShiftVI src (LShiftCntV shift))); ...
> 
> > 6) Constant folding scenarios are covered in RotateLeft/RotateRight
> idealization, inferencing of vector rotate through OrV idealization covers
> the vector patterns generated though non SLP route i.e. VectorAPI.
> 
> I'm fine with keeping OrV::Ideal(), but I'm concerned with the general
> direction here - duplication of scalar transformations to lane-wise vector
> operations. It definitely won't scale and in a longer run it risks to
> diverge. Would be nice to find a way to automatically "lift"
> scalar transformations to vectors and apply them uniformly. But right now
> it is just an idea which requires more experimentation.
> 
> 
> Some other minor comments/suggestions:
> 
> +  // Swap the computed left and right shift counts.
> +  if (is_rotate_left) {
> +    Node* temp = shiftRCnt;
> +    shiftRCnt  = shiftLCnt;
> +    shiftLCnt  = temp;
> +  }
> 
> Maybe use swap() here (declared in globalDefinitions.hpp)?
> 
> 
> +  if (Matcher::match_rule_supported_vector(vopc, vlen, bt))
> +    return true;
> 
> Please, don't omit curly braces (even for simple cases).
> 
> 
> -// Rotate Right by variable
> -instruct rorI_rReg_Var_C0(no_rcx_RegI dst, rcx_RegI shift, immI0 zero,
> rFlagsReg cr)
> +instruct rorI_immI8_legacy(rRegI dst, immI8 shift, rFlagsReg cr)
>   %{
> -  match(Set dst (OrI (URShiftI dst shift) (LShiftI dst (SubI zero
> shift))));
> -
> +  predicate(!VM_Version::supports_bmi2() &&
> n->bottom_type()->basic_type() == T_INT);
> +  match(Set dst (RotateRight dst shift));
> +  format %{ "rorl     $dst, $shift" %}
>     expand %{
> -    rorI_rReg_CL(dst, shift, cr);
> +    rorI_rReg_imm8(dst, shift, cr);
>     %}
> 
> It would be really nice to migrate to MacroAssembler along the way (as a
> cleanup).
> 
> > Please push the patch through your testing framework and let me know your
> review feedback.
> 
> There's one new assertion failure:
> 
> #  Internal Error (.../src/hotspot/share/opto/phaseX.cpp:1238),
> pid=5476, tid=6219
> #  assert((i->_idx >= k->_idx) || i->is_top()) failed: Idealize should
> return new nodes, use Identity to return old nodes
> 
> I believe it comes from RotateLeftNode::Ideal/RotateRightNode::Ideal
> which can return pre-contructed constants. I suggest to get rid of
> Ideal() methods and move constant folding logic into Node::Value() (as
> implemented for other bitwise/arithmethic nodes in
> addnode.cpp/subnode.cpp/mulnode.cpp et al). It's a more generic approach
> since it enables richer type information (ranges vs constants) and IMO it's
> more convenient to work with constants through Types than ConNodes.
> 
> (I suspect that original/expanded IR shape may already provide more precise
> type info for non-constant case which can affect the benchmarks.)
> 
> Best regards,
> Vladimir Ivanov
> 
> >
> > Best Regards,
> > Jatin
> >
> > [1]
> > http://cr.openjdk.java.net/~jbhateja/8248830/rotate_baseline_avx2_asm.
> > txt [2]
> > http://cr.openjdk.java.net/~jbhateja/8248830/rotate_new_patch_avx2_asm
> > .txt [3]
> > http://cr.openjdk.java.net/~jbhateja/8248830/rotate_perf_avx2_new_patc
> > h.txt
> >
> >
> >> -----Original Message-----
> >> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
> >> Sent: Saturday, July 18, 2020 12:25 AM
> >> To: Bhateja, Jatin <jatin.bhateja at intel.com>; Andrew Haley
> >> <aph at redhat.com>
> >> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>;
> >> hotspot-compiler- dev at openjdk.java.net
> >> Subject: Re: RFR[S] : 8248830 : C2 : Rotate API intrinsification for
> >> X86
> >>
> >> Hi Jatin,
> >>
> >>> http://cr.openjdk.java.net/~jbhateja/8248830/webrev_02/
> >>
> >> It definitely looks better, but IMO it hasn't reached the sweet spot
> yet.
> >> It feels like the focus is on auto-vectorizer while the burden is put
> >> on scalar cases.
> >>
> >> First of all, considering GVN folds relevant operation patterns into
> >> a single Rotate node now, what's the motivation to introduce intrinsics?
> >>
> >> Another point is there's still significant duplication for scalar cases.
> >>
> >> I'd prefer to see the legacy cases which rely on pattern matching to
> >> go away and be substituted with instructions which match Rotate
> >> instructions (migrating ).
> >>
> >> I understand that it will penalize the vectorization implementation,
> >> but IMO reducing overall complexity is worth it. On auto-vectorizer
> >> side, I see
> >> 2 ways to fix it:
> >>
> >>     (1) introduce additional AD instructions for
> >> RotateLeftV/RotateRightV specifically for pre-AVX512 hardware;
> >>
> >>     (2) in SuperWord::output(), when matcher doesn't support
> >> RotateLeftV/RotateLeftV nodes (Matcher::match_rule_supported()),
> >> generate vectorized version of the original pattern.
> >>
> >> Overall, it looks like more and more focus is made on scalar part.
> >> Considering the main goal of the patch is to enable vectorization,
> >> I'm fine with separating cleanup of scalar part. As an interim
> >> solution, it seems that leaving the scalar part as it is now and
> >> matching scalar bit rotate pattern in VectorNode::is_rotate() should
> >> be enough to keep the vectorization part functioning. Then scalar
> >> Rotate nodes and relevant cleanups can be integrated later. (Or vice
> >> versa: clean up scalar part first and then follow up with
> >> vectorization.)
> >>
> >> Some other comments:
> >>
> >> * There's a lot of duplication between OrINode::Ideal and
> OrLNode::Ideal.
> >> What do you think about introducing a super type
> >> (OrNode) and put a unified version (OrNode::Ideal) there?
> >>
> >>
> >> * src/hotspot/cpu/x86/x86.ad
> >>
> >> +instruct vprotate_immI8(vec dst, vec src, immI8 shift) %{
> >> +  predicate(n->bottom_type()->is_vect()->element_basic_type() == T_INT
> ||
> >> +            n->bottom_type()->is_vect()->element_basic_type() ==
> >> +T_LONG);
> >>
> >> +instruct vprorate(vec dst, vec src, vec shift) %{
> >> +  predicate(n->bottom_type()->is_vect()->element_basic_type() == T_INT
> ||
> >> +            n->bottom_type()->is_vect()->element_basic_type() ==
> >> +T_LONG);
> >>
> >> The predicates are redundant here.
> >>
> >>
> >> * src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp
> >>
> >> +void C2_MacroAssembler::vprotate_imm(int opcode, BasicType etype,
> >> XMMRegister dst, XMMRegister src,
> >> +                                     int shift, int vector_len) {
> >> + if (opcode == Op_RotateLeftV) {
> >> +    if (etype == T_INT) {
> >> +      evprold(dst, src, shift, vector_len);
> >> +    } else {
> >> +      evprolq(dst, src, shift, vector_len);
> >> +    }
> >>
> >> Please, put an assert for the false case (assert(etype == T_LONG,
> "...")).
> >>
> >>
> >> * On testing (with previous version of the patch): -XX:UseAVX is x86-
> >> specific flag, so new/adjusted tests now fail on non-x86 platforms.
> >> Either omitting the flag or adding -XX:+IgnoreUnrecognizedVMOptions
> >> will solve the issue.
> >>
> >> Best regards,
> >> Vladimir Ivanov
> >>
> >>>
> >>>
> >>> Summary of changes:
> >>> 1) Optimization is specifically targeted to exploit vector rotation
> >> instruction added for X86 AVX512. A single rotate instruction
> >> encapsulates entire vector OR/SHIFTs pattern thus offers better
> >> latency at reduced instruction count.
> >>>
> >>> 2) There were two approaches to implement this:
> >>>       a)  Let everything remain the same and add new wide complex
> >> instruction patterns in the matcher for e.g.
> >>>            set Dst ( OrV (Binary (LShiftVI dst (Binary ReplicateI
> >>> shift))
> >> (URShiftVI dst (Binary (SubI (Binary ReplicateI 32) ( Replicate
> >> shift))
> >>>       It would have been an overoptimistic assumption to expect that
> >>> graph
> >> shape would be preserved till the matcher for correct inferencing.
> >>>       In addition we would have required multiple such bulky patterns.
> >>>       b) Create new RotateLeft/RotateRight scalar nodes, these gets
> >> generated during intrinsification as well as during additional
> >> pattern
> >>>       matching during node Idealization, later on these nodes are
> >>> consumed
> >> by SLP for valid vectorization scenarios to emit their vector
> >>>       counterparts which eventually emits vector rotates.
> >>>
> >>> 3) I choose approach 2b) since its cleaner, only problem here was
> >>> that in non-evex mode (UseAVX < 3) new scalar Rotate nodes should
> >>> either be
> >> dismantled back to OR/SHIFT pattern or we penalize the vectorization
> >> which would be very costly, other option would have been to add
> >> additional vector rotate pattern for UseAVX=3 in the matcher which
> >> emit vector OR-SHIFTs instruction but then it will loose on emitting
> >> efficient instruction sequence which node sharing
> >> (OrV/LShiftV/URShift) offer in current implementation - thus it will
> >> not be beneficial for non-AVX512 targets, only saving will be in
> >> terms of cleanup of few existing scalar rotate matcher patterns, also
> >> old targets does not offer this powerful rotate instruction.
> >> Therefore new scalar nodes are created only for AVX512 targets.
> >>>
> >>> As per suggestions constant folding scenarios have been covered
> >>> during
> >> Idealizations of newly added scalar nodes.
> >>>
> >>> Please review the latest version and share your feedback and test
> >> results.
> >>>
> >>> Best Regards,
> >>> Jatin
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Andrew Haley <aph at redhat.com>
> >>>> Sent: Saturday, July 11, 2020 2:24 PM
> >>>> To: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>; Bhateja, Jatin
> >>>> <jatin.bhateja at intel.com>; hotspot-compiler-dev at openjdk.java.net
> >>>> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>
> >>>> Subject: Re: 8248830 : RFR[S] : C2 : Rotate API intrinsification
> >>>> for
> >>>> X86
> >>>>
> >>>> On 10/07/2020 18:32, Vladimir Ivanov wrote:
> >>>>
> >>>>    > High-level comment: so far, there were no pressing need in  >
> >>>> explicitly marking the methods as intrinsics. ROR/ROL instructions
> >>>> > were selected during matching [1]. Now the patch introduces  >
> >>>> dedicated nodes
> >>>> (RotateLeft/RotateRight) specifically for intrinsics  > which
> >>>> partly duplicates existing logic.
> >>>>
> >>>> The lack of rotate nodes in the IR has always meant that AArch64
> >>>> doesn't generate optimal code for e.g.
> >>>>
> >>>>      (Set dst (XorL reg1 (RotateLeftL reg2 imm)))
> >>>>
> >>>> because, with the RotateLeft expanded to its full combination of
> >>>> ORs and shifts, it's to complicated to match. At the time I put
> >>>> this to one side because it wasn't urgent. This is a shame because
> >>>> although such combinations are unusual they are used in some crypto
> operations.
> >>>>
> >>>> If we can generate immediate-form rotate nodes early by pattern
> >>>> matching during parsing (rather than depending on intrinsics) we'll
> >>>> get more value than by depending on programmers calling intrinsics.
> >>>>
> >>>> --
> >>>> Andrew Haley  (he/him)
> >>>> Java Platform Lead Engineer
> >>>> Red Hat UK Ltd. <https://www.redhat.com>
> >>>> https://keybase.io/andrewhaley
> >>>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
> >>>

From eric.caspole at oracle.com  Wed Jul 29 15:06:01 2020
From: eric.caspole at oracle.com (eric.caspole at oracle.com)
Date: Wed, 29 Jul 2020 11:06:01 -0400
Subject: RFR (S) - 8249663: LogCompilation cannot process log from
 o.r.scala.dotty.JmhDotty
Message-ID: <72cabad4-ee7d-f045-b2f9-5969c58abb4a@oracle.com>

Hi everyone,
Could I get reviews on this bug fix to LogCompilation tool. There were 
actually 2 problems, the first is that "site" in LogParser is not reset 
after a <parse>, and so the next parse could misuse the stale site which 
led to the stack trace in the bug.

The second problem appeared after I put the first fix in place, that 
<replace_string_concat> can have an uncommon trap 
reason='intrinsic_or_type_checked_inlining' so we need to record the 
location of that to process the uncommon trap that will follow it.

I tested this with logs from some jvm08's and a short run of all 
renaissance, and about 80 runs of the specific renaissance dotty that 
usually showed the first problem.

JBS:
https://bugs.openjdk.java.net/browse/JDK-8249663

webrev:
http://cr.openjdk.java.net/~ecaspole/JDK-8249663/01/webrev/

Thanks,
Eric

From vladimir.kozlov at oracle.com  Wed Jul 29 16:03:13 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 29 Jul 2020 09:03:13 -0700
Subject: RFR: 8250745: Fix a potential bug on AVX512 machines with
 assert(eval_map.contains(n)) failed: absent
In-Reply-To: <f4a157c5-4ac4-07b8-5d94-6d01f5000674@oracle.com>
References: <6E0374A1-1E57-4FF3-A8B0-BB605E5E8F68@tencent.com>
 <f4a157c5-4ac4-07b8-5d94-6d01f5000674@oracle.com>
Message-ID: <a117584f-d886-0b47-9bfc-8f080afca298@oracle.com>

On 7/29/20 2:20 AM, Vladimir Ivanov wrote:
>> Webrev: http://cr.openjdk.java.net/~jiefu/8250745/webrev.00/
> 
> Looks good.

+1

> 
> FTR the bug was introduced by JDK-8241040, but I don't see a way it can be hit by auto-vectorizer: before it kicks in, 
> scalar code is strongly normalized and constants are pushed to the right. It leads to the shape where (Replicate -1) is 
> always the second input of bitwise NOT shape (XorV v (Replicate -1)). Since there are no GVN transformations happening 
> for vector nodes, both left-hand and right-hand variants become possible with Vector API.

So it is difficult to write a test?

Thanks,
Vladimir K

> 
> Best regards,
> Vladimir Ivanov

From vladimir.x.ivanov at oracle.com  Wed Jul 29 16:18:39 2020
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Wed, 29 Jul 2020 19:18:39 +0300
Subject: RFR: 8250745: Fix a potential bug on AVX512 machines with
 assert(eval_map.contains(n)) failed: absent
In-Reply-To: <a117584f-d886-0b47-9bfc-8f080afca298@oracle.com>
References: <6E0374A1-1E57-4FF3-A8B0-BB605E5E8F68@tencent.com>
 <f4a157c5-4ac4-07b8-5d94-6d01f5000674@oracle.com>
 <a117584f-d886-0b47-9bfc-8f080afca298@oracle.com>
Message-ID: <a4aab4d8-3ddb-a96b-9a41-d59a7090b1fb@oracle.com>


>> FTR the bug was introduced by JDK-8241040, but I don't see a way it 
>> can be hit by auto-vectorizer: before it kicks in, scalar code is 
>> strongly normalized and constants are pushed to the right. It leads to 
>> the shape where (Replicate -1) is always the second input of bitwise 
>> NOT shape (XorV v (Replicate -1)). Since there are no GVN 
>> transformations happening for vector nodes, both left-hand and 
>> right-hand variants become possible with Vector API.
> 
> So it is difficult to write a test?

IMO there's no way to hit the bug without using Vector API.

Best regards,
Vladimir Ivanov

From john.r.rose at oracle.com  Wed Jul 29 16:48:00 2020
From: john.r.rose at oracle.com (John Rose)
Date: Wed, 29 Jul 2020 09:48:00 -0700
Subject: Performance degradation due to probable (?) C2 issue
In-Reply-To: <da261d22-3801-bf63-b44c-c181e80f6209@redhat.com>
References: <925401595926726@mail.yandex.ru>
 <da261d22-3801-bf63-b44c-c181e80f6209@redhat.com>
Message-ID: <5B377E17-952C-409B-98AB-2E6270A84185@oracle.com>

On Jul 28, 2020, at 5:12 AM, Andrew Haley <aph at redhat.com> wrote:
> 
> ? This isn't necessarily true, but it's a good guess. ?

And that is the history of HotSpot heuristics, in a nutshell.


From luhenry at microsoft.com  Wed Jul 29 16:55:29 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Wed, 29 Jul 2020 16:55:29 +0000
Subject: Adding an Intrinsic for MD5
Message-ID: <MWHPR21MB051125F36042FCA5A9A9C1B5B0700@MWHPR21MB0511.namprd21.prod.outlook.com>

Hi,

After doing profiling on some applications on Azure, I noticed that MD5 takes a significant time when verifying the content of large amount of downloaded data (see [1] for a flamegraph of some Spark operations pulling data from Azure Storage, look at the top most `Lsun/securitu/pro..` entry representing 11.68% of the samples). I then looked into the code generated for `sun.security.provider.MD5.implCompress` (the hottest method). I observed that the generated code contains many branches that are never taken and not even necessary (array-bound checks on a fixed sized array for which we already checked the size, for example). On top of that, MD5 doesn't require any (there are no conditions and no loops), making all these branches pure overhead. Accelerating MD5 will not be only beneficial to Azure workloads, but to anyone doing any sort of content hashing/verification with MD5 (which is quite unfortunate given the known flaws of MD5 and the availability of faster alternatives with greater cryptographical qualities).

I worked last night on a prototype of an intrinsic, which I've uploaded at [2]. It's a very rough draft and I want to have your input before I invest further into it.

As it is the first time I do such work (adding an intrinsic, generating assembly by hand, adding support for one instruction in the assembler), I'm still running into a crash and I am not sure how to debug it further. I would really appreciate any pointer on how I need to approach debugging such an issue, or even for an expert to look into my change and help me pinpoint what's going wrong. So far, I used the disassembly and hs_err*.log file to clearly see the generated code and the machine state at the time of the crash. I expect the problem to be around calling conventions and assumptions around the shape/content of the parameters. I'll keep debugging in the meantime.

Thank you very much,

--
Ludovic

[1] http://cr.openjdk.java.net/~burban/luhenry/md5-intrinsics/flamegraph-45235.svg
[2] http://cr.openjdk.java.net/~burban/luhenry/md5-intrinsics/webrev.00/

From vladimir.x.ivanov at oracle.com  Wed Jul 29 17:14:17 2020
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Wed, 29 Jul 2020 20:14:17 +0300
Subject: RFR[S] : 8248830 : C2 : Rotate API intrinsification for X86
In-Reply-To: <MWHPR11MB1614E047E14386D3B51EA3A9E8700@MWHPR11MB1614.namprd11.prod.outlook.com>
References: <MWHPR11MB1614EAFF216144FE6EAE68F9E87F0@MWHPR11MB1614.namprd11.prod.outlook.com>
 <92d97d1b-fc53-e368-b249-1cab7db33964@oracle.com>
 <MWHPR11MB1614CB6E26028AC98DAA7F30E8790@MWHPR11MB1614.namprd11.prod.outlook.com>
 <dd691913-d9c7-2657-905f-4f3df50f6bb4@oracle.com>
 <MWHPR11MB1614E047E14386D3B51EA3A9E8700@MWHPR11MB1614.namprd11.prod.outlook.com>
Message-ID: <e0a75968-936f-97df-5693-f1e3275824e9@oracle.com>


> http://cr.openjdk.java.net/~jbhateja/8248830/webrev.04/

Looks good. (Testing is in progress.)

>> I'd prefer to see a uniform Ideal IR shape being used irrespective of
>> whether the argument is a constant or not. It should also simplify the
>> logic in SuperWord and make it easier to support on non-x86 architectures.
>>
>> For example, here's how it is done on AArch64:
>>
>> instruct vsll4I_imm(vecX dst, vecX src, immI shift) %{
>>     predicate(n->as_Vector()->length() == 4);
>>     match(Set dst (LShiftVI src (LShiftCntV shift))); ...
>>
> 
> Graph shape has been made consistent, we could have also optimized the pattern for ARM port for
> immediate shifts.

Good.

> I have removed RotateLeftNode/RotateRightNode::Ideal routines since we are anyways
> doing constant folding in LShiftI/URShiftI value routines. Since JAVA rotate APIs are no longer
> intrincified hence these routines may no longer be useful.

Nice observation! Good.

>> It would be really nice to migrate to MacroAssembler along the way (as a
>> cleanup).
> 
> I guess you are saying remove opcodes/encoding from patterns and move then to Assembler,
> Can we take this cleanup activity separately since other patterns are also using these matcher
> directives.

I'm perfectly fine with handling it as a separate enhancement.

> Other synthetic comments have been taken care of. I have extended the Test to cover all the newly
> added scalar transforms. Kindly let me know if there other comments.

Nice!

Best regards,
Vladimir Ivanov

>> -----Original Message-----
>> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
>> Sent: Friday, July 24, 2020 3:21 AM
>> To: Bhateja, Jatin <jatin.bhateja at intel.com>
>> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; Andrew Haley
>> <aph at redhat.com>; hotspot-compiler-dev at openjdk.java.net
>> Subject: Re: RFR[S] : 8248830 : C2 : Rotate API intrinsification for X86
>>
>> Hi Jatin,
>>
>>> http://cr.openjdk.java.net/~jbhateja/8248830/webrev.03/
>>
>> Much better! Thanks.
>>
>>> Change Summary:
>>>
>>> 1) Unified the handling for scalar rotate operation. All scalar rotate
>> selection patterns are now dependent on newly created
>> RotateLeft/RotateRight nodes. This promotes rotate inferencing. Currently
>> if DAG nodes corresponding to a sub-pattern are shared (have multiple
>> users) then existing complex patterns based on Or/LShiftL/URShift does not
>> get matched and this prevents inferring rotate nodes. Please refer to
>> JIT'ed assembly output with baseline[1] and with patch[2] . We can see that
>> generated code size also went done from 832 byte to 768 bytes. Also this
>> can cause perf degradation if shift-or dependency chain appears inside a
>> hot region.
>>>
>>> 2) Due to enhanced rotate inferencing new patch shows better performance
>> even for legacy targets (non AVX-512). Please refer to the perf result[3]
>> over AVX2 machine for JMH benchmark part of the patch.
>>
>> Very nice!
>>> 3) As suggested, removed Java API intrinsification changes and scalar
>> rotate transformation are done during OrI/OrL node idealizations.
>>
>> Good.
>>
>> (Still would be nice to factor the matching code from Ideal() and share it
>> between multiple use sites. Especially considering OrVNode::Ideal() now
>> does basically the same thing. As an example/idea, take a look at
>> is_bmi_pattern() in x86.ad.)
>>
>>> 4) SLP always gets to work on new scalar Rotate nodes and creates vector
>> rotate nodes which are degenerated into OrV/LShiftV/URShiftV nodes if
>> target does not supports vector rotates(non-AVX512).
>>
>> Good.
>>
>>> 5) Added new instruction patterns for vector shift Left/Right operations
>> with constant shift operands. This prevents emitting extra moves to XMM.
>>
>> +instruct vshiftI_imm(vec dst, vec src, immI8 shift) %{
>> +  match(Set dst (LShiftVI src shift));
>>
>> I'd prefer to see a uniform Ideal IR shape being used irrespective of
>> whether the argument is a constant or not. It should also simplify the
>> logic in SuperWord and make it easier to support on non-x86 architectures.
>>
>> For example, here's how it is done on AArch64:
>>
>> instruct vsll4I_imm(vecX dst, vecX src, immI shift) %{
>>     predicate(n->as_Vector()->length() == 4);
>>     match(Set dst (LShiftVI src (LShiftCntV shift))); ...
>>
>>> 6) Constant folding scenarios are covered in RotateLeft/RotateRight
>> idealization, inferencing of vector rotate through OrV idealization covers
>> the vector patterns generated though non SLP route i.e. VectorAPI.
>>
>> I'm fine with keeping OrV::Ideal(), but I'm concerned with the general
>> direction here - duplication of scalar transformations to lane-wise vector
>> operations. It definitely won't scale and in a longer run it risks to
>> diverge. Would be nice to find a way to automatically "lift"
>> scalar transformations to vectors and apply them uniformly. But right now
>> it is just an idea which requires more experimentation.
>>
>>
>> Some other minor comments/suggestions:
>>
>> +  // Swap the computed left and right shift counts.
>> +  if (is_rotate_left) {
>> +    Node* temp = shiftRCnt;
>> +    shiftRCnt  = shiftLCnt;
>> +    shiftLCnt  = temp;
>> +  }
>>
>> Maybe use swap() here (declared in globalDefinitions.hpp)?
>>
>>
>> +  if (Matcher::match_rule_supported_vector(vopc, vlen, bt))
>> +    return true;
>>
>> Please, don't omit curly braces (even for simple cases).
>>
>>
>> -// Rotate Right by variable
>> -instruct rorI_rReg_Var_C0(no_rcx_RegI dst, rcx_RegI shift, immI0 zero,
>> rFlagsReg cr)
>> +instruct rorI_immI8_legacy(rRegI dst, immI8 shift, rFlagsReg cr)
>>    %{
>> -  match(Set dst (OrI (URShiftI dst shift) (LShiftI dst (SubI zero
>> shift))));
>> -
>> +  predicate(!VM_Version::supports_bmi2() &&
>> n->bottom_type()->basic_type() == T_INT);
>> +  match(Set dst (RotateRight dst shift));
>> +  format %{ "rorl     $dst, $shift" %}
>>      expand %{
>> -    rorI_rReg_CL(dst, shift, cr);
>> +    rorI_rReg_imm8(dst, shift, cr);
>>      %}
>>
>> It would be really nice to migrate to MacroAssembler along the way (as a
>> cleanup).
>>
>>> Please push the patch through your testing framework and let me know your
>> review feedback.
>>
>> There's one new assertion failure:
>>
>> #  Internal Error (.../src/hotspot/share/opto/phaseX.cpp:1238),
>> pid=5476, tid=6219
>> #  assert((i->_idx >= k->_idx) || i->is_top()) failed: Idealize should
>> return new nodes, use Identity to return old nodes
>>
>> I believe it comes from RotateLeftNode::Ideal/RotateRightNode::Ideal
>> which can return pre-contructed constants. I suggest to get rid of
>> Ideal() methods and move constant folding logic into Node::Value() (as
>> implemented for other bitwise/arithmethic nodes in
>> addnode.cpp/subnode.cpp/mulnode.cpp et al). It's a more generic approach
>> since it enables richer type information (ranges vs constants) and IMO it's
>> more convenient to work with constants through Types than ConNodes.
>>
>> (I suspect that original/expanded IR shape may already provide more precise
>> type info for non-constant case which can affect the benchmarks.)
>>
>> Best regards,
>> Vladimir Ivanov
>>
>>>
>>> Best Regards,
>>> Jatin
>>>
>>> [1]
>>> http://cr.openjdk.java.net/~jbhateja/8248830/rotate_baseline_avx2_asm.
>>> txt [2]
>>> http://cr.openjdk.java.net/~jbhateja/8248830/rotate_new_patch_avx2_asm
>>> .txt [3]
>>> http://cr.openjdk.java.net/~jbhateja/8248830/rotate_perf_avx2_new_patc
>>> h.txt
>>>
>>>
>>>> -----Original Message-----
>>>> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
>>>> Sent: Saturday, July 18, 2020 12:25 AM
>>>> To: Bhateja, Jatin <jatin.bhateja at intel.com>; Andrew Haley
>>>> <aph at redhat.com>
>>>> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>;
>>>> hotspot-compiler- dev at openjdk.java.net
>>>> Subject: Re: RFR[S] : 8248830 : C2 : Rotate API intrinsification for
>>>> X86
>>>>
>>>> Hi Jatin,
>>>>
>>>>> http://cr.openjdk.java.net/~jbhateja/8248830/webrev_02/
>>>>
>>>> It definitely looks better, but IMO it hasn't reached the sweet spot
>> yet.
>>>> It feels like the focus is on auto-vectorizer while the burden is put
>>>> on scalar cases.
>>>>
>>>> First of all, considering GVN folds relevant operation patterns into
>>>> a single Rotate node now, what's the motivation to introduce intrinsics?
>>>>
>>>> Another point is there's still significant duplication for scalar cases.
>>>>
>>>> I'd prefer to see the legacy cases which rely on pattern matching to
>>>> go away and be substituted with instructions which match Rotate
>>>> instructions (migrating ).
>>>>
>>>> I understand that it will penalize the vectorization implementation,
>>>> but IMO reducing overall complexity is worth it. On auto-vectorizer
>>>> side, I see
>>>> 2 ways to fix it:
>>>>
>>>>      (1) introduce additional AD instructions for
>>>> RotateLeftV/RotateRightV specifically for pre-AVX512 hardware;
>>>>
>>>>      (2) in SuperWord::output(), when matcher doesn't support
>>>> RotateLeftV/RotateLeftV nodes (Matcher::match_rule_supported()),
>>>> generate vectorized version of the original pattern.
>>>>
>>>> Overall, it looks like more and more focus is made on scalar part.
>>>> Considering the main goal of the patch is to enable vectorization,
>>>> I'm fine with separating cleanup of scalar part. As an interim
>>>> solution, it seems that leaving the scalar part as it is now and
>>>> matching scalar bit rotate pattern in VectorNode::is_rotate() should
>>>> be enough to keep the vectorization part functioning. Then scalar
>>>> Rotate nodes and relevant cleanups can be integrated later. (Or vice
>>>> versa: clean up scalar part first and then follow up with
>>>> vectorization.)
>>>>
>>>> Some other comments:
>>>>
>>>> * There's a lot of duplication between OrINode::Ideal and
>> OrLNode::Ideal.
>>>> What do you think about introducing a super type
>>>> (OrNode) and put a unified version (OrNode::Ideal) there?
>>>>
>>>>
>>>> * src/hotspot/cpu/x86/x86.ad
>>>>
>>>> +instruct vprotate_immI8(vec dst, vec src, immI8 shift) %{
>>>> +  predicate(n->bottom_type()->is_vect()->element_basic_type() == T_INT
>> ||
>>>> +            n->bottom_type()->is_vect()->element_basic_type() ==
>>>> +T_LONG);
>>>>
>>>> +instruct vprorate(vec dst, vec src, vec shift) %{
>>>> +  predicate(n->bottom_type()->is_vect()->element_basic_type() == T_INT
>> ||
>>>> +            n->bottom_type()->is_vect()->element_basic_type() ==
>>>> +T_LONG);
>>>>
>>>> The predicates are redundant here.
>>>>
>>>>
>>>> * src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp
>>>>
>>>> +void C2_MacroAssembler::vprotate_imm(int opcode, BasicType etype,
>>>> XMMRegister dst, XMMRegister src,
>>>> +                                     int shift, int vector_len) {
>>>> + if (opcode == Op_RotateLeftV) {
>>>> +    if (etype == T_INT) {
>>>> +      evprold(dst, src, shift, vector_len);
>>>> +    } else {
>>>> +      evprolq(dst, src, shift, vector_len);
>>>> +    }
>>>>
>>>> Please, put an assert for the false case (assert(etype == T_LONG,
>> "...")).
>>>>
>>>>
>>>> * On testing (with previous version of the patch): -XX:UseAVX is x86-
>>>> specific flag, so new/adjusted tests now fail on non-x86 platforms.
>>>> Either omitting the flag or adding -XX:+IgnoreUnrecognizedVMOptions
>>>> will solve the issue.
>>>>
>>>> Best regards,
>>>> Vladimir Ivanov
>>>>
>>>>>
>>>>>
>>>>> Summary of changes:
>>>>> 1) Optimization is specifically targeted to exploit vector rotation
>>>> instruction added for X86 AVX512. A single rotate instruction
>>>> encapsulates entire vector OR/SHIFTs pattern thus offers better
>>>> latency at reduced instruction count.
>>>>>
>>>>> 2) There were two approaches to implement this:
>>>>>        a)  Let everything remain the same and add new wide complex
>>>> instruction patterns in the matcher for e.g.
>>>>>             set Dst ( OrV (Binary (LShiftVI dst (Binary ReplicateI
>>>>> shift))
>>>> (URShiftVI dst (Binary (SubI (Binary ReplicateI 32) ( Replicate
>>>> shift))
>>>>>        It would have been an overoptimistic assumption to expect that
>>>>> graph
>>>> shape would be preserved till the matcher for correct inferencing.
>>>>>        In addition we would have required multiple such bulky patterns.
>>>>>        b) Create new RotateLeft/RotateRight scalar nodes, these gets
>>>> generated during intrinsification as well as during additional
>>>> pattern
>>>>>        matching during node Idealization, later on these nodes are
>>>>> consumed
>>>> by SLP for valid vectorization scenarios to emit their vector
>>>>>        counterparts which eventually emits vector rotates.
>>>>>
>>>>> 3) I choose approach 2b) since its cleaner, only problem here was
>>>>> that in non-evex mode (UseAVX < 3) new scalar Rotate nodes should
>>>>> either be
>>>> dismantled back to OR/SHIFT pattern or we penalize the vectorization
>>>> which would be very costly, other option would have been to add
>>>> additional vector rotate pattern for UseAVX=3 in the matcher which
>>>> emit vector OR-SHIFTs instruction but then it will loose on emitting
>>>> efficient instruction sequence which node sharing
>>>> (OrV/LShiftV/URShift) offer in current implementation - thus it will
>>>> not be beneficial for non-AVX512 targets, only saving will be in
>>>> terms of cleanup of few existing scalar rotate matcher patterns, also
>>>> old targets does not offer this powerful rotate instruction.
>>>> Therefore new scalar nodes are created only for AVX512 targets.
>>>>>
>>>>> As per suggestions constant folding scenarios have been covered
>>>>> during
>>>> Idealizations of newly added scalar nodes.
>>>>>
>>>>> Please review the latest version and share your feedback and test
>>>> results.
>>>>>
>>>>> Best Regards,
>>>>> Jatin
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Andrew Haley <aph at redhat.com>
>>>>>> Sent: Saturday, July 11, 2020 2:24 PM
>>>>>> To: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>; Bhateja, Jatin
>>>>>> <jatin.bhateja at intel.com>; hotspot-compiler-dev at openjdk.java.net
>>>>>> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>
>>>>>> Subject: Re: 8248830 : RFR[S] : C2 : Rotate API intrinsification
>>>>>> for
>>>>>> X86
>>>>>>
>>>>>> On 10/07/2020 18:32, Vladimir Ivanov wrote:
>>>>>>
>>>>>>     > High-level comment: so far, there were no pressing need in  >
>>>>>> explicitly marking the methods as intrinsics. ROR/ROL instructions
>>>>>>> were selected during matching [1]. Now the patch introduces  >
>>>>>> dedicated nodes
>>>>>> (RotateLeft/RotateRight) specifically for intrinsics  > which
>>>>>> partly duplicates existing logic.
>>>>>>
>>>>>> The lack of rotate nodes in the IR has always meant that AArch64
>>>>>> doesn't generate optimal code for e.g.
>>>>>>
>>>>>>       (Set dst (XorL reg1 (RotateLeftL reg2 imm)))
>>>>>>
>>>>>> because, with the RotateLeft expanded to its full combination of
>>>>>> ORs and shifts, it's to complicated to match. At the time I put
>>>>>> this to one side because it wasn't urgent. This is a shame because
>>>>>> although such combinations are unusual they are used in some crypto
>> operations.
>>>>>>
>>>>>> If we can generate immediate-form rotate nodes early by pattern
>>>>>> matching during parsing (rather than depending on intrinsics) we'll
>>>>>> get more value than by depending on programmers calling intrinsics.
>>>>>>
>>>>>> --
>>>>>> Andrew Haley  (he/him)
>>>>>> Java Platform Lead Engineer
>>>>>> Red Hat UK Ltd. <https://www.redhat.com>
>>>>>> https://keybase.io/andrewhaley
>>>>>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>>>>>

From vladimir.x.ivanov at oracle.com  Wed Jul 29 17:15:35 2020
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Wed, 29 Jul 2020 20:15:35 +0300
Subject: RFR (S) - 8249663: LogCompilation cannot process log from
 o.r.scala.dotty.JmhDotty
In-Reply-To: <72cabad4-ee7d-f045-b2f9-5969c58abb4a@oracle.com>
References: <72cabad4-ee7d-f045-b2f9-5969c58abb4a@oracle.com>
Message-ID: <9013c6c8-563d-b220-4cc9-f846df76fd2e@oracle.com>


> http://cr.openjdk.java.net/~ecaspole/JDK-8249663/01/webrev/

Looks good.

Best regards,
Vladimir Ivanov

From vladimir.kozlov at oracle.com  Wed Jul 29 18:17:25 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 29 Jul 2020 11:17:25 -0700
Subject: RFR (S) - 8249663: LogCompilation cannot process log from
 o.r.scala.dotty.JmhDotty
In-Reply-To: <9013c6c8-563d-b220-4cc9-f846df76fd2e@oracle.com>
References: <72cabad4-ee7d-f045-b2f9-5969c58abb4a@oracle.com>
 <9013c6c8-563d-b220-4cc9-f846df76fd2e@oracle.com>
Message-ID: <8df2e1bc-d27b-1de2-10e6-d9d3b0ec1532@oracle.com>

+1

Thanks,
Vladimir K

On 7/29/20 10:15 AM, Vladimir Ivanov wrote:
> 
>> http://cr.openjdk.java.net/~ecaspole/JDK-8249663/01/webrev/
> 
> Looks good.
> 
> Best regards,
> Vladimir Ivanov

From sandhya.viswanathan at intel.com  Wed Jul 29 18:19:14 2020
From: sandhya.viswanathan at intel.com (Viswanathan, Sandhya)
Date: Wed, 29 Jul 2020 18:19:14 +0000
Subject: RFR (XXL): 8223347: Integration of Vector API (Incubator):
 General HotSpot changes
In-Reply-To: <38a7fe74-0c5e-4a28-b128-24c40b8ea01e@oracle.com>
References: <c1bdf88c-5de2-d069-5f31-5a95c6988bf8@oracle.com>
 <38a7fe74-0c5e-4a28-b128-24c40b8ea01e@oracle.com>
Message-ID: <BYAPR11MB35437FE26F6DE8890ED94BBBEF700@BYAPR11MB3543.namprd11.prod.outlook.com>

Hi,

Likewise, the corresponding x86 backend changes since first review are also only minor cleanups and simple bug fixes:

X86:
   Full: http://cr.openjdk.java.net/~sviswanathan/VAPI_RFR/x86_webrev/webrev.01/
   Incremental: http://cr.openjdk.java.net/~sviswanathan/VAPI_RFR/x86_webrev/webrev.00-webrev.01/

Summary:
   - rebased to jdk/jdk tip;
   - backend changes related to removal of NotV, VLShiftV, VRShiftV, VURShiftV nodes;
   - vector insert bug fix
   - some minor cleanups

Older webrev links for your reference:
   X86b backend: http://cr.openjdk.java.net/~sviswanathan/VAPI_RFR/x86_webrev/webrev.00/

Best Regards,
Sandhya

-----Original Message-----
From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com> 
Sent: Tuesday, July 28, 2020 3:30 PM
To: hotspot-dev <hotspot-dev at openjdk.java.net>; hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; panama-dev <panama-dev at openjdk.java.net>
Subject: Re: RFR (XXL): 8223347: Integration of Vector API (Incubator): General HotSpot changes

Hi,

Thanks for the feedback on webrev.00, Remi, Coleen, Vladimir K., and Ekaterina!

Here are the latest changes for Vector API support in HotSpot shared code:
 
http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.01

Incremental changes (diff against webrev.00):
 
http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.01_00

I decided to post it here and not initiate a new round of reviews because the changes are mostly limited to minor cleanups / simple bug fixes.

Detailed summary:
   - rebased to jdk/jdk tip;
   - got rid of NotV, VLShiftV, VRShiftV, VURShiftV nodes;
   - restore lazy cleanup logic during incremental inlining (see needs_cleanup in compile.cpp);
   - got rid of x86-specific changes in shared code;
   - fix for 8244867 [1];
   - fix Graal test failure: enumerate VectorSupport intrinsics in CheckGraalIntrinsics
   - numerous minor cleanups

Best regards,
Vladimir Ivanov

[1] http://hg.openjdk.java.net/panama/dev/rev/dcfc7b6e8977
     http://jbs.oracle.com/browse/JDK-8244867
     8244867: 2 vector api tests crash with
assert(is_reference_type(basic_type())) failed: wrong type
Summary: Adding safety checks to prevent intrinsification if class arguments of non-primitive types are uninitialized.

On 04.04.2020 02:12, Vladimir Ivanov wrote:
> Hi,
> 
> Following up on review requests of API [0] and Java implementation [1] 
> for Vector API (JEP 338 [2]), here's a request for review of general 
> HotSpot changes (in shared code) required for supporting the API:
> 
> 
> http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shar
> ed/webrev.00/all.00-03/
> 
> 
> (First of all, to set proper expectations: since the JEP is still in 
> Candidate state, the intention is to initiate preliminary round(s) of 
> review to inform the community and gather feedback before sending out 
> final/official RFRs once the JEP is Targeted to a release.)
> 
> Vector API (being developed in Project Panama [3]) relies on JVM 
> support to utilize optimal vector hardware instructions at runtime. It 
> interacts with JVM through intrinsics (declared in 
> jdk.internal.vm.vector.VectorSupport [4]) which expose vector 
> operations support in C2 JIT-compiler.
> 
> As Paul wrote earlier: "A vector intrinsic is an internal low-level 
> vector operation. The last argument to the intrinsic is fall back 
> behavior in Java, implementing the scalar operation over the number of 
> elements held by the vector.? Thus, If the intrinsic is not supported 
> in
> C2 for the other arguments then the Java implementation is executed 
> (the Java implementation is always executed when running in the 
> interpreter or for C1)."
> 
> The rest of JVM support is about aggressively optimizing vector boxes 
> to minimize (ideally eliminate) the overhead of boxing for vector values.
> It's a stop-the-gap solution for vector box elimination problem until 
> inline classes arrive. Vector classes are value-based and in the 
> longer term will be migrated to inline classes once the support becomes available.
> 
> Vector API talk from JVMLS'18 [5] contains brief overview of JVM 
> implementation and some details.
> 
> Complete implementation resides in vector-unstable branch of 
> panama/dev repository [6].
> 
> Now to gory details (the patch is split in multiple "sub-webrevs"):
> 
> ===========================================================
> 
> (1)
> http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shar
> ed/webrev.00/00.backend.shared/
> 
> 
> Ideal vector nodes for new operations introduced by Vector API.
> 
> (Platform-specific back end support will be posted for review separately).
> 
> ===========================================================
> 
> (2)
> http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shar
> ed/webrev.00/01.intrinsics/
> 
> 
> JVM Java interface (VectorSupport) and intrinsic support in C2.
> 
> Vector instances are initially represented as VectorBox macro nodes 
> and "unboxing" is represented by VectorUnbox node. It simplifies 
> vector box elimination analysis and the nodes are expanded later right before EA pass.
> 
> Vectors have 2-level on-heap representation: for the vector value 
> primitive array is used as a backing storage and it is encapsulated in 
> a typed wrapper (e.g., Int256Vector - vector of 8 ints - contains a 
> int[8] instance which is used to store vector value).
> 
> Unless VectorBox node goes away, it needs to be expanded into an 
> allocation eventually, but it is a pure node and doesn't have any JVM 
> state associated with it. The problem is solved by keeping JVM state 
> separately in a VectorBoxAllocate node associated with VectorBox node 
> and use it during expansion.
> 
> Also, to simplify vector box elimination, inlining of vector reboxing 
> calls (VectorSupport::maybeRebox) is delayed until the analysis is over.
> 
> ===========================================================
> 
> (3)
> http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shar
> ed/webrev.00/02.vbox_elimination/
> 
> 
> Vector box elimination analysis implementation. (Brief overview: 
> slides
> #36-42 [5].)
> 
> The main part is devoted to scalarization across safepoints and 
> rematerialization support during deoptimization. In C2-generated code 
> vector operations work with raw vector values which live in registers 
> or spilled on the stack and it allows to avoid boxing/unboxing when a 
> vector value is alive across a safepoint. As with other values, 
> there's just a location of the vector value at the safepoint and 
> vector type information recorded in the relevant nmethod metadata and 
> all the heavy-lifting happens only when rematerialization takes place.
> 
> The analysis preserves object identity invariants except during 
> aggressive reboxing (guarded by -XX:+EnableAggressiveReboxing).
> 
> (Aggressive reboxing is crucial for cases when vectors "escape": it 
> allocates a fresh instance at every escape point thus enabling 
> original instance to go away.)
> 
> ===========================================================
> 
> (4)
> http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shar
> ed/webrev.00/03.module.hotspot/
> 
> 
> HotSpot changes for jdk.incubator.vector module. Vector support is 
> makred experimental and turned off by default. JEP 338 proposes the 
> API to be released as an incubator module, so a user has to specify 
> "--add-module jdk.incubator.vector" on the command line to be able to 
> use it.
> When user does that, JVM automatically enables Vector API support.
> It improves usability (user doesn't need to separately "open" the API 
> and enable JVM support) while minimizing risks of destabilitzation 
> from new code when the API is not used.
> 
> 
> That's it! Will be happy to answer any questions.
> 
> And thanks in advance for any feedback!
> 
> Best regards,
> Vladimir Ivanov
> 
> [0]
> https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-March/06534
> 5.html
> 
> 
> [1]
> https://mail.openjdk.java.net/pipermail/hotspot-dev/2020-April/041228.
> html
> 
> [2] https://openjdk.java.net/jeps/338
> 
> [3] https://openjdk.java.net/projects/panama/
> 
> [4]
> http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shar
> ed/webrev.00/01.intrinsics/src/java.base/share/classes/jdk/internal/vm
> /vector/VectorSupport.java.html
> 
> 
> [5] 
> http://cr.openjdk.java.net/~vlivanov/talks/2018_JVMLS_VectorAPI.pdf
> 
> [6] http://hg.openjdk.java.net/panama/dev/shortlog/92bbd44386e9
> 
>  ??? $ hg clone http://hg.openjdk.java.net/panama/dev/ -b 
> vector-unstable

From luhenry at microsoft.com  Wed Jul 29 19:13:32 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Wed, 29 Jul 2020 19:13:32 +0000
Subject: Adding an Intrinsic for MD5
In-Reply-To: <MWHPR21MB051125F36042FCA5A9A9C1B5B0700@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB051125F36042FCA5A9A9C1B5B0700@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <MWHPR21MB051175BA98E737205A5E4665B0700@MWHPR21MB0511.namprd21.prod.outlook.com>

To add some more information, I've uploaded one of the `hs_err_pid*.log` file at [1].

--
Ludovic

[1] http://cr.openjdk.java.net/~burban/luhenry/md5-intrinsics/hs_err_pid28286.log

-----Original Message-----
From: hotspot-compiler-dev <hotspot-compiler-dev-retn at openjdk.java.net> On Behalf Of Ludovic Henry
Sent: Wednesday, July 29, 2020 9:55 AM
To: hotspot-compiler-dev at openjdk.java.net
Subject: Adding an Intrinsic for MD5

Hi,

After doing profiling on some applications on Azure, I noticed that MD5 takes a significant time when verifying the content of large amount of downloaded data (see [1] for a flamegraph of some Spark operations pulling data from Azure Storage, look at the top most `Lsun/securitu/pro..` entry representing 11.68% of the samples). I then looked into the code generated for `sun.security.provider.MD5.implCompress` (the hottest method). I observed that the generated code contains many branches that are never taken and not even necessary (array-bound checks on a fixed sized array for which we already checked the size, for example). On top of that, MD5 doesn't require any (there are no conditions and no loops), making all these branches pure overhead. Accelerating MD5 will not be only beneficial to Azure workloads, but to anyone doing any sort of content hashing/verification with MD5 (which is quite unfortunate given the known flaws of MD5 and the availability of faster alternatives with greater cryptographical qualities).

I worked last night on a prototype of an intrinsic, which I've uploaded at [2]. It's a very rough draft and I want to have your input before I invest further into it.

As it is the first time I do such work (adding an intrinsic, generating assembly by hand, adding support for one instruction in the assembler), I'm still running into a crash and I am not sure how to debug it further. I would really appreciate any pointer on how I need to approach debugging such an issue, or even for an expert to look into my change and help me pinpoint what's going wrong. So far, I used the disassembly and hs_err*.log file to clearly see the generated code and the machine state at the time of the crash. I expect the problem to be around calling conventions and assumptions around the shape/content of the parameters. I'll keep debugging in the meantime.

Thank you very much,

--
Ludovic

[1] https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2Fmd5-intrinsics%2Fflamegraph-45235.svg&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cbc2c83438c794f73c6cb08d833e08f5f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637316386858391072&amp;sdata=1rNcCYW29l4KZPjpXT1%2F3nSWma3%2F83rXaIwNsw9s1GM%3D&amp;reserved=0
[2] https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2Fmd5-intrinsics%2Fwebrev.00%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cbc2c83438c794f73c6cb08d833e08f5f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637316386858401068&amp;sdata=014gBkFRpgC4QT6U0Zp4%2FKSI0qv0g3fXEJ4YL12bDX0%3D&amp;reserved=0

From igor.ignatyev at oracle.com  Wed Jul 29 19:34:02 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 29 Jul 2020 12:34:02 -0700
Subject: RFR(T) : 8250797 : remove CompileReason::Reason_CTW
Message-ID: <2C1589D3-2FEC-411A-8CC2-DF184593BD25@oracle.com>

http://cr.openjdk.java.net/~iignatyev//8250797/webrev.00
> 5 lines changed: 0 ins; 4 del; 1 mod; 

Hi all,

could you please review this patch?
from JBS:
> "native" CTW has been removed by JDK-8213812 (JDK-8214917), so CompileReason::Reason_CTW isn't used anymore and should be removed.

besides removing CompileReason::Reason_CTW and corresponding element from reason_names[] array, the patch also updates the comment for CompileReason as CompileTask::can_become_stale doesn't really depend on the order.

webrev: 
JBS: https://bugs.openjdk.java.net/browse/JDK-8250797

-- Igor

From ekaterina.pavlova at oracle.com  Wed Jul 29 19:44:32 2020
From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova)
Date: Wed, 29 Jul 2020 12:44:32 -0700
Subject: RFR(T) : 8250797 : remove CompileReason::Reason_CTW
In-Reply-To: <2C1589D3-2FEC-411A-8CC2-DF184593BD25@oracle.com>
References: <2C1589D3-2FEC-411A-8CC2-DF184593BD25@oracle.com>
Message-ID: <1948381d-5e52-1f6d-2ad1-a8e445bdea5a@oracle.com>

Looks good.

-katya

On 7/29/20 12:34 PM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev//8250797/webrev.00
>> 5 lines changed: 0 ins; 4 del; 1 mod;
> 
> Hi all,
> 
> could you please review this patch?
> from JBS:
>> "native" CTW has been removed by JDK-8213812 (JDK-8214917), so CompileReason::Reason_CTW isn't used anymore and should be removed.
> 
> besides removing CompileReason::Reason_CTW and corresponding element from reason_names[] array, the patch also updates the comment for CompileReason as CompileTask::can_become_stale doesn't really depend on the order.
> 
> webrev:
> JBS: https://bugs.openjdk.java.net/browse/JDK-8250797
> 
> -- Igor
> 


From dean.long at oracle.com  Wed Jul 29 19:48:01 2020
From: dean.long at oracle.com (Dean Long)
Date: Wed, 29 Jul 2020 12:48:01 -0700
Subject: [15] RFR(XS) 8248597: [Graal]
 api/java_security/SignatureSpi/DelegationTests.html fails with Method
 "javasoft.sqe.tests.api.java.security.SignatureSpi.JCKSignatureSpi.clear"
 doesn't exist.
Message-ID: <688dfe7e-c8ad-3c37-d2f7-432609f45e2f@oracle.com>

https://bugs.openjdk.java.net/browse/JDK-8248597
http://cr.openjdk.java.net/~dlong/8248597/webrev/

This change fixes an issue with frame states in Graal that was causing a 
JCK test to fail.? The fix is from Tom Rodriguez.? This change has 
already been reviewed and merged into Graal.? A new unit test was added 
to detect the problem.? After this is reviewed I'll be requesting 
permission to push this into 15.

dl

From vladimir.x.ivanov at oracle.com  Wed Jul 29 20:08:05 2020
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Wed, 29 Jul 2020 23:08:05 +0300
Subject: RFR: 8250745: Fix a potential bug on AVX512 machines with
 assert(eval_map.contains(n)) failed: absent
In-Reply-To: <f4a157c5-4ac4-07b8-5d94-6d01f5000674@oracle.com>
References: <6E0374A1-1E57-4FF3-A8B0-BB605E5E8F68@tencent.com>
 <f4a157c5-4ac4-07b8-5d94-6d01f5000674@oracle.com>
Message-ID: <aea91c40-f139-7cdc-d0a3-c94bf2be1bc6@oracle.com>

FYI test results are clean (hs-precheckin-comp,hs-tier1,hs-tier2).

Best regards,
Vladimir Ivanov

On 29.07.2020 12:20, Vladimir Ivanov wrote:
>> Webrev: http://cr.openjdk.java.net/~jiefu/8250745/webrev.00/
> 
> Looks good.
> 
> FTR the bug was introduced by JDK-8241040, but I don't see a way it can 
> be hit by auto-vectorizer: before it kicks in, scalar code is strongly 
> normalized and constants are pushed to the right. It leads to the shape 
> where (Replicate -1) is always the second input of bitwise NOT shape 
> (XorV v (Replicate -1)). Since there are no GVN transformations 
> happening for vector nodes, both left-hand and right-hand variants 
> become possible with Vector API.
> 
> Best regards,
> Vladimir Ivanov

From vladimir.kozlov at oracle.com  Wed Jul 29 20:49:20 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 29 Jul 2020 13:49:20 -0700
Subject: [15] RFR(XS) 8248597: [Graal]
 api/java_security/SignatureSpi/DelegationTests.html fails with Method
 "javasoft.sqe.tests.api.java.security.SignatureSpi.JCKSignatureSpi.clear"
 doesn't exist.
In-Reply-To: <688dfe7e-c8ad-3c37-d2f7-432609f45e2f@oracle.com>
References: <688dfe7e-c8ad-3c37-d2f7-432609f45e2f@oracle.com>
Message-ID: <a680e364-9d7e-adef-0fcd-78ec6ad4f135@oracle.com>

Looks good.

Thanks,
Vladimir

On 7/29/20 12:48 PM, Dean Long wrote:
> https://bugs.openjdk.java.net/browse/JDK-8248597
> http://cr.openjdk.java.net/~dlong/8248597/webrev/
> 
> This change fixes an issue with frame states in Graal that was causing a JCK test to fail.? The fix is from Tom 
> Rodriguez.? This change has already been reviewed and merged into Graal.? A new unit test was added to detect the 
> problem.? After this is reviewed I'll be requesting permission to push this into 15.
> 
> dl

From dean.long at oracle.com  Wed Jul 29 20:52:43 2020
From: dean.long at oracle.com (Dean Long)
Date: Wed, 29 Jul 2020 13:52:43 -0700
Subject: [15] RFR(XS) 8248597: [Graal]
 api/java_security/SignatureSpi/DelegationTests.html fails with Method
 "javasoft.sqe.tests.api.java.security.SignatureSpi.JCKSignatureSpi.clear"
 doesn't exist.
In-Reply-To: <a680e364-9d7e-adef-0fcd-78ec6ad4f135@oracle.com>
References: <688dfe7e-c8ad-3c37-d2f7-432609f45e2f@oracle.com>
 <a680e364-9d7e-adef-0fcd-78ec6ad4f135@oracle.com>
Message-ID: <f28c68d3-4e95-541b-d086-dc0ef20f1b9e@oracle.com>

Thanks Vladimir.

dl

On 7/29/20 1:49 PM, Vladimir Kozlov wrote:
> Looks good.
>
> Thanks,
> Vladimir
>
> On 7/29/20 12:48 PM, Dean Long wrote:
>> https://bugs.openjdk.java.net/browse/JDK-8248597
>> http://cr.openjdk.java.net/~dlong/8248597/webrev/
>>
>> This change fixes an issue with frame states in Graal that was 
>> causing a JCK test to fail.? The fix is from Tom Rodriguez. This 
>> change has already been reviewed and merged into Graal.? A new unit 
>> test was added to detect the problem.? After this is reviewed I'll be 
>> requesting permission to push this into 15.
>>
>> dl


From vladimir.kozlov at oracle.com  Wed Jul 29 20:54:31 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 29 Jul 2020 13:54:31 -0700
Subject: RFR(T) : 8250797 : remove CompileReason::Reason_CTW
In-Reply-To: <2C1589D3-2FEC-411A-8CC2-DF184593BD25@oracle.com>
References: <2C1589D3-2FEC-411A-8CC2-DF184593BD25@oracle.com>
Message-ID: <18e52d8d-afe9-7ccb-1d3a-ae1b37a2b8d8@oracle.com>

Igor,

You missed reference in should_wait_for_compilation().

Thanks,
Vladimir K

On 7/29/20 12:34 PM, Igor Ignatyev wrote:
> http://cr.openjdk.java.net/~iignatyev//8250797/webrev.00
>> 5 lines changed: 0 ins; 4 del; 1 mod;
> 
> Hi all,
> 
> could you please review this patch?
> from JBS:
>> "native" CTW has been removed by JDK-8213812 (JDK-8214917), so CompileReason::Reason_CTW isn't used anymore and should be removed.
> 
> besides removing CompileReason::Reason_CTW and corresponding element from reason_names[] array, the patch also updates the comment for CompileReason as CompileTask::can_become_stale doesn't really depend on the order.
> 
> webrev:
> JBS: https://bugs.openjdk.java.net/browse/JDK-8250797
> 
> -- Igor
> 

From xxinliu at amazon.com  Wed Jul 29 20:55:53 2020
From: xxinliu at amazon.com (Liu, Xin)
Date: Wed, 29 Jul 2020 20:55:53 +0000
Subject: RFR[XS] 8249809 avoid calling DirectiveSet::clone(this) in
 compilecommand_compatibility_init
In-Reply-To: <CA+3eh10yzWC1phhCSv9CMZRiH=A1ySJ6ij5jTQq0jeGpttxi6g@mail.gmail.com>
References: <1595807197546.52082@amazon.com>
 <CA+3eh10LEgJdcSAFrtsE=Ycy2oHJR3Hk2ddrTAFYLiC1E-sJ2w@mail.gmail.com>
 <1595907547514.55531@amazon.com>
 <CA+3eh10dQqSJyu+odn1zfLgMXw_oQaMN5+9icSp0LeERwui7JA@mail.gmail.com>
 <1595969785292.62158@amazon.com>
 <eba1ae8f-9409-48d2-64fa-6bb1e069bb6d@oracle.com>,
 <CA+3eh10yzWC1phhCSv9CMZRiH=A1ySJ6ij5jTQq0jeGpttxi6g@mail.gmail.com>
Message-ID: <1596056152748.75196@amazon.com>

hi, Volker and Tobias, 

Here is a new revision. 
http://cr.openjdk.java.net/~xliu/8249809/02/webrev/

1. This one add comments about this smart pointer and fix the formation issue. 

2. Thanks to point me out a new document of hotspot code style. 
Since it has updated to -std=c++14, I change all NULL to nullptr.

3.  I also add NON_COPYABLE because it's not intended to be copied. 


DirectiveSetPtr is just a thin wrapper of the raw pointer. if users only use it to read,  nothing will be cloned. It simply goes through. 

thanks, 
--lx

________________________________________
From: Volker Simonis <volker.simonis at gmail.com>
Sent: Wednesday, July 29, 2020 7:34 AM
To: Tobias Hartmann
Cc: Liu, Xin; hotspot-compiler-dev at openjdk.java.net
Subject: RE: [EXTERNAL] RFR[XS] 8249809 avoid calling DirectiveSet::clone(this) in compilecommand_compatibility_init

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


On Wed, Jul 29, 2020 at 9:38 AM Tobias Hartmann
<tobias.hartmann at oracle.com> wrote:
>
> Hi Xin,
>
> On 28.07.20 22:56, Liu, Xin wrote:
> > http://cr.openjdk.java.net/~xliu/8249809/01/webrev/
>
> Overall looks good to me.
>
> Some style comments:
> - Add a comment to 'DirectiveSetPtr' to describe its purpose
> - Why not put the "cloned" logic in "operator->"?

Because there's also a "read-only" access  of the DirectiveSetPtr
which doesn't mutate its content and therefore should clone the
underlying DirectiveSet. See my first mail where I proposed to add a
second, `const`-version of "operator->". But that still required const
casts in the places where we didn't want to clone. I've therefore
voted for the new "cloned()" method which makes cloning and mutating
explicit and which is much easier to understand from my point of view
(compared to two overloaded operators).

> - Do not use the _clone pointer as boolean (see "Miscellaneous" section in the style guide [1])
> - Indentation in line 301-303 is wrong
> - Line 306 use brackets around the "else" and move it one line up "} else {"
>
> Best regards,
> Tobias
>
> [1] https://hg.openjdk.java.net/jdk/jdk/raw-file/tip/doc/hotspot-style.html

From vladimir.x.ivanov at oracle.com  Wed Jul 29 22:00:26 2020
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Thu, 30 Jul 2020 01:00:26 +0300
Subject: RFR[S] : 8248830 : C2 : Rotate API intrinsification for X86
In-Reply-To: <e0a75968-936f-97df-5693-f1e3275824e9@oracle.com>
References: <MWHPR11MB1614EAFF216144FE6EAE68F9E87F0@MWHPR11MB1614.namprd11.prod.outlook.com>
 <92d97d1b-fc53-e368-b249-1cab7db33964@oracle.com>
 <MWHPR11MB1614CB6E26028AC98DAA7F30E8790@MWHPR11MB1614.namprd11.prod.outlook.com>
 <dd691913-d9c7-2657-905f-4f3df50f6bb4@oracle.com>
 <MWHPR11MB1614E047E14386D3B51EA3A9E8700@MWHPR11MB1614.namprd11.prod.outlook.com>
 <e0a75968-936f-97df-5693-f1e3275824e9@oracle.com>
Message-ID: <5f6a3e52-7854-4613-43f1-32a7423a0db6@oracle.com>


>> http://cr.openjdk.java.net/~jbhateja/8248830/webrev.04/
> 
> Looks good. (Testing is in progress.)

FYI test results are clean (tier1-tier5).

>> I have removed RotateLeftNode/RotateRightNode::Ideal routines since we 
>> are anyways
>> doing constant folding in LShiftI/URShiftI value routines. Since JAVA 
>> rotate APIs are no longer
>> intrincified hence these routines may no longer be useful.
> 
> Nice observation! Good.

As a second thought, it seems there's still a chance left that Rotate 
nodes get their input type narrowed after the folding happened. For 
example, as a result of incremental inlining or CFG transformations 
during loop optimizations. And it does happen in practice since the 
testing revealed some crashes due to the bug in 
RotateLeftNode/RotateRightNode::Ideal().

So, it makes sense to keep the transformations. But I'm fine with 
addressing that as a followup enhancement.

Best regards,
Vladimir Ivanov

> 
>>> It would be really nice to migrate to MacroAssembler along the way (as a
>>> cleanup).
>>
>> I guess you are saying remove opcodes/encoding from patterns and move 
>> then to Assembler,
>> Can we take this cleanup activity separately since other patterns are 
>> also using these matcher
>> directives.
> 
> I'm perfectly fine with handling it as a separate enhancement.
> 
>> Other synthetic comments have been taken care of. I have extended the 
>> Test to cover all the newly
>> added scalar transforms. Kindly let me know if there other comments.
> 
> Nice!
> 
> Best regards,
> Vladimir Ivanov
> 
>>> -----Original Message-----
>>> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
>>> Sent: Friday, July 24, 2020 3:21 AM
>>> To: Bhateja, Jatin <jatin.bhateja at intel.com>
>>> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; Andrew Haley
>>> <aph at redhat.com>; hotspot-compiler-dev at openjdk.java.net
>>> Subject: Re: RFR[S] : 8248830 : C2 : Rotate API intrinsification for X86
>>>
>>> Hi Jatin,
>>>
>>>> http://cr.openjdk.java.net/~jbhateja/8248830/webrev.03/
>>>
>>> Much better! Thanks.
>>>
>>>> Change Summary:
>>>>
>>>> 1) Unified the handling for scalar rotate operation. All scalar rotate
>>> selection patterns are now dependent on newly created
>>> RotateLeft/RotateRight nodes. This promotes rotate inferencing. 
>>> Currently
>>> if DAG nodes corresponding to a sub-pattern are shared (have multiple
>>> users) then existing complex patterns based on Or/LShiftL/URShift 
>>> does not
>>> get matched and this prevents inferring rotate nodes. Please refer to
>>> JIT'ed assembly output with baseline[1] and with patch[2] . We can 
>>> see that
>>> generated code size also went done from 832 byte to 768 bytes. Also this
>>> can cause perf degradation if shift-or dependency chain appears inside a
>>> hot region.
>>>>
>>>> 2) Due to enhanced rotate inferencing new patch shows better 
>>>> performance
>>> even for legacy targets (non AVX-512). Please refer to the perf 
>>> result[3]
>>> over AVX2 machine for JMH benchmark part of the patch.
>>>
>>> Very nice!
>>>> 3) As suggested, removed Java API intrinsification changes and scalar
>>> rotate transformation are done during OrI/OrL node idealizations.
>>>
>>> Good.
>>>
>>> (Still would be nice to factor the matching code from Ideal() and 
>>> share it
>>> between multiple use sites. Especially considering OrVNode::Ideal() now
>>> does basically the same thing. As an example/idea, take a look at
>>> is_bmi_pattern() in x86.ad.)
>>>
>>>> 4) SLP always gets to work on new scalar Rotate nodes and creates 
>>>> vector
>>> rotate nodes which are degenerated into OrV/LShiftV/URShiftV nodes if
>>> target does not supports vector rotates(non-AVX512).
>>>
>>> Good.
>>>
>>>> 5) Added new instruction patterns for vector shift Left/Right 
>>>> operations
>>> with constant shift operands. This prevents emitting extra moves to XMM.
>>>
>>> +instruct vshiftI_imm(vec dst, vec src, immI8 shift) %{
>>> +? match(Set dst (LShiftVI src shift));
>>>
>>> I'd prefer to see a uniform Ideal IR shape being used irrespective of
>>> whether the argument is a constant or not. It should also simplify the
>>> logic in SuperWord and make it easier to support on non-x86 
>>> architectures.
>>>
>>> For example, here's how it is done on AArch64:
>>>
>>> instruct vsll4I_imm(vecX dst, vecX src, immI shift) %{
>>> ??? predicate(n->as_Vector()->length() == 4);
>>> ??? match(Set dst (LShiftVI src (LShiftCntV shift))); ...
>>>
>>>> 6) Constant folding scenarios are covered in RotateLeft/RotateRight
>>> idealization, inferencing of vector rotate through OrV idealization 
>>> covers
>>> the vector patterns generated though non SLP route i.e. VectorAPI.
>>>
>>> I'm fine with keeping OrV::Ideal(), but I'm concerned with the general
>>> direction here - duplication of scalar transformations to lane-wise 
>>> vector
>>> operations. It definitely won't scale and in a longer run it risks to
>>> diverge. Would be nice to find a way to automatically "lift"
>>> scalar transformations to vectors and apply them uniformly. But right 
>>> now
>>> it is just an idea which requires more experimentation.
>>>
>>>
>>> Some other minor comments/suggestions:
>>>
>>> +? // Swap the computed left and right shift counts.
>>> +? if (is_rotate_left) {
>>> +??? Node* temp = shiftRCnt;
>>> +??? shiftRCnt? = shiftLCnt;
>>> +??? shiftLCnt? = temp;
>>> +? }
>>>
>>> Maybe use swap() here (declared in globalDefinitions.hpp)?
>>>
>>>
>>> +? if (Matcher::match_rule_supported_vector(vopc, vlen, bt))
>>> +??? return true;
>>>
>>> Please, don't omit curly braces (even for simple cases).
>>>
>>>
>>> -// Rotate Right by variable
>>> -instruct rorI_rReg_Var_C0(no_rcx_RegI dst, rcx_RegI shift, immI0 zero,
>>> rFlagsReg cr)
>>> +instruct rorI_immI8_legacy(rRegI dst, immI8 shift, rFlagsReg cr)
>>> ?? %{
>>> -? match(Set dst (OrI (URShiftI dst shift) (LShiftI dst (SubI zero
>>> shift))));
>>> -
>>> +? predicate(!VM_Version::supports_bmi2() &&
>>> n->bottom_type()->basic_type() == T_INT);
>>> +? match(Set dst (RotateRight dst shift));
>>> +? format %{ "rorl???? $dst, $shift" %}
>>> ???? expand %{
>>> -??? rorI_rReg_CL(dst, shift, cr);
>>> +??? rorI_rReg_imm8(dst, shift, cr);
>>> ???? %}
>>>
>>> It would be really nice to migrate to MacroAssembler along the way (as a
>>> cleanup).
>>>
>>>> Please push the patch through your testing framework and let me know 
>>>> your
>>> review feedback.
>>>
>>> There's one new assertion failure:
>>>
>>> #? Internal Error (.../src/hotspot/share/opto/phaseX.cpp:1238),
>>> pid=5476, tid=6219
>>> #? assert((i->_idx >= k->_idx) || i->is_top()) failed: Idealize should
>>> return new nodes, use Identity to return old nodes
>>>
>>> I believe it comes from RotateLeftNode::Ideal/RotateRightNode::Ideal
>>> which can return pre-contructed constants. I suggest to get rid of
>>> Ideal() methods and move constant folding logic into Node::Value() (as
>>> implemented for other bitwise/arithmethic nodes in
>>> addnode.cpp/subnode.cpp/mulnode.cpp et al). It's a more generic approach
>>> since it enables richer type information (ranges vs constants) and 
>>> IMO it's
>>> more convenient to work with constants through Types than ConNodes.
>>>
>>> (I suspect that original/expanded IR shape may already provide more 
>>> precise
>>> type info for non-constant case which can affect the benchmarks.)
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>>>
>>>> Best Regards,
>>>> Jatin
>>>>
>>>> [1]
>>>> http://cr.openjdk.java.net/~jbhateja/8248830/rotate_baseline_avx2_asm.
>>>> txt [2]
>>>> http://cr.openjdk.java.net/~jbhateja/8248830/rotate_new_patch_avx2_asm
>>>> .txt [3]
>>>> http://cr.openjdk.java.net/~jbhateja/8248830/rotate_perf_avx2_new_patc
>>>> h.txt
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
>>>>> Sent: Saturday, July 18, 2020 12:25 AM
>>>>> To: Bhateja, Jatin <jatin.bhateja at intel.com>; Andrew Haley
>>>>> <aph at redhat.com>
>>>>> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>;
>>>>> hotspot-compiler- dev at openjdk.java.net
>>>>> Subject: Re: RFR[S] : 8248830 : C2 : Rotate API intrinsification for
>>>>> X86
>>>>>
>>>>> Hi Jatin,
>>>>>
>>>>>> http://cr.openjdk.java.net/~jbhateja/8248830/webrev_02/
>>>>>
>>>>> It definitely looks better, but IMO it hasn't reached the sweet spot
>>> yet.
>>>>> It feels like the focus is on auto-vectorizer while the burden is put
>>>>> on scalar cases.
>>>>>
>>>>> First of all, considering GVN folds relevant operation patterns into
>>>>> a single Rotate node now, what's the motivation to introduce 
>>>>> intrinsics?
>>>>>
>>>>> Another point is there's still significant duplication for scalar 
>>>>> cases.
>>>>>
>>>>> I'd prefer to see the legacy cases which rely on pattern matching to
>>>>> go away and be substituted with instructions which match Rotate
>>>>> instructions (migrating ).
>>>>>
>>>>> I understand that it will penalize the vectorization implementation,
>>>>> but IMO reducing overall complexity is worth it. On auto-vectorizer
>>>>> side, I see
>>>>> 2 ways to fix it:
>>>>>
>>>>> ???? (1) introduce additional AD instructions for
>>>>> RotateLeftV/RotateRightV specifically for pre-AVX512 hardware;
>>>>>
>>>>> ???? (2) in SuperWord::output(), when matcher doesn't support
>>>>> RotateLeftV/RotateLeftV nodes (Matcher::match_rule_supported()),
>>>>> generate vectorized version of the original pattern.
>>>>>
>>>>> Overall, it looks like more and more focus is made on scalar part.
>>>>> Considering the main goal of the patch is to enable vectorization,
>>>>> I'm fine with separating cleanup of scalar part. As an interim
>>>>> solution, it seems that leaving the scalar part as it is now and
>>>>> matching scalar bit rotate pattern in VectorNode::is_rotate() should
>>>>> be enough to keep the vectorization part functioning. Then scalar
>>>>> Rotate nodes and relevant cleanups can be integrated later. (Or vice
>>>>> versa: clean up scalar part first and then follow up with
>>>>> vectorization.)
>>>>>
>>>>> Some other comments:
>>>>>
>>>>> * There's a lot of duplication between OrINode::Ideal and
>>> OrLNode::Ideal.
>>>>> What do you think about introducing a super type
>>>>> (OrNode) and put a unified version (OrNode::Ideal) there?
>>>>>
>>>>>
>>>>> * src/hotspot/cpu/x86/x86.ad
>>>>>
>>>>> +instruct vprotate_immI8(vec dst, vec src, immI8 shift) %{
>>>>> +? predicate(n->bottom_type()->is_vect()->element_basic_type() == 
>>>>> T_INT
>>> ||
>>>>> +??????????? n->bottom_type()->is_vect()->element_basic_type() ==
>>>>> +T_LONG);
>>>>>
>>>>> +instruct vprorate(vec dst, vec src, vec shift) %{
>>>>> +? predicate(n->bottom_type()->is_vect()->element_basic_type() == 
>>>>> T_INT
>>> ||
>>>>> +??????????? n->bottom_type()->is_vect()->element_basic_type() ==
>>>>> +T_LONG);
>>>>>
>>>>> The predicates are redundant here.
>>>>>
>>>>>
>>>>> * src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp
>>>>>
>>>>> +void C2_MacroAssembler::vprotate_imm(int opcode, BasicType etype,
>>>>> XMMRegister dst, XMMRegister src,
>>>>> +???????????????????????????????????? int shift, int vector_len) {
>>>>> + if (opcode == Op_RotateLeftV) {
>>>>> +??? if (etype == T_INT) {
>>>>> +????? evprold(dst, src, shift, vector_len);
>>>>> +??? } else {
>>>>> +????? evprolq(dst, src, shift, vector_len);
>>>>> +??? }
>>>>>
>>>>> Please, put an assert for the false case (assert(etype == T_LONG,
>>> "...")).
>>>>>
>>>>>
>>>>> * On testing (with previous version of the patch): -XX:UseAVX is x86-
>>>>> specific flag, so new/adjusted tests now fail on non-x86 platforms.
>>>>> Either omitting the flag or adding -XX:+IgnoreUnrecognizedVMOptions
>>>>> will solve the issue.
>>>>>
>>>>> Best regards,
>>>>> Vladimir Ivanov
>>>>>
>>>>>>
>>>>>>
>>>>>> Summary of changes:
>>>>>> 1) Optimization is specifically targeted to exploit vector rotation
>>>>> instruction added for X86 AVX512. A single rotate instruction
>>>>> encapsulates entire vector OR/SHIFTs pattern thus offers better
>>>>> latency at reduced instruction count.
>>>>>>
>>>>>> 2) There were two approaches to implement this:
>>>>>> ?????? a)? Let everything remain the same and add new wide complex
>>>>> instruction patterns in the matcher for e.g.
>>>>>> ??????????? set Dst ( OrV (Binary (LShiftVI dst (Binary ReplicateI
>>>>>> shift))
>>>>> (URShiftVI dst (Binary (SubI (Binary ReplicateI 32) ( Replicate
>>>>> shift))
>>>>>> ?????? It would have been an overoptimistic assumption to expect that
>>>>>> graph
>>>>> shape would be preserved till the matcher for correct inferencing.
>>>>>> ?????? In addition we would have required multiple such bulky 
>>>>>> patterns.
>>>>>> ?????? b) Create new RotateLeft/RotateRight scalar nodes, these gets
>>>>> generated during intrinsification as well as during additional
>>>>> pattern
>>>>>> ?????? matching during node Idealization, later on these nodes are
>>>>>> consumed
>>>>> by SLP for valid vectorization scenarios to emit their vector
>>>>>> ?????? counterparts which eventually emits vector rotates.
>>>>>>
>>>>>> 3) I choose approach 2b) since its cleaner, only problem here was
>>>>>> that in non-evex mode (UseAVX < 3) new scalar Rotate nodes should
>>>>>> either be
>>>>> dismantled back to OR/SHIFT pattern or we penalize the vectorization
>>>>> which would be very costly, other option would have been to add
>>>>> additional vector rotate pattern for UseAVX=3 in the matcher which
>>>>> emit vector OR-SHIFTs instruction but then it will loose on emitting
>>>>> efficient instruction sequence which node sharing
>>>>> (OrV/LShiftV/URShift) offer in current implementation - thus it will
>>>>> not be beneficial for non-AVX512 targets, only saving will be in
>>>>> terms of cleanup of few existing scalar rotate matcher patterns, also
>>>>> old targets does not offer this powerful rotate instruction.
>>>>> Therefore new scalar nodes are created only for AVX512 targets.
>>>>>>
>>>>>> As per suggestions constant folding scenarios have been covered
>>>>>> during
>>>>> Idealizations of newly added scalar nodes.
>>>>>>
>>>>>> Please review the latest version and share your feedback and test
>>>>> results.
>>>>>>
>>>>>> Best Regards,
>>>>>> Jatin
>>>>>>
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Andrew Haley <aph at redhat.com>
>>>>>>> Sent: Saturday, July 11, 2020 2:24 PM
>>>>>>> To: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>; Bhateja, Jatin
>>>>>>> <jatin.bhateja at intel.com>; hotspot-compiler-dev at openjdk.java.net
>>>>>>> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>
>>>>>>> Subject: Re: 8248830 : RFR[S] : C2 : Rotate API intrinsification
>>>>>>> for
>>>>>>> X86
>>>>>>>
>>>>>>> On 10/07/2020 18:32, Vladimir Ivanov wrote:
>>>>>>>
>>>>>>> ??? > High-level comment: so far, there were no pressing need in? >
>>>>>>> explicitly marking the methods as intrinsics. ROR/ROL instructions
>>>>>>>> were selected during matching [1]. Now the patch introduces? >
>>>>>>> dedicated nodes
>>>>>>> (RotateLeft/RotateRight) specifically for intrinsics? > which
>>>>>>> partly duplicates existing logic.
>>>>>>>
>>>>>>> The lack of rotate nodes in the IR has always meant that AArch64
>>>>>>> doesn't generate optimal code for e.g.
>>>>>>>
>>>>>>> ????? (Set dst (XorL reg1 (RotateLeftL reg2 imm)))
>>>>>>>
>>>>>>> because, with the RotateLeft expanded to its full combination of
>>>>>>> ORs and shifts, it's to complicated to match. At the time I put
>>>>>>> this to one side because it wasn't urgent. This is a shame because
>>>>>>> although such combinations are unusual they are used in some crypto
>>> operations.
>>>>>>>
>>>>>>> If we can generate immediate-form rotate nodes early by pattern
>>>>>>> matching during parsing (rather than depending on intrinsics) we'll
>>>>>>> get more value than by depending on programmers calling intrinsics.
>>>>>>>
>>>>>>> -- 
>>>>>>> Andrew Haley? (he/him)
>>>>>>> Java Platform Lead Engineer
>>>>>>> Red Hat UK Ltd. <https://www.redhat.com>
>>>>>>> https://keybase.io/andrewhaley
>>>>>>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>>>>>>

From igor.ignatyev at oracle.com  Wed Jul 29 22:24:05 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 29 Jul 2020 15:24:05 -0700
Subject: RFR(T) : 8250797 : remove CompileReason::Reason_CTW
In-Reply-To: <18e52d8d-afe9-7ccb-1d3a-ae1b37a2b8d8@oracle.com>
References: <2C1589D3-2FEC-411A-8CC2-DF184593BD25@oracle.com>
 <18e52d8d-afe9-7ccb-1d3a-ae1b37a2b8d8@oracle.com>
Message-ID: <87C23499-6F1E-4B89-BFF6-CC9269C1C279@oracle.com>

oopsie, removed, http://cr.openjdk.java.net/~iignatyev//8250797/webrev.01 :

> diff -r e5afd04596e7 src/hotspot/share/compiler/compileTask.hpp
> --- a/src/hotspot/share/compiler/compileTask.hpp        Wed Jul 29 15:02:31 2020 -0700
> +++ b/src/hotspot/share/compiler/compileTask.hpp        Wed Jul 29 15:02:53 2020 -0700
> @@ -133,7 +133,6 @@
>    bool         should_wait_for_compilation() const {
>      // Wait for blocking compilation to finish.
>      switch (_compile_reason) {
> -        case Reason_CTW:
>          case Reason_Replay:
>          case Reason_Whitebox:
>          case Reason_Bootstrap:

thanks for noticing. for luck, I've started builds-tier1 job.

-- Igor

> On Jul 29, 2020, at 1:54 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Igor,
> 
> You missed reference in should_wait_for_compilation().
> 
> Thanks,
> Vladimir K
> 
> On 7/29/20 12:34 PM, Igor Ignatyev wrote:
>> http://cr.openjdk.java.net/~iignatyev//8250797/webrev.00
>>> 5 lines changed: 0 ins; 4 del; 1 mod;
>> Hi all,
>> could you please review this patch?
>> from JBS:
>>> "native" CTW has been removed by JDK-8213812 (JDK-8214917), so CompileReason::Reason_CTW isn't used anymore and should be removed.
>> besides removing CompileReason::Reason_CTW and corresponding element from reason_names[] array, the patch also updates the comment for CompileReason as CompileTask::can_become_stale doesn't really depend on the order.
>> webrev:
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8250797
>> -- Igor


From vladimir.kozlov at oracle.com  Wed Jul 29 22:45:50 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Wed, 29 Jul 2020 15:45:50 -0700
Subject: RFR(T) : 8250797 : remove CompileReason::Reason_CTW
In-Reply-To: <87C23499-6F1E-4B89-BFF6-CC9269C1C279@oracle.com>
References: <87C23499-6F1E-4B89-BFF6-CC9269C1C279@oracle.com>
Message-ID: <27CCBBD2-A2DE-40AA-A894-BB257EFF95B5@oracle.com>

Good.

Thanks
Vladimir

> On Jul 29, 2020, at 3:24 PM, Igor Ignatyev <igor.ignatyev at oracle.com> wrote:
> 
> ?oopsie, removed, http://cr.openjdk.java.net/~iignatyev//8250797/webrev.01 :
> 
>> diff -r e5afd04596e7 src/hotspot/share/compiler/compileTask.hpp
>> --- a/src/hotspot/share/compiler/compileTask.hpp        Wed Jul 29 15:02:31 2020 -0700
>> +++ b/src/hotspot/share/compiler/compileTask.hpp        Wed Jul 29 15:02:53 2020 -0700
>> @@ -133,7 +133,6 @@
>>   bool         should_wait_for_compilation() const {
>>     // Wait for blocking compilation to finish.
>>     switch (_compile_reason) {
>> -        case Reason_CTW:
>>         case Reason_Replay:
>>         case Reason_Whitebox:
>>         case Reason_Bootstrap:
> 
> thanks for noticing. for luck, I've started builds-tier1 job.
> 
> -- Igor
> 
>> On Jul 29, 2020, at 1:54 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>> 
>> Igor,
>> 
>> You missed reference in should_wait_for_compilation().
>> 
>> Thanks,
>> Vladimir K
>> 
>>> On 7/29/20 12:34 PM, Igor Ignatyev wrote:
>>> http://cr.openjdk.java.net/~iignatyev//8250797/webrev.00
>>>> 5 lines changed: 0 ins; 4 del; 1 mod;
>>> Hi all,
>>> could you please review this patch?
>>> from JBS:
>>>> "native" CTW has been removed by JDK-8213812 (JDK-8214917), so CompileReason::Reason_CTW isn't used anymore and should be removed.
>>> besides removing CompileReason::Reason_CTW and corresponding element from reason_names[] array, the patch also updates the comment for CompileReason as CompileTask::can_become_stale doesn't really depend on the order.
>>> webrev:
>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8250797
>>> -- Igor
> 


From vladimir.x.ivanov at oracle.com  Wed Jul 29 23:10:34 2020
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Thu, 30 Jul 2020 02:10:34 +0300
Subject: Adding an Intrinsic for MD5
In-Reply-To: <MWHPR21MB051175BA98E737205A5E4665B0700@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB051125F36042FCA5A9A9C1B5B0700@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB051175BA98E737205A5E4665B0700@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <5445fe9c-f113-be29-3835-e342e10623db@oracle.com>

Hi Ludovic,

It's a crash due to a out-of-bounds Java heap access (right at the upper 
heap boundary). Something is wrong either with the initial buf value 
(r15) or limit check:

166   if (multi_block) {
167     // increment data pointer and loop if more to process
168     addptr(buf, 64);
169     movptr(rsi, ofs);
170     addptr(rsi, 64);
171     movptr(ofs, rsi);
172     cmpptr(rsi, limit);
173     jcc(Assembler::belowEqual, loop0);
174   }

 From the hs_err log:

#  SIGSEGV (0xb) at pc=0x00007f34f10354a1, pid=28286, tid=28305

siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr:
0x0000000510800000


   0x00007f34f10354a1:   add     0x18(%r15),%ecx


R15=0x00000005107fffe8 points into unknown readable memory: 
0x0000000000000000 | 00 00 00 00 00 00 00 00


| 100|0x0000000510000000, 0x0000000510800000, 0x0000000510800000|100%| 
E|CS|TAMS 0x0000000510000000, 0x0000000510000000| Complete

Regarding ways to debug it, I'd put a breakpoint right at the beginning 
of the stub first to validate that parameters are valid. Then I'd dump 
parameters on stack in order to simplify post-mortem analysis. (If the 
problem is with limit check, then many iterations should pass before it 
reaches the end of the Java heap.) Also, inserting debug checks in the 
stub itself can catch an inconsistency much closer to the actual place 
where the bug lurks.

Best regards,
Vladimir Ivanov

On 29.07.2020 22:13, Ludovic Henry wrote:
> To add some more information, I've uploaded one of the `hs_err_pid*.log` file at [1].
> 
> --
> Ludovic
> 
> [1] http://cr.openjdk.java.net/~burban/luhenry/md5-intrinsics/hs_err_pid28286.log
> 
> -----Original Message-----
> From: hotspot-compiler-dev <hotspot-compiler-dev-retn at openjdk.java.net> On Behalf Of Ludovic Henry
> Sent: Wednesday, July 29, 2020 9:55 AM
> To: hotspot-compiler-dev at openjdk.java.net
> Subject: Adding an Intrinsic for MD5
> 
> Hi,
> 
> After doing profiling on some applications on Azure, I noticed that MD5 takes a significant time when verifying the content of large amount of downloaded data (see [1] for a flamegraph of some Spark operations pulling data from Azure Storage, look at the top most `Lsun/securitu/pro..` entry representing 11.68% of the samples). I then looked into the code generated for `sun.security.provider.MD5.implCompress` (the hottest method). I observed that the generated code contains many branches that are never taken and not even necessary (array-bound checks on a fixed sized array for which we already checked the size, for example). On top of that, MD5 doesn't require any (there are no conditions and no loops), making all these branches pure overhead. Accelerating MD5 will not be only beneficial to Azure workloads, but to anyone doing any sort of content hashing/verification with MD5 (which is quite unfortunate given the known flaws of MD5 and the availability of faster alternatives with greater cryptographical qualities).
> 
> I worked last night on a prototype of an intrinsic, which I've uploaded at [2]. It's a very rough draft and I want to have your input before I invest further into it.
> 
> As it is the first time I do such work (adding an intrinsic, generating assembly by hand, adding support for one instruction in the assembler), I'm still running into a crash and I am not sure how to debug it further. I would really appreciate any pointer on how I need to approach debugging such an issue, or even for an expert to look into my change and help me pinpoint what's going wrong. So far, I used the disassembly and hs_err*.log file to clearly see the generated code and the machine state at the time of the crash. I expect the problem to be around calling conventions and assumptions around the shape/content of the parameters. I'll keep debugging in the meantime.
> 
> Thank you very much,
> 
> --
> Ludovic
> 
> [1] https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2Fmd5-intrinsics%2Fflamegraph-45235.svg&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cbc2c83438c794f73c6cb08d833e08f5f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637316386858391072&amp;sdata=1rNcCYW29l4KZPjpXT1%2F3nSWma3%2F83rXaIwNsw9s1GM%3D&amp;reserved=0
> [2] https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~burban%2Fluhenry%2Fmd5-intrinsics%2Fwebrev.00%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cbc2c83438c794f73c6cb08d833e08f5f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637316386858401068&amp;sdata=014gBkFRpgC4QT6U0Zp4%2FKSI0qv0g3fXEJ4YL12bDX0%3D&amp;reserved=0
> 

From jiefu at tencent.com  Wed Jul 29 23:09:34 2020
From: jiefu at tencent.com (=?utf-8?B?amllZnUo5YKF5p2wKQ==?=)
Date: Wed, 29 Jul 2020 23:09:34 +0000
Subject: RFR: 8250745: Fix a potential bug on AVX512 machines with
 assert(eval_map.contains(n)) failed: absent
Message-ID: <C8D749BC-3761-4E6B-AD6E-F682578BEE86@tencent.com>

Thanks Vladimir Ivanov and Vladimir Kozlov for your review.

Will push it several hours later (since the hotspot code review must be there for at least 24 hours).

Best regards,
Jie

?On 2020/7/30, 4:07 AM, "Vladimir Ivanov" <vladimir.x.ivanov at oracle.com> wrote:

    FYI test results are clean (hs-precheckin-comp,hs-tier1,hs-tier2).
    
    Best regards,
    Vladimir Ivanov
    
    On 29.07.2020 12:20, Vladimir Ivanov wrote:
    >> Webrev: http://cr.openjdk.java.net/~jiefu/8250745/webrev.00/
    > 
    > Looks good.
    > 
    > FTR the bug was introduced by JDK-8241040, but I don't see a way it can 
    > be hit by auto-vectorizer: before it kicks in, scalar code is strongly 
    > normalized and constants are pushed to the right. It leads to the shape 
    > where (Replicate -1) is always the second input of bitwise NOT shape 
    > (XorV v (Replicate -1)). Since there are no GVN transformations 
    > happening for vector nodes, both left-hand and right-hand variants 
    > become possible with Vector API.
    > 
    > Best regards,
    > Vladimir Ivanov
    
    
From dean.long at oracle.com  Wed Jul 29 23:48:23 2020
From: dean.long at oracle.com (Dean Long)
Date: Wed, 29 Jul 2020 16:48:23 -0700
Subject: Adding an Intrinsic for MD5
In-Reply-To: <5445fe9c-f113-be29-3835-e342e10623db@oracle.com>
References: <MWHPR21MB051125F36042FCA5A9A9C1B5B0700@MWHPR21MB0511.namprd21.prod.outlook.com>
 <MWHPR21MB051175BA98E737205A5E4665B0700@MWHPR21MB0511.namprd21.prod.outlook.com>
 <5445fe9c-f113-be29-3835-e342e10623db@oracle.com>
Message-ID: <fdb2aa51-8bbd-5cc7-0b26-23de0d8c558a@oracle.com>

Does this cmp have the lhs and rhs reversed?

dl

On 7/29/20 4:10 PM, Vladimir Ivanov wrote:
> 172 cmpptr(rsi, limit); 


From igor.ignatyev at oracle.com  Wed Jul 29 23:56:17 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Wed, 29 Jul 2020 16:56:17 -0700
Subject: RFR(T) : 8250797 : remove CompileReason::Reason_CTW
In-Reply-To: <27CCBBD2-A2DE-40AA-A894-BB257EFF95B5@oracle.com>
References: <87C23499-6F1E-4B89-BFF6-CC9269C1C279@oracle.com>
 <27CCBBD2-A2DE-40AA-A894-BB257EFF95B5@oracle.com>
Message-ID: <75838403-3E5D-4A7F-87A7-A8A6B905D095@oracle.com>

Vladimir, Katya,

Thanks for your reviews, pushed.

-- Igor

> On Jul 29, 2020, at 3:45 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
> Good.
> 
> Thanks
> Vladimir
> 
>> On Jul 29, 2020, at 3:24 PM, Igor Ignatyev <igor.ignatyev at oracle.com> wrote:
>> 
>> ?oopsie, removed, http://cr.openjdk.java.net/~iignatyev//8250797/webrev.01 :
>> 
>>> diff -r e5afd04596e7 src/hotspot/share/compiler/compileTask.hpp
>>> --- a/src/hotspot/share/compiler/compileTask.hpp        Wed Jul 29 15:02:31 2020 -0700
>>> +++ b/src/hotspot/share/compiler/compileTask.hpp        Wed Jul 29 15:02:53 2020 -0700
>>> @@ -133,7 +133,6 @@
>>>  bool         should_wait_for_compilation() const {
>>>    // Wait for blocking compilation to finish.
>>>    switch (_compile_reason) {
>>> -        case Reason_CTW:
>>>        case Reason_Replay:
>>>        case Reason_Whitebox:
>>>        case Reason_Bootstrap:
>> 
>> thanks for noticing. for luck, I've started builds-tier1 job.
>> 
>> -- Igor
>> 
>>> On Jul 29, 2020, at 1:54 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>>> 
>>> Igor,
>>> 
>>> You missed reference in should_wait_for_compilation().
>>> 
>>> Thanks,
>>> Vladimir K
>>> 
>>>> On 7/29/20 12:34 PM, Igor Ignatyev wrote:
>>>> http://cr.openjdk.java.net/~iignatyev//8250797/webrev.00
>>>>> 5 lines changed: 0 ins; 4 del; 1 mod;
>>>> Hi all,
>>>> could you please review this patch?
>>>> from JBS:
>>>>> "native" CTW has been removed by JDK-8213812 (JDK-8214917), so CompileReason::Reason_CTW isn't used anymore and should be removed.
>>>> besides removing CompileReason::Reason_CTW and corresponding element from reason_names[] array, the patch also updates the comment for CompileReason as CompileTask::can_become_stale doesn't really depend on the order.
>>>> webrev:
>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8250797
>>>> -- Igor
>> 
> 


From ningsheng.jian at arm.com  Thu Jul 30 01:53:07 2020
From: ningsheng.jian at arm.com (Ningsheng Jian)
Date: Thu, 30 Jul 2020 09:53:07 +0800
Subject: [aarch64-port-dev ] RFR (XXL): 8223347: Integration of Vector API
 (Incubator): AArch64 backend changes
In-Reply-To: <852a3a09-a627-c0fc-89c6-8c8100ae17f5@redhat.com>
References: <VI1PR0802MB2558C90ACF1189F17692FF278EB90@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <275eb57c-51c0-675e-c32a-91b198023559@redhat.com>
 <719F9169-ABC4-408E-B732-F1BD9A84337F@oracle.com>
 <9a13f5df-d946-579d-4282-917dc7338dc8@redhat.com>
 <09BC0693-80E0-4F87-855E-0B38A6F5EFA2@oracle.com>
 <668e500e-f621-5a2c-a41e-f73536880f73@redhat.com>
 <VI1PR0802MB255845256424F9EB4D5D81368EB30@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <1909fa9d-98bb-c2fb-45d8-540247d1ca8b@redhat.com>
 <VI1PR0802MB255895ADE9BC8D027EBB81048EB10@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <BYAPR11MB35438110FA50851942CE17A1EF950@BYAPR11MB3543.namprd11.prod.outlook.com>
 <HE1PR0802MB25542AE212E12FD3EF6D1BE48E6E0@HE1PR0802MB2554.eurprd08.prod.outlook.com>
 <2acbcc99-8dd4-b8f1-5982-1d439953c416@redhat.com>
 <VI1PR0802MB2558ECC61528694DBDFFF48F8E670@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <54d6b2b6-b79a-4700-981c-6ab33aca82f2@arm.com>
 <852a3a09-a627-c0fc-89c6-8c8100ae17f5@redhat.com>
Message-ID: <564c8283-0c8f-9487-af3c-c971fa6b736d@arm.com>

On 7/29/20 7:44 PM, Andrew Haley wrote:
> On 20/07/2020 04:51, Ningsheng Jian wrote:
>> Since we are getting ready to propose Vector API target to JDK 16 [1]. I
>> have regenerated webrev of aarch64 backend parts from panama repo, which
>> has been rebased to jdk/jdk very recently, by:
>>
>> $ hg update vector-unstable && hg diff -r default > all.patch
>> $ grep "diff -r" all.patch | grep -e "src/hotspot/cpu/aarch64" | awk
>> '{print $4}' > aarch64_list
>> $ ksh ./webrev.ksh -r default -o aarch64_webrev aarch64_list
>>
>> The new webrev:
>> http://cr.openjdk.java.net/~njian/vectorapi/8223347-integration/aarch64-webrev.01/
>>
>> Could you please help to take a look?
> 
> OK, thanks. It all looks fine. Sorry for the delay.
> 
Thank you Andrew!

Regards,
Ningsheng

From ningsheng.jian at arm.com  Thu Jul 30 06:22:35 2020
From: ningsheng.jian at arm.com (Ningsheng Jian)
Date: Thu, 30 Jul 2020 14:22:35 +0800
Subject: [aarch64-port-dev ] RFR(L): 8231441: AArch64: Initial SVE backend
 support
In-Reply-To: <707df21c-849d-ac9d-0ab2-61a30d1354f9@arm.com>
References: <bce2556a-b709-d6d1-ffa2-655faaf56787@arm.com>
 <42fca25d-7172-b4f3-335b-92e2b05e8195@arm.com>
 <707df21c-849d-ac9d-0ab2-61a30d1354f9@arm.com>
Message-ID: <5f6dc64c-f51a-50d4-995f-ed0c7a7724e8@arm.com>

Hi,

Pengfei helped to review the patch offline and found that some 
multiply-add/sub and popcount match rules are missing for SVE. Added in 
the new webrev. Thanks to Pengfei!

New webrev:
http://cr.openjdk.java.net/~njian/8231441/webrev.03

Incremental changes:
http://cr.openjdk.java.net/~njian/8231441/webrev.03-vs-02/

Split parts:

1) SVE feature detection:
http://cr.openjdk.java.net/~njian/8231441/webrev.03-feature

2) c2 register allocation:
http://cr.openjdk.java.net/~njian/8231441/webrev.03-ra

3) SVE c2 backend:
http://cr.openjdk.java.net/~njian/8231441/webrev.03-c2

Thanks,
Ningsheng

On 7/21/20 2:05 PM, Ningsheng Jian wrote:
> [Ping]
> 
> Could anyone please help to review this patch, especially for the c2
> register allocation part?
> 
> JBS: https://bugs.openjdk.java.net/browse/JDK-8231441
> 
> The latest webrev:
> http://cr.openjdk.java.net/~njian/8231441/webrev.02
> 
> In the latest webrev, we block one predicate register (p7) with all
> elements preset to TRUE, so that c2 compiled code can use it freely to
> generate instructions for unpredicated operations.
> 
> And the split parts:
> 
> 1) SVE feature detection:
> http://cr.openjdk.java.net/~njian/8231441/webrev.02-feature
> 
> 2) c2 register allocation:
> http://cr.openjdk.java.net/~njian/8231441/webrev.02-ra
> 
> 3) SVE c2 backend:
> http://cr.openjdk.java.net/~njian/8231441/webrev.02-c2
> 
> The initial RFR which has some descriptions of the patch:
> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-March/037628.html
> 
> The description can also be found at:
> http://cr.openjdk.java.net/~njian/8231441/README-RFR.txt
> 
> Notes to verify the patch on QEMU user emulation, with an example of
> compiled code:
> http://cr.openjdk.java.net/~njian/8231441/running-sve-in-qemu-user.txt
> 
> Thanks,
> Ningsheng
> 
> 
> On 5/27/20 3:23 PM, Ningsheng Jian wrote:
>> Hi,
>>
>> I have rebased this patch with some more comments added. And also
>> relaxed the instruction matching conditions for 128-bit vector.
>>
>> I would appreciate if someone could help to review this.
>>
>> Whole patch:
>> http://cr.openjdk.java.net/~njian/8231441/webrev.01
>>
>> Different parts of changes:
>>
>> 1) SVE feature detection
>> http://cr.openjdk.java.net/~njian/8231441/webrev.01-feature
>>
>> 2) c2 registion allocation
>> http://cr.openjdk.java.net/~njian/8231441/webrev.01-ra
>>
>> 3) SVE c2 backend
>> http://cr.openjdk.java.net/~njian/8231441/webrev.01-c2
>>
>> (Or should I split this into different JBS?)
>>
>> Thanks,
>> Ningsheng
>>
>> On 3/25/20 2:37 PM, Ningsheng Jian wrote:
>>> Hi,
>>>
>>> Could you please help to review this patch adding AArch64 SVE support?
>>> It also touches c2 compiler shared code.
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231441
>>> Webrev: http://cr.openjdk.java.net/~njian/8231441/webrev.00
>>>
>>> Arm has released new vector ISA extension for AArch64, SVE [1] and
>>> SVE2 [2]. This patch adds the initial SVE support in OpenJDK. In this
>>> patch we have:
>>>
>>> 1) SVE feature enablement and detection
>>> 2) SVE vector register allocation support with initial predicate
>>> register definition
>>> 3) SVE c2 backend for current SLP based vectorizer. (We also have a POC
>>> patch of a new vectorizer using SVE predicate-driven loop control, but
>>> that's still under development.)
>>>
>>> SVE register definition
>>> =======================
>>> Unlike other SIMD architectures, SVE allows hardware implementations to
>>> choose a vector register length from 128 and 2048 bits, multiple of 128
>>> bits. So we introduce a new vector type VectorA, i.e. length agnostic
>>> (scalable) vector type, and Op_VecA for machine vectora register. In the
>>> meantime, to minimize register allocation code changes, we also take
>>> advantage of one JIT compiler aspect, that is during the compile time we
>>> actually know the real hardware SVE vector register size of current
>>> running machine. So, the register allocator actually knows how many
>>> register slots an Op_VecA ideal reg requires, and could work fine
>>> without much modification.
>>>
>>> Since the bottom 128 bits are shared with the NEON, we extend current
>>> register mask definition of V0-V31 registers. Currently, c2 uses one bit
>>> mask for a 32-bit register slot, so to define at most 2048 bits we will
>>> need to add 64 slots in AD file. That's a really large number, and will
>>> also break current regmask assumption. Considering the SVE vector
>>> register is architecturally scalable for different sizes, we just define
>>> double of original NEON vector register slots, i.e. 8 slots: Vx, Vx_H,
>>> Vx_J ... Vx_O. After adlc, the generated register masks now looks like:
>>>
>>> const RegMask _VECTORA_REG_mask( 0x0, 0x0, 0xffffffff, 0xffffffff,
>>> 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, ...
>>>
>>> const RegMask _VECTORD_REG_mask( 0x0, 0x0, 0x3030303, 0x3030303,
>>> 0x3030303, 0x3030303, 0x3030303, 0x3030303, ...
>>>
>>> const RegMask _VECTORX_REG_mask( 0x0, 0x0, 0xf0f0f0f, 0xf0f0f0f,
>>> 0xf0f0f0f, 0xf0f0f0f, 0xf0f0f0f, 0xf0f0f0f, ...
>>>
>>> And we use SlotsPerVecA to indicate regmask bit size for a VecA register.
>>>
>>> Although for physical register allocation, register allocator does not
>>> need to know the real VecA register size, while doing spill/unspill,
>>> current register allocation needs to know actual stack slot size to
>>> store/load VecA registers. SVE is able to do vector size agnostic
>>> spilling, but to minimize the code changes, as I mentioned before, we
>>> just let RA know the actual vector register size in current running
>>> machine, by calling scalable_vector_reg_size().
>>>
>>> In the meantime, since some vector operations do not have unpredicated
>>> SVE1 instructions, but only predicate version, e.g. vector multiply,
>>> vector load/store. We have also defined predicate registers in this
>>> patch, and c2 register allocator will allocate a temp predicate register
>>> to fulfill the expecting unpredicated operations. And this can also be
>>> used for future predicate-driven vectorizer. This is not efficient for
>>> now, as we can see many ptrue instructions in the generated code. One
>>> possible solution I can see, is to block one predicate register, and
>>> preset it to all true. But to preserve/reinitialize a caller save
>>> register value cross calls seems risky to work in this patch. I decide
>>> to defer it to further optimization work. If anyone has any suggestions
>>> on this, I would appreciate.
>>>
>>> SVE feature detection
>>> =====================
>>> Since we may have some compiled code based on the initial detected SVE
>>> vector register length and the compiled code is compiled only for that
>>> vector register length, we assume that the SVE vector register length
>>> will not be changed during the JVM lifetime. However, SVE vector length
>>> is per-thread and can be changed by system call [3], so we need to make
>>> sure that each jni call will not change the sve vector length.
>>>
>>> Currently, we verify the SVE vector register length on each JNI return,
>>> and if an SVE vector length change is detected, jvm simply reports error
>>> and stops running. The VM running vector length can also be set by
>>> existing VM option MaxVectorSize with c2 enabled. If MaxVectorSize is
>>> specified not the same as system default sve vector length (in
>>> /proc/sys/abi/sve_default_vector_length), JVM will set current process
>>> sve vector length to the specified vector length.
>>>
>>> Compiled code
>>> =============
>>> We have added all current c2 backend codegen on par with NEON, but only
>>> for vector length larger than 128-bit.
>>>
>>> On a 1024 bit SVE environment, for the following simple loop with int
>>> array element type:
>>>
>>>  ??? for (int i = 0; i < LENGTH; i++) {
>>>  ????? c[i] = a[i] + b[i];
>>>  ??? }
>>>
>>> c2 generated loop:
>>>
>>>  ??? 0x0000ffff811c0820:?? sbfiz?? x11, x10, #2, #32
>>>  ??? 0x0000ffff811c0824:?? add???? x13, x18, x11
>>>  ??? 0x0000ffff811c0828:?? add???? x14, x1, x11
>>>  ??? 0x0000ffff811c082c:?? add???? x13, x13, #0x10
>>>  ??? 0x0000ffff811c0830:?? add???? x14, x14, #0x10
>>>  ??? 0x0000ffff811c0834:?? add???? x11, x0, x11
>>>  ??? 0x0000ffff811c0838:?? add???? x11, x11, #0x10
>>>  ??? 0x0000ffff811c083c:?? ptrue?? p1.s??? // To be optimized
>>>  ??? 0x0000ffff811c0840:?? ld1w??? {z16.s}, p1/z, [x14]
>>>  ??? 0x0000ffff811c0844:?? ptrue?? p0.s
>>>  ??? 0x0000ffff811c0848:?? ld1w??? {z17.s}, p0/z, [x13]
>>>  ??? 0x0000ffff811c084c:?? add???? z16.s, z17.s, z16.s
>>>  ??? 0x0000ffff811c0850:?? ptrue?? p1.s
>>>  ??? 0x0000ffff811c0854:?? st1w??? {z16.s}, p1, [x11]
>>>  ??? 0x0000ffff811c0858:?? add???? w10, w10, #0x20
>>>  ??? 0x0000ffff811c085c:?? cmp???? w10, w12
>>>  ??? 0x0000ffff811c0860:?? b.lt??? 0x0000ffff811c0820
>>>
>>> Test
>>> ====
>>> Currently, we don't have real hardware to verify SVE features (and
>>> performance). But we have run jtreg tests with SVE in some emulators. On
>>> QEMU system emulator, which has SVE emulation support, jtreg tier1-3
>>> passed with different vector sizes. We've also verified it with full
>>> jtreg tests without SVE on both x86 and AArch64, to make sure that
>>> there's no regression.
>>>
>>> The patch has also been applied to Vector API code base, and verified on
>>> emulator. In Vector API, there are more vector related tests and is more
>>> possible to generate vector instructions by intrinsification.
>>>
>>> A simple test can also run in QEMU user emulation, e.g.
>>>
>>> $ qemu-aarch64 -cpu max,sve-max-vq=2 java -XX:UseSVE=1 SIMD
>>>
>>> (
>>> To run it in user emulation mode, we will need to bypass SVE feature
>>> detection code in this patch. E.g. apply:
>>> http://cr.openjdk.java.net/~njian/8231441/user-emulation.patch
>>> )l
>>>
>>> Others
>>> ======
>>> Since this patch is a bit large, I've also split it into 3 parts, for
>>> easy review:
>>>
>>> 1) SVE feature detection
>>> http://cr.openjdk.java.net/~njian/8231441/webrev.00-feature
>>>
>>> 2) c2 registion allocation
>>> http://cr.openjdk.java.net/~njian/8231441/webrev.00-ra
>>>
>>> 3) SVE c2 backend
>>> http://cr.openjdk.java.net/~njian/8231441/webrev.00-c2
>>>
>>> Part of this patch has been contributed by Joshua Zhu and Yang Zhang.
>>>
>>> Refs
>>> ====
>>> [1] https://developer.arm.com/docs/ddi0584/latest
>>> [2] https://developer.arm.com/docs/ddi0602/latest
>>> [3] https://www.kernel.org/doc/Documentation/arm64/sve.txt
>>>
>>> Thanks,
>>> Ningsheng
>>>
>>
> 


From Pengfei.Li at arm.com  Thu Jul 30 06:59:23 2020
From: Pengfei.Li at arm.com (Pengfei Li)
Date: Thu, 30 Jul 2020 06:59:23 +0000
Subject: [aarch64-port-dev ] RFR(L): 8231441: AArch64: Initial SVE backend
 support
In-Reply-To: <5f6dc64c-f51a-50d4-995f-ed0c7a7724e8@arm.com>
References: <bce2556a-b709-d6d1-ffa2-655faaf56787@arm.com>
 <42fca25d-7172-b4f3-335b-92e2b05e8195@arm.com>
 <707df21c-849d-ac9d-0ab2-61a30d1354f9@arm.com>
 <5f6dc64c-f51a-50d4-995f-ed0c7a7724e8@arm.com>
Message-ID: <DB8PR08MB496950AD870476C7C59DB80A96710@DB8PR08MB4969.eurprd08.prod.outlook.com>

Hi,

To help reviewing the large ad file changes in the AArch64 backend, I created some jtreg tests checking if expected SVE/NEON instructions are correctly generated for each C2 vectornode.

I've uploaded my jtreg at http://cr.openjdk.java.net/~pli/rfr/8231441/jtreg.webrev.00/. Hope it would be useful for other reviewers.

--
Thanks,
Pengfei

> -----Original Message-----
> From: Ningsheng Jian <ningsheng.jian at arm.com>
> Sent: Thursday, July 30, 2020 14:23
> To: hotspot-compiler-dev at openjdk.java.net; Pengfei Li
> <Pengfei.Li at arm.com>; Vladimir Kozlov <vladimir.kozlov at oracle.com>;
> Vladimir Ivanov <vladimir.x.ivanov at oracle.com>; Andrew Haley
> <aph at redhat.com>
> Cc: aarch64-port-dev at openjdk.java.net
> Subject: Re: [aarch64-port-dev ] RFR(L): 8231441: AArch64: Initial SVE
> backend support
> 
> Hi,
> 
> Pengfei helped to review the patch offline and found that some multiply-
> add/sub and popcount match rules are missing for SVE. Added in the new
> webrev. Thanks to Pengfei!
> 
> New webrev:
> http://cr.openjdk.java.net/~njian/8231441/webrev.03
> 
> Incremental changes:
> http://cr.openjdk.java.net/~njian/8231441/webrev.03-vs-02/
> 
> Split parts:
> 
> 1) SVE feature detection:
> http://cr.openjdk.java.net/~njian/8231441/webrev.03-feature
> 
> 2) c2 register allocation:
> http://cr.openjdk.java.net/~njian/8231441/webrev.03-ra
> 
> 3) SVE c2 backend:
> http://cr.openjdk.java.net/~njian/8231441/webrev.03-c2
> 
> Thanks,
> Ningsheng
> 
> On 7/21/20 2:05 PM, Ningsheng Jian wrote:
> > [Ping]
> >
> > Could anyone please help to review this patch, especially for the c2
> > register allocation part?
> >
> > JBS: https://bugs.openjdk.java.net/browse/JDK-8231441
> >
> > The latest webrev:
> > http://cr.openjdk.java.net/~njian/8231441/webrev.02
> >
> > In the latest webrev, we block one predicate register (p7) with all
> > elements preset to TRUE, so that c2 compiled code can use it freely to
> > generate instructions for unpredicated operations.
> >
> > And the split parts:
> >
> > 1) SVE feature detection:
> > http://cr.openjdk.java.net/~njian/8231441/webrev.02-feature
> >
> > 2) c2 register allocation:
> > http://cr.openjdk.java.net/~njian/8231441/webrev.02-ra
> >
> > 3) SVE c2 backend:
> > http://cr.openjdk.java.net/~njian/8231441/webrev.02-c2
> >
> > The initial RFR which has some descriptions of the patch:
> > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-March
> > /037628.html
> >
> > The description can also be found at:
> > http://cr.openjdk.java.net/~njian/8231441/README-RFR.txt
> >
> > Notes to verify the patch on QEMU user emulation, with an example of
> > compiled code:
> > http://cr.openjdk.java.net/~njian/8231441/running-sve-in-qemu-user.txt
> >
> > Thanks,
> > Ningsheng
> >
> >
> > On 5/27/20 3:23 PM, Ningsheng Jian wrote:
> >> Hi,
> >>
> >> I have rebased this patch with some more comments added. And also
> >> relaxed the instruction matching conditions for 128-bit vector.
> >>
> >> I would appreciate if someone could help to review this.
> >>
> >> Whole patch:
> >> http://cr.openjdk.java.net/~njian/8231441/webrev.01
> >>
> >> Different parts of changes:
> >>
> >> 1) SVE feature detection
> >> http://cr.openjdk.java.net/~njian/8231441/webrev.01-feature
> >>
> >> 2) c2 registion allocation
> >> http://cr.openjdk.java.net/~njian/8231441/webrev.01-ra
> >>
> >> 3) SVE c2 backend
> >> http://cr.openjdk.java.net/~njian/8231441/webrev.01-c2
> >>
> >> (Or should I split this into different JBS?)
> >>
> >> Thanks,
> >> Ningsheng
> >>
> >> On 3/25/20 2:37 PM, Ningsheng Jian wrote:
> >>> Hi,
> >>>
> >>> Could you please help to review this patch adding AArch64 SVE support?
> >>> It also touches c2 compiler shared code.
> >>>
> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231441
> >>> Webrev: http://cr.openjdk.java.net/~njian/8231441/webrev.00
> >>>
> >>> Arm has released new vector ISA extension for AArch64, SVE [1] and
> >>> SVE2 [2]. This patch adds the initial SVE support in OpenJDK. In
> >>> this patch we have:
> >>>
> >>> 1) SVE feature enablement and detection
> >>> 2) SVE vector register allocation support with initial predicate
> >>> register definition
> >>> 3) SVE c2 backend for current SLP based vectorizer. (We also have a
> >>> POC patch of a new vectorizer using SVE predicate-driven loop
> >>> control, but that's still under development.)
> >>>
> >>> SVE register definition
> >>> =======================
> >>> Unlike other SIMD architectures, SVE allows hardware implementations
> >>> to choose a vector register length from 128 and 2048 bits, multiple
> >>> of 128 bits. So we introduce a new vector type VectorA, i.e. length
> >>> agnostic
> >>> (scalable) vector type, and Op_VecA for machine vectora register. In
> >>> the meantime, to minimize register allocation code changes, we also
> >>> take advantage of one JIT compiler aspect, that is during the
> >>> compile time we actually know the real hardware SVE vector register
> >>> size of current running machine. So, the register allocator actually
> >>> knows how many register slots an Op_VecA ideal reg requires, and
> >>> could work fine without much modification.
> >>>
> >>> Since the bottom 128 bits are shared with the NEON, we extend
> >>> current register mask definition of V0-V31 registers. Currently, c2
> >>> uses one bit mask for a 32-bit register slot, so to define at most
> >>> 2048 bits we will need to add 64 slots in AD file. That's a really
> >>> large number, and will also break current regmask assumption.
> >>> Considering the SVE vector register is architecturally scalable for
> >>> different sizes, we just define double of original NEON vector
> >>> register slots, i.e. 8 slots: Vx, Vx_H, Vx_J ... Vx_O. After adlc, the
> generated register masks now looks like:
> >>>
> >>> const RegMask _VECTORA_REG_mask( 0x0, 0x0, 0xffffffff, 0xffffffff,
> >>> 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, ...
> >>>
> >>> const RegMask _VECTORD_REG_mask( 0x0, 0x0, 0x3030303, 0x3030303,
> >>> 0x3030303, 0x3030303, 0x3030303, 0x3030303, ...
> >>>
> >>> const RegMask _VECTORX_REG_mask( 0x0, 0x0, 0xf0f0f0f, 0xf0f0f0f,
> >>> 0xf0f0f0f, 0xf0f0f0f, 0xf0f0f0f, 0xf0f0f0f, ...
> >>>
> >>> And we use SlotsPerVecA to indicate regmask bit size for a VecA register.
> >>>
> >>> Although for physical register allocation, register allocator does
> >>> not need to know the real VecA register size, while doing
> >>> spill/unspill, current register allocation needs to know actual
> >>> stack slot size to store/load VecA registers. SVE is able to do
> >>> vector size agnostic spilling, but to minimize the code changes, as
> >>> I mentioned before, we just let RA know the actual vector register
> >>> size in current running machine, by calling scalable_vector_reg_size().
> >>>
> >>> In the meantime, since some vector operations do not have
> >>> unpredicated
> >>> SVE1 instructions, but only predicate version, e.g. vector multiply,
> >>> vector load/store. We have also defined predicate registers in this
> >>> patch, and c2 register allocator will allocate a temp predicate
> >>> register to fulfill the expecting unpredicated operations. And this
> >>> can also be used for future predicate-driven vectorizer. This is not
> >>> efficient for now, as we can see many ptrue instructions in the
> >>> generated code. One possible solution I can see, is to block one
> >>> predicate register, and preset it to all true. But to
> >>> preserve/reinitialize a caller save register value cross calls seems
> >>> risky to work in this patch. I decide to defer it to further
> >>> optimization work. If anyone has any suggestions on this, I would
> appreciate.
> >>>
> >>> SVE feature detection
> >>> =====================
> >>> Since we may have some compiled code based on the initial detected
> >>> SVE vector register length and the compiled code is compiled only
> >>> for that vector register length, we assume that the SVE vector
> >>> register length will not be changed during the JVM lifetime.
> >>> However, SVE vector length is per-thread and can be changed by
> >>> system call [3], so we need to make sure that each jni call will not change
> the sve vector length.
> >>>
> >>> Currently, we verify the SVE vector register length on each JNI
> >>> return, and if an SVE vector length change is detected, jvm simply
> >>> reports error and stops running. The VM running vector length can
> >>> also be set by existing VM option MaxVectorSize with c2 enabled. If
> >>> MaxVectorSize is specified not the same as system default sve vector
> >>> length (in /proc/sys/abi/sve_default_vector_length), JVM will set
> >>> current process sve vector length to the specified vector length.
> >>>
> >>> Compiled code
> >>> =============
> >>> We have added all current c2 backend codegen on par with NEON, but
> >>> only for vector length larger than 128-bit.
> >>>
> >>> On a 1024 bit SVE environment, for the following simple loop with
> >>> int array element type:
> >>>
> >>>  ??? for (int i = 0; i < LENGTH; i++) {
> >>>  ????? c[i] = a[i] + b[i];
> >>>  ??? }
> >>>
> >>> c2 generated loop:
> >>>
> >>>  ??? 0x0000ffff811c0820:?? sbfiz?? x11, x10, #2, #32
> >>>  ??? 0x0000ffff811c0824:?? add???? x13, x18, x11
> >>>  ??? 0x0000ffff811c0828:?? add???? x14, x1, x11
> >>>  ??? 0x0000ffff811c082c:?? add???? x13, x13, #0x10
> >>>  ??? 0x0000ffff811c0830:?? add???? x14, x14, #0x10
> >>>  ??? 0x0000ffff811c0834:?? add???? x11, x0, x11
> >>>  ??? 0x0000ffff811c0838:?? add???? x11, x11, #0x10
> >>>  ??? 0x0000ffff811c083c:?? ptrue?? p1.s??? // To be optimized
> >>>  ??? 0x0000ffff811c0840:?? ld1w??? {z16.s}, p1/z, [x14]
> >>>  ??? 0x0000ffff811c0844:?? ptrue?? p0.s
> >>>  ??? 0x0000ffff811c0848:?? ld1w??? {z17.s}, p0/z, [x13]
> >>>  ??? 0x0000ffff811c084c:?? add???? z16.s, z17.s, z16.s
> >>>  ??? 0x0000ffff811c0850:?? ptrue?? p1.s
> >>>  ??? 0x0000ffff811c0854:?? st1w??? {z16.s}, p1, [x11]
> >>>  ??? 0x0000ffff811c0858:?? add???? w10, w10, #0x20
> >>>  ??? 0x0000ffff811c085c:?? cmp???? w10, w12
> >>>  ??? 0x0000ffff811c0860:?? b.lt??? 0x0000ffff811c0820
> >>>
> >>> Test
> >>> ====
> >>> Currently, we don't have real hardware to verify SVE features (and
> >>> performance). But we have run jtreg tests with SVE in some
> >>> emulators. On QEMU system emulator, which has SVE emulation
> support,
> >>> jtreg tier1-3 passed with different vector sizes. We've also
> >>> verified it with full jtreg tests without SVE on both x86 and
> >>> AArch64, to make sure that there's no regression.
> >>>
> >>> The patch has also been applied to Vector API code base, and
> >>> verified on emulator. In Vector API, there are more vector related
> >>> tests and is more possible to generate vector instructions by
> intrinsification.
> >>>
> >>> A simple test can also run in QEMU user emulation, e.g.
> >>>
> >>> $ qemu-aarch64 -cpu max,sve-max-vq=2 java -XX:UseSVE=1 SIMD
> >>>
> >>> (
> >>> To run it in user emulation mode, we will need to bypass SVE feature
> >>> detection code in this patch. E.g. apply:
> >>> http://cr.openjdk.java.net/~njian/8231441/user-emulation.patch
> >>> )l
> >>>
> >>> Others
> >>> ======
> >>> Since this patch is a bit large, I've also split it into 3 parts,
> >>> for easy review:
> >>>
> >>> 1) SVE feature detection
> >>> http://cr.openjdk.java.net/~njian/8231441/webrev.00-feature
> >>>
> >>> 2) c2 registion allocation
> >>> http://cr.openjdk.java.net/~njian/8231441/webrev.00-ra
> >>>
> >>> 3) SVE c2 backend
> >>> http://cr.openjdk.java.net/~njian/8231441/webrev.00-c2
> >>>
> >>> Part of this patch has been contributed by Joshua Zhu and Yang Zhang.
> >>>
> >>> Refs
> >>> ====
> >>> [1] https://developer.arm.com/docs/ddi0584/latest
> >>> [2] https://developer.arm.com/docs/ddi0602/latest
> >>> [3] https://www.kernel.org/doc/Documentation/arm64/sve.txt
> >>>
> >>> Thanks,
> >>> Ningsheng
> >>>
> >>
> >


From ningsheng.jian at arm.com  Thu Jul 30 08:13:23 2020
From: ningsheng.jian at arm.com (Ningsheng Jian)
Date: Thu, 30 Jul 2020 16:13:23 +0800
Subject: [aarch64-port-dev ] RFR(L): 8231441: AArch64: Initial SVE backend
 support
In-Reply-To: <DB8PR08MB496950AD870476C7C59DB80A96710@DB8PR08MB4969.eurprd08.prod.outlook.com>
References: <bce2556a-b709-d6d1-ffa2-655faaf56787@arm.com>
 <42fca25d-7172-b4f3-335b-92e2b05e8195@arm.com>
 <707df21c-849d-ac9d-0ab2-61a30d1354f9@arm.com>
 <5f6dc64c-f51a-50d4-995f-ed0c7a7724e8@arm.com>
 <DB8PR08MB496950AD870476C7C59DB80A96710@DB8PR08MB4969.eurprd08.prod.outlook.com>
Message-ID: <5fdc66e4-da99-f9bf-f656-a85a0e9e7b00@arm.com>

On 7/30/20 2:59 PM, Pengfei Li wrote:
> Hi,
> 
> To help reviewing the large ad file changes in the AArch64 backend, I created some jtreg tests checking if expected SVE/NEON instructions are correctly generated for each C2 vectornode.
> 
> I've uploaded my jtreg at http://cr.openjdk.java.net/~pli/rfr/8231441/jtreg.webrev.00/. Hope it would be useful for other reviewers.
>
Thanks! I like the idea of Opto checker. That would be helpful to the 
SVE vectorization work.

Regards,
Ningsheng

From jatin.bhateja at intel.com  Thu Jul 30 10:55:22 2020
From: jatin.bhateja at intel.com (Bhateja, Jatin)
Date: Thu, 30 Jul 2020 10:55:22 +0000
Subject: RFR[S] : 8248830 : C2 : Rotate API intrinsification for X86
In-Reply-To: <5f6a3e52-7854-4613-43f1-32a7423a0db6@oracle.com>
References: <MWHPR11MB1614EAFF216144FE6EAE68F9E87F0@MWHPR11MB1614.namprd11.prod.outlook.com>
 <92d97d1b-fc53-e368-b249-1cab7db33964@oracle.com>
 <MWHPR11MB1614CB6E26028AC98DAA7F30E8790@MWHPR11MB1614.namprd11.prod.outlook.com>
 <dd691913-d9c7-2657-905f-4f3df50f6bb4@oracle.com>
 <MWHPR11MB1614E047E14386D3B51EA3A9E8700@MWHPR11MB1614.namprd11.prod.outlook.com>
 <e0a75968-936f-97df-5693-f1e3275824e9@oracle.com>
 <5f6a3e52-7854-4613-43f1-32a7423a0db6@oracle.com>
Message-ID: <MWHPR11MB1614B0D4523E65CF9876E72DE8710@MWHPR11MB1614.namprd11.prod.outlook.com>

Hi Vladimir,

> So, it makes sense to keep the transformations. But I'm fine with
> addressing that as a followup enhancement.

Updated patch placed at following link

http://cr.openjdk.java.net/~jbhateja/8248830/webrev.05/

test-tier1 shows no surprises.

Have submitted the patch to submit-repo for testing:

http://hg.openjdk.java.net/jdk/submit/rev/3ed477bb24a7

Best Regards,
Jatin

> -----Original Message-----
> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
> Sent: Thursday, July 30, 2020 3:30 AM
> To: Bhateja, Jatin <jatin.bhateja at intel.com>
> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; hotspot-compiler-
> dev at openjdk.java.net
> Subject: Re: RFR[S] : 8248830 : C2 : Rotate API intrinsification for X86
> 
> 
> >> http://cr.openjdk.java.net/~jbhateja/8248830/webrev.04/
> >
> > Looks good. (Testing is in progress.)
> 
> FYI test results are clean (tier1-tier5).
> 
> >> I have removed RotateLeftNode/RotateRightNode::Ideal routines since
> >> we are anyways doing constant folding in LShiftI/URShiftI value
> >> routines. Since JAVA rotate APIs are no longer intrincified hence
> >> these routines may no longer be useful.
> >
> > Nice observation! Good.
> 
> As a second thought, it seems there's still a chance left that Rotate nodes
> get their input type narrowed after the folding happened. For example, as a
> result of incremental inlining or CFG transformations during loop
> optimizations. And it does happen in practice since the testing revealed
> some crashes due to the bug in RotateLeftNode/RotateRightNode::Ideal().
> 
> So, it makes sense to keep the transformations. But I'm fine with
> addressing that as a followup enhancement.
> 
> Best regards,
> Vladimir Ivanov
> 
> >
> >>> It would be really nice to migrate to MacroAssembler along the way
> >>> (as a cleanup).
> >>
> >> I guess you are saying remove opcodes/encoding from patterns and move
> >> then to Assembler, Can we take this cleanup activity separately since
> >> other patterns are also using these matcher directives.
> >
> > I'm perfectly fine with handling it as a separate enhancement.
> >
> >> Other synthetic comments have been taken care of. I have extended the
> >> Test to cover all the newly added scalar transforms. Kindly let me
> >> know if there other comments.
> >
> > Nice!
> >
> > Best regards,
> > Vladimir Ivanov
> >
> >>> -----Original Message-----
> >>> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
> >>> Sent: Friday, July 24, 2020 3:21 AM
> >>> To: Bhateja, Jatin <jatin.bhateja at intel.com>
> >>> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; Andrew
> >>> Haley <aph at redhat.com>; hotspot-compiler-dev at openjdk.java.net
> >>> Subject: Re: RFR[S] : 8248830 : C2 : Rotate API intrinsification for
> >>> X86
> >>>
> >>> Hi Jatin,
> >>>
> >>>> http://cr.openjdk.java.net/~jbhateja/8248830/webrev.03/
> >>>
> >>> Much better! Thanks.
> >>>
> >>>> Change Summary:
> >>>>
> >>>> 1) Unified the handling for scalar rotate operation. All scalar
> >>>> rotate
> >>> selection patterns are now dependent on newly created
> >>> RotateLeft/RotateRight nodes. This promotes rotate inferencing.
> >>> Currently
> >>> if DAG nodes corresponding to a sub-pattern are shared (have
> >>> multiple
> >>> users) then existing complex patterns based on Or/LShiftL/URShift
> >>> does not get matched and this prevents inferring rotate nodes.
> >>> Please refer to JIT'ed assembly output with baseline[1] and with
> >>> patch[2] . We can see that generated code size also went done from
> >>> 832 byte to 768 bytes. Also this can cause perf degradation if
> >>> shift-or dependency chain appears inside a hot region.
> >>>>
> >>>> 2) Due to enhanced rotate inferencing new patch shows better
> >>>> performance
> >>> even for legacy targets (non AVX-512). Please refer to the perf
> >>> result[3] over AVX2 machine for JMH benchmark part of the patch.
> >>>
> >>> Very nice!
> >>>> 3) As suggested, removed Java API intrinsification changes and
> >>>> scalar
> >>> rotate transformation are done during OrI/OrL node idealizations.
> >>>
> >>> Good.
> >>>
> >>> (Still would be nice to factor the matching code from Ideal() and
> >>> share it between multiple use sites. Especially considering
> >>> OrVNode::Ideal() now does basically the same thing. As an
> >>> example/idea, take a look at
> >>> is_bmi_pattern() in x86.ad.)
> >>>
> >>>> 4) SLP always gets to work on new scalar Rotate nodes and creates
> >>>> vector
> >>> rotate nodes which are degenerated into OrV/LShiftV/URShiftV nodes
> >>> if target does not supports vector rotates(non-AVX512).
> >>>
> >>> Good.
> >>>
> >>>> 5) Added new instruction patterns for vector shift Left/Right
> >>>> operations
> >>> with constant shift operands. This prevents emitting extra moves to
> XMM.
> >>>
> >>> +instruct vshiftI_imm(vec dst, vec src, immI8 shift) %{
> >>> +? match(Set dst (LShiftVI src shift));
> >>>
> >>> I'd prefer to see a uniform Ideal IR shape being used irrespective
> >>> of whether the argument is a constant or not. It should also
> >>> simplify the logic in SuperWord and make it easier to support on
> >>> non-x86 architectures.
> >>>
> >>> For example, here's how it is done on AArch64:
> >>>
> >>> instruct vsll4I_imm(vecX dst, vecX src, immI shift) %{
> >>> ??? predicate(n->as_Vector()->length() == 4);
> >>> ??? match(Set dst (LShiftVI src (LShiftCntV shift))); ...
> >>>
> >>>> 6) Constant folding scenarios are covered in RotateLeft/RotateRight
> >>> idealization, inferencing of vector rotate through OrV idealization
> >>> covers the vector patterns generated though non SLP route i.e.
> >>> VectorAPI.
> >>>
> >>> I'm fine with keeping OrV::Ideal(), but I'm concerned with the
> >>> general direction here - duplication of scalar transformations to
> >>> lane-wise vector operations. It definitely won't scale and in a
> >>> longer run it risks to diverge. Would be nice to find a way to
> >>> automatically "lift"
> >>> scalar transformations to vectors and apply them uniformly. But
> >>> right now it is just an idea which requires more experimentation.
> >>>
> >>>
> >>> Some other minor comments/suggestions:
> >>>
> >>> +? // Swap the computed left and right shift counts.
> >>> +? if (is_rotate_left) {
> >>> +??? Node* temp = shiftRCnt;
> >>> +??? shiftRCnt? = shiftLCnt;
> >>> +??? shiftLCnt? = temp;
> >>> +? }
> >>>
> >>> Maybe use swap() here (declared in globalDefinitions.hpp)?
> >>>
> >>>
> >>> +? if (Matcher::match_rule_supported_vector(vopc, vlen, bt))
> >>> +??? return true;
> >>>
> >>> Please, don't omit curly braces (even for simple cases).
> >>>
> >>>
> >>> -// Rotate Right by variable
> >>> -instruct rorI_rReg_Var_C0(no_rcx_RegI dst, rcx_RegI shift, immI0
> >>> zero, rFlagsReg cr)
> >>> +instruct rorI_immI8_legacy(rRegI dst, immI8 shift, rFlagsReg cr)
> >>> ?? %{
> >>> -? match(Set dst (OrI (URShiftI dst shift) (LShiftI dst (SubI zero
> >>> shift))));
> >>> -
> >>> +? predicate(!VM_Version::supports_bmi2() &&
> >>> n->bottom_type()->basic_type() == T_INT);
> >>> +? match(Set dst (RotateRight dst shift));
> >>> +? format %{ "rorl???? $dst, $shift" %}
> >>> ???? expand %{
> >>> -??? rorI_rReg_CL(dst, shift, cr);
> >>> +??? rorI_rReg_imm8(dst, shift, cr);
> >>> ???? %}
> >>>
> >>> It would be really nice to migrate to MacroAssembler along the way
> >>> (as a cleanup).
> >>>
> >>>> Please push the patch through your testing framework and let me
> >>>> know your
> >>> review feedback.
> >>>
> >>> There's one new assertion failure:
> >>>
> >>> #? Internal Error (.../src/hotspot/share/opto/phaseX.cpp:1238),
> >>> pid=5476, tid=6219
> >>> #? assert((i->_idx >= k->_idx) || i->is_top()) failed: Idealize
> >>> should return new nodes, use Identity to return old nodes
> >>>
> >>> I believe it comes from RotateLeftNode::Ideal/RotateRightNode::Ideal
> >>> which can return pre-contructed constants. I suggest to get rid of
> >>> Ideal() methods and move constant folding logic into Node::Value()
> >>> (as implemented for other bitwise/arithmethic nodes in
> >>> addnode.cpp/subnode.cpp/mulnode.cpp et al). It's a more generic
> >>> approach since it enables richer type information (ranges vs
> >>> constants) and IMO it's more convenient to work with constants
> >>> through Types than ConNodes.
> >>>
> >>> (I suspect that original/expanded IR shape may already provide more
> >>> precise type info for non-constant case which can affect the
> >>> benchmarks.)
> >>>
> >>> Best regards,
> >>> Vladimir Ivanov
> >>>
> >>>>
> >>>> Best Regards,
> >>>> Jatin
> >>>>
> >>>> [1]
> >>>> http://cr.openjdk.java.net/~jbhateja/8248830/rotate_baseline_avx2_asm.
> >>>> txt [2]
> >>>> http://cr.openjdk.java.net/~jbhateja/8248830/rotate_new_patch_avx2_
> >>>> asm
> >>>> .txt [3]
> >>>> http://cr.openjdk.java.net/~jbhateja/8248830/rotate_perf_avx2_new_p
> >>>> atc
> >>>> h.txt
> >>>>
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
> >>>>> Sent: Saturday, July 18, 2020 12:25 AM
> >>>>> To: Bhateja, Jatin <jatin.bhateja at intel.com>; Andrew Haley
> >>>>> <aph at redhat.com>
> >>>>> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>;
> >>>>> hotspot-compiler- dev at openjdk.java.net
> >>>>> Subject: Re: RFR[S] : 8248830 : C2 : Rotate API intrinsification
> >>>>> for
> >>>>> X86
> >>>>>
> >>>>> Hi Jatin,
> >>>>>
> >>>>>> http://cr.openjdk.java.net/~jbhateja/8248830/webrev_02/
> >>>>>
> >>>>> It definitely looks better, but IMO it hasn't reached the sweet
> >>>>> spot
> >>> yet.
> >>>>> It feels like the focus is on auto-vectorizer while the burden is
> >>>>> put on scalar cases.
> >>>>>
> >>>>> First of all, considering GVN folds relevant operation patterns
> >>>>> into a single Rotate node now, what's the motivation to introduce
> >>>>> intrinsics?
> >>>>>
> >>>>> Another point is there's still significant duplication for scalar
> >>>>> cases.
> >>>>>
> >>>>> I'd prefer to see the legacy cases which rely on pattern matching
> >>>>> to go away and be substituted with instructions which match Rotate
> >>>>> instructions (migrating ).
> >>>>>
> >>>>> I understand that it will penalize the vectorization
> >>>>> implementation, but IMO reducing overall complexity is worth it.
> >>>>> On auto-vectorizer side, I see
> >>>>> 2 ways to fix it:
> >>>>>
> >>>>> ???? (1) introduce additional AD instructions for
> >>>>> RotateLeftV/RotateRightV specifically for pre-AVX512 hardware;
> >>>>>
> >>>>> ???? (2) in SuperWord::output(), when matcher doesn't support
> >>>>> RotateLeftV/RotateLeftV nodes (Matcher::match_rule_supported()),
> >>>>> generate vectorized version of the original pattern.
> >>>>>
> >>>>> Overall, it looks like more and more focus is made on scalar part.
> >>>>> Considering the main goal of the patch is to enable vectorization,
> >>>>> I'm fine with separating cleanup of scalar part. As an interim
> >>>>> solution, it seems that leaving the scalar part as it is now and
> >>>>> matching scalar bit rotate pattern in VectorNode::is_rotate()
> >>>>> should be enough to keep the vectorization part functioning. Then
> >>>>> scalar Rotate nodes and relevant cleanups can be integrated later.
> >>>>> (Or vice
> >>>>> versa: clean up scalar part first and then follow up with
> >>>>> vectorization.)
> >>>>>
> >>>>> Some other comments:
> >>>>>
> >>>>> * There's a lot of duplication between OrINode::Ideal and
> >>> OrLNode::Ideal.
> >>>>> What do you think about introducing a super type
> >>>>> (OrNode) and put a unified version (OrNode::Ideal) there?
> >>>>>
> >>>>>
> >>>>> * src/hotspot/cpu/x86/x86.ad
> >>>>>
> >>>>> +instruct vprotate_immI8(vec dst, vec src, immI8 shift) %{
> >>>>> +? predicate(n->bottom_type()->is_vect()->element_basic_type() ==
> >>>>> T_INT
> >>> ||
> >>>>> +??????????? n->bottom_type()->is_vect()->element_basic_type() ==
> >>>>> +T_LONG);
> >>>>>
> >>>>> +instruct vprorate(vec dst, vec src, vec shift) %{
> >>>>> +? predicate(n->bottom_type()->is_vect()->element_basic_type() ==
> >>>>> T_INT
> >>> ||
> >>>>> +??????????? n->bottom_type()->is_vect()->element_basic_type() ==
> >>>>> +T_LONG);
> >>>>>
> >>>>> The predicates are redundant here.
> >>>>>
> >>>>>
> >>>>> * src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp
> >>>>>
> >>>>> +void C2_MacroAssembler::vprotate_imm(int opcode, BasicType etype,
> >>>>> XMMRegister dst, XMMRegister src,
> >>>>> +???????????????????????????????????? int shift, int vector_len) {
> >>>>> +if (opcode == Op_RotateLeftV) {
> >>>>> +??? if (etype == T_INT) {
> >>>>> +????? evprold(dst, src, shift, vector_len);
> >>>>> +??? } else {
> >>>>> +????? evprolq(dst, src, shift, vector_len);
> >>>>> +??? }
> >>>>>
> >>>>> Please, put an assert for the false case (assert(etype == T_LONG,
> >>> "...")).
> >>>>>
> >>>>>
> >>>>> * On testing (with previous version of the patch): -XX:UseAVX is
> >>>>> x86- specific flag, so new/adjusted tests now fail on non-x86
> platforms.
> >>>>> Either omitting the flag or adding
> >>>>> -XX:+IgnoreUnrecognizedVMOptions will solve the issue.
> >>>>>
> >>>>> Best regards,
> >>>>> Vladimir Ivanov
> >>>>>
> >>>>>>
> >>>>>>
> >>>>>> Summary of changes:
> >>>>>> 1) Optimization is specifically targeted to exploit vector
> >>>>>> rotation
> >>>>> instruction added for X86 AVX512. A single rotate instruction
> >>>>> encapsulates entire vector OR/SHIFTs pattern thus offers better
> >>>>> latency at reduced instruction count.
> >>>>>>
> >>>>>> 2) There were two approaches to implement this:
> >>>>>> ?????? a)? Let everything remain the same and add new wide
> >>>>>> complex
> >>>>> instruction patterns in the matcher for e.g.
> >>>>>> ??????????? set Dst ( OrV (Binary (LShiftVI dst (Binary
> >>>>>> ReplicateI
> >>>>>> shift))
> >>>>> (URShiftVI dst (Binary (SubI (Binary ReplicateI 32) ( Replicate
> >>>>> shift))
> >>>>>> ?????? It would have been an overoptimistic assumption to expect
> >>>>>> that graph
> >>>>> shape would be preserved till the matcher for correct inferencing.
> >>>>>> ?????? In addition we would have required multiple such bulky
> >>>>>> patterns.
> >>>>>> ?????? b) Create new RotateLeft/RotateRight scalar nodes, these
> >>>>>> gets
> >>>>> generated during intrinsification as well as during additional
> >>>>> pattern
> >>>>>> ?????? matching during node Idealization, later on these nodes
> >>>>>> are consumed
> >>>>> by SLP for valid vectorization scenarios to emit their vector
> >>>>>> ?????? counterparts which eventually emits vector rotates.
> >>>>>>
> >>>>>> 3) I choose approach 2b) since its cleaner, only problem here was
> >>>>>> that in non-evex mode (UseAVX < 3) new scalar Rotate nodes should
> >>>>>> either be
> >>>>> dismantled back to OR/SHIFT pattern or we penalize the
> >>>>> vectorization which would be very costly, other option would have
> >>>>> been to add additional vector rotate pattern for UseAVX=3 in the
> >>>>> matcher which emit vector OR-SHIFTs instruction but then it will
> >>>>> loose on emitting efficient instruction sequence which node
> >>>>> sharing
> >>>>> (OrV/LShiftV/URShift) offer in current implementation - thus it
> >>>>> will not be beneficial for non-AVX512 targets, only saving will be
> >>>>> in terms of cleanup of few existing scalar rotate matcher
> >>>>> patterns, also old targets does not offer this powerful rotate
> instruction.
> >>>>> Therefore new scalar nodes are created only for AVX512 targets.
> >>>>>>
> >>>>>> As per suggestions constant folding scenarios have been covered
> >>>>>> during
> >>>>> Idealizations of newly added scalar nodes.
> >>>>>>
> >>>>>> Please review the latest version and share your feedback and test
> >>>>> results.
> >>>>>>
> >>>>>> Best Regards,
> >>>>>> Jatin
> >>>>>>
> >>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: Andrew Haley <aph at redhat.com>
> >>>>>>> Sent: Saturday, July 11, 2020 2:24 PM
> >>>>>>> To: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>; Bhateja,
> >>>>>>> Jatin <jatin.bhateja at intel.com>;
> >>>>>>> hotspot-compiler-dev at openjdk.java.net
> >>>>>>> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>
> >>>>>>> Subject: Re: 8248830 : RFR[S] : C2 : Rotate API intrinsification
> >>>>>>> for
> >>>>>>> X86
> >>>>>>>
> >>>>>>> On 10/07/2020 18:32, Vladimir Ivanov wrote:
> >>>>>>>
> >>>>>>> ??? > High-level comment: so far, there were no pressing need in
> >>>>>>> > explicitly marking the methods as intrinsics. ROR/ROL
> >>>>>>> instructions
> >>>>>>>> were selected during matching [1]. Now the patch introduces? >
> >>>>>>> dedicated nodes
> >>>>>>> (RotateLeft/RotateRight) specifically for intrinsics? > which
> >>>>>>> partly duplicates existing logic.
> >>>>>>>
> >>>>>>> The lack of rotate nodes in the IR has always meant that AArch64
> >>>>>>> doesn't generate optimal code for e.g.
> >>>>>>>
> >>>>>>> ????? (Set dst (XorL reg1 (RotateLeftL reg2 imm)))
> >>>>>>>
> >>>>>>> because, with the RotateLeft expanded to its full combination of
> >>>>>>> ORs and shifts, it's to complicated to match. At the time I put
> >>>>>>> this to one side because it wasn't urgent. This is a shame
> >>>>>>> because although such combinations are unusual they are used in
> >>>>>>> some crypto
> >>> operations.
> >>>>>>>
> >>>>>>> If we can generate immediate-form rotate nodes early by pattern
> >>>>>>> matching during parsing (rather than depending on intrinsics)
> >>>>>>> we'll get more value than by depending on programmers calling
> intrinsics.
> >>>>>>>
> >>>>>>> --
> >>>>>>> Andrew Haley? (he/him)
> >>>>>>> Java Platform Lead Engineer
> >>>>>>> Red Hat UK Ltd. <https://www.redhat.com>
> >>>>>>> https://keybase.io/andrewhaley
> >>>>>>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
> >>>>>>

From adinn at redhat.com  Thu Jul 30 11:26:42 2020
From: adinn at redhat.com (Andrew Dinn)
Date: Thu, 30 Jul 2020 12:26:42 +0100
Subject: [aarch64-port-dev ] RFR(L): 8231441: AArch64: Initial SVE backend
 support
In-Reply-To: <707df21c-849d-ac9d-0ab2-61a30d1354f9@arm.com>
References: <bce2556a-b709-d6d1-ffa2-655faaf56787@arm.com>
 <42fca25d-7172-b4f3-335b-92e2b05e8195@arm.com>
 <707df21c-849d-ac9d-0ab2-61a30d1354f9@arm.com>
Message-ID: <2df4a73f-7e84-87f1-6b2f-1ed6b45bbc27@redhat.com>

Hi Ningsheng,

I will start to review this either later today or (more likely)
tomorrow. It will probably take some time to work through it all. I will
work from the updated patch posted by PengFei.

regards,


Andrew Dinn
-----------
Red Hat Distinguished Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill

On 21/07/2020 07:05, Ningsheng Jian wrote:
> [Ping]
> 
> Could anyone please help to review this patch, especially for the c2
> register allocation part?
> 
> JBS: https://bugs.openjdk.java.net/browse/JDK-8231441
> 
> The latest webrev:
> http://cr.openjdk.java.net/~njian/8231441/webrev.02
> 
> In the latest webrev, we block one predicate register (p7) with all
> elements preset to TRUE, so that c2 compiled code can use it freely to
> generate instructions for unpredicated operations.
> 
> And the split parts:
> 
> 1) SVE feature detection:
> http://cr.openjdk.java.net/~njian/8231441/webrev.02-feature
> 
> 2) c2 register allocation:
> http://cr.openjdk.java.net/~njian/8231441/webrev.02-ra
> 
> 3) SVE c2 backend:
> http://cr.openjdk.java.net/~njian/8231441/webrev.02-c2
> 
> The initial RFR which has some descriptions of the patch:
> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-March/037628.html
> 
> 
> The description can also be found at:
> http://cr.openjdk.java.net/~njian/8231441/README-RFR.txt
> 
> Notes to verify the patch on QEMU user emulation, with an example of
> compiled code:
> http://cr.openjdk.java.net/~njian/8231441/running-sve-in-qemu-user.txt
> 
> Thanks,
> Ningsheng
> 
> 
> On 5/27/20 3:23 PM, Ningsheng Jian wrote:
>> Hi,
>>
>> I have rebased this patch with some more comments added. And also
>> relaxed the instruction matching conditions for 128-bit vector.
>>
>> I would appreciate if someone could help to review this.
>>
>> Whole patch:
>> http://cr.openjdk.java.net/~njian/8231441/webrev.01
>>
>> Different parts of changes:
>>
>> 1) SVE feature detection
>> http://cr.openjdk.java.net/~njian/8231441/webrev.01-feature
>>
>> 2) c2 registion allocation
>> http://cr.openjdk.java.net/~njian/8231441/webrev.01-ra
>>
>> 3) SVE c2 backend
>> http://cr.openjdk.java.net/~njian/8231441/webrev.01-c2
>>
>> (Or should I split this into different JBS?)
>>
>> Thanks,
>> Ningsheng
>>
>> On 3/25/20 2:37 PM, Ningsheng Jian wrote:
>>> Hi,
>>>
>>> Could you please help to review this patch adding AArch64 SVE support?
>>> It also touches c2 compiler shared code.
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231441
>>> Webrev: http://cr.openjdk.java.net/~njian/8231441/webrev.00
>>>
>>> Arm has released new vector ISA extension for AArch64, SVE [1] and
>>> SVE2 [2]. This patch adds the initial SVE support in OpenJDK. In this
>>> patch we have:
>>>
>>> 1) SVE feature enablement and detection
>>> 2) SVE vector register allocation support with initial predicate
>>> register definition
>>> 3) SVE c2 backend for current SLP based vectorizer. (We also have a POC
>>> patch of a new vectorizer using SVE predicate-driven loop control, but
>>> that's still under development.)
>>>
>>> SVE register definition
>>> =======================
>>> Unlike other SIMD architectures, SVE allows hardware implementations to
>>> choose a vector register length from 128 and 2048 bits, multiple of 128
>>> bits. So we introduce a new vector type VectorA, i.e. length agnostic
>>> (scalable) vector type, and Op_VecA for machine vectora register. In the
>>> meantime, to minimize register allocation code changes, we also take
>>> advantage of one JIT compiler aspect, that is during the compile time we
>>> actually know the real hardware SVE vector register size of current
>>> running machine. So, the register allocator actually knows how many
>>> register slots an Op_VecA ideal reg requires, and could work fine
>>> without much modification.
>>>
>>> Since the bottom 128 bits are shared with the NEON, we extend current
>>> register mask definition of V0-V31 registers. Currently, c2 uses one bit
>>> mask for a 32-bit register slot, so to define at most 2048 bits we will
>>> need to add 64 slots in AD file. That's a really large number, and will
>>> also break current regmask assumption. Considering the SVE vector
>>> register is architecturally scalable for different sizes, we just define
>>> double of original NEON vector register slots, i.e. 8 slots: Vx, Vx_H,
>>> Vx_J ... Vx_O. After adlc, the generated register masks now looks like:
>>>
>>> const RegMask _VECTORA_REG_mask( 0x0, 0x0, 0xffffffff, 0xffffffff,
>>> 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, ...
>>>
>>> const RegMask _VECTORD_REG_mask( 0x0, 0x0, 0x3030303, 0x3030303,
>>> 0x3030303, 0x3030303, 0x3030303, 0x3030303, ...
>>>
>>> const RegMask _VECTORX_REG_mask( 0x0, 0x0, 0xf0f0f0f, 0xf0f0f0f,
>>> 0xf0f0f0f, 0xf0f0f0f, 0xf0f0f0f, 0xf0f0f0f, ...
>>>
>>> And we use SlotsPerVecA to indicate regmask bit size for a VecA
>>> register.
>>>
>>> Although for physical register allocation, register allocator does not
>>> need to know the real VecA register size, while doing spill/unspill,
>>> current register allocation needs to know actual stack slot size to
>>> store/load VecA registers. SVE is able to do vector size agnostic
>>> spilling, but to minimize the code changes, as I mentioned before, we
>>> just let RA know the actual vector register size in current running
>>> machine, by calling scalable_vector_reg_size().
>>>
>>> In the meantime, since some vector operations do not have unpredicated
>>> SVE1 instructions, but only predicate version, e.g. vector multiply,
>>> vector load/store. We have also defined predicate registers in this
>>> patch, and c2 register allocator will allocate a temp predicate register
>>> to fulfill the expecting unpredicated operations. And this can also be
>>> used for future predicate-driven vectorizer. This is not efficient for
>>> now, as we can see many ptrue instructions in the generated code. One
>>> possible solution I can see, is to block one predicate register, and
>>> preset it to all true. But to preserve/reinitialize a caller save
>>> register value cross calls seems risky to work in this patch. I decide
>>> to defer it to further optimization work. If anyone has any suggestions
>>> on this, I would appreciate.
>>>
>>> SVE feature detection
>>> =====================
>>> Since we may have some compiled code based on the initial detected SVE
>>> vector register length and the compiled code is compiled only for that
>>> vector register length, we assume that the SVE vector register length
>>> will not be changed during the JVM lifetime. However, SVE vector length
>>> is per-thread and can be changed by system call [3], so we need to make
>>> sure that each jni call will not change the sve vector length.
>>>
>>> Currently, we verify the SVE vector register length on each JNI return,
>>> and if an SVE vector length change is detected, jvm simply reports error
>>> and stops running. The VM running vector length can also be set by
>>> existing VM option MaxVectorSize with c2 enabled. If MaxVectorSize is
>>> specified not the same as system default sve vector length (in
>>> /proc/sys/abi/sve_default_vector_length), JVM will set current process
>>> sve vector length to the specified vector length.
>>>
>>> Compiled code
>>> =============
>>> We have added all current c2 backend codegen on par with NEON, but only
>>> for vector length larger than 128-bit.
>>>
>>> On a 1024 bit SVE environment, for the following simple loop with int
>>> array element type:
>>>
>>> ??? for (int i = 0; i < LENGTH; i++) {
>>> ????? c[i] = a[i] + b[i];
>>> ??? }
>>>
>>> c2 generated loop:
>>>
>>> ??? 0x0000ffff811c0820:?? sbfiz?? x11, x10, #2, #32
>>> ??? 0x0000ffff811c0824:?? add???? x13, x18, x11
>>> ??? 0x0000ffff811c0828:?? add???? x14, x1, x11
>>> ??? 0x0000ffff811c082c:?? add???? x13, x13, #0x10
>>> ??? 0x0000ffff811c0830:?? add???? x14, x14, #0x10
>>> ??? 0x0000ffff811c0834:?? add???? x11, x0, x11
>>> ??? 0x0000ffff811c0838:?? add???? x11, x11, #0x10
>>> ??? 0x0000ffff811c083c:?? ptrue?? p1.s??? // To be optimized
>>> ??? 0x0000ffff811c0840:?? ld1w??? {z16.s}, p1/z, [x14]
>>> ??? 0x0000ffff811c0844:?? ptrue?? p0.s
>>> ??? 0x0000ffff811c0848:?? ld1w??? {z17.s}, p0/z, [x13]
>>> ??? 0x0000ffff811c084c:?? add???? z16.s, z17.s, z16.s
>>> ??? 0x0000ffff811c0850:?? ptrue?? p1.s
>>> ??? 0x0000ffff811c0854:?? st1w??? {z16.s}, p1, [x11]
>>> ??? 0x0000ffff811c0858:?? add???? w10, w10, #0x20
>>> ??? 0x0000ffff811c085c:?? cmp???? w10, w12
>>> ??? 0x0000ffff811c0860:?? b.lt??? 0x0000ffff811c0820
>>>
>>> Test
>>> ====
>>> Currently, we don't have real hardware to verify SVE features (and
>>> performance). But we have run jtreg tests with SVE in some emulators. On
>>> QEMU system emulator, which has SVE emulation support, jtreg tier1-3
>>> passed with different vector sizes. We've also verified it with full
>>> jtreg tests without SVE on both x86 and AArch64, to make sure that
>>> there's no regression.
>>>
>>> The patch has also been applied to Vector API code base, and verified on
>>> emulator. In Vector API, there are more vector related tests and is more
>>> possible to generate vector instructions by intrinsification.
>>>
>>> A simple test can also run in QEMU user emulation, e.g.
>>>
>>> $ qemu-aarch64 -cpu max,sve-max-vq=2 java -XX:UseSVE=1 SIMD
>>>
>>> (
>>> To run it in user emulation mode, we will need to bypass SVE feature
>>> detection code in this patch. E.g. apply:
>>> http://cr.openjdk.java.net/~njian/8231441/user-emulation.patch
>>> )l
>>>
>>> Others
>>> ======
>>> Since this patch is a bit large, I've also split it into 3 parts, for
>>> easy review:
>>>
>>> 1) SVE feature detection
>>> http://cr.openjdk.java.net/~njian/8231441/webrev.00-feature
>>>
>>> 2) c2 registion allocation
>>> http://cr.openjdk.java.net/~njian/8231441/webrev.00-ra
>>>
>>> 3) SVE c2 backend
>>> http://cr.openjdk.java.net/~njian/8231441/webrev.00-c2
>>>
>>> Part of this patch has been contributed by Joshua Zhu and Yang Zhang.
>>>
>>> Refs
>>> ====
>>> [1] https://developer.arm.com/docs/ddi0584/latest
>>> [2] https://developer.arm.com/docs/ddi0602/latest
>>> [3] https://www.kernel.org/doc/Documentation/arm64/sve.txt
>>>
>>> Thanks,
>>> Ningsheng
>>>
>>
> 


From jiefu at tencent.com  Thu Jul 30 13:09:14 2020
From: jiefu at tencent.com (=?utf-8?B?amllZnUo5YKF5p2wKQ==?=)
Date: Thu, 30 Jul 2020 13:09:14 +0000
Subject: RFR: 8250825: C2 crashes with assert(field != __null) failed: missing
 field
Message-ID: <71F94D35-2B7B-4032-AD01-954524A150B7@tencent.com>

Hi all,

JBS:    https://bugs.openjdk.java.net/browse/JDK-8250825
Webrev: http://cr.openjdk.java.net/~jiefu/8250825/webrev.00/

When C2 tries to inline an unsafe-access method, it may generate the following pattern in make_unsafe_address:
      ConP  ConL
         \  |
          \ |
          AddP
Current implementation of TypeOopPtr::TypeOopPtr(...) failed to recognize it as an unsafe operation, which leads to the crash.

Testing:
  - tier1-3 on Linux/x64

Could you please review it and give me some advice?

Thanks a lot.
Best regards,
Jie

From christian.hagedorn at oracle.com  Thu Jul 30 13:15:34 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Thu, 30 Jul 2020 15:15:34 +0200
Subject: [16] RFR(S): C2: assert(no_dead_loop) failed: dead loop detected
Message-ID: <66e123b1-35d1-5b96-d0d7-6b4a8cdf2404@oracle.com>

Hi

Please review the following patch:
https://bugs.openjdk.java.net/browse/JDK-8249605
http://cr.openjdk.java.net/~chagedorn/8249605/webrev.00/

There is a dead memory loop detected during IGVN. In the testcase, many 
nodes are dying during IGVN because they are not reachable anymore. In 
this process, a (not yet dead) memory phi node (150 Phi) with two inputs 
is processed (see [1]):
(1) 289 MergeMem, whose base memory is 150 Phi and has one slice for 274 
StoreD which is again an output of 150 Phi
(2) 356 MergeMem, whose base memory is top (i.e. dead and would be 
removed when IGVN processes this node)

In PhiNode::Ideal, we check if a phi node is part of a dead loop where 
all its inputs reference itself over a MergeMemNode input whose base 
memory is the phi node itself again [2]. However, in this check we do 
not account for dead MergeMemNodes (like the input 356 MergeMem of 150 
Phi). Therefore, we do not return top and apply the optimization [3] to 
replace 150 Phi by a new MergeMemNode (380 MergeMem) whose base memory 
is top and now has again one slice which is input and output to 274 
StoreD [4]. This cycle is later detected and the assertion fails.

The fix accounts additionally for dead MergeMemNodes when trying to 
detect dead loops in PhiNode::Ideal to return top instead of a new dead 
MergeMemNode.

Best regards,
Christian


[1] 
https://bugs.openjdk.java.net/secure/attachment/89582/before_PhiNode_Ideal.png
[2] 
http://hg.openjdk.java.net/jdk/jdk/file/8f7ede592c28/src/hotspot/share/opto/cfgnode.cpp#l2234
[3] 
http://hg.openjdk.java.net/jdk/jdk/file/8f7ede592c28/src/hotspot/share/opto/cfgnode.cpp#l2246
[4] 
https://bugs.openjdk.java.net/secure/attachment/89583/after_PhiNode_Ideal.png

From volker.simonis at gmail.com  Thu Jul 30 17:03:49 2020
From: volker.simonis at gmail.com (Volker Simonis)
Date: Thu, 30 Jul 2020 19:03:49 +0200
Subject: RFR[XS] 8249809 avoid calling DirectiveSet::clone(this) in
 compilecommand_compatibility_init
In-Reply-To: <1596056152748.75196@amazon.com>
References: <1595807197546.52082@amazon.com>
 <CA+3eh10LEgJdcSAFrtsE=Ycy2oHJR3Hk2ddrTAFYLiC1E-sJ2w@mail.gmail.com>
 <1595907547514.55531@amazon.com>
 <CA+3eh10dQqSJyu+odn1zfLgMXw_oQaMN5+9icSp0LeERwui7JA@mail.gmail.com>
 <1595969785292.62158@amazon.com>
 <eba1ae8f-9409-48d2-64fa-6bb1e069bb6d@oracle.com>
 <CA+3eh10yzWC1phhCSv9CMZRiH=A1ySJ6ij5jTQq0jeGpttxi6g@mail.gmail.com>
 <1596056152748.75196@amazon.com>
Message-ID: <CA+3eh13Fy6-_M_rd6gc_TN701AW+CckQt19o4dwAMT+RhSaSMQ@mail.gmail.com>

On Wed, Jul 29, 2020 at 10:56 PM Liu, Xin <xxinliu at amazon.com> wrote:
>
> hi, Volker and Tobias,
>
> Here is a new revision.
> http://cr.openjdk.java.net/~xliu/8249809/02/webrev/
>
> 1. This one add comments about this smart pointer and fix the formation issue.
>
> 2. Thanks to point me out a new document of hotspot code style.
> Since it has updated to -std=c++14, I change all NULL to nullptr.
>
> 3.  I also add NON_COPYABLE because it's not intended to be copied.
>
>
> DirectiveSetPtr is just a thin wrapper of the raw pointer. if users only use it to read,  nothing will be cloned. It simply goes through.
>

Hi Xin,

I like the new version :)

I think it's fine except the assertion in "transfer()":

 308   DirectiveSet* transfer() {
 309     assert(_origin != nullptr, "_origin is NULL! transfer() can
only be invoked once.");
 310
 311     if (_clone != nullptr) {
 312       // We are returning a (parentless) copy. The original
parent don't need to account for this.
 313       DirectivesStack::release(_origin);
 314       _origin = nullptr;
 315       return _clone;
 316     }
 317     else {
 318       return _origin;
 319     }
 320   }
 321 };

You should either move it into the " if (_clone != nullptr)" block or
set "_origin" to NULL in the "else" branch as well.

Best regards,
Volker

PS: I won't have access to mail for the next two weeks. If there won't
be any fundamental changes to this patch any more you can consider it
reviewed from my side.

> thanks,
> --lx
>
> ________________________________________
> From: Volker Simonis <volker.simonis at gmail.com>
> Sent: Wednesday, July 29, 2020 7:34 AM
> To: Tobias Hartmann
> Cc: Liu, Xin; hotspot-compiler-dev at openjdk.java.net
> Subject: RE: [EXTERNAL] RFR[XS] 8249809 avoid calling DirectiveSet::clone(this) in compilecommand_compatibility_init
>
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> On Wed, Jul 29, 2020 at 9:38 AM Tobias Hartmann
> <tobias.hartmann at oracle.com> wrote:
> >
> > Hi Xin,
> >
> > On 28.07.20 22:56, Liu, Xin wrote:
> > > http://cr.openjdk.java.net/~xliu/8249809/01/webrev/
> >
> > Overall looks good to me.
> >
> > Some style comments:
> > - Add a comment to 'DirectiveSetPtr' to describe its purpose
> > - Why not put the "cloned" logic in "operator->"?
>
> Because there's also a "read-only" access  of the DirectiveSetPtr
> which doesn't mutate its content and therefore should clone the
> underlying DirectiveSet. See my first mail where I proposed to add a
> second, `const`-version of "operator->". But that still required const
> casts in the places where we didn't want to clone. I've therefore
> voted for the new "cloned()" method which makes cloning and mutating
> explicit and which is much easier to understand from my point of view
> (compared to two overloaded operators).
>
> > - Do not use the _clone pointer as boolean (see "Miscellaneous" section in the style guide [1])
> > - Indentation in line 301-303 is wrong
> > - Line 306 use brackets around the "else" and move it one line up "} else {"
> >
> > Best regards,
> > Tobias
> >
> > [1] https://hg.openjdk.java.net/jdk/jdk/raw-file/tip/doc/hotspot-style.html

From vladimir.kozlov at oracle.com  Thu Jul 30 17:17:13 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 30 Jul 2020 10:17:13 -0700
Subject: [16] RFR(S): C2: assert(no_dead_loop) failed: dead loop detected
In-Reply-To: <66e123b1-35d1-5b96-d0d7-6b4a8cdf2404@oracle.com>
References: <66e123b1-35d1-5b96-d0d7-6b4a8cdf2404@oracle.com>
Message-ID: <db79062c-3be1-c80e-1023-ae8860363c7e@oracle.com>

Very good.

Thanks,
Vladimir K

On 7/30/20 6:15 AM, Christian Hagedorn wrote:
> Hi
> 
> Please review the following patch:
> https://bugs.openjdk.java.net/browse/JDK-8249605
> http://cr.openjdk.java.net/~chagedorn/8249605/webrev.00/
> 
> There is a dead memory loop detected during IGVN. In the testcase, many nodes are dying during IGVN because they are not 
> reachable anymore. In this process, a (not yet dead) memory phi node (150 Phi) with two inputs is processed (see [1]):
> (1) 289 MergeMem, whose base memory is 150 Phi and has one slice for 274 StoreD which is again an output of 150 Phi
> (2) 356 MergeMem, whose base memory is top (i.e. dead and would be removed when IGVN processes this node)
> 
> In PhiNode::Ideal, we check if a phi node is part of a dead loop where all its inputs reference itself over a 
> MergeMemNode input whose base memory is the phi node itself again [2]. However, in this check we do not account for dead 
> MergeMemNodes (like the input 356 MergeMem of 150 Phi). Therefore, we do not return top and apply the optimization [3] 
> to replace 150 Phi by a new MergeMemNode (380 MergeMem) whose base memory is top and now has again one slice which is 
> input and output to 274 StoreD [4]. This cycle is later detected and the assertion fails.
> 
> The fix accounts additionally for dead MergeMemNodes when trying to detect dead loops in PhiNode::Ideal to return top 
> instead of a new dead MergeMemNode.
> 
> Best regards,
> Christian
> 
> 
> [1] https://bugs.openjdk.java.net/secure/attachment/89582/before_PhiNode_Ideal.png
> [2] http://hg.openjdk.java.net/jdk/jdk/file/8f7ede592c28/src/hotspot/share/opto/cfgnode.cpp#l2234
> [3] http://hg.openjdk.java.net/jdk/jdk/file/8f7ede592c28/src/hotspot/share/opto/cfgnode.cpp#l2246
> [4] https://bugs.openjdk.java.net/secure/attachment/89583/after_PhiNode_Ideal.png

From vladimir.kozlov at oracle.com  Thu Jul 30 18:23:07 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 30 Jul 2020 11:23:07 -0700
Subject: RFR: 8250825: C2 crashes with assert(field != __null) failed:
 missing field
In-Reply-To: <71F94D35-2B7B-4032-AD01-954524A150B7@tencent.com>
References: <71F94D35-2B7B-4032-AD01-954524A150B7@tencent.com>
Message-ID: <91ac9d1c-410c-786d-f0c1-e4e4c4afda2c@oracle.com>

Hi Jie

Nodes generated by make_unsafe_address() are correct. The issue is that Unsafe API allows to genereate unaligned (to 
fields) offset with arbitrary type. As result C2 type system can't find corresponding field.

Did you tried to do unaligned unsafe access to instance fields?
Also try to unsafe set value (Store node). There is code in C2 which checks for narrow stores. Would be interesting how 
it behave in unsafe case.

Please, extend your test.

Otherwise fix is good.

Thanks,
Vladimir K

On 7/30/20 6:09 AM, jiefu(??) wrote:
> Hi all,
> 
> JBS:    https://bugs.openjdk.java.net/browse/JDK-8250825
> Webrev: http://cr.openjdk.java.net/~jiefu/8250825/webrev.00/
> 
> When C2 tries to inline an unsafe-access method, it may generate the following pattern in make_unsafe_address:
>        ConP  ConL
>           \  |
>            \ |
>            AddP
> Current implementation of TypeOopPtr::TypeOopPtr(...) failed to recognize it as an unsafe operation, which leads to the crash.
> 
> Testing:
>    - tier1-3 on Linux/x64
> 
> Could you please review it and give me some advice?
> 
> Thanks a lot.
> Best regards,
> Jie
> 

From xxinliu at amazon.com  Thu Jul 30 19:33:23 2020
From: xxinliu at amazon.com (Liu, Xin)
Date: Thu, 30 Jul 2020 19:33:23 +0000
Subject: RFR[XS] 8249809 avoid calling DirectiveSet::clone(this) in
 compilecommand_compatibility_init
In-Reply-To: <CA+3eh13Fy6-_M_rd6gc_TN701AW+CckQt19o4dwAMT+RhSaSMQ@mail.gmail.com>
References: <1595807197546.52082@amazon.com>
 <CA+3eh10LEgJdcSAFrtsE=Ycy2oHJR3Hk2ddrTAFYLiC1E-sJ2w@mail.gmail.com>
 <1595907547514.55531@amazon.com>
 <CA+3eh10dQqSJyu+odn1zfLgMXw_oQaMN5+9icSp0LeERwui7JA@mail.gmail.com>
 <1595969785292.62158@amazon.com>
 <eba1ae8f-9409-48d2-64fa-6bb1e069bb6d@oracle.com>
 <CA+3eh10yzWC1phhCSv9CMZRiH=A1ySJ6ij5jTQq0jeGpttxi6g@mail.gmail.com>
 <1596056152748.75196@amazon.com>,
 <CA+3eh13Fy6-_M_rd6gc_TN701AW+CckQt19o4dwAMT+RhSaSMQ@mail.gmail.com>
Message-ID: <1596137602709.774@amazon.com>

hi, Volker, 

Your suggestion is great. I took it.  The assertion is there because I want to prevent a pointer from releasing more than one time. 
the downside is I limit how to use the function transfer(). 

I just came up a new idea. I changed the function name from transfer() to commit(). 
if _clone is not nullptr, commit() will overwrite _origin and reset itself to nullptr.  cloned() provisions a new object to update. commit() finalizes it.  

it's exaggerated,  but we can use the smart pointer repeat. 
+    set.commit(); // update _origin
+    set.cloned();  // clone it again
+    set.commit(); // update _origin again
+    set.commit(); // no-op
+    set.cloned();  // clone a new one. 
+    set.cloned();  // no-op
     return set.commit();

here is the new revision: 
https://cr.openjdk.java.net/~xliu/8249809/03/webrev/

thanks,
--lx


________________________________________
From: Volker Simonis <volker.simonis at gmail.com>
Sent: Thursday, July 30, 2020 10:03 AM
To: Liu, Xin
Cc: Tobias Hartmann; hotspot-compiler-dev at openjdk.java.net
Subject: RE: [EXTERNAL] RFR[XS] 8249809 avoid calling DirectiveSet::clone(this) in compilecommand_compatibility_init

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


On Wed, Jul 29, 2020 at 10:56 PM Liu, Xin <xxinliu at amazon.com> wrote:
>
> hi, Volker and Tobias,
>
> Here is a new revision.
> http://cr.openjdk.java.net/~xliu/8249809/02/webrev/
>
> 1. This one add comments about this smart pointer and fix the formation issue.
>
> 2. Thanks to point me out a new document of hotspot code style.
> Since it has updated to -std=c++14, I change all NULL to nullptr.
>
> 3.  I also add NON_COPYABLE because it's not intended to be copied.
>
>
> DirectiveSetPtr is just a thin wrapper of the raw pointer. if users only use it to read,  nothing will be cloned. It simply goes through.
>

Hi Xin,

I like the new version :)

I think it's fine except the assertion in "transfer()":

 308   DirectiveSet* transfer() {
 309     assert(_origin != nullptr, "_origin is NULL! transfer() can
only be invoked once.");
 310
 311     if (_clone != nullptr) {
 312       // We are returning a (parentless) copy. The original
parent don't need to account for this.
 313       DirectivesStack::release(_origin);
 314       _origin = nullptr;
 315       return _clone;
 316     }
 317     else {
 318       return _origin;
 319     }
 320   }
 321 };

You should either move it into the " if (_clone != nullptr)" block or
set "_origin" to NULL in the "else" branch as well.

Best regards,
Volker

PS: I won't have access to mail for the next two weeks. If there won't
be any fundamental changes to this patch any more you can consider it
reviewed from my side.

> thanks,
> --lx
>
> ________________________________________
> From: Volker Simonis <volker.simonis at gmail.com>
> Sent: Wednesday, July 29, 2020 7:34 AM
> To: Tobias Hartmann
> Cc: Liu, Xin; hotspot-compiler-dev at openjdk.java.net
> Subject: RE: [EXTERNAL] RFR[XS] 8249809 avoid calling DirectiveSet::clone(this) in compilecommand_compatibility_init
>
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> On Wed, Jul 29, 2020 at 9:38 AM Tobias Hartmann
> <tobias.hartmann at oracle.com> wrote:
> >
> > Hi Xin,
> >
> > On 28.07.20 22:56, Liu, Xin wrote:
> > > http://cr.openjdk.java.net/~xliu/8249809/01/webrev/
> >
> > Overall looks good to me.
> >
> > Some style comments:
> > - Add a comment to 'DirectiveSetPtr' to describe its purpose
> > - Why not put the "cloned" logic in "operator->"?
>
> Because there's also a "read-only" access  of the DirectiveSetPtr
> which doesn't mutate its content and therefore should clone the
> underlying DirectiveSet. See my first mail where I proposed to add a
> second, `const`-version of "operator->". But that still required const
> casts in the places where we didn't want to clone. I've therefore
> voted for the new "cloned()" method which makes cloning and mutating
> explicit and which is much easier to understand from my point of view
> (compared to two overloaded operators).
>
> > - Do not use the _clone pointer as boolean (see "Miscellaneous" section in the style guide [1])
> > - Indentation in line 301-303 is wrong
> > - Line 306 use brackets around the "else" and move it one line up "} else {"
> >
> > Best regards,
> > Tobias
> >
> > [1] https://hg.openjdk.java.net/jdk/jdk/raw-file/tip/doc/hotspot-style.html

From luhenry at microsoft.com  Fri Jul 31 01:26:00 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Fri, 31 Jul 2020 01:26:00 +0000
Subject: RFR[M]: Adding MD5 Intrinsic on x86-64
Message-ID: <MWHPR21MB0511E66642AA67A20AFCC2C8B04E0@MWHPR21MB0511.namprd21.prod.outlook.com>

JBS: I just got authorship status and I'll create a bug as soon as I have access to JBS
Webrev: http://cr.openjdk.java.net/~luhenry/md5-intrinsics/webrev.00/

The problem ended up not being with how `ofs` was incremented, but with a callee-saved register not being restored properly before returning from the intrinsic.

The performance results from running with JMH are very encouraging. I ran the `org.openjdk.bench.java.security.MessageDigests` with MD5 only enabled, and following are the results with and without the intrinsic.

-XX:-UseMD5Intrinsics
Benchmark              (digesterName)  (length)  (provider)   Mode  Cnt     Score    Error   Units
MessageDigests.digest             md5        64     DEFAULT  thrpt   10  3459.747 ? 10.508  ops/ms
MessageDigests.digest             md5      1024     DEFAULT  thrpt   10   446.407 ?  3.383  ops/ms
MessageDigests.digest             md5     16384     DEFAULT  thrpt   10    30.685 ?  0.676  ops/ms
MessageDigests.digest             md5   1048576     DEFAULT  thrpt   10     0.483 ?  0.004  ops/ms

-XX:+UseMD5Intrinsics
Benchmark              (digesterName)  (length)  (provider)   Mode  Cnt     Score    Error   Units
MessageDigests.digest             md5        64     DEFAULT  thrpt   10  4011.556 ? 10.212  ops/ms
MessageDigests.digest             md5      1024     DEFAULT  thrpt   10   526.873 ?  2.101  ops/ms
MessageDigests.digest             md5     16384     DEFAULT  thrpt   10    35.012 ?  0.088  ops/ms
MessageDigests.digest             md5   1048576     DEFAULT  thrpt   10     0.573 ?  0.002  ops/ms

That's overall a jump from ~483MB/s to ~573MB/s on the 1M chunks, or a ~19% speedup.

Thank you,
Ludovic

From ningsheng.jian at arm.com  Fri Jul 31 01:41:45 2020
From: ningsheng.jian at arm.com (Ningsheng Jian)
Date: Fri, 31 Jul 2020 09:41:45 +0800
Subject: [aarch64-port-dev ] RFR(L): 8231441: AArch64: Initial SVE backend
 support
In-Reply-To: <2df4a73f-7e84-87f1-6b2f-1ed6b45bbc27@redhat.com>
References: <bce2556a-b709-d6d1-ffa2-655faaf56787@arm.com>
 <42fca25d-7172-b4f3-335b-92e2b05e8195@arm.com>
 <707df21c-849d-ac9d-0ab2-61a30d1354f9@arm.com>
 <2df4a73f-7e84-87f1-6b2f-1ed6b45bbc27@redhat.com>
Message-ID: <8bc0d357-07e7-ae55-b7b2-23ec54ea3e6a@arm.com>

Hi Andrew,

Thanks a lot!!

FYI, the latest patch:

http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-July/039289.html

And some descriptions:

http://cr.openjdk.java.net/~njian/8231441/README-RFR.txt

Thanks,
Ningsheng


On 7/30/20 7:26 PM, Andrew Dinn wrote:
> Hi Ningsheng,
> 
> I will start to review this either later today or (more likely)
> tomorrow. It will probably take some time to work through it all. I will
> work from the updated patch posted by PengFei.
> 
> regards,
> 
> 
> Andrew Dinn
> -----------
> Red Hat Distinguished Engineer
> Red Hat UK Ltd
> Registered in England and Wales under Company Registration No. 03798903
> Directors: Michael Cunningham, Michael ("Mike") O'Neill
> 
> On 21/07/2020 07:05, Ningsheng Jian wrote:
>> [Ping]
>>
>> Could anyone please help to review this patch, especially for the c2
>> register allocation part?
>>
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8231441
>>
>> The latest webrev:
>> http://cr.openjdk.java.net/~njian/8231441/webrev.02
>>
>> In the latest webrev, we block one predicate register (p7) with all
>> elements preset to TRUE, so that c2 compiled code can use it freely to
>> generate instructions for unpredicated operations.
>>
>> And the split parts:
>>
>> 1) SVE feature detection:
>> http://cr.openjdk.java.net/~njian/8231441/webrev.02-feature
>>
>> 2) c2 register allocation:
>> http://cr.openjdk.java.net/~njian/8231441/webrev.02-ra
>>
>> 3) SVE c2 backend:
>> http://cr.openjdk.java.net/~njian/8231441/webrev.02-c2
>>
>> The initial RFR which has some descriptions of the patch:
>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-March/037628.html
>>
>>
>> The description can also be found at:
>> http://cr.openjdk.java.net/~njian/8231441/README-RFR.txt
>>
>> Notes to verify the patch on QEMU user emulation, with an example of
>> compiled code:
>> http://cr.openjdk.java.net/~njian/8231441/running-sve-in-qemu-user.txt
>>
>> Thanks,
>> Ningsheng
>>
>>
>> On 5/27/20 3:23 PM, Ningsheng Jian wrote:
>>> Hi,
>>>
>>> I have rebased this patch with some more comments added. And also
>>> relaxed the instruction matching conditions for 128-bit vector.
>>>
>>> I would appreciate if someone could help to review this.
>>>
>>> Whole patch:
>>> http://cr.openjdk.java.net/~njian/8231441/webrev.01
>>>
>>> Different parts of changes:
>>>
>>> 1) SVE feature detection
>>> http://cr.openjdk.java.net/~njian/8231441/webrev.01-feature
>>>
>>> 2) c2 registion allocation
>>> http://cr.openjdk.java.net/~njian/8231441/webrev.01-ra
>>>
>>> 3) SVE c2 backend
>>> http://cr.openjdk.java.net/~njian/8231441/webrev.01-c2
>>>
>>> (Or should I split this into different JBS?)
>>>
>>> Thanks,
>>> Ningsheng
>>>
>>> On 3/25/20 2:37 PM, Ningsheng Jian wrote:
>>>> Hi,
>>>>
>>>> Could you please help to review this patch adding AArch64 SVE support?
>>>> It also touches c2 compiler shared code.
>>>>
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231441
>>>> Webrev: http://cr.openjdk.java.net/~njian/8231441/webrev.00
>>>>
>>>> Arm has released new vector ISA extension for AArch64, SVE [1] and
>>>> SVE2 [2]. This patch adds the initial SVE support in OpenJDK. In this
>>>> patch we have:
>>>>
>>>> 1) SVE feature enablement and detection
>>>> 2) SVE vector register allocation support with initial predicate
>>>> register definition
>>>> 3) SVE c2 backend for current SLP based vectorizer. (We also have a POC
>>>> patch of a new vectorizer using SVE predicate-driven loop control, but
>>>> that's still under development.)
>>>>
>>>> SVE register definition
>>>> =======================
>>>> Unlike other SIMD architectures, SVE allows hardware implementations to
>>>> choose a vector register length from 128 and 2048 bits, multiple of 128
>>>> bits. So we introduce a new vector type VectorA, i.e. length agnostic
>>>> (scalable) vector type, and Op_VecA for machine vectora register. In the
>>>> meantime, to minimize register allocation code changes, we also take
>>>> advantage of one JIT compiler aspect, that is during the compile time we
>>>> actually know the real hardware SVE vector register size of current
>>>> running machine. So, the register allocator actually knows how many
>>>> register slots an Op_VecA ideal reg requires, and could work fine
>>>> without much modification.
>>>>
>>>> Since the bottom 128 bits are shared with the NEON, we extend current
>>>> register mask definition of V0-V31 registers. Currently, c2 uses one bit
>>>> mask for a 32-bit register slot, so to define at most 2048 bits we will
>>>> need to add 64 slots in AD file. That's a really large number, and will
>>>> also break current regmask assumption. Considering the SVE vector
>>>> register is architecturally scalable for different sizes, we just define
>>>> double of original NEON vector register slots, i.e. 8 slots: Vx, Vx_H,
>>>> Vx_J ... Vx_O. After adlc, the generated register masks now looks like:
>>>>
>>>> const RegMask _VECTORA_REG_mask( 0x0, 0x0, 0xffffffff, 0xffffffff,
>>>> 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, ...
>>>>
>>>> const RegMask _VECTORD_REG_mask( 0x0, 0x0, 0x3030303, 0x3030303,
>>>> 0x3030303, 0x3030303, 0x3030303, 0x3030303, ...
>>>>
>>>> const RegMask _VECTORX_REG_mask( 0x0, 0x0, 0xf0f0f0f, 0xf0f0f0f,
>>>> 0xf0f0f0f, 0xf0f0f0f, 0xf0f0f0f, 0xf0f0f0f, ...
>>>>
>>>> And we use SlotsPerVecA to indicate regmask bit size for a VecA
>>>> register.
>>>>
>>>> Although for physical register allocation, register allocator does not
>>>> need to know the real VecA register size, while doing spill/unspill,
>>>> current register allocation needs to know actual stack slot size to
>>>> store/load VecA registers. SVE is able to do vector size agnostic
>>>> spilling, but to minimize the code changes, as I mentioned before, we
>>>> just let RA know the actual vector register size in current running
>>>> machine, by calling scalable_vector_reg_size().
>>>>
>>>> In the meantime, since some vector operations do not have unpredicated
>>>> SVE1 instructions, but only predicate version, e.g. vector multiply,
>>>> vector load/store. We have also defined predicate registers in this
>>>> patch, and c2 register allocator will allocate a temp predicate register
>>>> to fulfill the expecting unpredicated operations. And this can also be
>>>> used for future predicate-driven vectorizer. This is not efficient for
>>>> now, as we can see many ptrue instructions in the generated code. One
>>>> possible solution I can see, is to block one predicate register, and
>>>> preset it to all true. But to preserve/reinitialize a caller save
>>>> register value cross calls seems risky to work in this patch. I decide
>>>> to defer it to further optimization work. If anyone has any suggestions
>>>> on this, I would appreciate.
>>>>
>>>> SVE feature detection
>>>> =====================
>>>> Since we may have some compiled code based on the initial detected SVE
>>>> vector register length and the compiled code is compiled only for that
>>>> vector register length, we assume that the SVE vector register length
>>>> will not be changed during the JVM lifetime. However, SVE vector length
>>>> is per-thread and can be changed by system call [3], so we need to make
>>>> sure that each jni call will not change the sve vector length.
>>>>
>>>> Currently, we verify the SVE vector register length on each JNI return,
>>>> and if an SVE vector length change is detected, jvm simply reports error
>>>> and stops running. The VM running vector length can also be set by
>>>> existing VM option MaxVectorSize with c2 enabled. If MaxVectorSize is
>>>> specified not the same as system default sve vector length (in
>>>> /proc/sys/abi/sve_default_vector_length), JVM will set current process
>>>> sve vector length to the specified vector length.
>>>>
>>>> Compiled code
>>>> =============
>>>> We have added all current c2 backend codegen on par with NEON, but only
>>>> for vector length larger than 128-bit.
>>>>
>>>> On a 1024 bit SVE environment, for the following simple loop with int
>>>> array element type:
>>>>
>>>>  ??? for (int i = 0; i < LENGTH; i++) {
>>>>  ????? c[i] = a[i] + b[i];
>>>>  ??? }
>>>>
>>>> c2 generated loop:
>>>>
>>>>  ??? 0x0000ffff811c0820:?? sbfiz?? x11, x10, #2, #32
>>>>  ??? 0x0000ffff811c0824:?? add???? x13, x18, x11
>>>>  ??? 0x0000ffff811c0828:?? add???? x14, x1, x11
>>>>  ??? 0x0000ffff811c082c:?? add???? x13, x13, #0x10
>>>>  ??? 0x0000ffff811c0830:?? add???? x14, x14, #0x10
>>>>  ??? 0x0000ffff811c0834:?? add???? x11, x0, x11
>>>>  ??? 0x0000ffff811c0838:?? add???? x11, x11, #0x10
>>>>  ??? 0x0000ffff811c083c:?? ptrue?? p1.s??? // To be optimized
>>>>  ??? 0x0000ffff811c0840:?? ld1w??? {z16.s}, p1/z, [x14]
>>>>  ??? 0x0000ffff811c0844:?? ptrue?? p0.s
>>>>  ??? 0x0000ffff811c0848:?? ld1w??? {z17.s}, p0/z, [x13]
>>>>  ??? 0x0000ffff811c084c:?? add???? z16.s, z17.s, z16.s
>>>>  ??? 0x0000ffff811c0850:?? ptrue?? p1.s
>>>>  ??? 0x0000ffff811c0854:?? st1w??? {z16.s}, p1, [x11]
>>>>  ??? 0x0000ffff811c0858:?? add???? w10, w10, #0x20
>>>>  ??? 0x0000ffff811c085c:?? cmp???? w10, w12
>>>>  ??? 0x0000ffff811c0860:?? b.lt??? 0x0000ffff811c0820
>>>>
>>>> Test
>>>> ====
>>>> Currently, we don't have real hardware to verify SVE features (and
>>>> performance). But we have run jtreg tests with SVE in some emulators. On
>>>> QEMU system emulator, which has SVE emulation support, jtreg tier1-3
>>>> passed with different vector sizes. We've also verified it with full
>>>> jtreg tests without SVE on both x86 and AArch64, to make sure that
>>>> there's no regression.
>>>>
>>>> The patch has also been applied to Vector API code base, and verified on
>>>> emulator. In Vector API, there are more vector related tests and is more
>>>> possible to generate vector instructions by intrinsification.
>>>>
>>>> A simple test can also run in QEMU user emulation, e.g.
>>>>
>>>> $ qemu-aarch64 -cpu max,sve-max-vq=2 java -XX:UseSVE=1 SIMD
>>>>
>>>> (
>>>> To run it in user emulation mode, we will need to bypass SVE feature
>>>> detection code in this patch. E.g. apply:
>>>> http://cr.openjdk.java.net/~njian/8231441/user-emulation.patch
>>>> )l
>>>>
>>>> Others
>>>> ======
>>>> Since this patch is a bit large, I've also split it into 3 parts, for
>>>> easy review:
>>>>
>>>> 1) SVE feature detection
>>>> http://cr.openjdk.java.net/~njian/8231441/webrev.00-feature
>>>>
>>>> 2) c2 registion allocation
>>>> http://cr.openjdk.java.net/~njian/8231441/webrev.00-ra
>>>>
>>>> 3) SVE c2 backend
>>>> http://cr.openjdk.java.net/~njian/8231441/webrev.00-c2
>>>>
>>>> Part of this patch has been contributed by Joshua Zhu and Yang Zhang.
>>>>
>>>> Refs
>>>> ====
>>>> [1] https://developer.arm.com/docs/ddi0584/latest
>>>> [2] https://developer.arm.com/docs/ddi0602/latest
>>>> [3] https://www.kernel.org/doc/Documentation/arm64/sve.txt
>>>>
>>>> Thanks,
>>>> Ningsheng
>>>>
>>>
>>
> 


From vladimir.kozlov at oracle.com  Fri Jul 31 02:54:28 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Thu, 30 Jul 2020 19:54:28 -0700
Subject: [16] RFR(M) 8250233: -XX:+CITime triggers guarantee(events != NULL)
 in jvmci.cpp:173
Message-ID: <e754ff2e-d764-cc2f-58f6-30daccb43dbc@oracle.com>

https://cr.openjdk.java.net/~kvn/8250233/webrev.00/
https://bugs.openjdk.java.net/browse/JDK-8250233

Main issue was missing EnableJVMCI flag check when calling JVMCICompiler::print_compilation_timers(). I addition to 
fixinf that I did next refactoring.

The code which collects and print statistics per compiler was guarded by #if INCLUDE_JVMCI but not by any JVMCI flags.
As result it is default code used by all JIT compilers since JVMCI was added in JDK 9.

I decided to make it not JVMCI specific and used it on all platforms.

I also added statistic per compilation tier which provides more useful information than combined date for C1.

Removed in CompileBroker::print_times() code which calculate total values based on data in compiler's statistic. Such 
data is already collected in CompileBroker's static fields.

Added checks for 0 values in print statements to avoid division by 0 (whioch produced NaN values for doubles).

Don't print empty data in JVMCICompiler::print_compilation_timers() but print total compilation time in 
JVMCICompiler::print_timers().

Tested hs-tier1-3.

Thanks,
Vladimir

Beginning of CITime new output:

Individual compiler times (for compiled methods only)
------------------------------------------------

   C1 {speed: 49626.710 bytes/s; standard:  0.037 s, 1842 bytes, 35 methods; osr:  0.000 s, 0 bytes, 0 methods; 
nmethods_size: 51096 bytes; nmethods_code_size: 30880 bytes}
   C2 {speed: 1451.769 bytes/s; standard:  0.001 s, 2 bytes, 2 methods; osr:  0.000 s, 0 bytes, 0 methods; 
nmethods_size: 288 bytes; nmethods_code_size: 128 bytes}

Individual compilation Tier times (for compiled methods only)
------------------------------------------------

   Tier1 {speed: 21162.963 bytes/s; standard:  0.002 s, 47 bytes, 10 methods; osr:  0.000 s, 0 bytes, 0 methods; 
nmethods_size: 3160 bytes; nmethods_code_size: 1504 bytes}
   Tier2 {speed:  0.000 bytes/s; standard:  0.000 s, 0 bytes, 0 methods; osr:  0.000 s, 0 bytes, 0 methods; 
nmethods_size: 0 bytes; nmethods_code_size: 0 bytes}
   Tier3 {speed: 51438.195 bytes/s; standard:  0.035 s, 1795 bytes, 25 methods; osr:  0.000 s, 0 bytes, 0 methods; 
nmethods_size: 47936 bytes; nmethods_code_size: 29376 bytes}
   Tier4 {speed: 1451.769 bytes/s; standard:  0.001 s, 2 bytes, 2 methods; osr:  0.000 s, 0 bytes, 0 methods; 
nmethods_size: 288 bytes; nmethods_code_size: 128 bytes}

Accumulated compiler times
----------------------------------------------------------
   Total compilation time   :   0.038 s
     Standard compilation   :   0.038 s, Average : 0.001 s
     Bailed out compilation :   0.000 s, Average : 0.000 s
     On stack replacement   :   0.000 s, Average : 0.000 s
     Invalidated            :   0.000 s, Average : 0.000 s

From viv.desh at gmail.com  Fri Jul 31 04:17:21 2020
From: viv.desh at gmail.com (Vivek Deshpande)
Date: Thu, 30 Jul 2020 21:17:21 -0700
Subject: RFR[M]: Adding MD5 Intrinsic on x86-64
In-Reply-To: <MWHPR21MB0511E66642AA67A20AFCC2C8B04E0@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB0511E66642AA67A20AFCC2C8B04E0@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <CACR9jGOqu3vM8J=n44b5C-FHgAE1ZBt9-AesVtxYqny2Dui=bQ@mail.gmail.com>

Hi Ludovic

Your patch looks good to me. Good reuse of existing code for SHA.
You have not added the stub generation for 32 bit.
Did you also test with a 32 bit build?
Thank you.

Regards,
Vivek

On Thu, Jul 30, 2020 at 6:26 PM Ludovic Henry <luhenry at microsoft.com> wrote:

> JBS: I just got authorship status and I'll create a bug as soon as I have
> access to JBS
> Webrev: http://cr.openjdk.java.net/~luhenry/md5-intrinsics/webrev.00/
>
> The problem ended up not being with how `ofs` was incremented, but with a
> callee-saved register not being restored properly before returning from the
> intrinsic.
>
> The performance results from running with JMH are very encouraging. I ran
> the `org.openjdk.bench.java.security.MessageDigests` with MD5 only enabled,
> and following are the results with and without the intrinsic.
>
> -XX:-UseMD5Intrinsics
> Benchmark              (digesterName)  (length)  (provider)   Mode  Cnt
>  Score    Error   Units
> MessageDigests.digest             md5        64     DEFAULT  thrpt   10
> 3459.747 ? 10.508  ops/ms
> MessageDigests.digest             md5      1024     DEFAULT  thrpt   10
>  446.407 ?  3.383  ops/ms
> MessageDigests.digest             md5     16384     DEFAULT  thrpt   10
> 30.685 ?  0.676  ops/ms
> MessageDigests.digest             md5   1048576     DEFAULT  thrpt   10
>  0.483 ?  0.004  ops/ms
>
> -XX:+UseMD5Intrinsics
> Benchmark              (digesterName)  (length)  (provider)   Mode  Cnt
>  Score    Error   Units
> MessageDigests.digest             md5        64     DEFAULT  thrpt   10
> 4011.556 ? 10.212  ops/ms
> MessageDigests.digest             md5      1024     DEFAULT  thrpt   10
>  526.873 ?  2.101  ops/ms
> MessageDigests.digest             md5     16384     DEFAULT  thrpt   10
> 35.012 ?  0.088  ops/ms
> MessageDigests.digest             md5   1048576     DEFAULT  thrpt   10
>  0.573 ?  0.002  ops/ms
>
> That's overall a jump from ~483MB/s to ~573MB/s on the 1M chunks, or a
> ~19% speedup.
>
> Thank you,
> Ludovic
>


-- 
Thanks and Regards,

Vivek Deshpande
viv.desh at gmail.com

From jiefu at tencent.com  Fri Jul 31 05:06:24 2020
From: jiefu at tencent.com (=?utf-8?B?amllZnUo5YKF5p2wKQ==?=)
Date: Fri, 31 Jul 2020 05:06:24 +0000
Subject: RFR: 8250825: C2 crashes with assert(field != __null) failed:
 missing field
Message-ID: <DA37DAA7-F8A3-4A0C-9CDE-242F0555B9CC@tencent.com>

Hi Vladimir K,

Thanks for your review.

The test had been extended here:
  - http://cr.openjdk.java.net/~jiefu/8250825/webrev.01/

Before the patch:
  The unsafe access (put/get) to static field will crash.
  The unsafe access (put/get) to instance field is fine.

After the patch:
  All is ok.

Thanks a lot.
Best regards,
Jie

?On 2020/7/31, 2:24 AM, "hotspot-compiler-dev on behalf of Vladimir Kozlov" <hotspot-compiler-dev-retn at openjdk.java.net on behalf of vladimir.kozlov at oracle.com> wrote:

    Hi Jie
    
    Nodes generated by make_unsafe_address() are correct. The issue is that Unsafe API allows to genereate unaligned (to 
    fields) offset with arbitrary type. As result C2 type system can't find corresponding field.
    
    Did you tried to do unaligned unsafe access to instance fields?
    Also try to unsafe set value (Store node). There is code in C2 which checks for narrow stores. Would be interesting how 
    it behave in unsafe case.
    
    Please, extend your test.
    
    Otherwise fix is good.
    
    Thanks,
    Vladimir K
    
    On 7/30/20 6:09 AM, jiefu(??) wrote:
    > Hi all,
    > 
    > JBS:    https://bugs.openjdk.java.net/browse/JDK-8250825
    > Webrev: http://cr.openjdk.java.net/~jiefu/8250825/webrev.00/
    > 
    > When C2 tries to inline an unsafe-access method, it may generate the following pattern in make_unsafe_address:
    >        ConP  ConL
    >           \  |
    >            \ |
    >            AddP
    > Current implementation of TypeOopPtr::TypeOopPtr(...) failed to recognize it as an unsafe operation, which leads to the crash.
    > 
    > Testing:
    >    - tier1-3 on Linux/x64
    > 
    > Could you please review it and give me some advice?
    > 
    > Thanks a lot.
    > Best regards,
    > Jie
    > 
    
    
From christian.hagedorn at oracle.com  Fri Jul 31 05:56:03 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Fri, 31 Jul 2020 07:56:03 +0200
Subject: [16] RFR(S): C2: assert(no_dead_loop) failed: dead loop detected
In-Reply-To: <db79062c-3be1-c80e-1023-ae8860363c7e@oracle.com>
References: <66e123b1-35d1-5b96-d0d7-6b4a8cdf2404@oracle.com>
 <db79062c-3be1-c80e-1023-ae8860363c7e@oracle.com>
Message-ID: <c9838211-689b-f2c8-c598-c5f27ea2bfe9@oracle.com>

Hi Vladimir

Thanks a lot for your review!

Best regards,
Christian

On 30.07.20 19:17, Vladimir Kozlov wrote:
> Very good.
> 
> Thanks,
> Vladimir K
> 
> On 7/30/20 6:15 AM, Christian Hagedorn wrote:
>> Hi
>>
>> Please review the following patch:
>> https://bugs.openjdk.java.net/browse/JDK-8249605
>> http://cr.openjdk.java.net/~chagedorn/8249605/webrev.00/
>>
>> There is a dead memory loop detected during IGVN. In the testcase, 
>> many nodes are dying during IGVN because they are not reachable 
>> anymore. In this process, a (not yet dead) memory phi node (150 Phi) 
>> with two inputs is processed (see [1]):
>> (1) 289 MergeMem, whose base memory is 150 Phi and has one slice for 
>> 274 StoreD which is again an output of 150 Phi
>> (2) 356 MergeMem, whose base memory is top (i.e. dead and would be 
>> removed when IGVN processes this node)
>>
>> In PhiNode::Ideal, we check if a phi node is part of a dead loop where 
>> all its inputs reference itself over a MergeMemNode input whose base 
>> memory is the phi node itself again [2]. However, in this check we do 
>> not account for dead MergeMemNodes (like the input 356 MergeMem of 150 
>> Phi). Therefore, we do not return top and apply the optimization [3] 
>> to replace 150 Phi by a new MergeMemNode (380 MergeMem) whose base 
>> memory is top and now has again one slice which is input and output to 
>> 274 StoreD [4]. This cycle is later detected and the assertion fails.
>>
>> The fix accounts additionally for dead MergeMemNodes when trying to 
>> detect dead loops in PhiNode::Ideal to return top instead of a new 
>> dead MergeMemNode.
>>
>> Best regards,
>> Christian
>>
>>
>> [1] 
>> https://bugs.openjdk.java.net/secure/attachment/89582/before_PhiNode_Ideal.png 
>>
>> [2] 
>> http://hg.openjdk.java.net/jdk/jdk/file/8f7ede592c28/src/hotspot/share/opto/cfgnode.cpp#l2234 
>>
>> [3] 
>> http://hg.openjdk.java.net/jdk/jdk/file/8f7ede592c28/src/hotspot/share/opto/cfgnode.cpp#l2246 
>>
>> [4] 
>> https://bugs.openjdk.java.net/secure/attachment/89583/after_PhiNode_Ideal.png 
>>

From tobias.hartmann at oracle.com  Fri Jul 31 07:06:20 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 31 Jul 2020 09:06:20 +0200
Subject: RFR(M): 8067651: Fix Trivial code path for
 LevelTransitionTest.java
In-Reply-To: <58fd3cd5-cdce-8e15-3237-d22a3566b0da@oracle.com>
References: <58fd3cd5-cdce-8e15-3237-d22a3566b0da@oracle.com>
Message-ID: <ca490ea2-88f3-db96-a85f-b5dc06438036@oracle.com>

Hi Evgeny,

looks good to me.

Best regards,
Tobias

On 27.07.20 21:38, Evgeny Nikitin wrote:
> Hi,
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8067651
> Webrev: https://cr.openjdk.java.net/~enikitin/8067651/webrev.00/
> 
> Adjusting the test to current state of the VM.
> 
> ??? - Definition of 'trivial code' does not depend on whether the method has been profiled or not;
> ??? - Trivial code does only go level 0 to level 1;
> ??? - Some refactoring.
> 
> The change has been checked in mach5 for the 5 platforms (passed).
> 
> Please review,
> /Evgeny Nikitin.

From tobias.hartmann at oracle.com  Fri Jul 31 07:10:01 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 31 Jul 2020 09:10:01 +0200
Subject: RFR[XS] 8249809 avoid calling DirectiveSet::clone(this) in
 compilecommand_compatibility_init
In-Reply-To: <CA+3eh10yzWC1phhCSv9CMZRiH=A1ySJ6ij5jTQq0jeGpttxi6g@mail.gmail.com>
References: <1595807197546.52082@amazon.com>
 <CA+3eh10LEgJdcSAFrtsE=Ycy2oHJR3Hk2ddrTAFYLiC1E-sJ2w@mail.gmail.com>
 <1595907547514.55531@amazon.com>
 <CA+3eh10dQqSJyu+odn1zfLgMXw_oQaMN5+9icSp0LeERwui7JA@mail.gmail.com>
 <1595969785292.62158@amazon.com>
 <eba1ae8f-9409-48d2-64fa-6bb1e069bb6d@oracle.com>
 <CA+3eh10yzWC1phhCSv9CMZRiH=A1ySJ6ij5jTQq0jeGpttxi6g@mail.gmail.com>
Message-ID: <a7cdcd85-b648-2376-2c39-7c00e6976b3e@oracle.com>


On 29.07.20 16:34, Volker Simonis wrote:
> Because there's also a "read-only" access  of the DirectiveSetPtr
> which doesn't mutate its content and therefore should clone the
> underlying DirectiveSet. See my first mail where I proposed to add a
> second, `const`-version of "operator->". But that still required const
> casts in the places where we didn't want to clone. I've therefore
> voted for the new "cloned()" method which makes cloning and mutating
> explicit and which is much easier to understand from my point of view
> (compared to two overloaded operators).

Right, I've missed the "set->LogOption" usage.

Best regards,
Tobias

From tobias.hartmann at oracle.com  Fri Jul 31 07:12:11 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 31 Jul 2020 09:12:11 +0200
Subject: RFR[XS] 8249809 avoid calling DirectiveSet::clone(this) in
 compilecommand_compatibility_init
In-Reply-To: <1596137602709.774@amazon.com>
References: <1595807197546.52082@amazon.com>
 <CA+3eh10LEgJdcSAFrtsE=Ycy2oHJR3Hk2ddrTAFYLiC1E-sJ2w@mail.gmail.com>
 <1595907547514.55531@amazon.com>
 <CA+3eh10dQqSJyu+odn1zfLgMXw_oQaMN5+9icSp0LeERwui7JA@mail.gmail.com>
 <1595969785292.62158@amazon.com>
 <eba1ae8f-9409-48d2-64fa-6bb1e069bb6d@oracle.com>
 <CA+3eh10yzWC1phhCSv9CMZRiH=A1ySJ6ij5jTQq0jeGpttxi6g@mail.gmail.com>
 <1596056152748.75196@amazon.com>
 <CA+3eh13Fy6-_M_rd6gc_TN701AW+CckQt19o4dwAMT+RhSaSMQ@mail.gmail.com>
 <1596137602709.774@amazon.com>
Message-ID: <ab2ad437-59d4-e97a-23e6-68baff927761@oracle.com>


On 30.07.20 21:33, Liu, Xin wrote:
> https://cr.openjdk.java.net/~xliu/8249809/03/webrev/

Looks good to me. Thanks for making these changes!

Best regards,
Tobias

From tobias.hartmann at oracle.com  Fri Jul 31 07:25:26 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 31 Jul 2020 09:25:26 +0200
Subject: [16] RFR(S): C2: assert(no_dead_loop) failed: dead loop detected
In-Reply-To: <66e123b1-35d1-5b96-d0d7-6b4a8cdf2404@oracle.com>
References: <66e123b1-35d1-5b96-d0d7-6b4a8cdf2404@oracle.com>
Message-ID: <c107f91a-aa62-bfb1-811f-36b3178d9dc4@oracle.com>

Hi Christian,

nice analysis, looks good to me!

Small typo in cfgnode.cpp:2239 "reference" -> "references"

Best regards,
Tobias

On 30.07.20 15:15, Christian Hagedorn wrote:
> Hi
> 
> Please review the following patch:
> https://bugs.openjdk.java.net/browse/JDK-8249605
> http://cr.openjdk.java.net/~chagedorn/8249605/webrev.00/
> 
> There is a dead memory loop detected during IGVN. In the testcase, many nodes are dying during IGVN
> because they are not reachable anymore. In this process, a (not yet dead) memory phi node (150 Phi)
> with two inputs is processed (see [1]):
> (1) 289 MergeMem, whose base memory is 150 Phi and has one slice for 274 StoreD which is again an
> output of 150 Phi
> (2) 356 MergeMem, whose base memory is top (i.e. dead and would be removed when IGVN processes this
> node)
> 
> In PhiNode::Ideal, we check if a phi node is part of a dead loop where all its inputs reference
> itself over a MergeMemNode input whose base memory is the phi node itself again [2]. However, in
> this check we do not account for dead MergeMemNodes (like the input 356 MergeMem of 150 Phi).
> Therefore, we do not return top and apply the optimization [3] to replace 150 Phi by a new
> MergeMemNode (380 MergeMem) whose base memory is top and now has again one slice which is input and
> output to 274 StoreD [4]. This cycle is later detected and the assertion fails.
> 
> The fix accounts additionally for dead MergeMemNodes when trying to detect dead loops in
> PhiNode::Ideal to return top instead of a new dead MergeMemNode.
> 
> Best regards,
> Christian
> 
> 
> [1] https://bugs.openjdk.java.net/secure/attachment/89582/before_PhiNode_Ideal.png
> [2] http://hg.openjdk.java.net/jdk/jdk/file/8f7ede592c28/src/hotspot/share/opto/cfgnode.cpp#l2234
> [3] http://hg.openjdk.java.net/jdk/jdk/file/8f7ede592c28/src/hotspot/share/opto/cfgnode.cpp#l2246
> [4] https://bugs.openjdk.java.net/secure/attachment/89583/after_PhiNode_Ideal.png

From tobias.hartmann at oracle.com  Fri Jul 31 07:49:46 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 31 Jul 2020 09:49:46 +0200
Subject: RFR: 8250825: C2 crashes with assert(field != __null) failed:
 missing field
In-Reply-To: <DA37DAA7-F8A3-4A0C-9CDE-242F0555B9CC@tencent.com>
References: <DA37DAA7-F8A3-4A0C-9CDE-242F0555B9CC@tencent.com>
Message-ID: <a896b8df-d477-7a5d-6074-7a402e2d2a91@oracle.com>

Hi Jie,

On 31.07.20 07:06, jiefu(??) wrote:
>  http://cr.openjdk.java.net/~jiefu/8250825/webrev.01/

Looks good to me.

Some comments regarding TestUnsafeAccess.java:
- Maybe rename the test to something more meaningful, for example "TestMisalignedUnsafeAccess" and
add a small comment in the @summary tag
- Xcomp already implies Xbatch [1]
- I don't think you need 'initUnsafe' in the test, you can just use Unsafe.getUnsafe [2]

Best regards,
Tobias

[1] http://hg.openjdk.java.net/jdk/jdk/file/83aeb4b1079b/src/hotspot/share/runtime/arguments.cpp#l1612
[2]
http://hg.openjdk.java.net/jdk/jdk/file/83aeb4b1079b/test/hotspot/jtreg/compiler/unsafe/UnsafeGetStableArrayElement.java#l67

From christian.hagedorn at oracle.com  Fri Jul 31 08:38:36 2020
From: christian.hagedorn at oracle.com (Christian Hagedorn)
Date: Fri, 31 Jul 2020 10:38:36 +0200
Subject: [16] RFR(S): C2: assert(no_dead_loop) failed: dead loop detected
In-Reply-To: <c107f91a-aa62-bfb1-811f-36b3178d9dc4@oracle.com>
References: <66e123b1-35d1-5b96-d0d7-6b4a8cdf2404@oracle.com>
 <c107f91a-aa62-bfb1-811f-36b3178d9dc4@oracle.com>
Message-ID: <e0c6edf8-3184-4bab-c99b-3265a3d01699@oracle.com>

Hi Tobias

Thanks a lot for your review!

I updated the typo directly in webrev.00.

Best regards,
Christian

On 31.07.20 09:25, Tobias Hartmann wrote:
> Hi Christian,
> 
> nice analysis, looks good to me!
> 
> Small typo in cfgnode.cpp:2239 "reference" -> "references"
> 
> Best regards,
> Tobias
> 
> On 30.07.20 15:15, Christian Hagedorn wrote:
>> Hi
>>
>> Please review the following patch:
>> https://bugs.openjdk.java.net/browse/JDK-8249605
>> http://cr.openjdk.java.net/~chagedorn/8249605/webrev.00/
>>
>> There is a dead memory loop detected during IGVN. In the testcase, many nodes are dying during IGVN
>> because they are not reachable anymore. In this process, a (not yet dead) memory phi node (150 Phi)
>> with two inputs is processed (see [1]):
>> (1) 289 MergeMem, whose base memory is 150 Phi and has one slice for 274 StoreD which is again an
>> output of 150 Phi
>> (2) 356 MergeMem, whose base memory is top (i.e. dead and would be removed when IGVN processes this
>> node)
>>
>> In PhiNode::Ideal, we check if a phi node is part of a dead loop where all its inputs reference
>> itself over a MergeMemNode input whose base memory is the phi node itself again [2]. However, in
>> this check we do not account for dead MergeMemNodes (like the input 356 MergeMem of 150 Phi).
>> Therefore, we do not return top and apply the optimization [3] to replace 150 Phi by a new
>> MergeMemNode (380 MergeMem) whose base memory is top and now has again one slice which is input and
>> output to 274 StoreD [4]. This cycle is later detected and the assertion fails.
>>
>> The fix accounts additionally for dead MergeMemNodes when trying to detect dead loops in
>> PhiNode::Ideal to return top instead of a new dead MergeMemNode.
>>
>> Best regards,
>> Christian
>>
>>
>> [1] https://bugs.openjdk.java.net/secure/attachment/89582/before_PhiNode_Ideal.png
>> [2] http://hg.openjdk.java.net/jdk/jdk/file/8f7ede592c28/src/hotspot/share/opto/cfgnode.cpp#l2234
>> [3] http://hg.openjdk.java.net/jdk/jdk/file/8f7ede592c28/src/hotspot/share/opto/cfgnode.cpp#l2246
>> [4] https://bugs.openjdk.java.net/secure/attachment/89583/after_PhiNode_Ideal.png

From sergei.tsypanov at yandex.ru  Fri Jul 31 09:11:39 2020
From: sergei.tsypanov at yandex.ru (=?utf-8?B?0KHQtdGA0LPQtdC5INCm0YvQv9Cw0L3QvtCy?=)
Date: Fri, 31 Jul 2020 11:11:39 +0200
Subject: Performance degradation due to probable (?) C2 issue
References: <925401595926726@mail.yandex.ru>
 <da261d22-3801-bf63-b44c-c181e80f6209@redhat.com>
Message-ID: <477011596183779@mail.yandex.ru>

Hi,

could I ask one more question?

You wrote previosuly:

> Here, it's detected that CharArrayWriter::toString is large
> and has already been compiled so there's no sense inlining another copy of it.

So as far as CharArrayWriter::toString is compiled, but not inlined into call site
we have to do a real method call and it this call causes degradation, right?

Regards,
Sergey Tsypanov


28.07.2020, 14:12, "Andrew Haley" <aph at redhat.com>:
> Hi,
>
> On 28/07/2020 11:35, ?????? ??????? wrote:
>
>> ?So my question is whether there's something wrong with compier of
>> ?the original idea of improvement was wrong?
>
> No, and (probably) no.
>
> C2 uses a bunch of of heuristics. Here, it's detected that
> CharArrayWriter::toString is large and has already been compiled so
> there's no sense inlining another copy of it. This isn't necessarily
> true, but it's a good guess. Try playing with InlineSmallCode: start
> with =1000, and increases it from there to see if it helps.
>
> --
> Andrew Haley (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From jiefu at tencent.com  Fri Jul 31 09:40:47 2020
From: jiefu at tencent.com (=?gb2312?B?amllZnUouLW93Ck=?=)
Date: Fri, 31 Jul 2020 09:40:47 +0000
Subject: RFR: 8250825: C2 crashes with assert(field != __null) failed:
 missing field(Internet mail)
In-Reply-To: <a896b8df-d477-7a5d-6074-7a402e2d2a91@oracle.com>
References: <DA37DAA7-F8A3-4A0C-9CDE-242F0555B9CC@tencent.com>,
 <a896b8df-d477-7a5d-6074-7a402e2d2a91@oracle.com>
Message-ID: <402d0d22a984470483fef9761d01ad64@tencent.com>

Hi Tobias,


Thanks for your review and comments.


Updated: http://cr.openjdk.java.net/~jiefu/8250825/webrev.02/


 - Rename the test to TestMisalignedUnsafeAccess.java

 - Add @summary tag

 - Remove Xbatch

 - Remvoe initUnsafe


It seems better now.

Thanks.


Best regards,

Jie


________________________________
From: Tobias Hartmann <tobias.hartmann at oracle.com>
Sent: Friday, July 31, 2020 3:49 PM
To: jiefu(??); Vladimir Kozlov; hotspot-compiler-dev at openjdk.java.net
Subject: Re: RFR: 8250825: C2 crashes with assert(field != __null) failed: missing field(Internet mail)

Hi Jie,

On 31.07.20 07:06, jiefu(??) wrote:
>  http://cr.openjdk.java.net/~jiefu/8250825/webrev.01/

Looks good to me.

Some comments regarding TestUnsafeAccess.java:
- Maybe rename the test to something more meaningful, for example "TestMisalignedUnsafeAccess" and
add a small comment in the @summary tag
- Xcomp already implies Xbatch [1]
- I don't think you need 'initUnsafe' in the test, you can just use Unsafe.getUnsafe [2]

Best regards,
Tobias

[1] http://hg.openjdk.java.net/jdk/jdk/file/83aeb4b1079b/src/hotspot/share/runtime/arguments.cpp#l1612
[2]
http://hg.openjdk.java.net/jdk/jdk/file/83aeb4b1079b/test/hotspot/jtreg/compiler/unsafe/UnsafeGetStableArrayElement.java#l67


From tobias.hartmann at oracle.com  Fri Jul 31 10:00:03 2020
From: tobias.hartmann at oracle.com (Tobias Hartmann)
Date: Fri, 31 Jul 2020 12:00:03 +0200
Subject: RFR: 8250825: C2 crashes with assert(field != __null) failed:
 missing field(Internet mail)
In-Reply-To: <402d0d22a984470483fef9761d01ad64@tencent.com>
References: <DA37DAA7-F8A3-4A0C-9CDE-242F0555B9CC@tencent.com>
 <a896b8df-d477-7a5d-6074-7a402e2d2a91@oracle.com>
 <402d0d22a984470483fef9761d01ad64@tencent.com>
Message-ID: <6c860e7b-832f-3583-b282-7ea44e76e00b@oracle.com>

Hi Jie,

On 31.07.20 11:40, jiefu(??) wrote:
> Updated: http://cr.openjdk.java.net/~jiefu/8250825/webrev.02/

Looks good.

Best regards,
Tobias

From forax at univ-mlv.fr  Fri Jul 31 12:38:11 2020
From: forax at univ-mlv.fr (Remi Forax)
Date: Fri, 31 Jul 2020 14:38:11 +0200 (CEST)
Subject: Performance degradation due to probable (?) C2 issue
In-Reply-To: <477011596183779@mail.yandex.ru>
References: <925401595926726@mail.yandex.ru>
 <da261d22-3801-bf63-b44c-c181e80f6209@redhat.com>
 <477011596183779@mail.yandex.ru>
Message-ID: <1821013837.321064.1596199091093.JavaMail.zimbra@u-pem.fr>

----- Mail original -----
> De: "?????? ???????" <sergei.tsypanov at yandex.ru>
> ?: "Andrew Haley" <aph at redhat.com>, "hotspot compiler" <hotspot-compiler-dev at openjdk.java.net>
> Envoy?: Vendredi 31 Juillet 2020 11:11:39
> Objet: Re: Performance degradation due to probable (?) C2 issue

> Hi,
> 
> could I ask one more question?
> 
> You wrote previosuly:
> 
>> Here, it's detected that CharArrayWriter::toString is large
>> and has already been compiled so there's no sense inlining another copy of it.
> 
> So as far as CharArrayWriter::toString is compiled, but not inlined into call
> site
> we have to do a real method call and it this call causes degradation, right?

yes,
if you never share code you ends up with several giga bytes of assembly codes which destroy your perf because you start to have a lot of cache miss on the instructions.
so there is a trade off between a theoritical fully inlined program and a never inlined program.

> 
> Regards,
> Sergey Tsypanov

regards,
R?mi

> 
> 
> 
> 28.07.2020, 14:12, "Andrew Haley" <aph at redhat.com>:
>> Hi,
>>
>> On 28/07/2020 11:35, ?????? ??????? wrote:
>>
>>> ?So my question is whether there's something wrong with compier of
>>> ?the original idea of improvement was wrong?
>>
>> No, and (probably) no.
>>
>> C2 uses a bunch of of heuristics. Here, it's detected that
>> CharArrayWriter::toString is large and has already been compiled so
>> there's no sense inlining another copy of it. This isn't necessarily
>> true, but it's a good guess. Try playing with InlineSmallCode: start
>> with =1000, and increases it from there to see if it helps.
>>
>> --
>> Andrew Haley (he/him)
>> Java Platform Lead Engineer
>> Red Hat UK Ltd. <https://www.redhat.com>
>> https://keybase.io/andrewhaley
> > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From vladimir.kozlov at oracle.com  Fri Jul 31 16:45:23 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 31 Jul 2020 09:45:23 -0700
Subject: RFR: 8250825: C2 crashes with assert(field != __null) failed:
 missing field
In-Reply-To: <DA37DAA7-F8A3-4A0C-9CDE-242F0555B9CC@tencent.com>
References: <DA37DAA7-F8A3-4A0C-9CDE-242F0555B9CC@tencent.com>
Message-ID: <83167893-0924-860c-b2eb-fce9348d16eb@oracle.com>

Good.

thanks,
Vladimir K

On 7/30/20 10:06 PM, jiefu(??) wrote:
> Hi Vladimir K,
> 
> Thanks for your review.
> 
> The test had been extended here:
>    - http://cr.openjdk.java.net/~jiefu/8250825/webrev.01/
> 
> Before the patch:
>    The unsafe access (put/get) to static field will crash.
>    The unsafe access (put/get) to instance field is fine.
> 
> After the patch:
>    All is ok.
> 
> Thanks a lot.
> Best regards,
> Jie
> 
> ?On 2020/7/31, 2:24 AM, "hotspot-compiler-dev on behalf of Vladimir Kozlov" <hotspot-compiler-dev-retn at openjdk.java.net on behalf of vladimir.kozlov at oracle.com> wrote:
> 
>      Hi Jie
>      
>      Nodes generated by make_unsafe_address() are correct. The issue is that Unsafe API allows to genereate unaligned (to
>      fields) offset with arbitrary type. As result C2 type system can't find corresponding field.
>      
>      Did you tried to do unaligned unsafe access to instance fields?
>      Also try to unsafe set value (Store node). There is code in C2 which checks for narrow stores. Would be interesting how
>      it behave in unsafe case.
>      
>      Please, extend your test.
>      
>      Otherwise fix is good.
>      
>      Thanks,
>      Vladimir K
>      
>      On 7/30/20 6:09 AM, jiefu(??) wrote:
>      > Hi all,
>      >
>      > JBS:    https://bugs.openjdk.java.net/browse/JDK-8250825
>      > Webrev: http://cr.openjdk.java.net/~jiefu/8250825/webrev.00/
>      >
>      > When C2 tries to inline an unsafe-access method, it may generate the following pattern in make_unsafe_address:
>      >        ConP  ConL
>      >           \  |
>      >            \ |
>      >            AddP
>      > Current implementation of TypeOopPtr::TypeOopPtr(...) failed to recognize it as an unsafe operation, which leads to the crash.
>      >
>      > Testing:
>      >    - tier1-3 on Linux/x64
>      >
>      > Could you please review it and give me some advice?
>      >
>      > Thanks a lot.
>      > Best regards,
>      > Jie
>      >
>      
>      
> 

From igor.ignatyev at oracle.com  Fri Jul 31 17:11:39 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Fri, 31 Jul 2020 10:11:39 -0700
Subject: RFR(M): 8067651: Fix Trivial code path for
 LevelTransitionTest.java
In-Reply-To: <58fd3cd5-cdce-8e15-3237-d22a3566b0da@oracle.com>
References: <58fd3cd5-cdce-8e15-3237-d22a3566b0da@oracle.com>
Message-ID: <970076A7-1F18-4E88-994F-802590AF4F9B@oracle.com>

Hi Evgeny,

in general looks good to me, a couple comments/questions though:
 - I don't see necessity of move Helper.* methods into the enclosing class, nor do I see it as improving readability of the test. why did you decide to move them?
 - if the test is inapplicable for Xcomp run, you should either throw SkippedException instead of System.err::println at L#67 or use '@requires vm.compMode != "Xcomp"' in jtreg test description. currently, the former provides arguable more clear message that the test wasn't run (as it sets special sub-status which is understood by our test execution system) than the latter (which will just omit test from test results altogether), however @requires is "faster" as jtreg don't need to run any of the test code. in any case, both makes it clean that the test wasn't really performed, while your code will lead to a passed-passed test w/o no automated way to know that the test wasn't run.
 - from you explanation of the fix it's also unclear why BackgroundCompilation got disabled, could you please explain?

Thanks,
-- Igor

> On Jul 27, 2020, at 12:38 PM, Evgeny Nikitin <evgeny.nikitin at oracle.com> wrote:
> 
> Hi,
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8067651
> Webrev: https://cr.openjdk.java.net/~enikitin/8067651/webrev.00/
> 
> Adjusting the test to current state of the VM.
> 
>    - Definition of 'trivial code' does not depend on whether the method has been profiled or not;
>    - Trivial code does only go level 0 to level 1;
>    - Some refactoring.
> 
> The change has been checked in mach5 for the 5 platforms (passed).
> 
> Please review,
> /Evgeny Nikitin.


From coleen.phillimore at oracle.com  Fri Jul 31 19:38:09 2020
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Fri, 31 Jul 2020 15:38:09 -0400
Subject: RFR (XXL): 8223347: Integration of Vector API (Incubator):
 General HotSpot changes
In-Reply-To: <38a7fe74-0c5e-4a28-b128-24c40b8ea01e@oracle.com>
References: <c1bdf88c-5de2-d069-5f31-5a95c6988bf8@oracle.com>
 <38a7fe74-0c5e-4a28-b128-24c40b8ea01e@oracle.com>
Message-ID: <9c538834-903b-5431-bb43-908b58a1b70a@oracle.com>

The runtime code still looks good to me.
Coleen

On 7/28/20 6:29 PM, Vladimir Ivanov wrote:
> Hi,
>
> Thanks for the feedback on webrev.00, Remi, Coleen, Vladimir K., and 
> Ekaterina!
>
> Here are the latest changes for Vector API support in HotSpot shared 
> code:
>
> http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.01 
>
>
> Incremental changes (diff against webrev.00):
>
> http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.01_00 
>
>
> I decided to post it here and not initiate a new round of reviews 
> because the changes are mostly limited to minor cleanups / simple bug 
> fixes.
>
> Detailed summary:
> ? - rebased to jdk/jdk tip;
> ? - got rid of NotV, VLShiftV, VRShiftV, VURShiftV nodes;
> ? - restore lazy cleanup logic during incremental inlining (see 
> needs_cleanup in compile.cpp);
> ? - got rid of x86-specific changes in shared code;
> ? - fix for 8244867 [1];
> ? - fix Graal test failure: enumerate VectorSupport intrinsics in 
> CheckGraalIntrinsics
> ? - numerous minor cleanups
>
> Best regards,
> Vladimir Ivanov
>
> [1] http://hg.openjdk.java.net/panama/dev/rev/dcfc7b6e8977
> ??? http://jbs.oracle.com/browse/JDK-8244867
> ??? 8244867: 2 vector api tests crash with 
> assert(is_reference_type(basic_type())) failed: wrong type
> Summary: Adding safety checks to prevent intrinsification if class 
> arguments of non-primitive types are uninitialized.
>
> On 04.04.2020 02:12, Vladimir Ivanov wrote:
>> Hi,
>>
>> Following up on review requests of API [0] and Java implementation 
>> [1] for Vector API (JEP 338 [2]), here's a request for review of 
>> general HotSpot changes (in shared code) required for supporting the 
>> API:
>>
>>
>> http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.00/all.00-03/ 
>>
>>
>> (First of all, to set proper expectations: since the JEP is still in 
>> Candidate state, the intention is to initiate preliminary round(s) of 
>> review to inform the community and gather feedback before sending out 
>> final/official RFRs once the JEP is Targeted to a release.)
>>
>> Vector API (being developed in Project Panama [3]) relies on JVM 
>> support to utilize optimal vector hardware instructions at runtime. 
>> It interacts with JVM through intrinsics (declared in 
>> jdk.internal.vm.vector.VectorSupport [4]) which expose vector 
>> operations support in C2 JIT-compiler.
>>
>> As Paul wrote earlier: "A vector intrinsic is an internal low-level 
>> vector operation. The last argument to the intrinsic is fall back 
>> behavior in Java, implementing the scalar operation over the number 
>> of elements held by the vector.? Thus, If the intrinsic is not 
>> supported in C2 for the other arguments then the Java implementation 
>> is executed (the Java implementation is always executed when running 
>> in the interpreter or for C1)."
>>
>> The rest of JVM support is about aggressively optimizing vector boxes 
>> to minimize (ideally eliminate) the overhead of boxing for vector 
>> values.
>> It's a stop-the-gap solution for vector box elimination problem until 
>> inline classes arrive. Vector classes are value-based and in the 
>> longer term will be migrated to inline classes once the support 
>> becomes available.
>>
>> Vector API talk from JVMLS'18 [5] contains brief overview of JVM 
>> implementation and some details.
>>
>> Complete implementation resides in vector-unstable branch of 
>> panama/dev repository [6].
>>
>> Now to gory details (the patch is split in multiple "sub-webrevs"):
>>
>> ===========================================================
>>
>> (1) 
>> http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.00/00.backend.shared/ 
>>
>>
>> Ideal vector nodes for new operations introduced by Vector API.
>>
>> (Platform-specific back end support will be posted for review 
>> separately).
>>
>> ===========================================================
>>
>> (2) 
>> http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.00/01.intrinsics/ 
>>
>>
>> JVM Java interface (VectorSupport) and intrinsic support in C2.
>>
>> Vector instances are initially represented as VectorBox macro nodes 
>> and "unboxing" is represented by VectorUnbox node. It simplifies 
>> vector box elimination analysis and the nodes are expanded later 
>> right before EA pass.
>>
>> Vectors have 2-level on-heap representation: for the vector value 
>> primitive array is used as a backing storage and it is encapsulated 
>> in a typed wrapper (e.g., Int256Vector - vector of 8 ints - contains 
>> a int[8] instance which is used to store vector value).
>>
>> Unless VectorBox node goes away, it needs to be expanded into an 
>> allocation eventually, but it is a pure node and doesn't have any JVM 
>> state associated with it. The problem is solved by keeping JVM state 
>> separately in a VectorBoxAllocate node associated with VectorBox node 
>> and use it during expansion.
>>
>> Also, to simplify vector box elimination, inlining of vector reboxing 
>> calls (VectorSupport::maybeRebox) is delayed until the analysis is over.
>>
>> ===========================================================
>>
>> (3) 
>> http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.00/02.vbox_elimination/ 
>>
>>
>> Vector box elimination analysis implementation. (Brief overview: 
>> slides #36-42 [5].)
>>
>> The main part is devoted to scalarization across safepoints and 
>> rematerialization support during deoptimization. In C2-generated code 
>> vector operations work with raw vector values which live in registers 
>> or spilled on the stack and it allows to avoid boxing/unboxing when a 
>> vector value is alive across a safepoint. As with other values, 
>> there's just a location of the vector value at the safepoint and 
>> vector type information recorded in the relevant nmethod metadata and 
>> all the heavy-lifting happens only when rematerialization takes place.
>>
>> The analysis preserves object identity invariants except during 
>> aggressive reboxing (guarded by -XX:+EnableAggressiveReboxing).
>>
>> (Aggressive reboxing is crucial for cases when vectors "escape": it 
>> allocates a fresh instance at every escape point thus enabling 
>> original instance to go away.)
>>
>> ===========================================================
>>
>> (4) 
>> http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.00/03.module.hotspot/ 
>>
>>
>> HotSpot changes for jdk.incubator.vector module. Vector support is 
>> makred experimental and turned off by default. JEP 338 proposes the 
>> API to be released as an incubator module, so a user has to specify 
>> "--add-module jdk.incubator.vector" on the command line to be able to 
>> use it.
>> When user does that, JVM automatically enables Vector API support.
>> It improves usability (user doesn't need to separately "open" the API 
>> and enable JVM support) while minimizing risks of destabilitzation 
>> from new code when the API is not used.
>>
>>
>> That's it! Will be happy to answer any questions.
>>
>> And thanks in advance for any feedback!
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> [0] 
>> https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-March/065345.html 
>>
>>
>> [1] 
>> https://mail.openjdk.java.net/pipermail/hotspot-dev/2020-April/041228.html
>>
>> [2] https://openjdk.java.net/jeps/338
>>
>> [3] https://openjdk.java.net/projects/panama/
>>
>> [4] 
>> http://cr.openjdk.java.net/~vlivanov/panama/vector/jep338/hotspot.shared/webrev.00/01.intrinsics/src/java.base/share/classes/jdk/internal/vm/vector/VectorSupport.java.html 
>>
>>
>> [5] http://cr.openjdk.java.net/~vlivanov/talks/2018_JVMLS_VectorAPI.pdf
>>
>> [6] http://hg.openjdk.java.net/panama/dev/shortlog/92bbd44386e9
>>
>> ???? $ hg clone http://hg.openjdk.java.net/panama/dev/ -b 
>> vector-unstable


From luhenry at microsoft.com  Fri Jul 31 21:27:25 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Fri, 31 Jul 2020 21:27:25 +0000
Subject: RFR[M]: Adding MD5 Intrinsic on x86-64
In-Reply-To: <CACR9jGOqu3vM8J=n44b5C-FHgAE1ZBt9-AesVtxYqny2Dui=bQ@mail.gmail.com>
References: <MWHPR21MB0511E66642AA67A20AFCC2C8B04E0@MWHPR21MB0511.namprd21.prod.outlook.com>,
 <CACR9jGOqu3vM8J=n44b5C-FHgAE1ZBt9-AesVtxYqny2Dui=bQ@mail.gmail.com>
Message-ID: <MWHPR21MB05116DACB30B8F9D6C330FDCB04E0@MWHPR21MB0511.namprd21.prod.outlook.com>

Hi Vivek,

Thank you for your review.

> You have not added the stub generation for 32 bit.
> Did you also test with a 32 bit build?

I've added and tested it.

Webrev: http://cr.openjdk.java.net/~luhenry/md5-intrinsics/webrev.01

--
Ludovic

________________________________________
From: Vivek Deshpande <mailto:viv.desh at gmail.com>
Sent: Thursday, July 30, 2020 9:17:21 PM
To: Ludovic Henry <mailto:luhenry at microsoft.com>
Cc: Dean Long <mailto:dean.long at oracle.com>; Vladimir Ivanov <mailto:vladimir.x.ivanov at oracle.com>; mailto:hotspot-compiler-dev at openjdk.java.net <mailto:hotspot-compiler-dev at openjdk.java.net>
Subject: Re: RFR[M]: Adding MD5 Intrinsic on x86-64 
?
Hi?Ludovic 

Your patch looks good?to me. Good reuse of existing code for SHA.
You have not added the stub generation for 32 bit.
Did you also test with a 32 bit build?
Thank you.

Regards,
Vivek

On Thu, Jul 30, 2020 at 6:26 PM Ludovic Henry <mailto:luhenry at microsoft.com> wrote:
JBS: I just got authorship status and I'll create a bug as soon as I have access to JBS
Webrev: https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~luhenry%2Fmd5-intrinsics%2Fwebrev.00%2F&data=02%7C01%7Cluhenry%40microsoft.com%7C3326ebd9a7874a11b12508d83508a682%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637317658573667275&sdata=Lin4kFKrxpkZWkicMPjDaJf9JhhRECzwsS%2B7FEtWyks%3D&reserved=0

The problem ended up not being with how `ofs` was incremented, but with a callee-saved register not being restored properly before returning from the intrinsic.

The performance results from running with JMH are very encouraging. I ran the `org.openjdk.bench.java.security.MessageDigests` with MD5 only enabled, and following are the results with and without the intrinsic.

-XX:-UseMD5Intrinsics
Benchmark? ? ? ? ? ? ? (digesterName)? (length)? (provider)? ?Mode? Cnt? ? ?Score? ? Error? ?Units
MessageDigests.digest? ? ? ? ? ? ?md5? ? ? ? 64? ? ?DEFAULT? thrpt? ?10? 3459.747 ? 10.508? ops/ms
MessageDigests.digest? ? ? ? ? ? ?md5? ? ? 1024? ? ?DEFAULT? thrpt? ?10? ?446.407 ?? 3.383? ops/ms
MessageDigests.digest? ? ? ? ? ? ?md5? ? ?16384? ? ?DEFAULT? thrpt? ?10? ? 30.685 ?? 0.676? ops/ms
MessageDigests.digest? ? ? ? ? ? ?md5? ?1048576? ? ?DEFAULT? thrpt? ?10? ? ?0.483 ?? 0.004? ops/ms

-XX:+UseMD5Intrinsics
Benchmark? ? ? ? ? ? ? (digesterName)? (length)? (provider)? ?Mode? Cnt? ? ?Score? ? Error? ?Units
MessageDigests.digest? ? ? ? ? ? ?md5? ? ? ? 64? ? ?DEFAULT? thrpt? ?10? 4011.556 ? 10.212? ops/ms
MessageDigests.digest? ? ? ? ? ? ?md5? ? ? 1024? ? ?DEFAULT? thrpt? ?10? ?526.873 ?? 2.101? ops/ms
MessageDigests.digest? ? ? ? ? ? ?md5? ? ?16384? ? ?DEFAULT? thrpt? ?10? ? 35.012 ?? 0.088? ops/ms
MessageDigests.digest? ? ? ? ? ? ?md5? ?1048576? ? ?DEFAULT? thrpt? ?10? ? ?0.573 ?? 0.002? ops/ms

That's overall a jump from ~483MB/s to ~573MB/s on the 1M chunks, or a ~19% speedup.

Thank you,
Ludovic


-- 
Thanks and Regards,

Vivek Deshpande
mailto:viv.desh at gmail.com

From luhenry at microsoft.com  Fri Jul 31 22:05:40 2020
From: luhenry at microsoft.com (Ludovic Henry)
Date: Fri, 31 Jul 2020 22:05:40 +0000
Subject: RFR[M]: Adding MD5 Intrinsic on x86-64
In-Reply-To: <MWHPR21MB05116DACB30B8F9D6C330FDCB04E0@MWHPR21MB0511.namprd21.prod.outlook.com>
References: <MWHPR21MB0511E66642AA67A20AFCC2C8B04E0@MWHPR21MB0511.namprd21.prod.outlook.com>,
 <CACR9jGOqu3vM8J=n44b5C-FHgAE1ZBt9-AesVtxYqny2Dui=bQ@mail.gmail.com>
 <MWHPR21MB05116DACB30B8F9D6C330FDCB04E0@MWHPR21MB0511.namprd21.prod.outlook.com>
Message-ID: <MWHPR21MB0511725F80EACA52DD00A811B04E0@MWHPR21MB0511.namprd21.prod.outlook.com>

I've just created bug on JBS

JBS: https://bugs.openjdk.java.net/browse/JDK-8250902
Webrev: http://cr.openjdk.java.net/~luhenry/8250902/webrev.01/

-----Original Message-----
From: hotspot-compiler-dev <hotspot-compiler-dev-retn at openjdk.java.net> On Behalf Of Ludovic Henry
Sent: Friday, July 31, 2020 2:27 PM
To: Vivek Deshpande <viv.desh at gmail.com>
Cc: Dean Long <dean.long at oracle.com>; Vladimir Ivanov <vladimir.x.ivanov at oracle.com>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: RFR[M]: Adding MD5 Intrinsic on x86-64

Hi Vivek,

Thank you for your review.

> You have not added the stub generation for 32 bit.
> Did you also test with a 32 bit build?

I've added and tested it.

Webrev: https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~luhenry%2Fmd5-intrinsics%2Fwebrev.01&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cfc95eb95578b439136bf08d83598a069%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637318276934974077&amp;sdata=jdDkoNe3CtEKA8I05p14L580QeUHjCUL6dXGETTNZII%3D&amp;reserved=0

--
Ludovic

________________________________________
From: Vivek Deshpande <mailto:viv.desh at gmail.com>
Sent: Thursday, July 30, 2020 9:17:21 PM
To: Ludovic Henry <mailto:luhenry at microsoft.com>
Cc: Dean Long <mailto:dean.long at oracle.com>; Vladimir Ivanov <mailto:vladimir.x.ivanov at oracle.com>; mailto:hotspot-compiler-dev at openjdk.java.net <mailto:hotspot-compiler-dev at openjdk.java.net>
Subject: Re: RFR[M]: Adding MD5 Intrinsic on x86-64 
?
Hi?Ludovic 

Your patch looks good?to me. Good reuse of existing code for SHA.
You have not added the stub generation for 32 bit.
Did you also test with a 32 bit build?
Thank you.

Regards,
Vivek

On Thu, Jul 30, 2020 at 6:26 PM Ludovic Henry <mailto:luhenry at microsoft.com> wrote:
JBS: I just got authorship status and I'll create a bug as soon as I have access to JBS
Webrev: https://nam06.safelinks.protection.outlook.com/?url=http:%2F%2Fcr.openjdk.java.net%2F~luhenry%2Fmd5-intrinsics%2Fwebrev.00%2F&amp;data=02%7C01%7Cluhenry%40microsoft.com%7Cfc95eb95578b439136bf08d83598a069%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637318276934974077&amp;sdata=nPXHc8Pt048hRUjMCTYT09DVCQEo7Yz%2BD7ZzqO%2BZqWU%3D&amp;reserved=0

The problem ended up not being with how `ofs` was incremented, but with a callee-saved register not being restored properly before returning from the intrinsic.

The performance results from running with JMH are very encouraging. I ran the `org.openjdk.bench.java.security.MessageDigests` with MD5 only enabled, and following are the results with and without the intrinsic.

-XX:-UseMD5Intrinsics
Benchmark? ? ? ? ? ? ? (digesterName)? (length)? (provider)? ?Mode? Cnt? ? ?Score? ? Error? ?Units
MessageDigests.digest? ? ? ? ? ? ?md5? ? ? ? 64? ? ?DEFAULT? thrpt? ?10? 3459.747 ? 10.508? ops/ms
MessageDigests.digest? ? ? ? ? ? ?md5? ? ? 1024? ? ?DEFAULT? thrpt? ?10? ?446.407 ?? 3.383? ops/ms
MessageDigests.digest? ? ? ? ? ? ?md5? ? ?16384? ? ?DEFAULT? thrpt? ?10? ? 30.685 ?? 0.676? ops/ms
MessageDigests.digest? ? ? ? ? ? ?md5? ?1048576? ? ?DEFAULT? thrpt? ?10? ? ?0.483 ?? 0.004? ops/ms

-XX:+UseMD5Intrinsics
Benchmark? ? ? ? ? ? ? (digesterName)? (length)? (provider)? ?Mode? Cnt? ? ?Score? ? Error? ?Units
MessageDigests.digest? ? ? ? ? ? ?md5? ? ? ? 64? ? ?DEFAULT? thrpt? ?10? 4011.556 ? 10.212? ops/ms
MessageDigests.digest? ? ? ? ? ? ?md5? ? ? 1024? ? ?DEFAULT? thrpt? ?10? ?526.873 ?? 2.101? ops/ms
MessageDigests.digest? ? ? ? ? ? ?md5? ? ?16384? ? ?DEFAULT? thrpt? ?10? ? 35.012 ?? 0.088? ops/ms
MessageDigests.digest? ? ? ? ? ? ?md5? ?1048576? ? ?DEFAULT? thrpt? ?10? ? ?0.573 ?? 0.002? ops/ms

That's overall a jump from ~483MB/s to ~573MB/s on the 1M chunks, or a ~19% speedup.

Thank you,
Ludovic


-- 
Thanks and Regards,

Vivek Deshpande
mailto:viv.desh at gmail.com

From vladimir.x.ivanov at oracle.com  Fri Jul 31 23:19:15 2020
From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov)
Date: Sat, 1 Aug 2020 02:19:15 +0300
Subject: RFR[S] : 8248830 : C2 : Rotate API intrinsification for X86
In-Reply-To: <MWHPR11MB1614B0D4523E65CF9876E72DE8710@MWHPR11MB1614.namprd11.prod.outlook.com>
References: <MWHPR11MB1614EAFF216144FE6EAE68F9E87F0@MWHPR11MB1614.namprd11.prod.outlook.com>
 <92d97d1b-fc53-e368-b249-1cab7db33964@oracle.com>
 <MWHPR11MB1614CB6E26028AC98DAA7F30E8790@MWHPR11MB1614.namprd11.prod.outlook.com>
 <dd691913-d9c7-2657-905f-4f3df50f6bb4@oracle.com>
 <MWHPR11MB1614E047E14386D3B51EA3A9E8700@MWHPR11MB1614.namprd11.prod.outlook.com>
 <e0a75968-936f-97df-5693-f1e3275824e9@oracle.com>
 <5f6a3e52-7854-4613-43f1-32a7423a0db6@oracle.com>
 <MWHPR11MB1614B0D4523E65CF9876E72DE8710@MWHPR11MB1614.namprd11.prod.outlook.com>
Message-ID: <8265e303-0f86-b308-be79-740d6b4710f2@oracle.com>


> http://cr.openjdk.java.net/~jbhateja/8248830/webrev.05/

Looks good.

Tier5 (where I saw the crashes) passed.

Please, incorporate the following minor cleanups in the final version:
   http://cr.openjdk.java.net/~vlivanov/jbhateja/8248830/webrev.05.cleanup/

(Tested with hs-tier1,hs-tier2.)

Best regards,
Vladimir Ivanov

>> -----Original Message-----
>> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
>> Sent: Thursday, July 30, 2020 3:30 AM
>> To: Bhateja, Jatin <jatin.bhateja at intel.com>
>> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; hotspot-compiler-
>> dev at openjdk.java.net
>> Subject: Re: RFR[S] : 8248830 : C2 : Rotate API intrinsification for X86
>>
>>
>>>> http://cr.openjdk.java.net/~jbhateja/8248830/webrev.04/
>>>
>>> Looks good. (Testing is in progress.)
>>
>> FYI test results are clean (tier1-tier5).
>>
>>>> I have removed RotateLeftNode/RotateRightNode::Ideal routines since
>>>> we are anyways doing constant folding in LShiftI/URShiftI value
>>>> routines. Since JAVA rotate APIs are no longer intrincified hence
>>>> these routines may no longer be useful.
>>>
>>> Nice observation! Good.
>>
>> As a second thought, it seems there's still a chance left that Rotate nodes
>> get their input type narrowed after the folding happened. For example, as a
>> result of incremental inlining or CFG transformations during loop
>> optimizations. And it does happen in practice since the testing revealed
>> some crashes due to the bug in RotateLeftNode/RotateRightNode::Ideal().
>>
>> So, it makes sense to keep the transformations. But I'm fine with
>> addressing that as a followup enhancement.
>>
>> Best regards,
>> Vladimir Ivanov
>>
>>>
>>>>> It would be really nice to migrate to MacroAssembler along the way
>>>>> (as a cleanup).
>>>>
>>>> I guess you are saying remove opcodes/encoding from patterns and move
>>>> then to Assembler, Can we take this cleanup activity separately since
>>>> other patterns are also using these matcher directives.
>>>
>>> I'm perfectly fine with handling it as a separate enhancement.
>>>
>>>> Other synthetic comments have been taken care of. I have extended the
>>>> Test to cover all the newly added scalar transforms. Kindly let me
>>>> know if there other comments.
>>>
>>> Nice!
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>>>> -----Original Message-----
>>>>> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
>>>>> Sent: Friday, July 24, 2020 3:21 AM
>>>>> To: Bhateja, Jatin <jatin.bhateja at intel.com>
>>>>> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>; Andrew
>>>>> Haley <aph at redhat.com>; hotspot-compiler-dev at openjdk.java.net
>>>>> Subject: Re: RFR[S] : 8248830 : C2 : Rotate API intrinsification for
>>>>> X86
>>>>>
>>>>> Hi Jatin,
>>>>>
>>>>>> http://cr.openjdk.java.net/~jbhateja/8248830/webrev.03/
>>>>>
>>>>> Much better! Thanks.
>>>>>
>>>>>> Change Summary:
>>>>>>
>>>>>> 1) Unified the handling for scalar rotate operation. All scalar
>>>>>> rotate
>>>>> selection patterns are now dependent on newly created
>>>>> RotateLeft/RotateRight nodes. This promotes rotate inferencing.
>>>>> Currently
>>>>> if DAG nodes corresponding to a sub-pattern are shared (have
>>>>> multiple
>>>>> users) then existing complex patterns based on Or/LShiftL/URShift
>>>>> does not get matched and this prevents inferring rotate nodes.
>>>>> Please refer to JIT'ed assembly output with baseline[1] and with
>>>>> patch[2] . We can see that generated code size also went done from
>>>>> 832 byte to 768 bytes. Also this can cause perf degradation if
>>>>> shift-or dependency chain appears inside a hot region.
>>>>>>
>>>>>> 2) Due to enhanced rotate inferencing new patch shows better
>>>>>> performance
>>>>> even for legacy targets (non AVX-512). Please refer to the perf
>>>>> result[3] over AVX2 machine for JMH benchmark part of the patch.
>>>>>
>>>>> Very nice!
>>>>>> 3) As suggested, removed Java API intrinsification changes and
>>>>>> scalar
>>>>> rotate transformation are done during OrI/OrL node idealizations.
>>>>>
>>>>> Good.
>>>>>
>>>>> (Still would be nice to factor the matching code from Ideal() and
>>>>> share it between multiple use sites. Especially considering
>>>>> OrVNode::Ideal() now does basically the same thing. As an
>>>>> example/idea, take a look at
>>>>> is_bmi_pattern() in x86.ad.)
>>>>>
>>>>>> 4) SLP always gets to work on new scalar Rotate nodes and creates
>>>>>> vector
>>>>> rotate nodes which are degenerated into OrV/LShiftV/URShiftV nodes
>>>>> if target does not supports vector rotates(non-AVX512).
>>>>>
>>>>> Good.
>>>>>
>>>>>> 5) Added new instruction patterns for vector shift Left/Right
>>>>>> operations
>>>>> with constant shift operands. This prevents emitting extra moves to
>> XMM.
>>>>>
>>>>> +instruct vshiftI_imm(vec dst, vec src, immI8 shift) %{
>>>>> +? match(Set dst (LShiftVI src shift));
>>>>>
>>>>> I'd prefer to see a uniform Ideal IR shape being used irrespective
>>>>> of whether the argument is a constant or not. It should also
>>>>> simplify the logic in SuperWord and make it easier to support on
>>>>> non-x86 architectures.
>>>>>
>>>>> For example, here's how it is done on AArch64:
>>>>>
>>>>> instruct vsll4I_imm(vecX dst, vecX src, immI shift) %{
>>>>>  ??? predicate(n->as_Vector()->length() == 4);
>>>>>  ??? match(Set dst (LShiftVI src (LShiftCntV shift))); ...
>>>>>
>>>>>> 6) Constant folding scenarios are covered in RotateLeft/RotateRight
>>>>> idealization, inferencing of vector rotate through OrV idealization
>>>>> covers the vector patterns generated though non SLP route i.e.
>>>>> VectorAPI.
>>>>>
>>>>> I'm fine with keeping OrV::Ideal(), but I'm concerned with the
>>>>> general direction here - duplication of scalar transformations to
>>>>> lane-wise vector operations. It definitely won't scale and in a
>>>>> longer run it risks to diverge. Would be nice to find a way to
>>>>> automatically "lift"
>>>>> scalar transformations to vectors and apply them uniformly. But
>>>>> right now it is just an idea which requires more experimentation.
>>>>>
>>>>>
>>>>> Some other minor comments/suggestions:
>>>>>
>>>>> +? // Swap the computed left and right shift counts.
>>>>> +? if (is_rotate_left) {
>>>>> +??? Node* temp = shiftRCnt;
>>>>> +??? shiftRCnt? = shiftLCnt;
>>>>> +??? shiftLCnt? = temp;
>>>>> +? }
>>>>>
>>>>> Maybe use swap() here (declared in globalDefinitions.hpp)?
>>>>>
>>>>>
>>>>> +? if (Matcher::match_rule_supported_vector(vopc, vlen, bt))
>>>>> +??? return true;
>>>>>
>>>>> Please, don't omit curly braces (even for simple cases).
>>>>>
>>>>>
>>>>> -// Rotate Right by variable
>>>>> -instruct rorI_rReg_Var_C0(no_rcx_RegI dst, rcx_RegI shift, immI0
>>>>> zero, rFlagsReg cr)
>>>>> +instruct rorI_immI8_legacy(rRegI dst, immI8 shift, rFlagsReg cr)
>>>>>  ?? %{
>>>>> -? match(Set dst (OrI (URShiftI dst shift) (LShiftI dst (SubI zero
>>>>> shift))));
>>>>> -
>>>>> +? predicate(!VM_Version::supports_bmi2() &&
>>>>> n->bottom_type()->basic_type() == T_INT);
>>>>> +? match(Set dst (RotateRight dst shift));
>>>>> +? format %{ "rorl???? $dst, $shift" %}
>>>>>  ???? expand %{
>>>>> -??? rorI_rReg_CL(dst, shift, cr);
>>>>> +??? rorI_rReg_imm8(dst, shift, cr);
>>>>>  ???? %}
>>>>>
>>>>> It would be really nice to migrate to MacroAssembler along the way
>>>>> (as a cleanup).
>>>>>
>>>>>> Please push the patch through your testing framework and let me
>>>>>> know your
>>>>> review feedback.
>>>>>
>>>>> There's one new assertion failure:
>>>>>
>>>>> #? Internal Error (.../src/hotspot/share/opto/phaseX.cpp:1238),
>>>>> pid=5476, tid=6219
>>>>> #? assert((i->_idx >= k->_idx) || i->is_top()) failed: Idealize
>>>>> should return new nodes, use Identity to return old nodes
>>>>>
>>>>> I believe it comes from RotateLeftNode::Ideal/RotateRightNode::Ideal
>>>>> which can return pre-contructed constants. I suggest to get rid of
>>>>> Ideal() methods and move constant folding logic into Node::Value()
>>>>> (as implemented for other bitwise/arithmethic nodes in
>>>>> addnode.cpp/subnode.cpp/mulnode.cpp et al). It's a more generic
>>>>> approach since it enables richer type information (ranges vs
>>>>> constants) and IMO it's more convenient to work with constants
>>>>> through Types than ConNodes.
>>>>>
>>>>> (I suspect that original/expanded IR shape may already provide more
>>>>> precise type info for non-constant case which can affect the
>>>>> benchmarks.)
>>>>>
>>>>> Best regards,
>>>>> Vladimir Ivanov
>>>>>
>>>>>>
>>>>>> Best Regards,
>>>>>> Jatin
>>>>>>
>>>>>> [1]
>>>>>> http://cr.openjdk.java.net/~jbhateja/8248830/rotate_baseline_avx2_asm.
>>>>>> txt [2]
>>>>>> http://cr.openjdk.java.net/~jbhateja/8248830/rotate_new_patch_avx2_
>>>>>> asm
>>>>>> .txt [3]
>>>>>> http://cr.openjdk.java.net/~jbhateja/8248830/rotate_perf_avx2_new_p
>>>>>> atc
>>>>>> h.txt
>>>>>>
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
>>>>>>> Sent: Saturday, July 18, 2020 12:25 AM
>>>>>>> To: Bhateja, Jatin <jatin.bhateja at intel.com>; Andrew Haley
>>>>>>> <aph at redhat.com>
>>>>>>> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>;
>>>>>>> hotspot-compiler- dev at openjdk.java.net
>>>>>>> Subject: Re: RFR[S] : 8248830 : C2 : Rotate API intrinsification
>>>>>>> for
>>>>>>> X86
>>>>>>>
>>>>>>> Hi Jatin,
>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~jbhateja/8248830/webrev_02/
>>>>>>>
>>>>>>> It definitely looks better, but IMO it hasn't reached the sweet
>>>>>>> spot
>>>>> yet.
>>>>>>> It feels like the focus is on auto-vectorizer while the burden is
>>>>>>> put on scalar cases.
>>>>>>>
>>>>>>> First of all, considering GVN folds relevant operation patterns
>>>>>>> into a single Rotate node now, what's the motivation to introduce
>>>>>>> intrinsics?
>>>>>>>
>>>>>>> Another point is there's still significant duplication for scalar
>>>>>>> cases.
>>>>>>>
>>>>>>> I'd prefer to see the legacy cases which rely on pattern matching
>>>>>>> to go away and be substituted with instructions which match Rotate
>>>>>>> instructions (migrating ).
>>>>>>>
>>>>>>> I understand that it will penalize the vectorization
>>>>>>> implementation, but IMO reducing overall complexity is worth it.
>>>>>>> On auto-vectorizer side, I see
>>>>>>> 2 ways to fix it:
>>>>>>>
>>>>>>>  ???? (1) introduce additional AD instructions for
>>>>>>> RotateLeftV/RotateRightV specifically for pre-AVX512 hardware;
>>>>>>>
>>>>>>>  ???? (2) in SuperWord::output(), when matcher doesn't support
>>>>>>> RotateLeftV/RotateLeftV nodes (Matcher::match_rule_supported()),
>>>>>>> generate vectorized version of the original pattern.
>>>>>>>
>>>>>>> Overall, it looks like more and more focus is made on scalar part.
>>>>>>> Considering the main goal of the patch is to enable vectorization,
>>>>>>> I'm fine with separating cleanup of scalar part. As an interim
>>>>>>> solution, it seems that leaving the scalar part as it is now and
>>>>>>> matching scalar bit rotate pattern in VectorNode::is_rotate()
>>>>>>> should be enough to keep the vectorization part functioning. Then
>>>>>>> scalar Rotate nodes and relevant cleanups can be integrated later.
>>>>>>> (Or vice
>>>>>>> versa: clean up scalar part first and then follow up with
>>>>>>> vectorization.)
>>>>>>>
>>>>>>> Some other comments:
>>>>>>>
>>>>>>> * There's a lot of duplication between OrINode::Ideal and
>>>>> OrLNode::Ideal.
>>>>>>> What do you think about introducing a super type
>>>>>>> (OrNode) and put a unified version (OrNode::Ideal) there?
>>>>>>>
>>>>>>>
>>>>>>> * src/hotspot/cpu/x86/x86.ad
>>>>>>>
>>>>>>> +instruct vprotate_immI8(vec dst, vec src, immI8 shift) %{
>>>>>>> +? predicate(n->bottom_type()->is_vect()->element_basic_type() ==
>>>>>>> T_INT
>>>>> ||
>>>>>>> +??????????? n->bottom_type()->is_vect()->element_basic_type() ==
>>>>>>> +T_LONG);
>>>>>>>
>>>>>>> +instruct vprorate(vec dst, vec src, vec shift) %{
>>>>>>> +? predicate(n->bottom_type()->is_vect()->element_basic_type() ==
>>>>>>> T_INT
>>>>> ||
>>>>>>> +??????????? n->bottom_type()->is_vect()->element_basic_type() ==
>>>>>>> +T_LONG);
>>>>>>>
>>>>>>> The predicates are redundant here.
>>>>>>>
>>>>>>>
>>>>>>> * src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp
>>>>>>>
>>>>>>> +void C2_MacroAssembler::vprotate_imm(int opcode, BasicType etype,
>>>>>>> XMMRegister dst, XMMRegister src,
>>>>>>> +???????????????????????????????????? int shift, int vector_len) {
>>>>>>> +if (opcode == Op_RotateLeftV) {
>>>>>>> +??? if (etype == T_INT) {
>>>>>>> +????? evprold(dst, src, shift, vector_len);
>>>>>>> +??? } else {
>>>>>>> +????? evprolq(dst, src, shift, vector_len);
>>>>>>> +??? }
>>>>>>>
>>>>>>> Please, put an assert for the false case (assert(etype == T_LONG,
>>>>> "...")).
>>>>>>>
>>>>>>>
>>>>>>> * On testing (with previous version of the patch): -XX:UseAVX is
>>>>>>> x86- specific flag, so new/adjusted tests now fail on non-x86
>> platforms.
>>>>>>> Either omitting the flag or adding
>>>>>>> -XX:+IgnoreUnrecognizedVMOptions will solve the issue.
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Vladimir Ivanov
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Summary of changes:
>>>>>>>> 1) Optimization is specifically targeted to exploit vector
>>>>>>>> rotation
>>>>>>> instruction added for X86 AVX512. A single rotate instruction
>>>>>>> encapsulates entire vector OR/SHIFTs pattern thus offers better
>>>>>>> latency at reduced instruction count.
>>>>>>>>
>>>>>>>> 2) There were two approaches to implement this:
>>>>>>>>  ?????? a)? Let everything remain the same and add new wide
>>>>>>>> complex
>>>>>>> instruction patterns in the matcher for e.g.
>>>>>>>>  ??????????? set Dst ( OrV (Binary (LShiftVI dst (Binary
>>>>>>>> ReplicateI
>>>>>>>> shift))
>>>>>>> (URShiftVI dst (Binary (SubI (Binary ReplicateI 32) ( Replicate
>>>>>>> shift))
>>>>>>>>  ?????? It would have been an overoptimistic assumption to expect
>>>>>>>> that graph
>>>>>>> shape would be preserved till the matcher for correct inferencing.
>>>>>>>>  ?????? In addition we would have required multiple such bulky
>>>>>>>> patterns.
>>>>>>>>  ?????? b) Create new RotateLeft/RotateRight scalar nodes, these
>>>>>>>> gets
>>>>>>> generated during intrinsification as well as during additional
>>>>>>> pattern
>>>>>>>>  ?????? matching during node Idealization, later on these nodes
>>>>>>>> are consumed
>>>>>>> by SLP for valid vectorization scenarios to emit their vector
>>>>>>>>  ?????? counterparts which eventually emits vector rotates.
>>>>>>>>
>>>>>>>> 3) I choose approach 2b) since its cleaner, only problem here was
>>>>>>>> that in non-evex mode (UseAVX < 3) new scalar Rotate nodes should
>>>>>>>> either be
>>>>>>> dismantled back to OR/SHIFT pattern or we penalize the
>>>>>>> vectorization which would be very costly, other option would have
>>>>>>> been to add additional vector rotate pattern for UseAVX=3 in the
>>>>>>> matcher which emit vector OR-SHIFTs instruction but then it will
>>>>>>> loose on emitting efficient instruction sequence which node
>>>>>>> sharing
>>>>>>> (OrV/LShiftV/URShift) offer in current implementation - thus it
>>>>>>> will not be beneficial for non-AVX512 targets, only saving will be
>>>>>>> in terms of cleanup of few existing scalar rotate matcher
>>>>>>> patterns, also old targets does not offer this powerful rotate
>> instruction.
>>>>>>> Therefore new scalar nodes are created only for AVX512 targets.
>>>>>>>>
>>>>>>>> As per suggestions constant folding scenarios have been covered
>>>>>>>> during
>>>>>>> Idealizations of newly added scalar nodes.
>>>>>>>>
>>>>>>>> Please review the latest version and share your feedback and test
>>>>>>> results.
>>>>>>>>
>>>>>>>> Best Regards,
>>>>>>>> Jatin
>>>>>>>>
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Andrew Haley <aph at redhat.com>
>>>>>>>>> Sent: Saturday, July 11, 2020 2:24 PM
>>>>>>>>> To: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>; Bhateja,
>>>>>>>>> Jatin <jatin.bhateja at intel.com>;
>>>>>>>>> hotspot-compiler-dev at openjdk.java.net
>>>>>>>>> Cc: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>
>>>>>>>>> Subject: Re: 8248830 : RFR[S] : C2 : Rotate API intrinsification
>>>>>>>>> for
>>>>>>>>> X86
>>>>>>>>>
>>>>>>>>> On 10/07/2020 18:32, Vladimir Ivanov wrote:
>>>>>>>>>
>>>>>>>>>  ??? > High-level comment: so far, there were no pressing need in
>>>>>>>>>> explicitly marking the methods as intrinsics. ROR/ROL
>>>>>>>>> instructions
>>>>>>>>>> were selected during matching [1]. Now the patch introduces? >
>>>>>>>>> dedicated nodes
>>>>>>>>> (RotateLeft/RotateRight) specifically for intrinsics? > which
>>>>>>>>> partly duplicates existing logic.
>>>>>>>>>
>>>>>>>>> The lack of rotate nodes in the IR has always meant that AArch64
>>>>>>>>> doesn't generate optimal code for e.g.
>>>>>>>>>
>>>>>>>>>  ????? (Set dst (XorL reg1 (RotateLeftL reg2 imm)))
>>>>>>>>>
>>>>>>>>> because, with the RotateLeft expanded to its full combination of
>>>>>>>>> ORs and shifts, it's to complicated to match. At the time I put
>>>>>>>>> this to one side because it wasn't urgent. This is a shame
>>>>>>>>> because although such combinations are unusual they are used in
>>>>>>>>> some crypto
>>>>> operations.
>>>>>>>>>
>>>>>>>>> If we can generate immediate-form rotate nodes early by pattern
>>>>>>>>> matching during parsing (rather than depending on intrinsics)
>>>>>>>>> we'll get more value than by depending on programmers calling
>> intrinsics.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Andrew Haley? (he/him)
>>>>>>>>> Java Platform Lead Engineer
>>>>>>>>> Red Hat UK Ltd. <https://www.redhat.com>
>>>>>>>>> https://keybase.io/andrewhaley
>>>>>>>>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>>>>>>>>

From vladimir.kozlov at oracle.com  Fri Jul 31 23:54:00 2020
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 31 Jul 2020 16:54:00 -0700
Subject: 8250825: C2 crashes with assert(field != __null) failed: missing
 field(Internet mail)
In-Reply-To: <11584C93-EDD5-42A9-A2CD-0738970F3181@tencent.com>
References: <11584C93-EDD5-42A9-A2CD-0738970F3181@tencent.com>
Message-ID: <40d947f8-ebdb-0850-274b-583be9a37aa3@oracle.com>

Yes, it is good.

Thanks,
Vladimir

On 7/31/20 4:43 PM, jiefu(??) wrote:
> Hi Vladimir K,
> 
> The latest version for the test case is here: http://cr.openjdk.java.net/~jiefu/8250825/webrev.02/
> Compared with webrev.01, the changes are:
>       - Rename the test to TestMisalignedUnsafeAccess.java
>       - Add @summary tag
>       - Remove Xbatch
>       - Remvoe initUnsafe
> 
> Are you still OK with it?
> 
> Thanks.
> Best regards,
> Jie
> 
> ?On 2020/8/1, 12:46 AM, "Vladimir Kozlov" <vladimir.kozlov at oracle.com> wrote:
> 
>      Good.
>      
>      thanks,
>      Vladimir K
>      
>      On 7/30/20 10:06 PM, jiefu(??) wrote:
>      > Hi Vladimir K,
>      >
>      > Thanks for your review.
>      >
>      > The test had been extended here:
>      >    - http://cr.openjdk.java.net/~jiefu/8250825/webrev.01/
>      >
>      > Before the patch:
>      >    The unsafe access (put/get) to static field will crash.
>      >    The unsafe access (put/get) to instance field is fine.
>      >
>      > After the patch:
>      >    All is ok.
>      >
>      > Thanks a lot.
>      > Best regards,
>      > Jie
>      >
>      > On 2020/7/31, 2:24 AM, "hotspot-compiler-dev on behalf of Vladimir Kozlov" <hotspot-compiler-dev-retn at openjdk.java.net on behalf of vladimir.kozlov at oracle.com> wrote:
>      >
>      >      Hi Jie
>      >
>      >      Nodes generated by make_unsafe_address() are correct. The issue is that Unsafe API allows to genereate unaligned (to
>      >      fields) offset with arbitrary type. As result C2 type system can't find corresponding field.
>      >
>      >      Did you tried to do unaligned unsafe access to instance fields?
>      >      Also try to unsafe set value (Store node). There is code in C2 which checks for narrow stores. Would be interesting how
>      >      it behave in unsafe case.
>      >
>      >      Please, extend your test.
>      >
>      >      Otherwise fix is good.
>      >
>      >      Thanks,
>      >      Vladimir K
>      >
>      >      On 7/30/20 6:09 AM, jiefu(??) wrote:
>      >      > Hi all,
>      >      >
>      >      > JBS:    https://bugs.openjdk.java.net/browse/JDK-8250825
>      >      > Webrev: http://cr.openjdk.java.net/~jiefu/8250825/webrev.00/
>      >      >
>      >      > When C2 tries to inline an unsafe-access method, it may generate the following pattern in make_unsafe_address:
>      >      >        ConP  ConL
>      >      >           \  |
>      >      >            \ |
>      >      >            AddP
>      >      > Current implementation of TypeOopPtr::TypeOopPtr(...) failed to recognize it as an unsafe operation, which leads to the crash.
>      >      >
>      >      > Testing:
>      >      >    - tier1-3 on Linux/x64
>      >      >
>      >      > Could you please review it and give me some advice?
>      >      >
>      >      > Thanks a lot.
>      >      > Best regards,
>      >      > Jie
>      >      >
>      >
>      >
>      >
>      
>      
>